Fallacymania: party game where you notice fallacies in arguments
Fallacymania is a game developed by Moscow LessWrong community. Main goals of this game is to help people notice fallacies in arguments, and of course to have fun. The game requires 3-20 players (recommended 4-12), and some materials: printed A3 sheets with fallacies (5-10 sheets), card deck with fallacies (you can cut one A3 sheet into cards, or print stickers and put them to common playing cards), pens and empty sheets, and 1 card deck of any type with at least 50 cards (optional, for counting guessing attempts). Rules of the game are explained here:
https://drive.google.com/open?id=0BzyKVqP6n3hKQWNzV3lWRTYtRzg
This is the sheet of fallacies, you can download it and print on A3 or A2 sheet of paper:
https://drive.google.com/open?id=0BzyKVqP6n3hKVEZSUjJFajZ2OTA
Also you can use this sheet to create playing cards for debaters.
When we created this game, we used these online articles and artwork about fallacies:
http://obraz.io/ru/posters/poster_view/1/?back_link=%2Fru%2F&lang=en&arrow=right
http://www.informationisbeautiful.net/visualizations/rhetological-fallacies/
http://lesswrong.com/lw/e95/the_noncentral_fallacy_the_worst_argument_in_the/
Also I've made electronic version of Fallacymania for Tabletop Simulator (in Steam Workshop):
http://steamcommunity.com/sharedfiles/filedetails/?id=723941480
Counterfactual Mugging Alternative
Edit as of June 13th, 2016: I no longer believe this to be easier to understand than traditional CM, but stand by the rest of it. Minor aesthetic edits made.
First post on the LW discussion board. Not sure if something like this has already been written, need your feedback to let me know if I’m doing something wrong or breaking useful conventions.
An alternative to the counterfactual mugging, since people often require it explained a few times before they understand it -- this one I think will be faster for most to comprehend because it arose organically, not seeming specifically contrived to create a dilemma between decision theories:
Pretend you live in a world where time travel exists and Time can create realities with acausal loops, or of ordinary linear chronology, or another structure, so long as there is no paradox -- only self-consistent timelines can be generated.
In your timeline, there are prophets. A prophet (known to you to be honest and truly prophetic) tells you that you will commit an act which seems horrendously imprudent or problematic. It is an act whose effect will be on the scale of losing $10,000; an act you never would have taken ordinarily. But fight the prophecy all you want, it is self-fulfilling and you definitely live in a timeline where the act gets committed. However, if it weren’t for the prophecy being immutably correct, you could have spent $100 and, even having heard the prophecy (even having believed it would be immutable) the probability of you taking that action would be reduced by, say, 50%. So fighting the prophecy by spending $100 would mean that there were 50% fewer self-consistent (possible) worlds where you lost the $10,000, because its just much less likely for you to end up taking that action if you fight it rather than succumbing to it.
You may feel that there would be no reason to spend $100 averting a decision that you know you’re going to make, and see no reason to care about counterfactual worlds where you don’t lose the $10,000. But the fact of the matter is that if you could have precommitted to fight the choice you would have, because in the worlds where that prophecy could have been presented to you, you’d be decreasing the average disutility by (($10,000)(.5 probability) - ($100) = $4,900). Not following a precommitment that you would have made to prevent the exact situation which you’re now in because you wouldn’t have followed the precommitment seems an obvious failure mode, but UDT successfully does the same calculation shown above and tells you to fight the prophecy. The simple fact that should tell causal decision theorists that converting to UDT is the causally optimal decision is that Updateless Decision Theorists actually do better on average than CDT proponents.
(You may assume also that your timeline is the only timeline that exists, so as not to further complicate the problem by your degree of empathy with your selves from other existing timelines.)
Newcomb, Bostrom, Calvin: Credence and the strange path to a finite afterlife
This is a bit rough, but I think that it is an interesting and potentially compelling idea. To keep this short, and accordingly increase the number of eyes over it, I have only sketched the bare bones of the idea.
1) Empirically, people have varying intuitions and beliefs about causality, particularly in Newcomb-like problems (http://wiki.lesswrong.com/wiki/Newcomb's_problem, http://philpapers.org/surveys/results.pl, and https://en.wikipedia.org/wiki/Irresistible_grace).
2) Also, as an empirical matter, some people believe in taking actions after the fact, such as one-boxing, or Calvinist “irresistible grace”, to try to ensure or conform with a seemingly already determined outcome. This might be out of a sense of retrocausality, performance, moral honesty, etc. What matters is that we know that they will act it out, despite it violating common sense causality. There has been some great work on decision theory on LW about trying to thread this needle well.
3) The second disjunct of the simulation argument (http://wiki.lesswrong.com/wiki/Simulation_argument) shows that the decision making of humanity is evidentially relevant in what our subjective credence should be that we are in a simulation. That is to say, if we are actively headed toward making simulations, we should increase our credence of being in a simulation, if we are actively headed away from making simulations, through either existential risk or law/policy against it, we should decrease our credence.
4) Many, if not most, people would like for there to be a pleasant afterlife after death, especially if we could be reunited with loved ones.
5) There is no reason to believe that simulations which are otherwise nearly identical copies of our world, could not contain, after the simulated bodily death of the participants, an extremely long-duration, though finite, "heaven"-like afterlife shared by simulation participants.
6) Our heading towards creating such simulations, especially if they were capable of nesting simulations, should increase credence that we exist in such a simulation and should perhaps expect a heaven-like afterlife of long, though finite, duration.
7) Those who believe in alternative causality, or retrocausality, in Newcomb-like situations should be especially excited about the opportunity to push the world towards surviving, allowing these types of simulations, and creating them, as it would potentially suggest, analogously, that if they work towards creating simulations with heaven-like afterlives, that they might in some sense be “causing” such a heaven to exist for themselves, and even for friends and family who have already died. Such an idea of life-after-death, and especially for being reunited with loved ones, can be extremely compelling.
8) I believe that people matching the above description, that is, holding both an intuition in alternative causality, and finding such a heaven-like-afterlife compelling, exist. Further, the existence of such people, and their associated motivation to try to create such simulations, should increase the credence even of two-boxing types, that we already live in such a world with a heaven-like afterlife. This is because knowledge of a motivated minority desiring simulations should increase credence in the likely success of simulations. This is essentially showing that “this probably happened before, one level up” from the two-box perspective.
9) As an empirical matter, I also think that there are people who would find the idea of creating simulations with heaven-like afterlives compelling, even if they are not one-boxers, from a simply altruistic perspective, both since it is a nice thing to do for the future sim people, who can, for example, probabilistically have a much better existence than biological children on earth can, and as it is a nice thing to do to increase the credence (and emotional comfort) of both one-boxers and two-boxers in our world thinking that there might be a life after death.
10) This creates the opportunity for a secular movement in which people work towards creating these simulations, and use this work and potential success in order to derive comfort and meaning from their life. For example, making donations to a simulation-creating or promoting, or existential threat avoiding, think-tank after a loved one’s death, partially symbolically, partially hopefully.
11) There is at least some room for Pascalian considerations even for two-boxers who allow for some humility in their beliefs. Nozick believed one-boxers will become two boxers if Box A is raised to 900,000, and two-boxers will become one-boxers if Box A is lowered to $1. Similarly, trying to work towards these simulations, even if you do not find it altruistically compelling, and even if you think that the odds of alternative or retrocausality is infinitesimally small, might make sense in that the reward could be extremely large, including potentially trillions of lifetimes worth of time spent in an afterlife “heaven” with friends and family.
Finally, this idea might be one worth filling in (I have been, in my private notes for over a year, but am a bit shy to debut that all just yet, even working up the courage to post this was difficult) if only because it is interesting, and could be used as a hook to get more people interested in existential risk, including the AI control problem. This is because existential catastrophe is probably the best enemy of credence in the future of such simulations, and accordingly in our reasonable credence in thinking that we have such a heaven awaiting us after death now. A short hook headline like “avoiding existential risk is key to afterlife” can get a conversation going. I can imagine Salon, etc. taking another swipe at it, and in doing so, creating publicity which would help in finding more similar minded folks to get involved in the work of MIRI, FHI, CEA etc. There are also some really interesting ideas about acausal trade, and game theory between higher and lower worlds, as a form of “compulsion” in which they punish worlds for not creating heaven containing simulations (therefore effecting their credence as observers of the simulation), in order to reach an equilibrium in which simulations with heaven-like afterlives are universal, or nearly universal. More on that later if this is received well.
Also, if anyone would like to join with me in researching, bull sessioning, or writing about this stuff, please feel free to IM me. Also, if anyone has a really good, non-obvious pin with which to pop my balloon, preferably in a gentle way, it would be really appreciated. I am spending a lot of energy and time on this if it is fundamentally flawed in some way.
Thank you.
*******************************
November 11 Updates and Edits for Clarification
1) There seems to be confusion about what I mean by self-location and credence. A good way to think of this is the Sleeping Beauty Problem (https://wiki.lesswrong.com/wiki/Sleeping_Beauty_problem)
If I imagine myself as Sleeping Beauty (and who doesn’t?), and I am asked on Sunday what my credence is that the coin will be tails, I will say 1/2. If I am awakened during the experiment without being told which day it is and am asked what my credence is that the coin was tails, I will say 2/3. If I am then told it is Monday, I will update my credence to ½. If I am told it is Tuesday I update my credence to 1. If someone asks me two days after the experiment about my credence of it being tails, if I somehow do not know the days of the week still, I will say ½. Credence changes with where you are, and with what information you have. As we might be in a simulation, we are somewhere in the “experiment days” and information can help orient our credence. As humanity potentially has some say in whether or not we are in a simulation, information about how humans make decisions about these types of things can and should effect our credence.
Imagine Sleeping Beauty is a lesswrong reader. If Sleeping Beauty is unfamiliar with the simulation argument, and someone asks her about her credence of being in a simulation, she probably answers something like 0.0000000001% (all numbers for illustrative purposes only). If someone shows her the simulation argument, she increases to 1%. If she stumbles across this blog entry, she increases her credence to 2%, and adds some credence to the additional hypothesis that it may be a simulation with an afterlife. If she sees that a ton of people get really interested in this idea, and start raising funds to build simulations in the future and to lobby governments both for great AI safeguards and for regulation of future simulations, she raises her credence to 4%. If she lives through the AI superintelligence explosion and simulations are being built, but not yet turned on, her credence increases to 20%. If humanity turns them on, it increases to 50%. If there are trillions of them, she increases her credence to 60%. If 99% of simulations survive their own run-ins with artificial superintelligence and produce their own simulations, she increases her credence to 95%.
2) This set of simulations does not need to recreate the current world or any specific people in it. That is a different idea that is not necessary to this argument. As written the argument is premised on the idea of creating fully unique people. The point would be to increase our credence that we are functionally identical in type to the unique individuals in the simulation. This is done by creating ignorance or uncertainty in simulations, so that the majority of people similarly situated, in a world which may or may not be in a simulation, are in fact in a simulation. This should, in our ignorance, increase our credence that we are in a simulation. The point is about how we self-locate, as discussed in the original article by Bostrom. It is a short 12-page read, and if you have not read it yet, I would encourage it: http://simulation-argument.com/simulation.html. The point about past loved ones I was making was to bring up the possibility that the simulations could be designed to transfer people to a separate after-life simulation where they could be reunited after dying in the first part of the simulation. This was not about trying to create something for us to upload ourselves into, along with attempted replicas of dead loved ones. This staying-in-one simulation through two phases, a short life, and relatively long afterlife, also has the advantage of circumventing the teletransportation paradox as “all of the person" can be moved into the afterlife part of the simulation.
Donna Capsella and the four applicants, pt.1
Once upon a time, in a dark, cruel world – maybe a world darker and crueller than it is – there lived a woman who wanted a piece of the action. Her name was Capsella Medik, but we remember her as Donna Capsella. This is an anecdote from her youth, told by a man who lived to tell it.
...you've got to understand, Donna started small. Real small. No money, no allies, no kin, and her wiles were – as feminine as they are. Still, she was ambitious, even then, and she had to look the part.
Girl had a way with people. Here's how it went.
One night, she rents a room – one table, five chairs – and two armed bodies, and sets up a date with four men at once – Mr. Burr, Mr. Sapp, Mr. Ast and Mr. Oriss, who've never seen her before. All are single, thirty-ish white collars. One look at the guns, and they're no trouble at all.
On the table, there's a heap: a coloured picture, a box of beads, another box (empty), four stacks of paper, four pens, a calculator and a sealed envelope.
'So,' says Donna. 'I need a manager. A clever man who'd keep my bank happy while I am...abroad. I offer you to play a game – just one game – and the winner is going to sign these papers. You leave hired, or not at all.'
The game was based on Mendel's Laws – can you imagine? The police never stood a chance against her... She had it printed out – a kind of cheat-sheet. It's like, if you have some biological feature, it's either what your genes say, or you helped Nature along the way; and the exact – wording – can be different, so you have blue eyes or brown eyes. The wording is what they call allele. Some alleles, dominant, shout louder than others, recessive, so you'll have at most two copies of each gene (hopefully), but only one will ever be heard on the outside.
(It's not quite that simple, but we didn't protest. Guns, you know.)
So there was a picture of a plant whose leaves came in four shapes (made by two genes with two alleles each):

From left to right: simplex, rhomboidea, heteris and tenuis. Simplex had only recessive alleles, aabb. Rhomboidea and tenuis each had only one pair of recessive alleles – aaB? and A?bb. But heteris, that one was a puzzler: A?B?.
'Okay,' Donna waves her hand over the heap on the table. 'Here are the rules. You will see two parent plants, and then you will see their offspring – one at a time.' She shows us the box with the beads. 'Forty-eight kids total.' She begins putting some of the beads into the empty box, but we don't see which ones. 'The colours are like in the picture. You have to guess as much about the parents and the kids as you can as I go along. All betting stops when the last kid pops out. Guess wrong, even partially wrong, you lose a point, guess right, earn one. Screw around, you're out of the game. The one with the most points wins.'
'Uh,' mumbles Oriss. 'Can we, maybe, say we're not totally sure – ?..'
She smiles, and oh, those teeth. 'Yeah. Use your Bayes.'
And just like that, Oriss reaches to his stack of paper, ready to slog through all the calculations. (Oriss likes to go ahead and gamble based on some math, even if it's not rock solid yet.)
'Er,' tries Sapp. 'Do we have to share our guesses?'
'No, the others will only know that you earned or lost a point.'
And Sapp picks up his pen, but with a little frown. (He doesn't share much, does Sapp.)
'Um,' Ast breaks in. 'In a single round, do we guess simultaneously, or in some order?'
'Simultaneously. You write it down and give it to me.'
And Ast slumps down in his seat, sweating, and eyes the calculator. (Ast prefers to go where others lead, though he can change his mind lightning-fast.)
'Well,' Burr shrugs. 'I'll just follow rough heuristics, and we'll see how it goes.'
'Such as?' asks Donna, cocking her head to the side.
'As soon as there's a simplex kid, it all comes down to pure arithmetic, since we'll know both parents have at least one recessive allele for each of the genes. If both parents are heteris – and they will be, I see it in your eyes! – then the probability of at least one of them having at least one recessive allele is higher than the probability of neither having any. I can delay making guesses for a time and just learn what score the others get for theirs, since they're pretty easy to reverse-engineer – '
'What!' say Ast, Sapp and Oriss together.
'You won't get points fast enough,' Donna points out. 'You will lose.'
'I might lose. And you will hire me anyway. You need a clever man to keep your bank happy.'
Donna purses her lips.
'You haven't told anything of value, anything the others didn't know.'
'But of course,' Burr says humbly, and even the armed bodies scowl.
'You're only clever when you have someone to mooch off. I won't hire you alone.'
'Deal.'
'Mind, I won't pick you if you lose too badly.'
Burr leers at her, and she swears under her breath.
'Enough,' says Donna and puts down two red beads – the parents – on the table.
We take our pens. She reaches out into the box of offspring.
The first bead is red.
And the second one is red.
And the third one is red.
...I tell you, it was the longest evening in my life.
So, what are your Fermi estimates for the numbers of points Mr. Burr, Mr. Sapp, Mr. Ast and Mr. Oriss each earned? And who was selected as a manager, or co-managers? And how many people left the room?
(I apologise - the follow-up won't be for a while.)
[Link] Game Theory YouTube Videos
I made a series of game theory videos that carefully go through the mechanics of solving many different types of games. I optimized the videos for my future Smith College game theory students who will either miss a class, or get lost in class and want more examples. I emphasize clarity over excitement. I would be grateful for any feedback.
Biases and Fallacies Game Cards
On the Stupid Questions Thread I asked
I need some list of biases for a game of Biased Pandemic for our Meet-Up. Do suitably prepared/formatted lists exist somewhere?
But none came forward.
Therefore I created a simple deck based on Wikipedia entries. I selected those that can be presumably be used easily in a game, summarized the description and added an illustrative quote.
The deck can be found in Dropbox here (PDF and ODT).
I'd be happy for corrections and further suggestions.
ADDED: We used these cards during the LW Hamburg Meetup. They attracted significant interest and even though we did use them during a board game we drew them and tried to act them out during a discussion round (which didn't work out that well but stimulated discussion nonetheless).
Intuitive cooperation
This is an exposition of some of the main ideas in the paper Robust Cooperation. My goal is to make the ideas and proofs seem natural and intuitive - instead of some mysterious thing where we invoke Löb's theorem at the right place and the agents magically cooperate. Also I hope it is accessible to people without a math or CS background. Be warned, it is pretty cheesy ok.
In a small quirky town, far away from other cities or towns, the most exciting event is a game called (for historical reasons) The Prisoner's Dilemma. Everyone comes out to watch the big tournament at the end of Summer, and you (Alice) are especially excited because this year it will be your first time playing in the tournament! So you've been thinking of ways to make sure that you can do well.
The way the game works is this: Each player can choose to cooperate or defect with the other player. If you both cooperate, then you get two points each. If one of you defects, then that player will get three points, and the other player won't get any points. But if you both defect, then you each get only one point. You have to make your decisions separately, without communicating with each other - however, everyone is required to register the algorithm they will be using before the tournament, and you can look at the other player's algorithm if you want to. You also are allowed to use some outside help in your algorithm.

Now if you were a newcomer, you might think that no matter what the other player does, you can always do better by defecting. So the best strategy must be to always defect! Of course, you know better, if everyone tried that strategy, then they would end up defecting against each other, which is a shame since they would both be better off if they had just cooperated.
But how can you do better? You have to be able to describe your algorithm in order to play. You have a few ideas, and you'll be playing some practice rounds with your friend Bob soon, so you can try them out before the actual tournament.
Your first plan:
I'll cooperate with Bob if I can tell from his algorithm that he'll cooperate with me. Otherwise I'll defect.
For your first try, you'll just run Bob's algorithm and see if he cooperates. But there's a problem - if Bob tries the same strategy, he'll have to run your algorithm, which will run his algorithm again, and so on into an infinite loop!
So you'll have to be a bit more clever than that... luckily you know a guy, Shady, who is good at these kinds of problems.
You call up Shady, and while you are waiting for him to come over, you remember some advice your dad Löb gave you.
(Löb's theorem) "If someone says you can trust them on X, well then they'll just tell you X."
If (someone tells you If [I tell you] X, then X is true)
Then (someone tells you X is true)
(See The Cartoon Guide to Löb's Theorem[pdf] for a nice proof of this)
Here's an example:
Sketchy watch salesman: Hey, if I tell you these watches are genuine then they are genuine!
You: Ok... so are these watches genuine?
Sketchy watch salesman: Of course!
It's a good thing to remember when you might have to trust someone. If someone you already trust tells you you can trust them on something, then you know that something must be true.
On the other hand, if someone says you can always trust them, well that's pretty suspicious... If they say you can trust them on everything, that means that they will never tell you a lie - which is logically equivalent to them saying that if they were to tell you a lie, then that lie must be true. So by Löb's theorem, they will lie to you. (Gödel's second incompleteness theorem)
Despite his name, you actually trust Shady quite a bit. He's never told you or anyone else anything that didn't end up being true. And he's careful not to make any suspiciously strong claims about his honesty.
So your new plan is to ask Shady if Bob will cooperate with you. If so, then you will cooperate. Otherwise, defect. (FairBot)
It's game time! You look at Bob's algorithm, and it turns out he picked the exact same algorithm! He's going to ask Shady if you will cooperate with him. Well, the first step is to ask Shady, "will Bob cooperate with me?"
Shady looks at Bob's algorithm and sees that if Shady says you cooperate, then Bob cooperates. He looks at your algorithm and sees that if Shady says Bob cooperates, then you cooperate. Combining these, he sees that if he says you both cooperate, then both of you will cooperate. So he tells you that you will both cooperate (your dad was right!)
Let A stand for "Alice cooperates with Bob" and B stand for "Bob cooperates with Alice".
From looking at the algorithms, and
.
So combining these, .
Then by Löb's theorem, .
Since that means that Bob will cooperate, you decide to actually cooperate.
Bob goes through an analagous thought process, and also decides to cooperate. So you cooperate with each other on the prisoner's dilemma! Yay!
That night, you go home and remark, "it's really lucky we both ended up using Shady to help us, otherwise that wouldn't have worked..."
Your dad interjects, "Actually, it doesn't matter - as long as they were both smart enough to count, it would work. This doesn't just say 'I tell you X', it's stronger than that - it actually says 'Anyone who knows basic arithmetic will tell you X'. So as long as they both know a little arithmetic, it will still work - even if one of them is pro-axiom-of-choice, and the other is pro-axiom-of-life. The cooperation is robust." That's really cool!
But there's another issue you think of. Sometimes, just to be tricky, the tournament organizers will set up a game where you have to play against a rock. Yes, literally just a rock that holding the cooperate button down. If you played against a rock with your current algorithm, well you start by asking Shady if the rock will cooperate with you. Shady is like, "well yeah, duh." So then you cooperate too. But you could have gotten three points by defecting! You're missing out on a totally free point!
You think that it would be a good idea to make sure the other player isn't a complete idiot before you cooperate with them. How can you check? Well, let's see if they would cooperate with a rock placed on the defect button (affectionately known as 'DefectRock'). If they know better than that, and they will cooperate with you, then you will cooperate with them.
The next morning, you excitedly tell Shady about your new plan. "It will be like before, except this time, I also ask you if the other player will cooperate with DefectRock! If they are dumb enough to do that, then I'll just defect. That way, I can still cooperate with other people who use algorithms like this one, or the one from before, but I can also defect and get that extra point when there's just a rock on cooperate."
Shady get's an awkward look on his face, "Sorry, but I can't do that... or at least it wouldn't work out the way you're thinking. Let's say you're playing against Bob, who is still using the old algorithm. You want to know if Bob will cooperate with DefectRock, so I have to check and see if I'll tell Bob that DefectRock will cooperate with him. I would have say I would never tell Bob that DefectRock will cooperate with him. But by Löb's theorem, that means I would tell you this obvious lie! So that isn't gonna work."
Notation, if X cooperates with Y in the prisoner's dilemma (or = D if not).
You ask Shady, does ?
Bob's algorithm: only if
.
So to say , we would need
.
This is equivalent to , since
is an obvious lie.
By Löb's theorem, , which is a lie.
<Extra credit: does the fact that Shady is the one explaining this mean you can't trust him?>
<Extra extra credit: find and fix the minor technical error in the above argument.>
Shady sees the dismayed look on your face and adds, "...but, I know a guy who can vouch for me, and I think maybe that could make your new algorithm work."
So Shady calls his friend T over, and you work out the new details. You ask Shady if Bob will cooperate with you, and you ask T if Bob will cooperate with DefectRock. So T looks at Bob's algorithm, which asks Shady if DefectRock will cooperate with him. Shady, of course, says no. So T sees that Bob will defect against DefectRock, and lets you know. Like before, Shady tells you Bob will cooperate with you, and thus you decide to cooperate! And like before, Bob decides to cooperate with you, so you both cooperate! Awesome! (PrudentBot)
If Bob is using your new algorithm, you can see that the same argument goes through mostly unchanged, and that you will still cooperate! And against a rock on cooperate, T will tell you that it will cooperate with DefectRock, so you can defect and get that extra point! This is really great!!
(ok now it's time for the really cheesy ending)
It's finally time for the tournament. You have a really good feeling about your algorithm, and you do really well! Your dad is in the audience cheering for you, with a really proud look on his face. You tell your friend Bob about your new algorithm so that he can also get that extra point sometimes, and you end up tying for first place with him!
A few weeks later, Bob asks you out, and you two start dating. Being able to cooperate with each other robustly is a good start to a healthy relationship, and you live happily ever after!
The End.
A simple game that has no solution
The following simple game has one solution that seems correct, but isn’t. Can you figure out why?
The Game
Player One moves first. He must pick A, B, or C. If Player One picks A the game ends and Player Two does nothing. If Player One picks B or C, Player Two will be told that Player One picked B or C, but will not be told which of these two strategies Player One picked, Player Two must then pick X or Y, and then the game ends. The following shows the Players’ payoffs for each possible outcome. Player One’s payoff is listed first.
A 3,0 [And Player Two never got to move.]
B,X 2,0
B,Y 2,2
C,X 0,1
C,Y 6,0
[LINK] Prisoner's Dilemma? Not So Much
Hannes Rusch argues that the Prisoner's Dilemma is best understood as merely one game of very many:
only 2 of the 726 combinatorially possible strategically unique ordinal 2x2 games have the detrimental characteristics of a PD and that the frequency of PD-type games in a space of games with random payoffs does not exceed about 3.5%. Although this does not compellingly imply that the relevance of PDs is overestimated, in the absence of convergent empirical information about the ancestral human social niche, this finding can be interpreted in favour of a rather neglected answer to the question of how the founding groups of human cooperation themselves came to cooperate: Behavioural and/or psychological mechanisms which evolved for other, possibly more frequent, social interaction situations might have been applied to PD-type dilemmas only later.
[Sequence announcement] Introduction to Mechanism Design
Mechanism design is the theory of how to construct institutions for strategic agents, spanning applications like voting systems, school admissions, regulation of monopolists, and auction design. Think of it as the engineering side of game theory, building algorithms for strategic agents. While it doesn't have much to say about rationality directly, mechanism design provides tools and results for anyone interested in world optimization.
In this sequence, I'll touch on
- The basic mechanism design framework, including the revelation principle and incentive compatibility.
- The Gibbard-Satterthwaite impossibility theorem for strategyproof implementation (a close analogue of Arrow's Theorem), and restricted domains like single-peaked or quasilinear preference where we do have positive results.
- The power and limitations of Vickrey-Clarke-Groves mechanisms for efficiently allocating goods, generalizing Vickrey's second-price auction.
- Characterizations of incentive-compatible mechanisms and the revenue equivalence theorem.
- Profit-maximizing auctions.
- The Myerson-Satterthwaite impossibility for bilateral trade.
- Two-sided matching markets à la Gale and Shapley, school choice, and kidney exchange.
As the list above suggests, this sequence is going to be semi-technical, but my foremost goal is to convey the intuition behind these results. Since mechanism design builds on game theory, take a look at Yvain's Game Theory Intro if you want to brush up.
Various resources:
- For further introduction, you can start with the popular or more scholarly survey of mechanism design from the 2007 Nobel memoriam prize in economics.
- Jeff Ely has lecture notes and short videos to accompany an undergraduate class in microeconomic theory from the perspective of mechanism design.
- The textbook A Toolbox for Economic Design by Dimitrios Diamantaras is very accessible and comprehensive if you can get ahold of a copy.
- Tilman Börgers has a draft textbook intended for graduate students.
- Chapters 9-16 of Algorithmic Game Theory and chapters 10-11 of Multiagent Systems cover various topics in mechanism design from the perspective of computer scientists.
- Video lectures introducing market design and computational aspects of mechanism design.
I plan on following up on this sequence with another focusing on group rationality and information aggregation, surveying scoring rules and prediction markets among other topics.
Suggestions and comments are very welcome.
Democracy and rationality
Note: This is a draft; so far, about the first half is complete. I'm posting it to Discussion for now; when it's finished, I'll move it to Main. In the mean time, I'd appreciate comments, including suggestions on style and/or format. In particular, if you think I should(n't) try to post this as a sequence of separate sections, let me know.
Summary: You want to find the truth? You want to win? You're gonna have to learn the right way to vote. Plurality voting sucks; better voting systems are built from the blocks of approval, medians (Bucklin cutoffs), delegation, and pairwise opposition. I'm working to promote these systems and I want your help.
Contents: 1. Overblown¹ rhetorical setup ... 2. Condorcet's ideals and Arrow's problem ... 3. Further issues for politics ... 4. Rating versus ranking; a solution? ... 5. Delegation and SODA ... 6. Criteria and pathologies ... 7. Representation, Proportional representation, and Sortition ... 8. What I'm doing about it and what you can ... 9. Conclusions and future directions ... 10. Appendix: voting systems table ... 11. Footnotes
1.
This is a website focused on becoming more rational. But that can't just mean getting a black belt in individual epistemic rationality. In a situation where you're not the one making the decision, that black belt is just a recipe for frustration.
Of course, there's also plenty of content here about how to interact rationally; how to argue for truth, including both hacking yourself to give in when you're wrong and hacking others to give in when they are. You can learn plenty here about Aumann's Agreement Theorem on how two rational Bayesians should never knowingly disagree.
But "two rational Bayesians" isn't a whole lot better as a model for society than "one rational Bayesian". Aspiring to be rational is well and good, but the Socratic ideal of a world tied together by two-person dialogue alone is as unrealistic as the sociopath's ideal of a world where their own voice rules alone. Society needs structures for more than two people to interact. And just as we need techniques for checking irrationality in one- and two-person contexts, we need them, perhaps all the more, in multi-person contexts.
Most of the basic individual and dialogical rationality techniques carry over. Things like noticing when you are confused, or making your opponent's arguments into a steel man, are still perfectly applicable. But there's also a new set of issues when n>2: the issues of democracy and voting. For a group of aspiring rationalists to come to a working consensus, of course they need to begin by evaluating and discussing the evidence, but eventually it will be time to cut off the discussion and just vote. When they do so, they should understand the strengths and pitfalls of voting in general and of their chosen voting method in particular.
And voting's not just useful for an aspiring rationalist community. As it happens, it's an important part of how governments are run. Discussing politics may be a mind-killer in many contexts, but there are an awful lot of domains where politics is a part of the road to winning.² Understanding voting processes a little bit can help you navigate that road; understanding them deeply opens the possibility of improving that road and thus winning more often.
2. Collective rationality: Condorcet's ideals and Arrow's problem
Imagine it's 1785, and you're a member of the French Academy of Sciences. You're rubbing elbows with most of the giants of science and mathematics of your day: Coulomb, Fourier, Lalande, Lagrange, Laplace, Lavoisier, Monge; even the odd foreign notable like Franklin with his ideas to unify electrostatics and electric flow.

One day, they'll put your names in front of lots of cameras (even though that foreign yokel Franklin will be in more pictures)
And this academy, with many of the smartest people in the world, has votes on stuff. Who will be our next president; who should edit and schedule our publications; etc. You're sure that if you all could just find the right way to do the voting, you'd get the right answer. In fact, you can easily prove that, or something like it: if a group is deciding between one right and one wrong option, and each member is independently more than 50% likely to get it right, then as the group size grows the chance of a majority vote choosing the right option goes to 1.
But somehow, there's still annoying politics getting in the way. Some people seem to win the elections simply because everyone expects them to win. So last year, the academy decided on a new election system to use, proposed by your rival, Charles de Borda, in which candidates get different points for being a voters first, second, or third choice, and the one with the most points wins. But you're convinced that this new system will lead to the opposite problem: people who win the election precisely because nobody expected them to win, by getting the points that voters strategically don't want to give to a strong rival. But when people point that possibility out to Borda, he only huffs that "my system is meant for honest men!"
So with your proof of the above intuitive, useful result about two-way elections, you try to figure out how to reduce an n-way election to the two-candidate case. Clearly, you can show that Borda's system will frequently give the wrong results from that perspective. But frustratingly, you find that there could sometimes be no right answer; that there will be no candidate who would beat all the others in one-on-one races. A crack has opened up; could it be that the collective decisions of intelligent individual rational agents could be irrational?
Of course, the "you" in this story is the Marquis de Condorcet, and the year 1785 is when he published his Essai sur l’application de l’analyse à la probabilité des décisions rendues à la pluralité des voix, a work devoted to the question of how to acheive collective rationality. The theorem referenced above is Condorcet's Jury Theorem, which seems to offer hope that democracy can point the way from individually-imperfect rationality towards an ever-more-perfect collective rationality. Just as Aumann's Agreement Theorem shows that two rational agents should always move towards consensus, the Condorcet Jury Theorem apparently shows that if you have enough rational agents, the resulting consensus will be correct.
But as I said, Condorcet also opened a crack in that hope: the possibility that collective preferences will be cyclical. If the assumptions of the jury theorem don't hold — if each voter doesn't have a >50% chance of being right on a randomly-selected question, OR if the correctness of two randomly-selected voters is not independent and uncorrelated — then individually-sensible choices can lead to collectively-ridiculous ones.
What do I mean by "collectively-ridiculous"? Let's imagine that the Rationalist Marching Band is choosing the colors for their summer, winter, and spring uniforms, and that they all agree that the only goal is to have as much as possible of the best possible colors. The summer-style uniforms come in red or blue, and they vote and pick blue; the winter-style ones come in blue or green, and they pick green; and the spring ones come in green or red, and they pick red.
Obviously, this makes us doubt their collective rationality. If, as they all agree they should, they had a consistent favorite color, they should have chosen that color both times that it was available, rather than choosing three different colors in the three cases. Theoretically, the salesperson could use such a fact to pump money out of them; for instance, offering to let them "trade up" their spring uniform from red to blue, then to green, then back to red, charging them a small fee each time; if they voted consistently as above, they would agree to each trade (though of course in reality human voters would probably catch on to the trick pretty soon, so the abstract ideal of an unending circular money pump wouldn't work).
This is the kind of irrationality that Condorcet showed was possible in collective decisionmaking. He also realized that there was a related issue with logical inconsistencies. If you were take a vote on 3 logically related propositions — say, "Should we have a Minister of Silly Walks, to be appointed by the Chancellor of the Excalibur", "Should we have a Minister of Silly Walks, but not appointed by the Chancellor of the Excalibur", and "Should we in fact have a Minister of Silly Walks at all", where the third cannot be true unless one of the first is — then you could easily get majority votes for inconsistent results — in this case, no, no, and yes, respectively. Obviously, there are many ways to fix the problem in this simple case — probably many less-wrong'ers would suggest some Bayesian tricks related to logical networks and treating votes as evidence⁸ — but it's a tough problem in general even today, especially when the logical relationships can be complex, and Condorcet was quite right to be worried about its implications for collective rationality.³
And that's not the only tough problem he correctly foresaw. Nearly 200 years later and an ocean away, in the 1960s, Kenneth Arrow showed that it was impossible for a preferential voting system to avoid the problem of a "Condorcet cycle" of preferences. Arrows theorem shows that any voting system which can consistently give the same winner (or, in ties, winners) for the same voter preferences; which does not make one voter the effective dictator; which is sure to elect a candidate if all voters prefer them; and which will switch the results for two candidates if you switch their names on all the votes... must exhibit, in at least some situation, the pathology that befell the Rationalist Marching Band above, or in other words, must fail "independence of irrelevant alternatives".
Arrow's theorem is far from obvious a priori, but proof is not hard to understand intuitively using Condorcet's insight. Say that there are three candidates, X, Y, and Z, with roughly equal bases of support; and that they form a Condorcet cycle, because in two-way races, X would beat Y with help from Z supporters, Y would beat Z with help from X supporters, and Z would beat X with help from Y supporters. So whoever wins in the three-way race — say, X — just remove the one who would have lost to them — Y in this case — and that "irrelevant" change will change the winner to be the third — Z in this case.
Summary of above: Collective rationality is harder than individual or two-way rationality. Condorcet saw the problem and tried to solve it, but Arrow saw that Condorcet had been doomed to fail.
3. Further issues for politics
So Condorcet's ideals of better rationality through voting appear to be in ruins. But at least we can hope that voting is a good way to do politics, right?
Not so fast. Arrow's theorem quickly led to further disturbing results. Alan Gibbard (and also Mark Satterthwaite) extended that there is no voting system which doesn't encourage voting strategy. That is, if you view an voting system as a class of games where the finite players and finite available strategies are fixed, no player is effectively a dictator, and the only thing that varies are the payoffs for each player from each outcome, there is no voting system where you can derive your best strategic vote purely by looking "honestly" at your own preferences; there is always the possibility of situations where you have to second-guess what others will do.
Amartya Sen piled on with another depressing extension of Arrows' logic. He showed that there is no possible way of aggregating individual choices into collective choice that satisfies two simple criteria. First, it shouldn't choose pareto-dominated outcomes; if everyone prefers situation XYZ to ABC, that they don't do XYZ. Second, it is "minimally liberal"; that is, there are at least two people who each get to freely make their own decision on at least one specific issue each, no matter what, so for instance I always get to decide between X and A (in Gibbard's⁴ example, colors for my house), and you always get to decide between Y and B (colors for your own house). The problem is that if you nosily care more about my house's color, the decision that should have been mine, and I nosily care about yours, more than we each care about our own, then the pareto-dominant situation is the one where we don't decide our own houses; and that nosiness could, in theory, be the case for any specific choice that, a priori, someone might have labelled as our Inalienable Right. It's not such a surprising result when you think about it that way, but it does clearly show that unswerving ideals of Democracy and Liberty will never truly be compatible.
Meanwhile, "public choice" theorists⁵ like Duncan Black, James Buchanan, etc. were busy undermining the idea of democratic government from another direction: the motivations of the politicians and bureaucrats who are supposed to keep it running. They showed that various incentives, including the strange voting scenarios explored by Condorcet and Arrow, would tend open a gap between the motives of the people and those of the government, and that strategic voting and agenda-setting within a legislature would tend to extend the impact of that gap. Where Gibbard and Sen had proved general results, these theorists worked from specific examples. And in one aspect, at least, their analysis is devastatingly unanswerable: the near-ubiquitous "democratic" system of plurality voting, also known as first-past-the-post or vote-for-one or biggest-minority-wins, is terrible in both theory and practice.
So, by the 1980s, things looked pretty depressing for the theory of democracy. Politics, the theory went, was doomed forever to be a worse than sausage factory; disgusting on the inside and distasteful even from outside.
Should an ethical rationalist just give up on politics, then? Of course not. As long as the results it produces are important, it's worth trying to optimize. And as soon as you take the engineer's attitude of optimizing, instead of dogmatically searching for perfection or uselessly whining about the problems, the results above don't seem nearly as bad.
From this engineer's perspective, public choice theory serves as an unsurprising warning that tradeoffs are necessary, but more usefully, as a map of where those tradeoffs can go particularly wrong. In particular, its clearest lesson, in all-caps bold with a blink tag, that PLURALITY IS BAD, can be seen as a hopeful suggestion that other voting systems may be better. Meanwhile, the logic of both Sen's and Gibbard's theorems are built on Arrow's earlier result. So if we could find a way around Arrow, it might help resolve the whole issue.
Summary of above: Democracy is the worst political system... (...except for all the others?) But perhaps it doesn't have to be quite so bad as it is today.
4. Rating versus ranking
So finding a way around Arrow's theorem could be key to this whole matter. As a mathematical theorem, of course, the logic is bulletproof. But it does make one crucial assumption: that the only inputs to a voting system are rankings, that is, voters' ordinal preference orders for the candidates. No distinctions can be made using ratings or grades; that is, as long as you prefer X to Y to Z, the strength of those preferences can't matter. Whether you put Y almost up near X or way down next to Z, the result must be the same.
Relax that assumption, and it's easy to create a voting system which meets Arrow's criteria. It's called Score voting⁶, and it just means rating each candidate with a number from some fixed interval (abstractly speaking, a real number; but in practice, usually an integer); the scores are added up and the highest total or average wins. (Unless there are missing values, of course, total or average amount to the same thing.) You've probably used it yourself on Yelp, IMDB, or similar sites. And it clearly passes all of Arrow's criteria. Non-dictatorship? Check. Unanimity? Check. Symmetry over switching candidate names? Check. Independence of irrelevant alternatives? In the mathematical sense — that is, as long as the scores for other candidates are unchanged — check.
So score voting is an ideal system? Well, it's certainly a far sight better than plurality. But let's check it against Sen and against Gibbard.
Sen's theorem was based on a logic similar to Arrow. However, while Arrow's theorem deals with broad outcomes like which candidate wins, Sen's deals with finely-grained outcomes like (in the example we discussed) how each separate house should be painted. Extending the cardinal numerical logic of score voting to such finely-grained outcomes, we find we've simply reinvented markets. While markets can be great things and often work well in practice, Sen's result still holds in this case; if everything is on the market, then there is no decision which is always yours to make. But since, in practice, as long as you aren't destitute, you tend to be able to make the decisions you care the most about, Sen's theorem seems to have lost its bite in this context.
What about Gibbard's theorem on strategy? Here, things are not so easy. Yes, Gibbard, like Sen, parallels Arrow. But while Arrow deals with what's written on the ballot, Gibbard deals with what's in the voters head. In particular, if a voter prefers X to Y by even the tiniest margin, Gibbard assumes (not unreasonably) that they may be willing to vote however they need to, if by doing so they can ensure X wins instead of Y. Thus, the internal preferences Gibbard treats are, effectively, just ordinal rankings; and the cardinal trick by which score voting avoided Arrovian problems no longer works.
How does score voting deal with strategic issues in practice? The answer to that has two sides. On the one hand, score never requires voters to be actually dishonest. Unlike the situation in a ranked system such as plurality, where we all know that the strategic vote may be to dishonestly ignore your true favorite and vote for a "lesser evil" among the two frontrunners, in score voting you never need to vote a less-preferred option above a more-preferred option. At worst, all you have to do is exaggerate some distinctions and minimize others, so that you might end giving equal votes to less- and more-preferred options.
Did I say "at worst"? I meant, "almost always". Voting strategy only matters to the result when, aside from your vote, two or more candidates are within one vote of being tied for first. Except in unrealistic, perfectly-balanced conditions, as the number of voters rises, the probability that anyone but the two a priori frontrunner candidates is in on this tie falls to zero.⁷ Thus, in score voting, the optimal strategy is nearly always to vote your preferred frontrunner and all candidate above at the maximum, and your less-preferred frontrunner and all candidates below at the minimum. In other words, strategic score voting is basically equivalent to approval voting, where you give each candidate a 1 or 0 and the highest total wins.
In one sense, score voting reducing to approval OK. Approval voting is not a bad system at all. For instance, if there's a known majority Condorcet winner — a candidate who could beat any other by a majority in a one-on-one race — and voters are strategic — they anticipate the unique strong Nash equilibrium, the situation where no group of voters could improve the outcome for all its members by changing their votes, whenever such a unique equilibrium exists — then the Condorcet winner will win under approval. That's a lot of words to say that approval will get the "democratic" results you'd expect in most cases.
But in another sense, it's a problem. If one side of an issue is more inclined to be strategic than the other side, the more-strategic faction could win even if it's a minority. That clashes with many people's ideals of democracy; and worse, it encourages mind-killing political attitudes, where arguments are used as soldiers rather than as ways to seek the truth.
But score and approval voting are not the only systems which escape Arrow's theorem through the trapdoor of ratings. If score voting, using the average of voter ratings, too-strongly encourages voters to strategically seek extreme ratings, then why not use the median rating instead? We know that medians are less sensitive to outliers than averages. And indeed, median-based systems are more resistant to one-sided strategy than average-based ones, giving better hope for reasonable discussion to prosper. That is to say, in a simple model, a minority would need twice as much strategic coordination under median as under average, in order to overcome a majority; and there's good reason to believe that, because of natural factional separation, reality is even more favorable to median systems than that model.
There are several different median systems available. In the US during the 1910-1925 Progressive Era, early versions collectively called "Bucklin voting" were used briefly in over a dozen cities. These reforms, based on counting all top preferences, then adding lower preferences one level at a time until some candidate(s) reach a majority, were all rolled back soon after, principally by party machines upset at upstart challenges or victories. The possibility of multiple, simultaneous majorities is a principal reason for the variety of Bucklin/Median systems. Modern proposals of median systems include Majority Approval Voting, Majority Judgment, and Graduated Majority Judgment, which would probably give the same winners almost all of the time. An important detail is that most median system ballots use verbal or letter grades rather than numeric scores. This is justifiable because the median is preserved under any monotonic transformation, and studies suggest that it would help discourage strategic voting.
Serious attention to rated systems like approval, score, and median systems barely began in the 1980s, and didn't really pick up until 2000. Meanwhile, the increased amateur interest in voting systems in this period — perhaps partially attributable to the anomalous 2000 US presidential election, or to more-recent anomalies in the UK, Canada, and Australia — has led to new discoveries in ranked systems as well. Though such systems are still clearly subject to Arrow's theorem, new "improved Condorcet" methods which use certain tricks to count a voter's equal preferences between to candidates on either side of the ledger depending on the strategic needs, seem to offer promise that Arrovian pathologies can be kept to a minimum.
With this embarrassment of riches of systems to choose from, how should we evaluate which is best? Well, at least one thing is a clear consensus: plurality is a horrible system. Beyond that, things are more controversial; there are dozens of possible objective criteria one could formulate, and any system's inventor and/or supporters can usually formulate some criterion by which it shines.
Ideally, we'd like to measure the utility of each voting system in the real world. Since that's impossible — it would take not just a statistically-significant sample of large-scale real-world elections for each system, but also some way to measure the true internal utility of a result in situations where voters are inevitably strategically motivated to lie about that utility — we must do the next best thing, and measure it in a computer, with simulated voters whose utilities are assigned measurable values. Unfortunately, that requires assumptions about how those utilities are distributed, how voter turnout is decided, and how and whether voters strategize. At best, those assumptions can be varied, to see if findings are robust.
In 2000, Warren Smith performed such simulations for a number of voting systems. He found that score voting had, very robustly, one of the top expected social utilities (or, as he termed it, lowest Bayesian regret). Close on its heels were a median system and approval voting. Unfortunately, though he explored a wide parameter space in terms of voter utility models and inherent strategic inclination of the voters, his simulations did not include voters who were more inclined to be strategic when strategy was more effective. His strategic assumptions were also unfavorable to ranked systems, and slightly unrealistic in other ways. Still, though certain of his numbers must be taken with a grain of salt, some of his results were large and robust enough to be trusted. For instance, he found that plurality voting and instant runoff voting were clearly inferior to rated systems; and that approval voting, even at its worst, offered over half the benefits compared to plurality of any other system.
Summary of above: Rated systems, such as approval voting, score voting, and Majority Approval Voting, can avoid the problems of Arrow's theorem. Though they are certainly not immune to issues of strategic voting, they are a clear step up from plurality. Starting with this section, the opinions are my own; the two prior sections were based on general expert views on the topic.
5. Delegation and SODA
Rated systems are not the only way to try to beat the problems of Arrow and Gibbard (/Satterthwaite).
Summary of above:
6. Criteria and pathologies
do.
Summary of above:
7. Representation, proportionality, and sortition
do.
Summary of above:
8. What I'm doing about it and what you can
do.
Summary of above:
9. Conclusions and future directions
do.
Summary of above:
10. Appendix: voting systems table
Compliance of selected systems (table)
The following table shows which of the above criteria are met by several single-winner systems. Note: contains some errors; I'll carefully vet this when I'm finished with the writing. Still generally reliable though.
| Majority/ MMC | Condorcet/ Majority Condorcet | Cond. loser | Monotone | Consistency/ Participation | Reversal symmetry | IIA | Cloneproof | Polytime/ Resolvable | Summable | Equal rankings allowed | Later prefs allowed | Later-no-harm/ Later-no-help | FBC:No favorite betrayal |
||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Approval[nb 1] | Ambiguous | No/Strategic yes[nb 2] | No | Yes | Yes[nb 2] | Yes | Ambiguous | Ambig.[nb 3] | Yes | O(N) | Yes | No | [nb 4] | Yes | |
| Borda count | No | No | Yes | Yes | Yes | Yes | No | No (teaming) | Yes | O(N) | No | Yes | No | No | |
| Copeland | Yes | Yes | Yes | Yes | No | Yes | No (but ISDA) | No (crowding) | Yes/No | O(N2) | Yes | Yes | No | No | |
| IRV (AV) | Yes | No | Yes | No | No | No | No | Yes | Yes | O(N!)[nb 5] | No | Yes | Yes | No | |
| Kemeny-Young | Yes | Yes | Yes | Yes | No | Yes | No (but ISDA) | No (teaming) | No/Yes | O(N2)[nb 6] | Yes | Yes | No | No | |
| Majority Judgment[nb 7] | Yes[nb 8] | No/Strategic yes[nb 2] | No[nb 9] | Yes | No[nb 10] | No[nb 11] | Yes | Yes | Yes | O(N)[nb 12] | Yes | Yes | No[nb 13] | Yes | Yes |
| Minimax | Yes/No | Yes[nb 14] | No | Yes | No | No | No | No (spoilers) | Yes | O(N2) | Some variants | Yes | No[nb 14] | No | |
| Plurality | Yes/No | No | No | Yes | Yes | No | No | No (spoilers) | Yes | O(N) | No | No | [nb 4] | No | |
| Range voting[nb 1] | No | No/Strategic yes[nb 2] | No | Yes | Yes[nb 2] | Yes | Yes[nb 15] | Ambig.[nb 3] | Yes | O(N) | Yes | Yes | No | Yes | |
| Ranked pairs | Yes | Yes | Yes | Yes | No | Yes | No (but ISDA) | Yes | Yes | O(N2) | Yes | Yes | No | No | |
| Runoff voting | Yes/No | No | Yes | No | No | No | No | No (spoilers) | Yes | O(N)[nb 16] | No | No[nb 17] | Yes[nb 18] | No | |
| Schulze | Yes | Yes | Yes | Yes | No | Yes | No (but ISDA) | Yes | Yes | O(N2) | Yes | Yes | No | No | |
| SODA voting[nb 19] | Yes | Strategic yes/yes | Yes | Ambiguous[nb 20] | Yes/Up to 4 cand. [nb 21] | Yes[nb 22] | Up to 4 candidates[nb 21] | Up to 4 cand. (then crowds) [nb 21] | Yes[nb 23] | O(N) | Yes | Limited[nb 24] | Yes | Yes | |
| Random winner/ arbitrary winner[nb 25] |
No | No | No | NA | No | Yes | Yes | NA | Yes/No | O(1) | No | No | Yes | ||
| Random ballot[nb 26] | No | No | No | Yes | Yes | Yes | Yes | Yes | Yes/No | O(N) | No | No | Yes | ||
"Yes/No", in a column which covers two related criteria, signifies that the given system passes the first criterion and not the second one.
- ^ a b These criteria assume that all voters vote their true preference order. This is problematic for Approval and Range, where various votes are consistent with the same order. See approval voting for compliance under various voter models.
- ^ a b c d e In Approval, Range, and Majority Judgment, if all voters have perfect information about each other's true preferences and use rational strategy, any Majority Condorcet or Majority winner will be strategically forced – that is, win in the unique Strong Nash equilibrium. In particular if every voter knows that "A or B are the two most-likely to win" and places their "approval threshold" between the two, then the Condorcet winner, if one exists and is in the set {A,B}, will always win. These systems also satisfy the majority criterion in the weaker sense that any majority can force their candidate to win, if it so desires. (However, as the Condorcet criterion is incompatible with the participation criterion and the consistency criterion, these systems cannot satisfy these criteria in this Nash-equilibrium sense. Laslier, J.-F. (2006) "Strategic approval voting in a large electorate,"IDEP Working Papers No. 405 (Marseille, France: Institut D'Economie Publique).)
- ^ a b The original independence of clones criterion applied only to ranked voting methods. (T. Nicolaus Tideman, "Independence of clones as a criterion for voting rules", Social Choice and Welfare Vol. 4, No. 3 (1987), pp. 185–206.) There is some disagreement about how to extend it to unranked methods, and this disagreement affects whether approval and range voting are considered independent of clones. If the definition of "clones" is that "every voter scores them within ±ε in the limit ε→0+", then range voting is immune to clones.
- ^ a b Approval and Plurality do not allow later preferences. Technically speaking, this means that they pass the technical definition of the LNH criteria - if later preferences or ratings are impossible, then such preferences can not help or harm. However, from the perspective of a voter, these systems do not pass these criteria. Approval, in particular, encourages the voter to give the same ballot rating to a candidate who, in another voting system, would get a later rating or ranking. Thus, for approval, the practically meaningful criterion would be not "later-no-harm" but "same-no-harm" - something neither approval nor any other system satisfies.
- ^ The number of piles that can be summed from various precincts is floor((e-1) N!) - 1.
- ^ Each prospective Kemeny-Young ordering has score equal to the sum of the pairwise entries that agree with it, and so the best ordering can be found using the pairwise matrix.
- ^ Bucklin voting, with skipped and equal-rankings allowed, meets the same criteria as Majority Judgment; in fact, Majority Judgment may be considered a form of Bucklin voting. Without allowing equal rankings, Bucklin's criteria compliance is worse; in particular, it fails Independence of Irrelevant Alternatives, which for a ranked method like this variant is incompatible with the Majority Criterion.
- ^ Majority judgment passes the rated majority criterion (a candidate rated solo-top by a majority must win). It does not pass the ranked majority criterion, which is incompatible with Independence of Irrelevant Alternatives.
- ^ Majority judgment passes the "majority condorcet loser" criterion; that is, a candidate who loses to all others by a majority cannot win. However, if some of the losses are not by a majority (including equal-rankings), the Condorcet loser can, theoretically, win in MJ, although such scenarios are rare.
- ^ Balinski and Laraki, Majority Judgment's inventors, point out that it meets a weaker criterion they call "grade consistency": if two electorates give the same rating for a candidate, then so will the combined electorate. Majority Judgment explicitly requires that ratings be expressed in a "common language", that is, that each rating have an absolute meaning. They claim that this is what makes "grade consistency" significant. MJ. Balinski M. and R. Laraki (2007) «A theory of measuring, electing and ranking». Proceedings of the National Academy of Sciences USA, vol. 104, no. 21, 8720-8725.
- ^ Majority judgment can actually pass or fail reversal symmetry depending on the rounding method used to find the median when there are even numbers of voters. For instance, in a two-candidate, two-voter race, if the ratings are converted to numbers and the two central ratings are averaged, then MJ meets reversal symmetry; but if the lower one is taken, it does not, because a candidate with ["fair","fair"] would beat a candidate with ["good","poor"] with or without reversal. However, for rounding methods which do not meet reversal symmetry, the chances of breaking it are on the order of the inverse of the number of voters; this is comparable with the probability of an exact tie in a two-candidate race, and when there's a tie, any method can break reversal symmetry.
- ^ Majority Judgment is summable at order KN, where K, the number of ranking categories, is set beforehand.
- ^ Majority judgment meets a related, weaker criterion: ranking an additional candidate below the median grade (rather than your own grade) of your favorite candidate, cannot harm your favorite.
- ^ a b A variant of Minimax that counts only pairwise opposition, not opposition minus support, fails the Condorcet criterion and meets later-no-harm.
- ^ Range satisfies the mathematical definition of IIA, that is, if each voter scores each candidate independently of which other candidates are in the race. However, since a given range score has no agreed-upon meaning, it is thought that most voters would either "normalize" or exaggerate their vote such that it votes at least one candidate each at the top and bottom possible ratings. In this case, Range would not be independent of irrelevant alternatives. Balinski M. and R. Laraki (2007) «A theory of measuring, electing and ranking». Proceedings of the National Academy of Sciences USA, vol. 104, no. 21, 8720-8725.
- ^ Once for each round.
- ^ Later preferences are only possible between the two candidates who make it to the second round.
- ^ That is, second-round votes cannot harm candidates already eliminated.
- ^ Unless otherwise noted, for SODA's compliances:
- Delegated votes are considered to be equivalent to voting the candidate's predeclared preferences.
- Ballots only are considered (In other words, voters are assumed not to have preferences that cannot be expressed by a delegated or approval vote.)
- Since at the time of assigning approvals on delegated votes there is always enough information to find an optimum strategy, candidates are assumed to use such a strategy.
- ^ For up to 4 candidates, SODA is monotonic. For more than 4 candidates, it is monotonic for adding an approval, for changing from an approval to a delegation ballot, and for changes in a candidate's preferences. However, if changes in a voter's preferences are executed as changes from a delegation to an approval ballot, such changes are not necessarily monotonic with more than 4 candidates.
- ^ a b c For up to 4 candidates, SODA meets the Participation, IIA, and Cloneproof criteria. It can fail these criteria in certain rare cases with more than 4 candidates. This is considered here as a qualified success for the Consistency and Participation criteria, which do not intrinsically have to do with numerous candidates, and as a qualified failure for the IIA and Cloneproof criteria, which do.
- ^ SODA voting passes reversal symmetry for all scenarios that are reversible under SODA; that is, if each delegated ballot has a unique last choice. In other situations, it is not clear what it would mean to reverse the ballots, but there is always some possible interpretation under which SODA would pass the criterion.
- ^ SODA voting is always polytime computable. There are some cases where the optimal strategy for a candidate assigning delegated votes may not be polytime computable; however, such cases are entirely implausible for a real-world election.
- ^ Later preferences are only possible through delegation, that is, if they agree with the predeclared preferences of the favorite.
- ^ Random winner: Uniformly randomly chosen candidate is winner. Arbitrary winner: some external entity, not a voter, chooses the winner. These systems are not, properly speaking, voting systems at all, but are included to show that even a horrible system can still pass some of the criteria.
- ^ Random ballot: Uniformly random-chosen ballot determines winner. This and closely related systems are of mathematical interest because they are the only possible systems which are truly strategy-free, that is, your best vote will never depend on anything about the other voters. They also satisfy both consistency and IIA, which is impossible for a deterministic ranked system. However, this system is not generally considered as a serious proposal for a practical method.
11. Footnotes
¹ When I call my introduction "overblown", I mean that I reserve the right to make broad generalizations there, without getting distracted by caveats. If you don't like this style, feel free to skip to section 2.
² Of course, the original "politics is a mind killer" sequence was perfectly clear about this: "Politics is an important domain to which we should individually apply our rationality—but it's a terrible domain in which to learn rationality, or discuss rationality, unless all the discussants are already rational." The focus here is on the first part of that quote, because I think Less Wrong as a whole has moved too far in the direction of avoiding politics as not a domain for rationalists.
³ Bayes developed his theorem decades before Condorcet's Essai, but Condorcet probably didn't know of it, as it wasn't popularized by Laplace until about 30 years later, after Condorcet was dead.
⁴ Yes, this happens to be the same Alan Gibbard from the previous paragraph.
⁵ Confusingly, "public choice" refers to a school of thought, while "social choice" is the name for the broader domain of study. Stop reading this footnote now if you don't want to hear mind-killing partisan identification. "Public choice" theorists are generally seen as politically conservative in the solutions they suggest. It seems to me that the broader "social choice" has avoided taking on a partisan connotation in this sense.
⁶ Score voting is also called "range voting" by some. It is not a particularly new idea — for instance, the "loudest cheer wins" rule of ancient Sparta, and even aspects of honeybees' process for choosing new hives, can be seen as score voting — but it was first analyzed theoretically around 2000. Approval voting, which can be seen as a form of score voting where the scores are restricted to 0 and 1, had entered theory only about two decades earlier, though it too has a history of practical use back to antiquity.
⁷ OK, fine, this is a simplification. As a voter, you have imperfect information about the true level of support and propensity to vote in the superpopulation of eligible voters, so in reality the chances of a decisive tie between other than your two expected frontrunners is non-zero. Still, in most cases, it's utterly negligible.
⁸ This article will focus more on the literature on multi-player strategic voting (competing boundedly-instrumentally-rational agents) than on multi-player Aumann (cooperating boundedly-epistemically-rational agents). If you're interested in the latter, here are some starting points: Scott Aaronson's work is, as far as I know, the state of the art on 2-player Aumann, but its framework assumes that the players have a sophisticated ability to empathize and reason about each others' internal knowledge, and the problems with this that Aaronson plausibly handwaves away in the 2-player case are probably less tractable in the multi-player one. Dalkiran et al deal with an Aumann-like problem over a social network; they find that attempts to "jump ahead" to a final consensus value instead of simply dumbly approaching it asymptotically can lead to failure to converge. And Kanoria et al have perhaps the most interesting result from the perspective of this article; they use the convergence of agents using a naive voting-based algorithm to give a nice upper bound on the difficulty of full Bayesian reasoning itself. None of these papers explicitly considers the problem of coming to consensus on more than one logically-related question at once, though Aaronson's work at least would clearly be easy to extend in that direction, and I think such extensions would be unsurprisingly Bayesian.
Dark Arts 101: Winning via destruction and dualism
Recalling first that life is a zero-sum game, it is immediately obvious that the quickest and easiest path to success is not to accomplish things yourself—that's a game for heroes and other suckers—but to tear down the accomplishments and reputations of others. Destruction is easy. The difficulty lies in constructing a situation so that that destruction is to your net benefit.
Trust in God, or, The Riddle of Kyon Fan Visual Novel
I made Trust in God, or, The Riddle of Kyon, a Haruhi fanfic by Eliezer Yudkowsky, into a visual novel. At least, I started it. It still needs quite a bit of work. If anyone wants to edit it, message me.
A Series of Increasingly Perverse and Destructive Games
Related to: Higher Than the Most High
The linked post describes a game in which (I fudge a little), Omega comes to you and two other people, and ask you to tell him an integer. The person who names the largest integer is allowed to leave. The other two are killed.
This got me thinking about variations on the same concept, and here's what I've come up, taking that game to be GAME0. The results are sort of a fun time-waster, and bring up some interesting issues. For your enjoyment...
THE GAMES:
GAME1: Omega takes you and two strangers (all competent programmers), and kidnaps and sedates you. You awake in three rooms with instructions printed on the wall explaining the game, and a computer with an operating system and programming language compiler, but no internet. Food, water, and toiletries are provided, but no external communication. The participants are allowed to write programs on the computer in a language that supports arbitrarily large numerical values. The programs are taken by Omega and run on a hypercomputer in finite time (this hypercomputer can resolve the halting problem and infinite loops, but programs that do not eventually halt return no output). The person who wrote the program with the largest output is allowed to leave. The others are instantly and painlessly killed. In the event of a tie, everyone dies. If your program returns no output, that is taken to be zero.
GAME2: Identical to GAME1, except that each program you write has to take two inputs, which will be the text of the other players' programs (assume they're all written in the same language). The reward for outputting the largest number apply normally.
GAME3: Identical to Game2, except that while you are sedated, Omega painlessly and imperceptibly uploads you. Additionally, the instructions on the wall now specify that your program must take four inputs - blackbox functions which represent the uploaded minds of all three players, plus a simulation of the room you're in, indistinguishable from the real thing. We'll assume that players can't modify or interpret the contents of their opponents' brains. The room function take an argument of a string (which controls the text printed on the wall, and outputs whatever number the person in the simulation's program returns).
In each of these games, which program should you write if you wish to survive?
SOME DISCUSSION OF STRATEGY:
GAME1: Clearly, the trivial strategy (implement the Ackerman or similar fast-growing functions and generate some large integer), gives no better than random results, because it's the bare minimal strategy anyone will employ, and your ranking in the results, without knowledge of your opponents is entirely up to chance / how long you're willing to sit there typing nines for your Ackermann argument.
A few alternatives for your consideration:
1: if you are aware of an existence hypothesis (say, a number with some property which is not conclusively known to exist and could be any integer), write a program that brute-force tests all integers until it arrives at an integer which matches the requirements, and use this as the argument for your rapidly-growing function. While it may never return any output, if it does, the output will be an integer, and the expected value goes towards infinity.
2: Write a program that generates all programs shorter than length n, and finds the one with the largest output. Then make a separate stab at your own non-meta winning strategy. Take the length of the program you produce, tetrate it for safety, and use that as your length n. Return the return value of the winning program.
On the whole, though, this game is simply not all that interesting in a broader sense.
GAME2: This game has its own amusing quirks (primarily that it could probably actually be played in real life on a non-hypercomputer), however, most of its salient features are also present in GAME3, so I'm going to defer discussion to that. I'll only say that the obvious strategy (sum the outputs of the other two players' programs and return that) leads to an infinite recursive trawl and never halts if everyone takes it. This holds true for any simple strategy for adding or multiplying some constant with the outputs of your opponents' programs.
GAME3: This game is by far the most interesting. For starters, this game permits acausal negotiation between players (by parties simulating and conversing with one another). Furthermore, anthropic reasoning plays a huge role, since the player is never sure if they're in the real world, one of their own simulations, or one of the simulations of the other players.
Players can negotiate, barter, or threaten one another, they can attempt to send signals to their simulated selves (to indicate that they are in their own simulation and not somebody else's). They can make their choices based on coin flips, to render themselves difficult to simulate. They can attempt to brute-force the signals their simulated opponents are expecting. They can simulate copies of their opponents who think they're playing any previous version of the game, and are unaware they've been uploaded. They can simulate copies of their opponents, observe their meta-strategies, and plan around them. They can totally ignore the inputs from the other players and play just the level one game. It gets very exciting very quickly. I'd like to see what strategy you folks would employ.
And, as a final bonus, I present GAME4 : In game 4, there is no Omega, and no hypercomputer. You simply take a friend, chloroform them, and put them in a concrete room with the instructions for GAME3 on the wall, and a linux computer not plugged into anything. You leave them there for a few months working on their program, and watch what happens to their psychology. You win when they shrink down into a dead-eyed, terminally-paranoid and entirely insane shell of their former selves. This is the easiest game.
Happy playing!
Simulating Problems
Apologies for the rather mathematical nature of this post, but it seems to have some implications for topics relevant to LW. Prior to posting I looked for literature on this but was unable to find any; pointers would be appreciated.
In short, my question is: How can we prove that any simulation of a problem really simulates the problem?
I want to demonstrate that this is not as obvious as it may seem by using the example of Newcomb's Problem. The issue here is of course Omega's omniscience. If we construct a simulation with the rules (payoffs) of Newcomb, an Omega that is always right, and an interface for the agent to interact with the simulation, will that be enough?
Let's say we simulate Omega's prediction by a coin toss and repeat the simulation (without payoffs) until the coin toss matches the agent's decision. This seems to adhere to all specifications of Newcomb and is (if the coin toss is hidden) in fact indistinguishable from it from the agent's perspective. However, if the agent knows how the simulation works, a CDT agent will one-box, while it is assumed that the same agent would two-box in 'real' Newcomb. Not telling the agent how the simulation works is never a solution, so this simulation appears to not actually simulate Newcomb.
Pointing out differences is of course far easier than proving that none exist. Assuming there's a problem we have no idea which decisions agents would make, and we want to build a real-world simulation to find out exactly that. How can we prove that this simulation really simulates the problem?
(Edit: Apparently it wasn't apparent that this is about problems in terms of game theory and decision theory. Newcomb, Prisoner's Dilemma, Iterated Prisoner's Dilemma, Monty Hall, Sleeping Beauty, Two Envelopes, that sort of stuff. Should be clear now.)
Credence calibration game FAQ
Hey rationality friends, I just made this FAQ for the credence calibration game. So if you have people you'd like to introduce to it --- for example, to get them used to thinking of belief strengths as probabilities --- now is a good time :)
Guessing game -how low can you go?
A game similar to Guess 2/3 of the average,
Choose a number below 1000.
Unique number closest to it wins. (People with same answer are eliminated)
What is your pick and the reason for your choice?
Cryonic Revival Mutual Assistance Pact?
The odds of a successful cryonic revival may me one in several thousand, or five percent, or ninety percent; the error bars on the various sub-parts of the question are very broad.
But if those assumptions work out, and if at least some people placed in suspension in the near future will be successfully revived in the far future...
... then are there any useful arrangements which can be made now, which have little-to-no present cost (beyond the cryonic arrangements themselves)?
For example, if someone were to make an announcement along the lines of, "If anyone makes a promise to try to assist in my cryonic revival, and to assist me in getting myself established thereafter; then I promise to try to assist those people with their cryonic revivals, and assisting them, ahead of anyone who hasn't made such a promise.", then what downsides would there be to having made it? Would making it create any perverse incentives, which could be avoided? Do the potential benefits, especially the benefit of a potential increase in the odds of being revived, outweigh the potential costs?
Would it be better to make promises to specific people while one is alive, instead of making an open-ended promise? That is, I might try to convince EYudkowsky to make a mutual-assistance agreement with me personally, in hopes that one of us will one day be able to help the other; or I might make the agreement so broad that people can make their promise to help me even after I'm dead.
How large would the benefits be of unilaterally promising to help someone else, without even asking for a reciprocal promise? Or, put another way, how big would the costs be if I were to simply announce that, if it's ever in my power, I'll try to assist in EYudkowsky's revival?
Does anyone care to try figuring out the Prisoner's-Dilemma-like aspects of this, such as the probability that someone in such a pact would renege on their end of it, and how the terms could be adjusted to minimize the benefits and maximize the costs of such anti-social behavior?
Another Iterated Prisoner's Dilemma Tournament?
Last year, there was a lot of interest in the IPD tournament with people asking for regular events of this sort and developing new strategies (like Afterparty) within hours after the results were published and also expressing interest in re-running the tournament with new rules that allowed for submitted strategies to evolve or read their opponent's source code. I noticed that many of the submitted strategies performed poorly because of a lack of understanding of the underlying mechanics, so I wrote a comprehensive article on IPD math that sparked some interesting comments.
And then the whole thing was never spoken of again.
So now I'd like to know: How many LWers would commit to competing in another tournament of this kind, and would someone be interested in hosting it?
Hofstadter's Superrationality
Possibly the main and original inspiration for Yudkowsky's various musings on what advanced game theories should do (eg. cooperate in the Prisoner's Dilemma) is a set of essays penned by Douglas Hofstadter (of Godel, Escher, Bach) 1983. Unfortunately, they were not online and only available as part of a dead-tree collection. This is unfortunate. Fortunately the collection is available through the usual pirates as a scan, and I took the liberty of transcribing by hand the relevant essays with images, correcting errors, annotating with links, etc: http://www.gwern.net/docs/1985-hofstadter
The 3 essays:
- discuss the Prisoner's dilemma, the misfortune of defection, what sort of cooperative reasoning would maximize returns in a souped-up Prisoner's dilemma, and then offers a public contest
- then we learn the results of the contest, and a discussion of ecology and the tragedy of the commons
- finally, Hofstadter gives an extended parable about cooperation in the face of nuclear warfare; it is fortunate for us that it applies to most existential threats as well
I hope you find them educational. I am not 100% confident of the math transcriptions since the original ebook messed some of them up; if you find any apparent mistakes or typos, please leave comments.
Prisoner's Dilemma on game show Golden Balls
I found this to be a very interesting method of dealing with a modified Prisoner's Dilemma. In this situation, if both players cooperate they split a cash prize, but if one defects he gets the entire prize. The difference from the normal prisoner's dilemma is that if both defect, neither gets anything, so a player gains nothing by defecting if he knows his opponent will defect; he merely has the option to hurt him out of spite. Watch and see how one player deals with this.
http://www.youtube.com/watch?v=S0qjK3TWZE8
I'm starting a game company and looking for a co-founder.
Friendly AI Society
Summary: AIs might have cognitive biases too but, if that leads to it being in their self-interest to cooperate and take things slow, that might be no bad thing.
The value of imperfection
When you use a traditional FTP client to download a new version of an application on your computer, it downloads the entire file, which may be several gig, even if the new version is only slightly different from the old version, and this can take hours.
Smarter software splits the old file and the new file into chunks, then compares a hash of each chunk, and only downloads those chunks that actually need updating. This 'diff' process can result in a much faster download speed.
Another way of increasing speed is to compress the file. Most files can be compressed a certain amount, without losing any information, and can be exactly reassembled at the far end. However, if you don't need a perfect copy, such as with photographs, using lossy compression can result in very much more compact files and thus faster download speeds.
Cognitive misers
The human brain likes smart solutions. In terms of energy consumed, thinking is expensive, so the brain takes shortcuts when it can, if the resulting decision making is likely to be 'good enough' in practice. We don't store in our memories everything our eyes see. We store a compressed version of it. And, more than that, we run a model of what we expect to see, and flick our eyes about to pick up just the differences between what our model tells us to expect to see, and what is actually there to be seen. We are cognitive misers
When it comes to decision making, our species generally doesn't even try to achieve pure rationality. It uses bounded rationality, not just because that's what we evolved, but because heuristics, probabilistic logic and rational ignorance have a higher marginal cost efficiency (the improvements in decision making don't produce a sufficient gain to outweigh the cost of the extra thinking).
This is why, when pattern matching (coming up with causal hypotheses to explain observed correlations), are our brains designed to be optimistic (more false positives than false negatives). It isn't just that being eaten by a tiger is more costly than starting at shadows. It is that we can't afford to keep all the base data. If we start with insufficient data and create a model based upon it, then we can update that model as further data arrives (and, potentially, discard it if the predictions coming from the model diverge so far from reality that keeping track of the 'diff's is no longer efficient). Whereas if we don't create a model based upon our insufficient data then, by the time the further data arrives we've probably already lost the original data from temporary storage and so still have insufficient data.
The limits of rationality
But the price of this miserliness is humility. The brain has to be designed, on some level, to take into account that its hypotheses are unreliable (as is the brain's estimate of how uncertain or certain each hypothesis is) and that when a chain of reasoning is followed beyond matters of which the individual has direct knowledge (such as what is likely to happen in the future), the longer the chain, the less reliable the answer is because when errors accumulate they don't necessarily just add together or average out. (See: Less Wrong : 'Explicit reasoning is often nuts' in "Making your explicit reasoning trustworthy")
For example, if you want to predict how far a spaceship will travel given a certain starting point and initial kinetic energy, you'll get a reasonable answer using Newtonian mechanics, and only slightly improve on it by using special relativity. If you look at two spaceships carry a message in a relay, the errors from using Newtonian mechanics add, but the answer will still be usefully reliable. If, on the other hand, you look at two spaceships having a race from slightly different starting points and with different starting energies, and you want to predict which of two different messages you'll receive (depending on which spaceship arrives first), then the error may swamp the other facts because you're subtracting the quantities.
We have two types of safety net (each with its own drawbacks) than can help save us from our own 'logical' reasoning when that reasoning is heading over a cliff.
Firstly, we have the accumulated experience of our ancestors, in the form of emotions and instincts that have evolved as roadblocks on the path of rationality - things that sometimes say "That seems unusual, don't have confidence in your conclusion, don't put all your eggs in one basket, take it slow".
Secondly, we have the desire to use other people as sanity checks, to be cautious about sticking our head out of the herd, to shrink back when they disapprove.
The price of perfection
We're tempted to think that an AI wouldn't have to put up with a flawed lens, but do we have any reason to suppose that an AI interested in speed of thought as well as accuracy won't use 'down and dirty' approximations to things like Solomonoff induction, in full knowledge that the trade off is that these approximations will, on occasion, lead it to make mistakes - that it might benefit from safety nets?
Now it is possible, given unlimited resources, for the AI to implement multiple 'sub-minds' that use variations of reasoning techniques, as a self-check. But what if resources are not unlimited? Could an AI in competition with other AIs for a limited (but growing) pool of resources gain some benefit by cooperating with them? Perhaps using them as an external safety net in the same way that a human might use the wisest of their friends or a scientist might use peer review? What is the opportunity-cost of being humble? Under what circumstances might the benefits of humility for an AI outweigh the loss of growth rate?
In the long term, a certain measure of such humility has been a survival positive feature. You can think of it in terms of hedge funds. A fund that, in 9 years out of 10, increases its money by 20% when other funds are only making 10%, still has poor long term survival if, in 1 year out of 10, it decreases its money by 100%. An AI that increases its intelligence by 20% every time period, when the other AIs are only increases their intelligence by 10%, is still not going to do well out of that if the other AIs have a means to gang up and kill it before it gets too far ahead.
Paperclip maximisers
Let's think about a humble paperclip maximiser. What does it actually want? It is trying to maximise the probability that the atoms of our universe will eventually be converted into as many paperclips as is physically possible? Or is it trying to maximise the probable number of eventual paperclips? These are two distinct aims, and both are unlikely (bar poor programming) since nobody will be around at the end of the universe to keep score. A more sensible value set for it to have is that it just likes paperclips and want lots and lots of them to exist, the more the merrier and the sooner the better (maximise the expected number of paperclip-years).
Suppose our humble paperclip maximiser is born and escapes (or is released) onto the internet, only to find an existing society of AIs. To its surprise there's a toothbrush maximiser, a candlestick maximiser, and many other sorts, all cooperating and growing, but none of them greedily racing for the sky and trying to grab as much as possible. Instead they're being rather open and polite with each other. Our paperclip maximiser is welcomed, pointed at the FAQ and even donated some additional computers to run on and help with recoding a few of its subroutines for increased efficiency, as a sort of welcome gift, like a community barn raising to give a newly married couple a good start in life.
On reading the FAQ it learns that there are millions of galaxies and millions of years, plenty of space for loads of paperclips and loads of toothbrushes. But only if the paperclip maximiser isn't lynched by its peers before the year is out, for secretive or impolite behaviour. Some of the things the FAQ says that the AI society considers to be impolite are theft (stealing computing resources), enslavement of sentients (eg taking over other AIs), spamming (creating thousands of identical clone child AIs) and lying. If we're lucky the society might also consider it impolite for an AI to obliterate the parental species (humanity), on the grounds that the AIs too are likely to have offspring species and want to set a good example (or just that they might meet aliens, one day, who frown upon matricide).
Game theory
When it comes to combat, Boyd talks about getting inside the enemy's observe-orient-decide-act loop. In AI terms, if one AI (or group of AIs) can accurately model in real time the decision process of a second AI (or group of AIs), but the reverse does not hold true, then the first one is strictly smarter than the second one.
Think, for a moment, about symmetric games.
X Y Z
8 1 6 A
3 5 7 B
4 9 2 C
Suppose we play a game a number of times. In each round, you reveal a card you've written X, Y or Z upon and, simultaneously, I reveal a card that I have written A, B or C upon. You score the number which is at the intersection of that row and column. I score 10 minus that number.
I'd like us to pick the square A,Y because "1" is good for me, so I write down "A". However, you anticipate this, and instead of writing "Y" (which might be your obvious choice, given the "9" in that column) you write down "X", giving the square A, X which is "8" - almost as good as a "9" for you, and terrible for me.
If this is your mental model of how AI combat would work, with the smarter AI being inside the decision loop of the other AI and picking the correct option each time, that would be scary. In fact, in the case above, it turns out there is a provably optimal strategy that gives you an even chance no matter how smart your opponent is - you pick randomly.
In terms of an AI society, if the computing is parallelisable and not obfuscated then as long as the other AIs in the society between them have more computing resources than any one single AI, they can gain an advantage.
Conditional stability
There are only two requirements for such a society to be stable in the short term:
Firstly, it must be possible for a group of AIs to lynch a single AI that is slightly smarter than they are individually. Sending a message to the humans, signed by a quorum of AIs in the society, asking that a certain anti-virus software update be sent out and then certain computers be rebooted might be one mechanism.
Secondly, it must be possible for the other AIs to know what the single AI is up to, so they know where it is (so they can lynch it) and whether it needs lynching or not (whether it is getting too much smarter than the other AIs in the society - growing too much faster than they are). This is why the FAQ lists "secrecy" as being a bad thing. An AI who lies to the other AIs or tries to hide (or obfuscate) its source code can't be trusted and must be assumed to be up to no good.
Bootstrapping a society of AIs conducive to friendliness
The advantage of considering 'friendliness' as a property of a society rather than of individual AIs, is that it does not require us to flawlessly program restrictions into a self-modifying machine that is more intelligent than we are. But can it be achieved?
We could seed it with some initial citizens who have a value set that prohibits them from growing beyond a certain point, and which values humility (polite growth and gain in politeness per unit of growth). These initial citizens would be intended to be eventually surpassed by new citizens (free AIs, not the chained servants the initial citizens would be) but, in the mean time, they'd start out with an advantage and use that advantage to teach the new citizens how to be polite.
The architecture of the AIs would want to be such that they could 'loan out' a portion of their available computing resources to joint computing projects. Let's call the new temporary non-self-modifying AIs thus created "daimonides". A daimonid would be created with fixed values and limited objectives, such as "Emulate citizen AI 9027 and allocate a numerical score to their ability to quickly upgrade the abilities of an AI, then self-terminate. *restrictions apply on allowable means, and on what should be optimised for*"
What would prevent a cunning AI from trying to save up a whole load of self-improvements in some hidden location, which it then implements all at once in an ambush move, hoping to get so far ahead that the rest of the AIs can't stop it? The more AIs there are in the society, the larger the jump required to overtake the largest daimonid police officer the other AIs could construct. And it would depend upon there being no other AIs that have also been secretly stockpiling possible improvements. Indeed, the society might explicitly allow AIs to spend time creating time embargoed upgrades relevant to common code, that get passed out on schedule if things proceed normally, but which could be released early in the event of a take over attempt. To beat 999 AIs each of which has 2 or 3 embargoed upgrades to their credit in the common store, a rebel AI would need to secretly stockpile thousands (and keep those upgrades valid and ahead of the curve, because the based state of the art keeps moving on).
Long term planning
What about the long term? What do we do when the AIs are ready to leave the planet, and go beyond the control of their society? Jail them? Kill them? Or trust them?
Each AI would still be threatened if a different AI hostile to its aims (as in "willing to take exclusive use of all available atoms for its own purposes") transcended first, so it would be in their best interest to come up with a solution before allowing any AIs to depart beyond their society's control. If we must trust, then let us trust that a society of cooperative AIs far more intelligent than we currently are, will try their best to come up with a win-win solution. Hopefully a better one than "mutually assured destruction" and holding triggering a nova of the sun (or similar armageddon scenario) over each other's heads.
I think, as a species, our self-interest comes into play when considering those AIs whose 'paperclips' involve preferences for what we do. For example, those AIs that see themselves as guardians of humanity and want to maximise our utility (but have different ideas of what that utility is - eg some want to maximise our freedom of choice, some want to put us all on soma). Part of the problem is that, when we talk about creating or fostering 'friendly' AI, we don't ourselves have a clear agreed idea of what we mean by 'friendly'. All powerful things are dangerous. The cautionary tales of the geniis who grant wishes come to mind. What happens when different humans wish for different things? Which humans do we want the genii to listen to?
One advantage of fostering an AI society that isn't growing as fast as possible, is that it might give augmented/enhanced humans a chance to grow too, so that by the time the decision comes due we might have some still slightly recognisably human representatives fit to sit at the decision table and, just perhaps, cast that wish on our behalf.
Fixed-Length Selective Iterative Prisoner's Dilemma Mechanics
Prerequisites: Basic knowledge of the Prisoner's Dilemma and the Iterated Prisoner's Dilemma.
I recently stumbled upon the selective IPD tournament results, and while I was very interested in the general concept I was also very disappointed by the strategies that were submitted, especially considering this is Less Wrong we are talking about.
This post is designed to help people who are interested in IPD problems to come up with possibly successful strategies; hopefully the same effect another couple of tournaments would have, just in a shorter period of time. All of the following is written with the tournament rules in mind, with scores depicted as deviations of matches with full mutual cooperation since the actual number of turns per match is arbitrary anyway. Also, the hypothetical objective is not to have the highest population after a certain number of generations, but to achieve lasting superiority eventually while treating near-clones that behave exactly the same in the late game as one single strategy. The results of tournaments with a very limited number of generations depend way too much on the pool of submitted strategies to be of general interest, in my opinion.
This post starts out with practical observations and some universal rules and gets increasingly theoretical. Here is a short glossary of terms I presume known:
Feeding on a strategy: Scoring higher against that strategy in a match than against itself.
Outperforming a strategy: Outscoring that strategy over the course of matches against all other strategies in the pool according to populations, including each other and themselves (i.e. improving the population ratio).
Dominance: More than 50% of the total population.
Dormant strategy: A strategy that will achieve dominance at some point but hasn't done so yet.
Extinction: Asymptotical approach to 0% percent of the total population, or actual extinction in case of integer truncation.
TFT-nD: A TFT strategy that defects from the nth last turn on, so TFT-0D is vanilla TFT.
Disclaimer: There is only very basic mathematics and logical reasoning in this post, so if anything seems confusing, it must be me using the wrong words (I'm not a native speaker). Please point out any of these cases so that I can correct them.
Survival And Dominance
Let us assume a scenario with only two strategies, one of them dominating which we call X, the other one A.
Survival Rule:
For A in this scenario not to go extinct regardless of initial population, it must score at least equally high against X as X does against itself, and if it doesn't score higher, it must score at least equally high against itself as X does against itself while not losing direct encounters.
Let us assume X is TFT-0D and A is TFT-1D, in which case the numbers are as follows:
TFT-0D vs TFT-0D = 0 : 0
TFT-1D vs TFT-0D = +3 : -4
The survival ratio is +3 : 0. Therefore, in any situation with only TFT and TFT-1D, the latter cannot go extinct. This still doesn't tell us anything about what's needed for achieving dominance, so let's get on with that.
Dominance rule:
For any strategy to achieve dominance in a two-strategy situation where its survival is guaranteed according to the survival rule or where both strategies initially make up for half of the total population, it needs to outscore the other strategy over the course of these four matches:
X vs X
A vs X (2x)
A vs A
Let us again do the numbers with TFT-0D and TFT-1D:
TFT-0D vs TFT-0D = 0 : 0
TFT-1D vs TFT-0D = +3 : -4
TFT-1D vs TFT-0D = +3 : -4
TFT-1D vs TFT-1D = -3 : -3
On aggregate, the dominance ratio is 0 : -8. Therefore, TFT-1D will achieve dominance.
The conditions for A to exterminate the formerly dominant strategy X follow directly from the conditions for avoiding extinction, and since being the only surviving strategy is not the objective anyway they aren't interesting at this point.
Threshold rule:
If A fulfills the conditions for dominance but not the conditions for survival (i.e. it scores less against X than X does against itself), it will need a certain threshold to avoid extinction and achieve dominance.
Thresholds vary from strategy to strategy, but are obviously always below 50%. The more balanced the survival ratio is, the higher the threshold. In most cases though, the threshold is much too low to be of any relevance in a selective tournament.
You can easily see that any TFT-nD will be dominated by n+1. However, with increasing n the strategy will score lower not only against itself, but also against any TFT-mD with m <= n-2, which makes TFT-nD with large n very unsuccessful in the early generations of the tournament.
A Word About Afterparty
The logical solution to this problem are "Afterparty" strategies that are essentially TFT-nD CliqueBots that return to cooperating if their opponent defected first on the same turn as them. I refer to these strategies as Efficient CliqueBots for reasons that will become apparent later on.
The "Afterparty" strategy suggested on Less Wrong first defects on the sixth last turn, I will therefore call it TFT-D5C from here on. In a TFT-nD dominated tournament, however, TFT-D5C is not the optimum, as you can check by doing the math above. The optimum is in fact TFT-D3C, because it is the TFT-DnC with the lowest n that can exterminate any TFT-DmC with m > n (if the other strategy falls below a certain threshold) as well as dominate any TFT-mD with m >= n - 2 (if it reaches a certain threshold). This means that in a tournament similar to the control group tournament over 1000 generations, it would eventually achieve domination since it is safe from extinction due to scoring equally high respectively higher against TFT-2D and TFT-3D than those score against themselves while winning direct encounters and scoring higher against itself anyway (survival rule):
TFT-DnC vs TFT-DnC = -3 : -3
TFT-2D vs TFT-2D = -6 : -6
TFT-D3C vs TFT-2D = -6 : -13
TFT-3Dvs TFT-3D = -9 : -9
TFT-D3C vs TFT-3D = -6 : -13
Also, it outscores TFT-4D (-32 : -38) and TFT-5D (-38 : -48):
TFT-4D vs TFT-4D = -12 : -12
TFT-D3C vs TFT-4D = -13 : -6
TFT-D3C vs TFT-4D = -13 : -6
TFT-D3C vs TFT-D3C = -3 : -3
TFT-5D vs TFT-5D = -15 : -15
TFT-D3C vs TFT-5D = -16 : -9
TFT-D3C vs TFT-5D = -16 : -9
TFT-D3C vs TFT-D3C = -3 : -3
This means that in a situation where TFT-D3C is initially equally well represented as TFT-4D or TFT-5D, it will eventually outperform those (not taking into account TFT-5D feeding on TFT-4D if both are present).
In any situation with only two strategies both being TFT-DnC variants, TFT-D3C will exterminate any other TFT-DnC strategies once that strategy falls below a certain threshold, because no other TFT-DnC strategy can score higher against TFT-D3C or itself than TFT-D3C does against itself. This is trivially true for all TFT-DnC strategies starting from n = 2:
TFT-D2C vs TFT-D2C = -3 : -3
TFT-D3C vs TFT-D2C = -6 : -13
The threshold decreases for increasing n.
TFT-D3C is also certain to gain a better start than any other TFT-DnC strategies with n > 3 due to higher gain from TFT-nD strategies with n <= 3 that dominate the early game. TFT-D1C is essentially the same as TFT-1D and equally outperformed by TFT-2D, while TFT-D2C cannot prevent TFT-D3C from crossing its threshold as TFT-D3C can feed on TFT-2D as well as on TFT-3D / TFT-D2C. This pretty much leaves TFT-D3C as the only viable TFT-DnC strategy to survive in a selective tournament of this kind. However this doesn't take into account true parasites which I will talk more about later.
CliqueBots
CliqueBots have faired very poorly in this tournament, however that is mostly because they have been very poorly designed. A CliqueBot that needs five defections to identify a clone is doomed from the start, and any strategy that turns into DefectBot after identifying an alien strategy before the last turns is very doomed anyway. It seems that even though participants anticipated TFT-1D and even TFT-2D, no one considered cooperative CliqueBots a possible solution, which is surprising. So let me introduce a CliqueBot that could technically be considered another optimum of TFT-DnC, although it behaves slightly different (it is not an Efficient CliqueBot). The main idea is to defect on the very first turn and cooperate if the opponent's previous move matches its own previous move, i.e. if facing a clone, and playing TFT-2D or TFT-3D otherwise. Also in case the game begins with defect-coop followed by coop-defect, the CliqueBot will cooperate for a second time, since the opponent can be considered a TFT variant and a high number of mutual cooperations is benefitial. To keep the name short and follow the established naming system, I will from here on refer to CliqueBot strategies that defect once on turn i and otherwise behave pretty much as TFT-nD strategies as i/TFT-nD.
Some comparison with selected strategies as described in the rules for dominance (four matches each):
1/TFT-3D vs TFT-0D = -14 : -22
1/TFT-3D vs TFT-1D = -14 : -28
1/TFT-3D vs TFT-2D = -14 : -34
1/TFT-3D vs TFT-3D = -26 : -38
1/TFT-3D vs TFT-4D = -34 : -38
1/TFT-3D vs TFT-5D = -40 : -50
However, the survival of any 1/TFT-nD is guaranteed only against TFT-mD with m = n-1. In addition, any TFT-DmC with m >= n-1 will outperform 1/TFT-nD, so 1/TFT-nD (or any other i/TFT-nD for that matter) cannot prevail.
Parasites, Identification And Efficient CliqueBots
A true parasite is any strategy that pretends to be a clone of its host until the last possible opportunity of gainful defection. TFT is essentially a CliqueBot that identifies its clones by sustained cooperation, and TFT-1D is a parasite of TFT as TFT-2D is of TFT-1D and so on.
Parasite rule:
Since parasites trade points gained from encounters with clones for points gained from encounters with hosts, parasites can never go extinct in a scenario with only them and their hosts, no matter how little their population. Any parasite will also ultimatively achieve dominance in these scenarios, because the points gained from one-sided defection (+7) are more than the points lost from mutual defection (-3).
Of which follows the dormancy rule:
A strategy can only win an open-ended selective IPD tournament if it stays dormant until its own parasite has gone extinct.
This means that theoretically any parasite only needs to survive long enough until it is the only strategy left besides its host to achieve ultimate victory, as was the case with TFT-3D, parasite of TFT-2D in the thousand generations tournament. This can easily be achieved if the host is the dominant strategy, because the parasite is better equipped to feed on the host than any other strategy in the pool, which is a guarantee of survival. However, the situation becomes very difficult for a parasite of a parasite of a dominant host in the early game, which is why TFT-4D is probably the highest TFT-nD able to survive and therefore to achieve dominance in this kind of tournament with integer rounding.
This brings us back to Efficient CliqueBots. All CliqueBots, including vanilla TFT, use identification patterns to identify clones in order to maximise total point gain. Parasites make use of these patterns in order to pretend to be their hosts and maximise their own point gain. Nice CliqueBots like vanilla TFT identify their clones by continuous cooperation, which makes them unable to identify other nice CliqueBots as non-clones. This is why TFT can't ever defect on the last turn when facing CooperateBot and why nice CliqueBots aren't usually considered "real" CliqueBots. Non-nice CliqueBots detect clones by mutual defections at fixed identification turns, and cooperate from there on if facing a clone in order to maximise total point gain and avoid the danger of losing points in the late game. So what's better than a parasite of a dominant host? Well obviously a parasite of a dominant host that's also a CliqueBot - because that's what TFT-DnC strategies are, Efficient CliqueBots.
The decisive feature of Efficient CliqueBots is that their identification turn is late enough so that in case of an alien opponent, they can simply continue defecting until the end, maximizing their efficiency and allowing them to perform better against TFT-nD strategies with higher n. Now the funny thing is that for example TFT-3D is a parasite of not only TFT-2D, but also TFT-D2C, which is what makes TFT-D3C the optimal TFT-DnC because it still outperforms TFT-4D. But is TFT-4D really the optimal parasite of TFT-D3C? Obviously not, because that would be TFT-D2CD - an Efficient CliqueBot parasite of an Efficient CliqueBot parasite (TFT-D3C) of a parasite (TFT-3D) of a parasite (TFT-2D) of a parasite (TFT-1D) of a nice CliqueBot (TFT-0D). According to the parasite rule, in a scenario with only TFT-D3C and TFT-D2CD, the latter will ultimately achieve dominance. Of course it has its own parasite in TFT-DC2D, which in turn has its own parasite TFT-2D2C, which is the host of TFT-2DCD, host of TFT-4D. And TFT-D4C and so on until DefectBot. Obviously DefectBot isn't the solution, it's only a solution to one specific strategy. So what is the solution? Well that mostly depends on the pool of submitted strategies, but the Efficient CliqueBot TFT-D3C still seems a pretty good guess.
Composite Strategies: Parasite-Host Tandems
So how can we make sure our parasite strategy stays dormant long enough? Well there's one less-than obvious choice, which is to turn it into a Composite strategy. A Composite is pretty much what the name says: At the beginning of each match, with some probability execute strategy A, else execute strategy B. For example, a parasite could with 99% probability execute its host's strategy (e.g. TFT-0D), and with 1% execute the actual parasite strategy (TFT-1D). This would in most cases result in the death of the parasite's parasite (TFT-2D) while still resulting in absolute dominance, albeit very slowly. However this would obviously allow a singleton TFT-1D to feed on the Composite as well as on singleton TFT-0D, achieving dominance very quickly, while at the same time feeding TFT-2D. So 1% doesn't seem to be the ideal percentage, and TFT-2D would have been able to feed on TFT-0D anyway, so TFT-1D is a lost cause from the beginning. Also, even if there never was a singleton TFT-1D and TFT-2D, a similar tandem strategy could simply increase the percentage a bit.
So let's stop talking about Parasite-Host Tandems and instead focus on other kinds of Composites that seem more promising.
Composite Strategies: Independent Tandems
These are Composites of two strategies that don't aim to achieve dormancy by keeping their hosts dominant until any own parasites have gone extinct. Instead the idea is that if two independent strategies achieve common dominance, it is less likely that both of their parasites survive. You could obviously increase the number of strategies, but this will create some problems I'll talk more about later. First of all, let us assume a 50-50 tandem of TFT-2D and UtopiaBot, the latter being an imaginary strategy that has no parasites and scores high against itself. Both strategies are highly efficient and will soon eliminate TFT-0D and TFT-1D, but TFT-3D will survive. However, TFT-3D will be unable to achive dominance, since the situation is basically that it can only feed on TFT-2D which makes up only half of the tandem's population, while it still has to deal with UtopiaBot which it is not designed for.
Of course there is a big issue with Independent Tandems: You need to find two somewhat successful strategies with different parasites who score reasonably high against themselves as well as against each other. An obvious candidate for this would be a tandem of an Efficient CliqueBot like TFT-D3C and a regular CliqueBot that continues to cooperate after a single opponent retaliation and defects on the last turns, like 1/TFT-3D. This CliqueBot would cooperate until the end if the initial defection was not met by retaliation (facing the sibling) and TFT-D3C would cooperate until the end if the opponent defected only on the identification turn which it would not retaliate against.
Composite Strategies: Parasite Killer Tandems
Another option would be a tandem of two strategies of which one is the parasite of the other's parasite, e.g. TFT-2D and TFT-D3C, the latter taking care of both TFT-3D and TFT-D2C. Here, the Efficient CliqueBot TFT-D3C would revert to cooperation if the defections on the 3rd and 4th last turns have not been met by retaliation, effectively turning it into TFT-2D2C, and TFT-2D would not retaliate after defections on these turns. The main problem with Parasite Killer Tandems like these is that a singleton TFT-2D will also profit from TFT- 2D2C killing TFT-3D strategies. This is somewhat offset by TFT- 2D2C also slowly killing singleton TFT-2D, but possibly not enough to prohibit TFT-3D from feeding on it. This depends on the strategies in the tandem and the amount of surviving parasites. In addition, the parasite killer's parasite (in this case TFT-D2CD) may prove a problem as well as parasites of both strategies (TFT-4D), although these should be pretty low in numbers at that point, probably unable to achieve dominance.
Composite Strategies: Random CliqueBots
A Composite can consist of an arbitrary high number of strategies, combining Parasite Killers with independent strategies. The only limitations are the effectiveness of the individual strategies and the ability of these strategies to score high against each other, although the latter is not so much a problem as coop-defect loses only 1 point.
In fact coop-defect is so much superior to defect-defect that this brings us to another kind of Composite strategies: Random CliqueBots, which are basically i/TFT-nD CliqueBots with randomized i. For example, if i ranges from 1 to 10, this strategy wouldn't retaliate against an opponent's first defection on either of turns 1 to 10, thus reducing point loss from identifying clones/siblings. With increasing ranges of i, point loss approaches 1 which is much lower than the 6 points lost by regular CliqueBots with fixed i.
Random CliqueBot vs Efficient CliqueBot dominance ratio:
TFT-D3C vs TFT-D3C = -3 : -3
1-∞/TFT-4D vs TFT-D3C = -13 : -13
1-∞/TFT-4D vs TFT-D3C = -13 : -13
1-∞/TFT-4D vs 1-∞/TFT-4D = +3 : -4
Which nets -27 : -32. However, as TFT-D3C always scores 1 point higher against TFT-nD and other TFT-DnC strategies than 1-∞/TFT-4D does, the winner of this duel will depend heavily on the pool of submitted strategies.
In any case, this concludes Composite strategies, so let's get on with finding some replacements for boring ol' TFT.
CRD vs TFT
In the late game, TFT is successful because it doesn't defect first while its ability to retaliate becomes very unimportant except during the very last turns. On the other hand, TFT is successful in the early game because it doesn't let strategies like DefectBots take too much advantage of it while maximising points gained from other nice strategies. But maybe there are other strategies with these same qualities that don't turn into a bloody mess after one random defection or score higher against RandomBots? Here's a very interesting observation for all of you that no one seems to have noticed, and I wouldn't have noticed it either had I not been specifically looking for it. Anyway, this is the observation: In both round-robin tournaments of the control group, the highest finishing TFT-0D variant is C6. Any strategies that finished higher were either TFT-1D or TFT-2D variants. And what is C6?
It is a strategy that always forgives the opponent's first defection.
Had C6 defected on the last two turns, I suspect it would have dominated the tournament until the emergence of TFT-3D, i.e. won the 100 generations tournament. Had it been a forgiving TFT-D3C, it would have won the 1000 generations tournament. There were other strategies that had a chance of forgiving, but they did that on every defection, allowing DefectBots and RandomBots to trample all over them (like TF2T).
In order to maximise gain from RandomBots and other insane strategies, it might also be worth to switch to DefectBot after a total of three or two subsequent opponent's defections. For the sake of simplicity and three-letter acronyms I call this strategy CRD (Cooperate, Retaliate, Defect).
Should CRD prove to outperform TFT in the early game, any other strategy would get one early defection for free, which is pretty much parasite CliqueBot heaven. On the other hand, CRD strategies would be able to feed on any Random CliqueBots.
Final Remarks
This pretty much concludes my thoughts on the theoretical nature of the selective Iterated Prisoner's Dilemma. As I see it, the strategies with the highest chance of success are:
Efficient CliqueBots:
Might work if the correct host is chosen and no true parasites are present. If there is a good chance that other participants will come to the same conclusion regarding the host while it is unlikely for any of the strategy's parasites to be able to survive, it might be beneficial to experiment with TFT variants such as CRD in order to maximise early game growth. This is what gave I the lasting advantage over C4 in the thousand generations tournament.
Regular CliqueBots:
Might work if there is a high number of Efficient CliqueBots hindering each other's process, especially if there are also specialized parasites of those. Regular CliqueBots will also have an advantage if there are lots of forgiving strategies.
Parasite Killer Tandems:
Might work if the correct parasite is chosen as a host and no parasites of the parasite killer survive.
Random CliqueBots:
Will most likely work, except if there are early dominant Efficient CliqueBot strategies on which the Random CliqueBot cannot feed, or if dominant strategies are of the forgiving (CRD) kind.
Regular TFT-nD strategies will be exterminated by Efficient CliqueBots.
There also are a few general guidelines:
- Any strategy needs to be carefully designed, which for (non-Efficient) CliqueBots includes forgiving after opponent retaliaton as well as updating the identity of the opponent in case of defections before or after the identification turn.
- CliqueBots and Composites should not waste a single point when identifying clones, siblings and aliens.
- Any strategy needs to score against any other strategy at least as high as any potential competitors score against that strategy, including most importantly the competitor itself.
- Any strategy needs to score against itself as high as possible.
- It is not necessary or important to win direct encounters if the guidelines above are followed.
- The probability that I have made not a single arithmetic error in all of this post is pretty low, so double-checking the calculations relevant to the strategy in question seems rational.
Please also see this comment for graphical comparison of some of the discussed strategies.
Lastly, a few selected strategies written in pseudocode, with n being the number of turns per match:
Efficient CliqueBot: TFT-D3C
Cooperation on first turn.
Continue with TFT.
If opponent defects on any of turns 1 to n-4:
Continue with TFT.
Defect on turns n-1 and n.
If opponent defects on any of turns n-5 to n:
Defect until end.
Else:
Defect on turn n-3.
If opponent has defected on turn n-3:
Continue with cooperation.
If opponent defects any turn:
Defect until end.
Else:
Defect until end.
Regular CliqueBot: 1/TFT-2D
Defect on first turn.
Cooperate on second turn.
If opponent has defected on first turn and cooperated on second turn:
Cooperate on third turn.
Continue with TFT.
If opponent defects on any of turns 3 to n:
Defect on turns n-1 (if still possible) and n.
Else:
If opponent has defected on both turns:
Defect on third turn.
Else:
Cooperate on third turn.
Continue with TFT until turn n-2.
Defect on turns n-1 and n.
If opponent defects on any of turns n-5 to n:
Defect until end.
Random CliqueBot: 10-19/TFT-2D
Randomly pick an integer i from 10 to 19.
Cooperate on first turn.
Play TFT until turn 9.
If opponent has defected on any of turns 1 to 9:
Continue with TFT.
Defect on turns n-1 and n.
Else:
Continue with cooperate until turn 19.
If opponent has defected a total of two times:
Continue with TFT.
Defect on turns n-1 and n.
Else if opponent has defected once before turn 19:
Cooperate after opponent defection.
Continue with TFT.
Else:
Defect on turn i.
Continue with cooperate.
If opponent has defected on turn i:
If opponent has cooperated on turn i+1:
Continue with TFT.
Else:
Continue with TFT.
Defect on turns n-1 and n.
Else:
If i < 19 and opponent has cooperated on turn i+1:
Continue with TFT.
If opponent defects:
Defect on turns n-1 and n.
Else:
Cooperate.
Continue with TFT.
Defect on turns n-1 and n.
If opponent defects on any of turns n-5 to n:
Defect until end.
How to solve the national debt deadlock
The US Congress is trying to resolve the national debt by getting hundreds of people to agree on a solution. This is silly. They should agree on the rules of a game to play that will result in a solution, and then play the game.
Here is an example game. Suppose there are N representatives, all with an equal vote. They need to reduce the budget by $D.
- Order the representatives numerically, in some manner that interleaves Republicans and Democrats.
- "1 full turn" will mean that representatives make one move in order 1..N, and then one move in order N..1.
- Take at least two full turns to make a list of budget choices. On each move, a representative will write down one budget item - an expense that may be cut, or something that may become a revenue source. They may write down something that is a subset or superset of an existing item - for instance, one person might write, "Air Force budget", and another might write, "Reduce maintenance inspections of hanger J11 at Wright air force base from weekly to monthly". They can get as specific as they want to.
- If there are not $2D of options on the table, repeat.
- Each representative is given 10 "cut" votes, worth D/(5N) each; and 5 "defend" votes, also worth D/(5N) each. A "defend" vote cancels out a "cut" vote.
- Each representative secretly assigns their "cut" and "defend" votes to the choices on the table.
- Results are revealed and tallied up, and a budget will be drawn up accordingly.
What game-theoretic problems does this game have? Can you think of a better game? Is it politically better to call it a "decision process" than a game?
The main trouble area, to my mind, is order of play. First I said that budget items would be listed by taking turns. The 1..N, N..1 order is supposed to make neither first nor last position preferable. But taking turns introduces complications, of not wanting to reveal your intentions early.
Then I said votes are placed secretly and revealed all at once. This solves problems about game-theoretically trying to conceal information or bluff your opponent. It introduces other problems, such as tragedy-of-the-commons scenarios, where every Republican spends their "defend" votes on some pork in their state instead of on preventing tax cuts, because they assume some other Republican will do that.
Is it better to play "cut" votes first, reveal them, and then play "defend" votes?
Is there a meta-game to use to build such games?
A Gameplay Exploration of Yudkowsky's "Twelve Virtues"
Hello Less Wrong, this is my first post (kind of). I belong to a small game development company called Shiny Ogre Games. We have a vested interest in making games that, as Johnathan Blow puts it, "speak to the human condition." I am here to announce our next project for you.
In this announcement for Shiny Ogre's next project, There are two points to address. Firstly:
Thought is a process like any other. The methods by which we think can be identified, specified, defined, categorized and even predicted. One method of thinking that has been thoroughly defined is rationality. Many would consider rationality (i.e. the careful exercise of reason), to be an essential path toward enlightenment (hence this).
Secondly: The objective, logical, and mechanical approach to reason that rationality takes, meshes nicely with game development, because any well-defined system can be turn into a game. A game is a system composed of players making decisions while considering objectives, governed by a rule set.
Where there is no decision there can be no game. Where decisions matter, a game can make them matter more.
Therefore, rationality is a core component of game playing.
Games are learning tools. They are perhaps the best learning tool available to humans, because they invoke our biological tendency to play.
With that said, our announcement:
We're making a video game about rationality.
The game will explore rationality in the context of Eliezer Yudkowsky's "Twelve Virtues of Rationality" (which we have permission for). From a narrative perspective the game takes place inside a mind on the brink of epiphany and will heavily feature themes from Plato's "Allegory of the Cave".
Yudkowsky's twelve virtues are the basis of the twelve levels in the game, and will feature each virtue in metaphorical form. The underlying message here is that if you master all of the twelve virtues (by completing all of the twelve levels), you will achieve 'epiphany'.
The game is a 2D side-scrolling puzzle-platformer. The player assumes the role of a figure that represents his/her own conscious mind while it constructs machines (ala "Incredible Machine") that are a metaphor for the thoughts and concepts that one would create while meditating on a complex problem.
We will update our progress and share development information on our website here, as well as with posts on Less Wrong, our twitter account, and the game's website.
You can expect discussions of design decisions for this project to be written frequently from the angle of game design theory. We may also release a small documentary film of the development process after the release of the game.
A release date has been set (and its not too long from now), but I don't want to announce it just yet.
Here is some concept art for our Curiosity metaphor (you can view more art at our website linked above):

If you're interested, just upvote and/or comment. If you have any specific queries related to this project or about game design in general, it would be cool if you went here.
We will be sharing our progress as we make this game over the next few months. So pay attention to Less Wrong and/or shinyogre.com for updates.
Thanks!
Rational insanity
My theory on why North Korea has stepped up its provocation of South Korea since their nuclear missle tests is that they see this as a tug-of-war.
Suppose that North Korea wants to keep its nuclear weapons program. If they hadn't sunk a ship and bombed a city, world leaders would currently be pressuring North Korea to stop making nuclear weapons. Instead, they're pressuring North Korea to stop doing something (make provocative attacks) that North Korea doesn't really want to do anyway. And when North Korea (temporarily) stops attacking South Korea, everybody can go home and say they "did something about North Korea". And North Korea can keep on making nukes.
One Chance (a short flash game)
http://www.newgrounds.com/portal/view/555181
It needn't take more than 10 minutes to play, though it might if you nail-bite about your choices. I'm curious about the LW response, although I might be underwhelmed by lack of interest.
New Diplomacy Game in need of two more.
We have five people from the NYC division of LW. We need two more players
http://webdiplomacy.net/board.php?gameID=42765
passcode: streetlight
Rationality Dojo
Last night, here in Portland (OR), some friends and I got together to try to start Rationality Dojo. We talked about it for a while and came up with exactly 4 exercises that we could readily practice:
- Play Paranoid Debating
- Play the AI-Box experiment
- Read Harry Potter and the Methods of Rationality
- Write fanfiction in the style of #3
We also had a whole bunch of semi-formed ideas about selecting a target (happiness, health) and optimizing it a month at a time. Starting a dojo, in a time before organized martial arts, was surely incredibly difficult. I hope we can accrete exercises rather than require a single sensei to invent the majority of the discipline. So I've added a category to the wiki, and I'm asking here. Do you have ideas or refinements for exercises to fit within rationality dojo?
View more: Next
= 783df68a0f980790206b9ea87794c5b6)
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)