Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

The Truly Iterated Prisoner's Dilemma

18 Eliezer_Yudkowsky 04 September 2008 06:00PM

Followup toThe True Prisoner's Dilemma

For everyone who thought that the rational choice in yesterday's True Prisoner's Dilemma was to defect, a follow-up dilemma:

Suppose that the dilemma was not one-shot, but was rather to be repeated exactly 100 times, where for each round, the payoff matrix looks like this:

Humans: C Humans:  D
Paperclipper: C (2 million human lives saved, 2 paperclips gained) (+3 million lives, +0 paperclips)
Paperclipper: D (+0 lives, +3 paperclips) (+1 million lives, +1 paperclip)

As most of you probably know, the king of the classical iterated Prisoner's Dilemma is Tit for Tat, which cooperates on the first round, and on succeeding rounds does whatever its opponent did last time.  But what most of you may not realize, is that, if you know when the iteration will stop, Tit for Tat is - according to classical game theory - irrational.

Why?  Consider the 100th round.  On the 100th round, there will be no future iterations, no chance to retaliate against the other player for defection.  Both of you know this, so the game reduces to the one-shot Prisoner's Dilemma.  Since you are both classical game theorists, you both defect.

Now consider the 99th round.  Both of you know that you will both defect in the 100th round, regardless of what either of you do in the 99th round.  So you both know that your future payoff doesn't depend on your current action, only your current payoff.  You are both classical game theorists.  So you both defect.

Now consider the 98th round...

With humanity and the Paperclipper facing 100 rounds of the iterated Prisoner's Dilemma, do you really truly think that the rational thing for both parties to do, is steadily defect against each other for the next 100 rounds?

The True Prisoner's Dilemma

56 Eliezer_Yudkowsky 03 September 2008 09:34PM

It occurred to me one day that the standard visualization of the Prisoner's Dilemma is fake.

The core of the Prisoner's Dilemma is this symmetric payoff matrix:

1: C 1:  D
2: C (3, 3) (5, 0)
2: D (0, 5) (2, 2)

Player 1, and Player 2, can each choose C or D.  1 and 2's utility for the final outcome is given by the first and second number in the pair.  For reasons that will become apparent, "C" stands for "cooperate" and D stands for "defect".

Observe that a player in this game (regarding themselves as the first player) has this preference ordering over outcomes:  (D, C) > (C, C) > (D, D) > (C, D).

D, it would seem, dominates C:  If the other player chooses C, you prefer (D, C) to (C, C); and if the other player chooses D, you prefer (D, D) to (C, D).  So you wisely choose D, and as the payoff table is symmetric, the other player likewise chooses D.

If only you'd both been less wise!  You both prefer (C, C) to (D, D).  That is, you both prefer mutual cooperation to mutual defection.

The Prisoner's Dilemma is one of the great foundational issues in decision theory, and enormous volumes of material have been written about it.  Which makes it an audacious assertion of mine, that the usual way of visualizing the Prisoner's Dilemma has a severe flaw, at least if you happen to be human.

continue reading »

Rationality Quotes 13

8 Eliezer_Yudkowsky 02 September 2008 04:00PM

"You can only compromise your principles once.  After then you don't have any."
        -- Smug Lisp Weeny

"If you want to do good, work on the technology, not on getting power."
        -- John McCarthy

"If you’re interested in being on the right side of disputes, you will refute your opponents’ arguments.  But if you’re interested in producing truth, you will fix your opponents’ arguments for them.  To win, you must fight not only the creature you encounter; you must fight the most horrible thing that can be constructed from its corpse."
        -- Black Belt Bayesian

"I normally thought of "God!" as a disclaimer, or like the MPAA rating you see just before a movie starts: it told me before I continued into conversation with that person, that that person had limitations to their intellectual capacity or intellectual honesty."
        -- Mike Barskey

"It is the soldier, not the reporter, who has given us freedom of the press. It is the soldier, not the poet, who has given us freedom of speech. It is the soldier, not the campus organizer, who has given us the freedom to demonstrate. It is the soldier, not the lawyer, who has given us the right to a fair trial. It is the soldier, who salutes the flag, who serves under the flag, and whose coffin is draped by the flag, who allows the protester to burn the flag."
        -- Father Dennis Edward O'Brien, USMC

Rationality Quotes 12

4 Eliezer_Yudkowsky 01 September 2008 08:00PM

"Even if I had an objective proof that you don't find it unpleasant when you stick your hand in a fire, I still think you’d pull your hand out at the first opportunity."
        -- John K Clark

"So often when one level of delusion goes away, another one more subtle comes in its place."
        -- Rational Buddhist

"Your denial of the importance of objectivity amounts to announcing your intention to lie to us. No-one should believe anything you say."
        -- John McCarthy

"How exactly does one 'alter reality'?  If I eat an apple have I altered reality?  Or maybe you mean to just give the appearance of altering reality."
        -- JoeDad

"Promoting less than maximally accurate beliefs is an act of sabotage.   Don't do it to anyone unless you'd also slash their tires."
        -- Black Belt Bayesian

Brief Break

3 Eliezer_Yudkowsky 31 August 2008 04:00PM

I've been feeling burned on Overcoming Bias lately, meaning that I take too long to write my posts, which decreases the amount of recovery time, making me feel more burned, etc.

So I'm taking at most a one-week break.  I'll post small units of rationality quotes each day, so as to not quite abandon you.  I may even post some actual writing, if I feel spontaneous, but definitely not for the next two days; I have to enforce this break upon myself.

When I get back, my schedule calls for me to finish up the Anthropomorphism sequence, and then talk about Marcus Hutter's AIXI, which I think is the last brain-malfunction-causing subject I need to discuss.  My posts should then hopefully go back to being shorter and easier.

Hey, at least I got through over a solid year of posts without taking a vacation.

Dreams of Friendliness

16 Eliezer_Yudkowsky 31 August 2008 01:20AM

Continuation ofQualitative Strategies of Friendliness

Yesterday I described three classes of deep problem with qualitative-physics-like strategies for building nice AIs - e.g., the AI is reinforced by smiles, and happy people smile, therefore the AI will tend to act to produce happiness.  In shallow form, three instances of the three problems would be:

  1. Ripping people's faces off and wiring them into smiles;
  2. Building lots of tiny agents with happiness counters set to large numbers;
  3. Killing off the human species and replacing it with a form of sentient life that has no objections to being happy all day in a little jar.

And the deep forms of the problem are, roughly:

  1. A superintelligence will search out alternate causal pathways to its goals than the ones you had in mind;
  2. The boundaries of moral categories are not predictively natural entities;
  3. Strong optimization for only some humane values, does not imply a good total outcome.

But there are other ways, and deeper ways, of viewing the failure of qualitative-physics-based Friendliness strategies.

Every now and then, someone proposes the Oracle AI strategy:  "Why not just have a superintelligence that answers human questions, instead of acting autonomously in the world?"

Sounds pretty safe, doesn't it?  What could possibly go wrong?

continue reading »

Qualitative Strategies of Friendliness

10 Eliezer_Yudkowsky 30 August 2008 02:12AM

Followup toMagical Categories

What on Earth could someone possibly be thinking, when they propose creating a superintelligence whose behaviors are reinforced by human smiles? Tiny molecular photographs of human smiles - or if you rule that out, then faces ripped off and permanently wired into smiles - or if you rule that out, then brains stimulated into permanent maximum happiness, in whichever way results in the widest smiles...

Well, you never do know what other people are thinking, but in this case I'm willing to make a guess.  It has to do with a field of cognitive psychology called Qualitative Reasoning.


Qualitative reasoning is what you use to decide that increasing the temperature of your burner increases the rate at which your water boils, which decreases the derivative of the amount of water present. One would also add the sign of d(water) - negative, meaning that the amount of water is decreasing - and perhaps the fact that there is only a bounded amount of water.  Or we could say that turning up the burner increases the rate at which the water temperature increases, until the water temperature goes over a fixed threshold, at which point the water starts boiling, and hence decreasing in quantity... etc.

That's qualitative reasoning, a small subfield of cognitive science and Artificial Intelligence - reasoning that doesn't describe or predict exact quantities, but rather the signs of quantities, their derivatives, the existence of thresholds.

As usual, human common sense means we can see things by qualitative reasoning that current programs can't - but the more interesting realization is how vital human qualitative reasoning is to our vaunted human common sense.  It's one of the basic ways in which we comprehend the world.

Without timers you can't figure out how long water takes to boil, your mind isn't that precise.  But you can figure out that you should turn the burner up, rather than down, and then watch to make sure the water doesn't all boil away.  Which is what you mainly need, in the real world.  Or at least we humans seem to get by on qualitative reasoning; we may not realize what we're missing...

So I suspect that what went through the one's mind, proposing the AI whose behaviors would be reinforced by human smiles, was something like this:

continue reading »

Harder Choices Matter Less

31 Eliezer_Yudkowsky 29 August 2008 02:02AM

...or they should, logically speaking.

Suppose you're torn in an agonizing conflict between two choices.

Well... if you can't decide between them, they must be around equally appealing, right?  Equally balanced pros and cons?  So the choice must matter very little - you may as well flip a coin.  The alternative is that the pros and cons aren't equally balanced, in which case the decision should be simple.

This is a bit of a tongue-in-cheek suggestion, obviously - more appropriate for choosing from a restaurant menu than choosing a major in college.

But consider the case of choosing from a restaurant menu.  The obvious choices, like Pepsi over Coke, will take very little time.  Conversely, the choices that take the most time probably make the least difference.  If you can't decide between the hamburger and the hot dog, you're either close to indifferent between them, or in your current state of ignorance you're close to indifferent between their expected utilities.

continue reading »

Against Modal Logics

28 Eliezer_Yudkowsky 27 August 2008 10:13PM

Continuation ofGrasping Slippery Things
Followup toPossibility and Could-ness, Three Fallacies of Teleology

When I try to hit a reduction problem, what usually happens is that I "bounce" - that's what I call it.  There's an almost tangible feel to the failure, once you abstract and generalize and recognize it.  Looking back, it seems that I managed to say most of what I had in mind for today's post, in "Grasping Slippery Things".  The "bounce" is when you try to analyze a word like could, or a notion like possibility, and end up saying, "The set of realizable worlds [A'] that follows from an initial starting world A operated on by a set of physical laws f."  Where realizable contains the full mystery of "possible" - but you've made it into a basic symbol, and added some other symbols: the illusion of formality.

There are a number of reasons why I feel that modern philosophy, even analytic philosophy, has gone astray - so far astray that I simply can't make use of their years and years of dedicated work, even when they would seem to be asking questions closely akin to mine.

The proliferation of modal logics in philosophy is a good illustration of one major reason:  Modern philosophy doesn't enforce reductionism, or even strive for it.

Most philosophers, as one would expect from Sturgeon's Law, are not very good.  Which means that they're not even close to the level of competence it takes to analyze mentalistic black boxes into cognitive algorithms.  Reductionism is, in modern times, an unusual talent.  Insights on the order of Pearl et. al.'s reduction of causality or Julian Barbour's reduction of time are rare.

So what these philosophers do instead, is "bounce" off the problem into a new modal logic:  A logic with symbols that embody the mysterious, opaque, unopened black box.  A logic with primitives like "possible" or "necessary", to mark the places where the philosopher's brain makes an internal function call to cognitive algorithms as yet unknown.

And then they publish it and say, "Look at how precisely I have defined my language!"

continue reading »

Dreams of AI Design

19 Eliezer_Yudkowsky 26 August 2008 11:28PM

Followup toAnthropomorphic Optimism, Three Fallacies of Teleology

After spending a decade or two living inside a mind, you might think you knew a bit about how minds work, right?  That's what quite a few AGI wannabes (people who think they've got what it takes to program an Artificial General Intelligence) seem to have concluded.  This, unfortunately, is wrong.

Artificial Intelligence is fundamentally about reducing the mental to the non-mental.

You might want to contemplate that sentence for a while.  It's important.

Living inside a human mind doesn't teach you the art of reductionism, because nearly all of the work is carried out beneath your sight, by the opaque black boxes of the brain.  So far beneath your sight that there is no introspective sense that the black box is there - no internal sensory event marking that the work has been delegated.

Did Aristotle realize that when he talked about the telos, the final cause of events, that he was delegating predictive labor to his brain's complicated planning mechanisms - asking, "What would this object do, if it could make plans?"  I rather doubt it.  Aristotle thought the brain was an organ for cooling the blood - which he did think was important:  Humans, thanks to their larger brains, were more calm and contemplative.

So there's an AI design for you!  We just need to cool down the computer a lot, so it will be more calm and contemplative, and won't rush headlong into doing stupid things like modern computers.

continue reading »

View more: Prev | Next