Desirable Dispositions and Rational Actions

RichardChappell

A common background assumption on LW seems to be that it's rational to act in accordance with the dispositions one would wish to have. (Rationalists must WIN, and all that.)

E.g., Eliezer:

It is, I would say, a general principle of rationality - indeed, part of how I define rationality - that you never end up envying someone else's mere choices. You might envy someone their genes, if Omega rewards genes, or if the genes give you a generally happier disposition. But [two-boxing] Rachel, above, envies [one-boxing] Irene her choice, and only her choice, irrespective of what algorithm Irene used to make it. Rachel wishes just that she had a disposition to choose differently.

And more recently, from AdamBell:

I [previously] saw Newcomb’s Problem as proof that it was sometimes beneficial to be irrational. I changed my mind when I realized that I’d been asking the wrong question. I had been asking which decision would give the best payoff at the time and saying it was rational to make that decision. Instead, I should have been asking which decision theory would lead to the greatest payoff.

Within academic philosophy, this is the position advocated by David Gauthier. Derek Parfit has constructed some compelling counterarguments against Gauthier, so I thought I'd share them here to see what the rest of you think.

First, let's note that there definitely are possible cases where it would be "beneficial to be irrational". For example, suppose an evil demon ('Omega') will scan your brain, assess your rational capacities, and torture you iff you surpass some minimal baseline of rationality. In that case, it would very much be in your interests to fall below the baseline! Or suppose you're rewarded every time you honestly believe the conclusion of some fallacious reasoning. We can easily multiply cases here. What's important for now is just to acknowledge this phenomenon of 'beneficial irrationality' as a genuine possibility.

This possibility poses a problem for the Eliezer-Gauthier methodology. (Quoting Eliezer again:)

Rather than starting with a concept of what is the reasonable decision, and then asking whether "reasonable" agents leave with a lot of money, start by looking at the agents who leave with a lot of money, develop a theory of which agents tend to leave with the most money, and from this theory, try to figure out what is "reasonable".

The problem, obviously, is that it's possible for irrational agents to receive externally-generated rewards for their dispositions, without this necessarily making their downstream actions any more 'reasonable'. (At this point, you should notice the conflation of 'disposition' and 'choice' in the first quote from Eliezer. Rachel does not envy Irene her choice at all. What she wishes is to have the one-boxer's dispositions, so that the predictor puts a million in the first box, and then to confound all expectations by unpredictably choosing both boxes and reaping the most riches possible.)

To illustrate, consider (a variation on) Parfit's story of the threat-fulfiller and threat-ignorer. Tom has a transparent disposition to fulfill his threats, no matter the cost to himself. So he straps on a bomb, walks up to his neighbour Joe, and threatens to blow them both up unless Joe shines his shoes. Seeing that Tom means business, Joe sensibly gets to work. Not wanting to repeat the experience, Joe later goes and pops a pill to acquire a transparent disposition to ignore threats, no matter the cost to himself. The next day, Tom sees that Joe is now a threat-ignorer, and so leaves him alone.

So far, so good. It seems this threat-ignoring disposition was a great one for Joe to acquire. Until one day... Tom slips up. Due to an unexpected mental glitch, he threatens Joe again. Joe follows his disposition and ignores the threat. BOOM.

Here Joe's final decision seems as disastrously foolish as Tom's slip up. It was good to have the disposition to ignore threats, but that doesn't necessarily make it good idea to act on it. We need to distinguish the desirability of a disposition to X from the rationality of choosing to do X.

A common background assumption on LW seems to be that it's rational to act in accordance with the dispositions one would wish to have. (Rationalists must WIN, and all that.)

E.g., Eliezer:

It is, I would say, a general principle of rationality - indeed, part of how I define rationality - that you never end up envying someone else's mere choices. You might envy someone their genes, if Omega rewards genes, or if the genes give you a generally happier disposition. But [two-boxing] Rachel, above, envies [one-boxing] Irene her choice, and only her choice, irrespective of what algorithm Irene used to make it. Rachel wishes just that she had a disposition to choose differently.

And more recently, from AdamBell:

I [previously] saw Newcomb’s Problem as proof that it was sometimes beneficial to be irrational. I changed my mind when I realized that I’d been asking the wrong question. I had been asking which decision would give the best payoff at the time and saying it was rational to make that decision. Instead, I should have been asking which decision theory would lead to the greatest payoff.

This possibility poses a problem for the Eliezer-Gauthier methodology. (Quoting Eliezer again:)

Rather than starting with a concept of what is the reasonable decision, and then asking whether "reasonable" agents leave with a lot of money, start by looking at the agents who leave with a lot of money, develop a theory of which agents tend to leave with the most money, and from this theory, try to figure out what is "reasonable".

Wei-Dai wrote a post entitled The Absent-Minded Driver which I labeled "snarky". Moreover, I suggested that the snarkiness was so bad as to be nauseating, so as to drive reasonable people to flee in horror from LW and SAIA. I here attempt to defend these rather startling opinions. Here is what Wei-Dai wrote that offended me:

This post examines an attempt by professional decision theorists to treat an example of time inconsistency, and asks why they failed to reach the solution (i.e., TDT/UDT) that this community has more or less converged upon. (Another aim is to introduce this example, which some of us may not be familiar with.) Before I begin, I should note that I don't think "people are crazy, the world is mad" (as Eliezer puts it) is a good explanation. Maybe people are crazy, but unless we can understand how and why people are crazy (or to put it more diplomatically, "make mistakes"), how can we know that we're not being crazy in the same way or making the same kind of mistakes?

The paper that Wei-Dai reviews is "The Absent-Minded Driver" by Robert J. Aumann, Sergiu Hart, and Motty Perry. Wei-Dai points out, rather condescendingly:

(Notice that the authors of this paper worked for a place called Center for the Study of Rationality, and one of them won a Nobel Prize in Economics for his work on game theory. I really don't think we want to call these people "crazy".)

Wei-Dai then proceeds to give a competent description of the problem and the standard "planning-optimality" solution of the problem. Next comes a description of an alternative seductive-but-wrong solution by Piccione and Rubinstein. I should point that everyone - P&R, Aumann, Hart, and Perry, Wei-Dai, me, and hopefully you who look into this - realizes that the alternative P&R solution is wrong. It gets the wrong result. It doesn't win. The only problem is explaining exactly where the analysis leading to that solution went astray, and in explaining how it might be modified so as to go right. Making this analysis was, as I see it, the whole point of both papers - P&R and Aumann et al. Wei-Dai describes some characteristics of Aumann et al's corrected version of the alternate solution. Then he (?) goes horribly astray:

In problems like this one, UDT is essentially equivalent to planning-optimality. So why did the authors propose and argue for action-optimality despite its downsides ..., instead of the alternative solution of simply remembering or recomputing the planning-optimal decision at each intersection and carrying it out?

But, as anyone who reads the paper carefully should see, they weren't arguing for action-optimality as the solution. They never abandoned planning optimality. Their point is that if you insist on reasoning in this way, (and Seldin's notion of "subgame perfection" suggests some reasons why you might!) then the algorithm they call "action-optimality" is the way to go about it.

But Wei-Dai doesn't get this. Instead we get this analysis of how these brilliant people just haven't had the educational advantages that LW folks have:

Well, the authors don't say (they never bothered to argue against it), but I'm going to venture some guesses:

That solution is too simple and obvious, and you can't publish a paper arguing for it.

It disregards "the probability of being at X", which intuitively ought to play a role.

The authors were trying to figure out what is rational for human beings, and that solution seems too alien for us to accept and/or put into practice.

The authors were not thinking in terms of an AI, which can modify itself to use whatever decision theory it wants to.

Aumann is known for his work in game theory. The action-optimality solution looks particularly game-theory like, and perhaps appeared more natural than it really is because of his specialized knowledge base.

The authors were trying to solve one particular case of time inconsistency. They didn't have all known instances of time/dynamic/reflective inconsistencies/paradoxes/puzzles laid out in front of them, to be solved in one fell swoop.

Taken together, these guesses perhaps suffice to explain the behavior of these professional rationalists, without needing to hypothesize that they are "crazy". Indeed, many of us are probably still not fully convinced by UDT for one or more of the above reasons.

Let me just point out that the reason it is true that "they never argued against it" is that they had already argued for it. Check out the implications of their footnote #4!

Ok, those are the facts, as I see them. Was Wei-Dai snarky? I suppose it depends on how you define snarkiness. Taboo "snarky". I think that he was overbearingly condescending without the slightest real reason for thinking himself superior. "Snarky" may not be the best one-word encapsulation of that attitude, but it is the one I chose. I am unapologetic. Wei-Dai somehow came to believe himself better able to see the truth than a Nobel laureate in the Nobel laureate's field. It is a mistake he would not have made had he simply read a textbook or taken a one-semester course in the field. But I'm coming to see it as a mistake made frequently by SIAI insiders.

Let me point out that the problem of forgetful agents may seem artificial, but it is actually extremely important. An agent with perfect recall playing the iterated PD, knowing that it is to be repeated exactly 100 times, should rationally choose to defect. On the other hand, if he cannot remember how many iterations remain to be played, and knows that the other player cannot remember either, should cooperate by playing Tit-for-Tat or something similar.

Well, that is my considered response on "snarkiness". I still have to respond on some other points, and I suspect that, upon consideration, I am going to have to eat some crow. But I'm not backing down on this narrow point. Wei-Dai blew it in interpreting Aumann's paper. (And also, other people who know some game theory should read the paper and savor the implications of footnote #4. It is totally cool).

Preliminary notes: You can call me "Wei Dai" (that's firstname lastname). "He" is ok. I have taken a graduate level course in game theory (where I got a 4.0 grade, in case you suspect that I coasted through it), and have Fudenberg and Tirole's "Game Theory" and Joyce's "Foundations of Causal Decision Theory" as two of the few physical books that I own.

Their point is that if you insist on reasoning in this way, (and Seldin's notion of "subgame perfection" suggests some reasons why you might!) then the algo

... (read more)

8Tyrrell_McAllister16y

How is Wei Dai being condescending there? He's pointing out how weak it is to dismiss people with these credentials by just calling them crazy. ETA: In other words, it's an admonishment directed at LWers. That, at any rate, was my read.

18

Desirable Dispositions and Rational Actions

18

18

18

Desirable Dispositions and Rational Actions

18

18