You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

paulfchristiano comments on Stupid Questions Open Thread Round 4 - Less Wrong Discussion

6 Post author: lukeprog 27 August 2012 12:04AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (179)

You are viewing a single comment's thread. Show more comments above.

Comment author: lukeprog 27 August 2012 12:20:24AM *  8 points [-]

I finally decided it's worth some of my time to try to gain a deeper understanding of decision theory...

Question: Can Bayesians transform decisions under ignorance into decisions under risk by assuming the decision maker can at least assign probabilities to outcomes using some kind of ignorance prior(s)?

Details: "Decision under uncertainty" is used to mean various things, so for clarity's sake I'll use "decision under ignorance" to refer to a decision for which the decision maker does not (perhaps "cannot") assign probabilities to some of the possible outcomes, and I'll use "decision under risk" to refer to a decision for which the decision maker does assign probabilities to all of the possible outcomes.

There is much debate over which decision procedure to use when facing a decision under ignorance when there is no act that dominates the others. Some proposals include: the leximin rule, the optimism-pessimism rule, the minimax regret rule, the info-gap rule, and the maxipok rule.

However, there is broad agreement that when facing a decision under risk, rational agents maximize expected utility. Because we have a clearer procedure for dealing with decisions under risk than we do for dealing with decisions under ignorance, many decision theorists are tempted to transform decisions under ignorance into decisions under risk by appealing to the principle of insufficient reason: "if you have literally no reason to think that one state is more probable than another, then one should assign equal probability to both states."

And if you're a Bayesian decision-maker, you presumably have some method for generating ignorance priors, whether or not that method always conforms to the principle of insufficient reason, and even if you doubt you've found the final, best method for assigning ignorance priors.

So if you're a Bayesian decision-maker, doesn't that mean that you only ever face decisions under risk, because at they very least you're assigning ignorance priors to the outcomes for which you're not sure how to assign probabilities? Or have I misunderstood something?

Comment author: paulfchristiano 27 August 2012 05:50:01AM 8 points [-]

You could always choose to manage ignorance by choosing a prior. It's not obvious whether you should. But as it turns out, we have results like the complete class theorem, which imply that EU maximization with respect to an appropriate prior is the only "Pareto efficient" decision procedure (any other decision can be changed so as to achieve a higher reward in every possible world).

This analysis breaks down in the presence of computational limitations; in that case it's not clear that a "rational" agent should have even an implicit representation of a distribution over possible worlds (such a distribution may be prohibitively expensive to reason about, much less integrate exactly over), so maybe a rational agent should invoke some decision rule other than EU maximization.

The situation is sort of analogous to defining a social welfare function. One approach is to take a VNM utility function for each individual and then maximize total utility. At face value it's not obvious if this is the right thing to do--choosing an exchange rate between person A's preferences and person B's preferences feels pretty arbitrary and potentially destructive (just like choosing prior odds between possible world A and possible world B). But as it turns out, if you do anything else then you could have been better off by picking some particular exchange rate and using it consistently (again, modulo practical limitations).

Comment author: lukeprog 30 August 2012 10:56:13PM *  1 point [-]

as it turns out, we have results like the complete class theorem, which imply that EU maximization with respect to an appropriate prior is the only "Pareto efficient" decision procedure (any other decision can be changed so as to achieve a higher reward in every possible world).

I found several books which give technical coverage of statistical decision theory, complete classes, and admissibility rules (Berger 1985; Robert 2001; Jaynes 2003; Liese & Miescke 2010), but I didn't find any clear explanation of exactly how the complete class theorem implies that "EU maximization with respect to an appropriate prior is the only 'Pareto efficient' decision procedure (any other decision can be changed so as to achieve a higher reward in every possible world)."

Do you know any source which does so, or are you able to explain it? This seems like a potentially significant argument for EUM that runs independently of the standard axiomatic approaches, which have suffered many persuasive attacks.

Comment author: paulfchristiano 31 August 2012 05:37:13AM *  1 point [-]

The formalism of the complete class theorem applies to arbitrary decisions, the Bayes decision procedures correspond to EU maximization with respect to an appropriate choice of prior. An inadmissable decision procedure is not Pareto efficient, in the sense that a different decision procedure does better in all possible worlds (which feels analogous to making all possible people happier). Does that make sense?

There is a bit of weasel room, in that the complete class theorem assumes that the data is generated by a probabilistic process in each possible world. This doesn't seem like an issue, because you just absorb the observation into the choice of possible world, but this points to a bigger problem:

If you define "possible worlds" finely enough, such that e.g. each (world, observation) pair is a possible world, then the space of priors is very large (e.g., you could put all of your mass on one (world, observation) pair for each observation) and can be used to justify any decision. For example, if we are in the setting of AIXI, any decision procedure can trivially be described as EU maximization under an appropriate prior: if the decision procedure outputs f(X) on input X, it corresponds to EU maximization against a prior which has the universe end after N steps with probability 2^(-N), and when the universe ends after you seeing X, you receive an extra reward if your last output was f(X).

So the conclusion of the theorem isn't so interesting, unless there are few possible worlds. When you argue for EUM, you normally want some stronger statement than saying that any decision procedure corresponds to some prior.

Comment author: lukeprog 31 August 2012 04:16:53PM 1 point [-]

That was clear. Thanks!