JGWeissman comments on Bayesian Utility: Representing Preference by Probability Measures - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (35)
How is it formally equivalent? How can I represent the expected utility of an action with arbitrary effects on conditional probability using the average, weighted by unconditional probabilities, of the utility of some subset of the possibilities, as in the post?
Let A be the action (set of possibilities consistent with taking the action), and O set of possible outcomes (each one rated by the utility function, assuming for simplicity that every concrete outcome is considered, not events-outcomes). We can assume
. Then:
How do you calculate P(A)?
Trick question? P(A) is just a probability of some event, so depending on the problem it could be calculated in any of the possible ways. "A" can for example correspond to a value of some random variable in a (dynamic) graphical model, taking observations into account, so that its probability value is obtained from belief propagation.
As I already explained, that only works for actions that exclude some outcomes and renormalize the probabilities of remaining outcomes, preserving the ratios of their probabilities.
Suppose O had 2 elements, x1 and x2, such that p(x1) = p(x2) = .5. If you take action A, then you have conditional probabilities p(x1|A) = .2 and p(x2|A) = .8. In this case, your transformation of P(x|A) = P(x, A)/P(A) does not work. Because A did not remove x1 as a possibility, it just made it less likely.
P(x|A) = P(x,A)/P(A) is by definition of conditional probability. You are trying to interpret x1 and x2 as events, while in grandparent comment x are elements of the sample space. If you want to consider non-concrete outcomes, compose them from smaller elements. For example, you can have P(O1)=P(O2)=.5, P(O1|A)=.2, P(O2|A)=.8, if O1={x1,x2}, O2={x3,x4}, A={x1,x3}, and p(x1)=.1, p(x2)=.4, p(x3)=.4, p(x4)=.1.