Bayesian Utility: Representing Preference by Probability Measures - Less Wrong
http://lesswrong.com/
Bayesian Utility: Representing Preference by Probability Measures
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/Tue, 28 Jul 2009 00:28:55 +1000
Submitted by <a href="http://lesswrong.com/user/Vladimir_Nesov">Vladimir_Nesov</a>
•
10 votes
•
<a href="http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/#comments">35 comments</a>
<div><p>This is a simple transformation of standard expected utility formula that I found conceptually interesting.</p>
<p>For simplicity, let's consider a finite discrete <a href="http://en.wikipedia.org/wiki/Probability_space">probability space</a> with non-zero probability at each point p(x), and a utility function u(x) defined on its sample space. Expected utility of an event A (set of the points of the sample space) is the average value of utility function weighted by probability over the event, and is written as</p>
<p><img src="http://www.codecogs.com/png.latex?EU(A)=%5Cfrac%7B%5Csum_%7Bx%5Cin%20A%7D%7Bp(x)%5Ccdot%20u(x)%7D%7D%7B%5Csum_%7Bx%5Cin%20A%7D%7Bp(x)%7D%7D" alt="EU(A)=\frac{\sum_{x\in A}{p(x)\cdot u(x)}}{\sum_{x\in A}{p(x)}}"></p>
<p><a id="more"></a> Expected utility is a way of comparing events (sets of possible outcomes) that correspond to, for example, available actions. Event A is said to be preferable to event B when EU(A)>EU(B). Preference relation doesn't change when utility function is transformed by positive affine transformations. Since the sample space is assumed finite, we can assume without loss of generality that for all x, u(x)>0. Such utility function can be additionally rescaled so that for all sample space</p>
<p><img src="http://www.codecogs.com/png.latex?%5Csum_%7Bx%7D%7Bp(x)%5Ccdot%20u(x)%7D=1" alt="\sum_{x}{p(x)\cdot u(x)}=1"></p>
<p>Now, if we define</p>
<p><img src="http://www.codecogs.com/png.latex?q(x)=p(x)%5Ccdot%20u(x)" alt="q(x)=p(x)\cdot u(x)"></p>
<p>the expected utility can be rewritten as</p>
<p><img src="http://www.codecogs.com/png.latex?EU(A)=%5Cfrac%7B%5Csum_%7Bx%5Cin%20A%7D%7Bq(x)%7D%7D%7B%5Csum_%7Bx%5Cin%20A%7D%7Bp(x)%7D%7D" alt="EU(A)=\frac{\sum_{x\in A}{q(x)}}{\sum_{x\in A}{p(x)}}"></p>
<p>or</p>
<p><img src="http://www.codecogs.com/png.latex?EU(A)=%5Cfrac%7BQ(A)%7D%7BP(A)%7D" alt="EU(A)=\frac{Q(A)}{P(A)}"></p>
<p>Here, P and Q are two probability measures. It's easy to see that this form of expected utility formula has the same expressive power, so preference relation can be defined directly by a pair of probability measures on the same sample space, instead of using a utility function.</p>
<p>Expected utility written in this form only uses probability of the whole event in both measures, without looking at the individual points. I tentatively call measure Q "shouldness", together with P being "probability". Conceptual advantage of this form is that probability and utility are now on equal footing, and it's possible to work with both of them using the familiar Bayesian updating, in exactly the same way. To compute expected utility of an event given additional information, just use the posterior shouldness and probability:</p>
<p><img src="http://www.codecogs.com/png.latex?EU(A%7CB)=%5Cfrac%7BQ(A%7CB)%7D%7BP(A%7CB)%7D" alt="EU(A|B)=\frac{Q(A|B)}{P(A|B)}"></p>
<p>If events are drawn as points (vectors) in (P,Q) coordinates, expected utility is monotone on the polar angle of the vectors. Since coordinates show measures of events, a vector depicting a union of nonintersecting events is equal to the sum of vectors depicting these events:</p>
<p><img src="http://www.codecogs.com/png.latex?(P(A%5Ccup%20B),Q(A%5Ccup%20B))%20=%20(P(A),Q(A))+(P(B),Q(B)),%5C%20A%5Ccap%20B=%5Cemptyset" alt="(P(A\cup B),Q(A\cup B)) = (P(A),Q(A))+(P(B),Q(B)),\ A\cap B=\emptyset"></p>
<p>This allows to graphically see some of the structure of simple sigma-algebras of the sample space together with a preference relation defined by a pair of measures. See also <a href="/lw/148/bayesian_utility_representing_preference_by/11hn">this comment</a> on some examples of applying this geometric representation of preference.</p>
<p>Preference relation defined by expected utility this way also doesn't depend on constant factors in the measures, so it's unnecessary to require the measures to sum up to 1.</p>
<p>Since P and Q are just devices representing the preference relation, there is nothing inherently "epistemic" about P. Indeed, it's possible to mix P and Q together without changing the preference relation. A pair (p',q') defined by</p>
<p><img src="http://www.codecogs.com/png.latex?%5Cbegin%7Bmatrix%7D%20%5Cleft%5C%7B%5Cbegin%7Bmatrix%7D%20p'%20&=&%20%5Calpha%5Ccdot%20p%20+%20(1-%5Cbeta)%5Ccdot%20q%5C%5C%20q'%20&=&%20%5Cbeta%5Ccdot%20q%20+%20(1-%5Calpha)%5Ccdot%20p%20%5Cend%7Bmatrix%7D%5Cright.%5C%5C%20%5Calpha>%5Cbeta%20%5Cend%7Bmatrix%7D" alt="\begin{matrix} \left\{\begin{matrix} p' &=& \alpha\cdot p + (1-\beta)\cdot q\\ q' &=& \beta\cdot q + (1-\alpha)\cdot p \end{matrix}\right.\\ \alpha>\beta \end{matrix}"></p>
<p>gives the same preference relation,</p>
<p><img src="http://www.codecogs.com/png.latex?%5Cfrac%7BQ(A)%7D%7BP(A)%7D>%5Cfrac%7BQ(B)%7D%7BP(B)%7D%20%5CLeftrightarrow%20%5Cfrac%7BQ'(A)%7D%7BP'(A)%7D>%5Cfrac%7BQ'(B)%7D%7BP'(B)%7D" alt="\frac{Q(A)}{P(A)}>\frac{Q(B)}{P(B)} \Leftrightarrow \frac{Q'(A)}{P'(A)}>\frac{Q'(B)}{P'(B)}"></p>
<p>(Coefficients can be negative or more than 1, but values of p and q must remain positive.)</p>
<p>Conversely, given a fixed measure P, it isn't possible to define an arbitrary preference relation by only varying Q (or utility function). For example, for a sample space of three elements, a, b and c, if p(a)=p(b)=p(c), then EU(a)>EU(b)>EU(c) means that EU(a+c)>EU(b+c), so it isn't possible to choose q such that EU(a+c)<EU(b+c). If we are free to choose p, however, an example that has these properties (allowing zero values for simplicity) is a=(0,1/4), b=(1/2,3/4), c=(1/2,0), with a+c=(1/2,1/4), b+c=(1,3/4), so EU(a+c)<EU(b+c).</p>
<p>Prior is an integral part of preference, and it works exactly the same way as shouldness. Manipulations with probabilities, or Bayesian "levels of certainty", are manipulations with "half of preference". The problem of choosing Bayesian priors is in general the problem of formalizing preference, it can't be solved completely without considering utility, without formalizing values, and values are <a href="http://wiki.lesswrong.com/wiki/Complexity_of_value">very complicated</a>. No simple morality, no simple probability.</p></div>
<a href="http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/#comments">35 comments</a>
timtyler on Bayesian Utility: Representing Preference by Probability Measures
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zjq
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zjq2009-07-28T03:09:01.123438+10:00
<div class="md"><p>I've critiqued this "value is complex" [<a href="http://lesswrong.com/lw/y3/value_is_fragile/" rel="nofollow">http://lesswrong.com/lw/y3/value_is_fragile/</a>] material before. To summarise from my objections there:</p>
<p>The utility function of Deep Blue has 8,000 parts - and contained a lot of information. Throw all that information away, and all you really need to reconstruct Deep Blue is the knowledge that it's aim is to win games of chess. The exact details of the information in the original utility function are not recovered - but the eventual functional outcome would be much the same - a powerful chess computer.</p>
<p>The supposed complexity is actually a bunch of implementation details that can be effectively recreated from the goal - if that should prove to be necessary.</p>
<p>It is <em>not</em> precious information that must be preserved. If anything, attempts to preserve the 8,000 parts of Deep Blue's utility function while improving it would actually have a crippling negative effect on its future development. For example, the "look 9 moves ahead" heuristic is a feature when the program is weak, but a serious bug when it grows stronger.</p>
<p>Similarly with complexity of human values: those are a bunch of implementation details to deal with the problem of limited resources - not some kind of representation of the real target.</p></div>
Jonathan_Graehl on Bayesian Utility: Representing Preference by Probability Measures
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zkm
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zkm2009-07-28T06:55:19.007505+10:00
<div class="md"><p>It looks like this is a response to the passing link to <a href="http://wiki.lesswrong.com/wiki/Complexity_of_value" rel="nofollow">http://wiki.lesswrong.com/wiki/Complexity_of_value</a> in the article. At first I didn't understand what in the article you were responding to.</p></div>
timtyler on Bayesian Utility: Representing Preference by Probability Measures
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zko
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zko2009-07-28T07:08:21.016926+10:00
<div class="md"><p>The article it was posted in response to was this one - from the conclusion of the post:</p>
<p><a href="http://wiki.lesswrong.com/wiki/Complexity_of_value" rel="nofollow">http://wiki.lesswrong.com/wiki/Complexity_of_value</a></p>
<p>That's a wiki article - which can't be responded to directly. The point I raise is an old controversy now. This message seems rather redundant now - since the question it responded to has subsequently been dramatically edited.</p></div>
Jonathan_Graehl on Bayesian Utility: Representing Preference by Probability Measures
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zm2
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zm22009-07-28T15:49:46.902624+10:00
<div class="md"><p>Yes, I edited, but before your response. Sorry for the confusion.</p></div>
Wei_Dai on Bayesian Utility: Representing Preference by Probability Measures
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zke
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zke2009-07-28T06:07:37.392431+10:00
<div class="md"><p>Why was this comment voted down so much (to -4 as of now)? It seems to be a reasonable point, clearly written, not an obvious troll or off-topic. Why does it deserve to be ignored?</p></div>
jimmy on Bayesian Utility: Representing Preference by Probability Measures
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zk3
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zk32009-07-28T04:50:24.453683+10:00
<div class="md"><p>I may be missing your point, but to me, it looks like the summary would be:</p>
<blockquote>
<p>If you bundle utility with probability, you can do the same maths, which is nice since it simplifies other things. You cannot prefer certain <em>expected</em> outcomes no matter what your utility function is [neat result, btw].</p>
<p>Since the probability math works, I now <a href="http://lesswrong.com/lw/ny/sneaking_in_connotations/">call the new thing</a> "probability" and show that you can't find prior "probability" (new definition) without considering the normal definition of probability.</p>
</blockquote>
<p>This doesn't change anything about regular probability, or <a href="http://en.wikipedia.org/wiki/Principle_of_maximum_entropy" rel="nofollow">finding priors</a>. It just says that you cannot find out what you instrumentally want apriori without knowing your utility function, which is trivially true.</p></div>
Vladimir_Nesov on Bayesian Utility: Representing Preference by Probability Measures
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zk5
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zk52009-07-28T05:04:53.790586+10:00
<div class="md"><p>As I said in the first phrase, this is but a "simple transformation of standard expected utility formula that I found conceptually interesting". I don't quite understand the second part of your comment (starting from "Since the probability...").</p></div>
jimmy on Bayesian Utility: Representing Preference by Probability Measures
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zkb
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zkb2009-07-28T05:32:20.152407+10:00
<div class="md"><p>I agree that it is an interesting transformation, but I think your conclusion ("No simple morality, no simple probability.") does not follow.</p></div>
Vladimir_Nesov on Bayesian Utility: Representing Preference by Probability Measures
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zkd
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zkd2009-07-28T05:39:35.893238+10:00
<div class="md"><p>That argument says that if you pick a prior, you can't "patch" it to become an arbitrary preference by finding a fitting utility function. It's not particularly related to the shouldness/probability representation, and it isn't well-understood, but it's easy to demonstrate by example in this setting, and I think it's an interesting point as well, possibly worth exploring.</p></div>
cousin_it on Bayesian Utility: Representing Preference by Probability Measures
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zkt
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zkt2009-07-28T07:56:50.634736+10:00
<div class="md"><p>The new version of the post still loses me at about the point where mixing comes in. (What's your motivation for introducing mixing at all?) I would've been happier if it went on about geometry instead of those huge inferential leaps at the end.</p>
<p>And JGWeissman is right, expected utility is a <a href="http://lesswrong.com/lw/l4/terminal_values_and_instrumental_values/">property of actions not outcomes</a> which seems to make the whole post invalid unless you fix it somehow.</p></div>
Vladimir_Nesov on Bayesian Utility: Representing Preference by Probability Measures
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zkx
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zkx2009-07-28T08:26:28.655597+10:00
<div class="md"><p>Any action can be identified with a set of outcomes consistent with the action. See <a href="http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zkw">my reply</a> to JGWeissman.</p>
<p>Is the example after mixing unclear? In what way?</p></div>
cousin_it on Bayesian Utility: Representing Preference by Probability Measures
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zky
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zky2009-07-28T08:33:20.939092+10:00
<div class="md"><p>Yes, that's true but makes your conclusion a bit misleading because not all sets of outcomes correspond to possible actions. It can easily happen that any preference ordering on <em>actions</em> is rationalizable by tweaking utility under a given prior.</p>
<p>The math in the example is clear enough, I just don't understand the motivation for it. If you reduce everything to a preference relation on subsets of a sigma algebra, it's trivially true that you can tweak it with any monotonic function, not just mixing p and q with alpha and beta. So what.</p></div>
Vladimir_Nesov on Bayesian Utility: Representing Preference by Probability Measures
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zl1
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zl12009-07-28T08:47:54.472872+10:00
<div class="md"><p>It can also happen that the prior happens to be the right one, but it isn't guaranteed. This is a red flag, a possible flaw, something to investigate.</p>
<p>The question of which events are "possible actions" is a many-faceted one, and solving this problem "by definition" doesn't work. For example, if you can pick the best strategy, it doesn't matter what the preference order says for all events except the best strategy, even what it says for "possible actions" which won't actually happen.</p>
<p>Strictly speaking, I don't even trust (any) expected utility (and so Bayesian math) to represent preference. Any solution has to also work in a discrete deterministic setting.</p></div>
cousin_it on Bayesian Utility: Representing Preference by Probability Measures
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zma
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zma2009-07-28T17:45:26.522553+10:00
<div class="md"><p>It seems to me that you're changing the subject, or maybe making inferential jumps that are too long for me.</p>
<p>The information to determine which events are possible actions is absent from your model. You can't calculate it within your setting, only postulate.</p>
<p>If the overarching goal of this post was finding ways to represent human preference (did you imply that? I can't tell), then I don't understand how it brings us closer to that goal.</p></div>
Vladimir_Nesov on Bayesian Utility: Representing Preference by Probability Measures
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zmo
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zmo2009-07-28T21:38:18.492829+10:00
<div class="md"><p>The <a href="http://en.wikipedia.org/wiki/Hofstadter%27s_law" rel="nofollow">Hofstadter's Law</a> of <a href="http://wiki.lesswrong.com/wiki/Inferential_distance">Inferential Distance</a>: What you are saying is always harder to understand than you expect, even when you take into account Hofstadter's Law of Inferential Distance.</p>
<p>Of course this post is only a small side-node, and it tells nothing about which events mean what. Human preference is <em>a</em> preference, so even without details the discussion of preference-in-general has <em>some</em> implications for human preference, which the last paragraph of the post alluded to, with regards to picking priors for Bayesian math.</p></div>
JGWeissman on Bayesian Utility: Representing Preference by Probability Measures
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zl0
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zl02009-07-28T08:42:31.678477+10:00
<div class="md"><blockquote>
<p>Expected utility is usually written for actions, but it can be written as in the post as well, it's formally equivalent.</p>
</blockquote>
<p>However, the ratios of the conditional probabilities of those outcomes, given that you take a certain action, will not always equal the rations of the unconditional probabilities, as in your formula.</p></div>
JGWeissman on Bayesian Utility: Representing Preference by Probability Measures
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zjx
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zjx2009-07-28T04:03:30.865346+10:00
<div class="md"><p>Why are we concerned with the expected utility of some subset of the probability space? To find the expected utility of an action, you should sum over the products of the utility of the point with its conditional probability given that you take that action, over all points in the space. In effect, you are only considering actions that reduce the probability of some points to zero, and then renormalizes the probability of the remaining points.</p></div>
Vladimir_Nesov on Bayesian Utility: Representing Preference by Probability Measures
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zkw
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zkw2009-07-28T08:21:34.766849+10:00
<div class="md"><p>Expected utility is usually written for actions, but it can be written as in the post as well, it's formally equivalent. This treatment of expected utility isn't novel in any way. Any action can be identified with a set of possibilities (outcomes) in which it happens. When you talk of actions that "don't reduce some probabilities to zero", you are actually talking about the effect of the actions on probability distributions of random variables, but behind those random variables is still a probability space on which any information is an element of sigma algebra, or a clear-cut set of possibilities.</p></div>
JGWeissman on Bayesian Utility: Representing Preference by Probability Measures
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zkz
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zkz2009-07-28T08:38:18.788065+10:00
<div class="md"><blockquote>
<p>Expected utility is usually written for actions, but it can be written as in the post as well, it's formally equivalent.</p>
</blockquote>
<p>How is it formally equivalent? How can I represent the expected utility of an action with arbitrary effects on conditional probability using the average, weighted by unconditional probabilities, of the utility of some subset of the possibilities, as in the post?</p></div>
Vladimir_Nesov on Bayesian Utility: Representing Preference by Probability Measures
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zl2
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zl22009-07-28T09:12:00.211429+10:00
<div class="md"><p>Let A be the action (set of possibilities consistent with taking the action), and O set of possible outcomes (each one rated by the utility function, assuming for simplicity that every concrete outcome is considered, not events-outcomes). We can assume <img src="http://latex.codecogs.com/png.latex?A\subseteq%20O" alt="" title="" />. Then:</p>
<p><img src="http://latex.codecogs.com/png.latex?EU(A)=\sum_{{x}\in%20O}{u(x)\cdot%20P(x|A)}=\sum_{{x}\in%20O}{u(x)\cdot\frac{P(x,A)}{P(A)}}" alt="" title="" /></p>
<p><img src="http://latex.codecogs.com/png.latex?EU(A)=\frac{1}{P(A)}\sum_{{x}\in%20A\cap%20O}{u(x)\cdot%20P(x,A)}=\frac{1}{P(A)}\sum_{{x}\in%20A}{u(x)\cdot%20P(x)}" alt="" title="" /></p></div>
Peter_de_Blanc on Bayesian Utility: Representing Preference by Probability Measures
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zn8
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zn82009-07-29T02:54:37.975025+10:00
<div class="md"><p>How do you calculate P(A)?</p></div>
Vladimir_Nesov on Bayesian Utility: Representing Preference by Probability Measures
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zoi
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zoi2009-07-29T06:52:04.362941+10:00
<div class="md"><p>Trick question? P(A) is just a probability of some event, so depending on the problem it could be calculated in any of the possible ways. "A" can for example correspond to a value of some random variable in a (dynamic) graphical model, taking observations into account, so that its probability value is obtained from belief propagation.</p></div>
JGWeissman on Bayesian Utility: Representing Preference by Probability Measures
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zl6
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zl62009-07-28T09:41:56.133773+10:00
<div class="md"><p>As I already explained, that only works for actions that exclude some outcomes and renormalize the probabilities of remaining outcomes, preserving the ratios of their probabilities.</p>
<p>Suppose O had 2 elements, x1 and x2, such that p(x1) = p(x2) = .5. If you take action A, then you have conditional probabilities p(x1|A) = .2 and p(x2|A) = .8. In this case, your transformation of P(x|A) = P(x, A)/P(A) does not work. Because A did not remove x1 as a possibility, it just made it less likely.</p></div>
Vladimir_Nesov on Bayesian Utility: Representing Preference by Probability Measures
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zla
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zla2009-07-28T09:58:10.881321+10:00
<div class="md"><p>P(x|A) = P(x,A)/P(A) is by definition of conditional probability. You are trying to interpret x1 and x2 as events, while in grandparent comment x are elements of the sample space. If you want to consider non-concrete outcomes, compose them from smaller elements. For example, you can have P(O1)=P(O2)=.5, P(O1|A)=.2, P(O2|A)=.8, if O1={x1,x2}, O2={x3,x4}, A={x1,x3}, and p(x1)=.1, p(x2)=.4, p(x3)=.4, p(x4)=.1.</p></div>
cousin_it on Bayesian Utility: Representing Preference by Probability Measures
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zil
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zil2009-07-28T00:53:35.577868+10:00
<div class="md"><p>Clever! I would have titled it "Couldness and Shouldness", and inserted some sort of pun about "wouldness" at the end.</p>
<p>I don't quite understand the part about mixing. Did you mean 1 >= alpha > beta >= 0 ? If no, some vectors now have negative coordinates and the polar angle becomes an ambiguous ordering. If yes, that's not the general form: why not use any matrix with nonnegative elements and positive determinant?</p>
<p>And I don't understand the last paragraph at all. If X coordinates of points are given, changing the Y coordinates can reorder the polar angles arbitrarily. Or did you simply mean that composite events stay dependent on simple events?</p>
<p>Sorry if those are stupid questions.</p></div>
Vladimir_Nesov on Bayesian Utility: Representing Preference by Probability Measures
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zip
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zip2009-07-28T01:14:49.793763+10:00
<div class="md"><p>Mixing: coefficients can be negative or more than 1, but values of p and q must remain positive (added to the post). This is also a way to drive polar angle of the expected utility of the best point of the sample space to pi/2 (look at the bounding parallelogram in (P,Q)).</p>
<p>You can't move the points around independently, since their coordinates are <em>measures</em>, sums of distributions over specific events, so if you move one event, some of the other events move as well. I'll add an example to the article in a moment.</p></div>
othercriteria on Bayesian Utility: Representing Preference by Probability Measures
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/bv5d
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/bv5d2015-01-15T10:42:22.490378+11:00
<div class="md"><p>This seems cool but I have a nagging suspicion that this reduces to greater generality and a handful of sentences if you use <a href="http://en.wikipedia.org/wiki/Conditional_expectation" rel="nofollow">conditional expectation</a> of the utility function and the <a href="http://en.wikipedia.org/wiki/Radon%E2%80%93Nikodym_theorem" rel="nofollow">Radon-Nikodym theorem</a>?</p></div>
Vladimir_Nesov on Bayesian Utility: Representing Preference by Probability Measures
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/11hn
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/11hn2009-08-14T06:41:07.045218+10:00
<div class="md"><p>A couple of random thoughts. From the point of view on prior+utility as vectors in probability-shouldness coordinates, it's easy to see that the ability to rescale and shift utilities without changing preference corresponds to transformations to the shouldness component. These transformations don't change the order on vectors' (events') angles, and so even if we allow shouldness to go negative, expected utility will still work as preference. Similarly, if the shouldness is fixed positive, one could allow rescaling and shifting <em>probability</em>, so that it, too, can go negative.</p>
<p>Another transformation: if we swap the roles of probability and shouldness, the resulting prior+utility will have shouldness of the original system as prior and inverse utility of the original system as utility. In this system, expected utility <em>minimization</em> will describe the same optimization as the expected utility maximization in the original system. The same effect could be achieved by flipping the sign on utility (another symmetry), which can also be easily seen from the probability-shouldness diagram.</p>
<p>Applying both transformations, we get the same preference, but with shouldness of the original system as prior. Utility of the transformed system is negated inverted utility of the original representation. This shows that conceptually, probability distribution and shouldness distribution are interchangeable.</p></div>
Vladimir_Nesov on Bayesian Utility: Representing Preference by Probability Measures
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zjc
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zjc2009-07-28T02:22:47.657261+10:00
<div class="md"><p>Added an example of when it isn't possible to specify arbitrary preference for a given prior, and a philosophical note at the end (related to the "where do the priors come from" debate).</p></div>
Jonathan_Graehl on Bayesian Utility: Representing Preference by Probability Measures
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zkj
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zkj2009-07-28T06:50:02.217452+10:00
<div class="md"><p>I don't follow the equation of preference and priors in the last paragraph.</p></div>
Vladimir_Nesov on Bayesian Utility: Representing Preference by Probability Measures
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zkl
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zkl2009-07-28T06:54:45.941620+10:00
<div class="md"><p>What do you mean?</p></div>
Jonathan_Graehl on Bayesian Utility: Representing Preference by Probability Measures
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zkn
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zkn2009-07-28T07:03:34.228583+10:00
<div class="md"><blockquote>
<p>Prior is an integral part of preference, and it works exactly the same way as shouldness.</p>
</blockquote>
<p>Could you demonstrate? I don't understand.</p>
<blockquote>
<p>The problem of choosing Bayesian priors is in general the problem of formalizing preference, it can't be solved completely without considering utility</p>
</blockquote>
<p>I also don't understand what you mean above.</p></div>
Vladimir_Nesov on Bayesian Utility: Representing Preference by Probability Measures
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zkq
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zkq2009-07-28T07:51:37.684902+10:00
<div class="md"><p>What is usually called "prior" is represented by measure P in the post. Together with "shouldness" Q they constitute the recipe for computing preference over events, through expected utility.</p>
<p>If it's not possible to choose prior more or less arbitrarily and then fill in the gaps using utility to get the correct preference, some priors are inherently incorrect for human preference, and finding the priors that admit completion to the correct preference with fitting utility requires knowledge about preference.</p></div>
Jonathan_Graehl on Bayesian Utility: Representing Preference by Probability Measures
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zm1
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zm12009-07-28T15:46:18.806766+10:00
<div class="md"><p>Regarding your second point; I'm not sure how it's rational to choose your beliefs because of some subjective preference order.</p>
<p>Perhaps you could suggest a case where it makes sense to reason from preferences to "priors which make my preferences consistent", because I'm also fuzzy on the details of when and how you propose to do so.</p></div>
Jonathan_Graehl on Bayesian Utility: Representing Preference by Probability Measures
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zm0
http://lesswrong.com/lw/148/bayesian_utility_representing_preference_by/zm02009-07-28T15:43:31.769247+10:00
<div class="md"><p>I see - by "prior" you mean "current estimate of probability", because P was defined</p>
<p>I've been dealing lately with learning research where "prior" means how likely a given model of probability(outcome) is before any evidence, so maybe I was a little rigid.</p>
<p>In any case, I suggest you consistently use "probability" and drop "prior".</p></div>