This is a simple transformation of standard expected utility formula that I found conceptually interesting.
For simplicity, let's consider a finite discrete probability space with non-zero probability at each point p(x), and a utility function u(x) defined on its sample space. Expected utility of an event A (set of the points of the sample space) is the average value of utility function weighted by probability over the event, and is written as
Expected utility is a way of comparing events (sets of possible outcomes) that correspond to, for example, available actions. Event A is said to be preferable to event B when EU(A)>EU(B). Preference relation doesn't change when utility function is transformed by positive affine transformations. Since the sample space is assumed finite, we can assume without loss of generality that for all x, u(x)>0. Such utility function can be additionally rescaled so that for all sample space
Now, if we define
the expected utility can be rewritten as
or
Here, P and Q are two probability measures. It's easy to see that this form of expected utility formula has the same expressive power, so preference relation can be defined directly by a pair of probability measures on the same sample space, instead of using a utility function.
Expected utility written in this form only uses probability of the whole event in both measures, without looking at the individual points. I tentatively call measure Q "shouldness", together with P being "probability". Conceptual advantage of this form is that probability and utility are now on equal footing, and it's possible to work with both of them using the familiar Bayesian updating, in exactly the same way. To compute expected utility of an event given additional information, just use the posterior shouldness and probability:
If events are drawn as points (vectors) in (P,Q) coordinates, expected utility is monotone on the polar angle of the vectors. Since coordinates show measures of events, a vector depicting a union of nonintersecting events is equal to the sum of vectors depicting these events:
This allows to graphically see some of the structure of simple sigma-algebras of the sample space together with a preference relation defined by a pair of measures. See also this comment on some examples of applying this geometric representation of preference.
Preference relation defined by expected utility this way also doesn't depend on constant factors in the measures, so it's unnecessary to require the measures to sum up to 1.
Since P and Q are just devices representing the preference relation, there is nothing inherently "epistemic" about P. Indeed, it's possible to mix P and Q together without changing the preference relation. A pair (p',q') defined by
gives the same preference relation,
(Coefficients can be negative or more than 1, but values of p and q must remain positive.)
Conversely, given a fixed measure P, it isn't possible to define an arbitrary preference relation by only varying Q (or utility function). For example, for a sample space of three elements, a, b and c, if p(a)=p(b)=p(c), then EU(a)>EU(b)>EU(c) means that EU(a+c)>EU(b+c), so it isn't possible to choose q such that EU(a+c)<EU(b+c). If we are free to choose p, however, an example that has these properties (allowing zero values for simplicity) is a=(0,1/4), b=(1/2,3/4), c=(1/2,0), with a+c=(1/2,1/4), b+c=(1,3/4), so EU(a+c)<EU(b+c).
Prior is an integral part of preference, and it works exactly the same way as shouldness. Manipulations with probabilities, or Bayesian "levels of certainty", are manipulations with "half of preference". The problem of choosing Bayesian priors is in general the problem of formalizing preference, it can't be solved completely without considering utility, without formalizing values, and values are very complicated. No simple morality, no simple probability.
Clever! I would have titled it "Couldness and Shouldness", and inserted some sort of pun about "wouldness" at the end.
I don't quite understand the part about mixing. Did you mean 1 >= alpha > beta >= 0 ? If no, some vectors now have negative coordinates and the polar angle becomes an ambiguous ordering. If yes, that's not the general form: why not use any matrix with nonnegative elements and positive determinant?
And I don't understand the last paragraph at all. If X coordinates of points are given, changing the Y coordinates can reorder the polar angles arbitrarily. Or did you simply mean that composite events stay dependent on simple events?
Sorry if those are stupid questions.
Mixing: coefficients can be negative or more than 1, but values of p and q must remain positive (added to the post). This is also a way to drive polar angle of the expected utility of the best point of the sample space to pi/2 (look at the bounding parallelogram in (P,Q)).
You can't move the points around independently, since their coordinates are measures, sums of distributions over specific events, so if you move one event, some of the other events move as well. I'll add an example to the article in a moment.