This is a mathematical appendix to my post "Why you must maximize expected utility", giving precise statements and proofs of some results about von Neumann-Morgenstern utility theory without the Axiom of Continuity. I wish I had the time to make this post more easily readable, giving more intuition; the ideas are rather straight-forward and I hope they won't get lost in the line noise!
The work here is my own (though closely based on the standard proof of the VNM theorem), but I don't expect the results to be new.
*
I represent preference relations as total preorders on a simplex
; define
,
,
and
in the obvious ways (e.g.,
iff both
and
, and
iff
but not
). Write
for the
'th unit vector in
.
In the following, I will always assume that satisfies the independence axiom: that is, for all
and
, we have
if and only if
. Note that the analogous statement with weak preferences follows from this:
holds iff
, which by independence is equivalent to
, which is just
.
Lemma 1 (more of a good thing is always better). If and
, then
.
Proof. Let . Then,
and
. Thus, the result follows from independence applied to
,
,
, and
.
Lemma 2. If and
, then there is a unique
such that
for
and
for
.
Proof. Let be the supremum of all
such that
(note that by assumption, this condition holds for
). Suppose that
. Then there is an
such that
. By Lemma 1, we have
, and the first assertion follows.
Suppose now that . Then by definition of
, we do not have
, which means that we have
, which was the second assertion.
Finally, uniqueness is obvious, because if both and
satisfied the condition, we would have
.
Definition 3. is much better than
, notation
or
, if there are neighbourhoods
of
and
of
(in the relative topology of
) such that we have
for all
and
. (In other words, the graph of
is the interior of the graph of
.) Write
or
when
(
is not much better than
), and
(
is about as good as
) when both
and
.
Theorem 4 (existence of a utility function). There is a such that for all
,
Unless for all
and
, there are
such that
.
Proof. Let be a worst and
a best outcome, i.e. let
be such that
for all
. If
, then
for all
, and by repeated applications of independence we get
for all
, and therefore
again for all
, and we can simply choose
.
Thus, suppose that . In this case, let
be such that for every
,
equals the unique
provided by Lemma 2 applied to
and
. Because of Lemma 1,
. Let
.
We first show that implies
. For every
, we either have
, in which case by Lemma 2 we have
for arbitrarily small
, or we have
, in which case we set
and find
. Set
. Now, by independence applied
times, we have
; analogously, we obtain
for arbitrarily small
. Thus, using
and Lemma 1,
and therefore
as claimed. Now note that if
, then this continues to hold for
and
in a sufficiently small neighbourhood of
and
, and therefore we have
.
Now suppose that . Since we have
and
, we can find points
and
arbitrarily close to
and
such that the inequality becomes strict (either the left-hand side is smaller than one and we can increase it, or the right-hand side is greater than zero and we can decrease it, or else the inequality is already strict). Then,
by the preceding paragraph. But this implies that
, which completes the proof.
Corollary 5. is a preference relation (i.e., a total preorder) that satisfies independence and the von Neumann-Morgenstern continuity axiom.
Proof. It is well-known (and straightforward to check) that this follows from the assertion of the theorem.
Corollary 6. is unique up to affine transformations.
Proof. Since is a VNM utility function for
, this follows from the analogous result for that case.
Corollary 7. Unless for all
, for all
the set
has lower dimension than
(i.e., it is the intersection of
with a lower-dimensional subspace of
).
Proof. First, note that the assumption implies that . Let
be given by
,
, and note that
is the intersection of the hyperplane
with the closed positive orthant
. By the theorem,
is not parallel to
, so the hyperplane
is not parallel to
. It follows that
has dimension
, and therefore
can have at most this dimension. (It can have smaller dimension or be the empty set if
only touches or lies entirely outside the positive orthant.)
Just some feedback: I'm probably about average in math skill here (or maybe below average, the most math I've done is calculus 10 years ago) and with some work I'm able to get through some of this. When I first looked at it I didn't understand anything but reading the wikipedia on VNM utility theorem and the always helpful List of Mathematical Symbols I was able to get through most of Lemma 1. I was able to prove it to my satisfaction using the solver in Excel and can follow most of the proof up until "Thus, the result follows", I don't see how it follows.
Are there any recommendations for slowly improving math skills other than just trying to work through things like this when time permits? Are people willing to host a Google Hangout where they walk through things such as this for those of us who are curious but have difficulty working it out all on our own (I know I probably could work it all out given enough time, but its hard to be motivated enough to make the time. When I first found the site, I didn't know about Bayes theorem or any of the probability theory notation, but I saw its importance and so made sure to spend the time so I can follow it and work it out on my own when needed).
I think it's a general problem in the way mathematics is taught (at least around here in Finland and I'm basing this on considerably low amount of empirical observations) that the language of mathematics is not very well elaborated: What each symbol stands for, what's the logical rule set for using each symbol, like for an example if you have the symbol for sigma to stand for summation - and so even if the students could use their math skills in principle they end up stumbling in practice due to not know how to interpet some statement using symbols they're... (read more)