Unconditionally Convergent Expected Utility

DanielLC

Expected utility can be expressed as the sum ΣP(X_n)U(X_n). Suppose P(X_n) = 2^-n, and U(X_n) = (-2)ⁿ/n. Then expected utility = Σ2^-n(-2)ⁿ/n = Σ(-1)ⁿ/n = -1+1/2-1/3+1/4-... = -ln(2). Except there's no obvious order to add it. You could just as well say it's -1+1/2+1/4+1/6+1/8-1/3+1/10+1/12+1/14+1/16-1/5+... = 0. The sum depends on the order you add it. This is known as conditional convergence.

This is clearly something we want to avoid. Suppose my priors have an unconditionally convergent expected utility. This would mean that ΣP(X_n)|U(X_n)| converges. Now suppose I observe evidence Y. ΣP(X_n|Y)|U(X_n)| = Σ|U(X_n)|P(X_n∩Y)/P(Y) ≤ Σ|U(X_n)|P(X_n)/P(Y) = 1/P(Y)·ΣP(X_n)|U(X_n)|. As long as P(Y) is nonzero, this must also converge.

If my prior expected utility is unconditionally convergent, then given any finite amount of evidence, so is my posterior.

This means I only have to come up with a nice prior, and I'll never have to worry about evidence braking expected utility.

I suspect that this can be made even more powerful, and given any amount of evidence, finite or otherwise, I will almost surely have an unconditionally convergent posterior. Anyone want to prove it?

Now let's look at Pascal's Mugging. The problem here seems to be that someone could very easily give you an arbitrarily powerful threat. However, in order for expected utility to converge unconditionally, either carrying out the threat must get unlikely faster than the disutility increases, or the probability of the threat itself must get unlikely that fast. In other words, either someone threatening 3^^^3 people is so unlikely to carry it out to make it non-threatening, or the threat itself must be so difficult to make that you don't have to worry about it.

If my prior expected utility is unconditionally convergent, then given any finite amount of evidence, so is my posterior.

This means I only have to come up with a nice prior, and I'll never have to worry about evidence braking expected utility.

I suspect that this can be made even more powerful, and given any amount of evidence, finite or otherwise, I will almost surely have an unconditionally convergent posterior. Anyone want to prove it?

Except you get this result by making up probabilities rather than arriving at them through any rational process. This has been discussed here many times before, including in the sequences and very recently. Downvoted.

I disagree that the above is not a new contribution to thought on this. The issue at stake has to do with restricting the set of permissible utility functions. If we have a probability measure induced by our empirical observations, then it doesn't do any good from a rationalism standpoint to allow non-summable or non-integrable utility functions with respect to that probability measure.

This example shows one such case. Suppose Nature hands me a probability distribution over some sequence of events, P(Xn) = 2^{-n}. Then there is a meta-probability assignment over the space of utility functions I can assign to the events Xn and it involves the resulting expectations. You can think of it like a Dirichlet distribution.

It makes no sense to speak of utility functions that aren't L1(problem domain) (respectively, l1(problem domain)) under the probability measure you believe to be true about the situation.

I think Pascal's mugging suffers from this issue. For any valid probability distribution over the number of lives at stake, I can produce utility functions for valuing lives that produce arbitrarily different output decisions. In reality, though, you can't decouple the choice of a "permissible" utility function from the exact same processes that yield some knowledge or model about the probability distribution over lives threatened.

I could go get some evidence about probability of lives threatened, then internally reflect on how I should choose to assign value to lives, then compute joint probability distributions over both the threatened lives and all my different options for utility functions on the space of threatened lives, then internally reflect on how to value joint configurations of (threatened lives, utility functions over spaces of threatened lives), then compute joint probabilities over the 2-tuple consisting of ( (threatened lives, utility functions over threatened lives), utility functions over 2-tuples of (threatened lives, utility functions over threatened lives) ), and so on ad infinitum.

At some point, because brains have finite computing resources and (at least human brains) have a machine epsilon, I just have to stop this recursive computation, draw a line in the sand, accept some conditional probabilities some at some deep ply of the computation, and then integrate my way back all the way down to the decision of choosing a utility function.

Nothing stops me from choosing a utility function that, when coupled with the probabilities that Nature gives me, causes my expectation to fail to be summable (integrable). I could, after all, act like The Ultimate Pessimist and assign a utility of -\infty to every outcome, for example. More realistically, I could choose a utility function that has the same shape as a Cauchy distribution. But in the landscape of meta-goals, or even just correspondence of utility functions to reality, this would be bad for me. How can I make decisions about which bets to accept if I am in a situation where Nature hands me an improper prior uniform probability of a set of different outcomes, and I choose to have a Cauchy distribution of personal utility over that set of outcomes? The idea of an expectation fails to even exist in that scenario. Hence, scalar multiples of Cauchy distributions don't make much sense viewed as potential utility functions.

The example here of conditional convergence is a very elementary one. More complicated issues like this arise when you think in terms of probability theory and functional analysis on the space of utility functions. But it's a salient example nonetheless. If we choose utility functions such that the resultant expectation calculation includes a conditionally convergent, or worse non-summable, series, then we can't accept or reject bets in a way that has meaningful correspondence to our perceived actual utility. Hence, implicitly, rationalists must make some time-saving admissibility criteria for what sorts of functions are even allowed to be utility functions.

Getting rid of conditional convergence, or issues of non-measurability and non-integrability, would seem like intuitively plausible first steps in forming utility functions. Similar to the way that Jaynes showed how consistent formulations of belief in terms of wagers was isomorphic to probability theory, we have similar constraints on consistent use of utility functions. But as the Cauchy distribution example above, for utility functions, shows that the restrictions must actually be quite a bit more severe than mere summability.

The fact that this is a problem does not make anything in the post novel. In the grandparent, I linked to discussions of this problem that touched on everything that you discussed here.

I could go get some evidence about probability of lives threatened, then internally reflect on how I should choose to assign value to lives, then compute joint probability distributions over both the threatened lives and all my different options for utility functions on the space of threatened lives

Since utility functions are only unique modulo affine transforms, you can't combine them using naive expected utility. The correct method to do so is unknown.