A fungibility theorem
Restatement of: If you don't know the name of the game, just tell me what I mean to you. Alternative to: Why you must maximize expected utility. Related to: Harsanyi's Social Aggregation Theorem.
Summary: This article describes a theorem, previously described by Stuart Armstrong, that tells you to maximize the expectation of a linear aggregation of your values. Unlike the von Neumann-Morgenstern theorem, this theorem gives you a reason to behave rationally.1
Information theory and the symmetry of updating beliefs
Contents:
1. The beautiful symmetry of Bayesian updating
2. Odds and log odds: a short comparison
3. Further discussion of information
Rationality is all about handling this thing called "information". Fortunately, we live in an era after the rigorous formulation of Information Theory by C.E. Shannon in 1948, a basic understanding of which can actually help you think about your beliefs, in a way similar but complementary to probability theory. Indeed, it has flourished as an area of research exactly because it helps people in many areas of science to describe the world. We should take advantage of this!
The information theory of events, which I'm about to explain, is about as difficult as high school probability. It is certainly easier than the information theory of multiple random variables (which right now is explained on Wikipedia), even though the equations look very similar. If you already know it, this can be a linkable source of explanations to save you writing time :)
So! To get started, what better way to motivate information theory than to answer a question about Bayesianism?
The beautiful symmetry of Bayesian updating
The factor by which observing A increases the probability of B is the same as the factor by which observing B increases the probability of A. This factor is P(A and B)/(P(A)·P(B)), which I'll denote by pev(A,B) for reasons to come. It can vary from 0 to +infinity, and allows us to write Bayes' Theorem succinctly in both directions:
P(A|B)=P(A)·pev(A,B), and P(B|A)=P(B)·pev(A,B)
What does this symmetry mean, and how should it affect the way we think?
A great way to think of pev(A,B) is as a multiplicative measure of mutual evidence, which I'll call mutual probabilistic evidence to be specific. If pev=1 if they're independent, if pev>1 they make each other more likely, and if pev<1 if they make each other less likely.
But two ways to think are better than one, so I will offer a second explanation, in terms of information, which I often find quite helpful in analyzing my own beliefs:
Average utilitarianism must be correct?
I said this in a comment on Real-life entropic weirdness, but it's getting off-topic there, so I'm posting it here.
My original writeup was confusing, because I used some non-standard terminology, and because I wasn't familiar with the crucial theorem. We cleared up the terminological confusion (thanks esp. to conchis and Vladimir Nesov), but the question remains. I rewrote the title yet again, and have here a restatement that I hope is clearer.
- We have a utility function u(outcome) that gives a utility for one possible outcome. (Note the word utility. That means your diminishing marginal utility, and all your preferences, and your aggregation function for a single outcome, are already incorporated into this function. There is no need to analyze u further, as long as we agree on using a utility function.)
- We have a utility function U(lottery) that gives a utility for a probability distribution over all possible outcomes.
- The von Neumann-Morgenstern theorem indicates that, given 4 reasonable axioms about U, the only reasonable form for U is to calculate the expected value of u(outcome) over all possible outcomes. This is why we constantly talk on LW about rationality as maximizing expected utility.
- This means that your utility function U is indifferent with regard to whether the distribution of utility is equitable among your future selves. Giving one future self u=10 and another u=0 is equally as good as giving one u=5 and another u=5.
- This is the same ethical judgement that an average utilitarian makes when they say that, to calculate social good, we should calculate the average utility of the population; modulo the problems that population can change and that not all people are equal. This is clearer if you use a many-worlds interpretation, and think of maximizing expected value over possible futures as applying average utilitarianism to the population of all possible future yous.
- Therefore, I think that, if the 4 axioms are valid when calculating U(lottery), they are probably also valid when calculating not our private utility, but a social utility function s(outcome), which sums over people in a similar way to how U(lottery) sums over possible worlds. The theorem then shows that we should set s(outcome) = the average value of all of the utilities for the different people involved. (In other words, average utilitarianism is correct). Either that, or the axioms are inappropriate for both U and s, and we should not define rationality as maximizing expected utility.
- (I am not saying that the theorem reaches down through U to say anything directly about the form of u(outcome). I am saying that choosing a shape for U(lottery) is the same type of ethical decision as choosing a shape for s(outcome); and the theorem tells us what U(lottery) should look like; and if that ethical decision is right for U(lottery), it should also be right for s(outcome). )
- And yet, average utilitarianism asserts that equity of utility, even among equals, has no utility. This is shocking, especially to Americans.
- It is even more shocking that it is thus possible to prove, given reasonable assumptions, which type of utilitarianism is correct. One then wonders what other seemingly arbitrary ethical valuations actually have provable answers given reasonable assumptions.
Some problems with average utilitarianism from the Stanford Encyclopedia of Philosophy:
Despite these advantages, average utilitarianism has not obtained much acceptance in the philosophical literature. This is due to the fact that the principle has implications generally regarded as highly counterintuitive. For instance, the principle implies that for any population consisting of very good lives there is a better population consisting of just one person leading a life at a slightly higher level of well-being (Parfit 1984 chapter 19). More dramatically, the principle also implies that for a population consisting of just one person leading a life at a very negative level of well-being, e.g., a life of constant torture, there is another population which is better even though it contains millions of lives at just a slightly less negative level of well-being (Parfit 1984). That total well-being should not matter when we are considering lives worth ending is hard to accept. Moreover, average utilitarianism has implications very similar to the Repugnant Conclusion (see Sikora 1975; Anglin 1977).
(If you assign different weights to the utilities of different people, we could probably get the same result by considering a person with weight W to be equivalent to W copies of a person with weight 1.)
= 783df68a0f980790206b9ea87794c5b6)
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)