# Scott Aaronson on Born Probabilities

This post attempts to popularize some of Scott Aaronson's lectures and research results relating to Born probabilities. I think they represent a significant step towards answering the question "Why Born's rule?" but do not seem to be very well known. Prof. Aaronson writes frequently on his popular blog, Shtetl-Optimized, but is apparently too modest to use it to do much promotion of his own ideas. I hope he doesn’t mind that I take up this task (and that he forgives any errors and misunderstandings I may have committed here).

Before I begin, I want to point out something that has been bugging me about the fictional Ebborian physics, which will eventually lead us to Aaronson's ideas. So, let’s first recall the following passage from Eliezer’s story:

"And we also discovered," continues Po'mi, "that our very planet of Ebbore, including all the people on it, has a four-dimensional thickness, and is constantly fissioning along that thickness, just as our brains do. Only the fissioned sides of our planet do not remain in contact, as our new selves do; the sides separate into the fourth-dimensional void."

…

"Well," says Po'mi, "when the world splits down its four-dimensional thickness, it does not always split exactly evenly. Indeed, it is not uncommon to see nine-tenths of the four-dimensional thickness in one side."

...

"Now," says Po'mi, "if fundamental physics has nothing to do with consciousness, can you tell me why the subjective probability of finding ourselves in a side of the split world, should be exactly proportional to the square of the thickness of that side?"

Ok, so the part that’s been bugging me is, suppose an Ebborian world splits twice, first into 1/3 and 2/3 of the original thickness (slices A and B respectively), then the B slice splits exactly in half, into two 1/3 thickness slices (C and D). Before the splitting, with what probability should you anticipate ending up in the slices A, C and D? Well, according to the squaring rule, you have 1/5 chance of ending up in A, and 4/5 chance of ending up in B. Those in B then have equal chance of ending up in C and D, so each of them gets a final probability of 2/5.

Well, that’s not how quantum branching works! There, the probability of ending up in any branch depends only on the final amplitude of that branch, not on the order in which branching occurred. This makes perfect sense since decoherence is not a instantaneous process, and thinking of it as branching is only an approximation because worlds never completely split off and become totally independent of one another. In QM, the “order of branching” is not even well defined, so how can probabilities depend on it?

Suppose we want to construct an Ebborian physics where, like in QM, the probability of ending up in any slice depends only on the thickness of that slice, and not on the order in which splitting occurs, how do we go about doing that? Simple, we just make that probability a function of the absolute thickness of a slice, instead of having it depend on the relative thickness at each splitting.

So let’s say that the subjective probability of ending up in any slice is proportional to the square of the absolute thickness of that slice, and consider the above example again. When the world splits into A and B, the probabilities are again 1/5 and 4/5 respectively. But when B splits again into C and D, A goes from probability 1/5 to 1/3, and C and D each get 1/3. That’s pretty weird… what’s going on this time?

To use Aaronson’s language, splitting is not a 2-norm preserving transformation; it only preserves the 1-norm. Or to state this more plainly, splitting conserves the sum of the individual slices’ thicknesses, but not the sum of the squares of the individual thicknesses. So in order to apply the squaring rule and get a set of probabilities that sum to 1 at the end, we have to renormalize, and this renormalizing can cause the probability of a slice to go up or down, depending purely on what happens to *other* slices that it otherwise would have nothing to do with.

Note that in quantum mechanics, the evolution of a wavefunction always preserves its 2-norm, not its 1-norm (nor p-norm for any p≠2). If we were to use any probability rule other than the squaring rule in QM, we would have to renormalize and thereby encounter this same issue: the probability of a branch would go up or down depending on other parts of the wavefunction that it otherwise would have little interaction with.

At this point you might ask, “Ok, this seems unusual and counterintuitive, but lots of physics are counterintuitive. Is there some other argument that the probability rule shouldn’t involve renormalization?” And the answer to that is yes, because to live in a world with probability renormalization would be to have magical powers, including the ability to solve NP-complete problems in polynomial time. (And to turn this into a full anthropic explanation of the Born rule, similar to the anthropic explanations for other physical laws and constants, we just have to note that intelligence seems to have little evolutionary value in such a world. But that’s my position, not Aaronson’s, or at least he hasn’t argued for this additional step in public, as far as I know.)

Aaronson actually proved that problems in PP, which are commonly believed to be even harder than NP problems, can be solved in polynomial time using “fantasy” quantum computers that use variants of Born’s rule where the exponent doesn't equal 2. But it turns out that the power of these computers has nothing to do with quantum computing, but instead has everything to do with probability renormalization. So here I’ll show how we can solve NP-complete (instead of PP since it’s easier to think about) problems in polynomial time using the modified Ebborian physics that I described above.

The idea is actually very easy to understand. Each time an Ebborian world slice splits, its descendant slices decrease in total probability, while every other slice increases in probability. (Recall how when B split, A’s probability went from 1/5 to 1/3, and B’s 4/5 became a total of 2/3 for C and D.) So to take advantage of this, we first split our world into an exponential number of slices of equal thickness, and let each slice try a different possible solution to the NP-complete problem. If a slice finds that its candidate solution is a correct one, then it does nothing, otherwise it splits itself a large number of times. Since that greatly decreases their own probabilities, and increases the probabilities of the slices that didn’t split at the end, we should expect to find ourselves in one of the latter kind of slices when the computation finishes, which (surprise!) happens to be one that found a correct solution. Pretty neat, right?

**ETA:** The lecture notes and papers I linked to also give explanations for other aspects of quantum mechanics, such as why it is linear, and why it preserves the 2-norm and not some other p-norm. Read them to find out more.

## Comments (7)

Best*2 points [-]WeiDai, good work on reading through Scott's PostBQP=PP paper and explaining the BQP_p contained in NP result so clearly.

Wei, the relationship between computing power and the probability rule is interesting, but doesn't do much to explain Born's rule.

In the context of a many worlds interpretation, which I have to assume you are using since you write of splitting, it is a mistake to work with probabilities directly. Because the sum is always normalized to 1, probabilities deal (in part) with global information about the multiverse, but people easily forget that and think of them as local. The proper quantity to use is measure, which is the amount of consciousness that each type of observer has, such that effective probability is proportional to measure (by summing over the branches and normalizing). It is important to remember that total measure need not be conserved as a function of time.

So for the Ebborian example, if

measureis proportional to the thickness squared, the fact that theprobabilityof a slice can go up or down, depending purely on what happens to other slices that it otherwise would have nothing to do with, is neither surprising nor counterintuitive. The measure, of course, would not be affected by what the other slices do. It is just like saying that if the population of China were to increase, and other countries had constant population, then the effective probability that a typical person is American would decrease.The second point is that, even supposing that quantum computers could solve hard math problems in polynomial time, your claim that intelligence would have little evolutionary value is both utterly far-fetched (quantum computers are hard to make, and nonlinear ones could be even harder) and irrelevant if we believe - as typical Everettians do - that the Born rule is not a seperate rule but must follow from the wave equation. Even supposing intelligence required the Born rule, that would just tell us that the Born rule is true - but we already know that. The question is, why would it follow from the wave equation? If the Born rule is a seperate rule, that suggests dualism or hidden variables, which bring in other possibilities for probability rules.

Actually there are already many other possibilities for probability rules. A lot of people, when trying to derive the Born rule, start out assuming that probabilities depend only on branch amplitudes. We know that seems true, but not why, so we can't start out assuming it. For example, probabilities could have been proportional to brain size.

These issues are discussed in my eprints, e.g. Decision Theory is a Red Herring for the Many Worlds Interpretation http://arxiv.org/abs/0808.2415

So, this

combined with this

implies the Born rule?

The conclusion also needs that what happens in one branch can't renormalize the probability of another branch.

*0 points [-]And further, if the wavefunction didn't need to preserve the 2-norm, then presumably there would be no computational advantage to probability being split by absolute value of amplitude instead of squared absolute value?

(By the way, how come they don't just say "product of wave function with complex conjugate" instead of "square of absolute value of wavefunction", since the magnitudes are the same?)

Because "square of absolute value of wavefunction" is shorter and clearer than "product of wave function with complex conjugate", I'm guessing.

(Counting by words, syllables or letters gives that result; I also tried counting morae and ended up with the same count for both.)

True. But wouldn't your meaning be clear if you used the shorter term "wavefunction conjugate product"? Then again, maybe not.