I suggested that, in some situations, questions like "What is your posterior probability?" might not have answers, unless they are part of decision problems like "What odds should you bet at?" or even "What should you rationally anticipate to get a brain that trusts rational anticipation?". You didn't comment on the suggestion, so I thought about problems you might have seen in it.
In the suggestion, the "correct" subjective probability depends on a utility function and a UDT/TDT agent's starting probabilities, which never change. The most important way the suggestion is incomplete is that it doesn't itself explain something we do naturally: we care about the way our "existentness" has "flowed" to us, and if we learn things about how "existentness" or "experiencedness" works, we change what we care about. So when we experiment on quantum systems, and we get experimental statistics that are more probable under a Born rule with a power of 2 than (hand-waving normalization problems) under a Born rule with a power of 4, we change our preferences, so that we care about what happens in possible future worlds in proportion to their integrated squared amplitude, and not in proportion to the integral of the fourth power. But, if there were people who consistently got experimental statistics that were more probable under a Born rule with a power of 4 (whatever that would mean), we would want them to care about possible future worlds in proportion to the integral of the fourth power of their amplitude.
This can even be done in classical decision theory. Suppose you were creating an agent to be put into a world with Ebborean physics, and you had uncertainty about whether, in the law relating world-thickness ratios (at splitting time) to "existentness" ratios, the power was 2 or 4. It would be easy to put a prior probability of 1/2 on each power, and then have "the agent" update from measurements of the relative thicknesses of the sides of the split worlds it (i.e. its local copy) ended up on. But this doesn't explain why you would want to do that.
What would a UDT/TDT prior belief distribution or utility function have to look like in order to define agents that can "update" in this way, while only thinking in terms of copying and not subjective probability? Suppose you were creating an agent to be put into a world with Ebborean physics, and you had uncertainty about whether, in the relation between world thickness ratios and "existentness" ratios, the power was 2 or 4. And this time, suppose the agent was to be an updateless decision theory agent. I think a UDT agent which uses "probability" can be converted by an expected utility calculation into a behaviorally equivalent UDT agent which uses no probability. Instead of probability, the agent uses only "importances": relative strengths of its (linearly additive) preferences about what happens in the various deterministic worlds the agent "was" copied into at the time of its creation. To make such an agent in Ebborean physics "update" on "evidence" about existentness, you could take the relative importance you assigned to influencing world-sheets, split it into two halves, and distribute each half across world-sheets in a different way. Half of the importance would be distributed in proportion to the cumulative products of the squares of the worlds' thickness ratios at their times of splitting, and half of the importance would be distributed in proportion to the cumulative products of the fourth powers of the worlds' thickness ratios at their times of splitting. Then, in each world-sheet, the copy of the agent in that world-sheet would make some measurements of the relative thicknesses on its side of a split, and it would use use those measurements to decide what kinds of local futures it should prioritize influencing.
But, again, this doesn't explain why you would want to do that. (Maybe you wanted the agents to take a coordinated action at the end of time using the world-sheets they controlled, and you didn't know which kinds of world-sheets would become good general-purpose resources for that action?)
I think there was another way my suggestion is incomplete, which has something to do with the way your definition of altruism doesn't work without a definition of "correct" subjective probability. But I don't remember what your definition of altruism was or why it didn't work without subjective probability.
I still think the right way to answer the question, "What is the correct subjective probability?" might be partly to derive "Bayesian updating" as an approximation that can be used by computationally limited agents implementing an updateless or other decision theory, with a utility function defined over mathematical descriptions of worlds containing some number of copies of the agent, when the differences in utility that result from the agent's decisions fulfill certain independence and linearity assumptions. I need to mathematically formalize those assumptions. "Subjective probability" would then be a variable used in that approximation, which would be meaningless or undefined when the assumptions failed.
Speaking of problems I don't know how to solve, here's one that's been gnawing at me for years.
The operation of splitting a subjective worldline seems obvious enough - the skeptical initiate can consider the Ebborians, creatures whose brains come in flat sheets and who can symmetrically divide down their thickness. The more sophisticated need merely consider a sentient computer program: stop, copy, paste, start, and what was one person has now continued on in two places. If one of your future selves will see red, and one of your future selves will see green, then (it seems) you should anticipate seeing red or green when you wake up with 50% probability. That is, it's a known fact that different versions of you will see red, or alternatively green, and you should weight the two anticipated possibilities equally. (Consider what happens when you're flipping a quantum coin: half your measure will continue into either branch, and subjective probability will follow quantum measure for unknown reasons.)
But if I make two copies of the same computer program, is there twice as much experience, or only the same experience? Does someone who runs redundantly on three processors, get three times as much weight as someone who runs on one processor?
Let's suppose that three copies get three times as much experience. (If not, then, in a Big universe, large enough that at least one copy of anything exists somewhere, you run into the Boltzmann Brain problem.)
Just as computer programs or brains can split, they ought to be able to merge. If we imagine a version of the Ebborian species that computes digitally, so that the brains remain synchronized so long as they go on getting the same sensory inputs, then we ought to be able to put two brains back together along the thickness, after dividing them. In the case of computer programs, we should be able to perform an operation where we compare each two bits in the program, and if they are the same, copy them, and if they are different, delete the whole program. (This seems to establish an equal causal dependency of the final program on the two original programs that went into it. E.g., if you test the causal dependency via counterfactuals, then disturbing any bit of the two originals, results in the final program being completely different (namely deleted).)
So here's a simple algorithm for winning the lottery:
Buy a ticket. Suspend your computer program just before the lottery drawing - which should of course be a quantum lottery, so that every ticket wins somewhere. Program your computational environment to, if you win, make a trillion copies of yourself, and wake them up for ten seconds, long enough to experience winning the lottery. Then suspend the programs, merge them again, and start the result. If you don't win the lottery, then just wake up automatically.
The odds of winning the lottery are ordinarily a billion to one. But now the branch in which you win has your "measure", your "amount of experience", temporarily multiplied by a trillion. So with the brief expenditure of a little extra computing power, you can subjectively win the lottery - be reasonably sure that when next you open your eyes, you will see a computer screen flashing "You won!" As for what happens ten seconds after that, you have no way of knowing how many processors you run on, so you shouldn't feel a thing.
Now you could just bite this bullet. You could say, "Sounds to me like it should work fine." You could say, "There's no reason why you shouldn't be able to exert anthropic psychic powers." You could say, "I have no problem with the idea that no one else could see you exerting your anthropic psychic powers, and I have no problem with the idea that different people can send different portions of their subjective futures into different realities."
I find myself somewhat reluctant to bite that bullet, personally.
Nick Bostrom, when I proposed this problem to him, offered that you should anticipate winning the lottery after five seconds, but anticipate losing the lottery after fifteen seconds.
To bite this bullet, you have to throw away the idea that your joint subjective probabilities are the product of your conditional subjective probabilities. If you win the lottery, the subjective probability of having still won the lottery, ten seconds later, is ~1. And if you lose the lottery, the subjective probability of having lost the lottery, ten seconds later, is ~1. But we don't have p("experience win after 15s") = p("experience win after 15s"|"experience win after 5s")*p("experience win after 5s") + p("experience win after 15s"|"experience not-win after 5s")*p("experience not-win after 5s").
I'm reluctant to bite that bullet too.
And the third horn of the trilemma is to reject the idea of the personal future - that there's any meaningful sense in which I can anticipate waking up as myself tomorrow, rather than Britney Spears. Or, for that matter, that there's any meaningful sense in which I can anticipate being myself in five seconds, rather than Britney Spears. In five seconds there will be an Eliezer Yudkowsky, and there will be a Britney Spears, but it is meaningless to speak of the current Eliezer "continuing on" as Eliezer+5 rather than Britney+5; these are simply three different people we are talking about.
There are no threads connecting subjective experiences. There are simply different subjective experiences. Even if some subjective experiences are highly similar to, and causally computed from, other subjective experiences, they are not connected.
I still have trouble biting that bullet for some reason. Maybe I'm naive, I know, but there's a sense in which I just can't seem to let go of the question, "What will I see happen next?" I strive for altruism, but I'm not sure I can believe that subjective selfishness - caring about your own future experiences - is an incoherent utility function; that we are forced to be Buddhists who dare not cheat a neighbor, not because we are kind, but because we anticipate experiencing their consequences just as much as we anticipate experiencing our own. I don't think that, if I were really selfish, I could jump off a cliff knowing smugly that a different person would experience the consequence of hitting the ground.
Bound to my naive intuitions that can be explained away by obvious evolutionary instincts, you say? It's plausible that I could be forced down this path, but I don't feel forced down it quite yet. It would feel like a fake reduction. I have rather the sense that my confusion here is tied up with my confusion over what sort of physical configurations, or cascades of cause and effect, "exist" in any sense and "experience" anything in any sense, and flatly denying the existence of subjective continuity would not make me feel any less confused about that.
The fourth horn of the trilemma (as 'twere) would be denying that two copies of the same computation had any more "weight of experience" than one; but in addition to the Boltzmann Brain problem in large universes, you might develop similar anthropic psychic powers if you could split a trillion times, have each computation view a slightly different scene in some small detail, forget that detail, and converge the computations so they could be reunified afterward - then you were temporarily a trillion different people who all happened to develop into the same future self. So it's not clear that the fourth horn actually changes anything, which is why I call it a trilemma.
I should mention, in this connection, a truly remarkable observation: quantum measure seems to behave in a way that would avoid this trilemma completely, if you tried the analogue using quantum branching within a large coherent superposition (e.g. a quantum computer). If you quantum-split into a trillion copies, those trillion copies would have the same total quantum measure after being merged or converged.
It's a remarkable fact that the one sort of branching we do have extensive actual experience with - though we don't know why it behaves the way it does - seems to behave in a very strange way that is exactly right to avoid anthropic superpowers and goes on obeying the standard axioms for conditional probability.
In quantum copying and merging, every "branch" operation preserves the total measure of the original branch, and every "merge" operation (which you could theoretically do in large coherent superpositions) likewise preserves the total measure of the incoming branches.
Great for QM. But it's not clear to me at all how to set up an analogous set of rules for making copies of sentient beings, in which the total number of processors can go up or down and you can transfer processors from one set of minds to another.
To sum up:
I will be extremely impressed if Less Wrong solves this one.