This post is a summary of the different positions expressed in the comments to my previous post and elsewhere on LW. The central issue turned out to be assigning "probabilities" to individual theories within an equivalence class of theories that yield identical predictions. Presumably we must prefer shorter theories to their longer versions even when they are equivalent. For example, is "physics as we know it" more probable than "Odin created physics as we know it"? Is the Hamiltonian formulation of classical mechanics apriori more probable than the Lagrangian formulation? Is the definition of reals via Dedekind cuts "truer" than the definition via binary expansions? And are these all really the same question in disguise?
One attractive answer, given by shokwave, says that our intuitive concept of "complexity penalty" for theories is really an incomplete formalization of "conjunction penalty". Theories that require additional premises are less likely to be true, according to the eternal laws of probability. Adding premises like "Odin created everything" makes a theory less probable and also happens to make it longer; this is the entire reason why we intuitively agree with Occam's Razor in penalizing longer theories. Unfortunately, this answer seems to be based on a concept of "truth" granted from above - but what do differing degrees of truth actually mean, when two theories make exactly the same predictions?
Another intriguing answer came from JGWeissman. Apparently, as we learn new physics, we tend to discard inconvenient versions of old formalisms. So electromagnetic potentials turn out to be "more true" than electromagnetic fields because they carry over to quantum mechanics much better. I like this answer because it seems to be very well-informed! But what shall we do after we discover all of physics, and still have multiple equivalent formalisms - do we have any reason to believe simplicity will still work as a deciding factor? And the question remains, which definition of real numbers is "correct" after all?
Eliezer, bless him, decided to take a more naive view. He merely pointed out that our intuitive concept of "truth" does seem to distinguish between "physics" and "God created physics", so if our current formalization of "truth" fails to tell them apart, the flaw lies with the formalism rather than with us. I have a lot of sympathy for this answer as well, but it looks rather like a mystery to be solved. I never expected to become entangled in a controversy over the notion of truth on LW, of all places!
A final and most intriguing answer of all came from saturn, who alluded to a position held by Eliezer and sharpened by Nesov. After thinking it over for awhile, I generated a good contender for the most confused argument ever expressed on LW. Namely, I'm going to completely ignore the is-ought distinction and use morality to prove the "strong" version of Occam's Razor - that shorter theories are more "likely" than equivalent longer versions. You ready? Here goes:
Imagine you have the option to put a human being in a sealed box where they will be tortured for 50 years and then incinerated. No observational evidence will ever leave the box. (For added certainty, fling the box away at near lightspeed and let the expansion of the universe ensure that you can never reach it.) Now consider the following physical theory: as soon as you seal the box, our laws of physics will make a localized exception and the victim will spontaneously vanish from the box. This theory makes exactly the same observational predictions as your current best theory of physics, so it lies in the same equivalence class and you should give it the same credence. If you're still reluctant to push the button, it looks like you already are a believer in the "strong Occam's Razor" saying simpler theories without local exceptions are "more true". QED.
It's not clear what, if anything, the above argument proves. It probably has no consequences in reality, because no matter how seductive it sounds, skipping over the is-ought distinction is not permitted. But it makes for a nice koan to meditate on weird matters like "probability as preference" (due to Nesov and Wei Dai) and other mysteries we haven't solved yet.
ETA: Hal Finney pointed out that the UDT approach - assuming that you live in many branches of the "Solomonoff multiverse" at once, weighted by simplicity, and reducing everything to decision problems in the obvious way - dissolves our mystery nicely and logically, at the cost of abandoning approximate concepts like "truth" and "degree of belief". It agrees with our intuition in advising you to avoid torturing people in closed boxes, and more generally in all questions about moral consequences of the "implied invisible". And it nicely skips over all the tangled issues of "actual" vs "potential" predictions, etc. I'm a little embarrassed at not having noticed the connection earlier. Now can we find any other good solutions, or is Wei's idea the only game in town?
Years ago, before coming up with even crazier ideas, Wei Dai invented a concept that I named UDASSA. One way to think of the idea is that the universe actually consists of an infinite number of Universal Turing Machines running all possible programs. Some of these programs "simulate" or even "create" virtual universes with conscious entities in them. We are those entities.
Generally, different programs can produce the same output; and even programs that produce different output can have identical subsets of their output that may include conscious entities. So we live in more than one program's output. There is no meaning to the question of what program our observable universe is actually running. We are present in the outputs of all programs that can produce our experiences, including the Odin one.
Probability enters the picture if we consider that a UTM program of n bits is being run in 1/2^n of the UTMs (because 1/2^n of all infinite bit strings will start with that n bit string). That means that most of our instances are present in the outputs of relatively short programs. The Odin program is much longer (we will assume) than one without him, so the overwhelming majority of our copies are in universes without Odin. Probabilistically, we can bet that it's overwhelmingly likely that Odin does not exist.
Yep, I already arrived at that answer elsewhere in the thread. It's very nice and consistent and fits very well with UDT (Wei Dai's current "crazy" idea). There still remains the mystery of where our "subjective" probabilities come from, and the mystery why everything doesn't explode into chaos, but our current mystery becomes solved IMO. To give a recent quote from Wei, "There are copies of me all over math".