Consider two friends, Alice and Bob, trying to figure out what happened to a diamond that disappeared from a museum. They do so in the form of a game that is kind of an approximation to Solomonoff induction: they will work together to come up with the smallest possible explanations that conform to the data, for some intuitive notion of smallness.
This helps to eliminate fake explanations; the hypothesis "a witch caused Henry to fall ill" can be simplified to "Henry fell ill". But "Sally touched Henry and Bob, and Sally is sick and Henry is sick and Bob is sick" is beaten by "Sally touched Henry and Bob and Sally is contagiously sick". An explanation is good if it is smaller than just hard-coding the answer.
Bob knows that there are four diamond thieves in the city, so he comes up with four hypotheses:
- The diamond was stolen by thief number 1.
- The diamond was stolen by thief number 2.
- The diamond was stolen by thief number 3.
- The diamond was stolen by thief number 4.
These are all roughly the same complexity (depending on how you encode numbers), so this provides a uniform distribution over the four thieves.
Alice comes up with one hypothesis:
- The diamond spontaneously ceased existing.
and declares victory.
Bob: What, that makes no sense? Physical objects can't stop existing.
Alice: We aren't doing physics; we are playing a game.
Bob: ͠° ͟ʖ ͡°
Alice: ¯\_(ツ)_/¯
But there is an additional rule; you can add other data from the real world to the challenge. For example, for "Henry falling ill", you might get better hypotheses if you try to compress info about all the sick people in the village, so that a slightly more complex hypothesis that can explain all of them wins!
Bob: I hereby add all physics experiments to the data set!
Alice then comes up with the following hypothesis: all physical experiments are explained by the standard model of physics, except the diamond spontaneously ceased existing.
Bob: That bit about the diamond ceasing to exist is arbitrary!
Alice: Do you have a better hypothesis?
Bob: Uhm, idk. All physical experiments are explained by the standard model of physics, and the diamond was stolen by thief 3?
Alice: ͠° ͟ʖ ͡°
Bob's new hypothesis is bigger than Alice's new hypothesis.
And thus we run into the problem. Due to Alice's and Bob's bounded rationality, they can't determine which thief stole the diamond from the laws of physics alone. So the laws of physics at a low level don't help compress hypotheses at a high level, and thus can't constrain the smallest possible hypotheses they can consider.
How do you modify the game so that the spontaneously disappearing diamond hypothesis doesn't win?
If we are trying to approximate Solomonoff induction, only the complexity in the overall description of the universe counts directly, and a universe in which thief 3 stole the diamond isn't any more complex in terms of overall description than one in which the diamond stayed put. Instead, we account for the complexity of Bob's specific hypothesis in terms of ordinary probability, which accounts for the fact that there are more universes which are compatible with some theories than are compatible with other theories. E.g. in this particular case there will be some base rate for theft, for a locally prominent thief being involved, etc, and we can use that to penalize Bob's hypotheses instead. As part of that calculation, the fact that there are 4 thieves applies a factor of four penalty (2 bits) to any particular thief.
Regarding Alice's hypotheses, I think the "the diamond spontaneously disappeared" hypothesis is actually a much larger hypothesis (in terms of bits) than you are giving it credit for. If you don't gerrymander your descriptions to make this smaller, then the same number of bits should describe any other comparable object disappearing. Also, your bits need to specify the time of disappearance as well up to the observed precision, so the number of bits should be (ignoring additional details such as the precise manner of disappearance) around log2((number of comparable objects in universe)*(age of the universe)/(observed time window of disappearance)), which should I think be pretty decent in size.
Now, this may not be a particularly satisfying answer since I am only addressing your particular example, and not the general question of "how do low level hypotheses constrain high level ones?" AFAIK assessing how compatible any given high level hypothesis is with simple low level physics might in general be a complex issue.
Yes, it would.
(in writing the original comment, I actually wrote the second paragraph first then re-ordered them, which may have effected the consistency. I do think however it would be easy to forget to take this into account in calculating bit's for Alice's calculation while automatically taking it into acccount (via base rate which includes amount of thefts per time) in Bob's calculation.)