The argument in A Problem with the Modified Demski Prior does not work as stated. The argument which I gave seems to make sense in the case where the random bits are added as axioms, but fails in the more natural case where we update on them in a Bayesian manner.
To re-articulate the problem: suppose we have a boolean sequence Sn which we would like to predict, encoded in the language of PA plus a special predicate P(n) which represents the value Sn. Values of Sn are coming from a biased coin in reality. We can update the modified Demski prior on the values we observe for Sn in two ways: by adding them as axioms, or by Bayesian updating.
Adding them as axioms, we run into the problem which I described. Sampled programs are required to be consistent with the axioms so far. Programs which make guesses on every P(n) will decrease in mass quickly as we add more P(n), even if they guess with the true frequencies. Programs which make guesses very rarely will not. Within the set of programs which guess frequently, the next guess will quickly approach the correct frequency. Within programs which guess rarely, however, this will not happen. Despite only rarely making guesses, the overall mass from these uninformed guessers will come to dominate the probability of new P(n), because the loss in mass of frequent guessers is exponential in n, while the mass of infrequent-guessers which guess at a given n is roughly the algorithmic probability of n.
If we do a Bayesian update, however, then infrequent-guessers are treated as enumerating some far-out n and then filling in the remaining cases by sampling more frequent-guessers. The hypothesis represented by the infrequent-guesser is, therefore, still losing probability mass at about the rate of everything else. There's no problem.
The original problem came up when our MIRIx group was thinking about things like irreducible patterns, where the sequence actually follows from our existing axioms; so, it was reasonable to think as we did. (The actual set-up to formulate this case would have to be more in the style of asymptotic logical uncertainty, not the simpler add-P(n)-axioms which I discussed above.) However, my formulation when I wrote the forum post got it wrong.
The argument in A Problem with the Modified Demski Prior does not work as stated. The argument which I gave seems to make sense in the case where the random bits are added as axioms, but fails in the more natural case where we update on them in a Bayesian manner.
To re-articulate the problem: suppose we have a boolean sequence Sn which we would like to predict, encoded in the language of PA plus a special predicate P(n) which represents the value Sn. Values of Sn are coming from a biased coin in reality. We can update the modified Demski prior on the values we observe for Sn in two ways: by adding them as axioms, or by Bayesian updating.
Adding them as axioms, we run into the problem which I described. Sampled programs are required to be consistent with the axioms so far. Programs which make guesses on every P(n) will decrease in mass quickly as we add more P(n), even if they guess with the true frequencies. Programs which make guesses very rarely will not. Within the set of programs which guess frequently, the next guess will quickly approach the correct frequency. Within programs which guess rarely, however, this will not happen. Despite only rarely making guesses, the overall mass from these uninformed guessers will come to dominate the probability of new P(n), because the loss in mass of frequent guessers is exponential in n, while the mass of infrequent-guessers which guess at a given n is roughly the algorithmic probability of n.
If we do a Bayesian update, however, then infrequent-guessers are treated as enumerating some far-out n and then filling in the remaining cases by sampling more frequent-guessers. The hypothesis represented by the infrequent-guesser is, therefore, still losing probability mass at about the rate of everything else. There's no problem.
The original problem came up when our MIRIx group was thinking about things like irreducible patterns, where the sequence actually follows from our existing axioms; so, it was reasonable to think as we did. (The actual set-up to formulate this case would have to be more in the style of asymptotic logical uncertainty, not the simpler add-P(n)-axioms which I discussed above.) However, my formulation when I wrote the forum post got it wrong.