V_V comments on More and Less than Solomonoff Induction - Less Wrong Discussion
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (32)
This is an improper way of stating it.
Solomonoff induction requires that the programming language syntax prevents the existence of programs which are continuations of other programs.
This is required in order to make sure that the prior is a probability distribution over all programs, up to a normalization constant (the inverse of Chaitin omega).
I'm not sure what do you mean by "frequentist" in this context, but the theory of Solomonoff induction makes no assumption that there must exist a "true program" which deterministically generates the data string. It only assumes that the conditional probability distribution of each bit given all the previous one must be computable.
Even if we had a halting oracle, we would still have to consider infinitely many programs.
A time limits avoids both the halting problem and the program set cardinality problem.
A caveat here:
Full Solomonoff induction doesn't overfit: since it has infinite modelling resources it can spend them to exactly learn all the properties of the observed string and maintain maximum predictive power.
A resource-limited version of Solomonoff induction, like any resource-limited machine learning approach, has limited modelling resources, and spending them to learn irrelevant properties of the observed data makes them less able to predict the relevant properties.
I don't see the relevance with Solomonoff induction. Most machine learning methods have explicit or implicit noise models and ways to control hypothesis complexity.
For instance, L2-regularized linear regression performs maximum a posteriori estimation under the assumption of an independent Gaussian prior on the model parameters and an independent Gaussian noise model on the observations: http://en.wikipedia.org/wiki/Tikhonov_regularization#Bayesian_interpretation