Wei_Dai comments on Approximating Solomonoff Induction - Less Wrong Discussion
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (45)
That's not even true. In practice, it's the best we've got, but it's still terrible in most interesting settings (or else you could solve NP-hard problems in practice, which you can't).
It's because the neural net algorithms are not even close to finding the optimal neural net in complex situations.
That's trivial to do. It's not the problem here.
This might be true in some sense, but not in a meaningful one. PAC learning, for instance, is fundamentally non-Bayesian. Saying that PAC learning approximates Bayesian inference is the same as saying that Bayesian inference approximates PAC learning. It's not a very meaningful statement.
People on LW tend to be hard-core Bayesians who have never even heard of PAC learning, which is an entire branch of learning theory. I find it rather strange.
I looked into PAC learning a bit when Scott Aaronson talked about it on his blog, and came to the following conclusion. 'Instead of saying “PAC-learning and Bayesianism are two different useful formalisms for reasoning about learning and prediction” I think we can keep just Bayesianism and reinterpret PAC-learning results as Bayesian-learning results which say that in some special circumstances, it doesn’t matter exactly what prior one uses. In those circumstances, Bayesianism will work regardless.'
Of course that was 7 years ago and I probably barely scratched the surface of the PAC learning literature even then. Are there any PAC learning results which can't be reinterpreted this way?
PAC-learning has no concept of prior or even of likelihood, and it allows you to learn regardless. If by "Bayesianism" you mean "learning", then sure, PAC-learning is a type of Bayesianism. But I don't see why it's useful to view it that way (Bayes's rule is never used, for example).