You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

Wei_Dai comments on Approximating Solomonoff Induction - Less Wrong Discussion

6 Post author: Houshalter 29 May 2015 12:23PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (45)

You are viewing a single comment's thread. Show more comments above.

Comment author: anon85 02 June 2015 09:03:15AM 0 points [-]

But in practice, SGD is extremely good at optimizing NNs, and the local optima issue isn't a huge problem.

That's not even true. In practice, it's the best we've got, but it's still terrible in most interesting settings (or else you could solve NP-hard problems in practice, which you can't).

As to why we can have decent machine learning and not AGI, I don't know.

It's because the neural net algorithms are not even close to finding the optimal neural net in complex situations.

Approximating SI isn't sufficient for one, you need to act on the models you find.

That's trivial to do. It's not the problem here.

Everything approximates Bayesian inference, it's just a matter of how ideal the approximation is.

This might be true in some sense, but not in a meaningful one. PAC learning, for instance, is fundamentally non-Bayesian. Saying that PAC learning approximates Bayesian inference is the same as saying that Bayesian inference approximates PAC learning. It's not a very meaningful statement.

People on LW tend to be hard-core Bayesians who have never even heard of PAC learning, which is an entire branch of learning theory. I find it rather strange.

Comment author: Wei_Dai 03 June 2015 01:42:14AM -1 points [-]

PAC learning, for instance, is fundamentally non-Bayesian. Saying that PAC learning approximates Bayesian inference is the same as saying that Bayesian inference approximates PAC learning. It's not a very meaningful statement.

I looked into PAC learning a bit when Scott Aaronson talked about it on his blog, and came to the following conclusion. 'Instead of saying “PAC-learning and Bayesianism are two different useful formalisms for reasoning about learning and prediction” I think we can keep just Bayesianism and reinterpret PAC-learning results as Bayesian-learning results which say that in some special circumstances, it doesn’t matter exactly what prior one uses. In those circumstances, Bayesianism will work regardless.'

Of course that was 7 years ago and I probably barely scratched the surface of the PAC learning literature even then. Are there any PAC learning results which can't be reinterpreted this way?

Comment author: anon85 03 June 2015 03:26:03AM 2 points [-]

PAC-learning has no concept of prior or even of likelihood, and it allows you to learn regardless. If by "Bayesianism" you mean "learning", then sure, PAC-learning is a type of Bayesianism. But I don't see why it's useful to view it that way (Bayes's rule is never used, for example).