You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

maxikov comments on Open thread, Nov. 10 - Nov. 16, 2014 - Less Wrong Discussion

3 Post author: MrMind 10 November 2014 08:32AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (194)

You are viewing a single comment's thread. Show more comments above.

Comment author: maxikov 13 November 2014 05:13:42AM 0 points [-]

Does it mean then that neural networks start with a completely crazy model of the real world, and slowly modify this model to better fit the data, as opposed to jumping between model sets that fit the data perfectly, as Solomonoff induction does?

Comment author: Vaniver 13 November 2014 03:28:15PM *  2 points [-]

Does it mean then that neural networks start with a completely crazy model of the real world, and slowly modify this model to better fit the data

This seems like a good description to me.

as opposed to jumping between model sets that fit the data perfectly, as Solomonoff induction does?

I'm not an expert in Solomonoff induction, but my impression is that each model set is a subset of the model set from the last step. That is, you consider every possible output string (implicitly) by considering every possible program that could generate those strings, and I assume stochastic programs (like 'flip a coin n times and output 1 for heads and 0 for tails') are expressed by some algorithmic description followed by the random seed (so that the algorithm itself is deterministic, but the set of algorithms for all possible seeds meets the stochastic properties of the definition).

As we get a new piece of the output string--perhaps we see it move from "1100" to "11001"--we rule out any program that would not have output "11001," which includes about half of our surviving coin-flip programs and about 90% of our remaining 10-sided die programs. So the class of models that "fit the data perfectly" is a very broad class of models, and you could imagine neural networks as estimating the mean of that class of models instead of every instance of the class and then taking the mean of them.