Manfred comments on Open Thread, Jun. 29 - Jul. 5, 2015 - Less Wrong Discussion
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (210)
I've started learning Machine Learning (he!), and upon reading the first chapter of the most famous textbook I was already gasping for air.
For someone like me who grew into probability with Jaynes' book, seeing in the first chapter that algorithms are trained using multiple times the same data (cross-validation) was... annoying, let's say (I actually screamed at the book).
Is there a sane textbook on machine learning? I don't demand one that starts from objective bayesianism, that would be asking too much. But at least something that assumes bayesianism as a foundation? Pretty please?
Eventually it makes sense, I promise. "Bayesianism" in the sense of keeping track of every hypothesis is very computationally expensive - modern algorithms only keep track of a very small number of hypotheses (only those representable by a neural network [or what have you], and even then only those required to do gradient descent). This fact opens you up to the overfitting problem, where the simplest perfect hypothesis in your space actually has very little information about the true external reality. You need some way of throwing away the parts of the signal that your model wasn't going to figure out anyhow.
For this reason among others, modern machine learning algorithms often have a lot of settings that have to be set by smarter systems (humans), before your algorithm can actually learn a novel domain. These settings reflect how the properties of the domain interact with properties of your algorithm (e.g. how many resources the algorithm has to commit before it can expect to have found something good, or what degree of noise the algorithm has to learn to throw away). These are those "hyperparameter" things. Cross-validation is just an empirical tool that helps humans figure out the right settings. You can probably figure out why it's expected to work.
I upvoted because I understand the rationale, I understand the explanation, I just rather wish that a book whose purpose is to teach the subject wouldn't be so... ad hoc.