MattG comments on Rationality Quotes Thread November 2015 - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (143)
That's not how Bayesian optimization works. Broadly, the idea is that we use Bayesian optimization when both calculating the value of the target function at a point and calculating its gradient are both expensive or infeasible. Thus, we instead choose points at which to sample the target function, and the samples train a Gaussian process model (or other nonparametric model of functions) that tells us what the function's surface looks like. In such a procedure, we obtain the best performance by sampling points where either the expected function value or the model's variance is particularly high. Thus, we choose points that we know are good, or points where we're very uncertain, but we never specifically search for low points. We'll probably encounter some when sampling points of great uncertainty, but we didn't specifically seek them out.