gwern comments on Rationality Quotes Thread November 2015 - Less Wrong

5 Post author: elharo 02 November 2015 12:30PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (143)

You are viewing a single comment's thread. Show more comments above.

Comment author: gwern 02 December 2015 08:08:49PM 0 points [-]

Your optimizer, whether Bayesian or not, needs to be able to recognize a low point when it hits one, or else it can't optimize at all! If every point looks the same... (It may learn more about high points, but it must still learn about low points.)

Comment author: [deleted] 03 December 2015 02:44:22AM 0 points [-]

(It may learn more about high points, but it must still learn about low points.)

That's not how Bayesian optimization works. Broadly, the idea is that we use Bayesian optimization when both calculating the value of the target function at a point and calculating its gradient are both expensive or infeasible. Thus, we instead choose points at which to sample the target function, and the samples train a Gaussian process model (or other nonparametric model of functions) that tells us what the function's surface looks like. In such a procedure, we obtain the best performance by sampling points where either the expected function value or the model's variance is particularly high. Thus, we choose points that we know are good, or points where we're very uncertain, but we never specifically search for low points. We'll probably encounter some when sampling points of great uncertainty, but we didn't specifically seek them out.