# Johnicholas comments on Self-fulfilling correlations - Less Wrong

103 26 August 2010 09:07PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Sort By: Best

Comment author: 27 August 2010 07:49:55PM 7 points [-]

SIWOTI: Simulated annealing is not not not a genetic algorithm.

I admit these are fuzzy, confusing categories, and you can certainly create borderline examples, but most implementations of simulated annealing use a cooling schedule, which has no natural analog in the genetic algorithm, and many implementations of genetic algorithms use crossover, which has no natural analog in simulated annealing.

Comment author: 28 August 2010 01:46:28AM *  8 points [-]

borderline examples

I give you Differential evolution Monte Carlo! It slices! It dices! It's a genetic algorithm and a simulated annealing algorithm! But wait -- there's more! You also get an self-tuning Metropolis sampler! (The proof of the correctness of the sampler isn't valid, but the theorem is true; the author gives a correct proof in a subsequent paper.)

Comment author: 27 August 2010 08:14:41PM *  0 points [-]

Fair points. In my mind those techniques are all taxonomically related, close in conceptual space, but some of the boundaries are fuzzy. I'm not even sure what the general category is called, but 'evolutionary search' is what I roughly mean by "the space of algorithms that includes GA variants and simulated annealing, plus some other stuff". Perhaps they aren't that close, but thats where map stands and what I meant.

But anyway, no GA algorithms vary the mutation or crossover parameters over time? That would be closer to SA then. I thought the similarity between SA and GA is they both explore a set of points in solution space simultaneously and thus are more generally robust vs single point exploratory techniques.

Comment author: 28 August 2010 12:41:25AM 2 points [-]

Simulated annealing is a kind of single-point stochastic hill-climbing, with the temperature controlling how frequently a "downward" step is taken (that might bounce the single point out of a local optimum). It doesn't explore a set of points simultaneously - I mean, there might be a way of describing it as such, but that would be an exotic, insightful analogy, rather than the usual bland, boring sort of description.

Certainly some GA's vary the mutation or crossover parameters - as I said before, you can certainly create borderline examples, but the (standard, introductory, boring) centers of the two fuzzy categories are pretty distinguishable.

Comment author: 28 August 2010 12:48:42AM 0 points [-]

Ahhh my bad. I thought that SA explored multiple solution points at once. Of course, if the stochastic jumps are similar, it could end up being the same exploration path eventually, just serial vs parallel, although that seems to make vanilla SA incredibly non-useful in the modern era of parallel computing.

Comment author: 28 August 2010 09:56:04PM 2 points [-]

[...] that seems to make vanilla SA incredibly non-useful in the modern era of parallel computing.

That is what parallel tempering is for.

Comment author: [deleted] 14 February 2012 01:19:00AM 0 points [-]

Also, SA is specifically useful when the original objective function can be easily evaluated, but its derivatives are too expensive. With SA, you don't need to compute derivatives or the normalizing constants. You can try quasi-Newton methods and other approaches, but even these are computationally intractable in many cases. There are certain ways in which a problem can be non-convex that makes SA an attractive alternative. In principle, this could be true even in low dimensional problems, meaning that it's not at all just a question of parallelism. Another thing worth mentioning is that SA lends itself very well to the GPU in some cases where traditional optimizers don't.