The Weighted Majority Algorithm

18 Eliezer_Yudkowsky 12 November 2008 11:19PM

Followup toWorse Than Random, Trust In Bayes

In the wider field of Artificial Intelligence, it is not universally agreed and acknowledged that noise hath no power.  Indeed, the conventional view in machine learning is that randomized algorithms sometimes perform better than unrandomized counterparts and there is nothing peculiar about this.  Thus, reading an ordinary paper in the AI literature, you may suddenly run across a remark:  "There is also an improved version of this algorithm, which takes advantage of randomization..."

Now for myself I will be instantly suspicious; I shall inspect the math for reasons why the unrandomized algorithm is being somewhere stupid, or why the randomized algorithm has a hidden disadvantage.  I will look for something peculiar enough to explain the peculiar circumstance of a randomized algorithm somehow doing better.

I am not completely alone in this view.  E. T. Jaynes, I found, was of the same mind:  "It appears to be a quite general principle that, whenever there is a randomized way of doing something, then there is a nonrandomized way that delivers better performance but requires more thought."  Apparently there's now a small cottage industry in derandomizing algorithms.  But so far as I know, it is not yet the majority, mainstream view that "we can improve this algorithm by randomizing it" is an extremely suspicious thing to say.

Let us now consider a specific example - a mainstream AI algorithm where there is, apparently, a mathematical proof that the randomized version performs better.  By showing how subtle the gotcha can be, I hope to convince you that, even if you run across a case where the randomized algorithm is widely believed to perform better, and you can't find the gotcha yourself, you should nonetheless trust that there's a gotcha to be found.

continue reading »