Thrun's algorithm is correct. To see why, note that no matter how the envelope contents are distributed, all situations faced by the player can be grouped into pairs, where each pair consists of situations (x,2x) and (2x,x) which are equally likely. Within each pair the chance of switching from x to 2x is higher than the chance of switching from 2x to x, because f(x)>f(2x) by construction.
BTW, we have an ongoing discussion there about some math aspects of the algorithm.
In the ideal case, which I specifically addressed in the first line, epsilon is zero.
Today's post, The Rhythm of Disagreement was originally published on 01 June 2008. A summary (taken from the LW wiki):
Discuss the post here (rather than in the comments to the original post).
This post is part of the Rerunning the Sequences series, where we'll be going through Eliezer Yudkowsky's old posts in order so that people who are interested can (re-)read and discuss them. The previous post was A Premature Word on AI, and you can use the sequence_reruns tag or rss feed to follow the rest of the series.
Sequence reruns are a community-driven effort. You can participate by re-reading the sequence post, discussing it here, posting the next day's sequence reruns post, or summarizing forthcoming articles on the wiki. Go here for more details, or to have meta discussions about the Rerunning the Sequences series.