# cousin_it comments on The Rhythm of Disagreement - Less Wrong

9 01 June 2008 08:18PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Sort By: Old

Comment author: 24 May 2012 09:48:25PM *  0 points [-]

The expected benefit is 25 cents if the minimum possible contents of an envelope is 1 dollar. More generally, if envelopes are guaranteed to contain more than 100 dollars, then using f(x)=100/x yields expected benefit of 25 dollars, and so on. Also, Eliezer is right that a randomized algorithm can't be Bayesian-optimal in this problem, so for any given prior there's a deterministic algorithm with even higher expected benefit, I guess you can work it out.

Comment author: 24 May 2012 09:56:04PM *  0 points [-]

1/4 of the smallest possible amount you could win doesn't count as a large constant benefit in my view of things, but that's a bit of a nitpick. In any case, what do you think about the rest of the post?

Comment author: 24 May 2012 10:03:36PM 0 points [-]

Oh, sorry, just noticed the last part of your comment. It seems wrong, you can get a higher than 50% chance of picking the better envelope. The degenerate case is if you already know there's only one possibility, e.g. 1 dollar in one envelope and 2 dollars in the other. If you open the envelope and see 1 dollar, then you know you must switch, so you get the better envelope with probability 100%. You can get more fuzzy cases by sort of smearing out the distribution of envelopes continuously, starting from that degenerate case, and using the f(x) strategy. The chance of picking the better envelope will fall below 100%, but I think it will stay above 50%. Do you want my opinion on anything else? :-)

Comment author: 24 May 2012 10:09:34PM 0 points [-]

That's definitely cheating! We don't have access to the means by which X is generated. In the absence of a stated distribution, can we still do better than 50%?

Comment author: 25 May 2012 12:10:40AM *  1 point [-]

Well, Thrun's algorithm does better than 50% for every distribution. But no matter what f(x) we choose, there will always be distributions that make the chance arbitrarily close to 50% (say, less than 50%+epsilon). To see why, note that for a given f(x) we can construct a distribution far enough from zero that all values of f(x) are less than epsilon, so the chance of switching prescribed by the algorithm is also less than epsilon.

The next question is whether we can find any other randomized algorithm that does better than 50%+epsilon on any distribution. The answer to that is also no.

1) Note that any randomized algorithm must decide whether to switch or not, based on the contents of the envelope and possibly a random number generator. In other words, it must be described by a function f(x) like in Thrun's algorithm. f(x) doesn't have to be monotonic, but must lie between 0 and 1 inclusive for every x.

2) For every such function f(x), we will construct a distribution of envelopes that makes it do worse than 50%+epsilon.

3) Let's consider for each number x the degenerate distribution D_x that always puts x dollars in one envelope and 2*x in the other.

4) To make the algorithm do worse than 50%+epsilon on distribution D_x, we need the chance of switching at 2*x to be not much lower than the chance of switching at x. Namely, we need the condition f(2*x)>f(x)-epsilon.

5) Now we only need to prove that there exists an x such that f(2*x)>f(x)-epsilon. We will prove that by reductio ad absurdum. If we had f(2*x)≤f(x)-epsilon for every x, we could iterate that and obtain f(x)≥f(x*2^n)+n*epsilon for every x and n, which would make f(x) greater than 1. Contradiction, QED.

Comment author: 25 May 2012 03:21:12AM 0 points [-]

Yes, that all looks sensible. The point I'm trying to get at - the one I think Eliezer was gesturing towards - was that for any f and any epsilon, f(x) - f(2x) < epsilon for almost all x, in the formal sense. The next step is less straightforward - does it then follow that, prior to the selection of x, our expectation for getting the right answer is 50%? This seems to be Eliezer's implication. However, it seems also to rest on an infinite uniform random distribution, which I understand can be problematic. Or have I misunderstood?

Comment author: 25 May 2012 08:16:36AM 0 points [-]

That's called an improper prior. Eliezer mentions in the post that it was his first idea, but turned out to be irrelevant to the analysis.

Comment author: 25 May 2012 06:27:19PM 0 points [-]

So I guess we're back to square one, then.

Comment author: 25 May 2012 07:11:11PM *  0 points [-]

I don't understand. Which part are you still confused about? To me the whole thing seems quite clear.

Comment author: 25 May 2012 08:00:53PM 0 points [-]

How did Eliezer determine that the expected benefit of the algorithm over random chance is zero?

Comment author: 24 May 2012 10:05:52PM 0 points [-]

As for the deterministic variant, as you'd need some distribution from which the value of x is being selected, I'm not sure how best to calculate the EV of any particular scheme (whereas the nondeterministic algorithm sidesteps this by allowing a calculation of EV after X is selected). With a good prior, yeah, it'd be pretty simple, but without it becomes GAI-complete, yeah?

Comment author: 24 May 2012 10:06:41PM 1 point [-]

Eliezer is right that a randomized algorithm can't be Bayesian-optimal in this problem, so for any given prior there's a deterministic algorithm with even higher expected benefit

I think the claim is that there must a deterministic algorithm that does at least as well as the randomized algorithm. The randomized algorithm is allowed to be optimal, just not uniquely optimal.

Comment author: 24 May 2012 10:31:01PM 1 point [-]

Oh. Sorry. You're right.