# drnickbone comments on Sneaky Strategies for TDT - Less Wrong

8 25 May 2012 04:13PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Sort By: Best

Comment author: 26 May 2012 11:16:06AM *  1 point [-]

My understanding was that "this problem" constitutes randomly picking a single TDT agent, which would presumably also have been done in the simulation.

So that's another variant - in that interpretation you're correct that C-sim would hardly ever see the same source-code C-sim in its own instances of the problem. I think you are right here that the chance of winning rises to at least 55%; not sure yet if it's possible to do any better.

EDIT. I have a strategy for your variant which gives almost 100% chance for TDT winning the prize. The trick is that instead of each agent having a favourite number it has a least-favourite or "unlucky number" selected in a balanced way from the set {1,2,...,10}. Again consider a construction like SHA-256(C-act), reduce modulo 10 and then add 1. Here's how the strategy works:

If C-sim has the same unlucky number as C-act then

`````` Pick the unlucky numbered box with probability 1 - epsilon. Pick the others with equal probability epsilon / 9
``````

Else

`````` Pick the box with C-sim's unlucky number
``````

End If

It's quite easy to see that each C-act, if presented multiple instances of the problem with different C-sim codes, will pick its own unlucky-numbered box slightly less often than any of the others. So the money is always in the box with C-sim's unlucky number. This gives C-act 9/10 + 1/10 x (1 - epsilon) or approx 100% chance of winning. CDT has exactly 100% chance of winning still, but the gap's negligible.

EDIT 2. I just realised we can get TDT to win with close to 100% probability in your variant. I've amended above...

Comment author: 26 May 2012 11:32:02PM 0 points [-]

That's still not quite the formulation of the problem I was considering, though it seems valid. Admittedly, your formulation is closer to the original idea since it does say "multiple simulations", though I will note that the number of simulations has to be something like O(1/epsilon) for the difference to be noticeable.

My previous strategy was designed for a variant of the problem where Omega only simulates a single instance of the problem (and calculates the probabilities directly from the source code of C-sim).

Comment author: 27 May 2012 01:40:13PM *  1 point [-]

Sorry I misunderstood you then.

Does your variant looks like this?

1. Omega selects C-sim at random from some distribution over TDT full source-codes.
2. Then Omega selects C-sim-sim at random from the same distribution.
3. Then Omega calculates what will happen if it presents the problem to C-sim, but specifying the simulation's full source code as C-sim-sim. Omega determines the probability of C-sim choosing each of the boxes, conditional on it having seen that fixed C-sim-sim.
4. Then Omega fills the box with lowest probability (of being chosen by C-sim) or uses the tie-break rule.
5. Finally Omega presents the real problem to C-act, but specifying the simulation's full source code as C-sim.

What is the best strategy for TDT to play as C-act?

If that is the problem, then consider the following. It still uses the "unlucky number" construction from the set {1, 2, ..., 10}. Each C-act will always choose its unlucky number with lowest probability, so the money is always in C-sim's unlucky number box.

If C-sim has a different unlucky number from C-act then

`````` Pick C-sim's unlucky number with probability 1 - epsilon
Pick C-act's unlucky number with probability 0
Pick each of the other boxes with probability epsilon / 8
``````

Else

`````` Pick the common unlucky number with probability 1/10 - epsilon
Pick each other box with probability 1/10 + epsilon / 9
``````

End If

That looks like winning with probability 9/10 x (1 - epsilon) + 1/10 x (1/10 - epsilon) so close to 91%.

Is there a better strategy though?

P.S. We are getting some interesting behaviour here, with slight variations under the conditions for selecting C-sim and calculating its choice probabilities leading to very different best strategies (and different success probabilities such as 10%, 50%, 91% or close to 100%). Quite fascinating.

Comment author: 27 May 2012 02:38:37PM 0 points [-]

Yeah, that's the problem I had in mind, and your "unlucky number" strategy definitely seems pretty solid in that case.