cousin_it comments on Prisoner's Dilemma Tournament Results - Less Wrong

101 Post author: prase 06 September 2011 12:46AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (170)

You are viewing a single comment's thread. Show more comments above.

Comment author: Eliezer_Yudkowsky 06 September 2011 12:49:08AM 30 points [-]

Variants I'd like to see:

1) You can observe rounds played by other bots.

2) You can partially observe rounds played by other bots.

3) (The really interesting one.) You get a copy of the other bot's source code and are allowed to analyze it. All bots have 10,000 instructions per turn, and if you run out of time the round is negated (both players score 0 points). There is a standard function for spending X instructions evaluating a piece of quoted code, and if the evaled code tries to eval code, it asks the outer eval-ing function whether it should simulate faithfully or return a particular value. (This enables you to say, "Simulate my opponent, and if it tries to simulate me, see what it will do if it simulates me outputting Cooperate.")

Comment author: cousin_it 06 September 2011 08:25:16AM *  12 points [-]

if you run out of time the round is negated (both players score 0 points).

This makes the game matrix bigger for no reason. Maybe replace this with "if you run out of time, you automatically defect"?

There is a standard function for spending X instructions evaluating a piece of quoted code, and if the evaled code tries to eval code, it asks the outer eval-ing function whether it should simulate faithfully or return a particular value.

Haha, this incentivizes players to reimplement eval themselves and avoid your broken one! One way to keep eval as a built-in would be to make code an opaque blob that can be analyzed only by functions like eval. I suggested this version a while ago :-)

Comment author: philh 06 September 2011 12:08:35PM 1 point [-]

I may be misreading, but I don't see how Eliezer's eval is broken. It can choose between a faithful evaluation and one in which inner calls to eval are replaced with a given value. That's more powerful than standard.

Comment author: DavidLS 06 September 2011 12:32:45PM 10 points [-]

If you build your own eval, and it returns a different result than the built in eval, you would know you're being simulated

Comment author: abramdemski 14 September 2012 06:10:24PM 0 points [-]

This makes the game matrix bigger for no reason. Maybe replace this with "if you run out of time, you automatically defect"?

Slightly better: allow the program to set the output at any time during execution (so it can set its own time-out value before trying expensive operations).