orthonormal comments on Decision Theories, Part 3.5: Halt, Melt and Catch Fire - Less Wrong

31 Post author: orthonormal 26 August 2012 10:40PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (34)

You are viewing a single comment's thread. Show more comments above.

Comment author: orthonormal 29 August 2012 01:20:02AM 0 points [-]

Now, you may say that it is throwing away free points by cooperating with CooperateBot.

Indeed, I do say this. I'm looking to formalize an agent that does the obviously correct things against CooperateBot, DefectBot, FairBot, and itself, without being exploitable by any opponent (i.e. the opponent cannot do better than a certain Nash equilibrium unless the agent does better too). Anything weaker than that simply doesn't interest me.

One reason to care about performance against CooperateBot is that playing correctly against constant strategies is equivalent to winning at one-player games. Rational agents should win, especially if there are no other agents in the picture!

Comment author: OrphanWilde 29 August 2012 02:50:44PM 2 points [-]

"Always defect" meets your criteria. You missed a criteria: That it wouldn't miss out on a cooperation it could achieve if its strategy were different.

Your agent will have to consider, not only its current opponent, but every opponent currently in the game. (If Masquerade played in a larger game full of my reflective agents, it would consistently -lose-, because it would choose a mask it thinks they will cooperate against, and they will find the mask it would use against CooperateBot and believe it would defect against them.)

Therefore, in a game full of my reflective agents and one CooperateBot, your agent should choose to cooperate with CooperateBot, because otherwise my reflective agents will always defect against it.

Comment author: orthonormal 30 August 2012 09:55:09PM 0 points [-]

"Always defect" meets your criteria.

I said that it needed to cooperate with FairBot and with itself (and not for gerrymandered reasons the way that CliqueBots do).

You missed a criteria: That it wouldn't miss out on a cooperation it could achieve if its strategy were different.

This is impossible to guarantee without thereby being exploitable. Say that there are agents which cooperate iff their opponent cooperates with DefectBot.

Part of the trick is figuring out exactly what kind of optimality we're looking for, and I don't have a good definition, but I tend to think of agents like the ones you defined as "not really trying to win according to their stipulated payoff matrix", and so I'm less worried about optimizing my strategy for them. But those sort of considerations do factor into UDT thinking, which adds entire other layers of confusion.