Vladimir_Nesov comments on Decision Theory Paradox: Answer Key - Less Wrong Discussion
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (10)
Its goal is still different though (if we restore some missing pieces): it wants to game the probabilities of encountering certain opponents so that a single round that contains TDT delivers the most reward. It just so happens that getting rid of DefectBots serves this purpose, but if the opponents were CooperateBots, it looks like TDTs would drive themselves to extinction (or farm the opponents) to maximize the number of expected cooperating opponents that they can defect against (for each instance where there's a TDT agent in the round). (I didn't check this example carefully, so could be wrong, but the principle it exemplifies seems to hold.)
That's a seriously sick idea, but there doesn't seem to be a way to both set up such a favorable matchup and exploit it- is there?