AlephNeil comments on Decision Theories: A Semi-Formal Analysis, Part I - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (90)
Your intuition that it gets deduced before any of the spurious claims like "if output = 'defect' then utility <= -$1" is taking advantage of an authoritative payoff matrix that X can't safely calculate xerself. I'm not sure that this tweaked version is any safer from exploitation...
Why not? Can't the payoff matrix be "read off" from the "world program" (assuming X isn't just 'given' the payoff matrix as an argument.)
The one-player game that I wrote out is an example of a NDT agent trying to read off the payoff matrix from the world program, and failing. There are ways to ensure you read off the matrix correctly, but that's tantamount to what you do to implement CDT, so I'll explain it in Part II.