orthonormal comments on Decision Theories: A Semi-Formal Analysis, Part I - Less Wrong

21 Post author: orthonormal 24 March 2012 04:01PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (90)

You are viewing a single comment's thread. Show more comments above.

Comment author: orthonormal 24 March 2012 08:40:11PM 0 points [-]

Note that our agent will quickly prove "if output = 'defect' then utility >= $1".

Your intuition that it gets deduced before any of the spurious claims like "if output = 'defect' then utility <= -$1" is taking advantage of an authoritative payoff matrix that X can't safely calculate xerself. I'm not sure that this tweaked version is any safer from exploitation...

Comment author: AlephNeil 24 March 2012 09:15:44PM *  0 points [-]

an authoritative payoff matrix that X can't safely calculate xerself.

Why not? Can't the payoff matrix be "read off" from the "world program" (assuming X isn't just 'given' the payoff matrix as an argument.)

Comment author: orthonormal 24 March 2012 11:29:57PM 0 points [-]

The one-player game that I wrote out is an example of a NDT agent trying to read off the payoff matrix from the world program, and failing. There are ways to ensure you read off the matrix correctly, but that's tantamount to what you do to implement CDT, so I'll explain it in Part II.