loup-vaillant comments on Prisoner's Dilemma (with visible source code) Tournament - Less Wrong

47 Post author: AlexMennen 07 June 2013 08:30AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (232)

You are viewing a single comment's thread. Show more comments above.

Comment author: bogdanb 15 June 2013 07:48:12PM *  1 point [-]

Oh, OK. In that case, what you are trying to achieve is (theoretically) boxing a (potential) AGI, without a gatekeeper. Which is kind of overkill in this case, and wouldn’t be solved with a choice of language anyway :)

Comment author: loup-vaillant 16 June 2013 09:37:19PM 4 points [-]

I think it is possible to prove that a given boxing works, if it's sufficiently simple. Choosing the language isn't enough, but choosing the interpreter should be.

Take Brainfuck for instance: replace the dot ('.'), which prints a character, by two other statements: one that prints "yes" and exits, and one that prints "no" and exits. If the interpreter has no bug, a program can only:

  • Print "yes" and kill itself.
  • Print "no" and kill itself.
  • Do nothing until we kill it, or otherwise fail.

Assuming the AI doesn't control the external world by heating the host Intel processor in a smart way, we should be able to prove that we're otherwise safe.