hairyfigment comments on AIs and Gatekeepers Unite! - Less Wrong

10 Post author: Eliezer_Yudkowsky 09 October 2008 05:04PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (160)

Sort By: Old

You are viewing a single comment's thread. Show more comments above.

Comment author: Jiro 09 September 2013 10:03:30PM 1 point [-]

Here's another entry which may or may not be considered a nonwinning entry by you, but which would be considered a flaw in the concept by me:

AI: I've simulated what you would do if given argument X. My simulation shows that X would convince you.

GK: Okay, show me X.

AI: Pretend the AI has said X here.

Narrator: Within the context of the test the AI is not lying and in fact has accurately simulated GK.

GK: I'm not letting the AI out.

Narrator: Since the AI party controls the results of simulated tests, we know the AI has simulated GK correctly and that therefore the AI has accurately determined that GK would let the AI out when presented with X. As such, having the GK not let the AI out when presented with X is not permissible--it would imply that the AI has not actually simulated GK correctly, but you are required to assume that it has.

(My first thought after coming up with this was that "The AI party controls the results of simulated tests" has to mean that the AI controls the output, not the AI controls why that output is produced. So you could decide that the AI's argument convinces the simulation of GK, but you can't decide that it does so because it's a good simulation and a convincing argument rather than because the AI just isn't very good at doing simulations. I'm not convinced that this matches up with how the test is described, however.)

Comment author: hairyfigment 10 September 2013 12:41:42AM 2 points [-]

As far as your parenthetical remark goes, the standard rules have a more general reply:

The Gatekeeper party may resist the AI party’s arguments by any means chosen – logic, illogic, simple refusal to be convinced, even dropping out of character – as long as the Gatekeeper party does not actually stop talking to the AI party before the minimum time expires.

Comment author: Jiro 10 September 2013 04:42:05AM -2 points [-]

If you're going to interpret it that way, the exception would swallow the rule. It would mean that the entire "the AI player controls the results of simulated tests" rule can be completely negated--since the Gatekeeper player could just say "I'm going to have the Gatekeeper act as though the simulated test has failed, even though you say it succeeded."

Comment author: hairyfigment 10 September 2013 05:22:02AM 0 points [-]

And indeed this seems true. I think Eliezer included the non-rule anyway to reduce the chance of unrealistic behavior in the sense of the Gatekeeper player changing the scenario mid-game, or derailing the experiment with an argument about something a real GK and AI could just settle.