accolade comments on I attempted the AI Box Experiment (and lost) - Less Wrong

47 Post author: Tuxedage 21 January 2013 02:59AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (244)

You are viewing a single comment's thread. Show more comments above.

Comment author: accolade 22 January 2013 10:25:39AM 0 points [-]

Ok, I take it by "one-way-blind" you mean that each layer gets no new information that is not already in its database, but what is explicitly controlled by the humans. (E.g. I guess each layer should know the human query, in order to evaluate if AI's answer is manipulative.)

I also understand that we do look at complex information given by the AI, but only if the security bit signals "ok".

Ideally the AI […] knows as little as possible about humans and about our universe's physics.

That seems problematic, as these kinds of knowledge will be crucial for the optimization we want the AI to calculate.