Shouldn't you try to immunize them against at least some of the strategies AIs could conceivably discover independently?
I don't think reading a few logs would immunize someone. If you wanted to immunize someone I would suggest a few years of therapy with a good psychologist to work through any trauma's that exist in that person's life and the existential questions.
I would add many hours in meditation to have learn to have control over your own mind.
You could train someone to precommit and build emotional endurance. If someone can take highly addictive drugs and has a enough control over his own mind to refuse them when put a few hours alone in a room with them I would trust them more to stay emotionally stable in front of an AI.
You could also require gatekeepers to have played the AI role in the experiment a few times.
You might also look into techniques that the military teaches soldiers to resist torture.
But even with all these safety measures it's still dangerous.
Summary
Furthermore, in the last thread I have asserted that
It would be quite bad for me to assert this without backing it up with a victory. So I did.
First Game Report - Tuxedage (GK) vs. Fjoelsvider (AI)
Second Game Report - Tuxedage (AI) vs. SoundLogic (GK)
Testimonies:
State of Mind
Post-Game Questions
$̶1̶5̶0̶$300 for any subsequent experiments regardless of outcome, plus an additional$̶1̶5̶0̶$450 if I win. (Edit: Holy shit. You guys are offering me crazy amounts of money to play this. What is wrong with you people? In response to incredible demand, I have raised the price.) If you feel queasy about giving me money, I'm perfectly fine with this money being donating to MIRI. It is also personal policy that I do not play friends (since I don't want to risk losing one), so if you know me personally (as many on this site do), I will not play regardless of monetary offer.Advice
These are tactics that have worked for me. I do not insist that they are the only tactics that exists, just one of many possible.
Playing as Gatekeeper
Playing as AI
Ps: Bored of regular LessWrong? Check out the LessWrong IRC! We have cake.