if I really wanted to fake the experiment in order to convince people about the dangers of failing gatekeepers, wouldn't it be better for me to say I had won?
Win is a stronger claim, tight loss is a more believable claim. There's a tradeoff to be made and it is not a priori clear which variant pursues the goal better.
Update 2013-09-05.
I have since played two more AI box experiments after this one, winning both.
Update 2013-12-30:
I have lost two more AI box experiments, and won two more. Current Record is 3 Wins, 3 Losses.