Log in to IRC as "Boxed_AI" and "AI_Gatekeeper". Conduct experiment. Register a throw-away LessWrong account. Post log. Have the Gatekeeper post with their normal account, confirming the validity.
That at least anonymizes the Boxed_AI, who is (I presume) the player worried about repercussions. I wouldn't expect the AI to have a similar-enough style to really give away who it was, although the gatekeeper is probably impossible to anonymize because a good AI will use who-they-are as part of their technique :)
Gatekeeper could threaten to deanonymize the AI. Or is the gatekeeper not supposed to be actively fighting back?
Update 2013-09-05.
I have since played two more AI box experiments after this one, winning both.
Update 2013-12-30:
I have lost two more AI box experiments, and won two more. Current Record is 3 Wins, 3 Losses.