Maybe a more scary question isn't whether we can stop our AIs from blackmailing us, but whether we want to. If the AI has an opportunity to blackmail Alice for a dollar to save Bob from some suffering, do we want the AI to do that, or let Bob suffer? Eliezer seems to think that we obviously don't want our FAI to use certain tactics, but I'm not sure why he thinks that.
I think I've found a new argument, which I'll call X, against Paul Christiano's "indirect normativity" approach to FAI goals. I just discussed X with Paul, who agreed that it's serious.
This post won't describe X in detail because it's based on basilisks, which are a forbidden topic on LW, and I respect Eliezer's requests despite sometimes disagreeing with them. If you understand Paul's idea and understand basilisks, figuring out X should take you about five minutes (there's only one obvious way to combine the two ideas), so you might as well do it now. If you decide to discuss X here, please try to follow the spirit of LW policy.
In conclusion, I'd like to ask Eliezer to rethink his position on secrecy. If more LWers understood basilisks, somebody might have come up with X earlier.