If you want to design a complex malleable AI design and have some guarantees about what it will do (rather than just fail in some creative way), think of simple properties you can prove about your code, and then try and prove them using Coq or other theorem proving system.
If you can't think of any properties that you want to hold for your system, think more.
A friend of mine is about to launch himself heavily into the realm of AI programming. The details of his approach aren't important; probabilities dictate that he is unlikely to score a major success. He's asked me for advice, however, on how to design a safe(r) AI. I've been pointing him in the right directions and sending him links to useful posts on this blog and the SIAI.
Do people here have any recommendations they'd like me to pass on? Hopefully, these may form the basis of a condensed 'warning pack' for other AI makers.
Addendum: Advice along the lines of "don't do it" is vital and good, but unlikely to be followed. Coding will nearly certainly happen; is there any way of making it less genocidally risky?