Wei, do you know of any arguments better than XiXiDu's that a failed attempt at FAI could very well be worse than extinction?
I'm not aware of an especially good writeup, but here's a general argument. Any attempt to build an AGI induces a distribution of possible outcomes, and specifically the distribution induced by an attempt at FAI can be thought of as a circle of uncertainty around an FAI in design space. AGIs that cause worse-than-extinction outcomes are clustered around FAIs in design space. So an attempt at FAI may be more likely to hit one of these worse-than-extinction AGIs than an attempt to build an AGI without consideration of Friendliness.
AGIs that cause worse-than-extinction outcomes are clustered around FAIs in design space.
Yes, that's the part I'd like to see developed more. Maybe SI or FHI will get around to it eventually, but in the meantime I wouldn't mind somebody like Wei Dai taking a crack at it.
Previously: round 1, round 2, round 3
From the original thread:
Ask away!