"Fascinating! You should definitely look into this. Fortunately, my own research has no chance of producing a super intelligent AGI, so I'll continue. Good luck son! The government should give you more money."
Stuart Armstrong paraphrasing a typical AI researcher
I forgot to mention in my last post why "AI risk" might be a bad phrase even to denote the problem of UFAI. It brings to mind analogies like physics catastrophes or astronomical disasters, and lets AI researchers think that their work is ok as long as they have little chance of immediately destroying Earth. But the real problem we face is how to build or become a superintelligence that shares our values, and given that this seems very difficult, any progress that doesn't contribute to the solution but brings forward the date by which we must solve it (or be stuck with something very suboptimal even if it doesn't kill us), is bad. The word "risk" connotes a small chance of something bad suddenly happening, but slow steady progress towards losing the future is just as worrisome.
The usual way of stating the problem also invites lots of debate that are largely beside the point (as far as determining how serious the problem is), like whether intelligence explosion is possible, or whether a superintelligence can have arbitrary goals, or how sure we are that a non-Friendly superintelligence will destroy human civilization. If someone wants to question the importance of facing this problem, they really instead need to argue that a superintelligence isn't possible (not even a modest one), or that the future will turn out to be close to the best possible just by everyone pushing forward their own research without any concern for the big picture, or perhaps that we really don't care very much about the far future and distant strangers and should pursue AI progress just for the immediate benefits.
(This is an expanded version of a previous comment.)
This is similar in spirit to my complaint about the focus on intelligence explosion. Your framing though requires acceptance of consequentialist optimization view on decision making (so that "good enough" is not considered good enough if it's possible to do better), and of there being a significant difference between the better outcomes and the "default" outcomes.
For this, high risk of indifferent UFAI is a strong argument (when accepted), which in turn depends on orthogonality of values and optimization power. So while it's true that this particular argument doesn't have to hold for the problem to remain serious, it looks like one of the best available arguments for the seriousness of the problem. (It also has the virtue of describing relatively concrete features of the possible future.)
That said, I agree that there should exist a more abstract argument for ensuring the control of human value over the future, that doesn't depend on any particular scenario. It's harder to make this argument convincing, as it seems to depend on acceptance of some decision-theoretic/epistemic/metaethical background. Since it's not necessary to accept this argument (if you accept other arguments, such as intelligence explosion), for the same reasons as it's not necessary to accept plausibility of intelligence explosion, it seems to me that the current focus should be on making a good case for the strongest arguments, but once the low-hanging fruit on the other arguments is collected, it might be a good idea to develop this more general case as well.
Yes, agreed. On the other hand it may especially appeal to some AI researchers who seem really taken with the notion of optimality. :)
... (read more)