misterbailey comments on Debunking Fallacies in the Theory of AI Motivation - Less Wrong

8 Post author: Richard_Loosemore 05 May 2015 02:46AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (343)

You are viewing a single comment's thread. Show more comments above.

Comment author: Richard_Loosemore 18 May 2015 02:16:18AM 0 points [-]

Maybe I could try to reduce possible confusion here. The paper was written to address a category of "AI Risk" scenarios in which we are told:

"Even if the AI is programmed with goals that are ostensibly favorable to humankind, it could execute those goals in such a way that would lead to disaster".

Given that premise, it would be a bait-and-switch if I proposed a fix for this problem, and someone objected with "But you cannot ASSUME that the programmers would implement that fix!"

The whole point of the problem under consideration is that even if the engineers tried, they could not get the AI to stay true.

Comment author: misterbailey 18 May 2015 09:16:17AM 1 point [-]

Yudkowsky et al don't argue that the problem is unsolvable, only that it is hard. In particular, Yudkowsky fears it may be harder than creating AI in the first place, which would mean that in the natural evolution of things, UFAI appears before FAI. However, I needn't factor what I'm saying through the views of Yudkowsky. For an even more modest claim, we don't have to believe that FAI is hard in hindsight in order to claim that AI will be unfriendly unless certain failure modes are guarded against. On this view of the FAI project, a large part of the effort is just noticing the possible failure modes that were only obvious in hindsight, and convincing people that the problem is important and won't solve itself.

Comment author: TheAncientGeek 18 May 2015 11:16:42AM *  0 points [-]

If no one is building AIs with utility functions, then the one kind of failure MIRI is talking about has solved itself,