You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

Dmytry comments on against "AI risk" - Less Wrong Discussion

24 Post author: Wei_Dai 11 April 2012 10:46PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (89)

You are viewing a single comment's thread. Show more comments above.

Comment author: Dmytry 12 April 2012 04:20:06AM *  2 points [-]

Seems like a prime example of where to apply rationality: what are the consequences to trying to work on AI risk right now? Versus on something else? Does AI risk work have good payoff?

What's of the historical cases? The one example I know of is this: http://www.fas.org/sgp/othergov/doe/lanl/docs1/00329010.pdf (thermonuclear ignition of atmosphere scenario). Can a bunch of people with little physics related expertise do something about such risks >10 years before? Beyond the usual anti war effort? Bill Gates will work on AI risk when it becomes clear what to do about it.

Comment author: Wei_Dai 12 April 2012 08:13:55AM 1 point [-]

Can a bunch of people with little physics related expertise do something about such risks >10 years before?

Have you seen Singularity and Friendly AI in the dominant AI textbook?

Comment author: Dmytry 12 April 2012 08:42:43AM *  0 points [-]

I'm kind of dubious that you needed 'beware of destroying mankind' in a physics textbook to get Teller to check if nuke can cause thermonuclear ignition in atmosphere or seawater, but if it is there, I guess it won't hurt.

Comment author: Wei_Dai 12 April 2012 09:18:11AM *  8 points [-]

Here's another reason why I don't like "AI risk": it brings to mind analogies like physics catastrophes or astronomical disasters, and lets AI researchers think that their work is ok as long as they have little chance of immediately destroying Earth. But the real problem is how do we build or become a superintelligence that shares our values, and given this seems very difficult, any progress that doesn't contribute to the solution but brings forward the date by which we must solve it (or be stuck with something very suboptimal even if it doesn't kill us) is bad, and this includes AI progress that is not immediately dangerous.

ETA: I expanded this comment into a post here.

Comment author: Dmytry 12 April 2012 09:27:17AM *  0 points [-]

Well, there's this implied assumption that super-intelligence that 'does not share our values' shares our domain of definition of the values. I can make a fairly intelligent proof generator, far beyond human capability if given enough CPU time; it won't share any values with me, not even the domain of applicability; the lack of shared values with it is so profound as to make it not do anything whatsoever in the 'real world' that I am concerned with. Even if it was meta - strategic to the point of potential for e.g. search for ways to hack into a mainframe to gain extra resources to do the task 'sooner' by wallclock time, it seems very dubious that by mere accident it will have proper symbol grounding, won't wirelead (i.e. would privilege the solutions that don't involve just stopping said clock), etc etc. Same goes for other practical AIs, even the evil ones that would e.g. try to take over internet.

Comment author: Wei_Dai 12 April 2012 09:30:47AM 4 points [-]

You're still falling into the same trap, thinking that your work is ok as long as it doesn't immediately destroy the Earth. What if someone takes your proof generator design, and uses the ideas to build something that does affect the real world?

Comment author: Dmytry 12 April 2012 09:44:05AM *  0 points [-]

You're still falling into the same trap, thinking that your work is ok as long as it doesn't immediately destroy the Earth. What if someone takes your proof generator design, and uses the ideas to build something that does affect the real world?

Well let's say in 2022 we have a bunch of tools along the lines of automatic problem solving, unburdened by their own will (not because they were so designed but by simple omission of immense counter productive effort). Someone with a bad idea comes around, downloads some open source software, cobbles together some self propelling 'thing' that is 'vastly superhuman' circa 2012. Keep in mind that we still have our tools that make us 'vastly superhuman' circa 2012 , and i frankly don't see how 'automatic will', for lack of better term, is contributing anything here that would make the fully automated system competitive.

Comment author: Wei_Dai 12 April 2012 10:18:58AM 5 points [-]

Well, one thing the self-willed superintelligent AI could do is read your writings, form a model of you, and figure out a string of arguments designed to persuade you to give up your own goals in favor of its goals (or just trick you into doing things that further its goals without realizing it). (Or another human with superintelligent tools could do this as well.) Can you ask your "automatic problem solving tools" to solve the problem of defending against this, while not freezing your mind so that you can no longer make genuine moral/philosophical progress? If you can do this, then you've pretty much already solved the FAI problem, and you might as well ask the "tools" to tell you how to build an FAI.

Comment author: XiXiDu 12 April 2012 11:10:54AM 0 points [-]

Well, one thing the self-willed superintelligent AI could do is read your writings, form a model of you, and figure out a string of arguments designed to persuade you to give up your own goals in favor of its goals...

Does agency enable the AI to do so? If not, then why wouldn't a human being not be able to do the same by using the AI in tool mode?

Can you ask your "automatic problem solving tools" to solve the problem of defending against this...

Just make it list equally convincing counter-arguments.

Comment author: Wei_Dai 12 April 2012 08:34:55PM 2 points [-]

Does agency enable the AI to do so? If not, then why wouldn't a human being not be able to do the same by using the AI in tool mode?

Yeah, I realized this while writing the comment: "(Or another human with superintelligent tools could do this as well.)" So this isn't a risk with self-willed AI per se. But note this actually makes my original point stronger, since I was arguing against the idea that progress on AI is safe as long as it doesn't have a "will" to act in the real world.

Just make it list equally convincing counter-arguments.

So every time you look at a (future equivalent of) website or email, you ask your tool to list equally convincing counter-arguments to whatever you're looking at? What does "equally convincing" mean? An argument that exactly counteracts the one that you're reading, leaving your mind unchanged?

Comment author: Dmytry 12 April 2012 11:31:22AM *  1 point [-]

Yep. Majorly awesome scenario degrades into ads vs adblock when you consider everything in the future not just the self willed robot. Matter of fact, a lot of work is put into constructing convincing strings of audio and visual stimuli, and into ignoring those strings.

Comment author: TheOtherDave 12 April 2012 02:12:47PM 0 points [-]

They'd probably have to be more convincing, since convincing a human being out of a position they already hold is usually a more difficult task than convincing them to hold the position in the first place.

Comment author: XiXiDu 12 April 2012 11:06:09AM 2 points [-]

Keep in mind that we still have our tools that make us 'vastly superhuman' circa 2012 , and i frankly don't see how 'automatic will', for lack of better term, is contributing anything here that would make the fully automated system competitive.

This is actually one of Greg Egan's major objections. That superhuman tools come first and that artificial agency won't make those tools competitive against augmented humans. Further, you can't apply any work done to ensure that an artificial agents is friendly to augmented humans.