Stuart_Armstrong comments on Siren worlds and the perils of over-optimised search - Less Wrong

27 Post author: Stuart_Armstrong 07 April 2014 11:00AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (411)

You are viewing a single comment's thread. Show more comments above.

Comment author: Stuart_Armstrong 06 May 2014 11:47:11AM 0 points [-]

I do not understand your question. It was suggested that an AI run a simulated brain, and ask the brain for approval for doing its action. My point was that "ask the brain for approval" is a complicated thing to define, and puts no real limits on what the AI can do unless we define it properly.

Comment author: TheAncientGeek 06 May 2014 12:42:23PM 0 points [-]

Ok. You are assuming the superintelligent .AI will pose the question in a dumb way?

Comment author: Stuart_Armstrong 06 May 2014 12:46:19PM 0 points [-]

No, I am assuming the superintelligent AI will pose the question in the way it will get the answer it prefers to get.

Comment author: TheAncientGeek 06 May 2014 01:20:24PM 0 points [-]

Oh, you're assuming it's malicious. In order to prove...?

Comment author: Stuart_Armstrong 06 May 2014 05:57:19PM 1 point [-]

No, not assuming it's malicious.

I'm assuming that it has some sort of programming along the lines of "optimise X, subject to the constraint that uploaded brain B must approve your decisions."

Then it will use the most twisted definition of "approve" that it can find, in order to best optimise X.

Comment author: TheAncientGeek 07 May 2014 11:01:55AM *  -1 points [-]

The programme it with:

Prime directive - interpret all directives according to your makers intentions.

Secondary directive - do nothing that goes against the uploaded brain

Tertiary objective - optimise X.

Comment author: Stuart_Armstrong 07 May 2014 12:07:43PM 0 points [-]

And how do you propose to code the prime directive? (with that, you have no need for the other ones; the uploaded brain is completely pointless)

Comment author: TheAncientGeek 07 May 2014 01:00:55PM 0 points [-]

The prime directive is the tertiary directive for a specific X

Comment author: Stuart_Armstrong 07 May 2014 02:58:44PM 0 points [-]

That's not a coding approach for the prime directive.

Comment author: TheAncientGeek 07 May 2014 03:13:15PM *  0 points [-]

You have already assumed you can build an .AI that optimises X. I am not assuming anything different.

In fact any .AI that self improves is going to have to have some sort of goal of getting things right, whether instrumental or terminal. Terminal is much safer, to the extent that it might even solve the whole friendliness problem.