Nisan comments on A taxonomy of Oracle AIs - Less Wrong

13 Post author: lukeprog 08 March 2012 11:14PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (51)

You are viewing a single comment's thread.

Comment author: Nisan 09 March 2012 04:00:25AM *  5 points [-]

Can someone help me understand why a non-Friendly Question-Answerer is a bad idea?

A Question-Answerer is a system that [...] somehow computes the "answer to the question.” To analyze the difficulty of creating a Question-Answerer, suppose that we ask it the question "what ought we (or I) to do?" [...]

If it cannot answer this question, many of its answers are radically unsafe. Courses of action recommended by the Question-Answerer will likely be unsafe, insofar as "safety" relies on the definition of human value.

I understand that such an AI won't be able to tell me if something is safe. But if it doesn't have goals, it wouldn't try to persuade me that anything is safe. So this sounds like my daily life: There are tools I can use to find answers to some of my questions, but ultimately it is I who must decide whether something is safe or not. This AI doesn't sound dangerous.

EDIT: Can someone give an example of a disaster involving such an AI?

Comment author: John_Maxwell_IV 09 March 2012 07:05:55AM *  6 points [-]

It seems like the worst it could do is misunderstand your question and give you a recipe for gray goo when you really wanted a recipe for a cake. Bonus points if the gray goo recipe looks a lot like a cake recipe.

It seems to me that I often see people thinking about FAI assuming a best case scenario where all intelligent people are less wrong users who see friendliness as paramount, and discarding solutions that don't have above a 99.9% chance of succeeding. But really we want an entire stable of solutions, depending on how potential UFAI projects are going, right?

Comment author: Viliam_Bur 09 March 2012 09:29:23AM 12 points [-]

Bonus points if the gray goo recipe looks a lot like a cake recipe.

More bonus points if the recipe really generates a cake... which later with some probability turns into the gray goo.

Now you can have your cake and it will eat you too. :D

Comment author: Nisan 09 March 2012 05:57:16PM 1 point [-]

I don't believe that a gray goo recipe can look like a cake recipe. I believe there are recipes for disastrously harmful things that look like recipes for desirable things; but is a goal-less Question Answerer producing a deceitful recipe more likely than a human working alone accidentally producing one?

The problem of making the average user as prudent as a Less Wrong user seems much easier than FAI. Average users already know to take the results of Wolfram Alpha and Google with a grain of salt. People working on synthetic organisms and nuclear radiation already know to take precautions when doing anything for the first time.

Comment author: John_Maxwell_IV 09 March 2012 09:47:58PM 1 point [-]

My point about assuming the entire world were less wrong users is that there are teams, made up of people who are not less wrong users, who will develop UFAI if we wait long enough. So a quick and slightly dirty plan (like making this sort of potentially dangerous Oracle AI) may beat a slow and perfect one.

Comment author: Nisan 09 March 2012 11:15:43PM 1 point [-]

Oh! I see. That makes sense.

Comment author: orthonormal 16 March 2012 03:36:31AM 0 points [-]

Can someone give an example of a disaster involving such an AI?

The AI might find answers that satisfy the question but violate background assumptions we never thought to include and wouldn't realize until it was too late (if even then). An easy-to-imagine one that we wouldn't fall for is a cure for cancer that succeeds by eradicating all cellular life. Of course, it's more difficult to come up with one that we would fall for, but anything involving cognitive modifications would be a candidate.

Comment author: Nisan 16 March 2012 04:00:35AM 0 points [-]

So, the reason we wouldn't fall for that one is that the therapy wouldn't pass the safety tests required by first-world governments. We have safety tests for all sorts of new technologies, with the stringency of the tests depending on the kind of technology — some testing for children's toys, more testing for drugs, hopefully more testing for permanent cognitive enhancement. It seems like these tests should protect us from a Question-Answerer as much as from human mistakes.

Actual unfriendly AI seems scarier because it could try to pass our safety tests, in addition to accomplishing its terminal goals. But a Question-Answerer designing something that passes all the tests and nevertheless causes disaster seems about as likely as a well-intentioned but not completely competent human doing the same.

I guess I should have asked for a disaster involving a Question-Answerer which is more plausible than the same scenario with the AI replaced by a human.