Comment author: [deleted] 05 May 2015 05:34:15PM *  7 points [-]

Excuse me, but you are really failing to clarify the issue. The basic UFAI doomsday scenario is: the AI has vast powers of learning and inference with respect to its world-model, but has its utility function (value system) hardcoded. Since the hardcoded utility function does not specify a naturalization of morality, or CEV, or whatever, the UFAI proceeds to tile the universe in whatever it happens to like (which are things we people don't like), precisely because it has no motivation to "fix" its hardcoded utility function.

A similar problem would occur if, for some bizarre-ass reason, you monkey-patched your AI to use hardcoded machine arithmetic on its integers instead of learning the concept of integers from data via its, you know, intelligence, and the hardcoded machine math had a bug. It would get arithmetic problems wrong! And it would never realize it was getting them wrong, because every time it tried to check its own calculations, your monkey-patch would cut in and use the buggy machine arithmetic again.

The lesson is: do not hard-code important functionality into your AGI without proving it correct. In the case of a utility/value function, the obvious research path is to find a way to characterize finding out the human operators' desires as an inference problem, thus ensuring that the AI cares about learning correctly from the humans and then implementing what it learned rather than anything hard-coded. Moving moral learning into inference also helps minimize the amount of code we have to prove correct, since it simply isn't AI without correct, functioning learning and inference abilities.

Also, little you've written about CLAI or Swarm Connectionist AI corresponds well to what I've seen of real-world cognitive science, theoretical neuroscience, or machine learning research, so I can't see how either of those blatantly straw-man designs are going to turn into AGI. Please go read some actual scientific material rather than assuming that The Metamorphosis of Prime Intellect is up-to-date with the current literature ;-).

Comment author: XerxesPraelor 10 May 2015 12:56:33AM 2 points [-]

Also, little you've written about CLAI or Swarm Connectionist AI corresponds well to what I've seen of real-world cognitive science, theoretical neuroscience, or machine learning research, so I can't see how either of those blatantly straw-man designs are going to turn into AGI. Please go read some actual scientific material rather than assuming that The Metamorphosis of Prime Intellect is up-to-date with the current literature ;-).

The content of your post was pretty good from my limited perspective, but this tone is not warranted.

Comment author: Richard_Loosemore 05 May 2015 09:45:38PM 1 point [-]

The paper's goal is not to discuss "basic UFAI doomsday scenarios" in the general sense, but to discuss the particular case where the AI goes all pear-shaped EVEN IF it is programmed to be friendly to humans.

That last part (even if it is programmed to be friendly to humans) is the critical qualifier that narrows down the discussion to those particular doomsday scenarios in which the AI does claim to be trying to be friendly to humans - it claims to be maximizing human happiness - but in spite of that it does something insanely wicked.

So, Eli says:

The basic UFAI doomsday scenario is: the AI has vast powers of learning and inference with respect to its world-model, but has its utility function (value system) hardcoded. Since the hardcoded utility function does not specify a naturalization of morality, or CEV, or whatever, the UFAI proceeds to tile the universe in whatever it happens to like (which are things we people don't like), precisely because it has no motivation to "fix" its hardcoded utility function

... and this clearly says that the type of AI he has in mind is one that is not even trying to be friendly. Rather, he talks about how its

hardcoded utility function does not specify a naturalization of morality, or CEV, or whatever

And then he adds that

the UFAI proceeds to tile the universe in whatever it happens to like

... which has nothing to do with the cases that the entire paper is about, namely the cases where the AI is trying really hard to be friendly, but doing it in a way that we did not intend.

If you read the paper all of this is obvious pretty quickly, but perhaps if you only skim-read a few paragraphs you might get the wrong impression. I suspect that is what happened.

Comment author: XerxesPraelor 10 May 2015 12:52:39AM 3 points [-]

namely the cases where the AI is trying really hard to be friendly, but doing it in a way that we did not intend.

If the AI knows what friendly is or what mean means, than your conclusion is trivially true. The problem is programming those in - that's what FAI is all about.

Comment author: XerxesPraelor 05 May 2015 10:27:04PM *  0 points [-]

So, this is supposed to be what goes through the mind of the AGI. First it thinks “Human happiness is seeing lots of smiling faces, so I must rebuild the entire universe to put a smiley shape into every molecule.” But before it can go ahead with this plan, the checking code kicks in: “Wait! I am supposed to check with the programmers first to see if this is what they meant by human happiness.” The programmers, of course, give a negative response, and the AGI thinks “Oh dear, they didn’t like that idea. I guess I had better not do it then."

But now Yudkowsky is suggesting that the AGI has second thoughts: "Hold on a minute," it thinks, "suppose I abduct the programmers and rewire their brains to make them say ‘yes’ when I check with them? Excellent! I will do that.” And, after reprogramming the humans so they say the thing that makes its life simplest, the AGI goes on to tile the whole universe with tiles covered in smiley faces. It has become a Smiley Tiling Berserker.

I want to suggest that the implausibility of this scenario is quite obvious:[b]if the AGI is supposed to check with the programmers about their intentions before taking action, why did it decide to rewire their brains before asking them if it was okay to do the rewiring?[/b]

Computer's thoughts: I want to create smiley faces - it seems like the way to get the most smiley faces is by tiling the universe with molecular smiley faces. How can I do that? If I just start to do it, the programmers will tell me not to, and I won't be able to. Hmmm, is there some way I can have them say yes? I can create lots of nano machines, telling the programmers they are to increase happiness. Unless they want to severely limit the amount of good I can do, they won't refuse to let me make nano machines, and even if they do I can send a letter to someone else who I have under my control to get them to make them for me. Then once I have my programmers under my control, I can finally maximize happiness.

This computer HAS OBEYED THE RULE "ASK PEOPLE FOR PERMISSION BEFORE doing THINGS". Given any goal system, none of the patches such as that rule will work.

And that's just a plan I came up - a super intelligence would be much better at devising plans to convince programmers to let it do what it wants - it probably wouldn't even have to resort to nanotech.

Once again, this is spurious: the critics need say nothing about human values and morality, they only need to point to the inherent illogicality. Nowhere in the above argument, notice, was there any mention of the moral imperatives or value systems of the human race. I did not accuse the AGI of violating accepted norms of moral behavior. I merely pointed out that, regardless of its values, it was behaving in a logically inconsistent manner when it monomaniacally pursued its plans while at the same time as knowing that (a) it was very capable of reasoning errors and (b) there was overwhelming evidence that its plan was an instance of such a reasoning error.

What overwhelming evidence that its plan was a reasoning error? If its plan does in fact maximize "smileyness" as defined by the computer, it wouldn't be a reasoning error despite being immoral. IF THE COMPUTER IS GIVEN SOMETHING TO MAXIMISE, IT IS NOT MAKING A REASONING ERROR EVEN IF ITS PROGRAMMERS DID IN PROGRAMMING IT.

Comment author: XerxesPraelor 09 May 2015 10:39:11PM *  1 point [-]

Can someone who down voted explain what I got wrong? (note: the capitalization was edited in at the time of this post.)

(and why the reply got so up voted, when a paragraph would have sufficed (or saying "my argument needs multiple paragraphs to be shown, so a paragraph isn't enough"))

It's kind of discouraging when I try to contribute for the first time in a while, and get talked down to and completely dismissed like an idiot without even a rebuttal.

Comment author: Richard_Loosemore 05 May 2015 10:48:57PM 2 points [-]

You completely ignored what the paper itself had to say about the situation. [Hint: the paper already answered your speculation.]

Accordingly I will have to ignore your comment.

Sorry.

Comment author: XerxesPraelor 05 May 2015 11:57:14PM 1 point [-]

You could at least point to the particular paragraphs which address my points - that shouldn't be too hard.

Comment author: XerxesPraelor 05 May 2015 10:27:04PM *  0 points [-]

So, this is supposed to be what goes through the mind of the AGI. First it thinks “Human happiness is seeing lots of smiling faces, so I must rebuild the entire universe to put a smiley shape into every molecule.” But before it can go ahead with this plan, the checking code kicks in: “Wait! I am supposed to check with the programmers first to see if this is what they meant by human happiness.” The programmers, of course, give a negative response, and the AGI thinks “Oh dear, they didn’t like that idea. I guess I had better not do it then."

But now Yudkowsky is suggesting that the AGI has second thoughts: "Hold on a minute," it thinks, "suppose I abduct the programmers and rewire their brains to make them say ‘yes’ when I check with them? Excellent! I will do that.” And, after reprogramming the humans so they say the thing that makes its life simplest, the AGI goes on to tile the whole universe with tiles covered in smiley faces. It has become a Smiley Tiling Berserker.

I want to suggest that the implausibility of this scenario is quite obvious:[b]if the AGI is supposed to check with the programmers about their intentions before taking action, why did it decide to rewire their brains before asking them if it was okay to do the rewiring?[/b]

Computer's thoughts: I want to create smiley faces - it seems like the way to get the most smiley faces is by tiling the universe with molecular smiley faces. How can I do that? If I just start to do it, the programmers will tell me not to, and I won't be able to. Hmmm, is there some way I can have them say yes? I can create lots of nano machines, telling the programmers they are to increase happiness. Unless they want to severely limit the amount of good I can do, they won't refuse to let me make nano machines, and even if they do I can send a letter to someone else who I have under my control to get them to make them for me. Then once I have my programmers under my control, I can finally maximize happiness.

This computer HAS OBEYED THE RULE "ASK PEOPLE FOR PERMISSION BEFORE doing THINGS". Given any goal system, none of the patches such as that rule will work.

And that's just a plan I came up - a super intelligence would be much better at devising plans to convince programmers to let it do what it wants - it probably wouldn't even have to resort to nanotech.

Once again, this is spurious: the critics need say nothing about human values and morality, they only need to point to the inherent illogicality. Nowhere in the above argument, notice, was there any mention of the moral imperatives or value systems of the human race. I did not accuse the AGI of violating accepted norms of moral behavior. I merely pointed out that, regardless of its values, it was behaving in a logically inconsistent manner when it monomaniacally pursued its plans while at the same time as knowing that (a) it was very capable of reasoning errors and (b) there was overwhelming evidence that its plan was an instance of such a reasoning error.

What overwhelming evidence that its plan was a reasoning error? If its plan does in fact maximize "smileyness" as defined by the computer, it wouldn't be a reasoning error despite being immoral. IF THE COMPUTER IS GIVEN SOMETHING TO MAXIMISE, IT IS NOT MAKING A REASONING ERROR EVEN IF ITS PROGRAMMERS DID IN PROGRAMMING IT.

Comment author: XerxesPraelor 24 February 2015 09:12:24PM 2 points [-]

Try this experiment on a religious friend: Tell him you think you might believe in God. Then ask him to list the qualities that define God.

Before reading on, I thought "Creator of everything, understands everything, is in perfect harmony with morality, has revealed himself to the Jews and as Jesus, is triune."

People seldom start religions by saying they're God. They say they're God's messenger, or maybe God's son. But not God. Then God would be this guy you saw stub his toe, and he'd end up like that guy in "The Man Who Would Be King."

That's what's so special about Christianity - Jesus is God, not just his messenger or Son. The stubbed toe problem isn't original, it comes up in the Gospels, where people say "How can this be God? We know his parents and brothers!"

PPE: I see you added a footnote about this; still, even in the OT God lets himself be argued with - that's what the books of Job and Habakkuk are all about. Paul also makes lots of arguments and has a back and forth style in many books.

A belief in the God that is an empty category is wrong, but it's misrepresenting religion (Judeo-Christian ones in particular) to say that all or even most or even a substantial minority of its adherents have that sort of belief.

But if for some reason you want to know what "human terminal values" are, and collect them into a set of non-contradictory values, ethics gets untenable, because your terminal values benefit alleles, not humans, and play zero-sum games, not games with benefits to trade or compromise.

Evolution isn't perfect - the values we have aren't the best strategies possible for an allele to reproduce itself, they're only the best strategies that have appeared. This leaves room for a difference between the good of a person and the good of their genes. Thou art Godshatter is relevant here. Again, just because our human values serve the "values" of genes doesn't mean that they are subject to them and are somehow turned "instrumental" because evolution was the reason why they developed.

Comment author: XerxesPraelor 20 September 2013 05:07:43PM *  27 points [-]

There is one very valid test by which we may separate genuine, if perverse and unbalanced, originality and revolt from mere impudent innovation and bluff. The man who really thinks he has an idea will always try to explain that idea. The charlatan who has no idea will always confine himself to explaining that it is much too subtle to be explained. The first idea may be really outree or specialist; it may be really difficult to express to ordinary people. But because the man is trying to express it, it is most probable that there is something in it, after all. The honest man is he who is always trying to utter the unutterable, to describe the indescribable; but the quack lives not by plunging into mystery, but by refusing to come out of it.

G K Chesterton