player_03 comments on The genie knows, but doesn't care - Less Wrong

54 Post author: RobbBB 06 September 2013 06:42AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (515)

You are viewing a single comment's thread. Show more comments above.

Comment author: player_03 10 September 2013 08:41:57AM *  5 points [-]

Oh, yeah, I found that myself eventually.

Anyway, I went and read the the majority of that discussion (well, the parts between Richard and Rob). Here's my summary:

Richard:

I think that what is happening in this discussion [...] is a misunderstanding. [...]

[Rob responds]

Richard:

You completely miss the point that I was trying to make. [...]

[Rob responds]

Richard:

You are talking around the issue I raised. [...] There is a gigantic elephant in the middle of this room, but your back is turned to it. [...]

[Rob responds]

Richard:

[...] But each time I explain my real complaint, you ignore it and respond as if I did not say anything about that issue. Can you address my particular complaint, and not that other distraction?

[Rob responds]

Richard:

[...] So far, nobody (neither Rob nor anyone else at LW or elsewhere) will actually answer that question. [...]

[Rob responds]

Richard:

Once again, I am staggered and astonished by the resilience with which you avoid talking about the core issue, and instead return to the red herring that I keep trying to steer you away from. [...]

Rob:

Alright. You say I’ve been dancing around your “core” point. I think I’ve addressed your concerns quite directly, [...] To prevent yet another suggestion that I haven’t addressed the “core”, I’ll respond to everything you wrote above. [...]

Richard:

Rob, it happened again. [...]

I snipped a lot of things there. I found lots of other points I wanted to emphasize, and plenty of things I wanted to argue against. But those aren't the point.


Richard, this next part is directed at you.

You know what I didn't find?

I didn't find any posts where you made a particular effort to address the core of Rob's argument. It was always about your argument. Rob was always the one missing the point.

Sure, it took Rob long enough to focus on finding the core of your position, but he got there eventually. And what happened next? You declared that he was still missing the point, posted a condensed version of the same argument, and posted here that your position "withstands all the attacks against it."

You didn't even wait for him to respond. You certainly didn't quote him and respond to the things he said. You gave no obvious indication that you were taking his arguments seriously.

As far as I'm concerned, this is a cardinal sin.


I think I am explaining the point with such long explanations that I am causing you to miss the point.

How about this alternate hypothesis? Your explanations are fine. Rob understands what you're saying. He just doesn't agree.

Perhaps you need to take a break from repeating yourself and make sure you understand Rob's argument.

(P.S. Eliezer's ad hominem is still wrong. You may be making a mistake, but I'm confident you can fix it, the tone of this post notwithstanding.)

Comment author: Richard_Loosemore 10 September 2013 01:27:40PM 4 points [-]

This entire debate is supposed to about my argument, as presented in the original article I published on the IEET.org website ("The Fallacy of Dumb Superintelligence").

But in that case, what should I do when Rob insists on talking about something that I did not say in that article?

My strategy was to explain his mistake, but not engage in a debate about his red herring. Sensible people of all stripes would consider that a mature response.

But over and over again Rob avoided the actual argument and insisted on talking about his red herring.

And then FINALLY I realized that I could write down my original claim in such a way that it is IMPOSSIBLE for Rob to misinterpret it.

(That was easy, in retrospect: all I had to do was remove the language that he was using as the jumping-off point for his red herring).

That final, succinct statement of my argument is sitting there at the end of his blog ..... so far ignored by you, and by him. Perhaps he will be able to respond, I don't know, but you say you have read it, so you have had a chance to actually understand why it is that he has been talking about something of no relevance to my original argument.

But you, in your wisdom, chose to (a) completely ignore that statement of my argument, and (b) give me a patronizing rebuke for not being able to understand Rob's red herring argument.

Comment author: player_03 11 September 2013 02:22:34AM *  1 point [-]

I didn't mean to ignore your argument; I just didn't get around to it. As I said, there were a lot of things I wanted to respond to. (In fact, this post was going to be longer, but I decided to focus on your primary argument.)

Your story:

This hypothetical AI will say “I have a goal, and my goal is to get a certain class of results, X, in the real world.” [...] And we say “Hey, no problem: looks like your goal code is totally consistent with that verbal description of the desired class of results.” Everything is swell up to this point.

My version:

The AI is lying. Or possibly it isn't very smart yet, so it's bad at describing its goal. Or it's oversimplifying, because the programmers told it to, because otherwise the goal description would take days. And the goal code itself is too complicated for the programmers to fully understand. In any case, everything is not swell.

Your story:

Then one day the AI says “Okay now, today my goalX code says I should do this…” and it describes an action that is VIOLENTLY inconsistent with the previously described class of results, X. This action violates every one of the features of the class that were previously given.

My version:

The AI's goal was never really X. It was actually Z. The AI's actions perfectly coincide with Z.

In the rest of the scenario you described, I agree that the AI's behavior is pretty incoherent, if its goal is X. But if it's really aiming for Z, then its behavior is perfectly, terrifyingly coherent.

And your "obvious" fail-safe isn't going to help. The AI is smarter than us. If it wants Z, and a fail-safe prevents it from getting Z, it will find a way around that fail-safe.

I know, your premise is that X really is the AI's true goal. But that's my sticking point.

Making it actually have the goal X, before it starts self-modifying, is far from easy. You can't just skip over that step and assume it as your premise.

Comment author: Richard_Loosemore 11 September 2013 06:24:16PM 2 points [-]

What you say makes sense .... except that you and I are both bound by the terms of a scenario that someone else has set here.

So, the terms (as I say, this is not my doing!) of reference are that an AI might sincerely believe that it is pursuing its original goal of making humans happy (whatever that means .... the ambiguity is in the original), but in the course of sincerely and genuinely pursuing that goal, it might get into a state where it believes that the best way to achieve the goal is to do something that we humans would consider to be NOT achieving the goal.

What you did was consider some other possibilities, such as those in which the AI is actually not being sincere. Nothing wrong with considering those, but that would be a story for another day.

Oh, and one other thing that arises from your above remark: remember that what you have called the "fail-safe" is not actually a fail-safe, it is an integral part of the original goal code (X). So there is no question of this being a situation where "... it wants Z, and a fail-safe prevents it from getting Z, [so] it will find a way around that fail-safe." In fact, the check is just part of X, so it WANTS to check as much as wants anything else involved in the goal.

I am not sure that self-modification is part of the original terms of reference here, either. When Muehlhauser (for example) went on a radio show and explained to the audience that a superintelligence might be programmed to make humans happy, but then SINCERELY think it was making us happy when it put us on a Dopamine Drip, I think he was clearly not talking about a free-wheeling AI that can modify its goal code. Surely, if he wanted to imply that, the whole scenario goes out the window. The AI could have any motivation whatsoever.

Hope that clarifies rather than obscures.

Comment author: player_03 12 September 2013 07:02:14AM 2 points [-]

You and I are both bound by the terms of a scenario that someone else has set here.

Ok, if you want to pass the buck, I won't stop you. But this other person's scenario still has a faulty premise. I'll take it up with them if you like; just point out where they state that the goal code starts out working correctly.

To summarize my complaint, it's not very useful to discuss an AI with a "sincere" goal of X, because the difficulty comes from giving the AI that goal in the first place.

What you did was consider some other possibilities, such as those in which the AI is actually not being sincere. Nothing wrong with considering those, but that would be a story for another day.

As I see it, your (adopted) scenario is far less likely than other scenario(s), so in a sense that one is the "story for another day." Specifically, a day when we've solved the "sincere goal" issue.