Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

Comment author: player_03 27 April 2017 12:20:36AM *  2 points [-]

An example I like is the Knight Capital Group trading incident. Here are the parts that I consider relevant:

KCG deployed new code to a production environment, and while I assume this code was thoroughly tested in a sandbox, one of the production servers had some legacy code ("Power Peg") that wasn't in the sandbox and therefore wasn't tested with the new code. These two pieces of code used the same flag for different purposes: the new code set the flag during routine trading, but Power Peg interpreted that flag as a signal to buy and sell ~10,000 arbitrary* stocks.

*Actually not arbitrary. What matters is that the legacy algorithm was optimized for something other than making money, so it lost money on average.

They stopped this code after 45 minutes, but by then it was too late. Power Peg had already placed millions of inadvisable orders, nearly bankrupting KCG.

Sometimes, corrigibility isn't enough.

Comment author: gwern 18 July 2014 10:54:31PM 0 points [-]

The fact that his brother said this while passing by means that he spotted a low-hanging fruit. If his brother had spent more time looking before giving the hint, this would have indicated a fruit that was a little higher up.

The brother could have spent arbitrarily much time on the jigsaw puzzle before Claude started playing with it.

Comment author: player_03 20 July 2014 12:18:51AM *  0 points [-]

I suppose, but even then he would have to take time to review the state of the puzzle. You would still expect him to take longer to spot complex details, and perhaps he'd examine a piece or two to refresh his memory.

But that isn't my true rejection here.

If you assume that Claude's brother "spent arbitrarily much time" beforehand, the moral of the story becomes significantly less helpful: "If you're having trouble, spend an arbitrarily large amount of time working on the problem."

Comment author: gwern 20 April 2011 08:14:35PM *  9 points [-]

An excerpt from a likely-never-to-be-finished essay:

"Claude Shannon once told me that as a kid, he remembered being stuck on a jigsaw puzzle.

His brother, who was passing by, said to him: "You know: I could tell you something."

That's all his brother said.

Yet that was enough hint to help Claude solve the puzzle. The great thing about this hint... is that you can always give it to yourself."

--Manuel Blum, "Advice to a Beginning Graduate Student"

Comment author: player_03 18 July 2014 06:24:57AM *  5 points [-]

His brother's hint contained information that he couldn't have gotten by giving the hint to himself. The fact that his brother said this while passing by means that he spotted a low-hanging fruit. If his brother had spent more time looking before giving the hint, this would have indicated a fruit that was a little higher up.

This advice is worth trying, but when you give it to yourself, you can't be sure that there's low hanging fruit left. If someone else gives it to you, you know it's worth looking for, because you know there's something there to find. (The difference is that they, not you, took the time to search for it.)

Again, it's a worthwhile suggestion. I just want to point out that it boils down to "If you're having trouble, check for easier solutions," and that while you can always give this advice to yourself, it will not always help.

Comment author: Osuniev 23 December 2012 02:28:05AM *  4 points [-]

re-reading chapter 76 made me realise the prophecy could not be about Voldemort at all :

Let's look at this prophecy in detail :

"The one with the power to vanquish the Dark Lord approaches,"

Vanquish, as Snape said, is a strange word to describe a baby accidentally toasting Voldemort, especially since we have evidence that this might not be what really happened. "Dark Lord" is used by EY quite loosely, and not as something specifically relating to Voldemort. Indeed, Dumbledore seems to worry that he could be this Dark Lord. Now, if we step outside of what we think we know about the prophecy...

Who is Harry trying to "vanquish" ? Who is it which Harry has "the power to Vanquish" ?

Dementors ? Death in general ? Dementors as an incarnation of Death ?

Could Death be considered as the Dark Lord ? I admit this is stretching the use of the word Dark Lord, but it does sounds interesting and more appropriate to Vanquish. Now, bear with me a moment and let's look at the rest of the prophecy : Born to those who have thrice defied him,

Now, while Lily and James have defied death 3 times, there's a million person in the same case on the planet. But WHO has defied Death three times in the Universe ?

The Peverell Brother. Harry's ancestors through the Potter Family.

Born as the seventh month dies, And the Dark Lord will mark him as his equal,

The Tale of the Three Brothers specifically says : "..."And then he [the third brother Ignotus, owner of the Cloak] greeted Death as an old friend, and went with him gladly, and, as equals, they departed this life." Harry having the Cloak works, as such. Alternatively, Harry "killing" Dementors make Death and he litteraly equals, in that they can destroy each other.

But he will have power the Dark Lord knows not,

The only unique powers Harry has are Dementor 2.0 and partial transfiguration Dementor 2.0 seems rather good.

And either must destroy all but a remnant of the other, For those two different spirits cannot exist in the same world.

I find really interesting that nowhere it is said that the dark lord "lives". "Destroy all but a remnant" could mean Dementing Harry, or Destroying all dementors except one, or giving Philosopher's Stones to everyone but without the death rate falling to zero (because accidental Death would still happen buit would not be an inevitability.

Note that this theory (still improbable, if I had to bet on it I wouldn't assign more than a 15 % chance for Death to be the "Dark Lord" of the prophecy) is still compatible with Dumbledore trying to trick Voldemort in a Dark ritual, or both of them interpreting the prophecy as in canon.

Comment author: player_03 24 April 2014 02:03:49AM 0 points [-]

Harry left "a portion of his life" (not an exact quote) in Azkaban, and apparently it will remain there forever. That could be the remnant that Death would fail to destroy.

Anyway, Snape drew attention to the final line in the prophecy. It talked about two different spirits that couldn't exist in the same world, or perhaps two ingredients that cannot exist in the same cauldron. That's not Harry and Voldemort; that's Harry and Death.

I mean, Harry has already sworn to put an end to death. It's how he casts his patronus. He's a lot less sure about killing Voldemort, and would prefer not to, if given the choice.

Comment author: Tuxedage 23 December 2013 08:26:52PM 8 points [-]

I have posted this in the last open thread, but I should post here too for relevancy:

I have donated $5,000 for the MIRI 2013 Winter Fundraiser. Since I'm a "new large donor", this donation will be matched 3:1, netting a cool $20,000 for MIRI.

I have decided to post this because of "Why our Kind Cannot Cooperate". I have been convinced that people donating should publicly brag about it to attract other donors, instead of remaining silent about their donation which leads to a false impression of the amount of support MIRI has.

Comment author: player_03 12 January 2014 04:26:45AM *  0 points [-]

On the other hand, MIRI hit its goal three weeks early, so the amount of support is pretty obvious.

Though I have to admit, I was going to remain silent too, and upon reflection I couldn't think of any good reasons to do so. It may not be necessary, but it couldn't hurt either. So...

I donated $700 to CFAR.

In response to comment by [deleted] on Rationality Quotes October 2013
Comment author: James_Miller 04 October 2013 03:15:19AM 8 points [-]

Well designed traditions and protocols will contain elements that cause most subcompetent people to not want to throw them out.

Comment author: player_03 06 October 2013 04:01:31AM *  9 points [-]

Well designed traditions and protocols will contain elements that cause most competent people to not want to throw them out.

Comment author: pewpewlasergun 03 October 2013 06:06:56AM 25 points [-]

“Whenever serious and competent people need to get things done in the real world, all considerations of tradition and protocol fly out the window.”

Neal Stephenson - "Quicksilver"

Comment author: player_03 06 October 2013 03:59:56AM 2 points [-]

Having just listened to much of the Ethical Injunctions sequence (as a podcast courtesy of George Thomas), I'm not so sure about this one. There are reasons for serious, competent people to follow ethical rules, even when they need to get things done in the real world.

Ethics aren't quite the same as tradition and protocol, but even so, sometimes all three of those things exist for good reasons.

Comment author: pslunch 10 September 2013 09:01:44PM 5 points [-]

Thank you for the clarification. While I have a certain hesitance to throw around terms like "irredeemable", I do understand the frustration with a certain, let's say, overconfident and persistent brand of misunderstanding and how difficult it can be to maintain a public forum in its presence.

My one suggestion is that, if the goal was to avoid RobbBB's (wonderfully high-quality comments, by the way) confusion, a private message might have been better. If the goal was more generally to minimize the confusion for those of us who are newer or less versed in LessWrong lore, more description might have been useful ("a known and persistent troll" or whatever) rather than just providing a name from the enemies list.

Comment author: player_03 13 September 2013 02:06:08AM 4 points [-]


Though actually, Eliezer used similar phrasing regarding Richard Loosemore and got downvoted for it (not just by me). Admittedly, "persistent troll" is less extreme than "permanent idiot," but even so, the statement could be phrased to be more useful.

I'd suggest, "We've presented similar arguments to [person] already, and [he or she] remained unconvinced. Ponder carefully before deciding to spend much time arguing with [him or her]."

Not only is it less offensive this way, it does a better job of explaining itself. (Note: the "ponder carefully" section is quoting Eliezer; that part of his post was fine.)

Comment author: Richard_Loosemore 11 September 2013 06:24:16PM 2 points [-]

What you say makes sense .... except that you and I are both bound by the terms of a scenario that someone else has set here.

So, the terms (as I say, this is not my doing!) of reference are that an AI might sincerely believe that it is pursuing its original goal of making humans happy (whatever that means .... the ambiguity is in the original), but in the course of sincerely and genuinely pursuing that goal, it might get into a state where it believes that the best way to achieve the goal is to do something that we humans would consider to be NOT achieving the goal.

What you did was consider some other possibilities, such as those in which the AI is actually not being sincere. Nothing wrong with considering those, but that would be a story for another day.

Oh, and one other thing that arises from your above remark: remember that what you have called the "fail-safe" is not actually a fail-safe, it is an integral part of the original goal code (X). So there is no question of this being a situation where "... it wants Z, and a fail-safe prevents it from getting Z, [so] it will find a way around that fail-safe." In fact, the check is just part of X, so it WANTS to check as much as wants anything else involved in the goal.

I am not sure that self-modification is part of the original terms of reference here, either. When Muehlhauser (for example) went on a radio show and explained to the audience that a superintelligence might be programmed to make humans happy, but then SINCERELY think it was making us happy when it put us on a Dopamine Drip, I think he was clearly not talking about a free-wheeling AI that can modify its goal code. Surely, if he wanted to imply that, the whole scenario goes out the window. The AI could have any motivation whatsoever.

Hope that clarifies rather than obscures.

Comment author: player_03 12 September 2013 07:02:14AM 2 points [-]

You and I are both bound by the terms of a scenario that someone else has set here.

Ok, if you want to pass the buck, I won't stop you. But this other person's scenario still has a faulty premise. I'll take it up with them if you like; just point out where they state that the goal code starts out working correctly.

To summarize my complaint, it's not very useful to discuss an AI with a "sincere" goal of X, because the difficulty comes from giving the AI that goal in the first place.

What you did was consider some other possibilities, such as those in which the AI is actually not being sincere. Nothing wrong with considering those, but that would be a story for another day.

As I see it, your (adopted) scenario is far less likely than other scenario(s), so in a sense that one is the "story for another day." Specifically, a day when we've solved the "sincere goal" issue.

Comment author: Richard_Loosemore 10 September 2013 01:27:40PM 4 points [-]

This entire debate is supposed to about my argument, as presented in the original article I published on the IEET.org website ("The Fallacy of Dumb Superintelligence").

But in that case, what should I do when Rob insists on talking about something that I did not say in that article?

My strategy was to explain his mistake, but not engage in a debate about his red herring. Sensible people of all stripes would consider that a mature response.

But over and over again Rob avoided the actual argument and insisted on talking about his red herring.

And then FINALLY I realized that I could write down my original claim in such a way that it is IMPOSSIBLE for Rob to misinterpret it.

(That was easy, in retrospect: all I had to do was remove the language that he was using as the jumping-off point for his red herring).

That final, succinct statement of my argument is sitting there at the end of his blog ..... so far ignored by you, and by him. Perhaps he will be able to respond, I don't know, but you say you have read it, so you have had a chance to actually understand why it is that he has been talking about something of no relevance to my original argument.

But you, in your wisdom, chose to (a) completely ignore that statement of my argument, and (b) give me a patronizing rebuke for not being able to understand Rob's red herring argument.

Comment author: player_03 11 September 2013 02:22:34AM *  1 point [-]

I didn't mean to ignore your argument; I just didn't get around to it. As I said, there were a lot of things I wanted to respond to. (In fact, this post was going to be longer, but I decided to focus on your primary argument.)

Your story:

This hypothetical AI will say “I have a goal, and my goal is to get a certain class of results, X, in the real world.” [...] And we say “Hey, no problem: looks like your goal code is totally consistent with that verbal description of the desired class of results.” Everything is swell up to this point.

My version:

The AI is lying. Or possibly it isn't very smart yet, so it's bad at describing its goal. Or it's oversimplifying, because the programmers told it to, because otherwise the goal description would take days. And the goal code itself is too complicated for the programmers to fully understand. In any case, everything is not swell.

Your story:

Then one day the AI says “Okay now, today my goalX code says I should do this…” and it describes an action that is VIOLENTLY inconsistent with the previously described class of results, X. This action violates every one of the features of the class that were previously given.

My version:

The AI's goal was never really X. It was actually Z. The AI's actions perfectly coincide with Z.

In the rest of the scenario you described, I agree that the AI's behavior is pretty incoherent, if its goal is X. But if it's really aiming for Z, then its behavior is perfectly, terrifyingly coherent.

And your "obvious" fail-safe isn't going to help. The AI is smarter than us. If it wants Z, and a fail-safe prevents it from getting Z, it will find a way around that fail-safe.

I know, your premise is that X really is the AI's true goal. But that's my sticking point.

Making it actually have the goal X, before it starts self-modifying, is far from easy. You can't just skip over that step and assume it as your premise.

View more: Next