You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

Viliam comments on A few misconceptions surrounding Roko's basilisk - Less Wrong Discussion

39 Post author: RobbBB 05 October 2015 09:23PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (125)

You are viewing a single comment's thread. Show more comments above.

Comment author: V_V 06 October 2015 10:51:51AM *  2 points [-]

When Roko posted about the Basilisk, I very foolishly yelled at him, called him an idiot, and then deleted the post. [...] Why I yelled at Roko: Because I was caught flatfooted in surprise, because I was indignant to the point of genuine emotional shock, at the concept that somebody who thought they'd invented a brilliant idea that would cause future AIs to torture people who had the thought, had promptly posted it to the public Internet. In the course of yelling at Roko to explain why this was a bad thing, I made the further error---keeping in mind that I had absolutely no idea that any of this would ever blow up the way it did, if I had I would obviously have kept my fingers quiescent---of not making it absolutely clear using lengthy disclaimers that my yelling did not mean that I believed Roko was right about CEV-based agents [= Eliezer’s early model of indirectly normative agents that reason with ideal aggregated preferences] torturing people who had heard about Roko's idea. [...] What I considered to be obvious common sense was that you did not spread potential information hazards because it would be a crappy thing to do to someone. The problem wasn't Roko's post itself, about CEV, being correct.

I don't buy this explanation for EY actions. From his original comment, quoted in the wiki page:

"One might think that the possibility of CEV punishing people couldn't possibly be taken seriously enough by anyone to actually motivate them. But in fact one person at SIAI was severely worried by this, to the point of having terrible nightmares, though ve wishes to remain anonymous."

"YOU DO NOT THINK IN SUFFICIENT DETAIL ABOUT SUPERINTELLIGENCES CONSIDERING WHETHER OR NOT TO BLACKMAIL YOU. THAT IS THE ONLY POSSIBLE THING WHICH GIVES THEM A MOTIVE TO FOLLOW THROUGH ON THE BLACKMAIL. "

"... DO NOT THINK ABOUT DISTANT BLACKMAILERS in SUFFICIENT DETAIL that they have a motive toACTUALLY [sic] BLACKMAIL YOU. "

"Meanwhile I'm banning this post so that it doesn't (a) give people horrible nightmares and (b) give distant superintelligences a motive to follow through on blackmail against people dumb enough to think about them in sufficient detail, though, thankfully, I doubt anyone dumb enough to do this knows the sufficient detail. (I'm not sure I know the sufficient detail.) "

"You have to be really clever to come up with a genuinely dangerous thought. "

"... the gist of it was that he just did something that potentially gives superintelligences an increased motive to do extremely evil things in an attempt to blackmail us. It is the sort of thing you want to be EXTREMELY CONSERVATIVE about NOT DOING."

This is evidence that Yudkowsky believed, if not that Roko's argument was correct as it was, that at least it was plausible enough that could be developed in a correct argument, and he was genuinely scared by it.

It seems to me that Yudkowsky's position on the matter was unreasonable. LessWrong is a public forum unusually focused on discussion about AI safety, in particular at that time it was focused on discussion about decision theories and moral systems. What better place to discuss possible failure modes of an AI design?
If one takes AI risk seriously, and realized that an utilitarian/CEV/TDT/one-boxing/whatever AI might have a particularly catastrophic failure mode, the proper thing to do would be to publicly discuss it, so that the argument can be either refuted or accepted, and if it was accepted it would imply scrapping that particular AI design and making sure that anybody who may create an AI is aware of that failure mode. Yelling and trying to sweep it under the rug was irresponsible.

Comment author: Viliam 06 October 2015 03:06:03PM *  2 points [-]

This is evidence that Yudkowsky believed (...) that at least it was plausible enough that could be developed in a correct argument, and he was genuinely scared by it.

Just to be sure, since you seem to disagree with this opinion (whether it is actually Yudkowsky's opinion or not), what exactly is it that you believe?

a) There is absolutely no way one could be harmed by thinking about not-yet-existing dangerous entities; even if those entities in the future will be able to learn about the fact that the person was thinking about them in this specific way.

b) There is a way one could be harmed by thinking about not-yet-existing dangerous entities, but the way to do this is completely different from what Roko proposed.

If it happens to be (b), then it still makes sense to be angry about publicly opening the whole topic of "let's use our intelligence to discover the thoughts that may harm us by us thinking about them -- and let's do it in a public forum where people are interested in decision theories, so they are more qualified than average to find the right answer." Even if the proper way to harm oneself is different from what Roko proposed, making this a publicly debated topic increases the chance of someone finding the correct solution. The problem is not the proposed basilisk, but rather inviting people to compete in clever self-harm; especially the kind of people known for being hardly able to resist such invitation.

Comment author: anon85 06 October 2015 05:38:34PM *  0 points [-]

I'm not the person you replied to, but I mostly agree with (a) and reject (b). There's no way you can could possibly know enough about a not-yet-existing entity to understand any of its motivations; the entities that you're thinking about and the entities that will exist in the future are not even close to the same. I outlined some more thoughts here.