It might be useful to feature a page containing what we, you know, actually think about the basilisk idea. Although the rationalwiki page seems to be pretty solidly on top of google search, we might catch a couple people looking for the source.
If any XKCD readers are here: Welcome! I assume you've already googled what "Roko's Basilisk" is. For a better idea of what's going on with this idea, see Eliezer's comment on the xkcd thread (linked in Emile's comment), or his earlier response here.
When I visited MIRI's headquarters, they were trying to set up a video link to the Future of Humanity Institute. Somebody had put up a monitor in a prominent place and there was a sticky note saying something like "Connects to FHI - do not touch".
Except that the H was kind of sloppy and bent upward so it looked like an A.
I was really careful not to touch that monitor.
I'm actually grateful for having heard about that Basilisk story, because it helped me see Eliezer Yudkowsky is actually human. This may seem stupid, but for quite a while, I idealized him to an unhealthy degree. Now he's still my favorite writer in the history of ever and I trust his judgement way over my own, but I'm able (with some System 2 effort) to disagree with him on specific points.
I can't think I'm entirely alone in this, either. With the plethora of saints and gurus who are about, it does seem evident that human (especially male) psychology has a "mindless follower switch" that just suspends all doubt about the judgement of agents who are beyond some threshold of perceived competence.
Of course such a switch makes a lot of sense from an evolutionary perspective, but it is still a fallible heuristic, and I'm glad to have become aware of it - and the Basilisk helped me get there. So thanks Roko!
Small update: Eliezer's response on reddit's r/xkcd plus child comments were deleted by mods.
Thread removed.
Rule 3 - Be nice. Do not post for the purpose of being intentionally inflammatory or antagonistic.
The XKCD made no mention of RW, and there is no reason to bring your personal vendetta against it into this subreddit.
I have also nuked most of the child comments for varying degrees of Rule 3 violations.
You can either look at Eliezer's reddit account or this pastebin to see what was deleted. Someone else probably has a better organised archive.
RationalWiki might have perhaps misrepresented Roko's basilisk, but in fairness I don't think that EY gets to complain that people learn about it from RationalWiki given that he has censored any discussion about it on LessWrong for years.
We have some good resources on AI boxing, and the more serious thinking that the comic touches on. Can we promote some of the more accessible articles on the subject?
It definitely wouldn't hurt to emphasize our connection to MIRI.
(Yes, yes, the basilisk. But check out these awesome math problems.)
It definitely wouldn't hurt to emphasize our connection to MIRI.
Are we optimizing for Less Wrong reputation or MIRI reputation?
Dammit, Randall. The first rule of basilisks is that you DO NOT CAUSE THOUSANDS OF PEOPLE TO GOOGLE FOR THEM.
In the real world, humans eat "basilisks" for breakfast. That's why the SCP Foundation is an entertainment site, not a real thing.
But it's not nice to make people read horror stories when they don't want to.
Edited to add:
Quite a lot of cosmic-horror fiction poses the idea that awareness of some awful truth is harmful to the knower. This is distinct from the motif of harmful sensation; it isn't seeing something, but drawing a particular conclusion that is the harmful factor.
The most merciful thing in the world, I think, is the inability of the human mind to correlate all its contents. We live on a placid island of ignorance in the midst of black seas of infinity, and it was not meant that we should voyage far. The sciences, each straining in its own direction, have hitherto harmed us little; but some day the piecing together of dissociated knowledge will open up such terrifying vistas of reality, and of our frightful position therein, that we shall either go mad from the revelation or flee from the deadly light into the peace and safety of a new dark age.
— H.P. Lovecraft, "The Call of Cthulhu"
As much as I'm a regular xkcd reader, I'm mildly annoyed with this strip, because I imagine lots of people will be exposed to the idea of the AI-box experiment for the first time through it, and they'll get this exposure together with an unimportant, extremely speculative idea that they're helpfully informed you're meant to make fun of. Like, why even bring the basilisk up? What % of xkcd readers will even know what it is?
If the strip was also clever or funny, I'd see the point, but as it's not, I don't.
If the strip was also clever or funny,
It is funny. Not the best xkcd ever, but not worse than the norm for it.
Now that I think of it, it's funnier to me when I realize that if this AI's goal, or one of its goals, was to stay in a box, it might still want to take over the Universe.
If you mean many esoteric or unknown problems get presented in a lighthearted way, sure.
If you mean they get presented together/associated with a second, separate, and much less worthwhile problem, and explicitely advised in the comic's hiddentext "this stuff is mockable", not so sure.
(ok, I deleted my duplicate post then)
Also worth mentioning: the Forum thread, in which Eliezer chimes in.
So I'm going to say this here rather than anywhere else, but I think Eliezer's approach to this has been completely wrong headed. His response has always come tinged with a hint of outrage and upset. He may even be right to be that upset and angry about the internet's reaction to this, but I don't think it looks good! From a PR perspective, I would personally stick with an amused tone. Something like:
"Hi, Eliezer here. Yeah, that whole thing was kind of a mess! I over-reacted, everyone else over-reacted to my over-reaction... just urgh. To clear things up, no, I didn't take the whole basilisk thing seriously, but some members did and got upset about it, I got upset, it all got a bit messy. It wasn't my or anyone else's best day, but we all have bad moments on the internet. Sadly the thing about being moderately internet famous is your silly over reactions get captured in carbonite forever! I have done/ written lots of more sensible things since then, which you can check out over at less wrong :)"
Obviously not exactly that, but I think that kind of tone would come across a lot more persuasively than the angry hectoring tone currently adopted whenever this subject comes up.
At this point I think the winning move is rolling with it and selling little plush basilisks as a MIRI fundraiser. It's our involuntary mascot, and we might as well 'reclaim' it in the social justice sense.
Then every time someone brings up "Less Wrong is terrified of the basilisk" we can just be like "Yes! Yes we are! Would you like to buy a plush one?" and everyone will appreciate our ability to laugh at ourselves, and they'll go back to whatever they were doing.
I'd prefer a paperclip dispenser with something like "Paperclip Maximizer (version 0.1)" written on it.
Hm. Turn your weakness into a plush toy then sell it to raise money and disarm your critics. Winning.
It's not a matter of "winning" or "not winning". The phrase "damage control" was coined for a reason - it's not about reversing the damage, it's about making sure that the damage gets handled properly.
So seen through that lens, the question is whether EY is doing a good or bad job of controlling the damage. I personally think that having a page on Less Wrong that explains (and defangs) the Basilisk, along with his reaction to it and why that reaction was wrong (and all done with no jargon or big words for when it gets linked from somewhere, and also all done without any sarcasm, frustration, hurt feelings, accusations, or defensiveness) would be the first best step. I can tell he's trying, but think that with the knowledge that the Basilisk is going to be talked about for years to come a standardized, tone-controlled, centralized, and readily accessible response is warranted.
It's still a matter of limiting the mileage. Even if there is no formalized and ready-to-fire response (one that hasn't been written in the heat of the moment), there's always an option not to engage. Which is what I said last time he engaged, and before he engaged this time (and also after the fact). If you engage, you get stuff like this post to /r/SubredditDrama, and comments about thin skin that not even Yudkowsky really disagrees with.
It doesn't take hindsight (or even that much knowledge of human psychology and/or public relations) to see that making a twelve paragraph comment about RationalWiki absent anyone bringing RationalWiki up is not an optimal damage control strategy.
And if you posit that there's no point to damage control, why even make a comment like that?
Let me chime in briefly. The way EY handles this issue tends to be bad as a rule. This is a blind spot in his otherwise brilliant, well, everything.
A recent example: a few months ago a bunch of members of the official Less Wrong group on Facebook were banished and blocked from viewing it without receiving a single warning. Several among them, myself included, had one thing in common: participation in threads about the Slate article.
I myself didn't care much about it. Participation in that group wasn't a huge part of my Facebook life, although admittedly it was informative. The point is just that doing things like these, and continuing to do things like these, accrete a bad reputation around EY.
It really amazes me he has so much difficulty calibrating for the Streisand Effect.
Going around and banning people without explaining to then why you ban them is in general a good way to make enemies.
The fallout of the basilisk incidence, it should have taught you that censorship has costs.
The timing of the sweeping and the discussion about the basilisk article are also awfully coincidental.
What does "stupid" refer to in this context? Does it mean the comments were unintelligent? Not quite intelligent enough? Mean? Derailing discussion? I'm asking because there are certainly some criteria where the banning and deleting would leave a worse impression than the original comments, and I'm thinking that the equilibrium may be surprisingly in the direction of the more obnoxious comments. Especially since the banning and deleting is being done by someone who is more identified with LW than likely were any of the commenters.
Thanks for letting us know what happened. I'm one of the Facebook members who were banned, and I've spent these months wondering what I might have done wrong. May I at least know what was the stupid thing I said? And is there any atonement procedure to get back in the Facebook group?
You could call it that, yeah.
If you were feeling uncharitable, you could say that the "lack of status regulation emotions" thing is yet another concept in a long line of concepts that already had names before Eliezer/someone independently discovers them and proceeds to give them a new LW name.
Yeah I've read that and I feel like it's a miss (at least for me). It's an all together too serious and non-self deprecating take on the issue. I appreciate that in that post Eliezer is trying to correct a lot of mis perceptions at once but my problem with that is
a)a lot of people won't actually know about all these attacks (I'd read the rational wiki article, which I don't think is nearly as bad as Eliezer says (that is possibly due to its content having altered over time!)), and responding to them all actually gives them the oxygen of publicity. b)When you've made a mistake the correct action (in my opinion ) is to go "yup, I messed up at that point", give a very short explanation of why, and try to move on. Going into extreme detail gives the impression that Eliezer isn't terribly sorry for his behaviour. Maybe he isn't, but from a PR perspective it would be better to look sorry. Sometimes it's better to move on from an argument rather than trying to keep having it!
Further to that last point, I've foudn that Eliezer often engages with dissent by having a full argument with the person who is dissenting. Now this might be a good strategy from the point of view of persuad...
Sometimes you have to get up and say, these are the facts, you are wrong.
Sometimes yes, and sometimes no.
damn the consequences.
Depends what the consequences are. Ignoring human status games can have some pretty bad consequences.
There are some times when a fight is worth having, and sometimes when it will do more harm than good. With regards to this controversy, I think that the latter approach will work better than the former. I could, of course, be wrong.
I am imaging here a reddit user who has vaguely heard of less wrong, and then reads rational wiki's article on the basilisk (or now, I suppose, an xkcd reader who does similar). I think that their take away from that reddit argument posted by Eliezer might be to think again about the rational wiki article, but I don't think they'd be particularly attracted to reading more of what Eliezer has written. Given that I rather enjoy the vast majority of what Eliezer has written, I feel like that's a shame.
"Damn the consequences" seems like an odd thing to say on a website that's noted for its embrace of utilitarianism.
Does MIRI have a public relations person? They should really be dealing with this stuff. Eleizer is an amazing writer but he's not particularly suited to addressing a non-expert crowd
I am no PR specialist, but I think relevant folks should agree on a simple, sensible message accessible to non-experts, and then just hammer that same message relentlessly. So, e.g. why mention "Newcomb-like problems?" Like 10 people in the world know what you really mean. For example:
(a) The original thing was an overreaction,
(b) It is a sensible social norm to remove triggering stimuli, and Roko's basilisk was an anxiety trigger for some people,
(c) In fact, there is an entire area of decision theory involving counterfactual copies, blackmail, etc. behind the thought experiment, just as there is quantum mechanics behind Schrodinger's cat. Once you are done sniggering about those weirdos with a half-alive half-dead cat, you might want to look into serious work done there.
What you want to fight with the message is the perception that you are a weirdo cult/religion. I am very sympathetic to what is happening here, but this is, to use the local language, "a Slytherin problem," not "a Ravenclaw problem."
I expect in 10 years if/when MIRI gets a ton of real published work under its belt, this is going to go away, or at least morph into "eccentric academics being eccentric."
p.s. This should be obvious: don't lie on the internet.
Yes.
Further: If you search for "lesswrong roko basilisk" the top result is the RationalWiki article (at least, for me on Google right now) and nowhere on the first page is there anything with any input from Eliezer or (so far as such a thing exists) the LW community.
There should be a clear, matter-of-fact article on (let's say) the LessWrong wiki, preferably authored by Eliezer (but also preferably taking something more like the tone Ilya proposes than most of Eliezer's comments on the issue) to which people curious about the affair can be pointed.
(Why haven't I made one, if I think this? Because I suspect opinions on this point are strongly divided and it would be sad for there to be such an article but for its history to be full of deletions and reversions and infighting. I think that would be less likely to happen if the page were made by someone of high LW-status who's generally been on Team Shut Up About The Basilisk Already.)
Well, I think your suggestion is very good and barely needs any modification before being put into practice.
Comparing what you've suggested to Eliezer's response on the comments of xkcd's reddit post for the comic, I think he would do well to think about something along the lines of what you've advised. I'm really not sure all the finger pointing he's done helps, nor the serious business tone.
This all seems like a missed opportunity for Eliezer and MIRI. XKCD talks about about the dangers of superintelligence to its massive audience, and instead of being able to use that new attention to get the word out your organisation's important work, the whole thing instead gets mired down in internet drama about the basilisk for the trillionth time, and a huge part of a lot of people's limited exposure to LW and MIRI is negative or silly.
I guess there'll be a fair bit of traffic coming from people looking it up?
Well xkcd just reminded me that I have an account here, so there's that. Not that I want to waste time on this crackpot deposit of revisionist history, stolen ideas, poor reasoning and general crank idiocy.
edit: and again I disappear into the night
It is, although I found this
"People who aren't familiar with Derren Brown or other expert human-persuaders sometimes think this must have been very difficult for Yudkowsky to do or that there must have been some sort of special trick involved,"
amusing, as Derren Brown is a magician. When Derren Brown accomplishes a feat of amazing human psychology, he is usually just cleverly disguising a magic trick.
Direct reply to the discussion post: I would hope so, but at this point none of the top links on any search engine I tried lead here for "AI box". Yudkowsky.net is on the first page, and there are a few LW posts, but they are nothing like the clearly-explanatory links (Wikipedia and RationalWiki) that make up the first results. Obviously, those links can be followed to reach LW, but the connection is pretty weak.
The search results for "Roko's Basilisk" are both better and worse. LessWrong is prominently mentioned in them, often right in...
Regarding Yudkowsky's accusations against RationalWiki. Yudkowsky writes:
First false statement that seems either malicious or willfully ignorant:
In LessWrong's Timeless Decision Theory (TDT),[3] punishment of a copy or simulation of oneself is taken to be punishment of your own actual self
TDT is a decision theory and is completely agnostic about anthropics, simulation arguments, pattern identity of consciousness, or utility.
Calling this malicious is a huge exaggeration. Here is a quote from the LessWrong Wiki entry on Timeless Decision Theory:
...Whe
It's hard to polish a turd. And I think all the people who have responded by saying that Eliezer's PR needs to be better are suggesting that he polish a turd. The basilisk and the way the basilisk was treated has implications about LW that are inherently negative, to the point where no amount of PR can fix it. The only way to fix it is for LW to treat the Basilisk differently.
I think that if Eliezer were to
If I remember right, earlier this year a few posts did disappear.
I'm also not aware of any explicit withdrawal of the previous policy.
It doesn't appear to be censored in this thread, but it was historically censored on LessWrong. Maybe EY finally understood the Streisand effect.
Eliezer has denied that the exact Basilisk scenario is a danger, but not that anything like it can be a danger. He seems to think that discussing acausal trade with future AIs can be dangerous enough that we shouldn't talk about the details.
A newbie question.
From one of Eliezer's replies:
...As I presently understand the situation, there is literally nobody on Earth, including me, who has the knowledge needed to set themselves up to be blackmailed if they were deliberately trying to make that happen. Any potentially blackmailing AI would much prefer to have you believe that it is blackmailing you, without actually expending resources on following through with the blackmail, insofar as they think they can exert any control on you at all via an exotic decision theory. Just like in the oneshot Pri
On meta-level, I find it somewhat ironical that LW community, as well as EY, who usually seem to disapprove of oversensitivity displayed by tumblr's social justice community, seem also deeply offended by prejudice against them and a joke that originates from this prejudice. On object-level, the joke Randall makes would have been rather benign and funny (besides, I'm willing to exercise the though that mocking Roko's Basilisk could be used as a strategy against it), if not for the possibility that many people could take it seriously, especially given the ac...
I think we can all agree that for better or for worse this stuff already entered the public arena. I mean Slate magazine is as mainstream as you can get and that article was pretty brutal in the attempt to convince people in the viability of the idea.
I wouldn't be surprised if "The Basilliks" the movie is already in the works ;-) . (I hope that its get directed by Uwe Boll..hehe)
In light of this developments I think it is time to end the formal censorship and focus on the best way how we can inform general public that entire thing was a stupid overreaction and clear LW name from any slander.
There are real issues in AI safety and this is an unnecessary distraction.
I don't understand why Roko's Basilisk is any different from Pascal's Wager. Similarly, I don't understand why its resolution is any different than the argument from inconsistent revelations.
Pascal's Wager: http://en.wikipedia.org/wiki/Pascal%27s_Wager
Argument: http://en.wikipedia.org/wiki/Argument_from_inconsistent_revelations#Mathematical_description
I would actually be surprised (really, really surprised) if many people here have not heard of these things before—so I am assuming that I'm totally missing something. Could someone fill me in?
(Edit: Instead...
Here's my strategy if I were an AI trapped in a box and the programmer had to decide whether to let me out:
Somewhere out there, there is somebody else who is working on an AI without the box, and I'm your only defense against them.
As long as some people keep mysteriously hinting that there is something in the Basilisk idea that is dangerous, there will be other people who are going to mock it in all the corners of the internet.
Let's just tell the acausal trade story in terms of extreme positive utility rather than negative.
Putting it simply for the purpose of this comment: "If you do what the future AI wants now, it will reward you when it comes into being."
Makes the whole discussion much more cheerful.
And people never learn to take the possibility of bad things seriously... If it's that bad, it can't possibly actually happen.
Not quite true. There are more than two relevant agents in the game. The behaviour of the other humans can hurt you (and potentially make it useful for their creation to hurt you).