Tetronian comments on Best career models for doing research? - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (999)
Let me try to rephrase: correct FAI theory shouldn't have dangerous ideas. If we find that the current version does have dangerous ideas, then this suggests that we are on the wrong track. The "Friendly" in "Friendly AI" should mean friendly.
Pretty much correct in this case. Roko's original post was, in fact, wrong; correctly programmed FAIs should not be a threat.
(FAIs shouldn't be a threat, but a theory to create a FAI will obviously have at least potential to be used to create uFAIs. FAI theory will have plenty of dangerous ideas.)
I want to highlight at this point how you think about similar scenarios:
That isn't very reassuring. I believe that if you had the choice of either letting a Paperclip maximizer burn the cosmic commons or torture 100 people, you'd choose to torture 100 people. Wouldn't you?
They are always a threat to some beings. For example beings who oppose CEV or other AI's. Any FAI who would run a human version of CEV would be a potential existential risk to any alien civilisation. If you accept all this possible oppression in the name of what is subjectively friendliness, how can I be sure that you don't favor torture for some humans that support CEV, in order to ensure it? After all you already allow for the possibility that many beings are being oppressed or possible killed.
This seems to be true and obviously so.
Narrowness. You can parry almost any statement like this, by posing a context outside its domain of applicability.
From my point of view, and as I discussed in the post (this discussion got banned with the rest, although it's not exactly on that topic), the problem here is the notion of "blackmail". I don't know how to formally distinguish that from any other kind of bargaining, and the way in which Roko's post could be wrong that I remember required this distinction to be made (it could be wrong in other ways, but that I didn't notice at the time and don't care to revisit).
(The actual content edited out and posted as a top-level post.)
(I seem to have a talent for writing stuff, then deleting it, and then getting interesting replies. Okay. Let it stay as a little inference exercise for onlookers! And please nobody think that my comment contained interesting secret stuff; it was just a dumb question to Eliezer that I deleted myself, because I figured out on my own what his answer would be.)
Thanks for verbalizing the problems with "blackmail". I've been thinking about these issues in the exact same way, but made no progress and never cared enough to write it up.
Perhaps the reason you are having trouble coming up with a satisfactory characterization of blackmail is that you want a definition with the consequence that it is rational to resist blackmail and therefore not rational to engage in blackmail.
Pleasant though this might be, I fear the universe is not so accomodating.
Elsewhere VN asks how to unpack the notion of a status-quo, and tries to characterize blackmail as a threat which forces the recipient to accept less utility than she would have received in the status quo. I don't see any reason in game theory why such threats should be treated any differently than other threats. But it is easy enough to define the 'status-quo'.
The status quo is the solution to a modified game - modified in such a way that the time between moves increases toward infinity and the current significance of those future moves (be they retaliations or compensations) is discounted toward zero. A player who lives in the present and doesn't respond to delayed gratification or delayed punishment is pretty much immune to threats (and to promises).
On RW it's called Headless Chicken Mode, when the community appears to go nuts for a time. It generally resolves itself once people have the yelling out of their system.
The trick is not to make any decisions based on the fact that things have gone into headless chicken mode. It'll pass.
[The comment this is in reply to was innocently deleted by the poster, but not before I made this comment. However, I think I'm making a useful point here, so would prefer to keep this comment.]
No, the rationale for deletion was not based on the possibility that his exact, FAI-based scenario could actually happen.
What was the grandparent?
Hm? Did my comment get deleted? I still see it.
I noticed you removed the content of the comment from the record on your user page. I would have preferred you not do this; those who are sufficiently curious and know about the trick of viewing the user page ought to have this option.
There was nothing particularly important or interesting in it, just a question I had been mildly curious about. I didn't think there was anything dangerous about it either, but, as I said elsewhere, I'm willing to take Eliezer's word for it if he thinks it is, so I blanked it. Let it go.
I know why you did it. My intention is to register disagreement with your decision. I claim it would have sufficed to simply let Eliezer delete the comment, without you yourself taking additional action to further delete it, as it were.
I could do without this condescending phrase, which unnecessarily (and unjustifiably) imputes agitation to me.
Sorry, you're right. I didn't mean to imply condescension or agitation toward you; it was written in a state of mild frustration, but definitely not at or about your post in particular.
Only if you disagree with correctness of moderator's decision.
Disagreement may be only partial. One could agree to the extent of thinking that viewing of the comment ought to be restricted to a more narrowly-filtered subset of readers.
Yes, this is a possible option, depending on the scope of moderator's decision. Banning comments from a discussion, even if they a backed up and publicly available elsewhere, is still an effective tool in shaping the conversation.
yep, this one is showing as deleted
Weird. I see:
How does Eliezer's delete option work exactly? It stays visible to the author? Now I'm curious.
Wow. Even the people being censored don't know it. That's kinda creepy!
How did you work out that it had been deleted? Just by logging out, looking and trying to remember where you had stuff posted?
I think it's a standard tool: trollish comments look like being ignored to the trolls. But I think it's impolite to delete comments made in good faith without notification and usable guidelines for cleaning up and reposting. (Hint hint.)
I only made one comment on the subject and I was rather confused that it was being ignored. I also knew I might have said too much about the Roko post and actually included a sentence saying that if I crossed the line I'd appreciate being told to edit it instead of having the entire thing deleted. So I just checked that one comment in particular. If other comments of mine have been deleted I wouldn't know about it, though this was the only comment in which I have discussed the Roko post.
I doubt that this is a deliberate feature.
Yes, I've been told that it was deleted but that I still see it since I'm logged in.
In that case I won't repeat what I said in it, partly because it'll just be deleted again but mainly because I actually do trust Eliezer's judgment on this. (I didn't realize that I was saying more than I was supposed to.) All I'll say about it is that it did not actually contain the question that Eliezer's reply suggests he thought it was asking, but it's really not important enough to belabor the point.
Reposting comments deleted by the authors or by moderators will be considered hostile behavior and interfering with the normal and intended behavior of the site and its software, and you will be asked to leave the Less Wrong site.
-- Eliezer Yudkowsky, Less Wrong Moderator.
This is certainly the case with regard to the kind of decision theoretic thing in Roko's deleted post. I'm not sure if it is the case with all ideas that might come up while discussing FAI.