The most terrifying part of the experience for me is the idea that Eliezer could have such a strongly different reaction to the story; it made me less confident that something like CEV will converge nicely.
Eliezer having a different response to a fanfic you both ended up reading and even enjoyed is minor variation, yet it is what made you take seriously: "Maybe values of different humans are incompatible."
That terries me. People are different. The possiblity should have been screaming at you everytime you looked at the world. From personal interactions to study of history and different cultures examples of this are numerous.
When I say that "it made me less confident" that doesn't mean "this is the one piece of evidence that convinced me to take objections to CEV seriously." Taking at face value, it means, "I go through life weighing the evidence I find for or against CEV and this seemed like evidence against, which is always bad."
Taken the way it was intended, it means, "I thought of a joke that only people on LW would get and posted it."
I had two reactions at the end, more or less simultaneous:
There's something about many shards of value here, and one of the key shards getting lost in translation to pony land....
It's strange to have this as my first comment on LW but I'd like to mention that the fanfic's author is currently being targeted by Internet trolls due to her transhumanist stories, among which this one, to the point of receiving death threats. See this blog post of hers for details.
That is sad on many levels. It is horrible that someone can be attacked in real life for writing a pony fanfiction. That's just... totally fucked up.
Unfortunately, she blames the website for having upvote and downvote buttons, because those -- using her words -- encourage antagonistic behavior. I support her right to not be harassed, but I don't agree with her conclusion. At least I don't think the presence of upvote and downvote buttons made LessWrong a hostile place... so obviously, there is something beyond the buttons. Buttons are just a community's tool to regulate itself. They cannot encode values. They express "want more" and "want less", but whatever is it that people want more or want less -- polite behavior or abuse -- that's up to them. The button is not able to encourage good behavior and discourage bad behavior. At best, it can enable its users to do so; at worst, it can be useless.
The proposed solution of removing downvotes and keeping only upvotes is just a silly wishful thinking. Okay, let's have a site with upvotes only. Now a few people come there and start writing abuse, or start coordinating an attack on someone in real life. Hey, they can't be downvoted; they can only be upvoted! Because we have removed the antagonism buttons from the website. I guess it's obvious why such situation is not superior.
Sure, the administrator deleting the offending comments or banning the users, that would help (for a few seconds, until they register again, but don't underestimate the trivial inconveniences). Guess what; administrators can delete offending comments and ban users on websites with downvote buttons, too. So in my opinion the presence of the downvote buttons was not the problem; admins avoiding taking responsibility (beyond implementing the downvote button) for their website was. And perhaps there is something fucked up about the community that encourages or at least doesn't downvote abuse.
So... the story of the abuse is horrible, but the proposed solution (which makes a critical part of the story) is just silly.
The issue with the buttons is that 4chan has a campaign to mass downvote anything she does, maybe even bots to do this automatically. Her texts have disappeared from the main page even though they're very popular, and every comment she posts appears almost immediately with downvotes. The removal of downvotes wouldn't solve the underlying problem, sure, but it'd make the abuse much more difficult to implement as to remove her texts from public view it'd require the abusers to mass upvote everything else rather than just downvoting her own specific contributions.
Malicious people will take advantage of whatever mechanism is offered.
If you allow comments but not downvotes, they will use comment spam. If you allow downvotes from new accounts, they'll do the bury-brigade thing. If you require posters to get accounts and use their legal names and real locations, attackers will stalk people, find their homes, scare their children, and leave poo on their doorstep. If you have a mechanism for automatic takedowns of copyright violations, they'll send forged DMCA notices. (Yes, even though it's illegal to do so. It's a common tactic to get the identity of pseudonymous posters, since a DMCA counterclaim requires a statement of the poster's name, address, and telephone number!)
Attackers of this sort look for asymmetric attacks — things that are relatively cheap, easy, and risk-free for them to do, and cause much more ① grief, and ② time & energy expenditure, on the part of the attacked person or site. The ideal attack is one that is quick and easily repeatable, causes the target great discomfort, and requires the target to spend a bunch of time to clean it up. The intention is to get the target to go away, to cease being visible; to "run them out of town" as it were.
(For an analogy, consider the act of a vandal spray-painting swastikas or dicks on someone's house. It makes the target feel very unsafe; it causes them a bunch more work to clean up than it cost the vandal to do it; it can be done quickly; and it's not very risky in a lot of neighborhoods.)
Attackers look for relative advantages — for instance, if the attackers have coding ability and the target does not, they can use automated posting (bots) or denial-of-service attacks. If the attackers have more free time (e.g. if they are unemployed youths and the target is a working mother's blog), they can post obscene spam or what-have-you at hours when the target is not online to moderate or respond. They also look for ways to increase their advantage — for instance, if they can ascertain the target's real-world identity while remaining anonymous themselves, the attackers can escalate to more credible threats, harassment with photos, "we know where you live", or the like.
Responses to this sort of attacker have to address the facts on the ground. They have to make it harder for attackers to drive up the costs (time, labor, and emotional) for legitimate users, without much additional encumbrance on the legitimate users.
Well, I don't really remember the exact boundary between Friend is Optimal vs Caelum est Conterrens, but...
The uploading process seemed to be destructive only for convenience's sake.
For me, it felt more realistic. I don't think anyone has actually thought of a non-destructive uploading process that is remotely plausible.
Attach lots of sensors to lots of axons, try to emulate the thing while it's running... for me, it's the on-line method that sounds more plausible compared to the "look at axons with microscopes and try to guess what they do" approach. Nevertheless, imagining a scenario with non-destructive uploads... how many times would you allow people to upload? Ending up with questions like that, I think it's the destructive one that would generate less horrifyingness...
The aliens with star communication weren't destroyed. They were close enough to "human" that they were uploaded or ignored. What's more, CelestAI would probably satisfy (most of) the values of these aliens, who probably find "friendship" just as approximately-neutral as they and we find "ponies".
Read it more carefully. One or several paragraphs before the designated-human aliens, it is mentioned that CelestAI found many sources of complex radio waves which weren't deemed "human".
Consider the epistemic state of someone who knows that they have the attention of a vastly greater intelligence than themselves, but doesn't know whether that intelligence is Friendly. An even-slightly-wrong CAI will modify your utility function, and there's nothing you can do but watch it happen.
An even-slightly-wrong CAI won't modify your utility function because she isn't wrong in that way. An even-slightly-wrong CAI does do several other bad things, but that isn't one of them.
I didn't think Failed Utopia #4-2 was "failed" either
It's "failed" in the sense that it could have been a whole lot better than it was. As I said when it was first posted:
Making me the girl of my dreams? I'm currently unattached, so that sounds pretty good. Separating me from the social context I'm currently in? I'll adjust, although this would certainly upset most people. Taking away my video games and Magic cards? Over my dead body!
Haven't read anything in the Optimalverse.
But Failed Utopia 4-2 is very, very successful in a 'Victory -- At Any Cost!!!!' way and pretty unsuccessful compared to something more incremental in the less important stuff.
If I had to make a wild guess, I'd say that what would horrify Eliezer is that at the end (ROT13) gurer ner bayl gjb cbffvoyr glcrf bs vzzbegnyf, Ybbc Vzzbegnyf juvpu frrz pbtavgviryl fghagrq naq checbfryrff, naq Enl Vzzbegnyf jubfr fbyr fubja rknzcyr frrzf zrynapubyl, ertergshy naq srryvat vzcevfbarq ol vasvavgl.
I started reading the first page, and it looks like it's a fic about an engineered utopia with constructed simulated minds.
That's enough right there. Large scale manufacture of pony minds for a video game is exactly the kind of thing that Eliezer would call horrifying, even if it didn't wrench his guts with terror. Think of it as endorsing that you should feel horror in response to the Optimalverse, even though you probably won't because it's nicer than present reality in many respects and because human emotions don't properly reflect reality (scope neglect, hedonic treadmill, not crying when you walk through a cemetery until you reach the row of infant graves). Or maybe his guts do wrench. Different people get upset about all sorts of stimuli, from squirting blood to scraping nails to clicking computers to a microcosm inhabited by adorable inhuman sentients to whom no one gives proper moral consideration .
It also happens to serve Eliezer's interests to make it seem that he is an expert designer of utopias, against whose work everything else falls disastrously short. But that's not such a high standard.
That's what I thought until the last chapter. All the time, I was waiting for something horrible to happen. I thought that in the last chapter, at the latest, things would have to take some very dark turn indeed to make up for all the full-on utopianism that came before.
Instead, the protagonist becomes a godlike intelligence herself. She not only achieves things outside her virtual world, but gains a far deeper understanding of the physical universe than she could ever have as a human. And she's one of trillions to do so. I can't fathom any rational reason anyone wouldn't want to be that transpony, or one of the Superhappies from Three Worlds Collide.
Different people get upset about all sorts of stimuli, from squirting blood to scraping nails to clicking computers to a microcosm inhabited by adorable inhuman sentients to whom no one gives proper moral consideration .
Actually, if I recall correctly, in the original Friendship is Optimal, once they were constructed, the non-uploaded people received the same moral consideration as those originally human. They were designed to fit into a preconceived world but they weren't slaves. I'm not quite sure whether that feels bad because it's actually bad or because it's so very different from our current methods of manufacturing minds (roll genetic dice, expose to local memes, hope for the best).
Making house elves is horrible, but once they exist it's ceteris paribus better to satisfy their desire to serve than not.
(Making house elves is horrible because they are capable of suffering but not resistance. It's the "I Have No Mouth And I Must Scream" situation.)
I don't see it as bad at all and suspect most who do see it as bad do so because it's different from the current method. These minds are designed to have lives that humans would consider valuable, and that they enjoy for all its complexity. It is like making new humans in the usual method, but without the problems of abusive upbringing (the one pony with abusive upbringing wasn't a person at the time) or other bad things that can happen to a human.
Since no one really discussed it yet... by far, I found the fear of uploading to be the most gut-wrenching part. No new ideas, but putting it into a story context made it hit home.
In terms of the utopia, the most concerning part for me was that Celestia did not seem to (ROT13) qrpvqr gung uhznaf inyhr guvatf va gur rkgreany raivebazrag. Pryrfgvn jnf unccl gb yvr va beqre gb znxr crbcyr(/cbavrf) unccl, rira jura gur cbal va dhrfgvba fcrpvsvpnyyl qrfverq gb xabj jung jnf tbvat ba va gur rkgreany jbeyq. Pryrfgvn jnf unccl gb fubj gur cbal n cvpgher bs gur rkgreany jbeyq qrfvtarq gb znkvznyyl fngvfsl inyhrf, vafgrnq bs (jung V jbhyq frr nf) ernyyl fngvfslvat gur cbal'f qrfver gb xabj.
So you think you can guess that character's desire more accurately than a godlike AI with full access to her mind could?
Yes. The author wrote that part because it was a horrifying situation. It isn't a horrifying situation unless the character's desire is to actually know. Therefore, the character wanted to actually know. I can excuse the other instances of lying as tricks to get people to upload, thus satisfying more values than are possible in 80-odd years; that seems a bit out of character for Celestia though.
I personally didn't find the actual experience at Equestria itself terrifying at all. It was a little disturbing at first, but almost all of that was sheer physical disgust or a knee-jerk sour grapes reaction. But it seems to avoid almost all of the pitfalls of failed Utopias everywhere:
That said, there were moments of genuine horror, mainly stuff people have pointed out before:
I suspect your fridge logic would be solved by fvzcyl abg trggvat qb jung ur jnagrq, hagvy ur jvfurq ng fbzr cbvag gung ur jbhyq abg or n fbpvbcngu. I'm more worried about the part you rot13ed, and I suspect it's part of what makes Eliezer consider it horror. I feel that's the main horror part of the story.
There are also the issues of Celestia lying to Lavendar when clearly she wants the truth on some level, the worry about those who would have uploaded (or uploaded earlier) if they had a human option, and the lack of obviously-possible medical and other care for the unuploaded humans (whose values could be satisfied almost as much as those of the ponies). These are instances when an AI is almost-but-not-quite Friendly (and, in the case of the simple fictional story instead of everyday life, could have been easily avoided by telling Celestia to "satisfy values" and that most people she meets initially want friendship and ponies). These are probably the parts that Eliezer is referring to, because of his work in avoiding uFAI and almost-FAI. On the other hand, they are far better than his default scenario, the no AI scenario, and the Failed Utopia #4-2 scenario in the OP. EDIT: Additionally, in the story at least, everything except the lying was easily avoidable by having Celestia just maximize values, while telling her that most people she meets early on will value friendship and ponies (and the lying at the end seems to be somewhat out-of-character because it doesn't actually maximize values).
One other thing some might find horrifying, but probably not Eliezer, is the "Does Síofra die" question. To me, and I presume to him, the answer is "surely not", and the question of ethics boils down to a simple check "does there ever exist an observer moment without a successor; i.e., has somebody died?". Obviously some people do die preventable deaths, but Síofra isn't one of them.
just finished reading.
It's kind of sad in a... grand way.
No one remembering anymore what exactly being "human" means. But... what do we expect? I don't see any human values that are not statisfied, it just does not "feel like home" that much. But still, orpbzvat bar bs gur yrffre fhcrevagryyvtraprf naq fgvyy univat n fcrpvny cynpr va zvaq sbe fgne gerx? It's as heart-warming as it gets, in a cold, dark and strange universe.
(If only we could do this well.)
I think I've solved it:
Yudkowsky had read Friendship is Optimal, a soft-horror story detailing a plausible outcome of Hasbro-sponsored [F]AGI. This story is roughly the origin of the so-called "Optimalverse" in which Caelum est Conterrens is set. The latter links to the former for context, so I imagine that he simply got the 2 confused (or even forgot the latter) when compiling links from his browser history for that write-up.
(If I only remembered FiO, I'd certainly have pegged it as “Caelum est Conterrens” if you asked me offhand; in fact, I'd just as likely mismatch the titles even if I did remember CeC, if I hadn't burned the which-is-which into my memory by contemplating the confusion you're expressing by this post.)
I think that this isn't an optimal future, but it is still pretty good. I think I would take it over our current reality. Its more a sense of a fairly good future, that could have been even better if a few lines of code had been better considered.
I think the optimalverse, and other similar constructs are about the best one can do when trying to construct a utopia out of language with human minds. It would be quite sad if this were the best we could do.
A singularity where every entity is satisfied is a loss. Not nearly as bad as other losses but still a huge loss for all the other possibilities that could have been.
At the end of the fic: Vg vf vzcyvrq gung znal ragvgvrf unir erghearq gb gur irel fbeg bs rkvfgrapr jr ner gelvat gb rfpncr, anzryl bar svyyrq jvgu rssbegf gb rkgraq yvsrfcna vaqrsvavgryl (guebhtu gur perngvba bs arj havirefrf be jungrire) engure guna nal orggre tbnyf.
The biggest horror aspect for me (also from the original) was that (rot13) nal aba-uhzna vagryyvtrapr unf ab punapr. Aba-uhzna vagryyvtrag yvsr trgf znqr vagb pbzchgebavhz, gb srrq gur rire tebjvat cbal fcurer. Vg vf gur gbgny trabpvqr bs rirel aba-uhzna enpr.
(rot13) Nyvraf ner rvgure cbavsvrq be abg qrcraqvat ba jurgure fur erpbtavmrf gurz nf uhzna. Vg raqf jvgu ure svaqvat na rknzcyr bs nyvraf gung fur guvaxf ner uhzna, juvyr fgebatyl vzcylvat gung fur'f qrfgeblrq fbpvrgvrf jvgubhg rira abgvpvat ("Fur unq frra znal cynargf tvir bss pbzcyrk, aba-erthyne enqvb fvtanyf, ohg hcba vairfgvtngvba, abar bs gubfr cynargf unq uhzna yvsr, znxvat gurz fnsr gb erhfr nf enj zngrevny gb tebj Rdhrfgevn.")
That sounds suspiciously like Mass Effect, in which the Reapers ner gur erfhyg bs n HSNV gung qrpvqrq gb znvagnva gur tnynkl naq fbyir gur ceboyrz bs (uhzna-yriry uhzna-yvxr) ebobg eroryyvbaf fhpu nf gur Trgu ol trabpvqvat nyy fcnprsnevat pvivyvmngvbaf rire 50X lrnef naq gura tebjvat vgf ahzoref ol hcybnqvat naq punatvat gur hgvyvgl shapgvbaf bs gur ivpgvzf, gura znxvat gurz vagb arj havgf.
The only thing that bothered me about that one was its title. She was deeply conflicted, but I did not get a sense of true terror.
The thing that I found most frightening was that at the end when the protagonist had become a servant to Celestia, she optimized for the satisfaction of values through friendship and ponies, which made me think that in becoming an uplift, she eventually had to trade in her utility function for Celestia's.
The bits with surveillance, manipulation and deceit were somewhat spooky, and I suppose someone might find them scary in the arsenal of a superintelligence. Not me, though, in this instance - the setting and the knowledge/presumption that the original Optimalverse is as bad as it gets counter the potential for scariness.
The uploading back-and-forth gets a little boring for an old timer as well. But worth a read anyway.
Actually, I have pretty much your same misgivings/objections; it didn't feel particularly scary to me either :-/
maybe it's the fact that uploading/etc. is basically a foregone conclusion when facing a superintelligence? Although I thought that was obvious from the concept itself :-/
So Eliezer said in his March 1st HPMOR progress report:
So I read that and it was certainly very much worth reading - thanks for the recommendation! Obviously, the following contains spoilers.
I'm confused about how the story is supposed to be "terrifying". I rarely find any fiction scary, but I suspect that this is about something else: I didn't think Failed Utopia #4-2 was "failed" either and in Three Worlds Collide, I thought the choice of the "Normal" ending made a lot more sense than choosing the "True" ending. The Optimalverse seems to me a fantastically fortunate universe, pretty much the best universe mammals could ever hope to end up in, and I honestly don't see how it is a horror novel, at all.
So, apparently there's something I'm not getting. Something that makes an individual's hard-to-define "free choice" more valuable than her much-easier-to-define happiness. Something like a paranoid schizophrenic's right not to be treated,
So I'd like the dumb version please. What's terrifying about the Optimalverse?