JenniferRM comments on Harry Potter and the Methods of Rationality discussion thread, February 2015, chapter 113 - Less Wrong Discussion
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (503)
Just finished reading. Wow! This story is so bleak. I suspect Voldemort just "identity raped" Harry into becoming an Unfriendly Intelligence? Or at least a grossly grossly suboptimal one. Harry himself seems to be dead.
I'm going to call him HarryPrime now, because I think the mind contained in Riddle2/Harry's body before and after this horror was perpetrated should probably not be modeled as "the same person" as just prior to it.
HarryPrime is based on Harry (sort of like an uploaded and modified human simulation is based on a human) but not the same, because he has been imbued with a mission that he must implacably pursue, that has Harry's identity (and that of the still unconscious(!) and never interviewed(!) Hermione) woven into it as part of its motivational structure, in a sort of twist on coherent extraplotated volition.
Versus how "old Harry" and "revived Hermione" were "#included" into the motivational structure of HarryPrime:
My estimate of Voldemort's intelligence just dropped substantially. He is well trained and in the fullness of his power, but he isn't wise... at all. I'd been modeling him as relatively sane, because of past characterization, but I didn't predict this at all.
(There are way better ways to get a hypothetical HarryPrime to "not do things" than giving him a mission as an unstoppable risk mitigation robot. If course, prophesy means self consistent time travel is happening in the story, and self consistent time travel nearly always means that at least some characters will be emotionally or intellectually blinded to certain facts (so that they do the things that bring about the now-inevitable future) unless they are explicitly relying on self consistency to get an outcome they actively desire, so I guess Voldemort's foolishness is artistically forgivable :-P
Also, still going meta on the story, this is a kind of beautiful way to "spend" the series... bringing it back to AI risk mitigation themes in such a powerfully first person way. "You [the reader identifying with the protagonist] have now been turned by magic into an X-risk mitigation robot!")
Prediction: It makes sense now why Riddle2/HarryPrime will tear apart the stars in heaven. They represent small but real risks. He has basically been identity raped into becoming a sort of Pierson's Pupeeteer (from Larry Niven's universe) on behalf of Earth rather than on behalf of himself, and in Niven's stories the puppeteer's evolved cowardice (because they evolved from herd animals, and are ruled by "the hindmost" rather than a "leader") forced them into minor planetary engineering.
As explained in Le Wik:
Prediction: HarryPrime's first line will be better than any in the LW thread where people talked about the one sentence ai box experiment. Eliezer read that long ago and has thought a lot about the general subject.
Something I'm still not sure about is what exactly HarryPrime will be aiming for. I think that's where Eliezer retains some play in his control over whether the ending is very short and bleak or longer and less bleak.
Voldemort kept talking about "destruction of the world" and "destroying the world" and so on. He didn't say the planet had to have to have people on it, but he might not have been talking about the planet. "The world" in normal speech often seems to mean in practice something like "the social world of the humans who are salient to us". Like in the USA people will often talk about "no one in the world does X" but there are people in other countries who do, and if someone points this out they will be accused of quibbling. Similarly, we tend to talk about "saving the earth" and it doesn't really mean the mantle or the core, it primarily means the biosphere and the economy and humans and stuff.
From my perspective, this was the key flaw of the intent:
The literal text appears to be:
And then the errata and full intention was:
In the shorter and sadder ending, I think it is likely that HarryPrime will escape, but not really care about people, and become an optimizing preservation agent of the mere planet. Thus Harry might escape the box and then start removing threats to the physical integrity of the earth's biosphere.
Also the "trusted friend" stuff is dangerous if Hermione doesn't wake up with a healthy normal mind. In canon, resurrection tended to create copies of what the resurrector remembered of a person, not the person themselves.
In the less sad ending I hope/think that HarryPrime will retain substantial overlap with the original Harry, Hermione will be somewhat OK, and the oath will only cause HarryPrime to be constrained in limited and reasonably positive ways. Maybe he will be risk averse. Maybe he will tear apart the stars because they represent a danger to the earth. Maybe he will exterminate every alien in the galaxy that could pose a threat to the earth. Maybe he will constrain the free will of every human on earth to not allow them to put the earth at risk... but he will still sorta be "the old Harry" while doing so.
I'm curious just how dark Eliezer could make such an ending, if he were inspired to try as hard as possible without concern for other goals/strategy. 'Twould be an interesting read.
Maybe it would be intellectually interesting, but I'm not sure I'd want to read it... it has been a long time since I was into the horror genre.
Hopefully some kind soul will come along and grace us with this spin-off.
I expect that (as with Three Worlds Collide), EY has already written both endings, and will show us both if we win.
Not the bad ending, the ending where Harry survives and has been transformed by the vow into an unFriendly intelligence.
I see, you're disagreeing with JenniferRM's prediction that this is what the official Bad Ending will be, but you want to see it written anyway. (Or maybe you agree with, or are agnostic about, Jennifer's prediction as far as it goes, but even so don't think that the Bad Ending will be as inspiredly dark as Dorikka proposed.)
Then I agree with you!