gregconen comments on Hedging our Bets: The Case for Pursuing Whole Brain Emulation to Safeguard Humanity's Future - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (244)
Most versions of torture, continued for your entire existence. You finally cease when you otherwise would (at the heat death of the universe, if nothing else), but your entire experience spent being tortured. The type isn't really important, at that point.
First, the scenario you describe explicitly includes death, and as such falls under the 'embellishments' exception.
Second, thanks to the hedonic treadmill, any randomly-selected form of torture repeated indefinitely would eventually become tolerable, then boring. As you said,
Third, if I ever run out of other active goals to pursue, I could always fall back on "defeat/destroy the eternal tormetor of all mankind." Even with negligible chance of success, some genuinely heroic quest like that makes for a far better waste of my time and resources than, say, lottery tickets.
What if your hedonic treadmill were disabled, or bypassed by something like direct stimulation of your pain center?
You're going to die (or at least cease) eventually, unless our understanding of physics changes significantly. Eventually, you'll run out of negentropy to run your thoughts. My scenario only changes what happens between then and now.
Failing that, you can just be tortured eternally, with no chance of escape (no chance of escape is unphysical, but so is no chance of death). Even if the torture becomes boring (and there may be ways around that), an eternity of boredom, with no chance to succeed any at any goal, seems worse than death to me.
When considering the potential harm you could suffer from a superintelligence that values harming you, you don't get to exclude some approaches it could take because they are too obvious. Superintelligences take obvious wins.
Perhaps. So consider other approaches the hostile superintelligence might take. It's not going to go easy on you.
Yes, I've considered the possibility of things like inducement of anteriograde amnesia combined with application of procedure 110-Montauk, and done my best to consider nameless horrors beyond even that.
As I understand it, a superintelligence derived from a sadistic, sociopathic human upload would have some interest in me as a person capable of suffering, while a superintelligence with strictly artificial psychology and goals would more likely be interested in me as a potential resource, a poorly-defended pile of damp organic chemistry. Neither of those is anywhere near my ideal outcome, of course, but in the former, I'll almost certainly be kept alive for some perceptible length of time. As far as I'm concerned, while I'm dead, my utility function is stuck at 0, but while I'm alive my utility function is equal to or greater than zero.
Furthermore, even a nigh-omnipotent sociopath might be persuaded to torture on a strictly consensual basis by appealing to exploitable weaknesses in the legacy software. The same cannot be said of a superintelligence deliberately constructed without such security flaws, or one which wipes out humanity before it's flaws can be discovered.
Neither of these options is actually good, but the human-upload 'bad end' is at least, from my perspective, less bad. That's all I'm asserting.
Yes, the superintelligence that takes an interest in harming you would have to come from some optimized process, like recursive self improvement of a psychopath upload.
A sufficient condition for the superintelligence to be indifferent to your well being, and see you as spare parts, is an under optimized utility function.
Your approach to predicting what the hostile superintelligence would do to you, seems to be figuring out the worst sort of torture that you can imagine. The problem with this is that the superintelligence is a lot smarter, and more creative than you. Reading your mind and making real you worst fears, constantly with no break or rest, isn't nearly as bad as what it would come up with. And no, you are not going to find some security flaw you can exploit to defeat it, or even slow it down. For one thing, the only way you will be able to think straight is if it determines that this maximizes the harm you experience. But the big reason is recursive self improvement. The superintelligence will analyze itself and fix security holes. You, puny mortal, will be up against a superintelligence. You will not win.
If you knew you were going to die tomorrow, would you now have a preference for what happens to the universe afterwards?
A superintelligence based on an uploaded human mind might retain exploits like 'pre-existing honorable agreements' or even 'mercy' because it considers them part of it's own essential personality. Recursive self-improvement doesn't just mean punching some magical enhance button exponentially fast.
My preferences would be less relevant, given the limited time and resources I'd have with which to act on them. They wouldn't be significantly changed, though. I would, in short, want the universe to continue containing nice places for myself and those people I love to live in, and for as many of us as possible to continue living in such places. I would also hope that I was wrong about my own imminent demise, or at least the inevitability thereof.
If we are postulating a superintelligence that values harming you, let's really postulate that. In the early phases of recursive self improvement, it will figure out all the principles of rationality we have discussed here, including the representation of preferences as a utility function. It will self-modify to maximize a utility function that best represents its precursor conflicting desires, including hurting others and mercy. If it truly started as a psychopath, the desire to hurt others is going to dominate. As it becomes superintelligent, it will move away from having a conflicting sea of emotions that could be manipulated by someone at your level.
I was never suggesting it was anything magical. Software security, given physical security of the system, really is not that hard. The reason we have security holes in computer software today is that most programmers, and the people they work for, do not care about security. But a self improving intelligence will at some point learn to care about its software level security (as an instrumental value), and it will fix vulnerabilities in its next modification.
Is it fair to say that you prefer A: you die tomorrow and the people you currently care about will continue to have worthwhile lives and survive to a positive singularity, to B: you die tomorrow and the people you currently care about also die tomorrow?
If yes, then "while I'm dead, my utility function is stuck at 0" is not a good representation of your preferences.
Conflicts will be resolved, yes, but preferences will remain. A fully self-consistent psychopath might still enjoy weeping more than screams, crunches more than spurts, and certain victim responses could still be mood-breaking. It wouldn't be a good life, of course, collaborating to turn myself into a better toy for a nigh-omnipotent monstrosity, but I'm still pretty sure I'd rather have that than not exist at all.
For my preference to be meaningful, I have to be aware of the distinction. I'd certainly be happier during the last moments of my life with a stack of utilons wrapped up in the knowledge that those I love would do alright without me, but I would stop being happy about that when the parts of my brain that model future events and register satisfaction shut down for the last time and start to rot.
If cryostasis pans out, or, better yet, the positive singularity in scenario A includes reconstruction sufficient to work around the lack of it, there's some non-negligible chance that I (or something functionally indistinguishable from me) would stop being dead, in which case I pop back up to greater-than-zero utility. Shortly thereafter, I would get further positive utility as I find out about good stuff that happened while I was out.
You're aware of the distinction right now - would you be willing to act right now in a way which doesn't affect the world in any major way during your lifetime, but which makes a big change after you die?
Edit: It seems to me as if you noted the fact that your utility function is no longer instantiated after you die, and confused that with the question of whether anything after your death matters to you now.
Of course I would. Why does a difference have to be "major" before I have permission to care? A penny isn't much money, but I'll still take the time to pick one up, if I see it on the floor and can do so conveniently. A moth isn't much intelligence, or even much biomass, but if I see some poor thing thrashing, trapped in a puddle, I'll gladly mount a fingertip-based rescue mission unless I'd significantly endanger my own interests by doing so.
Anything outside the light cone of my conscious mind is none of my business. That still leaves a lot of things I might be justifiably interested in.
Again, its preferences are not going to be manipulated by someone at your level, even ignoring your reduced effectiveness from being constantly tortured. Whatever you think you can offer as part of a deal, it can unilaterally take from you. (And really, a psychopathic torturer does not want you to simulate its favorite reaction, it wants to find the specific torture that naturally causes you to react in its favorite way. It does not care about your cooperation at all.)
You seem to be confusing your utility with your calculation of your utility function. I expect that this confusion would cause you to wirehead, given the chance. Which of the following would you choose, if you had to choose between them:
Choice C: Your loved ones are separated from you but continue to live worthwhile lives. Meanwhile, you are given induce amnesia, and false memories of your loved ones dying.
Choice D: You are placed in a simulation separated from the rest of the world, and your loved ones are all killed. You are given induced amnesia, and believe yourself to be in the real world. You do not have detailed interactions with your loved ones (they are not simulated in such detail that they can be considered alive in the simulation), but you receive regular reports that they are doing well. These reports are false, but you believe them.
In the scenario I described, death is actual death, after which you cannot be brought back. It is not what current legal and medical authorities falsely believe to be that state.
You should probably read about The Least Convenient Possible World.
I've had a rather unsettling night's sleep, contemplating scenarios where I'm forced to choose between slight variations on violations of my body and mind, disconnect from reality, and loss of everyone I've ever loved. It was worth it, though, since I've come up with a less convenient version:
If choice D included, within the simulation, versions of my loved ones that were ultimately hollow, but convincing enough that I could be satisfied with them by choosing not to look too closely, and further if the VR included a society with complex, internally-consistent dynamics of a sort that are impossible in the real world but endlessly fascinating to me, and if in option C I would know that such a virtual world existed but be permanently denied access to it (in such a way that seemed consistent with the falsely-remembered death of my loved ones), that would make D quite a bit more tempting.
However, I would still chose the 'actual reality' option, because it has better long-term recovery prospects. In that situation, my loved ones aren't actually dead, so I've got some chance of reconnecting with them or benefiting by the indirect consequences of their actions; my map is broken, but I still have access to the territory, so it could eventually be repaired.
My first instinct is that I would take C over D, on the grounds that if I think they're dead, I'll eventually be able to move on, whereas vague but somehow persuasive reports that they're alive and well but out of my reach would constitute a slow and inescapable form of torture that I'm altogether too familiar with already. Besides, until the amnesia sets in I'd be happy for them.
Complications? Well, there's more than just warm fuzzies I get from being near these people. I've got plans, and honorable obligations which would cost me utility to violate. But, dammit, permanent separation means breaking those promises - for real and in my own mind - no matter which option I take, so that changes nothing. Further efforts to extract the intended distinction are equally fruitless.
I don't think I would wirehead, since that would de-instantiate my current utility function just as surely as death would. On the contrary, I scrupulously avoid mind-altering drugs, including painkillers, unless the alternative is incapacitation.
Think about it this way: if my utility function isn't instantiated at any given time, why should it be given special treatment over any other possible but nonexistent utility function? Should the (slightly different) utility function I had a year ago be able to dictate my actions today, beyond the degree to which it influenced my environment and ongoing personal development?
If something was hidden from me, even something big (like being trapped in a virtual world), and hidden so thoroughly that I never suspected it enough for the suspicion to alter my actions in any measurable way, I wouldn't care, because there would be no me which knew well enough to be able to care. Ideally, yes, the me that can see such hypotheticals from outside would prefer a map to match the territory, but at some point that meta-desire has to give way to practical concerns.