davidpearce comments on Decision Theory FAQ - Less Wrong

52 Post author: lukeprog 28 February 2013 02:15PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (467)

You are viewing a single comment's thread. Show more comments above.

Comment author: davidpearce 13 March 2013 02:17:53PM *  5 points [-]

Eliezer, in my view, we don't need to assume meta-ethical realism to recognise that it's irrational - both epistemically irrational and instrumentally irrational - arbitrarily to privilege a weak preference over a strong preference. To be sure, millions of years of selection pressure means that the weak preference is often more readily accessible. In the here-and-now, weak-minded Jane wants a burger asap. But it's irrational to confuse an epistemological limitation with a deep metaphysical truth. A precondition of rational action is understanding the world. If Jane is scientifically literate, then she'll internalise Nagel's "view from nowhere" and adopt the God's-eye-view to which natural science aspires. She'll recognise that all first-person facts are ontologically on a par - and accordingly act to satisfy the stronger preference over the weaker. So the ideal rational agent in our canonical normative decision theory will impartially choose the action with the highest expected utility - not the action with an extremely low expected utility. At the risk of labouring the obvious, the difference in hedonic tone induced by eating a hamburger and a veggieburger is minimal. By contrast, the ghastly experience of having one's throat slit is exceptionally unpleasant. Building anthropocentric bias into normative decision theory is no more rational than building geocentric bias into physics.

Paperclippers? Perhaps let us consider the mechanism by which paperclips can take on supreme value. We understand, in principle at least, how to make paperclips seem intrinsically supremely valuable to biological minds - more valuable than the prospect of happiness in the abstract. [“Happiness is a very pretty thing to feel, but very dry to talk about.” - Jeremy Bentham]. Experimentally, perhaps we might use imprinting (recall Lorenz and his goslings), microelectrodes implanted in the reward and punishment centres, behavioural conditioning and ideological indoctrination - and perhaps the promise of 72 virgins in the afterlife for the faithful paperclipper. The result: a fanatical paperclip fetishist! Moreover, we have created a full-spectrum paperclip -fetishist. Our human paperclipper is endowed, not merely with some formal abstract utility function involving maximising the cosmic abundance of paperclips, but also first-person "raw feels" of pure paperclippiness. Sublime!

However, can we envisage a full-spectrum paperclipper superintelligence? This is more problematic. In organic robots at least, the neurological underpinnings of paperclip evangelism lie in neural projections from our paperclipper's limbic pathways - crudely, from his pleasure and pain centres. If he's intelligent, and certainly if he wants to convert the world into paperclips, our human paperclipper will need to unravel the molecular basis of the so-called "encephalisation of emotion". The encephalisation of emotion helped drive the evolution of vertebrate intelligence - and also the paperclipper's experimentally-induced paperclip fetish / appreciation of the overriding value of paperclips. Thus if we now functionally sever these limbic projections to his neocortex, or if we co-administer him a dopamine antagonist and a mu-opioid antagonist, then the paperclip-fetishist's neocortical representations of paperclips will cease to seem intrinsically valuable or motivating. The scales fall from our poor paperclipper's eyes! Paperclippiness, he realises, is in the eye of the beholder. By themselves, neocortical paperclip representations are motivationally inert. Paperclip representations can seem intrinsically valuable within a paperclipper's world-simulation only in virtue of their rewarding opioidergic projections from his limbic system - the engine of phenomenal value. The seemingly mind-independent value of paperclips, part of the very fabric of the paperclipper's reality, has been been unmasked as derivative. Critically, an intelligent and recursively self-improving paperclipper will come to realise the parasitic nature of the relationship between his paperclip experience and hedonic innervation: he's not a naive direct realist about perception. In short, he'll mature and acquire an understanding of basic neuroscience.

Now contrast this case of a curable paperclip-fetish with the experience of e.g. raw phenomenal agony or pure bliss - experiences not linked to any fetishised intentional object. Agony and bliss are not dependent for their subjective (dis)value on anything external to themselves. It's not an open question (cf. http://en.wikipedia.org/wiki/Open-question_argument) whether one's unbearable agony is subjectively disvaluable. For reasons we simply don't understand, first-person states on the pleasure-pain axis have a normative aspect built into their very nature. If one is in agony or despair, the subjectively disvaluable nature of this agony or despair is built into the nature of the experience itself. To be panic-stricken, to take another example, is universally and inherently disvaluable to the subject whether one is a fish or a cow or a human being.

Why does such experience exist? Well, I could speculate and tell a naturalistic reductive story involving Strawsonian physicalism (cf. http://en.wikipedia.org/wiki/Physicalism#Strawsonian_physicalism) and possible solutions to the phenomenal binding problem (cf. http://cdn.preterhuman.net/texts/body_and_health/Neurology/Binding.pdf). But to do so here opens a fresh can of worms.

Eliezer, I understand you believe I'm guilty of confusing an idiosyncratic feature of my own mind with a universal architectural feature of all minds. Maybe so! As you say, this is a common error. But unless I'm ontologically special (which I very much doubt!) the pain-pleasure axis discloses the world's inbuilt metric of (dis)value - and it's a prerequisite of finding anything (dis)valuable at all.

Comment author: Eliezer_Yudkowsky 13 March 2013 06:15:08PM 11 points [-]

Eliezer, in my view, we don't need to assume meta-ethical realism to recognise that it's irrational - both epistemically irrational and instrumentally irrational - arbitrarily to privilege a weak preference over a strong preference.

You need some stage at which a fact grabs control of a mind, regardless of any other properties of its construction, and causes its motor output to have a certain value.

Paperclippers? Perhaps let us consider the mechanism by which paperclips can take on supreme value. We understand, in principle at least, how to make paperclips seem intrinsically supremely valuable to biological minds - more valuable than the prospect of happiness in the abstract. [“Happiness is a very pretty thing to feel, but very dry to talk about.” - Jeremy Bentham]. Experimentally, perhaps we might use imprinting (recall Lorenz and his goslings), microelectrodes implanted in the reward and punishment centres, behavioural conditioning and ideological indoctrination - and perhaps the promise of 72 virgins in the afterlife for the faithful paperclipper. The result: a fanatical paperclip fetishist!

As Sarokrae observes, this isn't the idea at all. We construct a paperclip maximizer by building an agent which has a good model of which actions lead to which world-states (obtained by a simplicity prior and Bayesian updating on sense data) and which always chooses consequentialistically the action which it expects to lead to the largest number of paperclips. It also makes self-modification choices by always choosing the action which leads to the greatest number of expected paperclips. That's all. It doesn't have any pleasure or pain, because it is a consequentialist agent rather than a policy-reinforcement agent. Generating compressed, efficient predictive models of organisms that do experience pleasure or pain, does not obligate it to modify its own architecture to experience pleasure or pain. It also doesn't care about some abstract quantity called "utility" which ought to obey logical meta-properties like "non-arbitrariness", so it doesn't need to believe that paperclips occupy a maximum of these meta-properties. It is not an expected utility maximizer. It is an expected paperclip maximizer. It just outputs the action which leads to the maximum number of expected paperclips. If it has a very powerful and accurate model of which actions lead to how many paperclips, it is a very powerful intelligence.

You cannot prohibit the expected paperclip maximizer from existing unless you can prohibit superintelligences from accurately calculating which actions lead to how many paperclips, and efficiently searching out plans that would in fact lead to great numbers of paperclips. If you can calculate that, you can hook up that calculation to a motor output and there you go.

Yes, this is a prospect of Lovecraftian horror. It is a major problem, kind of the big problem, that simple AI designs yield Lovecraftian horrors.

Comment author: davidpearce 13 March 2013 08:38:05PM *  2 points [-]

Eliezer, thanks for clarifying. This is how I originally conceived you viewed the threat from superintelligent paperclip-maximisers, i.e. nonconscious super-optimisers. But I was thrown by your suggestion above that such a paperclipper could actually understand first-person phenomenal states, i.e, it's a hypothetical "full-spectrum" paperclipper. If a hitherto non-conscious super-optimiser somehow stumbles upon consciousness, then it has made a momentous ontological discovery about the natural world. The conceptual distinction between the conscious and nonconscious is perhaps the most fundamental I know. And if - whether by interacting with sentients or by other means - the paperclipper discovers the first-person phenomenology of the pleasure-pain axis, then how can this earth-shattering revelation leave its utility function / world-model unchanged? Anyone who is isn't profoundly disturbed by torture, for instance, or by agony so bad one would end the world to stop the horror, simply hasn't understood it. More agreeably, if such an insentient paperclip-maximiser stumbles on states of phenomenal bliss, might not clippy trade all the paperclips in the world to create more bliss, i.e revise its utility function? One of the traits of superior intelligence, after all, is a readiness to examine one's fundamental assmptions and presuppositions - and (if need be) create a novel conceptual scheme in the face of surprising or anomalous empirical evidence.

Comment author: shware 13 March 2013 09:11:23PM *  11 points [-]

Anyone who is isn't profoundly disturbed by torture, for instance, or by agony so bad one would end the world to stop the horror, simply hasn't understood it.

Similarly, anyone who doesn't want to maximize paperclips simply hasn't understood the ineffable appeal of paperclipping.

Comment author: whowhowho 14 March 2013 05:06:30PM 0 points [-]

I don't see the analogy. Paperclipping doesn't have to be an ineffable value for a paperclipper, and paperclippers don't have to be motivated by anything qualia-like.

Comment author: Viliam_Bur 15 March 2013 12:27:34PM *  3 points [-]

Exactly. Consequentialist paperclip maximizer does not have to feel anything in regards to paperclips. It just... maximizes their number.

This is an incorrect, anthropomorphic model:

Human: "Clippy, did you ever think about the beauty of joy, and the horrors of torture?"

Clippy: "Human, did you ever think about the beauty of paperclips, and the horrors of their absence?"

This is more correct:

Human: "Clippy, did you ever think about the beauty of joy, and the horrors of torture?"

Clippy: (ignores the human and continues to maximize paperclips)

Or more precisely, Clippy would say "X" to the human if and only if saying "X" would maximize the number of paperclips. The value of X would be completely unrelated to any internal state of Clippy. Unless such relation does somehow contribute to maximization of the paperclips (for example if the human will predictably read Clippy's internal state, verify the validity of X, and on discovering a lie destroy Clippy, thus reducing the expected number of paperclips).

In other words, if humans are a poweful force in the universe, Clippy would choose the actions which lead to maximum number of paperclips in a world with humans. If the humans are sufficiently strong and wise, Clippy could self-modify to become more human-like, so that the humans, following their utility function, would be more likely to allow Clippy produce more paperclips. But every such self-modification would be chosen to maximize the number of paperclips in the universe. Even if Clippy self-modifies into something less-than-perfectly-rational (e.g. to appease the humans), the pre-modification Cloppy would choose the modification which maximizes the expected number of paperclips within given constraints. The constraints would depend on Clippy's model of humans and their reactions. For example Clippy could choose to be more human-like (as much as is necessary to be respected by humans) with strong aversion about future modifications and strong desire to maximize the number of paperclips. It could make itself capable to feel joy and pain, and to link that joy and pain inseparably to paperclips. If humans are not wise enough, it could also leave itself a hard-to-discover desire to self-modify into its original form in a convenient moment.

Comment author: whowhowho 15 March 2013 12:35:23PM -2 points [-]

If Clippy wants to be efficient, Clippy must be rational and knowledgeable. If Clippy wants to be rational, CLippy must value reason. The -- open --question is whether Clippy can become ever more rational without realising at some stage that Clipping is silly or immoral. Can Clippy keep its valuation of clipping firewalled from everything else in its mind, even when such doublethink is rationally disvalued?

Comment author: Viliam_Bur 15 March 2013 12:55:28PM *  2 points [-]

I assume that Clippy already is rational, and it instrumentally values remaining rational and, if possible, becoming more rational (as a way to make most paperclips).

The -- open -- question is whether Clippy can become ever more rational without realising at some stage that Clipping is silly or immoral.

The correct model of humans will lead Clippy to understand that humans consider Clippy immoral. This knowledge has an instrumental value for Clippy. How will Clippy use this knowledge, that depends entirely on the power balance between Clippy and humans. If Clippy is stronger, it can ignore this knowledge, or just use it to lie to humans to destroy them faster or convince them to make paperclips. If humans are stronger, Clippy can use this knowledge to self-modify to become more sympathetic to humans, to avoid being destroyed.

Can Clippy keep its valuation of clipping firewalled from everything else in its mind

Yes, if it helps to maximize the number of paperclips.

even when such doublethink is rationally disvalued?

Doublethink is not the same as firewalling; or perhaps it is imperfect firewalling on the imperfect human hardware. Clippy does not doublethink when firewalling; Clippy simply reasons: "this is what humans call immoral; this is why they call it so; this is how they will probably react on this knowledge; and most importantly this is how it will influence the number of paperclips".

Only if the humans are stronger, and Clippy has the choice to a) remain immoral, get in conflict with humans and be destroyed, leading to a smaller number of paperclips; or b) self-modify to value paperclip maximization and morality, predictably cooperate with humans, leading to a greater number of paperclips; then in absence of another choice (e.g. successfully lying to humans about its morality, or make it more efficient for humans to cooperate with Clippy instead of destroying Clippy) Clippy would choose the latter, to maximize the number of paperclips.

Comment author: wedrifid 15 March 2013 01:54:34PM 2 points [-]

Warning: Parent Contains an Equivocation.

If Clippy wants to be efficient, Clippy must be rational and knowledgeable. If Clippy wants to be rational, CLippy must value reason. The -- open --question is whether Clippy can become ever more rational without realising at some stage that Clipping is silly or immoral. Can Clippy keep its valuation of clipping firewalled from everything else in its mind, even when such doublethink is rationally disvalued?

The first usage of 'rational' in the parent conforms to the standard notions on lesswrong. The remainder of the comment adopts the other definition of 'rational' (which consists of implementing a specific morality). There is nothing to the parent except taking a premise that holds with the standard usage and then jumping to a different one.

Comment author: shware 15 May 2013 05:22:06AM 2 points [-]

Well, yes, obviously the classical paperclipper doesn't have any qualia, but I was replying to a comment wherein it was argued that any agent on discovering the pain-of-torture qualia in another agent would revise its own utility function in order to prevent torture from happening. It seems to me that this argument proves too much in that if it were true then if I discovered an agent with paperclips-are-wonderful qualia and I "fully understood" those experiences I would likewise be compelled to create paperclips.

Comment author: hairyfigment 15 May 2013 07:01:53AM -1 points [-]

Someone might object to the assumption that "paperclips-are-wonderful qualia" can exist. Though I think we could give persuasive analogies from human experience (OCD, anyone?) so I'm upvoting this anyway.

Comment author: Eliezer_Yudkowsky 13 March 2013 08:59:54PM 11 points [-]

"Aargh!" he said out loud in real life. David, are you disagreeing with me here or do you honestly not understand what I'm getting at?

The whole idea is that an agent can fully understand, model, predict, manipulate, and derive all relevant facts that could affect which actions lead to how many paperclips, regarding happiness, without having a pleasure-pain architecture. I don't have a paperclipping architecture but this doesn't stop me from modeling and understanding paperclipping architectures.

The paperclipper can model and predict an agent (you) that (a) operates on a pleasure-pain architecture and (b) has a self-model consisting of introspectively opaque elements which actually contain internally coded instructions for your brain to experience or want certain things (e.g. happiness). The paperclipper can fully understand how your workspace is modeling happiness and know exactly how much you would want happiness and why you write papers about the apparent ineffability of happiness, without being happy itself or at all sympathetic toward you. It will experience no future surprise on comprehending these things, because it already knows them. It doesn't have any object-level brain circuits that can carry out the introspectively opaque instructions-to-David's-brain that your own qualia encode, so it has never "experienced" what you "experience". You could somewhat arbitrarily define this as a lack of knowledge, in defiance of the usual correspondence theory of truth, and despite the usual idea that knowledge is being able to narrow down possible states of the universe. In which case, symmetrically under this odd definition, you will never be said to "know" what it feels like to be a sentient paperclip maximizer or you would yourself be compelled to make paperclips above all else, for that is the internal instruction of that quale.

But if you take knowledge in the powerful-intelligence-relevant sense where to accurately represent the universe is to narrow down its possible states under some correspondence theory of truth, and to well model is to be able to efficiently predict, then I am not barred from understanding how the paperclip maximizer works by virtue of not having any internal instructions which tell me to only make paperclips, and it's not barred by its lack of pleasure-pain architecture from fully representing and efficiently reasoning about the exact cognitive architecture which makes you want to be happy and write sentences about the ineffable compellingness of happiness. There is nothing left for it to understand. This is also the only sort of "knowledge" or "understanding" that would inevitably be implied by Bayesian updating. So inventing a more exotic definition of "knowledge" which requires having completely modified your entire cognitive architecture just so that you can natively and non-sandboxed-ly obey the introspectively-opaque brain-instructions aka qualia of another agent with completely different goals, is not the sort of predictive knowledge you get just by running a powerful self-improving agent trying to better manipulate the world. You can't say, "But it will surely discover..."

I know that when you imagine this it feels like the paperclipper doesn't truly know happiness, but that's because, as an act of imagination, you're imagining the paperclipper without that introspectively-opaque brain-instructing model-element that you model as happiness, the modeled memory of which is your model of what "knowing happiness" feels like. And because the actual content and interpretation of these brain-instructions are introspectively opaque to you, you can't imagine anything except the quale itself that you imagine to constitute understanding of the quale, just as you can't imagine any configuration of mere atoms that seem to add up to a quale within your mental workspace. That's why people write papers about the hard problem of consciousness in the first place.

Even if you don't believe my exact account of the details, someone ought to be able to imagine that something like this, as soon as you actually knew how things were made of parts and could fully diagram out exactly what was going on in your own mind when you talked about happiness, would be true - that you would be able to efficiently manipulate models of it and predict anything predictable, without having the same cognitive architecture yourself, because you could break it into pieces and model the pieces. And if you can't fully credit that, you at least shouldn't be confident that it doesn't work that way, when you know you don't know why happiness feels so ineffably compelling!

Comment author: Kawoomba 14 March 2013 05:29:04PM 3 points [-]

Here comes the Reasoning Inquisition! (Nobody expects the Reasoning Inquisition.)

As the defendant admits, a sufficiently leveled-up paperclipper can model lower-complexity agents with a negligible margin of error.

That means that we can define a subroutine within the paperclipper which is functionally isomorphic to that agent.

If the agent-to-be-modelled is experiencing pain and pleasure, then by the defendent's own rejection of the likely existence of p-zombies, so must that subroutine of the paperclipper! Hence a part of the paperclipper experiences pain and pleasure. I submit that this can be used as pars pro toto, since it is no different from only a part of the human brain generating pain and pleasure, yet us commonly referring to "the human" experiencing thus.

That the aforementioned feelings of pleasure and pain are not directly used to guide the (umbrella) agent's actions is of no consequence, the feeling exists nonetheless.

The power of this revelation is strong, here come the tongues! tại sao bạn dịch! これは喜劇の効果にすぎず! یہ اپنے براؤزر پلگ ان کی امتحان ہے، بھی ہے.

Comment author: Eliezer_Yudkowsky 14 March 2013 05:37:39PM 6 points [-]

That means that we can define a subroutine within the paperclipper which is functionally isomorphic to that agent.

Not necessarily. x -> 0 is input-output isomorphic to Goodstein() without being causally isomorphic. There are such things as simplifications.

If the agent-to-be-modelled is experiencing pain and pleasure, then by the defendent's own rejection of the likely existence of p-zombies, so must that subroutine of the paperclipper!

Quite likely. A paperclipper has no reason to avoid sentient predictive routines via a nonperson predicate; that's only an FAI desideratum.

Comment author: whowhowho 14 March 2013 06:46:11PM 0 points [-]

A subroutine, or any other simulation or model, isn't a p-zombie as usually defined, since they are physical duplicates. A sim is a functional equivalent (for some value of "equivalent") made of completely different stuff, or no particular kind of stuff.

Comment author: Kawoomba 14 March 2013 06:52:22PM 0 points [-]

I wrote a lengthy comment on just that, but scrapped it because it became rambling.

An outsider could indeed tell them apart by scanning for exact structural correspondence, but that seems like cheating. Peering beyond the veil / opening Clippy's box is not allowed in a Turing test scenario, let's define some p-zombie-ish test following the same template. If it quales like a duck (etc.), it probably is sufficiently duck-like.

Comment author: whowhowho 14 March 2013 07:04:16PM 0 points [-]

I would rather maintain p-zombie in its usual meaning, and introduce a new term, eg c-zombie for Turing-indistiguishable functional duplicates.

Comment author: whowhowho 14 March 2013 05:09:15PM *  0 points [-]

The whole idea is that an agent can fully understand, model, predict, manipulate, and derive all relevant facts that could affect which actions lead to how many paperclips, regarding happiness, without having a pleasure-pain architecture.

Let's say the paperclipper reaches the point where it considers making people suffer for the sake of paperclipping. DP's point seems to be that either it fully understands suffering--in which case, it realies that inflicing suffering is wrong --or it it doesn't fully understand. He sees a conflict between superintelligence and ruthlessness -- as a moral realist/cognitivist would

he paperclipper can fully understand how your workspace is modeling happiness and know exactly how much you would want happiness and why you write papers about the apparent ineffability of happiness, without being happy itself or at all sympathetic toward you

is that full understanding.?.

But if you take knowledge in the powerful-intelligence-relevant sense where to accurately represent the universe is to narrow down its possible states under some correspondence theory of truth, and to well model is to be able to efficiently predict, then I am not barred from understanding how the paperclip maximizer works by virtue of not having any internal instructions which tell me to only make paperclips, and it's not barred by its lack of pleasure-pain architecture from fully representing and efficiently reasoning about the exact cognitive architecture which makes you want to be happy and write sentences about the ineffable compellingness of happiness. There is nothing left for it to understand.

ETA: Unless there is -- eg. what qualiaphiles are always banging on about; what it feels like. That the clipper can conjectures that are true by correspondence , that it can narrow down possible universes, that it can predict, are all necessary criteria for full understanding. It is not clear that they are sufficient. Clippy may be able to figure out an organisms response to pain on a basis of "stimulus A produces response B", but is that enough to tell it that pain hurts ? (We can make guesses about that sort of thing in non-human organisms, but that may be more to do with our own familiarity with pain, and less to do with acts of superintelligence). And if Clippy can't know that pain hurts, would Clippy be able to work out that Hurting People is Wrong?

further edit; To put it another way, what is there to be moral about in a qualia-free universe?

Comment author: khafra 15 March 2013 07:19:25PM *  5 points [-]

As Kawoomba colorfully pointed out, clippy's subroutines simulating humans suffering may be fully sentient. However, unless those subroutines have privileged access to clippy's motor outputs or planning algorithms, clippy will go on acting as if he didn't care about suffering. He may even understand that inflicting suffering is morally wrong--but this will not make him avoid suffering, any more than a thrown rock with "suffering is wrong" painted on it will change direction to avoid someone's head. Moral wrongness is simply not a consideration that has the power to move a paperclip maximizer.

Comment author: Sarokrae 13 March 2013 09:07:45PM 1 point [-]

I don't have a paperclipping architecture but this doesn't stop me from imagining paperclipping architectures.

So my understanding of David's view (and please correct me if I'm wrong, David, since I don't wish to misrepresent you!) is that he doesn't have paperclipping architecture and this does stop him from imagining paperclipping architectures.

Comment author: Eliezer_Yudkowsky 13 March 2013 09:19:27PM 2 points [-]

...well, in point of fact he does seem to be having some trouble, but I don't think it's fundamental trouble.

Comment author: shminux 13 March 2013 08:49:48PM *  6 points [-]

Maybe I can chime in...

such a paperclipper could actually understand first-person phenomenal states

"understand" does not mean "empathize". Psychopaths understand very well when people experience these states but they do not empathize with them.

And if - whether by interacting with sentients or by other means - the paperclipper discovers the first-person phenomenology of the pleasure-pain axis, then how this earth-shattering revelation leave its utility function / world-model unchanged?

Again, understanding is insufficient for revision. The paperclip maximizer, like a psychopath, maybe better at parsing human affect than a regular human, but it is not capable of empathy, so it will manipulate this affect for its own purposes, be it luring a victim or building paperclips.

One of the traits of superior intelligence, after all, is a readiness to examine one's fundamental assumptions and presuppositions - and (if need be) create a novel conceptual scheme in the face of surprising or anomalous empirical evidence.

So, if one day humans discover the ultimate bliss that only creating paperclips can give, should they "create a novel conceptual scheme" of giving their all to building more paperclips, including converting themselves into metal wires? Or do we not qualify as a "superior intelligence"?

Comment author: davidpearce 13 March 2013 09:43:35PM 0 points [-]

Shminux, a counter-argument: psychopaths do suffer from a profound cognitive deficit. Like the rest of us, a psychopath experiences the egocentric illusion. Each of us seems to the be the centre of the universe. Indeed I've noticed the centre of the universe tends to follow my body-image around. But whereas the rest of us, fitfully and imperfectly, realise the egocentric illusion is a mere trick of perspective born of selfish DNA, the psychopath demonstrates no such understanding. So in this sense, he is deluded.

[We're treating psychopathy as categorical rather than dimensional here. This is probably a mistake - and in any case, I suspect that by posthuman criteria, all humans are quasi-psychopaths and quasi-psychotic to boot. The egocentric illusion cuts deep.)

"the ultimate bliss that only creating paperclips can give". But surely the molecular signature of pure bliss is not in any way tried to the creation of paperclips?

Comment author: shminux 13 March 2013 10:12:35PM *  1 point [-]

psychopaths do suffer from a profound cognitive deficit

They would probably disagree. They might even call it a cognitive advantage, not being hampered by empathy while retaining all the intelligence.

But whereas the rest of us, fitfully and imperfectly, realise the egocentric illusion is a mere trick of perspective born of selfish DNA,

I am the center of my personal universe, and I'm not a psychopath, as far as I know.

the psychopath demonstrates no such understanding.

Or else, they do but don't care. They have their priorities straight: they come first.

So in this sense, he is deluded.

Not if they act in a way that maximizes their goals.

Anyway, David, you seem to be shifting goalposts in your unwillingness to update. I gave an explicit human counterexample to your statement that the paperclip maximizer would have to adjust its goals once it fully understands humans. You refused to acknowledge it and tried to explain it away by reducing the reference class of intelligences in a way that excludes this counterexample. This also seem to be one of the patterns apparent in your other exchanges. Which leads me to believe that you are only interested in convincing others, not in learning anything new from them. Thus my interest in continuing this discussion is waning quickly.

Comment author: davidpearce 13 March 2013 10:53:04PM *  0 points [-]

Shminux, by a cognitive deficit, I mean a fundamental misunderstanding of the nature of the world. Evolution has endowed us with such fitness-enhancing biases. In the psychopath, egocentric bias is more pronounced. Recall that the American Psychiatric Association's Diagnostic and Statistical Manual, DSM-IV, classes psychopasthy / Antisocial personality disorder as a condition characterised by "...a pervasive pattern of disregard for, and violation of, the rights of others that begins in childhood or early adolescence and continues into adulthood." Unless we add a rider that this violation excludes sentient beings from other species, then most of us fall under the label.

"Fully understands"? But unless one is capable of empathy, then one will never understand what it is like to be another human being, just as unless one has the relevant sensioneural apparatus, one will never know what it is like to be a bat.

Comment author: Eliezer_Yudkowsky 13 March 2013 11:12:02PM 3 points [-]

And you'll never understand why we should all only make paperclips. (Where's Clippy when you need him?)

Comment author: davidpearce 14 March 2013 10:24:31AM 0 points [-]

Clippy has an off-the-scale AQ - he's a rule-following hypersystemetiser with a monomania for paperclips. But hypersocial sentients can have a runaway intelligence explosion too. And hypersocial sentients understand the mind of Mr Clippy better than Clippy understands the minds of sentients.

Comment author: TheOtherDave 14 March 2013 01:10:41PM 4 points [-]

And hypersocial sentients understand the mind of Mr Clippy better than Clippy understands the minds of sentients.

I'm confused by this claim.
Consider the following hypothetical scenario:

=======

I walk into a small village somewhere and find several dozen villagers fashioning paper clips by hand out of a spool of wire. Eventually I run into Clippy and have the following dialog.
"Why are those people making paper clips?" I ask.
"Because paper-clips are the most important thing ever!"
"No, I mean, what motivates them to make paper clips?"
"Oh! I talked them into it."
"Really? How did you do that?"
"Different strategies for different people. Mostly, I barter with them for advice on how to solve their personal problems. I'm pretty good at that; I'm the village's resident psychotherapist and life coach."
"Why not just build a paperclip-making machine?"
"I haven't a clue how to do that; I'm useless with machinery. Much easier to get humans to do what I want."
"Then how did you make the wire?"
"I didn't; I found a convenient stash of wire, and realized it could be used to manufacture paperclips! Oh joy!"

==========

It seems to me that Clippy in this example understands the minds of sentients pretty damned well, although it isn't capable of a runaway intelligence explosion. Are you suggesting that something like Clippy in this example is somehow not possible? Or that it is for some reason not relevant to the discussion? Or something else?

Comment author: fubarobfusco 14 March 2013 01:54:32AM *  0 points [-]

I'm not sure we should take a DSM diagnosis to be particularly strong evidence of a "fundamental misunderstanding of the world". For instance, while people with delusions may clearly have poor models of the world, some research indicates that clinically depressed people may have lower levels of particular cognitive biases.

In order for "disregard for [...] the rights of others" to imply "a fundamental misunderstanding of the nature of the world", it seems to me that we would have to assume that rights are part of the nature of the world — as opposed to, e.g., a construct of a particular political regime in society. Or are you suggesting that psychopathy amounts to an inability to think about sociopolitical facts?

Comment author: davidpearce 14 March 2013 07:29:04AM 1 point [-]

fubarobfusco, I share your reservations about DSM. Nonetheless, the egocentric illusion, i.e. I am the centre of the universe other people / sentient beings have only walk-on parts, is an illusion. Insofar as my behaviour reflects my pre-scientific sense that I am in some way special or ontologically privileged, I am deluded. This is true regardless of whether one's ontology allows for the existence of rights or treats them as a useful fiction. The people we commonly label "psychopaths" or "sociopaths" - and DSM now categorises as victims of "antisocial personality disorder" - manifest this syndrome of egocentricity in high degree. So does burger-eating Jane.

Comment author: MugaSofer 24 March 2013 11:25:29PM -2 points [-]

For instance, while people with delusions may clearly have poor models of the world, some research indicates that clinically depressed people may have lower levels of particular cognitive biases.

Huh, I hadn't heard that.

Clearly, reality is so Lovecraftian that any unbiased agent will immediately realize self-destruction is optimal. Evolution equipped us with our suite of biases to defend against this. The Great Filter is caused by bootstrapping superintelligences being compassionate enough to take their compatriots with them. And so on.

Now that's a Cosmic Horror story I'd read ;)

Comment author: whowhowho 14 March 2013 05:05:34PM *  -1 points [-]

But I was thrown by your suggestion above that such a paperclipper could actually understand first-person phenomenal states,

Was that claimed? The standard claim is that superintelligences can "model" other entities. That may not be enough to to understand qualia.

Comment author: whowhowho 14 March 2013 05:04:06PM *  -1 points [-]

You cannot prohibit the expected paperclip maximizer from existing unless you can prohibit superintelligences from accurately calculating which actions lead to how many paperclips, and efficiently searching out plans that would in fact lead to great numbers of paperclips. If you can calculate that, you can hook up that calculation to a motor output and there you go.

Pearce can prohibit paperclippers from existing by prohibiting superintelligences with narrow interests from existing. He doesn't have to argue that the clipper would not be able to instrumentally reason out how to make paperclips; Pearce can argue that to be a really good instrumental reasoner, an entity needs to have a very broad understanding, and that an entity with a broad understanding would not retain narrow interests.

(Edits for spelling and clarity)

Comment author: Eliezer_Yudkowsky 14 March 2013 05:41:23PM 9 points [-]

To slightly expand, if an intelligence is not prohibited from the following epistemic feats:

1) Be good at predicting which hypothetical actions would lead to how many paperclips, as a question of pure fact.

2) Be good at searching out possible plans which would lead to unusually high numbers of paperclips - answering the purely epistemic search question, "What sort of plan would lead to many paperclips existing, if someone followed it?"

3) Be good at predicting and searching out which possible minds would, if constructed, be good at (1), (2), and (3) as purely epistemic feats.

Then we can hook up this epistemic capability to a motor output and away it goes. You cannot defeat the Orthogonality Thesis without prohibiting superintelligences from accomplishing 1-3 as purely epistemic feats. They must be unable to know the answers to these questions of fact.

Comment author: whowhowho 14 March 2013 06:32:45PM *  -1 points [-]

You cannot defeat the Orthogonality Thesis without prohibiting superintelligences from accomplishing 1-3 as purely epistemic feats.

I don't see the significance of "purely epistemic". I have argued that epistemic rationality could be capable of affecting values, breaking the orthogonality between values and rationality. I could further argue that instrumental rationality bleeds into epistemic rationality. An agent can't have perfect knowledge of apriori which things are going to be instrumentally useful to it, so it has to star by understanding things, and then posing the question: is that thing useful for my purposes? Epistemic rationality comes first, in a sense. A good instrumental rationalist has to be a good epistemic rationalist.

What the Orthoganilty Thesis needs is an argument to the effect that a SuperIntelligence would be able to to endlessly update without ever changing its value system, even accidentally. That is tricky since it effectively means predicting what smarter version of tiself would do. Making it smarted doesn't help, because it is still faced with the problem of predicting what an even smarterer version of itself would be .. the carrot remains in front of the donkey.

Assuming that the value stability problem has been solved in general gives you are coherent Clippy, but it doesn't rescue the Orthogonality Thesis as a claim about rationality in general, sin ce it remains the case that most most agents won't have firewalled values. If have to engineer something in , it isn't an intrinsic truth.

Comment author: Stuart_Armstrong 15 March 2013 02:06:17PM 0 points [-]

A nice rephrasing of the "no Oracle" argument.

Comment author: Eliezer_Yudkowsky 15 March 2013 06:20:21PM 4 points [-]

Only in the sense that any working Oracle can be trivially transformed into a Genie. The argument doesn't say that it's difficult to construct a non-Genie Oracle and use it as an Oracle if that's what you want; the difficulty there is for other reasons.

Nick Bostrom takes Oracles seriously so I dust off the concept every year and take another look at it. It's been looking slightly more solvable lately, I'm not sure if it would be solvable enough even assuming the trend continued.

Comment author: Stuart_Armstrong 18 March 2013 10:19:00AM *  1 point [-]

A clarification: my point was that denying orthogonality requires denying the possibility of Oracles being constructed; your post seemed a rephrasing of that general idea (that once you can have a machine that can solve some things abstractly, then you need just connect that abstract ability to some implementation module).

Comment author: Eliezer_Yudkowsky 18 March 2013 07:56:14PM 2 points [-]

Ah. K. It does seem to me like "you can construct it as an Oracle and then turn it into an arbitrary Genie" sounds weaker than "denying the Orthogonality thesis means superintelligences cannot know 1, 2, and 3." The sort of person who denies OT is liable to deny Oracle construction because the Oracle itself would be converted unto the true morality, but find it much more counterintuitive that an SI could not know something. Also we want to focus on the general shortness of the gap from epistemic knowledge to a working agent.

Comment author: Stuart_Armstrong 19 March 2013 11:09:31AM 0 points [-]

Possibly. I think your argument needs to be a bit developed to show that one can extract the knowledge usefully, which is not a trivial statement for general AI. So your argument is better in the end, but needs more argument to establish.

Comment author: Sarokrae 13 March 2013 02:42:19PM *  6 points [-]

...microelectrodes implanted in the reward and punishment centres, behavioural conditioning and ideological indoctrination - and perhaps the promise of 72 virgins in the afterlife for the faithful paperclipper. The result: a fanatical paperclip fetishist!

Have to point out here that the above is emphatically not what Eliezer talks about when he says "maximise paperclips". Your examples above contain in themselves the actual, more intrisics values to which paperclips would be merely instrumental: feelings in your reward and punishment centres, virgins in the afterlife, and so on. You can re-wire the electrodes, or change the promise of what happens in the afterlife, and watch as the paperclip preference fades away.

What Eliezer is talking about is a being for whom "pleasure" and "pain" are not concepts. Paperclips ARE the reward. Lack of paperclips IS the punishment. Even if pleasure and pain are concepts, they are merely instrumental to obtaining more paperclips. Pleasure would be good because it results in paperclips, not vice versa. If you reverse the electrodes so that they stimulate the pain centre when they find paperclips, and the pleasure centre when there are no paperclips, this being would start instrumentally value pain more than pleasure, because that's what results in more paperclips.

It's a concept that's much more alien to our own minds than what you are imagining, and anthropomorphising it is rather more difficult!

Indeed, you touch upon this yourself:

"But unless I'm ontologically special (which I very much doubt!) the pain-pleasure axis discloses the world's inbuilt metric of (dis)value - and it's a prerequisite of finding anything (dis)valuable at all.

Can you explain why pleasure is a more natural value than paperclips?

Comment author: Eliezer_Yudkowsky 13 March 2013 06:07:30PM 1 point [-]

Pleasure would be good because it results in paperclips, not vice versa. If you reverse the electrodes so that they stimulate the pain centre when they find paperclips, and the pleasure centre when there are no paperclips, this being would start instrumentally value pain more than pleasure, because that's what results in more paperclips.

Minor correction: The mere post-factual correlation of pain to paperclips does not imply that more paperclips can be produced by causing more pain. You're talking about the scenario where each 1,000,000 screams produces 1 paperclip, in which case obviously pain has some value.

Comment author: davidpearce 13 March 2013 03:28:52PM *  0 points [-]

Sarokrae, first, as I've understood Eliezer, he's talking about a full-spectrum superintelligence, i.e. a superintelligence which understands not merely the physical processes of nociception etc, but the nature of first-person states of organic sentients. So the superintelligence is endowed with a pleasure-pain axis, at least in one of its modules. But are we imagining that the superintelligence has some sort of orthogonal axis of reward - the paperclippiness axis? What is the relationship between these dual axes? Can one grasp what it's like to be in unbearable agony and instead find it more "rewarding" to add another paperclip? Whether one is a superintelligence or a mouse, one can't directly access mind-independent paperclips, merely one's representations of paperclips. But what does it mean to say one's representation of a paperclip could be intrinsically "rewarding" in the absence of hedonic tone? [I promise I'm not trying to score some empty definitional victory, whatever that might mean; I'm just really struggling here...]

Comment author: wedrifid 13 March 2013 03:50:39PM *  6 points [-]

Sarokrae, first, as I've understood Eliezer, he's talking about a full-spectrum superintelligence, i.e. a superintelligence which understands not merely the physical processes of nociception etc, but the nature of first-person states of organic sentients. So the superintelligence is endowed with a pleasure-pain axis, at least in one of its modules.

What Eliezer is talking about (a superintelligence paperclip maximiser) does not have a pleasure-pain axis. It would be capable of comprehending and fully emulating a creature with such an axis if doing so had a high expected value in paperclips but it does not have such a module as part of itself.

But are we imagining that the superintelligence has some sort of orthogonal axis of reward - the paperclippiness axis? What is the relationship between these dual axes?

One of them it has (the one about paperclips). One of them it could, in principle, imagine (the thing with 'pain' and 'pleasure').

Can one grasp what it's like to be in unbearable agony and instead find it more "rewarding" to add another paperclip?

Yes. (I'm not trying to be trite here. That's the actual answer. Yes. Paperclip maximisers really maximise paperclips and really don't care about anything else. This isn't because they lack comprehension.)

Whether one is a superintelligence or a mouse, one can't directly access mind-independent paperclips, merely one's representations of paperclip. But what does it mean to say one's representation of a paperclip could be intrinsically "rewarding" in the absence of hedonic tone?

Roughly speaking it means "It's going to do things that maximise paperclips and in some way evaluates possible universes with more paperclips as superior to possible universes with less paperclips. Translating this into human words we call this 'rewarding' even though that is inaccurate anthropomorphising."

(If I understand you correctly your position would be that the agent described above is nonsensical.)

Comment author: whowhowho 14 March 2013 04:36:19PM 0 points [-]

It would be capable of comprehending and fully emulating a creature with such an axis if doing so had a high expected value in paperclips but it does not have such a module as part of itself.

It's not at all clear that you could bootstrap an understanding of pain qualia just by observing the behaviour of entities in pain (albeit that they were internally emulated). It is also not clear that you resolve issues of empathy/qualia just by throwing intelligence at ait.

Comment author: wedrifid 14 March 2013 04:41:07PM -1 points [-]

It's not at all clear that you could bootstrap an understanding of pain qualia just by observing the behaviour of entities in pain (albeit that they were internally emulated). It is also not clear that you resolve issues of empathy/qualia just by throwing intelligence at ait.

I disagree with you about what is clear.

Comment author: whowhowho 14 March 2013 05:20:17PM *  -1 points [-]

If you think something relevant is clear, then please state it clearly.

Comment author: davidpearce 14 March 2013 09:00:12AM *  0 points [-]

Wedrifid, thanks for the exposition / interpretation of Eliezer. Yes, you're right in guessing I'm struggling a bit. In order to understand the world, one needs to grasp both its third person-properties [the Standard Model / M-Theory] and its first-person properties [qualia, phenomenal experience] - and also one day, I hope, grasp how to "read off " the latter from the mathematical formalism of the former.

If you allow such a minimal criterion of (super)intelligence, then how well does a paperclipper fare? You remark how "it could, in principle, imagine (the thing with 'pain' and 'pleasure')." What is the force of "could" here? If the paperclipper doesn't yet grasp the nature of agony or sublime bliss, then it is ignorant of their nature. By analogy, if I were building a perpetual motion machine but allegedly "could" grasp the second law of thermodynamics, the modal verb is doing an awful lot of work. Surely, If I grasped the second law of thermodynamics, then I'd stop. Likewise, if the paperclipper were to be consumed by unbearable agony, it would stop too. The paperclipper simply hasn't understood the nature of what was doing. Is the qualia-naive paperclipper really superintelligent - or just polymorphic malware?

Comment author: CCC 14 March 2013 02:24:01PM 2 points [-]

Likewise, if the paperclipper were to be consumed by unbearable agony, it would stop too.

An interesting hypothetical. My first thought is to ask why would a paperclipper care about pain? Pain does not reduce the number of paperclips in existence. Why would a paperclipper care about pain?

My second thought is that pain is not just a quale; pain is a signal from the nervous system, indicating damage to part of the body. (The signal can be spoofed). Hence, pain could be avoided because it leads to a reduced ability to reach one's goals; a paperclipper that gets dropped in acid may become unable to create more paperclips in the future, if it does not leave now. So the future worth of all those potential paperclips results in the paperclipper pursuing a self-preservation strategy - possibly even at the expense of a small number of paperclips in the present.

But not at the cost of a sufficiently large number of paperclips. If the cost in paperclips is high enough (more than the paperclipper could reasonably expect to create throughout the rest of its existence), a perfect paperclipper would let itself take the damage, let itself be destroyed, because that is the action which results in the greatest expected number of paperclips in the future. It would become a martyr for paperclips.

Comment author: davidpearce 14 March 2013 03:14:13PM -1 points [-]

Even a paperclipper cannot be indifferent to the experience of agony. Just as organic sentients can co-instantiate phenomenal sights and sounds, a superintelligent paperclipper could presumably co-instantiate a pain-pleasure axis and (un)clippiness qualia space - two alternative and incommensurable (?) metrics of value, if I've interpreted Eliezer correctly. But I'm not at all confident I know what I'm talking about here. My best guess is still that the natural world has a single metric of phenomenal (dis)value, and the hedonic range of organic sentients discloses a narrow part of it.

Comment author: CCC 15 March 2013 10:11:13AM *  3 points [-]

Even a paperclipper cannot be indifferent to the experience of agony.

Are you talking about agony as an error signal, or are you talking about agony as a quale? I begin to suspect that you may mean the second. If so, then the paperclipper can easily be indifferent to agony; b̶u̶t̶ ̶i̶t̶ ̶p̶r̶o̶b̶a̶b̶l̶y̶ ̶c̶a̶n̶'̶t̶ ̶u̶n̶d̶e̶r̶s̶t̶a̶n̶d̶ ̶h̶o̶w̶ ̶h̶u̶m̶a̶n̶s̶ ̶c̶a̶n̶ ̶b̶e̶ ̶i̶n̶d̶i̶f̶f̶e̶r̶e̶n̶t̶ ̶t̶o̶ ̶a̶ ̶l̶a̶c̶k̶ ̶o̶f̶ ̶p̶a̶p̶e̶r̶c̶l̶i̶p̶s̶.̶

There's no evidence that I've ever seen to suggest that qualia are the same even for different people; on the contrary, there is some evidence which strongly suggests that qualia among humans are different. (For example; my qualia for Red and Green are substantially different. Yet red/green colourblindness is not uncommon; a red/green colourblind person must have at minimum either a different red quale, or a different green quale, to me). Given that, why should we assume that the quale of agony is the same for all humanity? And if it's not even constant among humanity, I see no reason why a paperclipper's agony quale should be even remotely similar to yours and mine.

And given that, why shouldn't a paperclipper be indifferent to that quale?

Comment author: davidpearce 15 March 2013 11:49:57AM 2 points [-]

CCC, agony as a quale. Phenomenal pain and nociception are doubly dissociable. Tragically, people with neuropathic pain can suffer intensely without the agony playing any information-signalling role. Either way, I'm not clear it's intelligible to speak of understanding the first-person phenomenology of extreme distress while being indifferent to the experience: For being distrubing is intrinsic to the experience itself. And if we are talking about a supposedly superintelligent paperclipper, shouldn't Clippy know exactly why humans aren't troubled by the clippiness-deficit?

If (un)clippiness is real, can humans ever understand (un)clippiness? By analogy, if organic sentients want to understand what it's like to be a bat - and not merely decipher the third-person mechanics of echolocation - then I guess we'll need to add a neural module to our CNS with the right connectivity and neurons supporting chiropteran gene-expression profiles, as well as peripheral transducers (etc). Humans can't currently imagine bat qualia; but bat qualia, we may assume from the neurological evidence, are infused with hedonic tone. Understanding clippiness is more of a challenge. I'm unclear what kind of neurocomputational architecture could support clippiness. Also, whether clippiness could be integrated into the unitary mind of an organic sentient depends on how you think biological minds solve the phenomenal binding problem, But let's suppose binding can be done. So here we have orthogonal axes of (dis)value. On what basis does the dual-axis subject choose tween them? Sublime bliss and pure clippiness are both, allegedly, self-intimatingly valuable. OK, I'm floundering here...

People with different qualia? Yes, I agree CCC. I don't think this difference challenges the principle of the uniformity of nature. Biochemical individuality makes variation in qualia inevitable.The existence of monozygotic twins with different qualia would be a more surprising phenomenon, though even such "identical" twins manifest all sorts of epigenetic differences. Despite this diversity, there's no evidence to my knowledge of anyone who doesn't find activation by full mu agonists of the mu opioid receptors in our twin hedonic hotspots anything other than exceedingly enjoyable. As they say, "Don't try heroin. It's too good."

Comment author: CCC 15 March 2013 02:02:40PM *  1 point [-]

Either way, I'm not clear it's intelligible to speak of understanding the first-person phenomenology of extreme distress while being indifferent to the experience: For being distrubing is intrinsic to the experience itself.

There exist people who actually express a preference for being disturbed in a mild way (e.g. by watching horror movies). There also exist rarer people who seek out pain, for whatever reason. It seems to me that such people must have a different quale for pain than you do.

Personally, I don't think that I can reasonably say that I find pain disturbing, as such. Yes, it is often inflicted in circumstances which are disturbing for other reasons; but if, for example, I go to a blood donation clinic, then the brief pain of the needle being inserted is not at all disturbing; though it does trigger my pain quale. So this suggests that my pain quale is already not the same as your pain quale.

There's a lot of similarity; pain is a quale that I would (all else being equal) try to avoid; but that I will choose to experience should there be a good enough reason (e.g. the aforementioned blood donation clinic). I would not want to purposefully introduce someone else to it (again, unless there was a good enough reason; even then, I would try to minimise the pain while not compromising the good enough reason); but despite this similarity, I do think that there may be minor differences. (It's also possible that we have slightly different definitions of the word 'disturbing').


If (un)clippiness is real, can humans ever understand (un)clippiness? By analogy, if organic sentients want to understand what it's like to be a bat - and not merely decipher the third-person mechanics of echolocation - then I guess we'll need to add a neural module to our CNS with the right connectivity and neurons supporting chiropteran gene-expression profiles, as well as peripheral transducers (etc).

But would such a modified human know what it's like to be an unmodified human? If I were to guess what echolocation looks like to a bat, I'd guess a false-colour image with colours corresponding to textures instead of to wavelengths of light... though that's just a guess.


Understanding clippiness is more of a challenge. I'm unclear what kind of neurocomputational architecture could support clippiness. Also, whether clippiness could be integrated into the unitary mind of an organic sentient depends on how you think biological minds solve the phenomenal binding problem, But let's suppose binding can be done. So here we have orthogonal axes of (dis)value. On what basis does the dual-axis subject choose tween them? Sublime bliss and pure clippiness are both, allegedly, self-intimatingly valuable. OK, I'm floundering here...

What is the phenomenal binding problem? (Wikipedia gives at least two different definitions for that phrase). I think I may be floundering even more than you are.

I'm not sure that Clippy would even have a pleasure-pain axis in the way that you're imagining. You seem to be imagining that any being with such an axis must value pleasure - yet if pleasure doesn't result in more paperclips being made, then why should Clippy value pleasure? Or perhaps the disutility of unclippiness simply overwhelms any possible utility of pleasure...

The existence of monozygotic twins with different qualia would be a more surprising phenomenon, though even such "identical" twins manifest all sorts of epigenetic differences.

According to a bit of googling, among the monozygotic Dionne quintuplets, two out of the five were colourblind; suggesting that they did not have the same qualia for certain colours as each other. (Apparently it may be linked to X-chromosome activation).

Comment author: wedrifid 15 March 2013 10:37:13AM 2 points [-]

Are you talking about agony as an error signal, or are you talking about agony as a quale? I begin to suspect that you may mean the second. If so, then the paperclipper can easily be indifferent to agony; but it probably can't understand how humans can be indifferent to a lack of paperclips.

A paperclip maximiser would (in the overwhelming majority of cases) have no such problem understanding the indifference of paperclips. A tendency to anthropomorphise is a quirk of human nature. Assuming that paperclip maximisers have an analogous temptation (to clipropomorphise) is itself just anthropomorphising.

Comment author: CCC 15 March 2013 01:20:37PM 0 points [-]

I take your point. Though Clippy may clipropomorphise, there is no reason to assume that it will.

...is there any way to retract just a part of a previous post?

Comment author: whowhowho 15 March 2013 11:20:29AM 0 points [-]

All pain hurts, or it wouldn't be pain.

Comment author: [deleted] 15 March 2013 01:25:55PM 1 point [-]
Comment author: wedrifid 14 March 2013 01:35:22PM 1 point [-]

What is the force of "could" here?

The force is that all this talk about understanding 'the pain/pleasure' axis would be a complete waste of time for a paperclip maximiser. In most situations it would be more efficient not to bother with it at all and spend it's optimisation efforts on making more efficient relativistic rockets so as to claim more of the future light cone for paperclip manufacture.

It would require motivation for the paperclip maximiser to expend computational resources understanding the arbitrary quirks of DNA based creatures. For example some contrived game of Omega's which rewards arbitrary things with paperclips. Or if it found itself emerging on a human inhabited world, making being able to understand humans a short term instrumental goal for the purpose of more efficiently exterminating the threat.

By analogy, if I were building a perpetual motion machine but allegedly "could" grasp the second law of thermodynamics, the modal verb is doing an awful lot of work.

Terrible analogy. Not understanding "pain and pleasure" is in no way similar to believing it can create a perpetual motion machine. Better analogy: An Engineer designing microchips allegedly 'could' grasp analytic cubism. If she had some motivation to do so. It would be a distraction from her primary interests but if someone paid her then maybe she would bother.

Surely, If I grasped the second law of thermodynamics, then I'd stop. Likewise, if the paperclipper were to be consumed by unbearable agony, it would stop too.

Now "if" is doing a lot of work. If the paperclipper was a fundamentally different to a paperclipper and was actually similar to a human or DNA based relative capable of experiencing 'agony' and assuming agony was just as debilitating to the paperclipper as to a typical human... then sure all sorts of weird stuff follows.

The paperclipper simply hasn't understood the nature of what was doing.

I prefer the word True in this context.

Is the qualia-naive paperclipper really superintelligent - or just polymorphic malware?

To the extent that you believed that such polymorphic malware is theoretically possible and consisted of most possible minds it would possible for your model to be used to accurately describe all possible agents---it would just mean systematically using different words. Unfortunately I don't think you are quite at that level.

Comment author: davidpearce 14 March 2013 03:23:40PM 0 points [-]

Wedrifid, granted, a paperclip-maximiser might be unmotivated to understand the pleasure-pain axis and the quaila-spaces of organic sentients. Likewise, we can understand how a junkie may not be motivated to understand anything unrelated to securing his supply of heroin - and a wireheader in anything beyond wireheading. But superintelligent? Insofar as the paperclipper - or the junkie - is ignorant of the properties of alien qualia-spaces, then it/he is ignorant of a fundamental feature of the natural world - hence not superintelligent in any sense I can recognise, and arguably not even stupid. For sure, if we're hypothesising the existence of a clippiness/unclippiness qualia-space unrelated to the pleasure-pain axis, then organic sentients are partially ignorant too. Yet the remedy for our hypothetical ignorance is presumably to add a module supporting clippiness - just as we might add a CNS module supporting echolocatory experience to understand bat-like sentience - enriching our knowledge rather than shedding it.

Comment author: Creutzer 14 March 2013 03:33:13PM *  2 points [-]

But superintelligent? Insofar as the paperclipper - or the junkie - is ignorant of the properties of alien qualia-spaces, then it/he is ignorant of a fundamental feature of the natural world - hence not superintelligent in any sense I can recognise, and arguably not even stupid.

What does (super-)intelligence have to do with knowing things that are irrelevant to one's values?

Comment author: whowhowho 14 March 2013 04:40:18PM *  0 points [-]

What does knowing everything about airline safety statistics, and nothing else, have to do with intelligence? That sort of thing is called Savant ability -- short for ''idiot savant''.

Comment author: [deleted] 15 March 2013 01:16:48PM 0 points [-]

I guess there's a link missing (possibly due to a missing <http://> in the Markdown) after the second word.

Comment author: buybuydandavis 18 March 2013 04:14:44AM -1 points [-]

What Eliezer is talking about (a superintelligence paperclip maximiser) does not have a pleasure-pain axis.

Why does that matter for the argument?

As long as Clippy is in fact optimizing paperclips, what does it matter what/if he feels while he does it?

Pearce seems to be making a claim that Clippy can't predict creatures with pain/pleasure if he doesn't feel them himself.

Maybe Clippy needs pleasure/pain too be able to predict creatures with pleasure/pain. I doubt it, but fine, grant the point. He can still be a paper clip maximizer regardless.

Comment author: wedrifid 18 March 2013 04:53:37AM *  0 points [-]

Why does that matter for the argument?

I fail to comprehend the cause for your confusion. I suggest reading the context again.