What do you mean it can't handle advanced neuroscience? Who do you think invented neuroscience!
Not that I wanna beat a dead horse here, but it took us ages. We can't even do basic arithmetic right without tons of tools. I'm always astonished to read history books and see how many really fundamental things weren't discovered for hundreds, if not thousands of years. So I'm fairly underwhelmed by the intellectual capacities of humans. But I see your point.
Since we can separate these concepts, it seems like our final reflective equilibrium, whatever that looks like is perfectly capable of treating them differently.
Capable, sure. That seems like an overly general argument. The ability to distinguish things doesn't mean the distinction appears in the supposed utility function. I can tell apart hundreds of monospace fonts (don't ask), I don't expect monospace fonts to appear in my actual utility function as terminal values. I'm not sure how this helps either way.
Am I correct that you are claiming that while my conscious mind utters sentences like "I don't want to be a wire-head" subconsciously I actually do want to be a wire-head?
Not exactly like this. I don't think the unconscious part of the brain is conspiring against the conscious one.
I don't think it's useful to clearly separate "conscious" and "unconscious" into two distinct agents. They are the same agent, only with conscious awareness shifting around, metaphorically like handing around a microphone in a crowd such that only one part can make itself heard for a while and then has to resort to affecting only its direct neighbors or screaming really loud.
I don't think there's a direct conflict between agents here. Rather, the (current) conscious part encounters intentions and reactions it doesn't understand, doesn't know the origin or history of, and then tries to make sense of them, so it often starts confabulating. This is most easily seen in split-brain patients.
I can clearly observe this by watching my own intentions and my reactions to them moment-to-moment. Intentions come out of nowhere, then directly afterwards (if I investigate) a reason is made up why I wanted this all along. Sometimes, this reason might be correct, but it's clearly a later interpolation. That's why I generally tend to ignore any verbal reasons for actions.
So maybe hypocrisy is a bit of an misleading term here. I'd say that there are many agents that don't always have privileged access (and aren't always conscious), so that they get somewhat ignored, which screws up complex decision making, which causes akrasia. Like, "I'm not getting my needs fulfilled and can't change that myself right now, so I'm going to veto everything!". On the other hand, the conscious part is now stuck with actions that don't make sense, so it makes up a story. It signals "oh, I would've studied all day, but I somehow couldn't get myself to stop watching cat videos, even though I hated it". Really, it just avoided pain of boredom when studying and needed instant gratification. But "akrasia" is a much nicer cover story.
I'm not saying this is perfectly correct or the whole picture, but I think assuming models like this fits my own experiences closer than assuming actual conflicting agents. Also, those unconscious parts, I suspect, are too simple to actually understand wireheading. They want rewards. If they were smart enough, they might see that wireheading is a good solution.
On a somewhat related note, Susan Blackmore often makes the point when talking about free will that she doesn't have any and doesn't even have the illusion of free will anymore, but it doesn't interfere with her actual behavior. Example quote from Conversations On Consciousness (she talks more about this in several radio shows I can't find right now):
Susan Greenfield: "[Searle] said that when he goes into a restaurant and orders a hamburger, he doesn't say, 'Well, I'm a determinist, I wonder what my genes are going to order.'" Susan Blackmore: "I do. You're right that Searle doesn't do that, but when I go in a restaurant, I think, 'Ooh, how interesting, here's a menu, I wonder what she'll choose'; so it is possible to do that."
I'm totally like Blackmore here. I have no idea what I'll choose tomorrow or even in ten minutes, only that it will be according to rewards, aversion and so on. Not even considering counterfactuals in my decision making (and not making up verbal reasons anymore) hasn't crippled me in any way, as far as I can tell.
That makes me skeptical that there's really all that complex a machinery behind all this, and it makes insistence on "but I really value this complex, external thing!" so puzzling.
Also, I don't think that qualia are a useful concept ever. Let's not drag any dualism into this by accident. Besides, what makes you think that "what you call qualia" is something your unconscious processes don't have, right now? What makes you think you have exactly one conscious mind in your skull?
Do you, in all honesty, want to be wire-headed? For the moment I'm not asking what you think you should want, what you want to want or what you think you would want in reflective equilibrium, just what you actually want. Does the prospect of being reduced to orgasmium, if you were offered it right now, seem more desirable than the prospect of a complicated universe filled with diverse being pursuing interesting goals and having fun?
I don't have an opinion on that, deliberately. I find wireheading very attractive and it seems about equally nice as the complicated universe, but much easier and more of an elegant solution. The halo effect is way too powerful here and I don't wanna screw myself over just because I didn't see a fundamental flaw over how pretty the solution was.
(Of course, as per the nature of wireheading, even if I thought it were a good idea, I would spend no effort on convincing anyone of it. What for, because I value them? Then what am I wireheading myself for?)
I've been thinking about wireheading and the nature of my values. Many people here have defended the importance of external referents or complex desires. My problem is, I can't understand these claims at all.
To clarify, I mean wireheading in the strict "collapsing into orgasmium" sense. A successful implementation would identify all the reward circuitry and directly stimulate it, or do something equivalent. It would essentially be a vastly improved heroin. A good argument for either keeping complex values (e.g. by requiring at least a personal matrix) or external referents (e.g. by showing that a simulation can never suffice) would work for me.
Also, I use "reward" as short-hand for any enjoyable feeling, as "pleasure" tends to be used for a specific one of them, among bliss, excitement and so on, and "it's not about feeling X, but X and Y" is still wireheading after all.
I tried collecting all related arguments I could find. (Roughly sorted from weak to very weak, as I understand them, plus link to example instances. I also searched any literature/other sites I could think of, but didn't find other (not blatantly incoherent) arguments.)
(There have also been technical arguments against specific implementations of wireheading. I'm not concerned with those, as long as they don't show impossibility.)
Overall, none of this sounds remotely plausible to me. Most of it is outright question-begging or relies on intuition pumps that don't even work for me.
It confuses me that others might be convinced by arguments of this sort, so it seems likely that I have a fundamental misunderstanding or there are implicit assumptions I don't see. I fear that I have a large inferential gap here, so please be explicit and assume I'm a Martian. I genuinely feel like Gamma in A Much Better Life.
To me, all this talk about "valueing something" sounds like someone talking about "feeling the presence of the Holy Ghost". I don't mean this in a derogatory way, but the pattern "sense something funny, therefore some very specific and otherwise unsupported claim" matches. How do you know it's not just, you know, indigestion?
What is this "valuing"? How do you know that something is a "value", terminal or not? How do you know what it's about? How would you know if you were mistaken? What about unconscious hypocrisy or confabulation? Where do these "values" come from (i.e. what process creates them)? Overall, it sounds to me like people are confusing their feelings about (predicted) states of the world with caring about states directly.
To me, it seems like it's all about anticipating and achieving rewards (and avoiding punishments, but for the sake of the wireheading argument, it's equivalent). I make predicitions about what actions will trigger rewards (or instrumentally help me pursue those actions) and then engage in them. If my prediction was wrong, I drop the activity and try something else. If I "wanted" something, but getting it didn't trigger a rewarding feeling, I wouldn't take that as evidence that I "value" the activity for its own sake. I'd assume I suck at predicting or was ripped off.
Can someone give a reason why wireheading would be bad?