Hi Jonatas,
I am having a hard time understanding the argument. In general terms, I take you to be arguing that some kind of additive impartial total hedonistic utilitarianism is true, and would be discovered by, and motivating to, any "generally intelligent" reasoner. Is that right?
My rough guess at your argument, knowing that I am having difficulty following your meaning, is something like this:
Different instances in time of a physical organism relate to it in the same way that any other physical organism in the universe does. There is no logical basis for privileging a physical organism's own viewpoint,
Indeed, there is none. But nor is there any logical basis for not privileging a physical organism's own viewpoint, and since most organisms evolve/are built to privilege themselves, this is not an argument that will make them change their opinion.
I am also having a hard time understanding this argument, but skimming through it I don't see anything that looks strong enough to defeat the orthogonality thesis, which I see as the claim that it should be possible to design minds in such a way that the part with the utility function is separate from the part which optimizes. This seems to me like a pretty reasonable claim about a certain class of algorithms, and I would expect an argument claiming that such algorithms cannot exist to involve substantially more math than what I see in this argument (namely, no math whatsoever).
No offense: this is not written well enough for me to follow your arguments closely enough to respond to them. But, considering the outside view, what do you think are the chances you have actually proven moral realism in three paragraphs? If you're trying to convince non-realists they're wrong then substantially more work is necessary.
At the point where you called some values "errors" without defining their truth conditions I assumed this wasn't going to be any good and stopped reading.
What sorts of actions, both cognitive (inside the computer) and physical, would a robot have to take to make you think it valued some alien thing like maximizing the number of paperclips in the world? Robbing humans to build a paperclip factory, for example, or making a list of plans, ordered by how many paperclips they would make.
Why is it impossible to program a robot that would do this?
For example, "there is no logical basis for privileging a physical organism's own viewpoint" is true under certain premises, but what goes wrong if we just bui...
I find this post to be too low quality to support even itself, let alone stand up against the orthogonality thesis (on which I have no opinion). It needs a complete rewrite at best. Some (rather incomplete) notes are below.
This is either because the beings in question have some objective difference in their constitution that associates them to different values, or because they can choose what values they have.
Where do you include environmental and cultural influences?
...If they differ in other values, given that they are constitutionally similar, then
You make some good points about the post, but there's no call for this:
Please consider learning the material before writing about it next time. Maybe read a Sequence or two, can't hurt, can it?
Jonatas happens to be a rather successful philosophy student, who I think is quite well read in related topics, even if this post needs work. He's also writing in a second language, which makes it harder to be clear.
This essay is... unclear, but it really sounds like you are limiting the definition of 'intelligence' to a large but limited set of somewhat human-like intelligences with a native capacity for sociability, which does not include most Yudkowskian FOOMing AIs.
I actually read all the way through and found the broad argument quite understandable (although many of the smaller details were confusing). I also found it obviously wrong on many levels. The one I would consider most essential is that you say:
...The existence of personal identities is purely an illusion that cannot be justified by argument, and clearly disintegrates upon deeper analysis...
Different instances in time of a physical organism relate to it in the same way that any other physical organism in the universe does. There is no logical basis for priv
For the question of personal identity, another essay, that was posted on Less Wrong by Eliezer, is here:
http://lesswrong.com/lw/19d/the_anthropic_trilemma/
However, while this essay presents the issue, it admittedly does not solve it, and expresses doubt that it would be solved in this forum. The solution exists in philosophy, though. For example, in the first essay I linked to, in Daniel Kolak's work "I Am You: The Metaphysical Foundations for Global Ethics", or also, in a partial form, in Derek Parfit's work "Reasons and Persons".
The crux of the disagreement, I think, is in the way we understand the self-assessment of our experience. If consciousness is epiphenomenal or just a different level of description of a purely physical world, this self-assessment is entirely algorithmic and does not disclose anything real about the intrinsic nature of consciousness.
But consciousness is not epiphenomenal, and a purely computational account fails to bridge the explanatory gap. Somehow conscious experience can evaluate itself directly, which still remains a not well understood and peculiar f...
The problem with The orthogonality thesis is not so much that it's wrong that it is misleading. It's a special case of the idea that we will ultimately be able to create whatever we can imagine (because our brains are VR simulators - and because of Turing completeness). The problem with it is that what we can imagine and what evolution tends to produce are different things. Failing to account for that seems consistent with fear mongering about the future - a common marketing technique for these kinds of outfit. Sure enough, the paper goes on to talk ab...
The orthogonality thesis (formulated by Nick Bostrom in his article Superintelligent Will, 2011), states basically that an artificial intelligence can have any combination of intelligence level and goal. This article will focus on this simple question, and will only deal with the practical implementation issues at the end, that would need to be part of its full refutation according to Stuart Armstrong.
Meta-ethics
The orthogonality thesis is based on a variation of ethical values for different beings. This is either because the beings in question have some objective difference in their constitution that associates them to different values, or because they can choose what values they have.
That assumption of variation is arguably based on an analysis of humans. The problem with choosing values is obvious: making errors. Human beings are biologically and constitutionally very similar, and given this, if they objectively and rightfully differ in correct values, it is only in aesthetic preferences, by an existing biological difference. If they differ in other values, given that they are constitutionally similar, then the differing values could not be all correct at the same time, they would be differing due to error in choice.
Aesthetic preferences do vary for us, but they all connect ultimately to their satisfaction ― a specific aesthetic preference may satisfy only some people and not others. What is important is the satisfaction, or good feelings, that they produce, in the present or future (what might entail life preservation), which is basically the same thing to everyone. A given stimulus or occurrence is interpreted by the senses and it can produce good feelings, bad feelings, or neither, depending on the organism that receives it. This variation is besides the point, it is just an idiosyncrasy that could be either way: theoretically, any input (aesthetic preferences) could be associated with a certain output (good and bad feelings), or even no input at all, as in spontaneous satisfaction or wire-heading. In terms of output, good feelings and bad feelings always get positive and negative value, by definition.
Masochism is not a counter-example: masochists like pain only in very specific environments, associated with certain roleplaying fantasies, due to good feelings associated to it, or due to a relief of mental suffering that comes with the pain. Outside of these environments and fantasies, they are just as averse to pain as other people. They don't regularly put their hands into boiling water to feel the pain, nobody does.
Good and bad feelings are directly felt as positive and desirable; negative and aversive, and this direct verification gives them the highest epistemological value. What is indirectly felt, such as the world around us, science, or physical theories, depends on the senses and could therefore be an illusion, such as being part of a virtual world. We could, theoretically, be living inside virtual worlds in an underlying alien universe with different physical laws and scientific facts, but we can nonetheless be sure of the reality of our conscious experiences in themselves, which are directly felt.
There is a difference between valid and invalid human values, which is the ground of justification for moral realism: valid values have an epistemological justification, while invalid ones are based on arbitrary choice or intuition. The epistemological justification of valid values occurs by that part of our experiences which has a direct certainty, as opposed to indirect: conscious experiences in themselves. Likewise, only conscious beings can be said to be ethically relevant in themselves, while what goes on in the hot magma at the core of the earth, or in a random rock in Pluto, are not. Consciousness creates a subject of experience, which is required for direct ethical value. It is straightforward to conclude, therefore, that good conscious experiences constitute what is good, and bad conscious experiences constitute what is bad. Good and bad are what ethical value is about.
Good and bad feelings (or conscious experiences) are physical occurrences, and therefore objectively good and bad occurrences, and objective value. Other fictional values without epistemological (or logical) justification are therefore in another category, and simply constitute the error which comes from allowing free choice of one's values for beings with a similar biological constitution.
Personal Identity
The existence of personal identities is purely an illusion that cannot be justified by argument, and clearly disintegrates upon deeper analysis (for why that is, see, e.g., this essay: Universal Identity, or for an introduction to the problem, see Less Wrong article The Anthropic Trilemma).
Different instances in time of a physical organism relate to it in the same way that any other physical organism in the universe does. There is no logical basis for privileging a physical organism's own viewpoint, nor the satisfaction of their own values over that of other physical organisms, nor for assuming the preponderance of their own reasoning over those of other physical organisms of contextually comparable reasoning capacity.
Therefore, the argument of variation or orthogonality could, at best, assume that a superintelligent physical organism with complete understanding of these cognitively trivial philosophical matters would have to consider all viewpoints and valid preferences in their utility function, in a way much similar to coherent extrapolated volition (CEV), extrapolating the values for intelligence and removing errors, but taking account of the values of all sentient physical organisms: not only humans, but also animals, and possibly sentient machines and aliens. The only values that are validly generalizable among such widely differing sentient creatures are good and bad feelings (in the present or future).
Furthermore, a superintelligent physical organism with such understanding would have to give equal weight to the reasoning of other physical organisms of contextually comparable reasoning capacity (depending on the cognitive demands of the context, or problem, even some humans can reason perfectly well), if existent. In case of convergence, this would be a non-issue. In case of divergence, this would force an evaluation of reasons or argumentation, seeking a convergence or preponderance of argument.
Conclusions
Taking the orthogonality thesis to be merely the assumption of divergence of ethical values of superintelligent agents, but not a statement about the issues with practical implementation and tampering with or forcing them by non-superintelligent humans, then there are two fatal arguments against it, one on the side of meta-ethics (moral realism), and one on the side of personal identity (open/empty individualism, or universal identity).
Beings with general superintelligence should find these fundamental philosophical matters trivial (meta-ethics and personal identity), and understand them completely. They should take a non-privileged and objective viewpoint, accounting for all perspectives of physical subjects, and giving (a priori) similar consideration for the reasoning of all physical organisms of contextually comparable reasoning capacity.
Furthermore they would understand that the free variation of values, even in comparable causal chains of biologically similar organisms, comes from error, and that their extrapolation for intelligence would result in moral realism with good and bad feelings as the epistemologically justified and only valid direct values, from which all other indirectly or instrumentally valuable actions derive their indirect value. For instance, survival, which in a paradise can have positive value, coming from good feelings in the present and future, and which in a hell can have negative value, coming from bad feelings in the present and in the future.
Perhaps certain architectures or contexts involving beings with superintelligence, caused by beings without superintelligence and erratic behavior, could be forced to produce unethical results. This seems to be the most grave existential risk that we face, and would not come from beings with superintelligence themselves, but from human error. The orthogonality thesis is fundamentally mistaken in relation to beings with general superintelligence (surpassing all human cognitive capacities), but it might be practically realized by non-superintelligent human agents.