Out of curiosity, is rejection of the Orthogonality thesis a common position in philosophy? (If you can make a guess at what percentage of philosophers reject it, that'd be cool.)
I seem to remember always finding it intuitively obvious, so it's difficult for me to understand why someone would disagree with it (except for being a theist, maybe).
Someone with a hardcore 'rationalist' position (someone who thought all moral statements could be derived from first principles e.g. a Kantian) would probably reject it, but they're basically extinct in the wild.
In the sense of moral rationalism. The fact that rationalist can be used to refer to rationality or rationalism is unfortunate, but IIRC (to busy to search for it) we've had a few debates about terminology and decided that we currently are using the least bad options.
Indeed. Its a problem of language evolution.
To summarise a few centuries of Philosophy very briefly: A lng tie ago there were Rationalists who thought everything could be proven by pure reason, and Empiricists who depended on observation of the external world. Because Reason was often used in contrast to emotion (and because of the association with logic and mathematics) "Rational" evolved into a general word for reasonable or well argued. The modern rationalist movement is about thinking clearly and coming to correct conclusions, which can't really be done by relying exclusively on pure reason. (Hence why moral rationalists in the original sense don't really exist anymore)
Moral motivation: internalism or externalism?
Other 329 / 931 (35.3%)
Accept or lean toward: internalism 325 / 931 (34.9%)
Accept or lean toward: externalism 277 / 931 (29.8%)
Internalism is the belief that it is a necessary truth that, if A believes X to be wrong/right, A is at least partly motivated to avoid/promote/honour X. Externalism is usually considered to be the denial of internalism, so I don't know what 35.5% of people are talking about. My guess is they meant "don't know".
it's difficult for me to understand why someone would disagree with it
Typical mind fallacy, but with respect to the entirety of mindspace?
Intelligence and final goals are orthogonal axes along which possible agents can freely vary. In other words, more or less any level of intelligence could in principle be combined with more or less any final goal.
It seems true - but pretty irrelevant. We mostly care about real world agents - not what could "in principle" be constructed. It's a kind of weasel wording - no doubt intended to provoke concern about evil geniuses.
We should call those the Moral Convergent Values or some other fancy name.
These are the same as Universal Instrumental Values? Or is there a reason to think that something different would be valued?
Incidentally, value convergence could involve multiple attractors. There might be moral symmetry breaking. Value systems as natural attractors doesn't imply universal convergence on one set of values. This point seems to get lost in this post.
Tim, thanks for that commentary, it will put reading your book on the top of my leisure to do list.
Yes, it could involve multiple attractors. I'm not sure which kind of symmetry you refer to though. Do you mean some sort of radial symmetry comming from everything else towards a unique set of values? Even in that case it would not be symmetric because the acceleration (force) would be different from different regions, due to for instance the stuff in (2008 Boyd Richerson and someone else) .
About your main question, no, that is not the same as instrumental Moral Values. Those who hold that claim would probably prefer to say something like: There are these two sets of convergent values, the instrumental ones, and about those we don't care much more than Omohundro does. And the Convergent ones, which are named such because they converge despite not being for instrumental reasons.
Yes, it could involve multiple attractors. I'm not sure which kind of symmetry you refer to though. Do you mean some sort of radial symmetry comming from everything else towards a unique set of values? Even in that case it would not be symmetric because the acceleration (force) would be different from different regions, due to for instance the stuff in (2008 Boyd Richerson and someone else).
Imagine the Lefties, who value driving on the left - and the Righties, who value driving on the right. Nature doesn't care much about this (metaphorically speaking, of course), but the Lefties and the Righties do. I would say that was an example of moral symmetry breaking. It may not be the greatest example (it is more lilkely that they actually care about not being killed) - but I think it illustrates the general idea.
About your main question, no, that is not the same as instrumental Moral Values. Those who hold that claim would probably prefer to say something like: There are these two sets of convergent values, the instrumental ones, and about those we don't care much more than Omohundro does. And the Convergent ones, which are named such because they converge despite not being for instrumental reasons.
I suspect they are practically the same. Intelligent organisms probably won't deviate far from Universal Instrumental Values - for fear of meeting agents whose values more closely approximate them - thus losing control of their entire future.
Lefties and righties is just a convention case, if humans had three arms, two on the right, there might have been a matter of fact as to coming from which arm preference things go better.
I think this fear of other agents taking over the world is some form of reminiscent ingroup outgroup bias. To begin with, on the limit, if you value A B and C intrinsically but you have to do D1 D2 and D3 instrumentally, you may initially think of doing D1 D2 and D3. but what use would it be to fill up your future with that instrumental stuff if you nearly never get A B an C. You'd become just one more stupid replicator fighting for resources. You'd be better off by doing nothing and wishing that, by luck, A B an C were being instantiated by someone less instrumental than yourself.
Lefties and righties is just a convention case, if humans had three arms, two on the right, there might have been a matter of fact as to coming from which arm preference things go better.
Sure, but there are cases where rivals are evenly matched. Lions and tigers, for instance, have different - often conflicting - aims. However, it isn't a walk-over for one team. Of course, you could say whether the lion or tiger genes win is "just a convention" - but to the lions and tigers, it really matters.
To begin with, on the limit, if you value A B and C intrinsically but you have to do D1 D2 and D3 instrumentally, you may initially think of doing D1 D2 and D3. but what use would it be to fill up your future with that instrumental stuff if you nearly never get A B an C [?]
No use. However, our values are not that far from Universal Instrumental Values - because we were built by a process involving a lot of natural selection.
Our choice is more like: do we give up a few of the things we value now - or run the risk of losing many more of them in the future. That leads to the question of how big the risk is - and that turns out to be a tricky issue.
Agreed. That tricky issue I suspect might have enormous consequences if reason ends up being highjacked by in-group out-group biases, and the surviving memes end up being those that make us more instrumental, for fear of someone else doing the same.
I expect that the force that will eventually promote natural values most strongly will be the prospect of encountering unknown aliens. As you say, the stakes are high. If we choose incorrectly, much of our distinctiveness could be permanently obliterated.
sorry, in your terminology I should have said "reproductor"?, I forgot your substitute for replicator....
I disbelieve the orthonality thesis, but I'm not sure that my poisition is covered by either of your two cases. My position is best described as a statement by Yudkowsky:
"for every X except x0, it is mysteriously impossible to build any computational system which generates a range of actions, predicts the consequences of those actions relative to some ontology and world-model, and then selects among probable consequences using criterion X"
I certainly don't think AIs become friendly automatically. I agree they have to have the correct goal system X (xo) built-in from the start. My guess is that the AIs without the correct X built-in are not true general intelligences. That is to say, I think they would simply stop functioning correctly (or equivalently, there is an intelligence ceiling past which they cannot go).
Why do you think this, and on a related note why do you think AI's without X will stop functioning/hit a ceiling (in the sense of what is the causal mechanism)?
Taking a wild guess I’d say…
Starting from my assumption that concept-free general intelligence is impossible, the implication is that there would be some minimal initial set of concepts required to be built-in for all AGIs.
This minimal set of concepts would imply some necessary cognitive biases/heuristics (because the very definition of a ‘concept’ implies a particular grouping or clustering of data – an initial ‘bias’), which in turn is equivalent to some necessary starting values (a ‘bias’ is in a sense, a type of value judgement).
The same set of heuristics/biases (values) involved in taking actions in the world would also be involved in managing (reorganizing) the internal representational system of the AIs. If the reorganization is not performed in a self-consistent fashion, the AIs stop functioning. Remember: we are talking about a closed loop here….the heuristics/biases used to reorganize the representational system, have to themselves be fully represented in that system.
Therefore, the causal mechanism that stops the uAIs would be the eventual breakdown in their representational systems as the need for ever more new concepts arises, stemming from the inconsistent and/or incomplete initial heuristics/biases being used to manage those representational systems (i.e., failing to maintain a closed loop).
Advanced hard math for all this to follow….
Stuart has worked on further developing the orthogonality thesis, which gave rise to a paper, a non-final version of which you can see here: http://lesswrong.com/lw/cej/general_purpose_intelligence_arguing_the/
This post won't make sense if you haven't been through that.
Today we spent some time going over it and he accepted my suggestion of a minor amendment. Which best fits here.
Besides all the other awkward things that a moral convergentist would have to argue for, namely: