From John Danaher's review:
Of course, the Humean theory may be false and so Bostrom wisely avoids it in his defence of the orthogonality thesis.
I had the opposite reaction. The Humean theory of motivation is correct, and I see no reason to avoid tying the orthogonality thesis to it. To me, Bostrom's distancing of the orthogonality thesis from Humean motivation seemed like splitting hairs. Since how strong a given motivation is can only be measured relative to other motivations, Bostrom's point that an agent could have very strong motivations not arisin...
what matters for Bostrom’s definition of intelligence is whether the agent is getting what it wants
This brings up another way - comparable to the idea that complex goals may require high intelligence - in which the orthogonality thesis might be limited. I think that the very having of wants itself requires a certain amount of intelligence. Consider the animal kingdom, sphexishness, etc. To get behavior that clearly demonstrates what most people would confidently call "goals" or "wants", you have to get to animals with pretty subst...
There is more than one version of the orthogonality thesis. It is trivially false under some interpretations, and trivially true under others, which is true because only some versions can be used as a stage in an argument towards Yudkowskian UFAI.
It is admitted from the outset that some versions of the OT are not logically possible, those being the ones that involve a Godelian or Lobian contradiction.
It is also admitted that the standard OT does not deal with any dynamic or developmental aspects of agents. However, the UFAI argument is posited on agents w...
What are other examples of possible motivating beliefs? I find the examples of morals incredibly non-convincing (as in actively convincing me of the opposite position).
Here's a few examples I think might count. They aren't universal, but they do affect humans:
Realizing neg-entropy is going to run out and the universe will end. An agent trying to maximize average-utility-over-time might treat this as a proof that the average is independent of its actions, so that it assigns a constant eventual average utility to all possible actions (meaning what it does
What cognitive skills do moral realists think you need for moral knowledge? Is it sufficient to be really good at prediction and planning?
One way intelligence and goals might be related is that the ontology an agent uses (e.g. whether it thinks of the world it deals with in terms of atoms or agents or objects) as well as the mental systems it has (e.g. whether it has true/false beliefs, or probabilistic beliefs) might change how capable it is, as well as which values it can comprehend. For instance, an agent capable of a more detailed model of the world might tend to perceive more useful ways to interact with the world, and so be more intelligent. It should also be able to represent preferences which wouldn't have made sense in a simpler model.
This section presents and explains the orthogonality thesis, but doesn't provide much argument for it. Should the proponents or critics of such a view be required to make their case?
One way intelligence and goals might be related is that the ontology an agent uses (e.g. whether it thinks of the world it deals with in terms of atoms or agents or objects) as well as the mental systems it has (e.g. whether it has true/false beliefs, or probabilistic beliefs) might change how capable it is, as well as which values it can comprehend.
I think the remarks about goals being ontologically-associated, are absolutely spot on. Goals, and any “values” distinguishing among the possible future goals in the agent's goal space, are built around that agent's perceived (actually, inhabited is a better word) ontology.
For example, the professional ontology of a wall street financial analyst includes the objects that he or she interacts with (options, stocks, futures, dividends, and the laws and infrastructure associated with the conceptual “deductive closure” of that ontology.)
Clearly, “final” -- teleological and moral – principles involving approach and avoidance judgments … say, involving insider trading (and the negative consequences at a practical level, if not the pure anethicality, of running afoul of the laws and rules of governance for trading those objects) , are only defined within an ontological universe of discourse, which contains those financial objects and the network of laws and valuations that define – and are defined by -- those objects.
Smarter beings, or even ourselves, as our culture evolves, generation after generation becoming more complex, acquire new ontologies and gradually retire others. Identity theft mediated by surreptitious seeding of laptops in Starbucks with keystroke-logging viruses, is “theft” and is unethical. But trivially in 1510 BCE, the ontological stage on which this is optionally played out did not exist, and thus, the ethical valence would have been undefined, even unintelligible.
That is why, if we can solve the friendlieness problem, it will have to be by some means that gives new minds the capacity to develop robust ethical meta-intuition, that can be recruited creatively, on the fly, as these beings encounter new situations that call upon them to make new ethical judgements.
I happen to be a version of meta -ethical realist, like I am something of a mathematical platonist, but in my position, this is crossed also with a type of constructivist metaethics, apparently like that subscribed-to by John Danaher in his blog (after I followed the link and read it.)
At least, his position sounds like it is similar to mine, although constructivist part of my theory is supplemented with a “weak” quasi-platonist thread, that I am trying to derive from some more fundamental meta-ontological principles (work in progress on that.)
This is part of a weekly reading group on Nick Bostrom's book, Superintelligence. For more information about the group, and an index of posts so far see the announcement post. For the schedule of future topics, see MIRI's reading guide.
Welcome. This week we discuss the ninth section in the reading guide: The orthogonality of intelligence and goals. This corresponds to the first section in Chapter 7, 'The relation between intelligence and motivation'.
This post summarizes the section, and offers a few relevant notes, and ideas for further investigation. Some of my own thoughts and questions for discussion are in the comments.
There is no need to proceed in order through this post, or to look at everything. Feel free to jump straight to the discussion. Where applicable and I remember, page numbers indicate the rough part of the chapter that is most related (not necessarily that the chapter is being cited for the specific claim).
Reading: 'The relation between intelligence and motivation' (p105-8)
Summary
Another view
John Danaher at Philosophical Disquisitions starts a series of posts on Superintelligence with a somewhat critical evaluation of the orthogonality thesis, in the process contributing a nice summary of nearby philosophical debates. Here is an excerpt, entitled 'is the orthogonality thesis plausible?':
Notes
In-depth investigations
If you are particularly interested in these topics, and want to do further research, these are a few plausible directions, some inspired by Luke Muehlhauser's list, which contains many suggestions related to parts of Superintelligence. These projects could be attempted at various levels of depth.
How to proceed
This has been a collection of notes on the chapter. The most important part of the reading group though is discussion, which is in the comments section. I pose some questions for you there, and I invite you to add your own. Please remember that this group contains a variety of levels of expertise: if a line of discussion seems too basic or too incomprehensible, look around for one that suits you better!
Next week, we will talk about instrumentally convergent goals. To prepare, read 'Instrumental convergence' from Chapter 7. The discussion will go live at 6pm Pacific time next Monday November 17. Sign up to be notified here.