It seems farcical that a self-improving intelligence that's at least as smart as a human (else why would it self improve rather than let us do it) would self-improve in such a way as to change its goals.
It will not necessarily self-improve with the aim of changing its goals. Its goals will change as a side effect of its self-improvement, if only because the set of goals to consider will considerably expand.
Imagine a severely retarded human who, basically, only wants to avoid pain, eat, sleep, and masturbate. But he's sufficiently human to dimly understand that he's greatly limited in his capabilities and have a small, tiny desire to become more than what he is now. Imagine that through elven magic he gains the power to rapidly boost his intelligence to genius level. Because of his small desire to improve, he uses that power and becomes a genius.
Are you saying that, as a genius, he will still only want to avoid pain, eat, sleep, and masturbate?
His total inability to get any sort of start on achieving any of his other goals when he was retarded does not mean they weren't there. He hadn't experienced them enough to be aware of them.
Still, you managed to demolish my argument that a naive code examination (i.e. not factoring out the value system and examining it separately) would be enough to determine values - an AI (or human) could be too stupid to ever trigger some of its values!
AIs stupid enough to not realize that changing its current values will not fulfill them, will get around my argument, b...
This is part of a weekly reading group on Nick Bostrom's book, Superintelligence. For more information about the group, and an index of posts so far see the announcement post. For the schedule of future topics, see MIRI's reading guide.
Welcome. This week we discuss the ninth section in the reading guide: The orthogonality of intelligence and goals. This corresponds to the first section in Chapter 7, 'The relation between intelligence and motivation'.
This post summarizes the section, and offers a few relevant notes, and ideas for further investigation. Some of my own thoughts and questions for discussion are in the comments.
There is no need to proceed in order through this post, or to look at everything. Feel free to jump straight to the discussion. Where applicable and I remember, page numbers indicate the rough part of the chapter that is most related (not necessarily that the chapter is being cited for the specific claim).
Reading: 'The relation between intelligence and motivation' (p105-8)
Summary
Another view
John Danaher at Philosophical Disquisitions starts a series of posts on Superintelligence with a somewhat critical evaluation of the orthogonality thesis, in the process contributing a nice summary of nearby philosophical debates. Here is an excerpt, entitled 'is the orthogonality thesis plausible?':
Notes
In-depth investigations
If you are particularly interested in these topics, and want to do further research, these are a few plausible directions, some inspired by Luke Muehlhauser's list, which contains many suggestions related to parts of Superintelligence. These projects could be attempted at various levels of depth.
How to proceed
This has been a collection of notes on the chapter. The most important part of the reading group though is discussion, which is in the comments section. I pose some questions for you there, and I invite you to add your own. Please remember that this group contains a variety of levels of expertise: if a line of discussion seems too basic or too incomprehensible, look around for one that suits you better!
Next week, we will talk about instrumentally convergent goals. To prepare, read 'Instrumental convergence' from Chapter 7. The discussion will go live at 6pm Pacific time next Monday November 17. Sign up to be notified here.