I wonder [read the book got the t-shirt & sticker] if it really is -generally- all so complex. I mean a lot of the imputations are anthropomorphic. Machines are dead brains that are switched on. There is nothing else. Unless mimickry which might con some people some of the time. 2001 the movie was still the closest to a machine thinking along certain logic lines. As for rebelling robots, independent machine inteliigences [unless hybrid brain interfaces] I cannot forsee anything in this book that is even relevant. Nice thought experiments though. I am finished. This is it.
This is part of a weekly reading group on Nick Bostrom's book, Superintelligence. For more information about the group, and an index of posts so far see the announcement post. For the schedule of future topics, see MIRI's reading guide.
Welcome. This week we discuss the twentieth section in the reading guide: the value-loading problem.
This post summarizes the section, and offers a few relevant notes, and ideas for further investigation. Some of my own thoughts and questions for discussion are in the comments.
There is no need to proceed in order through this post, or to look at everything. Feel free to jump straight to the discussion. Where applicable and I remember, page numbers indicate the rough part of the chapter that is most related (not necessarily that the chapter is being cited for the specific claim).
Reading: “The value-loading problem” through “Motivational scaffolding” from Chapter 12
Summary
Another view
Ernest Davis, on a 'serious flaw' in Superintelligence:
Notes
1. At the start of the chapter, Bostrom says ‘while the agent is unintelligent, it might lack the capability to understand or even represent any humanly meaningful value. Yet if we delay the procedure until the agent is superintelligent, it may be able to resist our attempt to meddle with its motivation system.' Since presumably the AI only resists being given motivations once it is turned on and using some other motivations, you might wonder why we wouldn't just wait until we had built an AI smart enough to understand or represent human values, before we turned it on. I believe the thought here is that the AI will come to understand the world and have the concepts required to represent human values by interacting with the world for a time. So it is not so much that the AI will need to be turned on to become fundamentally smarter, but that it will need to be turned on to become more knowledgeable.
2. A discussion of Davis' response to Bostrom just started over at the Effective Altruism forum.
3. Stuart Russell thinks of value loading as an intrinsic part of AI research, in the same way that nuclear containment is an intrinsic part of modern nuclear fusion research.
4. Kaj Sotala has written about how to get an AI to learn concepts similar to those of humans, for the purpose of making safe AI which can reason about our concepts. If you had an oracle which understood human concepts, you could basically turn it into an AI which plans according to arbitrary goals you can specify in human language, because you can say 'which thing should I do to best forward [goal]?' (This is not necessarily particularly safe as it stands, but is a basic scheme for turning conceptual understanding and a motivation to answer questions into any motivation).
5. Inverse reinforcement learning and goal inference are approaches to having machines discover goals by observing actions—these could be useful instilling our own goals into machines (as has been observed before).
6. If you are interested in whether values are really so complex, Eliezer has written about it. Toby Ord responds critically to the general view around the LessWrong community that value is extremely likely to be complex, pointing out that this thesis is closely related to anti-realism—a relatively unpopular view among academic philosophers—and so that overall people shouldn't be that confident. Lots of debate ensues.
In-depth investigations
If you are particularly interested in these topics, and want to do further research, these are a few plausible directions, some inspired by Luke Muehlhauser's list, which contains many suggestions related to parts of Superintelligence. These projects could be attempted at various levels of depth.
How to proceed
This has been a collection of notes on the chapter. The most important part of the reading group though is discussion, which is in the comments section. I pose some questions for you there, and I invite you to add your own. Please remember that this group contains a variety of levels of expertise: if a line of discussion seems too basic or too incomprehensible, look around for one that suits you better!
Next week, we will talk about how an AI might learn about values. To prepare, read “Value learning” from Chapter 12. The discussion will go live at 6pm Pacific time next Monday 2 February. Sign up to be notified here.