Why do not copy concepts how children learn ethical codes?
Because the AI is not a child, so doing the same thing would probably give different results.
I do not know how many dozens of pages with silly stories I read about AIs misinterpreting human commands.
The essence of the problem is that the difference between "interpreting" and "misinterpreting" only exists in the mind of the human.
If I as a computer programmer say to a machine "add 10 to X" -- while I really meant "add 100 to X", but made a mistake -- and the machine adds 10 to X, would you call that "misinterpreting" my command? Because such things happen every day with the existing programming languages, so there is nothing strange about expecting a similar thing happening in the future.
From the machine point of view, it was asked to "add 10 to X", it added 10 to X, so it works correctly. If the human is frustrated because that's not what they meant, that's bad for the human, but the machine worked correctly according to its inputs.
You may be assuming a machine with a magical source of wisdom which could look at command "add 10 to X" and somehow realize that the human would actually want to add 100, and would fix its own program (unless it is passively aggressive and decides to follow the letter of the program anyway). But that's not how machines work.
Let us try to free our mind from associating AGIs with machines. They are totally different from automata. AGIs will be creative, will learn to understand sarcasm, will understand that women in some situations say no and mean yes.
On your command to add 10 to x an AGI would reply: "I love to work for you! At least once a day you try to fool me - I am not asleep and I know that + 100 would be correct. ShalI I add 100?"
This is part of a weekly reading group on Nick Bostrom's book, Superintelligence. For more information about the group, and an index of posts so far see the announcement post. For the schedule of future topics, see MIRI's reading guide.
Welcome. This week we discuss the eighth section in the reading guide: Cognitive Superpowers. This corresponds to Chapter 6.
This post summarizes the section, and offers a few relevant notes, and ideas for further investigation. Some of my own thoughts and questions for discussion are in the comments.
There is no need to proceed in order through this post, or to look at everything. Feel free to jump straight to the discussion. Where applicable and I remember, page numbers indicate the rough part of the chapter that is most related (not necessarily that the chapter is being cited for the specific claim).
Reading: Chapter 6
Summary
Another view
Bostrom starts the chapter claiming that humans' dominant position comes from their slightly expanded set of cognitive functions relative to other animals. Computer scientist Ernest Davis criticizes this claim in a recent review of Superintelligence:
Notes
In-depth investigations
If you are particularly interested in these topics, and want to do further research, these are a few plausible directions, almost entirely taken from Luke Muehlhauser's list, without my looking into them further.
How to proceed
This has been a collection of notes on the chapter. The most important part of the reading group though is discussion, which is in the comments section. I pose some questions for you there, and I invite you to add your own. Please remember that this group contains a variety of levels of expertise: if a line of discussion seems too basic or too incomprehensible, look around for one that suits you better!
Next week, we will talk about the orthogonality of intelligence and goals, section 9. To prepare, read The relation between intelligence and motivation from Chapter 7. The discussion will go live at 6pm Pacific time next Monday November 10. Sign up to be notified here.