Any level of perverse instantiation in a sufficiently powerful AI is likely to lead to total UFAI; i.e. a full existential catastrophe. Either you get the AI design right so that it doesn't wirehead itself - or others, against their will - or you don't. I don't think there's much middle ground.
OTOH, the relevance of Mind Crime really depends on the volume. The FriendlyAICriticalFailureTable has this instance:
22: The AI, unknown to the programmers, had qualia during its entire childhood, and what the programmers thought of as simple negative feedback corresponded to the qualia of unbearable, unmeliorated suffering. All agents simulated by the AI in its imagination existed as real people (albeit simple ones) with their own qualia, and died when the AI stopped imagining them. The number of agents fleetingly imagined by the AI in its search for social understanding exceeds by a factor of a thousand the total number of humans who have ever lived. Aside from that, everything worked fine.
This scenario always struck me as a (qualified) FAI success. There's a cost - and it's large in absolute terms - but the benefits will outweigh it by a huge factor, and indeed by enough orders of magnitude that even a slight increase in the probability of getting pre-empted by a UFAI may be too expensive a price to pay for fixing this kind of bug.
So cases like this - where it only happens until the AI matures sufficiently and then becomes able see that its values make this a bad idea, and stops doing it - aren't as bad as an actual FAI failure.
Of course, if it's an actual problem with the AI's value content, which causes the AI to keep on doing this kind of thing throughout its existence, that may well outweigh any good it ever does. The total cost in this case becomes hard to predict, depending crucially on just how much resources the AI spends on these simulations, and how nasty they are on average.
This is part of a weekly reading group on Nick Bostrom's book, Superintelligence. For more information about the group, and an index of posts so far see the announcement post. For the schedule of future topics, see MIRI's reading guide.
Welcome. This week we discuss the twelfth section in the reading guide: Malignant failure modes.
This post summarizes the section, and offers a few relevant notes, and ideas for further investigation. Some of my own thoughts and questions for discussion are in the comments.
There is no need to proceed in order through this post, or to look at everything. Feel free to jump straight to the discussion. Where applicable and I remember, page numbers indicate the rough part of the chapter that is most related (not necessarily that the chapter is being cited for the specific claim).
Reading: 'Malignant failure modes' from Chapter 8
Summary
Another view
In this chapter Bostrom discussed the difficulty he perceives in designing goals that don't lead to indefinite resource acquisition. Steven Pinker recently offered a different perspective on the inevitability of resource acquisition:
Notes
1. Perverse instantiation is a very old idea. It is what genies are most famous for. King Midas had similar problems. Apparently it was applied to AI by 1947, in With Folded Hands.
2. Adam Elga writes more on simulating people for blackmail and indexical uncertainty.
3. More directions for making AI which don't lead to infrastructure profusion:
In-depth investigations
If you are particularly interested in these topics, and want to do further research, these are a few plausible directions, some inspired by Luke Muehlhauser's list, which contains many suggestions related to parts of Superintelligence. These projects could be attempted at various levels of depth.
How to proceed
This has been a collection of notes on the chapter. The most important part of the reading group though is discussion, which is in the comments section. I pose some questions for you there, and I invite you to add your own. Please remember that this group contains a variety of levels of expertise: if a line of discussion seems too basic or too incomprehensible, look around for one that suits you better!
Next week, we will talk about capability control methods, section 13. To prepare, read “Two agency problems” and “Capability control methods” from Chapter 9. The discussion will go live at 6pm Pacific time next Monday December 8. Sign up to be notified here.