I write fiction. I'm also interested in how AI is going to impact the world. Among other things, I'd prefer that AI not lead to catastrophe. Let's imagine that I want to combine these two interests, writing fiction that explores the risks posed by AI. How should I go about doing so? More concretely, what ideas about AI might I try to communicate via fiction?
This post is an attempt to partially answer that question. It is also an attempt to invoke Cunningham's Law: I'm sure there will be things I miss or get wrong, and I'm hoping the comments section might illuminate some of these.
Holden's Messages
A natural starting point is Holden's recent blog post, Spreading Messages to Help With the Most Important Century. Stripping out the nuances of that post, here's a list of the messages that Holden would like to see spread:
- We should worry about conflict between misaligned AI and all humans.
- AIs could behave deceptively, so “evidence of safety” might be misleading.
- AI projects should establish and demonstrate safety (and potentially comply with safety standards) before deploying powerful systems.
- Alignment research is prosocial and great.
- It might be important for companies (and other institutions) to act in unusual ways.
- We're not ready for this.
However, as interesting as this list is, it's not what I'm looking for; I'm not looking for bottom-line messages to convey. Instead, I want to identify a list of smaller ideas that will help people to reach their own bottom lines by thinking carefully through the issues. The idea of instrument convergence might appear on such a list. The idea that alignment research is great would not.
One reason for my focus is that fiction writing is ultimately about details. Fiction might convey big messages, but it does so by exploring more specific ideas. This raises the question: which specific ideas?
Another reason for my focus is that I'm allergic to propaganda. I don't want to tell people what to think and would prefer to introduce ideas that can help people think for themselves. Of course, not all message fiction is propaganda, and I'm not accusing Holden of calling for propaganda. Still, my personal preference is to focus on how to convey the nuts and bolts needed to understand AI.[1]
What Nuts and Which Bolts?
So with context to hand, back to the question: what ideas about AI might someone try to convey via fiction? Here's a potential list:
- Basics of AI
- Neural networks are black boxes (though interpretability might help us to see inside).
- AI "Psychology"
- AI systems are likely to be alien in how they think. They are unlikely to think like humans.
- Orthogonality and instrumental convergence might provide insight into likely AI behaviour.
- AI systems might be agents, in some relatively natural sense. They might also simulate agents, even if they are not agents.
- Potential dangers from AI
- Outer misalignment is a potential danger, but in the context of neural networks so too is inner misalignment (related: reward misspecification and goal misgeneralisation).
- Deceptive alignment might lead to worries about a treacherous turn.
- The possibility of recursive improvement might influence views about takeoff speed (which might influence views about safety).
- Broader Context of Potential Risks
- Different challenges might arise in the case of a singleton, when compared with multipolar scenarios.
- Arms races can lead to outcomes that no-one wants.
- AI rights could be a real thing but also incorrect attribution of rights to AI could itself pose a risk (by making it harder to control AI behaviour).
So that's the list. Having seen it, one might naturally wonder why fiction is the right medium to communicate ideas like this. Part of the answer is that I think it's useful to explore ideas from many angles.
Another part of the answer is that conveying an idea is one thing but conveying an intuition is another. Humans are used to modelling other humans, and so it is likely that we'll anthropomorphise when considering AI. Fiction might help with this. It's one thing to state in factual tones that AI systems are likely to have an alien psychology. It's quite another to be shown a world in which humans come up against the alien.
So why communicate the ideas? Because it's plausibly good that those working on AI capabilities, those working on AI safety, and people more broadly are able to reflect on the implications of AI and can understand why many are concerned about it. And why fiction? In part, because an intuitive grasp can be as important as a grasp of facts.
AI Fables
I started this post with a hypothetical, imagining that I wanted to write fiction that explores AI risk. In reality, I doubt that I'll find a great deal of time to do so. Still, I'd be excited to see other people writing fiction of this sort.
Here's one genre of story I'd be interested to see more of: AI fables. Fables are short stories, with a particular aesthetic sensibility, that convey a lesson.
While I enjoy the aesthetic of fables I wouldn't want to narrow the focus too much, but I'd love to see more short stories, of the sort that could be read around a fire on a winter's night, that communicate a brief lesson about AI.
For example, stories of djinni and golems can be used to communicate the problem of outer misalignment; even if something does precisely what we tell it to, it can be hard to ensure that it does what we actually want it to. I'd love to see a fable that likewise communicated the problem of inner misalignment. I'd love to see a wide variety of such fables, exploring a range of ideas about AI, and maybe even a collection putting them in one place.
If you know of such a story, please link it in the comments. If you write such a story, please link it. And if you have thoughts or additions for the list of ideas in the post, I'd love to hear these.
The ideas in this post were developed in discussion with Elizabeth Garrett and Damon Sasi. Thanks also to Conor Barnes for feedback.
- ^
I'm also not confident in the bottom lines; I retain substantial uncertainty about how likely AI is to lead to extinction or something equally bad (as opposed to more mundane, but still awful, catastrophe). However, I feel far more confident that there is insight to be gleaned from reflection on the various concepts and ideas underlying the case for AI risk. So this is where I focus.
There was a worldbuilding contest last year for writing short stories featuring AGI with positive outcomes. You may be interested in it, although it's undoubtedly propaganda of some sort.
These are not fables, so I apologize for that. However, I've written many short stories (that are not always obviously) about alignment and related topics. The Well of Cathedral is about trying to contain a threat that grows in power exponentially, Waste Heat is about unilateral action to head off a catastrophe causing its own catastrophe, and Flourishing is a romance between a human and AI, but also about how AIs don't think like humans at all.
More than half my works are inadvertently about AI or alignment in some way or another... Dais 11, Dangerous Thoughts, The Only Thing that Proves You, and I Will Inform Them probably also count, as does I See the Teeth (though only tangentially at end) and Zamamiro (although that one's quality is notably poor).
I guess what I'm saying is, if there's ever a competition I'll probably write an entry, otherwise please check out my AO3 links above.
Flourishing is a fantastic story and definitely left me wanting more. I would have enjoyed a 5, 10, 20 year fast forward approach to explore their long term relationship. We've seen many stories of AI companions that highlight the beginnings of the relationship but it would be fun to see how their domestic life is, interactions with friends and family and other companions and growing old together. How would they, for example, deal with optional upgrades over time? Or if there was a recall many years later? There are many endless fascinating possibilities. The clash of human thinking with AI thinking is so entertaining, some truly impressive writing. Thanks for recommending, I'll definitely check out your other stories as well.