[Many people have been complaining about the lack of new content on LessWrong lately, so I thought I'd cross-post my latest blog post here in discussion. Feel free to critique the content as much as you like, but please do keep in mind that I wrote this for my personal blog and not with LW in mind specifically, so some parts might not be up to LW standards, whereas others might be obvious to everyone here. In other words...well, be gentle]
You know what’s scarier than having enemy soldiers at your border?
Having sleeper agents within your borders.
Enemy soldiers are malevolent, but they are at least visibly malevolent. You can see what they’re doing; you can fight back against them or set up defenses to stop them. Sleeper agents on the other hand are malevolent and invisible. They are a threat and you don’t know that they’re a threat. So when a sleeper agent decides that it’s time to wake up and smell the gunpowder, not only will you be unable to stop them, but they’ll be in a position to do far more damage than a lone soldier ever could. A single well-placed sleeper agent can take down an entire power grid, or bring a key supply route to a grinding halt, or – in the worst case – kill thousands with an act of terrorism, all without the slightest warning.
Okay, so imagine that your country is in wartime, and that a small group of vigilant citizens has uncovered an enemy sleeper cell in your city. They’ve shown you convincing evidence for the existence of the cell, and demonstrated that the cell is actively planning to commit some large-scale act of violence – perhaps not imminently, but certainly in the near-to-mid-future. Worse, the cell seems to have even more nefarious plots in the offing, possibly involving nuclear or biological weapons.
Now imagine that when you go to investigate further, you find to your surprise and frustration that no one seems to be particularly concerned about any of this. Oh sure, they acknowledge that in theory a sleeper cell could do some damage, and that the whole matter is probably worthy of further study. But by and large they just hear you out and then shrug and go about their day. And when you, alarmed, point out that this is not just a theory – that you have proof that a real sleeper cell is actually operating and making plans right now – they still remain remarkably blase. You show them the evidence, but they either don’t find it convincing, or simply misunderstand it at a very basic level (“A wiretap? But sleeper agents use cellphones, and cellphones are wireless!”). Some people listen but dismiss the idea out of hand, claiming that sleeper cell attacks are “something that only happen in the movies”. Strangest of all, at least to your mind, are the people who acknowledge that the evidence is convincing, but say they still aren’t concerned because the cell isn’t planning to commit any acts of violence imminently, and therefore won’t be a threat for a while. In the end, all of your attempts to raise the alarm are to no avail, and you’re left feeling kind of doubly scared – scared first because you know the sleeper cell is out there, plotting some heinous act, and scared second because you know you won’t be able to convince anyone of that fact before it’s too late to do anything about it.
This is roughly how I feel about AI risk.
You see, I think artificial intelligence is probably the most significant existential threat facing humanity right now. This, to put it mildly, is something of a fringe position in most intellectual circles (although that’s becoming less and less true as time goes on), and I’ll grant that it sounds kind of absurd. But regardless of whether or not you think I’m right to be scared of AI, you can imagine how the fact that AI risk is really hard to explain would make me even more scared about it. Threats like nuclear war or an asteroid impact, while terrifying, at least have the virtue of being simple to understand – it’s not exactly hard to sell people on the notion that a 2km hunk of rock colliding with the planet might be a bad thing. As a result people are aware of these threats and take them (sort of) seriously, and various organizations are (sort of) taking steps to stop them.
AI is different, though. AI is more like the sleeper agents I described above – frighteningly invisible. The idea that AI could be a significant risk is not really on many people’s radar at the moment, and worse, it’s an idea that resists attempts to put it on more people’s radar, because it’s so bloody confusing a topic even at the best of times. Our civilization is effectively blind to this threat, and meanwhile AI research is making progress all the time. We’re on the Titanic steaming through the North Atlantic, unaware that there’s an iceberg out there with our name on it – and the captain is ordering full-speed ahead.
(That’s right, not one but two ominous metaphors. Can you see that I’m serious?)
But I’m getting ahead of myself. I should probably back up a bit and explain where I’m coming from.
Artificial intelligence has been in the news lately. In particular, various big names like Elon Musk, Bill Gates, and Stephen Hawking have all been sounding the alarm in regards to AI, describing it as the greatest threat that our species faces in the 21st century. They (and others) think it could spell the end of humanity – Musk said, “If I had to guess what our biggest existential threat is, it’s probably [AI]”, and Gates said, “I…don’t understand why some people are not concerned [about AI]”.
Of course, others are not so convinced – machine learning expert Andrew Ng said that “I don’t work on not turning AI evil today for the same reason I don’t worry about the problem of overpopulation on the planet Mars”.
In this case I happen to agree with the Musks and Gates of the world – I think AI is a tremendous threat that we need to focus much of our attention on it in the future. In fact I’ve thought this for several years, and I’m kind of glad that the big-name intellectuals are finally catching up.
Why do I think this? Well, that’s a complicated subject. It’s a topic I could probably spend a dozen blog posts on and still not get to the bottom of. And maybe I should spend those dozen-or-so blog posts on it at some point – it could be worth it. But for now I’m kind of left with this big inferential gap that I can’t easily cross. It would take a lot of explaining to explain my position in detail. So instead of talking about AI risk per se in this post, I thought I’d go off in a more meta-direction – as I so often do – and talk about philosophical differences in general. I figured if I couldn’t make the case for AI being a threat, I could at least make the case for making the case for AI being a threat.
(If you’re still confused, and still wondering what the whole deal is with this AI risk thing, you can read a not-too-terrible popular introduction to the subject here, or check out Nick Bostrom’s TED Talk on the topic. Bostrom also has a bestselling book out called Superintelligence. The one sentence summary of the problem would be: how do we get a superintelligent entity to want what we want it to want?)
(Trust me, this is much much harder than it sounds)
So: why then am I so meta-concerned about AI risk? After all, based on the previous couple paragraphs it seems like the topic actually has pretty decent awareness: there are popular internet articles and TED talks and celebrity intellectual endorsements and even bestselling books! And it’s true, there’s no doubt that a ton of progress has been made lately. But we still have a very long way to go. If you had seen the same number of online discussions about AI that I’ve seen, you might share my despair. Such discussions are filled with replies that betray a fundamental misunderstanding of the problem at a very basic level. I constantly see people saying things like “Won’t the AI just figure out what we want?”, or “If the AI gets dangerous why can’t we just unplug it?”, or “The AI can’t have free will like humans, it just follows its programming”, or “lol so you’re scared of Skynet?”, or “Why not just program it to maximize happiness?”.
Having read a lot about AI, these misunderstandings are frustrating to me. This is not that unusual, of course – pretty much any complex topic is going to have people misunderstanding it, and misunderstandings often frustrate me. But there is something unique about the confusions that surround AI, and that’s the extent to which the confusions are philosophical in nature.
Why philosophical? Well, artificial intelligence and philosophy might seem very distinct at first glance, but look closer and you’ll see that they’re connected to one another at a very deep level. Take almost any topic of interest to philosophers – free will, consciousness, epistemology, decision theory, metaethics – and you’ll find an AI researcher looking into the same questions. In fact I would go further and say that those AI researchers are usually doing a better job of approaching the questions. Daniel Dennet said that “AI makes philosophy honest”, and I think there’s a lot of truth to that idea. You can’t write fuzzy, ill-defined concepts into computer code. Thinking in terms of having to program something that actually works takes your head out of the philosophical clouds, and puts you in a mindset of actually answering questions.
All of which is well and good. But the problem with looking at philosophy through the lens of AI is that it’s a two-way street – it means that when you try to introduce someone to the concepts of AI and AI risk, they’re going to be hauling all of their philosophical baggage along with them.
And make no mistake, there’s a lot of baggage. Philosophy is a discipline that’s notorious for many things, but probably first among them is a lack of consensus (I wouldn’t be surprised if there’s not even a consensus among philosophers about how much consensus there is among philosophers). And the result of this lack of consensus has been a kind of grab-bag approach to philosophy among the general public – people see that even the experts are divided, and think that that means they can just choose whatever philosophical position they want.
Want. That’s the key word here. People treat philosophical beliefs not as things that are either true or false, but as choices – things to be selected based on their personal preferences, like picking out a new set of curtains. They say “I prefer to believe in a soul”, or “I don’t like the idea that we’re all just atoms moving around”. And why shouldn’t they say things like that? There’s no one to contradict them, no philosopher out there who can say “actually, we settled this question a while ago and here’s the answer”, because philosophy doesn’t settle things. It’s just not set up to do that. Of course, to be fair people seem to treat a lot of their non-philosophical beliefs as choices as well (which frustrates me to no end) but the problem is particularly pronounced in philosophy. And the result is that people wind up running around with a lot of bad philosophy in their heads.
(Oh, and if that last sentence bothered you, if you’d rather I said something less judgmental like “philosophy I disagree with” or “philosophy I don’t personally happen to hold”, well – the notion that there’s no such thing as bad philosophy is exactly the kind of bad philosophy I’m talking about)
(he said, only 80% seriously)
Anyway, I find this whole situation pretty concerning. Because if you had said to me that in order to convince people of the significance of the AI threat, all we had to do was explain to them some science, I would say: no problem. We can do that. Our society has gotten pretty good at explaining science; so far the Great Didactic Project has been far more successful than it had any right to be. We may not have gotten explaining science down to a science, but we’re at least making progress. I myself have been known to explain scientific concepts to people every now and again, and fancy myself not half-bad at it.
Philosophy, though? Different story. Explaining philosophy is really, really hard. It’s hard enough that when I encounter someone who has philosophical views I consider to be utterly wrong or deeply confused, I usually don’t even bother trying to explain myself – even if it’s someone I otherwise have a great deal of respect for! Instead I just disengage from the conversation. The times I’ve done otherwise, with a few notable exceptions, have only ended in frustration – there’s just too much of a gap to cross in one conversation. And up until now that hasn’t really bothered me. After all, if we’re being honest, most philosophical views that people hold aren’t that important in grand scheme of things. People don’t really use their philosophical views to inform their actions – in fact, probably the main thing that people use philosophy for is to sound impressive at parties.
AI risk, though, has impressed upon me an urgency in regards to philosophy that I’ve never felt before. All of a sudden it’s important that everyone have sensible notions of free will or consciousness; all of a sudden I can’t let people get away with being utterly confused about metaethics.
All of a sudden, in other words, philosophy matters.
I’m not sure what to do about this. I mean, I guess I could just quit complaining, buckle down, and do the hard work of getting better at explaining philosophy. It’s difficult, sure, but it’s not infinitely difficult. I could write blogs posts and talk to people at parties, and see what works and what doesn’t, and maybe gradually start changing a few people’s minds. But this would be a long and difficult process, and in the end I’d probably only be able to affect – what, a few dozen people? A hundred?
And it would be frustrating. Arguments about philosophy are so hard precisely because the questions being debated are foundational. Philosophical beliefs form the bedrock upon which all other beliefs are built; they are the premises from which all arguments start. As such it’s hard enough to even notice that they’re there, let alone begin to question them. And when you do notice them, they often seem too self-evident to be worth stating.
Take math, for example – do you think the number 5 exists, as a number?
Yes? Okay, how about 700? 3 billion? Do you think it’s obvious that numbers just keep existing, even when they get really big?
Well, guess what – some philosophers debate this!
It’s actually surprisingly hard to find an uncontroversial position in philosophy. Pretty much everything is debated. And of course this usually doesn’t matter – you don’t need philosophy to fill out a tax return or drive the kids to school, after all. But when you hold some foundational beliefs that seem self-evident, and you’re in a discussion with someone else who holds different foundational beliefs, which they also think are self-evident, problems start to arise. Philosophical debates usually consist of little more than two people talking past one another, with each wondering how the other could be so stupid as to not understand the sheer obviousness of what they’re saying. And the annoying this is, both participants are correct – in their own framework, their positions probably are obvious. The problem is, we don’t all share the same framework, and in a setting like that frustration is the default, not the exception.
This is not to say that all efforts to discuss philosophy are doomed, of course. People do sometimes have productive philosophical discussions, and the odd person even manages to change their mind, occasionally. But to do this takes a lot of effort. And when I say a lot of effort, I mean a lot of effort. To make progress philosophically you have to be willing to adopt a kind of extreme epistemic humility, where your intuitions count for very little. In fact, far from treating your intuitions as unquestionable givens, as most people do, you need to be treating them as things to be carefully examined and scrutinized with acute skepticism and even wariness. Your reaction to someone having a differing intuition from you should not be “I’m right and they’re wrong”, but rather “Huh, where does my intuition come from? Is it just a featureless feeling or can I break it down further and explain it to other people? Does it accord with my other intuitions? Why does person X have a different intuition, anyway?” And most importantly, you should be asking “Do I endorse or reject this intuition?”. In fact, you could probably say that the whole history of philosophy has been little more than an attempt by people to attain reflective equilibrium among their different intuitions – which of course can’t happen without the willingness to discard certain intuitions along the way when they conflict with others.
I guess what I’m trying to say is: when you’re discussing philosophy with someone and you have a disagreement, your foremost goal should be to try to find out exactly where your intuitions differ. And once you identify that, from there the immediate next step should be to zoom in on your intuitions – to figure out the source and content of the intuition as much as possible. Intuitions aren’t blank structureless feelings, as much as it might seem like they are. With enough introspection intuitions can be explicated and elucidated upon, and described in some detail. They can even be passed on to other people, assuming at least some kind of basic common epistemological framework, which I do think all humans share (yes, even objective-reality-denying postmodernists).
Anyway, this whole concept of zooming in on intuitions seems like an important one to me, and one that hasn’t been emphasized enough in the intellectual circles I travel in. When someone doesn’t agree with some basic foundational belief that you have, you can’t just throw up your hands in despair – you have to persevere and figure out why they don’t agree. And this takes effort, which most people aren’t willing to expend when they already see their debate opponent as someone who’s being willfully stupid anyway. But – needless to say – no one thinks of their positions as being a result of willful stupidity. Pretty much everyone holds beliefs that seem obvious within the framework of their own worldview. So if you want to change someone’s mind with respect to some philosophical question or another, you’re going to have to dig deep and engage with their worldview. And this is a difficult thing to do.
Hence, the philosophical quagmire that we find our society to be in.
It strikes me that improving our ability to explain and discuss philosophy amongst one another should be of paramount importance to most intellectually serious people. This applies to AI risk, of course, but also to many everyday topics that we all discuss: feminism, geopolitics, environmentalism, what have you – pretty much everything we talk about grounds out to philosophy eventually, if you go deep enough or meta enough. And to the extent that we can’t discuss philosophy productively right now, we can’t make progress on many of these important issues.
I think philosophers should – to some extent – be ashamed of the state of their field right now. When you compare philosophy to science it’s clear that science has made great strides in explaining the contents of its findings to the general public, whereas philosophy has not. Philosophers seem to treat their field as being almost inconsequential, as if whatever they conclude at some level won’t matter. But this clearly isn’t true – we need vastly improved discussion norms when it comes to philosophy, and we need far greater effort on the part of philosophers when it comes to explaining philosophy, and we need these things right now. Regardless of what you think about AI, the 21st century will clearly be fraught with difficult philosophical problems – from genetic engineering to the ethical treatment of animals to the problem of what to do with global poverty, it’s obvious that we will soon need philosophical answers, not just philosophical questions. Improvements in technology mean improvements in capability, and that means that things which were once merely thought experiments will be lifted into the realm of real experiments.
I think the problem that humanity faces in the 21st century is an unprecedented one. We’re faced with the task of actually solving philosophy, not just doing philosophy. And if I’m right about AI, then we have exactly one try to get it right. If we don’t, well..
Well, then the fate of humanity may literally hang in the balance.