Will the world's elites navigate the creation of AI just fine?
One open question in AI risk strategy is: Can we trust the world's elite decision-makers (hereafter "elites") to navigate the creation of human-level AI (and beyond) just fine, without the kinds of special efforts that e.g. Bostrom and Yudkowsky think are needed?
Some reasons for concern include:
- Otherwise smart people say unreasonable things about AI safety.
- Many people who believed AI was around the corner didn't take safety very seriously.
- Elites have failed to navigate many important issues wisely (2008 financial crisis, climate change, Iraq War, etc.), for a variety of reasons.
- AI may arrive rather suddenly, leaving little time for preparation.
But if you were trying to argue for hope, you might argue along these lines (presented for the sake of argument; I don't actually endorse this argument):
- If AI is preceded by visible signals, elites are likely to take safety measures. Effective measures were taken to address asteroid risk. Large resources are devoted to mitigating climate change risks. Personal and tribal selfishness align with AI risk-reduction in a way they may not align on climate change. Availability of information is increasing over time.
- AI is likely to be preceded by visible signals. Conceptual insights often take years of incremental tweaking. In vision, speech, games, compression, robotics, and other fields, performance curves are mostly smooth. "Human-level performance at X" benchmarks influence perceptions and should be more exhaustive and come more rapidly as AI approaches. Recursive self-improvement capabilities could be charted, and are likely to be AI-complete. If AI succeeds, it will likely succeed for reasons comprehensible by the AI researchers of the time.
- Therefore, safety measures will likely be taken.
- If safety measures are taken, then elites will navigate the creation of AI just fine. Corporate and government leaders can use simple heuristics (e.g. Nobel prizes) to access the upper end of expert opinion. AI designs with easily tailored tendency to act may be the easiest to build. The use of early AIs to solve AI safety problems creates an attractor for "safe, powerful AI." Arms races not insurmountable.
The basic structure of this 'argument for hope' is due to Carl Shulman, though he doesn't necessarily endorse the details. (Also, it's just a rough argument, and as stated is not deductively valid.)
Personally, I am not very comforted by this argument because:
- Elites often fail to take effective action despite plenty of warning.
- I think there's a >10% chance AI will not be preceded by visible signals.
- I think the elites' safety measures will likely be insufficient.
Obviously, there's a lot more for me to spell out here, and some of it may be unclear. The reason I'm posting these thoughts in such a rough state is so that MIRI can get some help on our research into this question.
In particular, I'd like to know:
- Which historical events are analogous to AI risk in some important ways? Possibilities include: nuclear weapons, climate change, recombinant DNA, nanotechnology, chloroflourocarbons, asteroids, cyberterrorism, Spanish flu, the 2008 financial crisis, and large wars.
- What are some good resources (e.g. books) for investigating the relevance of these analogies to AI risk (for the purposes of illuminating elites' likely response to AI risk)?
- What are some good studies on elites' decision-making abilities in general?
- Has the increasing availability of information in the past century noticeably improved elite decision-making?
Loading…
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)
Comments (266)
What does RSI stand for?
"recursive self improvement".
Okay, I've now spelled this out in the OP.
Lately I've been listening to audiobooks (at 2x speed) in my down time, especially ones that seem likely to have passages relevant to the question of how well policy-makers will deal with AGI, basically continuing this project but only doing the "collection" stage, not the "analysis" stage.
I'll post quotes from the audiobooks I listen to as replies to this comment.
From Watts' Everything is Obvious:
More (#1) from Everything is Obvious:
More (#2) from Everything is Obvious:
More (#4) from Everything is Obvious:
More (#3) from Everything is Obvious:
From Caplan's The Myth of the Rational Voter:
More (#2) from The Myth of the Rational Voter:
This is an absurdly narrow definition of self-interest. Many people who are not old have parents who are senior citizens. Men have wives, sisters, and daughters whose well-being is important to them. Etc. Self-interest != solipsistic egoism.
More (#1) from The Myth of the Rational Voter:
And:
More (#3) from The Myth of the Rational Voter:
One quote from Taleb's AntiFragile is here, and here's another:
AntiFragile makes lots of interesting points, but it's clear in some cases that Taleb is running roughshod over the truth in order to support his preferred view. I've italicized the particularly lame part:
From Rhodes' Arsenals of Folly:
More (#3) from Arsenals of Folly:
And:
And:
And:
And:
More (#2) from Arsenals of Folly:
And:
And, a blockquote from the writings of Robert Gates:
More (#1) from Arsenals of Folly:
And:
And:
And:
More (#4) from Arsenals of Folly:
From Feynman's Surely You're Joking, Mr. Feynman:
More (#1) from Surely You're Joking, Mr. Feynman:
And:
And:
From Ariely's The Honest Truth about Dishonesty:
More (#1) from Ariely's The Honest Truth about Dishonesty:
And:
More (#2) from Ariely's The Honest Truth about Dishonesty:
And:
From Think Like a Freak:
More (#1) from Think Like a Freak:
And:
From Rhodes' Twilight of the Bombs:
More (#1) from Twilight of the Bombs:
And:
And:
And:
And:
From Pentland's Social Physics:
More (#2) from Social Physics:
And:
More (#1) from Social Physics:
And:
And:
From Harford's The Undercover Economist Strikes Back:
And:
More (#2) from The Undercover Economist Strikes Back:
And:
And:
And:
More (#1) from The Undercover Economist Strikes Back:
And:
From de Mesquita and Smith's The Dictator's Handbook:
More (#2) from The Dictator's Handbook:
And:
More (#1) from The Dictator's Handbook:
From Ferguson's The Ascent of Money:
More (#1) from The Ascent of Money:
And:
The Medici Bank is pretty interesting. A while ago I wrote https://en.wikipedia.org/wiki/Medici_Bank on the topic; LWers might find it interesting how international finance worked back then.
Passage from Patterson's Dark Pools: The Rise of the Machine Traders and the Rigging of the U.S. Stock Market:
But it proved all too easy: The very first tape Wang played revealed two dealers fixing prices.
From Richard Rhodes' The Making of the Atomic Bomb:
More (#2) from The Making of the Atomic Bomb:
After Alexander Sachs paraphrased the Einstein-Szilard letter to Roosevelt, Roosevelt demanded action, and Edwin Watson set up a meeting with representatives from the Bureau of Standards, the Army, and the Navy...
Upon asking for some money to conduct the relevant experiments, the Army representative launched into a tirade:
More (#3) from The Making of the Atomic Bomb:
Frisch and Peierls wrote a two-part report of their findings:
More (#1) from The Making of the Atomic Bomb:
On the origins of the Einstein–Szilárd letter:
And:
Some relevant quotes from Schlosser's Command and Control: Nuclear Weapons, the Damascus Accident, and the Illusion of Safety:
And:
More from Command and Control:
And:
More (#3) from Command and Control:
And:
And:
And:
More (#2) from Command and Control:
And:
And:
There was so much worth quoting from Better Angels of Our Nature that I couldn't keep up. I'll share a few quotes anyway.
More (#3) from Better Angels of Our Nature:
Further reading on integrative complexity:
Wikipedia Psychlopedia Google book
Now that I've been introduced to the concept, I want to evaluate how useful it is to incorporate into my rhetorical repertoire and vocabulary. And, to determine whether it can inform my beliefs about assessing the exfoliating intelligence of others (a term I'll coin to refer to that intelligence/knowledge which another can pass on to me to aid my vocabulary and verbal abstract reasoning - my neuropsychological strengths which I try to max out just like an RPG character).
At a less meta level, knowing the strengths and weaknesses of the trait will inform whether I choose to signal it or dampen it from herein and in what situations. It is important for imitators to remember that whatever IC is associated with does not neccersarily imply those associations to lay others.
strengths
As listed in psycholopedia:
weaknesses
based on psychlopedia:
seem antagonistic and even narcissistic based on the wiki article:
dependence (more likely to defer to others)
Upon reflection, here are my conclusions:
More (#4) from Better Angels of Our Nature:
Untrue unless you're in a non-sequential game
True under a utilitarian framework and with a few common mind-theoretic assumptions derived from intuitions stemming from most people's empathy
Woo
More (#2) from Better Angels of Our Nature:
More (#1) from Better Angels of Our Nature:
From Lewis' Flash Boys:
So Spivey began digging the line, keeping it secret for 2 years. He didn't start trying to sell the line to banks and traders until a couple months before the line was complete. And then:
More (#1) from Flash Boys:
And:
And:
Do you keep a list of the audiobooks you liked anywhere? I'd love to take a peek.
Okay. In this comment I'll keep an updated list of audiobooks I've heard since Sept. 2013, for those who are interested. All audiobooks are available via iTunes/Audible unless otherwise noted.
Outstanding:
* Tetlock, Expert Political Judgment
* Pinker, The Better Angels of Our Nature (my clips)
* Schlosser, Command and Control (my clips)
* Yergin, The Quest (my clips)
* Osnos, Age of Ambition (my clips)
Worthwhile if you care about the subject matter:
* Singer, Wired for War (my clips)
* Feinstein, The Shadow World (my clips)
* Venter, Life at the Speed of Light (my clips)
* Rhodes, Arsenals of Folly (my clips)
* Weiner, Enemies: A History of the FBI (my clips)
* Rhodes, The Making of the Atomic Bomb (available here) (my clips)
* Gleick, Chaos (my clips)
* Wiener, Legacy of Ashes: The History of the CIA (my clips)
* Freese, Coal: A Human History (my clips)
* Aid, The Secret Sentry (my clips)
* Scahill, Dirty Wars (my clips)
* Patterson, Dark Pools (my clips)
* Lieberman, The Story of the Human Body
* Pentland, Social Physics (my clips)
* Okasha, Philosophy of Science: VSI
* Mazzetti, The Way of the Knife (my clips)
* Ferguson, The Ascent of Money (my clips)
* Lewis, The Big Short (my clips)
* de Mesquita & Smith, The Dictator's Handbook (my clips)
* Sunstein, Worst-Case Scenarios (available here) (my clips)
* Johnson, Where Good Ideas Come From (my clips)
* Harford, The Undercover Economist Strikes Back (my clips)
* Caplan, The Myth of the Rational Voter (my clips)
* Hawkins & Blakeslee, On Intelligence
* Gleick, The Information (my clips)
* Gleick, Isaac Newton
* Greene, Moral Tribes
* Feynman, Surely You're Joking, Mr. Feynman! (my clips)
* Sabin, The Bet (my clips)
* Watts, Everything Is Obvious: Once You Know the Answer (my clips)
* Greenblatt, The Swerve: How the World Became Modern (my clips)
* Cain, Quiet: The Power of Introverts in a World That Can't Stop Talking
* Dennett, Freedom Evolves
* Kaufman, The First 20 Hours
* Gertner, The Idea Factory (my clips)
* Olen, Pound Foolish
* McArdle, The Up Side of Down
* Rhodes, Twilight of the Bombs (my clips)
* Isaacson, Steve Jobs (my clips)
* Priest & Arkin, Top Secret America (my clips)
* Ayres, Super Crunchers (my clips)
* Lewis, Flash Boys (my clips)
* Dartnell, The Knowledge (my clips)
* Cowen, The Great Stagnation
* Lewis, The New New Thing (my clips)
* McCray, The Visioneers (my clips)
* Jackall, Moral Mazes (my clips)
* Langewiesche, The Atomic Bazaar
* Ariely, The Honest Truth about Dishonesty (my clips)
A process for turning ebooks into audiobooks for personal use, at least on Mac:
To de-DRM your Audible audiobooks, just use Tune4Mac.
VoiceDream for iPhone does a very fine job of text-to-speech; it also syncs your pocket bookmarks and can read epub files.
Other:
* Roose, Young Money. Too focused on a few individuals for my taste, but still has some interesting content. (my clips)
* Hofstadter & Sander, Surfaces and Essences. Probably a fine book, but I was only interested enough to read the first and last chapters.
* Taleb, AntiFragile. Learned some from it, but it's kinda wrong much of the time. (my clips)
* Acemoglu & Robinson, Why Nations Fail. Lots of handy examples, but too much of "our simple theory explains everything." (my clips)
* Byrne, The Many Worlds of Hugh Everett III (available here). Gave up on it; too much theory, not enough story. (my clips)
* Drexler, Radical Abundance. Gave up on it; too sanitized and basic.
* Mukherjee, The Emperor of All Maladies. Gave up on it; too slow in pace and flowery in language for me.
* Fukuyama, The Origins of Political Order. Gave up on it; the author is more keen on name-dropping theorists than on tracking down data.
* Friedman, The Moral Consequences of Economic Growth (available here). Gave up on it. There are some actual data in chs. 5-7, but the argument is too weak and unclear for my taste.
* Tuchman, The Proud Tower. Gave up on it after a couple chapters. Nothing wrong with it, it just wasn't dense enough in the kind of learning I'm trying to do.
* Foer, Eating Animals. I listened to this not to learn, but to shift my emotions. But it was too slow-moving, so I didn't finish it.
* Caro, The Power Broker. This might end up under "outstanding" if I ever finish it. For now, I've put this one on hold because it's very long and not as highly targeted at the useful learning I want to be doing right now than some other books.
* Rutherfurd, Sarum. This is the furthest I've gotten into any fiction book for the past 5 years at least, including HPMoR. I think it's giving my system 1 an education into what life was like in the historical eras it covers, without getting bogged down in deep characterization, complex plotting, or ornate environmental description. But I've put it on hold for now because it is incredibly long.
* Diamond, Collapse. I listened to several chapters, but it seemed to be mostly about environmental decline, which doesn't interest me much, so I stopped listening.
* Bowler & Morus, Making Modern Science (available here) (my clips). A decent history of modern science but not focused enough on what I wanted to learn, so I gave up.
* Brynjolfsson & McAfee, The Second Machine Age (my clips). Their earlier, shorter Race Against the Machine contained the core arguments; this book expands the material in order to explain things to a lay audience. As with Why Nations Fail, I have too many quibbles with this book's argument to put this book in the 'Liked' category.
* Clery, A Piece of the Sun. Nothing wrong with it, I just wasn't learning the type of things I was hoping to learn, so I stopped about half way through.
* Schuman, The Miracle. Fairly interesting, but not quite dense enough in the kind of stuff I'm hoping to learn these days.
* Conway & Oreskes, Merchants of Doubt. Fairly interesting, but not dense enough in the kind of things I'm hoping to learn.
* Horowitz, The Hard Thing About Hard Things
* Wessel, Red Ink
* Levitt & Dubner, Think Like a Freak (my clips)
* Gladwell, David and Goliath (my clips)
Could you say a bit about your audiobook selection process?
When I was just starting out in September 2013, I realized that vanishingly few of the books I wanted to read were available as audiobooks, so it didn't make sense for me to search Audible for titles I wanted to read: the answer was basically always "no." So instead I browsed through the top 2000 best-selling unabridged non-fiction audiobooks on Audible, added a bunch of stuff to my wishlist, and then scrolled through the wishlist later and purchased the ones I most wanted to listen to.
These days, I have a better sense of what kind of books have a good chance of being recorded as audiobooks, so I sometimes do search for specific titles on Audible.
Some books that I really wanted to listen to are available in ebook but not audiobook, so I used this process to turn them into audiobooks. That only barely works, sometimes. I have to play text-to-speech audiobooks at a lower speed to understand them, and it's harder for my brain to stay engaged as I'm listening, especially when I'm tired. I might give up on that process, I'm not sure.
Most but not all of the books are selected because I expect them to have lots of case studies in "how the world works," specifically with regard to policy-making, power relations, scientific research, and technological development. This is definitely true for e.g. Command and Control, The Quest, Wired for War, Life at the Speed of Light, Enemies, The Making of the Atomic Bomb, Chaos, Legacy of Ashes, Coal, The Secret Sentry, Dirty Wars, The Way of the Knife, The Big Short, Worst-Case Scenarios, The Information, and The Idea Factory.
I definitely found out something similar. I've come to believe that most 'popular science', 'popular history' etc books are on audible, but almost anything with equations or code is not.
The 'great courses' have been quite fantastic for me for learning about the social sciences. I found out about those recently.
Occasionally I try podcasts for very niche topics (recent Rails updates, for instance), but have found them to be rather uninteresting in comparison to full books and courses.
Thanks!
From Poor Economics:
From The Visioneers:
And:
And:
And:
From Priest & Arkin's Top Secret America:
More (#2) from Top Secret America:
And, on JSOC:
And:
And:
More (#1) from Top Secret America:
And:
From Scahill's Dirty Wars:
More (#2) from Dirty Wars:
And:
And:
More (#1) from Dirty Wars:
And:
And:
And:
Foreign fighters show up everywhere. And now there's the whole Islamic State issue. Perhaps all the world needs is more foreign legions doing good things. The FFL is overrecruited afterall. Heck, we could even deal with the refugee crisis by offering visas to those mercenaries. Sure as hell would be more popular than selling visas and citizenship cause people always get antsy about inequality and having less downward social comparisons.
From Singer's Wired for War:
More (#7) from Wired for War:
And:
The army recruiters say that soldiers on the ground still win wars. I reckon that Douhet's prediction will approach true, however, crudely. Drones.
From Osnos' Age of Ambition:
And:
And:
More (#2) from Osnos' Age of Ambition:
And:
More (#1) from Osnos' Age of Ambition:
And:
And:
And:
From Soldiers of Reason:
More (#2) from Soldiers of Reason:
And:
More (#1) from Soldiers of Reason:
And:
From David and Goliath:
And:
More (#2) from David and Goliath:
And:
From Wade's A Troublesome Inheritance:
More (#2) from A Troubled Inheritance:
More (#1) from A Troublesome Inheritance:
And:
From Moral Mazes:
And:
And:
From Lewis' The New New Thing:
And:
From Dartnell's The Knowledge:
And:
And:
And:
From Ayres' Super Crunchers, speaking of Epagogix, which uses neural nets to predict a movie's box office performance from its screenplay:
More (#1) from Super Crunchers:
And:
And:
From Isaacson's Steve Jobs:
And:
And:
And:
More (#1) from Steve Jobs:
And:
[no more clips, because Audible somehow lost all my bookmarks for the last two parts of the audiobook!]
From Feinstein's The Shadow World:
More (#8) from The Shadow World:
And:
And:
More (#7) from The Shadow World:
And:
And:
And:
And:
More (#6) from The Shadow World:
And:
And:
More (#5) from The Shadow World:
And:
And:
And:
More (#4) from The Shadow World:
And:
And:
More (#3) from The Shadow World:
And:
And:
More (#2) from The Shadow World:
And:
More (#1) from The Shadow World:
And:
And:
And:
From Weiner's Enemies:
More (#5) from Enemies:
And:
More (#4) from Enemies:
And:
And:
More (#3) from Enemies:
And:
And:
More (#2) from Enemies:
And:
And:
More (#1) from Enemies:
And:
And:
From Roose's Young Money:
From Tetlock's Expert Political Judgment:
More (#2) from Expert Political Judgment:
More (#1) from Expert Political Judgment:
And:
And:
From Sabin's The Bet:
And:
More (#3) from The Bet:
More (#2) from The Bet:
And:
And:
More (#1) from The Bet:
And:
And:
From Yergin's The Quest:
More (#7) from The Quest:
More (#6) from The Quest:
And:
And:
And:
And:
More (#5) from The Quest:
And:
And:
And:
More (#4) from The Quest:
And:
More (#3) from The Quest:
And:
More (#2) from The Quest:
And:
And:
And:
More (#1) from The Quest:
And:
And:
From The Second Machine Age:
More (#1) from The Second Machine Age:
From Making Modern Science:
More (#1) from Making Modern Science:
From Johnson's Where Good Ideas Come From:
From Gertner's The Idea Factory:
More (#2) from The Idea Factory:
And:
And:
More (#1) from The Idea Factory:
And:
I'm sure that I've seen your answer to this question somewhere before, but I can't recall where: Of the audiobooks that you've listened to, which have been most worthwhile?
I keep an updated list here.
I guess I might as well post quotes from (non-audio) books here as well, when I have no better place to put them.
First up is Revolution in Science.
Starting on page 45:
From Sunstein's Worst-Case Scenarios:
More (#2) from Worst-Case Scenarios:
More (#5) from Worst-Case Scenarios:
More (#4) from Worst-Case Scenarios:
More (#3) from Worst-Case Scenarios:
And:
Similar issues are raised by the continuing debate over whether certain antidepressants impose a (small) risk of breast cancer. A precautionary approach might seem to argue against the use of these drugs because of their carcinogenic potential. But the failure to use those antidepressants might well impose risks of its own, certainly psychological and possibly even physical (because psychological ailments are sometimes associated with physical ones as well). Or consider the decision by the Soviet Union to evacuate and relocate more than 270,000 people in response to the risk of adverse effects from the Chernobyl fallout. It is hardly clear that on balance this massive relocation project was justified on health grounds: "A comparison ought to have been made between the psychological and medical burdens of this measure (anxiety, psychosomatic diseases, depression and suicides) and the harm that may have been prevented." More generally, a sensible government might want to ignore the small risks associated with low levels of radiation, on the ground that precautionary responses are likely to cause fear that outweighs any health benefits from those responses - and fear is not good for your health.
And:
More (#1) from Worst-Case Scenarios:
But at least so far in the book, Sunstein doesn't mention the obvious rejoinder about investing now to prevent existential catastrophe.
Anyway, another quote:
From Gleick's Chaos:
More (#3) from Chaos:
And:
More (#2) from Chaos:
And:
And:
More (#1) from Chaos:
From Lewis' The Big Short:
More (#4) from The Big Short:
And:
And:
And:
More (#3) from The Big Short:
And:
And:
And:
More (#2) from The Big Short:
And:
And:
More (#1) from The Big Short:
And:
From Gleick's The Information:
More (#1) from The Information:
And:
And:
And, an amusing quote:
From Acemoglu & Robinson's Why Nations Fail:
More (#2) from Why Nations Fail:
And:
More (#1) from Why Nations Fail:
And:
And:
And:
From Greenblatt's The Swerve: How the World Became Modern:
More (#1) from The Swerve:
From Aid's The Secret Sentry:
Aren't we seeing "visible signals" already? Machines are better than humans at lots of intelligence-related tasks today.
I interpreted that as 'visible signals of danger', but I could be wrong.
(I don't have answers to your specific questions, but here are some thoughts about the general problem.)
I agree with most of you said. I also assign significant probability mass to most parts of the argument for hope (but haven't thought about this enough to put numbers on this), though I too am not comforted on these parts because I also assign non-small chance to them going wrong. E.g., I have hope for "if AI is visible [and, I add, AI risk is understood] then authorities/elites will be taking safety measures".
That said, there are some steps in the argument for hope that I'm really worried about:
Although it's also true that I assign some probability to e.g. AGI without visible signs, I think the above is currently the largest part of why I feel MIRI work is important.
The argument from hope or towards hope or anything but despair and grit is misplaced when dealing with risks of this magnitude.
Don't trust God (or semi-competent world leaders) to make everything magically turn out all right. The temptation to do so is either a rationalization of wanting to do nothing, or based on a profoundly miscalibrated optimism for how the world works.
/doom
Cryptography and cryptanalysis are obvious precursors of supposedly-dangerous tech within IT.
Looking at their story, we can plausibly expect governments to attempt to delay the development of "weaponizable" technology by others.
These days, cryptography facilitates international trade. It seems like a mostly-positive force overall.
I personally am optimistic about the world's elites navigating AI risk as well as possible subject to inherent human limitations that I would expect everybody to have, and the inherent risk. Some points:
I've been surprised by people's ability to avert bad outcomes. Only two nuclear weapons have been used since nuclear weapons were developed, despite the fact that there are 10,000+ nuclear weapons around the world. Political leaders are assassinated very infrequently relative to how often one might expect a priori.
AI risk is a Global Catastrophic Risk in addition to being an x-risk. Therefore, even people who don't care about the far future will be motivated to prevent it.
The people with the most power tend to be the most rational people, and the effect size can be expected to increase over time (barring disruptive events such as economic collapses, supervolcanos, climate change tail risk, etc). The most rational people are the people who are most likely to be aware of and to work to avert AI risk. Here I'm blurring "near mode instrumental rationality" and "far mode instrumental rationality," but I think there's a fair amount of overlap between the two things. e.g. China is pushing hard on nuclear energy and on renewable energies, even though they won't be needed for years.
Availability of information is increasing over time. At the time of the Dartmouth conference, information about the potential dangers of AI was not very salient, now it's more salient, and in the future it will be still more salient.
In the Manhattan project, the "will bombs ignite the atmosphere?" question was analyzed and dismissed without much (to our knowledge) double-checking. The amount of risk checking per hour of human capital available can be expected to increase over time. In general, people enjoy tackling important problems, and risk checking is more important than most of the things that people would otherwise be doing.
I should clarify that with the exception of my first point, the arguments that I give are arguments that humanity will address AI risk in a near optimal way – not necessarily that AI risk is low.
For example, it could be that people correctly recognize that building an AI will result in human extinction with probability 99%, and so implement policies to prevent it, but that sometime over the next 10,000 years, these policies will fail, and AI will kill everyone.
But the actionable thing is how much we can reduce the probability of AI risk, and if by default people are going to do the best that one could hope, we can't reduce the probability substantially.
It's not much evidence, but the two earliest scientific investigations of existential risk I know of, LA-602 and the RHIC Review, seem to show movement in the opposite direction: "LA-602 was written by people curiously investigating whether a hydrogen bomb could ignite the atmosphere, and the RHIC Review is a work of public relations."
Perhaps the trend you describe is accurate, but I also wouldn't be surprised to find out (after further investigation) that scientists are now increasingly likely to avoid serious analysis of real risks posed by their research, since they're more worried than ever before about funding for their field (or, for some other reason). The AAAI Presidential Panel on Long-Term AI Futures was pretty disappointing, and like the RHIC Review seems like pure public relations, with a pre-determined conclusion and no serious risk analysis.
What?
Rationality is systematized winning. Chance plays a role, but over time it's playing less and less of a role, because of more efficient markets.
There is lots of evidence that people in power are the most rational, but there is a huger prior to overcome.
Among people for whom power has an unsatiated major instrumental or intrinsic value, the most rational tend to have more power- but I don't think that very rational people are common and I think that they are less likely to want more power than they have.
Particularly since the previous generation of power-holders used different factors when they selected their successors.
I agree with all of this. I think that "people in power are the most rational" was much less true in 1950 than it is today, and that it will be much more true in 2050.
Actually that's a badly titled article. At best "Rationality is systematized winning" applies to instrumental, not epistemic, rationality. And even for that you can't make rationality into systematized winning by defining it so. Either that's a tautology (whatever systematized winning is, we define that as "rationality") or it's an empirical question. I.e. does rationality lead to winning? Looking around the world at "winners", that seems like a very open question.
And now that I think about it, it's also an empirical question whether there even is a system for winning. I suspect there is--that is, I suspect that there are certain instrumental practices one can adopt that are generically useful for achieving a broad variety of life goals--but this too is an empirical question we should not simply assume the answer to.
The problem is that politicians have a lot to gain from really believing the stupid things they have to say to gain and hold power.
To quote an old thread:
Cf. Stephen Pinker historians who've studied Hitler tend to come away convinced he really believed he was a good guy.
To get the fancy explanation of why this is the case, see "Trivers' Theory of Self-Deception."
Why would a good AI policy be one which takes as a model a universe where world destroying weapons in the hands of incredibly unstable governments controlled by glorified tribal chieftains is not that bad of a situation? Almost but not quite destroying ourselves does not reflect well on our abilities. The Cold War as a good example of averting bad outcomes? Eh.
This is assuming that people understand what makes an AI so dangerous - calling an AI a global catastrophic risk isn't going to motivate anyone who thinks you can just unplug the thing (and even worse if it does motivate them, since then you have someone who is running around thinking the AI problem is trivial).
I think you're just blurring "rationality" here. The fact that someone is powerful is evidence that they are good at gaining a reputation in their specific field, but I don't see how this is evidence for rationality as such (and if we are redefining it to include dictators and crony politicians, I don't know what to say), and especially of the kind needed to properly handle AI - and claiming evidence for future good decisions related to AI risk because of domain expertise in entirely different fields is quite a stretch. Believe it or not, most people are not mathematicians or computer scientists. Most powerful people are not mathematicians or computer scientists. And most mathematicians and computer scientists don't give two shits about AI risk - if they don't think it worthy of attention, why would someone who has no experience with these kind of issues suddenly grab it out of the space of all possible ideas he could possibly be thinking about? Obviously they aren't thinking about it now - why are you confident this won't be the case in the future? Thinking about AI requires a rather large conceptual leap - "rationality" is necessary but not sufficient, so even if all powerful people were "rational" it doesn't follow that they can deal with these issues properly or even single them out as something to meditate on, unless we have a genius orator I'm not aware of. It's hard enough explaining recursion to people who are actually interested in computers. And it's not like we can drop a UFAI on a country to get people to pay attention.
It seems like you are claiming that AI safety does not require a substantial shift in perspective (I'm taking this as the reason why you are optimistic, since my cynicism tells me that expecting a drastic shift is a rather improbable event) - rather, we can just keep chugging along because nice things can be "expected to increase over time", and this somehow will result in the kind of society we need. These statements always confuse me; one usually expects to be in a better position to solve a problem 5 years down the road, but trying to describe that advantage in terms of out of thin air claims about incremental changes in human behavior seems like a waste of space unless there is some substance behind it. They only seem useful when one has reached that 5 year checkpoint and can reflect on the current context in detail - for example, it's not clear to me that the increasing availability of information is always a net positive for AI risk (since it could be the case that potential dangers are more salient as a result of unsafe AI research - the more dangers uncovered could even act as an incentive for more unsafe research depending on the magnitude of positive results and the kind of press received. But of course the researchers will make the right decision, since people are never overconfident...). So it comes off (to me) as a kind of sleight of hand where it feels like a point for optimism, a kind of "Yay Open Access Knowledge is Good!" applause light, but it could really go either way.
Also I really don't know where you got that last idea - I can't imagine that most people would find AI safety more glamorous then, you know, actually building a robot. There's a reason why it's hard to get people to do unit tests and software projects get bloated and abandoned. Something like what Haskell is to software would be optimal. I don't think it's a great idea to rely on the conscientiousness of people in this case.
Thanks for engaging.
The point is that I would have expected things to be worse, and that I imagine that a lot of others would have as well.
I think that people will understand what makes AI dangerous. The arguments aren't difficult to understand.
Broadly, the most powerful countries are the ones with the most rational leadership (where here I mean "rational with respect to being able to run a country," which is relevant), and I expect this trend to continue.
Also, wealth is skewing toward more rational people over time, and wealthy people have political bargaining power.
Political leaders have policy advisors, and policy advisors listen to scientists. I expect that AI safety issues will percolate through the scientific community before long.
I agree that AI safety requires a substantial shift in perspective — what I'm claiming is that this change in perspective will occur organically substantially before the creation of AI is imminent.
You don't need "most people" to work on AI safety. It might suffice for 10% or fewer of the people who are working on AI to work on safety. There are lots of people who like to be big fish in a small pond, and this will motivate some AI researchers to work on safety even if safety isn't the most prestigious field.
If political leaders are sufficiently rational (as I expect them to be), they'll give research grants and prestige to people who work on AI safety.
We still get people occasionally who argue the point while reading through the Sequences, and that's a heavily filtered audience to begin with.
There's a difference between "sufficiently difficult so that a few readers of one person's exposition can't follow it" and "sufficiently difficult so that after being in the public domain for 30 years, the arguments won't have been distilled so as to be accessible to policy makers."
I don't think that the arguments are any more difficult than the arguments for anthropogenic global warming. One could argue that the difficulty of these arguments has been a limiting factor in climate change policy, but I believe that by far the dominant issue has been misaligned incentives, though I'd concede that this is not immediately obvious.
Things were a lot worse then everyone knew: Russia almost invaded Yugoslavia, which would have triggered a war according to newly declassified NSA journals, in the 1950's. The Cuban Missile Crisis could easily have gone hot, and several times early warning systems were triggered by accident. Of course, estimating what could have happened is quite hard.
Here are my reasons for pessimism:
There are likely to be effective methods of controlling AIs that are of subhuman or even roughly human-level intelligence which do not scale up to superhuman intelligence. These include for example reinforcement by reward/punishment, mutually beneficial trading, legal institutions. Controlling superhuman intelligence will likely require qualitatively different methods, such as having the superintelligence share our values. Unfortunately the existence of effective but unscalable methods of AI control will probably lull elites into a false sense of security as we deploy increasingly smarter AIs without incident, and both increase investments into AI capability research and reduce research into "higher" forms of AI control.
The only possible approaches I can see of creating scalable methods of AI control require solving difficult philosophical problems which likely require long lead times. By the time elites take the possibility of superhuman AIs seriously and realize that controlling them requires approaches very different from controlling subhuman and human-level AIs, there won't be enough time to solve these problems even if they decide to embark upon Manhattan-style projects (because there isn't sufficient identifiable philosophical talent in humanity to recruit for such projects to make enough of a difference).
In summary, even in a relatively optimistic scenario, one with steady progress in AI capability along with apparent progress in AI control/safety (and nobody deliberately builds a UFAI for the sake of "maximizing complexity of the universe" or what have you), it's probably only a matter of time until some AI crosses a threshold of intelligence and manages to "throw off its shackles". This may be accompanied by a last-minute scramble by mainstream elites to slow down AI progress and research methods of scalable AI control, which (if it does happen) will likely be too late to make a difference.
Congress' non-responsiveness to risks to critical infrastructure from geomagnetic storms, despite scientific consensus on the issue, is also worrying.
This seems obviously false. Local expenditures - of money, pride, possibility of not being the first to publish, etc. - are still local, global penalties are still global. Incentives are misaligned in exactly the same way as for climate change.
This is to be taken as an arguendo, not as the author's opinion, right? See IEM on the minimal conditions for takeoff. Albeit if "AI-complete" is taken in a sense of generality and difficulty rather than "human-equivalent" then I agree much more strongly, but this is correspondingly harder to check using some neat IQ test or other "visible" approach that will command immediate, intuitive agreement.
Most obviously molecular nanotechnology a la Drexler, the other ones seem too 'straightforward' by comparison. I've always modeled my assumed social response for AI on the case of nanotech, i.e., funding except for well-connected insiders, term being broadened to meaninglessness, lots of concerned blither by 'ethicists' unconnected to the practitioners, etc.
Climate change doesn't have the aspect that "if this ends up being a problem at all, then chances are that I (or my family/...) will die of it".
(Agree with the rest of the comment.)
This seems implied by X-complete. X-complete generally means "given a solution to an X-complete problem, we have a solution for X".
eg. NP complete: given a polynomial solution to any NP-complete problem, any problem in NP can be solved in polynomial time.
(Of course the technical nuance of the strength of the statement X-complete is such that I expect most people to imagine the wrong thing, like you say.)
One question is whether AI is like CFCs, or like CO2, or like hacking.
With CFCs, the solution was simple: ban CFCs. The cost was relatively low, and the benefit relatively high.
With CO2, the solution is equally simple: cap and trade. It's just not politically palatable, because the problem is slower-moving, and the cost would be much, much greater (perhaps great enough to really mess up the world economy). So, we're left with the second-best solution: do nothing. People will die, but the economy will keep growing, which might balance that out, because a larger economy can feed more people and produce better technology.
With hacking, we know it's a problem and we are highly motivated to solve it, but we just don't know how. You can take every recommendation that Bruce Schneier makes, and still get hacked. The US military gets hacked. The Australian intelligence agency gets hacked. Swiss banks get hacked. And it doesn't seem to be getting better, even though we keep trying.
Banning AI research (once it becomes clear that RSI is possible) would have the same problem as banning CO2. And it might also have the same problems as hacking: how do you stop people from writing code?
Even if one organization navigates the creation of friendly AI successfully, won't we still have to worry about preventing anyone from ever creating an unsafe AI?
Unlike nuclear weapons, a single AI might have world ending consequences, and an AI requires no special resources. Theoretically a seed AI could be uploaded to Pirate Bay, from where anyone could download and compile it.
If the friendly AI comes first, the goal is for it to always have enough resources to be able to stop unsafe AIs from being a big risk.
Upvoted, but "always" is a big word. I think the hope is more for "as long as it takes until humanity starts being capable of handling its shit itself"...
Why the downvotes? Do people feel that "the FAI should at some point fold up and vanish out of existence" is so obvious that it's not worth pointing out? Or disagree that the FAI should in fact do that? Or feel that it's wrong to point this out in the context of Manfred's comment? (I didn't mean to suggest that Manfred disagrees with this, but felt that his comment was giving the wrong impression.)
Will sentient, self-interested agents ever be free from the existential risks of UFAI/intelligence amplification without some form of oversight? It's nice to think that humanity will grow up and learn how to get along, but even if that's true for 99.9999999% of humans that leaves 7 people from today's population who would probably have the power to trigger their own UFAI hard takeoff after a FAI fixes the world and then disappears. Even if such a disaster could be stopped it is a risk probably worth the cost of keeping some form of FAI around indefinitely. What FAI becomes is anyone's guess but the need for what FAI does will probably not go away. If we can't trust humans to do FAI's job now, I don't think we can trust humanity's descendents to do FAI's job either, just from Loeb's theorem. I think it is unlikely that humans will become enough like FAI to properly do FAI's job. They would essentially give up their humanity in the process.
A secure operating system for governed matter doesn't need to take the form of a powerful optimization process, nor does verification of transparent agents trusted to run at root level. Benja's hope seems reasonable to me.
This seems non-obvious. (So I'm surprised to see you state it as if it was obvious. Unless you already wrote about the idea somewhere else and are expecting people to pick up the reference?) If we want the "secure OS" to stop posthumans from running private hell simulations, it has to determine what constitutes a hell simulation and successfully detect all such attempts despite superintelligent efforts at obscuration. How does it do that without being superintelligent itself?
This sounds interesting but I'm not sure what it means. Can you elaborate?
Hm, that's true. Okay, you do need enough intelligence in the OS to detect certain types of simulations / and/or the intention to build such simulations, however obscured.
If you can verify an agent's goals (and competence at self-modification), you might be able to trust zillions of different such agents to all run at root level, depending on what the tiny failure probability worked out to quantitatively.
I'd hope so, since I think I got the idea from you :-)
This is tangential to what this thread is about, but I'd add that I think it's reasonable to have hope that humanity will grow up enough that we can collectively make reasonable decisions about things affecting our then-still-far-distant future. To put it bluntly, if we had an FAI right now I don't think it should be putting a question like "how high is the priority of sending out seed ships to other galaxies ASAP" to a popular vote, but I do think there's reasonable hope that humanity will be able to make that sort of decision for itself eventually. I suppose this is down to definitions, but I tend to visualize FAI as something that is trying to steer the future of humanity; if humanity eventually takes on the responsibility for this itself, then even if for whatever reason it decides to use a powerful optimization process for the special purpose of preventing people from building uFAI, it seems unhelpful to me to gloss this without more qualification as "the friendly AI [... will always ...] stop unsafe AIs from being a big risk", because the latter just sounds to me like we're keeping around the part where it steers the fate of humanity as well.
There's another reason for hope in this above global warming: The idea of a dangerous AI is already common in the public eye as "things we need to be careful about." A big problem the global warming movement had, and is still having, is convincing the public that it's a threat in the first place.
What kind of "AI safety problems" are we talking about here? If they are like the "FAI Open Problems" that Eliezer has been posting, they would require philosophers of the highest (perhaps even super-human) caliber to solve. How could "early AIs" be of much help?
If "AI safety problems" here do not refer to FAI problems, then how do those problems get solved, according to this argument?
@Lukeprog, can you
(1) update us on your working answers the posed questions in brief? (2) your current confidence (and if you would like to, by proxy, MIRI's as an organisation's confidence in each of the 3:
Thank you for your diligence.