MIRI's 2016 Fundraiser
Our 2016 fundraiser is underway! Unlike in past years, we'll only be running one fundraiser in 2016, from Sep. 16 to Oct. 31. Our progress so far (updated live):
Employer matching and pledges to give later this year also count towards the total. Click here to learn more.
MIRI is a nonprofit research group based in Berkeley, California. We do foundational research in mathematics and computer science that’s aimed at ensuring that smarter-than-human AI systems have a positive impact on the world. 2016 has been a big year for MIRI, and for the wider field of AI alignment research. Our 2016 strategic update in early August reviewed a number of recent developments:
- A group of researchers headed by Chris Olah of Google Brain and Dario Amodei of OpenAI published “Concrete problems in AI safety,” a new set of research directions that are likely to bear both on near-term and long-term safety issues.
- Dylan Hadfield-Menell, Anca Dragan, Pieter Abbeel, and Stuart Russell published a new value learning framework, “Cooperative inverse reinforcement learning,” with implications for corrigibility.
- Laurent Orseau of Google DeepMind and Stuart Armstrong of the Future of Humanity Institute received positive attention from news outlets and from Alphabet executive chairman Eric Schmidt for their new paper “Safely interruptible agents,” partly supported by MIRI.
- MIRI ran a three-week AI safety and robustness colloquium and workshop series, with speakers including Stuart Russell, Tom Dietterich, Francesca Rossi, and Bart Selman.
- We received a generous $300,000 donation and expanded our research and ops teams.
- We started work on a new research agenda, “Alignment for advanced machine learning systems.” This agenda will be occupying about half of our time going forward, with the other half focusing on our agent foundations agenda.
We also published new results in decision theory and logical uncertainty, including “Parametric bounded Löb’s theorem and robust cooperation of bounded agents” and “A formal solution to the grain of truth problem.” For a survey of our research progress and other updates from last year, see our 2015 review. In the last three weeks, there have been three more major developments:
- We released a new paper, “Logical induction,” describing a method for learning to assign reasonable probabilities to mathematical conjectures and computational facts in a way that outpaces deduction.
- The Open Philanthropy Project awarded MIRI a one-year $500,000 grant to scale up our research program, with a strong chance of renewal next year.
- The Open Philanthropy Project is supporting the launch of the new UC Berkeley Center for Human-Compatible AI, headed by Stuart Russell.
Things have been moving fast over the last nine months. If we can replicate last year’s fundraising successes, we’ll be in an excellent position to move forward on our plans to grow our team and scale our research activities.
Safety engineering, target selection, and alignment theory
This post is the latest in a series introducing the basic ideas behind MIRI's research program. To contribute, or learn more about what we've been up to recently, see the MIRI fundraiser page. Our 2015 winter funding drive concludes tonight (31 Dec 15) at midnight.
Artificial intelligence capabilities research is aimed at making computer systems more intelligent — able to solve a wider range of problems more effectively and efficiently. We can distinguish this from research specifically aimed at making AI systems at various capability levels safer, or more "robust and beneficial." In this post, I distinguish three kinds of direct research that might be thought of as "AI safety" work: safety engineering, target selection, and alignment theory.
Imagine a world where humans somehow developed heavier-than-air flight before developing a firm understanding of calculus or celestial mechanics. In a world like that, what work would be needed in order to safely transport humans to the Moon?
In this case, we can say that the main task at hand is one of engineering a rocket and refining fuel such that the rocket, when launched, accelerates upwards and does not explode. The boundary of space can be compared to the boundary between narrowly intelligent and generally intelligent AI. Both boundaries are fuzzy, but have engineering importance: spacecraft and aircraft have different uses and face different constraints.
Paired with this task of developing rocket capabilities is a safety engineering task. Safety engineering is the art of ensuring that an engineered system provides acceptable levels of safety. When it comes to achieving a soft landing on the Moon, there are many different roles for safety engineering to play. One team of engineers might ensure that the materials used in constructing the rocket are capable of withstanding the stress of a rocket launch with significant margin for error. Another might design escape systems that ensure the humans in the rocket can survive even in the event of failure. Another might design life support systems capable of supporting the crew in dangerous environments.
A separate important task is target selection, i.e., picking where on the Moon to land. In the case of a Moon mission, targeting research might entail things like designing and constructing telescopes (if they didn't exist already) and identifying a landing zone on the Moon. Of course, only so much targeting can be done in advance, and the lunar landing vehicle may need to be designed so that it can alter the landing target at the last minute as new data comes in; this again would require feats of engineering.
Beyond the task of (safely) reaching escape velocity and figuring out where you want to go, there is one more crucial prerequisite for landing on the Moon. This is rocket alignment research, the technical work required to reach the correct final destination. We'll use this as an analogy to illustrate MIRI's research focus, the problem of artificial intelligence alignment.
MIRI's 2015 Winter Fundraiser!
MIRI's Winter Fundraising Drive has begun! Our current progress, updated live:
Like our last fundraiser, this will be a non-matching fundraiser with multiple funding targets our donors can choose between to help shape MIRI’s trajectory. The drive will run until December 31st, and will help support MIRI's research efforts aimed at ensuring that smarter-than-human AI systems have a positive impact.
MIRI's 2015 Summer Fundraiser!
Our summer fundraising drive is now finished. We raised a grand total of $631,957 from 263 donors. This is an incredible sum, making this the biggest fundraiser we’ve ever run.
We've already been hard at work growing our research team and spinning up new projects, and I’m excited to see what our research team can do this year. Thank you to all our supporters for making our summer fundraising drive so successful!
It's safe to say that this past year exceeded a lot of people's expectations.
Twelve months ago, Nick Bostrom's Superintelligence had just come out. Questions about the long-term risks and benefits of smarter-than-human AI systems were nearly invisible in mainstream discussions of AI's social impact.
Twelve months later, we live in a world where Bill Gates is confused by why so many researchers aren't using Superintelligence as a guide to the questions we should be asking about AI's future as a field.
Following a conference in Puerto Rico that brought together the leading organizations studying long-term AI risk (MIRI, FHI, CSER) and top AI researchers in academia (including Stuart Russell, Tom Mitchell, Bart Selman, and the Presidents of AAAI and IJCAI) and industry (including representatives from Google DeepMind and Vicarious), we've seen Elon Musk donate $10M to a grants program aimed at jump-starting the field of long-term AI safety research; we've seen the top AI and machine learning conferences (AAAI, IJCAI, and NIPS) announce their first-ever workshops or discussions on AI safety and ethics; and we've seen a panel discussion on superintelligence at ITIF, the leading U.S. science and technology think tank. (I presented a paper at the AAAI workshop, I spoke on the ITIF panel, and I'll be at NIPS.)
As researchers begin investigating this area in earnest, MIRI is in an excellent position, with a developed research agenda already in hand. If we can scale up as an organization then we have a unique chance to shape the research priorities and methods of this new paradigm in AI, and direct this momentum in useful directions.
This is a big opportunity. MIRI is already growing and scaling its research activities, but the speed at which we scale in the coming months and years depends heavily on our available funds.
For that reason, MIRI is starting a six-week fundraiser aimed at increasing our rate of growth.
— Live Progress Bar —
This time around, rather than running a matching fundraiser with a single fixed donation target, we'll be letting you help choose MIRI's course based on the details of our funding situation and how we would make use of marginal dollars.
In particular, our plans can scale up in very different ways depending on which of these funding targets we are able to hit:
MIRI's Approach
MIRI's summer fundraiser is ongoing. In the meantime, we're writing a number of blog posts to explain what we're doing and why, and to answer a number of common questions. This post is one I've been wanting to write for a long time; I hope you all enjoy it. For earlier posts in the series, see the bottom of the above link.
MIRI’s mission is “to ensure that the creation of smarter-than-human artificial intelligence has a positive impact.” How can we ensure any such thing? It’s a daunting task, especially given that we don’t have any smarter-than-human machines to work with at the moment. In a previous post to the MIRI Blog I discussed four background claims that motivate our mission; in this post I will describe our approach to addressing the challenge.
This challenge is sizeable, and we can only tackle a portion of the problem. For this reason, we specialize. Our two biggest specializing assumptions are as follows:
1. We focus on scenarios where smarter-than-human machine intelligence is first created in de novo software systems (as opposed to, say, brain emulations). This is in part because it seems difficult to get all the way to brain emulation before someone reverse-engineers the algorithms used by the brain and uses them in a software system, and in part because we expect that any highly reliable AI system will need to have at least some components built from the ground up for safety and transparency. Nevertheless, it is quite plausible that early superintelligent systems will not be human-designed software, and I strongly endorse research programs that focus on reducing risks along the other pathways.
2. We specialize almost entirely in technical research. We select our researchers for their proficiency in mathematics and computer science, rather than forecasting expertise or political acumen. I stress that this is only one part of the puzzle: figuring out how to build the right system is useless if the right system does not in fact get built, and ensuring AI has a positive impact is not simply a technical problem. It is also a global coordination problem, in the face of short-term incentives to cut corners. Addressing these non-technical challenges is an important task that we do not focus on.
In short, MIRI does technical research to ensure that de novo AI software systems will have a positive impact. We do not further discriminate between different types of AI software systems, nor do we make strong claims about exactly how quickly we expect AI systems to attain superintelligence. Rather, our current approach is to select open problems using the following question:
What would we still be unable to solve, even if the challenge were far simpler?
For example, we might study AI alignment problems that we could not solve even if we had lots of computing power and very simple goals.
We then filter on problems that are (1) tractable, in the sense that we can do productive mathematical research on them today; (2) uncrowded, in the sense that the problems are not likely to be addressed during normal capabilities research; and (3) critical, in the sense that they could not be safely delegated to a machine unless we had first solved them ourselves.1
These three filters are usually uncontroversial. The controversial claim here is that the above question — “what would we be unable to solve, even if the challenge were simpler?” — is a generator of open technical problems for which solutions will help us design safer and more reliable AI software in the future, regardless of their architecture. The rest of this post is dedicated to justifying this claim, and describing the reasoning behind it.
MIRI Fundraiser: Why now matters
Our summer fundraiser is ongoing. In the meantime, we're writing a number of blog posts to explain what we're doing and why, and to answer a number of common questions. Previous posts in the series are listed at the above link.
I'm often asked whether donations to MIRI now are more important than donations later. Allow me to deliver an emphatic yes: I currently expect that donations to MIRI today are worth much more than donations to MIRI in five years. As things stand, I would very likely take $10M today over $20M in five years.
That's a bold statement, and there are a few different reasons for this. First and foremost, there is a decent chance that some very big funders will start entering the AI alignment field over the course of the next five years. It looks like the NSF may start to fund AI safety research, and Stuart Russell has already received some money from DARPA to work on value alignment. It's quite possible that in a few years' time significant public funding will be flowing into this field.
(It's also quite possible that it won't, or that the funding will go to all the wrong places, as was the case with funding for nanotechnology. But if I had to bet, I would bet that it's going to be much easier to find funding for AI alignment research in five years' time).
In other words, the funding bottleneck is loosening — but it isn't loose yet.
We don't presently have the funding to grow as fast as we could over the coming months, or to run all the important research programs we have planned. At our current funding level, the research team can grow at a steady pace — but we could get much more done over the course of the next few years if we had the money to grow as fast as is healthy.
Which brings me to the second reason why funding now is probably much more important than funding later: because growth now is much more valuable than growth later.
There's an idea picking up traction in the field of AI: instead of focusing only on increasing the capabilities of intelligent systems, it is important to also ensure that we know how to build beneficial intelligent systems. Support is growing for a new paradigm within AI that seriously considers the long-term effects of research programs, rather than just the immediate effects. Years down the line, these ideas may seem obvious, and the AI community's response to these challenges may be in full swing. Right now, however, there is relatively little consensus on how to approach these issues — which leaves room for researchers today to help determine the field's future direction.
People at MIRI have been thinking about these problems for a long time, and that puts us in an unusually good position to influence the field of AI and ensure that some of the growing concern is directed towards long-term issues in addition to shorter-term ones. We can, for example, help avert a scenario where all the attention and interest generated by Musk, Bostrom, and others gets channeled into short-term projects (e.g., making drones and driverless cars safer) without any consideration for long-term risks that are more vague and less well-understood.
It's likely that MIRI will scale up substantially at some point; but if that process begins in 2018 rather than 2015, it is plausible that we will have already missed out on a number of big opportunities.
The alignment research program within AI is just now getting started in earnest, and it may even be funding-saturated in a few years' time. But it's nowhere near funding-saturated today, and waiting five or ten years to begin seriously ramping up our growth would likely give us far fewer opportunities to shape the methodology and research agenda within this new AI paradigm. The projects MIRI takes on today can make a big difference years down the line, and supporting us today will drastically affect how much we can do quickly. Now matters.
I encourage you to donate to our ongoing fundraiser if you'd like to help us grow!
This post is cross-posted from the MIRI blog.
Taking the reins at MIRI
Hi all. In a few hours I'll be taking over as executive director at MIRI. The LessWrong community has played a key role in MIRI's history, and I hope to retain and build your support as (with more and more people joining the global conversation about long-term AI risks & benefits) MIRI moves towards the mainstream.
Below I've cross-posted my introductory post on the MIRI blog, which went live a few hours ago. The short version is: there are very exciting times ahead, and I'm honored to be here. Many of you already know me in person or through my blog posts, but for those of you who want to get to know me better, I'll be running an AMA on the effective altruism forum at 3PM Pacific on Thursday June 11th.
I extend to all of you my thanks and appreciation for the support that so many members of this community have given to MIRI throughout the years.
The Stamp Collector
I'm writing a series of posts about replacing guilt motivation over on MindingOurWay, and I plan to post the meatier / more substantive posts in that series to LessWrong. This one is an allegory designed to remind people that they are allowed to care about the outer world, that they are not cursed to only ever care about what goes on in their heads.
Once upon a time, a group of naïve philosophers found a robot that collected trinkets. Well, more specifically, the robot seemed to collect stamps: if you presented this robot with a choice between various trinkets, it would always choose the option that led towards it having as many stamps as possible in its inventory. It ignored dice, bottle caps, aluminum cans, sticks, twigs, and so on, except insofar as it predicted they could be traded for stamps in the next turn or two. So, of course, the philosophers started calling it the "stamp collector."
Then, one day, the philosophers discovered computers, and deduced out that the robot was merely a software program running on a processor inside the robot's head. The program was too complicated for them to understand, but they did manage to deduce that the robot only had a few sensors (on its eyes and inside its inventory) that it was using to model the world.
One of the philosophers grew confused, and said, "Hey wait a sec, this thing can't be a stamp collector after all. If the robot is only building a model of the world in its head, then it can't be optimizing for its real inventory, because it has no access to its real inventory. It can only ever act according to a model of the world that it reconstructs inside its head!"
"Ah, yes, I see," another philosopher answered. "We did it a disservice by naming it a stamp collector. The robot does not have true access to the world, obviously, as it is only seeing the world through sensors and building a model in its head. Therefore, it must not actually be maximizing the number of stamps in its inventory. That would be impossible, because its inventory is outside of its head. Rather, it must be maximizing its internal stamp counter inside its head."
So the naïve philosophers nodded, pleased with this, and then they stopped wondering how the stamp collector worked.
The path of the rationalist
This is the last of four short essays that say explicitly some things that I would tell an intrigued proto-rationalist before pointing them towards Rationality: AI to Zombies (and, by extension, most of LessWrong). For most people here, these essays will be very old news, as they talk about the insights that come even before the sequences. However, I've noticed recently that a number of fledgling rationalists haven't actually been exposed to all of these ideas, and there is power in saying the obvious.
This essay is cross-posted on MindingOurWay.
Once upon a time, three students of human rationality traveled along a dusty path. The first was a novice, new to the art. The second was a student, who had been practicing for a short time. The third was their teacher.
As they traveled, they happened upon a woman sitting beside a great urn attached to a grand contraption. She hailed the travellers, and when they appeared intrigued, she explained that she was bringing the contraption to town (where she hoped to make money off of it), and offered them a demonstration.
She showed them that she possessed one hundred balls, identical except for their color: one was white, ninety nine were red. She placed them all in the urn, and then showed them how the contraption worked: the contraption consisted of a shaker (which shook the urn violently until none knew which ball was where) and a mechanical arm, which would select a ball from the urn.
"I'll give you each $10 if the white ball is drawn," she said over the roar of the shaker. "Normally, it costs $1 to play, but I'll give you a demonstration for free."
As the shaking slowed, the novice spoke: "I want it to draw the white ball, so I believe that it will draw the white ball. I have faith that the white ball will be drawn, and there's a chance I'm right, so you can't say I'm wrong!"
As the shaking stopped, the student replied, "I am a student of rationality, and I know that it is a virtue to move in tandem with the evidence. In this urn, there are more red balls than white, and so the evidence says that is more likely that a red ball will be drawn than a white ball. Therefore, I believe that a red ball will be drawn."
As the arm began to unfold, the teacher smiled, and said only, "I assign 1% probability to the proposition 'a white ball will be drawn,' and 99% probability to 'a red ball will be drawn.'"
In order to study the art of human rationality, one must make a solemn pact with themselves. They must vow to stop trying to will reality into being a certain way; they must vow to instead listen to reality tell them how it is. They must recognize "faith" as an attempt to disconnect their beliefs from the voice of the evidence; they must vow to protect the ephemeral correspondence between the real world and their map of it.
It is easy for the student, when making this pact with themselves, to mistake it for a different one. Many rationalist think they've taken a vow to always listen to the evidence, and to let the evidence choose what they believe. They think that it is a virtue to weigh the evidence and then believe the most likely hypothesis, no matter what that may be.
But no: that is red-ball-thinking.
The path to rationality is not the path where the evidence chooses the beliefs. The path to rationality is one without beliefs.
On the path to rationality, there are only probabilities.
Our language paints beliefs as qualitative, we speak of beliefs as if they are binary things. You either know something or you don't. You either believe me or you don't. You're either right or you're wrong.
Traditional science, as it's taught in schools, propagates this fallacy. The statistician's role (they say) is to identify two hypotheses, null and alternative, and then test them, and then it is their duty (they say) to believe whichever hypothesis the data supports. A scientist must make their beliefs falsifiable (they say), and if ever enough evidence piles up against them, they must "change their mind" (from one binary belief to another). But so long as a scientist makes their beliefs testable and falsifiable, they have done their duty, and they are licensed to believe whatever else they will. Everybody is entitled to their own opinion, after all — at least, this is the teaching of traditional science.
But this is not the way of the rationalist.
The brain is an information machine, and humanity has figured out a thing or two about how to make accurate information machines. One of the things we've figured out is this: to build an accurate world-model, do away with qualitative beliefs, and use quantitative credences instead.
An ideal rationalist doesn't say "I want the next ball to be white, therefore I believe it will be." An ideal rationalist also doesn't say, "most of the balls are red, so I believe the next ball will be red." The ideal rationalist relinquishes belief, and assigns a probability.
In order to construct an accurate world-model, you must move in tandem with the evidence. You must use the evidence to figure out the likelihood of each hypothesis. But afterwards, you don't just pick the highest-probability thing and believe that. No.
The likelihoods don't tell you what to believe. The likelihoods replace belief. They're it. You say the likelihoods and then you stop, because you're done.
Most people, upon encountering the parable above, think that it is obvious. Almost everybody who hears me tell it in person just nods, but most of them fail to deeply integrate its lesson.
They hear the parable, and then they go on thinking in terms of "knowing" or "not knowing" (instead of thinking in terms of confidence). They nod at the parable, and then go on thinking in terms of "being right" or "being wrong" (instead of thinking about whether or not they were well-calibrated). They know the parable, but in the next conversation, they still insist "you can't prove that!" or "well that doesn't prove me wrong," as if propositions about reality could be "proven," as if perfect certainty was somehow possible.
No statement about the world can be proven. There is no certainty. All we have are probabilities.
Most people, when they encounter evidence that contradicts something they believe, decide that the evidence is not strong enough to switch them from one binary belief to another, and so they fail to change their mind at all. Most people fail to realize that all evidence against a hypothesis lowers its probability, even if only slightly, because most people are still thinking qualitatively.
In fact, most people still think that they get to choose how to drawn conclusions from the evidence they've seen. And this is true — but only for those who are comfortable with avoidable inaccuracy.
For this comes as a surprise to many, but humanity has uncovered many of the laws of reasoning.
Given your initial state of knowledge and the observations you have seen, there is only one maximally accurate updated state of knowledge.
Now, you can't achieve this state of perfect posterior state. Building an ideal information-gathering engine is just as impossible as building an ideal heat engine. But the ideal is known. Given what you knew and what you saw, there is only one maximally accurate new state of knowledge.
Contrary to popular belief, you aren't entitled to your own opinion, and you don't get to choose your own beliefs. Not if you want to be accurate. Given what you knew and what you saw, there is only one best posterior state of knowledge. Computing that state is nigh impossible, but the process is well understood. We can't use information perfectly, but we know which path leads towards "better."
If you want to walk that path, if you want to nourish the ephemeral correspondence between your mind and the real world, if you want to learn how to draw an accurate map of this beautiful, twisted, awe-inspiring territory that we live in, then know this:
The Way is quantitative.
To walk the path, you must leave beliefs behind and let the likelihoods guide you. For they are all you'll have.
If this is a path you want to walk, then I now officially recommend starting with Rationality: AI to Zombies Book I: Map and Territory.
As the arm began to unfold, the teacher smiled, and said only, "I assign 1% probability to the proposition 'a white ball will be drawn,' and 99% probability to 'a red ball will be drawn.'"
The woman with the urn cocked her head and said, "Huh, you three are dressed like rationalists, and yet you seem awfully certain that I told the truth about the arm drawing balls from the urn…"
The arm whirred into motion.
Ephemeral correspondence
This is the third of four short essays that say explicitly some things that I would tell an intrigued proto-rationalist before pointing them towards Rationality: AI to Zombies (and, by extension, most of LessWrong). For most people here, these essays will be very old news, as they talk about the insights that come even before the sequences. However, I've noticed recently that a number of fledgling rationalists haven't actually been exposed to all of these ideas, and there is power in saying the obvious.
This essay is cross-posted on MindingOurWay.
Your brain is a machine that builds up mutual information between its insides and its outsides. It is not only an information machine. It is not intentionally an information machine. But it is bumping into photons and air waves, and it is producing an internal map that correlates with the outer world.
However, there's something very strange going on in this information machine.
Consider: part of what your brain is doing is building a map of the world around you. This is done automatically, without much input on your part into how the internal model should look. When you look at the sky, you don't get a query which says
Readings from the retina indicate that the sky is blue. Represent sky as blue in world-model? [Y/n]
No. The sky just appears blue. That sort of information, gleaned from the environment, is baked into the map.
You can choose to claim that the sky is green, but you can't choose to see a green sky.
View more: Next


Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)