Holden Karnofsky: I think a decent counterpoint to this [project of predicting future AI progress] is like, "Look. You can do all the work you want and try and understand the brain. It is not possible to predict the future 20 years out. No one’s ever done it well." That’s an argument someone could make. I think it’s a decent argument. But I think there’s also a counterpoint to that.
One, I think people do somewhat overestimate how futile it is to predict the future. We have an ongoing project on this. We have a contractor working right now on looking back at a bunch of 10- or 20- or 30-year predictions, and scoring them according to whether they came true or false. These are predictions that were made decades ago. We’ll see how that goes.
The other thing I’d say is: you can’t know the future, but it seems possible that you can be well-calibrated about the future. If you look at, for example, Slate Star Codex, every year it’s putting up a whole bunch of probabilities about things that that blogger is not an expert on and doesn’t really necessarily know a whole bunch about. There will be a probability that things improve in the Middle East, a probability that someone wins a Presidential election... This person doesn’t necessarily know what’s going to happen, but they have some knowledge about their own state of knowledge. They have some knowledge about their own state of ignorance.
What they’re not able to do is accurately predict what’s going to happen. What they are able to do is make predictions such that when they say something is 90% likely, they’re wrong about 10% of the time. When they say something is 80% likely, they’re wrong about 20% of the time. When they say something is 50% likely, they’re wrong about half the time. They know how likely they are to be wrong, which is different from knowing what’s going to happen.
And so one of the things that I’d look for in this timelines person is a deep familiarity with the science of forecasting, which is something that we’re very interested in and we try to incorporate into our grantmaking. A deep familiarity with that, and an understanding of what is realistic for them to say 20 years out.
[...]
Robert Wiblin: What things do you think you’ve learned, over the last 11 years of doing this kind of research, about in what situations you can trust expert consensus and in what cases you should think there’s a substantial chance that it’s quite mistaken?
Holden Karnofsky: Sure. I mean, I think it’s hard to generalize about this, and sometimes I wish I would write out my model more explicitly. I thought it was cool that Eliezer Yudkowsky did that in his book, Inadequate Equilibria. I think one thing that I especially look for, in terms of when we’re doing philanthropy, is I’m especially interested in the role of academia and what academia is able to do.
You could look at corporations, you can understand their incentives. You can look at governments, you can sort of understand their incentives. You can look at think tanks, and a lot of them are aimed directly at governments, in a sense, and so you can sort of understand what’s going on there. But academia is the default home for people who really spend all their time thinking about things that are intellectual, that could be important to the world, but that there’s no client who is like, “I need this now for this reason, and I’m making you do it.” So a lot of the time, when someone says someone should (let's say) work on AI alignment or work on AI strategy, or for example evaluate the evidence base for bed nets and deworming, which is what GiveWell does, my first question when it’s not obvious where else it fits is: "Would this fit into academia?"
And this is something where my opinions and my views have evolved a lot, where I used to have this very simplified: “Academia! That’s, like, this giant set of universities. There’s a whole ton of very smart intellectuals, who knows, they can do everything. There’s a zillion fields. There’s a literature on everything, as has been written on Marginal Revolution..." All that sort of thing. And so I would really never know when to expect that something was going to be neglected and when it wasn’t, and it takes a giant literature review to figure out which is which.
I would say I’ve definitely evolved on that. Today, when I think about what academia does, I think it is really set up to push the frontier of knowledge. Especially in the harder sciences, I would say the vast majority of what is going on in academia is people are trying to do something novel, interesting, clever, creative, different, new, provocative, that really pushes the boundaries of knowledge forward in a new way.
I think that’s really important, obviously, and a great thing. I’m really incredibly glad we have institutions to do it. But I think there are a whole bunch of other activities that are intellectual, that are challenging, that take a lot of intellectual work, and that are incredibly important, and that are not that. And they have nowhere else to live. No one else can do them. And so I’m especially interested, and my eyes especially light up, when I see an opportunity where there’s an intellectual topic, it’s really important to the world, but it’s not advancing the frontier of knowledge. It’s more figuring out something in a pragmatic way that is going to inform what decision-makers should do; and also there’s no one decision-maker asking for it, as would be the case with government or corporations.
To give examples of this, I think GiveWell is the first place where I might have initially expected that development economics was going to tell us what the best charities are. Or at least tell us what the best interventions are. Tell us, "Bed nets, deworming, cash transfers, agricultural extension programs, education improvement programs — which ones are helping the most people for the least money?"
And there’s really very little work on this in academia. A lot of times, there will be one study that tries to estimate the impact of deworming, but very few or no attempts to really replicate it. Because it’s much more valuable to academics to have a new insight, to show something new about the world, than to try to kind of nail something down.
It really got brought home to me recently when we were doing our criminal justice reform work, and we wanted to check ourselves. We wanted to check this basic assumption that it would be good to have less incarceration in the US. And so David Roodman, who is basically the person that I consider the gold standard of a critical evidence reviewer, someone who can really dig on a complicated literature and come up with the answers, he did what I think was a really wonderful and really fascinating paper, which is up on our website, where he looked for all the studies on the relationship between incarceration and crime, and what happens. If you cut incarceration, do you expect crime to rise, to fall, to stay the same? And he really picked them apart. And what happened is he found a lot of the best, most prestigious studies, and about half of them, he found fatal flaws in when he just tried to replicate them or redo their conclusions.
And so when he put it all together, he ended up with a different conclusion from what you would get if you just read the abstracts. And it was a completely novel piece of work that kind of reviewed this whole evidence base at a level of thoroughness that had never been done before, came out with a conclusion that was different from what you naively would have thought — which concluded his best estimate is that, at current margins, we could cut incarceration and there would be no expected impact on crime.
He did all that. Then he started submitting it to journals. And, you know, he’s gotten rejected from a large number of journals by now. Starting with the most prestigious ones and then going to the less...
Robert Wiblin: Why is that?
Holden Karnofsky: Because his paper — I think it’s incredibly well done, it’s incredibly important, but in some kind of academic taste sense, there’s nothing "new" in there.
He took a bunch of studies, he kind of redid them, he found that they broke. He found new issues with them, and he found new conclusions, and from a policymaker or philanthropist perspective, all very interesting stuff, but did we really find a new method for asserting causality? Did we really find a new insight about how the mind of a perpetrator works? No. We didn’t advance the frontiers of knowledge. We pulled together a bunch of knowledge that we already had, and we synthesized it.
And I think that’s a common theme. Our academic institutions were set up a while ago, and they were set up at a time when it seemed like the most valuable thing to do was just to search for the next big insight.
And these days, they’ve been around for a while. We’ve got a lot of insights. We’ve got a lot of insights sitting around, we’ve got a lot of studies, and I think a lot of the time what we need to do is take the information that’s already available, take the studies that already exist, and synthesize them critically and say, “What does this mean for what we should do — where we should give money, what policy should be?" And I don’t think there’s any home in academia to do that.
I think that creates a lot of the gaps. This also applies to AI timelines: there’s nothing particularly innovative, groundbreaking, knowledge-frontier-advancing, creative, clever about it. It’s a question that matters. When can we expect transformative AI, and with what probability? It matters, but it’s not a work of frontier-advancing intellectual creativity to try to answer it. And so a very common theme in a lot of the work we advance is instead of pushing the frontiers of knowledge, take knowledge that’s already out there, pull it together, critique it, synthesize it, and decide what that means for what we should do. And especially, I think, there’s very little in the way of institutions that are trying to anticipate big intellectual breakthroughs down the road, such as AI, such as other technologies that could change the world. Think about how they could make the world better or worse, and what we can do to prepare for them.
I think historically when academia was set up, we were kind of in a world where it was really hard to predict what the next scientific breakthrough was going to be. It was really hard to predict how it would affect the world — but it usually turned out pretty well. And I think for various reasons, the scientific landscape may be changing now, where there are arguments it’s getting easier to see where things are headed. We know more about science. We know more about the ground rules. We know more about what cannot be done. We know more about what probably, eventually can be done.
And I think it’s somewhat of a happy coincidence so far that most breakthroughs have been good. And so to say, "I see a breakthrough on the horizon. Is that good or bad? How can we prepare for it?" That’s another thing academia is really not set up to do. Academia is set up to get the breakthrough.
And so that is a question I ask myself a lot: "Here’s an intellectual activity. Why can’t it be done in academia?" But these days, my answer, "Is if it’s really primarily of interest to a very cosmopolitan philanthropist trying to help the whole future, and there’s no one client, and it’s not frontier advancing, then I think that does make it pretty plausible to me that there’s no one doing it." And we would love to change that, at least somewhat, by funding what we think is the most important work.
Robert Wiblin: Something that doesn’t quite fit with that is that you do see a lot of practical psychology and nutrition papers that are trying to answer questions that the public have. Usually done very poorly, and you can’t really trust the answers, but it’s things like, “Does chocolate prevent cancer?” Or some nonsense, a small-sample paper like that. That seems like it’s not pushing forward methodology; it’s just doing an application. How does that fit into to this model?
Holden Karnofsky: Well, first off, it’s a generalization, so I’m not going to say it’s everything. But I will also say, that stuff is very low prestige. That work — it’s not the hot thing to work on, and for that reason, I think, correlated with that, you see a lot of work that's not very well funded, it’s not very well executed, it’s not very well done, it doesn’t tell you very much. Like, the vast majority of nutrition studies out there are just... You know, you can look at even a sample report we did on carbs and obesity that Luke Muehlhauser did. These studies are just — if someone had gone after them a little harder with the energy and the funding that we go after some of the fundamental stuff, they could have been a lot more informative.
And then the other thing is that a thing you will see even less of is good critical evidence reviews. You're right, you’ll see a study that’s, you know, “Does chocolate lead to more disease?” Or whatever, and sometimes that study will use established methods, and it’s just another data point. But the part about taking what’s out there and synthesizing it all, and saying, “There’s a thousand studies. Here are the ones that are worth looking at. Here are their strengths; here are their weaknesses.” There are literature reviews, but I don’t think they’re a very prestigious thing to do, and I don’t they’re done super great.
And so, I think, for example, some of the stuff GiveWell does — it’s like they have to reinvent a lot of this stuff, and they have to do a lot of the critical evidence reviews, because they’re not already out there. Same with David.
[...]
Holden Karnofsky: I think a decent counterpoint to this [project of predicting future AI progress] is like, "Look. You can do all the work you want and try and understand the brain. It is not possible to predict the future 20 years out. No one’s ever done it well." That’s an argument someone could make. I think it’s a decent argument. But I think there’s also a counterpoint to that.
One, I think people do somewhat overestimate how futile it is to predict the future. We have an ongoing project on this. We have a contractor working right now on looking back at a bunch of 10- or 20- or 30-year predictions, and scoring them according to whether they came true or false. These are predictions that were made decades ago. We’ll see how that goes.
The other thing I’d say is: you can’t know the future, but it seems possible that you can be well-calibrated about the future. If you look at, for example, Slate Star Codex, every year it’s putting up a whole bunch of probabilities about things that that blogger is not an expert on and doesn’t really necessarily know a whole bunch about. There will be a probability that things improve in the Middle East, a probability that someone wins a Presidential election... This person doesn’t necessarily know what’s going to happen, but they have some knowledge about their own state of knowledge. They have some knowledge about their own state of ignorance.
What they’re not able to do is accurately predict what’s going to happen. What they are able to do is make predictions such that when they say something is 90% likely, they’re wrong about 10% of the time. When they say something is 80% likely, they’re wrong about 20% of the time. When they say something is 50% likely, they’re wrong about half the time. They know how likely they are to be wrong, which is different from knowing what’s going to happen.
And so one of the things that I’d look for in this timelines person is a deep familiarity with the science of forecasting, which is something that we’re very interested in and we try to incorporate into our grantmaking. A deep familiarity with that, and an understanding of what is realistic for them to say 20 years out.
[...]
Robert Wiblin: What things do you think you’ve learned, over the last 11 years of doing this kind of research, about in what situations you can trust expert consensus and in what cases you should think there’s a substantial chance that it’s quite mistaken?
Holden Karnofsky: Sure. I mean, I think it’s hard to generalize about this, and sometimes I wish I would write out my model more explicitly. I thought it was cool that Eliezer Yudkowsky did that in his book, Inadequate Equilibria. I think one thing that I especially look for, in terms of when we’re doing philanthropy, is I’m especially interested in the role of academia and what academia is able to do.
You could look at corporations, you can understand their incentives. You can look at governments, you can sort of understand their incentives. You can look at think tanks, and a lot of them are aimed directly at governments, in a sense, and so you can sort of understand what’s going on there. But academia is the default home for people who really spend all their time thinking about things that are intellectual, that could be important to the world, but that there’s no client who is like, “I need this now for this reason, and I’m making you do it.” So a lot of the time, when someone says someone should (let's say) work on AI alignment or work on AI strategy, or for example evaluate the evidence base for bed nets and deworming, which is what GiveWell does, my first question when it’s not obvious where else it fits is: "Would this fit into academia?"
And this is something where my opinions and my views have evolved a lot, where I used to have this very simplified: “Academia! That’s, like, this giant set of universities. There’s a whole ton of very smart intellectuals, who knows, they can do everything. There’s a zillion fields. There’s a literature on everything, as has been written on Marginal Revolution..." All that sort of thing. And so I would really never know when to expect that something was going to be neglected and when it wasn’t, and it takes a giant literature review to figure out which is which.
I would say I’ve definitely evolved on that. Today, when I think about what academia does, I think it is really set up to push the frontier of knowledge. Especially in the harder sciences, I would say the vast majority of what is going on in academia is people are trying to do something novel, interesting, clever, creative, different, new, provocative, that really pushes the boundaries of knowledge forward in a new way.
I think that’s really important, obviously, and a great thing. I’m really incredibly glad we have institutions to do it. But I think there are a whole bunch of other activities that are intellectual, that are challenging, that take a lot of intellectual work, and that are incredibly important, and that are not that. And they have nowhere else to live. No one else can do them. And so I’m especially interested, and my eyes especially light up, when I see an opportunity where there’s an intellectual topic, it’s really important to the world, but it’s not advancing the frontier of knowledge. It’s more figuring out something in a pragmatic way that is going to inform what decision-makers should do; and also there’s no one decision-maker asking for it, as would be the case with government or corporations.
To give examples of this, I think GiveWell is the first place where I might have initially expected that development economics was going to tell us what the best charities are. Or at least tell us what the best interventions are. Tell us, "Bed nets, deworming, cash transfers, agricultural extension programs, education improvement programs — which ones are helping the most people for the least money?"
And there’s really very little work on this in academia. A lot of times, there will be one study that tries to estimate the impact of deworming, but very few or no attempts to really replicate it. Because it’s much more valuable to academics to have a new insight, to show something new about the world, than to try to kind of nail something down.
It really got brought home to me recently when we were doing our criminal justice reform work, and we wanted to check ourselves. We wanted to check this basic assumption that it would be good to have less incarceration in the US. And so David Roodman, who is basically the person that I consider the gold standard of a critical evidence reviewer, someone who can really dig on a complicated literature and come up with the answers, he did what I think was a really wonderful and really fascinating paper, which is up on our website, where he looked for all the studies on the relationship between incarceration and crime, and what happens. If you cut incarceration, do you expect crime to rise, to fall, to stay the same? And he really picked them apart. And what happened is he found a lot of the best, most prestigious studies, and about half of them, he found fatal flaws in when he just tried to replicate them or redo their conclusions.
And so when he put it all together, he ended up with a different conclusion from what you would get if you just read the abstracts. And it was a completely novel piece of work that kind of reviewed this whole evidence base at a level of thoroughness that had never been done before, came out with a conclusion that was different from what you naively would have thought — which concluded his best estimate is that, at current margins, we could cut incarceration and there would be no expected impact on crime.
He did all that. Then he started submitting it to journals. And, you know, he’s gotten rejected from a large number of journals by now. Starting with the most prestigious ones and then going to the less...
Robert Wiblin: Why is that?
Holden Karnofsky: Because his paper — I think it’s incredibly well done, it’s incredibly important, but in some kind of academic taste sense, there’s nothing "new" in there.
He took a bunch of studies, he kind of redid them, he found that they broke. He found new issues with them, and he found new conclusions, and from a policymaker or philanthropist perspective, all very interesting stuff, but did we really find a new method for asserting causality? Did we really find a new insight about how the mind of a perpetrator works? No. We didn’t advance the frontiers of knowledge. We pulled together a bunch of knowledge that we already had, and we synthesized it.
And I think that’s a common theme. Our academic institutions were set up a while ago, and they were set up at a time when it seemed like the most valuable thing to do was just to search for the next big insight.
And these days, they’ve been around for a while. We’ve got a lot of insights. We’ve got a lot of insights sitting around, we’ve got a lot of studies, and I think a lot of the time what we need to do is take the information that’s already available, take the studies that already exist, and synthesize them critically and say, “What does this mean for what we should do — where we should give money, what policy should be?" And I don’t think there’s any home in academia to do that.
I think that creates a lot of the gaps. This also applies to AI timelines: there’s nothing particularly innovative, groundbreaking, knowledge-frontier-advancing, creative, clever about it. It’s a question that matters. When can we expect transformative AI, and with what probability? It matters, but it’s not a work of frontier-advancing intellectual creativity to try to answer it. And so a very common theme in a lot of the work we advance is instead of pushing the frontiers of knowledge, take knowledge that’s already out there, pull it together, critique it, synthesize it, and decide what that means for what we should do. And especially, I think, there’s very little in the way of institutions that are trying to anticipate big intellectual breakthroughs down the road, such as AI, such as other technologies that could change the world. Think about how they could make the world better or worse, and what we can do to prepare for them.
I think historically when academia was set up, we were kind of in a world where it was really hard to predict what the next scientific breakthrough was going to be. It was really hard to predict how it would affect the world — but it usually turned out pretty well. And I think for various reasons, the scientific landscape may be changing now, where there are arguments it’s getting easier to see where things are headed. We know more about science. We know more about the ground rules. We know more about what cannot be done. We know more about what probably, eventually can be done.
And I think it’s somewhat of a happy coincidence so far that most breakthroughs have been good. And so to say, "I see a breakthrough on the horizon. Is that good or bad? How can we prepare for it?" That’s another thing academia is really not set up to do. Academia is set up to get the breakthrough.
And so that is a question I ask myself a lot: "Here’s an intellectual activity. Why can’t it be done in academia?" But these days, my answer, "Is if it’s really primarily of interest to a very cosmopolitan philanthropist trying to help the whole future, and there’s no one client, and it’s not frontier advancing, then I think that does make it pretty plausible to me that there’s no one doing it." And we would love to change that, at least somewhat, by funding what we think is the most important work.
Robert Wiblin: Something that doesn’t quite fit with that is that you do see a lot of practical psychology and nutrition papers that are trying to answer questions that the public have. Usually done very poorly, and you can’t really trust the answers, but it’s things like, “Does chocolate prevent cancer?” Or some nonsense, a small-sample paper like that. That seems like it’s not pushing forward methodology; it’s just doing an application. How does that fit into to this model?
Holden Karnofsky: Well, first off, it’s a generalization, so I’m not going to say it’s everything. But I will also say, that stuff is very low prestige. That work — it’s not the hot thing to work on, and for that reason, I think, correlated with that, you see a lot of work that's not very well funded, it’s not very well executed, it’s not very well done, it doesn’t tell you very much. Like, the vast majority of nutrition studies out there are just... You know, you can look at even a sample report we did on carbs and obesity that Luke Muehlhauser did. These studies are just — if someone had gone after them a little harder with the energy and the funding that we go after some of the fundamental stuff, they could have been a lot more informative.
And then the other thing is that a thing you will see even less of is good critical evidence reviews. You're right, you’ll see a study that’s, you know, “Does chocolate lead to more disease?” Or whatever, and sometimes that study will use established methods, and it’s just another data point. But the part about taking what’s out there and synthesizing it all, and saying, “There’s a thousand studies. Here are the ones that are worth looking at. Here are their strengths; here are their weaknesses.” There are literature reviews, but I don’t think they’re a very prestigious thing to do, and I don’t they’re done super great.
And so, I think, for example, some of the stuff GiveWell does — it’s like they have to reinvent a lot of this stuff, and they have to do a lot of the critical evidence reviews, because they’re not already out there. Same with David.