I was dissatisfied with this speech in some ways. For the sake of transparency and charity, I will say that Kaufman has written a disclaimer explaining that, because of a miscommunication, he wrote this speech in the span of two hours immediately before he delivered it (instead of eating lunch, I would like to add), and that even after writing the text version, he is not entirely satisfied with the result.
I'm not that familiar with the EA community, but I predict that debates about cause prioritization, especially when existential risk mitigation is among the causes being discussed, can become mind-killed extremely quickly. And I don't mean to convey that in the tone of a wise outsider. It makes sense, considering the stakes at hand and the eschatological undertones of existential risk. (That is to say that the phrase 'savetheworld' can be sobering or gross, depending on the individual.) So, as is always implicit, but is sometimes worth making explicit, I'm criticizing some arguments as I understand them, not any person. I write this precisely because rationality is a common interest of many causes. I'll be focusing on the part about existential risk, as well as the parts that it is dependent upon. Lastly, I'd be interested in knowing if anyone else has criticized this speech in writing or come to conclusions similar to mine. Without further ado:
Jeff Kaufman's explanation of EA and why it makes sense is boilerplate; I agree with it, naturally. I also agree with the idea that certain existential risk mitigation strategies are comparatively less neglected by national governments and thus that risks like these are considerably less likely to be where one can make one's most valuable marginal donation. E.g., there are people who are paid to record and predict the trajectories of celestial objects, celestial mechanics is well-understood, and an impact event in the next two centuries is, with high meta-confidence, far less probable than many other risks. You probably shouldn't donate to asteroid impact risk mitigation organizations if you have to choose a cause from the category of existential risk mitigation organizations. The same goes for most natural (non-anthropogenic) risks.
The next few parts are worth looking at in detail, however:
At the other end we have risks like the development of an artificial intelligence that destroys us through its indifference. Very few people are working on this, there's low funding, and we don't have much understanding of the problem. Neglectedness is a strong heuristic for finding causes where your contribution can go far, and this does seem relatively neglected. The main question for me, though, is how do you know if you're making progress?
Everything before the question seems accurate to me. Furthermore, if I interpret the question correctly, then what's implied is a difference between the observable consequences of global poverty mitigation and existential risk mitigation. I think the implied difference is fair. You can see the malaria evaporating but you only get one chance to build a superintelligence right. (It's worth saying that AI risk is also the example that Kaufman uses in his explanation.)
However, I don't think that this necessarily implies that we can't have some confidence that we're actually mitigating existential risks. This is clear if we dissolve the question. What are the disguised queries behind the question 'How do you know if you're making progress?'
If your disguised query is 'Can I observe the consequences of my interventions and update my beliefs and correct my actions accordingly?', then in the case of existential risks, the answer is "No", at least in the traditional sense of an experiment.
If your disguised query is 'Can I have confidence in the effects of my interventions without observing their consequences?', then that seems like a different, much more complicated question that is both interesting and worth examining further. I'll expand on this conceivably more controversial bit later, so that it doesn't seem like I'm being uncharitable or quoting out of context. Kaufman continues:
First, a brief digression into feedback loops. People succeed when they have good feedback loops. Otherwise they tend to go in random directions. This is a problem for charity in general, because we're buying things for others instead of for ourselves. If I buy something and it's no good I can complain to the shop, buy from a different shop, or give them a bad review. If I buy you something and it's no good, your options are much more limited. Perhaps it failed to arrive but you never even knew you were supposed to get it? Or it arrived and was much smaller than I intended, but how do you know. Even if you do know that what you got is wrong, chances are you're not really in a position to have your concerns taken seriously.
This is a big problem, and there are a few ways around this. We can include the people we're trying to help much more in the process instead of just showing up with things we expect them to want. We can give people money instead of stuff so they can choose the things they most need. We can run experiments to see which ways of helping people work best. Since we care about actually helping people instead of just feeling good about ourselves, we not only can do these things, we need to do them. We need to set up feedback loops where we only think we're helping if we're actually helping.
Back to AI risk. The problem is we really really don't know how to make good feedback loops here. We can theorize that an AI needs certain properties not to just kill us all, and that in order to have those properties it would be useful to have certain theorems proved, and go work on those theorems. And maybe we have some success at this, and the mathematical community thinks highly of us instead of dismissing our work. But if our reasoning about what math would be useful is off there's no way for us to find out. Everything will still seem like it's going well.
I think I get where Kaufman is coming from on this. First, I'm going to use an analogy to convey what I believe to be the commonly used definition of the phrase 'feedback loop'.
If you're an entrepreneur, you want your beliefs about which business strategies will be successful to be entangled with reality. You also have a short financial runway, so you need to decide quickly, which means that you have to obtain your evidence quickly if you want your beliefs to be entangled in time for it to matter. So immediately after you affect the world, you look at it to see what happened and update on it. And this is virtuous.
And of course, people are notoriously bad at remaining entangled with reality when they don't look at it. And this seems like an implicit deficiency in any existential risk mitigation intervention; you can't test the effectiveness of your intervention. You succeed or fail, one time.
So, it seems like there's a big difference between first handing out insecticidal bed nets and then looking to see whether or not the malaria incidence goes down, and paying some mathematicians to think about AI risk. When the AI researchers 'make progress', where can you look? What in the world is different because they thought instead of not, beyond the existence of an academic paper?
In fact, AI risk is not-that-different from this, but you can imagine it as a variant where you have to predict much further into the future, the stakes are higher, and you don't get a second try after you observe the effect of your intervention.
And if you imagine a world where a global authoritarian regime involuntarily reads its citizens' minds as a matter of course, and there it is lawful that anyone who identifies as an EA is to be put in an underground chamber where they are given a minimum income that they may donate as they please, and they are allowed to reason on their prior knowledge only, never being permitted to observe the consequences of their donations, then I bet that EAs would not say, "I have no feedback loop and I therefore cannot decide between any of these alternatives."
Rather, I bet that they would say, "I will never be able to look at the world and see the effects of my actions at a time that affects my decision-making, but this is my best educated guess of what the best thing I can do is, and it's sure as hell better than doing nothing. Yea, my decision is merely rational."
You want observational consequences because they give you confidence in your ability to make predictions. But you can make accurate predictions without being able to observe the consequences of your actions, and without just getting lucky, and sometimes you have to.
But in reality we're not deciding between donating something and donating nothing. We're choosing between charitable causes. But I don't think that the fact that our interventions are less predictable should make us consider the risk more negligible or the prevention thereof less valuable. Above choosing causes where the effects of interventions are predictable, don't we want to choose the most valuable causes? A bias towards causes with consistently, predictably, immediately effective interventions doesn't seem like something that should completely dominate our decision-making process even if there's an alternative cause that can be less predictably intervened upon but that would result in outcomes with extremely high utility if successfully intervened upon.
To illustrate, imagine that you are at some point on a long road, truly in the middle of nowhere, and you see a man whose car has a flat tire. You know that someone else may not drive by for hours, and you don't know how well-prepared the man is for that eventuality. You consider stopping your car to help; you have a spare, you know how to change tires, and you've seen it work before. And if you don't do it right the first time for some weird reason, you can always try again.
But suddenly, you notice that there is a person lying motionless on the ground, some ways down the road; far, but visible. There's no cellphone service, it would take an ambulance hours to get here unless they happened to be driving by, and you have no medical training or experience.
I don't know about you, but even if I'm having an extremely hard time thinking of things to do about a guy dying on my watch in the middle of nowhere, the last thing I do is say, "I have no idea what to do if I try to save that guy, but I know exactly how to change a tire, so why don't I just change the tire instead." Because even if I don't know what to do, saving a life is so much more important than changing a tire that I don't care about the uncertainty. And maybe if I went and actually tried saving his life, even if I wasn't sure how to go about it, it would turn out that I would find a way, or that he needed help, but he wasn't about to die immediately, or that he was perfectly fine all along. And I never would've known if I'd changed a tire and driven in the opposite direction.
And it doesn't mean that the strategy space is open season. I'm not going to come up with a new religion on the spot that contains a prophetic vision that this man will survive his medical emergency, nor am I going to try setting him on fire. There are things that will obviously not work without me trying them out. And that can be built on with other ideas that are not-obviously-wrong-but-may-turn-out-to-be-wrong-later. It's great to have an idea of what you can know is wrong even if you can't try anything. Because not being able to try more than once is precisely the problem.
If we stop talking about what rational thinking feels like, and just start talking about rational thinking with the usual words, then what I'm getting at is that, in reality, there is an inside view to the AI risk arguments. You can always talk about confidence levels outside of an argument, but it helps to go into the details of the inside view, to see where our uncertainty about various assertions is greatest. Otherwise, where is your outside estimate even coming from, besides impression?
We can't run an experiment to see if the mathematics of self-reference, for example, is a useful thing to flesh out before trying to solve the larger problem of AI risk, but there are convincing reasons that it is. And sometimes that's all you have at the time.
And if you ever ask me, "Why does your uncertainty bottom out here?", then I'll ask you "Why does your uncertainty bottom out there?" Because it bottoms out somewhere, even if it's at the level of "I know that I know nothing," or some other similarly useless sentiment. And it's okay.
But I will say that this state of affairs is not optimal. It would be nice if we could be more confident about our reasoning in situations where we aren't able to make predictions, and then perform interventions, and then make observations that we can update on, and then try again. It's great to have medical training in the middle of nowhere.
And I will also say that I imagine that Kaufman is not talking about it being a fundamentally bad idea forever to donate to existential risk mitigation, but that it just doesn't seem like a good idea right now, because we don't know enough about when we should be confident in predictions that we can't test before we have to take action.
But if you know you're confused about how to determine the impact of interventions intended to mitigate existential risks, it's almost as if you should consider trying to figure out that problem itself. If you could crack the problem of mitigating existential risks, it would blow global poverty out of the water. And the problem doesn't immediately seem completely obviously intractable.
In fact, it's almost as if the cause you should choose is the research of existential risk strategy (a subset of cause prioritization). And, if you were to write a speech about it, it seems like it would be a good idea to make it really clear that that's probably very impactful, because value of information counts.
And so, when you read a speech that you claim is entitled 'Why Global Poverty?', I read a speech entitled 'Why Existential Risk Strategy Research?'
Related: Is Molecular Nanotechnology "Scientific"?
For context, Jeff Kaufman delivered a speech on effective altruism and cause prioritization at EA Global 2015 entitled 'Why Global Poverty?', which he has transcribed and made available here. It's certainly worth reading.
I was dissatisfied with this speech in some ways. For the sake of transparency and charity, I will say that Kaufman has written a disclaimer explaining that, because of a miscommunication, he wrote this speech in the span of two hours immediately before he delivered it (instead of eating lunch, I would like to add), and that even after writing the text version, he is not entirely satisfied with the result.
I'm not that familiar with the EA community, but I predict that debates about cause prioritization, especially when existential risk mitigation is among the causes being discussed, can become mind-killed extremely quickly. And I don't mean to convey that in the tone of a wise outsider. It makes sense, considering the stakes at hand and the eschatological undertones of existential risk. (That is to say that the phrase 'save the world' can be sobering or gross, depending on the individual.) So, as is always implicit, but is sometimes worth making explicit, I'm criticizing some arguments as I understand them, not any person. I write this precisely because rationality is a common interest of many causes. I'll be focusing on the part about existential risk, as well as the parts that it is dependent upon. Lastly, I'd be interested in knowing if anyone else has criticized this speech in writing or come to conclusions similar to mine. Without further ado:
Jeff Kaufman's explanation of EA and why it makes sense is boilerplate; I agree with it, naturally. I also agree with the idea that certain existential risk mitigation strategies are comparatively less neglected by national governments and thus that risks like these are considerably less likely to be where one can make one's most valuable marginal donation. E.g., there are people who are paid to record and predict the trajectories of celestial objects, celestial mechanics is well-understood, and an impact event in the next two centuries is, with high meta-confidence, far less probable than many other risks. You probably shouldn't donate to asteroid impact risk mitigation organizations if you have to choose a cause from the category of existential risk mitigation organizations. The same goes for most natural (non-anthropogenic) risks.
The next few parts are worth looking at in detail, however:
Everything before the question seems accurate to me. Furthermore, if I interpret the question correctly, then what's implied is a difference between the observable consequences of global poverty mitigation and existential risk mitigation. I think the implied difference is fair. You can see the malaria evaporating but you only get one chance to build a superintelligence right. (It's worth saying that AI risk is also the example that Kaufman uses in his explanation.)
However, I don't think that this necessarily implies that we can't have some confidence that we're actually mitigating existential risks. This is clear if we dissolve the question. What are the disguised queries behind the question 'How do you know if you're making progress?'
If your disguised query is 'Can I observe the consequences of my interventions and update my beliefs and correct my actions accordingly?', then in the case of existential risks, the answer is "No", at least in the traditional sense of an experiment.
If your disguised query is 'Can I have confidence in the effects of my interventions without observing their consequences?', then that seems like a different, much more complicated question that is both interesting and worth examining further. I'll expand on this conceivably more controversial bit later, so that it doesn't seem like I'm being uncharitable or quoting out of context. Kaufman continues:
I think I get where Kaufman is coming from on this. First, I'm going to use an analogy to convey what I believe to be the commonly used definition of the phrase 'feedback loop'.
If you're an entrepreneur, you want your beliefs about which business strategies will be successful to be entangled with reality. You also have a short financial runway, so you need to decide quickly, which means that you have to obtain your evidence quickly if you want your beliefs to be entangled in time for it to matter. So immediately after you affect the world, you look at it to see what happened and update on it. And this is virtuous.
And of course, people are notoriously bad at remaining entangled with reality when they don't look at it. And this seems like an implicit deficiency in any existential risk mitigation intervention; you can't test the effectiveness of your intervention. You succeed or fail, one time.
Next, let's taboo the phrase 'feedback loop'.
So, it seems like there's a big difference between first handing out insecticidal bed nets and then looking to see whether or not the malaria incidence goes down, and paying some mathematicians to think about AI risk. When the AI researchers 'make progress', where can you look? What in the world is different because they thought instead of not, beyond the existence of an academic paper?
But a big part of this rationality thing is knowing that you can arrive at true beliefs by correct reasoning, and not just by waiting for the answer to smack you in the face.
And I would argue that any altruist is doing the same thing when they have to choose between causes before they can make observations. There are a million other things that the founders of the Against Malaria Foundation could have done, but they took the risk of riding on distributing bed nets, even though they had yet to see it actually work.
In fact, AI risk is not-that-different from this, but you can imagine it as a variant where you have to predict much further into the future, the stakes are higher, and you don't get a second try after you observe the effect of your intervention.
And if you imagine a world where a global authoritarian regime involuntarily reads its citizens' minds as a matter of course, and there it is lawful that anyone who identifies as an EA is to be put in an underground chamber where they are given a minimum income that they may donate as they please, and they are allowed to reason on their prior knowledge only, never being permitted to observe the consequences of their donations, then I bet that EAs would not say, "I have no feedback loop and I therefore cannot decide between any of these alternatives."
Rather, I bet that they would say, "I will never be able to look at the world and see the effects of my actions at a time that affects my decision-making, but this is my best educated guess of what the best thing I can do is, and it's sure as hell better than doing nothing. Yea, my decision is merely rational."
You want observational consequences because they give you confidence in your ability to make predictions. But you can make accurate predictions without being able to observe the consequences of your actions, and without just getting lucky, and sometimes you have to.
But in reality we're not deciding between donating something and donating nothing. We're choosing between charitable causes. But I don't think that the fact that our interventions are less predictable should make us consider the risk more negligible or the prevention thereof less valuable. Above choosing causes where the effects of interventions are predictable, don't we want to choose the most valuable causes? A bias towards causes with consistently, predictably, immediately effective interventions doesn't seem like something that should completely dominate our decision-making process even if there's an alternative cause that can be less predictably intervened upon but that would result in outcomes with extremely high utility if successfully intervened upon.
To illustrate, imagine that you are at some point on a long road, truly in the middle of nowhere, and you see a man whose car has a flat tire. You know that someone else may not drive by for hours, and you don't know how well-prepared the man is for that eventuality. You consider stopping your car to help; you have a spare, you know how to change tires, and you've seen it work before. And if you don't do it right the first time for some weird reason, you can always try again.
But suddenly, you notice that there is a person lying motionless on the ground, some ways down the road; far, but visible. There's no cellphone service, it would take an ambulance hours to get here unless they happened to be driving by, and you have no medical training or experience.
I don't know about you, but even if I'm having an extremely hard time thinking of things to do about a guy dying on my watch in the middle of nowhere, the last thing I do is say, "I have no idea what to do if I try to save that guy, but I know exactly how to change a tire, so why don't I just change the tire instead." Because even if I don't know what to do, saving a life is so much more important than changing a tire that I don't care about the uncertainty. And maybe if I went and actually tried saving his life, even if I wasn't sure how to go about it, it would turn out that I would find a way, or that he needed help, but he wasn't about to die immediately, or that he was perfectly fine all along. And I never would've known if I'd changed a tire and driven in the opposite direction.
And it doesn't mean that the strategy space is open season. I'm not going to come up with a new religion on the spot that contains a prophetic vision that this man will survive his medical emergency, nor am I going to try setting him on fire. There are things that will obviously not work without me trying them out. And that can be built on with other ideas that are not-obviously-wrong-but-may-turn-out-to-be-wrong-later. It's great to have an idea of what you can know is wrong even if you can't try anything. Because not being able to try more than once is precisely the problem.
If we stop talking about what rational thinking feels like, and just start talking about rational thinking with the usual words, then what I'm getting at is that, in reality, there is an inside view to the AI risk arguments. You can always talk about confidence levels outside of an argument, but it helps to go into the details of the inside view, to see where our uncertainty about various assertions is greatest. Otherwise, where is your outside estimate even coming from, besides impression?
We can't run an experiment to see if the mathematics of self-reference, for example, is a useful thing to flesh out before trying to solve the larger problem of AI risk, but there are convincing reasons that it is. And sometimes that's all you have at the time.
And if you ever ask me, "Why does your uncertainty bottom out here?", then I'll ask you "Why does your uncertainty bottom out there?" Because it bottoms out somewhere, even if it's at the level of "I know that I know nothing," or some other similarly useless sentiment. And it's okay.
But I will say that this state of affairs is not optimal. It would be nice if we could be more confident about our reasoning in situations where we aren't able to make predictions, and then perform interventions, and then make observations that we can update on, and then try again. It's great to have medical training in the middle of nowhere.
And I will also say that I imagine that Kaufman is not talking about it being a fundamentally bad idea forever to donate to existential risk mitigation, but that it just doesn't seem like a good idea right now, because we don't know enough about when we should be confident in predictions that we can't test before we have to take action.
But if you know you're confused about how to determine the impact of interventions intended to mitigate existential risks, it's almost as if you should consider trying to figure out that problem itself. If you could crack the problem of mitigating existential risks, it would blow global poverty out of the water. And the problem doesn't immediately seem completely obviously intractable.
In fact, it's almost as if the cause you should choose is the research of existential risk strategy (a subset of cause prioritization). And, if you were to write a speech about it, it seems like it would be a good idea to make it really clear that that's probably very impactful, because value of information counts.
And so, when you read a speech that you claim is entitled 'Why Global Poverty?', I read a speech entitled 'Why Existential Risk Strategy Research?'