The series explains my part in the response to COVID, my reasons for switching from AI alignment work to the COVID response for a full year, and some new ideas the experience gave me. While it is written from my (Jan Kulveit's) personal perspective, I co-wrote the text with Gavin Leech, with input from many others.
The first post covers my main motivation: experimental longtermism.
Feedback loop
Possibly the main problem with longtermism and x-risk reduction is the weak and slow feedback loop.
(You work on AI safety; at some unknown time in the future, an existential catastrophe happens, or doesn’t happen, as a result of your work, or not as a result of your work.)
Most longtermists and existential risk people openly admit that the area doesn't have good feedback loops. Still, I think the community at large underappreciates how epistemically tricky our situation is. Disciplines that lack feedback from reality are exactly the ones that can easily go astray.
But most longtermist work is based on models of how the world works - or doesn’t work. These models try to explain why such large risks are neglected, the ways institutions like government or academia are inadequate, how various biases influence public perception and decision making, how governments work during crises, and so on. Based on these models, we take further steps (e.g. writing posts like this, uncovering true statements in decision theory, founding organisations, working at AI labs, going into policy, or organising conferences where we explain to others why we believe the long-term future is important and x-risk is real).
Covid as opportunity
Claim: COVID presented an unusually clear opportunity to put some of our models and theory in touch with reality, thus getting more "experimental" data than is usually possible, while at the same time helping to deal with pandemic. The impact of the actions I mentioned above is often unclear even after many years, whereas in the case of COVID impact of similar actions was observable within weeks and months.
For me personally, there was one more pull. My background is in physics, and in many ways, I still think like a physicist. Physics - in contrast to most of maths and philosophy - has the advantage of being able to put its models in touch with reality, and to use this signal as an important driver in finding out what's true. In modern maths, (basically) whatever is consistent is true, and a guiding principle for what's important to work on is a sense of beauty. To a large extent, the feedback signal in philosophy is what other philosophers think. (Except when a philosophy turns into a political movement - then the signal comes from outcomes such as greater happiness, improved governance, large death tolls, etc.) In both maths and philosophy, the core computation mostly happens "in” humans. Physics has the advantage that in its experiments, "reality itself" does the computation for us.
I miss this feedback from reality in my x-risk work. Note that many of the concrete things longtermists do, like posting on the Alignment Forum or explaining things at conferences, actually do have feedback loops. But these are usually more like maths or philosophy: they provide social feedback, including intuitions about what kinds of research are valuable. One may wonder about the problems with these feedback loops, and what kind of blind-spots or biases they entail.
At the beginning of the COVID crisis, it seemed to me that some of our "longtermist" models were making fairly strong predictions about specific things that would fail - particularly about inadequate research support for executive decision-making. After some hesitation, I decided that if I trusted these models for x-risk mitigation, it made sense to use them to solve COVID as well. And in pretty much every scenario, I learn something.
Over the next year, I and many collaborators tried a number of interventions to limit the damage associated with COVID. While we were motivated by trying to help, basically every intervention was also an experiment, putting some specific model in touch with reality, or attempting to fix some perceived inadequacy. Our efforts have had some direct impact, but from the longtermist perspective, the main source of value is ‘value of information’.
A more detailed description of our work is forthcoming, but briefly: we focused on inadequacies in the world’s modeling, forecasting, and decision support. Legible outputs include our research on non-pharmaceutical interventions, advising major vaccine manufacturers, advising multiple governments, sometimes at the executive level, consulting with bodies such as the European CDC and multiple WHO offices, and reaching millions of educated readers with our arguments, with mostly unknowable effects. I'm fairly confident the efforts made at least one country’s COVID policy not suck during at least one epidemic wave, and moderately confident our efforts influenced multiple countries toward marginally better decisions.
Concretely, here’s a causal graph of some of our efforts:
(Every edge has value of information.)
The sequence of posts, to be released over the next couple of weeks, will cover more detail:
Static and dynamic prioritisation: effective altruism should switch from argmax() to softmax()
Different forms of capital
Miscellaneous lessons
Evidence in favour of trespassing
Evidence for crises as opportunities
Research distillation is neglected
Call to Action
Part of the value of my COVID year depends on whether I can pass on the data I collected, and the updates I made from them. The posts to come discuss some of these.
Conclusion
A year of intense work on COVID likely gave me more macrostrategy ideas, governance insights, and general world-modelling skills than the counterfactual (which would have been mostly solo research from my home office and occasional zoom calls with colleagues from FHI). My general conclusion is that such "experimental longtermist" work is useful, and relatively neglected.
One reason for neglectedness may be the type of reasoning where a longtermist compares the "short-term direct impacts" of similar work with the potential "long-term direct impacts" of a clearly longtermist project, and neglects the value of information term. (Note that a longtermist prioritisation taking value of information into account will often look different from a prioritisation focused on maximising direct impact - e.g. optimising for the value of information will lead to exploring more possible interventions).
My rough guess of the total value of information is a >10% improvement in my decision-making ability about large matters. Adding in what I hope you learn from me, it seems a clearly good investment.
On the margin, more longtermists should do experiments in this spirit; for the future, seize the day.
The series explains my part in the response to COVID, my reasons for switching from AI alignment work to the COVID response for a full year, and some new ideas the experience gave me. While it is written from my (Jan Kulveit's) personal perspective, I co-wrote the text with Gavin Leech, with input from many others.
The first post covers my main motivation: experimental longtermism.
Feedback loop
Possibly the main problem with longtermism and x-risk reduction is the weak and slow feedback loop.
(You work on AI safety; at some unknown time in the future, an existential catastrophe happens, or doesn’t happen, as a result of your work, or not as a result of your work.)
Most longtermists and existential risk people openly admit that the area doesn't have good feedback loops. Still, I think the community at large underappreciates how epistemically tricky our situation is. Disciplines that lack feedback from reality are exactly the ones that can easily go astray.
But most longtermist work is based on models of how the world works - or doesn’t work. These models try to explain why such large risks are neglected, the ways institutions like government or academia are inadequate, how various biases influence public perception and decision making, how governments work during crises, and so on. Based on these models, we take further steps (e.g. writing posts like this, uncovering true statements in decision theory, founding organisations, working at AI labs, going into policy, or organising conferences where we explain to others why we believe the long-term future is important and x-risk is real).
Covid as opportunity
Claim: COVID presented an unusually clear opportunity to put some of our models and theory in touch with reality, thus getting more "experimental" data than is usually possible, while at the same time helping to deal with pandemic. The impact of the actions I mentioned above is often unclear even after many years, whereas in the case of COVID impact of similar actions was observable within weeks and months.
For me personally, there was one more pull. My background is in physics, and in many ways, I still think like a physicist. Physics - in contrast to most of maths and philosophy - has the advantage of being able to put its models in touch with reality, and to use this signal as an important driver in finding out what's true. In modern maths, (basically) whatever is consistent is true, and a guiding principle for what's important to work on is a sense of beauty. To a large extent, the feedback signal in philosophy is what other philosophers think. (Except when a philosophy turns into a political movement - then the signal comes from outcomes such as greater happiness, improved governance, large death tolls, etc.) In both maths and philosophy, the core computation mostly happens "in” humans. Physics has the advantage that in its experiments, "reality itself" does the computation for us.
I miss this feedback from reality in my x-risk work. Note that many of the concrete things longtermists do, like posting on the Alignment Forum or explaining things at conferences, actually do have feedback loops. But these are usually more like maths or philosophy: they provide social feedback, including intuitions about what kinds of research are valuable. One may wonder about the problems with these feedback loops, and what kind of blind-spots or biases they entail.
At the beginning of the COVID crisis, it seemed to me that some of our "longtermist" models were making fairly strong predictions about specific things that would fail - particularly about inadequate research support for executive decision-making. After some hesitation, I decided that if I trusted these models for x-risk mitigation, it made sense to use them to solve COVID as well. And in pretty much every scenario, I learn something.
Over the next year, I and many collaborators tried a number of interventions to limit the damage associated with COVID. While we were motivated by trying to help, basically every intervention was also an experiment, putting some specific model in touch with reality, or attempting to fix some perceived inadequacy. Our efforts have had some direct impact, but from the longtermist perspective, the main source of value is ‘value of information’.
A more detailed description of our work is forthcoming, but briefly: we focused on inadequacies in the world’s modeling, forecasting, and decision support. Legible outputs include our research on non-pharmaceutical interventions, advising major vaccine manufacturers, advising multiple governments, sometimes at the executive level, consulting with bodies such as the European CDC and multiple WHO offices, and reaching millions of educated readers with our arguments, with mostly unknowable effects. I'm fairly confident the efforts made at least one country’s COVID policy not suck during at least one epidemic wave, and moderately confident our efforts influenced multiple countries toward marginally better decisions.
Concretely, here’s a causal graph of some of our efforts:
(Every edge has value of information.)
The sequence of posts, to be released over the next couple of weeks, will cover more detail:
Part of the value of my COVID year depends on whether I can pass on the data I collected, and the updates I made from them. The posts to come discuss some of these.
Conclusion
A year of intense work on COVID likely gave me more macrostrategy ideas, governance insights, and general world-modelling skills than the counterfactual (which would have been mostly solo research from my home office and occasional zoom calls with colleagues from FHI). My general conclusion is that such "experimental longtermist" work is useful, and relatively neglected.
One reason for neglectedness may be the type of reasoning where a longtermist compares the "short-term direct impacts" of similar work with the potential "long-term direct impacts" of a clearly longtermist project, and neglects the value of information term. (Note that a longtermist prioritisation taking value of information into account will often look different from a prioritisation focused on maximising direct impact - e.g. optimising for the value of information will lead to exploring more possible interventions).
My rough guess of the total value of information is a >10% improvement in my decision-making ability about large matters. Adding in what I hope you learn from me, it seems a clearly good investment.
On the margin, more longtermists should do experiments in this spirit; for the future, seize the day.