The Airtable application mentions a summer internship, but there doesn't seem to be any information about that on the jobs section of the website. Is that still open?
I think there is some basic research missing here.
For example, presumably human+ level AI could be considered agenty, or even alive. However, as far as I know, we have no good definition of what separates life from non-life outside of the one type of life we see on Earth. The boundary being that prions are non-life, viruses are kind-of life, and bacteria are definitely life, judging by the definition like "genetic information metabolism".
I suspect that a more promising approach is more of a black-box one, combined with a white-box one, where life and agents act as more adversarial and anti-inductive than non-life, regardless of the underlying physics/biology. A description like that would cover most of life on Earth, as well as much of the simulated life in, say, computer games. However, it might go too far, as there are plenty of phenomena we don't identify as alive that behave in such a way. Presumably some combination of heritability and mutability would lead to something that is alive and agenty. I have not seen much research done in that direction.
AI Impacts is beginning a serious hiring round (see here for job postings), so I’d like to explain a bit why it has been my own best guess at the highest impact place for me to work for me. (As in, this is a personal blog post by Katja on the AI Impacts blog, not some kind of officialesque missive from the organization.)
But first—
What is AI Impacts?
AI Impacts is a few things:
Why think working on AI Impacts is among the best things to do?
1. AI risk looks like a top-notch cause area
It seems plausible to me that advanced AI poses a substantial risk to humanity’s survival. I don’t think this is clear, but I do think there’s enough evidence that it warrants a lot of attention. I hope to write more about this, see here for recent discussion. Furthermore, I don’t know of other similarly serious risks (see Ord’s The Precipice for a review), or of other intervention areas that look clearly more valuable than reducing existential risk to humanity.
I actually also think AI risk is a potentially high-impact area to work (for a little while at least) if AI isn’t a huge existential risk to humanity, because so many capable and well-intentioned people are dedicating themselves to it. Demonstrating that it wasn’t that bad could redirect mountains of valuable effort to real problems.
2. Understanding the situation beats intervening on the current margin
Within the area of mitigating AI risk, there are several broad classes of action being taken. Technical safety research focuses on building AI that won’t automatically cause catastrophe. AI Governance focuses on maneuvering the policy landscape to lower risk. These are both kinds of intervention: ‘intervening’ is a meta-category, and the other main meta-category in my mind is ‘understanding the situation’. My own best guess is that on the current margin, ‘understanding the situation’ is a better place for an additional person with general skills than any particular intervening that I know of. (Or maybe it’s only almost as good—I flip-flop, but it doesn’t really matter much: the important thing is that for some large part of the space of people and their skills and characteristics, it seems better.)
By ‘understanding the situation’, I mean for instance working toward better answers to questions like these:
Carrying out any particular intervention also involves a lot of ‘understanding the situation’, but I think this is often at a different level. For instance, if you decide to intervene by trying to get AI labs to collaborate with each other, you might end up accruing better models of how people at AI projects interact socially, how decisions are made, how running events works, and so on, because these things are part of the landscape between you and your instrumental goal: improving collaboration between AI projects. You probably also learn about things around you, like what kinds of AI projects people are doing. But you don’t get to learn much at all about how the achievement of your goal affects the future of AI. (I fear that in general this situation means you can end up lumbering forward blindly while thinking you can see, because you are full of specific concrete information—the intricacies of the steering wheel distracting you from the dense fog on the road.) There are some exceptions to this. For instance, I expect some technical work to be pretty enlightening about the nature of AI systems, which is directly relevant to how the development of better AI systems will play out. For instance, mesa-optimization seems like a great contribution to ‘understanding the situation’ which came out of a broadly intervention-oriented organization.
It is that kind of understanding the situation—understanding what will happen with AI and its effects on society, under different interventions—that I think deserves way more attention.
Why do I think understanding the situation is better than intervening? Of course in general, both are great. Intervening is generally necessary for achieving anything, and understanding the situation is arguably necessary for intervening well. (The intense usefulness of understanding the situation for achieving your goals in most situations is exactly the reason one might be concerned about AI to begin with.) So in general, you want a combination of understanding the situation and intervening. The question is how valuable the two are on the current margin.
My guess: understanding the situation is better. Which is to say, I think a person with a subjectively similar level of skill at everything under consideration will add more value via improving everyone’s understanding of the situation by one person’s worth of effort than they would by adding one person’s worth of effort to pursuing the seemingly best intervention.
Here are a few things influencing this guess:
Yes: work on AI risk instead of other EA causes and other non-emergency professions. Show a good case for this to large numbers of people who aren’t thinking about it and try to change views within the AI community and public about the appropriate degree of caution for relevant AI work.
No: work on something more valuable, support AI progress
5 years: plan for what specific actors should do in a situation much like our current one and talk to them about doing it; build relationships with likely actors; try to align systems much like our current AI systems.
20 years: more basic time-consuming alignment research; movement building; relationship building with institutions rather than people.
100 years: avert risks from narrow or weak AI and other nearer technologies, even more basic alignment research, improve society’s general institutions for responding to risks like this, movement building directed at broader issues that people won’t get disillusioned with over that long a period (e.g. ‘responding to technological risks’ vs. AI specifically).
Before you know it: searching for technical solutions that can be proven to entirely solve the problem before it arises (even if you are unlikely to find any), social coordination to avoid setting off such an event.
Weeks: Immediate-response contingency plans.
Years: Fast-response contingency plans; alignment plans that would require some scope for iteration.
Decades: Expect to improve safety through more normal methods of building systems, observing them, correcting, iterating. ‘Soft’ forces like regulations, broadscale understanding of the problems, cooperation initiatives. Systems that are incrementally safer but not infinitely safer.
When approaching a poorly understood danger down a dark corridor, I feel like even a small amount of light is really good. Good for judging whether you are facing a dragon or a cliff, good for knowing when you are getting close to it so you can ready your sword (or your ropes, as the case may be), good for telling how big it is. But even beyond those pre-askable questions, I expect the details of the fight (or climb) to go much better if you aren’t blind. You will be able to strike well, and jump out of the way well, and generally have good feedback about your micro-actions and local risks.So I don’t actually trust tallying up possible decision changes as in the last point, that much. If you told me that we had reasoned through the correct course of action for dragons, and cliff faces, and tar pits, and alternate likely monsters, and decided they were basically the same, I’d persist in being willing to pay a lot to be able to see.
Applied to AI strategy: understanding the situation both lets you choose interventions that might help, and having chosen an intervention, probably helps you make smaller choices within that intervention well, such that the intervention hits its target.
I think another part of the value here is that very abstract reasoning about complicated situations seems untrustworthy (especially when it isn’t actually formal), and I expect getting more data and working out more details to generally engage people’s concrete thinking better, and for that to be helpful.
People sometimes ask if we might be scraping the barrel on finding research to do in this space, I guess because quite a few people have prolifically opined on it over numerous years, and things seem pretty uncertain. I think that radically under-imagines what understanding, or an effort dedicated to understanding, could look like. Like, we haven’t gotten as far as making sure that the empirical claims being opined about are solid, whereas a suitable investment for a major international problem that you seriously need to solve should probably look more like the one we see for climate change. Climate change is a less bad and arguably easier to understand problem than AI risk, and the ‘understanding the situation’ effort there looks like an army of climate scientists working for decades. And they didn’t throw up their hands and say things were too uncertain and they had run out of things to think about after twenty climate hobbyists had thought about it for a bit. There is a big difference between a vibrant corner of the blogosphere and a serious research effort.
3. Different merits of different projects
Ok, so AI risk is the most impactful field to my knowledge, and within AI risk I claim that the highest impact work is on understanding the situation1. This is reason to work at AI Impacts, and also reason to work at Open Philanthropy, FHI, Metaculus, as an independent scholar, in academia, etc. Probably who should do which depends on the person and their situation. Here are some things AI Impacts is about, and axes on which we have locations:
I’m focused here on the positives, but here are a few negatives too:
So, that was a hand-wavy account of why I think working at AI Impacts is particularly high impact, and some of what it’s like. If you might want to work for us, see our jobs page2. If you don’t, but like thinking about the future of AI and wish we invited you to dinners, coffees, parties or our Slack, drop me a DM or send us a message through the AI Impacts feedback box. Pitches that I’m wrong and should do something else are also welcome.
Notes