Strong upvote (both as object-level support and for setting a valuable precedent) for doing the quite difficult thing of saying "You should see me as less expert in some important areas than you currently do."
Nice post! As someone who spends a lot of time in AI policy on strategic thought and talking to people who I think are amongst the best strategic thinkers on AI, I appreciated this piece and think you generally describe the skills pretty well.
However, you say "research" skill by default does not lead to strategic skill, which is very true, but this varies drastically depending on the type of research! Mechanistic interpretability, in fact, appears to me to be an example of a field which is so in the weeds empirical with good feedback loops, that it makes it much harder for researchers in this field to learn better strategic thinking. Other research fields with slower feedback loops are different—for example, societal impacts of AI research. More broadly, I think many fields of social science train strategic skill well, and some of the best political science thinkers clearly have significant strategic skill: Fukuyama, James C. Scott, Dominic Cummings, etc.
I made an attempt to brainstorm ways to evaluate strategic skill based on the abilities of the best thinkers I know, and came up with a list of characteristics I think it is correlated with:
Finally, I do notice a lot of those I think have the best strategic thought often use lenses and framings inspired by systems thinking, social evolution/selection processes, memetics, biology, and other similar ways of viewing society and human behavior.
Interesting. Thanks for the list. That seemed like a pretty reasonable breakdown to me. I think mechanistic interpretability does train some of them in particular, two, three and maybe six. But I agree that things involve thinking about society and politics and power and economics etc as a whole do seem clearly more relevant.
One major concern I have is that it's hard to judge skill in domains with worse feedback loops because there is not feedback on who is correct. I'm curious how confident you are in your assessment of who has good takes or is good in these fields, and how you determine this?
I guess that's the main element I didn't mention: many people on this forum would suggest judging via predictive skill/forecasting success. I think this is an ok heuristic, but of course the long time horizons involved in many strategic questions makes it hard to judge (and Tetlock has documented the problems with forecasting over long time horizons where these questions matter most).
Mostly, the people I think of as having strong strategic skill are closely linked to some political influence (which implicitly requires this skill to effect change) such as attaining a senior govt position, being influential over the Biden EO/export controls, UK govt AI efforts, etc. Alternatively, they are linked to some big major idea in governance or technical safety, often by spotting something missing years before it became relevant.
Often by interacting regularly with good thinkers you can get a sense that they have stronger mental models for trends and the levers controlling trends than others, but concrete judgement is sometimes extremely difficult until a key event has passed and we can judge in hindsight (especially about very high level trends such as Mearsheimer's disputed take on the causes of the Ukraine invasion, Fukuyama's infamous "end of history" prediction, or even Pinker's "Better Angels of Our Nature" predictions about continually declining global conflict).
Political influence seems a very different skill to me? Lots of very influential politicians have been very incompetent in other real world ways
Alternatively, they are linked to some big major idea in governance or technical safety, often by spotting something missing years before it became relevant.
This is just a special case (and an unusually important one) of a good forecasting record, right?
I suppose I mean influence over politics, policy, or governance (this is very high level since these are all distinct and separable), rather than actually being political necessarily. I do think there are some common skills, but actually being a politician weighs so many other factors more heavily that the strategic skill is not selected on very strongly at all. Being a politician's advisor, on the other hand...
Yes, it's a special case, but importantly one that is not evaluated by Brier score or Manifold bucks.
A few points:
Curated. I think this is a pretty important point. I appreciate Neel's willigness to use himself as an example.
I do think this leaves us with the important followup questions of "okay, but, how actually DO we evaluate strategic takes?". A lot of people who are in a position to have demonstrated some kind of strategic awareness are people who are also some kind of "player" on the gameboard with an agenda, which means you can't necessarily take their statements at face value as an epistemic claim.
Thanks!
okay, but, how actually DO we evaluate strategic takes?
Yeah, I don't have a great answer to this one. I'm mostly trying to convey the spirit of: we're all quite confused, and the people who seem competent disagree a lot, so they can't actually be that correct. And given that the ground truth is confusion, it is epistemically healthier to be aware of this.
Actually solving these problems is way harder! I haven't found a much better substitute than looking at people who have a good non-trivial track record of predictions, and people who have what to me seem like coherent models of the world that make legitimate and correct seeming predictions. Though the latter one is fuzzier and has a lot more false positives. A particularly salient form of a good track record is people who had positions in domains I know well (eg interpretability) that I previously thought were wrong/ridiculous, but who I later decided were right (eg I give Buck decent points here, and also a fair amount of points to Chris Olah)
Also, if you're asking a panel of people, even those skilled at strategic thinking will still be useless unless they've thought deeply about the particular question or adjacent ones. And skilled strategic thinkers can get outdated quickly if they haven't thought seriously about the problem in awhile.
I'm not trying to agree with that one. I think that if someone has thought a bunch about the general topic of AI and has a bunch of useful takes. They can probably convert this on the fly to something somewhat useful, even if it's not as reliable as it would be if they spent a long time thinking about it. Like I think I can give useful technical mechanistic interpretability takes even if the question is about topics I've not spent much time thinking about before
yeah there's generalization, but I do thing that eg (AGI technical alignment strategy, AGI lab and government strategy, AI welfare, AGI capabilities strategy) are sufficiently different that experts at one will be significantly behind experts on the others
If a strategy is likely to be outdated quickly it's not robust and not a good strategy. Strategies should be able to withstand lots of variation.
Hey Neel, I've heard you make similar remarks informally at talks or during Q&A sessions in past in-person panels and events, and it's great that you've written them up so that they're available in a nuanced format to a broader audience. I agree with the points you've made, but have a slightly different perspective on how it connects to the example of people asking for your strategic takes specifically, which I'll share below (without presumption).
People aren't necessarily confusing research prowess with strategic insight. Rather, they recognize you as having achieved elite social standing within the field of AI more broadly and want:
Before reading this post, I believed that the median person asking these questions was motivated by your impressive academic performance during your undergraduate studies, something that can be (over)simplified to "wow, this guy studied pure math at Cambridge and ranked top of his class, he's one of the smartest people in the world, and smart people are correct about lots of things, he might have a correct answer to this question I have!". I'm quite embarrassed to admit that this is pretty much what was going through my head when I attended a session you were holding during EAG last year, and I wouldn't be surprised if others there were thinking that too.
Similarly along those lines, I recall reaching out to one of your former mentees for a 1:1 thinking, "wow, this guy studied computer science at Cambridge and ranked top of his class, he's one of the smartest people in the world, and smart people are correct about lots of things!". I also took the time to read his dissertation, and found it interesting, but that first impression mattered a lot more than it should have. An analogy is that when people are selecting the model to use for a task, they want to use the best model for that task. But if a model takes the top spot on the leaderboard where test scores are easy to measure, then that tends to mess with human psychology which irrationally pattern matches and assumes generalization across every possible task.
My key takeaway was that although that this winner-take-all dynamic may have played one factor, your model assigns more weightage towards the work you've done after graduating and pioneering the field of mechinterp.
To be clear, founding mechinterp is a greater accomplishment than any formal credential. But even though teams of researchers at frontier labs are working on this agenda, it's not mainstream yet (just take a look at mechinterp.com), whereas the handle of "math/cs genius" is generic enough as a concept to be legible to the average person. The arguments in your post about research being an empirical science requiring skills not especially relevant to strategy are locally valid, but these points are the furthest thing from the mind of those waiting in line at conferences to ask what your p(doom) is.
Often the demands placed upon us by our environment play an instrumental role in shaping our skillset, because we adapt against the pressures placed upon us. I'm thankfully not in a leadership position where the role calls for executive project management decisions which require a solid understanding of the broader field and industry. I'm also grateful that I'm not a public figure with a reputation to maintain whose every move is open to scrutiny and close examination. I also understand that blog posts aren't meant to be epistemically bulletproof.
I think that it's true that when the people you speak with the most (e.g work colleagues or MATS scholars) ask you about your thoughts, their respect is based on the merits of the technical research you've published. And in general, when anyone publishes great AI research, then that does inspire interest in that person's AI takes.
Your social circle is heavily filtered by a competitive application process which strongly selects for predicted ability to do quality research. This can distort intuitions around the prevalence of certain traits which are not as well represented in the common population. For example, authoring code or research papers requires to some extent that your brain is adapted for processing text content, the implications of which I haven't seen discussed in depth anywhere on lesswrong. If someone expresses a strong preference for reading above watching a video when both options are available, it's almost like a secret handshake, because so many cracked engineers have told me this that it's become a green flag. In this world, entertainment culture and information transfer happens from books, web novels, articles, etc.
There's an entirely separate world occupied by someone with the opposite preference, i.e wanting to watching a video above reading text when both options are available, an example secret handshake for that is when my Uber driver tells me that they're cutting down on instagram. I admit this is a shallow heuristic but it's become a red flag I watch out for indicating a potential vulnerability to predatory social media dark patterns or television binge-watching. It's not an issue of self-control, people in the first group need to apply cognitive effort to pick things up from videos, but might have difficulties setting aside an engaging fantasy web serial. Most treatment of this topic I've seen addresses the second group, which feels alienating to me, as if there's this ongoing dimorphism between producers and users of consumer software.
I'm typically skeptical of "high IQ bubble" typed arguments since they tend to prove too much, so I'll make a more specific point. I agree with you that within these groups, conflation between perceived research skills and strategic skill does occur. My (minor) contention is that I don't think that this particular mistake is the one being made by the average person asking a speaker about their strategic takes at the end of a talk.
Like, these sort of questions aren't just being fielded by researchers in the field, you know. Why do people ask random celebrities and movie stars about their takes on geopolitics? Are they genuinely conflating acting skill with strategic skill? What about pro athletes? Is physical skill being conflated with strategic skill too? Do you believe that if a rich heiress with no research background was giving a talk about AI risk, that no one in the audience would be interested in her big picture takes? It makes no sense. Other comments have pointed this out already, so I'm sorry about adding another rant to the pile, but there exists a simpler explanation which does a better job of tracking reality!
The missing ingredient here is clout.
Various essays go into the relationship between competence and power, but what you're describing as "research skill" can be renamed expertise. These folks aren't mistaking you for someone high in "strategic skill", instead they are making the correct inference that you are an elite. They want in on the latest gossip behind the waitlist at the exclusive private social where frontier lab employees are joking around about what name they'll use for tomorrow's new model. They're holding their breath waiting for invention and hyperstition and self-fulfilling prophecy. They want to know the story of how Elon Musk will save the U.S AISI and call it xAISI.
I'm not sure if this was an aim for the above post, but it's an understandable impulse to want to distance oneself from scenes where it's easier to find elites (good strategic takes) than experts (good research takes), because there can be a certain culture attached which often fails to act in a way that consistently upholds virtuous truth-seeking.
Overall, I think that taking a public stance can warp the landscape being describing in ways that are hard to predict, and appreciate your approach here compared to the influencer extreme of "my strategic takes are all great, the best, and bigly" versus the corporate extreme of "oh there are so many great takes, how could I pick one, great takes, thanks all". The position of "yeah I've got takes but chill they're mid" is a reasonable midpoint, and it would be nice to have people defer more intelligently in general.
Thanks for writing this post. I agree with the sentiment but feel it important to highlight that it is inevitable that people assume you have good strategy takes.
In Monty Python's "Life of Brian" there is a scene in which the titular character finds himself surrounded by a mob of people declaring him the Mesiah. Brian rejects this label and flees into the desert, only to find himself standing in a shallow hole, surrounded by adherents. They declare that his reluctance to accept the title is further evidence that he really is the Mesiah.
To my knowledge nobody thinks that you are the literal Messiah but plenty of people going into AI Safety are heavily influenced by your research agenda. You work at Deepmind and have mentored a sizeable number of new researchers through MATS. 80,000 Hours lists you as example of someone with a successful career in Technical Alignment research.
To some, the fact that you request people not to blindly trust your strategic judgement is evidence that you are humble, grounded and pragmatic, all good reasons to trust your strategic judgement.
It is inevitable that people will view your views on the Theory of Change for Interpretability as aithoritative. You could literally repeat this post verbatim at the end of every single AI safety/interpretability talk you give, and some portion of junior researchers will still leave the talk defering to your strategic judgement.
Yes, I agree. It's very annoying for general epistemics (though obviously pragmatically useful to me in various ways if people respect my opinion)
Though, to be clear, my main goal in writing this post was not to request that people defer less to me specifically, but more to make the general point about please defer more intelligently using myself as an example and to avoid calling any specific person out
Being good at research and being good at high level strategic thinking are just fairly different skillsets!
Neel, thank you, especially for the humility in acknowledging how hard it is to know whether a strategic take is any good.
Your post made me realise I’ve been holding back on a framing I’ve found useful (from when I worked as a matchmaker and a relationship coach), thinking about alignment less as a performance problem, and more as a relationship problem. We often fixate on traits like intelligence, speed, obedience but we forget to ask, what kind of relationship are we building with AI? If we started there, maybe we’d optimise for collaboration rather than control?
P.S. I don’t come from a research background, but my work in behaviour and systems design gives me a practical lens on alignment, especially around how relationships shape trust, repair, and long-term coherence.
Instance of Halo Effect in addition to Undue Deference. We believe they are a good strategic thinking because good researchers must be brilliant in all fields.
Still important to value their view, compare it with views of strategic thinkers and find symmetries that can better predict answers to questions.
Excellent points on the distinct skillset needed for strategy, Neel. Tackling the strategic layer, especially concerning societal dynamics under ASI influence where feedback is poor, is indeed critical and distinct from technical research.
Applying strategic thinking beyond purely technical alignment, I focused on how societal structure itself impacts the risks and stability of long-term human-ASI coexistence. My attempt to design a societal framework aimed at mitigating those risks resulted in the model described in my post, Proposal for a Post-Labor Societal Structure to Mitigate ASI Risks: The 'Game Culture Civilization' (GCC) Model
Whether the strategic choices and reasoning within that model hold up to scrutiny is exactly the kind of difficult evaluation your post calls for. Feedback focused on the strategic aspects (the assumptions, the proposed mechanisms for altering incentives, the potential second-order effects, etc.), as distinct from just the technical feasibility, would be very welcome and relevant to this discussion on evaluating strategic takes.
Nicholas Taleb in his book "Black Swan" argues similar ideas. As a former Wall Street trader, his thesis is that people making good decisions under uncertainty must have "skin in the game". I.e. quantitative modeling is insufficient . This suggests researchers can and should support the stakeholders who are the decision-makers.
Whilst the title is true, I don't think that it adds much as, for most people, the authority of a researcher is probably as good as it gets. Even other researchers are probably not able to reliably tell who is or is not a good strategic thinker, so, for a layperson, there is no realistic alternative than to take the researcher seriously.
(IMHO a good proxy for strategic thinking is the ability to clearly communicate to a lay audience. )
I think the correct question is how much of an update should you make in an absolute sense rather than a relative sense? Many people in this community are overconfident and if you decide that every person is less worth listening to than you thought this doesn't change who you listen to, but it should make you a lot more uncertain in your beliefs
Strong upvote. Slightly worried by the fact that this wasn't written, in some form, earlier (maybe I missed a similar older post?)
I think we[1] can, and should, go even further:
-Find a systematic/methodical way of identifying which people are really good at strategic thinking, and help them use their skills in relevant work; maybe try to hire from outside the usual recruitment pools.
If deemed feasible (in a short enough amount of time): train some people mainly on strategy, so as to get a supply of better strategists.
-Encourage people to state their incompetence in some domains (except maybe in cases where it makes for bad PR) / embrace the idea of specialization and division of labour more: maybe high-level strategists don't need as much expertise on the technical details, only the ability to see which phenomena matter (assuming domain experts are able to communicate well enough)
say, the people who care about preventing catastrophic events, in a broad sense
I completely agree on the importance of strategic thinking. Personally, I like to hear what early AI pioneers had to say about modeling AI. For example, Minsky's society of mind. I believe the trend of AI must be informed by the development of epistemology, and I’ve basically bet my research on the idea that epistemological progress will shape AGI
What do you mean with 'must'? The word has to different meanings in this context and it seems bad epistemology not to distinguish them.
My use of “must” wasn’t just about technical necessity, but rather a philosophical or strategic imperative — that we ought to inform AGI not only through recent trends in deep learning (say, post-2014), but also by drawing from longer-standing academic traditions, like epistemic logic.
TL;DR Having a good research track record is some evidence of good big-picture takes, but it's weak evidence. Strategic thinking is hard, and requires different skills. But people often conflate these skills, leading to excessive deference to researchers in the field, without evidence that that person is good at strategic thinking specifically. I certainly try to have good strategic takes, but it's hard, and you shouldn't assume I succeed!
Introduction
I often find myself giving talks or Q&As about mechanistic interpretability research. But inevitably, I'll get questions about the big picture: "What's the theory of change for interpretability?", "Is this really going to help with alignment?", "Does any of this matter if we can’t ensure all labs take alignment seriously?". And I think people take my answers to these way too seriously.
These are great questions, and I'm happy to try answering them. But I've noticed a bit of a pathology: people seem to assume that because I'm (hopefully!) good at the research, I'm automatically well-qualified to answer these broader strategic questions. I think this is a mistake, a form of undue deference that is both incorrect and unhelpful. I certainly try to have good strategic takes, and I think this makes me better at my job, but this is far from sufficient. Being good at research and being good at high level strategic thinking are just fairly different skillsets!
But isn’t someone being good at research strong evidence they’re also good at strategic thinking? I personally think it’s moderate evidence, but far from sufficient. One key factor is that a very hard part of strategic thinking is the lack of feedback. Your reasoning about confusing long-term factors need to extrapolate from past trends and make analogies from things you do understand better, and it can be quite hard to tell if what you're saying is complete bullshit or not. In an empirical science like mechanistic interpretability, however, you can get a lot more feedback. I think there's a certain kind of researcher who thrives in environments where they can get lots of feedback, but has much worse performance in domains without, where they e.g. form bad takes about the strategic picture and just never correct them because there's never enough evidence to convince them otherwise. It's just a much harder and rarer skill set to be good at something in the absence of good feedback.
Having good strategic takes is hard, especially in a field as complex and uncertain as AGI Safety. It requires clear thinking about deeply conceptual issues, in a space where there are many confident yet contradictory takes, and a lot of superficially compelling yet simplistic models. So what does it take?
Factors of Good Strategic Takes
As discussed above, ability to think clearly about thorny issues is crucial, and is a rare skill that is only somewhat used in empirical research. Lots of research projects I do feel more like plucking the low hanging fruit. I do think someone doing ground-breaking research is better evidence here, like Chris Olah’s original circuits work, especially if done multiple times (once could just be luck!). Though even then, it's evidence of the ability to correctly pursue ambitious research goals, but not necessarily to identify which ones will actually matter come AGI.
Domain knowledge of the research area is important. However, the key thing is not necessarily deep technical knowledge, but rather enough competence to tell when you're saying something deeply confused. Or at the very least, enough ready access to experts that you can calibrate yourself. You also need some sense of what the technique is likely to eventually be capable of and what limitations it will face.
But you don't necessarily need deep knowledge of all the recent papers so you can combine all the latest tricks. Being good at writing inference code efficiently or iterating quickly in a Colab notebook—these skills are crucial to research but just aren't that relevant to strategic thinking, except insofar as they potentially build intuitions.
Time spent thinking about the issue definitely helps, and correlates with research experience. Having my day job be hanging out with other people who think about the AGI safety problem is super useful. Though note that people's opinions are often substantially reflections of the people they speak to most, rather than what’s actually true.
It’s also useful to just know what people in the field believe, so I can present an aggregate view - this is something where deferring to experienced researchers makes sense.
I think there's also diverse domain expertise that's needed for good strategic takes that isn't needed for good research takes, and most researchers (including me) haven't been selected for having, e.g.:
Conclusion
Having good strategic takes is important, and I think that researchers, especially those in research leadership positions, should spend a fair amount of time trying to cultivate them, and I’m trying to do this myself. But regardless of the amount of effort, there is a certain amount of skill required to be good at this, and people vary a lot in this skill.
Going forwards, if you hear someone's take about the strategic picture, please ask yourself, "What evidence do I have that this person is actually good at the skill of strategic takes?" And don't just equivocate this with them having written some impressive papers!
Practically, I recommend just trying to learn about lots of people's views, aim for deep and nuanced understanding of them (to the point that you can argue them coherently to someone else), and trying to reach some kind of overall aggregated perspective. Trying to form your own views can also be valuable, though I think also somewhat overrated.
Thanks to Jemima Jones for poking me to take agency and write a blog post for the first time in forever.