Pay Risk Evaluators in Cash, Not Equity
Personally, I suspect the alignment problem is hard. But even if it turns out to be easy, survival may still require getting at least the absolute basics right; currently, I think we're mostly failing even at that. Early discussion of AI risk often focused on debating the viability of various elaborate safety schemes humanity might someday devise—designing AI systems to be more like “tools” than “agents,” for example, or as purely question-answering oracles locked within some kryptonite-style box. These debates feel a bit quaint now, as AI companies race to release agentic models they barely understand directly onto the internet. But a far more basic failure, from my perspective, is that at present nearly all AI company staff—including those tasked with deciding whether new models are safe to build and release—are paid substantially in equity, the value of which seems likely to decline if their employers stop building and releasing new models. As a result, it is currently the case that roughly everyone within these companies charged with sounding the alarm risks personally losing huge sums of money if they do. This extreme conflict of interest could be avoided simply by compensating risk evaluators in cash instead.


Note though that the reference class "blog" is only partially apt. For example, some authors publish on LessWrong in the course of attempting to make or propagate serious intellectual progress, which is a rare aim among bloggers. It seems to me LessWrong's design has historically been unusually conducive to this rare aim, and personally, this is the main reason I hope and plan to publish more here in the future (and why I'd feel far less since excited about publishing on Medium or Substack or other platforms formatted like standard blogs).