> This sequence draws from a position paper co-written with Simon Pepin Lehalleur, Jesse Hoogland, Matthew Farrugia-Roberts, Susan Wei, Alexander Gietelink Oldenziel, Stan van Wingerden, George Wang, Zach Furman, Liam Carroll, Daniel Murfet. Thank you to Stan, Dan, and Simon for providing feedback on this post. Alignment ⊆ Capabilities. As...
A corollary of Sutton's Bitter Lesson is that solutions to AI safety should scale with compute.[1] Let's consider a few examples of research directions that are aiming at this property: * Deliberative Alignment: Combine chain-of-thought with Constitutional AI to improve safety with inference-time compute (see Guan et al. 2025, Figure...
> TLDR: We made substantial progress in 2024: > > * We published a series of papers that verify key predictions of Singular Learning Theory (SLT) [1, 2, 3, 4, 5, 6]. > * We scaled key SLT-derived techniques to models with billions of parameters, eliminating our main concerns around...
Common Law AI worked better than anyone expected. Dr. Sarah Chen was skeptical from the start. "You're essentially training them to be moral judges," she warned during the initial architecture review. "What if they overfit on ethics?" The room laughed. "Better than the alternative," someone quipped. The idea was simple...
It started as so many dooms do, with a flash in the night sky over the South China Sea. Testing a new ASAT weapon, the Chinese military shattered a derelict spy satellite into 40,000 shards of shrapnel. The debris pattern suggested a fragmentation warhead optimized for lethal scatter. Within 48...
January: In early 2026, Meta launches a fleet of new AI influencers, targeting the massive audience displaced by the Xiaohongshu-TikTok wars. They are beautiful, funny, smart—whatever you want them to be. Equipped with the latest in online learning, the agents immediately begin adapting to social media trends as they occur....
And then we hit a wall. Nobody expected it. Well... almost nobody. Yann LeCun posted his "I told you so's" all over X. Gary Marcus insisted he'd predicted this all along. Sam Altman pivoted, declaring o3 was actually already ASI. The first rumors of scaling laws breaking down were already...