Was a philosophy PhD student, left to work at AI Impacts, then Center on Long-Term Risk, then OpenAI. Quit OpenAI due to losing confidence that it would behave responsibly around the time of AGI. Now executive director of the AI Futures Project. I subscribe to Crocker's Rules and am especially interested to hear unsolicited constructive criticism. http://sl4.org/crocker.html
Some of my favorite memes:
(by Rob Wiblin)
(xkcd)
My EA Journey, depicted on the whiteboard at CLR:
(h/t Scott Alexander)
I've played with poking Claude into simulating the first ten years after the Slowdown ending of AI 2027. But it just seems like there's so much to model, though! What are the actual bottlenecks to various things?
Neat! I'd be curious to hear your takeaways if you have any.
OK, thanks, that's a relief then. We shall see who their customers end up being.
Sure that works.
training process is “the most forbidden technique”, including recent criticism of Goodfire for investing in this area.
I think this mischaracterizes the criticism. The criticism as I understand it is that Goodfire is planning to help frontier AI companies use model internals in training, in exchange for money. Insofar as they really are planning to do this, then I'll count myself as among the critics, for the classic "but we need interpretability tools to be our held-out test set for alignment" reason. Do you have a link to the criticism you are responding to?
I think "Machines of Loving Grace" shouldn't qualify; it deliberately doesn't address how the problems get solved, it only cheerfully depicts a world after the problems were all solved.
I think for a submission to be valid, it must at least attempt to answer how the alignment problem gets solved and how extreme concentrations of power are avoided.
One way to do this is to require scenarios come with dates attached. What happens in 2027? 2028? etc. That way if they are like "And in 2035, everything is peachy and there's no more poverty etc." it's more obvious to people that there's a giant plot hole in the story.
The game in question was about as decentralized as you expect, I think? But, importantly, compute is very unevenly distributed. The giant army of AIs running on OpenAI's datacenters all have the same system prompt essentially (like, maybe there are a few variants, but they are all designed to work smoothly together towards OpenAI's goals) and that army constitutes 20% of the total population of AIs initially and at one point in the game a bit more than 50%.
So while (in our game) there were thousands/millions of different AI factions/goals of similar capability level, the top 5 AI factions/goals by population size / compute level controlled something like 90% of the world's compute, money, access-to-powerful-humans, etc. So to a first approximation, it's reasonable to model the world as containing 1-4 AI factions, plus a bunch of miscellaneous minor AIs that can get up to trouble and shout warnings from the sidelines but don't wield significant power.
If you are interested in playing a game sometime, you'd be welcome to join! I'd encourage you to make your own variant scenario too if you like.
Thanks for playing & writing up your reflections!
I think China wasn't as aggressive / bold in our game as I think they could have been; I agree that the situation for them is pretty rough but I'd like to try again someday and see if they can pull off a win, by more aggressively angling for a deal early on.
Hmm, but can't the megacorporations involved in the H200 transaction also bribe Customs? Won't your $500 WeChat bribe to the mid-level bureaucrat be cancelled and overwhelmed by the many $5,000 WeChat bribes flying at them by the corporations eagerly awaiting their shipment of H200s?
Yeah that EA-prevalence assumption also caused me to doubt that the author actually worked at an AI company, it was very dissonant with my experience at least.
That seems like a reasonable summary to me of the contrast.