Idle musing: should we all be writing a Claude Constitution-esque set of posts about our hopes for how AIs help humans around the dangerous moments in takeoff, in hopes that this influences the models advise not only us, but people who are coming into these issues fresher than us?
(Yes, I know that from the outside I have exponentially less influence on model behavior than Anthropic, and for MIRI-like reasons maybe this doesn't go well at all. But, you know, play to all of your outs.)
I would like, for obvious self-interested reasons, discussion of public policy ideas you think people should be pushing more on than they currently are.
It was a mild positive update for me that an Anthropic employee, in this case Zac, was able to clarify this.
I worry, a lot, that the true gloss on the American Way of War is, roughly the meme of "every Pacific naval encounter from late 1943 onward is like the IJN Golden Kirin, Glorious Harbinger of Eternal Imperial Dawn versus six identical copies of the USS We Built This Yesterday supplied by a ship that does nothing but make birthday cakes for the other ships."
Or, put more generally, we show up in the 4th quarter with a shit ton of gratuitously over-the-top production of every possibly-vaguely-good idea, and manage to eke out a win. See, e.g., the Civil War, WW1, WW2 (as above), Korea (kind of, long story), the Gulf War (after we fucked up the pre-war diplomacy), and the post-Surge Iraq War.
"There is no pause, so we just gotta Get a Lot of Alignment Research Done Real REAL Fast" is plausibly the real world we end up in, and I think we should have more folks optimizing for it beyond Redwood (and Anthropic???), even as terrifying as it feels.
I also had this question and couldn't find an obvious answer. I recognize that to some degree this might be proprietary, but this feels like a pretty obvious comms question. It doesn't negatively impact my opinion of Zac or Drake if they're unable to answer given their confidentiality obligations, but I would ask them to relay back internally that not indicating one way or another looks extremely weird from a comms perspective.
It's like Ford announcing that they've added airbags to their car designs. Which designs, the ones available for purchase now, or future year models? Oh, you know, just...car designs.
This is awesome! It broadly aligns with my understanding of the situation, although it does miss some folks that are known to care a bunch about this from their public statements. Downloading the JSON to take a deeper look!
Strong upvoting the underlying post for Doing The Thing.
So far the LLMs really want to procrastinate on this task in normal chat windows because it's tooooo many queries. This is gonna have to be a Claude Code thing.
Yeah, I think one thing I've added to my (too long!) to-do list is to ask the LLMs, and then pay a researcher to find any other examples of folks like this that we missed.
I just reread https://www.lesswrong.com/posts/CoZhXrhpQxpy9xw9y/where-i-agree-and-disagree-with-eliezer by Paul Christiano from 2022 for somewhat random reasons[1] and wow is this a fascinating historical snapshot document, especially in the comment section.
Many of the main characters in AI from 2022 to 2025 swing by and say, essentially, "hello! I would like to foreshadow my character arc for the next 3 years!"
Too many open tabs, need to clean up or computer no do videoconference good
Who's behind Pull the Plug? I don't see any details about it on their website.