I keep running into similar arguments online, where people attack “the other” and use the (correct) observation of badness to claim their side is therefore doing well. There’s a temptation to correct this by saying that in a dispute between two sides, one side being bad isn’t causally making the...
A lot of “red line” talk assumed that a capability shows up, everyone notices, and something changes. We keep seeing the opposite; capability arrives, and we get an argument about definitions after deployment, after it should be clear that we're well over the line. We’ve Already Crossed The Lines! Karl...
The modern world is incredibly insecure along a wide variety of dimensions - because it’s not a problem. Usually. No-one is trying to exploit the security of your email server, most of the time, so it’s fine if it is unpatched. No-one is trying to hack the internet alarm clock...
There’s a strong argument that humans should stop trying to build more capable AI systems, or at least slow down progress. The risks are plausibly large but unclear, and we’d prefer not to die. But the roadmaps of the companies pursuing these systems envision increasingly agentic AI systems taking over...
In the previous two posts (first, second) we laid out our take on AI alignment, which involves conservative philosophy and the political school of thought of Agonistic Democracy. We also suggested an approach to AI alignment in which the conflicts between multiple agents lead to an AI system that has...
In our previous post, we outlined a view of AI alignment we disagree with as a central assumption in current discussions of AI alignment, and suggested that it might be useful to push in a different direction, which we started to outline. Here, we’ll point out that we think alignment...
Worldbuilding is critical for understanding the world and how the future could go - but it’s also useful for understanding counterfactuals better. With that in mind, when people talk about counterfactuals in AI development, they seem to assume that safety would always have been a focus. That is, there’s a...