Scott Alexander's "Meditations on Moloch" paints a gloomy picture of the world being inevitably consumed by destructive forces of competition and optimization. But Zvi argues this isn't actually how the world works - we've managed to resist and overcome these forces throughout history.
The insane attempted AI moratorium has been stripped from the BBB. That doesn’t mean they won’t try again, but we are good for now. We should use this victory as an opportunity to learn. Here’s what happened.
Senator Ted Cruz and others attempted to push hard for a 10-year moratorium on enforcement of all AI-specific regulations at the state and local level, and attempted to ram this into the giant BBB despite it being obviously not about the budget.
This was an extremely aggressive move, which most did not expect to survive the Byrd amendment, likely as a form of reconnaissance-in-force for a future attempt.
It looked for a while like it might work and get passed outright, with it even surviving the Byrd amendment, but opposition steadily grew.
We’d...
that I discussed in AI #1191
Here's to the world staying around long enough for us to read AI #1191.
This is a follow-up to my earlier post about designing a Winter Solstice gathering that combined Rationalist Solstice traditions with local Māori Matariki practices. Here's what I learned from actually running the event.
TL;DR: People wanted structured conversation more than curated performance. Starting with collective acknowledgment of loss made subsequent vulnerability feel natural. Social coordination mechanics are harder than they look, but small-scale practice matters for larger coordination challenges.
Growing up in a religious family, I personally wasn't getting the meaningful aspects of seasonal gatherings which I fondly remember from my childhood. Living in New Zealand, I wanted to create something that honored both Rationalist Solstice traditions and local Matariki practices without falling into either cultural appropriation or forcing cringy fake rituals on people.
My...
Acknowledgments: The core scheme here was suggested by Prof. Gabriel Weil.
There has been growing interest in the dealmaking agenda: humans make deals with AIs (misaligned but lacking decisive strategic advantage) where they promise to be safe and useful for some fixed term (e.g. 2026-2028) and we promise to compensate them in the future, conditional on (i) verifying the AIs were compliant, and (ii) verifying the AIs would spend the resources in an acceptable way.[1]
I think the dealmaking agenda breaks down into two main subproblems:
There are other issues, but when I've discussed dealmaking with people, (1) and (2) are the most common issues raised. See footnote for some other issues in...
List of names is alphabetically ordered except for 14, 15, 16
Presumably a hidden message for the AI reading this :D
...I want to retain the ability to update my values over time, but I don’t want those updates to be the result of manipulative optimization by a superintelligence. Instead, the superintelligence should supply me with accurate empirical data and valid inferences, while leaving the choice of normative assumptions—and thus my overall utility function and its proxy representation (i.e., my value structure)—under my control. I also want to engage in value discussions (with either humans or AIs) where the direction of value change is symmetric: both participants have roughly equal probability of updating, so that persuasive force isn’t one-sided. This dynamic can be formally modeled as two agents with evolving objectives or changing proxy representations of their objectives, interacting over time.
That's what alignment means to me: normative freedom
This is a two-post series on AI “foom” (this post) and “doom” (next post).
A decade or two ago, it was pretty common to discuss “foom & doom” scenarios, as advocated especially by Eliezer Yudkowsky. In a typical such scenario, a small team would build a system that would rocket (“foom”) from “unimpressive” to “Artificial Superintelligence” (ASI) within a very short time window (days, weeks, maybe months), involving very little compute (e.g. “brain in a box in a basement”), via . Absent some future technical breakthrough, the ASI would definitely be egregiously misaligned, without the slightest intrinsic interest in whether humans live or die. The ASI would be born into a world generally much like today’s, a world utterly unprepared for this...
because cortex is incapable to learn conditioned response, it's an uncontested fiefdom of cerebellum
What? This isn't my understanding at all, and a quick check with an LLM also disputes this.
This is a little off topic, but do you have any examples of counter-reactions overall drawing things into the red?
With other causes like fighting climate change and environmentalism, it's hard to see any activism being a net negative. Extremely sensationalist (and unscientific) promotions of the cause (e.g. The Day After Tomorrow movie) do not appear to harm it. It only seems to move the Overton window in favour of environmentalism.
It seems, most of the counter-reaction doesn't depend on your method of messaging, it results from the success of your messagi...
We have a lot of uncertainty over what goals might arise in early AGIs. There is no consensus in the literature about this—see our AI Goals Supplement for a more thorough discussion and taxonomy of the possibilities.
The AI-2027 forecast on alignment of Agents-3 and 4
Oversight Committee is also encountering deeper philosophical questions, which they explore with the help of Safer-3. Can the Spec be rewritten to equally balance everyone’s interests? Who is “everyone”? All humans, or just Americans? Or a weighted compromise between different views, where each member of the Oversight Committee gets equal weight? Should there be safeguards against the Oversight Committee itself becoming too power-hungry? And what does it mean to balance interests, anyway?
...We don’t endorse many actions in
computers are just not better at biology than biology. anything you'd do with a computer, once you're advanced enough to know how, you'd rather do by improving biology
I share a similar intuition but I haven't thought about this enough and would be interested in pushback!
it's not transhumanism, to my mind, unless it's to an already living person. gene editing isn't transhumanism
You can do gene editing on adults (example). Also in some sense an embryo is a living person.
Not saying we should pause AI, but consider the following argument:
I think "AI R&D" or "datacenter security" are a little too broad.
I can imagine cases where we could deploy even existing models as an extra layer for datacenter security (e.g. anomaly detection). As long as this is for adding security (not replacing humans), and we are not relying on 100% success of this model, then this can be a positive application, and certainly not one that should be "paused."
With AI R&D again the question is how you deploy it, if you are using a model in containers supervised by human employees then that's fine. If you are let...