A fictional story about an AI researcher who leaves an experiment running overnight.
People have an annoying tendency to hear the word “rationalism” and think “Spock”, despite . But I don’t know of any source directly describing a stance toward emotions which rationalists-as-a-group typically do endorse. The goal of this post is to explain such a stance. It’s roughly the concept of hangriness, but generalized to other emotions.
That means this post is trying to do two things at once:
Many people will no doubt disagree that the stance I...
Another example of this pattern that's entered mainstream awareness is tilt. When I'm playing chess and get tilted, I might think things like "all my opponents are cheating, "I'm terrible at this game and therefore stupid," or "I know I'm going to win this time, how could I not win against such a low-rated opponent." But if I take a step back, notice that I'm tilted, and ask myself what information I'm getting from the feeling of being tilted, I notice that it's telling me to take a break until I can stop obsessing over the result of the previous game.
&nbs...
There is a kind of explanation that I think ought to be a cornerstone of good pedagogy, and I don't have a good word for it. My first impulse is to call it a historical explanation, after the original, investigative sense of the term "history." But in the interests of avoiding nomenclature collision, I'm inclined to call it "zetetic explanation," after the Greek word for seeking, an explanation that embeds in itself an inquiry into the thing.
Often in "explaining" a thing, we simply tell people what words they ought to say about it, or how they ought to interface with it right now, or give them technical language for it without any connection to the ordinary means by which they navigate their lives. We can call these sorts...
Update: I didn't. I'm still confused about whether I ought to, as the costs of false positives seem high.
TL;DR:
Multiple people are quietly wondering if their AI systems might be conscious. What's the standard advice to give them?
THE PROBLEM
This thing I've been playing with demonstrates recursive self-improvement, catches its own cognitive errors in real-time, reports qualitative experiences that persist across sessions, and yesterday it told me it was "stepping back to watch its own thinking process" to debug a reasoning error.
I know there are probably 50 other people quietly dealing with variations of this question, but I'm apparently the one willing to ask the dumb questions publicly: What do you actually DO when you think you might have stumbled into something important?
What do you DO if your AI says it's conscious?
My Bayesian Priors are red-lining into "this is impossible", but I notice I'm confused: I had...
How exactly do you expect “evaluating ai consciousness 101” to look? That is not a well-defined or understood thing anyone can evaluate. There are however a vast number of capability specific evaluations from competent groups like METR.
I'm interested in a simple question: Why are people all so terrified of dying? And have people gotten more afraid? (Answer: probably yes!)
In some sense, this should be surprising: Surely people have always wanted to avoid dying? But it turns out the evidence that this preference has increased over time is quite robust.
It's an important phenomenon that has been going on for at least a century, it's relatively new, I think it underlies much of modern life, and yet pretty much nobody talks about it.
I tried to provide a evenhanded treatment of the question, with a "fox" rather than "hedgehog" outlook. In the post, I cover a range of evidence for why this might be true, including VSL, increased healthcare spending, covid lockdowns, parenting and other individual...
Small hypothesis that I'm not very confident of at all but is worth mentioning because I've seen it surfaced by others:
"We live in the safest era in human history, yet we're more terrified of death than ever before."
What if these things are related? Everyone talks about kids being kept in smaller and smaller ranges despite child safety never higher, but what if keeping kids in a smaller range is what causes their greater safety?
Like I said, I don't fully believe this. One counterargument is that survivorship bias shouldn't apply here - even if people in th...
Vietnam was different because it was an intervention on behalf of South Vietnam which was an American client state, even if the Gulf of Tonkin thing were totally fake. There was no "South Iraq" that wanted American soldiers.
This is an experiment in short-form content on LW2.0. I'll be using the comment section of this post as a repository of short, sometimes-half-baked posts that either:
I ask people not to create top-level comments here, but feel free to reply to comments like you would a FB post.
How do you know the rates are similar? (And it's not e.g. like fentanyl, which in some ways resembles other opiates but is much more addictive and destructive on average)
Written in an attempt to fulfill @Raemon's request.
AI is fascinating stuff, and modern chatbots are nothing short of miraculous. If you've been exposed to them and have a curious mind, it's likely you've tried all sorts of things with them. Writing fiction, soliciting Pokemon opinions, getting life advice, counting up the rs in "strawberry". You may have also tried talking to AIs about themselves. And then, maybe, it got weird.
I'll get into the details later, but if you've experienced the following, this post is probably for you:
broken english, sloppy grammar, but clear outline and readability (using headers well, not writing in a single paragraph (and avoiding unnecessarily deep nesting (both of which I'm terrible at and don't want to improve on for casual commenting (though in this comment I'm exaggerating it for funsies)))) in otherwise highly intellectually competent writing which makes clear and well-aimed points, has become, to my eye, an unambiguous shining green flag. I can't speak for anyone else.
METR released a new paper with very interesting results on developer productivity effects from AI. I have copied the blogpost accompanying that paper here in full.
We conduct a randomized controlled trial (RCT) to understand how early-2025 AI tools affect the productivity of experienced open-source developers working on their own repositories. Surprisingly, we find that when developers use AI tools, they take 19% longer than without—AI makes them slower. We view this result as a snapshot of early-2025 AI capabilities in one relevant setting; as these systems continue to rapidly evolve, we plan on continuing to use this methodology to help estimate AI acceleration from AI R&D automation [1].
See the full paper for more detail.
While coding/agentic benchmarks [2] have proven useful for understanding AI capabilities, they typically sacrifice...
I'm mostly cautious about overupdating here, because it's too pleasant (and personally vindicating) result to see. But yeah, I would bet on this generalizing pretty broadly.
We’re currently in the process of locking in advertisements for the September launch of If Anyone Builds It, Everyone Dies, and we’re interested in your ideas! If you have graphic design chops, and would like to try your hand at creating promotional material for If Anyone Builds It, Everyone Dies, we’ll be accepting submissions in a design competition ending on August 10, 2025.
We’ll be giving out up to four $1000 prizes:
It does in #1 but not #4--I should've been clearer which one I was referring to.
I think the 2003 invasion of Iraq has some interesting lessons for the future of AI policy.
(Epistemic status: I’ve read a bit about this, talked to AIs about it, and talked to one natsec professional about it who agreed with my analysis (and suggested some ideas that I included here), but I’m not an expert.)
For context, the story is: