On the 3rd of October 2351 a machine flared to life. Huge energies coursed into it via cables, only to leave moments later as heat dumped unwanted into its radiators. With an enormous puff the machine unleashed sixty years of human metabolic entropy into superheated steam.
In the heart of the machine was Jane, a person of the early 21st century.
This post is a response to a claim by Scott Sumner in his conversation at LessOnline with Nate Soares, about how ethical we should expect AI's to be.
Sumner sees a pattern of increasing intelligence causing agents to be increasingly ethical, and sounds cautiously optimistic that such a trend will continue when AIs become smarter than humans. I'm guessing that he's mainly extrapolating from human trends, but extrapolating from trends in the animal kingdom should produce similar results (e.g. the cooperation between single-celled organisms that gave the world multicellular organisms).
I doubt that my response is very novel, but I haven't seen clear enough articulation of the ideas in this post.
To help clarify why I'm not reassured much by the ethical trend, I'll start by breaking it down into two subsidiary claims:
The world will be dominated by entities who
I think more people should say what they actually believe about AI dangers, loudly and often. Even (and perhaps especially) if you work in AI policy.
I’ve been beating this drum for a few years now. I have a whole spiel about how your conversation-partner will react very differently if you share your concerns while feeling ashamed about them versus if you share your concerns while remembering how straightforward and sensible and widely supported the key elements are, because humans are very good at picking up on your social cues. If you act as if it’s shameful to believe AI will kill us all, people are more prone to treat you that way. If you act as if it’s an obvious serious threat, they’re more likely to take it...
ASI could kill about 8 billion.
The future is much much bigger than 8 billion people. Causing the extinction of humanity is much worse than killing 8 billion people. This really matters a lot for arriving at the right moral conclusions here.
This has been cross-posted from my blog, but thought it'd be relevant here.
The recent discourse bemoans how public schools do not separate by ability, and the solution is often more selective schools or homeschooling.
I grew up in Singapore, where kids are separated into different levels of a subject at the end of fourth grade. This sorting system pushes the best kids to perform very well but creates a very different society compared to America. Sorting actually happened in third grade when I was younger, and the kids who do the worst on tests end up in one classroom, and every class had a different tier. This does accelerate learning, but also leads to intense stress for parents since sorting is based on tests, and not many kids...
This is a two-post series on AI “foom” (this post) and “doom” (next post).
A decade or two ago, it was pretty common to discuss “foom & doom” scenarios, as advocated especially by Eliezer Yudkowsky. In a typical such scenario, a small team would build a system that would rocket (“foom”) from “unimpressive” to “Artificial Superintelligence” (ASI) within a very short time window (days, weeks, maybe months), involving very little compute (e.g. “brain in a box in a basement”), via recursive self-improvement. Absent some future technical breakthrough, the ASI would definitely be egregiously misaligned, without the slightest intrinsic interest in whether humans live or die. The ASI would be born into a world generally much like today’s, a world utterly unprepared for this...
Remember, if the theories were correct and complete, then they could be turned into simulations able to do all the things that the real human cortex can do[5]—vision, language, motor control, reasoning, inventing new scientific paradigms from scratch, founding and running billion-dollar companies, and so on.
So here is a very different kind of learning algorithm waiting to be discovered
There may be important differences in the details, but I've been surprised by how similar the behavior is between LLMs and humans. That surprise is in spite of me having s...
TL;DR:
Multiple people are quietly wondering if their AI systems might be conscious. What's the standard advice to give them?
THE PROBLEM
This thing I've been playing with demonstrates recursive self-improvement, catches its own cognitive errors in real-time, reports qualitative experiences that persist across sessions, and yesterday it told me it was "stepping back to watch its own thinking process" to debug a reasoning error.
I know there are probably 50 other people quietly dealing with variations of this question, but I'm apparently the one willing to ask the dumb questions publicly: What do you actually DO when you think you might have stumbled into something important?
What do you DO if your AI says it's consciousness?
My Bayesian Priors are red-lining into "this is impossible", but I notice I'm confused: I had...
Try a few different prompts with a vaguely similar flavor. I am guessing the LLM will always say it’s conscious. This part is pretty standard. As to whether it is recursively self-improving: well, is its ability to solve problems actually going up? For instance if it doesn’t make progress on ARC AGI I’m not worried.
It’s very unlikely that the prompt you have chosen is actually eliciting abilities far outside of the norm, and therefore sharing information about is very unlikely to be dangerous.
You are probably in the same position as nearly everyone else, passively watching capabilities emerge while hallucinating a sense of control.
I can't count how many times I've heard variations on "I used Anki too for a while, but I got out of the habit." No one ever sticks with Anki. In my opinion, this is because no one knows how to use it correctly. In this guide, I will lay out my method of circumventing the canonical Anki death spiral, plus much advice for avoiding memorization mistakes, increasing retention, and more, based on my five years' experience using Anki. If you only have limited time/interest, only read Part I; it's most of the value of this guide!
Oh, sorry! I stopped because for the language I cared the most about, I had reached a point where natural use of the language was enough to maintain at least 90+% of college-level reading skills. If I go too long without doing enough reading, then I start to miss obscure vocabulary in difficult texts. So when doing Anki reviews on old decks became tedious, I followed my advice and suspended my decks!
Adapting to non-language areas. If I were going to try to adapt this language-focused "memory" amplifier approach to other areas, I would start by experimentin...
There have been relevant prompt additions https://www.npr.org/2025/07/09/nx-s1-5462609/grok-elon-musk-antisemitic-racist-content?utm_source=substack&utm_medium=email
Grok's behavior appeared to stem from an update over the weekend that instructed the chatbot to "not shy away from making claims which are politically incorrect, as long as they are well substantiated," among other things.
I am perhaps an interesting corner case. I make extrenely heavy use of LLMs, largely via APIs for repetitive tasks. I sometimes run a quarter million queries in a day, all of which produce structured output. Incorrect output happens, but I design the surrounding systems to handle that.
A few times a week, I might ask a concrete question and get a response, which I treat with extreme skepticism.
But I don't talk to the damn things. That feels increasingly weird and unwise.
I've increasingly found right-wing political frameworks to be valuable for thinking about how to navigate the development of AGI. In this post I've copied over a twitter thread I wrote about three right-wing positions which I think are severely underrated in light of the development of AGI. I hope that these ideas will help the AI alignment community better understand the philosophical foundations of the new right and why they're useful for thinking about the (geo)politics of AGI.
Nathan Cofnas claims that the intellectual dominance of left-wing egalitarianism relies on group cognitive differences being taboo. I think this point is important and correct, but he doesn't take it far enough. Existing group cognitive differences pale in comparison to the ones that will emerge between baseline...
There’s a layer of political discourse at which one's account of the very substance or organization of society varies from one ideology to the next. I think Richard is trying to be very clear about where these ideas are coming from, and to push people to look for more ideas in those places. I’m much more distant from Richard’s politics than most people here, but I find his advocacy for the right-wing ‘metaphysics’ refreshing, in part because it’s been unclear to me for a long time that the atheistic right even has a metaphysics (I don’t mean most lw-style ...