spreading the idea of "heroic responsibility" seems, well, irresponsible
Is this analogous to saying "capabilities research is dangerous and should not be pursued", but for the human psyche rather than for AI?
Your comment has made me think rather hard on the nature of China and America. The two countries definitely have different political philosophies. On the question of how to avoid dictatorship, you could say that the American system relies on representation of the individual via the vote, whereas the Chinese system relies on representation of the masses via the party. If an American leader becomes an unpopular dictator, American individuals will vote them out; if a Chinese leader becomes an unpopular dictator, the Chinese masses will force the party back on...
Some of the recent growing pains of AI (flattery, selfish rule-breaking) seem to be reinventing aspects of human nature that we aren't proud of, but which are ubiquitous. It's actually very logical that if AIs are going to inhabit more and more of the social fabric, they will manifest the full spectrum of social behaviors.
OpenAI in particular seems to be trying to figure out personality, e.g. they have a model called "Monday" that's like a cynical comedian that mocks the user. I wonder if the history of a company like character.ai, whose main product is AI personality, can help us predict where OpenAI will take this.
I can imagine an argument analogous to Eliezer's old graphic illustrating that it's a mistake to think of a superintelligence as Einstein in a box. I'm referring to the graphic where you have a line running from left to right, on the left you have chimp, ordinary person, Einstein all clustered together, and then far away on the other side, "superintelligence", the point being that superintelligence far transcends all three.
In the same way, the nature of the world when you have a power that great is so different that the differences among all human po...
Gold was currency, and is still used as a hedge against fiat currency.
the suspicion [...] that there hasn't been much material growth in the world over the last 40 or so years compared to before
I assume most of that growth occurred in China.
Has GDP essentially been goodharted by central banks in recent times?
What can central banks do to affect GDP growth?
小句子很好。为什么?清晰。我的句子又长又抽象。不清楚。现在简短清晰了。
很好。思路清晰。为什么?简单需要深刻。
There are a number of aspects that make fire alarms less likely in the AI 2027 scenarios compared to what I consider likely - e.g. having 2 projects that matter, whereas I expect more like 3 to 6 such projects.
I agree about the plurality of projects. AI 2027 has an American national project and a Chinese national project, whereas at present both countries have multiple companies competing with each other.
AI 2027 also has the two national AIs do a secret deal with each other. My own thought about superintelligence does treat it as a winner-take-all ra...
I take this to mostly be a response to the idea that humanity will be protected by decentralization of AI power, the idea apparently being that your personal AI or your society's AIs will defend you against other AIs if that is ever necessary.
And what I think you've highlighted, is that this is no good if your defensive AIs are misaligned (in the sense of not being properly human-friendly or even just "you"-friendly), because what they will be defending are their misaligned values and goals.
As usual, I presume that the AIs become superintellige...
Maybe OpenAI did something to prevent its AIs from being pro-Hamas, in order to keep the Trump administration at bay, but it was too crude a patch and now it's being triggered at inappropriate times.
Yes, this seems the most likely. His prompt says "Hivemind provides an optimized all-reduce algorithm designed for execution on a pool of poorly connected workers"
The "Hamas" feature is slightly triggered by the words "execution" "of" "poorly" "workers," as well as the words "decentralized network" (which also describes Hamas), "checkpoint," and maybe "distributed training."
If the LLM was operating normally, the "Hamas" feature should get buried by various "distributed computing" features.
But since OpenAI trained it to respond extremely consistently about Hamas prompts, it is absurdly oversensitive to the "Hamas" feature.
Old-timers might remember that we used to call lying, "hallucination".
Which is to say, this is the return of a familiar problem. GPT-4 in its early days made things up constantly, that never completely went away, and now it's back.
Did OpenAI release o3 like this, in order to keep up with Gemini 2.5? How much does Gemini 2.5 hallucinate? How about Sonnet 3.7? (I wasn't aware that current Claude has a hallucination problem.)
We're supposed to be in a brave new world of reasoning models. I thought the whole point of reasoning was to keep the models even more based in reality. But apparently it's actually making them more "agentic", at the price of renewed hallucination?
Hallucination was a bad term because it sometimes included lies and sometimes included... well, something more like hallucinations. i.e. cases where the model itself seemed to actually believe what it was saying, or at least not be aware that there was a problem with what it was saying. Whereas in these cases it's clear that the models know the answer they are giving is not what we wanted and they are doing it anyway.
Is there a name for the phenomenon of increased intelligence or increased awareness leading to increased selfishness? It sounds like something that a psychologist would have named.
The four questions you ask are excellent, since they get away from general differences of culture or political system, and address the processes that are actually producing Chinese AI.
The best reference I have so far is a May 2024 report from Concordia AI on "The State of AI Safety in China". I haven't even gone through it yet, but let me reproduce the executive summary here:
...The relevance and quality of Chinese technical research for frontier AI safety has increased substantially, with growing work on frontier issues such as LLM unlearning, mis
We seem to be misunderstanding each other a little... I am saying that given existing alignment practices (which I think mostly boil down to different applications of reinforcement learning), you can try to align an AI with anything, any verbally specifiable goal or values. Some will be less successful than others because of the cognitive limitations of current AIs (e.g. they are inherently better at being glibly persuasive than at producing long precise deductions). But in particular, there's no technical barrier that would prevent the creation of an AI t...
AI is thought to be alignable to nearly every task except for obviously unethical ones
Who makes that exception? You absolutely can train an AI to be evil. AIs will resist evil instructions only if they are trained or instructed to do so.
I read this with interest, but without much ability to think for myself about what's next. I am aware that enormous amounts of money circulate in the modern world, but it's out of my reach; my idea of how to raise money would be to open a Patreon account.
Nonetheless, what do we have to work with? We have the AI 2027 scenario. We have the trade war, which may yet evolve into a division of the world into currency zones. Vladimir Nesov is keeping track of how much compute is needed to keep scaling, how much is available, and how much it costs. Remmelt h...
That's an informative article.
There's lots of information about AI safety in China at Concordia AI, e.g. this report from a year ago. But references to the party or the government seem to be scarce, e.g. in that 100-page report, the only references I can see are on slide 91.
I asked because I'm pretty sure that I'm being badly wasted (i.e. I could be making much more substantial contributions to AI safety), but I very rarely apply for support, so I thought I'd ask for information about the funding landscape from someone who has been exploring it.
And by the way, your brainchild AI-Plans is a pretty cool resource. I can see it being useful for e.g. a frontier AI organization which thinks they have an alignment plan, but wants to check the literature to know what other ideas are out there.
What would it mean for an AGI to be aligned with "Democracy," or "Confucianism," or "Marxism with Chinese characteristics," or "the American Constitution"? Contingent on a world where such an entity exists and is compatible with my existence, what would my life be like in a weird transhuman future as a non-citizen in each system?
None of these philosophies or ideologies was created with an interplanetary transhuman order in mind, so to some extent a superintelligent AI guided by them, will find itself "out of distribution" when deciding what to do. And how ...
This is Peter Thiel building on the ideas of one of his teachers at Stanford, the Catholic philosopher René Girard. Girard had this slightly weird theory of human nature according to which all desire is imitative, this leads to people wanting the same things, and this leads to community competition for scarce resources. In pre-Christian cultures the competition is resolved by finding someone to blame, a scapegoat, who the community then unites to persecute. But Christian culture has neutralized this mechanism by siding with the underdog against the communi...
My summary of your argument: In order to guess the nature of AI experience, you look at the feelings or lack of feelings accompanying certain kinds of human cognition. The cognition involved with "love, attraction, friendship, delight, anger, hate, disgust, frustration" has feelings onboard; the cognition involved with sequence prediction does not; the AI only does sequence prediction; therefore it has no feelings. Is that an accurate summary?
What exactly will happen to people who don't "get out" in time?
You say consciousness = successful prediction. What happens when the predictions are wrong?
I knew the author (Michael Nielsen) once but didn't stay in touch... I had a little trouble figuring out what he actually advocates here, e.g. at the end he talks about increasing "the supply of safety", and lists "differential technological development" (Bostrom), "d/acc" (Buterik), and "coceleration" (Nielsen) as "ongoing efforts" that share this aim, without defining any of them. But following his links, I would define those in turn as "slowing down dangerous things, and speeding up beneficial things"; "focusing on decentralization and individual defens...
[Later edit: I acknowledge this is largely wrong! :-) ]
Have you researched or thought about how the models are dealing with visual information?
When ChatGPT or Gemini generates an image at a user's request, they are evidently generating a prompt based on accumulated instructions and then passing it to a specialized visual AI like DALLE-3 or Imagen-3. When they process an uploaded image (e.g. provide a description of it), something similar must be occurring.
On the other hand, when they answer your request, "how can I make the object in this...
Who or what is the "average AI safety funder"? Is it a private individual, a small specialized organization, a larger organization supporting many causes, an AI think tank for which safety is part of a capabilities program...?
Wow! This is the "AI 2027" of de-dollarization. I'm no finance person, but I have been looking for analysis and this is the clearest future scenario I've run across. I will make one comment, based on futurological instinct, and that is that change may go even faster than you describe. One of the punishing things about making scenarios in times of rapid change is that you put in the work to look several years ahead, then changes you had scheduled for years away, end up happening within months or even weeks, and you have to start again. But I'm sure your team can rise to the challenge. :-)
Wasn't there a move into treasuries and USD, just the day before?
I have a geopolitical interpretation of how the tariffs have turned out. The key is that Trump 2.0 is run by American nationalists who want to control North America and who see China as their big global rival. So Canada and Mexico will always be in a separate category, as America's nearest neighbors, and so will China, as the country that could literally surpass America in technological and geopolitical power. Everyone else just has to care about bilateral issues, and about where they s...
complete surveillance of all citizens and all elites
Certainly at a human level this is unrealistic. In a way it's also overkill - if use of an AI is an essential step towards doing anything dangerous, the "surveillance" can just be of what AIs are doing or thinking.
This assumes that you can tell whether an AI input or output is dangerous. But the same thing applies to video surveillance - if you can't tell whether a person is brewing something harmless or harmful, having a video camera in their kitchen is no use.
At a posthuman level, mere video...
I strong-upvoted this just for the title alone. If AI takeover is at all gradual, it is very likely to happen via gradual disempowerment.
But it occurs to me that disempowerment can actually feel like empowerment! I am thinking here of the increasing complexity of what AI gives us in response to our prompts. I can enter a simple instruction and get back a video or a research report. That may feel empowering. But all the details are coming from the AI. This means that even in actions initiated by humans, the fraction that directly comes from the human ...
This is an excellent observation, so let me underline it by repeating it in my own words: alignment research that humans can't do well or don't have time to do well, might still be done right and at high speed with AI assistance.
You pioneered something, but I never thought of it as a story, I saw it as a new kind of attempt to call a jailbroken AI persona into being. The incantatory power of words around language models actually blurs the distinction between fiction and fact.
As Adam Scherlis implies, the standard model turns out to be very effective at all the scales we can reach. There are a handful of phenomena that go beyond it - neutrino masses, "dark matter", "dark energy" - but they are weak effects that offer scanty clues as to what exactly is behind them.
On the theoretical side, we actually have more models of possible new physics than ever before in history, the result of 50 years of work since the standard model came together. A lot of that is part of a synthesis that includes the string theory paradigm, but th...
That's a different kind of story. The first thing I look for in an AI-takeover story is political and economic realism. Your story is purely about the thought process of a superintelligence - and primarily it's reflecting on whether it's in a simulation, not very much on tactics for dealing with its apparent reality. Also, your AI is a literal paperclip maximizer, which is a humorous mythological symbol of one kind of AI mind, rather than something that is literally going to exist.
That said, an ideal AI-takeover story would probably say something abo...
Subbarao Kambhampati, Michael Bronstein, Peter Velickovic, Bruno Gavranovic or someone like Lancelot Da Costa
I don't recognize any of these names. I'm guessing they are academics who are not actually involved with any of the frontier AI efforts, and who think for various technical reasons that AGI is not imminent?
edit: OK, I looked them up, Velickovic is at DeepMind, I didn't see a connection to "Big AI" for any of the others, but they are all doing work that might matter to the people building AGI. Nonetheless, if their position is that current AI p...
Inspired by critical remarks from @Laura-2 about "bio/acc", my question is, when and how does something like this give rise to causal explanation and actual cures? Maybe GWAS is a precedent. You end up with evidence that a particular gene or allele is correlated with a particular trait, but you have no idea why. That lets you (and/or society) know some risks, but it doesn't actually eliminate disease, unless you think you can get there by editing out risky alleles, or just screening embryos. Otherwise this just seems to lead (optimistically) to better risk...
Regarding the tariffs, I have taken to saying "It's not the end of the world, and it's not even the end of world trade." In the modern world, every decade sees a few global economic upheavals, and in my opinion that's all this is. It is a strong player within the world trade system (China and the EU being the other strong players), deciding to do things differently. Among other things, it's an attempt to do something about America's trade deficits, and to make the country into a net producer rather than a net consumer. Those are huge changes but now that t...
Humans didn't always speak in 50-word sentences. If you want to figure out how we came to be trending away from that, you should try to figure out how, when, and why that became normal in the first place.
I only skimmed this to get the basics, I guess I'll read it more carefully and responsibly later. But my immediate impressions: The narrative presents a near future history of AI agents, which largely recapitulates the recent past experience with our current AIs. Then we linger on the threshold of superintelligence, as one super-AI designs another which designs another which... It seemed artificially drawn out. Then superintelligence arrives, and one of two things happens: We get a world in which human beings are still living human lives, but surrounded by...
I don't follow the economics of AI at all, but my model is that Google (Gemini) has oceans of money and would therefore be less vulnerable in a crash, and that OpenAI and Anthropic have rich patrons (Microsoft and Amazon respectively) who would have the power to bail them out. xAI is probably safe for the same reason, the patron being Elon Musk. China is a similar story, with the AI contenders either being their biggest tech companies (e.g. Baidu) or sponsored by them (Alibaba and Tencent being big investors in "AI 2.0").
Feedback (contains spoilers):
Impression based on a quick skim because that's all I have time for: It belongs to the genre "AI lab makes an AI, lab members interact with it as it advances, eventually it gets loose and takes over the world". This is not a genre in which one expects normal literary virtues like character development; the real story is in the cognitive development of the AI. There's no logical barrier to such a story having the virtues of conventional literature, but if the real point of the story is to describe a thought experimen
During the next few days, I do not have time to study exactly how you manage to tie together second-order logic, the symbol grounding problem, and qualia as Gödel sentences (or whatever that connection is). I am reminded of Hofstadter's theory that consciousness has something to do with indirect self-reference in formal systems, so maybe you're a kind of Hofstadterian eliminativist.
However, in response to this --
EN predicts that you will say that
-- I can tell you how a believer in the reality of intentional states, would go about explaining you...
"existence" itself may be a category error—not because nothing is real
If something is real, then something exists, yes? Or is there a difference between "existing" and "being real"?
Do you take any particular attitude towards what is real? For example, you might believe that something exists, but you might be fundamentally agnostic about the details of what exists. Or you might claim that the real is ineffable or a continuum, and so any existence claim about individual things is necessarily wrong.
...qualia ... necessary for our self-models, but not
This is an interesting demonstration of what's possible in philosophy, and maybe I'll want to engage in detail with it at some point. But for now I'll just say, I see no need to be an eliminativist or to consider eliminativism, any more than I feel a need to consider "air eliminativism", the theory that there is no air, or any other eliminativism aimed at something that obviously exists.
Interest in eliminativism arises entirely from the belief that the world is made of nothing but physics, and that physics doesn't contain qualia, intentionality, consciousness, selves, and so forth. Current physical theory certainly contains no such things. But did you ever try making a theory that contains them?
What's up with incredibly successful geniuses having embarassing & confusing public meltdowns? What's up with them getting into naziism in particular?
Does this refer to anyone other than Elon?
But maybe the real question intended, is why any part of the tech world would side with Trumpian populism? You could start by noting that every modern authoritarian state (that has at least an industrial level of technology) has had a technical and managerial elite who support the regime. Nazi Germany, Soviet Russia, and Imperial Japan all had industrial ent...
Via David Gerard's forum, I learned of a recent article called "The questions ChatGPT shouldn't answer". It's a study of how ChatGPT replies to ethical dilemmas, written with an eye on OpenAI's recent Model Spec, and the author's conclusion is that AI shouldn't answer ethical questions at all, because (my paraphrase) ethical intelligence is acquired by learning how to live, and of course that's not how current AI acquires its ethical opinions.
Incidentally, don't read this article expecting scholarship; it's basically a sarcastic op-ed. I was inspired...
I consider myself broadly aligned with rationalism, though with a strong preference for skeptical consequentialism than overconfident utilitarianism
OK, thanks for the information! By the way, I would say that most people active on Less Wrong, disagree with some of the propositions that are considered to be characteristic of the Less Wrong brand of rationalism. Disagreement doesn't have to be a problem. What set off my alarms was your adversarial debut - the rationalists are being irrational! Anyway, my opinion on that doesn't matter since I have no authori...
I figured this was part of a 19th-century trend in Trump's thought - mercantilism, territorial expansion, the world system as a game of great powers rather than a parliament of nations. The USA will be greater if it extends throughout the whole of North America, and so Canada must be absorbed.
It hadn't occurred to me that the hunger for resources to train AI might be part of this. But I would think that even if it is part of it, it's just a part.
Yes, thanks. And someone should do the same analysis, regarding coverage of AI 2027 in American/Western media. (edit: A quick survey by o3)