There's a tweet (1,564 likes as I write this) making the rounds that I think is at least half false. Since I don't have a Twitter/X account, I will reply here. The tweet says
Every day I get reminded of the story of how KPD and SPD members would clap when a member of the other party would come into the concentration camps
quote tweeting this tweet:
Not a fan of Tr*mp's to say the least but so far it is unclear there is anyone in his administration as monstrous as Biden's Middle East team.
The source for the KPD-SPD claim seems to be this earlier tweet from February (2,051 likes):
When the first concentration camps for political prisoners were created in Nazi Germany between 33-34, SPD deputy Gerhard Seger reported that KPD prisoners would cheer when the prison guards announced new prisoners of the SPD had arrived and vice-versa
When asked about the source, the author of that February tweet claimed that
You can find it in Seger's A Nation Terrorized, which is his personal report on one of the first concentration camps
A Nation Terrorized is the English title of Oranienburg. Erster authentischer Bericht eines aus dem Konzentrationslager Geflüchteten (my translation: Oranienburg: First Authentic Report from a Concentration Camp Escapee). The relevant passage reads, in full:
One evening at roll call, Sturmbannführer Krüger stepped before the ranks of prisoners and announced that the next day the "complete social democratic bigwig Fritz Ebert" would be delivered, this Marxist swine who belonged to the November criminals who had plunged Germany into disaster, and well, the SA would take care of this pig.
What happened after this speech with its ominous announcement at the end?
Loud cheers of "Bravo!" rang out from the ranks of the communist prisoners!
The communists in question, themselves victims of the SA charlatan standing before them, noisily took the side of their own party enemies, applauding when this National Socialist promised to take action against a Social Democrat!
I can find no other evidence of cheering or celebration of political opponents entering a concentration camp, and no other reports in A Nation Terrorized to that effect. So the original account seems wrong in a few ways:
Also, not to excuse the communists, but Fritz Ebert was not just any member of the SPD, he was the son of Friedrich Ebert who in the late 1910s had allied with conservative military forces and right-wing Freikorps units to violently suppress the communists, and who was plausibly most responsible for the murders of Rosa Luxemburg and Karl Liebknecht. The median SPD member would not have gotten the same reception. And finally, the events Seger reports happened in 1933 when the SPD-KPD rivalry was still fresh, and most of the Nazis' worst misdeeds had yet to happen -- for example, it was five years before Kristallnacht.
An enormous, unconscionable amount of information shared on Twitter/"TPOT" is like this. Plausible sounding anecdotes that get stretched and pixelated through legions of cross-platform and intra-platform quote-tweeting.
At my job on the compute policy team at IAPS, we recently started a Substack that we call The Substrate. I think this could be of interest to some here, since I quite often see discussions on LessWrong around export controls, hardware-enabled mechanisms, security, and other compute-governance-related topics.
Here are the posts we've published so far:
To make this quick take not merely an advertisement, I would also be happy to discuss anything about any of these posts, and/or to hear suggestions for things that we should write about.
I wrote a post forecasting Chinese compute acquisition in 2026. The very short summary is that I expect about 60% to be legally imported NVIDIA H200s, with domestically produced Huawei Ascends accounting for about 25%, and the remainder being smuggled AI chips and Ascends illegally fabricated outside China via proxies.
While China likely produces GPU dies in quite large quantities, it is likely bottlenecked by an HBM shortage, which limits the total number of Ascend 910Cs and other AI chips that can actually be assembled. I do expect domestic production to grow substantially in 2027 and 2028, as CXMT ramps up HBM production.
In total, I expect China to acquire about 320,000 B300-equivalents (90% CI: 150,000 to 600,000) in 2026, enough to train about six Grok-4-scale models simultaneously. By comparison, the Stargate campus that Oracle has been building for OpenAI in Abilene, Texas will alone house over 300,000 B300-equivalents.
Some Chinese companies are also renting AI chips from non-Chinese cloud providers. For example, according to SemiAnalysis, ByteDance is Oracle’s largest customer; their largest joint cluster, located in Southeast Asia, will perhaps reach about 250,000 B300-equivalents this summer. (I don’t count remote access as “acquisition” since there is no ownership.)
NB. These estimates are quite rough, so take them with a grain of salt. But I think they give a good sense of the general size of these different pathways.
Bit of feedback: would be helpful if you explicitly stated your estimated number of H200 and Huawei chips and/or provide a B300-eq conversion table, so they are more comparable to other reports that are quoted just in number of chips. I understand how you do the conversion but it is not super apparent in the post.
Here's what I usually try when I want to get the full text of an academic paper:
https://doi.org/...) and then, if that doesn't work, give it a link to the paper's page at an academic journal (e.g. https://www.sciencedirect.com/science...)."name of paper in quotes" filetype:pdf. If that fails, search for "name of paper in quotes" and look at a few of the results if they seem promising. (Again, I may find a different version of the paper than the one I was looking for, which is usually but not always fine.)I would add Semantic Scholar to the list. It gives consistantly better search results than Google Scholar and has a better interface. I've also found a really difficult-to-find paper on pre-print websites once or twice.
Thanks for the suggestion! I'll be trying it out and adding it to the list if I find it useful.
Some argue that even without misaligned AI, humanity could lose control of societal systems simply by delegating more and more to AI. They would delegate because these future AIs are more capable and faster than humans, and because competitive dynamics pushing everyone to delegate further, until eventually humans have no control over these societal systems.
Delegation ≠ loss of control, though. A principal can delegate to an agent while maintaining control and seeing what the agent does. CEOs and managers do this all the time obviously. So to go from "strong incentive to delegate" to "loss of control", you may need to also argue that humans will be unable to meaningfully oversee what the AIs do, e.g., because those AIs are too fast and their actions are too complicated for humans to understand. (Again, we're assuming these AIs are intent-aligned, so modulo information, humans can retain control over the AIs.)
I guess to me it isn't at all obvious that all humans would in fact delegate everything to AIs when that means giving up meaningful control. First, there may well exist methods to better aggregate and abstract information for humans so that they can understand enough of what the AIs are doing. Second, most humans would probably be reluctant to give up meaningful control when delegating -- e.g., a CEO would likely be more reluctant to delegate a task or role if they have reason to think they will have no insight into how it's done, or no ability to meaningfully control the employee -- and this seems like it should move the equilibrium away a bit from "delegate everything", even with competitive pressure. But unless all humans do so delegate, some humans will retain meaningful control over the AIs, and arguments about gradual disempowerment look more like arguments about concentration of power.
If CEOs (and boards) are also AIs, the analogy breaks. Humans are currently necessary in such positions, their necessity is sufficient to explain the fact that they are there at all, even if there are other reasons it might be a good thing. The situation changes when a system won't break down without humans in positions of power, it's not clear that these other reasons have any teeth in practice.
This doesn't need to be the case, but only in a sense similar to how humanity doesn't need to build AGIs before it's ready. It's a new affordance, and there is a danger that it gets used irresponsibly and leads to bad outcomes. There should be some understanding of how specifically this won't be happening.
For a legally constituted corporation, the role of CEO is not only one of decision-maker, but also blame-taker: if the company goes into decline, the CEO can be fired; if the company does a sufficiently serious crime, the CEO can be prosecuted and punished (think Jeffrey Skilling of Enron). The presence of a human whose reputation (and possibly freedom) depend on the business's conduct, conveys some trustworthiness to other humans (investors, trading partners, creditors).
If a company has an AI agent for its top-level decision-maker, then those decisions are made without this kind of responsibility for the outcome. An AI agent cannot meaningfully be fired or punished; it can be turned off, and some chatbot characters sometimes act to avoid such a fate; but I don't think investors would be wise to count on that.
Now, what about a non-legally-constituted entity, or even a criminal one? Criminal gangs do rely on a big boss to adjudicate disputes, set strategy, and to risk taking a fall if things go sour. But online criminal groups like ransomware gangs or darknet marketplaces might be able to rest their reputation solely on performance rather than on the ability for a human big boss to fall or be punished. I don't know enough about the sociology of these groups to say.
The question with intent alignment is: intent aligned with whom? If the AI executive is intent aligned with (follows orders from) the government, and the human government is voluntarily replaced with an AI government, we are left with an AI that is intent aligned with another AI.
Has anyone else noticed a thing recently (the past couple of days) where Claude is extremely reluctant to search the web, and instead is extremely keen to search past conversations or Google Drive and other nonsense like that? Even after updating my system prompt to encourage the former and discourage the latter, it still defaults to the latter. Also, instead of using the web search tool it will sometimes try and fail to search using curl and code execution. Is this just me or is anyone else experiencing similar issues?
I'm really confused by this passage from The Six Mistakes Executives Make in Risk Management (Taleb, Goldstein, Spitznagel):
We asked participants in an experiment: “You are on vacation in a foreign country and are considering flying a local airline to see a special island. Safety statistics show that, on average, there has been one crash every 1,000 years on this airline. It is unlikely you’ll visit this part of the world again. Would you take the flight?” All the respondents said they would.
We then changed the second sentence so it read: “Safety statistics show that, on average, one in 1,000 flights on this airline has crashed.” Only 70% of the sample said they would take the flight. In both cases, the chance of a crash is 1 in 1,000; the latter formulation simply sounds more risky.
One crash every 1,000 years is only the same as one crash in 1,000 flights if there's exactly one flight per year on average. I guess they must have stipulated that in the experiment (of which there's no citation), because otherwise it's perfectly rational to suppose the first option is safer (since generally an airline serves >1 flight per year)?
On the advice of @adamShimi, I recently read Hasok Chang's Inventing Temperature. The book is terrific and full of deep ideas, many of which relate in interesting ways to AI safety. What follows are some thoughts on that relationship, from someone who is not an AI safety researcher and only somewhat follows developments there, and who probably got one or two things wrong.
(Definitions: By "operationalizing", I mean "giving a concept meaning by describing it in terms of measurable or closer-to-measurable operations", whereas "abstracting" means "removing properties in the description of an object".)
There has been discussion on LessWrong about the relative value of abstract work on AI safety (e.g., agent foundations) versus concrete work on AI safety (e.g., mechanistic interpretability, prosaic alignment). Proponents of abstract work argue roughly that general mathematical models of AI systems are useful or essential for understanding risks, especially coming from not-yet-existing systems like superintelligences. Proponents of concrete work argue roughly that safety work is more relevant when empirically grounded and subjected to rapid feedback loops. (Note: The abstract-concrete distinction is similar to, but different from, the distinction between applied and basic safety research.)
As someone who has done neither, I think we need both. We need abstract work because we need to build safety mechanisms using generalizable concepts, so that we can be confident that the mechanisms apply to new AI systems and new situations. We need concrete work because we must operationalize the abstract concepts in order to measure them and apply them to actually existing systems. And finally we need work that connects the abstract concepts to the concrete concepts, to see that they are coherent and for each to justify the other.
Chang writes:
The dichotomy between the abstract and the concrete has been enormously helpful in clarifying my thinking at the earlier stages, but I can now afford to be more sophisticated. What we really have is a continuum, or at least a stepwise sequence, between the most abstract and the most concrete. This means that the operationalization of a very abstract concept can proceed step by step, and so can the building-up of a concept from concrete operations. And it may be beneficial to move only a little bit at a time up and down the ladder of abstraction.
Take for example the concept of (capacity for) corrigibility, i.e., the degree to which an AI system can be corrected or shut down. The recent alignment faking paper showed that, in experiments, Claude would sometimes "pretend" to change its behavior when it was ostensibly being trained with new alignment criteria, while not actually changing its behavior. That's an interesting and important result. But (channeling Bridgman) we can only be confident that it applies to the concrete concept of corrigibility measured by the operations used in the experiments -- we have no guarantees that it holds for some abstract corrigibility, or when corrigibility is measured using another set of operations or under other circumstances.
An interesting case study discussed in the book is the development of the abstract concept of temperature by Lord Kelvin (the artist formerly known as William Thomson) in collaboration with James Prescott Joule (of conservation of energy fame). Thomson defined his abstract temperature in terms of work and pressure (which were themselves abstract and needed to be operationalized). He based his definition on the Carnot cycle, an idealized process performed by the theoretical Carnot heat engine. The Carnot heat engine was inspired by actual heat engines, but was fully theoretical -- there was no physical Carnot heat engine that could be used in experiments. In other words, the operationalization of temperature that Thomson invented using the Carnot cycle was an intermediate step that required further operationalization before Thomson's abstract temperature could be connected with experimental data. Chang suggests that, while the Carnot engine was never necessary for developing an abstract concept of temperature, it did help Thomson achieve that feat.
Ok, back to AI safety. So above I said that, for the whole AI thing to go well, we probably need progress on both abstract and concrete AI safety concepts, as well as work to bridge the two. But where should research effort be spent on the margin?
You may think abstract work is useless because it has no error-correcting mechanism when it is not trying to, or is not close to being able to, operationalize its abstract concepts. If it is not grounded in any measurable quantities, it can't be empirically validated. On the other hand, many abstract concepts (such as corrigibility) still make sense today and are currently being studied in the concrete (though they have not yet been connected to fully abstract concepts) despite being formulated before AI systems looked much like they do today.
You may think concrete work is useless because AI changes so quickly that the operations used to measure things today will soon be irrelevant, or more pertinently perhaps, because the superintelligent systems we truly need to align are presumably vastly different from today's AI systems, in their behavior if not in their architecture. In that way, AI is quite different from temperature. The physical nature of temperature is constant in space and time -- if you measure temperature with a specific set of operations (measurement tools and procedures), you would expect the same outcomes regardless of which century or country you do it in -- whereas the properties of AI change rapidly over time and across architectures. On the other hand, timelines seem short, such that AGI may share many similarities with today's AI systems, and it is possible to build abstractions gradually on top of concrete operations.
There is in fact an example from the history of thermometry of extending concrete concepts to new environments without recourse to abstract concepts. In the 18th century, scientists realized that the mercury and air thermometers used then behaved very differently, or could not be used at all due to freezing and melting, for very low and very high temperatures. While they had an intuitive notion that some abstract temperature ought to apply across all degrees of heat or cold, their operationalized temperatures clearly only applied to a limited range of heat and cold. To solve this, they eventually developed different sets of operations for the measurement of temperatures in extreme ranges. For example, Josiah Wedgwood measured very high temperatures in ovens by baking standardized clay cylinders and measuring how much they'd shrunk. These different operations, which yielded measurements of temperature on different scales, were then connected by measuring temperature for both scales (using different operations) in an overlapping range and lining those up. All this was done without an abstract theory of temperature, and while the resulting scale was not on very solid theoretical ground, it was good enough to provide practical value.
Of course, the issue with superintelligence is that, because of e.g., deceptive alignment and gradient hacking, we want trustworthy safety mechanisms and alignment techniques in place well before the system has finished training. That's why we want to tie those techniques to abstract concepts which we are confident will generalize well. But I have no idea what the appropriate resource allocation is across these different levels of abstraction.[1] Maybe what I want to suggest is that abstract and concrete work is complementary and should strive towards one another. But maybe that's what people have been doing all along?
A few months ago I wrote a post about Game B. The summary:
I describe Game B, a worldview and community that aims to forge a new and better kind of society. It calls the status quo Game A and what comes after Game B. Game A is the activity we’ve been engaged in at least since the dawn of civilisation, a Molochian competition over resources. Game B is a new equilibrium, a new kind of society that’s not plagued by collective action problems.
While I agree that collective action problems (broadly construed) are crucial in any model of catastrophic risk, I think that
- civilisations like our current one are not inherently self-terminating (75% confidence);
- there are already many resources allocated to solving collective action problems (85% confidence); and
- Game B is unnecessarily vague (90% confidence) and suffers from a lack of tangible feedback loops (85% confidence).
I think it can be of interest to some LW users, though it didn't feel on-topic enough to post in full here.