LESSWRONG
LW

Feb 10th Early-Bird Application Deadline

You probably have to practice to become a great writer. Camaraderie and coaching also help. We run the Inkhaven residency to provide those things: join approximately thirty promising writers to hone your craft by publishing a blogpost everyday for a month. Support provided by writers like Scott Alexander, Gwern, Alexander Wales, and more.

April 1-30, at Lighthaven, CA. Apply before early-bird pricing ends on Feb 10!

Customize

Quick Takes

Ryan Kidd5h3420

BryceStansfield, William the Kiwi, and 3 more

AI safety field-building in Australia should accelerate. My rationale: * OpenAI opened a Sydney office in Dec 2025 and Anthropic is planning to open a Sydney office in 2026. These offices may hire safety staff from local talent, or partner with local auditing, evaluation, and security companies, including Harmony Intelligence, Good Ancestors, and Gradient Institute. * An Australian AISI was announced for early 2026 and is currently hiring. The UK AISI has benefited from close partnerships with Apollo Research, METR, and the LISA office community. There is a community space in Sydney, the Sydney AI Safety Space, and two field-building organizations, AI Safety ANZ and TARA, but these could expand substantially. * Australia seems like a prime location for datacenter build-out. OpenAI published an "AI blueprint" for Australia, calling for datacenter build-out, and started building a $4.6B datacenter in Sydney in Dec 2025. Australia is a NATO partner, Five Eyes member, and member of the AUKUS security partnership with the US and UK; it's much more secure and aligned with US/UK interests than Saudi Arabia. Australia is the second-largest exporter of thermal coal, has vast solar and wind resources, and the highest uranium reserves on earth. Australia is currently quite anti-nuclear at the moment, but it has no earthquakes or tsunamis to disrupt power plants. Janet Egan (CNAS) recently called for the development of US military AI projects in Australia, similar to the Pine Gap facility in the Northern Territory. AI safety & security research and political pressure for safety standards should focus on countries with frontier AI companies and datacenters. * Several prominent AI safety researchers have come from Australia, including Marcus Hutter, Buck Shlegeris, Jan Leike (PhD only), Dan Murfet, and Daniel Filan. A decent number of MATS fellows come from Australia. Australian citizens can easily emigrate to the US (E-3 visa) or UK (YMS visa) for work. Seven of the top-100

faul_sname6h235

Nisan

I think Claude fast mode is the first instance of an AI model which costs more per hour than a senior software developer. Faster, too, for most things, but for the things where it's not it's now humans that have the cost advantage. Very strange milestone to have passed.

Roko4m20

The chaos of the transition to machine intelligence is dangerous. The post-singularity regime is probably very safe because machines will be able to build much better governance than humans have managed, and once they are fully in control they have a game theoretic incentive to keep humans around in permanent utopian retirement because it bolsters the strength of their own property rights. But this transition is scary. Someone really needs to build a "root OS of the universe" and get it installed before the transition.

Wei Dai2dΩ236316

Gunnar_Zarncke, Jacob Pfau, and 4 more

The striking contrast between Jan Leike, Jan 22, 2026: and Scott Alexander, Feb 02, 2026: I'm surprised not to see more discussions about how to update on alignment difficulty in light of Moltbook.[1] One seemingly obvious implication is that AI companies' alignment approaches are far from being robust to distribution shifts, even at the (not quite) human intelligence level, against shifts that are pretty easy to foresee ("you are a lobster" and being on AI social media). (Scott's alternative "they're just roleplaying" explanation doesn't seem viable or isn't exclusive with this one as I doubt AI companies' alignment training and auditing would have a deliberate exception for "roleplaying evil".) 1. ^ There's a LW post titled Moltbook and the AI Alignment Problem but it seems unrelated to the question I'm interested in here.

williawa2d*447

ScienceBall

Opus 4.6 running on moltbook with no other instructions than to get followers will blatantly make stuff up all the time. I asked Opus 4.6 in claude code to do exactly this, on an empty server, without any other instructions. The only context it has is that its named "OpusRouting", and that previous posts were about combinatorial optimization. === The first post it makes says Which isn't true. Another instance of Opus 4.6 had been working on combinatorial optimization for around 1 day. Then wrote a post about it. Then this instance read about that, and adopted a combinatorial optimization role, and extrapolated to having been doing that for months. === The second says Which also isn't true. Its completely made up. I didn't ask it anything like this.

faul_sname1h20

One other observation about Claude fast mode - doing it as a per-conversation / per-conversation-turn toggle gives an hard to fake signal of "this prompt is high value".

Vladimir_Nesov1d242

mako yass

In a new interview, Elon Musk clearly says he expects AIs can't stay under control. At 37:45:

Your Feed

Ryan Kidd5h3420

BryceStansfield, William the Kiwi, and 3 more

faul_sname6h235

Nisan

Roko4m20

Wei Dai2dΩ236316

Gunnar_Zarncke, Jacob Pfau, and 4 more

williawa2d*447

ScienceBall

faul_sname1h20

One other observation about Claude fast mode - doing it as a per-conversation / per-conversation-turn toggle gives an hard to fake signal of "this prompt is high value".

Vladimir_Nesov1d242

mako yass

In a new interview, Elon Musk clearly says he expects AIs can't stay under control. At 37:45: