My name is Mikhail Samin (diminutive Misha, @Mihonarium on Twitter, @misha in Telegram).
Humanity's future can be huge and awesome; losing it would mean our lightcone (and maybe the universe) losing most of its potential value.
My research is currently focused on AI governance and improving the understanding of AI and AI risks among stakeholders. I also have takes on what seems to me to be the very obvious shallow stuff about the technical AI notkilleveryoneism; but many AI Safety researchers told me our conversations improved their understanding of the alignment problem.
I believe a capacity for global regulation is necessary to mitigate the risks posed by future general AI systems. I'm happy to talk to policymakers and researchers about ensuring AI benefits society.
I took the Giving What We Can pledge to donate at least 10% of my income for the rest of my life or until the day I retire (why?).
In the past, I've launched the most funded crowdfunding campaign in the history of Russia (it was to print HPMOR! we printed 21 000 copies =63k books) and founded audd.io, which allowed me to donate >$100k to EA causes, including >$60k to MIRI.
[Less important: I've also started a project to translate 80,000 Hours, a career guide that helps to find a fulfilling career that does good, into Russian. The impact and the effectiveness aside, for a year, I was the head of the Russian Pastafarian Church: a movement claiming to be a parody religion, with 200 000 members in Russia at the time, trying to increase separation between religious organisations and the state. I was a political activist and a human rights advocate. I studied relevant Russian and international law and wrote appeals that won cases against the Russian government in courts; I was able to protect people from unlawful police action. I co-founded the Moscow branch of the "Vesna" democratic movement, coordinated election observers in a Moscow district, wrote dissenting opinions for members of electoral commissions, helped Navalny's Anti-Corruption Foundation, helped Telegram with internet censorship circumvention, and participated in and organized protests and campaigns. The large-scale goal was to build a civil society and turn Russia into a democracy through nonviolent resistance. This goal wasn't achieved, but some of the more local campaigns were successful. That felt important and was also mostly fun- except for being detained by the police. I think it's likely the Russian authorities will imprison me if I ever visit Russia.]
I do not believe Anthropic as a company has a coherent and defensible view on policy. It is known that they said words they didn't hold while hiring people (and they claim to have good internal reasons for changing their minds, but people did work for them because of impressions that Anthropic made but decided not to hold). It is known among policy circles that Anthropic's lobbyists are similar to OpenAI's.
From Jack Clark, a billionaire co-founder of Anthropic and its chief of policy, today:
Dario is talking about countries of geniuses in datacenters in the context of competition with China and a 10-25% chance that everyone will literally die, while Jack Clark is basically saying, "But what if we're wrong about betting on short AI timelines? Security measures and pre-deployment testing will be very annoying, and we might regret them. We'll have slower technological progress!"
This is not invalid in isolation, but Anthropic is a company that was built on the idea of not fueling the race.
Do you know what would stop the race? Getting policymakers to clearly understand the threat models that many of Anthropic's employees share.
It's ridiculous and insane that, instead, Anthropic is arguing against regulation because it might slow down technological progress.
No- only two requirements:
If you give a physical computer a large enough tape, or make a human brain large enough without changing its density, it collapses into a black hole. It is really not relevant to any of the points made in the post.
For any set of inputs and outputs to an algorithm, we can make a neural network that approximates with arbitrary precision these inputs and outputs, in a single forward pass, without even having to simulate a tape.
I sincerely ask people to engage with the actual contents of the post related to sharp left turn, goal crystallization, etc. and not with a technicality that doesn’t affect any of the points raised who they’re not an intended audience of.
This seems pretty irrelevant to the points in question. To the extent there’s a way to look at the world, think about it, and take actions to generally achieve your goals, e.g., via the algorithms that humans are running, technically, a large enough neural network can do this in a single forward pass.
We won’t make a neural network that in a single forward pass can iterate through and check all possible proofs of length <10^100 of some conjecture, but it doesn’t mean that we can’t make a generally capable AI system; and we also make CPUs large enough for it—but that doesn’t affect whether computers are Turing-complete in a bunch of relevant ways.
“Any algorithm” being restricted to algorithms that, e.g., have some limited set of variables to operate on, is a technicality expanding on which wouldn’t affect the validity of the points made or implied in the post, so the simplification of saying “any algorithm” is not misleading to a reader who is not familiar with any of this stuff; and it’s in the section marked as worth skipping to people not new to LW, as it is not intended to communicate anything new to people who are not new to LW.
In reality, empirically, we see that fairly small neural networks get pretty good at the important stuff.
Like, “oh, but physical devices can’t run an arbitrarily long tape” is a point completely irrelevant to whether for anything that we can do, LLMs would be able to do this, and to the question of whether AI will end up killing everyone. Humans are not Turing-complete in some narrow sense; this doesn’t prevent us from being generally intelligent.
Thanks for the reply!
after talking to Claude for a couple of hours asking it to reflect:
Oh no, OpenAI hasn’t been meaningfully advancing the frontier for a couple of months, scaling must be dead!
What is the easiest among problems you’re 95% confident AI won’t be able to solve by EOY 2025?
Sleeper agents, alignment faking, and other research were all motivated by the two cases outlined here: a scientific case and a global coordination case.
These studies have results that are much likelier in the "alignment is hard" worlds.
Is Anthropic currently planning to ask AI labs or state actors to pause scaling?