My name is Mikhail Samin (diminutive Misha, @Mihonarium on Twitter, @misha on Telegram).
Humanity's future can be enormous and awesome; losing it would mean our lightcone (and maybe the universe) losing most of its potential value.
My research is currently focused on AI governance and improving the understanding of AI and AI risks among stakeholders. I also have takes on what seems to me to be the very obvious shallow stuff about the technical AI notkilleveryoneism; but many AI Safety researchers told me our conversations improved their understanding of the alignment problem.
I believe a capacity for global regulation is necessary to mitigate the risks posed by future general AI systems. I'm happy to talk to policymakers and researchers about ensuring AI benefits society.
I took the Giving What We Can pledge to donate at least 10% of my income for the rest of my life or until the day I retire (why?).
In the past, I've launched the most funded crowdfunding campaign in the history of Russia (it was to print HPMOR! we printed 21 000 copies =63k books) and founded audd.io, which allowed me to donate >$100k to EA causes, including >$60k to MIRI.
[Less important: I've also started a project to translate 80,000 Hours, a career guide that helps to find a fulfilling career that does good, into Russian. The impact and the effectiveness aside, for a year, I was the head of the Russian Pastafarian Church: a movement claiming to be a parody religion, with 200 000 members in Russia at the time, trying to increase separation between religious organisations and the state. I was a political activist and a human rights advocate. I studied relevant Russian and international law and wrote appeals that won cases against the Russian government in courts; I was able to protect people from unlawful police action. I co-founded the Moscow branch of the "Vesna" democratic movement, coordinated election observers in a Moscow district, wrote dissenting opinions for members of electoral commissions, helped Navalny's Anti-Corruption Foundation, helped Telegram with internet censorship circumvention, and participated in and organized protests and campaigns. The large-scale goal was to build a civil society and turn Russia into a democracy through nonviolent resistance. This goal wasn't achieved, but some of the more local campaigns were successful. That felt important and was also mostly fun- except for being detained by the police. I think it's likely the Russian authorities would imprison me if I ever visit Russia.]
It’s better than stampy (try asking both some interesting questions!). Stampy is cheaper to run though.
I wasn’t able to get LLMs to produce valid arguments or answer questions correctly without the context, though that could be scaffolding/skill issue on my part.
Thanks! I think we’re close to a point where I’d want to put this in front of a lot of people, though we don’t have the budget for this (which seems ridiculous, given the stats we have for our ads results etc.), and also haven’t yet optimized the interface (as in, half the US public won’t like the gender dropdown).
Also, it’s much better at conversations than at producing 5min elevator pitches. (Hard to make it good at being where the user is while getting to a point instead of being very sycophantic).
The end goal is to be able to explain the current situation to people at scale.
Sure! Mostly, it's just that a lot of stuff that correlates with specific qualia in humans doesn't provide any evidence about qualia in other animals; reinforcement learning- behavior that seeks the things that when encountered update the brain to seek more of them, and tries to avoid the things that update the brain to avoid them- doesn't mean that there are any circuits in the animal's brain for experiencing these updates from the inside, as qualia, the way humans do when we suffer. If I train a very simple RL agent with the feedback that salmon get via mechanisms that produce pain in humans, the RL agent will learn to demonstrate salmon's behavior while we can be very confident there's no qualia in that RL agent. Basically almost all of the evidence Rethink and others present are of the kind that RL agents and don't provide evidence that would add anything on top of "it's a brain of that size that can do RL and has this evolutionary history".
The reason we know other humans have qualia circuits in their brains is that these circuits have outputs that make humans talk about qualia even if they've not heard others talk about qualia (this would've been very surprising if that happened randomly).
We don't have anything remotely close to that for any non-human animals.
For many things, we can assume that something like what led to humans having qualia has been present in the evolutionary history of that thing; or have tests (such as a correct mirror test) that likely correlates with the kinds of things that lead to qualia; but among all known fish species we've done these experiments on, there are very few that have any social dynamics of the kind that would maybe correlate with qualia or can remotely pass anything like a mirror test, and salmon is not among those species.
i made a thing!
it is a chatbot with 200k tokens of context about AI safety. it is surprisingly good- better than you expect current LLMs to be- at answering questions and counterarguments about AI safety. A third of its dialogues contain genuinely great and valid arguments.
You can try the chatbot at https://whycare.aisgf.us (ignore the interface; it hasn't been optimized yet). Please ask it some hard questions! Especially if you're not convinced of AI x-risk yourself, or can repeat the kinds of questions others ask you.
Send feedback to ms@contact.ms.
A couple of examples of conversations with users:
I know AI will make jobs obsolete. I've read runaway scenarios, but I lack a coherent model of what makes us go from "llms answer our prompts in harmless ways" to "they rebel and annihilate humanity".
Do you ever use LLMs? (They have a lot more neurons than bees, and it's unclear why consuming honey is worse than using LLMs.)
Salmon is incredibly unlikely to have qualia, there's approximately nothing in its evolutionary history that correlates with what qualia could be useful for or a side-effect of. I'm fine with eating salmon. Bees are social; I wouldn't eat bees.
I'm happy to make a bet that you win if salmon have qualia and bees don't, I win if bees have qualia and salmon don't, and N/A otherwise, resolves via asking a CEV-aligned AGI.
There’s correspondingly more liquidity subsidies on these markets, which makes the consequences the same in expectation (i.e., others would love to eat your free money by correcting the attempted manipulation just as much as they would on normally structured decision markets). Everyone just makes bets 1000x larger than they normally would.
Thanks for the link, but (having only skimmed it, so maybe I missed it) I don’t think the paper analyzes this sort of scheme? It says that you need to have at least some randomness so that options are explored, but this is somewhat orthogonal to my claim (that you might want to cancel the market 99.9% of the time and take a random decision which is not informed by the market 0.1% of the time to make the market predict the causal consequences of your decision via implementing the do() operator this way).
I would be curious if any literature actually analyzes the type of scheme that uses policy markets to implement CDT instead of EDT.
I really like the idea, I think an issue is that it’s hard for the AI to really verify the lab actually made that contract and isn’t just faking its environment
Another example: