LESSWRONG
LW

2426
Mikhail Samin
2464263175
Message
Dialogue
Subscribe

My name is Mikhail Samin (diminutive Misha, @Mihonarium on Twitter, @misha on Telegram). 

Humanity's future can be enormous and awesome; losing it would mean our lightcone (and maybe the universe) losing most of its potential value.

My research is currently focused on AI governance and improving the understanding of AI and AI risks among stakeholders. I also have takes on what seems to me to be the very obvious shallow stuff about the technical AI notkilleveryoneism; but many AI Safety researchers told me our conversations improved their understanding of the alignment problem.

I believe a capacity for global regulation is necessary to mitigate the risks posed by future general AI systems. I'm happy to talk to policymakers and researchers about ensuring AI benefits society.

I took the Giving What We Can pledge to donate at least 10% of my income for the rest of my life or until the day I retire (why?).

In the past, I've launched the most funded crowdfunding campaign in the history of Russia (it was to print HPMOR! we printed 21 000 copies =63k books) and founded audd.io, which allowed me to donate >$100k to EA causes, including >$60k to MIRI.

[Less important: I've also started a project to translate 80,000 Hours, a career guide that helps to find a fulfilling career that does good, into Russian. The impact and the effectiveness aside, for a year, I was the head of the Russian Pastafarian Church: a movement claiming to be a parody religion, with 200 000 members in Russia at the time, trying to increase separation between religious organisations and the state. I was a political activist and a human rights advocate. I studied relevant Russian and international law and wrote appeals that won cases against the Russian government in courts; I was able to protect people from unlawful police action. I co-founded the Moscow branch of the "Vesna" democratic movement, coordinated election observers in a Moscow district, wrote dissenting opinions for members of electoral commissions, helped Navalny's Anti-Corruption Foundation, helped Telegram with internet censorship circumvention, and participated in and organized protests and campaigns. The large-scale goal was to build a civil society and turn Russia into a democracy through nonviolent resistance. This goal wasn't achieved, but some of the more local campaigns were successful. That felt important and was also mostly fun- except for being detained by the police. I think it's likely the Russian authorities would imprison me if I ever visit Russia.]

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
6Mikhail Samin's Shortform
3y
276
Why Corrigibility is Hard and Important (i.e. "Whence the high MIRI confidence in alignment difficulty?")
Mikhail Samin5h20

Giving the AI only corrigibility as a terminal goal is not impossible; it is merely anti-natural for many reasons including because the goal-achieving machinery still there will, with a terminal goal other than corrigibility, output the same seemingly corrigible behavior while tested, for instrumental reasons; and our training setups do not know how to distinguish between the two; and growing the goal-achieving machinery to be good at pursuing particular goals will make it attempt to have a goal other than corrigibility crystallize. Gradient descent will attempt to go to other places.

But sure, if you’ve successfully given your ASI corrigibility as the only terminal goal, congrats, you’ve gone much further than MIRI expected humanity to go with anything like the current tech. The hardest bit was getting there.

I would be surprised if Max considers corrigibility to have been reduced to an engineering problem.

Reply
Reasons to sell frontier lab equity to donate now rather than later
Mikhail Samin2d11-13

Enjoyed the post, am disappointed with the list of 501(c)(3) opportunities.

METR researches and runs evaluations of frontier AI systems with a major focus on AI agents. If you work at a frontier lab, you’re probably aware of them, as they’ve partnered with OpenAI and Anthropic to pilot pre-deployment evaluations procedures. METR’s is particularly known for work on measuring AI systems’ ability to conduct increasingly long tasks

While the long-horizon task graph is somewhat helpful for policy, it's unclear what's the marginal impact of METR's existence (AI labs are running evals anyway, there are other orgs and gov agencies in this space), and to what extent it's dual-use (can AI labs compare various bits they add to training by success on evals?).

Horizon Institute for Public Service

I donated to Horizon in the past, but I'm no longer convinced they're significantly impactful, as it seems that most staffers don't have time to focus on any specific issue that we care about, and generally the folk there don't seem to be particularly x-risk-pilled.

Forethought conducts academic-style research on how best to navigate the transition to a world with superintelligent AI

How is this more than near-zero dignity points?

Donations to MIRI are probably more impactful than donations to any of these three orgs, despite them not fundraising/not being funding-constrained.

Reply1
Mikhail Samin's Shortform
Mikhail Samin8d*993

The book is now a NYT bestseller: #7 in combined print&e-books nonfiction, #8 in hardcover nonfiction.

I want to thank everyone here who contributed to that. You're an awesome community, and you've earned a huge amount of dignity points.

Reply42
Mikhail Samin's Shortform
Mikhail Samin11d20

Yep, I’ve seen the video. Maybe a small positive update overall, because could’ve been worse?

It seems to me that you probably shouldn’t optimize for publicity for publicity’s sake, and even then, hunger strikes are not a good way.

Hunger strikes are very effective tools in some situations; but they’re not effective for this. You can raise awareness a lot more efficiently than this.

“The fears are not backed up with evidence” and “AI might improve billions of lives” is what you get when you communicate being in fear of something without focusing on the reasons why.

Reply
Mikhail Samin's Shortform
Mikhail Samin19d20

Yep. Good that he stopped. Likely bad that he started.

Reply
Mikhail Samin's Shortform
Mikhail Samin21d6524

"There is no justice in the laws of Nature, no term for fairness in the equations of motion. The universe is neither evil, nor good, it simply does not care. The stars don't care, or the Sun, or the sky. But they don't have to! We care! There is light in the world, and it is us!"

Reply
Mikhail Samin's Shortform
Mikhail Samin21d2-1

the demand is that a specific company agrees to halt if everyone halts; this does not help in reality, because in fact it won't be the case that everyone halts (abscent gov intervention).

Reply
Mikhail Samin's Shortform
Mikhail Samin22d20

You are of course right that there’s no difference between reality-fluid and normal probabilities in a small world: it’s just how much you care about various branches relative to each other, regardless of whether all of them will exist or only some.

I claim that the negative utility due to stopping to exist is just not there, because you don’t actually stop to exist in a way you reflectively care about, when you have fewer instances. For normal things (e.g., how much do you care about paperclips), the expected utility is the same; but here, it’s the kind of terminal value that i expect for most people would be different; guaranteed continuation in 5% of instances is much better than 5% chance of continuing in all instances; in the first case, you don’t die!

Reply
Mikhail Samin's Shortform
Mikhail Samin22d10

Imagine they you’re an agent in the game of life. Your world, your laws of physics are computed on a very large independent computers; all performing the same computation.

You exist within the laws of causality of your world, computed as long as at least one server computes your world. If some of them stop performing the computation, it won’t be a death of a copy; you’ll just have one fewer instance of yourself.

Reply
Mikhail Samin's Shortform
Mikhail Samin22d10

In a large universe, you do not end. Like, not in expectation see some branch versus other; you just continue, the computation that is you continues. When you open your eyes, you're not likely to find yourself as a person in a branch computed only relatively rarely; still, that person continues, and does not die.

Attemted suicide reduces your reality-fluid- how much you're computed and how likely you are to find yourself there- but you will continue to experience the world. If you die in a nuclear explosion, the continuation of you will be somewhere else, sort-of isekaied; and mostly you will find yourself not in a strange world that recovers the dead but in a world where the nuclear explosion did not appear; still, in a large world, even after a nuclear explosion, you continue.

You might care about having a lot of reality-fluid, because this makes your actions more impactful, because you can spend your lightcone better, and improve the average experience in the large universe. You might also assign negative utility to others seeing you die; they'll have a lot of reality-fluid in worlds where you're dead and they can't talk to you, even as you continue. But I don't think it works out to assigning the same negative utility to dying as in branches of small worlds.

Reply
Load More
76OpenAI Claims IMO Gold Medal
2mo
74
33No, Futarchy Doesn’t Have This EDT Flaw
3mo
28
6Superintelligence's goals are likely to be random
7mo
6
80No one has the ball on 1500 Russian olympiad winners who've received HPMOR
9mo
21
72How to Give in to Threats (without incentivizing them)
1y
34
11Can agents coordinate on randomness without outside sources?
Q
1y
Q
16
76Claude 3 claims it's conscious, doesn't want to die or be modified
2y
118
33FTX expects to return all customer money; clawbacks may go away
2y
1
25An EA used deceptive messaging to advance her project; we need mechanisms to avoid deontologically dubious plans
2y
1
The Tree of AI Alignment on Arbital
2 months ago
The Tree of AI Alignment on Arbital
2 months ago
(+37676)
Decision theory
6 months ago
(+142)
Functional Decision Theory
6 months ago
(+242)
Translations Into Other Languages
3 years ago
(+84/-60)