Mikhail Samin

My name is Mikhail Samin (diminutive Misha, @Mihonarium on Twitter, @misha in Telegram).

Humanity's future can be huge and awesome; losing it would mean our lightcone (and maybe the universe) losing most of its potential value.

My research is currently focused on AI governance and improving the understanding of AI and AI risks among stakeholders. I also have takes on what seems to me to be the very obvious shallow stuff about the technical AI notkilleveryoneism; but many AI Safety researchers told me our conversations improved their understanding of the alignment problem.

I believe a capacity for global regulation is necessary to mitigate the risks posed by future general AI systems. I'm happy to talk to policymakers and researchers about ensuring AI benefits society.

I took the Giving What We Can pledge to donate at least 10% of my income for the rest of my life or until the day I retire (why?).

In the past, I've launched the most funded crowdfunding campaign in the history of Russia (it was to print HPMOR! we printed 21 000 copies =63k books) and founded audd.io, which allowed me to donate >$100k to EA causes, including >$60k to MIRI.

[Less important: I've also started a project to translate 80,000 Hours, a career guide that helps to find a fulfilling career that does good, into Russian. The impact and the effectiveness aside, for a year, I was the head of the Russian Pastafarian Church: a movement claiming to be a parody religion, with 200 000 members in Russia at the time, trying to increase separation between religious organisations and the state. I was a political activist and a human rights advocate. I studied relevant Russian and international law and wrote appeals that won cases against the Russian government in courts; I was able to protect people from unlawful police action. I co-founded the Moscow branch of the "Vesna" democratic movement, coordinated election observers in a Moscow district, wrote dissenting opinions for members of electoral commissions, helped Navalny's Anti-Corruption Foundation, helped Telegram with internet censorship circumvention, and participated in and organized protests and campaigns. The large-scale goal was to build a civil society and turn Russia into a democracy through nonviolent resistance. This goal wasn't achieved, but some of the more local campaigns were successful. That felt important and was also mostly fun- except for being detained by the police. I think it's likely the Russian authorities will imprison me if I ever visit Russia.]

Posts

Sorted by New

6Mikhail Samin's Shortform

52How to Give in to Threats (without incentivizing them)

3mo

6Can agents coordinate on randomness without outside sources?

6mo

72Claude 3 claims it's conscious, doesn't want to die or be modified

10mo

113

33FTX expects to return all customer money; clawbacks may go away

10mo

18An EA used deceptive messaging to advance their project; we need mechanisms to avoid deontologically dubious plans

10mo

42NYT is suing OpenAI&Microsoft for alleged copyright infringement; some quick thoughts

15Some quick thoughts on "AI is easy to control"

0It's OK to eat shrimp: EAs Make Invalid Inferences About Fish Qualia and Moral Patienthood

82AI pause/governance advocacy might be net-negative, especially without a focus on explaining x-risk

Wiki Contributions

Translations Into Other Languages

(+84/-60)

Comments

Sorted by

Newest

I Finally Worked Through Bayes' Theorem (Personal Achievement)

Mikhail Samin18d40

If you want to understand Bayes theorem, know why you’re applying it, and use it intuitively, try https://arbital.com/p/bayes_rule/?l=1zq

(The) Lightcone is nothing without its people: LW + Lighthaven's big fundraiser

Mikhail Samin20d90

I've donated $1000. Thank you for your work.

"The Solomonoff Prior is Malign" is a special case of a simpler argument

Mikhail Samin1mo11

I’d bet 1:1 that, conditional on building a CEV-aligned AGI, we won’t consider this type of problem to have been among the top-5 hardest to solve.

Reality-fluid in our universe should pretty much add up to normality, to the extent it’s Tegmark IV (and it’d be somewhat weird for your assumed amount of compute and simulations to exist but not for all computations/maths objects to exist).

If a small fraction of computers simulating this branch stop, this doesn’t make you stop. All configurations of you are computed; simulators might slightly change the relative likelihood of currently being in one branch or another, but they can’t really terminate you

Furthermore, our physics seems very simple, and most places that compute us probably do it faithfully, on the level of the underlying physics, with no interventions.

I feel like thinking of reality-fluid as just inverse relationship to the description length might produce wrong intuitions. In Tegmark IV, you still get more reality-fluid if someone simulates you; and it’s less intuitive why this translates into shorter description length. It might be better to think of it as: if all computation/maths exists and I open my eyes in a random place, how often would that happen here? All the places run this world give some of their reality-fluid to this world. If a place visible from a bunch of other places starts to simulate this universe, it will be visible from slightly more places.

You can think of the entire object of everything, with all of its parts being simulated in countless other parts; or imagine a Markov process, but with worlds giving each other reality-fluid.

In that sense, the resource that we have is the reality-fluid of our future lightcone; it is our endowment, and we can use it to maximize the overall flourishing in the entire structure.

If we make decisions based on how good the overall/average use of the reality-fluid would be, you’ll gain less reality-fluid by manipulating our world the way described in the post than you’ll spend on the manipulation. It’s probably better for you to trade with us instead.

(I also feel like there might be a reasonable way to talk about causal descendants, where the probabilities are whatever abides the math of probability theory and causality down the nodes we care about, instead of being the likelihoods of opening eyes in different branches in a particular moment of evaluation.)

LDT (and everything else) can be irrational

Mikhail Samin2mo20

It’s reasonable to consider two agents playing against each other. “Playing against your copy” is a reasonable problem. ($9 rocks get 0 in this problem, LDTs probably get $5.)

Newcomb, Parfit’s hitchhiker, smoking, etc. are all very reasonable problems that essentially depend on the buttons you press when you play the game. It is important to get these problems right.

But playing against LDT is not necessarily in the “fair problem class” because the game might behave differently depending on your algorithm/on how you arrive at taking actions, and not just depending on your actions.

Your version of it- playing against an LDT- is indeed different from playing against a game that looks at whether we’re an alphabetizing agent and pick X instead of Y because X<Y and not because we looked at the expected utility: we would want LDT to perform optimally in this game. But the reason LDT-created-rock loses to a natural rock here isn’t fundamentally different from the reason LDT loses to an alphabetizing agent in the other game and it is known that you can construct a game like that where LDT will lose to something else. You can make the game description sound more natural, but I feel like there’s a sharp divide between the “fair problem class” problems and others.

(I also think that in real life, where this game might play out, there isn’t really a choice we can make, to make our AI a $9 rock instead of an LDT agent; because when we do that due to the rock’s better performance in this game, our rock gets slightly less than $5 in EV instead of getting $9; LDT doesn’t perform worse than other agents we could’ve chosen in this game.)

LDT (and everything else) can be irrational

Mikhail Samin2mo63

Playing ultimatum game against an agent that gives in to $9 from rocks but not from us is not in the fair problem class, as the payoffs depend directly on our algorithm and not just on our choices and policies.

https://arbital.com/p/fair_problem_class/

A simpler game is “if you implement or have ever implemented LDT, you get $0; otherwise, you get $100”.

LDT decision theories are probably the best decision theories for problems in the fair problem class.

(Very cool that you’ve arrived at the idea of this post independently!)

If I have some money, whom should I donate it to in order to reduce expected P(doom) the most?

Mikhail Samin3mo64

Do you want to donate to alignment specifically? IMO AI governance efforts are significantly more p(doom)-reducing than technical alignment research; it might be a good idea to, e.g., donate to MIRI, as they’re now focused on comms & governance.

Alexander Gietelink Oldenziel's Shortform

Mikhail Samin3mo20

Probability is in the mind. There's no way to achieve entanglement between what's necessary to make these predictions and the state of your brain, so for you, some of these are random.
In multi-worlds, the Turing machine will compute many copies of you, and there might be more of those who see one thing when they open their eyes than of those who see another thing. When you open your eyes, there's some probability of being a copy that sees one thing and a copy that sees the other thing. In a deterministic world with many copies of you, there's "true" randomness in where you end up opening your eyes.

[This comment is no longer endorsed by its author]Reply

How to Give in to Threats (without incentivizing them)

Mikhail Samin3mo30

If you are a smart individual in todays society, you shouldn't ignore threats of punishment

If today's society consisted mostly of smart individuals, they would overthrow the government that does something unfair instead of giving in to its threats.

Should you update your idea of fairness if you get rejected often?

Only if you're a kid who's playing with other human kids (which is the scenario described in the quoted text), and converging on fairness possibly includes getting some idea of how much effort various things take different people.

If you're an actual grown-up (not that we have those) and you're playing with aliens, you probably don't update, and you certainly don't update in the direction of anything asymmetric.

How to Give in to Threats (without incentivizing them)

Mikhail Samin3mo20

Very funny that we had this conversation a couple of weeks prior to transparently deciding that we should retaliate with p=.7!

[Completed] The 2024 Petrov Day Scenario

Mikhail Samin3mo30

huh, are you saying my name doesn’t sound WestWrongian