LESSWRONG
LW

Mikhail Samin — LessWrong

Replying toWhy I Transitioned: A Response

Thank you for writing this.

Replying toWhy I Transitioned: A Case Study

It is terrible and unfair that the world is so bad that people are not presented with freedom of looking what they want to look like and having the gender tropes they want to have.

Confession: I pranked Inkhaven to make sure no one fails

Mikhail Samin

1mo

(Content warnings: dubious math, quantum immortality, nuclear war)

Normal people make New Year’s resolutions.

People on the internet love to make resolutions for November.

So, for the entire month of November, 41 people, myself included, set out to publish a post every day, as part of a writing residency called Inkhaven. On the final day, I couldn’t resist the temptation to pull out a prank and make sure no one appeared to have failed.

(Note that I want to maintain everything as plausibly deniable; everything in this post might or might not have happened.)

From Mahmoud, another resident:

Inkhaven is the name of the 30 day blogging retreat that I went on this month. The rules were: come meet

... (read 2977 more words →)

Replying toAnnouncing Inkhaven 2: April 2026

Mikhail Samin1mo

Announcing Inkhaven 2: April 2026

A specific failure mode I experienced was that I was told that there would be scheduled time with a coach to think about the goals for the writing and other things; and so I outsourced these things to the time that would’ve been dedicated to them; but then, it was never actually scheduled.

So I almost never thought hard about what I want to do, having cached that there is a dedicated time for that; and so way over half of my posts were written because I needed to write something, and not because I was actively trying to write something for some reason other than “I’ll need to have published a post... (read more)

Replying toAnnouncing Inkhaven 2: April 2026

Mikhail Samin1mo

Announcing Inkhaven 2: April 2026

for me, inkhaven was a lot of fun. can hugely recommend. one very important thing is that you need to dedicate effort to being agentic about your writing and your goals. i somewhat didn’t, and that meant that i didn’t get nearly as much value out of it as i hoped for.

Mikhail Samin1moQuick Take

I noticed I have no clue how different positions of the tongue, the jaw, and the lips lead to different sounds.

So after talking to LLMs and a couple of friends who are into linguistics, I vibecoded https://contact.ms/fun/vowels.

I have no clue how valid any of it is. Would love for someone with a background in physics(/physiology/phonetics?) to fact-check it.

Replying toClipboard Normalization

Mikhail Samin2mo

Clipboard Normalization

would love a version for windows!

Replying toToss a bitcoin to your Lightcone – LW + Lighthaven's 2026 fundraiser

Mikhail Samin2mo

Toss a bitcoin to your Lightcone – LW + Lighthaven's 2026 fundraiser

You, in 2024: "I would be surprised if we never end up hosting events for scaling lab teams at Lighthaven. If they pay us a fair price (and maybe somewhat more as a tax), I will rent to them."

I would give you the points here if you acknowledged a change in policy instead of pretending like you always said you'd charge "(potentially enormous) premiums", that would depend on the externalities, and would not be cheap; because a year ago, you said a "fair price (and maybe somewhat more as a tax)".

It's not really clear to me to what extent you didn't communicate your policy well back then, or changed it on your... (read 436 more words →)

-13

-3

Replying toToss a bitcoin to your Lightcone – LW + Lighthaven's 2026 fundraiser

Mikhail Samin2mo*

Toss a bitcoin to your Lightcone – LW + Lighthaven's 2026 fundraiser

If you’re considering donating, this might important context:

Relatedly, here’s Sam Altman at Lighthaven:

Lighthaven, as people took from your comments on EA Forum, wants to be an impartial event venue. I’m not sure you all want to give money to an impartial event venue, just like you probably wouldn’t want to give money to a random hotel that is a good conference venue (at least, you wouldn’t give from your utilons budget; you might give them money to purchase fuzzies, but be clear with yourself when this is what you’re doing.

(I also think that people shouldn’t give money to the team if they’re enjoying and getting value out of LessWrong, as the value created by LessWrong should largely be attributed to the people who write posts and to the community and not to the team; that the cost of maintaining the website could be much lower than what the team is spending.)

-10

-42

Mikhail Samin2moQuick Take

Has anyone tried to do refusal training with early layers frozen/only on the last layers? I wonder if the result would be harder to jailbreak.

Replying toSharing information about Lightcone Infrastructure

Mikhail Samin2mo

Sharing information about Lightcone Infrastructure

Perhaps you’re right; I would love for that to be the case, and to have been wrong about all this. But this model- that it’s a there exists quantifier- is very surprised by a bunch of things from “lol, no, […]” to “I might use it that way. Like, I might tell someone who is worried about [third party] that they are planning to move into the space if it seems relevant. Or I might myself come to realize it's important and then actively tell people to maybe do something about it.”

And, like, he didn’t give any examples of when he would not use the information.

His position was pretty clear to me:... (read more)

Unless its governance changes, Anthropic is untrustworthy

Mikhail Samin

3mo

Anthropic is untrustworthy.

This post provides arguments, asks questions, and documents some examples of Anthropic's leadership being misleading and deceptive, holding contradictory positions that consistently shift in OpenAI's direction, lobbying to kill and water down regulation so helpful that employees of all major AI companies speak out to support it, and violating the fundamental promise the company was founded on. It also shares a few previously unreported details on Anthropic leadership's promises and efforts.^[1]

Anthropic has a strong internal culture that has broadly EA views and values, and the company has strong pressures to appear to follow these views and values as it wants to retain talent and the loyalty of staff, but it's... (read 8559 more words →)

287

•••

At the beginning of November, I learned about a startup called Red Queen Bio, that automates the development of viruses and related lab equipment. They work together with OpenAI, and OpenAI is their lead investor.

On November 13, they publicly announced their launch:

Today, we are launching Red Queen Bio (http://redqueen.bio), an AI biosecurity company, with a $15M seed led by @OpenAI. Biorisk grows exponentially with AI capabilities. Our mission is to scale biological defenses at the same rate. A on who we are + what we do!
[...]
We also need *financial* co-scaling. Governments can't have exponentially scaling biodefense budgets. But they can create the right market incentives, as they have done for other

... (read 499 more words →)

•••

I made a tool for learning absolute pitch as an adult

Mikhail Samin

3mo

I read a study that claims to have debunked the myth that only children can learn absolute pitch, and got 12 musicians who’ve not previously had absolute pitch to improve significantly at having absolute pitch.

On average, they spent 21.4 hours over 8 weeks, making 15,327 guesses. All learned to name at least 3 pitches with >90% accuracy, having to respond in under 2.028 seconds; some learned all 12. The average was 7.08 pitches learned.

Notably, the results on the new instruments were worse than on the instruments they were trained on, suggesting people can somewhat learn to rely on the cues from the specifics of the used instrument’s timbre:

The way it works is... (read 437 more words →)

A list of people who could’ve started a nuclear war, but chose not to

Mikhail Samin

3mo

This is a list of everyone who had a big red button but did not press it, despite a unilateral ability to destroy (at least some of) the world with nuclear weapons.

(Please comment with suggestions for additions: I’m sure I missed some people.)

Dwight D. Eisenhower was a U.S. President with the authority to launch nuclear weapons. He considered the use of nuclear weapons and made implicit threats during the 1954 Quemoy-Matsu crisis (against China) and when France asked for U.S. intervention in Vietnam. Eisenhower decided to not use nuclear weapons. While publicly, he declared that in the event of war in East Asia, he would authorize the use of tactical nuclear weapons... (read 1219 more words →)

I’m accumulating a small collection of spicy previously unreported deets about Anthropic for an upcoming post. Some of them sadly cannot publish because they might identify the sources. Others can! Some of those will be surprising to staff.

If you can share anything that’s wrong with Anthropic, that has not previously been public, DM me, preferably on Signal (@ misha.09)

‪Is there a write up on why the “abundance and growth” cause area is an actually relatively efficient way to spend money (instead of a way for OpenPhil to be(come) friends with everyone who’s into abundance & growth)? (These are good things to work on, but seem many orders of magnitude worse than other ways to spend money.)‬

Prediction markets for social deduction games

Mikhail Samin

3mo

Being able to N/A contracts with market manipulators reduces the incentives to do bad stuff.

Prediction markets are fun! Social deduction games are fun! Can you merge the two?

The simple solution is normal games plus betting among the spectators.

But for a long time, I wanted to play a social deduction game with prediction markets that players can participate in, however, I didn’t really try to figure out how to make it work.

Straightforward combination of prediction markets with social deduction games distorts incentives. Either the bad guys might want to sell their probability of winning for providing everyone information, or they might want to increase their chance of winning by losing bets to mess... (read 379 more words →)

There should be unicorns

Mikhail Samin

3mo

I want people to pursue their dreams.

I want more mad and not-so-mad scientists and inventors. More people pursuing awesome projects. I want them to create fun and wholesome things that make our civilization simultaneously a little bit more grown-up and a little bit sillier and child-like.

I want there to be unicorns.

Perhaps a secret bounty, for genetically modifying an animal into a unicorn; and releasing it into the wild; and pranking the world, once someone randomly meets a pack of unicorns in a forest.

Like that GPT-2 story, but for real.

I want people to look at their lives with fresh eyes and bring about incredible amounts of fun.

We can already see six colors (thanks,... (read 388 more words →)

How to be convincing when talking to people about existential threat from AI

Mikhail Samin

3mo

I think I’m pretty good at convincing people about AI dangers in personal conversations. This post talks about the basics of speaking convincingly about AI dangers to people.

Prerequisites

I. Learn to truly see them

In 2022, at a CFAR workshop, I was introduced to circling.

It is multi-player meditation. People sit in a circle and have a conversation, but the content of the conversation is mostly focused on the meta: what someone says or expresses causes in you, how you relate to other people, and what’s going on in the circle as a group.

It is sometimes a great experience; but more importantly, it allows you to (1) pay attention to what’s going on in other... (read 1352 more words →)

Taste of food

Mikhail Samin

3mo

(I'm doing Inkhaven and have two actually important/potentially impactful posts coming up, and really want to polish both a bit more before publishing, so I wrote this quickly to have a thing to post instead. Apologies.)

Ever since I tried meditation, I love food.

Around April 2022, a friend convinced me to try Sam Harris' Waking Up app.

Meditation made me actually pay attention to my experiences; and the experience of food suddenly became much higher-dimensional, once I started paying attention.

By October 2022, I've been to a dozen Michelin-star restaurants.^[1]

At some point, I participated in a chocolate tasting hosted by Duncan Sabien. I couldn't stand dark chocolate prior. Afterwards, I could no longer eat milk... (read 724 more words →)

Why and how you should make your home smart (it's cheap and secure!)

Mikhail Samin

3mo

Your average day starts with an alarm on your phone.

Sometimes, you wake up a couple of minutes before it sounds.

Sometimes, you find the button to snooze it.

Sometimes, you’re already on the phone and it appears as a notification.

But when you finally stop it, the lights in your room turn on and you start your day.

You walk out of your room. A presence sensor detects your motion, and a bulb in a cute little bamboo lamp from IKEA outside your room lights up.

You go downstairs, into the living room/kitchen/workspace area. As you come in, 20 different lights turn on in perfect sync. It is very satisfying.

You put some buckwheat to boil. Go upstairs... (read 2148 more words →)

Sharing information about Lightcone Infrastructure

Mikhail Samin

3mo

(Please note that the purpose of this post is to communicate bits of information that I expect some people would really like to know.

A friend convinced me to make this post just before I was about to fly to the Bay Area for Inkhaven. Not wanting to use the skills gained due to participation against the organizers, I wrote and published the post at the very beginning of the program.

I suggest we spend our time on things that are not drama, as there are more important, and more useful, things to do. If you don’t find the information here to be of any importance, I suggest you don’t spend much time on... (read 2382 more words →)

-87

•••

I want to make a thing that talks about why people shouldn't work at Anthropic on capabilities and all the evidence that points in the direction of them being a bad actor in the space, bound by employees who they have to deceive.

A very early version of what it might look like: https://anthropic.ml

Help needed! Email me (or DM on Signal) ms@contact.ms (@misha.09)

Question: does LessWrong has any policies/procedures around accessing user data (e.g., private messages)? E.g., if someone from Lightcone Infrastructure wanted to look at my private DMs or post drafts, would they be able to without approval from others at Lightcone/changes to the codebase?

Horizon Institute for Public Service is not x-risk-pilled

Someone saw my comment and reached out to say it would be useful for me to make a quick take/post highlighting this: many people in the space have not yet realized that Horizon people are not x-risk-pilled.

Edit: some people reached out to me to say that they've had different experiences (with a minority of Horizon people).

We're sending copies of the book to everyone with >5k followers!

If you have >5k followers on any platform (or know anyone who does), (ask them to) DM me the address for a physical copy of If Anyone Builds It, or an email address for a Kindle copy.

So far, sent 13 copies to people with 428k followers in total.

(Removed)

•••

LESSWRONG
LW

LESSWRONG
LW

Mikhail Samin

Unless its governance changes, Anthropic is untrustworthy

How to Give in to Threats (without incentivizing them)

Gradient descent might see the direction of the optimum from far away

AI pause/governance advocacy might be net-negative, especially without a focus on explaining x-risk

Mikhail Samin

Confession: I pranked Inkhaven to make sure no one fails

Unless its governance changes, Anthropic is untrustworthy

I made a tool for learning absolute pitch as an adult

A list of people who could’ve started a nuclear war, but chose not to

Prediction markets for social deduction games

There should be unicorns

How to be convincing when talking to people about existential threat from AI

Mikhail Samin

Unless its governance changes, Anthropic is untrustworthy

How to Give in to Threats (without incentivizing them)

Gradient descent might see the direction of the optimum from far away

AI pause/governance advocacy might be net-negative, especially without a focus on explaining x-risk

Mikhail Samin

Confession: I pranked Inkhaven to make sure no one fails

Unless its governance changes, Anthropic is untrustworthy

I made a tool for learning absolute pitch as an adult

A list of people who could’ve started a nuclear war, but chose not to

Prediction markets for social deduction games

There should be unicorns

How to be convincing when talking to people about existential threat from AI

Prerequisites

I. Learn to truly see them

Horizon Institute for Public Service is not x-risk-pilled