habryka

Running Lightcone Infrastructure, which runs LessWrong. You can reach me at habryka@lesswrong.com

Sequences

A Moderate Update to your Artificial Priors
A Moderate Update to your Organic Priors
Concepts in formal epistemology

Wiki Contributions

Comments

habryka2-1

This indicates that our scaling lab mentors were more discerning of value alignment on average than non-scaling lab mentors, or had a higher base rate of low-value alignment scholars (probably both).

The second hypothesis here seems much more likely (and my guess is your mentors would agree). My guess is after properly controlling for that you would find a mild to moderate negative correlation here. 

But also, more importantly, the set of scholars from which MATS is drawing is heavily skewed towards the kind of person who would work at scaling labs (especially since funding has been heavily skewing towards funding the kind of research that can occur at scaling labs). 

habryka1310

implicit framing of the average scaling lab safety researcher we support as being relatively unconcerned about value alignment or the positive impact of their research

Huh, not sure where you are picking this up. I am of course very concerned about the ability of researchers at scaling labs being capable of evaluating their positive impact in respect to their choice of working at a scaling lab (their job does after all depend on them not believing that is harmful), but of course they are not unconcerned about their positive impact.

habryka116

In Winter 2023-24, our most empirical research dominated cohort, mentors rated the median scholar's value alignment at 8/10 and 85% of scholars were rated 6/10 or above, where 5/10 was “Motivated in part, but would potentially switch focus entirely if it became too personally inconvenient.”

Wait, aren't many of those mentors themselves working at scaling labs or working very closely with them? So this doesn't feel like a very comforting response to the concern of "I am worried these people want to work at scaling labs because it's a high-prestige and career-advancing thing to do", if the people whose judgements you are using to evaluate have themselves chosen the exact path that I am concerned about.

habryka5731

Cade Metz was the NYT journalist who doxxed Scott Alexander. IMO he has also displayed a somewhat questionable understanding of journalistic competence and integrity, and seems to be quite into narrativizing things in a weirdly adversarial way (I don't think it's obvious how this applies to this article, but it seems useful to know when modeling the trustworthiness of the article).

Promoted to curated: Cancer vaccines are cool. I didn't quite realize how cool they were before this post, and this post is a quite accessible intro into them. 

We are experimenting with bolding the date on posts that are new and leaving it thinner on posts that are old, though feedback so far hasn't been super great.

Hmm, most of the ordering should be the same. Here is the ordering on Youtube Music: 

The Road To Wisdom
Moloch
Thought That Faster (feat. Eliezer Yudkowsky)
The Litany of Tarrrrrski (feat. Eliezer Yudkowsky)
The Litany of Gendlin
Dath Ilan's Song (feat. Eliezer Yudkowsky)
Half An Hour Before Dawn In San Francisco (feat. Scott Alexander)
AGI and the EMH (mit Basil Halperin, J. Zachary Mazlish & Trevor Chow)
First they came for the epistemology (feat. Michael Vassar)
Prime Factorization (feat. Scott Alexander)
We Do Not Wish to Advance (feat. Anthropic)
Nihil Supernum (feat. Godric Gryffindor)
More Dakka (feat. Zvi Mowshowitz)
FHI at Oxford (feat. Nick Bostrom)
Answer to Job (feat. Scott Alexander)

Which is pretty similar to the order here. The folk album is in a slightly different order (which I do think is worse and we sadly can't change), but otherwise things are the same. 

habryka103

My current best guess is that actually cashing out the vested equity is tied to an NDA, but I am really not confident. OpenAI has a bunch of really weird equity arrangements.

Oh, yeah, admins currently have access to a purely recommended view, and I prefer it. I would be in favor of making that accessible to users (maybe behind a beta flag, or maybe not, depending on uptake).

Load More