SoerenMind

Learned pain as a leading cause of chronic pain

Epistemic status: Amateur synthesis of medical research that is still recent but now established enough to make it into modern medical textbooks. Some specific claims vary in evidence strength. I’ve spent ~20-30 hours studying the literature and treatment approaches, which were very effective for me. Disclaimer: I'm not a medical...

Apr 9, 2025222

How to Catch an AI Liar: Lie Detection in Black-Box LLMs by Asking Unrelated Questions

This post is a copy of the introduction of this paper on lie detection in LLMs. The Twitter Thread is here. Authors: Lorenzo Pacchiardi, Alex J. Chan, Sören Mindermann, Ilan Moscovitz, Alexa Y. Pan, Yarin Gal, Owain Evans, Jan Brauner [EDIT: Many people said they found the results very surprising....

Sep 28, 2023187

Wikipedia as an introduction to the alignment problem

AI researchers and others are increasingly looking for an introduction to the alignment problem that is clearly written, credible, and supported by evidence and real examples. The Wikipedia article on AI Alignment has become such an introduction. Link: https://en.wikipedia.org/wiki/AI_alignment Aside from me, it has contributions from Mantas Mazeika, Gavin Leech,...

May 29, 202383

The Alignment Problem from a Deep Learning Perspective (major rewrite)

We've recently uploaded a major rewrite of Richard Ngo's: The Alignment Problem from a Deep Learning Perspective We hope it can reach ML researchers by being more grounded in the deep learning literature and empirical findings, and being more rigorous than typical introductions to the alignment problem. There are many...

Jan 10, 202384

How much to optimize for the short-timelines scenario?

Some have argued that one should tend to act as if timelines are short since in that scenario it's possible to have more expected impact. But I haven't seen a thorough analysis of this argument. Question: Is this argument valid and if yes how strong is it? The basic argument...

Jul 21, 202220

Inference cost limits the impact of ever larger models

I sometimes notice that people in my community (myself included) assume that the first "generally human-level" model will lead to a transformative takeoff scenario almost immediately. The assumption seems to be that training is expensive but inference is cheap so once you're done training you can deploy an essentially unlimited...

Oct 23, 202142

SoerenMind's Shortform

Jun 11, 20215

SoerenMind

SoerenMind

Learned pain as a leading cause of chronic pain

How to Catch an AI Liar: Lie Detection in Black-Box LLMs by Asking Unrelated Questions

FHI paper published in Science: interventions against COVID-19

The Alignment Problem from a Deep Learning Perspective (major rewrite)

SoerenMind

Learned pain as a leading cause of chronic pain

How to Catch an AI Liar: Lie Detection in Black-Box LLMs by Asking Unrelated Questions

FHI paper published in Science: interventions against COVID-19

The Alignment Problem from a Deep Learning Perspective (major rewrite)

Learned pain as a leading cause of chronic pain

How to Catch an AI Liar: Lie Detection in Black-Box LLMs by Asking Unrelated Questions

Wikipedia as an introduction to the alignment problem

The Alignment Problem from a Deep Learning Perspective (major rewrite)

How much to optimize for the short-timelines scenario?

Inference cost limits the impact of ever larger models

SoerenMind's Shortform