LESSWRONG
LW

HomeAll PostsConceptsLibrary
Best of LessWrong
Sequence Highlights
Rationality: A-Z
The Codex
HPMOR
Community Events
Subscribe (RSS/Email)
LW the Album
Leaderboard
About
FAQ
Customize
Load More

Quick Takes

Load More

Popular Comments

Recent Discussion

Moloch Hasn’t Won
Best of LessWrong 2019

Scott Alexander's "Meditations on Moloch" paints a gloomy picture of the world being inevitably consumed by destructive forces of competition and optimization. But Zvi argues this isn't actually how the world works - we've managed to resist and overcome these forces throughout history. 

by Zvi
13fiddler
This review is more broadly of the first several posts of the sequence, and discusses the entire sequence.  Epistemic Status: The thesis of this review feels highly unoriginal, but I can't find where anyone else discusses it. I'm also very worried about proving too much. At minimum, I think this is an interesting exploration of some abstract ideas. Considering posting as a top-level post. I DO NOT ENDORSE THE POSITION IMPLIED BY THIS REVIEW (that leaving immoral mazes is bad), AND AM FAIRLY SURE I'M INCORRECT. The rough thesis of "Meditations on Moloch" is that unregulated perfect competition will inevitably maximize for success-survival, eventually destroying all value in service of this greater goal. Zvi (correctly) points out that this does not happen in the real world, suggesting that something is at least partially incorrect about the above mode, and/or the applicability thereof. Zvi then suggests that a two-pronged reason can explain this: 1. most competition is imperfect, and 2. most of the actual cases in which we see an excess of Moloch occur when there are strong social or signaling pressures to give up slack.  In this essay, I posit an alternative explanation as to how an environment with high levels of perfect competition can prevent the destruction of all value, and further, why the immoral mazes discussed later on in this sequence are an example of highly imperfect competition that causes the Molochian nature thereof.  First, a brief digression on perfect competition: perfect competition assumes perfectly rational agents. Because all strategies discussed are continuous-time, the decisions made in any individual moment are relatively unimportant assuming that strategies do not change wildly from moment to moment, meaning that the majority of these situations can be modeled as perfect-information situations.  Second, the majority of value-destroying optimization issues in a perfect-competition environment can be presented as prisoners dilemmas: both
470Welcome to LessWrong!
Ruby, Raemon, RobertM, habryka
6y
74
AGI Forum @ Purdue University
Tue Jul 1•West Lafayette
Lighthaven Sequences Reading Group #40 (Tuesday 7/1)
Wed Jul 2•Berkeley
AI Safety Thursdays: Are LLMs aware of their learned behaviors?
Thu Jul 10•Toronto
LessWrong Community Weekend 2025
Fri Aug 29•Berlin
Sam Marks8h263
1
The "uncensored" Perplexity-R1-1776 becomes censored again after quantizing Perplexity-R1-1776 is an "uncensored" fine-tune of R1, in the sense that Perplexity trained it not to refuse discussion of topics that are politically sensitive in China. However, Rager et al. (2025)[1] documents (see section 4.4) that after quantizing, Perplexity-R1-1776 again censors its responses: I found this pretty surprising. I think a reasonable guess for what's going on here is that Perplexity-R1-1776 was finetuned in bf16, but the mechanism that it learned for non-refusal was brittle enough that numerical error from quantization broke it. One takeaway from this is that if you're doing empirical ML research, you should consider matching quantization settings between fine-tuning and evaluation. E.g. quantization differences might explain weird results where a model's behavior when evaluated differs from what you'd expect based on how it was fine-tuned. 1. ^ I'm not sure if Rager et al. (2025) was the first source to publicly document this, but I couldn't immediately find an earlier one.
leogao1d630
7
random brainstorming ideas for things the ideal sane discourse encouraging social media platform would have: * have an LM look at the comment you're writing and real time give feedback on things like "are you sure you want to say that? people will interpret that as an attack and become more defensive, so your point will not be heard". addendum: if it notices you're really fuming and flame warring, literally gray out the text box for 2 minutes with a message like "take a deep breath. go for a walk. yelling never changes minds" * have some threaded chat component bolted on (I have takes on best threading system). big problem is posts are fundamentally too high effort to be a way to think; people want to talk over chat (see success of discord). dialogues were ok but still too high effort and nobody wants to read the transcript. one stupid idea is have an LM look at the transcript and gently nudge people to write things up if the convo is interesting and to have UI affordances to make it low friction (eg a single button that instantly creates a new post and automatically invites everyone from the convo to edit, and auto populates the headers) * inspired by the court system, the most autistically rule following part of the US government: have explicit trusted judges who can be summoned to adjudicate claims or meta level "is this valid arguing" claims. top level judges are selected for fixed terms by a weighted sortition scheme that uses some game theoretic / schelling point stuff to discourage partisanship * recommendation system where you can say what kind of stuff you want to be recommended in some text box in the settings. also when people click "good/bad rec" buttons on the home page, try to notice patterns and occasionally ask the user whether a specific noticed pattern is correct and ask whether they want it appended to their rec preferences * opt in anti scrolling pop up that asks you every few days what the highest value interaction you had recently on the
Mikhail Samin14h26-4
5
i made a thing! it is a chatbot with 200k tokens of context about AI safety. it is surprisingly good- better than you expect current LLMs to be- at answering questions and counterarguments about AI safety. A third of its dialogues contain genuinely great and valid arguments. You can try the chatbot at https://whycare.aisgf.us (ignore the interface; it hasn't been optimized yet). Please ask it some hard questions! Especially if you're not convinced of AI x-risk yourself, or can repeat the kinds of questions others ask you. Send feedback to ms@contact.ms. A couple of examples of conversations with users:
johnswentworth2dΩ411201
26
I was a relatively late adopter of the smartphone. I was still using a flip phone until around 2015 or 2016 ish. From 2013 to early 2015, I worked as a data scientist at a startup whose product was a mobile social media app; my determination to avoid smartphones became somewhat of a joke there. Even back then, developers talked about UI design for smartphones in terms of attention. Like, the core "advantages" of the smartphone were the "ability to present timely information" (i.e. interrupt/distract you) and always being on hand. Also it was small, so anything too complicated to fit in like three words and one icon was not going to fly. ... and, like, man, that sure did not make me want to buy a smartphone. Even today, I view my phone as a demon which will try to suck away my attention if I let my guard down. I have zero social media apps on there, and no app ever gets push notif permissions when not open except vanilla phone calls and SMS. People would sometimes say something like "John, you should really get a smartphone, you'll fall behind without one" and my gut response was roughly "No, I'm staying in place, and the rest of you are moving backwards". And in hindsight, boy howdy do I endorse that attitude! Past John's gut was right on the money with that one. I notice that I have an extremely similar gut feeling about LLMs today. Like, when I look at the people who are relatively early adopters, making relatively heavy use of LLMs... I do not feel like I'll fall behind if I don't leverage them more. I feel like the people using them a lot are mostly moving backwards, and I'm staying in place.
Raemon7h116
2
TAP for fighting LLM-induced brain atrophy: "send LLM query" ---> "open up a thinking doc and think on purpose." What a thinking doc looks varies by person. Also, if you are sufficiently good at thinking, just "think on purpose" is maybe fine, but, I recommend having a clear sense of what it means to think on purpose and whether you are actually doing it. I think having a doc is useful because it's easier to establish a context switch that is supportive of thinking. For me, "think on purpose" means: * ask myself what my goals are right now (try to notice at least 3) * ask myself what would be the best think to do next (try for at least 3 ideas) * flowing downhill from there is fine
Load More (5/38)
Existential Angst Factory
78
Eliezer Yudkowsky
17y

Followup to:  The Moral Void

A widespread excuse for avoiding rationality is the widespread belief that it is "rational" to believe life is meaningless, and thus suffer existential angst.  This is one of the secondary reasons why it is worth discussing the nature of morality.  But it's also worth attacking existential angst directly.

I suspect that most existential angst is not really existential.  I think that most of what is labeled "existential angst" comes from trying to solve the wrong problem.

Let's say you're trapped in an unsatisfying relationship, so you're unhappy.  You consider going on a skiing trip, or you actually go on a skiing trip, and you're still unhappy.  You eat some chocolate, but you're still unhappy.  You do some volunteer work at a charity (or better yet,...

(See More – 932 more words)
lesswronguser1232m10

How I would phrase it is, value precedes justification.

Reply
Are LLMs being trained using LessWrong text?
3
Cedar
33m

I wonder if there's a clear evidence that LessWrong text has been included in LLM training.

Claude seems generally aware of LessWrong, but it's difficult to distinguish between "this model has been trained on text that mentions LessWrong" and "this model has been trained on text from LessWrong"

Related discussion here, about preventing inclusion: https://www.lesswrong.com/posts/SGDjWC9NWxXWmkL86/keeping-content-out-of-llm-training-datasets?utm_source=perplexity

Answer by CedarJul 02, 202510

LessWrong scrape dataset on Hugging face, by NousResearch 

https://huggingface.co/datasets/LDJnr/LessWrong-Amplify-Instruct

Reply
The Best Tacit Knowledge Videos on Every Subject
437
Parker Conley, hans truman
1y

TL;DR

is extremely valuable. Unfortunately, developing tacit knowledge is usually bottlenecked by apprentice-master relationships. Tacit Knowledge Videos could widen this bottleneck. This post is a Schelling point for aggregating these videos—aiming to be The Best Textbooks on Every Subject for Tacit Knowledge Videos. Scroll down to the list if that's what you're here for. Post videos that highlight tacit knowledge in the comments and I’ll add them to the post. Experts in the videos include Stephen Wolfram, Holden Karnofsky, Andy Matuschak, Jonathan Blow, Tyler Cowen, George Hotz, and others. 

What are Tacit Knowledge Videos?

Samo Burja claims YouTube has opened the gates for a revolution in tacit knowledge transfer. Burja defines tacit knowledge as follows:

Tacit knowledge is knowledge that can’t properly be transmitted via verbal or written instruction, like the ability to create

...
(Continue Reading – 6195 more words)
1Parker Conley1h
Any chance you could unpin this comment? Seems like the idea of people suggesting videos based on it didn't work, and having the updates be the first pinned comment would probably provide more value to people looking at the post.
Ben Pace25m20

Done.

Reply
Hiring* an AI** Artist for LessWrong/Lightcone
27
Raemon
1d

Over the past few years, Lightcone has started using AI art in more of our products. This is a fairly easy and fun part of our job, but I've noticed often there's just a lotta art that needs to get made which we don't quite have the bandwidth to do ourselves. 

So I'm looking into hiring* an AI** artist for periodic contract gigs (this wouldn't be a fulltime thing, I have one initial job in mind, if it goes well we may periodically have other jobs to offer).

* we aren't sure we want to hire a person for this, this isn't a top organizational priority, it's more like I'm checking if someone exists who would integrate pretty quickly/easily into our workflow.

** in theory, we could hire, like, a...

(See More – 180 more words)
keltan25m10

I am commenting as to commit publicly. 

I Will: Create an AI art portfolio, and DM it to Raemon by 10pm AEST, tonight.

Reply1
3lemonhope18h
You might want to advertise on reddit or somewhere with more artists. You could ask for three specific arts for three specific posts to make your job easier. (I could do it for you.)
2Raemon10h
I do plan to post a couple other places but I think I do need people with both good artistic taste and good familiarity with LessWrong. (I'm planning to ask on Bountied Rationality) For this role to actually save us work, they need to not require that much onboarding. We could hypothetically train someone with less familiarity with LessWrong but I think that'd take more time than it's worth. (We need someone who's able to both understand the existing LessWrong aesthetic, and what we're going for with that aesthetic, and when/how/why it'd be appropriate to deviate from it. Most of the work often involves figuring out what broad choices would be appropriate for a given piece, so we need to be able to give pretty vague instructions and have them figure it out from context) (The particular project I'm looking to hire for is designing cover art and ~6 illustrations for a Sequence Highlights book, which involves figuring out an overall unifying motif for the book that is somewhat-distinct from the usual LessWrong vibe but compatible with it.)
Consider chilling out in 2028
161
Valentine
10d

I'll explain my reasoning in a second, but I'll start with the conclusion:

I think it'd be healthy and good to pause and seriously reconsider the focus on doom if we get to 2028 and the situation feels basically like it does today.

I don't know how to really precisely define "basically like it does today". I'll try to offer some pointers in a bit. I'm hoping folk will chime in and suggest some details.

Also, I don't mean to challenge the doom focus right now. There seems to be some good momentum with AI 2027 and the Eliezer/Nate book. I even preordered the latter.

But I'm still guessing this whole approach is at least partly misled. And I'm guessing that fact will show up in 2028 as "Oh, huh, looks...

(Continue Reading – 3793 more words)
roha29m10

I also had to look it up and got interested in testing whether or how it could apply.

Here's an explanation of Bulverism that suggests a concrete logical form of the fallacy:

  1. Person 1 makes argument X.
  2. Person 2 assumes person 1 must be wrong because of their Y (e.g. suspected motives, social identity, or other characteristic associated with their identity).
  3. Therefore, argument X is flawed or not true.

Here's a possible assignment for X and Y that tries to remain rather general:

  • X = Doom is plausible because ...
  • Y = Trauma / Fear / Fixation

Why would that be a fall... (read more)

Reply
1Michael Roe15h
I think “NPC” in that sense is more used by the conspiracy theory community than rationalists. With the idea being that only the person using the term is smart enough to realize that e.g. the Government is controlled by lizards from outer space, and everyone else just believes the media. The fundamental problem with the term is that you might actually be wrong about e.g. the lizards from outer space, and you might not be as smart as you think.
2garrets18h
OpenAI researcher Jason Wei recently stated that there will be many bottlenecks to recursive self improvement (experiments, data), thoughts? https://x.com/_jasonwei/status/1939762496757539297z
1Daniel Kokotajlo12h
He makes some obvious points everyone already knows about bottlenecks etc. but then doesn't explain why all that adds up to a decade or more, instead of of a year, or a month, or a century. In our takeoff speeds forecast we try to give a quantitative estimate that takes into account all the bottlenecks etc.
"What's my goal?"
12
Raemon
35m

The first in a series of bite-sized rationality prompts[1].

 

This is my most common opening-move for Instrumental Rationality. There are many, many other pieces of instrumental rationality. But asking this question is usually a helpful way to get started. Often, simply asking myself "what's my goal?" is enough to direct my brain to a noticeably better solution, with no further work.

Examples

Puzzle Games

I'm playing Portal 2, or Baba is You. I'm fiddling around with the level randomly, sometimes going in circles. I notice I've been doing that awhile. 

I ask "what's my goal?"

And then my eyes automatically glance at the exit for the level and realize I can't possibly make progress unless I solve a particular obstacle, which none of my fiddling-around was going to help with.

Arguing

I'm arguing with a...

(See More – 503 more words)
To get the best posts emailed to you, create an account! (2-3 posts per week, selected by the LessWrong moderation team.)
Log In Reset Password
...or continue with
GOOGLEGITHUB
Benefits of Psyllium Dietary Fiber in Particular
14
Brendan Long
10mo
This is a linkpost for https://www.brendanlong.com/benefits-of-psyllium-dietary-fiber-in-particular.html

Psyllium husk is a non-fermenting (no gas or bloating) soluble dietary fiber that improves both constipation and diarrhea (such as with IBS), normalizes blood sugar, reduces LDL ("bad") cholesterol, and can help with weight loss. Each type of dietary fiber has different effects, and a "high fiber" diet in general won't necessarily provide the same benefits, especially for conditions like Irritable Bowel Syndrome[1].

At a high level:

  • Psyllium is a dietary fiber that's soluble but doesn't ferment.
  • It forms a gel that traps water (helping with both constipation and diarrhea[2]) and also bile (reducing LDL/"bad" cholesterol[3][4][5]).
  • The gel slows down digestion, which normalizes blood sugar, increases GLP-1[6], and makes you feel full longer (and helps modestly with weight loss[7]).
  • The lack of fermentation means it makes it all the way through your
...
(See More – 284 more words)
bgaesop1h10

Thoughts on this recent finding? 

https://www.consumerlab.com/news/best-psyllium-fiber-supplements-2024/02-29-2024/

Reply
Foom & Doom 2: Technical alignment is hard
127
Steven Byrnes
Ω 468d

2.1 Summary & Table of contents

This is the second of a two-post series on foom (previous post) and doom (this post).

The last post talked about how I expect future AI to be different from present AI. This post will argue that, absent some future conceptual breakthrough, this future AI will be of a type that will be egregiously misaligned and scheming; a type that ruthlessly pursues goals with callous indifference to whether people, even its own programmers and users, live or die; and more generally a type of AI that is not even ‘slightly nice’.

I will particularly focus on exactly how and why I differ from the LLM-focused researchers who wind up with (from my perspective) bizarrely over-optimistic beliefs like “P(doom) ≲ 50%”.[1]

In particular, I will argue...

(Continue Reading – 8253 more words)
Seth Herd1h20

That makes sense. Although I don't think that non-behavioral training is a magic bullet either. And I don't think behavioral training becomes doomed when you hit an AI capable of scheming if it was working right up until then. Scheming and deception would allow an AI to hide its goals but not change its goals.

What might cause an AI to change its goals is the reflection I mention. Which would probably happen at right around the same level of intelligence as scheming and deceptive alignment. But it's a different effect. As with your point, I think doomed is ... (read more)

Reply
ACX Montreal meetup - July 5th @1PM
Jul 5th
1442 Rue Clark, Montréal
BionicD0LPH1N

Come on out to the next ACX (Astral Codex Ten) Montreal Meetup! This week, we're reading Orienting Toward Wizard Power, by John Wentworth.  The post discusses the distinction between Wizard Power and King Power.

I strongly recommend this post, which is quite good (and short), as well as the optional readings, which are also excellent.

Optional readings:

  • Three Notions of "Power", by John Wentworth
  • The Best Tacit Knowledge Videos on Every Subject, by Parker Conley
  • Cultivating And Destroying Agency, by hath
  • Tsuyoku Naritai! (I Want To Become Stronger), by Eliezer Yudkowsky

Feel free to suggest topics or readings for future meetups on this form. Seriously, I'm struggling here. :P

Venue: L'Esplanade Tranquille, 1442 Clark. Rough location here: https://plus.codes/87Q8GC5P+P2R. Note: join our Discord server to receive last-minute information in case of bad weather.
Date & Time: Saturday,...

(See More – 48 more words)
84
Proposal for making credible commitments to AIs.
Cleo Nardo
1d
33
150
X explains Z% of the variance in Y
Leon Lang
4d
23
Tacit knowledge
Montréal LessWrong
Cole Wyeth1d6037
The best simple argument for Pausing AI?
Welcome to lesswrong! I’m glad you’ve decided to join the conversation here.  A problem with this argument is that it doesn’t prove we should pause AI, only that we should avoid deploying AI in high impact (e.g. military) applications. Insofar as LLMs can’t follow rules, the argument seems to indicate that we should continue to develop the technology until it can. Personally, I’m concerned about the type of AI system which can follow rules, but is not intrinsically motivated to follow our moral rules. Whether LLMs will reach that threshold is not clear to me (see https://www.lesswrong.com/posts/vvgND6aLjuDR6QzDF/my-model-of-what-is-going-on-with-llms) but this argument seems to cut against my actual concerns. 
habryka1d*4732
Don't Eat Honey
My guess is this is obvious, but IMO it seems extremely unlikely to me that bee-experience is remotely as important to care about as cow experience. Enough as to make statements like this just sound approximately insane:  > 97% of years of animal life brought about by industrial farming have been through the honey industry (though this doesn’t take into account other insect farming). Like, no, this isn't how this works. This obviously isn't how this works. You can't add up experience hours like this. At the very least use some kind of neuron basis. > The median estimate, from the most detailed report ever done on the intensity of pleasure and pain in animals, was that bees suffer 7% as intensely as humans. The mean estimate was around 15% as intensely as people. Bees were guessed to be more intensely conscious than salmon! If anyone remotely thinks a bee suffering is 15% (!!!!!!!!) as important as a human suffering, you do not sound like someone who has thought about this reasonably at all. It is so many orders of magnitude away from what sounds reasonable to me that I find myself wanting to look somewhere else but the arguments in things like the Rethink Priorities report (which I have read, and argued with people about for many hours, and still sound insane to me, and IMO do not hold up), but instead look towards things like there being some kind of social signaling madness where someone is trying to signal commitment to some group standard of dedication, which involves some runaway set of extreme beliefs. Edit: And to avoid a slipping of local norms here. I am only leaving this comment here now after I have seriously entertained the hypothesis that I might be wrong, that maybe there do exist good arguments for moral weights that seem crazy to from where I was originally, but no, after looking into the arguments for quite a while, they still seem crazy to me, and so now I feel comfortable moving on and trying to think about what psychological or social process produces posts like this. And still, I am hesitant about it, because many readers have probably not gone through the same journey, and I don't want a culture of dismissing things just because they are big and would imply drastic actions.
Kaj_Sotala9h2511
Authors Have a Responsibility to Communicate Clearly
It used to be that I would sometimes read something and interpret it to mean X (sometimes, even if the author expressed it sloppily). Then I would say "I think the author meant X" and get into arguments with people who thought the author meant something different. These arguments would be very frustrating, since no matter how certain I was of my interpretation, short of asking the author there was no way to determine who was right. At some point I realized that there was no reason to make claims about the author's intent. Instead of saying "I think the author meant X", I could just say "this reads to me as saying X". Now I'm only reporting on how I'm personally interpreting their words, regardless of what they might have meant. That both avoids pointless arguments about what the author really meant, and is more epistemically sensible, since in most cases I don't know that my reading of the words is what the author really intended. Of course, sometimes I might have reason to believe that I do know the author's intent. For example, if I've spent quite some time discussing X with the author directly, and have a good understanding of how they think about the topic. In those cases I might still make claims of their intent. But generally I've stopped making such claims, which has saved me from plenty of pointless arguments.
Load More
410A case for courage, when speaking of AI danger
So8res
5d
44
81Authors Have a Responsibility to Communicate Clearly
TurnTrout
12h
16
342A deep critique of AI 2027’s bad timeline models
titotal
13d
39
469What We Learned from Briefing 70+ Lawmakers on the Threat from AI
leticiagarcia
1mo
15
340the void
Ω
nostalgebraist
21d
Ω
98
534Orienting Toward Wizard Power
johnswentworth
1mo
142
660AI 2027: What Superintelligence Looks Like
Ω
Daniel Kokotajlo, Thomas Larsen, elifland, Scott Alexander, Jonas V, romeo
3mo
Ω
222
206Foom & Doom 1: “Brain in a box in a basement”
Ω
Steven Byrnes
8d
Ω
78
87What We Learned Trying to Diff Base and Chat Models (And Why It Matters)
Ω
Clément Dumas, Julian Minder, Neel Nanda
1d
Ω
0
286Beware General Claims about “Generalizable Reasoning Capabilities” (of Modern AI Systems)
Ω
LawrenceC
20d
Ω
19
73The best simple argument for Pausing AI?
Gary Marcus
1d
10
159My pitch for the AI Village
Daniel Kokotajlo
8d
29
418Accountability Sinks
Martin Sustrik
2mo
57
Load MoreAdvanced Sorting/Filtering