This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
LESSWRONG
LW
Login
Home
All Posts
Concepts
Library
Best of LessWrong
Sequence Highlights
Rationality: A-Z
The Codex
HPMOR
Community Events
AI Safety Camp final presentations
Virtual AI Safety Unconference 2024
[Tomorrow]
Gothenburg – ACX Meetups Everywhere Spring 2024
Gulf Breeze – ACX Meetups Everywhere Spring 2024
Subscribe (RSS/Email)
LW the Album
About
FAQ
All Posts
Sorted by New
Timeframe:
All time
Daily
Weekly
Monthly
Yearly
Exponential
Sorted by:
Magic (New & Upvoted)
Top
Top (Inflation Adjusted)
Recent Comments
New
Old
Filtered by:
All Posts
Frontpage
Curated
Questions
Events
Show Low Karma
Show Events
229
Thoughts on seed oil
dynomight
2d
40
301
The Best Tacit Knowledge Videos on Every Subject
Parker Conley
,
hans truman
9d
111
81
[Linkpost] Practically-A-Book Review: Rootclaim $100,000 Lab Leak Debate
trevor
5d
22
250
On green
Joe Carlsmith
24d
33
248
My PhD thesis: Algorithmic Bayesian Epistemology
Eric Neyman
20d
14
171
Toward a Broader Conception of Adverse Selection
Ricki Heicklen
14d
61
197
"How could I have thought that faster?"
mesaoptimizer
1mo
31
139
Using axis lines for good or evil
dynomight
1mo
39
228
My Clients, The Liars
ymeskhout
1mo
85
109
Social status part 1/2: negotiations over object-level preferences
Steven Byrnes
1mo
15
57
Acting Wholesomely
owencb
1mo
64
262
Scale Was All We Needed, At First
Gabriel Mukobi
1mo
31
213
CFAR Takeaways: Andrew Critch
Raemon
2mo
62
139
And All the Shoggoths Merely Players
Zack_M_Davis
2mo
56
124
Updatelessness doesn't solve most problems
Ω
Martín Soto
2mo
Ω
43
208
Believing In
AnnaSalamon
2mo
49
106
Attitudes about Applied Rationality
Camille Berger
3mo
18
159
Without fundamental advances, misalignment and catastrophe are the default outcomes of training powerful AI
Ω
Jeremy Gillen
,
peterbarnett
3mo
Ω
59
238
The case for ensuring that powerful AIs are controlled
Ω
ryan_greenblatt
,
Buck
3mo
Ω
66
122
A Shutdown Problem Proposal
Ω
johnswentworth
,
David Lorell
3mo
Ω
61
349
There is way too much serendipity
Malmesbury
3mo
56
288
Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training
Ω
evhub
,
Carson Denison
,
Meg
,
Monte M
,
David Duvenaud
,
Nicholas Schiefer
,
Ethan Perez
3mo
Ω
94
130
Deep atheism and AI risk
Joe Carlsmith
2mo
22
265
Gentleness and the artificial Other
Joe Carlsmith
3mo
32
96
A case for AI alignment being difficult
Ω
jessicata
4mo
Ω
53
90
Meaning & Agency
Ω
abramdemski
4mo
Ω
17
259
Constellations are Younger than Continents
Jeffrey Heninger
4mo
22
131
The Dark Arts
lsusr
,
Lyrongolem
4mo
49
147
Discussion: Challenges with Unsupervised LLM Knowledge Discovery
Ω
Seb Farquhar
,
Vikrant Varma
,
zac_kenton
,
gasteigerjo
,
Vlad Mikulik
,
Rohin Shah
4mo
Ω
21
404
Significantly Enhancing Adult Intelligence With Gene Editing May Be Possible
GeneSmith
,
kman
4mo
162
288
Speaking to Congressional staffers about AI risk
Akash
,
hath
2mo
23
155
How useful is mechanistic interpretability?
ryan_greenblatt
,
Neel Nanda
,
Buck
,
habryka
3mo
53
307
Shallow review of live agendas in alignment & safety
Ω
technicalities
,
Stag
5mo
Ω
69
137
Moral Reality Check (a short story)
jessicata
4mo
44
215
What are the results of more parental supervision and less outdoor play?
juliawise
5mo
30
281
Social Dark Matter
[DEACTIVATED] Duncan Sabien
5mo
112
252
AI Timelines
Ω
habryka
,
Daniel Kokotajlo
,
Ajeya Cotra
,
Ege Erdil
5mo
Ω
74
185
Thinking By The Clock
Screwtape
5mo
27
260
The 6D effect: When companies take risks, one email can be very powerful.
scasper
5mo
40
104
Deception Chess: Game #1
Zane
,
aphyer
,
Alex A
,
AdamYedidia
6mo
19
240
Book Review: Going Infinite
Zvi
6mo
109
238
Alignment Implications of LLM Successes: a Debate in One Act
Ω
Zack_M_Davis
6mo
Ω
50
157
Holly Elmore and Rob Miles dialogue on AI Safety Advocacy
jacobjacob
,
Robert Miles
,
Holly_Elmore
6mo
30
286
Towards Monosemanticity: Decomposing Language Models With Dictionary Learning
Ω
Zac Hatfield-Dodds
6mo
Ω
19
169
Thomas Kwa's MIRI research experience
Thomas Kwa
,
peterbarnett
,
Vivek Hebbar
,
Jeremy Gillen
,
jacobjacob
,
Raemon
7mo
52
102
Cohabitive Games so Far
mako yass
6mo
116
324
Inside Views, Impostor Syndrome, and the Great LARP
johnswentworth
6mo
53
481
The Talk: a brief explanation of sexual dimorphism
Malmesbury
7mo
72
197
UDT shows that decision theory is more puzzling than ever
Ω
Wei Dai
7mo
Ω
51
222
Sum-threshold attacks
TsviBT
6mo
52