This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
LESSWRONG
LW
Login
TomasD's
Shortform
by
TomasD
14th Mar 2024
1 min read
0
1
This is a special post for quick takes by
TomasD
. Only they can create top-level comments. Comments here also appear on the
Quick Takes page
and
All Posts page
.
New to LessWrong?
Getting Started
FAQ
Library
Moderation Log
More from
TomasD
73
[Paper] A is for Absorption: Studying Feature Splitting and Absorption in Sparse Autoencoders
chanind
,
TomasD
,
hrdkbhatnagar
,
Joseph Bloom
7mo
16
49
Toy Models of Feature Absorption in SAEs
chanind
,
hrdkbhatnagar
,
TomasD
,
Joseph Bloom
7mo
8
View more
Curated and popular this week
368
Playing in the Creek
Hastings
3d
16
391
Accountability Sinks
Martin Sustrik
6d
55
253
Interpretability Will Not Reliably Find Deceptive AI
Ω
Neel Nanda
4d
Ω
30
0
Comments