LESSWRONG
LW

TomasD's Shortform

by TomasD
14th Mar 2024
1 min read
0

1

This is a special post for quick takes by TomasD. Only they can create top-level comments. Comments here also appear on the Quick Takes page and All Posts page.

New to LessWrong?

Getting Started

FAQ

Library

Moderation Log
More from TomasD
73[Paper] A is for Absorption: Studying Feature Splitting and Absorption in Sparse Autoencoders
chanind, TomasD, hrdkbhatnagar, Joseph Bloom
7mo
16
49Toy Models of Feature Absorption in SAEs
chanind, hrdkbhatnagar, TomasD, Joseph Bloom
7mo
8
View more
Curated and popular this week
368Playing in the Creek
Hastings
3d
16
391Accountability Sinks
Martin Sustrik
6d
55
253Interpretability Will Not Reliably Find Deceptive AI
Ω
Neel Nanda
4d
Ω
30
0Comments