This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
LESSWRONG
LW
Login
Jannik Brinkmann
Posts
Sorted by New
38
Evaluating Sparse Autoencoders with Board Game Models
6mo
1
74
Interpreting Preference Models w/ Sparse Autoencoders
Ω
7mo
Ω
12
50
Finding Backward Chaining Circuits in Transformers Trained on Tree Search
8mo
1
26
Improving SAE's by Sqrt()-ing L1 & Removing Lowest Activating Features
Ω
10mo
Ω
5
Wiki Contributions
Comments
Sorted by
Newest