This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
LESSWRONG
is fundraising!
200 Concrete Open Problems in Mechanistic Interpretability
LW
$
Login
200 Concrete Open Problems in Mechanistic Interpretability
56
Concrete Steps to Get Started in Transformer Mechanistic Interpretability
Ω
Neel Nanda
2y
Ω
7
106
200 Concrete Open Problems in Mechanistic Interpretability: Introduction
Ω
Neel Nanda
2y
Ω
0
40
200 COP in MI: The Case for Analysing Toy Language Models
Ω
Neel Nanda
2y
Ω
3
16
200 COP in MI: Looking for Circuits in the Wild
Ω
Neel Nanda
2y
Ω
5
33
200 COP in MI: Interpreting Algorithmic Problems
Ω
Neel Nanda
2y
Ω
2
34
200 COP in MI: Exploring Polysemanticity and Superposition
Ω
Neel Nanda
2y
Ω
6
16
200 COP in MI: Analysing Training Dynamics
Ω
Neel Nanda
2y
Ω
0
13
200 COP in MI: Techniques, Tooling and Automation
Ω
Neel Nanda
2y
Ω
0
18
200 COP in MI: Image Model Interpretability
Ω
Neel Nanda
2y
Ω
3
25
200 COP in MI: Interpreting Reinforcement Learning
Ω
Neel Nanda
2y
Ω
1
24
200 COP in MI: Studying Learned Features in Language Models
Ω
Neel Nanda
2y
Ω
2