Adam Pearce

An interactive introduction to grokking and mechanistic interpretability

Our write up largely agrees with @Quintin Pope's summary, with the addition of training trajectory visualizations and an explanation of the MLP construction that solves modular addition. A meta note that didn't make it into the article — with so many people looking into this problem over the last 18...

Aug 7, 202323

LESSWRONG
LW

LESSWRONG
LW

Adam Pearce

An interactive introduction to grokking and mechanistic interpretability

Adam Pearce

Adam Pearce

An interactive introduction to grokking and mechanistic interpretability