Broken Latents: Studying SAEs and Feature Co-occurrence in Toy Models
Thanks to Jean Kaddour, Tomáš Dulka, and Joseph Bloom for providing feedback on earlier drafts of this post. In a previous post on Toy Models of Feature Absorption, we showed that tied SAEs seem to solve feature absorption. However, when we tried to training some tied SAEs on Gemma 2...
Dec 30, 202424