This is a great post! Thank you for writing this up :)
On training SAEs on ConvNets - I recently trained SAEs for all layers of InceptionV1. I've written up a paper on some of the findings of early vision, with a specific focus on curve detectors (twitter thread on the paper and another on some branch specialisation related findings). The features look really good across the entire model, including finding interpretable, monosemantic features in the final layer which, to the best of my knowledge, hasn't been done before, which is really exciting! ... (read more)
This is a great post! Thank you for writing this up :)
On training SAEs on ConvNets - I recently trained SAEs for all layers of InceptionV1. I've written up a paper on some of the findings of early vision, with a specific focus on curve detectors (twitter thread on the paper and another on some branch specialisation related findings). The features look really good across the entire model, including finding interpretable, monosemantic features in the final layer which, to the best of my knowledge, hasn't been done before, which is really exciting! ... (read more)