A suite of Vision Sparse Autoencoders
CLIP-Scope? Inspired by Gemma-Scope We trained 8 Sparse Autoencoders each on 1.2 billion tokens on different layers of a Vision Transformer. These (along with more technical details) can be accessed on huggingface. We also released a pypi package for ease of use. We hope that this will allow researchers to...
Oct 27, 202425