Research Report: Alternative sparsity methods for sparse autoencoders with OthelloGPT.
Abstract Standard sparse autoencoder training uses an L1 sparsity loss term to induce sparsity in the hidden layer. However, theoretical justifications for this choice are lacking (in my opinion), and there may be better ways to induce sparsity. In this post, I explore other methods of inducing sparsity and experiment...
Oh, yeah, looks like with p=2 this is equivalent to Hoyer-Square. Thanks for pointing that out; I didn't know this had been studied previously.
And you're right, that was a typo, and I've fixed it now. Thank you for mentioning that!