Effects of Non-Uniform Sparsity on Superposition in Toy Models
Abstract This post summarises my findings on the effects of Non-Uniform feature sparsity on Superposition in the ReLU output model, introduced in the Toy Models of Superposition paper, the ReLU output model is a toy model which is shown to exhibit features in superposition instead of a dedicated dimension ('individual...
Nov 14, 20244