LESSWRONG
Wikitags
LW

Superposition

Settings
Applied to Computational Superposition in a Toy Model of the U-AND Problem by Adam Newgas 2mo ago
Applied to Thoughts on Toy Models of Superposition by RobertM 3mo ago
Applied to Conditional Importance in Toy Models of Superposition by RobertM 3mo ago
Applied to Effects of Non-Uniform Sparsity on Superposition in Toy Models by Shreyans Jain 6mo ago
Applied to Circuits in Superposition: Compressing many small neural networks into one by Lucius Bushnaq 7mo ago
Applied to Toy Models of Superposition: Simplified by Hand by Axel Sorensen 8mo ago
Applied to Superposition through Active Learning Lens by akankshanc 8mo ago
Applied to Crafting Polysemantic Transformer Benchmarks with Known Circuits by Evan Anders 9mo ago
Applied to Limitations on the Interpretability of Learned Features from Sparse Dictionary Learning by Tom Angsten 10mo ago
Applied to Superposition is not "just" neuron polysemanticity by LawrenceC 1y ago
Applied to Scaling Laws and Superposition by Pavan Katta 1y ago
Applied to Sparse autoencoders find composed features in small toy models by Evan Anders 1y ago
Applied to Some costs of superposition by Linda Linsefors 1y ago
Applied to From Conceptual Spaces to Quantum Concepts: Formalising and Learning Structured Conceptual Models by Roman Leventov 1y ago
Applied to AI alignment as a translation problem by Roman Leventov 1y ago
Applied to Open Source Sparse Autoencoders for all Residual Stream Layers of GPT2-Small by Joseph Bloom 1y ago