2
105The Local Interaction Basis: Identifying Computationally-Relevant and Sparsely Interacting Features in Neural NetworksLucius Bushnaq,
jake_mendel,
Dan Braun,
StefanHex,
Nicholas Goldowsky-Dill,
Kaarel,
Avery,
Joern Stoehler,
debrevitatevitae,
Magdalena Wache,
Marius Hobbhahn 2
93Apollo Research 1-year updateMarius Hobbhahn,
Lee Sharkey,
Lucius Bushnaq,
Dan Braun,
Mikita Balesni,
Jérémy Scheurer,
Nicholas Goldowsky-Dill,
StefanHex,
jake_mendel,
AlexMeinke,
rusheb