This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
LESSWRONG
is fundraising!
Tags
LW
$
Login
Transformers
•
Applied to
Analyzing how SAE features evolve across a forward pass
by
bensenberner
1mo
ago
•
Applied to
Transformers Explained (Again)
by
RohanS
2mo
ago
•
Applied to
Characterizing stable regions in the residual stream of LLMs
by
Jett Janiak
3mo
ago
•
Applied to
If I ask an LLM to think step by step, how big are the steps?
by
ryan_b
3mo
ago
•
Applied to
Adam Optimizer Causes Privileged Basis in Transformer LM Residual Stream
by
Diego Caples
4mo
ago
•
Applied to
Visualizing small Attention-only Transformers
by
WCargo
4mo
ago
•
Applied to
How Big a Deal are MatMul-Free Transformers?
by
JustisMills
6mo
ago
•
Applied to
Week One of Studying Transformers Architecture
by
JustisMills
6mo
ago
•
Applied to
Logit Prisms: Decomposing Transformer Outputs for Mechanistic Interpretability
by
ntt123
6mo
ago
•
Applied to
Exploring Llama-3-8B MLP Neurons
by
ntt123
6mo
ago
•
Applied to
Finding Backward Chaining Circuits in Transformers Trained on Tree Search
by
abhayesian
7mo
ago
•
Applied to
If language is for communication, what does that imply about LLMs?
by
Bill Benzon
7mo
ago
•
Applied to
An interesting mathematical model of how LLMs work
by
Bill Benzon
8mo
ago
•
Applied to
Transformers Represent Belief State Geometry in their Residual Stream
by
Adam Shai
8mo
ago
•
Applied to
Barcoding LLM Training Data Subsets. Anyone trying this for interpretability?
by
right..enough?
8mo
ago
•
Applied to
Understanding mesa-optimization using toy models
by
tilmanr
9mo
ago
•
Applied to
Decompiling Tracr Transformers - An interpretability experiment
by
Hannes Thurnherr
9mo
ago
•
Applied to
Modern Transformers are AGI, and Human-Level
by
jacobjacob
9mo
ago
•
Applied to
Deconfusing In-Context Learning
by
Arjun Panickssery
10mo
ago