This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
LESSWRONG
Tags
LW
Login
Transformers
•
Applied to
Analyzing how SAE features evolve across a forward pass
by
bensenberner
14d
ago
•
Applied to
Transformers Explained (Again)
by
RohanS
1mo
ago
•
Applied to
Characterizing stable regions in the residual stream of LLMs
by
Jett Janiak
2mo
ago
•
Applied to
If I ask an LLM to think step by step, how big are the steps?
by
ryan_b
2mo
ago
•
Applied to
Adam Optimizer Causes Privileged Basis in Transformer LM Residual Stream
by
Diego Caples
3mo
ago
•
Applied to
Visualizing small Attention-only Transformers
by
WCargo
3mo
ago
•
Applied to
How Big a Deal are MatMul-Free Transformers?
by
JustisMills
5mo
ago
•
Applied to
Week One of Studying Transformers Architecture
by
JustisMills
5mo
ago
•
Applied to
Logit Prisms: Decomposing Transformer Outputs for Mechanistic Interpretability
by
ntt123
5mo
ago
•
Applied to
Exploring Llama-3-8B MLP Neurons
by
ntt123
5mo
ago
•
Applied to
Finding Backward Chaining Circuits in Transformers Trained on Tree Search
by
abhayesian
6mo
ago
•
Applied to
If language is for communication, what does that imply about LLMs?
by
Bill Benzon
6mo
ago
•
Applied to
An interesting mathematical model of how LLMs work
by
Bill Benzon
7mo
ago
•
Applied to
Transformers Represent Belief State Geometry in their Residual Stream
by
Adam Shai
7mo
ago
•
Applied to
Barcoding LLM Training Data Subsets. Anyone trying this for interpretability?
by
right..enough?
7mo
ago
•
Applied to
Understanding mesa-optimization using toy models
by
tilmanr
8mo
ago
•
Applied to
Decompiling Tracr Transformers - An interpretability experiment
by
Hannes Thurnherr
8mo
ago
•
Applied to
Modern Transformers are AGI, and Human-Level
by
jacobjacob
8mo
ago
•
Applied to
Deconfusing In-Context Learning
by
Arjun Panickssery
9mo
ago
•
Applied to
Skepticism About DeepMind's "Grandmaster-Level" Chess Without Search
by
Arjun Panickssery
9mo
ago