This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
LESSWRONG
Tags
LW
Login
Transformers
Edit
History
Subscribe
Discussion
(0)
Help improve this page
Edit
History
Subscribe
Discussion
(0)
Help improve this page
Transformers
Random Tag
Contributors
Posts tagged
Transformers
Most Relevant
5
37
Striking Implications for Learning Theory, Interpretability — and Safety?
RogerDearnaley
11mo
4
3
131
How LLMs are and are not myopic
Ω
janus
1y
Ω
15
2
219
Modern Transformers are AGI, and Human-Level
Ω
abramdemski
8mo
Ω
88
2
87
Google's PaLM-E: An Embodied Multimodal Language Model
SandXbox
2y
7
2
76
Residual stream norms grow exponentially over the forward pass
Ω
StefanHex
,
TurnTrout
2y
Ω
24
2
62
Tracr: Compiled Transformers as a Laboratory for Interpretability | DeepMind
Ω
DragonGod
2y
Ω
12
2
56
Concrete Steps to Get Started in Transformer Mechanistic Interpretability
Ω
Neel Nanda
2y
Ω
7
2
53
How fast can we perform a forward pass?
jsteinhardt
2y
9
2
33
AGI will be made of heterogeneous components, Transformer and Selective SSM blocks will be among them
Ω
Roman Leventov
11mo
Ω
9
2
27
How Do Induction Heads Actually Work in Transformers With Finite Capacity?
Fabien Roger
2y
0
2
7
If I ask an LLM to think step by step, how big are the steps?
Q
ryan_b
2mo
Q
1
1
411
Transformers Represent Belief State Geometry in their Residual Stream
Ω
Adam Shai
7mo
Ω
100
1
89
An Analogy for Understanding Transformers
CallumMcDougall
2y
6
1
77
Attention SAEs Scale to GPT-2 Small
Ω
Connor Kissane
,
robertzk
,
Arthur Conmy
,
Neel Nanda
10mo
Ω
4
1
70
Adam Optimizer Causes Privileged Basis in Transformer LM Residual Stream
Diego Caples
,
rrenaud
2mo
7