Physics of Language models (part 2.1)

Nathan Helm-Burger

9 Physics of Language models (part 2.1)

by Nathan Helm-Burger

19th Sep 2024

1 min read

2

9

This is a linkpost for https://youtu.be/bpp6Dz8N2zY?si=RC20soJLynXxNOfv

This is perhaps the best interpretability work I've seen outside of Chris Olah's team.

Interpretability (ML & AI)AI

Frontpage

9

New Comment

2 comments, sorted by

top scoring

Click to highlight new comments since: Today at 4:34 PM

[-]StefanHex1y70

Paper link: https://arxiv.org/abs/2407.20311

(I have neither watched the video nor read the paper yet, just in case someone else was looking for the non-video version)

LESSWRONG
LW

LESSWRONG
LW

9

Physics of Language models (part 2.1)

9

9

9