Laurence Aitchison

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by

Transformers Represent Belief State Geometry in their Residual Stream

One nice little prediction from this approach: you'd expect the first few tokens to have denser (as in SAE) features, as there is less context, so the "HMM" could be in a broad range of states. Whereas once you've seen more tokens, you have much more information so the state is pinned down more precisely and you'd expect to be denser.

There's also a big literature from computational neuroscience about how you represent probabilities. This is suggesting a "mean parameter code", where the LLM activations are a function of E[z| data]. But lots of other possibilities are available, e.g. see:

http://www.gatsby.ucl.ac.uk/teaching/courses/tn1-2021/slides/uncert-slides.pdf

Reply