“Anomalous”, “glitch”, or “unspeakable” tokens in an LLM are those that induce bizarre behavior or otherwise don’t behave like regular text.
The SolidGoldMagikarp saga is pretty much essential context, as it documents the discovery of this phenomenon in GPT-2 and GPT-3.
But, as far as I was able to tell, nobody had yet attempted to search for these tokens in DeepSeek-V3, so I tried doing exactly that. Being a SOTA base model, open source, and an all-around strange LLM, it seemed like a perfect candidate for this.
This is a catalog of the glitch tokens I've found in DeepSeek after a day or so of experimentation, along with some preliminary observations about their behavior.
Note: I’ll... (read 1949 more words →)
You'd think, but nope, it's explicitly named after the web fiction.