You've probably seen ChatGPT going crazy this week and spewing nonsense in response to all sorts of prompts (e.g. https://twitter.com/alyssamvance/status/1760094457393983829)

The official OpenAI postmortem is https://openai.statuspage.io/incidents/ssg8fh7sfyz3

LLMs generate responses by randomly sampling words based in part on probabilities. Their “language” consists of numbers that map to tokens.

In this case, the bug was in the step where the model chooses these numbers. Akin to being lost in translation, the model chose slightly wrong numbers, which produced word sequences that made no sense. More technically, inference kernels produced incorrect results when used in certain GPU configurations.

What does this mean? Did it choose the wrong tokens from the tokenizer? Was it outputting some sequence of tokens [451, 621, 9800] which should mean "I love ice" but instead got resolved to "Is there lake"? Or is it something else?

Also, if someone could explain to me in layman terms what the last sentence means about the inference kernels, I would be very grateful!

New Answer
New Comment
1 comment, sorted by Click to highlight new comments since:
[-]Ann20

Inference kernels look like kernels (the core component of an operating system) specifically optimized to handle AI inference tasks. Aside from that ... not sure, there's a lot of possible numbers going on, but if it was specifically in the choosing by probability step and moderately equivalent to "higher temperature" the output would effectively get much more random, though might still be constrained to a near-sensible coherence.