GPT-3 Catching Fish in Morse Code
Mostly non-serious and slightly silly, with some potentially interesting bits for people who are into language models. TLDR: The current version of GPT-3 has a strong tendency to encode mangled versions of a specific phrase when asked to write morse code in zero-shot situations. This is possibly the result of a previous version of the model using essentially a single phrase for all morse code writing, which the newer version then learnt to modify. All completions done with text-davinci-002 (~GPT-Instruct-175B) at zero temperature and with no examples unless stated otherwise. All models used are GPT-Instruct series. The Basics GPT-3 'knows' morse code in a rudimentary sense. It can accurately regurgitate both the encodings of the entire alphabet and of individual letters, but it's not so great at translating words: Morse code is a letter-by-letter encoding, and since GPT sees tokens, it's not all that surprising that the jump from single letters to words might be bigger for GPT than for humans. Tokenizer Token IDs What is surprising is that GPT morse is often much longer than the original word, and quite specific. Fiddling with Tokens Let's see what happens if we try and make the tokenisation a bit nicer for GPT. Adding a space doesn't seem to help much. ("n" is tokenised differently to " n" so not too surprising). We also get a similarly weird output with this too. Target PhraseGPT TranslatedGPT MorseCorrect Morse"i n"I CAUGHT THE.. / -.-. .- ..- --. .... - / - .... ... / -. Separating the tokens out with a hyphen doesn't help much either, though we do get an N we didn't before. Target PhraseGPT TranslatedGPT MorseCorrect Morse"i-n"I NUGHT THE.. / -. ..- --. .... - / - .... ... -....- -. It does do better on a string of alphabet letters that are tokenised separately. Target PhraseGPT TranslatedGPT MorseCorrect Morse"qzj"QUQ--.- ..- --.---.- --.. .--- Still, even in this case, GPT's zero-shot morse writing ability leaves quite a b






The logistic is just one of many functions that's a reasonable fit for the P(success) vs length data. You can use lots of different curves and still get the exponential horizon trend, it's not specific to the log-logistic.
E.g. Here's a silly log-linear fit that I quickly threw together:
Here's all the histograms if you want to take a look
It's not the function being used to fit per-model P(success) vs length that causes the exponential horizon trend on our task... (read more)