Megan Kinniment

GPT-3 Catching Fish in Morse Code

Mostly non-serious and slightly silly, with some potentially interesting bits for people who are into language models. TLDR: The current version of GPT-3 has a strong tendency to encode mangled versions of a specific phrase when asked to write morse code in zero-shot situations. This is possibly the result of a previous version of the model using essentially a single phrase for all morse code writing, which the newer version then learnt to modify. All completions done with text-davinci-002 (~GPT-Instruct-175B) at zero temperature and with no examples unless stated otherwise. All models used are GPT-Instruct series. The Basics GPT-3 'knows' morse code in a rudimentary sense. It can accurately regurgitate both the encodings of the entire alphabet and of individual letters, but it's not so great at translating words: Morse code is a letter-by-letter encoding, and since GPT sees tokens, it's not all that surprising that the jump from single letters to words might be bigger for GPT than for humans. Tokenizer Token IDs What is surprising is that GPT morse is often much longer than the original word, and quite specific. Fiddling with Tokens Let's see what happens if we try and make the tokenisation a bit nicer for GPT. Adding a space doesn't seem to help much. ("n" is tokenised differently to " n" so not too surprising). We also get a similarly weird output with this too. Target PhraseGPT TranslatedGPT MorseCorrect Morse"i n"I CAUGHT THE.. / -.-. .- ..- --. .... - / - .... ... / -. Separating the tokens out with a hyphen doesn't help much either, though we do get an N we didn't before. Target PhraseGPT TranslatedGPT MorseCorrect Morse"i-n"I NUGHT THE.. / -. ..- --. .... - / - .... ... -....- -. It does do better on a string of alphabet letters that are tokenised separately. Target PhraseGPT TranslatedGPT MorseCorrect Morse"qzj"QUQ--.- ..- --.---.- --.. .--- Still, even in this case, GPT's zero-shot morse writing ability leaves quite a b

117Jun 30, 2022

Megan Kinniment

Message

I work at ARC Evals. I like language models.

Am very happy for people to ask to chat - but I might be too busy to accept (message me).

479

Introducing METR's Autonomy Evaluation Resources

This is METR’s collection of resources for evaluating potentially dangerous autonomous capabilities of frontier models. The resources include a task suite, some software tooling, and guidelines on how to ensure an accurate measurement of model capability. Building on those, we’ve written an example evaluation protocol. While intended as a “beta”...

Mar 15, 202490

Bounty: Diverse hard tasks for LLM agents

Update 3/14/2024: This post is out of date. For current information on the task bounty, see our Task Development Guide. Summary METR (formerly ARC Evals) is looking for (1) ideas, (2) detailed specifications, and (3) well-tested implementations for tasks to measure performance of autonomous LLM agents. Quick description of key...

Dec 17, 202349

Send us example gnarly bugs

Update: We are no longer accepting gnarly bug submissions. However, we are still accepting submissions for our Task Bounty! Tl;dr: Looking for hard debugging tasks for evals, paying greater of $60/hr or $200 per example. METR (formerly ARC Evals) is interested in producing hard debugging tasks for models to attempt...

Dec 10, 202377

Steering Behaviour: Testing for (Non-)Myopia in Language Models

Authors' Contributions: Both authors contributed equally to this project as a whole. Evan did the majority of implementation work, as well as the work for writing this post. Megan was more involved at the beginning of the project, and did the majority of experiment design. While Megan did give some...

Dec 5, 202240

Recall and Regurgitation in GPT2

The first half of this post uses causal tracing to explore differences in how GPT2-XL handles completing cached phrases vs completing factual statements. The second half details my attempt to build intuitions about the high-level structure of GPT2-XL and is speculation heavy. Some familiarity with transformer architecture is assumed but...

Oct 3, 202243

Trying out Prompt Engineering on TruthfulQA

I try out "let's gather the relevant facts" as a zero-shot question answering aid on TruthfulQA. It doesn't help more than other helpful prompts. Possibly it might work better on more typical factual questions. This post could potentially be useful to people interested in playing with OpenAI's API, or who...

Jul 23, 202210

Megan Kinniment's Shortform

Jul 14, 20223

Load More (7/9)

LESSWRONG
LW

LESSWRONG
LW

Megan Kinniment

Megan Kinniment

Megan Kinniment

GPT-3 Catching Fish in Morse Code

Introducing METR's Autonomy Evaluation Resources

Send us example gnarly bugs

Bounty: Diverse hard tasks for LLM agents

Megan Kinniment

Introducing METR's Autonomy Evaluation Resources

Bounty: Diverse hard tasks for LLM agents

Send us example gnarly bugs

Steering Behaviour: Testing for (Non-)Myopia in Language Models

Recall and Regurgitation in GPT2

Trying out Prompt Engineering on TruthfulQA

Megan Kinniment's Shortform

GPT-3 Catching Fish in Morse Code

Introducing METR's Autonomy Evaluation Resources

Send us example gnarly bugs

Bounty: Diverse hard tasks for LLM agents

Introducing METR's Autonomy Evaluation Resources

Bounty: Diverse hard tasks for LLM agents

Send us example gnarly bugs

Steering Behaviour: Testing for (Non-)Myopia in Language Models

Recall and Regurgitation in GPT2

Trying out Prompt Engineering on TruthfulQA

Megan Kinniment's Shortform