LIMA: Less Is More for Alignment

Ulisse Mini

16 LIMA: Less Is More for Alignment

by Ulisse Mini

30th May 2023

AI Alignment Forum

2 min read

6

16 Ω 10

This is a linkpost for https://arxiv.org/abs/2305.11206

Abstract

Large language models are trained in two stages: (1) unsupervised pretraining from raw text, to learn general-purpose representations, and (2) large scale instruction tuning and reinforcement learning, to better align to end tasks and user preferences. We measure the relative importance of these two stages by training LIMA, a 65B parameter LLaMa language model fine-tuned with the standard supervised loss on only 1,000 carefully curated prompts and responses, without any reinforcement learning or human preference modeling. LIMA demonstrates remarkably strong performance, learning to follow specific response formats from only a handful of examples in the training data, including complex queries that range from planning trip itineraries to speculating about alternate history. Moreover, the model tends to generalize well to unseen tasks that did not appear in the training data. In a controlled human study, responses from LIMA are either equivalent or strictly preferred to GPT-4 in 43% of cases; this statistic is as high as 58% when compared to Bard and 65% versus DaVinci003, which was trained with human feedback. Taken together, these results strongly suggest that almost all knowledge in large language models is learned during pretraining, and only limited instruction tuning data is necessary to teach models to produce high quality output.

Implications

Data Quality & Capabilities

Along with TinyStories and QLoRA I'm becoming increasingly convinced that data quality is all you need, definitely seems to be the case for finetuning, and may be the case for base-model training as well. Better scaling laws through higher-quality corpus?

Also for who haven't updated, it seems very likely that GPT-4 equivalents will be essentially free to self-host and tune within a year. Plan for this!

Perplexity != Quality

When fine-tuning LIMA, we observe that perplexity on held-out Stack Exchange data (2,000 examples) negatively correlates with the model’s ability to produce quality responses. To quantify this manual observation, we evaluate model generations using ChatGPT, following the methodology described in Section 5. Figure 9 shows that as perplexity rises with more training steps – which is typically a negative sign that the model is overfitting – so does the quality of generations increase. Lacking an intrinsic evaluation method, we thus resort to manual checkpoint selection using a small 50-example validation set.

Because of this, the authors manually select checkpoints between the 5th and 10th epochs (out of 15) using the held-out 50-example development set.

Language Models (LLMs)AI

Frontpage

16 Ω 10

Mentioned in

71MetaAI: less is less for alignment.

New Comment

6 comments, sorted by

top scoring

Click to highlight new comments since: Today at 2:41 PM

[-]Cleo Nardo2yΩ242

In a controlled human study, responses from LIMA are either equivalent or strictly preferred to GPT-4 in 43% of cases;

I'm not sure how well this metric tracks what people care about — performance on particular downstream tasks (e.g. passing a law exam, writing bugless code, automating alignment research, etc)

Reply

[-]Raemon2y42

The abstract feels overly long.

IMO, Abstracts should be either Actually Short™, or broken into paragraphs

Reply

[-]Ulisse Mini2y20

Copied it from the paper. I could break it down into several paragraphs but I figured bolding the important bits was easier. Might break up abstracts in future linkposts.

Reply

[-]the gears to ascension2y20

the bold font on lesswrong is too small of a difference vs normal text for this to work well, IMO. (I wish it were a stronger bolding so it was actually useful for communicating to others.) just messing around with one suggested way to do this from SO:

Reply

[-]Ulisse Mini2y20

Added italics. For the next post I'll break up the abstract into smaller paragraphs and/or make a TL;DR.

Reply

[-]the gears to ascension2y21

yeah I don't like that either, I find italics harder not easier to read in this font, sorry to disappoint, heh.

Reply

Moderation Log