The base models of GPT-3 already have the ability to "follow instructions", it's just veiled behind the more general interface. If you prompt it with something as simple as this (GPT generation is highlighted), you can see how it contains this capability somewhere.
You may have noticed that it starts to repeat itself after a few lines, and come up with new questions on its own besides. That's part of what the fine-tuning fixes, making its generations more concise and stop at the point where the next token would be leading to another question. InstructGPT also has the value of not needing the wrapper of "Q: [] A: []", but that's not really a qualitative difference.
In other words, instruction following is not a new capability and the fine-tuning doesn't really make any qualitative changes to the model. In fact, I think that you can get results [close to] this good if you prompt it really well (like, in the realm of soft prompts).
The base models of GPT-3 already have the ability to "follow instructions", it's just veiled behind the more general interface. [...] you can see how it contains this capability somewhere.
This is a good point that I forgot. My mental model of this is that since many training samples are Q&A, in these cases, learning to complete implies learning how to answer.
InstructGPT also has the value of not needing the wrapper of "Q: [] A: []", but that's not really a qualitative difference.
I want to push back a little bit on the claim that this is not a qu...
I posted in the open thread and was told that it would be worth promoting to top level.
cubefox responded with a link to an great explanation of how the fine-tuning is done, which made me realize that my original question was unclear, so I'm going to try to clarify.
The fundamental behavior of GPT-3 is token prediction, which can straightforwardly be leveraged into text completion; in contrast, the fundamental behavior of InstructGPT is instruction following. Instruction following is a new capability that uses the knowledge from the token prediction task to produce output as well as to understand input; how does that capability develop?
Some plausible experiments related to the question:
P
, which produces completionC
when fed into the fine-tuned model, try to find a promptP'
that producesC
when fed into the original model.