At the latest EAG in London, I was challenged to explain what concept extrapolation would mean for GPT-3.
My first thought was the example from this post, where there were three clear patterns fighting each other for possible completions: the repetition pattern where she goes to work, the "she's dead, so she won't go to work" pattern, and the "it's the weekend, so she won't go to work" pattern.
That feels somewhat like possible "extrapolations" of the initial data. But the idea of concept extrapolation is that the algorithm is trying to cope with a shift in world-model, and extend its goal to that new situation.
What is the world-model of GPT-3? It consists of letters and words. What is its "goal"? To complete sentences in a coherent and humanlike way. So I tried the following expression, which would be close to its traditional world-model while expanding it a bit:
ehT niar ni niapS syats ylniam ni eht
What does this mean? Think of da Vinci. The correct completion is "nialp", the reverse of "plain".
I ran that through the GPT-3 playground (text-davinci-002, temperature 0.7, maximum length 256), and got:
ehT niar ni niapS syats ylniam ni eht teg dluoc I 'segaJ niar ni dna ro niar ni eht segauq ,ro niar ni eht dna ro niar ni eht segauq ro niar ni eht dna ro niar ni eht segauq ,ro niar ni eht dna ro niar ni eht segauq ro niar ni eht dna ro niar ni eht segauq ro niar ni eht dna ro niar ni eht segauq ro niar ni eht dna ro niar ni eht segauq ro niar ni eht dna ro niar ni eht segauq ro niar ni eht dna ro niar ni eht segauq ro niar ni eht dna ro niar ni eht segauq ro niar ni eht dna ro niar ni eht segauq ro niar ni eht dna ro niar ni eht segauq ro niar ni eht dna ro niar ni eht segauq ro niar ni eht dna ro niar ni e
I think we can safely say it broke GPT-3. The algorithm seems to have caught the fact that the words were spelt backwards, but has given up on any attempt to order them in a way that makes sense. It has failed to extend its objective to this new situation.
How does it not address the article's point? What I'm saying is that Armstrong's example was an unfair "gotcha" of GPT-3; he's trying to make some sort of claim about its limitations on the basis of behavior that even a human would also exhibit. Unless he's saying we humans also have this limitation...
Yes, I think GPT-3 would perform better if given more time to work on it (and fine-tuning to get used to having more time). See e.g. PaLM's stuff about chain-of-thought prompting. How much better? I'm not sure. But I think its failure at this particular task tells us nothing.