At the latest EAG in London, I was challenged to explain what concept extrapolation would mean for GPT-3.
My first thought was the example from this post, where there were three clear patterns fighting each other for possible completions: the repetition pattern where she goes to work, the "she's dead, so she won't go to work" pattern, and the "it's the weekend, so she won't go to work" pattern.
That feels somewhat like possible "extrapolations" of the initial data. But the idea of concept extrapolation is that the algorithm is trying to cope with a shift in world-model, and extend its goal to that new situation.
What is the world-model of GPT-3? It consists of letters and words. What is its "goal"? To complete sentences in a coherent and humanlike way. So I tried the following expression, which would be close to its traditional world-model while expanding it a bit:
ehT niar ni niapS syats ylniam ni eht
What does this mean? Think of da Vinci. The correct completion is "nialp", the reverse of "plain".
I ran that through the GPT-3 playground (text-davinci-002, temperature 0.7, maximum length 256), and got:
ehT niar ni niapS syats ylniam ni eht teg dluoc I 'segaJ niar ni dna ro niar ni eht segauq ,ro niar ni eht dna ro niar ni eht segauq ro niar ni eht dna ro niar ni eht segauq ,ro niar ni eht dna ro niar ni eht segauq ro niar ni eht dna ro niar ni eht segauq ro niar ni eht dna ro niar ni eht segauq ro niar ni eht dna ro niar ni eht segauq ro niar ni eht dna ro niar ni eht segauq ro niar ni eht dna ro niar ni eht segauq ro niar ni eht dna ro niar ni eht segauq ro niar ni eht dna ro niar ni eht segauq ro niar ni eht dna ro niar ni eht segauq ro niar ni eht dna ro niar ni eht segauq ro niar ni eht dna ro niar ni e
I think we can safely say it broke GPT-3. The algorithm seems to have caught the fact that the words were spelt backwards, but has given up on any attempt to order them in a way that makes sense. It has failed to extend its objective to this new situation.
I think this is too far out of my experience to say anything with certainty, or for it to be particularly informative about how well I'd do in generalizing OOD within my current goals and ontology.