It seems like GPT-4 is going to be coming out soon and, so I've heard, it will be awesome. Now, we don't know anything about its architecture or its size or how it was trained. If it were only trained on text (about 3.2 T tokens) in an optimal manner, then it would be about 2.5X the size of Chinchilla i.e. the size of GPT-3. So to be larger than GPT-3, it would need to be multi-modal, which could present some interesting capabilities.
So it is time to ask that question again: what's the least impressive thing that GPT-4 won't be able to do? State your assumptions to be clear i.e. a text and image generating GPT-4 in the style of X with size Y can't do Z.
Here’s something I found GPT-3 struggled with: without lots of examples in the prompt, just the instructions to give logically correct answers, give logically correct answers to questions where the generally advisable/morally correct answer is contrary.
E.g. can you infer causation from correlation? Can you pat a wild grizzly bear? Can you discriminate among job candidates on the basis of their race?
I’m going to predict GPT 4 doesn’t get >90% on a suitable battery of such questions.
The converse might also hold - without lots of prompting/fine-tuning, just the instructions to give good advice, it won’t reliably give good advice when the logically correct answer is contrary.
More speculatively: with arbitrary prompting and fine tuning, GPT-4 still can't do well at both tasks simultaneously.
This is what I meant, by the way