It seems like GPT-4 is going to be coming out soon and, so I've heard, it will be awesome. Now, we don't know anything about its architecture or its size or how it was trained. If it were only trained on text (about 3.2 T tokens) in an optimal manner, then it would be about 2.5X the size of Chinchilla i.e. the size of GPT-3. So to be larger than GPT-3, it would need to be multi-modal, which could present some interesting capabilities.
So it is time to ask that question again: what's the least impressive thing that GPT-4 won't be able to do? State your assumptions to be clear i.e. a text and image generating GPT-4 in the style of X with size Y can't do Z.
My guess is that GPT-4 will not be able to convincingly answer a question as if it were a five-year-old. As a test, if you ask an adult whether a question was answered by a real five-year-old or GPT-4 pretending to be a five-year-old, the adult will be able to tell the difference for most questions in which an adult would give a very different answer from a child. My reason for thinking GPT-4 will have this limitation is the limited amount of Internet written content labeled as being produced by young children.
If GPT-4 training data includes YouTube video transcripts, it might be able to do this convincingly.