It seems like GPT-4 is going to be coming out soon and, so I've heard, it will be awesome. Now, we don't know anything about its architecture or its size or how it was trained. If it were only trained on text (about 3.2 T tokens) in an optimal manner, then it would be about 2.5X the size of Chinchilla i.e. the size of GPT-3. So to be larger than GPT-3, it would need to be multi-modal, which could present some interesting capabilities.
So it is time to ask that question again: what's the least impressive thing that GPT-4 won't be able to do? State your assumptions to be clear i.e. a text and image generating GPT-4 in the style of X with size Y can't do Z.
If you ask GPT-4 about GPT-4 it might be able to speak about itself in the second person, and say things that we would also be able to say about GPT-4 speculatively, but it will not be able to convert those statements into the first person without prompt engineering or post-processing rewrite rules.
Even if grammatical uses of "I" or "me" refer to GPT-4 via clever prompt engineering or rewrite rules, the semantic content of the claims will not be constrained by GPT-4's actual performance or state.
If GPT-4 can, with some prompt engineering, play chess at an elo rating of X and then if we compare that to when it says "I, GPT-4, have an elo rating of Y" the values X and Y will not match except by accident, and this lack of concordance will span many domains, and make it clear that GPT-4 has no coherent interior experiences linked to its most foundational operational modes. (In some sense it will be similar to humans who mostly can't name the parts of their own brain stem, but GPT-4's "subconscious disconnect" will be MUCH MUCH larger.)
However, GPT-4's fictional characters (when generating text in the style of high quality stories or news interviews) WILL be able to say "I" and predicate semantic properties of themselves-as-characters-or-interviewees that are coherently accurate... at least within the window of text that GPT-4 operates, and also in some cases over larger spans if they have famous names linked to coherent personas.
GPT-4 itself will have no such famous persona, or at least if a persona exists for it, the persona it generates during its operation will be defined by by the contents of human culture's projection of a persona as found within GPT-4's training data.
Any surprising new features in GPT-4 relative to its training data's projected personality will not be something it can talk about fluently, not even in the second person.
I’d want to see what happens if you play a game not following the exact moves of a published game. “Play chess” to me means coming up with good, valid moves in novel positions and being able to checkmate an opponent who’s doing the same.