It seems like GPT-4 is going to be coming out soon and, so I've heard, it will be awesome. Now, we don't know anything about its architecture or its size or how it was trained. If it were only trained on text (about 3.2 T tokens) in an optimal manner, then it would be about 2.5X the size of Chinchilla i.e. the size of GPT-3. So to be larger than GPT-3, it would need to be multi-modal, which could present some interesting capabilities.
So it is time to ask that question again: what's the least impressive thing that GPT-4 won't be able to do? State your assumptions to be clear i.e. a text and image generating GPT-4 in the style of X with size Y can't do Z.
>75% confidence: No consistent strong play in simple game of imperfect information (e.g. battleship) for which it has not been specifically trained.
>50% confidence: No consistent "correct" play in a simple game of imperfect information (e.g. battleship) for which it has not been specifically train. Correct here means making only valid moves, and no useless moves. For example, in battleship a useless move would be attacking the same grid coordinate twice.
>60% confidence: Bad long-term sequence memory, particularly when combined with non-memorization tasks. For example, suppose A=1, B=2, etc. What is the sum of the characters in a given page of text (~500 words)?