It should be pointed out that the original paper/press release describing GPT-4 explicitly says that they found that BIG-bench had contaminated their training data, and therefore excluded it as an evaluation. As far as I know there was no similar disclosure for claude or other models. See footnote 5 here: https://arxiv.org/abs/2303.08774v1
Leetcode questions are not selected for novelty. In fact, the best way to get a problem turned into a Leetcode question is to post it to Leetcode's discussion board and say someone asked you it in an interview at a big tech company. So it's still possible that some or even many these questions appear nearly verbatim in the training data.
This argument seems to be a one by analogy. steam engine:industrial revolution::???:machine learning. But as you can see there's a term in the analogy I don't understand. Is ??? chatgpt? LLMs? Transformers? AlexNet? The internet? Digital computers? Something that hasn't yet been invented?