This is a fact worth knowing and a lucid explanation --- thanks for writing this!
I know it's not the main point of the post, but I found myself a little lost when you talked about complexity theory; I would be interested to hear more details. In particular:
what are the definitions of these classes? Are these decision problems in the usual sense, or is this a statement about learning theory? I haven't been able to either track down a definition of e.g. "NP-invertible" or invent one that would make sense --- if this is a renamed-version of something people have studied I would be curious to know the original name.
Again though, I know these are quibbles and the main point of this section is "you don't need any unproven complexity assumptions to get lower bounds against statistical query learning".
(Separately, a minor nit with the presentation: I found the decision to use "learning algorithm" to refer specifically to "weight-based ML architecture that learns via SGD" to be slightly confusing. The takeaway being "there exists an algorithm that learns parities, but there doesn't exist a learning algorithm that learns parities" is a little slippery --- I think it would be worth replacing "learning algorithm" with e.g. "gradient-based learning algorithm" in the writeup.)
Confirming that efficiently finding a small circuit (you don't actually need further restrictions than size) based on its values on a fixed collection of test inputs is known to imply NP⊆BPP --- see this paper.