Choosing the wrong reference class?
A version of this type of situation seems to cover a lot of career decisions made by sufficiently talented people. If you're a young Mark Zuckerberg, should you drop out of Harvard? Dropping out of college is a bad idea, on average. It's not quite as clear cut as OP suggested because you can't reliably replicate what Bill Gates did, but there may be strong indicators that startup founders, or some more specific subclass like startup founders experiencing x% monthly growth in recurring users, and not the average college dropout, is the reference class you should look out. And maybe an experienced startup or VC person could point one to an even better reference class that wouldn't occur to me.
Perhaps related to https://www.lesswrong.com/tag/inside-outside-view , as kind of the opposite of the planning fallacy. If you don't look at averages, or if you deny that your context is more similar to the common case than the special case, you'll be massively overconfident. If you ONLY look at averages and don't consider how you can choose the environment or context (to some degree), you'll miss out on opportunities to improve.
I think the only real answer is in the specifics of strategy A (best in the common case, with common levels of ability and effort) and strategy B (best for at least some cases), to determine which is best for your abilities and needs.
[ note: I mentally replaced the word "average" with "median" or "common" in your post. Averages for non-symmetrical distributions can be very misleading, and basically should never be used for this kind of comparison. ]
Nominate “statisticians’ duck hunt”, after this joke
Three statisticians go duck hunting. They see a duck and the first statistician shoots, hitting two feet to the left of the duck. The second statistician shoots, hitting two feet to the right of the duck. The third statistician leaps up in joy, yelling, "We got it!"
Seems like a form of a fallacy of composition error. Might also be a category error in thinking the aggregate statistic that offers a (part of the) description about the distribution of the whole can be seen as representing meaningful information about individual elements.
I don't think there is a term, and don't think there needs to be one. If someone else disagrees with me that's fine, but situations where
1: you can consistently do far better than average by doing system B in a certain way
2: most people who use system B do worse
are so rare that it doesn't need a term. Unless you can think of several specific examples?
A specific example: how safe is it to use a condom? When you look at the statistics of pregnancies per user per year, it is important to understand that a person who says "uhm, I usually use condoms, but I kinda forget to put one on at 50% of occassions" is still classified as a condom-user. So the safety for you is probably much better than the statistics suggests.
Another example: homeschooling. Seems to me there are essentially two types of homeschooling families: smart conscientious people who want to give their kids better education than the school sys...
Examples:
In general: Doing things the same way that worked in the past vs doing something different. Most mutations are deleterious, but doing things in the correct different way can have big benefits.
I see lots of situations where let's say, Mike is aware that either method A and method B that can be used to carry out a task.
Observational data shows that on average, system A outperforms system B, so seeing this, Mike decides to use system A.
But, the best ever result on the task was achieved with system B, and the conditions under which that was achieved could be easily replicated.
So really, Mike would be better off using system B and replicating those exact conditions - Mike could have cherry picked and recreated the best scenario, but made a decision based on averages instead.
Is there a term for this? And if not, what should the term be?