I think the median human performance on all the areas you mention is basically determined by the amount of training received rather than the raw intelligence of the median human.
1000 years ago the median human couldn't write or do arithmetic at all, but now they can because of widespread schooling and other cultural changes.
A better way of testing this hypothesis could be comparing the learning curves of humans and monkeys for a variety of tasks, to control for differences in training.
Here's one study I could find (after ~10m googling) comparing the learning performance of monkeys and different types of humans in the oddity problem (given a series of objects, find the odd one): https://link.springer.com/article/10.3758/BF03328221
If you look at Table 1, monkeys needed 1470 trials to learn the task, chimpanzees needed 1310, 4-to-6 yo human children needed 760, and the best humans needed 138. So it seems the gap between best and worst humans is comparable in size to the gap between worst humans and monkeys.
Usual caveats apply re: this is a single 1960s psychology paper.
I agree that training/learning/specialisation seem more likely to explain the massive gaps between the best and mediocre than innate differences in general cognitive prowess. But I am also genuinely curious what the difference in raw g factor is between median human and +6 SD human and median human and -2 SD human.
I remember reading a Gwern post that shows a lot of studies on human ability, and they show very similar if not better results for my theory that humans abilities have a very narrow range.
You are probably thinking of my mentions of Wechsler 1935 that if you compare the extremes (defined as best/worst out of 1000, ie. ±3 SD) of human capabilities (defined as broadly as possible, including eg running) where the capability has a cardinal scale, the absolute range is surprisingly often around 2-3x. There's no obvious reason that it should be 2-3x rather than 10x or 100x or lots of other numbers*, so it certainly seems like the human range is quite narrow and we are, from a big picture view going from viruses to hypothetical galaxy-spanning superintelligences, stamped out from the same mold. (There is probably some sort of normality + evolution + mutation-load justification for this but I continue to wait for someone to propose any quantitative argument which can explain why it's 2-3x.)
You could also look at parts of cognitive tests which do allow absolute, not merely relative, measures, like vocabulary or digit span. If you look at, say, backwards digit span and note that most peopl...
There is a big circularity here:
I think the median human basically cannot:
- Invent general relativity pre 1920
- Solve the millennium prize problems
- Solve major open problems in physics/computer science/mathematics/other quantitative fields
These tasks are defined by being at or above the extreme top end of human ability! No matter how narrow the range of human ability was on any absolute scale, these statements would be true. They are useless for supporting the conclusion.
Upvoted, this is a valid objection.
I guess we could abstract away the specific relative difficulty and talk about stuff like "algorithms/pure mathematics research" and the ability to perform those tasks of dumb humans, median humans and the very best in the field.
I still feel like the dumb humans can't do those tasks at all, median humans can contribute to those tasks but are (potentially much) closer to dumb humans than to peak humans in their contributions?
We tried to model a complex phenomenon using a single scalar, and this resulted in confusion and clouded intuition.
It's sort of useful for humans because of restriction of range, along with a lot of correlation that comes from looking only at human brain operations when talking about 'g' or IQ or whatever.
Trying to think in terms of a scalar 'intelligence' measure when dealing with non-human intelligences is not going to be very productive.
I somewhat disagree here. Yes, If we truly tried to create a scalar intelligence that was definable across the entirety of the mathematical multiverse, the No Free Lunch theorem would tell us this can't happen.
However, instrumental convergence exists, so general intelligence can be done in practice.
From tailcalled here:
Specifically, there are common subtasks to real world tasks.
My intuitive sense is that the scale difference between the two gaps is often not like 2x or 3x but measured in orders of magnitude.
I think this cannot happen, due to physical limits that are very taut for human brains.
(AI will have limits, but they're quite a bit slacker here due to the fact that they can run on more energy without damaging it.)
From Jacob Cannell's post on brain efficiency:
So true 8-bit equivalent analog multiplication requires about 100k carriers/switches and thus 10^-15 J/op using noisy subthreshold ~0.1eV per carrier, for a minimal energy consumption on order 0.1W to 1W for the brain's estimated 10^14 -10^15 synaptic ops/s. There is some room for uncertainty here, but not room for many OOM uncertainty. It does suggest that the wiring interconnect and synaptic computation energy costs are of nearly the same OOM. I take this as some evidence favoring the higher 10^15 op/s number, as computation energy use below that of interconnect requirements is cheap/free.
Basically, orders of magnitude difference can't happen, only at best 1 OOM better can happen.
And in practice, I think human intelligence has significantly narrower bands than this, to the point where I think 2x differences are crazily high, and anything beyond that is beyond the human distribution.
This is because intelligence is well approximated by a normal distribution, and normal distributions with the population we have would have 6.4 standard deviations away from average at the top, which is essentially a little over 2x.
Thus I think Eliezer got this point much more right than most of his other points.
I think this cannot happen, due to physical limits that are very taut for human brains.
I was saying that e.g. the gap between Magnus Carlsen and the median human in chess ability is 10x - 1000x the gap between the median human and a dumb human.
I think this is just straightforwardly true and applies to many other fields.
...Basically, orders of magnitude difference can't happen, only at best 1 OOM better can happen.
And in practice, I think human intelligence has significantly narrower bands than this, to the point where I think 2x differences are cra
I've had similar questions to this before in terms of how human individual differences appear so great when the actual seeming differences in neurophysiology between +3 and -3 SD humans are so small. My current view on this is that:
a.) General 'peak' human cognition is pretty advanced and the human brain is large even by current ML standards so by the scaling laws we should be pretty good vs existing ML systems at general tasks. This means that human intelligence is pretty 'far out' compared to current ML often and that scaling ML tasks much beyond humans is often expensive unless it is a super specialised task where ML systems have a much lower constant factor due to better adapted architecture/algorithm. Specialized ML systems will still hit a scaling wall at some point but it could be quite a way from peak human cognition.
b.) Most human variation is caused by deleterious mutations away from the 'peak' and because it is so much easier to destroy than to gain performance, human performance is basically from 0 -> human peak. The higher up human peak is the larger this will seem. Median human is a bad benchmark because the median human operates at substantially less than our true scaling law potential. Because of this scaling ML systems often lie in the range of human performance for a long time as they climb up to our peak level.
c.) In some sense the weird thing is that humans are so bad instead of a tight normal distribution around peak performance. This has got to do with it being easier to mess up performance than to improve it. I wonder what the distribution of some SOTA ML architecture would be if we randomly messed with its architecture and training.
a.) General 'peak' human cognition is pretty advanced and the human brain is large even by current ML standards so by the scaling laws we should be pretty good vs existing ML systems at general tasks. This means that human intelligence is pretty 'far out' compared to current ML often and that scaling ML tasks much beyond humans is often expensive unless it is a super specialised task where ML systems have a much lower constant factor due to better adapted architecture/algorithm. Specialized ML systems will still hit a scaling wall at some point but it could be quite a way from peak human cognition.
I think that this is correct with one caveat:
I think that ML and human brains will converge to the same or similar performance this century, and the big difference is more energy can be added in pretty reliably to the ML model while humans don't enjoy this advantage.
Yes definitely. Based on my own estimates of approximate brain scale it is likely that current largest. ML projects (GPT4) are within an OOM or so of effective parameter count already (+- 1-2 OOM) and we will definitely have brain-scale ML systems being quite common within a decade and probably less -- hence short timelines. Strong agree that it is much easier to add compute/energy to ML models vs brains.
I've written up some of my preliminary thought and estimates here: https://www.beren.io/2022-08-06-The-scale-of-the-brain-vs-machine-learning/.
Jacob Cannell's post on brain efficiency https://www.lesswrong.com/posts/xwBuoE9p8GE7RAuhd/brain-efficiency-much-more-than-you-wanted-to-know is also very good
I'll check your post out.
I've found Cannell's post very dense/hard to read the times I've attempted it. I guess there's a large inferential distance in some aspects, so lots of it go over my head.
Epistemic Status
Discussion question.
Related Posts
Preamble: Why Does This Matter?
This question is important for building intuitions for thinking about takeoff dynamics; the breadth of the human cognitive spectrum (in an absolute sense) determines how long we have between AI that is capable enough to be economically impactful and AI that is capable enough to be existentially dangerous.
Our beliefs on this question factor into our beliefs on:
Introduction
The Yudkowsky-Bostrom intelligence chart often depicts the gap between a village idiot and Einstein as very miniscule (especially compared to the gap between the village idiot and a chimpanzee):
Challenge
However, this claim does not feel to me to track reality very well/have been borne out empirically. It seems that for many cognitive tasks, the median practitioner is often much[1] closer to beginner/completely unskilled/random noise than they are to the best in the world ("peak human"):
It may be the case that median practitioners being much closer to beginners than the best in the world is the default/norm, rather than any sort of exception.
Furthermore, for some particular tasks (e.g. chess) peak human seems to be closer to optimal performance than to median human.
I also sometimes get the sense that for some activities, median humans are basically closer to an infant/chimpanzee/rock/ant (in that they cannot do the task at all) than they are to peak human. E.g. I think the median human basically cannot:
And the incapability is to the extent where they cannot usefully contribute to the problems[2].
Lacklustre Empirical Support From AI's History
I don't necessarily think the history of AI has provided empirical validation for the Yudkowsky intelligence spectrum. For many domains, it seems to take AI quite a long time (several years to decades) to go from parity with dumb humans to exceeding peak human (this was the case for Checkers, Chess, and Go it also looks like it will be the case for driving as well)[3].
I guess generative art might be one domain in which AI quickly went from subhuman to vastly superhuman.
Conclusions
The traditional intelligence spectrum seems to track reality better for many domains:
My intuitive sense is that the scale difference between the two gaps is often not like 2x or 3x but measured in orders of magnitude.
I.e. the gap between Magnus Carlsen and the median human in chess ability is 10x - 1000x the gap between the median human and a dumb human.
Though I wouldn't necessarily claim that the median human is closer to infants than to peak humans on those domains, but the claim doesn't seem obviously wrong to me either.
I'm looking at the time frame from the first artificial system reaching a certain minimal cognitive level at the domain until an artificial system becomes superhuman. So I do not consider AlphaZero/MuZero surpassing humans in however many hours of self play to count as validation given that the first Chess/Go systems to reach dumb human level were decades prior.
Though perhaps the self play leap at Chess/Go may be more relevant to forecasting how quickly transformative systems would cross the human frontier.