A research team in China has created a system for answering verbal analogy questions of the type found on the GRE and IQ tests that scores a little above the average human score, perhaps corresponding to an IQ of around 105 or so. This improves substantially on the reported SOTA in AI for these types of problems.
This work builds on deep word-vector embeddings which have led to large gains in translation and many NLP tasks. One of their key improvements involves learning multiple vectors per word, where the number of specific word meanings is simply grabbed from a dictionary. This is important because verbal analogy questions often use more rare word meanings. They also employ modules specialized for the different types of questions.
I vaguely remember reading that AI systems already are fairly strong at solving visual raven-matrix style IQ questions, although I haven't looked into that in detail.
The multi-vector technique is probably the most important take away for future work.
Even if subsequent follow up work reaches superhuman verbal IQ in a few years, this of course doesn't immediately imply AGI. These types of IQ tests measure specific abilities which are correlated with general intelligence in humans, but these specific abilities are only a small subset of the systems/abilities required for general intelligence, and probably rely on a smallish subset of the brain's circuitry.
"These types of IQ tests measure specific abilities which are correlated with general intelligence in humans, but these specific abilities are only a small subset of the systems/abilities required for general intelligence, and probably rely on a smallish subset of the brain's circuitry."
Sounds like you're tinting your statement with a society-of-mindish perspective...would you say that's fair? There are even stronger reasons to question the importance of such results. Rather than exercising some specific dedicated IQ-test circuitry in the brain, it could just as well be that, in humans, IQ scores reflect how well-tuned your cognitive machinery is by some general measure much like an athlete's hundred-meter sprint time reflects his overall fitness as well as white muscle development in his legs. He would not achieve a low sprint time were it not for the proper functioning of his very complex, variegated biology. That one can build a simple robot that achieves faster sprint times says very little about that robot's potential for gymnastics or martial arts. For a robot to perform well on the sprint time just does not require anything like the solution to everything that evolution had to solve for human athletes to perform as well as they do.
Arithmetic ability would be another example of a metric which is nowhere near "agi-complete" since it's solvable by a relatively straightforward procedure.
In that case the solution of special "sub-tasks" of intelligence such as IQ puzzles would seem to me pretty uninformative. I'm reminded of the difference between Harry Foundalis' Phaeco which learns how to represent visual structures independently vs. earlier work in Bongard problems which 'cheated' by pre-encoding the images as logical objects.
I suppose my view on this is somewhat an inversion of what I see as the norm: others dismiss IQ tests as meaningless for human beings but find them significant metrics for machine intelligence.
I don't particularly endorse a society-of-mindish perspective (at least I don't think so - I'm only vaguely familiar with the term in relation to something Minsky wrote).
I mostly agree with your general points above.
Yes, th... (read more)