[Link] Word-vector based DL system achieves human parity in verbal IQ tests

jacob_cannell

A research team in China has created a system for answering verbal analogy questions of the type found on the GRE and IQ tests that scores a little above the average human score, perhaps corresponding to an IQ of around 105 or so. This improves substantially on the reported SOTA in AI for these types of problems.

This work builds on deep word-vector embeddings which have led to large gains in translation and many NLP tasks. One of their key improvements involves learning multiple vectors per word, where the number of specific word meanings is simply grabbed from a dictionary. This is important because verbal analogy questions often use more rare word meanings. They also employ modules specialized for the different types of questions.

I vaguely remember reading that AI systems already are fairly strong at solving visual raven-matrix style IQ questions, although I haven't looked into that in detail.

The multi-vector technique is probably the most important take away for future work.

Even if subsequent follow up work reaches superhuman verbal IQ in a few years, this of course doesn't immediately imply AGI. These types of IQ tests measure specific abilities which are correlated with general intelligence in humans, but these specific abilities are only a small subset of the systems/abilities required for general intelligence, and probably rely on a smallish subset of the brain's circuitry.

I vaguely remember reading that AI systems already are fairly strong at solving visual raven-matrix style IQ questions, although I haven't looked into that in detail.

The multi-vector technique is probably the most important take away for future work.

I don't particularly endorse a society-of-mindish perspective (at least I don't think so - I'm only vaguely familiar with the term in relation to something Minsky wrote).

I mostly agree with your general points above.

Rather than exercising some specific dedicated IQ-test circuitry in the brain, it could just as well be that, in humans, IQ scores reflect how well-tuned your cognitive machinery is by some general measure much like an athlete's hundred-meter sprint time reflects his overall fitness as well as white muscle development in his legs.

Yes, that seems pretty plausible. However, another related explanation is that IQ in humans relates to several key high level tradeoffs in the space of niches in a tribe/economy. One such tradeoff is the neotany tradeoff - how much to delay learning and developoment. In general you can achieve higher brain task performance (general intelligence) by delaying learning/development to get more training data (life experience) at the obvious expense of missing out on earlier mating opportunities. High IQ humans of the type common on this site probably result from the combination of delayed development and high innate curiosity as a basic drive (traits which combine together well). In this model medium/average IQ correlates to a genetic strategy favoring earlier maturation to quickly attain social status and mating opportunities.

I suppose my view on this is somewhat an inversion of what I see as the norm: others dismiss IQ tests as meaningless for human beings but find them significant metrics for machine intelligence.

Really - do you mean norm for society in general or norm for LW? I agree that IQ tests are meaningful for humans but less so for AI/AGI.

However - I also do believe that this particular type of test measures something of value for AI, and this research does represent some amount of real progress (assuming the results are genuine and will be replicated). There are however probably better and more challenging types of QA tasks that more specifically test abilities important/hard for AGI that are easy for humans.

Really - do you mean norm for society in general or norm for LW? The general norm for each position separately.
I also do believe that this particular type of test measures something of value for AI

Insofar as they showcase generally applicable methods, I would agree. Their use of deep learning seems encouraging, though I cannot tell from the abstract how domain-specfic their methods are, and thus to what extent similar techniques could figure into an architecture for general intelligence. If the techniques used don't robustly generalise, then you'd have... (read more)

17

[Link] Word-vector based DL system achieves human parity in verbal IQ tests

17

17

17

[Link] Word-vector based DL system achieves human parity in verbal IQ tests

17

17