Minds that make optimal use of small amounts of sensory data

SforSingularity

Minds that make optimal use of small amounts of sensory data — LessWrong

12 Minds that make optimal use of small amounts of sensory data

by SforSingularity

15th Aug 2009

3 min read

12

In That alien message, Eliezer made some pretty wild claims:

My moral - that even Einstein did not come within a million light-years of making efficient use of sensory data.

Riemann invented his geometries before Einstein had a use for them; the physics of our universe is not that complicated in an absolute sense. A Bayesian superintelligence, hooked up to a webcam, would invent General Relativity as a hypothesis - perhaps not the dominant hypothesis, compared to Newtonian mechanics, but still a hypothesis under direct consideration - by the time it had seen the third frame of a falling apple. It might guess it from the first frame, if it saw the statics of a bent blade of grass.

They never suspected a thing. They weren't very smart, you see, even before taking into account their slower rate of time. Their primitive equivalents of rationalists went around saying things like, "There's a bound to how much information you can extract from sensory data." And they never quite realized what it meant, that we were smarter than them, and thought faster.

In the comments, Will Pearson asked for "some form of proof of concept". It seems that researchers at Cornell - Schmidt and Lipson - have done exactly that. See their video on Guardian Science:

'Eureka machine' can discover laws of nature - The machine formulates laws by observing the world and detecting patterns in the vast quantities of data it has collected

Researchers at Cambridge and Aberystwith have gone one step further and implemented an AI system/robot to perform scientific experiments:

Researchers at Aberystwyth University in Wales and England's University of Cambridge report in Science today that they designed Adam - they describe how the bot operates by relating how he carried out one of his tasks, in this case to find out more about the genetic makeup of baker's yeast Saccharomyces cerevisiae, an organism that scientists use to model more complex life systems. Using artificial intelligence, Adam hypothesized that certain genes in baker's yeast code for specific enzymes that catalyze biochemical reactions. The robot devised experiments to test these beliefs, ran the experiments, and interpreted the results.

The crucial question is: what can we learn about the likely effectiveness of a "superintelligent" AI from the behavior of these AI programs? First of all, let us be clear: this AI is *not* a "superintellgience", so we shouldn't expect it to perform at that level. The problem we face is analogous to the problem of extrapolating how fast an olympic sprinter can run from looking at a baby crawling around on the floor. Furthermore, the Cornell machine was given a physical system that was specifically chosen to be easy to analyze, and a representation (equations) that is known to be suited to the problem.

We can certainly state that the program analyzed some data much faster than any human could have done. In a running time probably measured in hours or minutes, it took a huge stream of raw position and velocity data and found the underlying conserved quantities. And given likely algorithmic optimizations and another 10 years' of Moore's law, we can safely say that in 10 years' time, that particular program will run in seconds on a $500 machine or milliseconds on a supercomputer. These results actually surprise me: an AI can automatically and instantly analyze a physical system (albeit a rigged one).

But, of course, one has to ask: how much more narrow-AI work would it take to actually look at video of some bouncing, falling and whirling objects and deduce a general physical law such as the earth's gravity and the laws governing air resistance, where the objects are not hand-picked to be easy to analyze? This is unclear. But I can see mechanisms whereby this would work, rather than merely having to submit to the overwhelming power of the word "superintelligence". My suspicion is that with current state-of-the-art object identification technology, video footage of a system of bouncing balls and pendulums and springs would be amenable to this kind of analysis. There may even be a research project in that proposition.

As far as extrapolating the behavior of a superintelligence from the behavior of the Cornell AI or the Adam robot, we should note that no human can look at a complex physical system for a few seconds and just write down the physical law or equation that it obeys. A simple narrow AI has already outperformed humans at one specific task; though it still cannot do most of what a scientist does. We should therefore update our beliefs to assign more weight to the hypothesis that on some particular narrow physical modelling task, a "superintelligence" would vastly outperform us. Personally I was surprised at what such a simple system can do, though with hindsight it is obvious: data from a physical system follows patterns, and statistics can indentify those patterns. Science is not a magic ritual that only humans can perform, rather it is a specific kind of algorithm, and we should expect there to be no special injunction against silicon minds from doing it.

Personal Blog

12

New Comment

22 comments, sorted by

top scoring

Click to highlight new comments since: Today at 1:05 PM

[-]SilasBarta17y200

I was skeptical about Eliezer_Yudkowsky's assertion then. I'm skeptical of the work of the project in the Guardian link. And I'm still skeptical.

"But what's there to be skeptical about? The results are there for you to see!"

Er, kind of. One way you can produce artificial results in this field is to give the machine 89 of the 90 bits of the right hypothesis, where those 89 bits are the ones humans are pretty much born with, and then act surprised that it finds the 90th.

Two years ago, I saw a cool video on Youtube of a starfish robot that models itself and figures out how to move, supposedly an example of a self-aware machine that learns how to walk. Now, the machine is very impressive -- it actually looks alive.

But the reality is less interesting. It turns out that the builders fed it almost all of the correct model of itself, and all the robot had to do was solve for a few remaining parameters, then try some techniques heavily biased toward what would succeed. Interesting work (it's still in my YT favorites), but far from machine self-awareness and discovery of novel modes of locomotion.

I hope you can see where this is going: when you go to the link at the end of the Guardian video, yep, it's the same group.

The Eureka machine is, in a way, an example of the artificial results I described above. Notice how much cognitive labor the Cornell team does for the machine. First, they recognize that the huge amount of raw visual data can be concisely, losslessly compressed into a few variables. In other words, even given all the parts of the visual field that move, they have recognized how many of those degrees of freedom are constrained, and so don't need to be included in a varaible list that fully describes what's going on.

Second, they picked a system with heavy components and a short enough duration that you don't have to worry about energy loss due to aerodynamic drag. Such terms were not in the equations the machine discovered, which would have really put a crimp on its ability to find conservation laws. Remember, a reason it took so long for natural philosophers to notice the laws of motion is because air complicates things. You don't get to see regularity until you can focus on celestial bodies, dense/small objects, and vacuums -- which are a difficult engineering problem to create in a lab with pre-Scientific Revolution technology.

Third, they told it to look for invariants (conservation laws). Now, that's actually fair, because it's a rule you could feed a general-use AI. However, pick an average situation in your life. How hard is it to notice the invariants? Normally, that heuristic is not very good (unless you already know what to look for), but they gave it this heuristic in a situation pre-selected for its usefulness.

Remember, noticing the right hypothesis is half the battle. Once you've done enough to even bring the hypothesis to your attention, most of the cognitive labor is done.

This is impressive work, but, well, let's not get ahead of ourselves.

[-]Eliezer Yudkowsky17y80

I agree with Silas Barta that the data cited is not support for what I said a Bayesian superintelligence could do. This is 5% intelligence and 95% rigged demo. A lot of AI work is like that.

[-]MichaelVassar17y30

Not support, or just not very much support. Surely Univac's superiority over humans at arithmetic and the strength of a tractor are some support.

[-]SforSingularity17y40

First, they recognize that the huge amount of raw visual data can be concisely, losslessly compressed into a few variables. In other words, even given all the parts of the visual field that move, they have recognized how many of those degrees of freedom are constrained, and so don't need to be included in a varaible list that fully describes what's going on.

I said:

"Note that the data available to the system is the actual position and velocity measurements of the objects, rather than a video from a video camera, which would provide strictly more information, but be harder to process."

Second, they picked a system with heavy components and a short enough duration that you don't have to worry about energy loss due to aerodynamic drag. Such terms were not in the equations the machine discovered, which would have really put a crimp on its ability to find conservation laws.

just introduce a term into the Hamiltonian for energy in the temperature and velocity of the air. Air resistance would make the problem harder, but I hereby predict that the Cornell team would be able to get their machine to work with significant air resistance. Will you email them this as a challenge?

Also, what do you say about the Cambridge/Aberystwith group?

This is impressive work, but, well, let's not get ahead of ourselves.

the point is not that a narrow AI like Adam or the Cornell machine would deduce GR from the statics of a blade of grass. The point is that if Adam or the Cornell machine can do simple stuff orders of magnitude better than humans can, then my estimate of the probability that a "Superintelligence" would be able to do hard stuff like coming up with GR as a hypothesis and noticing that it is consistent with the motion of an apple in less than a second should go up.

[-]SilasBarta17y50

"Note that the data available to the system is the actual position and velocity measurements of the objects, rather than a video from a video camera, which would provide strictly more information, but be harder to process."

Yes, I was pointing out the significance of this pre-processing, not trying to imply you didn't mention it. "Would be harder to process" means they did most of the hard part before turning it over to the machine.

just introduce a term into the Hamiltonian for energy in the temperature and velocity of the air. Air resistance would make the problem harder

"Just"? I'm not sure you know what that words means ;-) The air functions as a thermodynamic reservoir ; you need precise equipment just to notice the change in air velocity and temperature, and even then, you've falling prey to exactly the criticism I made in my original comment. Simply by recognizing that temperature is relevant is itself difficult cognitive labor that you do for the machine. It can't be evidence of the machine's inferential capabilities except insofar as it has to account for one more variable.

And the more precise you have to be to notice this relevancy, the more cognitive labor you're doing for the machine.

but I hereby predict that the Cornell team would be able to get their machine to work with significant air resistance. Will you email them this as a challenge?

First, they're going to ignore a nobody like me. But yes, I will stick my neck out on this one. If the same measurement equipment is used, the same variables record, and the same huge prior given to "look for invariants", I claim their method will choke (to be precisely defined later).

Okay, maybe that's not what you meant. You meant that if you're going to do even more of the cognitive labor for the machine by adding on equipment that notices the variables necessary to make conservation-of-energy approaches work, then it can still find the invariant and discover the equation of motion.

But my point is, when you, the human, focus the machine's "attention" on precisely those observations that help the machine compress its description of its data, it's not the machine doing the cognitive labor; it's you.

Also, what do you say about the Cambridge/Aberystwith group?

Short answer: ditto.

Long answer: I think the biological sciences have been poor about expressing their results in a form that is conducive to the kind of regularity detection that machines like the Eureka machine do.

The point is that if Adam or the Cornell machine can do simple stuff orders of magnitude better than humans can,

And my point is that it flat out didn't once you consider that the makers bypassed everything that humans had to do when discovering these laws and gave it as a neat package to the algorithm.

then my estimate of the probability that a "Superintelligence" would be able to do hard stuff like coming up with GR as a hypothesis and noticing that it is consistent with the motion of an apple in less than a second should go up.

Given enough processing speed, sure. But the test for intelligence would normalize for elementary processing operations. That is, the machine is more intelligent if it didn't have to unnecessarily sweep through billions of longer hypotheses to get to the right one.

But hold on: if you truly do start from an untainted Occamian prior, you have to rule out many universes before you get to this one. In short, we don't actually want truly general intelligence. Rather, we want intelligence with a strong prior tilted toward the working of this universe.

[-]SforSingularity17y20

And my point is that it flat out didn't once you consider that the makers bypassed everything that humans had to do when discovering these laws and gave it as a neat package to the algorithm.

But it did do something faster than a human could have done. I don't claim that it invented physics: I claim that it quickly discovered the conserved quantities for a particular system albeit a system that was chosen in advance to be easy. But if I gave you the raw data that it had, and asked you to by hand write down a conserved quantity, you would take years.

[-]SilasBarta17y10

But it did do something faster than a human could have done.

That's enough to get a medal these days? ;-)

I don't claim that it invented physics: I claim that it quickly discovered the conserved quantities for a particular system albeit a system that was chosen in advance to be easy. But if I gave you the raw data that it had, and asked you to by hand write down a conserved quantity, you would take years.

Okay, sure, but as long as we're comparing feats from that baseline:

-Did the machine self-replicate?

-Did it defend itself against environmental threats?

-Did it find its own energy source?

-Did it persuade humans to grant it research funding?

Lest I be accused of being an AI goalpost mover, my point is just this: we don't all live by our own strength. Everyone, and every machine, can do at least some narrow task very well. The problem is when you equate that narrow task with the intelligence that was necessary to get to that narrow task.

[-][anonymous]17y00

Will you email them this as a challenge? First, they're going to ignore a nobody like me. But yes, I will stick my neck out on this one. If the same measurement equipment is used, the same variables record, and the same huge prior given to "look for invariants", I claim their method will choke (to be precisely defined later).

So you have emailed them?

[-]SilasBarta17y00

... no?

[-][anonymous]17y00

Ok, I will.

[-][anonymous]17y-10

ok. Perhaps better not to.

[-]SilasBarta17y00

\ >:-(

[-]SforSingularity17y00

But hold on: if you truly do start from an untainted Occamian prior, you have to rule out many universes before you get to this one. In short, we don't actually want truly general intelligence. Rather, we want intelligence with a strong prior tilted toward the working of this universe.

Sure, we want to bias the machine quite strongly towards hypotheses that we believe. This would make the job of the SI easier.

[-]SilasBarta17y10

Very true -- but only if you can find a way to represent your knowledge in a way conducive to the SI's Bayesian updating. At that point, however, you run into the problem of telling your SI knowledge that it couldn't generate for itself.

Let's say it found it had a high prior on the equations the Cornell team derived. But, for some reason, those equations seemed to be inapplicable to most featherless bipeds. Or even feathered bipeds! So, it wants to go back and identify the data that would have amplified the odds it assigned to those equations. Would it know to seek out heavy, double-pinned devices and track the linkages' x and y positions?

Would it know when the equations even apply? Or would the prior just unnecessarily taint any future inferences about phenomena too many levels above Newtonian mechanics (i.e. social psychology)?

[-]SforSingularity17y00

Would it know when the equations even apply? Or would the prior just unnecessarily taint any future inferences about phenomena too many levels above Newtonian mechanics (i.e. social psychology)?

Good point. That's why you don't want to go overboard with priors. However, even human psychology has underlying statistical laws governing it.

[-]Douglas_Knight17y20

Remember, a reason it took so long for natural philosophers to notice the laws of motion is because air complicates things. You don't get to see regularity until you can focus on celestial bodies, dense/small objects, and vacuums -- which are a difficult engineering problem to create in a lab with pre-Scientific Revolution technology.

Vacuums and telescopes are Renaissance tech, it's true. Wikipedia tells me that the first laboratory vacuum was built in the year after Galileo's death, so I think we can rule out the relevance of vacuums. (Galileo did say that things would be better in a vacuum.)

But dense objects are cheap! Maybe Galileo had better clocks than Archimedes, but given the Antikythera mechanism, we just don't know. Timing objects rolling down an inclined plane is easy. Racing two objects of different weights doesn't even require a clock.

The main question is whether the telescope affected Galileo's earthbound work.

I'm also unclear on his contribution. It may have been to combine simple physical laws with mathematics to produce conclusions. In particular, he seems to have been the first to say that projectiles travel in parabolas, which he deduced from gravity being constant acceleration. Other people (Avicenna, Biruni) may have said that gravity was acceleration, but I think it's hard to tell what they meant because they didn't draw clear conclusions from it.

[-]SilasBarta17y00

Vacuums and telescopes are Renaissance tech, it's true. Wikipedia tells me that the first laboratory vacuum was built in the year after Galileo's death, so I think we can rule out the relevance of vacuums. (Galileo did say that things would be better in a vacuum.)

Just to clarify, I only meant that vacuums were difficult to create in a lab with pre-Scientific Revolution tech, not that it was hard to create dense objects back then. Gold coins, anyone?

The broader point was that it takes a lot of cognitive labor simply to recognize that "hey, this would be much simpler to describe in a vacuum". And so testing an inference program on a system unlikely to be observed on average, but set up lack regular complexities, is "cheating", in a sense, because of how you save it the problem of recognizing these difficulties and abstracting away from them.

[-]Douglas_Knight17y20

The broader point was that it takes a lot of cognitive labor simply to recognize that "hey, this would be much simpler to describe in a vacuum".

It seems that it should be easy to produce physical laws or mathematical formulae describing dense bodies, without figuring out that they are universal laws whose domain of application is limited by the complication of air, but the historical progression was the opposite.

People talked about vacuum for thousands of years. Avicenna definitely said that things would be simpler in a vacuum. Wikipedia quotes Biruni saying that Aristotle said that the heavens are simpler because they are a vacuum.

Not that this has anything to do with superintelligences, but it suggests that we've forgotten what hard steps we've already done.

Why did no one before Galileo notice that the pendulum is cool?

I don't understand how the Caliphate produced Snell's law and good measurements (eg, the concern about whether measurement error is biased) without producing a contribution to precise earthbound dynamics that I've heard of.

[-]MichaelAnissimov17y00

Would it be alright if I published this as a guest post on my blog, Accelerating Future? Please email me at michaelanissimov (at) gmail.com with your name if you are interested. Thanks.

[-]CronoDAS17y00

I linked to discussion about that experiment before.

[-]SilasBarta17y10

How about a link to that discussion, or your previous linking, just for old times' sake? :-P

[-]CronoDAS17y30

Here it is.

http://lesswrong.com/lw/7b/robot_scientists_can_think_for_themselves/

Moderation Log