Automatic programming, an example

Thomas

19 Automatic programming, an example

1st Feb 2012

3 min read

19

Say, that we have the following observational data:

Planet	Aphelion 000 km	Perihelion 000 km	Orbit time days
Mercury	69,816	46,001	88
Venus	108,942	107,476	225
Earth	152,098	147,098	365
Mars	249,209	206,669	687
Jupiter	816,520	740,573	4,332
Saturn	1,513,325	1,353,572	10,760
Uranus	3,004,419	2,748,938	30,799
Neptune	4,553,946	4,452,940	60,190
Pluto	7,311,000	4,437,000	90,613

The minimal, the maximal distance between a planet and the Sun (both in thousands of kilometres) and the number of (Earth) days for one revolution around the Sun. Above is only the empirical data and no binding algorithm among the three quantities. The celestial mechanics rules which go by the name of the Kepler's laws. Can those rules be (re)invented by a computer program and how?

The following program code will be put into a simulator:

//declarations of the integer type variables
$DECLAREINT bad perihelion aphelion orbit guess dif temp zero temp1
 
//table with the known data in a simulator friendly format
$INVAR perihelion(46001) aphelion(69816) orbit(88)
$INVAR perihelion(107476) aphelion(108942) orbit(225)
$INVAR perihelion(147098) aphelion(152098) orbit(365)
$INVAR perihelion(206669) aphelion(249209) orbit(687)
$INVAR perihelion(740573) aphelion(816520) orbit(4332)
$INVAR perihelion(1353572) aphelion(1513325) orbit(10760)
$INVAR perihelion(2748938) aphelion(3004419) orbit(30799)
$INVAR perihelion(4452940) aphelion(4553946) orbit(60190)
$INVAR perihelion(4437000) aphelion(7311000) orbit(90613)

// variables orbit and bad can't be touched by the simulator
//to avoid a degeneration to a triviality
$RESVAR orbit bad

//do NOT use if clause, while clause do not set direct numbers ...
$RESCOM if while val_operation inc_dec

//bad is the variable, by which the whole program will be judged
//a big value of bad is bad. By this criteria programs will be wiped out
//from their virtual existence. A kind of anti-fitness
$PENVAL bad

//do show the following variables when simulating
$SHOWVAR bad,orbit,guess,dif

//penalize any command with 0 (nothing) and every line by 1 point
$WEIGHTS commands=0 lines=1

//minimize the whole program to 20 lines or less
$MINIMIZE lines 20

$BES
 //the arena, where algorithms will be
//created and the fittest only will survive

$EES

//testing area where the simulator has no write access to
//here the bad (the penalized variable) is calculated
//bigger the difference between the known orbit and the variable guess
//worse is the evolved algorithm
dif=orbit-guess;
dif=abs(dif);
bad=dif;
temp=dif;
temp*=10000;
temp1=temp/orbit;
temp=temp1*temp1;
bad=bad+temp;
//end of the testing area

After several hours the following C code has been evolved inside of the $BES - $EES segment.

aphelion=perihelion+aphelion;
aphelion=aphelion+aphelion;
aphelion=aphelion+aphelion;
guess=12;
aphelion=aphelion>>guess;
temp=aphelion/guess;
aphelion=aphelion-temp;
dif=sqrt(aphelion);
aphelion=guess|aphelion;
aphelion=aphelion*dif;
aphelion=guess^aphelion;
guess=aphelion/guess;

What the simulator does? It bombards the arena segment with a random C commands. Usually it then just notices a syntax error and repairs everything to the last working version. If everything is syntactically good, the simulator interprets the program and checks if the mutated version causes any run-time error like division by zero, a memory leak and so on. In the case of such an error it returns to the last good version. Otherwise it checks the variable called "bad", if it is at least as small as it was ever before. In the case it is, a new version has just been created and it is stored.

The evolutionary pressure is working toward ever better code, which increasingly well guesses the orbit time of nine planets. In this case the "orbit" variable has been under the $RESVAR clause and then the "gues" variable has been tested against the "orbit" variable. Had been no "$RESVAR orbit" statement, a simple "guess=orbit;" would evolve quickly. Had been no "$RESVAR bad" statement a simple "bad=-1000000;" could derail the process.

Many thousands of algorithms are born and die every second on a standard Windows PC inside this simulator. Million or billion generations later, the digital evolution is still running, even if an excellent solution has been already found.

And how good approximation for the Kepler (Newton) celestial mechanics of the Solar system we have here?

This good for the nine planets where the code evolved:

Planet	Error %
Mercury	0.00
Venus	0.44
Earth	0.27
Mars	0.29
Jupiter	0.16
Saturn	0.65
Uranus	0.10
Neptune	0.79
Pluto	1.08

And this good for the control group of a comet and six asteroids:

Asteroid/Comet	Error %
Halley	1.05
Hebe	1.37
Astraea	1.99
Juno	3.19
Pallas	1.66
Vesta	2.49
Ceres	2.02

Could be even much better after another billion generations and maybe with even more $INVAR examples. Generally, you can pick any three columns from any integer type table you want. And see this way, how they are related algorithmically. Can be more than three columns also.

The name of the simulator (evoluator) is Critticall and it is available at http://www.critticall.com

Personal Blog

19

New Comment

Rendering 0/36 comments, sorted by

top scoring

(show more) Click to highlight new comments since: Today at 10:19 AM

Moderation Log

19 Automatic programming, an example

by Thomas

1st Feb 2012

3 min read

19

Say, that we have the following observational data:

Planet	Aphelion 000 km	Perihelion 000 km	Orbit time days
Mercury	69,816	46,001	88
Venus	108,942	107,476	225
Earth	152,098	147,098	365
Mars	249,209	206,669	687
Jupiter	816,520	740,573	4,332
Saturn	1,513,325	1,353,572	10,760
Uranus	3,004,419	2,748,938	30,799
Neptune	4,553,946	4,452,940	60,190
Pluto	7,311,000	4,437,000	90,613

The following program code will be put into a simulator:

//declarations of the integer type variables
$DECLAREINT bad perihelion aphelion orbit guess dif temp zero temp1
 
//table with the known data in a simulator friendly format
$INVAR perihelion(46001) aphelion(69816) orbit(88)
$INVAR perihelion(107476) aphelion(108942) orbit(225)
$INVAR perihelion(147098) aphelion(152098) orbit(365)
$INVAR perihelion(206669) aphelion(249209) orbit(687)
$INVAR perihelion(740573) aphelion(816520) orbit(4332)
$INVAR perihelion(1353572) aphelion(1513325) orbit(10760)
$INVAR perihelion(2748938) aphelion(3004419) orbit(30799)
$INVAR perihelion(4452940) aphelion(4553946) orbit(60190)
$INVAR perihelion(4437000) aphelion(7311000) orbit(90613)

// variables orbit and bad can't be touched by the simulator
//to avoid a degeneration to a triviality
$RESVAR orbit bad

//do NOT use if clause, while clause do not set direct numbers ...
$RESCOM if while val_operation inc_dec

//bad is the variable, by which the whole program will be judged
//a big value of bad is bad. By this criteria programs will be wiped out
//from their virtual existence. A kind of anti-fitness
$PENVAL bad

//do show the following variables when simulating
$SHOWVAR bad,orbit,guess,dif

//penalize any command with 0 (nothing) and every line by 1 point
$WEIGHTS commands=0 lines=1

//minimize the whole program to 20 lines or less
$MINIMIZE lines 20

$BES
 //the arena, where algorithms will be
//created and the fittest only will survive

$EES

//testing area where the simulator has no write access to
//here the bad (the penalized variable) is calculated
//bigger the difference between the known orbit and the variable guess
//worse is the evolved algorithm
dif=orbit-guess;
dif=abs(dif);
bad=dif;
temp=dif;
temp*=10000;
temp1=temp/orbit;
temp=temp1*temp1;
bad=bad+temp;
//end of the testing area

After several hours the following C code has been evolved inside of the $BES - $EES segment.

aphelion=perihelion+aphelion;
aphelion=aphelion+aphelion;
aphelion=aphelion+aphelion;
guess=12;
aphelion=aphelion>>guess;
temp=aphelion/guess;
aphelion=aphelion-temp;
dif=sqrt(aphelion);
aphelion=guess|aphelion;
aphelion=aphelion*dif;
aphelion=guess^aphelion;
guess=aphelion/guess;

And how good approximation for the Kepler (Newton) celestial mechanics of the Solar system we have here?

This good for the nine planets where the code evolved:

Planet	Error %
Mercury	0.00
Venus	0.44
Earth	0.27
Mars	0.29
Jupiter	0.16
Saturn	0.65
Uranus	0.10
Neptune	0.79
Pluto	1.08

And this good for the control group of a comet and six asteroids:

Asteroid/Comet	Error %
Halley	1.05
Hebe	1.37
Astraea	1.99
Juno	3.19
Pallas	1.66
Vesta	2.49
Ceres	2.02

The name of the simulator (evoluator) is Critticall and it is available at http://www.critticall.com

Personal Blog

19

New Comment

Rendering 0/36 comments, sorted by

top scoring

(show more) Click to highlight new comments since: Today at 10:19 AM

Moderation Log

More from Thomas

Curated and popular this week

36Comments

Automatic programming, an example — LessWrong

Comment Permalink

wmorgan14y150

The generated code is bizarre. I refactored it as well as I could, and it still doesn't make much sense:

aphelion = (aphelion + perihelion) >> 10;
aphelion = aphelion - (aphelion / 12);
guess = ( ( (aphelion | 12) * (int)sqrt(aphelion) ) ^ 12 ) / 12;

"To get the orbit time in days from the aphelion and perihelion in Kkm, first sum them and divide by 1024. Then from that, subtract one twelfth. Then, to the value, perform a bitwise OR with 0x0C, multiply by the square root, and bit-XOR 0x0C again. Finally, divide by 12, and that will give you the number of days."

Showing 3 of 4 replies (Click to show all)

Thomas14y30

divide by 1024

Actually by 4096. And it is a rescaling as jimrandomh points out.

39jimrandomh14y

The three problems with the code are that the variable names are all lies, there's a bunch of redundant rescaling which isn't consolidated because it's done in integer math when it should be floating point, and there are a couple bits of overfitting (bitwise operators) that don't belong. If you convert to SSA and wipe out the misleading names, you get: a1 = perihelion+aphelion Real a2 = a1+a1 Rescaling a3 = a2+a2 Rescaling g1 = 12 Rescaling a4 = a3>>g1 Rescaling t1 = a4/g1 Rescaling a5 = a4-t1 Rescaling d1 = sqrt(a5) Real a6 = g1|a5 Overfit a7 = a6*d1 Real a8 = g1^a7 Overfit guess = a8/g1 Rescaling If you replace the overfitting with pass-throughs (a6=a5, a8=a7), then pretend it's floating point so that you can consolidate all the rescaling into a single multiplicative constant, you get guess = k * (perihelion+aphelion)*sqrt(perihelion+aphelion) Which is Kepler's third law.

2Douglas_Knight14y

The XOR with 12 won't do much after dividing by 12. For small radii, OR with 12 (in units of about 10^6km) will have an effect. These two constants are probably just overfitting. Indeed, it nails Mercury, the smallest and thus most vulnerable to these effects.* Rounding** the square root is also probably overfitting or just noise. It will have a larger effect, but smooth across planets, so it is probably canceled out by the choice of other numbers. Ignoring those three effects, it is a constant times the 3/2 power of average of the axes. The deviation from Kepler's law is that it should ignore perihelion.*** But for un-eccentric orbits, there's no difference. Since the training data isn't eccentric, this failure is unsurprising. That is, the code is unsurprising; that the code is so accurate is surprising. That it correctly calculates the orbital period of Halley's comet, rather than underestimating by a factor of 2^(3/2) is implausible.*** * The control group is too homogeneous. If it contained something close in, overfitting for Mercury would have been penalized in the final evaluation. ** Are you sure it's rounding? [Edit: Yes: bitwise operations are strongly suggestive.] *** These statements are wrong because I confused apehelion with the semi-major axis. So removing the bitwise operations yields exactly Kepler's law. If you switch from ints to doubles it becomes more accurate. But wmorgan has a constant error: it is divide by 4096, not 1024. This should make the rounding errors pretty bad for Mercury. Maybe the bitwise operations are to fix this, if they aren't noise. My C compiler does not reproduce the claimed error percentages, so I'm not going to pursue this.

See in context