Using a high-powered black-box technique to regress a one-dimensional continuous outcome against a one-dimensional continuous predictor seems misguided.
If you want to characterize how well your evolutionary learning idea works, try it on data that you've generated, where you know the "underlying math". See if you can recover the program that generated the data or one that's equivalent to it. Or try it on really big, messy data where no one knows the right answer and see if you/it can do better than the obvious competitors like SVM, k-NN, CART, etc.
The middle ground of working on an easy/messy problem, where any sane method will give you and adequate answer but there's no known ground truth, is not going to make a very compelling story.
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)
Yes, I think that was better, because the ground truth is Kepler's third law and jimrandomh pointed out your method actually recaptures a (badly obfuscated and possibly overfit) variant of it.
"Dimensionality" is totally relevant in any approach to supervised learning. But it matters even without considering the bias/variance trade-off, etc.
Imagine that you have an high-dimensional predictor, of which one dimension completely determines the outcome and the rest are noise. Your shortest possible generating algorithm is going to have to pick out the relevant dimension. So as the dimensionality of the predictor increases, the algorithm length will necessarily increase, just for information-theoretic reasons.
How do you overfit Kepler's law?
edit: Retracted. I see now looking at the actual link the result wasn't just obfuscated but wrong, and so the manner in which it's wrong can overfit of course (and that matches the results).