The first paper he mentions in the machine learning section can be found here, if you'd like to take a look: Murphy and Pazzani 1994 I had more trouble finding the others which he briefly mentions, and so relied on his summary for those.
As for the 'complexity of phenomena rather than theories' bit I was talking about, your reminder of Solomonoff induction has made me change my mind, and perhaps we can talk about 'complexity' when it comes to the phenomena themselves after all.
My initial mindset (reworded with Solomonoff induction in mind) was this: Given an algorithm (phenomenon) and the data it generates (observations), we are trying to come up with algorithms (theories) that create the same set of data. In that situation, Occam's Razor is saying "the shorter the algorithm you create which generates the data, the more likely it is to be the same as the original data-generating algorithm". So, as I said before, the theories are judged on their complexity. But the essay is saying, "Given a set of observations, there are many algorithms that could have originally generated it. Some algorithms are simpler than others, but nature does not necessarily choose the simplest algorithm that could generate those observations."
So then it would follow that when searching for a theory, the simplest ones will not always be the correct ones, since the observation-generating phenomenon was not chosen by nature to necessarily be the simplest phenomenon that could generate those observations. I think that may be what the essay is really getting at.
Someone please correct me if I'm wrong, but isn't the above only kinda valid when our observations are incomplete? Intuitively, it would seem to me that given the FULL set of possible observations from a phenomenon, if you believe any theory but the simplest one that generates all of them, surely you're making irrefutably unnecessary assumptions? The only reason you'd ever doubt the simplest theory is if you think there are extra observations you could make which would warrant extra assumptions and a more complex theory...
So then it would follow that when searching for a theory, the simplest ones will not always be the correct ones, since the observation-generating phenomenon was not chosen by nature to necessarily be the simplest phenomenon that could generate those observations. I think that may be what the essay is really getting at.
It might be a difference of starting points, then. We can either start with a universal approach, a broad prior, and use general heuristics like Occam's Razor, then move towards the specifics of a situation, or we can start with a narrow p...
This essay claims to refute a popularized understanding of Occam's Razor that I myself adhere to. It is confusing me, since I hold this belief at a very deep level that it's difficult for me to examine. Does anyone see any problems in its argument, or does it seem compelling? I specifically feel as though it might be summarizing the relevant Machine Learning research badly, but I'm not very familiar with the field. It also might be failing to give any credit to simplicity as a general heuristic when simplicity succeeds in a specific field, and it's unclear whether such credit would be justified. Finally, my intuition is that situations in nature where there is a steady bias towards growing complexity are more common than the author claims, and that such tendencies are stronger for longer. However, for all of this, I have no clear evidence to back up the ideas in my head, just vague notions that are difficult to examine. I'd appreciate someone else's perspective on this, as mine seems to be distorted.
Essay: http://bruce.edmonds.name/sinti/