I'm not really in the field, but I am vaguely familiar with the literature and this isn't how it works (though you might get that impression from reading LW).
A vision algorithm might face the following problem: reality picks an underlying physical scene and an image from some joint distribution. The algorithm looks at the image and must infer something about the scene. In this case, you need to integrate over a huge space to calculate likelihoods, which is generally completely intractable and so requires some algorithmic insight. For example, if you want to estimate the probability that there is an apple on the table, you need to integrate over the astronomically many possible scenes in which there is an apple on the table.
I don't know if this contradicts you, but this is a problem that biological brain/eye systems have to solve ("inverse optics"), and Steven Pinker has an excellect discussion of it from a Bayesian perspective in his book How the Mind Works. He mentions that the brain does heavily rely on priors that match our environment, which significantly narrows down the possible scenes that could "explain" a given retinal image pair. (You get optical illusions when a scene violates these assumptions.)
I searched the posts but didn't find a great deal of relevant information. Has anyone taken a serious crack at it, preferably someone who would like to share their thoughts? Is the material worthwhile? Are there any dubious portions or any sections one might want to avoid reading (either due to bad ideas or for time saving reasons)? I'm considering investing a chunk of time into investigating Legg's work so any feedback would be much appreciated, and it seems likely that there might be others who would like some perspective on it as well.