Perfect descriptions of reality are unattainable, unnecessary, and too costly for learning organisms, including humans. But workable descriptions are indispensable. So knowledge systems, like maps, are a complex blend of realism, flexibility, usefulness, and inspiration.
-- David Christian, Maps of Time: An Introduction to Big History
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)
Um, nope. What it would really take to change my mind about Bayes is seeing a refutation of Dutch Book and Cox's Theorem and Von Neumann-Morgenstern and the complete class theorem , combined with seeing some alternative epistemology (e.g. Dempster-Shafer) not turn out to completely blow up when subjected to the same kind of scrutiny as Bayesianism (the way DS brackets almost immediately go to [0-1] and fuzzy logic turned out to be useless etc.)
Neural nets have been useful for solving a large number of problems. It doesn't make them good epistemology. It doesn't make them a plausible candidate for "Yes, this is how you need to organize your thinking about your AI's thinking and if you don't your AI will explode".
I am afraid that your demonstration was not stated sufficiently precisely for me to criticize. This seems like the sort of thing for which there ought to be a standard reference, if there were such a thing as a well-known problem which Bayesian epistemology could not handle. For example, we have well-known critiques and literature claiming that nonconglomerability is a problem for Bayesianism, and we have a chapter of Jaynes which neatly shows that they all arise from misuse of limits on infinite problems. Is there a corresponding literature for your alleged reductio of Bayesianism which I can consult? Now, I am a great believer in civilizational inadequacy and the fact that the incompetence of academia is increasing, so perhaps if this problem was recently invented there is no more literature about it. I don't want to be a hypocrite about the fact that sometimes something is true and nobody has written it up anyway, heaven knows that's true all the time in my world. But the fact remains that I am accustomed to somewhat more detailed math when it comes to providing an alleged reductio of the standard edifice of decision theory. I know your time is limited, but the real fact is that I really do need more detail to think that I've seen a criticism and be convinced that no response to that criticism exists. Should your flat assertion that Bayesian methods can't handle something and fall flat so badly as to constitute a critique of Bayesian epistemology, be something that I find convincing?
Okay. Though I note that you haven't actually said that my intuitions (and/or my reading of Wikipedia) were wrong; many NP-hard problems will be easy to solve for a randomly generated case.
Anyway, suppose a standard L1-penalty algorithm solves a random case of this problem. Why do you think that's a reductio of Bayesian epistemology? Because the randomly generated weights mean that a Bayesian viewpoint says the credibility is going as the L2 norm on the non-zero weights, but we used an L1 algorithm to find which weights were non-zero? I am unable to parse this into the justifications I am accustomed to hearing for rejecting an epistemology. It seems like you're saying that one algorithm is more effective at finding the maximum of a Bayesian probability landscape than another algorithm; in a case where we both agree that the unbounded form of the Bayesian algorithm would work.
What destroys an epistemology's credibility is a case where even in the limit of unbounded computing power and well-calibrated prior knowledge, a set of rules just returns the wrong answer. The inherent subjectivity of p-values as described in http://lesswrong.com/lw/1gc/frequentist_statistics_are_frequently_subjective/ is not something you can make go away with a better-calibrated prior, correct use of limits, or unlimited computing power; it's the result of bad epistemology. This is the kind of smoking gun it would take to make me stop yammering about probability theory and Bayes's rule. Showing me algorithms which don't on the surface seem Bayesian but find good points on a Bayesian fitness landscape isn't going to cut it!
FYI, there are published counterexamples to Cox's theorem. See for example Joseph Halpern's at http://arxiv.org/pdf/1105.5450.pdf.