gwern comments on A Proof of Occam's Razor - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (121)
I'm completely flummoxed by the level of discussion here in the comments to Unknowns's post. When I wrote a post on logic and most commenters confused truth and provability, that was understandable because not everyone can be expected to know mathematical logic. But here we see people who don't know how to sum or reorder infinite series, don't know what a uniform distribution is, and talk about "the 1/infinity kind of zero". This is a rude wakeup call. If we want to discuss issues like Bayesianism, quantum mechanics or decision theory, we need to take every chance to fill the gaps in our understanding of math.
To answer your question: [0,1] does have the same cardinality as all reals, so in the set-theoretic sense they're equivalent. But they are more than just sets: they come equipped with an additional structure, "measure". A probability distribution can only be defined as uniform with regard to some measure. The canonical measure of the whole [0,1] is 1, so you can set up a uniform distribution that says the probability of each measurable subset is the measure of that subset (alternatively, the integral of the constant function f(x)=1 over that subset). But the canonical measure of the whole real line is infinite, so you cannot build a uniform distribution with respect to that.
After you're comfortable with the above idea, we can add another wrinkle: even though we cannot set up a uniform distribution over all reals, we can in some situations use a uniform prior over all reals. Such things are called improper priors and rely on the lucky fact that the arithmetic of Bayesian updating doesn't require the integral of the prior to be 1, so in well-behaved situations even "improper" prior distributions can always give rise to "proper" posterior distributions that integrate to 1.
Gosh, now I don't know whether to feel bad or not for asking that question.
But I guess 'no, it's not just cardinality that matters but measure' is a good answer. Is there any quick easy explanation of measure and its use in probability?
(I have yet to learn anything from Wikipedia on advanced math topics. You don't hear bad things about Wikipedia's math topics, but as Bjarne said of C++, complaints mean there are users.)
Don't feel bad, you're actually a hero. The four levels of depravity, by cousin_it:
Ask an illiterate question.
Assert an illiterate statement.
Assert and violently defend an illiterate statement.
Assert and violently defend an illiterate statement, becoming offended and switching to ad hominems when you begin to lose.
I'm not sure if other people can successfully learn math topics from Wikipedia because I'm atypical and other-optimizing is hard. Anyway, here's how you tell whether you actually understand a math topic: you should be able to solve simple problems. For example, someone who can't find Nash equilibria in simple 2x2 games is unqualified to talk about the Prisoner's Dilemma, and someone who can't solve the different Monty Hall variants is unqualified to talk about Bayesianism and priors.
You should post this hierarchy somewhere more permanent. It seems... useful.
Here's a non-wiki explanation of measure. http://www.ams.org/bookstore/pspdf/stml-48-prev.pdf It's a generalization of the concept of length.
Complaints mean there are annoyed users, yes. :-)
Also, I second this observation of Wikipedia's math pages not being a good learning resource. They're pretty good as reference and for refreshers on stuff you've already learned, but it's no substitute for a decent textbook and/or instructor.