Douglas_Knight comments on A Proof of Occam's Razor - Less Wrong

3 Post author: Unknowns 10 August 2010 02:20PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (121)

You are viewing a single comment's thread. Show more comments above.

Comment author: Douglas_Knight 11 August 2010 03:36:02PM *  3 points [-]

You can have a uniform distribution over a continuous space, but not over integers.

I think it's worth adding that while you can have a uniform distribution over some continuous spaces, like the real numbers between 0 and 1, you can't have a uniform distribution over all real numbers.

Comment author: gwern 11 August 2010 04:10:38PM 0 points [-]

How can you do that? I thought 0-1 had the same cardinality as all reals, and that the cardinality of all reals is too high to have a uniform distribution.

Comment author: cousin_it 11 August 2010 05:02:32PM *  11 points [-]

I'm completely flummoxed by the level of discussion here in the comments to Unknowns's post. When I wrote a post on logic and most commenters confused truth and provability, that was understandable because not everyone can be expected to know mathematical logic. But here we see people who don't know how to sum or reorder infinite series, don't know what a uniform distribution is, and talk about "the 1/infinity kind of zero". This is a rude wakeup call. If we want to discuss issues like Bayesianism, quantum mechanics or decision theory, we need to take every chance to fill the gaps in our understanding of math.

To answer your question: [0,1] does have the same cardinality as all reals, so in the set-theoretic sense they're equivalent. But they are more than just sets: they come equipped with an additional structure, "measure". A probability distribution can only be defined as uniform with regard to some measure. The canonical measure of the whole [0,1] is 1, so you can set up a uniform distribution that says the probability of each measurable subset is the measure of that subset (alternatively, the integral of the constant function f(x)=1 over that subset). But the canonical measure of the whole real line is infinite, so you cannot build a uniform distribution with respect to that.

After you're comfortable with the above idea, we can add another wrinkle: even though we cannot set up a uniform distribution over all reals, we can in some situations use a uniform prior over all reals. Such things are called improper priors and rely on the lucky fact that the arithmetic of Bayesian updating doesn't require the integral of the prior to be 1, so in well-behaved situations even "improper" prior distributions can always give rise to "proper" posterior distributions that integrate to 1.

Comment author: JoshuaZ 11 August 2010 10:02:18PM 2 points [-]

I'm completely flummoxed by the level of discussion here in the comments to Unknowns's post. When I wrote a post on logic and most commentors confused truth and provability, that was understandable because not everyone can be expected to know mathematical logic. But here we see people who don't know how to sum or reorder infinite series, don't know what a uniform distribution is, and talk about "the 1/infinity kind of zero". This is a rude wakeup call. If we want to discuss issues like Bayesianism, quantum mechanics or decision theory, we need to take every chance to fill the gaps in our understanding of math.

Yeah, it might help to make a list of "math recommended for reading and posting on Less Wrong" Unfortunately, the entire set is likely large enough such that even many physics majors won't have all of it (lots of physics people don't take logic or model theory classes). At this point the list of math topics seems to include Lebesque integration, Godel's theorems and basic model theory, basics of continuous and discrete probability spaces, and a little bit of theoretical compsci ranging over a lot of topics (both computability theory and complexity theory seem relevant). Some of the QM posts also require a bit of linear algebra to actually grok well but I suspect that anyone who meets most of the rest of the list will have that already. Am I missing any topics?

Comment author: cousin_it 11 August 2010 10:59:02PM *  19 points [-]

Not all LW participants need to know advanced math. Human rationality is a broad topic, and someone like Yvain or Alicorn can contribute a lot without engaging the math side of things. What I care about is the signal-to-noise ratio in math discussion threads. In other words, I'm okay with regular ignorance but hate vocal ignorance with a fiery passion.

I propose a simple heuristic: if you see others using unfamiliar math, look it up before commenting. If it relies on other concepts that you don't understand, look them up too, and so on. Yes, it might take you days of frantic digging. It sometimes takes me days full-time, even though I have a math degree. Time spent learning more math is always better invested than time spent in novice-level discussions.

Comment author: JoshuaZ 11 August 2010 11:13:39PM 3 points [-]

That seems valid, but part of the trouble also seems to be people thinking they understand math that they don't after reading popularizations. I'm not sure how to deal with that other than just having those individuals read the actual math.

Comment author: Vladimir_Nesov 12 August 2010 08:33:54AM 5 points [-]

We need merely a norm to tell people to stop, not a magic way of explaining math faster than possible. Also, more realistic timeline for grokking (as opposed to parsing) deeper concepts is months, not days, and that's if you are smart enough in the first place.

Comment author: cousin_it 11 August 2010 11:22:48PM *  1 point [-]

Agreed. I have no idea either.

Comment author: katydee 11 August 2010 11:42:35PM 2 points [-]

Well, if someone posts something wrong, call them out on it.

Comment author: JoshuaZ 11 August 2010 11:48:29PM 4 points [-]

Empirically that doesn't seem to help much. See for example the comment thread in cousin_it's last top-level post.

Comment author: NancyLebovitz 12 August 2010 09:18:59AM 2 points [-]

Cousin_it and JoshuaZ, this sounds as though it could be a good topic (or group of topics) for a top-level post.

Comment author: JoshuaZ 12 August 2010 08:54:44PM 2 points [-]

Do you mean a set of posts on telling when you don't know enough or a set of posts on the math people should know?

Comment author: NancyLebovitz 12 August 2010 10:28:10PM 2 points [-]

Either or both would be useful, but I was thinking about the latter.

Comment author: RichardKennaway 12 August 2010 01:31:56PM 0 points [-]

Am I missing any topics?

In getting to some of those things, there are some more basic subjects that would have to be mastered:

  • Algebra: giving names to unknown numerical quantities and then reasoning about them the way one would with actual numbers.

  • Calculus: the relationship between a rate of change and a total amount, and the basic differential equations of physics, e.g. Newtonian mechanics, the diffusion equation, etc.

If this all seems like a lot, many people spend until well into their twenties in school. Think where they would get to if all that time had been usefully spent!

Comment author: Sniffnoy 12 August 2010 11:00:56PM 0 points [-]

Why is all that physics necessary? I'm not seeing it.

Comment author: RichardKennaway 13 August 2010 07:00:05AM 0 points [-]

Practical examples. Not many people are going to plough through abstract mathematics without them.

Comment author: Vladimir_Nesov 12 August 2010 08:30:41AM 0 points [-]

Causal networks, axiomatizations of expected utility.

Comment author: gwern 12 August 2010 04:38:09AM 0 points [-]

Gosh, now I don't know whether to feel bad or not for asking that question.

But I guess 'no, it's not just cardinality that matters but measure' is a good answer. Is there any quick easy explanation of measure and its use in probability?

(I have yet to learn anything from Wikipedia on advanced math topics. You don't hear bad things about Wikipedia's math topics, but as Bjarne said of C++, complaints mean there are users.)

Comment author: cousin_it 12 August 2010 08:50:06AM *  9 points [-]

Don't feel bad, you're actually a hero. The four levels of depravity, by cousin_it:

  1. Ask an illiterate question.

  2. Assert an illiterate statement.

  3. Assert and violently defend an illiterate statement.

  4. Assert and violently defend an illiterate statement, becoming offended and switching to ad hominems when you begin to lose.

I'm not sure if other people can successfully learn math topics from Wikipedia because I'm atypical and other-optimizing is hard. Anyway, here's how you tell whether you actually understand a math topic: you should be able to solve simple problems. For example, someone who can't find Nash equilibria in simple 2x2 games is unqualified to talk about the Prisoner's Dilemma, and someone who can't solve the different Monty Hall variants is unqualified to talk about Bayesianism and priors.

Comment author: WrongBot 13 August 2010 08:12:56PM 2 points [-]

You should post this hierarchy somewhere more permanent. It seems... useful.

Comment author: [deleted] 12 August 2010 04:55:36AM 2 points [-]

Here's a non-wiki explanation of measure. http://www.ams.org/bookstore/pspdf/stml-48-prev.pdf It's a generalization of the concept of length.

Comment author: DSimon 16 December 2010 05:26:30AM 0 points [-]

Complaints mean there are annoyed users, yes. :-)

Also, I second this observation of Wikipedia's math pages not being a good learning resource. They're pretty good as reference and for refreshers on stuff you've already learned, but it's no substitute for a decent textbook and/or instructor.

Comment author: Douglas_Knight 11 August 2010 05:04:07PM 2 points [-]

Yes, it's uncountable, so the probability of any particular real number is zero. I'm not going to go into detail by what I mean by a probability distribution here (though I'll note that cousin it used the word "integrate"). To uniformly pick a number $x$ from 0 to 1, use an infinite sequence of coin flips to represent it in binary. Infinity is impractical, but for natural questions, like: is $x$ between 1/5 and 1/sqrt(2), you will almost surely need only finitely many flips (and your expected number is also finite). And the probability that the answer is yes is 1/sqrt(2)-1/5; that is the sense in which it is uniform.

You can transfer this distribution to the set of all real numbers, eg, by logit or arctan, but it won't be uniform in the same sense. One can satisfy that uniformity by a measure which not a probability measure.