Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

StellaAthena comments on 0 And 1 Are Not Probabilities - Less Wrong

34 Post author: Eliezer_Yudkowsky 10 January 2008 06:58AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (128)

Sort By: Old

You are viewing a single comment's thread.

Comment author: StellaAthena 20 August 2015 08:49:03AM 0 points [-]

This article is largely incoherent. The main justification is the abuse of an invalid transformations: y=x/(1-x) is not the bijection that he asserts it is, because it's not a function that maps [0,1] onto R. It's a function that maps [0,1] onto [1,\intfy] as a subset of the topological closure of R. And that's okay, but you can't say "well I don't like the topological closure of R, so I'll just use R and claim that 1 is where the problem is."

Additionally, his discussion of log odds and such is perfectly fine, but ignores the fact that there are places where you do need to have an odds of 0:1, or a log odds of negative infinity. Probability theory stops working when you throw out 0 and 1, it's as simple as that.

Even if you don't want to handle tautologies or contradictions, there are other ways to get P(X)=0 or 1. The probability that a real number chosen uniformly from the real interval [0,1] is 0. It has to be. It's a provable fact under ZFC and to decide otherwise is to say that you're more attached to the idea of 0 and 1 not being probabilities than you are to the fact that mathematics is consistent and if you really believe that, well, there's absolutely nothing I have to say to you.

This is one of those situations where EY just demonstrates he knows very little mathematics.

Comment author: Regex 20 August 2015 09:35:55AM 1 point [-]

As someone who doesn't know much beyond basic statistics, in what way are 0 or 1 probabilities? Isn't it just axiomatic truth at that point? In that sense saying zero and one are probabilities is just saying 'certain' or 'impossible' as far as I understand it. Situations where an event will definitely or definitely not occur doesn't seem to be consistent with the idea of randomness which I've understood probability to revolve around.

I suppose the alternative would be that we'd have to assume every mathematical proof has infinite evidence if we wanted to get anywhere productive- after all axioms are assumed to be true. It doesn't make much sense to need evidence in that scenario- except perhaps the probability of error and mistake? That isn't particularly calculable and would actually change from person to person.

Using one and zero makes sense to me as a matter of assumed or proven truths, but I'm still unsure how that makes it a probability.

Comment author: Epictetus 20 August 2015 02:30:48PM 1 point [-]

Situations where an event will definitely or definitely not occur doesn't seem to be consistent with the idea of randomness which I've understood probability to revolve around.

"Event" is a very broad notion. Let's say, for example, that I roll two dice. The sample space is just a collection of pairs (a, b) where "a" is what die 1 shows and "b" is what die 2 shows. An event is any sub-collection of the sample space. So, the event that the numbers sum to 7 is the collection of all such pairs where a + b = 7. The probability of this event is simply the fraction of the sample space it occupies.

If I rolled eight dice, then they'll never sum to seven and I say that that event occurs with probability 0. If I secretly rolled an unknown number of dice, you could reasonably ask me the probability that they sum to seven. If I answer "0", that just means that I rolled more than one and fewer than eight dice. It doesn't make the process less random nor the question less reasonable.

If you treat an event as some question you can ask about the result of a random process, then 1 and 0 make a lot more sense as probabilities.

For the mathematical theory of probability, there are plenty of technical reasons why you want to retain 1 and 0 as probabilities (and once you get into continuous distributions, it turns out that probability 1 just means "almost certain").

Comment author: Regex 21 August 2015 08:36:41AM 0 points [-]

This is what I meant by something being a proven truth- within the rules set one can find outcomes which are axiomatically impossible or necessary. The process itself may be random, but calling it random when something impossible didn't happen seems odd to me. The very idea that 1 may be not-quite-certain is more than a little baffling, and I suspect is the heart of the issue.

Comment author: Epictetus 21 August 2015 02:01:03PM 1 point [-]

The very idea that 1 may be not-quite-certain is more than a little baffling, and I suspect is the heart of the issue.

If 1 isn't quite certain then neither is 0 (if something happens with probability 1, then the probability of it not happening is 0). It's one of those things that pops up when dealing with infinity.

It's best illustrated with an example. Let's say we play a game where we flip a coin and I pay you $1 if it's heads and you pay me $1 if it's tails. With probability 1, one of us will eventually go broke (see Gambler's ruin). It's easy think of a sequence of coin flips where this never happens; for example, if heads and tails alternated. The theory holds that such a sequence occurs with probability 0. Yet this does not make it impossible.

It can be thought of as the result of a limiting process. If I looked at sequences of N of coin flips, counted the ones where no one went broke and divided this by the total number of possible sequences, then as I let N go to infinity this ratio would go to zero. This event occupies an region with area 0 in the sample space.

Comment author: Regex 22 August 2015 03:01:15PM 0 points [-]

If the limit converges then it can hit 0 or 1. Got it. Thank you.

Comment author: StellaAthena 20 August 2015 08:14:17PM 1 point [-]

Formally, probability is defined via areas. The basic idea is that the probability of picking an element from a set A out of a set B is the ratio of the areas of A to B, where "area" can be defined not only for things like squares but also things like lines, or actually almost every* subset of R. So, lets say you want to randomly select a real number from the interval [0,1] and want to know the odds it falls in a set, S. The area of [0,1] is 1, so the answer is just the area of S.

If S={0}, then S has area zero. If S=[0,1), then S has area 1. Not only are both of these theoretical possibilities, they are practical ones too. There are real world examples of probability zero events (the only one that comes to mind involves QM though so I don't want to bother with the details).

Now, notice that this isn't the same thing as "impossible". Instead, it means more like "it won't happen I promise even by the time the universe ends". The way I tend to think about probability zero events is that they are so unlikely they are beyond the reach of the principle that as the number of trials increases, events become expected. For any nonzero probability, there is a number of trials, n, such that once you do it n times the expected value becomes greater than 1. That's not the case with probability zero events. Probability 1 events can then be thought of as the negation of probability 0 events.

*not actually "almost every" in a formal sense, but "almost any" in a "unless you go try to build a set that you can't measure it probably has a well defined area" sense

Comment author: Regex 21 August 2015 08:24:39AM 1 point [-]

That seems a solid enough explanation, but how can something of probability zero have a chance to occur? How then do you represent an impossible outcome? It seems like otherwise 'zero' is equivalent to 'absurdly low'. That doesn't quite jive with my understanding.

Comment author: StellaAthena 21 August 2015 09:37:41PM 1 point [-]

Impossible things also have a probability of zero. I totally understand that this seems a bit unintuitive, and the underlying structure (which includes things like infinities of different sizes) is generally pretty unintuitive at first. Which is kinda just saying "sorry, I can't explain the intuition," which is unfortunately true.

Comment author: Regex 22 August 2015 02:47:33PM 0 points [-]

I'm just going to think of it as taking the limit as evidence approaches infinity. Because a probability next to zero and zero are identical, zero then is a probability?

Comment author: Stephen_Cole 22 August 2015 12:07:25AM 1 point [-]

I think one of the clearest expositions on these issues is ET Jaynes. The first three chapters (which is some of the relevant part) can be found at http://bayes.wustl.edu/etj/prob/book.pdf.

Comment author: Regex 22 August 2015 02:39:02PM 0 points [-]

"Not Found

The requested URL /etj/prob/book.pdf. was not found on this server."

Comment author: arundelo 22 August 2015 02:59:40PM 2 points [-]

Fixed Jaynes link (no trailing period).

Comment author: Regex 22 August 2015 03:01:47PM 0 points [-]

Ah. Thanks!

Comment author: Stephen_Cole 22 August 2015 03:17:11PM 0 points [-]

Oops. Thanks for the fix!

Comment author: David_Bolin 20 August 2015 01:10:15PM 1 point [-]

Eliezer isn't arguing with the mathematics of probability theory. He is saying that in the subjective sense, people don't actually have absolute certainty. This would mean that mathematical probability theory is an imperfect formalization of people's subjective degrees of belief. It would not necessarily mean that it is impossible in principle to come up with a better formalization.

Comment author: Lumifer 20 August 2015 02:44:08PM *  0 points [-]

Eliezer isn't arguing with the mathematics of probability theory. He is saying that in the subjective sense, people don't actually have absolute certainty.

Errr... as I read EY's post, he is certainly talking about the mathematics of probability (or about the formal framework in which we operate on probabilities) and not about some "subjective sense".

The claim of "people don't actually have absolute certainty" looks iffy to me, anyway. The immediate two questions that come to mind are (1) How do you know? and (2) Not even a single human being?

Comment author: Bound_up 20 August 2015 03:23:35PM *  1 point [-]

I think he's just acknowledging the minute(?) possibility that our apparently flawless reasoning could have a blind spot. We could be in a Matrix, or have something tampering with our minds, etcetera, such that the implied assertion:

If this appears absolutely certain to me

Then it must be true

is indefensible.

Comment author: Lumifer 20 August 2015 03:43:59PM 0 points [-]

There are two different things.

David_Bolin said (emphasis mine): "He is saying that in the subjective sense, people don't actually have absolute certainty." I am interpreting this as "people never subjectively feel they have absolute certainty about something" which I don't think is true.

You are saying that from an external ("objective") point of view, people can not (or should not) be absolutely sure that their beliefs/conclusions/maps are true. This I easily agree with.

Comment author: David_Bolin 20 August 2015 07:08:55PM 0 points [-]

It should probably be defined by calibration: do some people have a type of belief where they are always right?

Comment author: Lumifer 20 August 2015 07:36:53PM -1 points [-]

Self-referential and anthropic things would probably qualify, e.g. "I believe I exist".

Comment author: StellaAthena 20 August 2015 08:33:17PM -1 points [-]

You can phrase statements of logical deduction such that they have no premises and only conclusions. If we let S be the set of logical principles under which our logical system operates and T be some sentence that entails Y, then S AND T implies Y is something that I have absolute certainty in, even if this world is an illusion, because the premise of the implication contains all the rules necessary to derive the result.

A less formal example of this would be the sentence: If the rules of logic as I know them hold and the axioms of mathematics are true, then it is the case that 2+2=4

Comment author: Gram_Stone 20 August 2015 03:50:44PM 0 points [-]

The claim of "people don't actually have absolute certainty" looks iffy to me, anyway. The immediate two questions that come to mind are (1) How do you know? and (2) Not even a single human being?

The way I view that statement is: "In our formalization, agents with absolutely certain beliefs cannot change those beliefs, we want our formalization to capture our intuitive sense of how an ideal agent would update its beliefs, a formalization with a quality of fanaticism does not capture our intuitive sense of how an ideal agent would update its beliefs, therefore we do not want a quality of fanaticism."

And what state of the world would correspond to the statement "Some people have absolute certainty." ? Do you think that we can take some highly advanced and entirely fictional neuroimaging technology, look at a brain and meaningfully say, "There's a belief with probability 1." ?

And on the other hand, I'm not afraid to talk about folk certainty, where the properties of an ideal mathematical system are less relevant, where everyone can remain blissfully logically uncertain to the fact that beliefs with probability 1 and 0 imply undesirable consequences in formal systems that possess them, and say things like "I believe that absolutely." I am not afraid to say something like, "That person will not stop believing that for as long as he lives," and mean that I predict with high confidence that that person will not stop believing that for as long as he lives.

And once you believe that the formalization is trying to capture our intuitive sense of an ideal agent, and decide whether or not that quality of fanaticism captures it, and decide whether or not you're going to be a stickler about folk language, then I don't think that any question or confusion around that claim remains.

Comment author: Lumifer 20 August 2015 03:57:58PM 1 point [-]

People are not "ideal agents". If you specifically construct your formalization to fit your ideas of what an ideal agent should and should not be able to do, this formalization will be a poor fit to actual, live human beings.

So either you make a system for ideal agents -- in which case you'll still run into some problems because, as has been pointed out upthread, standard probability math stops working if you disallow zeros and ones -- or you make a system which is applicable to our imperfect world with imperfect humans.

Comment author: Gram_Stone 20 August 2015 09:59:02PM 1 point [-]

I don't see why both aren't useful. If you want a descriptive model instead of a normative one, try prospect theory.

I just don't see this article as an axiom that says probabilities of 0 and 1 aren't allowed in probability theory. I see it as a warning not to put 0s and 1s in your AI's prior. You're not changing the math so much as picking good priors.

Comment author: Wes_W 20 August 2015 05:02:33PM 0 points [-]

If we're asking what the author "really meant" rather than just what would be correct, it's on record.

The argument for why zero and one are not probabilities is not, "All objects which are special cases should be cast out of mathematics, so get rid of the real zero because it requires a special case in the field axioms", it is, "ceteris paribus, can we do this without the special case?" and a bit of further intuition about how 0 and 1 are the equivalents of infinite probabilities, where doing our calculations without infinities when possible is ceteris paribus regarded as a good idea by certain sorts of mathematicians. E.T. Jaynes in "Probability Theory: The Logic of Science" shows how many probability-theoretic errors are committed by people who assume limits directly into their calculations, without first showing the finite calculation and then finally taking its limit. It is not unreasonable to wonder when we might get into trouble by using infinite odds ratios. Furthermore, real human beings do seem to often do very badly on account of claiming to be infinitely certain of things so it may be pragmatically important to be wary of them.

I... can't really recommend reading the entire thread at the link, it's kind of flame-war-y and not very illuminating.

Comment author: EHeller 20 August 2015 05:14:30PM *  3 points [-]

I think the issue at hand is that 0 and 1 aren't special cases at all, but very important for the math of probability theory to work (try and construct a probability measure where some subset doesn't have probability 1 or 0).

This is incredibly necessary for the mathematical idea of probability ,and EY seems to be confusing "are 0 and 1 probabilities relevant to Bayesian agents?" with "are 0 and 1 probabilities?" (yes, they are, unavoidably, not as a special case!).

Comment author: Lumifer 20 August 2015 05:18:06PM *  0 points [-]

It seems that EY position boils down to

Pragmatically speaking, the real question for people who are not AI programmers is whether it makes sense for human beings to go around declaring that they are infinitely certain of things. I think the answer is that it is far mentally healthier to go around thinking of things as having 'tiny probabilities much larger than one over googolplex' than to think of them being 'impossible'.

And that's a weak claim. EY's ideas of what is "mentally healthier" are, basically, his personal preferences. I, for example, don't find any mental health benefits in thinking about one over googolplex probabilities.

Comment author: Wes_W 20 August 2015 05:27:16PM *  0 points [-]

Cromwell's Rule is not EY's invention, and relatively uncontroversial for empirical propositions (as opposed to tautologies or the like).

If you don't accept treating probabilities as beliefs and vice versa, then this whole conversation is just a really long and unnecessarily circuitous way to say "remember that you can be wrong about stuff".

Comment author: EHeller 20 August 2015 05:44:34PM 2 points [-]

The part that is new compared to Cromwell's rule is that Yudkowsky doesn't want to give probability 1 to logical statements (53 is a prime number).

Because he doesn't want to treat 1 as a probability, you can't expect complete sets of events to have total probability 1, despite them being tautologies. Because he doesn't want probability 0, how do you handle the empty set? How do you assign probabilities to statements like "A and B" where A and B are logical exclusive? (the coin lands heads AND the coin lands tails).

Removing 0 and 1 from the math of probability breaks most of the standard manipulations. Again, it's best to just say "be careful with 0 and 1 when working with odds ratios."

Comment author: Lumifer 20 August 2015 05:48:30PM 1 point [-]

Nobody is saying EY invented Cromwell's Rule, that's not the issue.

The issue is that "0 and 1 are not useful subjective certainties for a Bayesian agent" is a very different statement than "0 and 1 are not probabilities at all".

Comment author: Wes_W 20 August 2015 06:05:37PM *  0 points [-]

You're right, I misread your sentence about "his personal preferences" as referring to the whole claim, rather than specifically the part about what's "mentally healthy". I don't think we disagree on the object level here.

Comment author: David_Bolin 20 August 2015 06:50:57PM 0 points [-]

Of course if no one has absolute certainty, this very fact would be one of the things we don't have absolute certainty about. This is entirely consistent.