hmm... I feel even more confident about the existence of probability-zero statements than I feel about the existence of probability-1 statements. Because not only do we have logical contradictions, but we also have incoherent statements (like Husserl's "the green is either").
Can one form subjective probabilities over the truth of "the green is either" at all? I don't think so, but I remember a some-months-ago suggestion of Robin's about "impossible possible worlds," which might also imply the ability to form probability esti...
If you assign 0 to logical contradictions, you should assign 1 to the negations of logical contradictions. (Particularly since your confidence in bivalence and the power of negation is what allowed you to doubt the truth of the contradiction in the first place.) So it's strange to say that you feel safer appealing to 0s than to 1s.
For my part, I have a hard time convincing myself that there's simply no (epistemic) chance that Graham Priest is right. On the other hand, assigning any value but 1 to the sentence "All bachelors are bachelors" just seems perverse. It seems as though I could only get that sentence wrong if I misunderstand it. But what am I assigning a probability to, if not the truth of the sentence as I understand it?
Another way of saying this is that I feel queasy assigning a nonzero probability to "Not all bachelors are bachelors," (i.e., ¬(p → p)) even though I think it probably makes some sense to entertain as a vanishingly small possibility "All bachelors are non-bachelors" (i.e., p → ¬p, all bachelors are contradictory objects).
One answer would be that an incoherent proposition is not a proposition, and so doesn't have any probability (not even zero, if zero is a probability.)
Another answer would be that there is some probability that you are wrong that the proposition is incoherent (you might be forgetting your knowledge of English), and therefore also some probability that "the green is either" is both coherent and true.
It's difficult to assign probability to incoherent statements, because since we can't mean anything by them, we can't assert a referent to the statement -- in that sense, the probability is indeterminate (additionally, one could easily imagine a language in which a statement such as "the green is either" has a perfectly coherent meaning -- and we can't say that's not what we meant, since we didn't mean anything). Recall also that each probability zero statement implies a probability one statement by its denial and vice versa, so one is equally capable of imagining them, if in a contrived way.
j.edwards, I think your last sentence convinced me to withdraw the objection -- I can't very well assign a probability of 1 to ~"the green is either" can I? Good point, thanks.
Probabilities of 0 and 1 are perhaps more like the perfectly massless, perfectly inelastic rods we learn about in high school physics - they are useful as part of an idealized model which is often sufficient to accurately predict real-world events, but we know that they are idealizations that will never be seen in real life.
However, I think we can assign the primeness of 7 a value of "so close to 1 that there's no point in worrying about it".
In stark contrast to this time last week, I now internally believe the title of this post.
I did enjoy "something, somewhere, is having this thought," Paul, despite all its inherent messiness.
'Green is either' doesn't tell us much. As far as we know it's a nonsensical statement, but I think that makes it more believable than 'green is purple', which makes sense, but seems extremely wrong. You might as well try to assign a probability to 'flarg is nardle'. I can demonstrate that green isn't purple, but not that green isn't either, nor that flarg is...
I think you can still have probabilities sum to 1: probability 1 would be the theoretical limit of probability reaching infinite certitude. Just like you can integrate over the entire real line, i.e -∞ to ∞ even though those numbers don't actually exist.
i didn't get it.
Easy: it's a demonstration of how you can never be certain that you haven't made an error even on the things you're really sure about.
It's a cheap, dirty demonstration, but one nevertheless.
Cumulant - can you state, with infinite certainty, that no-one will ever run faster than light?
Another way to think about probabilities of 0 and 1 is in terms of code length.
Shannon told us that if we know the probability distribution of a stream of symbols, then the optimal code length for a symbol X is: l(X) = -log p(X)
If you consider that an event has zero probability, then there's no point in assigning a code to it (codespace is a conserved quantity, so if you want to get short codes you can't waste space on events that never happen). But if you think the event has zero probability, and then it happens, you've got a problem - system crash or som...
Brent,
From what I understood on reading the Wikipedia article on Bayesian probability and inferring from how he writes (and correct me if I'm wrong), Eliezer is talking about your "subjective probability." You are a being, have consciousness, and interpret input as information. Given a lot of this information, you've formed an idea that 7 is prime. You've also formed an idea that other people exist, and that the sky is blue, which also have a high subjective probability in your mind because you have a lot of direct information to sustain that ...
I agree with cumulant. The mathematical subject of probability is based on measure theory, which loses a ton of convergence theorems if we exclude 0 and 1. We can agree that things that are not known a priori can't have probability 0 or 1, but I think we must also agree that "an impossible thing will happen soon" has probability 0, because it's a contradiction. An alternate universe in which the number 7 (in the same kind of number system as ours, etc.) is prime is damn-near inconceivable, but an alternate universe in which impossible things are ...
Speaking of measure theory, what probability should we assign to a uniformly distributed random real number on the interval [0, 1] being rational? Something bigger than 0? Maybe in practice we would never hold a uniform distribution over [0, 1] but would assign greater probability to "special" numbers (like, say, 1/2). But regardless of our probability distribution, there will exist subsets of [0, 1] to which we must assign probability 0.
The only way I can see around this is to refuse to talk about infinite (or at least uncountable) sets. Are there others?
I suspect Eliezer would object to my post claiming that I'm confusing map and territory, but I don't think that's fair. If there's a map you're trying to use all over the place (and you do seem to), then I claim it makes no sense to put a little region on the map labelled "maybe this map doesn't make any sense at all". If the map seems to make sense and you're still following it for everything, you'll have to ignore that region anyway. So is it really reasonable to claim that "the probability that probability makes sense is <1"?
Utili...
If there's a map you're trying to use all over the place (and you do seem to), then I claim it makes no sense to put a little region on the map labelled "maybe this map doesn't make any sense at all". If the map seems to make sense and you're still following it for everything, you'll have to ignore that region anyway.
Janos, are you saying that it is in fact impossible that your map in fact doesn't make any sense? Because I do, indeed, have a little section of my map labelled "maybe this map doesn't make any sense at all", and every now and then, I think about it a little, because there are so many fundamental premises of which I am unsure even in their definitions. (E.g: "the universe exists", and "but why?") Just because this area of my map drops out of my everyday decision theory due to failure to generate coherent advice on preferences, does not mean it is absent from my map. "You must ignore" or rather "You should usually ignore" is decision theory, and probability theory should usually be firewalled off from preferences.
Computable numbers are the largest countable class I know of.
Either all countable sets are the ...
I can admit the possibility that probability doesn't work, but not have to do anything about it. If probability doesn't work and I can't make rational decisions, I can expect to be equally screwed no matter what I do, so it cancels out of the equation.
The definable real numbers are a countable superset of the computable ones, I think. (I haven't studied this formally or extensively.)
If you don't want to assume the existence of certain propositions, you're asking for a probability theory corresponding to a co-intutionistic variant of minimal logic. (Cointuitionistic logic is the logic of affirmatively false propositions, and is sometimes called Popperian logic.) This is a logic with false, or, and (but not truth), and an operation called co-implication, which I will write a <-- b.
Take your event space L to be a distributive lattice (with ordering <), which does not necessarily have a top element, but does have dual relative pseud...
If the map seems to make sense and you're still following it for everything, you'll have to ignore that region anyway.
Just cos it's not a very nice place to visit, doesn't mean it ain't on the map. ;)
"1, 2, and 3 are all integers, and so is -4. If you keep counting up, or keep counting down, you're bound to encounter a whole lot more integers. You will not, however, encounter anything called "positive infinity" or "negative infinity", so these are not integers."
This bothered me, more to the point, it hit on some stuff I've been thinking about. I realize I don't have a very good way to precisely state what I mean by "finite" or "eventually"
The above, for instance, basically says "if infinity is no...
Eliezer:
I'm not sure what an "infinite set atheist" is, but it seems from your post that you use different notions of probability than what I think of as standard modern measure theory, which surprises me. Utilitarian's example of a uniform r.v. on [0, 1] is perfect: it must take some value in [0, 1], but for all x it takes value x with probability 0. Clearly you can't say that for all x it's impossible for the r.v. to take value x, because it must in fact take one of those values. But the probabilities are still 0. Pragmatically the way this com...
Eliezer:
I am curious as to why you asked Peter not to repeat his stunt.
Also, I would really like to know how confident you are in your infinite set atheism and for that matter in your non-standard philosophy of mathematics attitudes in general.
Regarding infinite set atheism:
Is the set of "possible landing sites of a struck golf ball" finite or infinite?
In other words, can you finitely parameterize locations in space? Physicists normally model "position" as n-tuples of real numbers in a coordinate system; if they were forced to model position discretely, what would happen?
I can claim to see an infinite set each time I use a ruler...
Doug S., I believe according to quantum mechanics the smallest unit of length is Planck length and all distances must be finite multiples of it.
Eliezer:
I should mention that I'm also an infinite set atheist.
You've mentioned this before, and I have always wondered: what does this mean? Does it mean that you don't believe there are any infinite sets? If so, then you have to believe that a mathematician who claims the contrary (and gives the standard proof) is making a mistake somewhere. What is it?
Frankly, even if you actually are a finitist (which I find hard to imagine), it doesn't seem relevant to this disucssion: every argument you have presented could equally well have been given by someone who accepts standard mathematics, including the existence of infinite sets.
The nature of 0 & 1 as limit cases seem to be fascinating for the theorists. However, in terms of 'Overcoming Bias', shouldn't we be looking at more mundane conceptions of probability ? EY's posts have drawn attention to the idea that the amount of information needed to add additional cetainty on a proposition increases exponentially while the probability increases linearly. This says that in utilitarian terms, not many situations will warrant chasing the additional information above 99.9% certainty (outside technical implementations in nuclear phys...
Doug S., I believe according to quantum mechanics the smallest unit of length is Planck length and all distances must be finite multiples of it.
Not in standard quantum mechanics. Certain of the many theories unsupported hypotheses of quantum gravity (such as Loop Quantum Gravity) might say something similar to this, but that doesn't abolish every infinite set in the framework. The total number of "places where infinity can happen" in modern models has tended to increase, rather than decrease, over the centuries, as models have gotten more complex...
I think Eliezer's "infinite set atheism" is a belief that infinite sets, although well-defined mathematically, do not exist in the "real world"; in other words, that any physical phenomenon that actually occurs can be described using a finite number of bits. (This can include numbers with infinite decimal expansions, as long as they can be generated by a finitely long computer program. Therefore, using pi in equations is not prohibited, because you're using the symbol "pi" to represent the program, which is finite.)
A consequen...
What do you mean by "infinite set atheism"? You are essentially stating that you don't believe in mathematical limits -- because that is one of the major consequences of infinite sets (or sequences).
Janos is spot on about measure zero not implying impossibility. What is the probability of a golf ball landing at any exact point? Zero. ...
What is the probability of a golf ball landing at any exact point? Zero.
Wrong.
I don't know which is more painful: Eliezer's errors, or those of his detractors.
Perhaps you could clarify what exactly is an infinite set atheist in a full post...or maybe it's only worth a comment.
Cumulant, I think the idea behind "infinite set atheism" is not that limits don't exist, but that that infinities are acceptable only as limits approached in a specified way. On this view, limits are not a consequence of infinite sets, as you contend; rather, only the limit exists, and the infinite set or sequence is merely a sloppy way of thinking about the limit.
Eliezer, I'll second Matthew's suggestion above that you write a post on infinite set atheism; it looks as if we don't understand you.
I think I understand the motive for rejecting infin...
Caledonian: Not wrong. Take the field you're swinging at to be a plane. There are infinitely many points in that plane; that's just the density of the reals.
Now say there is some probability density of landing spots; and, let's say no one spot is special in that it attracts golf balls more than points immediately nearby (i.e. our pdf is continuous and non-atomic). Right there, you need every point (as a singleton) to have measure 0.
Go pick up Billingsley: measure 0 is not the same as impossible nor does it cause any problems.
Take the field you're swinging at to be a plane. There are infinitely many points in that plane; that's just the density of the reals.
And the location that the ball lands on will also be composed of infinitely many reals. Shall we compare the size of two infinite sets?
I'd say that the ball is a sphere and consider the first point of impact (i.e. the tangency point of the plane to the sphere). Otherwise, you need to know a lot about the ball and the field where it lands.
You can compare infinite sets. Take the sets A and B, A={1,2,3,...} and B={2,3,4,...}. B is, by construction, a subset of A. There's your comparison; yet, both are infinite sets.
What assumptions would you make for the golf ball and the field? (To keep things clear, can we define events and probabilities separately?)
Caledonian, every undergraduate who has ever taken a statistics class knows that the probability of any single point in a continuous distribution is zero. Probabilities in continuous space are measured on intervals. Basic calculus...
I believe according to quantum mechanics the smallest unit of length is Planck length and all distances must be finite multiples of it.
This is what I'm given to understand as well. Doesn't this take the teeth out of Zeno's paradox?
Pragmatically the way this comes out is that "probability 0" doesn't imply impossible.
Janos, would you agree that P=0 is a probability to the same degree that infinity is a number? Apologies for double post.
Caledonian, every undergraduate who has ever taken a statistics class knows that the probability of any single point in a continuous distribution is zero.
Gowder, everyone who's ever given the issue more than three-seconds'-thought knows that no statistical result ever involves a single point.
Usually, if a die lands on edge we say it was a spoiled throw and do it over. Similarly if a Dark Lord writes 37 on the face that lands on top, we complain that the Dark Lord is spoiling our game and we don't count it.
We count 6 possibilities for a 6-sided die, 5 possibilities for a 5-sided die, 2 possibilities for a 2-sided die, and if you have a die with just one face -- a spherical die -- what's the chance that face will come up?
I think it would be interesting to develop probability theory with no boundaries, with no 0 and 1. It works fine to do it the way it's done now, and the alternative might turn up something interesting too.
Ben:
Well, that depends on your number system. For some purposes +infinity is a very useful value to have. For instance if you consider the extended nonnegative reals (i.e. including +infinity) then every measurable nonnegative extended-real-valued function on a measure space actually has a well-defined extended-nonnegative-real-values integral. There are all kinds of mathematical structures where an infinity element (or many) is indispensable. It's a matter of context. The question of what is a "number" is I think very vague given how many intere...
I think it would be interesting to develop probability theory with no boundaries, with no 0 and 1. It works fine to do it the way it's done now, and the alternative might turn up something interesting too.
You might want to check out Kosko's Fuzzy Thinking. I haven't gone any further into fuzzy logic, yet, but that sounds like something he discussed. Also, he claimed probability was a subset of fuzzy logic. I intend to follow that up, but there is only one of me, and I found out a long time ago that they can write it faster than I can read it.
"On some golf courses, the fairway is readily accessible, and the sand traps are not. The green is either."
Haha, very nice CGD. Shows how much those philosophers of language know about golf. :-)
Although... hmm... interesting. I think that gives us a way to think about another probability 1 statement: statements that occupy the entire logical space. Example: "either there are probability 1 statements, or there are not probability 1 statements." That statement seems to be true with probability 1...
Disallowing a symbol for "all events" breaks the definition of a probability space. It's probably easier to allow extended reals and break some field axioms than figure out do rigorous probability without a sigma-algebra.
When re-working this into a book, you need to double check your conversions of log odds into decibels. By definition, decibels are calculated using log base 10, but some of your odds are natural logarithms, which confused the heck out of me when reading those paragraphs.
Probability .0001 = -40 decibels (This is the only correct one in this post, all "decibel" figures afterwards are listed as 10 * the natural logarithm of the odds.) Probability 0.502 = 0.035 decibels Probability 0.503 = 0.052 decibels Probability 0.9999 = 40 decibels Probability 0.99999 = 50 decibels
P.S. It'd be nice if you provided an RSS feed for the comments on a post, in addition to the RSS feed for the posts...
I cannot begin to imagine where those numbers came from. Dangers of "Posted at 1:58 am", I guess. Fixed.
P(A&B)+P(A&~B)+P(~A&B)+P(~A&~B)=1
Isn't the "1" above a probability?
My intution as a mathematician declares that nobody will never develop an elegant mathematical formulation of probability theory that does not allow for statements that are logically impossible or certain, such as statements of the form p AND NOT p. And it is necessary, if the theory is to be isomorphic to the usual one, that these statements have probability 0 (if impossible) or 1 (if certain). However, I believe that it is quite reasonable to declare, as a condition demanded of any prior deemed rational, that only truly impossible or certain statements...
As Perplexed points out this is usually known as Cromwell's_rule.
I'm kinda surprised that it's only been mentioned once in the comments (I only just discovered this site, really really great, by the way) and one from 2010 at that, but it seems to me that "a magical symbol to stand for "all possibilities I haven't considered" " does exist: the symbol "~" (i.e. not). Even the commenter who does mention it makes things complicated for himself: P(Q or ~Q)=1 is the simplest example of a proposition with probability 1.
The proposition is of course a tautology. I do think (but I'm not sure) that th...
For any state of information X, we have P(A or not A | X) = 1 and P(A and not A | X) = 0. We have to have 0 and 1 as probabilities for probability theory even to work. I think you're taking a reasonable idea -- that P(A | X) should be neither 0 nor 1 when A is a statement about the concrete physical world -- and trying to apply it beyond its applicable domain.
Consider the set of all possible hypotheses. This is a countable set, assuming I express hypotheses in natural language. It is potentially infinite as well, though in practice a finite mind cannot accomodate infintely-long hypotheses. To each hypothesis, I can try to assign a probability, on the basis of available evidence. These probabilities will be between zero and one. What is the probability that a rational mind will assign at least one hypothesis the status of absolute certainty? Either this is one (there is definitely such a hypothesis), or zero (th...
Yes 0 and 1 are not probabilities. They're truth or falseness values. it's necessary to make a third 'truth value' for things that are unprovable, and possibly a fourth for things that are untestable.
Digging up an old thread here, but an interesting point I want to bring up: a friend of mine claims that he internally assigns probability 1 (i.e. an undisprovable belief) only to one statement: that the universe is coherent. Because if not, then mnergarblewtf. Is it reasonable to say that even though no statement can actually have probability 1 if you're a true Bayesian, it's reasonable to internally establish an axiom which, if negated, would just make the universe completely stupid and not worth living in any more?
...The ("Bayesian") framework explored in these essays replaces the two Cartesian options, affirmation and denial, by a continuum of judgmental probabilities in the interval from 0 to 1, endpoints included, or -- what comes to the same thing -- a continuum of judgmental odds in the interval from 0 to infinity, endpoints included. Zero and 1 are probabilities no less than 1/2 and 99/100 are. Probability 1 corresponds to infinite odds, 1:0. That's a reason for thinking in terms of odds: to remember how momentous it may be to assign probability 1 to a
Interesting Log-Odds paper by Brian Lee and Jacob Sanders, November 2011.
"When you work in log odds, the distance between any two degrees of uncertainty equals the amount of evidence you would need to go from one to the other. That is, the log odds gives us a natural measure of spacing among degrees of confidence."
That observation is so useful and intuition friendly it probably deserves it's own blog post, and a prominent place in your book.
Forgive me if this sounds condescending, but isn't saying "0 and 1 are not probabilities because they won't let you update your knowledge" basically the same as saying "you can't know something because knowing makes you unable to learn"? If we assign tautologies as having probability 1, then anything reducible to a tautology should have probability 1 (and similarly, all contradictions and things reducible to contradictions should have probability 0). For any arbitrarily large N, if you put 2 apples next to 2 apples and repeat the test N...
So you are saying that statement "0 and 1 are not probabilities" has probability of 1?
O = (P / (1 - P))
probabilities and odds are isomorphic
This is undefined for P = 1. If you claim that that function is a real-valued bijection between probabilities and odds then P = 1 doesn't work so you're begging the question. Always take care to not divide by zero.
Whether or not real-world events can have a probability of 0 or 1 is a different question than "are 0 and 1 probabilities?". They most certainly are.
If I roll a die, then one of the events that can happen will happen. That's just saying that if S is my sample space, then P(S) = 1. Similarly, P(~S) = 0, which is just saying that impossible things won't happen. The former statement is an axiom in the standard mathematical treatments of the subject. These statements may be trivial, but I distrust any mathematics that can't handle trivial cases.
Rejecting 1 as a probability would be catastrophic when you're dealing with discrete spaces. If you're the sort to reject infinity, then it would follow that all pr...
This article is largely incoherent. The main justification is the abuse of an invalid transformations: y=x/(1-x) is not the bijection that he asserts it is, because it's not a function that maps [0,1] onto R. It's a function that maps [0,1] onto [1,\intfy] as a subset of the topological closure of R. And that's okay, but you can't say "well I don't like the topological closure of R, so I'll just use R and claim that 1 is where the problem is."
Additionally, his discussion of log odds and such is perfectly fine, but ignores the fact that there are places where you do need to have an odds of 0:1, or a log odds of negative infinity. Probability theory stops working when you throw out 0 and 1, it's as simple as that.
Even if you don't want to handle tautologies or contradictions, there are other ways to get P(X)=0 or 1. The probability that a real number chosen uniformly from the real interval [0,1] is 0. It has to be. It's a provable fact under ZFC and to decide otherwise is to say that you're more attached to the idea of 0 and 1 not being probabilities than you are to the fact that mathematics is consistent and if you really believe that, well, there's absolutely nothing I have to say to you.
This is one of those situations where EY just demonstrates he knows very little mathematics.
A real mathematician got in a debate with EY over this post, and made some really good points: https://np.reddit.com/r/badmathematics/comments/2bazyc/0_and_1_are_not_probabilities_any_more_than/cj43y8k
Maybe this doesn't stand up mathematically, but I really like the intuition of log odds instead of probability. And this post explained it quite well. And the main point that you shouldn't believe in absolute certainties is still true. An ideal AI using probability theory would probably use log odds, and not have a 0 or 1.
What are the odds that the face showing is 1? Well, the prior odds are 1:5 (corresponding to the real number 1/5 = 0.20)
I'm years late to this party, and probably missing something obvious. But I'm confused by Yudkowsky's math here. Wouldn't it be more correct to say that the prior odds of rolling a 1
are 1:5
, which corresponds to a probability of 1/6
or 0.1666...
? If odds of 1:5
correspond to a probability of 1/5
= 0.20
, that makes me think there are 5 sides to this six-sided die, each side having equal probability.
Put differently: when I think of how to ...
It's a nice analogy, but it all rests on whether infinite evidence is a thing or not, and there aren't arguments one way or the other here. (Sure, infinite evidence would mean "whatever log odds you come up with, this is even stronger", but that doesn't rule out it is a thing).
Like, how much evidence for the hypothesis "I'll perceive the die to come up a 4" does the event "Ok, die was thrown and I am perceiving it to be a 3" provide? Or how much evidence do I have of being conscious right now when I am feeling like something? I think any answer different from infinity is just playing a word game.
When you work in log odds, the distance between any two degrees of uncertainty equals the amount of evidence you would need to go from one to the other.
What does "amount of evidence" in this sentence is supposed to mean? Is it the same idea that "bits of evidence" mentioned in these posts previously?
The only way I can interpret this sentence as a definition of "amount of evidence", but then I don't understand what's the point of highlighting the sentence as if it's saying something more significant.
One, two, and three are all integers, and so is negative four. If you keep counting up, or keep counting down, you’re bound to encounter a whole lot more integers. You will not, however, encounter anything called “positive infinity” or “negative infinity,” so these are not integers.
Positive and negative infinity are not integers, but rather special symbols for talking about the behavior of integers. People sometimes say something like, “5 + infinity = infinity,” because if you start at 5 and keep counting up without ever stopping, you’ll get higher and higher numbers without limit. But it doesn’t follow from this that “infinity - infinity = 5.” You can’t count up from 0 without ever stopping, and then count down without ever stopping, and then find yourself at 5 when you’re done.
From this we can see that infinity is not only not-an-integer, it doesn’t even behave like an integer. If you unwisely try to mix up infinities with integers, you’ll need all sorts of special new inconsistent-seeming behaviors which you don’t need for 1, 2, 3 and other actual integers.
Even though infinity isn’t an integer, you don’t have to worry about being left at a loss for numbers. Although people have seen five sheep, millions of grains of sand, and septillions of atoms, no one has ever counted an infinity of anything. The same with continuous quantities—people have measured dust specks a millimeter across, animals a meter across, cities kilometers across, and galaxies thousands of lightyears across, but no one has ever measured anything an infinity across. In the real world, you don’t need a whole lot of infinity.1
In the usual way of writing probabilities, probabilities are between 0 and 1. A coin might have a probability of 0.5 of coming up tails, or the weatherman might assign probability 0.9 to rain tomorrow.
This isn’t the only way of writing probabilities, though. For example, you can transform probabilities into odds via the transformation O = (P/(1 - P)). So a probability of 50% would go to odds of 0.5/0.5 or 1, usually written 1:1, while a probability of 0.9 would go to odds of 0.9/0.1 or 9, usually written 9:1. To take odds back to probabilities you use P = (O∕(1 + O)), and this is perfectly reversible, so the transformation is an isomorphism—a two-way reversible mapping. Thus, probabilities and odds are isomorphic, and you can use one or the other according to convenience.
For example, it’s more convenient to use odds when you’re doing Bayesian updates. Let’s say that I roll a six-sided die: If any face except 1 comes up, there’s a 10% chance of hearing a bell, but if the face 1 comes up, there’s a 20% chance of hearing the bell. Now I roll the die, and hear a bell. What are the odds that the face showing is 1? Well, the prior odds are 1:5 (corresponding to the real number 1/5 = 0.20) and the likelihood ratio is 0.2:0.1 (corresponding to the real number 2) and I can just multiply these two together to get the posterior odds 2:5 (corresponding to the real number 2/5 or 0.40). Then I convert back into a probability, if I like, and get (0.4/1.4) = 2/7 = ~29%.
So odds are more manageable for Bayesian updates—if you use probabilities, you’ve got to deploy Bayes’s Theorem in its complicated version. But probabilities are more convenient for answering questions like “If I roll a six-sided die, what’s the chance of seeing a number from 1 to 4?” You can add up the probabilities of 1/6 for each side and get 4/6, but you can’t add up the odds ratios of 0.2 for each side and get an odds ratio of 0.8.
Why am I saying all this? To show that “odd ratios” are just as legitimate a way of mapping uncertainties onto real numbers as “probabilities.” Odds ratios are more convenient for some operations, probabilities are more convenient for others. A famous proof called Cox’s Theorem (plus various extensions and refinements thereof) shows that all ways of representing uncertainties that obey some reasonable-sounding constraints, end up isomorphic to each other.
Why does it matter that odds ratios are just as legitimate as probabilities? Probabilities as ordinarily written are between 0 and 1, and both 0 and 1 look like they ought to be readily reachable quantities—it’s easy to see 1 zebra or 0 unicorns. But when you transform probabilities onto odds ratios, 0 goes to 0, but 1 goes to positive infinity. Now absolute truth doesn’t look like it should be so easy to reach.
A representation that makes it even simpler to do Bayesian updates is the log odds—this is how E. T. Jaynes recommended thinking about probabilities. For example, let’s say that the prior probability of a proposition is 0.0001—this corresponds to a log odds of around -40 decibels. Then you see evidence that seems 100 times more likely if the proposition is true than if it is false. This is 20 decibels of evidence. So the posterior odds are around -40 dB + 20 dB = -20 dB, that is, the posterior probability is ~0.01.
When you transform probabilities to log odds, 0 goes to negative infinity and 1 goes to positive infinity. Now both infinite certainty and infinite improbability seem a bit more out-of-reach.
In probabilities, 0.9999 and 0.99999 seem to be only 0.00009 apart, so that 0.502 is much further away from 0.503 than 0.9999 is from 0.99999. To get to probability 1 from probability 0.99999, it seems like you should need to travel a distance of merely 0.00001.
But when you transform to odds ratios, 0.502 and 0.503 go to 1.008 and 1.012, and 0.9999 and 0.99999 go to 9,999 and 99,999. And when you transform to log odds, 0.502 and 0.503 go to 0.03 decibels and 0.05 decibels, but 0.9999 and 0.99999 go to 40 decibels and 50 decibels.
When you work in log odds, the distance between any two degrees of uncertainty equals the amount of evidence you would need to go from one to the other. That is, the log odds gives us a natural measure of spacing among degrees of confidence.
Using the log odds exposes the fact that reaching infinite certainty requires infinitely strong evidence, just as infinite absurdity requires infinitely strong counterevidence.
Furthermore, all sorts of standard theorems in probability have special cases if you try to plug 1s or 0s into them—like what happens if you try to do a Bayesian update on an observation to which you assigned probability 0.
So I propose that it makes sense to say that 1 and 0 are not in the probabilities; just as negative and positive infinity, which do not obey the field axioms, are not in the real numbers.
The main reason this would upset probability theorists is that we would need to rederive theorems previously obtained by assuming that we can marginalize over a joint probability by adding up all the pieces and having them sum to 1.
However, in the real world, when you roll a die, it doesn’t literally have infinite certainty of coming up some number between 1 and 6. The die might land on its edge; or get struck by a meteor; or the Dark Lords of the Matrix might reach in and write “37” on one side.
If you made a magical symbol to stand for “all possibilities I haven’t considered,” then you could marginalize over the events including this magical symbol, and arrive at a magical symbol “T” that stands for infinite certainty.
But I would rather ask whether there’s some way to derive a theorem without using magic symbols with special behaviors. That would be more elegant. Just as there are mathematicians who refuse to believe in the law of the excluded middle or infinite sets, I would like to be a probability theorist who doesn’t believe in absolute certainty.
1I should note for the more sophisticated reader that they do not need to write me with elaborate explanations of, say, the difference between ordinal numbers and cardinal numbers. I’m familiar with the different set-theoretic notions of infinity, but I don’t see a good use for them in probability theory.