gwern comments on Thinking Bayesianically, with Lojban - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (66)
Well, here's a first stab. We only need to cover 50-100% since English gives us negations: "unlikely" versus "likely", "unprobable" versus "probable". (If we can express 60% and we want to express 40%, we can just negate whatever we say for 60%.) Going by the above scheme, the 50-100% range requires 13 modifiers. If I replace the >99% for 99%, which I don't think is very useful, I need 13 or so. For infinity or 100%, I think it's better to signal a discontinuity by using a pair like "certain"/"impossible" (neg infinity or 0%) - since they aren't probabilities. I originally set had the range start at 50%, but then it was pointed out that negation was funny, so I realize I had to make that a special word as well, along with 0% and 100%. Half-way through, if I switch from "likely" to "probable" I can reuse the previous modifiers, so I only need to think of 5-6 modifiers.
It turns out that this is a really hard balancing act. I think I'm roughly satisfied with:
I'll think about this for a while more, and then I think I'll go through gwern.net and try to rationalize all uses of informal probability to use this scheme. I'm calibrated, so I might as well get some mileage out of it!
Translating numerical probabilities into verbal labels has been an active area of research. As an entry point into that literature, see the review article Teigen & Brun (2003, Verbal expressions of uncertainty and probability).
You might want to take a look at some of the other attempts out there to try to come up with labels that are more intuitive (I see "likely" and "probable" as equivalent, which would make this system where "somewhat probable" > "extremely likely" very unintuitive for me). Teigen and Brun cite several attempts which "have been made to construct standard lists of verbal expressions, where each phrase is coordinated with an appropriate numeric probability (Beyth-Marom, 1982; Hamm, 1991; Tavana, Kennedy & Mohebbi, 1997; Renooij & Witteman, 1999)." The full citations for those 4 papers are:
Beyth-Marom, R.(1982). How probable is probable? A numerical translation of verbal probability expressions. Journal of Forecasting, 1, 257–269.
Hamm, RM (1991). Selection of verbal probabilities: A solution for some problems of verbal probability expression. Organizational Behavior and Human Decision Processes, 48, 193–223.
Tavana, M., Kennedy, DT & Mohebbi, B.(1997). An applied study using the analytic hierarchy process to translate common verbal phrases to numerical probabilities. Journal of Behavioral Decision Making, 10, 133–150.
Renooij, S. & Witteman, C.(1999). Talking probabilities: Communicating probabilistic information with words and numbers. International Journal of Approximate Reasoning, 22, 169–194.
Kesselman's thesis suggests this mapping: Kesselman List of Estimative Words
I find the middle phrasing entirely unsatisfactory ("possible" is an obvious replacement), and the chunking is a little crude, but I do agree it should be impossible for most people to get the relative rankings wrong and invert any pairs. Not sure if it's better or not; need to read some of your cites, although the review's various PDF homes are all dead right now. EDIT: the book is available though.)
It seems to me that the way we talk about possibilities in English has a component of downside risk that isn't well captured by that ranking, especially at the extremes. I might comfortably refer to a 15% chance of losing twenty dollars as "remote", but the same certainly wouldn't be true of a 15% chance of losing my life.
"Almost Certain" is missing and "Highly Likely" and "Higly Unlikely" have the wrong numbers. It should be:
"Possible" seems to have two distinct meanings. The first one fits your usage, but the other is more of a binary expression, used to express the fact that something is not impossible. In other words, anything whose probability is equal or greater than 1% (say) can be tagged with "possible", and using this sense of "possible" for the 46-55% range seems wrong - it would deserve a stronger word. To avoid the risk of confusion about which sense is meant, I suggest using something like "entirely possible".
To me, 'entirely possible' doesn't convey around 50-50; so why bother sticking in an entire other word?
Notes from Teigen & Brun:
The cached HTML of the review is available.
After reading through those cited papers, I think the Kessler scale is still the best of the suggestions and simpler than my own suggestion. I guess I'll just use that in the future. I've made some flashcards to help me memorize them.