gwern comments on Thinking Bayesianically, with Lojban - Less Wrong

11 Post author: DataPacRat 24 January 2012 06:47PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (66)

You are viewing a single comment's thread. Show more comments above.

Comment author: gwern 30 May 2012 02:03:09AM *  2 points [-]

Well, here's a first stab. We only need to cover 50-100% since English gives us negations: "unlikely" versus "likely", "unprobable" versus "probable". (If we can express 60% and we want to express 40%, we can just negate whatever we say for 60%.) Going by the above scheme, the 50-100% range requires 13 modifiers. If I replace the >99% for 99%, which I don't think is very useful, I need 13 or so. For infinity or 100%, I think it's better to signal a discontinuity by using a pair like "certain"/"impossible" (neg infinity or 0%) - since they aren't probabilities. I originally set had the range start at 50%, but then it was pointed out that negation was funny, so I realize I had to make that a special word as well, along with 0% and 100%. Half-way through, if I switch from "likely" to "probable" I can reuse the previous modifiers, so I only need to think of 5-6 modifiers.

It turns out that this is a really hard balancing act. I think I'm roughly satisfied with:

db % odds English
-∞ 00% 1:∞ impossible
0 50% 1:1 possible
∞ 100% ∞:1 certain
1 55.7% 5:4 likely
2 61.3% 3:2 somewhat likely
3 66.6% 2:1 quite likely
4 71.5% 5:2 very likely
5 76.0% 3:1 highly likely
6 80.0% 4:1 extremely likely
7 83.3% 5:1 probable
8 86.3% 6:1 somewhat probable
9 88.8% 8:1 quite probable
10 90.9% 10:1 very probable
13 95.2% 20:1 highly probable
20 99% 99:1 extremely probable

I'll think about this for a while more, and then I think I'll go through gwern.net and try to rationalize all uses of informal probability to use this scheme. I'm calibrated, so I might as well get some mileage out of it!

Comment author: Unnamed 30 May 2012 05:22:05AM 6 points [-]

Translating numerical probabilities into verbal labels has been an active area of research. As an entry point into that literature, see the review article Teigen & Brun (2003, Verbal expressions of uncertainty and probability).

You might want to take a look at some of the other attempts out there to try to come up with labels that are more intuitive (I see "likely" and "probable" as equivalent, which would make this system where "somewhat probable" > "extremely likely" very unintuitive for me). Teigen and Brun cite several attempts which "have been made to construct standard lists of verbal expressions, where each phrase is coordinated with an appropriate numeric probability (Beyth-Marom, 1982; Hamm, 1991; Tavana, Kennedy & Mohebbi, 1997; Renooij & Witteman, 1999)." The full citations for those 4 papers are:

Beyth-Marom, R.(1982). How probable is probable? A numerical translation of verbal probability expressions. Journal of Forecasting, 1, 257–269.
Hamm, RM (1991). Selection of verbal probabilities: A solution for some problems of verbal probability expression. Organizational Behavior and Human Decision Processes, 48, 193–223.
Tavana, M., Kennedy, DT & Mohebbi, B.(1997). An applied study using the analytic hierarchy process to translate common verbal phrases to numerical probabilities. Journal of Behavioral Decision Making, 10, 133–150.
Renooij, S. & Witteman, C.(1999). Talking probabilities: Communicating probabilistic information with words and numbers. International Journal of Approximate Reasoning, 22, 169–194.

Comment author: gwern 31 May 2012 05:03:58PM *  1 point [-]

Kesselman's thesis suggests this mapping: Kesselman List of Estimative Words

  • 100% Certainty
  • 86-99% Highly Likely
  • 56-70% Likely
  • 46-55% Chances a Little Better [or Less]
  • 31-45% Unlikely
  • 13-30% Highly Unlikely
  • 1-15% Remote
  • 0% Impossibility

I find the middle phrasing entirely unsatisfactory ("possible" is an obvious replacement), and the chunking is a little crude, but I do agree it should be impossible for most people to get the relative rankings wrong and invert any pairs. Not sure if it's better or not; need to read some of your cites, although the review's various PDF homes are all dead right now. EDIT: the book is available though.)

Comment author: Nornagest 16 September 2013 02:49:07AM 3 points [-]

It seems to me that the way we talk about possibilities in English has a component of downside risk that isn't well captured by that ranking, especially at the extremes. I might comfortably refer to a 15% chance of losing twenty dollars as "remote", but the same certainly wouldn't be true of a 15% chance of losing my life.

Comment author: Bobertron 16 September 2013 01:41:38AM *  1 point [-]

"Almost Certain" is missing and "Highly Likely" and "Higly Unlikely" have the wrong numbers. It should be:

  • 100% Certainty
  • 86-99% Almost Certain
  • 71-85& Highly Likely
  • 56-70% Likely
  • 46-55% Chances a Little Better [or Less]
  • 31-45% Unlikely
  • 16-30% Highly Unlikely
  • 1-15% Remote
  • 0% Impossibility
Comment author: Kaj_Sotala 25 June 2012 01:09:30PM 0 points [-]

I find the middle phrasing entirely unsatisfactory ("possible" is an obvious replacement)

"Possible" seems to have two distinct meanings. The first one fits your usage, but the other is more of a binary expression, used to express the fact that something is not impossible. In other words, anything whose probability is equal or greater than 1% (say) can be tagged with "possible", and using this sense of "possible" for the 46-55% range seems wrong - it would deserve a stronger word. To avoid the risk of confusion about which sense is meant, I suggest using something like "entirely possible".

Comment author: gwern 25 June 2012 02:31:14PM 0 points [-]

To me, 'entirely possible' doesn't convey around 50-50; so why bother sticking in an entire other word?

Comment author: gwern 31 May 2012 06:48:50PM 0 points [-]

Notes from Teigen & Brun:

The recurrent findings in these studies are (1) a reasonable degree of between-group consistency, combined with (2) a high degree of within-group variability. In other words, mean estimates of “very probable”, “doubtful” and “improbable” are reasonably similar from study to study, supporting the claim that probability words are translatable; but, at the same time, the interindividual variability of estimates is large enough to represent a potential communication problem. If, for instance, the doctor tells the patient that a cure is “possible”, she may mean a 5 per cent chance, but it may be interpreted to mean a 70 per cent chance, or vice versa. This variability is typically underestimated by the participants themselves. Brun and Teigen (1988) asked medical doctors to specify a range within which would fall 90 per cent of other doctors’ interpretations. This interval included on the average (for 14 verbal phrases) less than 65 per cent of the actual individual estimates. Amer, Hackenbrack and Nelson (1994) found that auditors’ 90 per cent ranges included, on average, only 56 per cent of the individual estimates (for 23 phrases). In other words, the problem posed by interindividual variability appears to be aggravated by a low degree of variability awareness.

...several attempts have been made to construct standard lists of verbal expressions, where each phrase is coordinated with an appropriate numeric probability (Beyth-Marom, 1982; Hamm, 1991; Tavana, Kennedy & Mohebbi, 1997; Renooij & Witteman, 1999)

...Verbal phrases are, furthermore, parts of ordinary language, and thus sensitive to conversational implicatures. So I may say that a particular outcome is somewhat uncertain, not because I think it has a low probability of occurring, but because I want to modify some actual, imagined or implied belief in its occurrence. Such modifications can go in two directions, either upwards or downwards on the probability scale. Verbal probability expressions can accordingly be categorised as having a positive or a negative directionality. They determine whether attention should be directed to the attainment or the non-attainment of the target outcome, and, in doing so, they have the ability to influence people’s judg- ments and decisions in an unambiguous way. Words may be denotatively vague, but they are argumentatively precise. If you tell me that success is “possible”, I know I am being encouraged, even if I do not know whether you have a probability of 30 per cent or of 70 per cent in mind. If you say it is “not certain”, I know I am advised to be careful and to think twice. But if you tell me there is a 45 per cent probability I will not know what to think. The information is precise, but its pragmatic meaning is undecided. Do you mean uncertainty (I have only a 45 per cent chance) or possibility (at least I have a 45 per cent chance)? Likelihood or doubt? Or both?

Comment author: Unnamed 31 May 2012 06:21:21PM 0 points [-]

The cached HTML of the review is available.

Comment author: gwern 06 June 2012 09:58:29PM *  0 points [-]

After reading through those cited papers, I think the Kessler scale is still the best of the suggestions and simpler than my own suggestion. I guess I'll just use that in the future. I've made some flashcards to help me memorize them.