I found Marr's levels highly helpful when trying to think about this area. YMMV. Marr's levels also correspond to Aristotle's four causes if we do as Marr does and split the algorithmic level into 'representation' and 'traversal.'
Here are my initial thoughts -- I may want to add or detract if I have some more time to think about this.
I think a big distinction between risk and uncertainty is perhaps more about how we manage them than really about how well we can assess the likelihood of the events. However, that plays a very large part in how to manage so becomes part of the definitions and classifications.
So, yes, I do still see these two as categorically different than merely differing in degree.
At least when I think about managing risk, not only am I assessing some outcome/payoff but also thinking of that over time -- it is an iterated setting where I can say I have reasonably good knowledge (some observation or frequencies, understanding of the process and key parameters and what values might be seen). In this type of setting think about a casino and the house rules. The house sets up the rules so they have something like a 3% edge on all the customers. They have no idea on what any given bet will pay off for them but over all the bets over the whole year, they can be pretty sure that they keep 3% if all money put down during the year.
We can step away from that very controlled setting to where we don't get to make the rules we want but still are in an iterative setting. The confidence interval for the outcome over time widens up some/a lot but the general approach can stay the same. You're going to evaluate the setting and come up with your expected outcome and make your decision.
But lets know thing about a case of a gambler. What if he can get the same edge as the house was getting? Is he really in the same situation of risk management as the casino? I think that depends. We might know what the probabilities are and the shape of the function but we don't really know how many times we need to play before our sampling starts to reflect that distribution -- statistics gives us some ideas but that also has a random element to it. The gambler's has to decide if he has a budget to make it through far enough to take advantage of the underlying probabilities -- that is to take advantage of "managing the risk".
If the gambler cannot figure that out, or knows for a fact there are insufficient funds, do those probabilities really provide useful information on what to expect? To me this is then uncertainty. The gamble simply doesn't get the opportunity to repeat and so get the expected return. In this type of situation, perhaps rather than trying to calculate all the odds, wagers and pay-offs maybe a simple rule is better if someone wants to gamble.
I think this is a situation similar to the old saying about investing in markets (particularly on the short side): the market can stay irrational a lot longer than you can stay solvent. It's also interesting here, if you have never looked into it. Traders are largely rule following animals -- not to say that they don't do a lot of analysis but that they never let their analysis lock them into a loss. I think financial markets are a great setting where one sees both risk and uncertainty.
I don't think the above is a comprehensive definition of uncertainty or risk (much more so with risk) but hopefully helps tease out at least some aspects of the difference.
(As per usual, my comments are intended not to convince but to outline my thinking, and potentially have holes poked in it. I wouldn't be willing to spend time writing as many paragraphs as I do if I thought there was 0 chance I'd end up learning something new as a result!)
I don't think either the gambling or market analogies really shows either that the risk-uncertainty distinction makes sense in categorical terms, or that it's useful. I think they actually show a large collection of small, different issues, which means my explanation of why I think that may be a bit messy.
The house sets up the rules so they have something like a 3% edge on all the customers. They have no idea on what any given bet will pay off for them but over all the bets over the whole year, they can be pretty sure that they keep 3% if all money put down during the year.
I think this is true, but that "something like" and "pretty sure" are doing a lot of the work here. The house can't be absolutely certain that there's 3% edge, for a whole range of reasons - e.g., there could be card-counters at some point, the house's staff may go against their instructions in order to favour their friends or some pretty women, the house may have somehow simply calculated this wrong, or something more outlandish like Eliezer's dark lords of the Matrix. Like with my points in the post, in practice, I'd be happy making my bets as if these issues weren't issues, but they still prevent absolute certainty.
I don't think you were explicitly trying to say that the house does have absolute certainty (especially given the following paragraph), so that's sort of me attacking a straw man. But I think the typical idea of the distinction being categorical has to be premised on absolute certainty, and I think you may still sort-of be leaning on that idea in some ways, so it seems worth addressing that idea first.
But lets know thing about a case of a gambler. What if he can get the same edge as the house was getting? Is he really in the same situation of risk management as the casino? I think that depends. We might know what the probabilities are and the shape of the function but we don't really know how many times we need to play before our sampling starts to reflect that distribution -- statistics gives us some ideas but that also has a random element to it. The gambler's has to decide if he has a budget to make it through far enough to take advantage of the underlying probabilities -- that is to take advantage of "managing the risk".
If the gambler cannot figure that out, or knows for a fact there are insufficient funds, do those probabilities really provide useful information on what to expect? To me this is then uncertainty. The gamble simply doesn't get the opportunity to repeat and so get the expected return.
I think what's really going on here in your explicit comment is:
So it's very easy to reach the reasonable-seeming conclusion that gambling is unwise even if there's positive expected value in dollar terms, without leaning on the idea of a risk-uncertainty distinction (to be honest, even in terms of degrees - we don't even need to talk about the sizes of the confidence intervals, in this case).
I also think there's perhaps two more things implicitly going on in that analogy, which aren't key points but might slightly nudge one's intuitions:
In this type of situation, perhaps rather than trying to calculate all the odds, wagers and pay-offs maybe a simple rule is better if someone wants to gamble.
I think that's likely true, but I think that's largely because of a mixture of the difficulty of computing the odds for humans (it's just time consuming and we're likely to make mistakes), the likelihood that the gambler will be overconfident so he should probably instead adopt a blanket heuristic to protect him from himself, and the fact that being broke is way worse than being rich is good. (Also, in realistic settings, because the odds are bad anyway - they pretty much have to be, for the casino to keep the lights on - so there's no point calculating; we already know which side of the decision-relevant threshold the answer must be on.) I don't think there's any need to invoke the risk-uncertainty distinction.
And finally, regarding the ideas of iterating and repeating - I think that's really important, in the sense that it gives us a lot more, very relevant data, and shifts our estimates towards the truth and reduces their uncertainty. But I think on a fundamental level, it's just evidence, like any other evidence. Roughly speaking, we always start with a prior, and then update it as we see evidence. So I don't think there's an absolute difference between "having an initial guess about the odds and then updating based on 100 rounds of gambling" and "having an initial guess about the odds and then updating based on realising that the casino has paid for this massive building, all these staff, etc., and seem unlikely to make enough money for that from drinks and food alone". (Consider also that you're never iterating or repeating exactly the same situation.)
Most of what I've said here leaves open the possibility that the risk-uncertainty distinction - perhaps even imagined as categorical - is a useful concept in practice (though I tentatively argue against that here). But it seems to me that I still haven't encountered an argument that it actually makes sense as a categorical division.
Micheal, do so research on the way casinos work. The casino owners don't gamble on their income. Here is a link to consider: https://www.quora.com/How-do-casinos-ultimately-make-money
My point about iterations is not about getting better estimates for the probabilities. The probabilities are known, defined quantities, in the argument. The difference is that in some setting one will have the luxury of iterating and so able to actually average the results towards that expect value. If you can iterate an infinite number of times your results converge to that expected value.
Seem one implication of your position is that people should be indifferent to the following two settings where the expected payoff is the same:
1) They toss a fair coin as many times as they want. If they get heads, they will receive $60, if they get tails they pay $50.
2) They can have the same coin, and same payoffs but only get one toss.
Do you think most peoples decision will be the same? If not, how do you explain the difference.
The casino owners don't gamble on their income.
Maybe this is a matter of different definitions/connotations of "gamble". Given that the odds are in the casino's favour, and that they can repeat/iterate the games a huge number of times, the results do indeed tend to converge to the expected value, which is in the casino's favour - I'm in total agreement there. The odds that they'd lose out, given those facts, are infinitesimal and negligible for pretty much all practical purposes. But it's like they asymptotically approach zero, not that they literally are zero.
It seems very similar to the case of entropy:
The Second Law of Thermodynamics is statistical in nature, and therefore its reliability arises from the huge number of particles present in macroscopic systems. It is not impossible, in principle, for all 6 × 1023 atoms in a mole of a gas to spontaneously migrate to one half of a container; it is only fantastically unlikely—so unlikely that no macroscopic violation of the Second Law has ever been observed.
But in any case, it seems your key point there, which I actually agree with, is that the deal is better for the casino (partly) because they get to play the odds more often than an individual gambler does, so the value they actually get is more likely to be close to the expected value than the individual gambler's is. But I think the reason this is better is because of the diminishing marginal utility of money - losing all your money is way worse than doubling it is good - and not because of the risk-uncertainty distinction itself.
(Though there could be relevant interplays between the magnitude of one's uncertainty and the odds one ends up in a really bad position, which might make one more reluctant to avoid "bets" of any kind when the uncertainty is greater. But again, it's helpful to consider whether you're thinking about expected utility or expected value of some other unit, and it also seems unnecessary to use a categorical risk-uncertainty distinction.)
Seem one implication of your position is that people should be indifferent to the following two settings where the expected payoff is the same:
1) They toss a fair coin as many times as they want. If they get heads, they will receive $60, if they get tails they pay $50.
2) They can have the same coin, and same payoffs but only get one toss.
Do you think most peoples decision will be the same? If not, how do you explain the difference.
Regarding whether I think people's decisions will be the same, I think it's useful to make clear the distinction between descriptive and normative claims. As I say in footnote 1:
Additionally, it’s sometimes unclear whether proponents of the distinction are merely arguing (a) that people perceive such a distinction, so it’s useful to think about and research it in order to understand how people are likely to think and behave, or are actually arguing (b) that people should perceive such a distinction, or that such a distinction “really exists”, “out there in the world”. It seems to me that (a) is pretty likely to be true, but wouldn’t have major consequences for how we rationally should make decisions when not certain. Thus, in this post I focus exclusively on (b).
So my position doesn't really directly imply anything about what people will decide. It's totally possible for the risk-uncertainty distinction to not "actually make sense" and yet still be something that economists, psychologists, etc. should be aware of as something people believe in or act as if they believe in. (Like how it's useful to study biases or folk biology or whatever, to predict behaviours, without having to imagine that the biases or folk biology actually reflect reality perfectly.) But I'd argue that such researchers should make it clear when they're discussing what people do vs when they're discussing what they should do, or what's rational, or whatever.
(If your claims have a lot to do with what people actually think like, rather than normative claims, then we may be more in agreement than it appears.)
But as for what people should do in that situation, I think my position doesn't imply people should be indifferent to that, because getting diminishing marginal utility from money doesn't conflict with reality.
In the extreme version of that situation, if someone starts with $150 as their entire set of assets, and takes bet 2, then there's a 50% chance they'll lose a third of everything they own. That's really bad for them. The 50% chance they win $60 could plausibly not make up for that.
If the same person takes bet 1, the odds that they end up worse off go down, because, as you say, the actual results will tend to converge towards the (positive in dollar terms) expected value as one gets more trials/repetitions.
So it seems to me that it's reasonable to see bet 1 as better than bet 2 (depending on an individual's utility function for money and how much money they currently have), but that this doesn't require imagining a categorical risk-uncertainty distinction.
For the most part, you seem to spend a lot of time trying to discover whether terms like unknown probability and known probability make sense. Yet, those are language artifacts which, like everything language, is merely a use of a clarification algorithm as means to communicate abstractions. Each class represents primarily its dominating modes, but becomes increasingly useless at the margins. As such, you yourself make a false dichotomy by trying to discuss whether these terms are useful or not by showing that at the border they might fail: they fail, and they're both useful and not useful at the border, depending if the exact threshold. In fact, you even start by discussing whether the border can be objectively defined: the answer is obviously no, since it is language, and then you try to use other words to make the point, where you then do your analysis a second time on the new words, and discuss whether the classification systems for those words are perfect (i.e. zero, plausible, realistic, etc.). I think that in reality you are missing the point entirely.
These quotes and definitions are using imagery to teach readers the basic classification system (i.e. the words) to the reader, by proposing initial but vague boundaries. Then, based on the field and experience, the reader further refines this classification to further match the group's (i.e. The experts) definition. Reviews of a given set of risks and uncertainties are then about discussing a) whether the different experts are calibrated in terms of classification threshold; and b) if they feel they sufficiently are calibrated, whether the probabilities and impacts have been properly and sufficiently assessed or not (here to, these vague words are based on a given groups standards).
For example, in software engineering, plans generally include a risks section, where is described various unknowns, their probability, and their impact. Each of those are quantified by (for example) High, Medium, Low, or Unknown. This is simply a double-layer of subjective but group-agreed upon classification of words, meant to communicate the overall probability that the project will hit the date at the expected cost. It is based on experience (i.e. the internal model of the author). During the review process, other leaders and engineers then comment based on their experience on whether a specified risk is properly assessed. These threshold can be very context specific (i.e. a team, a company, or the industry). This is no different in public policy (i.e. risks and impact of global warming).
In other words, I think that you are trying to analyze the problem objectively by making each assertion absolute (i.e. a probability is known or unknown, etc.), while in fact the problem is one of pure communication, rather than one of objective truth or logic. So you get caught in a rabbit hole as you are essentially re-discovering the limitations of language and classification systems, rather than actually discussing the problem at hand. And the initial problem statement, in your analysis (i.e. certainty as risk+uncertainties) is arbitrary, and your logic could have been applied to any definition or concept.
Whether the idea of uncertainty+risk is the proper tool can essentially only analyzed empirically, by comparing it, for example, to another method used in a given field, and evaluating whether method A or B improve s the ability of planners to predict date/cost prediction (in software engineering, for example).
In other words, I think it's more useful to think of those definitions as an algorithm (perhaps ML): certainty ~ f(risk, uncertainty); and the definitions provided of the driving factors as initial values. The users can then refine their threshold to improve the model's prediction capability over time, but also as a function of the class of problems (i.e. climate vs software).
I think I agree with substantial parts of both the spirit and specifics of what you say. And your comments have definitely furthered my thinking, and it's quite possible I'd now write this quite differently, were I to do it again. But I also think you're perhaps underestimating the extent to which risk vs uncertainty very often is treated as an absolute dichotomy, with substantial consequences. I'll now attempt to lay out my thinking in response to your comments, but I should note that my goal isn't really to convince you of "my side", and I'd consider it a win to be convinced of why my thinking is wrong (because then I've learned something, and because that which can be destroyed by the truth should be, and all that).
For the most part, you seem to spend a lot of time trying to discover whether terms like unknown probability and known probability make sense. Yet, those are language artifacts which, like everything language, is merely a use of a clarification algorithm as means to communicate abstractions. Each class represents primarily its dominating modes, but becomes increasingly useless at the margins.
From memory, I think I agreed with basically everything in Eliezer's sequence A Human's Guide to Words. One core point from that seems to closely match what you're saying:
The initial clue only has to lead the user to the similarity cluster—the group of things that have many characteristics in common. After that, the initial clue has served its purpose, and I can go on to convey the new information "humans are currently mortal", or whatever else I want to say about us featherless bipeds.
A dictionary is best thought of, not as a book of Aristotelian class definitions, but a book of hints for matching verbal labels to similarity clusters, or matching labels to properties that are useful in distinguishing similarity clusters.
And it's useful to have words to point to clusters in thingspace, because it'd be far too hard to try to describe, for example, a car on the level of fundamental physics. So instead we use labels and abstractions, and accept there'll be some fuzzy boundaries and edge cases (e.g., some things that are sort of like cars and sort of like trucks).
One difference worth noting between that example and the labels "risk" and "uncertainty" is that risk and uncertainty are like two different "ends" or "directions" of a single dimension in thingspace. (At least, I'd argue they are, and it's possible that that has to be based on a Bayesian interpretation of probability.) So here it seems to me it'd actually be very easy to dispense with having two different labels. Instead, we can just have one for the dimension as a whole (e.g., "trustworthy", "well-grounded", "resilient"; see here), and then use that in combination with "more", "less", "extremely", "hardly at all", etc., and we're done.
We can then very clearly communicate the part that's real (that reflects the territory) from when we tried to talk about "risk" and "uncertainty", without confusing ourselves into thinking that there's some sharp line somewhere, or that it's obvious a different strategy would be needed in "one case" than in "the other". This is in contrast to the situation with cars, where it'd be much less useful to say "more car-y" or "less car-y" - do we mean along the size dimension, as compared to trucks? On the size dimension, as compared to mice? On the "usefulness for travelling in" dimension? On the "man made vs natural" dimension? It seems to me that it's the high dimensionality of thingspace that means labels for clusters are especially useful and hard to dispense with - when we're talking about two "regions" or whatever of a single dimension, the usefulness of separate labels is less clear.
That said, there are clearly loads of examples of using two labels for different points along a single dimension. E.g., short and tall, heavy and light. This is an obvious and substantial counterpoint to what I've said above.
But it also brings me to really my more central point, which is that people who claim real implications from a risk-uncertainty distinction typically don't ever talk about "more risk-ish" or "more Knightian" situations, but rather just situations of "risk" or of "uncertainty". (One exception is here.) And they make that especially clear when they say things like that we "completely know" or "completely cannot know" the probabilities, or we have "zero" knowledge, or things like that. In contrast, with height, it's often useful to say "short" or "tall", and assume a shared reference frame that makes it clear roughly what we mean by that (e.g., it's different for buildings than for people), but we also very often say things like "more" or "less" tall, "shorter", etc., and we never say "This person has zero height" or "This building is completely tall", or the like, except perhaps to be purposefully silly.
So while I agree that words typically point to somewhat messy clusters in thingspace, I think there are a huge number of people who don't realise (or agree with) that, and who seem to truly believe there's a clear, sharp distinction between risk and uncertainty, and who draw substantial implications from that (e.g., that we need to use methods other than expected value reasoning, as discussed in this post, or that we should entirely ignore possibilities we can't "have" probabilities about, an idea which the quote from Bostrom & Cirkovic points out the huge potential dangers of).
So one part of what you say that I think I do disagree with, if I'm interpreting you correctly, is "These quotes and definitions are using imagery to teach readers the basic classification system (i.e. the words) to the reader, by proposing initial but vague boundaries." I really don't think most writers who endorse the risk-uncertainty distinction think that that's what they're doing; I think they think they're really pointing to two cleanly separable concepts. (And this seems reflected in their recommendations - they don't typically refer to things like gradually shifting our emphasis from expected value reasoning to alternative approaches, but rather using one approach when we "have" probabilities and another when we "don't have probabilities", for example.)
And a related point is that, even though words typically point to somewhat messy clusters in thingspace, some words can be quite misleading and do a poor job of marking out meaningful clusters. This is another point Eliezer makes:
Any way you look at it, drawing a boundary in thingspace is not a neutral act. Maybe a more cleanly designed, more purely Bayesian AI could ponder an arbitrary class and not be influenced by it. But you, a human, do not have that option. Categories are not static things in the context of a human brain; as soon as you actually think of them, they exert force on your mind. One more reason not to believe you can define a word any way you like.
A related way of framing this is that you could see the term "Knightian uncertainty" as sneaking in connotations that this is a situation where we have to do something other than regular expected value reasoning, or where using any explicit probabilities would be foolish and wrong. So ultimately I'm sort-of arguing that we should taboo the terms "risk" (used in this sense) and "Knightian uncertainty", and just speak in terms of how uncertain/resilient/trustworthy/whatever a given uncertainty is (or how wide the confidence intervals or error bars are, or whatever).
But what I've said could be seen as just indicating that the problem is that advocates of the risk-uncertainty distinction need to read the sequences - this is just one example of a broader problem, which has already been covered there. This seems similar to what you're saying with:
So you get caught in a rabbit hole as you are essentially re-discovering the limitations of language and classification systems, rather than actually discussing the problem at hand. And the initial problem statement, in your analysis (i.e. certainty as risk+uncertainties) is arbitrary, and your logic could have been applied to any definition or concept.
I think there's something to this, but I still see the risk-uncertainty distinction proposed in absolute terms, even on LessWrong and the EA Forum, so it seemed worth discussing it specifically. (Plus possibly the fact that this is a one-dimensional situation, so it seems less useful to have totally separate labels than it is in many other cases like with cars, trucks, tigers, etc.)
But perhaps even if it is worth discussing that specifically, I should've more clearly situated it in the terms established by that sequence - using some of those terms, perhaps changing my framing, adding some links. I think there's something to this as well, and I'd probably do that if I was to rewrite this.
And something I do find troubling is the possibility that the way I've discussed these problems leans problematically on terms like "absolute, binary distinction", which should really be tabooed and replaced by something more substantive. I think that the term "absolute, binary distinction" is sufficiently meaningful to be ok to be used here, but it's possible that it's just far, far more meaningful than the term "Knightian uncertainty", rather than "absolutely" more meaningful. As you can probably tell, this particular point is one I'm still a bit confused about, and will have to think about more.
And the last point I'll make relates to this:
Whether the idea of uncertainty+risk is the proper tool can essentially only analyzed empirically, by comparing it, for example, to another method used in a given field, and evaluating whether method A or B improve s the ability of planners to predict date/cost prediction (in software engineering, for example).
This is basically what my next post will do. It focuses on whether, in practice, the concept of a risk-uncertainty distinction is useful, whether or not it "truly reflects reality" or whatever. So I think that post, at least, will avoid the issues you perceive (at least partially correctly, in my view) in this one.
I'd be interested in your thoughts on these somewhat rambly thoughts of mine.
Update: I've now posted that "next post" I was referring to (which gets into whether the risk-uncertainty distinction is a useful concept, in practice).
Overview
We’re often forced to make decisions under conditions of uncertainty. This may be empirical uncertainty (e.g., what is the likelihood that nuclear war would cause human extinction?), moral uncertainty (e.g., does the wellbeing of future generations matter morally?), or one of a number of other types of uncertainty.
But what do we really mean by “uncertainty”?
So what are we really talking about - risk, or (Knightian) uncertainty? What is such a distinction meant to mean? Does such a distinction make sense? What significance might this distinction have for how we resolve and make decisions given a lack of certainty? And what about unknown unknowns, black swans, and incomplete models?
These are the questions I discuss in this post, arriving at the following claims:
The risk-uncertainty distinction is usually not adequately specified; it lets a lot of the “work” be done by ambiguous phrases such as whether probabilities are “known” or “exact”.[1]
Proponents of the risk-uncertainty distinction usually seems to discuss it as if it’s an absolute, binary distinction, or fundamental dichotomy; as if in some cases we really do “know” (or “really can estimate”, or whatever) the probabilities of interest, while in other cases we really can’t at all. To be clear, the alternative to this view is the idea that:
That’s a false dichotomy; no absolute, binary distinction can be made.
Thinking that there is such a binary distinction can lead to using strange and suboptimal decision-making procedures.
This post discusses each of these four claims in turn. I close by considering how unknown unknowns (or black swans, or incomplete models) fit into this picture.
This post doesn’t address the idea that, as a practical or heuristic matter, it might be useful to act as if there’s a risk-uncertainty distinction. My next post will address that idea, and ultimately argue against it.
Epistemic status
The questions covered in this post all subject to substantial debate and have received some good treatments before. (I’d particularly recommend this short post by Ozzie Gooen, this paper by Dominic Roser [behind a paywall, unfortunately], and this series of posts by Nate Soares.) I’m also not an expert on these topics. Thus, this is basically meant as a collection and analysis of existing ideas, not as anything brand new. I’d appreciate feedback or comments in relation to any mistakes, unclear phrasings, etc. (and just in general!).
My three goals in writing this were to:
What’s the distinction meant to mean?
Wikipedia captures the everyday usage of the terms uncertainty and risk:
However, among some people and in some fields (particularly business and finance), it’s common to make a quite different risk-uncertainty distinction, like the one made in the quote at the beginning of this post. This different risk-uncertainty distinction (which is the one this post will focus on) is not about whether we’re talking about the possibility of something negative.
Instead, this distinction centres on something like whether we can “have”, “know”, “express”, “estimate”, or “quantify” the probabilities of interest (or perhaps, more specifically, “believable”, “justifiable”, or “precise” probabilities). If we can, we’re facing risk (even if the potential outcomes are all positive). If we can’t, we’re facing uncertainty (or Knightian uncertainty).[2]
Here’s one way of explaining this risk-uncertainty distinction:
Here’s another similar explanation:
And here’s another explanation from Roser:
Does the distinction make sense?
What does it mean for a probability to be “known”? What does it mean for a probability to be “completely unknown”, or “not even meaningful”? Can we find a clear, sharp way of separating all probabilities into just the two categories of (a) those we can “know” and (b) those which must remain “completely unknown”?
As far as I can tell, the answer to that last question has to be “No.” This is essentially based on the following premises:
P1: The answer being “Yes” would require it being the case that:
P2: P1a is false, because we can never validly be (or at least never should be) absolutely certain of anything.
P3: P1b is false, because we essentially always have at least some, incredibly flimsy basis for coming to a probability estimate for something, or, failing that, can use some type of uninformative prior. (I’m less confident in this premise than in the previous ones, and also less sure how to phrase what I mean.)
I can’t offer a proof of these premises (note: this doesn’t mean that a proof is impossible, just that I don’t know of one). Instead, what I’ll do below is try to illustrate why I believe those premises by:
I claim that, in the absence of an alternative good example that does demonstrate P1a or P1b, this provides fairly strong evidence for P2 and P3, at least.
Certain knowledge of a probability?
But how do we actually know that that’s a fair die? What if someone swapped it out at the last minute? And, even if it is what we’d typically call a “fair die”, how do we know that that means the odds it’ll show a six are ⅙? What if it’s become slightly eroded - entirely accidentally - such that it has a slightly higher chance of showing a six than showing another number? What if the person throwing the die knows how to throw it such as to increase or decrease the odds it’ll land on six?
Or as Yudkowsky’s flair for the dramatic puts it:
None of this stops me from happily believing that the odds are ⅙ that a die I have “very strong reason to believe” is “fair” will show a six. Nor will it stop me making bets on that basis. But it seems to me to highlight that there’s at least some doubt about what probability I should assign, and thus that this example doesn’t demonstrate P1a (i.e., doesn’t demonstrate that there are any probabilities I should be absolutely certain about).
Similar arguments could be run against the idea that the chances of being wrongly diagnosed with cancer can be absolutely, certainly known. (On top of the obvious possibilities like methodological errors in relevant studies, it’s also possible that the person’s very concept of cancer itself doesn’t line up well with reality or with the concept used by the relevant studies.)
Zero knowledge of a probability?
But might P1b be true - might there be probabilities we can have absolutely no knowledge about (and for which we can’t even use something like an uninformative prior)? I don’t believe I’ve ever encountered an example of such a probability. To return to the example given earlier:
But are those situations really absolutely unique? Do we really have no (relevant) data available? What about previous viruses? What about previous military interventions? Of course situations may be very, very different in the case at hand, and that data may be of very little relevance, barely narrow things down at all, and still leave your guesses as very likely to be quite inaccurate. But isn’t it something?
To see that some data we have is at least slightly relevant, consider your reaction if I told you that the number of deaths from this new virus or new military intervention (which will occur sometime in the next 20 years) would be somewhere between 0 and 1 million, rather than somewhere between 1 trillion and 1 trillion & 1 million (i.e., more than the entire population of Earth). You’d be confident saying which of those is more likely, wouldn’t you?
And I think we’d both agree that that’s not just overconfidence - you do have legitimate reasons for your judgements there.
In fact, Tetlock’s work has empirically shown that it is possible to reliably do better than chance (and better than just “between 0 and 1 million”) in predicting events very much like those, at least over spans of a few years.[3]
As with my rejection of P1a, what I’ve said is merely suggestive of P1b’s falsity - it’s still possible that there are some probabilities we can have absolutely no knowledge of. But I’ve seen various claimed examples of situations of (Knightian) uncertainty, and none have seemed to be ones in which we can have absolutely no knowledge of the probability. (Further discussion here.) And I suspect, though here I’m quite unsure, that if there was a good example, it’d still be possible to use something like an uninformative prior to deal with it, in the absence of anything better.
Altogether, I find myself fairly confident that Premises 1-3, or statements sufficiently like them, are true, and thus that it does not make sense to speak of an absolute, binary distinction between “risk” and “uncertainty”.
But why does that matter anyway? The next section discusses some decision-making procedures that people have proposed based on the idea that there is a risk-uncertainty distinction, and why these procedures seem strange and suboptimal.
Decision-making given this distinction
Many people argue that that maximising expected utility, or even just using expected value reasoning or explicit probabilities (EP) at all, is impossible or inappropriate when facing (Knightian) uncertainty rather than risk.[4][5] As Roser writes: “According to a popular view, then, how we ought to make policy decisions depends crucially on whether we have probabilities.” Roser goes on to discuss three existing proposals for alternatives to standard usage of EPs when in situations of “uncertainty”, and why he rejects them; this is what I turn to now.
Principle of indifference
Roser:
Maximin
Roser:
See Roser’s paper for what I see as a convincing example of the sort of problems maximin can lead to. Further discussion of why maximin, or something quite similar, seems a flawed approach can be found in this series of posts by Soares.[6]
Avoiding a lack of probabilities
Roser:
Further discussion of situations like Scenarios A and B, and of the ambiguity aversion involved, can be found in the abovementioned series of posts from Soares.
A quote from Bostrom and Ćirković may further highlight the lack of rationale for, and potential scale of harms from, following the principle of Avoiding a Lack of Probabilities:
Do we have a choice?
As Roser notes, despite all these issues, if there was an absolute, binary risk-uncertainty distinction, it might be preferable - or necessary - to use one of these three principles rather than a standard usage of EPs. However, as discussed above, it seems that there isn’t such a distinction. Thus, we do have a choice in the matter.
So it seems that, at least in the case of an ideal agent, it would be best to use whatever probabilities we do have, even if they’re incredibly poorly grounded, as they’d still be better than nothing. Roser writes:
My next post will discuss the more complicated matter of to what extent that argument applies in practice, for actual humans, given issues like the time costs involved in using EPs and tendencies towards overconfidence and anchoring.
Unknown unknowns
Here’s one last statement of the risk-uncertainty distinction (this one from Holden Karnofsky):
This quote makes more explicit the idea that (Knightian) uncertainty could be understood as including unknown unknowns (“missing pieces of one’s model”), rather than just “unknown” probabilities for the things one is modelling (which can be considered “known unknowns”). It could also be understood as including the related idea of “black swans”.
These concepts are sometimes ignored, sometimes implicit, and sometimes explicit in discussions of a risk-uncertainty distinction. Here I’ll explain how I think they fit into the picture. This is the section of this post where I’m least confident about both how true my beliefs are and how clearly I explain those beliefs.
I think unknown unknowns that influence the likelihood of whatever proposition we’re trying to work out the probability of are a key and common reason why the probability estimate we arrive at may be quite poorly grounded, flimsy, untrustworthy, etc. I think they’re also a key and common reason why the situation may look to some like a situation of Knightian uncertainty. But this doesn’t require fundamentally different approaches to creating or interpreting of one’s probability estimates. Soares gives a useful example:
Soares notes that black swans could include possibilities like “Within 70 years, human civilization will have collapsed”, but that obviously that particular example is no longer a black swan for us, as we’re now considering it. The black swans are those events we haven’t even thought of. He goes on:
But what if the unknown unknown doesn’t just influence what we’re trying to get a probability estimate for, but instead it is what we’re trying to get a probability estimate for? Soares discusses this too:
That argument seems fairly sound to me. But parts of MIRI’s more recent writings on embedded agents being smaller than their world models have seemed to me to indicate that Soares' suggestions may be insufficient. But even if that’s the case, I think that's a separate problem, rather than something that reveals that there's a binary risk-uncertainty distinction. Essentially, I think it'd only show that one can't have a probability estimate for something one hasn't thought of, not that there are some propositions or evidence bases that, by their nature, fundamentally allow no probability estimates (even when one is looking at the proposition and trying to come up with an estimate).
(But again, I’m less sure of both my thinking and explanation on this, and think it’s somewhat tangential to the risk-uncertainty distinction, so I’ll leave that there.)
Closing remarks
In this post, I’ve:
But, practically speaking, for humans, could there be benefits to acting as if one believes there is such a distinction? I’ll cover this question in my next post (ultimately arguing that the answer is probably “No”).
Additionally, it’s sometimes unclear whether proponents of the distinction are merely arguing (a) that people perceive such a distinction, so it’s useful to think about and research it in order to understand how people are likely to think and behave, or are actually arguing (b) that people should perceive such a distinction, or that such a distinction “really exists”, “out there in the world”. It seems to me that (a) is pretty likely to be true, but wouldn’t have major consequences for how we rationally should make decisions when not certain. Thus, in this post I focus exclusively on (b). ↩︎
It seems unfortunate that this distinction uses familiar words in a way that’s very different from their familiar usage. As Gooen writes, after introducing these less typical meanings of risk and uncertainty:
Roser further emphasises similar points:
↩︎Tetlock: “The best forecasters [...] we find are able to make between 10 and 15 distinguished… between 10 and 15 degrees of uncertainty for the types of questions that IARPA is asking about in these tournaments like whether Brexit is going to occur or if Greece is going to leave the eurozone or what Russia is going to do in the Crimea, those sorts of things. Now, that’s really interesting because a lot of people when they look at those questions say, “Well you can’t make probability judgements at all about that sort of thing because they’re unique.”
And I think that’s probably one of the most interesting results of the work over the last 10 years. I mean, you take that objection, which you hear repeatedly from extremely smart people that these events are unique and you can’t put probabilities on them, you take that objection and you say, “Okay, let’s take all the events that the smart people say are unique and let’s put them in a set and let’s call that set allegedly unique events. Now let’s see if people can make forecasts within that set of allegedly unique events and if they can, if they can make meaningful probability judgments of these allegedly unique events, maybe the allegedly unique events aren’t so unique after all, maybe there is some recurrence component.” And that is indeed the finding that when you take the set of allegedly unique events, hundreds of allegedly unique events, you find that the best forecasters make pretty well calibrated forecasts fairly reliably over time and don’t regress too much toward the mean.” ↩︎
Note that one doesn’t necessarily have to accept maximising expected utility if one rejects the binary risk-uncertainty distinction. (E.g., one might choose a modification of maximisation of expected utility to avoid the issue of Pascal’s mugging.) Roser writes:
By the same token, the question of whether there’s a binary risk-uncertainty distinction is relevant even if one rejects maximisation or expected utility, as Roser also notes: “A lack of probabilities is not only a challenge for expected utility theory—which is known for its need for probabilities—but for all practical reasoning.” ↩︎
For example, Andreas Mogensen writes:
Indeed, it seems to me that the whole idea of cluelessness is premised on the idea of an absolute, binary risk-uncertainty distinction, or on something that’s very much like that distinction and that suffers from the same problems as that distinction does. I therefore hope to later write a post attempting to “dissolve” the problem of cluelessness, using arguments quite similar to the above. For now, part of my thoughts on the matter can be found in two comments here, or extrapolated from what I say in this post. ↩︎
In Soares’ posts, he refers to what he’s critiquing as “maximiz[ing] minimum expected utility given [one’s] Knightian uncertainty”. I believe that this is either the same as or very similar to maximin, but it’s possible I’m wrong about that. ↩︎