Map:Territory::Uncertainty::Randomness – but that doesn’t matter, value of information does.

6 Davidmanheim 22 January 2016 07:12PM

In risk modeling, there is a well-known distinction between aleatory and epistemic uncertainty, which is sometimes referred to, or thought of, as irreducible versus reducible uncertainty. Epistemic uncertainty exists in our map; as Eliezer put it, “The Bayesian says, ‘Uncertainty exists in the map, not in the territory.’” Aleatory uncertainty, however, exists in the territory. (Well, at least according to our map that uses quantum mechanics, according to Bells Theorem – like, say, the time at which a radioactive atom decays.) This is what people call quantum uncertainty, indeterminism, true randomness, or recently (and somewhat confusingly to myself) ontological randomness – referring to the fact that our ontology allows randomness, not that the ontology itself is in any way random. It may be better, in Lesswrong terms, to think of uncertainty versus randomness – while being aware that the wider world refers to both as uncertainty. But does the distinction matter?

To clarify a key point, many facts are treated as random, such as dice rolls, are actually mostly uncertain – in that with enough physics modeling and inputs, we could predict them. On the other hand, in chaotic systems, there is the possibility that the “true” quantum randomness can propagate upwards into macro-level uncertainty. For example, a sphere of highly refined and shaped uranium that is *exactly* at the critical mass will set off a nuclear chain reaction, or not, based on the quantum physics of whether the neutrons from one of the first set of decays sets off a chain reaction – after enough of them decay, it will be reduced beyond the critical mass, and become increasingly unlikely to set off a nuclear chain reaction. Of course, the question of whether the nuclear sphere is above or below the critical mass (given its geometry, etc.) can be a difficult to measure uncertainty, but it’s not aleatory – though some part of the question of whether it kills the guy trying to measure whether it’s just above or just below the critical mass will be random – so maybe it’s not worth finding out. And that brings me to the key point.

In a large class of risk problems, there are factors treated as aleatory – but they may be epistemic, just at a level where finding the “true” factors and outcomes is prohibitively expensive. Potentially, the timing of an earthquake that would happen at some point in the future could be determined exactly via a simulation of the relevant data. Why is it considered aleatory by most risk analysts? Well, doing it might require a destructive, currently technologically impossible deconstruction of the entire earth – making the earthquake irrelevant. We would start with measurement of the position, density, and stress of each relatively macroscopic structure, and the perform a very large physics simulation of the earth as it had existed beforehand. (We have lots of silicon from deconstructing the earth, so I’ll just assume we can now build a big enough computer to simulate this.) Of course, this is not worthwhile – but doing so would potentially show that the actual aleatory uncertainty involved is negligible. Or it could show that we need to model the macroscopically chaotic system to such a high fidelity that microscopic, fundamentally indeterminate factors actually matter – and it was truly aleatory uncertainty. (So we have epistemic uncertainty about whether it’s aleatory; if our map was of high enough fidelity, and was computable, we would know.)

It turns out that most of the time, for the types of problems being discussed, this distinction is irrelevant. If we know that the value of information to determine whether something is aleatory or epistemic is negative, we can treat the uncertainty as randomness. (And usually, we can figure this out via a quick order of magnitude calculation; Value of Perfect information is estimated to be worth $100 to figure out which side the dice lands on in this game, and building and testing / validating any model for predicting it would take me at least 10 hours, my time is worth at least $25/hour, it’s negative.) But sometimes, slightly improved models, and slightly better data, are feasible – and then worth checking whether there is some epistemic uncertainty that we can pay to reduce. In fact, for earthquakes, we’re doing that – we have monitoring systems that can give several minutes of warning, and geological models that can predict to some degree of accuracy the relative likelihood of different sized quakes.

So, in conclusion; most uncertainty is lack of resolution in our map, which we can call epistemic uncertainty. This is true even if lots of people call it “truly random” or irreducibly uncertain – or if they are fancy, aleatory uncertainty. Some of what we assume is uncertainty is really randomness. But lots of the epistemic uncertainty can be safely treated as aleatory randomness, and value of information is what actually makes a difference. And knowing the terminology used elsewhere can be helpful.

Meetup : Finding Effective Altruism with Biased Inputs on Options - LA Rationality Weekly Meetup

1 Davidmanheim 14 January 2016 05:31AM

Discussion article for the meetup : Finding Effective Altruism with Biased Inputs on Options - LA Rationality Weekly Meetup

WHEN: 20 January 2016 07:00:00PM (-0800)

WHERE: 10850 West Pico Boulevard, Los Angeles, CA 90064Westside Pavilion - Upstairs Wine Bar (Next to the movie theater)

We're going to be discussing the general question of how to use biased information to make rational decisions, but talk about the specific context of how to be an Effective Altruist doing so.

The various EA nonprofits each have a claim to effective altruism, and there is lots of uncertainty about which will end up being the most effective; we can give to AMF and save lives in the near future for around $1,000 a life, or try policy interventions, with unknown effects, or perhaps we should try prevent one of severeal potential tail risks that could destroy humanity in the near or far future. The experts in each area argue for their cause, and we'd love a clearer way to think about the options. Come join us as we try to find one!

Discussion article for the meetup : Finding Effective Altruism with Biased Inputs on Options - LA Rationality Weekly Meetup

Perceptual Entropy and Frozen Estimates

10 Davidmanheim 03 June 2015 07:27PM

A Preface

During the 1990’s, a significant stream of research existed around how people process information, which combined very different streams in psychology and related areas with explicit predictive models about how actual cognitive processes differ from the theoretical ideal. This is not only the literature by Kahneman and Tversky about cognitive biases, but includes research about memory, perception, scope insensitivity, and other areas. The rationalist community is very familiar with some of this literature, but fewer are familiar with a masterful synthesis produced by Richards Heuer for the intelligence community in 1999[1], which was intended to start combating these problems, a goal we share. I’m hoping to put together a stream of posts based on that work, potentially expanding on it, or giving my own spin – but encourage reading the book itself (PDF) as well[2]. (This essay is based on Chapter 3.)

This will hopefully be my first set of posts, so feedback is especially welcome, both to help me refine the ideas, and to refine my presentation.

Entropy, Pressure, and Metaphorical States of Matter

Eliezer recommends updating incrementally but has noted that it’s hard. The central point, that it is hard to do  so, is one that some in our community have experienced and explicated, but there is deep theory I’ll attempt to outline, via an analogy, that I think explains how and why it occurs. The problem is that we are quick to form opinions and build models, because humans are good at pattern finding. We are less quick to discard them, due to limited mental energy. This is especially true when the pressure of evidence doesn’t shift overwhelmingly and suddenly.

I’ll attempt to answer the question of how this is true by stretching a metaphor and create an intuition pump for thinking about how our minds might be perform some think using uncertainty.

Frozen Perception

Heuer notes a stream of research about perception, and notes that “once an observer has formed an image – that is, once he or she has developed a mind set or expectation concerning the phenomenon being observed – this conditions future perceptions of that phenomenon.” This seems to follow a standard Bayesian practice, but in fact, as Eliezer noted, people fail to update. The following set of images, which Heuer reproduced from a 1976 book by Robert Jervis, show exactly this point;

Impressions Resist Change - Series of line drawings transitioning between a face and a crouching woman.

Looking at each picture, starting on the left, and moving to the right, you see a face slowly change. At what point does the face no longer seem to appear? (Try it!) For me, it’s at about the seventh image that it’s clear it morphed into a sitting, bowed figure. But what if you start at the other end? The woman is still clearly there long past the point where we see a face, starting in the other direction. What’s going on?

We seem to attach too strongly to our first approach, decision, or idea. Specifically, our decision seems to “freeze” once it get to one place, and needs much more evidence to start moving again. This has an analogue in physics, to the notion of freezing, which I think is more important than it first appears.

Entropy

To analyze this, I’ll drop into some basic probability theory, and physics, before (hopefully) we come out on the other side with a conceptually clearer picture. First, I will note that cognitive architecture has some way of representing theories, and implicitly assigns probabilities to various working theories. This is some sort of probability distribution over sample theories. Any probability distribution has a quantity called entropy[3], which is simply the probability of each state, multiplied by the logarithm of that probability, summed over all the states. (The probability is less than 1, so the logarithm is negative, but we traditionally flip the sign so entropy is a positive quantity.)

Need an example? Sure! I have two dice, and they can each land on any number, 1-6. I’m assuming they are fair, so each has probability of 1/6, and the logarithm (base 2) of 1/6 is about -2.585. There are 6 states, so the total is 6* (1/6) * 2.585 = 2.585. (With two dice, I have 36 possible combinations, each with probability 1/36, log(1/36) is -5.17, so the entropy is 5.17. You may have notices that I doubled the number of dice involved, and the entropy doubled – because there is exactly twice as much that can happen, but the average entropy is unchanged.) If I only have 2 possible states, such as a fair coin, each has probability of 1/2, and log(1/2)=-1, so for two states, (-0.5*-1)+(-0.5*-1)=1. An unfair coin, with a ¼ probability of tails, and a ¾ probability of heads, has an entropy  of 0.81. Of course, this isn’t the lowest possible entropy – a trick coin with both sides having heads only has 1 state, with entropy 0. So unfair coins have lower entropy – because we know more about what will happen.

 

Freezing, Melting, and Ideal Gases under Pressure

In physics, this has a deeply related concept, also called entropy, which in the form we see it on a macroscopic scale, just temperature. If you remember your high school science classes, temperature is a description of how much molecules move around. I’m not a physicist, and this is a bit simplified[4], but the entropy of an object is how uncertain we are about its state – gasses expand to fill their container, and the molecules could be anywhere, so they have higher entropy than a liquid, which stays in its container, which still has higher entropy than a solid, where the molecules don’t more much, which still has higher entropy than a crystal, where the molecules are sort of locked into place.

This partially lends intuition to the third law of thermodynamics; “the entropy of a perfect crystal at absolute zero is exactly equal to zero.” In our terms above, it’s like that trick coin – we know exactly where everything is in the crystal, and it doesn’t move. Interestingly, a perfect crystal at 0 Kelvin cannot exist in nature; no finite process can reduce entropy to that point; like infinite certainty, infinitely exact crystals are impossible to arrive at, unless you started there. So far, we could build a clever analogy between temperature and certainty, telling us that “you’re getting warmer” means exactly the opposite of what it does in common usage – but I think this is misleading[5].

In fact, I think that information in our analogy doesn’t change the temperature; instead, it reduces the volume! In the analogy, gases can become liquids or solids either by lowering temperature, or by increasing pressure – which is what evidence does. Specifically, evidence constrains the set of possibilities, squeezing our hypothesis space. The phrase “weight of evidence” is now metaphorically correct; it will actually constrain the space by applying pressure.

I think that by analogy, this explains the phenomenon we see with perception. While we are uncertain, information increases pressure, and our conceptual estimate can condense from uncertain to a relatively contained liquid state – not because we have less probability to distribute, but because the evidence has constrained  the space over which we can distribute it. Alternatively, we can settle on a lower energy state on our own, unassisted by evidence. If our minds too-quickly settle on a theory or idea, the gas settles into a corner of the available space, and if we fail to apply enough energy to the problem, our unchallenged opinion can even freeze into place.

Our mental models can be liquid, gaseous, or frozen in place – either by our prior certainty, our lack of energy required to update, or an immense amount of evidential pressure. When we look at those faces, our minds settle into a model quickly, and once there, fail to apply enough energy to re-evaporate our decision until the pressure of the new pictures is relatively immense. If we had started at picture 3 or 6, we could much more easily update away from our estimates; our minds are less willing to let the cloud settle into a puddle of probable answers, much less freeze into place. We can easily see the face, or the woman, moving between just these two images.

When we begin to search for a mental model to describe some phenomena, whether it be patterns of black and white on a page, or the way in which our actions will affect a friend, I am suggesting we settle into a puddle of likely options, and when not actively investing energy into the question, we are likely to freeze into a specific model.

What does this approach retrodict, or better, forbid?

Because our minds have limited energy, the process of maintaining an uncertain stance should be difficult. This seems to be borne out by personal and anecdotal experience, but I have not yet searched the academic literature to find more specific validation.

We should have more trouble updating away from a current model than we do arriving at that new model from the beginning. As Heuer puts it, “Initial exposure to… ambiguous stimuli interferes with accurate perception even after more and better information becomes available.” He notes that this was shown in Brunder and Potter, 1964 “Interference in Visual Recognition,” and that “the early but incorrect impression tends to persist because the amount of information necessary to invalidate a hypothesis is considerably greater than the amount of information required to make an initial interpretation.”

Potential avenues of further thought

The pressure of evidence should reduce the mental effort needed to switch models, but “leaky” hypothesis sets, where a class of model is not initially considered, should allow the pressure to metaphorically escape into the larger hypothesis space.

There is a potential for making this analogy more exact, but discussing entropy in graphical models (Bayesian Networks), especially in sets of graphical models with explicit uncertainty attached. I don’t have the math needed for this, but would be interested in hearing from those who did.



[1] I would like to thank both Abram Demski (Interviewed here) from providing a link to this material, and my dissertation chair, Paul Davis, who was able to point me towards how this has been used and extended in the intelligence community.

[2] There is a follow up book and training course which is also available, but I’ve not read it nor seen it online. A shorter version of the main points of that book is here (PDF), which I have only glanced through.

[3] Eliezer discusses this idea in Entropy and short codes, but I’m heading a slightly different direction.

[4] We have a LW Post, Entropy and Temperature that explains this a bit. For a different, simplified explanation, try this: http://www.nmsea.org/Curriculum/Primer/what_is_entropy.htm. For a slightly more complete version, try Wikipedia: https://en.wikipedia.org/wiki/Introduction_to_entropy. For a much more complete version, learn the math, talk to a PhD in thermodynamics, then read some textbooks yourself.

[5] I think this, of course, because I was initially heading in that direction. Instead, I realized there was a better analogy – but if we wanted to develop it in this direction instead, I’d point to the phase change energy required to changed phases of matter as a reason that our minds have trouble moving from their initial estimate. On reflection, I think this should be a small part of the story, if not entirely negligible.

Meetup : Complex problems, limited information, and rationality; How should we make decisions in real life?

3 Davidmanheim 09 October 2013 09:44PM

Discussion article for the meetup : Complex problems, limited information, and rationality; How should we make decisions in real life?

WHEN: 16 October 2013 07:00:45PM (-0700)

WHERE: West Los Angeles (At the Westside Tavern Upstair Wine Bar)

Most decisions we make involve complex, poorly understood systems. We'd like to be rational anyways, but how?

Example time: I am going to pre-commit here to biking to the meetup. Why? I believe that more exercise would increase my physical fitness in ways that are beneficial.

But... I haven't done the research into the benefits of physical fitness, and haven't done a tradeoff analysis of time costs versus benefits, I don't know how likely dangerous biking accidents are in LA, I don't know enough about my body to be sure that biking is safe, or a useful way for me to get in shape. Should I spend the week until the meetup researching these factors and building a model, or should I spend time getting work and homework done, playing with my kid, and sleeping? Through a combination of trusting experts, laziness, and other things to do, I'm not going to do the research.

And that's where we are with most decisions. What should we do, if we want to be rational? I have some ideas, some questions, and some willingness to shut up and listen to others, and I might even update my beliefs if others have ideas I like.

Discussion article for the meetup : Complex problems, limited information, and rationality; How should we make decisions in real life?

Meetup : Group Decision Making (the good, the bad, and the confusion of welfare economics)

4 Davidmanheim 30 April 2013 04:18PM

Discussion article for the meetup : Group Decision Making (the good, the bad, and the confusion of welfare economics)

WHEN: 08 May 2013 07:00:00PM (-0700)

WHERE: West Los Angeles (At the Westside Tavern Upstair Wine Bar)

Where: The Westside Tavern in the upstairs Wine Bar (all ages welcome), located inside the Westside Pavillion on the second floor, right by the movie theaters. The entrance sign says "Lounge".

Parking is free for 3 hours

Or you can take a Public Transit! A Trip Planner can be found here: http://socaltransport.org/tm_pub_start.php <- So you can try to avoid multiple hour trips! (We appreciate your attendance despite length of commute!)

We will hang out for 30 minutes or so, then I'll spend 10-15 minutes presenting: Group decision making. AKA Why voting can be a stupid way to make utility decisions, AKA Adding utility between people is stupid, this is an ordinal scale AKA Didn't Arrow win a Nobel prize for telling you people to stop?

Then we'll talk about what math and economics can say about making collective decisions in a way that isn't ill defined, and continue a hopefully interesting discussion. (Bonus points if it leads to a publishable idea for me!)

This will be a great break for me from... writing papers and taking tests about the same subject.

No foreknowledge or exposure to Less Wrong is necessary; this will be generally accessible and useful to anyone who values thinking for themselves. That said, it might help to read http://lesswrong.com/lw/ggm/pinpointing_utility/ so we can avoid type errors and radiation poisoning while we talk. (Not real radiation poisoning!)

Discussion article for the meetup : Group Decision Making (the good, the bad, and the confusion of welfare economics)