Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

Knightian uncertainty in a Bayesian framework

11 So8res 24 July 2014 02:31PM

Recently, I found myself in a conversation with someone advocating the use of Knightian uncertainty. I pointed out that it doesn't really matter what uncertainty you call "normal" and what uncertainty you call "Knightian" because, at the end of the day, you still have to cash out all your uncertainty into a credence so that you can actually act.

My conversation partner, who I'm anonymizing as "Sir Percy", acknowledged that this is true if your goal is to maximize your expected gains, but denies that he should maximize expected gains. He proposes maximizing minimum expected gains given Knightian uncertainty ("using the MMEU rule"), and when using such a rule, the distinction between normal uncertainty and Knightian uncertainty does matter. I motivate the MMEU rule in my previous post, and in the next post, I'll explore it in more detail.

In this post, I will be examining Knightian uncertainty more broadly. The MMEU rule is one way of cashing out Knightian uncertainty into decisions in a way that looks non-Bayesian. But this decision rule is only one way in which the concept of Knightian uncertainty could prove useful, and I want to take a post to explore the concept of Knightian uncertainty in its own right.

continue reading »

“And that’s okay": accepting and owning reality

17 Swimmer963 27 July 2014 07:13PM

The Context 

I was having a conversation with Ruby a while back–the gist of it was that I was upset because of a nightmare I’d had the night before, and mad at myself for being upset about something that hadn’t even really happened, and trying to figure out how to stop feeling terrible. He said a thing that turned out to be surprisingly helpful.

Life involves feeling bad, often with good reason, often, not. A lot of the time the best response is to say 'Yes, I'm feeling shitty today, no, I'm not going to able to focus, and that's crap, but that’s today.’

It's different from tolerance or resignation, it's more 'this is reality, this is my starting point and I've got to accept this is what it is'.

Then if you can find a way to make it go away, great, if not, most things pass soon enough, and even if didn't, you could accept that too.”

I’m not good at this. I’m frequently using System 2 to fight System 1: for example, when I’m feeling introverted and really don’t want to be at work having face-to-face conversations with patients and co-workers, I basically tell that part of my brain to suck it up and stop being a baby.  I get mad at myself for wanting things that I can’t reasonably ask for, like praise from random other nurses I work with. I get mad at myself for wanting things for what I think are the wrong reasons: for example, wanting to move to San Francisco because I’m friends with lots of people there, and reluctantly accepting that I would need to leave my current job to do that, is one thing, but wanting to leave my job because it’s stressful–not okay! And then I mistrust my brain’s motivations to move to San Francisco at all–heaven forbid I should behave “like a groupie.” I ignore my desires for food that isn’t the same bean salad I’ve been eating for four days, for an extra evenings of sleep, or to cancel on plans with a friend because I just want an afternoon alone at home.

And even though I’m pretty good at overriding all of my desires, the sub-agents that represent those desires don’t go away. They just sit there, metaphorically, fuming at being ignoring and plotting revenge, which they usually achieve by making the desires ten times stronger...and then I go out and buy hot dogs at midnight, or stay in bed for thirteen hours, or spend an entire stretch of days off hiding in my apartment reading fanfiction. Or I just end up confused and conflicted and not capable of wanting anything. In other words, I’m a society of mind (http://en.wikipedia.org/wiki/Society_of_Mind) that’s frequently in a civil war with itself.

I hadn’t thought of trying to accept the civil war. Of saying “tonight, during this hospital shift, I will not be able to solve the civil war. Rather than adding to the negative affect by getting mad at myself, I will accept that today will simply suck and I will feel shitty. Going into the future I will work on peace talks, but today I must endure.”

 

"And that’s okay."

There’s one area where I’ve successfully taken a thing that I was confused and conflicted and frustrated about, and turned it into a thing that’s okay, even though the original conflict hasn’t been solved. That thing is relationships. At some point, around the time that I started applying the term asexual (link) to myself and first read about tactile defensiveness (link) and suddenly had words for the things that were ‘wrong’ with me, I stopped being frustrated about them. I haven’t solved all the problems. I’m still confused about relationships, I still get super anxious and avoidant in the face of being wanted too much, and that’s okay. Maybe it’ll change. I haven’t given up, and I’m trying things on purpose. It turned out that most of the suffering from this problem was meta-suffering and now it’s gone.

Somehow, when it wasn’t okay, it was a lot harder to try things on purpose. 

I hypothesized that adding the mental phrase “and that’s okay” onto all your problems would be a good general-purpose strategy.

 

Non-complacency

Ruby disagreed with me: “One of my strongest virtues, but I pay a cost for it, is how not-complacent I am. I'm not good enough, the world's not good enough. And I just see it. It's there. And I'm not okay with it.”

The problem is, even though I don’t have the virtue of acceptance, I don’t have the virtue of non-complacency either–in the sense that seeing the things that aren’t good enough, and not being okay with them, rarely causes me to do something to make the things better. It causes me to not think about them, unless it’s something as object-level as “my patient is in pain and the doctor refuses to give me an order for more pain meds.” And sometimes even then, I’ll retreat into it no longer being my problem.

I think that I, and probably others, need a certain amount of acceptance, a certain amount of “and that’s okay”, to let the wrong things into the circle of our awareness–to admit that yes, they really do suck. It’s a bit like the Litany of Gendlin. What’s true is already true, and even though thinking about it being true makes me feel like I must be a bad person, it can’t cause me to be more of a bad person than I already am.

 

"You need to own it." 

Once, I had a fairly awful nursing school placement at a very large, stressful ICU. I made mistakes, despite the fact that ‘I knew better’ in theory. (I’ve since learned that nursing is something that takes place under average conditions, not optimal conditions, meaning that you will have good days and bad days and that on your bad days, you will make dumb mistakes.) 

As a perfectionist, I found this really hard, even though I knew enough cogsci to recognize that my brain was behaving predictably and understandably. My mentor said a lot of things that weren’t helpful, but one of the things that she said is “you need to own your mistakes.” At that time, those words left her mouth and reached my ears and then got processed and turned into “you should admit that you’re hopelessly incompetent and a failure.” The only obvious conclusion to draw was that I ought to quit nursing school right then. I didn’t want to quit, and the only other option was to not think about the stupid mistakes–or, rather, try not to, and then end up thinking about them anyway and being anxious all the time.

Nowadays, when I process those words from a much better emotional place, they come through as “you need to let your mistakes into your self-concept, so that you can learn not to make them again even if you’re put under those same awful conditions again.” The fact that being distracted by an interruption and then trying to put an un-primed, full-of-air IV tubing in the pump is understandable and predictable doesn’t make it less likely to kill someone. The correct response is to develop habits and routines that cause you to predictably not make that mistake. But if thinking about it means automatically bringing up the possibility that you should just quit nursing school now before you actually kill someone, it’s hard to think of good routines or focus on training your brain to do them.

In this case, what eventually helped was letting my past mistakes be just okay enough that I could admit them into my mental autobiography, think about them, strategize, and learn from them–in short, own them. 

 

On Having Priorities

When I brought this up to my friend Ben Hoffman, he had another point to add. 

The obvious-to-me alternative here is the trick of putting EVERYTHING on a list, prioritizing, and optimizing for working on the "most important thing" instead of for getting all the "important things" done. (Or solving the most important problem, however you want to word it.) This is the strategy I've started using, and when I'm disciplined about it I feel nearly no badness above the baseline level from having some problems unresolved. 

This rings true with a part of my nursing clinical experience, and a thing I found especially frustrating about my interactions with my mentor. Once, I accidentally gave my patient an extra dose of digoxin because I misread the medication sheet. Which ended up doing basically nothing, but the general class of “medication error” contains a lot of harmful options. (The most embarrassing and potentially serious med error that I’ve made so far at my current job involved accidentally running my patient’s fentanyl infusion an order of magnitude too high.) There was also the IV-tubing-full-of-air incident.

Then, there was the thing where I would leave plastic syringe caps and bits of paper from wrappers in patients’ beds. This incurred approximately equal wrath to the med errors–in practice, a lot more, because she would catch me doing it around once a shift. I agreed with her on the possible bad consequences. Patients might get bedsores, and that was bad. But there were other problems I hadn’t solved, and they had worse consequences. I had, correctly I think, decided to focus on those first.

That being said, I wasn’t actually able to stop feeling bad about it enough to actually free up mental space for anti-med-error strategizing. This is partly because an adult in a position of authority was constantly mad at me, and I wasn’t able to make that stop feeling bad. But it’s partly because I genuinely felt like a failure every time I caught myself doing something wrong, whether it mattered a lot or not.

Making lists and prioritizing is a useful thing to do, but the physical motion of writing down a list isn’t all that’s involved. There’s the “being disciplined about it”, the ability to actually take all the problems seriously and then only work on the first and most important. I think that's non-trivial, and doesn't automatically happen when you make a list of Important Problems 1 through 5. 

 

Conclusion

There are two closely related concepts here. One is the idea that you can let go of struggling against unpleasant feelings–you can just have the unpleasant feelings and accept them, forgoing the meta-suffering and the useless burning of mental energy that comes with fighting them. If you apply this mental habit of not struggling against suffering, the result is that you have less overall suffering. 

The second concept is related to owning mistakes you've made, or personal flaws, or atrocities in the world. By default, it seems like most people either obsess over these or don't think about them–I expect that this happens because the things are too awful. If you apply the mental habit of admitting that you made that mistake and it really was dumb, or that poverty really is bad, but that that's okay, the result is that you can think about it sanely, set priorities, and maybe actually fix it. 

However, when I go through these mental motions, they feel like the same operation, applied to a different substrate. It's an habit that I would like to cultivate more. 

Appendix

Ruby sourced much of his original thoughts on this from Acceptance and Commitment Theory, and from Russ Harris’ book The Happiness Trap

In stark contrast to most Western psychotherapy, ACT does not have symptom reduction as a goal. This is based on the view that the ongoing attempt to get rid of ‘symptoms’ actually creates a clinical disorder in the first place. As soon as a private experience is labeled a ‘symptom’, it immediately sets up a struggle with it because a ‘symptom’ is by definition something ‘pathological’; something we should try to get rid of. In ACT, the aim is to transform our relationship with our difficult thoughts and feelings, so that we no longer perceive them as ‘symptoms’. Instead, we learn to perceive them as harmless, even if uncomfortable, transient psychological events. Ironically, it is through this process that ACT actually achieves symptom reduction—but as a by-product and not the goal.

Politics is hard mode

19 RobbBB 21 July 2014 10:14PM

Summary: I don't think 'politics is the mind-killer' works well rthetorically. I suggest 'politics is hard mode' instead.


 

Some people in and catawampus to the LessWrong community have objected to "politics is the mind-killer" as a framing (/ slogan / taunt). Miri Mogilevsky explained on Facebook:

My usual first objection is that it seems odd to single politics out as a “mind-killer” when there’s plenty of evidence that tribalism happens everywhere. Recently, there has been a whole kerfuffle within the field of psychology about replication of studies. Of course, some key studies have failed to replicate, leading to accusations of “bullying” and “witch-hunts” and what have you. Some of the people involved have since walked their language back, but it was still a rather concerning demonstration of mind-killing in action. People took “sides,” people became upset at people based on their “sides” rather than their actual opinions or behavior, and so on.

Unless this article refers specifically to electoral politics and Democrats and Republicans and things (not clear from the wording), “politics” is such a frightfully broad category of human experience that writing it off entirely as a mind-killer that cannot be discussed or else all rationality flies out the window effectively prohibits a large number of important issues from being discussed, by the very people who can, in theory, be counted upon to discuss them better than most. Is it “politics” for me to talk about my experience as a woman in gatherings that are predominantly composed of men? Many would say it is. But I’m sure that these groups of men stand to gain from hearing about my experiences, since some of them are concerned that so few women attend their events.

In this article, Eliezer notes, “Politics is an important domain to which we should individually apply our rationality — but it’s a terrible domain in which to learn rationality, or discuss rationality, unless all the discussants are already rational.” But that means that we all have to individually, privately apply rationality to politics without consulting anyone who can help us do this well. After all, there is no such thing as a discussant who is “rational”; there is a reason the website is called “Less Wrong” rather than “Not At All Wrong” or “Always 100% Right.” Assuming that we are all trying to be more rational, there is nobody better to discuss politics with than each other.

The rest of my objection to this meme has little to do with this article, which I think raises lots of great points, and more to do with the response that I’ve seen to it — an eye-rolling, condescending dismissal of politics itself and of anyone who cares about it. Of course, I’m totally fine if a given person isn’t interested in politics and doesn’t want to discuss it, but then they should say, “I’m not interested in this and would rather not discuss it,” or “I don’t think I can be rational in this discussion so I’d rather avoid it,” rather than sneeringly reminding me “You know, politics is the mind-killer,” as though I am an errant child. I’m well-aware of the dangers of politics to good thinking. I am also aware of the benefits of good thinking to politics. So I’ve decided to accept the risk and to try to apply good thinking there. [...]

I’m sure there are also people who disagree with the article itself, but I don’t think I know those people personally. And to add a political dimension (heh), it’s relevant that most non-LW people (like me) initially encounter “politics is the mind-killer” being thrown out in comment threads, not through reading the original article. My opinion of the concept improved a lot once I read the article.

In the same thread, Andrew Mahone added, “Using it in that sneering way, Miri, seems just like a faux-rationalist version of ‘Oh, I don’t bother with politics.’ It’s just another way of looking down on any concerns larger than oneself as somehow dirty, only now, you know, rationalist dirty.” To which Miri replied: “Yeah, and what’s weird is that that really doesn’t seem to be Eliezer’s intent, judging by the eponymous article.”

Eliezer replied briefly, to clarify that he wasn't generally thinking of problems that can be directly addressed in local groups (but happen to be politically charged) as "politics":

Hanson’s “Tug the Rope Sideways” principle, combined with the fact that large communities are hard to personally influence, explains a lot in practice about what I find suspicious about someone who claims that conventional national politics are the top priority to discuss. Obviously local community matters are exempt from that critique! I think if I’d substituted ‘national politics as seen on TV’ in a lot of the cases where I said ‘politics’ it would have more precisely conveyed what I was trying to say.

But that doesn't resolve the issue. Even if local politics is more instrumentally tractable, the worry about polarization and factionalization can still apply, and may still make it a poor epistemic training ground.

A subtler problem with banning “political” discussions on a blog or at a meet-up is that it’s hard to do fairly, because our snap judgments about what counts as “political” may themselves be affected by partisan divides. In many cases the status quo is thought of as apolitical, even though objections to the status quo are ‘political.’ (Shades of Pretending to be Wise.)

Because politics gets personal fast, it’s hard to talk about it successfully. But if you’re trying to build a community, build friendships, or build a movement, you can’t outlaw everything ‘personal.’

And selectively outlawing personal stuff gets even messier. Last year, daenerys shared anonymized stories from women, including several that discussed past experiences where the writer had been attacked or made to feel unsafe. If those discussions are made off-limits because they relate to gender and are therefore ‘political,’ some folks may take away the message that they aren’t allowed to talk about, e.g., some harmful or alienating norm they see at meet-ups. I haven’t seen enough discussions of this failure mode to feel super confident people know how to avoid it.

Since this is one of the LessWrong memes that’s most likely to pop up in cross-subcultural dialogues (along with the even more ripe-for-misinterpretation “policy debates should not appear one-sided“…), as a first (very small) step, my action proposal is to obsolete the ‘mind-killer’ framing. A better phrase for getting the same work done would be ‘politics is hard mode’:

1. ‘Politics is hard mode’ emphasizes that ‘mind-killing’ (= epistemic difficulty) is quantitative, not qualitative. Some things might instead fall under Middlingly Hard Mode, or under Nightmare Mode…

2. ‘Hard’ invites the question ‘hard for whom?’, more so than ‘mind-killer’ does. We’re used to the fact that some people and some contexts change what’s ‘hard’, so it’s a little less likely we’ll universally generalize.

3. ‘Mindkill’ connotes contamination, sickness, failure, weakness. In contrast, ‘Hard Mode’ doesn’t imply that a thing is low-status or unworthy. As a result, it’s less likely to create the impression (or reality) that LessWrongers or Effective Altruists dismiss out-of-hand the idea of hypothetical-political-intervention-that-isn’t-a-terrible-idea. Maybe some people do want to argue for the thesis that politics is always useless or icky, but if so it should be done in those terms, explicitly — not snuck in as a connotation.

4. ‘Hard Mode’ can’t readily be perceived as a personal attack. If you accuse someone of being ‘mindkilled’, with no context provided, that smacks of insult — you appear to be calling them stupid, irrational, deluded, or the like. If you tell someone they’re playing on ‘Hard Mode,’ that’s very nearly a compliment, which makes your advice that they change behaviors a lot likelier to go over well.

5. ‘Hard Mode’ doesn’t risk bringing to mind (e.g., gendered) stereotypes about communities of political activists being dumb, irrational, or overemotional.

6. ‘Hard Mode’ encourages a growth mindset. Maybe some topics are too hard to ever be discussed. Even so, ranking topics by difficulty encourages an approach where you try to do better, rather than merely withdrawing. It may be wise to eschew politics, but we should not fear it. (Fear is the mind-killer.)

7. Edit: One of the larger engines of conflict is that people are so much worse at noticing their own faults and biases than noticing others'. People will be relatively quick to dismiss others as 'mindkilled,' while frequently flinching away from or just-not-thinking 'maybe I'm a bit mindkilled about this.' Framing the problem as a challenge rather than as a failing might make it easier to be reflective and even-handed.

This is not an attempt to get more people to talk about politics. I think this is a better framing whether or not you trust others (or yourself) to have productive political conversations.

When I playtested this post, Ciphergoth raised the worry that 'hard mode' isn't scary-sounding enough. As dire warnings go, it's light-hearted—exciting, even. To which I say: good. Counter-intuitive fears should usually be argued into people (e.g., via Eliezer's politics sequence), not connotation-ninja'd or chanted at them. The cognitive content is more clearly conveyed by 'hard mode,' and if some group (people who love politics) stands to gain the most from internalizing this message, the message shouldn't cast that very group (people who love politics) in an obviously unflattering light. LW seems fairly memetically stable, so the main issue is what would make this meme infect friends and acquaintances who haven't read the sequences. (Or Dune.)

If you just want a scary personal mantra to remind yourself of the risks, I propose 'politics is SPIDERS'. Though 'politics is the mind-killer' is fine there too.

If you and your co-conversationalists haven’t yet built up a lot of trust and rapport, or if tempers are already flaring, conveying the message ‘I’m too rational to discuss politics’ or ‘You’re too irrational to discuss politics’ can make things worse. In that context, ‘politics is the mind-killer’ is the mind-killer. At least, it’s a needlessly mind-killing way of warning people about epistemic hazards.

‘Hard Mode’ lets you speak as the Humble Aspirant rather than the Aloof Superior. Strive to convey: ‘I’m worried I’m too low-level to participate in this discussion; could you have it somewhere else?’ Or: ‘Could we talk about something closer to Easy Mode, so we can level up together?’ More generally: If you’re worried that what you talk about will impact group epistemology, you should be even more worried about how you talk about it.

The Correct Use of Analogy

24 SilentCal 16 July 2014 09:07PM

In response to: Failure by AnalogySurface Analogies and Deep Causes

Analogy gets a bad rap around here, and not without reason. The kinds of argument from analogy condemned in the above links fully deserve the condemnation they get. Still, I think it's too easy to read them and walk away thinking "Boo analogy!" when not all uses of analogy are bad. The human brain seems to have hardware support for thinking in analogies, and I don't think this capability is a waste of resources, even in our highly non-ancestral environment. So, assuming that the linked posts do a sufficient job detailing the abuse and misuse of analogy, I'm going to go over some legitimate uses.

 

The first thing analogy is really good for is description. Take the plum pudding atomic model. I still remember this falsified proposal of negative 'raisins' in positive 'dough' largely because of the analogy, and I don't think anyone ever attempted to use it to argue for the existence of tiny subnuclear particles corresponding to cinnamon. 

But this is only a modest example of what analogy can do. The following is an example that I think starts to show the true power: my comment on Robin Hanson's 'Don't Be "Rationalist"'. To summarize, Robin argued that since you can't be rationalist about everything you should budget your rationality and only be rational about the most important things; I replied that maybe rationality is like weightlifting, where your strength is finite yet it increases with use. That comment is probably the most successful thing I've ever written on the rationalist internet in terms of the attention it received, including direct praise from Eliezer and a shoutout in a Scott Alexander (yvain) post, and it's pretty much just an analogy.

Here's another example, this time from Eliezer. As part of the AI-Foom debate, he tells the story of Fermi's nuclear experiments, and in particular his precise knowledge of when a pile would go supercritical.

What do the above analogies accomplish? They provide counterexamples to universal claims. In my case, Robin's inference that rationality should be spent sparingly proceeded from the stated premise that no one is perfectly rational about anything, and weightlifting was a counterexample to the implicit claim 'a finite capacity should always be directed solely towards important goals'. If you look above my comment, anon had already said that the conclusion hadn't been proven, but without the counterexample this claim had much less impact.

In Eliezer's case, "you can never predict an unprecedented unbounded growth" is the kind of claim that sounds really convincing. "You haven't actually proved that" is a weak-sounding retort; "Fermi did it" immediately wins the point. 

The final thing analogies do really well is crystallize patterns. For an example of this, let's turn to... Failure by Analogy. Yep, the anti-analogy posts are themselves written almost entirely via analogy! Alchemists who glaze lead with lemons and would-be aviators who put beaks on their machines are invoked to crystallize the pattern of 'reasoning by similarity'. The post then makes the case that neural-net worshippers are reasoning by similarity in just the same way, making the same fundamental error.

It's this capacity that makes analogies so dangerous. Crystallizing a pattern can be so mentally satisfying that you don't stop to question whether the pattern applies. The antidote to this is the question, "Why do you believe X is like Y?" Assessing the answer and judging deep similarities from superficial ones may not always be easy, but just by asking you'll catch the cases where there is no justification at all.

An Experiment In Social Status: Software Engineer vs. Data Science Manager

18 JQuinton 15 July 2014 08:24PM

Here is an interesting blog post about a guy who did a resume experiment between two positions which he argues are by experience identical, but occupy different "social status" positions in tech: A software engineer and a data manager.

Interview A: as Software Engineer

Bill faced five hour-long technical interviews. Three went well. One was so-so, because it focused on implementation details of the JVM, and Bill’s experience was almost entirely in C++, with a bit of hobbyist OCaml. The last interview sounds pretty hellish. It was with the VP of Data Science, Bill’s prospective boss, who showed up 20 minutes late and presented him with one of those interview questions where there’s “one right answer” that took months, if not years, of in-house trial and error to discover. It was one of those “I’m going to prove that I’m smarter than you” interviews...

Let’s recap this. Bill passed three of his five interviews with flying colors. One of the interviewers, a few months later, tried to recruit Bill to his own startup. The fourth interview was so-so, because he wasn’t a Java expert, but came out neutral. The fifth, he failed because he didn’t know the in-house Golden Algorithm that took years of work to discover. When I asked that VP/Data Science directly why he didn’t hire Bill (and he did not know that I knew Bill, nor about this experiment) the response I got was “We need people who can hit the ground running.” Apparently, there’s only a “talent shortage” when startup people are trying to scam the government into changing immigration policy. The undertone of this is that “we don’t invest in people”.

Or, for a point that I’ll come back to, software engineers lack the social status necessary to make others invest in them.

Interview B: as Data Science manager.

A couple weeks later, Bill interviewed at a roughly equivalent company for the VP-level position, reporting directly to the CTO.

Worth noting is that we did nothing to make Bill more technically impressive than for Company A. If anything, we made his technical story more honest, by modestly inflating his social status while telling a “straight shooter” story for his technical experience. We didn’t have to cover up periods of low technical activity; that he was a manager, alone, sufficed to explain those away.

Bill faced four interviews, and while the questions were behavioral and would be “hard” for many technical people, he found them rather easy to answer with composure. I gave him the Golden Answer, which is to revert to “There’s always a trade-off between wanting to do the work yourself, and knowing when to delegate.” It presents one as having managerial social status (the ability to delegate) but also a diligent interest in, and respect for, the work. It can be adapted to pretty much any “behavioral” interview question...

Bill passed. Unlike for a typical engineering position, there were no reference checks. The CEO said, “We know you’re a good guy, and we want to move fast on you”. As opposed tot he 7-day exploding offers typically served to engineers, Bill had 2 months in which to make his decision. He got a fourth week of vacation without even having to ask for it, and genuine equity (about 75% of a year’s salary vesting each year)...

It was really interesting, as I listened in, to see how different things are once you’re “in the club”. The CEO talked to Bill as an equal, not as a paternalistic, bullshitting, “this is good for your career” authority figure. There was a tone of equality that a software engineer would never get from the CEO of a 100-person tech company.

The author concludes that positions that are labeled as code-monkey-like are low status, while positions that are labeled as managerial are high status. Even if they are "essentially" doing the same sort of work.

Not sure about this methodology, but it's food for thought.

Change Contexts to Improve Arguments

30 palladias 08 July 2014 03:51PM

On a recent trip to Ireland, I gave a talk on tactics for having better arguments (video here).  There's plenty in the video that's been discussed on LW before (Ideological Turing Tests and other reframes), but I thought I'd highlight one other class of trick I use to have more fruitful disagreements.

It's hard, in the middle of a fight, to remember, recognize, and defuse common biases, rhetorical tricks, emotional triggers, etc.  I'd rather cheat than solve a hard problem, so I put a lot of effort into shifting disagreements into environments where it's easier for me and my opposite-number to reason and argue well, instead of relying on willpower.  Here's a recent example of the kind of shift I like to make:

A couple months ago, a group of my friends were fighting about the Brendan Eich resignation on facebook. The posts were showing up fast; everyone was, presumably, on the edge of their seats, fueled by adrenaline, and alone at their various computers. It’s a hard place to have a charitable, thoughtful debate.

I asked my friends (since they were mostly DC based) if they’d be amenable to pausing the conversation and picking it up in person.  I wanted to make the conversation happen in person, not in front of an audience, and in a format that let people speak for longer and ask questions more easily. If so, I promised to bake cookies for the ultimate donnybrook.  

My friends probably figured that I offered cookies as a bribe to get everyone to change venues, and they were partially right. But my cookies had another strategic purpose. When everyone arrived, I was still in the process of taking the cookies out of the oven, so I had to recruit everyone to help me out.

“Alice, can you pour milk for people?”

“Bob, could you pass out napkins?”

“Eve, can you greet people at the door while I’m stuck in the kitchen with potholders on?”

Before we could start arguing, people on both sides of the debate were working on taking care of each other and asking each others’ help. Then, once the logistics were set, we all broke bread (sorta) with each other and had a shared, pleasurable experience. Then we laid into each other.

Sharing a communal experience of mutual service didn’t make anyone pull their intellectual punches, but I think it made us more patient with each other and less anxiously fixated on defending ourselves. Sharing food and seating helped remind us of the relationships we enjoyed with each other, and why we cared about probing the ideas of this particular group of people.

I prefer to fight with people I respect, who I expect will fight in good faith.  It's hard to remember that's what I'm doing if I argue with them in the same forums (comment threads, fb, etc) that I usually see bad fights.  An environment shift and other compensatory gestures makes it easier to leave habituated errors and fears at the door.

 

Crossposted/adapted from my blog.

Confound it! Correlation is (usually) not causation! But why not?

40 gwern 09 July 2014 03:04AM

It is widely understood that statistical correlation between two variables ≠ causation. But despite this admonition, people are routinely overconfident in claiming correlations to support particular causal interpretations and are surprised by the results of randomized experiments, suggesting that they are biased & systematically underestimating the prevalence of confounds/common-causation. I speculate that in realistic causal networks or DAGs, the number of possible correlations grows faster than the number of possible causal relationships. So confounds really are that common, and since people do not think in DAGs, the imbalance also explains overconfidence.

I’ve noticed I seem to be unusually willing to bite the correlation≠causation bullet, and I think it’s due to an idea I had some time ago about the nature of reality.

1.1 The Problem

One of the constant problems I face in my reading is that I constantly want to know about causal relationships but usually I only have correlational data, and as we all know, correlation≠causation. If the general public naively thinks correlation=causation, then most geeks know better and that correlation≠causation, but then some go meta and point out that correlation and causation do tend to correlate and so correlation weakly implies causation. But how much evidence…? If I suspect that A→B, and I collect data and establish beyond doubt that A&B correlates r=0.7, how much evidence do I have that A→B?

Now, the correlation could be an illusory correlation thrown up by all the standard statistical problems we all know about, such as too-small n, false positive from sampling error (A & B just happened to sync together due to randomness), multiple testing, p-hacking, data snooping, selection bias, publication bias, misconduct, inappropriate statistical tests, etc. I’ve read about those problems at length, and despite knowing about all that, there still seems to be a problem: I don’t think those issues explain away all the correlations which turn out to be confounds - correlation too often ≠ causation.

To measure this directly you need a clear set of correlations which are proposed to be causal, randomized experiments to establish what the true causal relationship is in each case, and both categories need to be sharply delineated in advance to avoid issues of cherrypicking and retroactively confirming a correlation. Then you’d be able to say something like ‘11 out of the 100 proposed A→B causal relationships panned out’, and start with a prior of 11% that in your case, A→B. This sort of dataset is pretty rare, although the few examples I’ve found from medicine tend to indicate that our prior should be under 10%. Not great. Why are our best guesses at causal relationships are so bad?

We’d expect that the a priori odds are good: 1/3! After all, you can divvy up the possibilities as:

  1. A causes B
  2. B causes A
  3. both A and B are caused by a C (possibly in a complex way like Berkson’s paradox or conditioning on unmentioned variables, like a phone-based survey inadvertently generating conclusions valid only for the phone-using part of the population, causing amusing pseudo-correlations)

If it’s either #1 or #2, we’re good and we’ve found a causal relationship; it’s only outcome #3 which leaves us baffled & frustrated. Even if we were guessing at random, you’d expect us to be right at least 33% of the time, if not much more often because of all the knowledge we can draw on. (Because we can draw on other knowledge, like temporal order or biological plausibility. For example, in medicine you can generally rule out some of the relationships this way: if you find a correlation between taking superdupertetrohydracyline™ and pancreas cancer remission, it seems unlikely that #2 curing pancreas cancer causes a desire to take superdupertetrohydracyline™ so the causal relationship is probably either #1 superdupertetrohydracyline™ cures cancer or #3 a common cause like ‘doctors prescribe superdupertetrohydracyline™ to patients who are getting better’.)

I think a lot of people tend to put a lot of weight on any observed correlation because of this intuition that a causal relationship is normal & probable because, well, “how else could this correlation happen if there’s no causal connection between A & B‽” And fair enough - there’s no grand cosmic conspiracy arranging matters to fool us by always putting in place a C factor to cause scenario #3, right? If you question people, of course they know correlation doesn’t necessarily mean causation - everyone knows that - since there’s always a chance of a lurking confound, and it would be great if you had a randomized experiment to draw on; but you think with the data you have, not the data you wish you had, and can’t let the perfect be the enemy of the better. So when someone finds a correlation between A and B, it’s no surprise that suddenly their language & attitude change and they seem to place great confidence in their favored causal relationship even if they piously acknowledge “Yes, correlation is not causation, but… [obviously hanging out with fat people can be expected to make you fat] [surely giving babies antibiotics will help them] [apparently female-named hurricanes increase death tolls] etc etc”.

So, correlations tend to not be causation because it’s almost always #3, a shared cause. This commonness is contrary to our expectations, based on a simple & unobjectionable observation that of the 3 possible relationships, 2 are causal; and so we often reason as though correlation were strong evidence for causation. This leaves us with a paradox: experimental results seem to contradict intuition. To resolve the paradox, I need to offer a clear account of why shared causes/confounds are so common, and hopefully motivate a different set of intuitions.

1.2 What a Tangled Net We Weave When First We Practice to Believe

Here’s where Bayes nets & causal networks (seen previously on LW & Michael Nielsen) come up. When networks are inferred on real-world data, they often start to look pretty gnarly: tons of nodes, tons of arrows pointing all over the place. Daphne Koller early on in her Probabilistic Graphical Models course shows an example from a medical setting where the network has like 600 nodes and you can’t understand it at all. When you look at a biological causal network like this:

A Toolkit Supporting Formal Reasoning about Causality in Metabolic Networks

“A Toolkit Supporting Formal Reasoning about Causality in Metabolic Networks”

You start to appreciate how everything might be correlated with everything, but not cause each other.

This is not too surprising if you step back and think about it: life is complicated, we have limited resources, and everything has a lot of moving parts. (How many discrete parts does an airplane have? Or your car? Or a single cell? Or think about a chess player analyzing a position: ‘if my bishop goes there, then the other pawn can go here, which opens up a move there or here, but of course, they could also do that or try an en passant in which case I’ll be down in material but up on initiative in the center, which causes an overall shift in tempo…’) Fortunately, these networks are still simple compared to what they could be, since most nodes aren’t directly connected to each other, which tamps down on the combinatorial explosion of possible networks. (How many different causal networks are possible if you have 600 nodes to play with? The exact answer is complicated but it’s much larger than 2600 - so very large!)

One interesting thing I managed to learn from PGM (before concluding it was too hard for me and I should try it later) was that in a Bayes net even if two nodes were not in a simple direct correlation relationship A→B, you could still learn a lot about A from setting B to a value, even if the two nodes were ‘way across the network’ from each other. You could trace the influence flowing up and down the pathways to some surprisingly distant places if there weren’t any blockers.

The bigger the network, the more possible combinations of nodes to look for a pairwise correlation between them (eg If there are 10 nodes/variables and you are looking at bivariate correlations, then you have 10 choose 2 = 45 possible comparisons, and with 20, 190, and 40, 780. 40 variables is not that much for many real-world problems.) A lot of these combos will yield some sort of correlation. But does the number of causal relationships go up as fast? I don’t think so (although I can’t prove it).

If not, then as causal networks get bigger, the number of genuine correlations will explode but the number of genuine causal relationships will increase slower, and so the fraction of correlations which are also causal will collapse.

(Or more concretely: suppose you generated a randomly connected causal network with x nodes and y arrows perhaps using the algorithm in Kuipers & Moffa 2012, where each arrow has some random noise in it; count how many pairs of nodes are in a causal relationship; now, n times initialize the root nodes to random values and generate a possible state of the network & storing the values for each node; count how many pairwise correlations there are between all the nodes using the n samples (using an appropriate significance test & alpha if one wants); divide # of causal relationships by # of correlations, store; return to the beginning and resume with x+1 nodes and y+1 arrows… As one graphs each value of x against its respective estimated fraction, does the fraction head toward 0 as x increases? My thesis is it does. Or, since there must be at least as many causal relationships in a graph as there are arrows, you could simply use that as an upper bound on the fraction.)

It turns out, we weren’t supposed to be reasoning ‘there are 3 categories of possible relationships, so we start with 33%’, but rather: ‘there is only one explanation “A causes B”, only one explanation “B causes A”, but there are many explanations of the form “C1 causes A and B”, “C2 causes A and B”, “C3 causes A and B”…’, and the more nodes in a field’s true causal networks (psychology or biology vs physics, say), the bigger this last category will be.

The real world is the largest of causal networks, so it is unsurprising that most correlations are not causal, even after we clamp down our data collection to narrow domains. Hence, our prior for “A causes B” is not 50% (it’s either true or false) nor is it 33% (either A causes B, B causes A, or mutual cause C) but something much smaller: the number of causal relationships divided by the number of pairwise correlations for a graph, which ratio can be roughly estimated on a field-by-field basis by looking at existing work or directly for a particular problem (perhaps one could derive the fraction based on the properties of the smallest inferrable graph that fits large datasets in that field). And since the larger a correlation relative to the usual correlations for a field, the more likely the two nodes are to be close in the causal network and hence more likely to be joined causally, one could even give causality estimates based on the size of a correlation (eg. an r=0.9 leaves less room for confounding than an r of 0.1, but how much will depend on the causal network).

This is exactly what we see. How do you treat cancer? Thousands of treatments get tried before one works. How do you deal with poverty? Most programs are not even wrong. Or how do you fix societal woes in general? Most attempts fail miserably and the higher-quality your studies, the worse attempts look (leading to Rossi’s Metallic Rules). This even explains why ‘everything correlates with everything’ and Andrew Gelman’s dictum about how coefficients are never zero: the reason datasets like those mentioned by Cohen or Meehl find most of their variables to have non-zero correlations (often reaching statistical-significance) is because the data is being drawn from large complicated causal networks in which almost everything really is correlated with everything else.

And thus I was enlightened.

1.3 Comment

Since I know so little about causal modeling, I asked our local causal researcher Ilya Shpitser to maybe leave a comment about whether the above was trivially wrong / already-proven / well-known folklore / etc; for convenience, I’ll excerpt the core of his comment:

But does the number of causal relationships go up just as fast? I don’t think so (although at the moment I can’t prove it).

I am not sure exactly what you mean, but I can think of a formalization where this is not hard to show. We say A “structurally causes” B in a DAG G if and only if there is a directed path from A to B in G. We say A is “structurally dependent” with B in a DAG G if and only if there is a marginal d-connecting path from A to B in G.

A marginal d-connecting path between two nodes is a path with no consecutive edges of the form * -> * <- * (that is, no colliders on the path). In other words all directed paths are marginal d-connecting but the opposite isn’t true.

The justification for this definition is that if A “structurally causes” B in a DAG G, then if we were to intervene on A, we would observe B change (but not vice versa) in “most” distributions that arise from causal structures consistent with G. Similarly, if A and B are “structurally dependent” in a DAG G, then in “most” distributions consistent with G, A and B would be marginally dependent (e.g. what you probably mean when you say ‘correlations are there’).

I qualify with “most” because we cannot simultaneously represent dependences and independences by a graph, so we have to choose. People have chosen to represent independences. That is, if in a DAG G some arrow is missing, then in any distribution (causal structure) consistent with G, there is some sort of independence (missing effect). But if the arrow is not missing we cannot say anything. Maybe there is dependence, maybe there is independence. An arrow may be present in G, and there may still be independence in a distribution consistent with G. We call such distributions “unfaithful” to G. If we pick distributions consistent with G randomly, we are unlikely to hit on unfaithful ones (subset of all distributions consistent with G that is unfaithful to G has measure zero), but Nature does not pick randomly.. so unfaithful distributions are a worry. They may arise for systematic reasons (maybe equilibrium of a feedback process in bio?)

If you accept above definition, then clearly for a DAG with n vertices, the number of pairwise structural dependence relationships is an upper bound on the number of pairwise structural causal relationships. I am not aware of anyone having worked out the exact combinatorics here, but it’s clear there are many many more paths for structural dependence than paths for structural causality.


But what you actually want is not a DAG with n vertices, but another type of graph with n vertices. The “Universe DAG” has a lot of vertices, but what we actually observe is a very small subset of these vertices, and we marginalize over the rest. The trouble is, if you start with a distribution that is consistent with a DAG, and you marginalize over some things, you may end up with a distribution that isn’t well represented by a DAG. Or “DAG models aren’t closed under marginalization.”

That is, if our DAG is A -> B <- H -> C <- D, and we marginalize over H because we do not observe H, what we get is a distribution where no DAG can properly represent all conditional independences. We need another kind of graph.

In fact, people have come up with a mixed graph (containing -> arrows and <-> arrows) to represent margins of DAGs. Here -> means the same as in a causal DAG, but <-> means “there is some sort of common cause/confounder that we don’t want to explicitly write down.” Note: <-> is not a correlative arrow, it is still encoding something causal (the presence of a hidden common cause or causes). I am being loose here – in fact it is the absence of arrows that means things, not the presence.

I do a lot of work on these kinds of graphs, because these are graphs are the sensible representation of data we typically get – drawn from a marginal of a joint distribution consistent with a big unknown DAG.

But the combinatorics work out the same in these graphs – the number of marginal d-connected paths is much bigger than the number of directed paths. This is probably the source of your intuition. Of course what often happens is you do have a (weak) causal link between A and B, but a much stronger non-causal link between A and B through an unobserved common parent. So the causal link is hard to find without “tricks.”

1.4 Heuristics & Biases

Now assuming the foregoing to be right (which I’m not sure about; in particular, I’m dubious that correlations in causal nets really do increase much faster than causal relations do), what’s the psychology of this? I see a few major ways that people might be incorrectly reasoning when they overestimate the evidence given by a correlation:

  • they might be aware of the imbalance between correlations and causation, but underestimate how much more common correlation becomes compared to causation.

    This could be shown by giving causal diagrams and seeing how elicited probability changes with the size of the diagrams: if the probability is constant, then the subjects would seem to be considering the relationship in isolation and ignoring the context.

    It might be remediable by showing a network and jarring people out of a simplistic comparison approach.
  • they might not be reasoning in a causal-net framework at all, but starting from the naive 33% base-rate you get when you treat all 3 kinds of causal relationships equally.

    This could be shown by eliciting estimates and seeing whether the estimates tend to look like base rates of 33% and modifications thereof.

    Sterner measures might be needed: could we draw causal nets with not just arrows showing influence but also another kind of arrow showing correlations? For example, the arrows could be drawn in black, inverse correlations drawn in red, and regular correlations drawn in green. The picture would be rather messy, but simply by comparing how few black arrows there are to how many green and red ones, it might visually make the case that correlation is much more common than causation.
  • alternately, they may really be reasoning causally and suffer from a truly deep & persistent cognitive illusion that when people say ‘correlation’ it’s really a kind of causation and don’t understand the technical meaning of ‘correlation’ in the first place (which is not as unlikely as it may sound, given examples like David Hestenes’s demonstration of the persistence of Aristotelian folk-physics in physics students as all they had learned was guessing passwords; on the test used, see eg Halloun & Hestenes 1985 & Hestenes et al 1992); in which cause it’s not surprising that if they think they’ve been told a relationship is ‘causation’, then they’ll think the relationship is causation. Ilya remarks:

    Pearl has this hypothesis that a lot of probabilistic fallacies/paradoxes/biases are due to the fact that causal and not probabilistic relationships are what our brain natively thinks about. So e.g. Simpson’s paradox is surprising because we intuitively think of a conditional distribution (where conditioning can change anything!) as a kind of “interventional distribution” (no Simpson’s type reversal under interventions: “Understanding Simpson’s Paradox”, Pearl 2014 [see also Pearl’s comments on Nielsen’s blog)).

    This hypothesis would claim that people who haven’t looked into the math just interpret statements about conditional probabilities as about “interventional probabilities” (or whatever their intuitive analogue of a causal thing is).

    This might be testable by trying to identify simple examples where the two approaches diverge, similar to Hestenes’s quiz for diagnosing belief in folk-physics.


This was originally posted to an open thread but due to the favorable response I am posting an expanded version here.

[LINK] Claustrum Stimulation Temporarily Turns Off Consciousness in an otherwise Awake Patient

34 shminux 04 July 2014 08:00PM

This paper, or more often the New Scientist's exposition of it is being discussed online and is rather topical here. In a nutshell, stimulating one small but central area of the brain reversibly rendered one epilepsia patient unconscious without disrupting wakefulness. Impressively, this phenomenon has apparently been hypothesized before, just never tested (because it's hard and usually unethical). A quote from the New Scientist article (emphasis mine):

One electrode was positioned next to the claustrum, an area that had never been stimulated before.

When the team zapped the area with high frequency electrical impulses, the woman lost consciousness. She stopped reading and stared blankly into space, she didn't respond to auditory or visual commands and her breathing slowed. As soon as the stimulation stopped, she immediately regained consciousness with no memory of the event. The same thing happened every time the area was stimulated during two days of experiments (Epilepsy and Behavior, doi.org/tgn).

To confirm that they were affecting the woman's consciousness rather than just her ability to speak or move, the team asked her to repeat the word "house" or snap her fingers before the stimulation began. If the stimulation was disrupting a brain region responsible for movement or language she would have stopped moving or talking almost immediately. Instead, she gradually spoke more quietly or moved less and less until she drifted into unconsciousness. Since there was no sign of epileptic brain activity during or after the stimulation, the team is sure that it wasn't a side effect of a seizure.

If confirmed, this hints at several interesting points. For example, a complex enough brain is not sufficient for consciousness, a sort-of command and control structure is required, as well, even if relatively small. A low-consciousness state of late-stage dementia sufferers might be due to the damage specifically to the claustrum area, not just the overall brain deterioration. The researchers speculates that stimulating the area in vegetative-state patients might help "push them out of this state". From an AI research perspective, understanding the difference between wakefulness and consciousness might be interesting, too.

 

The representational fallacy

1 DanielDeRossi 25 June 2014 11:28AM

Basically Heather Dyke argues that metaphysicians are too often arguing from representations of reality (eg in language) to reality itself.

 It looks to me like a variant of the mind projection fallacy. This might be the first book length treatment teh fallacy has gotten though.  What do people think?

 

See reviews here

https://www.sendspace.com/file/k5x8sy

https://ndpr.nd.edu/news/23820-metaphysics-and-the-representational-fallacy/

To give bit of background there's a debate between A-theorists and B-theorists in philosophy of time.

A-theorists think time has ontological distinctions between past present and future

B-theorists hold there is no ontological distinction between past present and future.

Dyke argues that a popular argument for A-theory (tensed language represents ontological distinctions) commits the representational fallacy. Bourne agrees , but points out an argument Dyke uses for B-theory commits the same fallacy.

A new derivation of the Born rule

13 MrMind 25 June 2014 03:07PM

This post is an explanation of a recent paper coauthored by Sean Carroll and Charles Sebens, where they propose a derivation of the Born rule in the context of the Many World approach to quantum mechanics. While the attempt itself is not fully successful, it contains interesting ideas and it is thus worthwhile to know.

A note to the reader: here I will try to enlighten the preconditions and give only a very general view of their method, and for this reason you won’t find any equation. It is my hope that if after having read this you’re still curious about the real math, you will point your browser to the preceding link and read the paper for yourself.

If you are not totally new to LessWrong, you should know by now that the preferred interpretation of quantum mechanics (QM) around here is the Many World Interpretation (MWI), which negates the collapse of the wave-function and postulates a distinct reality (that is, a branch) for every base state composing a quantum superposition.

MWI historically suffered from three problems: the absence of macroscopic superpositions, the preferred basis problem, the Born rule derivation. The development of decoherence famously solved the first  and, to a lesser degree, the second problem, but the role of the third still remains one of the most poorly understood side of the theory.

Quantum mechanics assigns an amplitude, a complex number, to each branch of a superposition, and postulates that the probability of an observer to find the system in that branch is the (squared) norm of the amplitude. This, very briefly, is the content of the Born rule (for pure states).

Quantum mechanics remains agnostic about the ontological status of both amplitudes and probabilities, but MWI, assigning a reality status to every branch, demotes ontological uncertainty (which branch will become real after observation) to indexical uncertainty (which branch the observer will find itself correlated to after observation).

Simple indexical uncertainty, though, cannot reproduce the exact predictions of QM: by the Indifference principle, if you have no information privileging any member in a set of hypothesis, you should assign equal probability to each one. This leads to forming a probability distribution by counting the branches, which only in special circumstances coincides with amplitude-derived probabilities. This discrepancy, and how to account for it, constitutes the Born rule problem in MWI.

There have been of course many attempts at solving it, for a recollection I quote directly the article:

One approach is to show that, in the limit of many observations, branches that do not obey the Born Rule have vanishing measure. A more recent twist is to use decision theory to argue that a rational agent should act as if the Born Rule is true. Another approach is to argue that the Born Rule is the only well-defined probability measure consistent with the symmetries of quantum mechanics.

These proposals have failed to uniformly convince physicists that the Born rule problem is solved, and the paper by Carroll and Sebens is another attempt to reach a solution.

Before describing their approach, there are some assumptions that have to be clarified.

The first, and this is good news, is that they are treating probabilities as rational degrees of belief about a state of the world. They are thus using a Bayesian approach, although they never call it that way.

The second is that they’re using self-locating indifference, again from a Bayesian perspective.
Self-locating indifference is the principle that you should assign equal probabilities to find yourself in different places in the universe, if you have no information that distinguishes the alternatives. For a Bayesian, this is almost trivial: self-locating propositions are propositions like any other, so the principle of indifference should be used on them as it should on any other prior information. This is valid for quantum branches too.

The third assumption is where they start to deviate from pure Bayesianism: it’s what they call Epistemic Separability Principle, or ESP. In their words:

the outcome of experiments performed by an observer on a specific system shouldn’t depend on the physical state of other parts of the universe.

This is a kind of a Markov condition: the request that the system is such that it screens the interaction between the observer and the system observed from every possible influence of the environment.
It is obviously false for many partitions of a system into an experiment and an environment, but rather than taking it as a Principle, we can make it an assumption: an experiment is such only if it obeys the condition.
In the context of QM, this condition amounts to splitting the universal wave-function into two components, the experiment and the environment, so that there’s no entanglement between the two, and to consider only interactions that can factors as a product of an evolution for the environment and an evolution for the experiment. In this case, environment evolution act as the identity operator on the experiment, and does not affect the behavior of the experiment wave-function.
Thus, their formulation requires that the probability that an observer finds itself in a certain branch after a measurement is independent on the operations performed on the environment.
Note though, an unspoken but very important point: probabilities of this kind depends uniquely on the superposition structure of the experiment.
A probability, being an abstract degree of belief, can depend on all sorts of prior information. With their quantum version of ESP, Carroll and Sebens are declaring that, in a factored environment, probabilities of a subsystem does not depend on the information one has about the environment. Indeed, in this treatment, they are equating factorization and lack of logical connection.
This is of course true in quantum mechanics, but is a significant burden in a pure Bayesian treatment.

That said, let’s turn to their setup.

They imagine a system in a superposition of base states, which first interacts and decoheres with an environment, then gets perceived by an observer. This sequence is crucial: the Carroll-Sebens move can only be applied when the system already has decohered with a sufficiently large environment.
I say “sufficiently large” because the next step is to consider a unitary transformation on the “system+environment” block. This transformation needs to be of this kind:

- it respects ESP, in that it has to factor as an identity transformation on the “observer+system” block;

- it needs to equally distribute the probability of each branch in the original superposition on a different branch in the decohered block, according to their original relative measure.

Then, by a simple method of rearranging labels of the decohered base, one can show that the correct probabilities comes out by the indifference principle, in the very same way that the principle is used to derive the uniform probability distribution in the second chapter of Jaynes’ Probability Theory.

As an example, consider a superposition of a quantum bit, and say that one branch has a higher measure with respect to the other by a factor of square root of 2. The environment needs in this case to have at least 8 different base states to be relabeled in such a way to make the indifference principle work.

In theory, in this way you can only show that the Born rule is valid for amplitudes which differ one another by the square root of a rational number. Again I quote the paper for their conclusion:

however, since this is a dense set, it seems reasonable to conclude that the Born Rule is established. 

Evidently, this approach suffers from a number of limits: the first and the most evident is that it works only in a situation where the system to be observed has already decohered with an environment. It is not applicable to, say, a situation where a detector reads a quantum superposition directly, e.g. in a Stern-Gerlach experiment.

The second limit, although less serious, is that it can work only when the system to be observed decoheres with an environment which has sufficiently base states to distribute the relative measure in different branches. This number, for a transcendental amplitude, is bound to be infinite.

The third limit is that it can only work if we are allowed to interact with the environment in such a way as to leave the amplitudes of the interaction between the system and the observer untouched.

All of these, which are understood as limits, can naturally be reversed and considered as defining conditions, saying: the Born rule is valid only within those limits. 

 

I’ll leave it to you to determine if this constitutes a sufficient answers to the Born rule problem in MWI.

View more: Next