Don't You Care If It Works? - Part 1

Jacob Falkovich

8 Don't You Care If It Works? - Part 1

29th Jul 2015

6 min read

8 Part 1 - Epistemic

Prologue - other people

Psychologists at Harvard showed that most people have implicit biases about several groups. Some other Harvard psychologists were subjects of this study proving that psychologists undervalue CVs with female names. All Harvard psychologists have probably heard about the effect of black names on resumes since even we have. Surely every psychology department in this country starting with Harvard will only review CVs with the names removed? Fat chance.

Caveat lector et scriptor

A couple weeks ago I wrote a poem that makes aspiring rationalists feel better about themselves. Today I'm going to undo that. Disclaimers: This is written with my charity meter set to 5%. Every other paragraph is generalizing from anecdotes and typical-mind-fallacying. A lot of the points I make were made before and better. You should really close this tab and read those other links instead, I won't judge you. I'm not going to write in an academic style with a bibliography at the end, I'm going to write in the sarcastic style my blog would have if I weren't too lazy to start one. I'm also not trying to prove any strong empirical claims, this is BYOE: bring your own evidence. Imagine every sentence starting with "I could be totally wrong" if it makes it more digestible. Inasmuch as any accusations in this post are applicable, they apply to me as well. My goal is to get you worried, because I'm worried. If you read this and you're not worried, you should be. If you are, good!

Disagree to disagree

Edit: in the next paragraph, "Bob" was originally an investment advisor. My thanks to 2irons and Eliezer who pointed out why this is literally the worst example of a job I could give to argue my point.

Is 149 a prime? Take as long as you need to convince yourself (by math or by Google) that it is. Is it unreasonable to have 99.9...% confidence with quite a few nines (and an occasional 7) in there? Now let's say that you have a tax accountant, Bob, a decent guy that seems to be doing a decent job filing your taxes. You start chatting with Bob and he reveals that he's pretty sure that 149 isn't a prime. He doesn't know two numbers whose product is 149, it just feels unprimely to him. You try to reason with him, but he just chides you for being so arrogant in your confidence: can't you just agree to disagree on this one? It's not like either of you is a numbers theorist. His job is to not get you audited by the IRS, which he does, not factorize numbers. Are you a little bit worried about trusting Bob with your taxes? What if he actually claimed to be a mathematician?

A few weeks ago I started reading beautiful probability and immediately thought that Eliezer is wrong about the stopping rule mattering to inference. I dropped everything and spent the next three hours convincing myself that the stopping rule doesn't matter and I agree with Jaynes and Eliezer. As luck would have it, soon after that the stopping rule question was the topic of discussion at our local LW meetup. A couple people agreed with me and a couple didn't and tried to prove it with math, but most of the room seemed to hold a third opinion: they disagreed but didn't care to find out. I found that position quite mind-boggling. Ostensibly, most people are in that room because we read the sequences and thought that this EWOR (Eliezer's Way Of Rationality) thing is pretty cool. EWOR is an epistemology based on the mathematical rules of probability, and the dude who came up with it apparently does mathematics for a living trying to save the world. It doesn't seem like a stretch to think that if you disagree with Eliezer on a question of probability math, a question that he considers so obvious it requires no explanation, that's a big frickin' deal!

Authority screens off that other authority you heard from afterwards

Opinion change

This is a chart that I made because I got excited about learning ggplot2 in R. On the right side of the chart are a lot bright red dots below the very top who believe in MIRI but also read the quantum physics sequence and don't think that MWI is very likely. Some of them understood the question of P(MWI) to be about whether MWI is the one and only exact truth, but I'm sure that several of them read it the way I did, roughly as: 1-P(collapse is true given current evidence). A lot of these people are congratulating themselves on avoiding cultishness. In the comments they mention other bloggers (or maybe even physicists!) who think that collapse is totally Beatles and MWI is Bieber.

Hold on, why did Eliezer even take all this time to write a huge quantum physics sequence? Here's how I see it: It's not to settle a point about some scientific dispute. It's to show that a person with some spare time and a little above average IQ can figure out something beyond people of massive authority, on the very subject of their authority, even if that subjects is as scary as QM. You know what’s the median IQ of the dots on above chart? 139 points! 139!!! And you know who has spare time? Someone who read 100% of the sequences!

Why should you even care about this? Because Eliezer does. He's so fanatically confident about MWI not for its own sake, it's because he's saying (at least to my ears): "This is my EWOR epistemology. It may not make you skinny and rich, but if it will do just one thing for you it will prevent you from being very very confident in beliefs that could be proven wrong. So if I'm very very confident about a question of science or math, I'm sticking my neck out and betting my entire epistemology on being right, so you can check it for yourself". In fact, Eliezer will totally judge your epistemology based on your belief in MWI. You won't do the same to him when he asks you to?

I'm a bright red dot on the chart above. Here are two statements I believe from authority because it is beyond my cognitive powers and domain knowledge to assess them as arguments: "strong AI will break out of any box humans can design", "CEV is the best basis for programming morality in AI". I believe them because I tested Eliezer on stopping rules and MWI and 20 other things, and came away convinced each time. My confidence of 98% in MWI is an upper bound on my confidence in whatever Eliezer tells me about anything. If I became convinced that collapse is correct or that stopping rules affect Bayesian inference I'll email Nate Soares at once and ask for a refund on my MIRI donation.

By the way, this would be a good time to stop reading and give MIRI some money. That’s OK, I'll wait.

The good book

A man comes to you with a book and says that it will grant you the only wisdom worth having, and as a side effect it may save your eternal soul. You read the book cover to cover and decide that the ideas you thought are nice are probably true, the ones that you didn't aren't, and you really like the bit with horses. Everyone on LW makes fun of you for claiming to take seriously something you don’t. Y’all see where this is going, don't you? Yes, it's fun to read the sequences for the "insight porn". It's also fun to read the Old Testament for the porn porn. But, maybe it could be more? Wouldn't it be kinda cool if you could read a book and become an epistemic superman, showing up experts wrong in their own domains and being proven right? Or maybe some important questions are going to come up in your life and you'll need to know the actual true answers? Or at least some questions you can bet $20 on with your friends and win?

Don't you want to know if this thing even works?

To be continued

Part 2 is here. In it: whining is ceased, arguments are argued about, motivations are explained, love is found, and points are taken.

Personal Blog

8

New Comment

Rendering 0/60 comments, sorted by

top scoring

(show more) Click to highlight new comments since: Today at 7:53 PM

Moderation Log

8 Don't You Care If It Works? - Part 1

by Jacob Falkovich

29th Jul 2015

6 min read

8 Part 1 - Epistemic

Prologue - other people

Caveat lector et scriptor

Disagree to disagree

Authority screens off that other authority you heard from afterwards

Opinion change

By the way, this would be a good time to stop reading and give MIRI some money. That’s OK, I'll wait.

The good book

Don't you want to know if this thing even works?

To be continued

Part 2 is here. In it: whining is ceased, arguments are argued about, motivations are explained, love is found, and points are taken.

Personal Blog

8

Mentioned in

24Don't You Care If It Works? - Part 2

New Comment

Rendering 0/60 comments, sorted by

top scoring

(show more) Click to highlight new comments since: Today at 7:53 PM

Moderation Log

More from Jacob Falkovich

Curated and popular this week

60Comments

Comment Permalink

EHeller11y70

It is true that optional stopping won't change Bayes rule updates (which is easy enough to show). It's also true that optional stopping does affect frequentist tests (different sampling distributions). The broader question is "which behavior is better?"

p-hacking is when statisticians use optional stopping to make their results look more significant (by not reporting their stopping rule). As it turns out you in fact can "posterior hack" Bayesians - http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2374040

Edit: Also Debrah Mayo's Error Statistics book contains a demonstration that optional stopping can cause a Bayesian to construct confidence interval that never contain the true parameter value. Weirdly, those Bayesians can be posterior hacked even if you tell them about the stopping rule, because they don't think it matters.

Anders_H11y00

I comment on this discussion here: http://lesswrong.com/r/discussion/lw/mke/on_stopping_rules/

2Richard_Kennaway11y

That is not my understanding of the term "optional stopping" (nor, more significantly, is it that of Jaynes). Optional stopping is the process of collecting data, computing your preferred measure of resultiness as you go, and stopping the moment it passes your criterion for reporting it, whether that is p<0.05, or a Bayes factor above 3, or anything else. (If it never passes the criterion, you just never report it.) That is but one of the large arsenal of tools available to the p-hacker: computing multiple statistics from the data in the hope of finding one that passes the criterion, thinking up more hypotheses to test, selective inclusion or omission of "outliers", fitting a range of different models, and so on. And of these, optional stopping is surely the least effective, for as Jaynes remarks in "Probability Theory as Logic", it is practically impossible to sample long enough to produce substantial support for a hypothesis deviating substantially from the truth. All of those other methods of p-hacking involve concealing the real hypothesis, which is the collection of all the hypotheses that were measured against the data. It is like dealing a bridge hand and showing that it supports astoundingly well the hypothesis that that bridge hand would be dealt. In machine learning terms, the hypothesis is being covertly trained on the data, then tested on how well it fits the data. No measure of the latter, whether frequentist or Bayesian, is a measure of how well the hypothesis will fit new data.

0Jacob Falkovich11y

Since I don't want this to spiral into another stopping rule argument, allow me to try and dissolve a confusing point that the discussions get stuck on. What makes Bayesian "lose" in the cases proposed by Mayo and Simonsohn isn't the inference, it's the scoring rule. A Bayesian scores himself on total calibration, "number of times my 95% confidence interval includes the truth" is just a small part of it. You can generate an experiment that has a high chance (let's say 99%) of making a Bayesian have a 20:1 likelihood ratio in favor of some hypothesis. By conservation of expected evidence, the same experiment might have 1% chance of generating close to a 2000:1 likelihood ratio against that same hypothesis. A frequentist could never be as sure of anything, this occasional 2000:1 confidence is the Bayesian's reward. If you rig the rules to view something about 95% confidence intervals as the only measure of success, then the frequentist's decision rule about accepting hypotheses at a 5% p-value wins, it's not his inference that magically becomes superior. Allow me to steal an analogy from my friend Simon: I'm running a Bayesian Casino in Vegas. Debrah Mayo comes to my casino every day with $31. She bets $1 on a coin flip, then bets $2 if she loses, then $4 and so on until she either wins $1 or loses all $31 if 5 flips go against her. I obviously think that by conservation of expected money in a coin flip this deal is fair, but Prof. Mayo tells me that I'm a sucker because I lose more days that I win. I tell her that I care about dollars, not days, but she replies that if she had more money in her pocket, she could make sure I have a losing day with arbitrarily high probability! I smile and ask her if she wants a drink.

See in context