Dark Arts of Rationality

136 So8res 19 January 2014 02:47AM

Today, we're going to talk about Dark rationalist techniques: productivity tools which seem incoherent, mad, and downright irrational. These techniques include:

  1. Willful Inconsistency
  2. Intentional Compartmentalization
  3. Modifying Terminal Goals

I expect many of you are already up in arms. It seems obvious that consistency is a virtue, that compartmentalization is a flaw, and that one should never modify their terminal goals.

I claim that these 'obvious' objections are incorrect, and that all three of these techniques can be instrumentally rational.

In this article, I'll promote the strategic cultivation of false beliefs and condone mindhacking on the values you hold most dear. Truly, these are Dark Arts. I aim to convince you that sometimes, the benefits are worth the price.

continue reading »

How the Grinch Ought to Have Stolen Christmas

40 Quirinus_Quirrell 25 December 2013 08:00PM

On Dec. 24, 1957, a Mr. T. Grinch attempted to disrupt Christmas by stealing associated gifts and decorations. His plan failed, the occupants of Dr. Suess' narrative remained festive, and Mr. Grinch himself succumbed to cardiac hypertrophy. To help others avoid repeating his mistakes, I've written a brief guide to properly disrupting holidays. Holiday-positive readers should read this with the orthogonality thesis in mind. Fighting Christmas is tricky, because the obvious strategy - making a big demoralizing catastrophe - doesn't work. No matter what happens, the media will put the word Christmas in front of it and convert your scheme into even more free advertising for the holiday. It'll be a Christmas tragedy, a Christmas earthquake, a Christmas wave of foreclosures. That's no good; attacking Christmas takes more finesse.

The first thing to remember is that, whether you're stealing a holiday or a magical artifact of immense power, it's almost always a good idea to leave a decoy in its place. When people notice that something important is missing, they'll go looking to find or replace it. This rule can be generalized from physical objects to abstractions like sense of community. T. Grinch tried to prevent community gatherings by vandalizing the spaces where they would've taken place. A better strategy would've been to promise to organize a Christmas party, then skip the actual organizing and leave people to sit at home by themselves. Unfortunately, this solution is not scalable, but someone came up with a very clever solution: encourage people to watch Christmas-themed films instead of talking to each other, achieving almost as much erosion of community without the backlash.

I'd like to particularly applaud Raymond Arnold, for inventing a vaguely-Christmas-like holiday in December, with no gifts, and death (rather than cheer) as its central theme [1]. I really wish it didn't involve so much singing and community, though. I recommend raising the musical standards; people who can't sing at studio-recording quality should not be allowed to sing at all.

Gift-giving traditions are particularly important to stamp out, but stealing gifts is ineffective because they're usually cheap and replaceable. A better approach would've been to promote giving undesirable gifts, such as religious sculptures and fruitcake. Even better would be to convince the Mayor of Whoville to enact bad economic policies, and grind the Whos into a poverty that would make gift-giving difficult to sustain. Had Mr. Grinch pursued this strategy effectively, he could've stolen Christmas and Birthdays and gotten himself a Nobel Prize in Economics [2].

Finally, it's important to avoid rhyming. This is one of those things that should be completely obvious in hindsight, with a little bit of genre savvy; villains like us win much more often in prose and in life than we do in verse.

And with that, I'll leave you with a few closing thoughts. If you gave presents, your friends are disappointed with them. Any friends who didn't give you presents, it's because they don't care, and any fiends who did give you presents, they're cheap and lame presents for the same reason. If you have a Christmas tree, it's ugly, and if it's snowing, the universe is trying to freeze you to death.

Merry Christmas!

 

[1] I was initially concerned that the Solstice would pattern-match and mutate into a less materialistic version of Christmas, but running a Kickstarter campaign seems to have addressed that problem.

[2] This is approximately the reason why Alfred Nobel specifically opposed the existence of that prize.

 

Produce / Consume Ratios

13 Vaniver 21 April 2013 06:38PM

I've been thinking about this a bit recently, and thought I'd do a dump of evidence and conjecture, and see what Less Wrong had to say.

There are lots of areas of life where activities can be partitioned into either producing products or consuming products. For those areas, it may be worthwhile to calculate one's Produce / Consume Ratio (PCR) and also contemplate what the optimal PCR is for that area.

continue reading »

On the Importance of Systematic Biases in Science

26 gwern 20 January 2013 09:39PM

From pg812-1020 of Chapter 8 “Sufficiency, Ancillarity, And All That” of Probability Theory: The Logic of Science by E.T. Jaynes:

The classical example showing the error of this kind of reasoning is the fable about the height of the Emperor of China. Supposing that each person in China surely knows the height of the Emperor to an accuracy of at least ±1 meter, if there are N=1,000,000,000 inhabitants, then it seems that we could determine his height to an accuracy at least as good as

(8-49)

merely by asking each person’s opinion and averaging the results.

The absurdity of the conclusion tells us rather forcefully that the rule is not always valid, even when the separate data values are causally independent; it requires them to be logically independent. In this case, we know that the vast majority of the inhabitants of China have never seen the Emperor; yet they have been discussing the Emperor among themselves and some kind of mental image of him has evolved as folklore. Then knowledge of the answer given by one does tell us something about the answer likely to be given by another, so they are not logically independent. Indeed, folklore has almost surely generated a systematic error, which survives the averaging; thus the above estimate would tell us something about the folklore, but almost nothing about the Emperor.

We could put it roughly as follows:

error in estimate = (8-50)

where S is the common systematic error in each datum, R is the RMS ‘random’ error in the individual data values. Uninformed opinions, even though they may agree well among themselves, are nearly worthless as evidence. Therefore sound scientific inference demands that, when this is a possibility, we use a form of probability theory (i.e. a probabilistic model) which is sophisticated enough to detect this situation and make allowances for it.

As a start on this, equation (8-50) gives us a crude but useful rule of thumb; it shows that, unless we know that the systematic error is less than about of the random error, we cannot be sure that the average of a million data values is any more accurate or reliable than the average of ten1. As Henri Poincare put it: “The physicist is persuaded that one good measurement is worth many bad ones.” This has been well recognized by experimental physicists for generations; but warnings about it are conspicuously missing in the “soft” sciences whose practitioners are educated from those textbooks.

Or pg1019-1020 Chapter 10 “Physics of ‘Random Experiments’”:

…Nevertheless, the existence of such a strong connection is clearly only an ideal limiting case unlikely to be realized in any real application. For this reason, the law of large numbers and limit theorems of probability theory can be grossly misleading to a scientist or engineer who naively supposes them to be experimental facts, and tries to interpret them literally in his problems. Here are two simple examples:

  1. Suppose there is some random experiment in which you assign a probability p for some particular outcome A. It is important to estimate accurately the fraction f of times A will be true in the next million trials. If you try to use the laws of large numbers, it will tell you various things about f; for example, that it is quite likely to differ from p by less than a tenth of one percent, and enormously unlikely to differ from p by more than one percent. But now, imagine that in the first hundred trials, the observed frequency of A turned out to be entirely different from p. Would this lead you to suspect that something was wrong, and revise your probability assignment for the 101’st trial? If it would, then your state of knowledge is different from that required for the validity of the law of large numbers. You are not sure of the independence of different trials, and/or you are not sure of the correctness of the numerical value of p. Your prediction of f for a million trials is probably no more reliable than for a hundred.
  2. The common sense of a good experimental scientist tells him the same thing without any probability theory. Suppose someone is measuring the velocity of light. After making allowances for the known systematic errors, he could calculate a probability distribution for the various other errors, based on the noise level in his electronics, vibration amplitudes, etc. At this point, a naive application of the law of large numbers might lead him to think that he can add three significant figures to his measurement merely by repeating it a million times and averaging the results. But, of course, what he would actually do is to repeat some unknown systematic error a million times. It is idle to repeat a physical measurement an enormous number of times in the hope that “good statistics” will average out your errors, because we cannot know the full systematic error. This is the old “Emperor of China” fallacy…

Indeed, unless we know that all sources of systematic error - recognized or unrecognized - contribute less than about one-third the total error, we cannot be sure that the average of a million measurements is any more reliable than the average of ten. Our time is much better spent in designing a new experiment which will give a lower probable error per trial. As Poincare put it, “The physicist is persuaded that one good measurement is worth many bad ones.”2 In other words, the common sense of a scientist tells him that the probabilities he assigns to various errors do not have a strong connection with frequencies, and that methods of inference which presuppose such a connection could be disastrously misleading in his problems.

I excerpted & typed up these quotes for use in my DNB FAQ appendix on systematic problems; the applicability of Jaynes’s observations to things like publication bias is obvious. See also http://lesswrong.com/lw/g13/against_nhst/


  1. If I am understanding this right, Jaynes’s point here is that the random error shrinks towards zero as N increases, but this error is added onto the “common systematic error” S, so the total error approaches S no matter how many observations you make and this can force the total error up as well as down (variability, in this case, actually being helpful for once). So for example, ; with N=100, it’s 0.43; with N=1,000,000 it’s 0.334; and with N=1,000,000 it equals 0.333365 etc, and never going below the original systematic error of . This leads to the unfortunate consequence that the likely error of N=10 is 0.017<x<0.64956 while for N=1,000,000 it is the similar range 0.017<x<0.33433 - so it is possible that the estimate could be exactly as good (or bad) for the tiny sample as compared with the enormous sample, since neither can do better than 0.017!

  2. Possibly this is what Lord Rutherford meant when he said, “If your experiment needs statistics you ought to have done a better experiment”.

From First Principles

48 [deleted] 27 September 2012 07:04PM

Related: Truly a Part of You, What Data Generated That Thought

Some Case Studies

The other day my friend was learning to solder and he asked an experienced hacker for advice. The hacker told him that because heat rises, you should apply the soldering iron underneath the work to maximize heat transfer. Seems reasonable, logically inescapable, even. When I heard of this, I thought through to why heat rises and when, and saw that it was not so. I don't remember the conversation, but the punchline is that hot things become less dense, and less dense things float, and if you're not in a fluid, hot fluids can't float. In the case of soldering, the primary mode of heat transfer is conduction through the liquid metal, so to maximize heat transfer, get the tip wet before you stick it in, and don't worry about position.

This is a case of surface reasoning failing because the heuristic (heat rises) was not truly a part of my friend or the random hacker. I want to focus on the actual 5-second skill of going back To First Principles that catches those failures.

Here's another; watch for the 5 second cues and responses: A few years ago, I was building a robot submarine for a school project. We were in the initial concept design phase, wondering what it should look like. My friend Peter said, "It should be wide, because stability is important". I noticed the heuristic "low and wide is stable" and thought to myself "Where does that come from? When is it valid?". In the case of catamarans or sports cars, wide is stable because it increases the lever arm between restoring force (gravity) and support point (wheel or hull), and low makes the tipping point harder to reach. Under water, there is no tipping point, and things are better modeled as hanging from their center of volume. In other words, underwater, the stability criteria is vertical separation, instead of horizontal separation. (More precisely, you can model the submarine as a damped pendulum, and notice that you want to tune the parameters for approximately critical damping). We went back to First Principles and figured out what actually mattered, then went on to build an awesome robot.

Let's review what happened. We noticed a heuristic or bit of qualitative knowledge (wide is stable), and asked "Why? When? How much?", which led us to the quantitative answer, which told us much more precisely exactly what matters (critical damping) and what does not matter (width, maximizing restoring force, etc).

A more Rationality-related example: I recently thought about Courage, and the fact that most people are too afraid of risk (beyond just utility concavity), and as a heuristic we should be failing more. Around the same time, I'd been hounding Michael Vassar (at minicamp) for advice. One piece that stuck with me was "use decision theory". Ok, Courage is about decisions; let's go.

"You should be failing more", they say. You notice the heuristic, and immediately ask yourself "Why? How much more? Prove it from first principles!" "Ok", your forked copy says. "We want to take all actions with positive expected utility. By the law of large numbers, in (non-black-swan) games we play a lot of, observed utility should approximate expected utility, which means you should be observing just as much fail as win on the edge of what you're willing to do. Courage is being well calibrated on risk; If your craziest plans are systematically succeeding, you are not well calibrated and you need to take more risks." That's approximately quantitative, and you can pull out the equations to verify if you like.

Notice all the subtle qualifications that you may not have guessed from the initial advice; (non-pascalian/lln applies, you can observe utility, your craziest plans, just as much fail as win (not just as many, not more)). (example application: one of the best matches for those conditions is social interaction) Those of you who actually busted out the equations and saw the math of it, notice how much more you understand than I am able to communicate with just words.

Ok, now I've named three, so we can play the generalization game without angering the gods.

On the Five-Second Level

Trigger: Notice an attempt to use some bit of knowledge or a heuristic. Something qualitative, something with unclear domain, something that affects what you are doing, something where you can't see the truth.

Action: Ask yourself: What problem does it try to solve (what's its interface, type signature, domain, etc)? What's the specific mechanism of its truth when it is true? In what situations does that hold? Is this one of those? If not, can we derive what the correct result would be in this case? Basically "prove it". Sometimes it will take 2 seconds, sometimes a day or two; if it looks like you can't immediately see it, come up with whatever quick approximation you can and update towards "I don't know what's going on here". Come back later for practice.

It doesn't have to be a formal proof that would convince even the most skeptical mathematician or outsmart even the most powerful demon, but be sure to see the truth.

Without this skill of going back to First Principles, I think you would not fully get the point of truly a part of you. Why is being able to regenerate your knowledge useful? What are the hidden qualifications on that? How does it work? (See what I'm doing here?) Once you see many examples of the kind of expanded and formidably precise knowledge you get from having performed a derivation, and the vague and confusing state of having only a theorem, you will notice the difference. What the difference is, in terms of a derivation From First Principles, is left as an exercise for the reader (ie. I don't know). Even without that, though, having seen the difference is a huge step up.

From having seen the difference between derived and taught knowledge, I notice that one of the caveats of making knowledge Truly a Part of You is that just being able to get it From First Principles is not enough; Actually having done the proof tells you a lot more than simply what the correct theorem is. Do not take my word for it; go do some proofs; see the difference.

So far I've just described something that has been unusually valuable for me. Can it be taught? Will others gain as much? I don't know; I got this one more or less by intellectual lottery. It can probably be tested, though:

Testing the "Prove It" Habit

In school, we had this awesome teacher for thermodynamics and fluid dynamics. He was usually voted best in faculty. His teaching and testing style fit perfectly with my "learn first principles and derive on the fly" approach that I've just outlined above, so I did very well in his classes.

In the lectures and homework, we'd learn all the equations, where they came from (with derivations), how they are used, etc. He'd get us to practice and be good at straightforward application of them. Some of the questions required a bit of creativity.

On the exams, the questions were substantially easier, but they all required creativity and really understanding the first principles. "Curve Balls", we called them. Otherwise smart people found his tests very hard; I got all my marks from them. It's fair to say I did well because I had a very efficient and practiced From First Principles groove in my mind. (This was fair, because actually studying for the test was a reasonable substitute.)

So basically, I think a good discriminator would be to throw people difficult problems that can be solved with standard procedure and surface heuristics, and then some easier problems that require creative application of first principles, or don't quite work with standard heuristics (but seem to).

If your subjects have consistent scores between the two types, they are doing it From First Principles. If they get the standard problems right, but not the curve balls, they aren't.

Examples:

Straight: Bayesian cancer test. Curve: Here's the base rate and positive rate, how good is the test (liklihood ratio)?

Straight: Sunk cost on some bad investment. Curve: Something where switching costs, opportunity for experience make staying the correct thing.

Straight: Monty Hall. Curve: Ignorant Monty Hall.

Etc.

Exercises

Again, maybe this can't be taught, but here's some practice ideas just in case it can. I got substantial value from figuring these out From First Principles. Some may be correct, others incorrect, or correct in a limited range. The point is to use them to point you to a problem to solve; once you know the actual problem, ignore the heuristic and just go for truth:

Science says good theories make bold predictions.

Deriving From First Principles is a good habit.

Boats go where you point them, so just sail with the bow pointed to the island.

People who do bad things should feel guilty.

I don't have to feel responsible for people getting tortured in Syria.

If it's broken, fix it.

(post more in comments)

Causality: a chapter by chapter review

54 Vaniver 26 September 2012 04:55PM

This is a chapter by chapter review of Causality (2nd ed.) by Judea Pearl (UCLA, blog). Like my previous review, the intention is not to summarize but to help readers determine whether or not they should read the book (and if they do, what parts to read). Reading the review is in no way a substitute for reading the book.

I'll state my basic impression of the book up front, with detailed comments after the chapter discussions: this book is monumentally important to anyone interested in procuring knowledge (especially causal knowledge) from statistical data, but it is a heavily technical book primarily suitable for experts. The mathematics involved is not particularly difficult, but its presentation requires dedicated reading and clarity of thought. Only the epilogue, this lecture, is suitable for the general audience, and that will be the highest value portion for most readers of LW.

continue reading »

The Yudkowsky Ambition Scale

38 loup-vaillant 12 September 2012 03:08PM

From Hacker News.

  1. We're going to build the next Facebook!
  2. We're going to found the next Apple!
  3. Our product will create sweeping political change! This will produce a major economic revolution in at least one country! (Seasteading would be change on this level if it worked; creating a new country successfully is around the same level of change as this.)
  4. Our product is the next nuclear weapon. You wouldn't want that in the wrong hands, would you?
  5. This is going to be the equivalent of the invention of electricity if it works out.
  6. We're going to make an IQ-enhancing drug and produce basic change in the human condition.
  7. We're going to build serious Drexler-class molecular nanotechnology.
  8. We're going to upload a human brain into a computer.
  9. We're going to build a recursively self-improving Artificial Intelligence.
  10. We think we've figured out how to hack into the computer our universe is running on.

This made me laugh, but from the look of it, I'd say there is little work to do to make it serious. Personally, I'd try to shorten it so it is punchier and more memorable.

HP:MOR and the Radio Fallacy

22 RichardChappell 21 July 2012 07:55PM

No plot spoilers here, just wanted to flag a bit of poor reasoning that shows up in Chapter 39:

I shouldn't have believed it even for all of thirty seconds! Because if people had souls there wouldn't be any such thing as brain damage, if your soul could go on speaking after your whole brain was gone, how could damage to the left cerebral hemisphere take away your ability to talk?

This is a surprisingly common fallacy.  Just because X depends on Y, it doesn't follow that X depends on nothing but Y.  A phenomenon may involve more than just its most obvious failure point.

To illustrate: Suppose I'm trapped in a box, and my only way to communicate with the outside world is via radio communication.  Someone on the other end argues that I don't really exist -- "There's no person beyond the radio receiver, for if there was then there wouldn't be any such thing as damaged radios!"  Pretty silly, huh?  But people say this kind of thing in defense of physicalism all the time.

(N.B. This is not to defend the existence of souls. It's just to point out that this particular argument against them is invalid.)

Exploiting the Typical Mind Fallacy for more accurate questioning?

31 Xachariah 17 July 2012 12:46AM

I was reading Yvain's Generalizing from One Example, which talks about the typical mind fallacy.  Basically, it describes how humans assume that all other humans are like them.  If a person doesn't cheat on tests, they are more likely to assume others won't cheat on tests either.  If a person sees mental images, they'll be more likely to assume that everyone else sees mental images.

As I'm wont to do, I was thinking about how to make that theory pay rent.  It occurred to me that this could definitely be exploitable.  If the typical mind fallacy is correct, we should be able to have it go the other way; we can derive information about a person's proclivities based on what they think about other people.

Eg, most employers ask "have you ever stolen from a job before," and have to deal with misreporting because nobody in their right mind will say yes.  However, imagine if the typical mind fallacy was correct.  The employers could instead ask "what do you think the percentage of employees who have stolen from their job is?" and know that the applicants who responded higher than average were correspondingly more likely to steal, and the applicants who responded lower than average were less likely to cheat.  It could cut through all sorts of social desirability distortion effects.  You couldn't get the exact likelihood, but it would give more useful information than you would get with a direct question. 

In hindsight, which is always 20/20, it seems incredibly obvious.  I'd be surprised if professional personality tests and sociologists aren't using these types of questions.  My google-fu shows no hits, but it's possible I'm just not using the correct term that sociologists use.  I'm was wondering if anyone had heard of this questioning method before, and if there's any good research data out there showing just how much you can infer from someone's deviance from the median response.

Useful maxims

26 ciphergoth 11 July 2012 11:56AM

In a New York shop, I once got pressure-sold something expensive I didn't really want; when I said it cost too much, I was asked what I might be prepared to pay, and we ended up haggling.  Since then, I've had a rule:

  • If it's a non trivial price, never decide to buy while you're in the shop

and I have been very glad of it on many occasions.  I can go for a short walk to decide, and if I don't want it, I simply don't return to the shop.  This means I'm deciding in calm surroundings, based on what I want rather than on embarrassment.

Are there other maxims I could adopt that would serve me equally well?

(Personal note: I'm in the Bay Area for a week after minicamp, Sunday July 29th to Sunday August 5th. Let's hang out, go to things together, help make my visit cooler! Mail me: paul at ciphergoth.org. Thanks!)

View more: Prev | Next