Crossing the History-Lessons Threshold
(1)
Around 2009, I embarked on being a serious amateur historian. I wouldn't have called it that at the time, but since then, I've basically nonstop studied various histories.
The payoffs of history come slow at first, and then fast. History is often written as a series of isolated events, and events are rarely put in total context. You can easily draw a straight line from Napoleon's invasions of the fragmented German principalities to how Bismarck and Moltke were able to unify a German Confederation under Prussian rule a few decades later; from there, it's a straight line to World War I due to great power rivalry; the Treaty of Versailles is easily understood in retrospect by historical French/German enmity; this gives rise to World War II.
That series of events is hard enough to truly get one's mind around, not just in abstract academic terms, but in actually getting a feel of how and why the actors did what they did, which shaped the outcomes that built the world.
And that's only the start of it: once you can flesh out the rest of the map, history starts coming brilliantly alive.
Without Prime Minister Stolypin's assassination in 1911, likely the Bolsheviks don't succeed in Russia; without that, Stalin is not at the helm when the Nazis invade.
On the other side of the Black Sea, in 1918, the Ottoman Empire is having terms worse than the Treaty of Versailles imposed on it -- until Mustafa Kemal leads the Turkish War of Independence, building one of the most stable states in the Middle East. Turkey, following Kemal's skill at governance and diplomacy, is able to (with great difficulty) stay neutral in World War II, not be absorbed by the Soviets, and not have its government taken over by hard-line Muslims.
This was not-at-all an obvious course of events. Without Kemal, Turkey almost certainly becomes crippled under the Treaty of Sevres, and eventually likely winds up as a member of the Axis during World War II, or gets absorbed as another Soviet/Warsaw Pact satellite state.
The chain of events goes on and on. There is an eminently clear chain of events from Martin Luther at Worms in 1521 to the American Revolution. Meanwhile, the non-success the Lord Protectorate and Commonwealth of England turned out less promising than was hoped -- ironically, arguably predisposing England to being less sympathetic to greater democracy. But the colonies were shielded from this, and their original constitutions and charters were never amended in the now-becoming-more-disenchanted-with-democracy England. Following a lack of consistent colonial policy and a lot of vacillating by various British governments, the American Revolution happens, and Britain loses control of the land and people would come to supplant it as the dominant world power one and a half centuries later.
(2)
Until you can start seeing the threads and chains of history across nations, interactions, and long stretches of time, history is a set of often-interesting stories -- but the larger picture remains blurry and out-of-focus. The lessons come once you can synthesize it all.
Hideyoshi Toyotomi's 1588 sword hunt was designed to take away weapons and chances of rebellious factions overthrowing his unified government of Japan. The policy was continued by his successor after the Toyotomi/Tokugawa Civil War, which leads to the Tokugawa forces losing to the Imperial Restoration in 1868 as their skill at warfare had atrophied; common soldiers with Western artillery were able to out-combat samurai with obsolete weapons.
Nurhaci founded the Qing Dynasty around the time Japan was being unified, with a mix of better command structures and tactics. But the dynasty hardened into traditionalism and was backwards-looking when Western technology and imperialists came with greater frequency in the late 1800's. The Japanese foreign minister Ito Hirobumi offered to help the Qing modernize along the lines Imperial Japan had modernized while looking for a greater alliance with the Chinese. But, Empress Dowager Cixi arrests and executes the reform-minded ministers of Emperor Guangxu and later, most likely, poisoned the reform-minded Emperor Guangxu. (He died of arsenic poisoning when Cixi was on her deathbed; someone poisoned him; Cixi or someone acting under her orders is the most likely culprit.)
The weak Qing Dynasty starts dealing with ever-more-frequent invasions, diplomatic extortions, and rebellions and revolutions. The Japanese invade China a generation after Hirobumi was rebuffed, and the Qing Dynasty entirely falls apart. After the Japanese unconditional surrender, the Chinese Civil War starts; the Communists win.
(3)
From this, we can start drawing lessons and tracing histories, seeing patterns. We start to see how things could have broken differently. Perhaps Germany and France were doomed to constant warfare due to geopolitics; maybe this is true.
But certainly, it's not at all obvious that Mustafa Kemal would lead the ruins of the Ottoman Empire into modern Turkey, and (seemingly against overwhelming odds) keep neutrality during World War II, rebuff Stalin and stay removed from Soviet conquest, and maintain a country with secular and modern laws that honors Muslim culture without giving way to warlordism as happened to much of the rest of the Middle East.
Likewise, we can clearly see how the policies of Empress Dowager Cixi ended the chance for a pan-East-Asian alliance, trade bloc, or federation; it's not inconceivable to imagine a world today were China and Japan are incredibly close allies, and much of the world's centers of commerce, finance, and power are consolidated in a Tokyo-Beijing-Seoul alliance. Sure, it's inconceivable with hindsight, but Japan in 1910 and Japan in 1930 are very different countries; and the struggling late Qing Dynasty is different than the fledgling competing factions in China after the fall of the Qing.
We can see, observing historical events from broad strokes, the huge differences individuals can make at leveraged points, the eventual outcomes in Turkey and East Asia were not-at-all foreordained by geography, demographics, or trends.
(4)
Originally, I was sketching out some of these trends of history to make a larger point about how modern minds have a hard time understanding older governments -- in a world where "personal rule" is entirely rebuffed in the more developed countries, it is hard to imagine how the Qing Dynasty or Ottoman Empire actually functioned. The world after the Treaty of Westphalia is incredibly different than the world before it, and the world before strict border controls pre-WWI is largely unrecognizable to us.
That was the piece I was going to write, about how we project modern institutions and understandings backwards, and how that means we can't understand what actually happened. The Ottomans and Qing were founded before modern nationalism had emerged, and the way their subjects related to them is so alien to us that it's almost impossible to conceive of how their culture and governance actually ran.
(5)
I might still pen that piece, if there's interest in it -- my attempt at a brief introduction came to result in this very different one, focused on a different particular point: the threshold effect in learning history.
I would say there's broadly three thresholds:
The first looks at a series of isolated events. You wind up with some witty quips, like: Astor saying, "Sir, if you were my husband, I would poison your drink." Churchill: "If I were married to you, I'd drink it."
Or moments of great drama: "And so the die is cast." "Don't fire until you see the whites of their eyes." "There is nothing to fear except fear itself."
These aren't so bad to learn; they're an okay jumping-off place. Certainly, Caesar's decision to march on Rome, Nobunaga's speech before the Battle of Okehazama, or understanding why Washington made the desperate gamble to cross the Delaware all offerlessons.
But seeing how the Marian military reforms, Sulla's purges, and the Gracchi brothers created the immediate situation before Julius Caesar's fateful crossing is more interesting, and tracing the lines backwards, seeing how Rome's generations-long combat with Hannibal's Carthage turned the city-state into a fully militarized conquest machine, and then following the lines onwards to see how the Romans relied on unit cohesion which, once learned by German adversaries, led to the fall of Rome -- this is much more interesting.
That's the second threshold of history to me: when isolated events start becoming regional chains; that's tracing Napoleon's invasion of Germany to Bismarck to the to World War I to the Treaty of Versailles to WWII.
Some people get to this level of history, and it makes you quickly an expert in a particular country.
But I think that's a poor place to stop learning: if you can truly get your mind around a long stretch of time in a nation, it's time to start coloring the map. When you can broadly know how Korea is developing simultaneous with Japan; how the Portugese/Spanish rivalry and Vatican compromises are affecting Asia's interactions with the Age of Sail Westerners; how Protestantism is creating rivals to Catholic power, two of which later equip the Japanese's Imperial Faction, which kicks off the Asian side of World War II -- this is when history starts really paying dividends and teaching worthwhile lessons.
The more you get into it, the more there is to learn. Regions that don't get much historical interest from Americans like Tito's Yugoslavia become fascinating to look at how they stayed out of Soviet Control and played the Western and Eastern blocs against each other; the chain of events takes a sad turn when Tito's successors can't keep the country together, the Yugoslav Wars follow, and its successor states still don't have the levels of relative prosperity and influence that Yugoslavia had in its heyday.
Yugoslavia is hard to get one's mind around by itself, but it's easy to color the map in with a decent understanding of Turkey, Germany, and Russia. Suddenly, figures and policies and conflicts and economics and culture start coming alive; lessons and patterns are everywhere.
I don't read much fiction any more, because most fiction can't compete with the sheer weight, drama, and insightfulness of history. Apparently some Kuomintang soldiers held out against the Chinese Communists and fought irregular warfare while funding their conflicts with heroin production in the regions of Burma and Thailand -- I just got a book on it, further coloring in the map of the aftermath of the Chinese Civil War, and that aspect of it upon the backdrop of the Cold War and containment, and how the Sino/Soviet split led to America normalizing relations with China, and...
...it never ends, and it's been one of the most insightful areas of study across my life.
History in that first threshold -- isolated battles, quotes, the occasional drama -- frankly, it offers only a slight glimmer of what's possible to learn.
Likewise, the second level of knowing a particular country's rise and fall over time can be insightful, but I would encourage anyone that has delved into history that much to not stop there: you're not far from the gates unlocking to large wellsprings of knowledge, a nearly infinite source of ideas, inspiration, case studies, and all manner of other sources of new and old ideas and very practical guidance.
Consider having sparse insides
It's easier to seek true beliefs if you keep your (epistemic) identity small. (E.g., if you avoid beliefs like "I am a democrat", and say only "I am a seeker of accurate world-models, whatever those turn out to be".)
It seems analogously easier to seek effective internal architectures if you also keep non-epistemic parts of your identity small -- not "I am a person who enjoys nature", nor "I am someone who values mathematics" nor "I am a person who aims to become good at email" but only "I am a person who aims to be effective, whatever that turns out to entail (and who is willing to let much of my identity burn in the process)".
There are obviously hazards as well as upsides that come with this; still, the upsides seem worth putting out there.
The two biggest exceptions I would personally make, which seem to mitigate the downsides: "I am a person who keeps promises" and "I am a person who is loyal to [small set of people] and who can be relied upon to cooperate more broadly -- whatever that turns out to entail".
Thoughts welcome.
"3 Reasons It’s Irrational to Demand ‘Rationalism’ in Social Justice Activism"
The lead article on everydayfeminism.com on March 25:
3 Reasons It’s Irrational to Demand ‘Rationalism’ in Social Justice Activism
The scenario is always the same: I say we should abolish prisons, police, and the American settler state— someone tells me I’m irrational. I say we need decolonization of the land — someone tells me I’m not being realistic.... When those who are the loudest, the most disruptive — the ones who want to destroy America and all of the oppression it has brought into the world — are being silenced even by others in social justice groups, that is unacceptable.
(The link from "decolonization" is to "Decolonization is not a metaphor", to make it clear s/he means actually giving the land back to the Native Americans.)
I regularly see people who describe how social justice activists act accused of setting up a straw man. This article show that the bias of some SJWs against reason is impossible to strawman. The author argues at length that rationality is bad, and that justice arguments shouldn't be rational or be defended rationally. Ze is, or was, confused about what "rationality" means, but clearly now means it to include reason-based argumentation.
This isn't just some wacko's blog; it was chosen as the headline article for the website. I had to click around to a few other articles to make sure it wasn't a parody site.
But it isn't just a sign of how irrational the social justice movement is—it has clues to how it got that way.
Why CFAR's Mission?
Related to:
---
Q: Why not focus exclusively on spreading altruism? Or else on "raising awareness" for some particular known cause?
Briefly put: because historical roads to hell have been powered in part by good intentions; because the contemporary world seems bottlenecked by its ability to figure out what to do and how to do it (i.e. by ideas/creativity/capacity) more than by folks' willingness to sacrifice; and because rationality skill and epistemic hygiene seem like skills that may distinguish actually useful ideas from ineffective or harmful ones in a way that "good intentions" cannot.
Q: Even given the above -- why focus extra on sanity, or true beliefs? Why not focus instead on, say, competence/usefulness as the key determinant of how much do-gooding impact a motivated person can have? (Also, have you ever met a Less Wronger? I hear they are annoying and have lots of problems with “akrasia”, even while priding themselves on their high “epistemic” skills; and I know lots of people who seem “less rational” than Less Wrongers on some axes who would nevertheless be more useful in many jobs; is this “epistemic rationality” thingy actually the thing we need for this world-impact thingy?...)
This is an interesting one, IMO.
Basically, it seems to me that epistemic rationality, and skills for forming accurate explicit world-models, become more useful the more ambitious and confusing a problem one is tackling.
For example:
Simultaneous Overconfidence and Underconfidence
Follow-up to this and this on my personal blog. Prep for this meetup. Cross-posted on my blog.
Eliezer talked about cognitive bias, statistical bias, and inductive bias in a series of posts only the first of which made it directly into the LessWrong sequences as currently organized (unless I've missed them!). Inductive bias helps us leap to the right conclusion from the evidence, if it captures good prior assumptions. Statistical bias can be good or bad, depending in part on the bias-variance trade-off. Cognitive bias refers only to obstacles which prevent us from thinking well.
Unfortunately, as we shall see, psychologists can be quite inconsistent about how cognitive bias is defined. This created a paradox in the history of cognitive bias research. One well-researched and highly experimentally validated effect was conservatism, the tendency to give estimates too middling, or probabilities too near 50%. This relates especially to integration of information: when given evidence relating to a situation, people tend not to take it fully into account, as if they are stuck with their prior. Another highly-validated effect was overconfidence, relating especially to calibration: when people give high subjective probabilities like 99%, they are typically wrong with much higher frequency.
In real-life situations, these two contradict: there is no clean distinction between information integration tasks and calibration tasks. A person's subjective probability is always, in some sense, the integration of the information they've been exposed to. In practice, then, when should we expect other people to be under- or over- confident?
Eliezer talked about cognitive bias, statistical bias, and inductive bias in a series of posts only the first of which made it directly into the LessWrong sequences as currently organized (unless I've missed them!). Inductive bias helps us leap to the right conclusion from the evidence, if it captures good prior assumptions. Statistical bias can be good or bad, depending in part on the bias-variance trade-off. Cognitive bias refers only to obstacles which prevent us from thinking well.
Unfortunately, as we shall see, psychologists can be quite inconsistent about how cognitive bias is defined. This created a paradox in the history of cognitive bias research. One well-researched and highly experimentally validated effect was conservatism, the tendency to give estimates too middling, or probabilities too near 50%. This relates especially to integration of information: when given evidence relating to a situation, people tend not to take it fully into account, as if they are stuck with their prior. Another highly-validated effect was overconfidence, relating especially to calibration: when people give high subjective probabilities like 99%, they are typically wrong with much higher frequency.
In real-life situations, these two contradict: there is no clean distinction between information integration tasks and calibration tasks. A person's subjective probability is always, in some sense, the integration of the information they've been exposed to. In practice, then, when should we expect other people to be under- or over- confident?
Simultaneous Overconfidence and Underconfidence
The conflict was resolved in an excellent paper by Ido Ereve et al which showed that it's the result of how psychologists did their statistics. Essentially, one group of psychologists defined bias one way, and the other defined it another way. The results are not really contradictory; they are measuring different things. In fact, you can find underconfidence or overconfidence in the same data sets by applying the different statistical techniques; it has little or nothing to do with the differences between information integration tasks and probability calibration tasks. Here's my rough drawing of the phenomenon (apologies for my hand-drawn illustrations):
Overconfidence here refers to probabilities which are more extreme than they should be, here illustrated as being further from 50%. (This baseline makes sense when choosing from two options, but won't always be the right baseline to think about.) Underconfident subjective probabilities are associated with more extreme objective probabilities, which is why the slope tilts up in the figure. Overconfident similarly tilts down, indicating that the subjective probabilities are associated with less-extreme objective probabilities. Unfortunately, if you don't know how the lines are computed, this means less than you might think. Ido Ereve et al show that these two regression lines can be derived from just one data-set. I found the paper easy and fun to read, but I'll explain the phenomenon in a different way here by relating it to the concept of statistical bias and tails coming apart.
The conflict was resolved in an excellent paper by Ido Ereve et al which showed that it's the result of how psychologists did their statistics. Essentially, one group of psychologists defined bias one way, and the other defined it another way. The results are not really contradictory; they are measuring different things. In fact, you can find underconfidence or overconfidence in the same data sets by applying the different statistical techniques; it has little or nothing to do with the differences between information integration tasks and probability calibration tasks. Here's my rough drawing of the phenomenon (apologies for my hand-drawn illustrations):
Overconfidence here refers to probabilities which are more extreme than they should be, here illustrated as being further from 50%. (This baseline makes sense when choosing from two options, but won't always be the right baseline to think about.) Underconfident subjective probabilities are associated with more extreme objective probabilities, which is why the slope tilts up in the figure. Overconfident similarly tilts down, indicating that the subjective probabilities are associated with less-extreme objective probabilities. Unfortunately, if you don't know how the lines are computed, this means less than you might think. Ido Ereve et al show that these two regression lines can be derived from just one data-set. I found the paper easy and fun to read, but I'll explain the phenomenon in a different way here by relating it to the concept of statistical bias and tails coming apart.
The Tails Come Apart
Everyone who has read Why the Tails Come Apart will likely recognize this image:
The idea is that even if X and Y are highly correlated, the most extreme X values and the most extreme Y values will differ. I've labelled the difference the "curse" after the optimizer's curse: if you optimize a criteria X which is merely correlated with the thing Y you actually want, you can expect to be disappointed.
Applying the idea to calibration, we can say that the most extreme subjective beliefs are almost certainly not the most extreme on the objective scale. That is: a person's most confident beliefs are almost certainly overconfident. A belief is not likely to have worked its way up to the highest peak of confidence by merit alone. It's far more likely that some merit but also some error in reasoning combined to yield high confidence. This sounds like the calibration literature, which found that people are generally overconfidant. What about underconfidence? By a symmetric argument, the points with the most extreme objective probabilities are not likely to be the same as those with the highest subjective belief; errors in our thinking are much more likely to make us underconfidant than overconfidant in those cases.
This argument tells us about extreme points, but not about the overall distribution. So, how does this explain simultaneous overconfidence and underconfidence? To understand that, we need to understand the statistics which psychologists used. We'll use averages rather than maximums, leading to a "soft version" which shows the tails coming apart gradually, rather than only at extreme ends.
Everyone who has read Why the Tails Come Apart will likely recognize this image:
The idea is that even if X and Y are highly correlated, the most extreme X values and the most extreme Y values will differ. I've labelled the difference the "curse" after the optimizer's curse: if you optimize a criteria X which is merely correlated with the thing Y you actually want, you can expect to be disappointed.
Applying the idea to calibration, we can say that the most extreme subjective beliefs are almost certainly not the most extreme on the objective scale. That is: a person's most confident beliefs are almost certainly overconfident. A belief is not likely to have worked its way up to the highest peak of confidence by merit alone. It's far more likely that some merit but also some error in reasoning combined to yield high confidence. This sounds like the calibration literature, which found that people are generally overconfidant. What about underconfidence? By a symmetric argument, the points with the most extreme objective probabilities are not likely to be the same as those with the highest subjective belief; errors in our thinking are much more likely to make us underconfidant than overconfidant in those cases.
This argument tells us about extreme points, but not about the overall distribution. So, how does this explain simultaneous overconfidence and underconfidence? To understand that, we need to understand the statistics which psychologists used. We'll use averages rather than maximums, leading to a "soft version" which shows the tails coming apart gradually, rather than only at extreme ends.
Statistical Bias
Statistical bias is defined through the notion of an estimator. We have some quantity we want to know, X, and we use an estimator to guess what it might be. The estimator will be some calculation which gives us our estimate, which I will write as X^. An estimator is derived from noisy information, such as a sample drawn at random from a larger population. The difference between the estimator and the true value, X^-X, would ideally be zero; however, this is unrealistic. We expect estimators to have error, but systematic error is referred to as bias.
Given a particular value for X, the bias is defined as the expected value of X^-X, written EX(X^-X). An unbiased estimator is an estimator such that EX(X^-X)=0 for any value of X we choose.
Due to the bias-variance trade-off, unbiased estimators are not the best way to minimize error in general. However, statisticians still love unbiased estimators. It's a nice property to have, and in situations where it works, it has a more objective feel than estimators which use bias to further reduce error.
Notice, the definition of bias is taking fixed X; that is, it's fixing the quantity which we don't know. Given a fixed X, the unbiased estimator's average value will equal X. This is a picture of bias which can only be evaluated "from the outside"; that is, from a perspective in which we can fix the unknown X.
A more inside-view of statistical estimation is to consider a fixed body of evidence, and make the estimator equal the average unknown. This is exactly inverse to unbiased estimation:
In the image, we want to estimate unknown Y from observed X. The two variables are correlated, just like in the earlier "tails come apart" scenario. The average-Y estimator tilts down because good estimates tend to be conservative: because I only have partial information about Y, I want to take into account what I see from X but also pull toward the average value of Y to be safe. On the other hand, unbiased estimators tend to be overconfident: the effect of X is exaggerated. For a fixed Y, the average Y^ is supposed to equal Y. However, for fixed Y, the X we will get will lean toward the mean X (just as for a fixed X, we observed that the average Y leans toward the mean Y). Therefore, in order for Y^ to be high enough, it needs to pull up sharply: middling values of X need to give more extreme Y^ estimates.
If we superimpose this on top of the tails-come-apart image, we see that this is something like a generalization:
Statistical bias is defined through the notion of an estimator. We have some quantity we want to know, X, and we use an estimator to guess what it might be. The estimator will be some calculation which gives us our estimate, which I will write as X^. An estimator is derived from noisy information, such as a sample drawn at random from a larger population. The difference between the estimator and the true value, X^-X, would ideally be zero; however, this is unrealistic. We expect estimators to have error, but systematic error is referred to as bias.
Given a particular value for X, the bias is defined as the expected value of X^-X, written EX(X^-X). An unbiased estimator is an estimator such that EX(X^-X)=0 for any value of X we choose.
Due to the bias-variance trade-off, unbiased estimators are not the best way to minimize error in general. However, statisticians still love unbiased estimators. It's a nice property to have, and in situations where it works, it has a more objective feel than estimators which use bias to further reduce error.
Notice, the definition of bias is taking fixed X; that is, it's fixing the quantity which we don't know. Given a fixed X, the unbiased estimator's average value will equal X. This is a picture of bias which can only be evaluated "from the outside"; that is, from a perspective in which we can fix the unknown X.
A more inside-view of statistical estimation is to consider a fixed body of evidence, and make the estimator equal the average unknown. This is exactly inverse to unbiased estimation:
In the image, we want to estimate unknown Y from observed X. The two variables are correlated, just like in the earlier "tails come apart" scenario. The average-Y estimator tilts down because good estimates tend to be conservative: because I only have partial information about Y, I want to take into account what I see from X but also pull toward the average value of Y to be safe. On the other hand, unbiased estimators tend to be overconfident: the effect of X is exaggerated. For a fixed Y, the average Y^ is supposed to equal Y. However, for fixed Y, the X we will get will lean toward the mean X (just as for a fixed X, we observed that the average Y leans toward the mean Y). Therefore, in order for Y^ to be high enough, it needs to pull up sharply: middling values of X need to give more extreme Y^ estimates.
If we superimpose this on top of the tails-come-apart image, we see that this is something like a generalization:
Wrapping It All Up
The punchline is that these two different regression lines were exactly what yields simultaneous underconfidence and overconfidence. The studies in conservatism were taking the objective probability as the independent variable, and graphing people's subjective probabilities as a function of that. The natural next step is to take the average subjective probability per fixed objective probability. This will tend to show underconfidence due to the statistics of the situation.
The studies on calibration, on the other hand, took the subjective probabilities as the independent variable, graphing average correct as a function of that. This will tend to show overconfidence, even with the same data as shows underconfidence in the other analysis.
From an individual's standpoint, the overconfidence is the real phenomenon. Errors in judgement tend to make us overconfident rather than underconfident because errors make the tails come apart so that if you select our most confident beliefs it's a good bet that they have only mediocre support from evidence, even if generally speaking our level of belief is highly correlated with how well-supported a claim is. Due to the way the tails come apart gradually, we can expect that the higher our confidence, the larger the gap between that confidence and the level of factual support for that belief.
This is not a fixed fact of human cognition pre-ordained by statistics, however. It's merely what happens due to random error. Not all studies show systematic overconfidence, and in a given study, not all subjects will display overconfidence. Random errors in judgement will tend to create overconfidence as a result of the statistical phenomena described above, but systematic correction is still an option.
I've also written a simple simulation of this. Julia code is here. If you don't have Julia installed or don't want to install it, you can run the code online at JuliaBox.
The punchline is that these two different regression lines were exactly what yields simultaneous underconfidence and overconfidence. The studies in conservatism were taking the objective probability as the independent variable, and graphing people's subjective probabilities as a function of that. The natural next step is to take the average subjective probability per fixed objective probability. This will tend to show underconfidence due to the statistics of the situation.
The studies on calibration, on the other hand, took the subjective probabilities as the independent variable, graphing average correct as a function of that. This will tend to show overconfidence, even with the same data as shows underconfidence in the other analysis.
From an individual's standpoint, the overconfidence is the real phenomenon. Errors in judgement tend to make us overconfident rather than underconfident because errors make the tails come apart so that if you select our most confident beliefs it's a good bet that they have only mediocre support from evidence, even if generally speaking our level of belief is highly correlated with how well-supported a claim is. Due to the way the tails come apart gradually, we can expect that the higher our confidence, the larger the gap between that confidence and the level of factual support for that belief.
This is not a fixed fact of human cognition pre-ordained by statistics, however. It's merely what happens due to random error. Not all studies show systematic overconfidence, and in a given study, not all subjects will display overconfidence. Random errors in judgement will tend to create overconfidence as a result of the statistical phenomena described above, but systematic correction is still an option.
I've also written a simple simulation of this. Julia code is here. If you don't have Julia installed or don't want to install it, you can run the code online at JuliaBox.
Easy wins aren't news
Recently I talked with a guy from Grant Street Group. They make, among other things, software with which local governments can auction their bonds on the Internet.
By making the auction process more transparent and easier to participate in, they enable local governments which need to sell bonds (to build a high school, for instance), to sell those bonds at, say, 7% interest instead of 8%. (At least, that's what he said.)
They have similar software for auctioning liens on property taxes, which also helps local governments raise more money by bringing more buyers to each auction, and probably helps the buyers reduce their risks by giving them more information.
This is a big deal. I think it's potentially more important than any budget argument that's been on the front pages since the 1960s. Yet I only heard of it by chance.
People would rather argue about reducing the budget by eliminating waste, or cutting subsidies to people who don't deserve it, or changing our ideological priorities. Nobody wants to talk about auction mechanics. But fixing the auction mechanics is the easy win. It's so easy that nobody's interested in it. It doesn't buy us fuzzies or let us signal our affiliations. To an individual activist, it's hardly worth doing.
Reductionist research strategies and their biases
I read an extract of (Wimsatt 1980) [1] which includes a list of common biases in reductionist research. I suppose most of us are reductionists most of the time, so these may be worth looking at.
This is not an attack on reductionism! If you think reductionism is too sacred for such treatment, you've got a bigger problem than anything on this list.
Here's Wimsatt's list, with some additions from the parts of his 2007 book Re-engineering Philosophy for Limited Beings that I can see on Google books. His lists often lack specific examples, so I came up with my own examples and inserted them in [brackets].
CFAR in 2014: Continuing to climb out of the startup pit, heading toward a full prototype
Summary: We outline CFAR’s purpose, our history in 2014, and our plans heading into 2015.
- Highlights from 2014.
- Improving operations.
- Attempts to go beyond the current workshop and toward the ‘full prototype’ of CFAR: our experience in 2014 and plans for 2015.
- Nuts, bolts, and financial details.
- The big picture and how you can help.
One of the reasons we’re publishing this review now is that we’ve just launched our annual matching fundraiser, and want to provide the information our prospective donors need for deciding. This is the best time of year to decide to donate to CFAR. Donations up to $120k will be matched until January 31.[1]
To briefly preview: For the first three years of our existence, CFAR mostly focused on getting going. We followed the standard recommendation to build a ‘minimum viable product’, the CFAR workshops, that could test our ideas and generate some revenue. Coming into 2013, we had a workshop that people liked (9.3 average rating on “Are you glad you came?”; a more recent random survey showed 9.6 average rating on the same question 6-24 months later), which helped keep the lights on and gave us articulate, skeptical, serious learners to iterate on. At the same time, the workshops are not everything we would want in a CFAR prototype; it feels like the current core workshop does not stress-test most of our hopes for what CFAR can eventually do. The premise of CFAR is that we should be able to apply the modern understanding of cognition to improve people’s ability to (1) figure out the truth (2) be strategically effective (3) do good in the world. We have dreams of scaling up some particular kinds of sanity. Our next goal is to build the minimum strategic product that more directly justifies CFAR’s claim to be an effective altruist project.[2]
Human capital or signaling? No, it's about doing the Right Thing and acquiring karma
There's a huge debate among economists of education on whether the positive relationship between educational attainment and income is due to human capital, signaling, or ability bias. But what do the students themselves believe? Bryan Caplan has argued that students' actions (for instance, their not sitting in for free on classes and their rejoicing at class cancellation) suggest a belief in the signaling model of education. At the same time, he notes that students may not fully believe the signaling model, and that shifting in the direction of that belief might improve individual educational attainment.
Still, something seems wrong about the view that most people believe in the signaling model of education. While their actions are consistent with that view, I don't think they frame it quite that way. I don't think they usually think of it as "education is useless, but I'll go through it anyway because that allows me to signal to potential employers that I have the necessary intelligence and personality traits to succeed on the job." Instead, I believe that people's model of school education is linked to the idea of karma: they do what the System wants them to do, because that's their duty and the Right Thing to do. Many of them also expect that if they do the Right Thing, and fulfill their duties well, then the System shall reward them with financial security and a rewarding life. Others may take a more fateful stance, saying that it's not up to them to judge what the System has in store for them, but they still need to do the Right Thing.
The case of the devout Christian
Consider a reasonably devout Christian who goes to church regularly. For such a person, going to church, and living a life in accordance with (his understanding of) Christian ethics is part of what he's supposed to do. God will take care of him as long as he does his job well. In the long run, God will reward good behavior and doing the Right Thing, but it's not for him to question God's actions.
Such a person might look bemused if you asked him, "Are you a practicing Christian because you believe in the prudential value of Christian teachings (the "human capital" theory) or because you want to give God the impression that you are worthy of being rewarded (the "signaling" theory")?" Why? Partly, because the person attributes omniscience, omnipotence, and omnibenevolence to God, so that the very idea of having a conceptual distinction between what's right and how to impress God seems wrong. Yes, he does expect that God will take care of him and reward him for his goodness (the "signaling" theory). Yes, he also believes that the Christian teachings are prudent (the "human capital" theory). But to him, these are not separate theories but just parts of the general belief in doing right and letting God take care of the rest.
Surely not all Christians are like this. Some might be extreme signalers: they may be deliberately trying to optimize for (what they believe to be) God's favor and maximizing the probability of making the cut to Heaven. Others might believe truly in the prudence of God's teachings and think that any rewards that flow are because the advice makes sense at the worldly level (in terms of the non-divine consequences of actions) rather than because God is impressed by the signals they're sending him through those actions. There are also a number of devout Christians I personally know who, regardless of their views on the matter, would be happy to entertain, examine, and discuss such hypotheses without feeling bemused. Still, I suspect the majority of Christians don't separate the issue, and many might even be offended at second-guessing God.
Note: I selected Christianity and a male sex just for ease of description; similar ideas apply to other religions and the female sex. Also note that in theory, some religious sects emphasize free will and others emphasize determinism more, but it's not clear to me how much effect this has on people's mental models on the ground.
The schoolhouse as church: why human capital and signaling sound ridiculous
Just as many people believe in following God's path and letting Him take care of the rewards, many people believe that by doing the Right Thing educationally (being a Good Student and jumping through the appropriate hoops through correctly applied sincere effort) they're doing their bit for the System. These people might be bemused at the cynicism involved in separating out "human capital" and "signaling" theories of education.
Again, not everybody is like this. Some people are extreme signalers: they openly claim that school builds no useful skills, but grades are necessary to impress future employers, mates, and society at large. Some are human capital extremists: they openly claim that the main purpose is to acquire a strong foundation of knowledge, and they continue to do so even when the incentive from the perspective of grades is low. Some are consumption extremists: they believe in learning because it's fun and intellectually stimulating. And some strategically combine these approaches. Yet, none of these categories describe most people.
I've had students who worked considerably harder on courses than the bare minimum effort needed to get an A. This is despite the fact that they aren't deeply interested in the subject, don't believe it will be useful in later life, and aren't likely to remember it for too long anyway. I think that the karma explanation fits best: people develop an image of themselves as Good Students who do their duty and fulfill their role in the system. They strive hard to fulfill that image, often going somewhat overboard beyond the bare minimum needed for signaling purposes, while still not trying to learn in ways that optimize for human capital acquisition. There are of course many other people who claim to aspire to the label of Good Student because it's the Right Thing, and consider it a failing of virtue that they don't currently qualify as Good Students. Of course, that's what they say, and social desirability bias might play a role in individuals' statements, but the very fact that people consider such views socially desirable indicates the strong societal belief in being a Good Student and doing one's academic duty.
If you presented the signaling hypothesis to self-identified Good Students they'd probably be insulted. It's like telling a devout Christian that he's in it only to curry favor with God. At the same time, the human capital hypothesis might also seem ridiculous to them in light of their actual actions and experiences: they know they don't remember or understand the material too well. Thinking of it as doing their bit for the System because it's the Right Thing to do seems both noble and realistic.
The impressive success of this approach
At the individual level, this works! Regardless of the relative roles of human capital, signaling, and ability bias, people who go through higher levels of education and get better grades tend to earn better and get more high-status jobs than others. People who transform themselves from being bad students to good students often see rewards both academically and in later life in the form of better jobs. This could again be human capital, signaling, or ability bias. The ability bias explanation is plausible because it requires a lot of ability to turn from a bad student into a good student, about the same as it does to be a good student from the get-go or perhaps even more because transforming oneself is a difficult task.
Can one do better?
Doing what the System commands can be reasonably satisfying, and even rewarding. But for many people, and particularly for the people who do the most impressive things, it's not necessarily the optimal path. This is because the System isn't designed to maximize every individual's success or life satisfaction, or even to optimize things for society as a whole. It's based on a series of adjustments driven by squabbling between competing interests. It could be a lot worse, but a motivated person could do better.
Also note that being a Good Student is fundamentally different from being a Good Worker. A worker, whether directly serving customers or reporting to a boss, is producing stuff that other people value. So, at least in principle, being a better worker translates to more gains for the customers. This means that a Good Worker is contributing to the System in a literal sense, and by doing a better job, directly adds more value. But this sort of reasoning doesn't apply to Good Students, because the actions of students qua students aren't producing direct value. Their value is largely their consumption value to the students themselves and their instrumental value to the students' current and later life choices.
Many of the qualities that define a Good Student are qualities that are desirable in other contexts as well. In particular, good study habits are valuable not just in school but in any form of research that relies on intellectual comprehension and synthesis (this may be an example of the human capital gains from education, except that I don't think most students acquire good study habits). So, one thing to learn from the Good Student model is good study habits. General traits of conscientiousness, hardwork, and willingness to work beyond the bare minimum needed for signaling purposes are also valuable to learn and practice.
But the Good Student model breaks down when it comes to acquiring perspective about how to prioritize between different subjects, and how to actually learn and do things of direct value. A common example is perfectionism. The Good Student may spend hours practicing calculus to get a perfect score in the test, far beyond what's necessary to get an A in the class or an AP BC 5, and yet not acquire a conceptual understanding of calculus or learn calculus in a way that would stick. Such a student has acquired a lot of karma, but has failed from both the human capital perspective (in not acquiring durable human capital) and the signaling perspective (in spending more effort than is needed for the signal). In an ideal world, material would be taught in a way that one can score highly on tests if and only if it serves useful human capital or signaling functions, but this is often not the case.
Thus, I believe it makes sense to critically examine the activities one is pursuing as a student, and ask: "does this serve a useful purpose for me?" The purpose could be human capital. signaling, pure consumption, or something else (such as networking). Consider the following four extreme answers a student may give to why a particular high school or college course matters:
- Pure signaling: A follow-up might be: "how much effort would I need to put in to get a good return on investment as far as the signaling benefits go?" And then one has to stop at that level, rather than overshoot or undershoot.
- Pure human capital: A follow-up might be: "how do I learn to maximize the long-term human capital acquired and retained?" In this world, test performance matters only as feedback rather than as the ultimate goal of one's actions. Rather than trying to practice for hours on end to get a perfect score on a test, more effort will go into learning in ways that increase the probability of long-term retention in ways that are likely to prove useful later on. (As mentioned above, in an ideal world, these goals would converge).
- Pure consumption: A follow-up might be: "how much effort should I put in in order to get the maximum enjoyment and stimulation (or other forms of consumptive experience), without feeling stressed or burdened by the material?"
- Pure networking: A follow-up might be: "how do I optimize my course experience to maximize the extent to which I'm able to network with fellow students and instructors?"
One might also believe that some combination of these explanations applies. For instance, a mixed human capital-cum-signaling explanation might recommend that one study all topics well enough to get an A, and then concentrate on acquiring a durable understanding of the few subtopics that one believes are needed for long-term knowledge and skills. For instance, a mastery of fractions matters a lot more than a mastery of quadratic equations, so a student preparing for a middle school or high school algebra course might choose to learn both at a basic level but get a really deep understanding of fractions. Similarly, in calculus, having a clear idea of what a function and derivative means matters a lot more than knowing how to differentiate trigonometric functions, so a student may superficially understand all aspects (to get the signaling benefits of a good grade) but dig deep into the concept of functions and the conceptual definition of derivatives (to acquire useful human capital). By thinking clearly about this, one may realize that perfecting one's ability to differentiate complicated trigonometric function expressions or integrate complicated rational functions may not be valuable from either a human capital perspective or a signaling perspective.
Ultimately, the changes wrought by consciously thinking about these issues are not too dramatic. Even though the System is suboptimal, it's locally optimal in small ways and one is constrained in one's actions in any case. But the changes can nevertheless add up to lead one to be more strategic and less stressed, do better on all fronts (human capital, signaling, and consumption), and discover opportunities one might otherwise have missed.
2014 iterated prisoner's dilemma tournament results
Followup to: Announcing the 2014 program equilibrium iterated PD tournament
In August, I announced an iterated prisoner's dilemma tournament in which bots can simulate each other before making a move. Eleven bots were submitted to the tournament. Today, I am pleased to announce the final standings and release the source code and full results.
All of the source code submitted by the competitors and the full results for each match are available here. See here for the full set of rules and tournament code.
Before we get to the final results, here's a quick rundown of the bots that competed:
AnderBot
AnderBot follows a simple tit-for-tat-like algorithm that eschews simulation:
- On the first turn, Cooperate.
- For the next 10 turns, play tit-for-tat.
- For the rest of the game, Defect with 10% probability or Defect if the opposing bot has defected more times than AnderBot.
View more: Next





Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)