Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.
A long time ago I thought that Martial Arts simply taught you how to fight – the right way to throw a punch, the best technique for blocking and countering an attack, etc. I thought training consisted of recognizing these attacks and choosing the correct responses more quickly, as well as simply faster/stronger physical execution of same. It was later that I learned that the entire purpose of martial arts is to train your body to react with minimal conscious deliberation, to remove “you” from the equation as much as possible.
The reason is of course that conscious thought is too slow. If you have to think about what you’re doing, you’ve already lost. It’s been said that if you had to think about walking to do it, you’d never make it across the room. Fighting is no different. (It isn’t just fighting either – anything that requires quick reaction suffers when exposed to conscious thought. I used to love Rock Band. One day when playing a particularly difficult guitar solo on expert I nailed 100%… except “I” didn’t do it at all. My eyes saw the notes, my hands executed them, and no where was I involved in the process. It was both exhilarating and creepy, and I basically dropped the game soon after.)
You’ve seen how long it takes a human to learn to walk effortlessly. That's a situation with a single constant force, an unmoving surface, no agents working against you, and minimal emotional agitation. No wonder it takes hundreds of hours, repeating the same basic movements over and over again, to attain even a basic level of martial mastery. To make your body react correctly without any thinking involved. When Neo says “I Know Kung Fu” he isn’t surprised that he now has knowledge he didn’t have before. He’s amazed that his body now reacts in the optimal manner when attacked without his involvement.
All of this is simply focusing on pure reaction time – it doesn’t even take into account the emotional terror of another human seeking to do violence to you. It doesn’t capture the indecision of how to respond, the paralysis of having to choose between outcomes which are all awful and you don’t know which will be worse, and the surge of hormones. The training of your body to respond without your involvement bypasses all of those obstacles as well.
This is the true strength of Martial Arts – eliminating your slow, conscious deliberation and acting while there is still time to do so.
Roles are the Martial Arts of Agency.
When one is well-trained in a certain Role, one defaults to certain prescribed actions immediately and confidently. I’ve acted as a guy standing around watching people faint in an overcrowded room, and I’ve acted as the guy telling people to clear the area. The difference was in one I had the role of Corporate Pleb, and the other I had the role of Guy Responsible For This Shit. You know the difference between the guy at the bar who breaks up a fight, and the guy who stands back and watches it happen? The former thinks of himself as the guy who stops fights. They could even be the same guy, on different nights. The role itself creates the actions, and it creates them as an immediate reflex. By the time corporate-me is done thinking “Huh, what’s this? Oh, this looks bad. Someone fainted? Wow, never seen that before. Damn, hope they’re OK. I should call 911.” enforcer-me has already yelled for the room to clear and whipped out a phone.
Roles are the difference between Hufflepuffs gawking when Neville tumbles off his broom (Protected), and Harry screaming “Wingardium Leviosa” (Protector). Draco insulted them afterwards, but it wasn’t a fair insult – they never had the slightest chance to react in time, given the role they were in. Roles are the difference between Minerva ordering Hagrid to stay with the children while she forms troll-hunting parties (Protector), and Harry standing around doing nothing while time slowly ticks away (Protected). Eventually he switched roles. But it took Agency to do so. It took time.
Agency is awesome. Half this site is devoted to becoming better at Agency. But Agency is slow. Roles allow real-time action under stress.
Agency has a place of course. Agency is what causes us to decide that Martial Arts training is important, that has us choose a Martial Art, and then continue to train month after month. Agency is what lets us decide which Roles we want to play, and practice the psychology and execution of those roles. But when the time for action is at hand, Agency is too slow. Ensure that you have trained enough for the next challenge, because it is the training that will see you through it, not your agenty conscious thinking.
As an aside, most major failures I’ve seen recently are when everyone assumed that someone else had the role of Guy In Charge If Shit Goes Down. I suggest that, in any gathering of rationalists, they begin the meeting by choosing one person to be Dictator In Extremis should something break. Doesn’t have to be the same person as whoever is leading. Would be best if it was someone comfortable in the role and/or with experience in it. But really there just needs to be one. Anyone.
cross-posted from my blog
Thanks to the generosity of several major donors,† every donation made to MIRI between now and August 15th, 2014 will be matched dollar-for-dollar, up to a total of $200,000!
Now is your chance to double your impact while helping us raise up to $400,000 (with matching) to fund our research program.
Corporate matching and monthly giving pledges will count towards the total! Please email firstname.lastname@example.org if you intend on leveraging corporate matching (check here, to see if your employer will match your donation) or would like to pledge 6 months of monthly donations, so that we can properly account for your contributions towards the fundraiser.
(If you're unfamiliar with our mission, see: Why MIRI?)
Accomplishments Since Our Winter 2013 Fundraiser Launched:
- Hired 2 new Friendly AI researchers, Benja Fallenstein & Nate Soares. Since March, they've authored or co-authored 4 papers/reports, with several others in the works. Right now they're traveling, to present papers at the Vienna Summer of Logic, AAAI-14, and AGI-14.
- 5 new papers & book chapters: “Why We Need Friendly AI,” “The errors, insights, and lessons of famous AI predictions,” “Problems of self-reference...,” “Program equilibrium...,” and “The ethics of artificial intelligence.”
- 11 new technical reports: 7 reports from the December 2013 workshop, “Botworld,” “Loudness...,” “Distributions allowing tiling...,” and “Non-omniscience...”
- New book: Smarter Than Us, published both as an e-book and a paperback.
- Held one MIRI workshop and launched the MIRIx program, which currently supports 8 independently-organized Friendly AI discussion/research groups around the world.
- New analyses: Robby's posts on naturalized induction, Luke's list of 70+ studies which could improve our picture of superintelligence strategy, “Exponential and non-exponential trends in information technology,” “The world's distribution of computation,” “How big is the field of artificial intelligence?,” “Robust cooperation: A case study in Friendly AI research,” “Is my view contrarian?,” and “Can we really upload Johnny Depp's brain?”
- Won $60,000+ in matching and prizes from sources that wouldn't have otherwise given to MIRI, via the Silicon Valley Gives fundraiser. (Thanks again, all you dedicated donors!)
- 49 new expert interviews, including interviews with Scott Aaronson (MIT), Max Tegmark (MIT), Kathleen Fisher (DARPA), Suresh Jagannathan (DARPA), André Platzer (CMU), Anil Nerode (Cornell), John Baez (UC Riverside), Jonathan Millen (MITRE), and Roger Schell.
- 4 transcribed conversations about MIRI strategy: 1, 2, 3, 4.
- Published a thorough “2013 in review.”
Ongoing Activities You Can Help Support
- We're writing an overview of the Friendly AI technical agenda (as we see it) so far.
- We're currently developing and testing several tutorials on different pieces of the Friendly AI technical agenda (tiling agents, modal agents, etc.).
- We're writing several more papers and reports.
- We're growing the MIRIx program, largely to grow the pool of people we can plausibly hire as full-time FAI researchers in the next couple years.
- We're planning, or helping to plan, multiple research workshops, including the May 2015 decision theory workshop at Cambridge University.
- We're finishing the editing for a book version of Eliezer's Sequences.
- We're helping to fund further SPARC activity, which provides education and skill-building to elite young math talent, and introduces them to ideas like effective altruism and global catastrophic risks.
- We're continuing to discuss formal collaboration opportunities with UC Berkeley faculty and development staff.
- We're helping Nick Bostrom promote his Superintelligence book in the U.S.
- We're investigating opportunities for supporting Friendly AI research via federal funding sources such as the NSF.
Other projects are still being surveyed for likely cost and impact. See also our mid-2014 strategic plan. We appreciate your support for our work!
Donate now, and seize a better than usual opportunity to move our work forward. If you have questions about donating, please contact Malo Bourgon at (510) 292-8776 or email@example.com.
† $200,000 of total matching funds has been provided by Jaan Tallinn, Edwin Evans, and Rick Schwall.
Screenshot service provided by URL2PNG.com used to include self updating progress bar.
[I'm unsure how much this rehashes things 'everyone knows already' - if old hat, feel free to downvote into oblivion. My other motivation for the cross-post is the hope it might catch the interest of someone with a stronger mathematical background who could make this line of argument more robust]
Many outcomes of interest have pretty good predictors. It seems that height correlates to performance in basketball (the average height in the NBA is around 6'7"). Faster serves in tennis improve one's likelihood of winning. IQ scores are known to predict a slew of factors, from income, to chance of being imprisoned, to lifespan.
What is interesting is the strength of these relationships appear to deteriorate as you advance far along the right tail. Although 6'7" is very tall, is lies within a couple of standard deviations of the median US adult male height - there are many thousands of US men taller than the average NBA player, yet are not in the NBA. Although elite tennis players have very fast serves, if you look at the players serving the fastest serves ever recorded, they aren't the very best players of their time. It is harder to look at the IQ case due to test ceilings, but again there seems to be some divergence near the top: the very highest earners tend to be very smart, but their intelligence is not in step with their income (their cognitive ability is around +3 to +4 SD above the mean, yet their wealth is much higher than this) (1).
The trend seems to be that although we know the predictors are correlated with the outcome, freakishly extreme outcomes do not go together with similarly freakishly extreme predictors. Why?
Too much of a good thing?
One candidate explanation would be that more isn't always better, and the correlations one gets looking at the whole population doesn't capture a reversal at the right tail. Maybe being taller at basketball is good up to a point, but being really tall leads to greater costs in terms of things like agility. Maybe although having a faster serve is better all things being equal, but focusing too heavily on one's serve counterproductively neglects other areas of one's game. Maybe a high IQ is good for earning money, but a stratospherically high IQ has an increased risk of productivity-reducing mental illness. Or something along those lines.
I would guess that these sorts of 'hidden trade-offs' are common. But, the 'divergence of tails' seems pretty ubiquitous (the tallest aren't the heaviest, the smartest parents don't have the smartest children, the fastest runners aren't the best footballers, etc. etc.), and it would be weird if there was always a 'too much of a good thing' story to be told for all of these associations. I think there is a more general explanation.
The simple graphical explanation
[Inspired by this essay from Grady Towers]
Suppose you make a scatter plot of two correlated variables. Here's one I grabbed off google, comparing the speed of a ball out of a baseball pitchers hand compared to its speed crossing crossing the plate:
It is unsurprising to see these are correlated (I'd guess the R-square is > 0.8). But if one looks at the extreme end of the graph, the very fastest balls out of the hand aren't the very fastest balls crossing the plate, and vice versa. This feature is general. Look at this data (again convenience sampled from googling 'scatter plot') of quiz time versus test score:
Given a correlation, the envelope of the distribution should form some sort of ellipse, narrower as the correlation goes stronger, and more circular as it gets weaker:
The thing is, as one approaches the far corners of this ellipse, we see 'divergence of the tails': as the ellipse doesn't sharpen to a point, there are bulges where the maximum x and y values lie with sub-maximal y and x values respectively:
So this offers an explanation why divergence at the tails is ubiquitous. Providing the sample size is largeish, and the correlation not to tight (the tighter the correlation, the larger the sample size required), one will observe the ellipses with the bulging sides of the distribution (2).
Hence the very best basketball players aren't the tallest (and vice versa), the very wealthiest not the smartest, and so on and so forth for any correlated X and Y. If X and Y are "Estimated effect size" and "Actual effect size", or "Performance at T", and "Performance at T+n", then you have a graphical display of winner's curse and regression to the mean.
An intuitive explanation of the graphical explanation
It would be nice to have an intuitive handle on why this happens, even if we can be convinced that it happens. Here's my offer towards an explanation:
The fact that a correlation is less than 1 implies that other things matter to an outcome of interest. Although being tall matters for being good at basketball, strength, agility, hand-eye-coordination matter as well (to name but a few). The same applies to other outcomes where multiple factors play a role: being smart helps in getting rich, but so does being hard working, being lucky, and so on.
For a toy model, pretend these height, strength, agility and hand-eye-coordination are independent of one another, gaussian, and additive towards the outcome of basketball ability with equal weight.(3) So, ceritus paribus, being taller will make one better at basketball, and the toy model stipulates there aren't 'hidden trade-offs': there's no negative correlation between height and the other attributes, even at the extremes. Yet the graphical explanation suggests we should still see divergence of the tails: the very tallest shouldn't be the very best.
The intuitive explanation would go like this: Start at the extreme tail - +4SD above the mean for height. Although their 'basketball-score' gets a massive boost from their height, we'd expect them to be average with respect to the other basketball relevant abilities (we've stipulated they're independent). Further, as this ultra-tall population is small, this population won't have a very high variance: with 10 people at +4SD, you wouldn't expect any of them to be +2SD in another factor like agility.
Move down the tail to slightly less extreme values - +3SD say. These people don't get such a boost to their basketball score for their height, but there should be a lot more of them (if 10 at +4SD, around 500 at +3SD), this means there is a lot more expected variance in the other basketball relevant activities - it is much less surprising to find someone +3SD in height and also +2SD in agility, and in the world where these things were equally important, they would 'beat' someone +4SD in height but average in the other attributes. Although a +4SD height person will likely be better than a given +3SD height person, the best of the +4SDs will not be as good as the best of the much larger number of +3SDs
The trade-off will vary depending on the exact weighting of the factors, which explain more of the variance, but the point seems to hold in the general case: when looking at a factor known to be predictive of an outcome, the largest outcome values will occur with sub-maximal factor values, as the larger population increases the chances of 'getting lucky' with the other factors:
So that's why the tails diverge.
Endnote: EA relevance
I think this is interesting in and of itself, but it has relevance to Effective Altruism, given it generally focuses on the right tail of various things (What are the most effective charities? What is the best career? etc.) It generally vindicates worries about regression to the mean or winner's curse, and suggests that these will be pretty insoluble in all cases where the populations are large: even if you have really good means of assessing the best charities or the best careers so that your assessments correlate really strongly with what ones actually are the best, the very best ones you identify are unlikely to be actually the very best, as the tails will diverge.
This probably has limited practical relevance. Although you might expect that one of the 'not estimated as the very best' charities is in fact better than your estimated-to-be-best charity, you don't know which one, and your best bet remains your estimate (in the same way - at least in the toy model above - you should bet a 6'11" person is better at basketball than someone who is 6'4".)
There may be spread betting or portfolio scenarios where this factor comes into play - perhaps instead of funding AMF to diminishing returns when its marginal effectiveness dips below charity #2, we should be willing to spread funds sooner.(4) Mainly, though, it should lead us to be less self-confident.
1. One might look at the generally modest achievements of people in high-IQ societies as further evidence, but there are worries about adverse selection.
2. One needs a large enough sample to 'fill in' the elliptical population density envelope, and the tighter the correlation, the larger the sample needed to fill in the sub-maximal bulges. The old faithful case is an example where actually you do get a 'point', although it is likely an outlier.
3. If you want to apply it to cases where the factors are positively correlated - which they often are - just use the components of the other factors that are independent of the factor of interest. I think, but I can't demonstrate, the other stipulations could also be relaxed.
4. I'd intuit, but again I can't demonstrate, the case for this becomes stronger with highly skewed interventions where almost all the impact is focused in relatively low probability channels, like averting a very specified existential risk.
Through a series of diagrams, this article will walk through key concepts in Nick Bostrom’s Superintelligence. The book is full of heavy content, and though well written, its scope and depth can make it difficult to grasp the concepts and mentally hold them together. The motivation behind making these diagrams is not to repeat an explanation of the content, but rather to present the content in such a way that the connections become clear. Thus, this article is best read and used as a supplement to Superintelligence.
Note: Superintelligence is now available in the UK. The hardcover is coming out in the US on September 3. The Kindle version is already available in the US as well as the UK.
Roadmap: there are two diagrams, both presented with an accompanying description. The two diagrams are combined into one mega-diagram at the end.
Figure 1: Pathways to Superintelligence
Figure 1 displays the five pathways toward superintelligence that Bostrom describes in chapter 2 and returns to in chapter 14 of the text. According to Bostrom, brain-computer interfaces are unlikely to yield superintelligence. Biological cognition, i.e., the enhancement of human intelligence, may yield a weak form of superintelligence on its own. Additionally, improvements to biological cognition could feed back into driving the progress of artificial intelligence or whole brain emulation. The arrows from networks and organizations likewise indicate technologies feeding back into AI and whole brain emulation development.
Artificial intelligence and whole brain emulation are two pathways that can lead to fully realized superintelligence. Note that neuromorphic is listed under artificial intelligence, but an arrow connects from whole brain emulation to neuromorphic. In chapter 14, Bostrom suggests that neuromorphic is a potential outcome of incomplete or improper whole brain emulation. Synthetic AI includes all the approaches to AI that are not neuromorphic; other terms that have been used are algorithmic or de novo AI.
This is a followup to Ben's post announcing the 2014 EA Summit.
The Effective Altruism Summit is now exactly one month away.
- The Center for Applied Rationality
- The Machine Intelligence Research Institute
- The Future of Humanity Institute
- The Life You Can Save
- 80,000 Hours
- Giving What We Can
- Leverage Research
A few months ago, my friend said the following thing to me: “After seeing Divergent, I finally understand virtue ethics. The main character is a cross between Aristotle and you.”
That was an impossible-to-resist pitch, and I saw the movie. The thing that resonated most with me–also the thing that my friend thought I had in common with the main character–was the idea that you could make a particular decision, and set yourself down a particular course of action, in order to make yourself become a particular kind of person. Tris didn’t join the Dauntless cast because she thought they were doing the most good in society, or because she thought her comparative advantage to do good lay there–she chose it because they were brave, and she wasn’t, yet, and she wanted to be. Bravery was a virtue that she thought she ought to have. If the graph of her motivations even went any deeper, the only node beyond ‘become brave’ was ‘become good.’
(Tris did have a concept of some future world-outcomes being better than others, and wanting to have an effect on the world. But that wasn't the causal reason why she chose Dauntless; as far as I can tell, it was unrelated.)
My twelve-year-old self had a similar attitude. I read a lot of fiction, and stories had heroes, and I wanted to be like them–and that meant acquiring the right skills and the right traits. I knew I was terrible at reacting under pressure–that in the case of an earthquake or other natural disaster, I would freeze up and not be useful at all. Being good at reacting under pressure was an important trait for a hero to have. I could be sad that I didn’t have it, or I could decide to acquire it by doing the things that scared me over and over and over again. So that someday, when the world tried to throw bad things at my friends and family, I’d be ready.
You could call that an awfully passive way to look at things. It reveals a deep-seated belief that I’m not in control, that the world is big and complicated and beyond my ability to understand and predict, much less steer–that I am not the locus of control. But this way of thinking is an algorithm. It will almost always spit out an answer, when otherwise I might get stuck in the complexity and unpredictability of trying to make a particular outcome happen.
I find the different houses of the HPMOR universe to be a very compelling metaphor. It’s not because they suggest actions to take; instead, they suggest virtues to focus on, so that when a particular situation comes up, you can act ‘in character.’ Courage and bravery for Gryffindor, for example. It also suggests the idea that different people can focus on different virtues–diversity is a useful thing to have in the world. (I'm probably mangling the concept of virtue ethics here, not having any background in philosophy, but it's the closest term for the thing I mean.)
I’ve thought a lot about the virtue of loyalty. In the past, loyalty has kept me with jobs and friends that, from an objective perspective, might not seem like the optimal things to spend my time on. But the costs of quitting and finding a new job, or cutting off friendships, wouldn’t just have been about direct consequences in the world, like needing to spend a bunch of time handing out resumes or having an unpleasant conversation. There would also be a shift within myself, a weakening in the drive towards loyalty. It wasn’t that I thought everyone ought to be extremely loyal–it’s a virtue with obvious downsides and failure modes. But it was a virtue that I wanted, partly because it seemed undervalued.
By calling myself a ‘loyal person’, I can aim myself in a particular direction without having to understand all the subcomponents of the world. More importantly, I can make decisions even when I’m rushed, or tired, or under cognitive strain that makes it hard to calculate through all of the consequences of a particular action.
The Less Wrong/CFAR/rationalist community puts a lot of emphasis on a different way of trying to be a hero–where you start from a terminal goal, like “saving the world”, and break it into subgoals, and do whatever it takes to accomplish it. In the past I’ve thought of myself as being mostly consequentialist, in terms of morality, and this is a very consequentialist way to think about being a good person. And it doesn't feel like it would work.
There are some bad reasons why it might feel wrong–i.e. that it feels arrogant to think you can accomplish something that big–but I think the main reason is that it feels fake. There is strong social pressure in the CFAR/Less Wrong community to claim that you have terminal goals, that you’re working towards something big. My System 2 understands terminal goals and consequentialism, as a thing that other people do–I could talk about my terminal goals, and get the points, and fit in, but I’d be lying about my thoughts. My model of my mind would be incorrect, and that would have consequences on, for example, whether my plans actually worked.
Practicing the art of rationality
Recently, Anna Salamon brought up a question with the other CFAR staff: “What is the thing that’s wrong with your own practice of the art of rationality?” The terminal goals thing was what I thought of immediately–namely, the conversations I've had over the past two years, where other rationalists have asked me "so what are your terminal goals/values?" and I've stammered something and then gone to hide in a corner and try to come up with some.
In Alicorn’s Luminosity, Bella says about her thoughts that “they were liable to morph into versions of themselves that were more idealized, more consistent - and not what they were originally, and therefore false. Or they'd be forgotten altogether, which was even worse (those thoughts were mine, and I wanted them).”
I want to know true things about myself. I also want to impress my friends by having the traits that they think are cool, but not at the price of faking it–my brain screams that pretending to be something other than what you are isn’t virtuous. When my immediate response to someone asking me about my terminal goals is “but brains don’t work that way!” it may not be a true statement about all brains, but it’s a true statement about my brain. My motivational system is wired in a certain way. I could think it was broken; I could let my friends convince me that I needed to change, and try to shoehorn my brain into a different shape; or I could accept that it works, that I get things done and people find me useful to have around and this is how I am. For now. I'm not going to rule out future attempts to hack my brain, because Growth Mindset, and maybe some other reasons will convince me that it's important enough, but if I do it, it'll be on my terms. Other people are welcome to have their terminal goals and existential struggles. I’m okay the way I am–I have an algorithm to follow.
Why write this post?
It would be an awfully surprising coincidence if mine was the only brain that worked this way. I’m not a special snowflake. And other people who interact with the Less Wrong community might not deal with it the way I do. They might try to twist their brains into the ‘right’ shape, and break their motivational system. Or they might decide that rationality is stupid and walk away.
Building a safe and powerful artificial general intelligence seems a difficult task. Working on that task today is particularly difficult, as there is no clear path to AGI yet. Is there work that can be done now that makes it more likely that humanity will be able to build a safe, powerful AGI in the future? Benja and I think there is: there are a number of relevant problems that it seems possible to make progress on today using formally specified toy models of intelligence. For example, consider recent program equilibrium results and various problems of self-reference.
AIXI is a powerful toy model used to study intelligence. An appropriately-rewarded AIXI could readily solve a large class of difficult problems. This includes computer vision, natural language recognition, and many other difficult optimization tasks. That these problems are all solvable by the same equation — by a single hypothetical machine running AIXI — indicates that the AIXI formalism captures a very general notion of "intelligence".
However, AIXI is not a good toy model for investigating the construction of a safe and powerful AGI. This is not just because AIXI is uncomputable (and its computable counterpart AIXItl infeasible). Rather, it's because AIXI cannot self-modify. This fact is fairly obvious from the AIXI formalism: AIXI assumes that in the future, it will continue being AIXI. This is a fine assumption for AIXI to make, as it is a very powerful agent and may not need to self-modify. But this inability limits the usefulness of the model. Any agent capable of undergoing an intelligence explosion must be able to acquire new computing resources, dramatically change its own architecture, and keep its goals stable throughout the process. The AIXI formalism lacks tools to study such behavior.
This is not a condemnation of AIXI: the formalism was not designed to study self-modification. However, this limitation is neither trivial nor superficial: even though an AIXI may not need to make itself "smarter", real agents may need to self-modify for reasons other than self-improvement. The fact that an embodied AIXI cannot self-modify leads to systematic failures in situations where self-modification is actually necessary. One such scenario, made explicit using Botworld, is explored in detail below.
In this game, one agent will require another agent to precommit to a trade by modifying its code in a way that forces execution of the trade. AIXItl, which is unable to alter its source code, is not able to implement the precommitment, and thus cannot enlist the help of the other agent.
Afterwards, I discuss a slightly more realistic scenario in which two agents have an opportunity to cooperate, but one agent has a computationally expensive "exploit" action available and the other agent can measure the waste heat produced by computation. Again, this is a scenario where an embodied AIXItl fails to achieve a high payoff against cautious opponents.
Though scenarios such as these may seem improbable, they are not strictly impossible. Such scenarios indicate that AIXI — while a powerful toy model — does not perfectly capture the properties desirable in an idealized AGI.
I once asked a room full of about 100 neuroscientists whether willpower depletion was a thing, and there was widespread disagreement with the idea. (A propos, this is a great way to quickly gauge consensus in a field.) Basically, for a while some researchers believed that willpower depletion "is" glucose depletion in the prefrontal cortex, but some more recent experiments have failed to replicate this, e.g. by finding that the mere taste of sugar is enough to "replenish" willpower faster than the time it takes blood to move from the mouth to the brain:
Carbohydrate mouth-rinses activate dopaminergic pathways in the striatum–a region of the brain associated with responses to reward (Kringelbach, 2004)–whereas artificially-sweetened non-carbohydrate mouth-rinses do not (Chambers et al., 2009). Thus, the sensing of carbohydrates in the mouth appears to signal the possibility of reward (i.e., the future availability of additional energy), which could motivate rather than fuel physical effort.
-- Molden, D. C. et al, The Motivational versus Metabolic Effects of Carbohydrates on Self-Control. Psychological Science.
Stanford's Carol Dweck and Greg Walden even found that hinting to people that using willpower is energizing might actually make them less depletable:
When we had people read statements that reminded them of the power of willpower like, “Sometimes, working on a strenuous mental task can make you feel energized for further challenging activities,” they kept on working and performing well with no sign of depletion. They made half as many mistakes on a difficult cognitive task as people who read statements about limited willpower. In another study, they scored 15 percent better on I.Q. problems.
-- Dweck and Walden, Willpower: It’s in Your Head? New York Times.
While these are all interesting empirical findings, there’s a very similar phenomenon that’s much less debated and which could explain many of these observations, but I think gets too little popular attention in these discussions:
Willpower is distractible.
Indeed, willpower and working memory are both strongly mediated by the dorsolateral prefontal cortex, so “distraction” could just be the two functions funging against one another. To use the terms of Stanovich popularized by Kahneman in Thinking: Fast and Slow, "System 2" can only override so many "System 1" defaults at any given moment.
So what’s going on when people say "willpower depletion"? I’m not sure, but even if willpower depletion is not a thing, the following distracting phenomena clearly are:
- Physical fatigue (like from running)
- Physical discomfort (like from sitting)
- That specific-other-thing you want to do
- Anxiety about willpower depletion
- Indignation at being asked for too much by bosses, partners, or experimenters...
... and "willpower depletion" might be nothing more than mental distraction by one of these processes. Perhaps it really is better to think of willpower as power (a rate) than energy (a resource).
If that’s true, then figuring out what processes might be distracting us might be much more useful than saying “I’m out of willpower” and giving up. Maybe try having a sip of water or a bit of food if your diet permits it. Maybe try reading lying down to see if you get nap-ish. Maybe set a timer to remind you to call that friend you keep thinking about.
The last two bullets,
- Anxiety about willpower depletion
- Indignation at being asked for too much by bosses, partners, or experimenters...
are also enough to explain why being told willpower depletion isn’t a thing might reduce the effects typically attributed to it: we might simply be less distracted by anxiety or indignation about doing “too much” willpower-intensive work in a short period of time.
Of course, any speculation about how human minds work in general is prone to the "typical mind fallacy". Maybe my willpower is depletable and yours isn’t. But then that wouldn’t explain why you can cause people to exhibit less willpower depletion by suggesting otherwise. But then again, most published research findings are false. But then again the research on the DLPFC and working memory seems relatively old and well established, and distraction is clearly a thing...
All in all, more of my chips are falling on the hypothesis that willpower “depletion” is often just willpower distraction, and that finding and addressing those distractions is probably a better a strategy than avoiding activities altogether in order to "conserve willpower".
As of May 2014, there is an existential risk research and outreach organization based in the Boston area. The Future of Life Institute (FLI), spearheaded by Max Tegmark, was co-founded by Jaan Tallinn, Meia Chita-Tegmark, Anthony Aguirre and myself.
Our idea was to create a hub on the US East Coast to bring together people who care about x-risk and the future of life. FLI is currently run entirely by volunteers, and is based on brainstorming meetings where the members come together and discuss active and potential projects. The attendees are a mix of local scientists, researchers and rationalists, which results in a diversity of skills and ideas. We also hold more narrowly focused meetings where smaller groups work on specific projects. We have projects in the pipeline ranging from improving Wikipedia resources related to x-risk, to bringing together AI researchers in order to develop safety guidelines and make the topic of AI safety more mainstream.
Max has assembled an impressive advisory board that includes Stuart Russell, George Church and Stephen Hawking. The advisory board is not just for prestige - the local members attend our meetings, and some others participate in our projects remotely. We consider ourselves a sister organization to FHI, CSER and MIRI, and touch base with them often.
We recently held our launch event, a panel discussion "The Future of Technology: Benefits and Risks" at MIT. The panelists were synthetic biologist George Church, geneticist Ting Wu, economist Andrew McAfee, physicist and Nobel laureate Frank Wilczek and Skype co-founder Jaan Tallinn. The discussion covered a broad range of topics from the future of bioengineering and personal genetics, to autonomous weapons, AI ethics and the Singularity. A video and transcript are available.
FLI is a grassroots organization that thrives on contributions from awesome people like the LW community - here are some ways you can help:
- If you have ideas for research or outreach we could be doing, or improvements to what we're already doing, please let us know (in the comments to this post, or by contacting me directly).
- If you are in the vicinity of the Boston area and are interested in getting involved, you are especially encouraged to get in touch with us!
- Support in the form of donations is much appreciated. (We are grateful for seed funding provided by Jaan Tallinn and Matt Wage.)
It is obvious that the same thing will not be willing to do or undergo opposites in the same part of itself, in relation to the same thing, at the same time. --Book IV of Plato's Republic
Can you simultaneously want sex and not want it? Can you believe in God and not believe in Him at the same time? Can you be fearless while frightened?
To be fair to Plato, this was meant not as an assertion that such contradictions are impossible, but as an argument that the soul has multiple parts. It seems we can, in fact, want something while also not wanting it. This is awfully strange, and it led Plato to conclude the soul must have multiple parts, for surely no one part could contain both sides of the contradiction.
Often, when we attempt to accept contradictory statements as correct, it causes cognitive dissonance--that nagging, itchy feeling in your brain that won't leave you alone until you admit that something is wrong. Like when you try to convince yourself that staying up just a little longer playing 2048 won't have adverse effects on the presentation you're giving tomorrow, when you know full well that's exactly what's going to happen.
But it may be that cognitive dissonance is the exception in the face of contradictions, rather than the rule. How would you know? If it doesn't cause any emotional friction, the two propositions will just sit quietly together in your brain, never mentioning that it's logically impossible for both of them to be true. When we accept a contradiction wholesale without cognitive dissonance, it's what Orwell called "doublethink".
When you're a mere mortal trying to get by in a complex universe, doublethink may be adaptive. If you want to be completely free of contradictory beliefs without spending your whole life alone in a cave, you'll likely waste a lot of your precious time working through conundrums, which will often produce even more conundrums.
Suppose I believe that my husband is faithful, and I also believe that the unfamiliar perfume on his collar indicates he's sleeping with other women without my permission. I could let that pesky little contradiction turn into an extended investigation that may ultimately ruin my marriage. Or I could get on with my day and leave my marriage intact.
It's better to just leave those kinds of thoughts alone, isn't it? It probably makes for a happier life.
Suppose you believe that driving is dangerous, and also that, while you are driving, you're completely safe. As established in Doublethink, there may be some benefits to letting that mental configuration be.
There are also some life-shattering downsides. One of the things you believe is false, you see, by the law of the excluded middle. In point of fact, it's the one that goes "I'm completely safe while driving". Believing false things has consequences.
Be irrationally optimistic about your driving skills, and you will be happily unconcerned where others sweat and fear. You won't have to put up with the inconvenience of a seatbelt. You will be happily unconcerned for a day, a week, a year. Then CRASH, and spend the rest of your life wishing you could scratch the itch in your phantom limb. Or paralyzed from the neck down. Or dead. It's not inevitable, but it's possible; how probable is it? You can't make that tradeoff rationally unless you know your real driving skills, so you can figure out how much danger you're placing yourself in. --Eliezer Yudkowsky, Doublethink (Choosing to be Biased)
What are beliefs for? Please pause for ten seconds and come up with your own answer.
Ultimately, I think beliefs are inputs for predictions. We're basically very complicated simulators that try to guess which actions will cause desired outcomes, like survival or reproduction or chocolate. We input beliefs about how the world behaves, make inferences from them to which experiences we should anticipate given various changes we might make to the world, and output behaviors that get us what we want, provided our simulations are good enough.
My car is making a mysterious ticking sound. I have many beliefs about cars, and one of them is that if my car makes noises it shouldn't, it will probably stop working eventually, and possibly explode. I can use this input to simulate the future. Since I've observed my car making a noise it shouldn't, I predict that my car will stop working. I also believe that there is something causing the ticking. So I predict that if I intervene and stop the ticking (in non-ridiculous ways), my car will keep working. My belief has thus led to the action of researching the ticking noise, planning some simple tests, and will probably lead to cleaning the sticky lifters.
If it's true that solving the ticking noise will keep my car running, then my beliefs will cash out in correctly anticipated experiences, and my actions will cause desired outcomes. If it's false, perhaps because the ticking can be solved without addressing a larger underlying problem, then the experiences I anticipate will not occur, and my actions may lead to my car exploding.
Doublethink guarantees that you believe falsehoods. Some of the time you'll call upon the true belief ("driving is dangerous"), anticipate future experiences accurately, and get the results you want from your chosen actions ("don't drive three times the speed limit at night while it's raining"). But some of the time, if you actually believe the false thing as well, you'll call upon the opposite belief, anticipate inaccurately, and choose the last action you'll ever take.
Without any principled algorithm determining which of the contradictory propositions to use as an input for the simulation at hand, you'll fail as often as you succeed. So it makes no sense to anticipate more positive outcomes from believing contradictions.
Contradictions may keep you happy as long as you never need to use them. Should you call upon them, though, to guide your actions, the debt on false beliefs will come due. You will drive too fast at night in the rain, you will crash, you will fly out of the car with no seat belt to restrain you, you will die, and it will be your fault.
Against Against Doublethink
What if Plato was pretty much right, and we sometimes believe contradictions because we're sort of not actually one single person?
It is not literally true that Systems 1 and 2 are separate individuals the way you and I are. But the idea of Systems 1 and 2 suggests to me something quite interesting with respect to the relationship between beliefs and their role in decision making, and modeling them as separate people with very different personalities seems to work pretty darn well when I test my suspicions.
I read Atlas Shrugged probably about a decade ago. I was impressed with its defense of capitalism, which really hammers home the reasons it’s good and important on a gut level. But I was equally turned off by its promotion of selfishness as a moral ideal. I thought that was *basically* just being a jerk. After all, if there’s one thing the world doesn’t need (I thought) it’s more selfishness.
Then I talked to a friend who told me Atlas Shrugged had changed his life. That he’d been raised in a really strict family that had told him that ever enjoying himself was selfish and made him a bad person, that he had to be working at every moment to make his family and other people happy or else let them shame him to pieces. And the revelation that it was sometimes okay to consider your own happiness gave him the strength to stand up to them and turn his life around, while still keeping the basic human instinct of helping others when he wanted to and he felt they deserved it (as, indeed, do Rand characters). --Scott of Slate Star Codex in All Debates Are Bravery Debates
If you're generous to a fault, "I should be more selfish" is probably a belief that will pay off in positive outcomes should you install it for future use. If you're selfish to a fault, the same belief will be harmful. So what if you were too generous half of the time and too selfish the other half? Well, then you would want to believe "I should be more selfish" with only the generous half, while disbelieving it with the selfish half.
Systems 1 and 2 need to hear different things. System 2 might be able to understand the reality of biases and make appropriate adjustments that would work if System 1 were on board, but System 1 isn't so great at being reasonable. And it's not System 2 that's in charge of most of your actions. If you want your beliefs to positively influence your actions (which is the point of beliefs, after all), you need to tailor your beliefs to System 1's needs.
For example: The planning fallacy is nearly ubiquitous. I know this because for the past three years or so, I've gotten everywhere five to fifteen minutes early. Almost every single person I meet with arrives five to fifteen minutes late. It is very rare for someone to be on time, and only twice in three years have I encountered the (rather awkward) circumstance of meeting with someone who also arrived early.
Before three years ago, I was also usually late, and I far underestimated how long my projects would take. I knew, abstractly and intellectually, about the planning fallacy, but that didn't stop System 1 from thinking things would go implausibly quickly. System 1's just optimistic like that. It responds to, "Dude, that is not going to work, and I have a twelve point argument supporting my position and suggesting alternative plans," with "Naaaaw, it'll be fine! We can totally make that deadline."
At some point (I don't remember when or exactly how), I gained the ability to look at the true due date, shift my System 1 beliefs to make up for the planning fallacy, and then hide my memory that I'd ever seen the original due date. I would see that my flight left at 2:30, and be surprised to discover on travel day that I was not late for my 2:00 flight, but a little early for my 2:30 one. I consistently finished projects on time, and only disasters caused me to be late for meetings. It took me about three months before I noticed the pattern and realized what must be going on.
I got a little worried I might make a mistake, such as leaving a meeting thinking the other person just wasn't going to show when the actual meeting time hadn't arrived. I did have a couple close calls along those lines. But it was easy enough to fix; in important cases, I started receiving Boomeranged notes from past-me around the time present-me expected things to start that said, "Surprise! You've still got ten minutes!"
This unquestionably improved my life. You don't realize just how inconvenient the planning fallacy is until you've left it behind. Clearly, considered in isolation, the action of believing falsely in this domain was instrumentally rational.
Doublethink, and the Dark Arts generally, applied to carefully chosen domains is a powerful tool. It's dumb to believe false things about really dangerous stuff like driving, obviously. But you don't have to doublethink indiscriminately. As long as you're careful, as long as you suspend epistemic rationality only when it's clearly beneficial to do so, employing doublethink at will is a great idea.
Instrumental rationality is what really matters. Epistemic rationality is useful, but what use is holding accurate beliefs in situations where that won't get you what you want?
Against Against Against Doublethink
There are indeed epistemically irrational actions that are instrumentally rational, and instrumental rationality is what really matters. It is pointless to believing true things if it doesn't get you what you want. This has always been very obvious to me, and it remains so.
There is a bigger picture.
Certain epistemic rationality techniques are not compatible with dark side epistemology. Most importantly, the Dark Arts do not play nicely with "notice your confusion", which is essentially your strength as a rationalist. If you use doublethink on purpose, confusion doesn't always indicate that you need to find out what false thing you believe so you can fix it. Sometimes you have to bury your confusion. There's an itsy bitsy pause where you try to predict whether it's useful to bury.
As soon as I finally decided to abandon the Dark Arts, I began to sweep out corners I'd allowed myself to neglect before. They were mainly corners I didn't know I'd neglected.
The first one I noticed was the way I responded to requests from my boyfriend. He'd mentioned before that I often seemed resentful when he made requests of me, and I'd insisted that he was wrong, that I was actually happy all the while. (Notice that in the short term, since I was probably going to do as he asked anyway, attending to the resentment would probably have made things more difficult for me.) This self-deception went on for months.
Shortly after I gave up doublethink, he made a request, and I felt a little stab of dissonance. Something I might have swept away before, because it seemed more immediately useful to bury the confusion than to notice it. But I thought (wordlessly and with my emotions), "No, look at it. This is exactly what I've decided to watch for. I have noticed confusion, and I will attend to it."
It was very upsetting at first to learn that he'd been right. I feared the implications for our relationship. But that fear didn't last, because we both knew the only problems you can solve are the ones you acknowledge, so it is a comfort to know the truth.
I was far more shaken by the realization that I really, truly was ignorant that this had been happening. Not because the consequences of this one bit of ignorance were so important, but because who knows what other epistemic curses have hidden themselves in the shadows? I realized that I had not been in control of my doublethink, that I couldn't have been.
Pinning down that one tiny little stab of dissonance took great preparation and effort, and there's no way I'd been working fast enough before. "How often," I wondered, "does this kind of thing happen?"
Very often, it turns out. I began noticing and acting on confusion several times a day, where before I'd been doing it a couple times a week. I wasn't just noticing things that I'd have ignored on purpose before; I was noticing things that would have slipped by because my reflexes slowed as I weighed the benefit of paying attention. "Ignore it" was not an available action in the face of confusion anymore, and that was a dramatic change. Because there are no disruptions, acting on confusion is becoming automatic.
I can't know for sure which bits of confusion I've noticed since the change would otherwise have slipped by unseen. But here's a plausible instance. Tonight I was having dinner with a friend I've met very recently. I was feeling s little bit tired and nervous, so I wasn't putting as much effort as usual into directing the conversation. At one point I realized we had stopped making making any progress toward my goals, since it was clear we were drifting toward small talk. In a tired and slightly nervous state, I imagine that I might have buried that bit of information and abdicated responsibility for the conversation--not by means of considering whether allowing small talk to happen was actually a good idea, but by not pouncing on the dissonance aggressively, and thereby letting it get away. Instead, I directed my attention at the feeling (without effort this time!), inquired of myself what precisely was causing it, identified the prediction that the current course of conversation was leading away from my goals, listed potential interventions, weighed their costs and benefits against my simulation of small talk, and said, "What are your terminal values?"
(I know that sounds like a lot of work, but it took at most three seconds. The hard part was building the pouncing reflex.)
When you know that some of your beliefs are false, and you know that leaving them be is instrumentally rational, you do not develop the automatic reflex of interrogating every suspicion of confusion. You might think you can do this selectively, but if you do, I strongly suspect you're wrong in exactly the way I was.
I have long been more viscerally motivated by things that are interesting or beautiful than by things that correspond to the territory. So it's not too surprising that toward the beginning of my rationality training, I went through a long period of being so enamored with a-veridical instrumental techniques--things like willful doublethink--that I double-thought myself into believing accuracy was not so great.
But I was wrong. And that mattered. Having accurate beliefs is a ridiculously convergent incentive. Every utility function that involves interaction with the territory--interaction of just about any kind!--benefits from a sound map. Even if "beauty" is a terminal value, "being viscerally motivated to increase your ability to make predictions that lead to greater beauty" increases your odds of success.
Dark side epistemology prevents total dedication to continuous improvement in epistemic rationality. Though individual dark side actions may be instrumentally rational, the patterns of thought required to allow them are not. Though instrumental rationality is ultimately the goal, your instrumental rationality will always be limited by your epistemic rationality.
That was important enough to say again: Your instrumental rationality will always be limited by your epistemic rationality.
It only takes a fraction of a second to sweep an observation into the corner. You don't have time to decide whether looking at it might prove problematic. If you take the time to protect your compartments, false beliefs you don't endorse will slide in from everywhere through those split-second cracks in your art. You must attend to your confusion the very moment you notice it. You must be relentless an unmerciful toward your own beliefs.
Excellent epistemology is not the natural state of a human brain. Rationality is hard. Without extreme dedication and advanced training, without reliable automatic reflexes of rational thought, your belief structure will be a mess. You can't have totally automatic anti-rationalization reflexes if you use doublethink as a technique of instrumental rationality.
This has been a difficult lesson for me. I have lost some benefits I'd gained from the Dark Arts. I'm late now, sometimes. And painful truths are painful, though now they are sharp and fast instead of dull and damaging.
And it is so worth it! I have much more work to do before I can move on to the next thing. But whatever the next thing is, I'll tackle it with far more predictive power than I otherwise would have--though I doubt I'd have noticed the difference.
So when I say that I'm against against against doublethink--that dark side epistemology is bad--I mean that there is more potential on the light side, not that the dark side has no redeeming features. Its fruits hang low, and they are delicious.
But the fruits of the light side are worth the climb. You'll never even know they're there if you gorge yourself in the dark forever.
View more: Next