Filter Last three months

Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

I Want To Live In A Baugruppe

44 Alicorn 17 March 2017 01:36AM

Rationalists like to live in group houses.  We are also as a subculture moving more and more into a child-having phase of our lives.  These things don't cooperate super well - I live in a four bedroom house because we like having roommates and guests, but if we have three kids and don't make them share we will in a few years have no spare rooms at all.  This is frustrating in part because amenable roommates are incredibly useful as alloparents if you value things like "going to the bathroom unaccompanied" and "eating food without being screamed at", neither of which are reasonable "get a friend to drive for ten minutes to spell me" situations.  Meanwhile there are also people we like living around who don't want to cohabit with a small child, which is completely reasonable, small children are not for everyone.

For this and other complaints ("househunting sucks", "I can't drive and need private space but want friends accessible", whatever) the ideal solution seems to be somewhere along the spectrum between "a street with a lot of rationalists living on it" (no rationalist-friendly entity controls all those houses and it's easy for minor fluctuations to wreck the intentional community thing) and "a dorm" (sorta hard to get access to those once you're out of college, usually not enough kitchens or space for adult life).  There's a name for a thing halfway between those, at least in German - "baugruppe" - buuuuut this would require community or sympathetic-individual control of a space and the money to convert it if it's not already baugruppe-shaped.

Maybe if I complain about this in public a millionaire will step forward or we'll be able to come up with a coherent enough vision to crowdfund it or something.  I think there is easily enough demand for a couple of ten-to-twenty-adult baugruppen (one in the east bay and one in the south bay) or even more/larger, if the structures materialized.  Here are some bulleted lists.

Desiderata:

  • Units that it is really easy for people to communicate across and flow between during the day - to my mind this would be ideally to the point where a family who had more kids than fit in their unit could move the older ones into a kid unit with some friends for permanent sleepover, but still easily supervise them.  The units can be smaller and more modular the more this desideratum is accomplished.
  • A pricing structure such that the gamut of rationalist financial situations (including but not limited to rent-payment-constraining things like "impoverished app academy student", "frugal Google engineer effective altruist", "NEET with a Patreon", "CfAR staffperson", "not-even-ramen-profitable entrepreneur", etc.) could live there.  One thing I really like about my house is that Spouse can pay for it himself and would by default anyway, and we can evaluate roommates solely on their charming company (or contribution to childcare) even if their financial situation is "no".  However, this does require some serious participation from people whose financial situation is "yes" and a way to balance the two so arbitrary numbers of charity cases don't bankrupt the project.
  • Variance in amenities suited to a mix of Soylent-eating restaurant-going takeout-ordering folks who only need a fridge and a microwave and maybe a dishwasher, and neighbors who are not that, ideally such that it's easy for the latter to feed neighbors as convenient.
  • Some arrangement to get repairs done, ideally some compromise between "you can't do anything to your living space, even paint your bedroom, because you don't own the place and the landlord doesn't trust you" and "you have to personally know how to fix a toilet".
  • I bet if this were pulled off at all it would be pretty easy to have car-sharing bundled in, like in Benton House That Was which had several people's personal cars more or less borrowable at will.  (Benton House That Was may be considered a sort of proof of concept of "20 rationalists living together" but I am imagining fewer bunk beds in the baugruppe.)  Other things that could be shared include longish-term storage and irregularly used appliances.
  • Dispute resolution plans and resident- and guest-vetting plans which thread the needle between "have to ask a dozen people before you let your brother crash on the couch, let alone a guest unit" and "cannot expel missing stairs".  I think there are some rationalist community Facebook groups that have medium-trust networks of the right caution level and experiment with ways to maintain them.

Obstacles:

  • Bikeshedding.  Not that it isn't reasonable to bikeshed a little about a would-be permanent community edifice that you can't benefit from or won't benefit from much unless it has X trait - I sympathize with this entirely - but too much from too many corners means no baugruppen go up at all even if everything goes well, and that's already dicey enough, so please think hard on how necessary it is for the place to be blue or whatever.
  • Location.  The only really viable place to do this for rationalist population critical mass is the Bay Area, which has, uh, problems, with new construction.  Existing structures are likely to be unsuited to the project both architecturally and zoningwise, although I would not be wholly pessimistic about one of those little two-story hotels with rooms that open to the outdoors or something like that.
  • Principal-agent problems.  I do not know how to build a dormpartment building and probably neither do you.
  • Community norm development with buy-in and a good match for typical conscientiousness levels even though we are rules-lawyery contrarians.

Please share this wherever rationalists may be looking; it's definitely the sort of thing better done with more eyes on it.

Effective altruism is self-recommending

30 Benquo 21 April 2017 06:37PM

A parent I know reports (some details anonymized):

Recently we bought my 3-year-old daughter a "behavior chart," in which she can earn stickers for achievements like not throwing tantrums, eating fruits and vegetables, and going to sleep on time. We successfully impressed on her that a major goal each day was to earn as many stickers as possible.

This morning, though, I found her just plastering her entire behavior chart with stickers. She genuinely seemed to think I'd be proud of how many stickers she now had.

The Effective Altruism movement has now entered this extremely cute stage of cognitive development. EA is more than three years old, but institutions age differently than individuals.

What is a confidence game?

In 2009, investment manager and con artist Bernie Madoff pled guilty to running a massive fraud, with $50 billion in fake return on investment, having outright embezzled around $18 billion out of the $36 billion investors put into the fund. Only a couple of years earlier, when my grandfather was still alive, I remember him telling me about how Madoff was a genius, getting his investors a consistent high return, and about how he wished he could be in on it, but Madoff wasn't accepting additional investors.

What Madoff was running was a classic Ponzi scheme. Investors gave him money, and he told them that he'd gotten them an exceptionally high return on investment, when in fact he had not. But because he promised to be able to do it again, his investors mostly reinvested their money, and more people were excited about getting in on the deal. There was more than enough money to cover the few people who wanted to take money out of this amazing opportunity.

Ponzi schemes, pyramid schemes, and speculative bubbles are all situations in investors' expected profits are paid out from the money paid in by new investors, instead of any independently profitable venture. Ponzi schemes are centrally managed – the person running the scheme represents it to investors as legitimate, and takes responsibility for finding new investors and paying off old ones. In pyramid schemes such as multi-level-marketing and chain letters, each generation of investor recruits new investors and profits from them. In speculative bubbles, there is no formal structure propping up the scheme, only a common, mutually reinforcing set of expectations among speculators driving up the price of something that was already for sale.

The general situation in which someone sets themself up as the repository of others' confidence, and uses this as leverage to acquire increasing investment, can be called a confidence game.

Some of the most iconic Ponzi schemes blew up quickly because they promised wildly unrealistic growth rates. This had three undesirable effects for the people running the schemes. First, it attracted too much attention – too many people wanted into the scheme too quickly, so they rapidly exhausted sources of new capital. Second, because their rates of return were implausibly high, they made themselves targets for scrutiny. Third, the extremely high rates of return themselves caused their promises to quickly outpace what they could plausibly return to even a small share of their investor victims.

Madoff was careful to avoid all these problems, which is why his scheme lasted for nearly half a century. He only promised plausibly high returns (around 10% annually) for a successful hedge fund, especially if it was illegally engaged in insider trading, rather than the sort of implausibly high returns typical of more blatant Ponzi schemes. (Charles Ponzi promised to double investors' money in 90 days.) Madoff showed reluctance to accept new clients, like any other fund manager who doesn't want to get too big for their trading strategy.

He didn't plaster stickers all over his behavior chart – he put a reasonable number of stickers on it. He played a long game.

Not all confidence games are inherently bad. For instance, the US national pension system, Social Security, operates as a kind of Ponzi scheme, it is not obviously unsustainable, and many people continue to be glad that it exists. Nominally, when people pay Social Security taxes, the money is invested in the social security trust fund, which holds interest-bearing financial assets that will be used to pay out benefits in their old age. In this respect it looks like an ordinary pension fund.

However, the financial assets are US Treasury bonds. There is no independently profitable venture. The Federal Government of the United States of America is quite literally writing an IOU to itself, and then spending the money on current expenditures, including paying out current Social Security benefits.

The Federal Government, of course, can write as large an IOU to itself as it wants. It could make all tax revenues part of the Social Security program. It could issue new Treasury bonds and gift them to Social Security. None of this would increase its ability to pay out Social Security benefits. It would be an empty exercise in putting stickers on its own chart.

If the Federal government loses the ability to collect enough taxes to pay out social security benefits, there is no additional capacity to pay represented by US Treasury bonds. What we have is an implied promise to pay out future benefits, backed by the expectation that the government will be able to collect taxes in the future, including Social Security taxes.

There's nothing necessarily wrong with this, except that the mechanism by which Social Security is funded is obscured by financial engineering. However, this misdirection should raise at least some doubts as to the underlying sustainability or desirability of the commitment. In fact, this scheme was adopted specifically to give people the impression that they had some sort of property rights over their social Security Pension, in order to make the program politically difficult to eliminate. Once people have "bought in" to a program, they will be reluctant to treat their prior contributions as sunk costs, and willing to invest additional resources to salvage their investment, in ways that may make them increasingly reliant on it.

Not all confidence games are intrinsically bad, but dubious programs benefit the most from being set up as confidence games. More generally, bad programs are the ones that benefit the most from being allowed to fiddle with their own accounting. As Daniel Davies writes, in The D-Squared Digest One Minute MBA - Avoiding Projects Pursued By Morons 101:

Good ideas do not need lots of lies told about them in order to gain public acceptance. I was first made aware of this during an accounting class. We were discussing the subject of accounting for stock options at technology companies. […] One side (mainly technology companies and their lobbyists) held that stock option grants should not be treated as an expense on public policy grounds; treating them as an expense would discourage companies from granting them, and stock options were a vital compensation tool that incentivised performance, rewarded dynamism and innovation and created vast amounts of value for America and the world. The other side (mainly people like Warren Buffet) held that stock options looked awfully like a massive blag carried out my management at the expense of shareholders, and that the proper place to record such blags was the P&L account.

Our lecturer, in summing up the debate, made the not unreasonable point that if stock options really were a fantastic tool which unleashed the creative power in every employee, everyone would want to expense as many of them as possible, the better to boast about how innovative, empowered and fantastic they were. Since the tech companies' point of view appeared to be that if they were ever forced to account honestly for their option grants, they would quickly stop making them, this offered decent prima facie evidence that they weren't, really, all that fantastic.

However, I want to generalize the concept of confidence games from the domain of financial currency, to the domain of social credit more generally (of which money is a particular form that our society commonly uses), and in particular I want to talk about confidence games in the currency of credit for achievement.

If I were applying for a very important job with great responsibilities, such as President of the United States, CEO of a top corporation, or head or board member of a major AI research institution, I could be expected to have some relevant prior experience. For instance, I might have had some success managing a similar, smaller institution, or serving the same institution in a lesser capacity. More generally, when I make a bid for control over something, I am implicitly claiming that I have enough social credit – enough of a track record – that I can be expected to do good things with that control.

In general, if someone has done a lot, we should expect to see an iceberg pattern where a small easily-visible part suggests a lot of solid but harder-to-verify substance under the surface. One might be tempted to make a habit of imputing a much larger iceberg from the combination of a small floaty bit, and promises. But, a small easily-visible part with claims of a lot of harder-to-see substance is easy to mimic without actually doing the work. As Davies continues:

The Vital Importance of Audit. Emphasised over and over again. Brealey and Myers has a section on this, in which they remind callow students that like backing-up one's computer files, this is a lesson that everyone seems to have to learn the hard way. Basically, it's been shown time and again and again; companies which do not audit completed projects in order to see how accurate the original projections were, tend to get exactly the forecasts and projects that they deserve. Companies which have a culture where there are no consequences for making dishonest forecasts, get the projects they deserve. Companies which allocate blank cheques to management teams with a proven record of failure and mendacity, get what they deserve.

If you can independently put stickers on your own chart, then your chart is no longer reliably tracking something externally verified. If forecasts are not checked and tracked, or forecasters are not consequently held accountable for their forecasts, then there is no reason to believe that assessments of future, ongoing, or past programs are accurate. Adopting a wait-and-see attitude, insisting on audits for actual results (not just predictions) before investing more, will definitely slow down funding for good programs. But without it, most of your funding will go to worthless ones.

Open Philanthropy, OpenAI, and closed validation loops

The Open Philanthropy Project recently announced a $30 million grant to the $1 billion nonprofit AI research organization OpenAI. This is the largest single grant it has ever made. The main point of the grant is to buy influence over OpenAI’s future priorities; Holden Karnofsky, Executive Director of the Open Philanthropy Project, is getting a seat on OpenAI’s board as part of the deal. This marks the second major shift in focus for the Open Philanthropy Project.

The first shift (back when it was just called GiveWell) was from trying to find the best already-existing programs to fund (“passive funding”) to envisioning new programs and working with grantees to make them reality (“active funding”). The new shift is from funding specific programs at all, to trying to take control of programs without any specific plan.

To justify the passive funding stage, all you have to believe is that you can know better than other donors, among existing charities. For active funding, you have to believe that you’re smart enough to evaluate potential programs, just like a charity founder might, and pick ones that will outperform. But buying control implies that you think you’re so much better, that even before you’ve evaluated any programs, if someone’s doing something big, you ought to have a say.

When GiveWell moved from a passive to an active funding strategy, it was relying on the moral credit it had earned for its extensive and well-regarded charity evaluations. The thing that was particularly exciting about GiveWell was that they focused on outcomes and efficiency. They didn't just focus on the size or intensity of the problem a charity was addressing. They didn't just look at financial details like overhead ratios. They asked the question a consequentialist cares about: for a given expenditure of money, how much will this charity be able to improve outcomes?

However, when GiveWell tracks its impact, it does not track objective outcomes at all. It tracks inputs: attention received (in the form of visits to its website) and money moved on the basis of its recommendations. In other words, its estimate of its own impact is based on the level of trust people have placed in it.

So, as GiveWell built out the Open Philanthropy Project, its story was: We promised to do something great. As a result, we were entrusted with a fair amount of attention and money. Therefore, we should be given more responsibility. We represented our behavior as praiseworthy, and as a result people put stickers on our chart. For this reason, we should be advanced stickers against future days of praiseworthy behavior.

Then, as the Open Philanthropy Project explored active funding in more areas, its estimate of its own effectiveness grew. After all, it was funding more speculative, hard-to-measure programs, but a multi-billion-dollar donor, which was largely relying on the Open Philanthropy Project's opinions to assess efficacy (including its own efficacy), continued to trust it.

What is missing here is any objective track record of benefits. What this looks like to me, is a long sort of confidence game – or, using less morally loaded language, a venture with structural reliance on increasing amounts of leverage – in the currency of moral credit.

Version 0: GiveWell and passive funding

First, there was GiveWell. GiveWell’s purpose was to find and vet evidence-backed charities. However, it recognized that charities know their own business best. It wasn’t trying to do better than the charities; it was trying to do better than the typical charity donor, by being more discerning.

GiveWell’s thinking from this phase is exemplified by co-founder Elie Hassenfeld’s Six tips for giving like a pro:

When you give, give cash – no strings attached. You’re just a part-time donor, but the charity you’re supporting does this full-time and staff there probably know a lot more about how to do their job than you do. If you’ve found a charity that you feel is excellent – not just acceptable – then it makes sense to trust the charity to make good decisions about how to spend your money.

GiveWell similarly tried to avoid distorting charities’ behavior. Its job was only to evaluate, not to interfere. To perceive, not to act. To find the best, and buy more of the same.

How did GiveWell assess its effectiveness in this stage? When GiveWell evaluates charities, it estimates their cost-effectiveness in advance. It assesses the program the charity is running, through experimental evidence of the form of randomized controlled trials. GiveWell also audits the charity to make sure they’re actually running the program, and figure out how much it costs as implemented. This is an excellent, evidence-based way to generate a prediction of how much good will be done by moving money to the charity.

As far as I can tell, these predictions are untested.

One of GiveWell’s early top charities was VillageReach, which helped Mozambique with TB immunization logistics. GiveWell estimated that VillageReach could save a life for $1,000. But this charity is no longer recommended. The public page says:

VillageReach (www.villagereach.org) was our top-rated organization for 2009, 2010 and much of 2011 and it has received over $2 million due to GiveWell's recommendation. In late 2011, we removed VillageReach from our top-rated list because we felt its project had limited room for more funding. As of November 2012, we believe that that this project may have room for more funding, but we still prefer our current highest-rated charities above it.

GiveWell reanalyzed the data it based its recommendations on, but hasn’t published an after-the-fact retrospective of long-run results. I asked GiveWell about this by email. The response was that such an assessment was not prioritized because GiveWell had found implementation problems in VillageReach's scale-up work as well as reasons to doubt its original conclusion about the impact of the pilot program. It's unclear to me whether this has caused GiveWell to evaluate charities differently in the future.

I don't think someone looking at GiveWell's page on VillageReach would be likely to reach the conclusion that GiveWell now believes its original recommendation was likely erroneous. GiveWell's impact page continues to count money moved to VillageReach without any mention of the retracted recommendation. If we assume that the point of tracking money moved is to track the benefit of moving money from worse to better uses, then repudiated programs ought to be counted against the total, as costs, rather than towards it.

GiveWell has recommended the Against Malaria Foundation for the last several years as a top charity. AMF distributes long-lasting insecticide-treated bed nets to prevent mosquitos from transmitting malaria to humans. Its evaluation of AMF does not mention any direct evidence, positive or negative, about what happened to malaria rates in the areas where AMF operated. (There is a discussion of the evidence that the bed nets were in fact delivered and used.) In the supplementary information page, however, we are told:

Previously, AMF expected to collect data on malaria case rates from the regions in which it funded LLIN distributions: […] In 2016, AMF shared malaria case rate data […] but we have not prioritized analyzing it closely. AMF believes that this data is not high quality enough to reliably indicate actual trends in malaria case rates, so we do not believe that the fact that AMF collects malaria case rate data is a consideration in AMF’s favor, and do not plan to continue to track AMF's progress in collecting malaria case rate data.

The data was noisy, so they simply stopped checking whether AMF’s bed net distributions do anything about malaria.

If we want to know the size of the improvement made by GiveWell in the developing world, we have their predictions about cost-effectiveness, an audit trail verifying that work was performed, and their direct measurement of how much money people gave because they trusted GiveWell. The predictions on the final target – improved outcomes – have not been tested.

GiveWell is actually doing unusually well as far as major funders go. It sticks to describing things it's actually responsible for. By contrast, the Gates Foundation, in a report to Warren Buffet claiming to describe its impact, simply described overall improvement in the developing world, a very small rhetorical step from claiming credit for 100% of the improvement. GiveWell at least sticks to facts about GiveWell's own effects, and this is to its credit. But, it focuses on costs it has been able to impose, not benefits it has been able to create.

The Centre for Effective Altruism's William MacAskill made a related point back in 2012, though he talked about the lack of any sort of formal outside validation or audit, rather than focusing on empirical validation of outcomes:

As far as I know, GiveWell haven't commissioned a thorough external evaluation of their recommendations. […] This surprises me. Whereas businesses have a natural feedback mechanism, namely profit or loss, research often doesn't, hence the need for peer-review within academia. This concern, when it comes to charity-evaluation, is even greater. If GiveWell's analysis and recommendations had major flaws, or were systematically biased in some way, it would be challenging for outsiders to work this out without a thorough independent evaluation. Fortunately, GiveWell has the resources to, for example, employ two top development economists to each do an independent review of their recommendations and the supporting research. This would make their recommendations more robust at a reasonable cost.

GiveWell's page on self-evaluation says that it discontinued external reviews in August 2013. This page links to an explanation of the decision, which concludes:

We continue to believe that it is important to ensure that our work is subjected to in-depth scrutiny. However, at this time, the scrutiny we’re naturally receiving – combined with the high costs and limited capacity for formal external evaluation – make us inclined to postpone major effort on external evaluation for the time being.

That said,

  • >If someone volunteered to do (or facilitate) formal external evaluation, we’d welcome this and would be happy to prominently post or link to criticism.
  • We do intend eventually to re-institute formal external evaluation.

Four years later, assessing the credibility of this assurance is left as an exercise for the reader.

Version 1: GiveWell Labs and active funding

Then there was GiveWell Labs, later called the Open Philanthropy Project. It looked into more potential philanthropic causes, where the evidence base might not be as cut-and-dried as that for the GiveWell top charities. One thing they learned was that in many areas, there simply weren’t shovel-ready programs ready for funding – a funder has to play a more active role. This shift was described by GiveWell co-founder Holden Karnofsky in his 2013 blog post, Challenges of passive funding:

By “passive funding,” I mean a dynamic in which the funder’s role is to review others’ proposals/ideas/arguments and pick which to fund, and by “active funding,” I mean a dynamic in which the funder’s role is to participate in – or lead – the development of a strategy, and find partners to “implement” it. Active funders, in other words, are participating at some level in “management” of partner organizations, whereas passive funders are merely choosing between plans that other nonprofits have already come up with.

My instinct is generally to try the most “passive” approach that’s feasible. Broadly speaking, it seems that a good partner organization will generally know their field and environment better than we do and therefore be best positioned to design strategy; in addition, I’d expect a project to go better when its implementer has fully bought into the plan as opposed to carrying out what the funder wants. However, (a) this philosophy seems to contrast heavily with how most existing major funders operate; (b) I’ve seen multiple reasons to believe the “active” approach may have more relative merits than we had originally anticipated. […]

  • In the nonprofit world of today, it seems to us that funder interests are major drivers of which ideas that get proposed and fleshed out, and therefore, as a funder, it’s important to express interests rather than trying to be fully “passive.”
  • While we still wish to err on the side of being as “passive” as possible, we are recognizing the importance of clearly articulating our values/strategy, and also recognizing that an area can be underfunded even if we can’t easily find shovel-ready funding opportunities in it.

GiveWell earned some credibility from its novel, evidence-based outcome-oriented approach to charity evaluation. But this credibility was already – and still is – a sort of loan. We have GiveWell's predictions or promises of cost effectiveness in terms of outcomes, and we have figures for money moved, from which we can infer how much we were promised in improved outcomes. As far as I know, no one's gone back and checked whether those promises turned out to be true.

In the meantime, GiveWell then leveraged this credibility by extending its methods into more speculative domains, where less was checkable, and donors had to put more trust in the subjective judgment of GiveWell analysts. This was called GiveWell Labs. At the time, this sort of compounded leverage may have been sensible, but it's important to track whether a debt has been paid off or merely rolled over.

Version 2: The Open Philanthropy Project and control-seeking

Finally, the Open Philanthropy made its largest-ever single grant to purchase its founder a seat on a major organization’s board. This represents a transition from mere active funding to overtly purchasing influence:

The Open Philanthropy Project awarded a grant of $30 million ($10 million per year for 3 years) in general support to OpenAI. This grant initiates a partnership between the Open Philanthropy Project and OpenAI, in which Holden Karnofsky (Open Philanthropy’s Executive Director, “Holden” throughout this page) will join OpenAI’s Board of Directors and, jointly with one other Board member, oversee OpenAI’s safety and governance work.

We expect the primary benefits of this grant to stem from our partnership with OpenAI, rather than simply from contributing funding toward OpenAI’s work. While we would also expect general support for OpenAI to be likely beneficial on its own, the case for this grant hinges on the benefits we anticipate from our partnership, particularly the opportunity to help play a role in OpenAI’s approach to safety and governance issues.

Clearly the value proposition is not increasing available funds for OpenAI, if OpenAI’s founders’ billion-dollar commitment to it is real:

Sam, Greg, Elon, Reid Hoffman, Jessica Livingston, Peter Thiel, Amazon Web Services (AWS), Infosys, and YC Research are donating to support OpenAI. In total, these funders have committed $1 billion, although we expect to only spend a tiny fraction of this in the next few years.

The Open Philanthropy Project is neither using this money to fund programs that have a track record of working, nor to fund a specific program that it has prior reason to expect will do good. Rather, it is buying control, in the hope that Holden will be able to persuade OpenAI not to destroy the world, because he knows better than OpenAI’s founders.

How does the Open Philanthropy Project know that Holden knows better? Well, it’s done some active funding of programs it expects to work out. It expects those programs to work out because they were approved by a process similar to the one used by GiveWell to find charities that it expects to save lives.

If you want to acquire control over something, that implies that you think you can manage it more sensibly than whoever is in control already. Thus, buying control is a claim to have superior judgment - not just over others funding things (the original GiveWell pitch), but over those being funded.

In a footnote to the very post announcing the grant, the Open Philanthropy Project notes that it has historically tried to avoid acquiring leverage over organizations it supports, precisely because it’s not sure it knows better:

For now, we note that providing a high proportion of an organization’s funding may cause it to be dependent on us and accountable primarily to us. This may mean that we come to be seen as more responsible for its actions than we want to be; it can also mean we have to choose between providing bad and possibly distortive guidance/feedback (unbalanced by other stakeholders’ guidance/feedback) and leaving the organization with essentially no accountability.

This seems to describe two main problems introduced by becoming a dominant funder:

  1. People might accurately attribute causal responsibility for some of the organization's conduct to the Open Philanthropy Project.
  2. The Open Philanthropy Project might influence the organization to behave differently than it otherwise would.

The first seems obviously silly. I've been trying to correct the imbalance where Open Phil is criticized mainly when it makes grants, by criticizing it for holding onto too much money.

The second really is a cost as well as a benefit, and the Open Philanthropy Project has been absolutely correct to recognize this. This is the sort of thing GiveWell has consistently gotten right since the beginning and it deserves credit for making this principle clear and – until now – living up to it.

But discomfort with being dominant funders seems inconsistent with buying a board seat to influence OpenAI. If the Open Philanthropy Project thinks that Holden’s judgment is good enough that he should be in control, why only here? If he thinks that other Open Philanthropy Project AI safety grantees have good judgment but OpenAI doesn’t, why not give them similar amounts of money free of strings to spend at their discretion and see what happens? Why not buy people like Eliezer Yudkowsky, Nick Bostrom, or Stuart Russell a seat on OpenAI’s board?

On the other hand, the Open Philanthropy Project is right on the merits here with respect to safe superintelligence development. Openness makes sense for weak AI, but if you’re building true strong AI you want to make sure you’re cooperating with all the other teams in a single closed effort. I agree with the Open Philanthropy Project’s assessment of the relevant risks. But it's not clear to me how often joining the bad guys to prevent their worst excesses is a good strategy, and it seems like it has to often be a mistake. Still, I’m mindful of heroes like John RabeChiune Sugihara, and Oscar Schindler. And if I think someone has a good idea for improving things, it makes sense to reallocate control from people who have worse ideas, even if there's some potential better allocation.

On the other hand, is Holden Karnofsky the right person to do this? The case is mixed.

He listens to and engages with the arguments from principled advocates for AI safety research, such as Nick Bostrom, Eliezer Yudkowsky, and Stuart Russell. This is a point in his favor. But, I can think of other people who engage with such arguments. For instance, OpenAI founder Elon Musk has publicly praised Bostrom’s book Superintelligence, and founder Sam Altman has written two blog posts summarizing concerns about AI safety reasonably cogently. Altman even asked Luke Muehlhauser, former executive director of MIRI, for feedback pre-publication. He's met with Nick Bostrom. That suggests a substantial level of direct engagement with the field, although Holden has engaged for a longer time, more extensively, and more directly.

Another point in Holden’s favor, from my perspective, is that under his leadership, the Open Philanthropy Project has funded the most serious-seeming programs for both weak and strong AI safety research. But Musk also managed to (indirectly) fund AI safety research at MIRI and by Nick Bostrom personally, via his $10 million FLI grant.

The Open Philanthropy Project also says that it expects to learn a lot about AI research from this, which will help it make better decisions on AI risk in the future and influence the field in the right way. This is reasonable as far as it goes. But remember that the case for positioning the Open Philanthropy Project to do this relies on the assumption that the Open Philanthropy Project will improve matters by becoming a central influencer in this field. This move is consistent with reaching that goal, but it is not independent evidence that the goal is the right one.

Overall, there are good narrow reasons to think that this is a potential improvement over the prior situation around OpenAI – but only a small and ill-defined improvement, at considerable attentional cost, and with the offsetting potential harm of increasing OpenAI's perceived legitimacy as a long-run AI safety organization.

And it’s worrying that Open Philanthropy Project’s largest grant – not just for AI risk, but ever (aside from GiveWell Top Charity funding) – is being made to an organization at which Holden’s housemate and future brother-in-law is a leading researcher. The nepotism argument is not my central objection. If I otherwise thought the grant were obviously a good idea, it wouldn’t worry me, because it’s natural for people with shared values and outlooks to become close nonprofessionally as well. But in the absence of a clear compelling specific case for the grant, it’s worrying.

Altogether, I'm not saying this is an unreasonable shift, considered in isolation. I’m not even sure this is a bad thing for the Open Philanthropy Project to be doing – insiders may have information that I don’t, and that is difficult to communicate to outsiders. But as outsiders, there comes a point when someone’s maxed out their moral credit, and we should wait for results before actively trying to entrust the Open Philanthropy Project and its staff with more responsibility.

EA Funds and self-recommendation

The Centre for Effective Altruism is actively trying to entrust the Open Philanthropy Project and its staff with more responsibility.

The concerns of CEA’s CEO William MacAskill about GiveWell have, as far as I can tell, never been addressed, and the underlying issues have only become more acute. But CEA is now working to put more money under the control of Open Philanthropy Project staff, through its new EA Funds product – a way for supporters to delegate giving decisions to expert EA “fund managers” by giving to one of four funds: Global Health and DevelopmentAnimal WelfareLong-Term Future, and Effective Altruism Community.

The Effective Altruism movement began by saying that because very poor people exist, we should reallocate money from ordinary people in the developed world to the global poor. Now the pitch is in effect that because very poor people exist, we should reallocate money from ordinary people in the developed world to the extremely wealthy. This is a strange and surprising place to end up, and it’s worth retracing our steps. Again, I find it easiest to think of three stages:

  1. Money can go much farther in the developing world. Here, we’ve found some examples for you. As a result, you can do a huge amount of good by giving away a large share of your income, so you ought to.
  2. We’ve found ways for you to do a huge amount of good by giving away a large share of your income for developing-world interventions, so you ought to trust our recommendations. You ought to give a large share of your income to these weird things our friends are doing that are even better, or join our friends.
  3. We’ve found ways for you to do a huge amount of good by funding weird things our friends are doing, so you ought to trust the people we trust. You ought to give a large share of your income to a multi-billion-dollar foundation that funds such things.

Stage 1: The direct pitch

At first, Giving What We Can (the organization that eventually became CEA) had a simple, easy to understand pitch:

Giving What We Can is the brainchild of Toby Ord, a philosopher at Balliol College, Oxford. Inspired by the ideas of ethicists Peter Singer and Thomas Pogge, Toby decided in 2009 to commit a large proportion of his income to charities that effectively alleviate poverty in the developing world.

[…]

Discovering that many of his friends and colleagues were interested in making a similar pledge, Toby worked with fellow Oxford philosopher Will MacAskill to create an international organization of people who would donate a significant proportion of their income to cost-effective charities.

Giving What We Can launched in November 2009, attracting significant media attention. Within a year, 64 people had joined the society, their pledged donations amounting to $21 million. Initially run on a volunteer basis, Giving What We Can took on full-time staff in the summer of 2012.

In effect, its argument was: "Look, you can do huge amounts of good by giving to people in the developing world. Here are some examples of charities that do that. It seems like a great idea to give 10% of our income to those charities."

GWWC was a simple product, with a clear, limited scope. Its founders believed that people, including them, ought to do a thing – so they argued directly for that thing, using the arguments that had persuaded them. If it wasn't for you, it was easy to figure that out; but a surprisingly large number of people were persuaded by a simple, direct statement of the argument, took the pledge, and gave a lot of money to charities helping the world's poorest.

Stage 2: Rhetoric and belief diverge

Then, GWWC staff were persuaded you could do even more good with your money in areas other than developing-world charity, such as existential risk mitigation. Encouraging donations and work in these areas became part of the broader Effective Altruism movement, and GWWC's umbrella organization was named the Centre for Effective Altruism. So far, so good.

But this left Effective Altruism in an awkward position; while leadership often personally believe the most effective way to do good is far-future stuff or similarly weird-sounding things, many people who can see the merits of the developing-world charity argument reject the argument that because the vast majority of people live in the far future, even a very small improvement in humanity’s long-run prospects outweighs huge improvements on the global poverty front. They also often reject similar scope-sensitive arguments for things like animal charities.

Giving What We Can's page on what we can achieve still focuses on global poverty, because developing-world charity is easier to explain persuasively. However, EA leadership tends to privately focus on things like AI risk. Two years ago many attendees at the EA Global conference in the San Francisco Bay Area were surprised that the conference focused so heavily on AI risk, rather than the global poverty interventions they’d expected.

Stage 3: Effective altruism is self-recommending

Shortly before the launch of the EA Funds I was told in informal conversations that they were a response to demand. Giving What We Can pledge-takers and other EA donors had told CEA that they trusted it to GWWC pledge-taker demand. CEA was responding by creating a product for the people who wanted it.

This seemed pretty reasonable to me, and on the whole good. If someone wants to trust you with their money, and you think you can do something good with it, you might as well take it, because they’re estimating your skill above theirs. But not everyone agrees, and as the Madoff case demonstrates, "people are begging me to take their money" is not a definitive argument that you are doing anything real.

In practice, the funds are managed by Open Philanthropy Project staff:

We want to keep this idea as simple as possible to begin with, so we’ll have just four funds, with the following managers:

  • Global Health and Development - Elie Hassenfeld
  • Animal Welfare – Lewis Bollard
  • Long-run future – Nick Beckstead
  • Movement-building – Nick Beckstead

(Note that the meta-charity fund will be able to fund CEA; and note that Nick Beckstead is a Trustee of CEA. The long-run future fund and the meta-charity fund continue the work that Nick has been doing running the EA Giving Fund.)

It’s not a coincidence that all the fund managers work for GiveWell or Open Philanthropy.  First, these are the organisations whose charity evaluation we respect the most. The worst-case scenario, where your donation just adds to the Open Philanthropy funding within a particular area, is therefore still a great outcome.  Second, they have the best information available about what grants Open Philanthropy are planning to make, so have a good understanding of where the remaining funding gaps are, in case they feel they can use the money in the EA Fund to fill a gap that they feel is important, but isn’t currently addressed by Open Philanthropy.

In past years, Giving What We Can recommendations have largely overlapped with GiveWell’s top charities.

In the comments on the launch announcement on the EA Forum, several people (including me) pointed out that the Open Philanthropy Project seems to be having trouble giving away even the money it already has, so it seems odd to direct more money to Open Philanthropy Project decisionmakers. CEA’s senior marketing manager replied that the Funds were a minimum viable product to test the concept:

I don't think the long-term goal is that OpenPhil program officers are the only fund managers. Working with them was the best way to get an MVP version in place.

This also seemed okay to me, and I said so at the time.

[NOTE: I've edited the next paragraph to excise some unreliable information. Sorry for the error, and thanks to Rob Wiblin for pointing it out.]

After they were launched, though, I saw phrasings that were not so cautious at all, instead making claims that this was generally a better way to give. As of writing this, if someone on the effectivealtruism.org website clicks on "Donate Effectively" they will be led directly to a page promoting EA Funds. When I looked at Giving What We Can’s top charities page in early April, it recommended the EA Funds "as the highest impact option for donors."

This is not a response to demand, it is an attempt to create demand by using CEA's authority, telling people that the funds are better than what they're doing already. By contrast, GiveWell's Top Charities page simply says:

Our top charities are evidence-backed, thoroughly vetted, underfunded organizations.

This carefully avoids any overt claim that they're the highest-impact option available to donors. GiveWell avoids saying that because there's no way they could know it, so saying it wouldn't be truthful.

A marketing email might have just been dashed off quickly, and an exaggerated wording might just have been an oversight. But when I looked at Giving What We Can’s top charities page in early April, it recommended the EA Funds "as the highest impact option for donors."

The wording has since been qualified with “for most donors”, which is a good change. But the thing I’m worried about isn’t just the explicit exaggerated claims – it’s the underlying marketing mindset that made them seem like a good idea in the first place. EA seems to have switched from an endorsement of the best things outside itself, to an endorsement of itself. And it's concentrating decisionmaking power in the Open Philanthropy Project.

Effective altruism is overextended, but it doesn't have to be

There is a saying in finance, that was old even back when Keynes said it. If you owe the bank a million dollars, then you have a problem. If you owe the bank a billion dollars, then the bank has a problem.

In other words, if someone extends you a level of trust they could survive writing off, then they might call in that loan. As a result, they have leverage over you. But if they overextend, putting all their eggs in one basket, and you are that basket, then you have leverage over them; you're too big to fail. Letting you fail would be so disastrous for their interests that you can extract nearly arbitrary concessions from them, including further investment. For this reason, successful institutions often try to diversify their investments, and avoid overextending themselves. Regulators, for the same reason, try to prevent banks from becoming "too big to fail."

The Effective Altruism movement is concentrating decisionmaking power and trust as much as possible, in a way that's setting itself up to invest ever increasing amounts of confidence to keep the game going.

The alternative is to keep the scope of each organization narrow, overtly ask for trust for each venture separately, and make it clear what sorts of programs are being funded. For instance, Giving What We Can should go back to its initial focus of global poverty relief.

Like many EA leaders, I happen to believe that anything you can do to steer the far future in a better direction is much, much more consequential for the well-being of sentient creatures than any purely short-run improvement you can create now. So it might seem odd that I think Giving What We Can should stay focused on global poverty. But, I believe that the single most important thing we can do to improve the far future is hold onto our ability to accurately build shared models. If we use bait-and-switch tactics, we are actively eroding the most important type of capital we have – coordination capacity.

If you do not think giving 10% of one's income to global poverty charities is the right thing to do, then you can't in full integrity urge others to do it – so you should stop. You might still believe that GWWC ought to exist. You might still believe that it is a positive good to encourage people to give much of their income to help the global poor, if they wouldn't have been doing anything else especially effective with the money. If so, and you happen to find yourself in charge of an organization like Giving What We Can, the thing to do is write a letter to GWWC members telling them that you've changed your mind, and why, and offering to give away the brand to whoever seems best able to honestly maintain it.

If someone at the Centre for Effective Altruism fully believes in GWWC's original mission, then that might make the transition easier. If not, then one still has to tell the truth and do what's right.

And what of the EA Funds? The Long-Term Future Fund is run by Open Philanthropy Project Program Officer Nick Beckstead. If you think that it's a good thing to delegate giving decisions to Nick, then I would agree with you. Nick's a great guy! I'm always happy to see him when he shows up at house parties. He's smart, and he actively seeks out arguments against his current point of view. But the right thing to do, if you want to persuade people to delegate their giving decisions to Nick Beckstead, is to make a principled case for delegating giving decisions to Nick Beckstead. If the Centre for Effective Altruism did that, then Nick would almost certainly feel more free to allocate funds to the best things he knows about, not just the best things he suspects EA Funds donors would be able to understand and agree with.

If you can't directly persuade people, then maybe you're wrong. If the problem is inferential distance, then you've got some work to do bridging that gap.

There's nothing wrong with setting up a fund to make it easy. It's actually a really good idea. But there is something wrong with the multiple layers of vague indirection involved in the current marketing of the Far Future fund – using global poverty to sell the generic idea of doing the most good, then using CEA's identity as the organization in charge of doing the most good to persuade people to delegate their giving decisions to it, and then sending their money to some dude at the multi-billion-dollar foundation to give away at his personal discretion. The same argument applies to all four Funds.

Likewise, if you think that working directly on AI risk is the most important thing, then you should make arguments directly for working on AI risk. If you can't directly persuade people, then maybe you're wrong. If the problem is inferential distance, it might make sense to imitate the example of someone like Eliezer Yudkowsky, who used indirect methods to bridge the inferential gap by writing extensively on individual human rationality, and did not try to control others' actions in the meantime.

If Holden thinks he should be in charge of some AI safety research, then he should ask Good Ventures for funds to actually start an AI safety research organization. I'd be excited to see what he'd come up with if he had full control of and responsibility for such an organization. But I don't think anyone has a good plan to work directly on AI risk, and I don't have one either, which is why I'm not directly working on it or funding it. My plan for improving the far future is to build human coordination capacity.

(If, by contrast, Holden just thinks there needs to be coordination between different AI safety organizations, the obvious thing to do would be to work with FLI on that, e.g. by giving them enough money to throw their weight around as a funder. They organized the successful Puerto Rico conference, after all.)

Another thing that would be encouraging would be if at least one of the Funds were not administered entirely by an Open Philanthropy Project staffer, and ideally an expert who doesn't benefit from the halo of "being an EA." For instance, Chris Blattman is a development economist with experience designing programs that don't just use but generate evidence on what works. When people were arguing about whether sweatshops are good or bad for the global poor, he actually went and looked by performing a randomized controlled trial. He's leading two new initiatives with J-PAL and IPA, and expects that directors designing studies will also have to spend time fundraising. Having funding lined up seems like the sort of thing that would let them spend more time actually running programs. And more generally, he seems likely to know about funding opportunities the Open Philanthropy Project doesn't, simply because he's embedded in a slightly different part of the global health and development network.

Narrower projects that rely less on the EA brand and more on what they're actually doing, and more cooperation on equal terms with outsiders who seem to be doing something good already, would do a lot to help EA grow beyond putting stickers on its own behavior chart. I'd like to see EA grow up. I'd be excited to see what it might do.

Summary

  1. Good programs don't need to distort the story people tell about them, while bad programs do.
  2. Moral confidence games – treating past promises and trust as a track record to justify more trust – are an example of the kind of distortion mentioned in (1), that benefits bad programs more than good ones.
  3. The Open Philanthropy Project's Open AI grant represents a shift from evaluating other programs' effectiveness, to assuming its own effectiveness.
  4. EA Funds represents a shift from EA evaluating programs' effectiveness, to assuming EA's effectiveness.
  5. A shift from evaluating other programs' effectiveness, to assuming one's own effectiveness, is an example of the kind of "moral confidence game" mentioned in (2).
  6. EA ought to focus on scope-limited projects, so that it can directly make the case for those particular projects instead of relying on EA identity as a reason to support an EA organization.
  7. EA organizations ought to entrust more responsibility to outsiders who seem to be doing good things but don't overtly identify as EA, instead of trying to keep it all in the family.
(Cross-posted at my personal blog.)

80,000 Hours: EA and Highly Political Causes

30 The_Jaded_One 26 January 2017 09:44PM

this post is now crossposted to the EA forum

80,000 hours is a well known Effective Altruism organisation which does "in-depth research alongside academics at Oxford into how graduates can make the biggest difference possible with their careers". 

They recently posted a guide to donating which aims, in their words, to (my emphasis)

use evidence and careful reasoning to work out how to best promote the wellbeing of all. To find the highest-impact charities this giving season ... We ... summed up the main recommendations by area below

Looking below, we find a section on the problem area of criminal justice (US-focused). An area where the aim is outlined as follows: (quoting from the Open Philanthropy "problem area" page)

investing in criminal justice policy and practice reforms to substantially reduce incarceration while maintaining public safety. 

Reducing incarceration whilst maintaining public safety seems like a reasonable EA cause, if we interpret "pubic safety" in a broad sense - that is, keep fewer people in prison whilst still getting almost all of the benefits of incarceration such as deterrent effects, prevention of crime, etc.

So what are the recommended charities? (my emphasis below)

1. Alliance for Safety and Justice 

"The Alliance for Safety and Justice is a US organization that aims to reduce incarceration and racial disparities in incarceration in states across the country, and replace mass incarceration with new safety priorities that prioritize prevention and protect low-income communities of color."  

They promote an article on their site called "black wounds matter", as well as how you can "Apply for VOCA Funding: A Toolkit for Organizations Working With Crime Survivors in Communities of Color and Other Underserved Communities"

2. Cosecha - (note that their url is www.lahuelga.com, which means "the strike" in Spanish) (my emphasis below)

"Cosecha is a group organizing undocumented immigrants in 50-60 cities around the country. Its goal is to build mass popular support for undocumented immigrants, in resistance to incarceration/detention, deportation, denigration of rights, and discrimination. The group has become especially active since the Presidential election, given the immediate threat of mass incarceration and deportation of millions of people."

Cosecha have a footprint in the news, for example this article:

They have the ultimate goal of launching massive civil resistance and non-cooperation to show this country it depends on us ...  if they wage a general strike of five to eight million workers for seven days, we think the economy of this country would not be able to sustain itself 

The article quotes Carlos Saavedra, who is directly mentioned by Open Philanthropy's Chloe Cockburn:

Carlos Saavedra, who leads Cosecha, stands out as an organizer who is devoted to testing and improving his methods, ... Cosecha can do a lot of good to prevent mass deportations and incarceration, I think his work is a good fit for likely readers of this post."

They mention other charities elsewhere on their site and in their writeup on the subject, such as the conservative Center for Criminal Justice Reform, but Cosecha and the Alliance for Safety and Justice are the ones that were chosen as "highest impact" and featured in the guide to donating

 


 

Sometimes one has to be blunt: 80,000 hours is promoting the financial support of some extremely hot-button political causes, which may not be a good idea. Traditionalists/conservatives and those who are uninitiated to Social Justice ideology might look at The Alliance for Safety and Justice and Cosecha and label them as them racists and criminals, and thereby be turned off by Effective Altruism, or even by the rationality movement as a whole. 

There are standard arguments, for example this by Robin Hanson from 10 years ago about why it is not smart or "effective" to get into these political tugs-of-war if one wants to make a genuine difference in the world.

One could also argue that the 80,000 hours' charities go beyond the usual folly of political tugs-of-war. In addition to supporting extremely political causes, 80,000 hours could be accused of being somewhat intellectually dishonest about what goal they are trying to further actually is. 

Consider The Alliance for Safety and Justice. 80,000 Hours state that the goal of their work in the criminal justice problem area is to "substantially reduce incarceration while maintaining public safety". This is an abstract goal that has very broad appeal and one that I am sure almost everyone agrees to. But then their more concrete policy in this area is to fund a charity that wants to "reduce racial disparities in incarceration" and "protect low-income communities of color". The latter is significantly different to the former - it isn't even close to being the same thing - and the difference is highly political. One could object that reducing racial disparities in incarceration is merely a means to the end of substantially reducing incarceration while maintaining public safety, since many people in prison in the US are "of color". However this line of argument is a very politicized one and it might be wrong, or at least I don't see strong support for it. "Selectively release people of color and make society safer - endorsed by effective altruists!" struggles against known facts about redictivism rates across races, as well as an objection about the implicit conflation of equality of outcome and equality of opportunity. (and I do not want this to be interpreted as a claim of moral superiority of one race over others - merely a necessary exercise in coming to terms with facts and debunking implicit assumptions). Males are incarcerated much more than women, so what about reducing gender disparities in incarceration, whilst also maintaining public safety? Again, this is all highly political, laden with politicized implicit assumptions and language.  

Cosecha is worse! They are actively planning potentially illegal activities like helping illegal immigrants evade the law (though IANAL), as well as activities which potentially harm the majority of US citizens such as a seven day nationwide strike whose intent is to damage the economy. Their URL is "The Strike" in Spanish. 

Again, the abstract goal is extremely attractive to almost anyone, but the concrete implementation is highly divisive. If some conservative altruist signed up to financially or morally support the abstract goal of "substantially reducing incarceration while maintaining public safety" and EA organisations that are pursuing that goal without reading the details, and then at a later point they saw the details of Cosecha and The Alliance for Safety and Justice, they would rightly feel cheated. And to the objection that conservative altruists should read the description rather than just the heading - what are we doing writing headings so misleading that you'd feel cheated if you relied on them as summaries of the activity they are mean to summarize? 

 


 

One possibility would be for 80,000 hours to be much more upfront about what they are trying to achieve here - maybe they like left-wing social justice causes, and want to help like-minded people donate money to such causes and help the particular groups who are favored in those circles. There's almost a nod and a wink to this when Chloe Cockburn says (my paraphrase of Saavedra, and emphasis, below)

I think his [A man who wants to lead a general strike of five to eight million workers for seven days so that the economy of the USA would not be able to sustain itself, in order to help illegal immigrants] work is a good fit for likely readers of this post

Alternatively, they could try to reinvigorate the idea that their "criminal justice" problem area is politically neutral and beneficial to everyone; the Open Philanthropy issue writeup talks about "conservative interest in what has traditionally been a solely liberal cause" after all. I would advise considering dropping The Alliance for Safety and Justice and Cosecha if they intend to do this. There may not be politically neutral charities in this area, or there may not be enough high quality conservative charities to present a politically balanced set of recommendations. Setting up a growing donor advised fund or a prize for nonpartisan progress that genuinely intends to benefit everyone including conservatives, people opposed to illegal immigration and people who are not "of color" might be an option to consider.

We could examine 80,000 hours' choice to back these organisations from a more overall-utilitarian/overall-effectiveness point of view, rather than limiting the analysis to the specific problem area. These two charities don't pass the smell test for altruistic consequentialism, pulling sideways on ropes, finding hidden levers that others are ignoring, etc. Is the best thing you can do with your smart EA money helping a charity that wants to get stuck into the culture war about which skin color is most over-represented in prisons? What about a second charity that wants to help people illegally immigrate at a time when immigration is the most divisive political topic in the western world?

Furthermore, Cosecha's plans for a nationwide strike and potential civil disobedience/showdown with Trump & co could push an already volatile situation in the US into something extremely ugly. The vast majority of people in the world (present and future) are not the specific group that Cosecha aims to help, but the set of people who could be harmed by the uglier versions of a violent and calamitous showdown in the US is basically the whole world. That means that even if P(Cosecha persuades Trump to do a U-turn on illegals) is 10 or 100 times greater than P(Cosecha precipitates a violent crisis in the USA), they may still be net-negative from an expected utility point of view. EA doesn't usually fund causes whose outcome distribution is heavily left-skewed so this argument is a bit unusual to have to make, but there it is. 

Not only is Cosecha a cause that is (a) mind-killing and culture war-ish (b) very tangentially related to the actual problem area it is advertised under by 80,000 hours, but it might also (c) be an anti-charity that produces net disutility (in expectation) in the form of a higher probability a US civil war with money that you donate to it. 

Back on the topic of criminal justice and incarceration: opposition to reform often comes from conservative voters and politicians, so it might seem unlikely to a careful thinker that extra money on the left-wing side is going to be highly effective. Some intellectual judo is required; make conservatives think that it was their idea all along. So promoting the Center for Criminal Justice Reform sounds like the kind of smart, against-the-grain idea that might be highly effective! Well done, Open Philanthropy! Also in favor of this org: they don't copiously mention which races or person-categories they think are most important in their articles about criminal justice reform, the only culture war item I could find on them is the world "conservative" (and given the intellectual judo argument above, this counts as a plus), and they're not planning a national strike or other action with a heavy tail risk. But that's the one that didn't make the cut for the 80,000 hours guide to donating!

The fact that they let Cosecha (and to a lesser extent The Alliance for Safety and Justice) through reduces my confidence in 80,000 hours and the EA movement as a whole. Who thought it would be a good idea to get EA into the culture war with these causes, and also thought that they were plausibly among the most effective things you can do with money? Are they taking effectiveness seriously? What does the political diversity of meetings at 80,000 hours look like? Were there no conservative altruists present in discussions surrounding The Alliance for Safety and Justice and Cosecha, and the promotion of them as "beneficial for everyone" and "effective"? 

Before we finish, I want to emphasize that this post is not intended to start an object-level discussion about which race, gender, political movement or sexual orientation is cooler, and I would encourage moderators to temp-ban people who try to have that kind of argument in the comments of this post.

I also want to emphasize that criticism of professional altruists is a necessary evil; in an ideal world the only thing I would ever want to say to people who dedicate their lives to helping others (Chloe Cockburn in particular, since I mentioned her name above)  is "thank you, you're amazing". Other than that, comments and criticism are welcome, especially anything pointing out any inaccuracies or misunderstandings in this post. Comments from anyone involved in 80,000 hours or Open Philanthropy are welcome. 

Allegory On AI Risk, Game Theory, and Mithril

25 James_Miller 13 February 2017 08:41PM

“Thorin, I can’t accept your generous job offer because, honestly, I think that your company might destroy Middle Earth.”  

 

“Bifur, I can tell that you’re one of those “the Balrog is real, evil, and near” folks who thinks that in the next few decades Mithril miners will dig deep enough to wake the Balrog causing him to rise and destroy Middle Earth.  Let’s say for the sake of argument that you’re right.  You must know that lots of people disagree with you.  Some don’t believe in the Balrog, others think that anything that powerful will inevitably be good, and more think we are hundreds or even thousands of years away from being able to disturb any possible Balrog.  These other dwarves are not going to stop mining, especially given the value of Mithril.  If you’re right about the Balrog we are doomed regardless of what you do, so why not have a high paying career as a Mithril miner and enjoy yourself while you can?”  

 

“But Thorin, if everyone thought that way we would be doomed!”

 

“Exactly, so make the most of what little remains of your life.”

 

“Thorin, what if I could somehow convince everyone that I’m right about the Balrog?”

 

“You can’t because, as the wise Sinclair said, ‘It is difficult to get a dwarf to understand something, when his salary depends upon his not understanding it!’  But even if you could, it still wouldn’t matter.  Each individual miner would correctly realize that just him alone mining Mithril is extraordinarily unlikely to be the cause of the Balrog awakening, and so he would find it in his self-interest to mine.  And, knowing that others are going to continue to extract Mithril means that it really doesn’t matter if you mine because if we are close to disturbing the Balrog he will be awoken.” 

 

“But dwarves can’t be that selfish, can they?”  

 

“Actually, altruism could doom us as well.  Given Mithril’s enormous military value many cities rightly fear that without new supplies they will be at the mercy of cities that get more of this metal, especially as it’s known that the deeper Mithril is found, the greater its powers.  Leaders who care about their citizen’s safety and freedom will keep mining Mithril.  If we are soon all going to die, altruistic leaders will want to make sure their people die while still free citizens of Middle Earth.”

 

“But couldn’t we all coordinate to stop mining?  This would be in our collective interest.”

 

“No, dwarves would cheat rightly realizing that if just they mine a little bit more Mithril it’s highly unlikely to do anything to the Balrog, and the more you expect others to cheat, the less your cheating matters as to whether the Balrog gets us if your assumptions about the Balrog are correct.”  

 

“OK, but won’t the rich dwarves step in and eventually stop the mining?  They surely don’t want to get eaten by the Balrog.”   

 

“Actually, they have just started an open Mithril mining initiative which will find and then freely disseminate new and improved Mithril mining technology.  These dwarves earned their wealth through Mithril, they love Mithril, and while some of them can theoretically understand how Mithril mining might be bad, they can’t emotionally accept that their life’s work, the acts that have given them enormous success and status, might significantly hasten our annihilation.”

 

“Won’t the dwarven kings save us?  After all, their primary job is to protect their realms from monsters.

 

“Ha!  They are more likely to subsidize Mithril mining than to stop it.  Their military machines need Mithril, and any king who prevented his people from getting new Mithril just to stop some hypothetical Balrog from rising would be laughed out of office.  The common dwarf simply doesn’t have the expertise to evaluate the legitimacy of the Balrog claims and so rightly, from their viewpoint at least, would use the absurdity heuristic to dismiss any Balrog worries.  Plus, remember that the kings compete with each other for the loyalty of dwarves and even if a few kings came to believe in the dangers posed by the Balrog they would realize that if they tried to imposed costs on their people, they would be outcompeted by fellow kings that didn’t try to restrict Mithril mining.  Bifur, the best you can hope for with the kings is that they don’t do too much to accelerating Mithril mining.”

 

“Well, at least if I don’t do any mining it will take a bit longer for miners to awake the Balrog.”

 

“No Bifur, you obviously have never considered the economics of mining.  You see, if you don’t take this job someone else will.  Companies such as ours hire the optimal number of Mithril miners to maximize our profits and this number won’t change if you turn down our offer.”

 

“But it takes a long time to train a miner.  If I refuse to work for you, you might have to wait a bit before hiring someone else.”

 

“Bifur, what job will you likely take if you don’t mine Mithril?”

 

“Gold mining.”

 

“Mining gold and Mithril require similar skills.  If you get a job working for a gold mining company, this firm would hire one less dwarf than it otherwise would and this dwarf’s time will be freed up to mine Mithril.  If you consider the marginal impact of your actions, you will see that working for us really doesn’t hasten the end of the world even under your Balrog assumptions.”  

 

“OK, but I still don’t want to play any part in the destruction of the world so I refuse work for you even if this won’t do anything to delay when the Balrog destroys us.”

 

“Bifur, focus on the marginal consequences of your actions and don’t let your moral purity concerns cause you to make the situation worse.  We’ve established that your turning down the job will do nothing to delay the Balrog.  It will, however, cause you to earn a lower income.  You could have donated that income to the needy, or even used it to hire a wizard to work on an admittedly long-shot, Balrog control spell.  Mining Mithril is both in your self-interest and is what’s best for Middle Earth.” 


Straw Hufflepuffs and Lone Heroes

22 Raemon 16 April 2017 11:48PM
I was hoping the next Project Hufflepuff post would involve more "explain concretely what I think we should do", but as it turns out I'm still hashing out some thoughts about that. In the meanwhile, this is the post I actually have ready to go, which is as good as any to post for now.

Epistemic Status: Mythmaking. This is tailored for the sort of person for whom the "Lone Hero" mindset is attractive. If that isn't something you're concerned with and this post feels irrelevant or missing some important things, note that my vision for Project Hufflepuff has multiple facets and I expect different people to approach it in different ways.

The Berkeley Hufflepuff Unconference is on April 28th. RSVPing on this Facebook Event is helpful, as is filling out this form.



For good or for ill, the founding mythology of our community is a Harry Potter fanfiction.

This has a few ramifications I’ll delve into at some point, but the most pertinent bit is: for a community to change itself, the impulse to change needs to come from within the community. I think it’s easier to build change off of stories that are already a part of our cultural identity.*

* with an understanding that maybe part of the problem is that our cultural identity needs to change, or be more accessible, but I’m running with this mythos for the time being.

In J.K Rowling’s original Harry Potter story, Hufflepuffs are treated like “generic background characters” at best and as a joke at worst. All the main characters are Gryffindors, courageous and true. All the bad guys are Slytherin. And this is strange - Rowling clearly was setting out to create a complex world with nuanced virtues and vices. But it almost seems to me like Rowling’s story takes place in an alternate, explicitly “Pro-Gryffindor propaganda” universe instead of the “real” Harry Potter world. 

People have trouble taking Hufflepuff seriously, because they’ve never actually seen the real thing - only lame, strawman caricatures.

Harry Potter and the Methods of Rationality is… well, Pro-Ravenclaw propaganda. But part of being Ravenclaw is trying to understand things, and to use that knowledge. Eliezer makes an earnest effort to steelman each house. What wisdom does it offer that actually makes sense? What virtues does it cultivate that are rare and valuable?

When Harry goes under the sorting hat, it actually tries to convince him not to go into Ravenclaw, and specifically pushes towards Hufflepuff House:

Where would I go, if not Ravenclaw?

"Ahem. 'Clever kids in Ravenclaw, evil kids in Slytherin, wannabe heroes in Gryffindor, and everyone who does the actual work in Hufflepuff.' This indicates a certain amount of respect. You are well aware that Conscientiousness is just about as important as raw intelligence in determining life outcomes, you think you will be extremely loyal to your friends if you ever have some, you are not frightened by the expectation that your chosen scientific problems may take decades to solve -"

I'm lazy! I hate work! Hate hard work in all its forms! Clever shortcuts, that's all I'm about!

"And you would find loyalty and friendship in Hufflepuff, a camaraderie that you have never had before. You would find that you could rely on others, and that would heal something inside you that is broken."

But my plans -

"So replan! Don't let your life be steered by your reluctance to do a little extra thinking. You know that."

In the end, Harry chooses to go to Ravenclaw - the obvious house, the place that seemed most straightforward and comfortable. And ultimately… a hundred+ chapters later, I think he’s still visibly lacking in the strengths that Hufflepuff might have helped him develop. 

He does work hard and is incredibly loyal to his friends… but he operates in a fundamentally lone-wolf mindset. He’s still manipulating people for their own good. He’s still too caught up in his own cleverness. He never really has true friends other than Hermione, and when she is unable to be his friend for an extended period of time, it takes a huge toll on him that he doesn’t have the support network to recover from in a healthy way. 

The story does showcase Hufflepuff virtue. Hermione’s army is strong precisely because people work hard, trust each other and help each other - not just in big, dramatic gestures, but in small moments throughout the day. 

But… none of that ends up really mattering. And in the end, Harry faces his enemy alone. Lip service is paid to the concepts of friendship and group coordination, but the dominant narrative is Godric Gryffindor’s Nihil Supernum:


No rescuer hath the rescuer.
No lord hath the champion.
No mother or father.
Only nothingness above.


The Sequences and HPMOR both talk about the importance of groups, of emotions, of avoiding the biases that plague overly-clever people in particular. But I feel like the communities descended from Less Wrong, as a whole, are still basically that eleven-year-old Harry Potter: abstractly understanding that these things are important, but not really believing in them seriously enough to actually change their plans and priorities.

Lone Heroes


In Methods of Rationality, there’s a pretty good reason for Harry to focus on being a lone hero: he literally is alone. Nobody else really cares about the things he cares about or tries to do things on his level. It’s like a group project in high school, which is supposed to teach cooperation but actually just results in one kid doing all the work while the others either halfheartedly try to help (at best) or deliberately goof off.

Harry doesn’t bother turning to others for help, because they won’t give him the help he needs.

He does the only thing he can do reliably: focus on himself, pushing himself as hard as he can. The world is full of impossible challenges and nobody else is stepping up, so he shuts up and does the impossible as best he can. Learning higher level magic. Learning higher level strategy. Training, physically and mentally. 

This proves to be barely enough to survive, and not nearly enough to actually play the game. The last chapters are Harry realizing his best still isn’t good enough, and no, this isn’t fair, but it’s how the world is, and there’s nothing to do but keep trying.

He helps others level up as best they can. Hermione and Neville and some others show promise. But they’re not ready to work together as equals.

And frankly, this does match my experience of the real world. When you have a dream burning in your heart... it is incredibly hard to find someone who shares it, who will not just pitch in and help but will actually move heaven and earth to achieve it. 

And if they aren’t capable, level themselves up until they are.

In my own projects, I have tried to find people to work alongside me and at best I’ve found temporary allies. And it is frustrating. And it is incredibly tempting to say “well, the only person I can rely on is myself.”

But… here’s the thing.

Yes, the world is horribly unfair. It is full of poverty, and people trapped in demoralizing jobs. It is full of stupid bureaucracies and corruption and people dying for no good reason. It is full of beautiful things that could exist but don’t. And there are terribly few people who are able and willing to do the work needed to make a dent in reality.

But as long as we’re willing to look at monstrously unfair things and roll up our sleeves and get to work anyway, consider this:

It may be that one of the unfair things is that one person can never be enough to solve these problems. That one of the things we need to roll up our sleeves and do even though it seems impossible is figure out how to coordinate and level up together and rely on each other in a way that actually works.

And maybe, while we’re at it, find meaningful relationships that actually make us happy. Because it's not a coincidence that Hufflepuff is about both hard work and warmth and camaraderie. The warmth is what makes the hard work sustainable.

Godric Gryffindor has a point, but Nihil Supernum feels incomplete to me. There are no parents to step in and help us, but if we look to our left, or right…


Yes, you are only one
No, it is not enough—
But if you lift your eyes,
I am your brother

Vienna Teng, Level Up 


-


Reminder that the Berkeley Hufflepuff Unconference is on April 28th. RSVPing on this Facebook Event is helpful, as is filling out this form.


LessWrong analytics (February 2009 to January 2017)

19 riceissa 16 April 2017 10:45PM

Table of contents

Introduction

In January 2017, Vipul Naik obtained Google Analytics daily sessions and pageviews data for LessWrong from Kaj Sotala. Vipul asked me to write a short post giving an overview of the data, so here it is.

This post covers just the basics. Vipul and I are eager to hear thoughts on what sort of deeper analysis people are interested in; we may incorporate these ideas in future posts.

Pageviews and sessions

The data for both sessions and pageviews span from February 26, 2009 to January 3, 2017. LessWrong seems to have launched in February 2009, so this is close to the full duration for which LessWrong has existed.

Pageviews plot:

30-day rolling sum of Pageviews

Total pageviews recorded by Google Analytics for this period is 52.2 million.

Sessions plot:

30-day rolling sum of Sessions

Total sessions recorded by Google Analytics for this period is 19.7 million.

Both plots end with an upward swing, coinciding with the effort to revive LessWrong that began in late November 2016. However, as of early January 2017 (the latest period for which we have data) the scale of any recent increase in LessWrong usage is small in the context of the general decline starting in early 2012.

Top posts

The top 20 posts of all time (by total pageviews), with pageviews and unique pageviews rounded to the nearest thousand, are as follows:

Title Pageviews (thousands) Unique Pageviews (thousands)
Don’t Get Offended 681 128
How to Be Happy 551 482
How to Beat Procrastination 378 342
The Best Textbooks on Every Subject 266 233
Do you have High-Functioning Asperger’s Syndrome? 188 168
Superhero Bias 169 154
The Quantum Physics Sequence 157 130
Bayesian Judo 140 126
An Alien God 125 113
An Intuitive Explanation of Quantum Mechanics 123 106
Three Worlds Collide (0/8) 121 93
Bayes’ Theorem Illustrated (My Way) 121 112
9/26 is Petrov Day 121 115
The Baby-Eating Aliens (1/8) 109 98
The noncentral fallacy - the worst argument in the world? 107 99
Advanced Placement exam cutoffs and superficial knowledge over deep knowledge 107 94
Guessing the Teacher’s Password 102 96
The Fun Theory Sequence 102 90
Optimal Employment 102 97
Ugh fields 95 86

Note that Google Analytics reports are subject to sampling when the number of sessions is large (as it is here) so the input numbers are not exact. More details can be found in a post at LunaMetrics. This doesn’t affect the estimates for the top posts, but those wishing to work with the exported data should be aware of this.

Each post on LessWrong can have numerous URLs. In the case of posts that were renamed, a significant number of pageviews could be recorded at both the old and new URL. To take an example, the following URLs all point to lukeprog’s post “How to Be Happy”:

All that matters for identifying this particular post is that we have the substring “/lw/4su” in the URL. In the above table, I have grouped the URLs by this identifying substring and summed to get the pageview counts.

In addition, each post has two “canonical” URLs that can be obtained by clicking on the post titles: one that begins with either “/r/lesswrong/lw” or “/r/discussion/lw” and one that begins with just “/lw”. I have used the latter in linking to the posts from my table.

Source code

The data, source code used to generate the plots, as well as the Markdown source of this post are available in a GitHub Gist.

Clone the Git repository with:

git clone https://gist.github.com/cbdd400180417c689b2befbfbe2158fc.git

Further reading

Here are a few related PredictionBook predictions:

Acknowledgments

Thanks to Kaj for providing the data used in this post. Thanks to Vipul for asking around for the data, for the idea of this post, and for sponsoring my work on this post.

What's up with Arbital?

19 Alexei 29 March 2017 05:22PM

This post is for all the people who have been following Arbital's progress since 2015 via whispers, rumors, and clairvoyant divination. That is to say: we didn't do a very good job of communicating on our part. I hope this posts corrects some of that.

The top question on your mind is probably: "Man, I was promised that Arbital will solve X! Why hasn't it solved X already?" Where X could be intuitive explanations, online debate, all LessWrong problems, AGI, or just cancer. Well, we did try to solve the first two and it didn't work. Math explanations didn't work because we couldn't find enough people who would spend the time to write good math explanations. (That said, we did end up with some decent posts on abstract algebra. Thank you to everyone who contributed!) Debates didn't work because... well, it's a very complicated problem. There was also some disagreement within the team about the best approach, and we ended up moving too slowly.

So what now?

You are welcome to use Arbital in its current version. It's mostly stable, though a little slow sometimes. It has a few features some might find very helpful for their type of content. Eliezer is still writing AI Alignment content on it, and he heavily relies on the specific Arbital features, so it's pretty certain that the platform is not going away. In fact, if the venture fails completely, it's likely MIRI will adopt Arbital for their personal use.

I'm starting work on Arbital 2.0. It's going to be a (micro-)blogging platform. (If you are a serious blogger / Tumblr user, let me know; I'd love to ask you some questions!) I'm not trying to solve online debates, build LW 2.0, or cure cancer. It's just going to be a damn good blogging platform. If it goes well, then at some point I'd love to revisit the Arbital dream.

I'm happy to answer any and all questions in the comments.

Sufficiently sincere confirmation bias is indistinguishable from science

19 Benquo 15 March 2017 01:19PM

Some theater people at NYU people wanted to demonstrate how gender stereotypes affected the 2016 US presidential election. So they decided to put on a theatrical performance of the presidential debates – but with the genders of the principals swapped. They assumed that this would show how much of a disadvantage Hillary Clinton was working under because of her gender. They were shocked to discover the opposite – audiences full of Clinton supporters, watching the gender-swapped debates, came away thinking that Trump was a better communicator than they'd thought.

The principals don't seem to have come into this with a fair-minded attitude. Instead, it seems to have been a case of "I'll show them!":

Salvatore says he and Guadalupe began the project assuming that the gender inversion would confirm what they’d each suspected watching the real-life debates: that Trump’s aggression—his tendency to interrupt and attack—would never be tolerated in a woman, and that Clinton’s competence and preparedness would seem even more convincing coming from a man.

Let's be clear about this. This was not epistemic even-handedness. This was a sincere attempt at confirmation bias. They believed one thing, and looked only for confirming evidence to prove their point. It was only when they started actually putting together the experiment that they realized they might learn the opposite lesson:

But the lessons about gender that emerged in rehearsal turned out to be much less tidy. What was Jonathan Gordon smiling about all the time? And didn’t he seem a little stiff, tethered to rehearsed statements at the podium, while Brenda King, plainspoken and confident, freely roamed the stage? Which one would audiences find more likeable?

What made this work? I think what happened is that they took their own beliefs literally. They actually believed that people hated Hillary because she was a woman, and so their idea of something that they were confident would show this clearly was a fair test. Because of this, when things came out the opposite of the way they'd predicted, they noticed and were surprised, because they actually expected the demonstration to work.

But they went further. Even though they knew in advance of the public performances that the experiment got the wrong answer, they neither falsified nor file-drawered the evidence. They tried to show, they got a different answer, they showed it anyway.

This is much, much better science than contemporary medical or psychology research were before the replication crisis.

Sometimes, when I think about how epistemically corrupt our culture is, I'm tempted to adopt a permanent defensive crouch and disbelieve anything I can't fact-check, to explicitly adjust for all the relevant biases, and this prospect sounds exhausting. It's not actually necessary. You don't have to worry too much about your biases. Just take your own beliefs literally, as though they mean what they say they mean, and try to believe all their consequences as well. And, when you hit a contradiction – well, now you have an opportunity to learn where you're wrong.

(Cross-posted at my personal blog.)

The Semiotic Fallacy

19 Stabilizer 21 February 2017 04:50AM

Acknowledgement: This idea is essentially the same as something mentioned in a podcast where Julia Galef interviews Jason Brennan.

You are in a prison. You don't really know how to fight and you don't have very many allies yet. A prison bully comes up to you and threatens you. You have two options: (1) Stand up to the bully and fight. If you do this, you will get hurt, but you will save face. (2) You can try and run away. You might get hurt less badly, but you will lose face.

What should you do?

From reading accounts of former prisoners and also from watching realistic movies and TV shows, it seems like (1) is the better option. The reason is that the semiotics—or the symbolic meaning—of running away has bad consequences down the road. If you run away, you will be seen as weak, and therefore you will be picked on more often and causing more damage down the road.

This is a case where focusing the semiotics on the action is the right decision, because it is underwritten by future consequences.

But consider now a different situation. Suppose a country, call it Macholand, controls some tiny island far away from its mainland. Macholand has a hard time governing the island and the people on the island don't quite like being ruled by Macholand. Suppose, one fine day, the people of the island declare independence from Macholand. Macholand has two options: (1) Send the military over and put down the rebellion; or (2) Allow the island to take its own course.

From a semiotic standpoint, (1) is probably better. It signals that Macholand is strong and powerful country. But from a consequential standpoint, it is at least plausible (2) is a better option. Macholand saves money and manpower by not having to govern that tiny island; the people on the island are happier by being self-governing; and maybe the international community doesn't really care what Macholand does here.

This is a case where focusing on the semiotics can lead to suboptimal outcomes. 

Call this kind of reasoning the semiotic fallacy: Thinking about the semiotics of possible actions without estimating the consequences of the semiotics.

I think the semiotic fallacy is widespread in human reasoning. Here are a few examples:

  1. People argue that democracy is good because it symbolizes egalitarianism. (This is example used in the podcast interview)
  2. People argue that we should build large particle accelerators because it symbolizes human achievement.
  3. People argue that we shouldn't build a wall on the southern border because it symbolizes division.
  4. People argue that we should build a wall on the southern border because it symbolizes national integrity. 

Two comments are in order:

  1. The semiotic fallacy is a special case of errors in reasoning and judgement caused from signaling behaviors (à la Robin Hanson). The distinctive feature of the semiotic fallacy is that the semiotics are explicitly stated during reasoning. Signaling type errors are often subconscious: e.g., if we spend a lot of money on our parents' medical care, we might be doing it for symbolic purposes (i.e., signaling) but we wouldn't say explicitly that that's why we are doing it. In the semiotic fallacy on the other hand, we do explicitly acknowledge the reason we do something is because of its symbolism.
  2. Just like all fallacies, the existence of the fallacy doesn't necessarily mean the final conclusion is wrong. It could be that the semiotics are underwritten by the consequences. Or the conclusion could be true because of completely orthogonal reasons. The fallacy occurs when we ignore, in our reasoning during choice, the need for the consequential undergirding of symbolic acts.

Why is the surprisingly popular answer correct?

19 Stuart_Armstrong 03 February 2017 04:24PM

In Nature, there's been a recent publication arguing that the best way of gauging the truth of a question is to get people to report their views on the truth of the matter, and their estimate of the proportion of people who would agree with them.

Then, it's claimed, the surprisingly popular answer is likely to be the correct one.

In this post, I'll attempt to sketch a justification as to why this is the case, as far as I understand it.

First, an example of the system working well:

 

Capital City

Canberra is the capital of Australia, but many people think the actual capital is Sydney. Suppose only a minority knows that fact, and people are polled on the question:

Is Canberra the capital of Australia?

Then those who think that Sydney is the capital will think the question is trivially false, and will generally not see any reason why anyone would believe it true. They will answer "no" and put high proportion of people answering "no".

The minority who know the true capital of Australia will answer "yes". But most of them will likely know a lot of people who are mistaken, so they won't put a high proportion on people answering "yes". Even if they do, there are few of them, so the population estimate for the population estimate of "yes", will still be low.

Thus "yes", the correct answer, will be surprisingly popular.

A quick sanity check: if we asked instead "Is Alice Springs the capital of Australia?", then those who believe Sydney is the capital will still answer "no" and claim that most people would do the same. Those who believe the capital is in Canberra will answer similarly. And there will be no large cache of people believing in Alice Springs being the capital, so "yes" will not be surprisingly popular.

What is important here is that adding true information to the population, will tend to move the proportion of people believing in the truth, more than that moves people's estimate of that proportion.

 

No differential information:

Let's see how that setup could fail. First, it could fail in a trivial fashion: the Australian Parliament and the Queen secretly conspire to move the capital to Melbourne. As long as they aren't included in the sample, nobody knows about the change. In fact, nobody can distinguish a world in which that was vetoed from one where where it passed. So the proportion of people who know the truth - that being those few deluded souls who already though the capital was in Melbourne, for some reason - is no higher in the world where it's true than the one where it's false.

So the population opinion has to be truth-tracking, not in the sense that the majority opinion is correct, but in the sense that more people believe X is true, relatively, in a world where X is true versus a world where X is false.


Systematic bias in population proportion:

A second failure mode could happen when people are systematically biased in their estimate of the general opinion. Suppose, for instance, that the following headline went viral:

"Miss Australia mocked for claims she got a doctorate in the nation's capital, Canberra."

And suppose that those who believed the capital was in Sydney thought "stupid beauty contest winner, she thought the capital was in Canberra!". And suppose those know knew the true capital thought "stupid beauty contest winner, she claimed to have a doctorate!". So the actual proportion in the belief doesn't change much at all.

But then suppose everyone reasons "now, I'm smart, so I won't update on this headline, but some other people, who are idiots, will start to think the capital is in Canberra." Then they will update their estimate of the population proportion. And Canberra may no longer be surprisingly popular, just expectedly popular.

 

Purely subjective opinions

How would this method work on a purely subjective opinion, such as:

Is Picasso superior to Van Gogh?

Well, there are two ways of looking at this. The first is to claim this is a purely subjective opinion, and as such people's beliefs are not truth tracking, and so the answers don't give any information. Indeed, if everyone accepts that the question is purely subjective, then there is no such thing as private (or public) information that is relevant to this question at all. Even if there were a prior on this question, no-one can update on any information.

But now suppose that there is a judgement that is widely shared, that, I don't know, blue paintings are objectively superior to paintings that use less blue. Then suddenly answers to that question become informative again! Except now, the question that is really being answered is:

Does Picasso use more blue than Van Gogh?

Or, more generally:

According to widely shared aesthetic criteria, is Picasso superior to Van Gogh?

The same applies to moral questions like "is killing wrong?". In practice, that is likely to reduce to:

According to widely shared moral criteria, is killing wrong?

 

OpenAI makes humanity less safe

18 Benquo 03 April 2017 07:07PM

If there's anything we can do now about the risks of superintelligent AI, then OpenAI makes humanity less safe.

Once upon a time, some good people were worried about the possibility that humanity would figure out how to create a superintelligent AI before they figured out how to tell it what we wanted it to do.  If this happened, it could lead to literally destroying humanity and nearly everything we care about. This would be very bad. So they tried to warn people about the problem, and to organize efforts to solve it.

Specifically, they called for work on aligning an AI’s goals with ours - sometimes called the value alignment problem, AI control, friendly AI, or simply AI safety - before rushing ahead to increase the power of AI.

Some other good people listened. They knew they had no relevant technical expertise, but what they did have was a lot of money. So they did the one thing they could do - throw money at the problem, giving it to trusted parties to try to solve the problem. Unfortunately, the money was used to make the problem worse. This is the story of OpenAI.

Before I go on, two qualifiers:

  1. This post will be much easier to follow if you have some familiarity with the AI safety problem. For a quick summary you can read Scott Alexander’s Superintelligence FAQ. For a more comprehensive account see Nick Bostrom’s book Superintelligence.
  2. AI is an area in which even most highly informed people should have lots of uncertainty. I wouldn't be surprised if my opinion changes a lot after publishing this post, as I learn relevant information. I'm publishing this because I think this process should go on in public.

The story of OpenAI

Before OpenAI, there was DeepMind, a for-profit venture working on "deep learning” techniques. It was widely regarded as the advanced AI research organization. If any current effort was going to produce superhuman intelligence, it was DeepMind.

Elsewhere, industrialist Elon Musk was working on more concrete (and largely successful) projects to benefit humanity, like commercially viable electric cars, solar panels cheaper than ordinary roofing, cheap spaceflight with reusable rockets, and a long-run plan for a Mars colony. When he heard the arguments people like Eliezer Yudkowsky and Nick Bostrom were making about AI risk, he was persuaded that there was something to worry about - but he initially thought a Mars colony might save us. But when DeepMind’s head, Demis Hassabis, pointed out that this wasn't far enough to escape the reach of a true superintelligence, he decided he had to do something about it:

Hassabis, a co-founder of the mysterious London laboratory DeepMind, had come to Musk’s SpaceX rocket factory, outside Los Angeles, a few years ago. […] Musk explained that his ultimate goal at SpaceX was the most important project in the world: interplanetary colonization.

Hassabis replied that, in fact, he was working on the most important project in the world: developing artificial super-intelligence. Musk countered that this was one reason we needed to colonize Mars—so that we’ll have a bolt-hole if A.I. goes rogue and turns on humanity. Amused, Hassabis said that A.I. would simply follow humans to Mars.

[…]

Musk is not going gently. He plans on fighting this with every fiber of his carbon-based being. Musk and Altman have founded OpenAI, a billion-dollar nonprofit company, to work for safer artificial intelligence.

OpenAI’s primary strategy is to hire top AI researchers to do cutting-edge AI capacity research and publish the results, in order to ensure widespread access. Some of this involves making sure AI does what you meant it to do, which is a form of the value alignment problem mentioned above.

Intelligence and superintelligence

No one knows exactly what research will result in the creation of a general intelligence that can do anything a human can, much less a superintelligence - otherwise we’d already know how to build one. Some AI research is clearly not on the path towards superintelligence - for instance, applying known techniques to new fields. Other AI research is more general, and might plausibly be making progress towards a superintelligence. It could be that the sort of research DeepMind and OpenAI are working on is directly relevant to building a superintelligence, or it could be that their methods will tap out long before then. These are different scenarios, and need to be evaluated separately.

What if OpenAI and DeepMind are working on problems relevant to superintelligence?

If OpenAI is working on things that are directly relevant to the creation of a superintelligence, then its very existence makes an arms race with DeepMind more likely. This is really bad! Moreover, sharing results openly makes it easier for other institutions or individuals, who may care less about safety, to make progress on building a superintelligence.

Arms races are dangerous

One thing nearly everyone thinking seriously about the AI problem agrees on, is that an arms race towards superintelligence would be very bad news. The main problem occurs in what is called a “fast takeoff” scenario. If AI progress is smooth and gradual even past the point of human-level AI, then we may have plenty of time to correct any mistakes we make. But if there’s some threshold beyond which an AI would be able to improve itself faster than we could possibly keep up with, then we only get one chance to do it right.

AI value alignment is hard, and AI capacity is likely to be easier, so anything that causes an AI team to rush makes our chances substantially worse; if they get safety even slightly wrong but get capacity right enough, we may all end up dead. But you’re worried that the other team will unleash a potentially dangerous superintelligence first, then you might be willing to skip some steps on safety to preempt them. But they, having more reason to trust themselves than you, might notice that you’re rushing ahead, get worried that your team will destroy the world, and rush their (probably safe but they’re not sure) AI into existence.

OpenAI promotes competition

DeepMind used to be the standout AI research organization. With a comfortable lead on everyone else, they would be able to afford to take their time to check their work if they thought they were on the verge of doing something really dangerous. But OpenAI is now widely regarded as a credible close competitor. However dangerous you think DeepMind might have been in the absence of an arms race dynamic, this makes them more dangerous, not less. Moreover, by sharing their results, they are making it easier to create other close competitors to DeepMind, some of whom may not be so committed to AI safety.

We at least know that DeepMind, like OpenAI, has put some resources into safety research. What about the unknown people or organizations who might leverage AI capacity research published by OpenAI?

For more on how openly sharing technology with extreme destructive potential might be extremely harmful, see Scott Alexander’s Should AI be Open?, and Nick Bostrom’s Strategic Implications of Openness in AI Development.

What if OpenAI and DeepMind are not working on problems relevant to superintelligence?

Suppose OpenAI and DeepMind are largely not working on problems highly relevant to superintelligence. (Personally I consider this the more likely scenario.) By portraying short-run AI capacity work as a way to get to safe superintelligence, OpenAI’s existence diverts attention and resources from things actually focused on the problem of superintelligence value alignment, such as MIRI or FHI.

I suspect that in the long-run this will make it harder to get funding for long-run AI safety organizations. The Open Philanthropy Project just made its largest grant ever, to Open AI, to buy a seat on OpenAI’s board for Open Philanthropy Project executive director Holden Karnofsky. This is larger than their recent grants to MIRI, FHI, FLI, and the Center for Human-Compatible AI all together.

But the problem is not just money - it’s time and attention. The Open Philanthropy Project doesn’t think OpenAI is underfunded, and could do more good with the extra money. Instead, it seems to think that Holden can be a good influence on OpenAI. This means that of the time he's allocating to AI safety, a fair amount has been diverted to OpenAI.

This may also make it harder for organizations specializing in the sort of long-run AI alignment problems that don't have immediate applications to attract top talent. People who hear about AI safety research and are persuaded to look into it will have a harder time finding direct efforts to solve key long-run problems, since an organization focused on increasing short-run AI capacity will dominate AI safety's public image.

Why do good inputs turn bad?

OpenAI was founded by people trying to do good, and has hired some very good and highly talented people. It seems to be doing genuinely good capacity research. To the extent to which this is not dangerously close to superintelligence, it’s better to share this sort of thing than not – they could create a huge positive externality. They could construct a fantastic public good. Making the world richer in a way that widely distributes the gains is very, very good.

Separately, many people at OpenAI seem genuinely concerned about AI safety, want to prevent disaster, and have done real work to promote long-run AI safety research. For instance, my former housemate Paul Christiano, who is one of the most careful and insightful AI safety thinkers I know of, is currently employed at OpenAI. He is still doing AI safety work – for instance, he coauthored Concrete Problems in AI Safety with, among others, Dario Amodei, another OpenAI researcher.

Unfortunately, I don’t see how those two things make sense jointly in the same organization. I’ve talked with a lot of people about this in the AI risk community, and they’ve often attempted to steelman the case for OpenAI, but I haven’t found anyone willing to claim, as their own opinion, that OpenAI as conceived was a good idea. It doesn’t make sense to anyone, if you’re worried at all about the long-run AI alignment problem.

Something very puzzling is going on here. Good people tried to spend money on addressing an important problem, but somehow the money got spent on the thing most likely to make that exact problem worse. Whatever is going on here, it seems important to understand if you want to use your money to better the world.

(Cross-posted at my personal blog.)

What exactly is the "Rationality Community?"

17 Raemon 09 April 2017 12:11AM

This is the second post in the Project Hufflepuff sequence. It’s also probably the most standalone and relevant to other interests. The introduction post is here.


The Berkeley Hufflepuff Unconference is on April 28th. RSVPing on this Facebook Event is helpful, as is filling out this form.



 

I used to use the phrase "Rationality Community" to mean three different things. Now I only use it to mean two different things, which is... well, a mild improvement at least. In practice, I was lumping a lot of people together, many of whom neither wanted to get lumped together nor had much in common.

 

As Project Hufflepuff took shape, I thought a lot about who I was trying to help and why. And I decided the relevant part of the world looks something like this:

I. The Rationalsphere

The Rationalsphere is defined in the broadest possible sense - a loose cluster of overlapping interest groups, communities and individuals. It includes people who disagree wildly with each other - some who are radically opposed to one another. It includes people who don’t identify as “rationalist” or even as especially interested in “rationality” - but who interact with each other on a semi-regular basis. I think it's useful to be able to look at that ecosystem as a whole, and talk about it without bringing in implications of community.

continue reading »

2017: An Actual Plan to Actually Improve

17 helldalgo 27 January 2017 06:42PM

[Epistemic status: mostly confident, but being this intentional is experimental]

This year, I'm focusing on two traits: resilience and conscientiousness.  I think these (or the fact that I lack them) are my biggest barriers to success.  Also: identifying them as goals for 2017 doesn't mean I'll stop developing them in 2018.  A year is just a nice, established amount of time in which progress can actually be made.  This plan is a more intentional version of techniques I've used to improve myself over the last few years.  I have outside verification that I'm more responsible, high-functioning, and resilient than I was several years ago.  I have managed to reduce my SSRI dose, and I have finished more important tasks this year than last year.  

Inspiring blog posts and articles can only do so much for personal development.  The most valuable writing in that genre tends to outline actual steps that (the author believes) generate positive results.  Unfortunately, finding those steps is a fairly personal process.  The song that gives me twenty minutes of motivation and the drug that helps me overcome anxiety might do the opposite for you.  Even though I'm including detailed steps in this plan, you should keep that in mind.  I hope that this post can give you a template for troubleshooting and discovering your own bottlenecks.

I.  

First, I want to talk about my criteria for success.  Without illustrating the end result, or figuring out how to measure it, I could finish out the year with a false belief that I'd made progress.  If you plan something without success criteria, you run the same risk. I also believe that most of the criteria should be observable by a third party, i.e. hard to fake. 

  1. I respond to disruptions in my plans with distress and anger.  While I've gotten better at calming down, the distress still happens. I would like to have emotional control such that I observe first, and then feel my feelings.  Disruptions should incite curiosity, and a calm evaluation of whether to correct course.  The observable bit is whether or not my husband and friends report that I seem less upset when they disrupt me.  This process is already taking place; I've been practicing this skill for a long time and I expect to continue seeing progress.  (resilience)
  2. If an important task takes very little time, doesn't require a lot of effort, and doesn't disrupt a more important process, I will do it immediately. The observable part is simple, here: are the dishes getting done? Did the trash go out on Wednesday?  (conscientiousness)
  3. I will do (2) without "taking damage."  I will use visualization of the end result to make my initial discomfort less significant.  (resilience) 
  4. I will use various things like audiobooks, music, and playfulness to make what can be made pleasant, pleasant.  (resilience and conscientiousness)
  5. My instinct when encountering hard problems will be to dissolve them into smaller pieces and identify the success criteria, immediately, before I start trying to generate solutions. I can verify that I'm doing this by doing hard problems in front of people, and occasionally asking them to describe my process as it appears.  
  6. I will focus on the satisfaction of doing hard things, and practice sitting in discomfort regularly (cold tolerance, calming myself around angry people, the pursuit of fitness, meditation).  It's hard to identify an external sign that this is accomplished.  I expect aversion-to-starting to become less common, and my spouse can probably identify that.  (conscientiousness)
  7. I will keep a daily journal of what I've accomplished, and carry a notebook to make reflective writing easy and convenient.  This will help keep me honest about my past self.  (conscientiousness) 
  8. By the end of the year, I will find myself and my close friends/family satisfied with my growth.  I will have a record of finishing several important tasks, will be more physically fit than I am now, and will look forward to learning difficult things.
One benefit of the some of these is that practice and success are the same.  I can experience the satisfaction of any piece of my practice done well; it will count as being partly successful.  

II.

I've taken the last few years to identify these known bottlenecks and reinforcing actions.  Doing one tends to make another easier, and neglecting them keeps harder things unattainable.  These are the most important habits to establish early.  

  1. Meditation for 10 minutes a day directly improves my resilience and lowers my anxiety.
  2. Medication shouldn't be skipped (an SSRI, DHEA, and methylphenidate). If I decide to go off of it, I should properly taper rather than quitting cold turkey.  DHEA counteracts the negatives of my hormonal birth control and (seems to!) make me more positively aggressive and confident.
  3. Fitness (in the form of dance, martial arts, and lifting) keeps my back from hurting, gives me satisfaction, and has a number of associated cognitive benefits.  Dancing and martial arts also function as socialization, in a way that leads to group intimacy faster than most of my other hobbies.  Being fit and attractive helps me maintain a high libido.  
  4. I need between 7 and 9 hours of sleep.  I've tried getting around it.  I can't.  Getting enough sleep is a well-documented process, so I'm not going to outline my process here.
  5. Water.  Obviously.
  6. Since overcoming most of my social anxiety, I've discovered that frequent, high-value socialization is critical to avoid depression.  I try to regularly engage in activities that bootstrap intimacy, like the dressing room before performances, solving a hard problem with someone, and going to conventions.  I need several days a week to include long conversations with people I like.  
Unknown bottlenecks can be identified by identifying a negative result, and tracing the chain of events backwards until you find a common denominator.  Sometimes, these can also be identified by people who interact with you a lot.

III.  

My personal "toolkit" is a list of things that give me temporary motivation or rapidly deescalate negative emotions.  

  1. Kratom (<7g) does wonders for my anxieties about starting a task.  I try not to take it too often, since I don't want to develop tolerance, but I like to keep some on hand for this.
  2. Nicotine+caffeine/ltheanine capsules gives me an hour of motivation without jitters.  This also has a rapid tolerance so I don't do it often.
  3. A 30-second mindfulness meditation can usually calm my first emotional response to a distressing event.
  4. Various posts on mindingourway.com can help reconnect me to my values when I'm feeling particularly demotivated.  
  5. Reorganizing furniture makes me feel less "stuck" when I get restless.  Ditto for doing a difficult thing in a different place.
  6. Google Calendar, a number of notebooks, and a whiteboard keep me from forgetting important tasks.
  7. Josh Waitzkin's book, The Art of Learning, remotivates me to achieve mastery in various hobbies.
  8. External prompting from other people can make me start a task I've been avoiding. Sometimes I have people aggressively yell at me.
  9. The LW study hall (Complice.co) helps keep me focused. I also do "pomos" over video with other people who don't like Complice.
IV.

This outline is the culmination of a few years of troubleshooting, getting feedback, and looking for invented narratives or dishonesty in my approach.  Personal development doesn't happen quickly for me, and I expect it doesn't for most people.  You should expect significant improvements to be a matter of years, not months, unless you're improving the basics like sleep or fitness.  For those, you see massive initial gains that eventually level off.  

If you have any criticisms or see any red flags in my approach, let me know in the comments.

 

Stupidity as a mental illness

15 PhilGoetz 10 February 2017 03:57AM

It's great to make people more aware of bad mental habits and encourage better ones, as many people have done on LessWrong.  The way we deal with weak thinking is, however, like how people dealt with depression before the development of effective anti-depressants:

  • Clinical depression was only marginally treatable.
  • It was seen as a crippling character flaw, weakness, or sin.
  • Admitting you had it could result in losing your job and/or friends.
  • Treatment was not covered by insurance.
  • Therapy was usually analytic or behavioral and not very effective.
  • People thus went to great mental effort not to admit, even to themselves, having depression or any other mental illness.
continue reading »

The Social Substrate

15 lahwran 09 February 2017 07:22AM

This post originally appeared on The Gears To Ascension

ABSTRACT

I present generative modeling of minds as a hypothesis for the complexities of social dynamics, and build a case for it out of pieces. My hope is that this explains social behaviors more precisely and with less handwaving than its components. I intend this to be a framework for reasoning about social dynamics more explicitly and for training intuitions. In future posts I plan to build on it to give more concrete evidence, and give examples of social dynamics that I think become more legible with the tools provided by combining these ideas.

Epistemic status: Hypothesis, currently my maximum likelihood hypothesis, of why social interaction is so weird.

INTRO: SOCIAL INTERACTION.

People talk to each other a lot. Many of them are good at it. Most people don't really have a deep understanding of why, and it's rare for people to question why it's a thing that's possible to be bad at. Many of the rules seem arbitrary at first look, and it can be quite hard to transfer skill at interaction by explanation.

Some of the rules sort of make sense, and you can understand why bad things would happen when you break them: Helping people seems to make them more willing to help you. Being rude to people makes them less willing to help you. People want to "feel heard". But what do those mean, exactly?

I've been wondering about this for a while. I wasn't naturally good at social interaction, and have had to put effort into learning it. This has been a spotty success - I often would go to people for advice, and then get things like "people want to know that you care". That advice sounded nice, but it was vague and not usable.

The more specific social advice seems to generalize quite badly. "Don't call your friends stupid", for example. Banter is an important part of some friendships! People say each other are ugly and feel cared for. Wat?

Recently, I've started to see a deeper pattern here that actually seems to have strong generalization: it's simple to describe, it correctly predicts large portions of very complicated and weird social patterns, and it reliably gives me a lens to decode what happened when something goes wrong. This blog post is my attempt to share it as a package.

I basically came up with none of this. What I'm sharing is the synthesis of things that Andrew Critch, Nate Soares, and Robin Hanson have said - I didn't find these ideas that useful on their own, but together I'm kind of blown away by how much they collectively explain. In future blog posts I'll share some of the things I have used this to understand.

WARNING: An easy instinct, on learning these things, is to try to become more complicated yourself, to deal with the complicated territory. However, my primary conclusion is "simplify, simplify, simplify": try to make fewer decisions that depend on other people's state of mind. You can see more about why and how in the posts in the "Related" section, at the bottom.

NEWCOMB'S TEST

Newcomb's problem is a game that two beings can play. Let's say that the two people playing are you and Newcomb. On Newcomb's turn, Newcomb learns all that they can about you, and then puts one opaque box and one transparent box in a room. Then on your turn, you go into the room, and you can take one or both of the boxes. What Newcomb puts in the boxes depends on what they think you'll do once it's your turn:

  • If Newcomb thinks that you'll take only the opaque box, they fill it with $1 million, and put $1000 in the transparent box.
  • If Newcomb thinks that you'll take both of the boxes, they only put $1000 in the transparent box.

Once Newcomb is done setting the room up, you enter and may do whatever you like.

This problem is interesting because the way you win or lose has little to do with what you actually do once you go into the room, it's entirely about what you can convince Newcome you'll do. This leads many people to try to cheat: convince Newcomb that you'll only take one box, and then take two.

In the original framing, Newcomb is a mind-reading oracle, and knows for certain what you'll do. In a more realistic version of the test, Newcomb is merely a smart person and paying attention to you. Newcomb's problem is simply a crystallized view of something that people do all the time: evaluate what kind of people each other, to determine trust. And it's interesting to look at it and note that when it's crystallized, it's kind of weird. When you put it this way, it becomes apparent that there are very strong arguments for why you should always do the trustworthy thing and one-box.

THE NECESSITY OF NEWCOMBLIKE INTERACTION

(This section inspired by nate soares' post "newcomblike problems are the norm".)

You want to know that people care about you. You don't just want to know that the other person is acting helpfully right now. If someone doesn't care about you, and is just helping you because it helps them, then you'll trust and like them less. If you know that someone thinks your function from experience to emotions is acceptable to them, you will feel validated.

I think this makes a lot of sense. In artificial distributed systems, we ask a bunch of computers to work together, each computer a node in the system. All of the computers must cooperate to perform some task - some artificial distributed systems, like bittorrent, are intended to allow the different nodes (computers) in the system to share things with each other, but where each participating computer joins to benefit from the system. Other distributed systems, such as the backbone routers of the internet, are intended to provide a service to the outside world - in the case of the backbone routers, they make the internet work.

However, nodes can violate the distributed system's protocols, and thereby gain advantage. In bittorrent, nodes can download but refuse to upload. In the internet backbone, each router needs to know where other routers are, but if a nearby router lies, then the entire internet may slow down dramatically, or route huge portions of US traffic to china. Unfortunately, despite the many trust problems in distributed systems, we have solved relatively few of them. Bitcoin is a fun exception to this - I'll use it as a metaphor in a bit.

Humans are each nodes in a natural distributed system, where each node has its own goals, and can provide and consume services, just like the artificial ones we've built. But we also have this same trust problem, and must be solving it somehow, or we wouldn't be able to make civilizations.

Human intuitions automatically look for reasons why the world is the way it is. In stats/ML/AI, it's called generative modeling. When you have an experience - every time you have any experience, all the time, on the fly - your brain's low level circuitry assumes there was a reason that the experience happened. Each moment your brain is looking for what the process was that created that experience for you. Then in the future, you can take your mental version of the world and run it forward to see what might happen.

When you're young, you start out pretty uncertain about what processes might be driving the world, but as you get older your intuition learns to expect gravity to work, learns to expect that pulling yourself up by your feet won't work, and learns to think of people as made of similar processes to oneself.

So when you're interacting with an individual human, your brain is automatically tracking what sort of process they are - what sort of person they are. It is my opinion that this is one of the very hardest things that brains do (where I got that idea). When you need to decide whether you trust them, you don't just have to do that based off their actions - you also have your mental version of them that you've learned from watching how they behave.

But it's not as simple as evaluating, just once, what kind of person someone is. As you interact with someone, you are continuously automatically tracking what kind of person they are, what kind of thoughts they seem to be having right now, in the moment. When I meet a person and they say something nice, is it because they think they're supposed to, or because they care about me? If my boss is snapping at me, are they to convince me I'm unwelcome at the company without saying it outright, or is my boss just having a bad day?

NEWCOMBLIKE URGES

Note: I am not familiar with the details of the evolution of cooperation. I propose a story here to transfer intuitions, but the details may have happened in a different order. I would be surprised if I am not describing a real event, and it would weaken my point.

Humans are smart, and our ancestors have been reasonably smart going back a very long time, far before even primates branched off. So imagine what it was like to be an animal in a pre-tribal species. You want to survive, and you need resources to do so. You can take them from other animals. You can give them to other animals. Some animals may be more powerful than you, and attempt to take yours.

Imagine what it's like to be an animal partway through the evolution of cooperation. You feel some drive to be nice to other animals, but you don't want to be nice if the other animal will take advantage of you. So you pay attention to which animals seem to care about being nice, and you only help them. They help you, and you both survive.

As the generations go on, this happens repeatedly. An animal that doesn't feel caring for other animals is an animal that you can't trust; An animal that does feel caring is one that you want to help, because they'll help you back.

Over generations, it becomes more and more the case that the animals participating in this system actually want to help each other - because the animals around them are all running newcomblike tests of friendliness. Does this animal seem to have a basic urge to help me? Will this animal only take the one box, if I leave the boxes lying out? If the answer is that you can trust them, and you recognize that you can trust them, then that is the best for you, because then the other animal recognizes that they were trusted and will be helpful back.

After many generations of letting evolution explore this environment, you can expect to end up with animals that feel strong emotions for each other, animals which want to be seen as friendly, animals where helping matters. Here is an example of another species that has learned to behave sort of this way.

This seems to me be a good generating hypothesis for why people care about what other people think of them innately, and seems to predict ways that people will care about each other. I want to feel like people actually care about me, I don't just want to hear them say that they do. In particular, it seems to me that humans want this far more than you would expect of an arbitrary smart-ish animal.

I'll talk more in detail about what I think human innate social drives actually are in a future blog post. I'm interested in links to any research on things like human basic needs or emotional validation. For now, the heuristic I've found most useful is simply "People want to know that those around them approve of/believe their emotional responses to their experiences are sane". See also Succeed Socially, in the related list.

THE RECURSION DISTORTION

Knowing that humans evaluate each other in newcomblike ways doesn't seem to me to be enough to figure out how to interact with them. Only armed with the statement "one needs to behave in a way that others will recognize as predictably cooperative", I still wouldn't know how to navigate this.

At a lightning talk session I was at a few months ago, Andrew Critch made the argument that humans regularly model many layers deep in real situations. His claim was that people intuitively have a sense of what each other are thinking, including their senses of what you're thinking, and back and forth for a bit. Before I go on, I should emphasize how surprising this should be, without the context of how the brain actually does it: the more levels of me-imagining-you-imagining-me-imagining-you-imagining… you go, the more of an explosion of different options you should expect to see, and the less you should expect actual-sized human minds to be able to deal with it.

However, after having thought about it, I don't think it's as surprising as it seems. I don't think people actually vividly imagine this that many levels deep: what I think is going on is that as you grow up, you learn to recognize different clusters of ways a person can be. Stereotypes, if you will, but not necessarily so coarse as that implies.

At a young age, if I am imagining you, I imagine a sort of blurry version of you. My version of you will be too blurry to have its own version of me, but I learn to recognize the blurry-you when I see it. The blurry version of you only has a few emotions, but I sort of learn what they are: my blurry you can be angry-"colored", or it can be satisfied-"colored", or it can be excited-"colored", etc. ("Color" used here as a metaphor, because I expect this to be built a similar way to color or other basic primitives in the brain.)

Then later, as I get older, I learn to recognize when you see a blurry version of me. My new version of you is a little less blurry, but this new version of you has a blurry-me, made out of the same anger-color or satisfaction-color that I had learned you could be made out of. I go on, and eventually this version of you becomes its own individual colors - you can be angry-you-with-happy-me-inside colored when I took your candy, or you can be relieved-you-with-distraught-me-inside colored when you are seeing that I'm unhappy when a teacher took your candy back.

As this goes on, I learn to recognize versions of you as their own little pictures, with only a few colors - but each color is a "color" that I learned in the past, and the "color" can have me in it, maybe recursively. Now my brain doesn't have to track many levels - it just has to have learned that there is a "color" for being five levels deep of this, or another "color" for being five levels deep of that. Now that I have that color, my intuition can make pictures out of the colors and thereby handle six levels deep, and eventually my intuition will turn six levels into colors and I'll be able to handle seven.

I think it gets a bit more complicated than this for particularly socially competent people, but that's a basic outline of how humans could reliably learn to do this.

A RECURSION EXAMPLE

I found the claim that humans regularly social-model 5+ levels deep hard to believe at first, but Critch had an example to back it up, which I attempt to recreate here.

Fair warning, it's a somewhat complicated example to follow, unless you imagine yourself actually there. I only share it for the purpose of arguing that this sort of thing actually can happen; if you can't follow it, then it's possible the point stands without it. I had to invent notation in order to make sure I got the example right, and I'm still not sure I did.

(I'm sorry this is sort of contrived. Making these examples fully natural is really really hard.)

  • You're back in your teens, and friends with Kris and Gary. You hang out frequently and have a lot of goofy inside jokes and banter.
  • Tonight, Gary's mom has invited you and Kris over for dinner.
  • You get to Gary's house several hours early, but he's still working on homework. You go upstairs and borrow his bed for a nap.
  • Later, you're awoken by the activity as Kris arrives, and Gary's mom shouts a greeting from the other room: "Hey, Kris! Your hair smells bad.". Kris responds with "Yours as well." This goes back and forth, with Gary, Kris, and Gary's mom fluidly exchanging insults as they chat. You're surprised - you didn't know Kris knew Gary's mom.
  • Later, you go downstairs to say hi. Gary's mom says "welcome to the land of the living!" and invites you all to sit and eat.
  • Partway through eating, Kris says "Gary, you look like a slob."
  • You feel embarrassed in front of Gary's mom, and say "Kris, don't be an ass."
  • You knew they had been bantering happily earlier. If you hadn't had an audience, you'd have just chuckled and joined in. What happened here?

If you'd like, pause for a moment and see if you can figure it out.


You, Gary, and Kris all feel comfortable bantering around each other. Clearly, Gary and Kris feel comfortable around Gary's mom, as well. But the reason you were uncomfortable is that you know Gary's mom thought you were asleep when Kris got there, and you hadn't known they were cool before, so as far as Gary's mom knows, you think she thinks kris is just being an ass. So you respond to that.

Let me try saying that again. Here's some notation for describing it:

  • X => Y: X correctly believes Y
  • X ~> Y: X incorrectly believes Y
  • X ?? Y: X does not know Y
  • X=Y=Z=...: X and Y and Z and ... are comfortable bantering

And here's an explanation in that notation:

  • Kris=You=Gary: Kris, You, and Gary are comfortable bantering.
  • Gary=Kris=Gary's mom: Gary, Kris, and Gary's mom are comfortable bantering.
  • You => [gary=Gary's mom=kris]: You know they're comfortable bantering.
  • Gary's mom ~> [You ?? [gary=Gary's mom=kris]]: Gary's mom doesn't know you know.
  • You => [Gary's mom ~> [You ?? [gary=Gary's mom=kris]]]: You know Gary's mom doesn't know you know they're comfortable bantering.

And to you in the moment, this crazy recursion just feels like a bit of anxiety, fuzzyness, and an urge to call Kris out so Gary's mom doesn't think you're ok with Kris being rude.

Now, this is a somewhat unusual example. It has to be set up just right in order to get such a deep recursion. The main character's reaction is sort of unhealthy/fake - better would have been to clarify that you overheard them bantering earlier. As far as I can tell, the primary case where things get this hairy is when there's uncertainty. But it does actually get this deep - this is a situation pretty similar to ones I've found myself in before.

There's a key thing here: when things like this happen, you react nearly immediately. You don't need to sit and ponder, you just immediately feel embarrassed for Kris, and react right away. Even though in order to figure out explicitly what you were worried about, you would have had to think about it four levels deep.

If you ask people about this, and it takes deep recursion to figure out what's going on, I expect you will generally get confused non-answers, such as "I just had a feeling". I also expect that when people give confused non-answers, it is almost always because of weird recursion things happening.

In Critch's original lightning talk, he gave this as an argument that the human social skills module is the one that just automatically gets this right. I agree with that, but I want to add: I think that that module is the same one that evaluates people for trust and tracks their needs and generally deals with imagining other people.

COMMUNICATION IN A NEWCOMBLIKE WORLD

So people have generative models of each other, and they care about each other's generative models of them. I care about people's opinion of me, but not in just a shallow way: I can't just ask them to change their opinion of me, because I'll be able to tell what they really think. Their actual moral judgement of their actual generative model of me directly affects my feelings of acceptance. So I want to let them know what kind of person I am: I don't just want to claim to be that kind of person, I want to actually show them that I am that kind of person.

You can't just tell someone "I'm not an asshole"; that's not strong evidence about whether you're an asshole. People have incentives to lie. People have powerful low-level automatic bayesian inference systems, and they'll automatically and intuitively recognize what social explanations are more likely as explanations of your behavior. If you want them to believe you're not an asshole, you have to give credible evidence that you are not an asshole: you have to show them that you do things that would have been unlikely had you been an asshole. You have to show them that you're willing to be nice to them, you have to show them that you're willing to accommodate their needs. Things that would be out of character if you were a bad character.

If you hang out with people who read Robin Hanson, you've probably heard of this before, under the name "signaling".

But many people who hear that interpret it as a sort of vacuous version, as though "signaling" is a sort of fakery, as though all you need to do is give the right signals. If someone says "I'm signaling that I'm one of the cool kids", then sure, they may be doing things that for other people would be signals of being one of the cool kids, but on net the evidence is that they are not one of the cool kids. Signaling isn't about the signals, it's about giving evidence about yourself.In order to be able to give credible evidence that you're one of the cool kids, you have to either get really good at lying-with-your-behavior such that people actually believe you, or you have to change yourself to be one of the cool kids. (This is, I think, a big part of where social anxiety advice falls down: "fake it 'til you make it" works only insofar as faking it actually temporarily makes it.)

"Signaling" isn't fakery, it is literally all communication about what kind of person you are. A common thing Hanson says, "X isn't about Y, it's about signaling" seems misleading to me: if someone is wearing a gold watch, it's not so much that wearing a gold watch isn't about knowing the time, it's that the owner's actual desires got distorted by the lens of common knowledge. Knowing that someone would be paying attention to them to infer their desires, they filtered their desires to focus on the ones they thought would make them look good. This also can easily come off as inauthentic, and it seems fairly clear why to me: if you're filtering your desires to make yourself look good, then that's a signal that you need to fake your desires or else you won't look good.

Signals are focused around hard-to-fake evidence. Anything and everything that is hard to fake and would only happen if you're a particular kind of person, and that someone else recognizes as so, is useful in conveying information about what kind of person you are. Fashion and hygiene are good examples of this: being willing to put in the effort make yourself fashionable or presentable, respectively, is evidence of being the kind of person who cares about participating in the societal distributed system.

Conveying truth in ways that are hard to fake is the sort of thing that comes up in artificial distributed systems, too. Bitcoin is designed around a "blockchain": a series of incredibly-difficult-to-fake records of transactions. 
Bitcoin has interesting cryptographic tricks to make this hard to fake, but it centers around having a lot of people doing useless work, so that no one person can do a bunch more useless work and thereby succeed at faking it.

SUMMARY

From the inside, it doesn't feel like we're in a massive distributed system. It doesn't feel like we're tracking game theory and common knowledge. Even though everyone, even those who don't know about it, do it automatically.

In the example, the main character just felt like something was funny. The reason they were able to figure it out and say something so fast was that they were a competent human who had focused their considerable learning power on understanding social interaction, presumably from a young age, and automatically recognized a common knowledge pattern when it presented itself.

But in real life, people are constantly doing this. To get along with people, you have to be willing to pay attention to giving evidence about your perception of them. To be accepted, you have to be willing to give evidence that you are the kind of person that other people want to accept, and you might need to change yourself if you actually just aren't.

In general, I currently think that minimizing recursion depth of common knowledge is important. Try to find ways-to-be that people will be able to recognize more easily. Think less about social things in-the-moment so that others have to think less to understand you; adjust your policies to work reliably so that people can predict them reliably.

Other information of interest

Brief update on the consequences of my "Two arguments for not thinking about ethics" (2014) article

14 Kaj_Sotala 05 April 2017 11:25AM

In March 2014, I posted on LessWrong an article called "Two arguments for not thinking about ethics (too much)", which started out with:

I used to spend a lot of time thinking about formal ethics, trying to figure out whether I was leaning more towards positive or negative utilitarianism, about the best courses of action in light of the ethical theories that I currently considered the most correct, and so on. From the discussions that I've seen on this site, I expect that a lot of others have been doing the same, or at least something similar.

I now think that doing this has been more harmful than it has been useful, for two reasons: there's no strong evidence to assume that this will give us very good insight to our preferred ethical theories, and more importantly, because thinking in those terms will easily lead to akrasia.

I ended the article with the following paragraph:

My personal experience of late has also been that thinking in terms of "what does utilitarianism dictate I should do" produces recommendations that feel like external obligations, "shoulds" that are unlikely to get done; whereas thinking about e.g. the feelings of empathy that motivated me to become utilitarian in the first place produce motivations that feel like internal "wants". I was very close to (yet another) burnout and serious depression some weeks back: a large part of what allowed me to avoid it was that I stopped entirely asking the question of what I should do, and began to focus entirely on what I want to do, including the question of which of my currently existing wants are ones that I'd wish to cultivate further. (Of course there are some things like doing my tax returns that I do have to do despite not wanting to, but that's a question of necessity, not ethics.) It's way too short of a time to say whether this actually leads to increased productivity in the long term, but at least it feels great for my mental health, at least for the time being.

The long-term update (three years after first posting the article) is that starting to shift my thought patterns in this way was totally the right thing to do, and necessary for starting a long and slow recovery from depression. It's hard to say entirely for sure how big of a role this has played, since the patterns of should-thought were very deeply ingrained and have been slow to get rid of; I still occasionally find myself engaging in them. And there have been many other factors also affecting my recovery during this period, so only a part of the recovery can be attributed to the "utilitarianism-excising" with any certainty. Yet, whenever I've found myself engaging in such patterns of thought and managed to eliminate them, I have felt much better as a result. I do still remember a time when a large part of my waking-time was driven by utilitarian thinking, and it's impossible for me to properly describe how relieved I now feel for the fact that my mind feels much more peaceful now.

The other obvious question besides "do I feel better now" is "do I actually get more good things done now"; and I think that the answer is yes there as well. So I don't just feel generally better, I think my actions and motivations are actually more aligned with doing good than they were when I was trying to more explicitly optimize for following utilitarianism and doing good in that way. I still don't feel like I actually get a lot of good done, but I attribute much of this to still not having entirely recovered; I also still don't get a lot done that pertains to my own personal well-being. (I just spent several months basically doing nothing, because this was pretty much the first time when I had the opportunity, finance-wise, to actually take a long stressfree break from everything. It's been amazing, but even after such an extended break, the burnout symptoms still pop up if I'm not careful.)

LW UI issue

14 gworley 24 March 2017 06:08PM

Not really sure where else I might post this, but there seems to be a UI issue on the site. When I hit the homepage of lesswrong.com while logged in I no longer see the user sidebar or the header links for Main and Discussion. This is kind of annoying because I have to click into an article first to get to a page where I can access those things. Would be nice to have them back on the front page.

Musical setting of the Litany of Tarski

14 komponisto 23 March 2017 11:18AM

About a year ago, I made a setting of the Litany of Tarski for four-part a cappella (i.e. unaccompanied) chorus.

More recently, in the process of experimenting with MuseScore for potential use in explaining musical matters on the internet (it makes online sharing of playback-able scores very easy), the thought occurred to me that perhaps the Tarski piece might be of interest to some LW readers (if no one else!), so I went ahead and re-typeset it in MuseScore for your delectation. 

Here it is (properly notated :-)).

Here it is (alternate version designed to avoid freaking out those who aren't quite the fanatical enthusiasts of musical notation that I am).

[Link] "On the Impossibility of Supersized Machines"

13 crmflynn 31 March 2017 11:32PM

In support of Yak Shaving

13 Elo 16 March 2017 05:31AM

Original post:  http://bearlamp.com.au/in-support-of-yak-shaving/


Yak shaving is heralded as pretty much "the devil" of trying to get things done.  The anti-yak shaving movement will identify this problem as being one of focus.  The moral of the story they give is "don't yak shave".

Originally posted in MIT's media lab with the description:

Any seemingly pointless activity which is actually necessary to solve a problem which solves a problem which, several levels of recursion later, solves the real problem you're working on.

But I prefer the story by Seth Godin:

"I want to wax the car today."

"Oops, the hose is still broken from the winter. I'll need to buy a new one at Home Depot."

"But Home Depot is on the other side of the Tappan Zee bridge and getting there without my EZPass is miserable because of the tolls."

"But, wait! I could borrow my neighbor's EZPass..."

"Bob won't lend me his EZPass until I return the mooshi pillow my son borrowed, though."

"And we haven't returned it because some of the stuffing fell out and we need to get some yak hair to restuff it."

And the next thing you know, you're at the zoo, shaving a yak, all so you can wax your car.

I disagree with the conclusion to not yak shave, and here's why.


The problem here is that you didn't wax the car because you spent all day shaving yaks (see also "there's a hole in my bucket").  In a startup that translates to not doing the tasks that get customers - the tasks which get money and actually make an impact, say "playing with the UI".  It's easy to see why such anti-yak shaving sentiment would exist (see also: bikeshedding, rearranging deck chairs on the titanic, hamming questions).  You can spend a whole day doing a whole lot of nothings; getting to bed and wonder what you actually accomplished that day (hint: a whole lot of running in circles).

Or at least that's what it looks like on the surface.  But let's look a little deeper into what the problems and barriers are in the classic scenario.

  1. Want to wax car
  2. Broken hose
  3. Hardware store is far away
  4. No EZpass for tolls
  5. Neighbour won't lend the pass until pillow is returned
  6. Broken mooshi pillow
  7. Have to go get yak hair.

So it's not just one problem, but a series of problems that come up in a sequence.  Hopefully by the end of the list you can turn around and walk all the way straight back up the list.  But in the real world there might even be other problems like, you get to the hardware store and realise you don't know the hose-fitting size of your house so you need to call someone at home to check...

On closer inspection; this sort of behaviour is not like bikeshedding at all.  Nor is it doing insignificant things under the guise of "real work".  Instead this is about tackling what stands in the way of your problem.  In problem solving in the real world, Don't yak shave" is not what I have found to be the solution.  In experiencing this the first time it feels like a sequence of discoveries.  For example, first you discover the hose.  Then you discover the EZpass problem, then you discover the pillow problem, at which point you are pretty sick of trying to wax your car and want a break or to work on something else.


I propose that classic yak shaving presents a very important sign that things are broken.  In order to get to the classic scenario we had to

  1. have borrowed a pillow from our neighbour,
  2. have it break and not get fixed,
  3. not own our own EZpass,
  4. live far from a hardware store,
  5. have a broken hose, and
  6. want to wax a car.  

Each open problem in this scenario presents an open problem or an open loop.  Yak shaving presents a warning sign that you are in a Swiss-cheese model scenario of problems.  This might sound familiar because it's the kind of situation which leads to the Fukushima reactor meltdown.  It's the kind of scenario when you try to work out why the handyman fell off your roof and died, and you notice that:

  1. he wasn't wearing a helmet.
  2. He wasn't tied on safely
  3. His ladder wasn't tied down
  4. It was a windy day
  5. His harness was old and worn out
  6. He was on his phone while on the roof...

And you realise that any five of those things could have gone wrong and not caused much of a problem.  But you put all six of those mistakes together and line the wind up in just the right way, everything comes tumbling down.


Yak shaving is a sign that you are living with problems waiting to crash down.  And living in a situation where you don't have time to do the sort of maintenance that would fix things and keep smoulders from bursting into flames.

I can almost guaranteed that when your house of cards all come falling down, it happens on a day that you don't have the spare time to waste on ridiculous seeming problems.


What should you do if you are in this situation?

Yak shave.  The best thing you can do if half your projects are unfinished and spread around the room is to tidy up.  Get things together; organise things, initiate the GTD system (or any system), wrap up old bugs, close the open loops (advice from GTD) and as many times as you can; YAK SHAVE for all you are worth!

If something is broken, and you are living with it, that's not acceptable.  You need a system in your life to regularly get around to fixing it.  Notepadsreviews, list keeping, set time aside for doing it and plan to fix things.

So I say, Yak Shave, as much, as long, and as many times as it takes till there are no more yaks to shave.


Something not mentioned often enough is a late addition to my list of common human goals.

Improve the tools available – sharpen the axe, write a new app that can do the thing you want, invent systems that work for you.  prepare for when the rest of the work comes along.

People often ask how you can plan for lucky breaks in your life.  How do you cultivate opportunity?  I can tell you right here and now, this is how.

Keep a toolkit at the ready, a work-space (post coming soon) at the ready, spare time for things to go wrong and things to go right.  And don't forget to play.  Why do we sharpen the axe?  Clear Epistemics, or clear Instrumental Rationality.  Be prepared for the situation that will come up.

Yak Shave like your life depends on it.  Because your life might one day depend on it.  Your creativity certainly does.


Meta: this took 2.5 hrs to write.

[Link] David Chalmers on LessWrong and the rationalist community (from his reddit AMA)

13 ignoranceprior 22 February 2017 07:07PM

Increasing GDP is not growth

13 PhilGoetz 16 February 2017 06:04PM

I just saw another comment implying that immigration was good because it increased GDP.  Over the years, I've seen many similar comments in the LW / transhumanist / etc bubble claiming that increasing a country's population is good because it increases its GDP.  These are generally used in support of increasing either immigration or population growth.

It doesn't, however, make sense.  People have attached a positive valence to certain words, then moved those words into new contexts.  They did not figure out what they want to optimize and do the math.

I presume they want to optimize wealth or productivity per person.  You wouldn't try to make Finland richer by absorbing China.  Its GDP would go up, but its GDP per person would go way down.

continue reading »

[Link] Slate Star Codex Notes on the Asilomar Conference on Beneficial AI

13 Gunnar_Zarncke 07 February 2017 12:14PM

Akrasia Tactics Review 3: The Return of the Akrasia

12 malcolmocean 10 April 2017 03:05PM

About three and a half years ago, polutropon ran an akrasia tactics review, following the one orthonormal ran three and a half years prior to that: an open-ended survey asking Less Wrong posters to give numerical scores to productivity techniques that they'd tried, with the goal of getting a more objective picture of how well different techniques work (for the sort of people who post here). Since it's been years since the others and the rationality community has grown and developed significantly while retaining akrasia/motivation/etc as a major topic, I thought it'd be useful to have a new one!

(Malcolm notes: it seems particularly likely that this time there are likely to be some noteworthy individually-invented techniques this time, as people seem to be doing a lot of that sort of thing these days!)

A lightly modified version of the instructions from the previous post:

  1. Note what technique you've tried. Techniques can be anything from productivity systems (Getting Things Done, Complice) to social incentives (precommitting in front of friends) to websites or computer programs (Beeminder, Leechblock) to chemical aids (Modafinil, Caffeine). If it's something that you can easily link to information about, please provide a link and I'll add it when I list the technique; if you don't have a link, describe it in your comment and I'll link that. It could also be a cognitive technique you developed or copied from a friend, which might not have a clear name but you can give it one if you like!
  2. Give your experience with it a score from -10 to +10 (0 if it didn't change the status quo, 10 if it ended your akrasia problems forever with no unwanted side effects, negative scores if it actually made your life worse, -10 if it nearly killed you). For simplicity's sake, I'll only include reviews that give numerical scores.
  3. Describe your experience with it, including any significant side effects. Please also say approximately how long you've been using it, or if you don't use it anymore how long you used it before giving up.

Every so often, I'll combine all the data back into the main post, listing every technique that's been reviewed at least twice with the number of reviews, average score, standard deviation and common effects. I'll do my best to combine similar techniques appropriately, but it'd be appreciated if you could try to organize it a bit by replying to people doing similar things and/or saying if you feel your technique is (dis)similar to another.

I'm not going to provide an initial list due to the massive number of possible techniques and concern of prejudicing answers, but you can look back on the list in the last post or the previous one one if you want. If you have any suggestions for how to organize this (that wouldn't require huge amounts of extra effort on my part), I'm open to hearing them.

Thanks for your data!

(There's a meta thread here for comments that aren't answers to the main prompt.)

Planning 101: Debiasing and Research

12 lifelonglearner 03 February 2017 03:01PM

Planning 101: Techniques and Research

<Cross-posed from my blog>

[Epistemic status: Relatively strong. There are numerous studies showing that predictions often become miscalibrated. Overconfidence in itself appears fairly robust, appearing in different situations. The actual mechanism behind the planning fallacy is less certain, though there is evidence for the inside/outside view model. The debiasing techniques are supported, but more data on their effectiveness could be good.]

Humans are often quite overconfident, and perhaps for good reason. Back on the savanna and even some places today, bluffing can be an effective strategy for winning at life. Overconfidence can scare down enemies and avoid direct conflict.

When it comes to making plans, however, overconfidence can really screw us over. You can convince everyone (including yourself) that you’ll finish that report in three days, but it might still really take you a week. Overconfidence can’t intimidate advancing deadlines.

I’m talking, of course, about the planning fallacy, our tendency to make unrealistic predictions and plans that just don’t work out.

Being a true pessimist ain’t easy.

Students are a prime example of victims to the planning fallacy:

First, students were asked to predict when they were 99% sure they’d finish a project. When the researchers followed up with them later, though, only about 45%, less than half of the students, had actually finished by their own predicted times [Buehler, Griffin, Ross, 1995].

Even more striking, students working on their psychology honors theses were asked to predict when they’d finish, “assuming everything went as poor as it possibly could.” Yet, only about 30% of students finished by their own worst-case estimate [Buehler, Griffin, Ross, 1995].

Similar overconfidence was also found in Japanese and Canadian cultures, giving evidence that this is a human (and not US-culture-based) phenomenon. Students continued to make optimistic predictions, even when they knew the task had taken them longer last time [Buehler and Griffin, 2003, Buehler et al., 2003].

As I student myself, though, I don’t mean to just pick on ourselves.

The planning fallacy affects projects across all sectors.

An overview of public transportation projects found that most of them were, on average, 20–45% above the estimated cost. In fact, research has shown that these poor predictions haven’t improved at all in the past 30 years [Flyvbjerg 2006].

And there’s no shortage of anecdotes, from the Scottish Parliament Building, which cost 10 times more than expected, or the Denver International Airport, which took over a year longer and cost several billion more.

When it comes to planning, we suffer from a major disparity between our expectations and reality. This article outlines the research behind why we screw up our predictions and gives three suggested techniques to suck less at planning.

 

The Mechanism:

So what’s going on in our heads when we make these predictions for planning?

On one level, we just don’t expect things to go wrong. Studies have found that we’re biased towards not looking at pessimistic scenarios [Newby-Clark et al., 2000]. We often just assume the best-case scenario when making plans.

Part of the reason may also be due to a memory bias. It seems that we might underestimate how long things take us, even in our memory [Roy, Christenfeld, and McKenzie 2005].

But by far the dominant theory in the field is the idea of an inside view and an outside view [Kahneman and Lovallo 1993]. The inside view is the information you have about your specific project (inside your head). The outside view is what someone else looking at your project (outside of the situation) might say.

Obviously you want to take the Outside View.

 

We seem to use inside view thinking when we make plans, and this leads to our optimistic predictions. Instead of thinking about all the things that might go wrong, we’re focused on how we can help our project go right.

Still, it’s the outside view that can give us better predictions. And it turns out we don’t even need to do any heavy-lifting in statistics to get better predictions. Just asking other people (from the outside) to predict your own performance, or even just walking through your task from a third-person point of view can improve your predictions [Buehler et al., 2010].

Basically, the difference in our predictions seems to depend on whether we’re looking at the problem in our heads (a first-person view) or outside our heads (a third-person view). Whether we’re the “actor” or the “observer” in our minds seems to be a key factor in our planning [Pronin and Ross 2006].


Debiasing Techniques:

I’ll be covering three ways to improve predictions: MurphyjitsuReference Class Forecasting (RCF), and Back-planning. In actuality, they’re all pretty much the same thing; all three techniques focus, on some level, on trying to get more of an outside view. So feel free to choose the one you think works best for you (or do all three).

For each technique, I’ll give an overview and cover the steps first and then end with the research that supports it. They might seem deceptively obvious, but do try to keep in mind that obvious advice can still be helpful!

(Remembering to breathe, for example, is obvious, but you should still do it anyway. If you don't want to suffocate.)

 

Murphyjitsu:

“Avoid Obvious Failures”


Almost as good as giving procrastination an ass-kicking.

The name Murphyjitsu comes from the infamous Murphy’s Law: “Anything that can go wrong, will go wrong.” The technique itself is from the Center for Applied Rationality (CFAR), and is designed for “bulletproofing your strategies and plans”.

Here are the basic steps:

  1. Figure out your goal. This is the thing you want to make plans to do.
  2. Write down which specific things you need to get done to make the thing happen. (Make a list.)
  3. Now imagine it’s one week (or month) later, and yet you somehow didn’t manage to get started on your goal. (The visualization part here is important.) Are you surprised?
  4. Why? (What went wrong that got in your way?)
  5. Now imagine you take steps to remove the obstacle from Step 4.
  6. Return to Step 3. Are you still surprised that you’d fail? If so, your plan is probably good enough. (Don’t fool yourself!)
  7. If failure still seems likely, go through Steps 3–6 a few more times until you “problem proof” your plan.

Murphyjitsu based off a strategy called a “premortem” or “prospective hindsight”, which basically means imagining the project has already failed and “looking backwards” to see what went wrong [Klein 2007].

It turns out that putting ourselves in the future and looking back can help identify more risks, or see where things can go wrong. Prospective hindsight has been shown to increase our predictive power so we can make adjustments to our plans — before they fail [Mitchell et al., 1989, Veinott et al., 2010].

This seems to work well, even if we’re only using our intuitions. While that might seem a little weird at first (“aren’t our intuitions pretty arbitrary?”), research has shown that our intuitions can be a good source of information in situations where experience is helpful [Klein 1999; Kahneman 2011]*.

While a premortem is usually done on an organizational level, Murphyjitsu works for individuals. Still, it’s a useful way to “failure-proof” your plans before you start them that taps into the same internal mechanisms.

Here’s what Murphyjitsu looks like in action:

“First, let’s say I decide to exercise every day. That’ll be my goal (Step 1). But I should also be more specific than that, so it’s easier to tell what “exercising” means. So I decide that I want to go running on odd days for 30 minutes and do strength training on even days for 20 minutes. And I want to do them in the evenings (Step 2).

Now, let’s imagine that it’s now one week later, and I didn’t go exercising at all! What went wrong? (Step 3) The first thing that comes to mind is that I forgot to remind myself, and it just slipped out of my mind (Step 4). Well, what if I set some phone / email reminders? Is that good enough? (Step 5)

Once again, let’s imagine it’s one week later and I made a reminder. But let’s say I still didn’t got exercising. How surprising is this? (Back to Step 3) Hmm, I can see myself getting sore and/or putting other priorities before it…(Step 4). So maybe I’ll also set aside the same time every day, so I can’t easily weasel out (Step 5).

How do I feel now? (Back to Step 3) Well, if once again I imagine it’s one week later and I once again failed, I’d be pretty surprised. My plan has two levels of fail-safes and I do want to do exercise anyway. Looks like it’s good! (Done)


Reference Class Forecasting:

“Get Accurate Estimates”


Predicting the future…using the past!

Reference class forecasting (RCF)is all about using the outside view. Our inside views tend to be very optimistic: We will see all the ways that things can go right, but none of the ways things can go wrong. By looking at past history — other people who have tried the same or similar thing as us — we can get a better idea of how long things will really take.

Here are the basic steps:

  1. Figure out what you want to do.
  2. See your records how long it took you last time 3.
  3. That’s your new prediction.
  4. If you don’t have past information, look for about how long it takes, on average, to do our thing. (This usually looks like Googling “average time to do X”.)**
  5. That’s your new prediction!

Technically, the actual process for reference class forecasting works a little differently. It involves a statistical distribution and some additional calculations, but for most everyday purposes, the above algorithm should work well enough.

In both cases, we’re trying to take an outside view, which we know improves our estimates [Buehler et al., 1994].

When you Google the average time or look at your own data, you’re forming a “reference class”, a group of related actions that can give you info about how long similar projects tend to take. Hence, the name “reference class forecasting”.

Basically, RCF works by looking only at results. This means that we can avoid any potential biases that might have cropped up if we were to think it through. We’re shortcutting right to the data. The rest of it is basic statistics; most people are close to average. So if we have an idea of what the average looks like, we can be sure we’ll be pretty close to average as well [Flyvbjerg 2006; Flyvbjerg 2008].

The main difference in our above algorithm from the standard one is that this one focuses on your own experiences, so the estimate you get tends to be more accurate than an average we’d get from an entire population.

For example, if it usually takes me about 3 hours to finish homework (I use Toggl to track my time), then I’ll predict that it will take me 3 hours today, too.

It’s obvious that RCF is incredibly simple. It literally just tells you that how long something will take you this time will be very close to how long it took you last time. But that doesn’t mean it’s ineffective! Often, the past is a good benchmark of future performance, and it’s far better than any naive prediction your brain might spit out.

RCF + Murphyjitsu Example:

For me, I’ve found that using a mixture of Reference Class Forecasting and Murphyjitsu to be helpful for reducing overconfidence in my plans.

When starting projects, I will often ask myself, “What were the reasons that I failed last time?” I then make a list of the first three or four “failure-modes” that I can recall. I now make plans to preemptively avoid those past errors.

(This can also be helpful in reverse — asking yourself, “How did I solve a similar difficult problem last time?” when facing a hard problem.)

Here’s an example:

“Say I’m writing a long post (like this one) and I want to know how what might go wrong. I’ve done several of these sorts of primers before, so I have a “reference class” of data to draw from. So what were the major reasons I fell behind for those posts?

<Cue thinking>

Hmm, it looks like I would either forget about the project, get distracted, or lose motivation. Sometimes I’d want to do something else instead, or I wouldn’t be very focused.

Okay, great. Now what are some ways that I might be able to “patch” those problems?

Well, I can definitely start by making a priority list of my action items. So I know which things I want to finish first. I can also do short 5-minute planning sessions to make sure I’m actually writing. And I can do some more introspection to try and see what’s up with my motivation.

 

Back-planning:

“Calibrate Your Intuitions with Reality”

Back-planning involves, as you might expect, planning from the end. Instead of thinking about where we start and how to move forward, we imagine we’re already at our goal and go backwards.

Time-travelling inside your internal universe.

Here are the steps:

  1. Figure out the task you want to get done.
  2. Imagine you’re at the end of your task.
  3. Now move backwards, step-by-step. What is the step right before you finish?
  4. Repeat Step 3 until you get to where you are now.
  5. Write down how long you think the task will now take you.
  6. You now have a detailed plan as well as better prediction!

The experimental evidence for back-planning basically suggests that people will predict longer times to start and finish projects.

There are a few interesting hypotheses about why back-planning seems to improve predictions. The general gist of these theories is that back-planning is a weird, counterintuitive way to think about things, which means it disrupts a lot of mental processes that can lead to overconfidence [Wiese et al., 2012].

This means that back-planning can make it harder to fall into the groove of the easy “best-case” planning we default to. Instead, we need to actually look at where things might go wrong. Which is, of course, what we want.

In my own experience, I’ve found that going through a quick back-planning session can help my intuitions “warm up” to my prediction more. As in, I’ll get an estimation from RCF, but it still feels “off”. Walking through the plan through back-planning can help all the parts of me understand that it really will probably take longer.

Here’s the back-planning example:

“Right now, I want to host a talk at my school. I know that’s the end goal (Step 1). So the end goal is me actually finishing the talk and taking questions (Step 2). What happens right before that? (Step 3). Well, people would need to actually be in the room. And I would have needed a room.

Is that all? (Step 3). Also, for people to show up, I would have needed publicity. Probably also something on social media. I’d need to publicize at least a week in advance, or else it won’t be common knowledge.

And what about the actual talk? I would have needed slides, maybe memorize my talk. Also, I’d need to figure out what my talk is actually going to be on.

Huh, thinking it through like this, I’d need something like 3 weeks to get it done. One week for the actual slides, one week for publicity (at least), and one week for everything else that might go wrong.

That feels more ‘right’ than my initial estimate of ‘I can do this by next week.’”

 

Experimental Ideas:

Murphyjitsu, Reference Class Forecasting, and Back-planning are the three debiasing techniques that I’m fairly confident work well. This section is far more anecdotal. They’re ideas that I think are useful and interesting, but I don’t have much formal backing for them.

Decouple Predictions From Wishes:

In my own experience, I often find it hard to separate when I want to finish a task versus when I actually think I will finish a task. This is a simple distinction to keep in mind when making predictions, and I think it can help decrease optimism. The most important number, after all, is when I actually think I will finish—it’s what’ll most likely actually happen.

There’s some evidence suggesting that “wishful thinking” could actually be responsible for some poor estimates but it’s far from definitive [Buehler et al., 1997, Krizan and Windschitl].

Incentivize Correct Predictions:

Lately, I’ve been using a 4-column chart for my work. I write down the task in Column 1 and how long I think it will take me in Column 2. Then I go and do the task. After I’m done, I write down how long it actually took me in Column 3. Column 4 is the absolute value of Column 2 minus Column 3, or my “calibration score”.

The idea is to minimize my score every day. It’s simple and it’s helped me get a better sense for how long things really take.

Plan For Failure:

In my schedules, I specifically write in “distraction time”. If you aren’t doing this, you may want to consider doing this. Most of us (me included) have wandering attentions, and I know I’ll lost at least some time to silly things every day.

Double Your Estimate:

I get it. The three debiasing techniques I outlined above can sometimes take too long. In a pinch, you can probably approximate good predictions by just doubling your naive prediction.

Most people tend to be less than 2X overconfident, but I think (pessimistically) sticking to doubling is probably still better than something like 1.5X.

 

Working in Groups:

Obviously because groups are made of individuals, we’d expect them to be susceptible to the same overconfidence biases I covered earlier. Though some research has shown that groups are less susceptible to bias, more studies have shown that group predictions can be far more optimistic than individual predictions [Wright and Wells, Buehler et al., 2010]. “Groupthink” is term used to describe the observed failings of decision making in groups [Janis].

Groupthink (and hopefully also overconfidence), can be countered by either assigning a “Devil’s Advocate” or engaging in “dialectical inquiry” [Lunenburg 2012]:

We give out more than cookies over here

A Devil’s Advocate is a person who is actively trying to find fault with the group’s plans, looking for holes in reasoning or other objections. It’s suggested that the role rotates, and it’s associated with other positives like improved communication skills.

A dialectical inquiry is where multiple teams try to create the best plan, and then present them. Discussion then happens, and then the group selects the best parts of each plan . It’s a little like building something awesome out of lots of pieces, like a giant robot.

This is absolutely how dialectical inquiry works in practice.

For both strategies, research has shown that they lead to “higher-quality recommendations and assumptions” (compared to not doing them), although it can also reduce group satisfaction and acceptance of the final decision [Schweiger et al. 1986].

(Pretty obvious though; who’d want to keep chatting with someone hell-bent on poking holes in your plan?)

 

Conclusion:

If you’re interested in learning (even) more about the planning fallacy, I’d highly recommend the paper The Planning Fallacy: Cognitive, Motivational, and Social Origins by Roger Buehler, Dale Griffin, and Johanna Peetz. Most of the material in this guide here is was taken from their paper. Do go check it out! It’s free!

Remember that everyone is overconfident (you and me included!), and that failing to plan is the norm. There are scary unknown unknowns out there that we just don’t know about!

Good luck and happy planning!

 

Footnotes:

* Just don’t go and start buying lottery tickets with your gut. We’re talking about fairly “normal” things like catching a ball, where your intuitions give you accurate predictions about where the ball will land. (Instead of, say, calculating the actual projectile motion equation in your head.)

** In a pinch, you can just use your memory, but studies have shown that our memory tends to be biased too. So as often as possible, try to use actual measurements and numbers from past experience.


Works Cited:

Buehler, Roger, Dale Griffin, and Johanna Peetz. "The Planning Fallacy: Cognitive,

Motivational, and Social Origins." Advances in Experimental Social Psychology 43 (2010): 1-62. Social Science Research Network.

Buehler, Roger, Dale Griffin, and Michael Ross. "Exploring the Planning Fallacy: Why People

Underestimate their Task Completion Times." Journal of Personality and Social Psychology 67.3 (1994): 366.

Buehler, Roger, Dale Griffin, and Heather MacDonald. "The Role of Motivated Reasoning in

Optimistic Time Predictions." Personality and Social Psychology Bulletin 23.3 (1997): 238-247.

Buehler, Roger, Dale Griffin, and Michael Ross. “It’s About Time: Optimistic Predictions in

Work and Love.” European Review of Social Psychology Vol. 6, (1995): 1–32

Buehler, Roger, et al. "Perspectives on Prediction: Does Third-Person Imagery Improve Task

Completion Estimates?." Organizational Behavior and Human Decision Processes 117.1 (2012): 138-149.

Buehler, Roger, Dale Griffin, and Michael Ross. "Inside the Planning Fallacy: The Causes and

Consequences of Optimistic Time Predictions." Heuristics and Biases: The Psychology of Intuitive Judgment (2002): 250-270.

Buehler, R., & Griffin, D. (2003). Planning, Personality, and Prediction: The Role of Future

Focus in Optimistic Time Predictions. Organizational Behavior and Human Decision Processes, 92, 80–90

Flyvbjerg, Bent. "From Nobel Prize to Project Management: Getting Risks Right." Project

Management Journal 37.3 (2006): 5-15. Social Science Research Network.

Flyvbjerg, Bent. "Curbing Optimism Bias and Strategic Misrepresentation in Planning:

Reference Class Forecasting in Practice." European Planning Studies 16.1 (2008): 3-21.

Janis, Irving Lester. "Groupthink: Psychological Studies of Policy Decisions and Fiascoes."

(1982).

Johnson, Dominic DP, and James H. Fowler. "The Evolution of Overconfidence." Nature

477.7364 (2011): 317-320.

Kahneman, Daniel. Thinking, Fast and Slow. Macmillan, 2011.

Kahneman, Daniel, and Dan Lovallo. “Timid Choices and Bold Forecasts: A Cognitive

Perspective on Risk Taking." Management Science 39.1 (1993): 17-31.

Klein, Gary. Sources of power: How People Make DecisionsMIT press, 1999.

Klein, Gary. "Performing a Project Premortem." Harvard Business Review 85.9 (2007): 18-19.

Krizan, Zlatan, and Paul D. Windschitl. "Wishful Thinking About the Future: Does Desire

Impact Optimism?" Social and Personality Psychology Compass 3.3 (2009): 227-243.

Lunenburg, F. "Devil’s Advocacy and Dialectical Inquiry: Antidotes to Groupthink."

International Journal of Scholarly Academic Intellectual Diversity 14 (2012): 1-9.

Mitchell, Deborah J., J. Edward Russo, and Nancy Pennington. "Back to the Future: Temporal

Perspective in the Explanation of Events." Journal of Behavioral Decision Making 2.1 (1989): 25-38.

Newby-Clark, Ian R., et al. "People focus on Optimistic Scenarios and Disregard Pessimistic

Scenarios While Predicting Task Completion Times." Journal of Experimental Psychology: Applied 6.3 (2000): 171.

Pronin, Emily, and Lee Ross. "Temporal Differences in Trait Self-Ascription: When the Self is

Seen as an Other." Journal of Personality and Social Psychology 90.2 (2006): 197.

Roy, Michael M., Nicholas JS Christenfeld, and Craig RM McKenzie. "Underestimating the

Duration of Future Events: Memory Incorrectly Used or Memory Bias?." Psychological Bulletin 131.5 (2005): 738.

Schweiger, David M., William R. Sandberg, and James W. Ragan. "Group Approaches for

Improving Strategic Decision Making: A Comparative Analysis of Dialectical Inquiry,

Devil's Advocacy, and Consensus." Academy of Management Journal 29.1 (1986): 51-71.

Veinott, Beth. "Klein, and Sterling Wiggins,“Evaluating the Effectiveness of the Premortem

Technique on Plan Confidence,”." Proceedings of the 7th International ISCRAM Conference (May, 2010).

Wiese, Jessica, Roger Buehler, and Dale Griffin. "Backward Planning: Effects of Planning

Direction on Predictions of Task Completion Time." Judgment and Decision Making 11.2

(2016): 147.

Wright, Edward F., and Gary L. Wells. "Does Group Discussion Attenuate the Dispositional

Bias?." Journal of Applied Social Psychology 15.6 (1985): 531-546.

[Link] The "I Already Get It" Slide

12 jsalvatier 01 February 2017 03:11AM

Value Journaling

12 ProofOfLogic 25 January 2017 06:10AM

I like to link to the Minding Our Way sequence on overcoming guilt a lot, but I've recently gone and added "information hazard" warnings to several of my posts which link there. Someone pointed out to me that the sequence destroys some people's current (guilt-based) motivation without successfully building up an alternative, making it a somewhat risky thing to try.

In light of that problem, I was thinking about what practices might help build up the kind of positive motivation that sequence is aiming at.

I was also listening to an audiobook on cognitive behavioral therapy. The book mentioned gratitude journaling, a practice which has proved surprisingly effective for boosting mood and getting longer and more refreshing sleep. The practice is simple: every week, write down five things which you were grateful for. (Once a week seems to be about right; writing more often is less effective.)

Gratitude journaling is the proven practice here, and if you want guaranteed results, you're better off trying it rather than the technique I'm going to describe here. But, we'd never have new techniques if someone didn't make them up!

I wanted to make a version of gratitude journaling which might be more suited to aspiring rationalists. I decided that it could be combined with the idea of value affirmation. Value affirmation (surprisingly) shows positive effects a year later, after just 15 minutes spent writing about what you value in life. Might it be useful to write about what we value more often? Perhaps repeating the value-affirmation exercise exactly would get old fast (since values do not change that much from week to week), but if we tie our values to things which happened recently, we get something which looks a lot like gratitude journaling.

That's the basic idea -- write about what you valued over the past week. What follows are my elaborations based on several weeks of trying it out.

continue reading »

Concrete Takeaways Post-CFAR

11 lifelonglearner 24 February 2017 06:31PM

Concrete Takeaways:

[So I recently volunteered at a CFAR workshop. This is part five of a five-part series on how I changed my mind. It's split into 3 sections: TAPs, Heuristics, and Concepts. They get progressively more abstract. It's also quite long at around 3,000 words, so feel free to just skip around and see what looks interesting.]

 

(I didn't post Part 3 and Part 4 on LW, as they're more speculative and arguably less interesting, but I've linked to them on my blog if anyone's interested.]

 

This is a collection of TAPs, heuristics, and concepts that I’ve been thinking about recently. Many of them were inspired by my time at the CFAR workshop, but there’s not really underlying theme behind it all. It’s just a collection of ideas that are either practical or interesting.

 


TAPs:

TAPs, or Trigger Action Planning, is a CFAR technique that is used to build habits. The basic idea is you pair a strong, concrete sensory “trigger” (e.g. “when I hear my alarm go off”) with a “plan”—the thing you want to do (e.g. “I will put on my running shoes”).


If you’re good at noticing internal states, TAPs can also use your feelings or other internal things as a trigger, but it’s best to try this with something concrete first to get the sense of it.


Some of the more helpful TAPs I’ve recently been thinking about are below:


Ask for Examples TAP:

[Notice you have no mental picture of what the other person is saying. → Ask for examples.]


Examples are good. Examples are god. I really, really like them.


In conversations about abstract topics, it can be easy to understand the meaning of the words that someone said, yet still miss the mental intuition of what they’re pointing at. Asking for an example clarifies what they mean and helps you understand things better.


The trigger for this TAP is noticing that what someone said gave you no mental picture.


I may be extrapolating too far from too little data here, but it seems like people do try to “follow along” with things in their head when listening. And if this mental narrative, simulation, or whatever internal thing you’re doing comes up blank when someone’s speaking, then this may be a sign that what they said was unclear.


Once you notice this, you ask for an example of what gave you no mental picture. Ideally, the other person can then respond with a more concrete statement or clarification.


Quick Focusing TAP:

[Notice you feel aversive towards something → Be curious and try to source the aversion.]


Aversion Factoring, Internal Double Crux, and Focusing are all techniques CFAR teaches to help deal with internal feelings of badness.


While there are definite nuances between all three techniques, I’ve sort of abstracted from the general core of “figuring out why you feel bad” to create an in-the-moment TAP I can use to help debug myself.


The trigger is noticing a mental flinch or an ugh field, where I instinctively shy away from looking too hard.


After I notice the feeling, my first step is to cultivate a sense of curiosity. There’s no sense of needing to solve it; I’m just interested in why I’m feeling this way.


Once I’ve directed my attention to the mental pain, I try to source the discomfort. Using some backtracking and checking multiple threads (e.g. “is it because I feel scared?”) allows me to figure out why. This whole process takes maybe half a minute.


When I’ve figured out the reason why, a sort of shift happens, similar to the felt shift in focusing. In a similar way, I’m trying to “ground” the nebulous, uncertain discomfort, forcing it to take shape.


I’d recommend trying some Focusing before trying this TAP, as it’s basically an expedited version of it, hence the name.


Rule of Reflexivity TAP:

[Notice you’re judging someone → Recall an instance where you did something similar / construct a plausible internal narrative]

[Notice you’re making an excuse → Recall times where others used this excuse and update on how you react in the future.]


This is a TAP that was born out of my observation that our excuses seem way more self-consistent when we’re the ones saying then. (Oh, why hello there, Fundamental Attribution Error!) The point of practicing the Rule of Reflexivity is to build empathy.


The Rule of Reflexivity goes both ways. In the first case, you want to notice if you’re judging someone. This might feel like ascribing a value judgment to something they did, e.g. “This person is stupid and made a bad move.”


The response is to recall times where either you did something similar or (if you think you’re perfect) think of a plausible set of events that might have caused them to act in this way. Remember that most people don’t think they’re acting stupidly; they’re just doing what seems like a good idea from their perspective.


In the second case, you want to notice when you’re trying to justify your own actions. If the excuses you yourself make suspiciously sound like things you’ve heard others say before, then you may want to jump less likely to immediately dismissing them in the future.


Keep Calm TAP:

[Notice you’re starting to get angry → Take a deep breath → Speak softer and slower]


Okay, so this TAP is probably not easy to do because you’re working against a biological response. But I’ve found it useful in several instances where otherwise I would have gotten into a deeper argument.


The trigger, of course, is noticing that you’re angry. For me, this feels like an increased tightness in my chest and a desire to raise my voice. I may feel like a cherished belief of mine is being attacked.


Once I notice these signs, I remember that I have this TAP which is about staying calm. I think something like, “Ah yes, I’m getting angry now. But I previously already made the decision that it’d be a better idea to not yell.”


After that, I take a deep breath, and I try to open up my stance. Then I remember to speak in a slower and quieter tone than previously. I find this TAP especially helpful in arguments—ahem, collaborative searches for the truth—where things get a little too excited on both sides.  

 


Heuristics:

Heuristics are algorithm-like things you can do to help get better results. I think that it’d be possible to turn many of the heuristics below into TAPs, but there’s a sense of deliberately thinking things out that separates these from just the “mindless” actions above.


As more formal procedures, these heuristics do require you to remember to Take Time to do them well. However, I think that the sorts of benefits you get from make it worth the slight investment in time.

 


Modified Murphyjitsu: The Time Travel Reframe:

(If you haven’t read up on Murphyjitsu yet, it’d probably be good to do that first.)


Murphyjitsu is based off the idea of a premortem, where you imagine that your project failed and you’re looking back. I’ve always found this to be a weird temporal framing, and I realized there’s a potentially easier way to describe things:


Say you’re sitting at your desk, getting ready to write a report on intertemporal travel. You’re confident you can finish before the hour is over. What could go wrong? Closing Facebook, you begin to start typing.


Suddenly, you hear a loud CRACK! A burst of light floods your room as a figure pops into existence, dark and silhouetted by the brightness behind it. The light recedes, and the figure crumples to the ground. Floating in the air is a whirring gizmo, filled with turning gears. Strangely enough, your attention is drawn from the gizmo to the person on the ground:


The figure has a familiar sort of shape. You approach, tentatively, and find the splitting image of yourself! The person stirs and speaks.


“I’m you from one week into the future,” your future self croaks. Your future self tries to tries to get up, but sinks down again.


“Oh,” you say.


“I came from the future to tell you…” your temporal clone says in a scratchy voice.


“To tell me what?” you ask. Already, you can see the whispers of a scenario forming in your head…


Future Your slowly says, “To tell you… that the report on intertemporal travel that you were going to write… won’t go as planned at all. Your best-case estimate failed.”


“Oh no!” you say.


Somehow, though, you aren’t surprised…


At this point, what plausible reasons for your failure come to mind?


I hypothesize that the time-travel reframe I provide here for Murphyjitsu engages similar parts of your brain as a premortem, but is 100% more exciting to use. In all seriousness, I think this is a reframe that is easier to grasp compared to the twisted “imagine you’re in the future looking back into the past, which by the way happens to be you in the present” framing normal Murphyjitsu uses.


The actual (non-dramatized) wording of the heuristic, by the way, is, “Imagine that Future You from one week into the future comes back telling you that the plan you are about to embark on will fail: Why?”


Low on Time? Power On!

Often, when I find myself low on time, I feel less compelled to try. This seems sort of like an instance of failing with abandon, where I think something like, “Oh well, I can’t possibly get anything done in the remaining time between event X and event Y”.


And then I find myself doing quite little as a response.


As a result, I’ve decided to internalize the idea that being low on time doesn’t mean I can’t make meaningful progress on my problems.


This a very Resolve-esque technique. The idea is that even if I have only 5 minutes, that’s enough to get things down. There’s lots of useful things I can pack into small time chunks, like thinking, brainstorming, or doing some Quick Focusing.


I’m hoping to combat the sense of apathy / listlessness that creeps in when time draws to a close.


Supercharge Motivation by Propagating Emotional Bonds:

[Disclaimer: I suspect that this isn’t an optimal motivation strategy, and I’m sure there are people who will object to having bonds based on others rather than themselves. That’s okay. I think this technique is effective, I use it, and I’d like to share it. But if you don’t think it’s right for you, feel free to just move along to the next thing.]


CFAR used to teach a skill called Propagating Urges. It’s now been largely subsumed by Internal Double Crux, but I still find Propagating Urges to be a powerful concept.


In short, Propagating Urges hypothesizes that motivation problems are caused because the implicit parts of ourselves don’t see how the boring things we do (e.g. filing taxes) causally relate to things we care about (e.g. not going to jail). The actual technique involves walking through the causal chain in your mind and some visceral imagery every step of the way to get the implicit part of yourself on board.


I’ve taken the same general principle, but I’ve focused it entirely on the relationships I have with other people. If all the parts of me realize that doing something would greatly hurt those I care about, this becomes a stronger motivation than most external incentives.


For example, I walked through an elaborate internal simulation where I wanted to stop doing a Thing. I imagined someone I cared deeply for finding out about my Thing-habit and being absolutely deeply disappointed. I focused on the sheer emotional weight that such disappointment would cause (facial expressions, what they’d feel inside, the whole deal).


I now have a deep injunction against doing the Thing, and all the parts of me are in agreement because we agree that such a Thing would hurt other people and that’s obviously bad.


The basic steps for Propagating Emotional Bonds looks like:

  • Figure out what thing you want to do more of or stop doing.

  • Imagine what someone you care about would think or say.

  • Really focus on how visceral that feeling would be.

  • Rehearse the chain of reasoning (“If I do this, then X will feel bad, and I don’t want X to feel bad, so I won’t do it”) a few times.


Take Time in Social Contexts:

Often, in social situations, when people ask me questions, I feel an underlying pressure to answer quickly. It feels like if I don’t answer in the next ten seconds, something’s wrong with me. (School may have contributed to this). I don’t exactly know why, but it just feels like it’s expected.


I also think that being forced to hurry isn’t good for thinking well. As a result, something helpful I’ve found is when someone asks something like, “Is that all? Anything else?” is to Take Time.


My response is something like, “Okay, wait, let me actually take a few minutes.” At which point, I, uh, actually take a few minutes to think things through. After saying this, it feel like it’s now socially permissible for me to take some time thinking.


This has proven in several contexts where, had I not Taken Time, I would have forgotten to bring up important things or missed key failure-modes.


Ground Mental Notions in Reality not by Platonics:

One of the proposed reasons that people suck at planning is that we don’t actually think about the details behind our plans. We end up thinking about them in vague black-box-style concepts that hide all the scary unknown unknowns. What we’re left with is just the concept of our task, rather than a deep understanding of what our task entails.


In fact, this seems fairly similar to the the “prototype model” that occurs in scope insensitivity.


I find this is especially problematic for tasks which look nothing like their concepts. For example, my mental representation of “doing math” conjures images of great mathematicians, intricate connections, and fantastic concepts like uncountable sets.


Of course, actually doing math looks more like writing stuff on paper, slogging through textbooks, and banging your head on the table.


My brain doesn’t differentiate well between doing a task and the affect associated with the task. Thus I think it can be useful to try and notice when our brains our doing this sort of black-boxing and instead “unpack” the concepts.


This means getting better correspondences between our mental conceptions of tasks and the tasks themselves, so that we can hopefully actually choose better.


3 Conversation Tips:

I often forget what it means to be having a good conversation with someone. I think I miss opportunities to learn from others when talking with them. This is my handy 3-step list of Conversation Tips to get more value out of conversations:


1) "Steal their Magic": Figure out what other people are really good at, and then get inspired by their awesomeness and think of ways you can become more like that. Learn from what other people are doing well.


2) "Find the LCD"/"Intellectually Escalate": Figure out where your intelligence matches theirs, and learn something new. Focus on Actually Trying to bridge those inferential distances. In conversations, this means focusing on the limits of either what you know or what the other person knows.


3) "Convince or Be Convinced”: (This is a John Salvatier idea, and it also follows from the above.) Focus on maximizing your persuasive ability to convince them of something. Or be convinced of something. Either way, focus on updating beliefs, be it your own or the other party’s.


Be The Noodly Appendages of the Superintelligence You Wish To See in the World:

CFAR co-founder Anna Salamon has this awesome reframe similar to IAT which asks, “Say a superintelligence exists and is trying to take over the world. However, you are its only agent. What do you do?”


I’ll admit I haven’t used this one, but it’s super cool and not something I’d thought of, so I’m including it here.

 


Concepts:

Concepts are just things in the world I’ve identified and drawn some boundaries around. They are farthest from the pipeline that goes from ideas to TAPs, as concepts are just ideas. Still, I do think these concepts “bottom out” at some point into practicality, and I think playing around with them could yield interesting results.


Paperspace =/= Mindspace:

I tend to write things down because I want to remember them. Recently, though I’ve noticed that rather act as an extension of my brain, I seem to treat things I write down as no longer in my own head. As in, if I write something down, it’s not necessarily easier for me to recall it later.


It’s as if by “offloading” the thoughts onto paper, I’ve cleared them out of my brain. This seems suboptimal, because a big reason I write things down is to cement them more deeply within my head.


I can still access the thoughts if I’m asking myself questions like, “What did I write down yesterday?” but only if I’m specifically sorting for things I write down.


The point is, I want stuff I write down on paper to be, not where I store things, but merely a sign of what’s stored inside my brain.


Outreach: Focus on Your Target’s Target:

One interesting idea I got from the CFAR workshop was that of thinking about yourself as a radioactive vampire. Um, I mean, thinking about yourself as a memetic vector for rationality (the vampire thing was an actual metaphor they used, though).


The interesting thing they mentioned was to think, not about who you’re directly influencing, but who your targets themselves influence.


This means that not only do you have to care about the fidelity of your transmission, but you need to think of ways to ensure that your target also does a passable job of passing it on to their friends.


I’ve always thought about outreach / memetics in terms of the people I directly influence, so looking at two degrees of separation is a pretty cool thing I hadn’t thought about in the past.


I guess that if I took this advice to heart, I’d probably have to change the way that I explain things. For example, I might want to try giving more salient examples that can be easily passed on or focusing on getting the intuitions behind the ideas across.


Build in Blank Time:

Professor Barbara Oakley distinguishes between focused and diffused modes of thinking. Her claim is that time spent in a thoughtless activity allows your brain to continue working on problems without conscious input. This is the basis of diffuse mode.


In my experience, I’ve found that I get interesting ideas or remember important ideas when I’m doing laundry or something else similarly mindless.


I’ve found this to be helpful enough that I’m considering building in “Blank Time” in my schedules.


My intuitions here are something like, “My brain is a thought-generator, and it’s particularly active if I can pay attention to it. But I need to be doing something that doesn’t require much of my executive function to even pay attention to my brain. So maybe having more Blank Time would be good if I want to get more ideas.”


There’s also the additional point that meta-level thinking can’t be done if you’re always in the moment, stuck in a task. This means that, cool ideas aside, if I just want to reorient or survey my current state, Blank Time can be helpful.


The 99/1 Rule: Few of Your Thoughts are Insights:

The 99/1 Rule says that the vast majority of your thoughts every day are pretty boring and that only about one percent of them are insightful.


This was generally true for my life…and then I went to the CFAR workshop and this rule sort of stopped being appropriate. (Other exceptions to this rule were EuroSPARC [now ESPR] and EAG)


Note:

I bulldozed through a bunch of ideas here, some of which could have probably garnered a longer post. I’ll probably explore some of these ideas later on, but if you want to talk more about any one of them, feel free to leave a comment / PM me.

 

How often do you check this forum?

11 JenniferRM 30 January 2017 04:56PM

I'm interested from hearing from everyone who reads this.

Who is checking LW's Discussion area and how often?

1. When you check, how much voting or commenting do you do compared to reading?

2. Do bother clicking through to links?

3. Do you check using a desktop or a smart phone?  Do you just visit the website in browser or use an RSS something-or-other?

4. Also, do you know of other places that have more schellingness for the topics you think this place is centered on? (Or used to be centered on?) (Or should be centered on?)

I would ask this in the current open thread except that structurally it seems like it needs to be more prominent than that in order to do its job.

If you have very very little time to respond or even think about the questions, I'd appreciate it if you just respond with "Ping" rather than click away.

[Link] Putanumonit: A spreadsheet helps debias decisions, like picking a girl to date

10 Jacobian 15 March 2017 03:19AM

[Link] How to not earn a delta (Change My View)

10 Viliam 14 February 2017 10:04AM

A majority coalition can lose a symmetric zero-sum game

10 Stuart_Armstrong 26 January 2017 12:13PM

Just a neat little result I found when thinking about Jessica's recent post.

For n players, let ai be the action of player i and si(a1,a2,...an) the reward of player i as a function of the actions of all players. Then the game is symmetric if for any permutation p:{1,...n}→{1,...n}

si(a1,a2,...an) = sp(i)(ap(1),ap(2),...ap(n))

The game is zero-sum if the sum of the si is always zero. Assume players can confer before choosing their actions.

Then it is possible for a majority coalition to strictly lose a zero-sum game, even in a deterministic game where they get to see their opponents' moves before choosing their own.

This seems counter-intuitive. After all, if one coalition has M players, the other has m players, with m<M, and there are no other players, how can the M players lose? Couldn't just m of the M players behave exactly as the smaller coalition, thus getting the same amount in expectation? The problem is the potential loses endured by the remaining M-m players.

For an example, consider the following 5 player colour game (it's a much simplified version of the game I came up with previously, proposed by cousin_it). Each player chooses one of two colour, blue or red. Then the players that selected the least commonly chosen colour(s) are the winners; the others are the losers. The losers pay 1 each, and this is split equally between the winners.

Then consider a coalition of three players, the triumvirate. The remaining two players - the duumvirate - choose different colours, red and blue. What can the triumvirate then do? If they all chose the same colour - say blue - then they all lose -1, and the duumvriate loses -1 (from its member that chose blue) and gains 4/1 (for the member that chose red). If they split - say 2 blue, 1 red - then the ones that chose blue lose -1, while the duumveriate loses -1 (from its member that chose blue) and gains 3/2 > 1 (from the member that chose red).

So the duumvriate can always win against the triumvirate.

Of course, it's possible for two members of the triumvirate to create a second duumvirate that will profit from the hapless third member. Feel free to add whatever political metaphor you think this fits.

 

Larger games

Variations of this game can make Jessica's theorem 2 sharp. Let the minority coalition be of size m (the majority coalition is of size M = n-m = qm+r for some unique q and 0≤r<m). The actions are choosing from m colours; apart from that the game is the same as before. And, as before, the members of the minority coalition each choose a different colour.

Then an m-set is a collection of players that each chose a different colour. Split the players into as many disjoint m-sets as possible, with the minority coalition being one of them - say this gives q'+1 m-sets. There are r' remaining players from the majority coalition.

Note that we can consider that any loss from a member of an m-set is spread among the remaining members. That's because all winners are members of m-sets, and players that choose the same colour are interchangeable. So we can assign the loss from the member of an m-sets as being an equivalent gain to the other members. Thus the m-sets only profit from the r' remaining players. And this profit is spread equally among the winners - hence equally among the m-sets.

Thus the majority coalition has a loss of r'/(q'+1), and minimises its loss by minimising r' and maximising q' - hence by setting q'=q and r'=r. Under these circumstances, the minority coalition wins r/(q+1) in total.

Adding 1 to the reward of each player, then dividing all rewards by n, gives the unit-sum game in Jessica's theorem.

Did EDT get it right all along? Introducing yet another medical Newcomb problem

10 Johannes_Treutlein 24 January 2017 11:43AM

One of the main arguments given against Evidential Decision Theory (EDT) is that it would “one-box” in medical Newcomb problems. Whether this is the winning action has been a hotly debated issue on LessWrong. A majority, including experts in the area such as Eliezer Yudkowsky and Wei Dai, seem to think that one should two-box (See e.g. Yudkowsky 2010, p.67). Others have tried to argue in favor of EDT by claiming that the winning action would be to one-box, or by offering reasons why EDT would in some cases two-box after all. In this blog post, I want to argue that EDT gets it right: one-boxing is the correct action in medical Newcomb problems. I introduce a new thought experiment, the Coin Flip Creation problem, in which I believe the winning move is to one-box. This new problem is structurally similar to other medical Newcomb problems such as the Smoking Lesion, though it might elicit the intuition to one-box even in people who would two-box in some of the other problems. I discuss both how EDT and other decision theories would reason in the problem and why people’s intuitions might diverge in different formulations of medical Newcomb problems.

Two kinds of Newcomblike problems

There are two different kinds of Newcomblike problems. In Newcomb’s original paradox, both EDT and Logical Decision Theories (LDT), such as Timeless Decision Theory (TDT) would one-box and therefore, unlike CDT, win $1 million. In medical Newcomb problems, EDT’s and LDT’s decisions diverge. This is because in the latter, a (physical) causal node that isn’t itself a decision algorithm influences both the current world state and our decisions – resulting in a correlation between action and environment but, unlike the original Newcomb, no “logical” causation.

It’s often unclear exactly how a causal node can exert influence on our decisions. Does it change our decision theory, utility function, or the information available to us? In the case of the Smoking Lesion problem, it seems plausible that it’s our utility function that is being influenced. But then it seems that as soon as we observe our utility function (“notice a tickle”; see Eells 1982), we lose “evidential power” (Almond 2010a, p.39), i.e. there’s nothing new to learn about our health by acting a certain way if we already know our utility function. In any case, as long as we don’t know and therefore still have the evidential power, I believe we should use it.

The Coin Flip Creation Problem is an adaption of Caspar Oesterheld’s “Two-Boxing Gene” problem and, like the the latter, attempts to take Newcomb’s original problem and make it into a medical Newcomb problem, triggering the intuition that we should one-box. In Oesterheld’s Two-Boxing Gene, it’s stated that a certain gene correlates with our decision to one-box or two-box in Newcomb’s problem, and that Omega, instead of simulating our decision algorithm, just looks at this gene.

Unfortunately, it’s not specified how the correlation between two-boxing and the gene arises, casting doubt on whether it’s a medical Newcomb problem at all, and whether other decision algorithms would disagree with one-boxing. Wei Dai argues that in the Two-Boxing Gene, if Omega conducts a study to find out which genes correlate with which decision algorithm, then Updateless Decision Theory (UDT) could just commit to one-boxing and thereby determine that all the genes UDT agents have will always correlate with one-boxing. So in some sense, UDT’s genes will still indirectly constitute a “simulation” of UDT’s algorithm, and there is a logical influence between the decision to one-box and Omega’s decision to put $1 million in box A. Similar considerations could apply for other LDTs.

The Coin Flip Creation problem is intended as an example of a problem in which EDT would give the right answer, but all causal and logical decision theories would fail. It works explicitly through a causal influence on the decision theory itself, thus reducing ambivalence about the origin of the correlation.

The Coin Flip Creation problem

One day, while pondering the merits and demerits of different acausal decision theories, you’re visited by Omega, a being assumed to possess flawless powers of prediction and absolute trustworthiness. You’re presented with Newcomb’s paradox, but with one additional caveat: Omega informs you that you weren’t born like a normal human being, but were instead created by Omega. On the day you were born, Omega flipped a coin: If it came up heads, Omega created you in such a way that you would one-box when presented with the Coin Flip Creation problem, and it put $1 million in box A. If the coin came up tails, you were created such that you’d two-box, and Omega didn’t put any money in box A. We don’t know how Omega made sure what your decision would be. For all we know, it may have inserted either CDT or EDT into your source code, or even just added one hard-coded decision rule on top of your messy human brain. Do you choose both boxes, or only box A?

It seems like EDT gets it right: one-boxing is the winning action here. There’s a correlation between our decision to one-box, the coin flip, and Omega’s decision to put money in box A. Conditional on us one-boxing, the probability that there is money in box A increases, and we “receive the good news” – that is, we discover that the coin must have come up heads, and we thus get the million dollars. In fact, we can be absolutely certain of the better outcome if we one-box. However, the problem persists if the correlation between our actions and the content of box A isn’t perfect. As long as the correlation is high enough, it is better to one-box.

Nevertheless, neither causal nor logical counterfactuals seem to imply that we can determine whether there is money in box A. The coin flip isn’t a decision algorithm itself, so we can’t determine its outcome. The logical uncertainty about our own decision output doesn’t seem to coincide with the empirical uncertainty about the outcome of the coin flip. In absence of a causal or logical link between their decision and the content of box A, CDT and TDT would two-box.

Updateless Decision Theory

As far as I understand, UDT would come to a similar conclusion. AlephNeil writes in a post about UDT:

In the Smoking Lesion problem, the presence of a 'lesion' is somehow supposed to cause Player's to choose to smoke (without altering their utility function), which can only mean that in some sense the Player's source code is 'partially written' before the Player can exercise any control over it. However, UDT wants to 'wipe the slate clean' and delete whatever half-written nonsense is there before deciding what code to write.

Ultimately this means that when UDT encounters the Smoking Lesion, it simply throws away the supposed correlation between the lesion and the decision and acts as though that were never a part of the problem.

This approach seems wrong to me. If we use an algorithm that changes our own source code, then this change, too, has been physically determined and can therefore correlate with events that aren’t copies of our own decision algorithm. If UDT reasons as though it could just rewrite its own source code and discard the correlation with the coin flip altogether, then UDT two-boxes and thus by definition ends up in the world where there is no money in box A.

Note that updatelessness seemingly makes no difference in this problem, since it involves no a priori decision: Before the coin flip, there’s a 50% chance of becoming either a one-boxing or a two-boxing agent. In any case, we can’t do anything about the coin flip, and therefore also can’t influence whether box A contains any money.

I am uncertain how UDT works, though, and would be curious about others people’s thoughts. Maybe UDT reasons that by one-boxing, it becomes a decision theory of the sort that would never be installed into an agent in a tails world, thus rendering impossible all hypothetical tails worlds with UDT agents in them. But if so, why wouldn’t UDT “one-box” in the Smoking Lesion? As far as the thought experiments are specified, the causal connection between coin flip and two-boxing in the Coin Flip Creation appears to be no different from the connection between gene and smoking in the Smoking Lesion.

More adaptations and different formalizations of LDTs exist, e.g. Proof-Based Decision Theory. I could very well imagine that some of those might one-box in the thought experiment I presented. If so, then I’m once again curious as to where the benefits of such decision theories lie in comparison to plain EDT (aside from updatelessness – see Concluding thoughts).

Coin Flip Creation, Version 2

Let’s assume UDT would two-box in the Coin Flip Creation. We could alter our thought experiment a bit so that UDT would probably one-box after all:

The situation is identical to the Coin Flip Creation, with one key difference: After Omega flips the coin and creates you with the altered decision algorithm, it actually simulates your decision, just as in Newcomb’s original paradox. Only after Omega has determined your decision via simulation does it decide whether to put money in box A, conditional on your decision. Do you choose both boxes, or only box A?

Here is a causal graph for the first and second version of the Coin Flip Creation problem. In the first version, a coin flip determines whether there is money in box A. In the second one, a simulation of your decision algorithm decides:

Since in Version 2, there’s a simulation involved, UDT would probably one-box. I find this to be a curious conclusion. The situation remains exactly the same – we can rule out any changes in the correlation between our decision and our payoff. It seems confusing to me, then, that the optimal decision should be a different one.

Copy-altruism and multi-worlds

The Coin Flip Creation problem assumes a single world and an egoistic agent. In the following, I want to include a short discussion of how the Coin Flip Creation would play out in a multi-world environment.

Suppose Omega’s coin is based on a quantum number generator and produces 50% heads worlds and 50% tails worlds. If we’re copy-egoists, EDT still recommends to one-box, since doing so would reveal to us that we’re in one of the branches in which the coin came up heads. If we’re copy-altruists, then in practice, we’d probably care a bit less about copies whose decision algorithms have been tampered with, since they would make less effective use of the resources they gain than we ourselves would (i.e. their decision algorithm sometimes behaves differently). But in theory, if we care about all the copies equally, we should be indifferent with respect to one-boxing or two-boxing, since there will always be 50% of us in either of the worlds no matter what we do. The two groups always take the opposite action. The only thing we can change is whether our own copy belongs to the tails or the heads group.

To summarize, UDT and EDT would both be indifferent in the altruistic multi-world case, but UDT would (presumably) two-box, and EDT would one-box, in both the copy-egoistic multi-worlds and in the single-world case.

“But I don’t have a choice”

There seems to be an especially strong intuition of “absence of free will” inherent to the Coin Flip Creation problem. When presented with the problem, many respond that if someone had created their source code, they didn’t have any choice to begin with. But that’s the exact situation in which we all find ourselves at all times! Our decision architecture and choices are determined by physics, just like a hypothetical AI’s source code, and all of our choices will thus be determined by our “creator.” When we’re confronted with the two boxes, we know that our decisions are predetermined, just like every word of this blogpost has been predetermined. But that knowledge alone won’t help us make any decision. As far as I’m aware, even an agent with complete knowledge of its own source code would have to treat its own decision outputs as uncertain, or it would fail to implement a decision algorithm that takes counterfactuals into account.

Note that our decision in the Coin Flip Creation is also no less determined than in Newcomb’s paradox. In both cases, the prediction has been made, and physics will guide our thoughts and our decision in a deterministic and predictable manner. Nevertheless, we can still assume that we have a choice until we make our decision, at which point we merely “find out” what has been our destiny all along.

Concluding thoughts

I hope that the Coin Flip Creation motivates some people to reconsider EDT’s answers in Newcomblike problems. A thought experiment somewhat similar to the Coin Flip Creation can be found in Arif Ahmed 2014.

Of course, the particular setup of the Coin Flip Creation means it isn’t directly relevant to the question of which decision theory we should program into an AI. We obviously wouldn’t flip a coin before creating an AI. Also, the situation doesn’t really look like a decision problem from the outside; an impartial observer would just see Omega forcing you to pick either A or B. Still, the example demonstrates that from the inside view, evidence from the actions we take can help us achieve our goals better. Why shouldn’t we use this information? And if evidential knowledge can help us, why shouldn’t we allow a future AI to take it into account? In any case, I’m not overly confident in my analysis and would be glad to have any mistakes pointed out to me.

Medical Newcomb is also not the only class of problems that challenge EDT. Evidential blackmail is an example of a different problem, wherein giving the agent access to specific compromising information is used to extract money from EDT agents. The problem attacks EDT from a different angle, though: namely by exploiting it’s lack of updatelessness, similar to the challenges in Transparent Newcomb, Parfit’s Hitchhiker, Counterfactual Mugging, and the Absent-Minded Driver. I plan to address questions related to updatelessness, e.g. whether it makes sense to give in to evidential blackmail if you already have access to the information and haven’t precommitted not to give in, at a later point.

Net Utility and Planetary Biocide

9 madhatter 09 April 2017 03:43AM

I've started listening to the audiobook of Peter Singer's Ethics in the Real World, which is both highly recommended and very unsettling. The essays on non-human animals, for example, made me realize for the first time that it may well be possible that the net utility on Earth over all conscious creatures is massively negative. 

Naturally, this led me to wonder whether, after all, efforts to eradicate all consciousness on Earth - human and non-human - may be ethically endorsable.This, in turn, reminded me of a recent post on LW asking whether the possibility of parallelized torture of future uploads justifies killing as many people as possible today. 

I had responded to that post by mentioning that parallelizing euphoria was also possible, so this should cancel things out. This seemed at the time like a refutation, but I realized later I had made the error of equating the two, utility and disutility, as part of the same smooth continuum, like [-100, 100] ∈ R. There is no reason to believe the maximum disutility I can experience is equal in magnitude to the maximum utility I can experience. It may be that max disutility is far greater. I really don't know, and I don't think introspection is as useful in answering this question as it seems intuitively to be, but it seems quite plausible for this to be the case.

As these thoughts were emerging, Singer, as if hearing my concerns, quoted someone or other who claimed that the human condition is one of perpetual suffering, constantly seeking desires which, once fulfilled, are ephemeral and dissatisfying, and therefore it is a morally tragic outcome for any of us to have emerged into existence. 

Of course these are shoddy arguments in support of Mass Planetary Biocide, even supposing the hypothesis that the Earth (universe?) has net negative utility is true. For one, we can engineer minds somewhere in a better neighborhood of mindspace, where utility is everywhere positive. Or maybe it's impossible even in theory to treat utility and disutility like real-valued functions of physical systems over time (though I'm betting it is). Or maybe the universe is canonically infinite, so even if 99% of conscious experiences in the universe have disutility, there are infinite quantities of both utility and disutility and so nothing we do matters, as Bostrom wrote about. (Although this is actually not an argument against MPB, just not one for it). And anyway, the state of net utility today is not nearly as important as the state of net utility could potentially be in the future. And perhaps utilitarianism is a naive and incorrect ethical framework. 

Still, I had somehow always assumed implicitly that net utility of life on Earth was positive, so the realization that this need not be so is causing me significant disutility. 

 

What conservatives and environmentalists agree on

9 PhilGoetz 08 April 2017 12:57AM

Today we had a sudden cold snap here in western Pennsylvania, with the temperature dropping 30 degrees F.  I was walking through a white field that had been green yesterday, looking at daffodils poking up through the snow and feeling irritated that they'd probably die.  It occurred to me that, if we could control the weather, people would probably vote for a smooth transition from winter to summer, and this would wreak some unforeseen environmental catastrophe, because it would suddenly make most survival strategies reliably sub-optimal.

This is typical environmentalist thinking:  Whenever you see something in the environment that you don't like, stop and step back before trying to change it.  Trust nature that there's some reason it is that way.  Interfere as little as possible.

The classic example is forest fires.  Our national park service used to try to stop all forest fires.  This policy changed in the 1960s for several reasons, including the observation that no new Sequoia saplings had sprouted since the beginning of fire suppression in the 19th century.  Fire is dangerous, destructive, and necessary.

It struck me that this cornerstone of environmentalism is also the cornerstone of social conservatism.

continue reading »

Against responsibility

9 Benquo 31 March 2017 09:12PM

I am surrounded by well-meaning people trying to take responsibility for the future of the universe. I think that this attitude – prominent among Effective Altruists – is causing great harm. I noticed this as part of a broader change in outlook, which I've been trying to describe on this blog in manageable pieces (and sometimes failing at the "manageable" part).

I'm going to try to contextualize this by outlining the structure of my overall argument.

Why I am worried

Effective Altruists often say they're motivated by utilitarianism. At its best, this leads to things like Katja Grace's excellent analysis of when to be a vegetarian. We need more of this kind of principled reasoning about tradeoffs.

At its worst, this leads to some people angsting over whether it's ethical to spend money on a cup of coffee when they might have saved a life, and others using the greater good as license to say things that are not quite true, socially pressure others into bearing inappropriate burdens, and make ever-increasing claims on resources without a correspondingly strong verified track record of improving people's lives. I claim that these actions are not in fact morally correct, and that people keep winding up endorsing those conclusions because they are using the wrong cognitive approximations to reason about morality.

Summary of the argument

  1. When people take responsibility for something, they try to control it. So, universal responsibility implies an attempt at universal control.
  2. Maximizing control has destructive effects:
    • An adversarial stance towards other agents.
    • Decision paralysis.
  3. These failures are not accidental, but baked into the structure of control-seeking. We need a practical moral philosophy to describe strategies that generalize better, and benefit from the existence of other benevolent agents, rather than treating them primarily as threats.

Responsibility implies control

In practice, the way I see the people around me applying utilitarianism, it seems to make two important moral claims:

  1. You - you, personally - are responsible for everything that happens.
  2. No one is allowed their own private perspective - everyone must take the public, common perspective.

The first principle is almost but not quite simple consequentialism. But it's important to note that it actually doesn't generalize; it's massive double-counting if each individual person is responsible for everything that happens. I worked through an example of the double-counting problem in my post on matching donations.

The second principle follows from the first one. If you think you're personally responsible for everything that happens, and obliged to do something about that rather than weigh your taste accordingly – and you also believe that there are ways to have an outsized impact (e.g. that you can reliably save a life for a few thousand dollars) – then in some sense nothing is yours. The money you spent on that cup of coffee could have fed a poor family for a day in the developing world. It's only justified if the few minutes you save somehow produce more value.

One way of resolving this is simply to decide that you're entitled to only as much as the global poor, and try to do without the rest to improve their lot. This is the reasoning behind the notorious demandingness of utilitarianism.

But of course, other people are also making suboptimal uses of resources. So if you can change that, then it becomes your responsibility to do so.

In general, if Alice and Bob both have some money, and Alice is making poor use of money by giving to the Society to Cure Rare Diseases in Cute Puppies, and Bob is giving money to comparatively effective charities like the Against Malaria Foundation, then if you can cause one of them to have access to more money, you'd rather help Bob than Alice.

There's no reason for this to be different if you are one of Bob and Alice. And since you've already rejected your own private right to hold onto things when there are stronger global claims to do otherwise, there's no principled reason not to try to reallocate resources from the other person to you.

What you're willing to do to yourself, you'll be willing to do to others. Respecting their autonomy becomes a mere matter of either selfishly indulging your personal taste for "deontological principles," or a concession made because they won't accept your leadership if you're too demanding - not a principled way to cooperate with them. You end up trying to force yourself and others to obey your judgment about what actions are best.

If you think of yourself as a benevolent agent, and think of the rest of the world and all the people in it in as objects with regular, predictable behaviors you can use to improve outcomes, then you'll feel morally obliged - and therefore morally sanctioned - to shift as much of the locus of control as possible to yourself, for the greater good.

If someone else seems like a better candidate, then the right thing to do seems like throwing your lot in with them, and transferring as much as you can to them rather than to yourself. So this attitude towards doing good leads either to personal control-seeking, or support of someone else's bid for the same.

I think that this reasoning is tacitly accepted by many Effective Altruists, and explains two seemingly opposite things:

  1. Some EAs get their act together and make power plays, implicitly claiming the right to deceive and manipulate to implement their plan.
  2. Some EAs are paralyzed by the impossibility of weighing the consequences for the universe of every act, and collapse into perpetual scrupulosity and anxiety, mitigated only by someone else claiming legitimacy, telling them what to do, and telling them how much is enough.

Interestingly, people in the second category are somewhat useful for people following the strategy of the first category, as they demonstrate demand for the service of telling other people what to do. (I think the right thing to do is largely to decline to meet this demand.)

Objectivists sometimes criticize "altruistic" ventures by insisting on Ayn Rand's definition of altruism as the drive to self-abnegation, rather than benevolence. I used to think that this was obnoxiously missing the point, but now I think this might be a fair description of a large part of what I actually see. (I'm very much not sure I'm right. I am sure I'm not describing all of Effective Altruism – many people are doing good work for good reasons.)

Control-seeking is harmful

You have to interact with other people somehow, since they're where most of the value is in our world, and they have a lot of causal influence on the things you care about. If you don't treat them as independent agents, and you don't already rule over them, you will default to going to war against them (and more generally trying to attain control and then make all the decisions) rather than trading with them (or letting them take care of a lot of the decisionmaking). This is bad because it destroys potential gains from trade and division of labor, because you win conflicts by destroying things of value, and because even when you win you unnecessarily become a bottleneck.

People who think that control-seeking is the best strategy for benevolence tend to adopt plans like this:

Step 1 – acquire control over everything.

Step 2 – optimize it for the good of all sentient beings.

The problem with this is that step 1 does not generalize well. There are lots of different goals for which step 1 might seem like an appealing first step, so you should expect lots of other people to be trying, and their interests will all be directly opposed to yours. Your methods will be nearly the same as the methods for someone with a different step 2. You'll never get to step 2 of this plan; it's been tried many times before, and failed every time.

Lots of different types of people want more resources. Many of them are very talented. You should be skeptical about your ability to win without some massive advantage. So, what you're left with are your proximate goals. Your impact on the world will be determined by your means, not your ends.

What are your means?

Even though you value others' well-being intrinsically, when pursuing your proximate goals, their agency mostly threatens to muck up your plans. Consequently, it will seem like a bad idea to give them info or leave them resources that they might misuse.

You will want to make their behavior more predictable to you, so you can influence it better. That means telling simplified stories designed to cause good actions, rather than to directly transmit relevant information. Withholding, rather than sharing, information. Message discipline. I wrote about this problem in my post on the humility argument for honesty.

And if the words you say are tools for causing others to take specific actions, then you're corroding their usefulness for literally true descriptions of things far away or too large or small to see. Peter Singer's claim that you can save a life for hundreds of dollars by giving to developing-world charities no longer means that you can save a life for hundreds of dollars by giving to developing-world charities. It simply means that Peter Singer wants to motivate you to give to developing-world charities. I wrote about this problem in my post on bindings and assurances.

More generally, you will try to minimize others' agency. If you believe that other people are moral agents with common values, then e.g. withholding information means that the friendly agents around you are more poorly informed, which is obviously bad, even before taking into account trust considerations! This plan only makes sense if you basically believe that other people are moral patients, but independent, friendly agents do not exist; that you are the only person in the world who can be responsible for anything.

Another specific behavioral consequence is that you'll try to acquire resources even when you have no specific plan for them. For instance, GiveWell's impact page tracks costs they've imposed on others – money moved, and attention in the form of visits to their website – but not independent measures of outcomes improved, or the opportunity cost of people who made a GiveWell-influenced donation. The implication is that people weren't doing much good with their money or time anyway, so it's a "free lunch" to gain control over these.<fn>Their annual metrics report goes into more detail and does track this, and finds that about a quarter of GiveWell-influenced donations were reallocated from other developing-world charities (and another quarter from developed-world charities).</fn> By contrast, the Gates foundation's Valentine's day report to Warren Buffet tracks nothing but developing-world outcomes (but then absurdly takes credit for 100% of the improvement).

As usual, I'm not picking on GiveWell because they're unusually bad – I'm picking on GiveWell because they're unusually open. You should assume that similar but more secretive organizations are worse by default, not better.

This kind of divergent strategy doesn't just directly inflict harms on other agents. It takes resources away from other agents that aren't defending themselves, which forces them into a more adversarial stance. It also earns justified mistrust, which means that if you follow this strategy, you burn cooperative bridges, forcing yourself farther down the adversarial path.

I've written more about the choice between convergent and divergent strategies in my post about the neglectedness consideration.

Simple patches don't undo the harms from adversarial strategies

Since you're benevolent, you have the advantage of a goal in common with many other people. Without abandoning your basic acquisitive strategy, you could try to have a secret handshake among people trying to take over the world for good reasons rather than bad. Ideally, this would let the benevolent people take over the world, cooperating among themselves. But, in practice, any simple shibboleth can be faked; anyone can say they're acquiring power for the greater good.

It's a commonplace in various discussions among Effective Altruists, when someone identifies an individual or organization doing important work, to suggest that we "persuade them to become an EA" or "get an EA in the organization", rather than directly about ways to open up a dialogue and cooperate. This is straightforwardly an attempt to get them to agree to the same shibboleths in order to coordinate on a power-grabbing strategy. And yet, the standard of evidence we're using is mostly "identifies as an EA".

When Gleb Tsipursky tried to extract resources from the Effective Altruism movement with straightforward low-quality mimesis, mouthing the words but not really adding value, and grossly misrepresenting what he was doing and his level of success, it took EAs a long time to notice the pattern of misbehavior. I don't think this is because Gleb is especially clever, or because EAs are especially bad at noticing things. I think this is because EAs identify each other by easy-to-mimic shibboleths rather than meaningful standards of behavior.

Nor is Effective Altruism unique in suffering from this problem. When the Roman empire became too big to govern, gradually emperors hit upon the solution of dividing the empire in two and picking someone to govern the other half. This occasionally worked very well, when the two emperors had a strong preexisting bond, but generally they distrusted each other enough that the two empires behaved like rival states as often as they behaved like allies. Even though both emperors were Romans, and often close relatives!

Using "believe me" as our standard of evidence will not work out well for us. The President of the United States seems to have followed the strategy of saying the thing that's most convenient, whether or not it happens to be true, and won an election based on this. Others can and will use this strategy against us.

We can do better

The above is all a symptom of not including other moral agents in your model of the world. We need a moral theory that takes this into account in its descriptions (rather than having to do a detailed calculation each time), and yet is scope-sensitive and consequentialist the way EAs want to be.

There are two important desiderata for such a theory:

  1. It needs to take into account the fact that there are other agents who also have moral reasoning. We shouldn't be sad to learn that others reason the way we do.
  2. Graceful degradation. We can't be so trusting that we can be defrauded by anyone willing to say they're one of us. Our moral theory has to work even if not everyone follows it. It should also degrade gracefully within an individual – you shouldn't have to be perfect to see benefits.

One thing we can do now is stop using wrong moral reasoning to excuse destructive behavior. Until we have a good theory, the answer is we don't know if your clever argument is valid.

On the explicit and systematic level, the divergent force is so dominant in our world that sincere benevolent people simply assume, when they see someone overtly optimizing for an outcome, that this person is optimizing for evil. This leads to perceptive people who don’t like doing harm, like Venkatesh Rao, to explicitly advise others to minimize their measurable impact on the world.

I don't think this impact-minimization is right, but on current margins it's probably a good corrective.

One encouraging thing is that many people using common-sense moral reasoning already behave according to norms that respect and try to cooperate with the moral agency of others. I wrote about this in Humble Charlie.

I've also begun to try to live up to cooperative heuristics even if I don't have all the details worked out, and help my friends do the same. For instance, I'm happy to talk to people making giving decisions, but usually I don't go any farther than connecting them with people they might be interested in, or coaching them through heuristics, because doing more would be harmful, it would destroy information, and I'm not omniscient, otherwise I'd be richer.

A movement like Effective Altruism, explicitly built around overt optimization, can only succeed in the long run at actually doing good with (a) a clear understanding of this problem, (b) a social environment engineered to robustly reject cost-maximization, and (c) an intellectual tradition of optimizing only for actually good things that people can anchor on and learn from.

This was only a summary. I don't expect many people to be persuaded by this alone. I'm going to fill in the details in the future posts. If you want to help me write things that are relevant, you can respond to this (preferably publicly), letting me know:

  • What seems clearly true?
  • Which parts seem most surprising and in need of justification or explanation?

(Cross-posted at my personal blog.)

[Link] ‘Crucial Considerations and Wise Philanthropy’, by Nick Bostrom

9 casebash 31 March 2017 12:52PM

IQ and Magnus Carlsen, Leo Messi and the Decathlon

9 ragintumbleweed 28 March 2017 10:37PM

[Epistemic Status: I suspect that this is at least partially wrong. But I don’t know why yet, and so I figured I’d write it up and let people tell me. First post on Less Wrong, for what that’s worth.] 

First thesis: IQ is more akin to a composite measure of performance such as the decathlon than it is to a single characteristic such as height or speed.

Second thesis: When looking at extraordinary performance in any specific field, IQ will usually be highly correlated with success, but it will not fully explain or predict top-end performance, because extraordinary performance in a specific field is a result of extraordinary talent in a sub-category of intelligence (or even a sub-category of a sub-category), rather than truly top-end achievement in the composite metric.                                         

Before we go too far, here are some of the things I’m not arguing:

  • IQ is largely immutable (though perhaps not totally immutable).
  • IQ is a heritable, polygenic trait.
  • IQ is highly correlated with a variety of achievement measures, including academic performance, longevity, wealth, happiness, and health.
  • That parenting and schooling matter far less than IQ in predicting performance.
  • That IQ matters more than “grit” and “mindset” when explaining performance.
  • Most extraordinary performers, from billionaire tech founders to chess prodigies, to writers and artists and musicians, will possess well-above-average IQ.[1]

Here is one area why I’m certain I’m in the minority:

Here is the issue where I’m not sure if my opinion is controversial, and thus why I’m writing to get feedback:

  • While IQ is almost certainly highly correlated with high-end performance, IQ fails a metric to explain or, more importantly, to predict top-end individual performance (the Second Thesis).

Why IQ Isn’t Like Height

Height is a single, measurable characteristic. Speed over any distance is a single, measurable characteristic. Ability to bench-press is a single, measurable characteristic.

But intelligence is more like the concept of athleticism than it is the concept of height, speed, or the ability to bench-press.  

Here is an excerpt from the Slate Star Codex article Talents part 2, Attitude vs. Altitude:

The average eminent theoretical physicist has an IQ of 150-160. The average NBA player has a height of 6’ 7”. Both of these are a little over three standard deviations above their respective mean. Since z-scores are magic and let us compare unlike domains, we conclude that eminent theoretical physicists are about as smart as pro basketball players are tall.

Any time people talk about intelligence, height is a natural sanity check. It’s another strongly heritable polygenic trait which is nevertheless susceptible to environmental influences, and which varies in a normal distribution across the population – but which has yet to accrete the same kind of cloud of confusion around it that IQ has.

All of this is certainly true. But here’s what I’d like to discuss more in depth:

Height is a trait that can be measured in a single stroke. IQ has to be measured by multiple sub-tests.

IQ measures the following sub-components of intelligence:

  •  Verbal Intelligence
  • Mathematical Ability
  • Spatial Reasoning Skills
  • Visual/Perceptual Skills
  • Classification Skills
  • Logical Reasoning Skills
  • Pattern Recognition Skills[2]

Even though both height and intelligence are polygenic traits, there is a category difference between two.

That’s why I think that athleticism is a better polygenic-trait-comparator to intelligence than height. Obviously, people are born with different degrees of athletic talent. Athleticism can be affected by environmental factors (nutrition, lack of access to athletic facilities, etc.). Athleticism, like intelligence, because it is composed of different sub-variables (speed, agility, coordination – verbal intelligence, mathematical intelligence, spatial reasoning skills), can be measured in a variety of ways. You could measure athleticism with an athlete’s performance in the decathlon, or you could measure it with a series of other tests. Those results would be highly correlated, but not identical. And those results would probably be highly correlated with lots of seemingly unrelated but important physical outcomes.

Measure intelligence with an LSAT vs. IQ test vs. GRE vs. SAT vs. ACT vs. an IQ test from 1900 vs. 1950 vs. 2000 vs. the blink test, and the results will be highly correlated, but again, not identical.

Whether you measure height in centimeters or feet, however, the ranking of the people you measure will be identical no matter how you measure it.

To me, that distinction matters.

I think this athleticism/height distinction explains part (but not all) of the “cloud” surrounding IQ.[3]

Athletic Quotient (“AQ”)

Play along with me for a minute.

Imagine we created a single, composite metric to measure overall athletic ability. Let’s call it AQ, or Athletic Quotient. We could measure AQ just as we measure IQ, with 100 as the median score, and with two standard deviations above at 130 and four standard deviations above at 160.

For the sake of simplicity, let’s measure athletes’ athletic ability with the decathlon. This event is an imperfect test of speed, strength, jumping ability, and endurance.

An Olympic-caliber decathlete could compete at a near-professional level in most sports. But the best decathletes aren’t the people whom we think of when we think of the best athletes in the world. When we think of great athletes, we think of the top performers in one individual discipline, rather than the composite.

When people think of the best athlete in the world, they think of Leo Messi or Lebron James, not Ashton Eaton.

IQ and Genius

Here’s where my ideas might start to get controversial.

I don’t think most of the people we consider geniuses necessarily had otherworldly IQs. People with 200-plus IQs are like Olympic decathletes. They’re amazingly intelligent people who can thrive in any intellectual environment. They’re intellectual heavyweights without specific weaknesses. But those aren’t necessarily the superstars of the intellectual world. The Einsteins, Mozarts, Picassos, or the Magnus Carlsens of the world – they’re great because of domain-specific talent, rather than general intelligence. 

Phlogiston and Albert Einstein’s IQ

Check out this article.

The article declares, without evidence, that Einstein had an IQ of 205-225.

The thinking seems to go like this: Most eminent physicists have IQs of around 150-160. Albert Einstein created a paradigm shift in physics (or perhaps multiple such shifts). So he must have had an IQ around 205-225. We’ll just go ahead and retroactively apply that IQ to this man who’s been dead for 65 years and that’ll be great for supporting the idea that IQ and high-end field-specific performance are perfectly correlated.

As an explanation of intelligence, that’s no more helpful than phlogiston in chemistry.

But here’s the thing: It’s easy to ascribe super-high IQs retroactively to highly accomplished dead people, but I have never heard of IQ predicting an individual’s world-best achievement in a specific field. I have never read an article that says, “this kid has an IQ of 220; he’s nearly certain to create a paradigm-shift in physics in 20 years.” There are no Nate Silvers predicting individual achievement based on IQ. IQ does not predict Nobel Prize winners or Fields Medal winners or the next chess #1. A kid with a 220 IQ may get a Ph.D. at age 17 from CalTech, but that doesn’t mean he’s going to be the next Einstein.

Einstein was Einstein because he was an outsider. Because he was intransigent. Because he was creative. Because he was an iconoclast. Because he had the ability to focus. But there is no evidence that he had an IQ over 200. But according to the Isaacson biography at least, there were other pre-eminent physicists who were stronger at math than he was. Of course he was super smart. But there's no evidence he had a super-high IQ (as in, above 200).

We’ve been using IQ as a measure of intelligence for over 100 years and it has never predicted an Einstein, a Musk, or a Carlsen.[4] Who is the best counter-example to this argument? Terence Tao? Without obvious exception, those who have been recognized for early-age IQ are still better known for their achievements as prodigies than their achievements as adults.

Is it unfair to expect that predictive capacity from IQ? Early-age prediction of world-class achievement does happen. Barcelona went and scooped up Leo Messi from the hinterlands of Argentina at age 12 and he went and became Leo Messi. Lebron James was on the cover of Sports Illustrated when he was in high school.

In some fields, predicting world-best performance happens at an early age. But IQ – whatever its other merits – does not seem to serve as an effective mechanism for predicting world-best performance in specific individualized activities.

Magnus Carlsen’s IQ

When I type in Magnus Carlsen’s name into Google, the first thing that autofills (after chess) is “Magnus Carlsen IQ.”

People seem to want to believe that his IQ score can explain why he is the Mozart of chess.  

We don’t know what his IQ is, but the instinct people have to try to explain his performance in terms of IQ feels very similar to people’s desire to ascribe an IQ of 225 to Einstein. It’s phlogiston.

Magnus Carlsen probably has a very high IQ. He obviously has well above-average intelligence. Maybe his IQ is 130, 150, or 170 (there's a website called ScoopWhoop that claims, without citation, that it's 190). But however high his IQ, doubtless there are many or at least a few chess players in the world who have higher IQs than he has. But he’s the #1 chess player in the world – not his competitors with higher IQs. And I don’t think the explanation for why he’s so great is his “mindset” or “grit” or anything like that.

It’s because IQ is akin to an intellectual decathlon, whereas chess is a single-event competition. If we dug deep into the sub-components of Carlsen’s IQ (or perhaps the sub-components of the sub-components), we’d probably find some sub-component where he measured off the charts. I’m not saying there’s a “chess gene,” but I suspect that there is a trait that could be measured as a sub-component of intelligence that that is more specific than IQ that would be a greater explanatory variable of his abilities than raw IQ.

Leo Messi isn’t the greatest soccer player in the world because he’s the best overall athlete in the world. He’s the best soccer player in the world because of his agility and quickness in incredibly tight spaces. Because of his amazing coordination in his lower extremities. Because of his ability to change direction with the ball before defenders have time to react. These are all natural talents. But they are only particularly valuable because of the arbitrary constraints in soccer.

Leo Messi is a great natural athlete. If we had a measure of AQ, he’d probably be in the 98th or 99th percentile. But that doesn’t begin to explain his otherworldly soccer-playing talents. He probably could have been a passable high-school point guard at a school of 1000 students.  He would have been a well-above-average decathlete (though I doubt he could throw the shot put worth a damn).

But it’s the unique athletic gifts that are particularly well suited to soccer that enabled him to be the best in the world at soccer. So, too, with Magnus Carlsen with chess, Elon Musk with entrepreneurialism, and Albert Einstein with paradigm-shifting physics.

The decathlon won’t predict the next Leo Messi or the next Lebron James. And IQ won’t predict the next Magnus Carlsen, Elon Musk, Picasso, Mozart, or Albert Einstein.

And so we shouldn’t seek it out as an after-the-fact explanation for their success, either.


[1] Of course, high performance in some fields is probably more closely correlated with IQ than others: physics professor > english professor > tech founder > lawyer > actor > bassist in grunge band. [Note: this footnote is total unfounded speculation]

[3] The other part is that people don’t like to be defined by traits that they feel they cannot change or improve.

[4] Let me know if I am missing any famous examples here.

Announcing: The great bridge

9 Elo 17 March 2017 01:11AM

Original post: http://bearlamp.com.au/announcing-the-great-bridge-between-communities/


In the deep dark lurks of the internet, several proactive lesswrong and diaspora leaders have been meeting each day.  If we could have cloaks and silly hats; we would.

We have been discussing the great diversification, and noticed some major hubs starting to pop up.  The ones that have been working together include:

  • Lesswrong slack
  • SlateStarCodex Discord
  • Reddit/Rational Discord
  • Lesswrong Discord
  • Exegesis (unofficial rationalist tumblr)

The ones that we hope to bring together in the future include (on the willingness of those servers):

  • Lesswrong IRC (led by Gwern)
  • Slate Star Codex IRC
  • AGI slack
  • Transhumanism Discord
  • Artificial Intelligence Discord

How will this work?

About a year ago, the lesswrong slack tried to bridge across to the lesswrong IRC.  That was bad.  From that experience we learnt a lot that can go wrong, and have worked out how to avoid those mistakes.  So here is the general setup.

Each server currently has it's own set of channels, each with their own style of talking and addressing problems, and sharing details and engaging with each other.  We definitely don't want to do anything that will harm those existing cultures.  In light of this, taking the main channel from one server and mashing it into the main channel of another server is going to reincarnate into HELL ON EARTH.  and generally leave both sides with the sentiment that "<the other side> is wrecking up <our> beautiful paradise".  Some servers may have a low volume buzz at all times, other servers may become active for bursts, it's not good to try to marry those things.

Logistics:

Room: Lesswrong-Slack-Open

Bridged to:

  • exegesis#lwslack_bridge
  • Discord-Lesswrong#lw_slack_main
  • R/rational#lw_slack_open
  • SSC#bridge_slack

I am in <exegesis, D/LW, R/R, SSC> what does this mean?

If you want to peek into the lesswrong slack and see what happens in their #open channel.  You can join or unmute your respective channel and listen in, or contribute (two way relay) to their chat.  Obviously if everyone does this at once we end up spamming the other chat and probably after a week we cut the bridge off because it didn't work.  So while it's favourable to increase the community; be mindful of what goes on across the divide and try not to anger our friends.

I am in Lesswrong-Slack, what does this mean?

We have new friends!  Posts in #open will be relayed to all 4 children rooms where others can contribute if they choose.  Mostly they have their own servers to chat on, and if they are not on an info-diet already, then maybe they should be.  We don't anticipate invasion or noise.

Why do they get to see our server and we don't get to see them?

So glad you asked - we do.  There is an identical set up for their server into our bridge channels.  in fact the whole diagram looks something like this:

Server Main channel Slack-Lesswrong Discord-Exegesis Discord-Lesswrong Discord-r/rational Discord-SSC
Slack-Lesswrong Open   lwslack_bridge lw_slack_main lw_slack_open bridge_slack
Discord-Exegesis Main #bridge_rat_tumblr   exegesis_main exegesis_rattumb_main bridge_exegesis
Discord-Lesswrong Main #Bridge_discord_lw lwdiscord_bridge   lw_discord_main bridge_lw_disc
Discord-r/rational General #bridge_r-rational_dis rrdiscord_bridge reddirati_main   bridge_r_rational
Discord-SSC General #bridge_ssc_discord sscdiscord_bridge ssc_main ssc_discord_gen  

Pretty right? No it's not.  But that's in the backend.

For extra clarification, the rows are the channels that are linked.  Which is to say that Discord-SSC, is linked to a child channel in each of the other servers.  The last thing we want to do is impact this existing channels in a negative way.

But what if we don't want to share our open and we just want to see the other side's open?  (/our talk is private, what about confidential and security?)

Oh you mean like the prisoners dilemma?  Where you can defect (not share) and still be rewarded (get to see other servers).  Yea it's a problem.  Tends to be when one group defects, that others also defect.  There is a chance that the bridge doesn't work.  That this all slides, and we do spam each other, and we end up giving up on the whole project.  If it weren't worth taking the risk we wouldn't have tried.

We have not rushed into this bridge thing, we have been talking about it calmly and slowly and patiently for what seems like forever.  We are all excited to be taking a leap, and keen to see it take off.

Yes, security is a valid concern, walled gardens being bridged into is a valid concern, we are trying our best.  We are just as hesitant as you, and being very careful about the process.  We want to get it right.

So if I am in <server1> and I want to talk to <server3> I can just post in the <bridge-to-server2> room and have the message relayed around to server 3 right?

Whilst that is correct, please don't do that.  You wouldn't like people relaying through your main to talk to other people.  Also it's pretty silly, you can just post in your <servers1> main and let other people see it if they want to.

This seems complicated, why not just have one room where everyone can go and hang out?

  1. How do you think we ended up with so many separate rooms
  2. Why don't we all just leave <your-favourite server> and go to <that other server>?  It's not going to happen

Why don't all you kids get off my lawn and stay in your own damn servers?

Thank's grandpa.  No one is coming to invade, we all have our own servers and stuff to do, we don't NEED to be on your lawn, but sometimes it's nice to know we have friends.

<server2> shitposted our server, what do we do now?

This is why we have mods, why we have mute and why we have ban.  It might happen but here's a deal; don't shit on other people and they won't shit on you.  Also if asked nicely to leave people alone, please leave people alone.  Remember anyone can tap out of any discussion at any time.

I need a picture to understand all this.

Great!  Friends on exegesis made one for us.


Who are our new friends:

Lesswrong Slack

Lesswrong slack has been active since 2015, and has a core community. The slack has 50 channels for various conversations on specific topics, the #open channel is for general topics and has all kinds of interesting discoveries shared here.

Discord-Exegesis (private, entry via tumblr)

Exegesis is a discord set up by a tumblr rationalist for all his friends (not just rats). It took off so well and became such a hive in such a short time that it's now a regular hub.

Discord-Lesswrong

Following Exegesis's growth, a discord was set up for lesswrong, it's not as active yet, but has the advantage of a low barrier to entry and it's filled with lesswrongers.

Discord-SSC

Scott posted a link on an open thread to the SSC discord and now it holds activity from users that hail from the SSC comment section. it probably has more conversation about politics than other servers but also has every topic relevant to his subscribers.

Discord-r/rational

reddit rational discord grew from the rationality and rational fiction subreddit, it's quite busy and covers all topics.


As at the publishing of this post; the bridge is not live, but will go live when we flip the switch.


Meta: this took 1 hour to write (actualy time writing) and half way through I had to stop and have a voice conference about it to the channels we were bridging.

Cross posted to lesswrong: http://lesswrong.com/lw/oqz

 

Am I Really an X?

9 Gram_Stone 05 March 2017 12:11AM

As I understand it, there is a phenomenon among transgender people where no matter what they do they can't help but ask themselves the question, "Am I really an [insert self-reported gender category here]?" In the past, a few people have called for a LessWrong-style dissolution of this question. This is how I approach the problem.

There are two caveats which I must address in the beginning.

The first caveat has to do with hypotheses about the etiology of the transgender condition. There are many possible causes of gender identity self-reports, but I don't think it's too controversial to propose that at least some of the transgender self-reports might result from the same mechanism as cisgender self-reports. Again, the idea is that there is some 'self-reporting algorithm', that takes some input that we don't yet know about, and outputs a gender category, and that both cisgender people and transgender people have this. It's not hard to come up with just-so stories about why having such an algorithm and caring about its output might have been adaptive. This is, however, an assumption. In theory, the self-reports from transgender people could have a cause separate from the self-reports of cisgender people, but it's not what I expect.

The second caveat has to do with essentialism. In the past calls for an article like this one, I saw people point out that we reason about gender as if it is an essence, and that any dissolution would have to avoid this mistake. But there's a difference between describing an algorithm that produces a category which feels like an essence, and providing an essentialist explanation. My dissolution will talk about essences because the human mind reasons with them, but my dissolution itself will not be essentialist in nature.

Humans universally make inferences about their typicality with respect to their self-reported gender. Check Google Scholar for 'self-perceived gender typicality' for further reading. So when I refer to a transman, by my model, I mean, "A human whose self-reporting algorithm returns the gender category 'male', but whose self-perceived gender typicality checker returns 'Highly atypical!'"

And the word 'human' at the beginning of that sentence is important. I do not mean "A human that is secretly, essentially a girl," or "A human that is secretly, essentially a boy,"; I just mean a human. I postulate that there are not boy typicality checkers and girl typicality checkers; there are typicality checkers that take an arbitrary gender category as input and return a measure of that human's self-perceived typicality with regard to the category.

So when a transwoman looks in the mirror and feels atypical because of a typicality inference from the width of her hips, I believe that this is not a fundamentally transgender experience, not different in kind, but only in degree, from a ciswoman who listens to herself speak and feels atypical because of a typicality inference from the pitch of her voice.

Fortunately, society's treatment of transgender people has come around to something like this in recent decades; our therapy proceeds by helping transgender people become more typical instances of their self-report algorithm's output.

Many of the typical traits are quite tangible: behavior, personality, appearance. It is easier to make tangible things more typical, because they're right there for you to hold; you aren't confused about them. But I often hear reports of transgender people left with a nagging doubt, a lingering question of "Am I really an X?, which feels far more slippery and about which they confess themselves quite confused.

To get at this question, I sometimes see transgender people try to simulate the subjective experience of a typical instance of the self-report algorithm's output. They ask questions like, "Does it feel the same to be me as it does to be a 'real X'?" And I think this is the heart of the confusion.

For when they simulate the subjective experience of a 'real X', there is a striking dissimilarity between themselves and the simulation, because a 'real X' lacks a pervasive sense of distress originating from self-perceived atypicality.

But what I just described in the previous sentence is itself a typicality inference, which means that this simulation itself causes distress from atypicality, which is used to justify future inferences of self-perceived atypicality!

I expected this to take more than one go-around.

Let's review something Eliezer wrote in Fake Causality:

One of the primary inspirations for Bayesian networks was noticing the problem of double-counting evidence if inference resonates between an effect and a cause.  For example, let's say that I get a bit of unreliable information that the sidewalk is wet.  This should make me think it's more likely to be raining.  But, if it's more likely to be raining, doesn't that make it more likely that the sidewalk is wet?  And wouldn't that make it more likely that the sidewalk is slippery?  But if the sidewalk is slippery, it's probably wet; and then I should again raise my probability that it's raining...

If you didn't have an explicit awareness that you have a general human algorithm that checks the arbitrary self-report against the perceived typicality, but rather you believed that this was some kind of special, transgender-specific self-doubt, then your typicality checker would never be able to mark its own distress signal as 'Typical!', and it would oscillate between judging the subjective experience as atypical, outputting a distress signal in response, judging its own distress signal as atypical, sending a distress signal about that, etc.

And this double-counting is not anything like hair length or voice pitch, or even more slippery stuff like 'being empathetic'; it's very slippery, and no matter how many other ways you would have made yourself more typical, even though those changes would have soothed you, there would have been this separate and additional lingering doubt, a doubt that can only be annihilated by understanding the deep reasons that the tangible interventions worked, and how your mind runs skew to reality.

And that's it. For me at least, this adds up to normality. There is no unbridgeable gap between the point at which you are a non-X and the point at which you become an X. Now you can just go back to making yourself as typical as you want to be, or anything else that you want to be.

View more: Next