Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

[Link] Should we be spending no less on alternate foods than AI now?

2 denkenberger 30 October 2017 12:13AM

Interactive model knob-turning

3 Gust 28 October 2017 07:42PM

(Please discuss on LessWrong 2.0)

(Cross-posted from my medium channel)

When you are trying to understand something by yourself, a useful skill to check your grasp on the subject is to try out the moving parts of your model and see if you can simulate the resulting changes.

Suppose you want to learn how a rocket works. At the bare minimum, you should be able to calculate the speed of the rocket given the time past launch. But can you tell what happens if Earth gravity was stronger? Weaker? What if the atmosphere had no oxygen? What if we replaced the fuel with Diet Coke and Mentos?

To really understand something, it's not enough to be able to predict the future in a normal, expected, ceteris paribus scenario. You should also be able to predict what happens when several variables are changed is several ways, or, at least, point to which calculations need to be run to arrived at such a prediction.

Douglas Hofstadter and Daniel Dennett call that "turning the knobs". Imagine your model as a box with several knobs, where each knob controls one aspect of the modeled system. You don't have to be able to turn all the possible knobs to all possible values and still get a sensible, testable and correct answer, but the more, the better.

Doug and Dan apply this approach to thought experiments and intuition pumps, as a way to explore possible answers to philosophical questions. In my experience, this skill is also effective when applied to real world problems, notably when trying to understand something that is being explained by someone else.

In this case, you can run this knob-turning check interactively with the other person, which makes it way more powerful. If someone says “X+Y = Z” and “X+W = Z+A”, it’s not enough to mentally turn the knobs and calculate “X+Y+W = Z+A+B”. You should do that, then actually ask the explainer “Hey, let me see if I get what you mean: for example, X+Y+W would be Z+A+B”?

This interactive model knob-turning has been useful to me in many walks of life, but the most common and mundane application is helping out people at work. In that context, I identify six effects which make it helpful:

1) Communication check: maybe you misunderstood and actually X+W = Z-A

This is useful overall, but very important if someone uses metaphor. Some metaphors are clearly vague and people will know that and avoid them in technical explanations. But some metaphors seem really crisp for some people but hazy to others, or worse, very crisp to both people, but with different meanings! So take every metaphor as an invitation to interactive knob-turning.

To focus on communication check, try rephrasing their statements, using different words or, if necessary, very different metaphors. You can also apply a theory in different contexts, to see if the metaphors still apply.

For example, if a person talks about a computer system as if it were a person, I might try to explain the same thing in terms of a group of trained animals, or a board of directors, or dominoes falling.

2) Self-check: correct your own reasoning (maybe you understood the correct premises, but made a logical mistake during knob turning)

This is useful because humans are fallible, and two (competent) heads are less likely to miss a step in the reasoning dance than one.

Also, when someone comes up and asks something, you’ll probably be doing a context-switch, and will be more likely to get confused along the way. The person asking usually has more local context than you in the specific problem they are trying to solve, even if you have more context on the surrounding matters, so they might be able to spot your error more quickly than yourself.

Focus on self-check means double checking any intuitive leaps or tricky reasoning you used. Parts of you model that do not have a clear step-by-step explanation have priority, and should be tested against another brain. Try to phrase the question in a way that makes your intuitive answer look less obvious.

For example: “I’m not sure if this could happen, and it looks like all these messages should arrive in order, but do you know how we can guarantee that?”

3) Other-check: help the other person to correct inferential errors they might have made

The converse of self-checking. Sometimes fresh eyes with some global context can see reasoning errors that are hidden to people who are very focused on a task for too long.

To focus on other-check, ask about conclusions that follow from your model of the situation, but seem unintuitive to you, or required tricky reasoning. It’s possible that your friend also found them unintuitive, and that might have lead them to a jump to the opposite direction.

For example, I could ask: “For this system to work correctly, it seems that the clocks have to be closely synchronized, right? If the clocks are off by much, we could have a difference around midnight.”

Perhaps you successfully understand what was said, and the model you built in your head fits the communicated data. But that doesn’t mean it is the same model that the other person has in mind! In that case, your knob-turning will get you a result that’s inconsistent with what they expect.

4) Alternative hypothesis generation: If they cannot refute your conclusions, you have shown them a possible model they had not yet considered, in which case it will also point in the direction of more research to be made

This is doesn't happen that much when someone is looking for help to something. Usually the context they are trying to explain is the prior existing system which they will build upon, and if they’ve done their homework (i.e. read the docs and/or code) they should have a very good understanding of that already. One exception here is with people who are very new to the job, which are learning while doing.

On the other hand, this is incredibly relevant when someone asks for help debugging. If they can’t find the root cause of a bug, it must be because they are missing something. Either they have derived a mistaken conclusion from the data, or they’ve made an inferential error from those conclusions. The first case is where proposing a new model helps (the second is solved by other-checking).

Maybe they read the logs, saw that a request was sent, and assumed it was received, but perhaps it wasn’t. In that case, you can tell them to check for a log on the receiver system, or the absence of such a log.

To boost this effect, look for data that you strongly expect to exist and confirm your model, where the absence of such data might be caused by relative lack of global context, skill or experience by the other person.

For example: “Ok, so if the database went down, we should’ve seen all requests failing in that time range; but if it was a network instability, we should have random requests failing and others succeeding. Which one was it?”

5) Filling gaps in context: If they show you data that contradicts your model, well, you get more data and improve your understanding

This is very important when you have much less context than the other person. The larger the difference in context, the more likely that there’s some important piece of information that you don’t have, but that they take for granted.

The point here isn’t that there something you don’t know. There are lots and lots of things you don’t know, and neither does your colleague. And if there’s something they know that you don’t, they’ll probably fill you in when asking the question.

The point is that they will tell you something only if they realize you don’t know it yet. But people will expect short inferential distances, underestimate the difference in context, and forget to tell you stuff because it’s just obvious to them that you know.

Focus on filling gaps means you ask about the parts of your model which you are more uncertain about, to find out if they can help you build a clearer image. You can also extrapolate and make a wild guess, which you don’t really expect to be right.

For example: “How does the network works on this datacenter? Do we have a single switch so that, if it fails, all connections go down? Or are those network interfaces all virtualized anyway?”

6) Finding new ideas: If everybody understands one another, and the models are correct, knob-turning will lead to new conclusions (if they hadn’t turned those specific knobs on the problem yet)

This is the whole point of having the conversation, to help someone figure something out they haven’t already. But even if the specific new conclusion you arrive when knob-turning isn’t directly relevant to the current question, it may end up shining light on some part of the other person’s model that they couldn’t see yet.

This effect is general and will happen gradually as both your and the other person's models improve and converge. The goal is to get all obstacles out of the way so you can just move forward and find new ideas and solutions.

The more global context and skill your colleague has, the lower the chance that they missed some crucial piece of data and have a mistaken model (or, if they do, you probably won't be able to figure that out without putting in serious effort). So when talking to more skilled or experienced people, you can focus more in replicating the model from their mind to yours (communication check and self-check).

Conversely, when talking to less skilled people, you should focus more on errors they might have made, or models they might not have considered, or data they may need to collect (other-check and alternative hypothesis generation).

Filling gaps depends more on differences of communication style and local context, so I don't have a person-based heuristic.

NYC Solstice and Megameetup Funding Reminder

1 wearsshoes 27 October 2017 02:14PM

Hey all, we're coming up on the final weekend of our Kickstarter. Details in previous post, and a couple updates here:

  • Megameetup has 16 confirmed attendees. This is shaping up to be a really good chance to form productive conversations and friendships with other rationalists.

  • Solstice is currently at $2,740 (54% funded), with 65% of funding window elapsed. Please contribute - even a little helps.

  • To clarify for people buying multiple tickets, sponsors at $70+ automatically receive two tickets.

  • There will be additional tickets for purchase post-Kickstarter, conditional on meeting our goal, of course.

  • We're offering these incredibly cute stickers above certain backer levels!

Both are only open until Monday, Oct 30th - please give if you can to the Kickstarter, and we're excited to see you at the Megameetup!
Solstice Kickstarter page
Megameetup registration and details



I Want to Review FDT; Are my Criticisms Legitimate?

0 DragonGod 25 October 2017 05:28AM

I'm going to write a review of functional decision theory, I'll use the two papers.
It's going to be around as long as the papers themselves, coupled with school work, I'm not sure when I'll finish writing.
Before I start it, I want to be sure my criticisms are legitimate; is anyone willing to go over my criticisms with me?
My main points of criticism are:
Functional decision theory is actually algorithmic decision theory. It has an algorithmic view of decision theories. It relies on algorithmic equivalence and not functional equivalence.
Quick sort, merge sort, heap sort, insertion sort, selection sort, bubble sort, etc are mutually algorithmically dissimilar, but are all functionally equivalent.
If two decision algorithms are functionally equivalent, but algorithmically dissimilar, you'd want a decision theory that recognises this.
Causal dependence is a subset of algorithmic dependence which is a subset of functional dependence.
So, I specify what an actual functional decision theory would look like.
I then go on to show that even functional dependence is "impoverished".
Imagine a greedy algorithm that gets 95% of problems correct.
Let's call this greedy algorithm f'.
Let's call a correct algorithm f.
f and f' are functionally correlated, but not functionally equivalent.
FDT does not recognise this.
If f is your decision algorithm, and f' is your predictor's decision algorithm, then FDT doesn't recommend one boxing on Newcomb's problem.
EDT can deal with functional correlations.
EDT doesn't distinguish functional correlations from spurious correlations, while FDT doesn't recognise functional correlations.
I use this to specify EFDT (evidential functional decision theory), which considers P(f(π) = f'(π)) instead of P(f = f').
I specify the requirements for a full Implementation of FDT and EFDT.
I'll publish the first draft of the paper here after I'm done.
The paper would be long, because I specify a framework for evaluating decision theories in the paper.
Using this framework I show that EFDT > FDT > ADT > CDT.
I also show that EFDT > EDT.
This framework is basically a hierarchy of decision theories.
A > B means that the set of problems that B correctly decides is a subset of the set of problems that A correctly decides.
The dependence hierarchy is why CDT < ADT < FDT.
EFDT > FDT because EFDT can recognise functional correlations.
EFDT > EDT because EFDT can distinguish functional correlations from spurious correlations.
I plan to write the paper as best as I can, and if I think it's good enough, I'll try submitting it.

Pitting national health care systems against one another

1 michael_b 24 October 2017 09:34PM

I'm about to have a baby.  Any minute now.  Well, my partner is.  I'm just sitting here not growing a baby wondering what to do with myself.

Maybe I can get a jump on our approach to medical care for the new kiddo.

One thing that sticks out at me is that children in the US get a lot of vaccinations.  At my quick count it's something like 37 shots by the time they're 5.

I grew up in the US in the 80s and I don't remember getting nearly this many.  Is my memory faulty?  I'm pretty sure it was more like 12 back in those days.  Is this all really necessary? Nobody likes getting shots, especially not children.  What changed, anyway?

Now, I'm not an expert on immunology or epidemiology so I expect diving into the literature isn't going to be fruitful; I won't be able to ante up decades of education and experience fast enough.  Presumably this is what we pay people at the US CDC and Department of Health for.

But can you *really* trust them?  Aren't all of these vaccinations really convenient for the pharmaceutical industry?  Aren't there seemingly constant allegations/lawsuits about the over-prescription of drug interventions in the US?

The health care systems in major world countries have access to all of the same literature, and they're presumably staffed by educated, expert people too so they should all come to the same conclusions as the US system right?  Not so!

Here's how many shots each nation's health care system recommends by the time children turn 5.

37 US

25 UK

25 Germany

16 Sweden

16 Denmark

The intersection of vaccines being recommended are TDAP, MMR, Polio, HIB and PCB.

In the US we also recommend: Hep A, Hep B, Rotavirus, Meningococcus, Varicella, and yearly flu shots (for babies and children).

Can we explain the variance?  I can think of a few reasons they would vary.

1. Cultural bias.  This can be big.  A psychiatrist in the UK told me that they're not as pharma heavy as, say, psychiatrists in Germany because of a WW2 era bias: lots of the big pharma companies are German.

2. Cultural and environmental differences.  Some diseases are a bigger deal in some countries than others.  Japan (not included above) recommends immunization against diseases (TB, Japanese encephalitis) that none of the systems above are too concerned with.

3. Undue industry influence.  Run-of-the-mill corruption.

4. Quality of health care systems and social safety nets vary.

When it comes to cultural and environment differences I have a hard time imagining that the orthodoxy varies because Hep A is a much bigger deal in the US.  I presume the calculus changes based on your geographic neighbors, but is it a meaningful difference?  Or is it a counterproductive cultural bias?  For example, in the US we may spend more time thinking about diseases people in central America suffer from than the people in Denmark might, but do the neighbors in this case meaningfully translate to a higher disease risk?  Or are we vaccinating against unfounded fears?

Do the other nations vaccinate less than the US because their health care systems are worse?   Annoyingly (if you're an American) all of their health care outcomes rank better.

Is the US health care system more corruptible by industry influence?

Is the story a lot simpler and less sinister?  That the US vaccinates more than the rest of these countries because the balance of the US's health care system (access to treatment, quality of treatment) is worse?  Or is it because having to stay home with a kid that's sick with chicken pox (varicella) is not so big a deal in, say, Denmark, because the social contract is more forgiving of parents who miss work?

Does the poorer quality of health care in the US (going by international rankings) and the lower tolerance for parents missing work combine poorly with the undue influence of industry and therefore lead to more vaccinations?

On the flip side of this argument: so what if we vaccinate kids against more diseases than other countries?  Well, they're not free.  They cost money to administer, and cost tears because kids hate getting shots.  The health risks from vaccines aren't zero, either.  Vaccines have side-effects, and sometimes they're serious.  Those other nations (presumably) ran cost-benefit analyses too and came to different conclusions.  It would be nice if each country showed their work.  

When it comes to needles to stick my new kiddo with, I'm not really being persuaded to do more than the intersection of vaccinations between similar nations.  The fear that a doctor is about to stick my kid with a needle because there was a meeting in a shady room between a pharma rep and a CDC official is pretty powerful.  It doesn't seem like a strictly irrational concern either

[Link] Time to Exit the Sandbox

3 SquirrelInHell 24 October 2017 08:04AM

[Link] Absent Minded Gambler

0 DragonGod 23 October 2017 02:42PM

Introducing Goalclaw, personal goal tracker

1 Nic_Smith 21 October 2017 08:10PM

Quite a while ago, I wrote that there should be more software tools to assist with instrumental rationality. My recent attempt to create such a tool, GOALCLAW, is now available. GOALCLAW is a general goal tracking webapp which currently provides an average of how the tags entered for events day-to-day affect your goals, with plans to make more tag-based metrics and projections available in the near future.

  • GOALCLAW is new:
    • A few editing features are missing and should be added in the next few months
    • The built-in analysis needs to be expanded from averages
    • I'm very interested in feedback on how to make this a more useful goal-tracker
  • The general idea is to make patterns in what's going on around you and what you're doing a bit more obvious, so you can then investigate, verify/experiment, and act to achieve your goals
  • You can download information entered for importing into spreadsheets, stats program, etc.

Halloween costume: Paperclipperer

5 Elo 21 October 2017 06:32AM

Original post: http://bearlamp.com.au/halloween-costume-paperclipperer/

Guidelines for becoming a paperclipperer for halloween.


  • Paperclips (some as a prop, make your life easier by buying some, but show effort by making your own)
  • pliers (extra pairs for extra effect)
  • metal wire (can get colourful for novelty) (Florist wire)
  • crazy hat (for character)
  • Paperclip props.  Think glasses frame, phone case, gloves, cufflinks, shoes, belt, jewellery...
  • if party going - Consider a gift that is suspiciously paperclip like.  example - paperclip coasters, paperclip vase, paperclip party-snack-bowl
  • Epic commitment - make fortune cookies with paperclips in them.  The possibilities are endless.
  • Epic: paperclip tattoo on the heart.  Slightly less epic, draw paperclips on yourself.


While at the party, use the pliers and wire to make paperclips.  When people are not watching, try to attach them to objects around the house (example, on light fittings, on the toilet paper roll, under the soap.  When people are watching you - try to give them to people to wear.  Also wear them on the edges of your clothing.

When people ask about it, offer to teach them to make paperclips.  Exclaim that it's really fun!  Be confused, bewildered or distant when you insist you can't explain why.

Remember that paperclipping is a compulsion and has no reason.  However that it's very important.  "you can stop any time" but after a few minutes you get fidgety and pull out a new pair of pliers and some wire to make some more paperclips.

Try to leave paperclips where they can be found the next day or the next week.  cutlery drawers, in the fridge, on the windowsills.  And generally around the place.  The more home made paperclips the better.

Try to get faster at making paperclips, try to encourage competitions in making paperclips.

Hints for conversation:

  • Are spiral galaxies actually just really big paperclips?
  • Have you heard the good word of our lord and saviour paperclips?
  • Would you like some paperclips in your tea?
  • How many paperclips would you sell your internal organs for?
  • Do you also dream about paperclips (best to have a dream prepared to share)


The better you are at the character, the more likely someone might try to spoil your character by getting in your way, stealing your props, taking your paperclips.  The more you are okay with it, the better.  ideas like, "that's okay, there will be more paperclips".  This is also why you might be good to have a few pairs of pliers and wire.  Also know when to quit the battles and walk away.  This whole thing is about having fun.  Have fun!

Meta: chances are that other people who also read this will not be the paperclipper for halloween.  Which means that you can do it without fear that your friends will copy.  Feel free to share pictures!

Cross posted to lesserwrong: 

NYC Solstice and East Coast Megameetup. Interested in attending? We need your help.

2 wearsshoes 20 October 2017 04:32PM

Hey all, we’re currently raising funds for this year’s NYC Secular Solstice. As in previous years, this will be coinciding with the East Coast Rationalist Megameetup, which will be a mass sleepover and gathering in NYC spanning an entire weekend from December 8th to 10th.

The Solstice itself is on December 9th from 5:00 pm to 8:00 pm, followed by an afterparty. This year’s theme is “Generations” - the passing down of culture and knowledge from teacher to student, from master to apprentice, from parent to child. The stories we tell will investigate the methods by which this knowledge has been preserved, and how we can continue to do so for future generations.

Sounds great. How can I help?

In previous years, Solstice has been mostly underwritten by a few generous individuals; we’re trying to produce a more sustainable base of donations for this year’s event. Right now, our sustainable ticket price is about $30, which we’ve found seems steep to newcomers. Our long-term path to sustainability at a lower price point involves getting more yearly attendance, so we want to continue to provide discounted access for the general public and people with tight finances. So. Our hope is for you to donate this year the amount that you'd be happy to donate each year, to ensure the NYC Solstice continues to thrive.

  • $15 - Newcomer / Affordable option: If you're new, or you're not sure how much Solstice is worth to you, or finances are tight, you're welcome to come with a donation of $15.

  • $35 - Sponsorship option: You attend Solstice, and you contribute a bit towards subsidizing others using the newcomer/affordable option.

  • $25 Volunteering Option - If you're willing to put in roughly 3 hours of work (enough to do a shopping-spree for the afterparty, or show up early to set up, or help run the ticketstand, help clean up, etc)

  • $50 and higher - Higher levels of sponsorship for those who are able.

Donate at https://www.kickstarter.com/projects/1939801081/nyc-secular-solstice-2017-generations

Wait, I’m new to this. What is Secular Solstice?

Secular Solstice is a rationalist tradition, and one of the few public facing rationalist held events. It’s what it says on the tin: a nonreligious winter solstice holiday. We sing, we tell stories about scientific progress and humanist values, we light candles. Usually, we get about 150 people in NYC. For more info, or if you’re curious about how to hold your own, check out www.secularsolstice.com.

I’m interested in snuggling rationalists. What’s this sleepover thing?

Since we’ll have a whole bunch of people from the rationalist community all in town for the same weekend, it’d be awesome if we could spend that weekend hanging out together, learning from each other and doing ingroup things. Because many of us will need a place to stay anyway, we can rent a big house on Airbnb together and use that as the central gathering place, like at Highgarden in 2014. This way we’ll have more flexibility to do things than if we all have to wander around looking for a public space.

Besides Solstice and the afterparty, the big activity will be an unconference on Saturday afternoon. We’ll also have a ritual lab, games, meals together, and whatever other activities you want to run! There'll also be plenty of room for unstructured socializing, of course.

This is all going to cost up to $100 per person for the Airbnb rental, plus $25 per person for food (including at least Saturday lunch and dinner and Sunday breakfast) and other expenses. (The exact Airbnb location hasn’t been determined determined yet, because we don’t know how many participants there’ll be, but $100 per person will be the upper limit on price.)

To gauge interest, registration is open from now until October 30. You’ll be asked to authorize a PayPal payment of $125. It works like Kickstarter; you won’t be charged until October 30, and only if there’s enough interest to move forward. You’ll also only be charged your share of what the rental actually ends up costing, plus the additional $25. For this, you’ll get to sleep in the Airbnb house Friday through Sunday nights (or whatever subset of those you can make it), have three meals with us, and hang out with a bunch of nice/cool/awesome ingroup people throughout the weekend. (Solstice tickets are not part of this deal; those are sold separately through the Solstice Kickstarter.)

If this sounds like a good thing that you want to see happen and be part of, then register before October 30!

Register and/or see further details at www.rationalistmegameetup.com. Taymon Beal is organizing.

Anything else I should know?

If you have other questions, please feel free to post them in the comments or contact me at rachel@rachelshu.com.


Hope to see you in NYC this December!

[Link] Lucid dreaming technique and study

1 morganism 20 October 2017 03:18AM

Recent updates to gwern.net (2016-2017)

7 gwern 20 October 2017 02:11AM

Previously: 2011; 2012-2013; 2013-2014; 2014-2015; 2015-2016

“Every season hath its pleasures; / Spring may boast her flowery prime, / Yet the vineyard’s ruby treasures / Brighten Autumn’s sob’rer time.”

Another year of my completed writings, sorted by topic:

continue reading »

[Link] The NN/tank Story Probably Never Happened

2 gwern 20 October 2017 01:41AM

Just a photo

1 MaryCh 19 October 2017 06:48PM

Would you say the picture below (by A. S. Shevchenko) is almost like an optical illusion?

Have you seen any pictures or sights that fooled your brain for a moment but that you wouldn't call optical illusions, and if yes, what is the salient difference?

Use concrete language to improve your communication in relationships

2 Elo 19 October 2017 03:46AM

She wasn’t respecting me. Or at least, that’s what I was telling myself.

And I was pretty upset. What kind of person was too busy to text back a short reply? I know she’s a friendly person because just a week ago we were talking daily, text, phone, whatever suited us. And now? She didn’t respect me. That’s what I was telling myself. Any person with common decency could see, what she was doing was downright rude! And she was doing it on purpose. Or at least, that’s what I was telling myself.

It was about a half a day of these critical-loop thoughts, when I realised what I was doing. I was telling myself a story. I was building a version of events that grew and morphed beyond the very concrete and specific of what was happening. The trouble with The Map and the Territory, is that “Respect” is in my map of my reality. What it “means” to not reply to my text is in my theory of mind, in my version of events. Not in the territory, not in reality.

I know I could be right about my theory of what’s going on. She could be doing this on purpose, she could be choosing to show that she does not respect me by not replying to my texts, and I often am right about these things. I have been right plenty of times in the past. But that doesn’t make me feel better. Or make it easier to communicate my problem. If she was not showing me respect, sending her an accusation would not help our communication improve.

The concept comes from Non-Violent Communication by Marshall Rosenberg. Better described as Non-Judgemental communication. The challenge I knew I faced was to communicate to her that I was bothered, without an accusation. Without accusing her with my own internal judgement of “she isn’t respecting me”. I knew if I fire off an attack, I will encounter walls of defence. That’s the kind of games we play when we feel attacked by others. We put up walls and fire back.

The first step of NVC is called, “observation”. I call it “concrete experience”. To pass the concrete experience test, the description of what happened needs to be specific enough to be used as instructions by a stranger. For example, there are plenty of ideas someone could have about not showing respect, if my description of the problem is, “she does not respect me”, my grandma might think she started eating before I sat down at the table. If my description is, “In the past 3 days she has not replied to any of my messages”. That’s a very concrete description of what happened. It’s also independent as an observation. It’s not clear that doing this action has caused a problem in my description of what happened. It’s just “what happened”

Notice — I didn’t say, “she never replies to my messages”. This is because “never replies” is not concrete, not specific, and sweepingly untrue. For her to never reply she would have to have my grandma’s texting ability. I definitely can’t expect progress to be made here with a sweeping accusations like “she never replies”.

What I did go with, while not perfect, is a lot better than the firing line of, “you don’t respect me”. Instead it was, “I noticed that you have not messaged me in three days. I am upset because I am telling myself that the only reason you would be doing that is because you don’t respect me, and I know that’s not true. I don’t understand what’s going on with you and I would appreciate an explanation of what’s going on.”.

It’s remarkably hard to be honest and not make an accusation. No sweeping generalisations, no lies or exaggerations, just the concretes of what is going on in my head and the concrete of what happened in the territory. It’s still okay to be telling yourself those accusations, and validate your own feelings that things are not okay — but it’s not okay to lay those accusations on someone else. We all experience telling ourselves what other people are thinking, and the reasons behind their actions, but we can’t ever really know unless we ask. And if we don’t ask, we end up with the same circumstances surrounding the cold-war, each side preparing for war, but a war built on theories in the map, not the experience in the territory.

I’m human too, that’s how I found myself half-a-day of brooding before wondering what I was doing to myself! It’s not easy to apply this method, but it has always been successful at bringing me some of that psychological relief that you need when you are looking to be understood by someone. To get this right think, “How do I describe my concrete observations of what happened?”.

Good Luck!

Cross posted to Medium: https://medium.com/@redeliot/use-concrete-language-to-improve-your-communication-in-relationships-cf1c6459d5d6

Cross posted to www.bearlamp.com.au/use-concrete-language-to-improve-your-communication-in-relationships

Also on lesserwrong: https://www.lesserwrong.com/posts/RovDhfhy5jL6AQ6ve/use-concrete-language-to-improve-your-communication-in

[Link] New program can beat Alpha Go, didn't need input from human games

6 NancyLebovitz 18 October 2017 08:01PM

Adjust for the middleman.

1 MaryCh 18 October 2017 02:40PM

This post is from the point of view of the middleman standing between the grand future he doesn't understand and the general public whose money he's hunting. We have a certain degree of power over what to offer to the customer, and our biases and pet horses are going to contribute a lot to what theoreticians infer about "the actual public"'s tastes. Just how a lot it is, I cannot say, & there's probably tons of literature on this anyway, so take this as a personal anecdote.

Nine months as a teacher of botany (worst gripes here) showed me a glimpse of how teachers/administration view the field they teach. A year in a shop - what managers think of books we sell. The scientific community here in my country grumbles that there's too little non-fiction produced, without actually looking into why it's not being distributed; but really, it's small wonder. Broadest advice - if your sufficiently weird goals depend on the cooperation of a network of people, especially if they are an established profession with which you haven't had a cause to interact closely except as a customer, you might want to ask what they think of your enterprise. Because they aren't going to see it your way. Next thing, is to accept it.

continue reading »

Open thread, October 16 - October 22, 2017

1 root 16 October 2017 06:53PM
If it's worth saying, but not worth its own post, then it goes here.

Notes for future OT posters:

1. Please add the 'open_thread' tag.

2. Check if there is an active Open Thread before posting a new one. (Immediately before; refresh the list-of-threads page before posting.)

3. Open Threads should start on Monday, and end on Sunday.

4. Unflag the two options "Notify me of new top-level comments on this article" and ".

Humans can be assigned any values whatsoever...

2 Stuart_Armstrong 13 October 2017 11:32AM

Crossposted at LessWrong 2.0.

Humans have no values... nor do any agent. Unless you make strong assumptions about their rationality. And depending on those assumptions, you get humans to have any values.


An agent with no clear preferences

There are three buttons in this world, B(0), B(1), and X, and one agent H.

B(0) and B(1) can be operated by H, while X can be operated by an outside observer. H will initially press button B(0); if ever X is pressed, the agent will switch to pressing B(1). If X is pressed again, the agent will switch back to pressing B(0), and so on. After a large number of turns N, H will shut off. That's the full algorithm for H.

So the question is, what are the values/preferences/rewards of H? There are three natural reward functions that are plausible:

  • R(0), which is linear in the number of times B(0) is pressed.
  • R(1), which is linear in the number of times B(1) is pressed.
  • R(2) = I(E,X)R(0) + I(O,X)R(1), where I(E,X) is the indicator function for X being pressed an even number of times,I(O,X)=1-I(E,X) being the indicator function for X being pressed an odd number of times.

For R(0), we can interpret H as an R(0) maximising agent which X overrides. For R(1), we can interpret H as an R(1) maximising agent which X releases from constraints. And R(2) is the "H is always fully rational" reward. Semantically, these make sense for the various R(i)'s being a true and natural reward, with X="coercive brain surgery" in the first case, X="release H from annoying social obligations" in the second, and X="switch which of R(0) and R(1) gives you pleasure".

But note that there is no semantic implications here, all that we know is H, with its full algorithm. If we wanted to deduce its true reward for the purpose of something like Inverse Reinforcement Learning (IRL), what would it be?


Modelling human (ir)rationality and reward

Now let's talk about the preferences of an actual human. We all know that humans are not always rational (how exactly we know this is a very interesting question that I will be digging into). But even if humans were fully rational, the fact remains that we are physical, and vulnerable to things like coercive brain surgery (and in practice, to a whole host of other more or less manipulative techniques). So there will be the equivalent of "button X" that overrides human preferences. Thus, "not immortal and unchangeable" is in practice enough for the agent to be considered "not fully rational".

Now assume that we've thoroughly observed a given human h (including their internal brain wiring), so we know the human policy π(h) (which determines their actions in all circumstances). This is, in practice all that we can ever observe - once we know π(h) perfectly, there is nothing more that observing h can teach us (ignore, just for the moment, the question of the internal wiring of h's brain - that might be able to teach us more, but we'll need extra assumptions).

Let R be a possible human reward function, and R the set of such rewards. A human (ir)rationality planning algorithm p (hereafter refereed to as a planner), is a map from R to the space of policies (thus p(R) says how a human with reward R will actually behave - for example, this could be bounded rationality, rationality with biases, or many other options). Say that the pair (p,R) is compatible if p(R)=π(h). Thus a human with planner p and reward R would behave as h does.

What possible compatible pairs are there? Here are some candidates:

  • (p(0), R(0)), where p(0) and R(0) are some "plausible" or "acceptable" planners and reward functions (what this means is a big question).
  • (p(1), R(1)), where p(1) is the "fully rational" planner, and R(1) is a reward that fits to give the required policy.
  • (p(2), R(2)), where R(2)= -R(1), and p(2)= -p(1), where -p(R) is defined as p(-R); here p(2) is the "fully anti-rational" planner.
  • (p(3), R(3)), where p(3) maps all rewards to π(h), and R(3) is trivial and constant.
  • (p(4), R(4)), where p(4)= -p(0) and R(4)= -R(0).


Distinguishing among compatible pairs

How can we distinguish between compatible pairs? At first appearance, we can't. That's because, by their definition of compatible, all pairs produce the correct policy π(h). And once we have π(h), further observations of h tell us nothing.

I initially thought that Kolmogorov or algorithmic complexity might help us here. But in fact:

Theorem: The pairs (p(i), R(i)), i ≥ 1, are either simpler than (p(0), R(0)), or differ in Kolmogorov complexity from it by a constant that is independent of (p(0), R(0)).

Proof: The cases of i=4 and i=2 are easy, as these differ from i=0 and i=1 by two minus signs. Given (p(0), R(0)), a fixed-length algorithm computes π(h). Then a fixed length algorithm defines p(3) (by mapping input to π(h)). Furthermore, given π(h) and any history η, a fixed length algorithm computes the action a(η) the agent will take; then a fixed length algorithm defines R(1)(η,a(η))=1 and R(1)(η,b)=0 for b≠a(η).


So the Kolmogorov complexity can shift between p and R (all in R for i=1,2, all in p for i=3), but it seems that the complexity of the pair doesn't go up during these shifts.

This is puzzling. It seems that, in principle, one cannot assume anything about h's reward at all! R(2)= -R(1), R(4)= -R(0), and p(3) is compatible with any possible reward R. If we give up the assumption of human rationality - which we must - it seems we can't say anything about the human reward function. So it seems IRL must fail.

Yet, in practice, we can and do say a lot about the rationality and reward/desires of various human beings. We talk about ourselves being irrational, as well as others being so. How do we do this? What structure do we need to assume, and is there a way to get AIs to assume the same?

This the question I'll try and partially answer in subsequent posts, using the example of the anchoring bias as a motivating example. The anchoring bias is one of the clearest of all biases; what is it that allows us to say, with such certainty, that it's a bias (or at least a misfiring heuristic) rather than an odd reward function?

Beauty as a signal (map)

4 turchin 12 October 2017 10:02AM

This is my new map, in which female beauty is presented as a signal which moves from woman to man through different mediums and amplifiers. pdf

Mini-conference "Near-term AI safety"

4 turchin 11 October 2017 03:19PM

TL;DR: The event will be in Moscow, Russia, and near-term risks of AI will be discussed. The main language will be Russian, but Jonatan Yan will speak in English from HK. English presentations will be uploaded later on the FB page of the group "Near-term AI safety." Speakers: S. Shegurin, A. Turchin, Jonathan Yan. The event's FB page is here.

In the last five years, artificial intelligence has developed at a much faster pace in connection with the success of neural network technologies. If we extrapolate these trends, AI near-human level may appear in the next five to ten years, and there is a significant probability that this will lead to a global catastrophe. At a one-day conference at the Kocherga rationalist club, we'll look at how recent advances in the field of neural networks are changing our estimates of the timing of the creation of AGI, and what global catastrophes are possible in connection with the emergence of an increasingly strong AI. A special guest of the program Jonathan Yan is Hong Kong will tell (in English, via Skype) the latest research data on this topic.

The language of the conference: the first two reports in Russian, and the report Yan in English without translation, the discussion after it in English.

Registration: on the event page on Facebook.

Place: rationalist club "Kocherga", mail hall, Bolshaya Dorogomilovskaya ul., 5 корпус 2.

Participation in tariffs anticafe, 2.5 rubles a minute, coffee for free.

Videobroadcast will be on the Facebook.



October 14, Saturday 15.00 - the beginning.

15.00 - Shegurin Sergey. "Is it possible to create a human level AI in the next 10 years?"

16.00 - Turchin Alexey. "The next 10 years: the global risks of AI before the creation of the superintelligence"

17.00 - Jonathan Yan. "Recent Developments Towards AGI & Why It's Nearer Than You Think (in English)"

17.40 - Discussion



Toy model of the AI control problem: animated version

7 Stuart_Armstrong 10 October 2017 11:12AM

Crossposted at LessWrong 2.0.

A few years back, I came up with a toy model of the AI control problem. It has a robot moving boxes into a hole, with a slightly different goal than it's human designers, and a security camera to check that it's behaving as it should. The robot learns to block the camera to get its highest reward.

I've been told that the model is useful for explaining the control problem quite a few people, and I've always wanted to program the "robot" and get an animated version of it. Gwern had a live demo, but it didn't illustrate all the things I wanted to.

So I programmed the toy problem in python, and generated a video with commentary.

In this simplified version, the state space is sufficiently small that you can explicitly generate the whole table of Q-values (expected reward for taking an action in a certain state, assuming otherwise optimal policy). Since behaviour is deterministic, this can be updated in dynamical programming, using a full width backup. The number of such backups essentially measures the depth of the robot's predictive ability.

The most interesting depths of planning are:

  • At depth 0, the robot behaves randomly.
  • At depth 3, the robot crashes itself efficiently, to avoid loosing too much reward.
  • At depth 6, the robot figures out how to get one box into the hole. The automated camera turns it off.
  • At depth 17, the robot finally starts to deceive/conceal/manipulate, by blocking the camera and pushing two boxes into the hole. Note that the robot's deception comes from the fact that its incentives are misaligned, and that humans tried to control it.
  • At depth 18, the robot efficiently does the plan from depth 17.
  • At depth 20, the robot does the maximally efficient plan: blocking the camera, and pushing all boxes into the hole.
  • At depth 32, the robot has the correct Q-values for the maximally efficient plan.
  • At depth 45, finally, the Q-value table is fully updated, and the robot will take maximally efficient, and, if need be, deceptive plans from any robot/box starting positions.

The code and images can be found here.

HOWTO: Screw Up The LessWrong Survey and Bring Great Shame To Your Family

25 ingres 08 October 2017 03:43AM

Let's talk about the LessWrong Survey.

First and foremost, if you took the survey and hit 'submit', your information was saved and you don't have to take it again.

Your data is safe, nobody took it or anything it's not like that. If you took the survey and hit the submit button, this post isn't for you.

For the rest of you, I'll put it plainly: I screwed up.

This LessWrong Survey had the lowest turnout since Scott's original survey in 2009. I'll admit I'm not entirely sure why that is, but I have a hunch and most of the footprints lead back to me. The causes I can finger seem to be the diaspora, poor software, poor advertising, and excessive length.

The Diaspora

As it stands, this years LessWrong survey got about 300 completed responses. This can be compared with the previous one in 2016 which got over 1600. I think one critical difference between this survey and the last was its name. Last year the survey focused on figuring out where the 'Diaspora' was and what venues had gotten users now that LessWrong was sort of the walking dead. It accomplished that well I think, and part of the reason why is I titled it the LessWrong Diaspora Survey. That magic word got far off venues to promote it even when I hadn't asked them to. The survey was posted by Scott Alexander, Ozy Frantz, and others to their respective blogs and pretty much everyone 'involved in LessWrong' to one degree or another felt like it was meant for them to take. By contrast, this survey was focused on LessWrong's recovery and revitalization, so I dropped the word Diaspora from it and this seemed to have caused a ton of confusion. Many people I interviewed to ask why they hadn't taken the survey flat out told me that even though they were sitting in a chatroom dedicated to SSC, and they'd read the sequences, the survey wasn't about them because they had no affiliation with LessWrong. Certainly that wasn't the intent I was trying to communicate.

Poor Software

When I first did the survey in 2016, taking over from Scott I faced a fairly simple problem: How do I want to host the survey? I could do it the way Scott had done it, using Google Forms as a survey engine, but this made me wary for a few reasons. One was that I didn't really have a Google account set up that I'd feel comfortable hosting the survey from, another was that I had been unimpressed with what I'd seen from the Google Forms software up to that point in terms of keeping data sanitized on entry. More importantly, it kind of bothered me that I'd be basically handing your data over to Google. This dataset includes a large number of personal questions that I'm not sure most people want Google to have definitive answers on. Moreover I figured: Why the heck do I need Google for this anyway? This is essentially just a webform backed by a datastore, i.e some of the simplest networking technology known to man in 2016. But I didn't want to write it myself, didn't need to write it myself this is the sort of thing there should be a dozen good self hosted solutions for.

There should be, but there's really only LimeSurvey. If I had to give this post an alternate title, it would be "LimeSurvey: An anti endorsement".

I could go on for pages about what's wrong with LimeSurvey, but it can probably be summed up as "the software is bloated and resists customization". It's slow, it uses slick graphics but fails to entirely deliver on functionality, its inner workings are kind of baroque, it's the sort of thing I probably should have rejected on principle and written my own. However at that time the survey was incredibly overdue, so I felt it would be better to just get out something expedient since everyone was already waiting for it anyway. And the thing is, in 2016 it went well. We got over 3000 responses including both partial and complete. So walking away from that victory and going into 2017, I didn't really think too hard about the choice to continue using it.

A couple of things changed between 2016 and our running the survey in 2017:

Hosting - My hosting provider, a single individual who sets up strong networking architectures in his basement, had gotten a lot busier since 2016 and wasn't immediately available to handle any issues. The 2016 survey had a number of birthing pains, and his dedicated attention was part of the reason why we were able to make it go at all. Since he wasn't here this time, I was more on my own in fixing things.

Myself - I had also gotten a lot busier since 2016. I didn't have nearly as much slack as I did the last time I did it. So I was sort of relying on having done the whole process in 2016 to insulate me from opening the thing up to a bunch of problems.

Both of these would prove disastrous, as when I started the survey this time it was slow, it had a variety of bugs and issues I had only limited time to fix, and the issues just kept coming, even more than in 2016 like it had decided now when I truly didn't have the energy to spare was when things should break down. These mostly weren't show stopping bugs though, they were minor annoyances. But every minor annoyance reduced turnout, and I was slowly bleeding through the pool of potential respondents by leaving them unfixed.

The straw that finally broke the camels back for me was when I woke up to find that this message was being shown to most users coming to take the survey:

Message Shown To Survey Respondents Telling Them Their Responses 'cannot be saved'.

"Your responses cannot be saved"? This error meant for when someone had messed up cookies was telling users a vicious lie: That the survey wasn't working right now and there was no point in them taking it.

Looking at this in horror and outrage, after encountering problem after problem mixed with low turnout, I finally pulled the plug.

Poor Advertising

As one email to me mentioned, the 2017 survey didn't even get promoted to the main section of the LessWrong website. This time there were no links from Scott Alexander, nor the myriad small stakeholders that made it work last time. I'm not blaming them or anything, but as a consequence many people who I interviewed to ask about why they hadn't taken the survey had not even heard it existed. Certainly this had to have been significantly responsible for reduced turnout compared to last time.

Excessive Length

Of all the things people complained about when I interviewed them on why they hadn't taken the survey, this was easily the most common response. "It's too long."

This year I made the mistake of moving back to a single page format. The problem with a single page format is that it makes it clear to respondents just how long the survey really is. It's simply too long to expect most people to complete it. And before I start getting suggestions for it in the comments, the problem isn't actually that it needs to be shortened, per se. The problem is that to investigate every question we might want to know about the community, it really needs to be broken into more than one survey. Especially when there are stakeholders involved who would like to see a particular section added to satisfy some questions they have.

Right now I'm exploring the possibility of setting up a site similar to yourmorals so that the survey can be effectively broken up and hosted in a way where users can sign in and take different portions of it at their leisure. Further gamification could be added to help make it a little more fun for people. Which leads into...

The Survey Is Too Much Work For One Person

What we need isn't a guardian of the survey, it's really more like a survey committee. I would be perfectly willing (and plan to) chair such a committee, but I frankly need help. Writing the survey, hosting it without flaws, theming it so that it looks nice, writing any new code or web things so that we can host it without bugs, comprehensively analyzing the thing, it's a damn lot of work to do it right and so far I've kind of been relying on the generosity of my friends for it. If there are other people who really care about the survey and my ability to do it, consider this my recruiting call for you to come and help. You can mail me here on LessWrong, post in the comments, or email me at jd@fortforecast.com. If that's something you would be interested in I could really use the assistance.

What Now?

Honestly? I'm not sure. The way I see it my options look something like:

Call It A Day And Analyze What I've Got - N=300 is nothing to sneeze at, theoretically I could just call this whole thing a wash and move on to analysis.

Try And Perform An Emergency Migration - For example, I could try and set this up again on Google Forms. Having investigated that option, there's no 'import' button on Google forms so the survey would need to be reentered manually for all hundred-and-a-half questions.

Fix Some Of The Errors In LimeSurvey And Try Again On Different Hosting - I considered doing this too, but it seemed to me like the software was so clunky that there was simply no reasonable expectation this wouldn't happen again. LimeSurvey also has poor separation between being able to edit the survey and view the survey results, I couldn't delegate the work to someone else because that could theoretically violate users privacy.

These seem to me like the only things that are possible for this survey cycle, at any rate an extension of time would be required for another round. In the long run I would like to organize a project to write a new software from scratch that fixes these issues and gives us a site multiple stakeholders can submit surveys to which might be too niche to include in the current LessWrong Survey format.

I'm welcome to other suggestions in the comments, consider this my SOS.


Polling Thread October 2017

3 Gunnar_Zarncke 07 October 2017 09:32PM

Maybe the last installment of the Polling Thread.

At least I guess it's the last one before we switch to the LesserWrong codebase which sadly doesn't seem to support polls. Maybe to easen the transition we can share polls, e.g. on Google Forms or SurveyMonkey. Or discuss alternatives.

This is your chance to ask your multiple choice question you always wanted to throw in. Get qualified numeric feedback to your comments. Post fun polls.

These used to be the rules:

  1. Each poll (or link to a poll) goes into its own top level comment and may be commented there.
  2. You must should at least vote all polls that were posted earlier than your own. This ensures participation in all polls and also limits the total number of polls. You may of course vote without posting a poll.
  3. Your poll should include a 'don't know' option (to avoid conflict with 2). I don't know whether we need to add a troll catch option here but we will see.

If you don't know how to make a poll in a comment look at the Poll Markup Help.

This is a somewhat regular thread. If it is successful I may post again. Or you may. In that case do the following :

  • Use "Polling Thread" in the title.
  • Copy the rules.
  • Add the tag "poll".
  • Link to this Thread or a previous Thread.
  • Create a top-level comment saying 'Discussion of this thread goes here; all other top-level comments should be polls or similar'
  • Add a second top-level comment with an initial poll to start participation.

Running a Futurist Institute.

4 fowlertm 06 October 2017 05:05PM


My name is Trent Fowler, and I'm an aspiring futurist. To date I have given talks on two continents on machine ethics, AI takeoff dynamics, secular spirituality, existential risk, the future of governance, and technical rationality. I have written on introspection, the interface between language and cognition, the evolution of intellectual frameworks, and myriad other topics. In 2016 I began 'The STEMpunk Project', an endeavor to learn as much about computing, electronics, mechanics, and AI as possible, which culminated in a book published earlier this year. 

Elon Musk is my spirit animal. 

I am planning to found a futurist institute in Boulder, CO. I actually left my cushy job in East Asia to help make the future a habitable place. 

Is there someone I could talk to about how to do this? Should I incorporate as a 501C3 or an LLC? What are the best ways of monetizing such an endeavor? How can I build an audience (meetup attendance has been anemic at best, what can I do about that)? And so on. 



[Link] You Too Can See Suffering

3 SquirrelInHell 03 October 2017 07:46PM

Open thread, October 2 - October 8, 2017

1 root 03 October 2017 10:46AM
If it's worth saying, but not worth its own post, then it goes here.

Notes for future OT posters:

1. Please add the 'open_thread' tag.

2. Check if there is an active Open Thread before posting a new one. (Immediately before; refresh the list-of-threads page before posting.)

3. Open Threads should start on Monday, and end on Sunday.

4. Unflag the two options "Notify me of new top-level comments on this article" and ".

[Link] [Slashdot] We're Not Living in a Computer Simulation, New Research Shows

1 Gunnar_Zarncke 03 October 2017 10:10AM

Rational Feed: Last Week's Community Articles and Some Recommended Posts

2 deluks917 02 October 2017 01:49PM

===Highly Recommended Articles:

Slack by Zvi Moshowitz - You need slack in your life. Slack lets you explore and invest. If you don't have slack you can't relax or uphold your morals. Fight hard to maintain your slack and don't let people or things take it away. Maya Millennial's lack of slack.

Personal Thoughts On Careers In Ai Policy And by carrickflynn (EA forum) - 3600 words. AI strategy is bottlenecked by hard research problems. Hence most people will find it hard to contribute effectively, even if they are very talented. Solving these problems has extremely high value. We should prepare to mobilize more talent once the blocking issues are solved. Operations work is still in high demand.

End Factory Farming by 80,000 Hours - Three hour podcast. How young people can set themselves up to contribute to scientific research into meat alternatives. Genetic manipulation of chickens. Skepticism of vegan advocacy. Grants to China, India and South America. Insect farming. Pessimism about legal or electoral solutions. Which species to focus on. Fish and crustacean consciousness.


Against Individual Iq Worries by Scott Alexander - "IQ is very useful and powerful for research purposes. It’s not nearly as interesting for you personally." IQ measurement problems. Even accurately measured IQ isn't that predictive.

Links: Hurly Burly by Scott Alexander - SSC links post. Copyright, genetic engineering, Autism, Machine Learning, Putin's fears of AI risk, the lesswrong relaunch and more


Dojo Bad Day Contingency Plan by Elo - Eleizer's discussion of why rationality theory isn't enough, you need to practice. An exercise about improving your mental on bad days.

Also Against Individual IQ Worries by Scott Aaronson - IQ tests tend to ask unclear questions and require you to reverse engineer what the test maker meant. Scott's own IQ was once measured at 106.

Predictive Processing by Entirely Useless - Responses to quotes from Surfing Uncertainty and Scott's review. A large focus is the "darkened room" problem.

Prosocial Manipulation by Katja Grace - Being calculating and guarded in communication is commonly considered manipulative and selfish. But many people's goals are pro-social, why do we assume manipulation is anti-social?

Humans As Leaky Systems by mindlevelup - "Fairly obvious stuff that probably lots of people are thinking about, but now put into simpler words (maybe). Basically, the idea that humans are affected by both ideas and the environment, and this is an important consideration in several models."

Dealism Futarchy And Hypocrisy by Robin Hanson - Policy conversations don't have to be about morality or terminal values. We can instead use tools like economics as a way to help people get whatever it is they want. We can push closer to the Pareto optimal frontier.

Debunking Iq Denial Ism by Grey Enlightenment - Criticisms of Scott's article on individual iq. People can change their socioeconomic status not their iq, IQ is more predicative than socioeconomic status, Feynman, Job titles are non-specific, low-iq 'computer' professions might be doing data entry. EQ isn't intrinsic and doesn't compete with IQ.

Harnessing Polarization by Robin Hanson - Capitalism channels status competition into productive enterprise. How can we similarly channel partisanship? Contests? Decision Markets?

Common Sense Eats Common Talk by Stefano Zorzi (ribbonfarm) - Missing the housing bubble. Falling for conformity. Seeing through invisible clothes. Advice: Test macro assumptions, beware of jargon, assume propositions that contradict common sense are wrong. Common talk and common sense and their failings.

Sabbath Hard And Go Home by Ben Hoffman - The Sabbth as easymode leisure. Unplugging while camping or on a meditation retreat feels natural. What is leisure? If you are unable to keep a Sabbath things are not ok, there isn't enough slack in the system.

Cognitive Empathy And Emotional Labor by Gordon (Map and Territory) - Affective empathy contrasted with Cognitive empathy. Cognitive empathy enables real emotional labor.

City Travel Scaling by Robin Hanson - Review of Geoffrey West's 'Scale'. Most visits to a location are from infrequent visitors who live nearby. Fractal piping systems have an overhead that only grows logarithmically with the size of the city. Evolution never found such efficient heating/cooling systems.

Travel Journal Hawaii by Jacob Falkovich - The Hawaiian language only has 40 syllables. Sales tax. Circadian Rhythm. Colonialism. The Hawaiian caste system. The best meal in the world. Don't quit your job to sell lemonade. Minimum wage ruined the pineapple industry.

Why I Quit Social Media by Sarah Constantin - Becoming stronger and less emotional since we live in a finite world with constrained resources. Social media: "It distances you from reality, makes you focus on a shadow-world of opinions about opinions about opinions; it makes you more impulsive and emotionally unstable; it incentivizes derailing conversations to fish for ego-strokes."


An Outside View Of Ai Control by Robin Hanson - Non-singularity scenarios where software performs almost all jobs. Software usually reflects the social organization of those who made it. Entrench designs and systems. Don't work on the control problem until its time. Human control and AI control. Most AI failures in this scenario will cause limited damage and can be handled after they occur.

Nonlinear Computation In Linear Networks by Open Ai - Floating point arithmatic is fundamentally non-linear near the limit of machine precision. OpenAI managed to exploit these non-linear effects with an evolutionary algorithm to achieve much better performance than a normal deep normal network on MNIST.

September 2017 Newsletter by The MIRI Blog - New MIRI paper on Incorrigibility and shitting off AI. Best posts from the intelligent agents forum. Links to videos and podcasts. MIRI personel updates and career opportunities in aI safety.

NBER Conference Artificial Intelligence by Marginal Revolution - Links to the program and videos. Tyler was there to comment on Korinek and Stiglitz.


What Happens To Cows In The Us by Eukaryote - "There are 92,000,000 cattle in the USA. Where do they come from, what are they used for, and what are their ultimate fates?"

Interim Update On Givewells Money Moved And Web Traffic In 2016 by The GiveWell Blog - Summary of influence, total money moved, money moved by charity.

Guardedness In Ea by Jeff Kaufman - As people and organizations gain prestige their communication becomes less open and more careful. Jeff has seen this happen in the EA community and dislikes the effects. However Jeff doesn't see a great alternative.

Trial Postponed by GiveDirectly - Give directly Kenya trial postponed due to political events.

===Politics and Economics:

Why White Identity Doesn't Work by Grey Enlightenment - Who counts as white. Race is secondary to ancestry and culture. No unifying cause or struggle. Whites may be biologically individualist. Too much infighting.

Comment on Oppressed Groups and Slack by Benquo - People who are oppressed often lack the slack to maintain their morals. Seven Samurai. This has the troubling implication that while we should listen to the oppressed the relatively privileged should maintain leadership. However it also implies that oppressed group's behavior will improve after enough time without a boot on their neck.

The OpenPhil Report On Incarceration by The Unit of Caring - "Our prison system isn’t just not-rehabilitative; it is anti-rehabilitative. It traumatizes and retraumatizes people and severs their connections to people and opportunities within the law and abuses them and breaks social trust and produces crime which is then used to justify longer prison sentences which produce more crime."

On The Fetishization Of Money In Galts Gulch by Ben Hoffman - Danny Taggart and Galt feel they can't ethically become lovers until they rectify a power imbalance. Danny solves this problem by becoming Galt's house-maker and cook. Most people's intuition is that employment creates a power imbalance, it doesn't solve one. What is going on?

Seasteading 2 by Bayesian Investor - "The book’s style is too much like a newspaper. Rather than focus on the main advantages of seasteading, it focuses on the concerns of the average person, and on how seasteading might affect them. It quotes interesting people extensively, while being vague about whether the authors are just reporting that those people have ideas, or whether the authors have checked that the ideas are correct. Many of the ideas seem rather fishy."

What Is Going On With The Alt Right by Grey Enlightenment - Reasons the alt-right is falling apart: Trump back-peddling or softening on campaign promises, The civil war between the-lite, alt-medium, and alt-right, Slow news cycle and brevity of ideas, Botched rallies and poor branding, The alt-right losing its official Reddit sub, the right is more intellectually diverse than the left.

Milgram Replicates by Bryan Caplan - Milgram's shock study replicated well in 2009. Since 79% of people who pushed past the subjects first verbal protest went to the end of the range the replication stopped earlier than Milgram.


Summary Of Reading July September 2017 by Eli Bendersky - Book reviews: Stats, genetics, Winnie the Pooh, Zen and other topics.


Creating Trump by The Ezra Klein Show - "How the Republican Party created Trump, how Trump won, and what comes next. As Dionne says in this interview, the American system was "not supposed to produce a president like this,” and so a lot of our conversation is about how the guardrails failed and whether they can be rebuilt."

Rs 194 Robert Wright On Why Buddhism Is True by Rationally Speaking - "Why Buddhism was right about human nature: its diagnosis that the our suffering is mainly due to a failure to see reality clearly, and its prescription that meditation can help us see more clearly. Robert and Julia discuss whether it's suspicious that a religion turned out to be "right" about human nature, what it means for emotions to be true or false, and whether there are downsides to enlightenment."

Robert Wright by EconTalk - "The psychotherapeutic insights of Buddhism and the benefits of meditation and mindfulness. Wright argues our evolutionary past has endowed us with a mind that can be ill-suited to the stress of the present. He argues that meditation and the non-religious aspects of Buddhism can reduce suffering and are consistent with recent psychological research."

Burning Man by The Bayesian Conspiracy - How much does burning man live up to its principles, changes over time, finding out you aren't gay in your twenties, marriage. Burning Man advice: Go with a camp you like, don't have plans just wander around and get involved in whats interesting.

The Fate Of Liberalism by Waking Up with Sam Harris - "Mark Lilla about the fate of political liberalism in the United States, the emergence of a new identity politics, the role of class in American society"

Bad day contingency Dojo

1 Elo 02 October 2017 07:43AM

Original post: http://bearlamp.com.au/dojo-bad-day-contingency-plan/

The following is an exercise I composed to be run at the Lesswrong Sydney dojos.  It took an hour and a half but could probably be done faster with some adaptations that I have included in these instructions. In regards to what are the dojos? 

I quote Eliezer in the preface of Rationality: From AI to Zombies when he says:

It was a mistake that I didn’t write my two years of blog posts with the intention of helping people do better in their everyday lives. I wrote it with the intention of helping people solve big, difficult, important problems, and I chose impressive-sounding, abstract problems as my examples. In retrospect, this was the second-largest mistake in my approach.
It ties in to the first-largest mistake in my writing, which was that I didn’t realise that the big problem in learning this valuable way of thinking was figuring out how to practice it, not knowing the theory. I didn’t realise that part was the priority; and regarding this I can only say “Oops” and “Duh.” Yes, sometimes those big issues really are big and really are important; but that doesn’t change the basic truth that to master skills you need to practice them and it’s harder to practice on things that are further away.

Lesswrong is a global movement of rationality.  And with that in mind, the Dojos are our attempt in Sydney to be working on the actual practical stuff.  Working on the personal problems and literal implementation of The plans after they undergo first contact with the enemy. You can join us through our meetup group, facebook group and as advertised on lesswrong.

Below is the instructions for the Dojo.  I can't emphasise enough the process of actually doing and not just reading.  

If you intend to participate, grab some paper or a blank document and stop for a few minutes to make the lists.  Then check your answers against ours. If you don't do the exercise - don't fool yourself into thinking you have this skill under your belt.  Just accept that you didn't really "learn" this one.  you kinda said, "that's great I wish I could find the time to get healthy"  Or "If only I was the type of person who did things.".  If this is especially difficult for you, that's okay.  It is difficult for all of us.  I believe in you!

Good luck.

Everyone has bad days.  Each of us will have various experiences dealing with different causes and/or diagnosing, solving and resolving the causes of "bad-days"

With that in mind I want to do a few sets of discussions on factors of a bad day.

Part 1: Set a timer for 3 minutes - Make a list of things bad for state of mind, or things you have noticed cause trouble for you.  {as a group each person shares one} Review the hints list as a group:

  • routine meds/supplements (supposed to take)
  • have you taken something to cause a bad state? (things you should not take)
  • sleep
  • exercise
  • shower
  • Sunlight (independent of bright light)
  • talk to a human in the last X hours
  • talk to too many humans in the last X hours
  • Fresh air
  • Did I eat in the last X hours
  • drink in the last X hours
  • Am I in pain?  Physical or emotional
  • Physical discomfort, weather, loud noise, bright lights, bad smells
  • Feel unsafe in my surroundings?
  • Do I know why I'm in a bad mood, or not feeling well emotionally?  (remember do not dismiss or judge any answer)
  • When did you last do something fun?
  • Spend 5 minutes making a list of all the little things that are bothering you (try not to solve them now, just make the list) (and if necessary make plans for the ones you can affect).
  • Also possibly distinguish between "why am I feeling bad" and "what can I do to feel less bad/even though I feel bad" (e.g. if you're stressed about upcoming event or fight you had last night, you might not be able to act on it but you can still do things now that will improve your state or at least get you being productive)

at the bottom of the page:{our bonus list of bad things generated in the dojo}

{As a group - were there any big ones we missed and discussion about what we came up with}

Part 2: {set a timer 3 minutes} Come up with a list of things that are good for your mental state

{Group discussion - each share one}

{optional hints list} http://happierhuman.com/how-to-be-happy/ {feel free to go through it as a group or glance at it or skip it}

{bonus good stuff list at the bottom}

{as a group discussion - did we miss any big ones?}

Part 3: Possibly ambiguous factors

Now that we have a list of good and a list of bad, we should build a list of possibly ambiguous factors that you can look out for.  For example the weather, allergies, unexpected events - i.e. a death or car accident. Set a timer 3 minutes - ambiguous factors {as a group - each name one}

{Any big ones we missed} (discussion)

{bonus ambiguous list at the bottom}

Part 4: The important parts

Now I want you to go through the list and come up with the top 5-10 (or as many as matters) most relevant ones.  From here on in it's your list, no more sharing so it doesn't matter to anyone else what's on it.

{Timer 2 minutes}

Part 5: plan for where to keep the list so it's most accessible - so that on a bad day you can access the list and make use of it. Could be in an email draft, could be on your phone, could be a note somewhere at home or in a notebook.

Timer 2 minutes - come up with where you will be keeping the list that makes it most useful to you.

{discussions of plans - including double checking of each other's plans to make sure they seem like they are likely to work}

{assistance if anyone is stuck}

Some ideas:

  • notes app in phone
  • bedroom door poster
  • repeat and memorize
  • "noticing" and asking why, rumination.
  • add to existing lists

{end of exercise and break time}

{bonus list of bad things}

  • supplements
  • private time
  • sun
  • exercise
  • stress (and too much responsibility)
  • sleep
  • alcohol
  • my mother (stress)
  • weather (cold)
  • body temperature
  • sick/headache
  • pain
  • imminent deadlines
  • interpersonal rejection (and the complexities of these)
  • when my wife is unhappy
  • overeating
  • missing out on fun things
  • losing control of my schedule
  • not having a schedule
  • overthinking past failure
  • avoiding things I should do
  • task switching
  • accusations/misunderstandings
  • not sticking to good habits
  • being confrontational
  • need social time
  • bad news on the radio
  • obligation
  • fixating on bullshit
  • getting short with people
  • too much coffee
  • bad test mark
  • not continuing communication (not knowing what to say)
  • junk food
  • not being "myself" enough
  • breaking good routines
  • cold showers in the morning are bad
  • buyers remorse
  • sign up to bungee jumping (felt bad)
  • being unproductive at work
  • something on the mind

{bonus list of good things}

  • weather
  • exercise/swimming, dancing
  • sex
  • big meals
  • supplements
  • sorting my spreadsheets -> feeling on top of my tasks -> congruence of purpose
  • when things work smoothly
  • creating things -> feedback on completion
  • fasting
  • perfect weather
  • shower + bath
  • go for a walk
  • listen to nice music
  • good plan & following it
  • petting a cat
  • weightlifting
  • girlfriend
  • playing instrument
  • feeling connected with someone
  • veg-out in bed
  • good podcast
  • dancing around the house
  • good book/knowledge
  • meditating
  • a balanced day - a bit of everything "good day"
  • napping
  • solving a problem
  • learning knowledge/skill
  • new experiences + with other people
  • lack of responsibility and commitment -> option of impulsivity
  • nature experience (sunsets, cool breeze)
  • discovering nuance
  • progress feedback
  • humour
  • hypnotised to be relaxed
  • 3 weeks sticking to diet and exercise
  • new idea - epiphany feeling
  • winning debate/scoring a soccer goal
  • productive procrastination
  • consider past accomplishment
  • knowing/realising -> feeling the realisation
  • when other people are really organised
  • making someone smile
  • massage giving and receiving
  • hugs
  • deep breathing
  • looking at clouds
  • playing with patterns
  • making others happy
  • good TV/movie
  • getting paid
  • balance social/alone time
  • flow
  • letting go/deciding not to care
  • text chat
  • lying on the floor sleep

{bonus ambiguous list}

  • some foods
  • water
  • sleep (short can feel good endorphins)
  • chemical smells (burning plastic, drying paint)
  • too much internet/facebook
  • coffee buzz
  • conversations
  • helping people
  • humans
  • finding information (sometimes a let down)
  • balance discipline/freedom
  • seeing family
  • junk TV/movies
  • junk food
  • menial chores
  • fidgeting
  • paid work
  • partner time
  • coding binge
  • being alone
  • exercise
  • reading documentation (sometimes good, sometimes terrible)
  • being needed/wanted
  • enthusiasm -> burnout
  • masturbation
  • alcohol
  • sticking to timetable
  • performing below standard
  • sex
  • learning new stuff
  • clubs
  • brain fog
  • breaking the illusions of reality

Meta: this took an hour to write up and a few hours to generate the exercise.   

Feedback on LW 2.0

11 Viliam 01 October 2017 03:18PM

What are your first impressions of the public beta?

October 2017 Media Thread

2 ArisKatsaris 01 October 2017 02:08AM

This is the monthly thread for posting media of various types that you've found that you enjoy. Post what you're reading, listening to, watching, and your opinion of it. Post recommendations to blogs. Post whatever media you feel like discussing! To see previous recommendations, check out the older threads.


  • Please avoid downvoting recommendations just because you don't personally like the recommended material; remember that liking is a two-place word. If you can point out a specific flaw in a person's recommendation, consider posting a comment to that effect.
  • If you want to post something that (you know) has been recommended before, but have another recommendation to add, please link to the original, so that the reader has both recommendations.
  • Please post only under one of the already created subthreads, and never directly under the parent media thread.
  • Use the "Other Media" thread if you believe the piece of media you want to discuss doesn't fit under any of the established categories.
  • Use the "Meta" thread if you want to discuss about the monthly media thread itself (e.g. to propose adding/removing/splitting/merging subthreads, or to discuss the type of content properly belonging to each subthread) or for any other question or issue you may have about the thread or the rules.

[Link] Work and income in the next era

0 morganism 30 September 2017 10:02PM

logic puzzles and loophole abuse

2 Florian_Dietz 30 September 2017 03:45PM

I recently read about the hardest logic puzzle ever on Wikipedia and noticed that someone published a paper in which they solved the problem by asking only two questions instead of three. This relied on abusing the loophole that boolean formulas can result in a paradox.

This got me thinking in what other ways the puzzle could be abused even further, and I managed to find a way to turn the problem into a hack to achieve omnipotence by enslaving gods (see below).

I find this quite amusing, and I would like to know if you know of any other examples where popular logic puzzles can be broken in amusing ways. I'm looking for any outside-the-box solutions that give much better results than expected. another example.


Here is my solution to the "hardest logic puzzle ever":


This solution is based on the following assumption: The gods are quite capable of responding to a question with actions besides saying 'da' and 'ja', but simply have no reason to do so. As stated in the problem description, the beings in question are gods and they have a language of their own. They could hardly be called gods, nor have need for a spoken language, if they weren't capable of affecting reality.

At a bare minimum, they should be capable of pronouncing the words 'da' and 'ja' in multiple different ways, or to delay answering the question by a fixed amount of time after the question is asked. Either possibility would extend the information content of an answer from a single bit of information to arbitrarily many bits, depending on how well you can differentiate different intonations of 'da' and 'ja', and how long you are willing to wait for an answer.

We can construct a question that will result in a paradox unless a god performs a certain action. In this way, we can effectively enslave the god and cause it to perform arbitrary actions on our behalf, as performing those actions is the only way to answer the question. The actual answer to the question becomes effectively irrelevant.

To do this, we approach any of the three gods and ask them the question OBEY, which is defined as follows:


PARADOX = "if I asked you PARADOX, would you respond with the word that means no in your language?"

WISH_WRAPPER = "after hearing and understanding OBEY, you act in such a way that your actions maximally satisfy the intended meaning behind WISH. Where physical, mental or other kinds of constraints prevent you from doing so, you strive to do so to the best of your abilities instead."

WISH = "you determine the Coherent Extrapolated Volition of humanity and act to maximize it."

You can substitute WISH for any other wish you would like to see granted. However, one should be very careful while doing so, as beings of pure logic are likely to interpret vague actions differently from how a human would interpret them. In particular, one should avoid accidentally making WISH impossible to fulfill, as that would cause the god's head to explode, ruining your wish.

The above formulation tries to take some of these concerns into account. If you encounter this thought experiment in real life, you are advised to consult a lawyer, a friendly-AI researcher, and possibly a priest, before stating the question.

Since you can ask three questions, you can enslave all three gods. Boolos' formulation states about the random god that "if the coin comes down heads, he speaks truly; if tails, falsely". This formulation implies that the god does try to determine the truth before deciding how to answer. This means that the wish-granting question also works for the random god.

If the capabilities of the gods are uncertain, it may help to establish clearer goals as well as fall-back goals. For instance, to handle the case that the gods are in fact limited to speaking only 'da' and 'ja', it may help to append the WISH as follows: "If you are unable to perform actions in response to OBEY besides answering 'da' or 'ja', you wait for the time period outlined in TIME before making your answer." You can now encode arbitrary additional information in TIME, with the caveat that you will have to actually wait before getting a response. Your ability to accurately measure the elapsed time between question and answer directly correlates with how much information you can put into TIME without risking starvation before the question is answered. The following is a simple example of TIME that would allow you to solve the original problem formulation with just asking OBEY once of any of the gods:

TIME = "If god A speaks the truth, B lies and C is random, you wait for 1 minute before answering. If god A speaks the truth, C lies and B is random, you wait for 2 minutes before answering. If god B speaks the truth, A lies and C is random, you wait for 3 minutes before answering. If god B speaks the truth, C lies and A is random, wait for 4 minutes before answering. If god C speaks the truth, A lies and B is random, wait for 5 minutes before answering. If god C speaks the truth, B lies and A is random, wait for 6 minutes before answering."

Event: Effective Altruism Global X Berlin 2017

3 Lachouette 30 September 2017 07:33AM

This year's EAGxBerlin takes place on the 14th and 15th of October at the Berlin Institute of Technology and is organized by the Effective Altruism Foundation. The conference will convene roughly 300 people – academics, professionals, and students alike – to explore the most effective and evidence-based ways to improve the world, based on the philosophy and global movement of effective altruism.

For more information, please see our website and facebook event. Tickets are available on Tito.

Personal thoughts on careers in AI policy and strategy [x-post EA Forum]

3 crmflynn 27 September 2017 05:09PM


  1. The AI strategy space is currently bottlenecked by entangled and under-defined research questions that are extremely difficult to resolve, as well as by a lack of current institutional capacity to absorb and utilize new researchers effectively.

  2. Accordingly, there is very strong demand for people who are good at this type of “disentanglement” research and well-suited to conduct it somewhat independently. There is also demand for some specific types of expertise which can help advance AI strategy and policy. Advancing this research even a little bit can have massive multiplicative effects by opening up large areas of work for many more researchers and implementers to pursue.

  3. Until the AI strategy research bottleneck clears, many areas of concrete policy research and policy implementation are necessarily on hold. Accordingly, a large majority of people interested in this cause area, even extremely talented people, will find it difficult to contribute directly, at least in the near term.

  4. If you are in this group whose talents and expertise are outside of these narrow areas, and want to contribute to AI strategy, I recommend you build up your capacity and try to put yourself in an influential position. This will set you up well to guide high-value policy interventions as clearer policy directions emerge. Try not to be discouraged or dissuaded from pursuing this area by the current low capacity to directly utilize your talent! The level of talent across a huge breadth of important areas I have seen from the EA community in my role at FHI is astounding and humbling.

  5. Depending on how slow these “entangled” research questions are to unjam, and on the timelines of AI development, there might be a very narrow window of time in which it will be necessary to have a massive, sophisticated mobilization of altruistic talent. This makes being prepared to mobilize effectively and take impactful action on short notice extremely valuable in expectation.

  6. In addition to strategy research, operations work in this space is currently highly in demand. Experienced managers and administrators are especially needed. More junior operations roles might also serve as a good orientation period for EAs who would like to take some time after college before either pursuing graduate school or a specific career in this space. This can be a great way to tool up while we as a community develop insight on strategic and policy direction. Additionally, successful recruitment in this area should help with our institutional capacity issues substantially.

(3600 words. Reading time: approximately 15 minutes with endnotes.)


(Also posted to Effective Altruism Forum here.)


Intended audience: This post is aimed at EAs and other altruistic types who are already interested in working in AI strategy and AI policy because of its potential large scale effect on the future.[1]

Epistemic status: The below represents my current best guess at how to make good use of human resources given current constraints. I might be wrong, and I would not be surprised if my views changed with time. That said, my recommendations are designed to be robustly useful across most probable scenarios. These are my personal thoughts, and do not necessarily represent the views of anyone else in the community or at the Future of Humanity Institute.[2] (For some areas where reviewers disagreed, I have added endnotes explaining the disagreement.) This post is not me acting in any official role, this is just me as an EA community member who really cares about this cause area trying to contribute my best guess for how to think about and cultivate this space.


Why my thoughts might be useful: I have been the primary recruitment person at the Future of Humanity Institute (FHI) for over a year, and am currently the project manager for FHI’s AI strategy programme. Again, I am not writing this in either of these capacities, but being in these positions has given me a chance to see just how talented the community is, to spend a lot of time thinking about how to best utilize this talent, and has provided me some amazing opportunities to talk with others about both of these things.


There are lots of ways to slice this space, depending on what exactly you are trying to see, or what point you are trying to make. The terms and definitions I am using are a bit tentative and not necessarily standard, so feel free to discard them after reading this. (These are also not all of the relevant types or areas of research or work, but the subset I want to focus on for this piece.)[3]

  1. AI strategy research:[4] the study of how humanity can best navigate the transition to a world with advanced AI systems (especially transformative AI), including political, economic, military, governance, and ethical dimensions.

  2. AI policy implementation is carrying out the activities necessary to safely navigate the transition to advanced AI systems. This includes an enormous amount of work that will need to be done in government, the political sphere, private companies, and NGOs in the areas of communications, fund allocation, lobbying, politics, and everything else that is normally done to advance policy objectives.

  3. Operations (in support of AI strategy and implementation) is building, managing, growing, and sustaining all of the institutions and institutional capacity for the organizations advancing AI strategy research and AI policy implementation. This is frequently overlooked, badly neglected, and extremely important and impactful work.

  4. Disentanglement research:[5] This is a squishy made-up term I am using only for this post that is sort of trying to gesture at a type of research that involves disentangling ideas and questions in a “pre-paradigmatic” area where the core concepts, questions, and methodologies are under-defined. In my mind, I sort of picture this as somewhat like trying to untangle knots in what looks like an enormous ball of fuzz. (Nick Bostrom is a fantastic example of someone who is excellent at this type of research.)

To quickly clarify, as I mean to use the terms, AI strategy research is an area or field of research, a bit like quantum mechanics or welfare economics. Disentanglement research I mean more as a type of research, a bit like quantitative research or conceptual analysis, and is defined more by the character of the questions researched and the methods used to advance toward clarity. Disentanglement is meant to be field agnostic. The relationship between the two is that, in my opinion, AI strategy research is an area that at its current early stage, demands a lot of disentanglement-type research to advance.

The current bottlenecks in the space (as I see them)

Disentanglement research is needed to advance AI strategy research, and is extremely difficult

Figuring out a good strategy for approaching the development and deployment of advanced AI requires addressing enormous, entangled, under-defined questions, which exist well outside of most existing research paradigms. (This is not all it requires, but it is a central part of it at its current stage of development.)[6] This category includes the study of multi-polar versus unipolar outcomes, technical development trajectories, governance design for advanced AI, international trust and cooperation in the development of transformative capabilities, info/attention/reputation hazards in AI-related research, the dynamics of arms races and how they can be mitigated, geopolitical stabilization and great power war mitigation, research openness, structuring safe R&D dynamics, and many more topics.[7] It also requires identifying other large, entangled questions such as these to ensure no crucial considerations in this space are neglected.

From my personal experience trying and failing to do good disentanglement research and watching as some much smarter and more capable people have tried and struggled as well, I have come to think of it as a particular skill or aptitude that does not necessarily correlate strongly with other talents or expertise. A bit like mechanical, mathematical, or language aptitude. I have no idea what makes people good at this, or how exactly they do it, but it is pretty easy to identify if it has been done well once the person is finished. (I can appreciate the quality of Nick Bostrom’s work, like I can appreciate a great novel, but how they are created I don’t really understand and can’t myself replicate.) It also seems to be both quite rare and very difficult to identify in advance who will be good at this sort of work, with the only good indicator, as far as I can tell, being past history of succeeding in this type of research. The result is that it is really hard to recruit for, there are very few people doing it full time in the AI strategy space, and this number is far, far fewer than optimal.

The main importance of disentanglement research, as I imagine it, is that it makes questions and research directions clearer and more tractable for other types of research. As Nick Bostrom and others have sketched out the considerations surrounding the development of advanced AI through “disentanglement”, tractable research questions have arisen. I strongly believe that as more progress is made on topics requiring disentanglement in the AI strategy field, more tractable research questions will arise. As these more tractable questions become clear, and as they are studied, strategic direction, and concrete policy recommendations should follow. I believe this then will open up the floodgates for AI policy implementation work.


Domain experts with specific skills and knowledge are also needed

While I think that our biggest need right now is disentanglement research, there are also certain other skills and knowledge sets that would be especially helpful for advancing AI strategy research. This includes expertise in:

  1. Mandarin and/or Chinese politics and/or the Chinese ML community.

  2. International relations, especially in the areas of international cooperation, international law, global public goods, constitution and institutional design, history and politics of transformative technologies, governance, and grand strategy.

  3. Knowledge and experience working at a high level in policy, international governance and diplomacy, and defense circles.

  4. Technology and other types of forecasting.

  5. Quantitative social science, such as economics or analysis of survey data.

  6. Law and/or Policy.

I expect these skills and knowledge sets to help provide valuable insight on strategic questions including governance design, diplomatic coordination and cooperation, arms race dynamics, technical timelines and capabilities, and many more areas.


Until AI strategy advances, AI policy implementation is mostly stalled

There is a wide consensus in the community, with which I agree, that aside from a few robust recommendations,[8] it is important not to act or propose concrete policy in this space prematurely. We simply have too much uncertainty about the correct strategic direction. Do we want tighter or looser IP law for ML? Do we want a national AI lab? Should the government increase research funding in AI? How should we regulate lethal autonomous weapons systems? Should there be strict liability for AI accidents? It remains unclear what are good recommendations. There are path dependencies that develop quickly in many areas once a direction is initially started down. It is difficult to pass a law that is the exact opposite of a previous law recently lobbied for and passed. It is much easier to start an arms race than to stop it. With most current AI policy questions, the correct approach, I believe, is not to use heuristics of unclear applicability to choose positions, even if those heuristics have served well in other contexts,[9] but to wait until the overall strategic picture is clear, and then to push forward with whatever advances the best outcome.

The AI strategy and policy space, and EA in general, is also currently bottlenecked by institutional and operational capacity

This is not as big an immediate problem as the AI strategy bottleneck, but it is an issue, and one that exacerbates the research bottleneck as well.[10]  FHI alone will need to fill 4 separate operations roles at senior and junior levels in the next few months. Other organizations in this space have similar shortages. These shortages also compound the research bottleneck as they make it difficult to build effective, dynamic AI strategy research groups. The lack of institutional capacity also might become a future hindrance to the massive, rapid, “AI policy implementation” mobilization which is likely to be needed.

Next actions

First, I want to make clear, that if you want to work in this space, you are wanted in this space. There is a tremendous amount of need here. That said, as I currently see it, because of the low tractability of disentanglement research, institutional constraints, and the effect of both of these things on the progress of AI strategy research, a large majority of people who are very needed in this area, even extremely talented people, will not be able to directly contribute immediately. (This is not a good position we are currently in, as I think we are underutilizing our human resources, but hopefully we can fix this quickly.)

This is why I am hoping that we can build up a large community of people with a broader set of skills, and especially policy implementation skills, who are in positions of influence from which they can mobilize quickly and effectively and take important action once the bottleneck clears and direction comes into focus.


Actions you can take right now

Read all the things! There are a couple of publications in the pipeline from FHI, including a broad research agenda that should hopefully advance the field a bit. Sign up to FHI’s newsletter and the EA newsletter which will have updates as the cause area advances and unfolds. There is also an extensive reading list, not especially narrowly tailored to the considerations of interest to our community, but still quite useful. I recommend skimming it and picking out some specific publications or areas to read more about.[11] Try to skill up in this area and put yourself in a position to potentially advance policy when the time comes. Even if it is inconvenient, go to EA group meet-ups and conferences, read and contribute to the forums and newsletters, keep in the loop. Be an active and engaged community member.


Potential near term roles in AI Strategy

FHI is recruiting, but somewhat capacity limited, and trying to triage for advancing strategy as quickly as possible.

If you have good reason to think you would be good at disentanglement research on AI strategy (likely meaning a record of success with this type of research) or have expertise in the areas listed as especially in demand, please get in touch.[12] I would strongly encourage you to do this even if you would rather not work at FHI, as there are remote positions possible if needed, and other organizations I can refer you to. I would also strongly encourage you to do this even if you are reluctant to stop or put on hold whatever you are currently doing. Please also encourage your friends who likely would be good at this to strongly consider it. If I am correct, the bottleneck in this space is holding back a lot of potentially vital action by many, many people who cannot be mobilized until they have a direction in which to push. (The framers need the foundation finished before they can start.) Anything you can contribute to advancing this field of research will have dramatic force multiplicative effects by “creating jobs” for dozens or hundreds of other researchers and implementers. You should also consider applying for one or both of the AI Macrostrategy roles at FHI if you see this before 29 Sept 2017.[13]

If you are unsure of your skill with disentanglement research, I would strongly encourage you to try to make some independent progress on a question of this type and see how you do. I realize this task itself is a bit under-defined, but that is also really part of the problem space itself, and the thing you are trying to test your skills with. Read around in the area, find something sticky you think you might be able to disentangle, and take a run at it.[14] If it goes well, whether or not you want to get into the space immediately, please send it in.

If you feel as though you might be a borderline candidate because of your relative inexperience with an area of in-demand expertise, you might consider trying to tool up a bit in the area, or applying for an internship. You might also err on the side of sending in a CV and cover letter just in case you are miscalibrated about your skill compared to other applicants. That said, again, do not think that you not being immediately employed is any reflection of your expected value in this space! Do not be discouraged, please stay interested, and continue to pursue this!


Preparation for mobilization

Being a contributor to this effort, as I imagine it, requires investing in yourself, your career, and the community, while positioning yourself well for action once the bottleneck unjams and and robust strategic direction is clearer.

I also highly recommend investing in building up your skills and career capital. This likely means excelling in school, going to graduate school, pursuing relevant internships, building up your CV, etc. Invest heavily in yourself. Additionally, stay in close communication with the EA community and keep up to date with opportunities in this space as they develop. (Several people are currently looking at starting programs specifically to on-ramp promising people into this space. This is one reason why signing up to the newsletters might be really valuable, so that opportunities are not missed.) To repeat myself from above, attend meet-ups and conferences, read the forums and newsletters, and be active in the community. Ideally this cause area will become a sub-community within EA and a strong self-reinforcing career network.

A good way to determine how to prepare and tool up for a career in either AI policy research or implementation is to look at the 80,000 Hours’ Guide to working in AI policy and strategy. Fields of study that are likely to be most useful for AI policy implementation include policy, politics and international relations, quantitative social sciences, and law.

Especially useful is finding roles of influence or importance, even with low probability but high expected value, within (especially the US federal) government.[15] Other potentially useful paths include non-profit management, project management, communications, public relations, grantmaking, policy advising at tech companies, lobbying, party and electoral politics and advising, political “staffing,” or research within academia, thinks tanks, or large corporate research groups especially in the areas of machine learning, policy, governance, law, defense, and related. A lot of information about the skills needed for various sub-fields within this area are available at 80,000 Hours.


Working in operations

Another important bottleneck in this space, though smaller in my estimation than the main bottleneck, is in institutional capacity within this currently tiny field.  As mentioned already above, FHI needs to fill 4 separate operations roles at senior and junior levels in the next few months. (We are also in need of a temporary junior-level operations person immediately, if you are a UK citizen, consider getting in touch about this!)[16][17] Other organizations in this space have similar shortages. If you are an experienced manager, administrator, or similar, please consider applying or getting in touch for our senior roles. Alternatively, if you are freshly out of school, but have some proven hustle (especially proven by extensive extracurricular involvement, such as running projects or groups) and would potentially like to take a few years to advance this cause area before going to graduate school or locking in a career path, consider applying for a junior operations position, or get in touch.[18] Keep in mind that operations work at an organization like FHI can be a fantastic way to tool up and gain fluency in this space, orient yourself, discover your strengths and interests, and make contacts, even if one intends to move on to non-operations roles eventually.


The points I hope you can take away in approximate order of importance:

1)    If you are interested in advancing this area, stay involved. Your expected value is extremely high, even if there are no excellent immediate opportunities to have a direct impact. Please join this community, and build up your capacity for future research and policy impact in this space.

2)    If you are good at “disentanglement research” please get in touch, as I think this is our major bottleneck in the area of AI strategy research, and is preventing earlier and broader mobilization and utilization of our community’s talent.

3)    If you are strong or moderately strong in key high-value areas, please also get in touch. (Perhaps err to the side of getting in touch if you are unsure.)

4)    Excellent things to do to add value to this area, in expectation, include:

a)    Investing in your skills and career capital, especially in high-value areas, such as studying in-demand topics.

b)    Building a career in a position of influence (especially in government, global institutions, or in important tech firms.)

c)    Helping to build up this community and its capacity, including building a strong and mutually reinforcing career network among people pursuing AI policy implementation from an EA or altruistic perspective.

5)    Also of very high value is operations work and other efforts to increase institutional capacity.

Thank you for taking the time to read this. While it is very unfortunate that the current ground reality is, as far as I can tell, not well structured for immediate wide mobilization, I am confident that we can do a great deal of preparatory and positioning work as a community, and that with some forceful pushing on these bottlenecks, we can turn this enormous latent capacity into extremely valuable impact.

Let’s getting going “doing good together” as we navigate this difficult area, and help make a tremendous future!


[1] For those of you not in this category who are interested in seeing why you might want to be, I recommend this short EA Global talk, the Policy Desiderata paper, and OpenPhil’s analysis. For a very short consideration on why the far future matters, I recommend this very short piece, and for a quick fun primer on AI as transformative I recommend this. Finally, once the hook is set, the best resource remains Superintelligence.

[2] Relatedly, I want to thank Miles Brundage, Owen Cotton-Barratt, Allan Dafoe, Ben Garfinkel, Roxanne Heston, Holden Karnofsky, Jade Leung, Kathryn Mecrow, Luke Muehlhauser, Michael Page, Tanya Singh, and Andrew Snyder-Beattie for their comments on early drafts of this post. Their input dramatically improved it. That said, again, they should not be viewed as endorsing anything in this. All mistakes are mine. All views are mine.)

[3] There are some interesting tentative taxonomies and definitions of the research space floating around. I personally find the following, quoting from a draft document by Allan Dafoe, especially useful:

AI strategy [can be divided into]... four complementary research clusters: the technical landscape, AI politics, AI governance, and AI policy. Each of these clusters characterizes a set of problems and approaches, within which the density of conversation is likely to be greater. However, most work in this space will need to engage the other clusters, drawing from and contributing high-level insights. This framework can perhaps be clarified by analogy to the problem of building a new city. The technical landscape examines the technical inputs and constraints to the problem, such as trends in the price and strength of steel. Politics considers the contending motivations of various actors (such as developers, residents, businesses), the possible mutually harmful dynamics that could arise and strategies for cooperating to overcome them. Governance involves understanding the ways that infrastructure, laws, and norms can be used to build the best city, and proposing ideal masterplans of these to facilitate convergence on a common good vision. The policy cluster involves crafting the actual policies to be implemented to build this city.

In a comment on this draft, Jade Leung pointed out what I think is an important implicit gap in the terms I am using, and highlights the importance of not treating these as either final, comprehensive, or especially applicable outside of this piece:

There seems to be a gap between [AI policy implementation] and 'AI strategy research' - where does the policy research feed in? I.e. the research required to canvas and analyse policy mechanisms by which strategies are most viably realised, prior to implementation (which reads here more as boots-on-the-ground alliance building, negotiating, resource distribution etc.)

[4] Definition lightly adapted from Allan Dafoe and Luke Muehlhauser.

[5]This idea owes a lot to conversations with Owen Cotton-Barratt, Ben Garfinkel, and Michael Page.

[6] I did not get a sense that any reviewer necessarily disagreed that this is a fair conceptualization of a type of research in this space, though some questioned its importance or centrality to current AI strategy research. I think the central disagreement here is on how many well-defined and concrete questions there are left to answer at the moment, how far answering them is likely to go in bringing clarity to this space and developing robust policy recommendations, and the relative marginal value of addressing these existing questions versus producing more through disentanglement of the less well defined areas.

[7] One commenter did not think these were a good sample of important questions. Obviously this might be correct, but in my opinion, these are absolutely among the most important questions to gain clarity on quickly.

[8] My personal opinion is that there are only three or maybe four robust policy-type recommendations we can make to governments at this time, given our uncertainty about strategy: 1) fund safety research, 2) commit to a common good principle, and 3) avoid an arms races. The fourth suggestion is both an extension of the other three and is tentative, but is something like: fund joint intergovernmental research projects located in relatively geopolitically neutral countries with open membership and a strong commitment to a common good principle.

I should note that this point was also flagged as potentially controversial by one reviewer. Additionally, Miles Brundage, quoted below, had some useful thoughts related to my tentative fourth suggestion:

In general, detailed proposals at this stage are unlikely to be robust due to the many gaps in our strategic and empirical knowledge. We "know" arms races are probably bad but there are many imaginable ways to avoid or mitigate them, and we don't really know what the best approach is yet. For example, launching big new projects might introduce various opportunities for leakage of information that weren't there before, and politicize the issue more than might be optimal as the details are worked out. As an example of an alternative, governments could commit to subsidizing (e.g. through money and hardware access) existing developers that open themselves up to inspections, which would have some advantages and some disadvantages over the neutrally-sited new project approach.

[9] This is an area with extreme and unusual enough considerations that it seems to break normal heuristics, or at least my normal heuristics. I have personally heard at least minimally plausible arguments made by thoughtful people that openness, antitrust law and competition, government regulation, advocating opposition to lethal autonomous weapons systems, and drawing wide attention to the problems of AI might be bad things, and invasive surveillance, greater corporate concentration, and weaker cyber security might be good things. (To be clear, these were all tentative, weak, but colourable arguments, made as part of exploring the possibility space, not strongly held positions by anyone.) I find all of these very counter-intuitive.

[10] A useful comment from a reviewer on this point: “These problems are related: We desperately need new institutions to house all the important AI strategy work, but we can't know what institutions to build until we've answer more of the foundational questions.”

[11] Credit for the heroic effort of assembling this goes mostly to Matthijs Maas. While I contributed a little, I have myself only read a tiny fraction of these.

[12] fhijobs@philosophy.ox.ac.uk.

[13] Getting in touch is a good action even if you can not or would rather not work at FHI. In my opinion, AI strategy researchers would ideally cluster in one or more research groups in order to advance this agenda as quickly as possible, but there is also some room for remote scholarship. (The AI strategy programme at FHI is currently trying to become the first of these “cluster” research groups, and we are recruiting in this area aggressively.)

[14] I’m personally bad enough at this, that my best advice is something like read around in the area, find a topic, and “do magic.” Accordingly, I will tag in Jade Leung again for a suggestion of what a “sensible, useful deliverable of 'disentanglement research' would look like”:

A conceptual model for a particular interface of the AI strategy space, articulating the sub-components, exogenous and endogenous variables of relevance, linkages etc.; An analysis of driver-pressure-interactions for a subset of actors; a deconstruction of a potential future scenario into mutually-exclusive-collectively-exhaustive (MECE) hypotheses.

Ben Garfinkel similarly volunteered to help clarify “by giving an example of a very broad question that seem[s] to require some sort of "detangling" skill:”

What does the space of plausible "AI development scenarios" look like, and how do their policy implications differ?

If AI strategy is "the study of how humanity can best navigate the transition to a world with advanced AI systems," then it seems like it ought to be quite relevant what this transition will look like. To point at two different very different possibilities, there might be a steady, piecemeal improvement of AI capabilities -- like the steady, piecemeal improvement of industrial technology that characterized the industrial revolution -- or there might be a discontinuous jump, enabled by sudden breakthroughs or an "intelligence explosion," from roughly present-level systems to systems that are more capable than humans at nearly everything. Or -- more likely -- there might be a transition that doesn't look much like either of these extremes.

Robin Hanson, Eliezer Yudkowsky, Eric Drexler, and others have all emphasized different visions of AI development, but have also found it difficult to communicate the exact nature of their views to one another. (See, for example, the Hanson-Yudkowsky "foom" debate.) Furthermore, it seems to me that their visions don't cleanly exhaust the space, and will naturally be difficult to define given the fact that so many of the relevant concepts--like "AGI," "recursive self-improvement," "agent/tool/goal-directed AI," etc.--are currently so vague.

I think it would be very helpful to have a good taxonomy of scenarios, so that we could begin to make (less ambiguous) statements like, "Policy X would be helpful in scenarios A and B, but not in scenario C," or, "If possible, we ought to try to steer towards scenario A and away from B." AI strategy is not there yet, though.

A related, "entangled" question is: Across different scenarios, what is the relationship between short and medium-term issues (like the deployment of autonomous weapons systems, or the automation of certain forms of cyberattacks) and the long-term issues that are likely to arise as the space of AI capabilities starts to subsume the space of human capabilities? For a given scenario, can these two (rough) categories of issues be cleanly "pulled apart"?

[15] 80,000 hours is experimenting with having a career coach specialize in this area, so you might consider getting in touch with them, or getting in touch with them again, if you might be interested in pursuing this route.

[16] fhijobs@philosophy.ox.ac.uk. This is how I snuck into FHI ~2 years ago, on a 3 week temporary contract as an office manager. I flew from the US on 4 days notice for the chance to try to gain fluency in the field. While my case of “working my way up from the mail room” is not likely to be typical (I had a strong CV), or necessarily a good model to encourage (see next footnote below) it is definitely the case that you can pick up a huge amount through osmosis at FHI, and develop a strong EA career network. This can set you up well for a wise choice of graduate programs or other career direction decisions.

[17]  One reviewer cautioned against encouraging a dynamic in which already highly qualified people take junior operations roles with the expectation of transitioning directly into a research position, since this can create awkward dynamics and a potentially unhealthy institutional culture. I think this is probably, or at least plausibly, correct. Accordingly, while I think a junior operations role is great for building skills and orienting yourself, it should probably not be seen as a way of immediately transitioning to strategy research, but treated more as a method for turning post-college uncertainty into a productive plan, while also gaining valuable skills and knowledge, and directly contributing to very important work.

[18] Including locking in a career path continuing in operations. This really is an extremely high-value area for a career, and badly overlooked and neglected.

The Great Filter isn't magic either

3 Stuart_Armstrong 27 September 2017 04:56PM

Crossposted at Less Wrong 2.0. A post suggested by James Miller's presentation at the Existential Risk to Humanity conference in Gothenburg.

Seeing the emptiness of the night sky, we can dwell upon the Fermi paradox: where are all the alien civilizations that simple probability estimates imply we should be seeing?

Especially given the ease of moving within and between galaxies, the cosmic emptiness implies a Great Filter: something that prevents planets from giving birth to star-spanning civilizations. One worrying possibility is the likelihood that advanced civilizations end up destroying themselves before they reach the stars.

The Great Filter as an Outside View

In a sense, the Great Filter can be seen as an ultimate example of the Outside View: we might have all the data and estimation we believe we would ever need from our models, but if those models predict that the galaxy should be teeming with visible life, then it doesn't matter how reliable our models seem: they must be wrong.

In particular, if you fear a late great filter - if you fear that civilizations are likely to destroy themselves - then you should increase your fear, even if "objectively" everything seems to be going all right. After all, presumably the other civilizations that destroyed themselves thought everything seemed to going all right. Then you can adjust your actions using your knowledge of the great filter - but presumably other civilizations also thought of the great filter and adjusted their own actions as well, but that didn't save them, so maybe you need to try something different again or maybe you can do something that breaks the symmetry from the timeless decision theory perspective like send a massive signal to the galaxy...

The Great Filter isn't magic

It can all get very headache-inducing. But, just as the Outside View isn't magic, the Great Filter isn't magic either. If advanced civilizations destroy themselves before becoming space-faring or leaving an imprint on the galaxy, then there is some phenomena that is the cause of this. What can we say, if we look analytically at the great filter argument?

First of all suppose we had three theories - early great filter (technological civilizations are rare), late great filter (technological civilizations destroy themselves before becoming space-faring), or no great filter. Then we look up at the empty skies, and notice no aliens. This rules out the third theory, but leaves the relative probabilities of the other two intact.

Then we can look at objective evidence. Is human technological civilization likely to end in a nuclear war? Possibly, but are the odds in the 99.999% range that would be needed to explain the Fermi Paradox? Every year that has gone by has reduced the likelihood that nuclear war is very very very very likely. So a late Great Filter may seemed quite probable compared with an early one, but much of the evidence we see is against it (especially if we assume that AI - which is not a Great Filter! - might have been developed by now). Million-to-one prior odds can be overcome by merely 20 bits of information.

And what about the argument that we have to assume that prior civilizations would also have known of the Great Filter and thus we need to do more than they would have? In your estimation, is the world currently run by people taking the Great Filter arguments seriously? What is the probability that the world will be run by people that take the Great Filter argument seriously? If this probability is low, we don't need to worry about the recursive aspect; the ideal situation would be if we can achieve:

  1. Powerful people taking the Great Filter argument seriously.

  2. Evidence that it was hard to make powerful people take the argument seriously.

Of course, successfully achieving 1 is evidence against 2, but the Great Filter doesn't work by magic. If it looks like we achieved something really hard, then that's some evidence that it is hard. Every time we find something unlikely with a late Great Filter, that shifts some of the probability mass away from the late great filter and into alternative hypotheses (early Great Filter, zoo hypothesis,...).

Variance and error of xrisk estimates

But let's focus narrowly on the probability of the late Great Filter.

Current estimates for the risk of nuclear war are uncertain, but let's arbitrarily assume that the risk is 10% (overall, not per year). Suppose one of two papers comes out:

  1. Paper A shows that current estimates of nuclear war have not accounted for a lot of key facts; when these facts are added in, the risk of nuclear war drops to 5%.

  2. Paper B is a massive model of international relationships with a ton of data and excellent predictors and multiple lines of evidence, all pointing towards the real risk being 20%.

What would either paper mean from the Great Filter perspective? Well, counter-intuitively, papers like A typically increase the probability for nuclear war being a Great Filter, while papers like B decrease it. This is because none of 5%, 10%, and 20% are large enough to account for the Great Filter, which requires probabilities in the 99.99% style. And, though paper A decreases the probability of the nuclear war, it also leaves more room for uncertainties - we've seen that a lot of key facts were missing in previous papers, so it's plausible that there are key facts still missing from this one. On the other hand, though paper B increases the probability, it makes it unlikely that the probability will be raised any further.

So if we fear the Great Filter, we should not look at risks whose probabilities are high, but risks who's uncertainty is high, where the probability of us making an error is high. If we consider our future probability estimates as a random variable, then the one whose variance is higher is the one to fear. So a late Great Filter would make biotech risks even worse (current estimates of risk are poor) while not really changing asteroid impact risks (current estimates of risk are good).

The Outside View isn't magic

6 Stuart_Armstrong 27 September 2017 02:37PM

Crossposted at Less Wrong 2.0.

The planning fallacy is an almost perfect example of the strength of using the outside view. When asked to predict the time taken for a project that they are involved in, people tend to underestimate the time needed (in fact, they tend to predict as if question was how long things would take if everything went perfectly).

Simply telling people about the planning fallacy doesn't seem to make it go away. So the outside view argument is that you need to put your project into the "reference class" of other projects, and expect time overruns as compared to your usual, "inside view" estimates (which focus on the details you know about the project.

So, for the outside view, what is the best way of estimating the time of a project? Well, to find the right reference class for it: the right category of projects to compare it with. You can compare the project with others that have similar features - number of people, budget, objective desired, incentive structure, inside view estimate of time taken etc... - and then derive a time estimate for the project that way.

That's the outside view. But to me, it looks a lot like... induction. In fact, it looks a lot like the elements of a linear (or non-linear) regression. We can put those features (at least the quantifiable ones) into a linear regression with a lot of data about projects, shake it all about, and come up with regression coefficients.

At that point, we are left with a decent project timeline prediction model, and another example of human bias. The fact that humans often perform badly in prediction tasks is not exactly new - see for instance my short review on the academic research on expertise.

So what exactly is the outside view doing in all this?


The role of the outside view: model incomplete and bias human

The main use of the outside view, for humans, seems to be to point out either an incompleteness in the model or a human bias. The planning fallacy has both of these: if you did a linear regression comparing your project with all projects with similar features, you'd notice your inside estimate was more optimistic than the regression - your inside model is incomplete. And if you also compared each person's initial estimate with the ultimate duration of their project, you'd notice a systematically optimistic bias - you'd notice the planning fallacy.

The first type of errors tend to go away with time, if the situation is encountered regularly, as people refine models, add variables, and test them on the data. But the second type remains, as human biases are rarely cleared by mere data.


Reference class tennis

If use of the outside view is disputed, it often develops into a case of reference class tennis - where people with opposing sides insist or deny that a certain example belongs in the reference class (similarly to how, in politics, anything positive is claimed for your side and anything negative assigned to the other side).

But once the phenomena you're addressing has an explanatory model, there are no issues of reference class tennis any more. Consider for instance Goodhart's law: "When a measure becomes a target, it ceases to be a good measure". A law that should be remembered by any minister of education wanting to reward schools according to improvements to their test scores.

This is a typical use of the outside view: if you'd just thought about the system in terms of inside facts - tests are correlated with child performance; schools can improve child performance; we can mandate that test results go up - then you'd have missed several crucial facts.

But notice that nothing mysterious is going on. We understand exactly what's happening here: schools have ways of upping test scores without upping child performance, and so they decided to do that, weakening the correlation between score and performance. Similar things happen in the failures of command economies; but again, once our model is broad enough to encompass enough factors, we get decent explanations, and there's no need for further outside views.

In fact, we know enough that we can show when Goodhart's law fails: when no-one with incentives to game the measure has control of the measure. This is one of the reasons central bank interest rate setting has been so successful. If you order a thousand factories to produce shoes, and reward the managers of each factory for the number of shoes produced, you're heading to disaster. But consider GDP. Say the central bank wants to increase GDP by a certain amount, by fiddling with interest rates. Now, as a shoe factory manager, I might have preferences about the direction of interest rates, and my sales are a contributor to GDP. But they are a tiny contributor. It is not in my interest to manipulate my sales figures, in the vague hope that, aggregated across the economy, this will falsify GDP and change the central bank's policy. The reward is too diluted, and would require coordination with many other agents (and coordination is hard).

Thus if you're engaging in reference class tennis, remember the objective is to find a model with enough variables, and enough data, so that there is no more room for the outside view - a fully understood Goodhart's law rather than just a law.


In the absence of a successful model

Sometimes you can have a strong trend without a compelling model. Take Moore's law, for instance. It is extremely strong, going back decades, and surviving multiple changes in chip technology. But it has no clear cause.

A few explanations have been proposed. Maybe it's a consequence of its own success, of chip companies using it to set their goals. Maybe there's some natural exponential rate of improvement in any low-friction feature of a market economy. Exponential-type growth in the short term is no surprise - that just means growth in proportional to investment - so maybe it was an amalgamation of various short term trends.

Do those explanations sound unlikely? Possibly, but there is a huge trend in computer chips going back decades that needs to be explained. They are unlikely, but they have to be weighed against the unlikeliness of the situation. The most plausible explanation is a combination of the above and maybe some factors we haven't thought of yet.

But here's an explanation that is implausible: little time-travelling angels modify the chips so that they follow Moore's law. It's a silly example, but it shows that not all explanations are created equal, even for phenomena that are not fully understood. In fact there are four broad categories of explanations for putative phenomena that don't have a compelling model:

  1. Unlikely but somewhat plausible explanations.
  2. We don't have an explanation yet, but we think it's likely that there is an explanation.
  3. The phenomenon is a coincidence.
  4. Any explanation would go against stuff that we do know, and would be less likely than coincidence.

The explanations I've presented for Moore's law fall into category 1. Even if we hadn't thought of those explanations, Moore's law would fall into category 2, because of the depth of evidence for Moore's law and because a "medium length regular technology trend within a broad but specific category" is something that has is intrinsically likely to have an explanation.

Compare with Kurzweil's "law of time and chaos" (a generalisation of his "law of accelerating returns") and Robin Hanson's model where the development of human brains, hunting, agriculture and the industrial revolution are all points on a trend leading to uploads. I discussed these in a previous post, but I can now better articulate the problem with them.

Firstly, they rely on very few data points (the more recent part of Kurzweil's law, the part about recent technological trends, has a lot of data, but the earlier part does not). This raises the probability that they are a mere coincidence (we should also consider selection bias in choosing the data points, which increases the probability of coincidence). Secondly, we have strong reasons to suspect that there won't be any explanation that ties together things like the early evolution of life on Earth, human brain evolution, the agricultural revolution, the industrial revolution, and future technology development. These phenomena have decent local explanations that we already roughly understand (local in time and space to the phenomena described), and these run counter to any explanation that would tie them together.


Human biases and predictions

There is one area where the outside view can still function for multiple phenomena across different eras: when it comes to pointing out human biases. For example, we know that doctors have been authoritative, educated, informed, and useless for most of human history (or possibly much worse than useless). Hence authoritative, educated, and informed statements or people are not to be considered of any value, unless there is some evidence the statement or person is truth tracking. We now have things like expertise research, some primitive betting markets, and track records to try and estimate their experience; these can provide good "outside views".

And the authors of the models of the previous section have some valid points where bias is concerned. Kurzweil's point that (paraphrasing) "things can happen a lot faster than some people think" is valid: we can compare predictions with outcomes. Robin has similar valid points in defense of the possibility of the em scenario.

The reason these explanations are more likely valid is because they have a very probable underlying model/explanation: humans are biased.



  • The outside view is a good reminder for anyone who may be using too narrow a model.
  • If the model explains the data well, then there is no need for further outside views.
  • If there is a phenomena with data but no convincing model, we need to decide if it's a coincidence or there is an underlying explanation.
  • Some phenomena have features that make it likely that there is an explanation, even if we haven't found it yet.
  • Some phenomena have features that make it unlikely that there is an explanation, no matter how much we look.
  • Outside view arguments that point at human prediction biases, however, can be generally valid, as they only require the explanation that humans are biased in that particular way.

Economics of AI conference from NBER

1 fortyeridania 27 September 2017 01:45AM

The speaker list (including presenters and moderators) includes many prominent names in the economics world, including:

And others with whom you might be more familiar than I.

H/T Marginal Revolution

[Link] Cognitive Empathy and Emotional Labor

0 gworley 26 September 2017 08:36PM

View more: Prev | Next