FAI Research Constraints and AGI Side Effects

14 JustinShovelain 03 June 2015 07:25PM

Ozzie Gooen and Justin Shovelain


Friendly artificial intelligence (FAI) researchers have at least two significant challenges. First, they must produce a significant amount of FAI research in a short amount of time. Second, they must do so without producing enough general artificial intelligence (AGI) research to result in the creation of an unfriendly artificial intelligence (UFAI). We estimate the requirements of both of these challenges using two simple models.

Our first model describes a friendliness ratio and a leakage ratio for FAI research projects. These provide limits on the allowable amount of artificial general intelligence (AGI) knowledge produced per unit of FAI knowledge in order for a project to be net beneficial.

Our second model studies a hypothetical FAI venture, which is responsible for ensuring FAI creation. We estimate necessary total FAI research per year from the venture and leakage ratio of that research. This model demonstrates a trade off between the speed of FAI research and the proportion of AGI research that can be revealed as part of it. If FAI research takes too long, then the acceptable leakage ratio may become so low that it would become nearly impossible to safely produce any new research.

Report -- Allocating risk mitigation across time

11 owencb 20 February 2015 04:37PM

I've just released a Future of Humanity Institute technical report, written as part of the Global Priorities Project.


This article is about priority-setting for work aiming to reduce existential risk. Its chief claim is that all else being equal we should prefer work earlier and prefer to work on risks that might come early. This is because we are uncertain about when we will have to face different risks, because we expect diminishing returns of extra work, and because we expect that more people will work on these risks in the future.

I explore this claim both qualitatively and with explicit models. I consider its implications for two questions: first, “When is it best to do different kinds of work?”; second, “Which risks should we focus on?”.

As a major application, I look at the case of risk from artificial intelligence. The best strategies for reducing this risk depend on when the risk is coming. I argue that we may be underinvesting in scenarios where AI comes soon even though these scenarios are relatively unlikely, because we will not have time later to address them.


You can read the full report here: Allocating risk mitigation across time.

Probability and radical uncertainty

11 David_Chapman 23 November 2013 10:34PM

In the previous article in this sequence, I conducted a thought experiment in which simple probability was not sufficient to choose how to act. Rationality required reasoning about meta-probabilities, the probabilities of probabilities.

Relatedly, lukeprog has a brief post that explains how this matters; a long article by HoldenKarnofsky makes meta-probability  central to utilitarian estimates of the effectiveness of charitable giving; and Jonathan_Lee, in a reply to that, has used the same framework I presented.

In my previous article, I ran thought experiments that presented you with various colored boxes you could put coins in, gambling with uncertain odds.

The last box I showed you was blue. I explained that it had a fixed but unknown probability of a twofold payout, uniformly distributed between 0 and 0.9. The overall probability of a payout was 0.45, so the expectation value for gambling was 0.9—a bad bet. Yet your optimal strategy was to gamble a bit to figure out whether the odds were good or bad.

Let’s continue the experiment. I hand you a black box, shaped rather differently from the others. Its sealed faceplate is carved with runic inscriptions and eldritch figures. “I find this one particularly interesting,” I say.

Probability, knowledge, and meta-probability

39 David_Chapman 17 September 2013 12:02AM

This article is the first in a sequence that will consider situations where probability estimates are not, by themselves, adequate to make rational decisions. This one introduces a "meta-probability" approach, borrowed from E. T. Jaynes, and uses it to analyze a gambling problem. This situation is one in which reasonably straightforward decision-theoretic methods suffice. Later articles introduce increasingly problematic cases.

Start Under the Streetlight, then Push into the Shadows

31 lukeprog 24 June 2013 12:49AM

See also: Hack Away at the Edges.

The streetlight effect

You've heard the joke before:

Late at night, a police officer finds a drunk man crawling around on his hands and knees under a streetlight. The drunk man tells the officer he’s looking for his wallet. When the officer asks if he’s sure this is where he dropped the wallet, the man replies that he thinks he more likely dropped it across the street. Then why are you looking over here? the befuddled officer asks. Because the light’s better here, explains the drunk man.

The joke illustrates the streetlight effect: we "tend to look for answers where the looking is good, rather than where the answers are likely to be hiding."

Freedman (2010) documents at length some harms caused by the streetlight effect. For example:

A bolt of excitement ran through the field of cardiology in the early 1980s when anti-arrhythmia drugs burst onto the scene. Researchers knew that heart-attack victims with steady heartbeats had the best odds of survival, so a medication that could tamp down irregularities seemed like a no-brainer. The drugs became the standard of care for heart-attack patients and were soon smoothing out heartbeats in intensive care wards across the United States.

But in the early 1990s, cardiologists realized that the drugs were also doing something else: killing about 56,000 heart-attack patients a year. Yes, hearts were beating more regularly on the drugs than off, but their owners were, on average, one-third as likely to pull through. Cardiologists had been so focused on immediately measurable arrhythmias that they had overlooked the longer-term but far more important variable of death.

Memetic Tribalism

43 [deleted] 14 February 2013 03:03AM

Related: politics is the mind killer, other optimizing

When someone says something stupid, I get an urge to correct them. Based on the stories I hear from others, I'm not the only one.

For example, some of my friends are into this rationality thing, and they've learned about all these biases and correct ways to get things done. Naturally, they get irritated with people who haven't learned this stuff. They complain about how their family members or coworkers aren't rational, and they ask what is the best way to correct them.

I could get into the details of the optimal set of arguments to turn someone into a rationalist, or I could go a bit meta and ask: "Why would you want to do that?"

Why should you spend your time correcting someone else's reasoning?

One reason that comes up is that it's valuable for some reason to change their reasoning. OK, when is it possible?

  1. You actually know better than them.

  2. You know how to patch their reasoning.

  3. They will be receptive to said patching.

  4. They will actually change their behavior if the accept the patch.

It seems like it should be rather rare for those conditions to all be true, or even to be likely enough for the expected gain to be worth the cost, and yet I feel the urge quite often. And I'm not thinking it through and deciding, I'm just feeling an urge; humans are adaptation executors, and this one seems like an adaptation. For some reason "correcting" people's reasoning was important enough in the ancestral environment to be special-cased in motivation hardware.

I could try to spin an ev-psych just-so story about tribal status, intellectual dominance hierarchies, ingroup-outgroup signaling, and whatnot, but I'm not an evolutionary psychologist, so I wouldn't actually know what I was doing, and the details don't matter anyway. What matters is that this urge seems to be hardware, and it probably has nothing to do with actual truth or your strategic concerns.

It seems to happen to everyone who has ideas. Social justice types get frustrated with people who seem unable to acknowledge their own privilege. The epistemological flamewar between atheists and theists rages continually across the internet. Tech-savvy folk get frustrated with others' total inability to explore and use Google. Some aspiring rationalists get annoyed with people who refuse to decompartmentalize or claim that something is in a separate magisteria.

Some of those border on being just classic blue vs green thinking, but from the outside, the rationality example isn't all that different. They all seem to be motivated mostly by "This person fails to display the complex habits of thought that I think are fashionable; I should {make fun | correct them | call them out}."

I'm now quite skeptical that my urge to correct reflects an actual opportunity to win by improving someone's thinking, given that I'd feel it whether or not I could actually help, and that it seems to be caused by something else.

The value of attempting a rationality-intervention has gone back down towards baseline, but it's not obvious that the baseline value of rationality interventions is all that low. Maybe it's a good idea, even if there is a possible bias supporting it. We can't win just by reversing our biases; reversed stupidity is not intelligence.

The best reason I can think of to correct flawed thinking is if your ability to accomplish your goals directly depends on their rationality. Maybe they are your business partner, or your spouse. Someone specific and close who you can cooperate with a lot. If this is the case, it's near the same level of urgency as correcting your own.

Another good reason (to discuss the subject at least) is that discussing your ideas with smart people is a good way to make your ideas better. I often get my dad to poke holes in my current craziness, because he is smarter and wiser than me. If this is your angle, keep in mind that if you expect someone else to correct you, it's probably not best to go in making bold claims and implicitly claiming intellectual dominance.

An OK reason is that creating more rationalists is valuable in general. This one is less good than it first appears. Do you really think your comparative advantage right now is in converting this person to your way of thinking? Is that really worth the risk of social friction and expenditure of time and mental energy? Is this the best method you can think of for creating more rationalists?

I think it is valuable to raise the sanity waterline when you can, but using methods of mass instruction like writing blog posts, administering a meetup, or launching a whole rationality movement is a lot more effective than arguing with your mom. Those options aren't for everybody of course, but if you're into waterline-manipulation, you should at least be considering strategies like them. At least consider picking a better time.

Another reason that gets brought up is that turning people around you into rationalists is instrumental in a selfish way, because it makes life easier for you. This one is suspect to me, even without the incentive to rationalize. Did you also seriously consider sabotaging people's rationality to take advantage of them? Surely that's nearly as plausible a-priori. For what specific reason did your search process rank cooperation over predation? 

I'm sure there are plenty of good reasons to prefer cooperation, but of course no search process was ever run. All of these reasons that come to mind when I think of why I might want to fix someone's reasoning are just post-hoc rationalizations of an automatic behavior. The true chain of cause-and-effect is observe->feel->act; no planning or thinking involved, except where it is necessary for the act. And that feeling isn't specific to rationality, it affects all mental habits, even stupid ones.

Rationality isn't just a new memetic orthodoxy for the cool kids, it's about actually winning. Every improvement requires a change. Rationalizing strategic reasons for instinctual behavior isn't change, it's spending your resources answering questions with zero value of information. Rationality isn't about what other people are doing wrong; it's about what you are doing wrong.

I used to call this practice of modeling other people's thoughts to enforce orthodoxy on them "incorrect use of empathy", but in terms of ev-psych, it may be exactly the correct use of empathy. We can call it Memetic Tribalism instead.

(I've ignored the other reason to correct people's reasoning, which is that it's fun and status-increasing. When I reflect on my reasons for writing posts like this, it turns out I do it largely for the fun and internet status points, but I try to at least be aware of that.)

How to avoid dying in a car crash

76 michaelcurzi 17 March 2012 07:44PM

Aside from cryonics and eating better, what else can we do to live long lives?

Using this tool, I looked up the risks of death for my demographic group. As a 15-24 year old male in the United States, the most likely cause of my death is a traffic accident; and so I’m taking steps to avoid that. Below I have included the results of my research as well as the actions I will take to implement my findings. Perhaps my research can help you as well.1

Before diving into the results, I will note that this data took me one hour to collect. It’s definitely not comprehensive, and I know that working together, we can do much better. So if you have other resources or data-backed recommendations on how to avoid dying in a traffic accident, leave a comment below and I’ll update this post.

General points

Changing your behavior can reduce your risk of death in a car crash. A 1985 report on British and American crash data discovered that driver error, intoxication and other human factors contribute wholly or partly to about 93% of crashes.” Other drivers’ behavior matters too, of course, but you might as well optimize your own.2

Secondly, overconfidence appears to be a large factor in peoples’ thinking about traffic safety. A speaker for the National Highway Traffic Safety Association (NHTSA) stated that “Ninety-five percent of crashes are caused by human error… but 75% of drivers say they're more careful than most other drivers. Less extreme evidence for overconfidence about driving is presented here.

One possible cause for this was suggested by the Transport Research Laboratory, which explains that “...the feeling of being confident in more and more challenging situations is experienced as evidence of driving ability, and that 'proven' ability reinforces the feelings of confidence. Confidence feeds itself and grows unchecked until something happens – a near-miss or an accident.”

So if you’re tempted to use this post as an opportunity to feel superior to other drivers, remember: you’re probably overconfident too! Don’t just humbly confess your imperfections – change your behavior.

Top causes of accidents


Driver distraction is one of the largest causes of traffic accident deaths. The Director of Traffic Safety at the American Automobile Association stated that "The research tells us that somewhere between 25-50 percent of all motor vehicle crashes in this country really have driver distraction as their root cause." The NHTSA reports the number as 16%.

If we are to reduce distractions while driving, we ought to identify which distractors are the worst. One is cell phone use. My solution: Don’t make calls in the car, and turn off your phone’s sound so that you aren’t tempted.

I brainstormed other major distractors and thought of ways to reduce their distracting effects.

Distractor: Looking at directions on my phone as I drive

  • Solution: Download a great turn-by-turn navigation app (recommendations are welcome).
  • Solution: Buy a GPS.

Distractor: Texting, Facebook, slowing down to gawk at an accident, looking at scenery

  • Solution [For System 2]: Consciously accept that texting (Facebook, gawking, scenery) causes accidents.
  • Solution [For System 1]: Once a week, vividly and emotionally imagine texting (using Facebook, gawking at an accident) and then crashing & dying.
  • Solution: Turn off your phone’s sound while driving, so you won’t answer texts.

Distractor: Fatigue

  • Solution [For System 2]: Ask yourself if you’re tired before you plan to get in the car. Use Anki or a weekly review list to remember the association.
  • Solution [For System 1]: Once a week, vividly and emotionally imagine dozing off while driving and then dying.

Distractor: Other passengers

  • Solution: Develop an identity as someone who drives safely and thinks it’s low status to be distracting in the car. Achieve this by meditating on the commitment, writing a journal entry about it, using Anki, or saying it every day when you wake up in the morning.
  • Solution [In the moment]: Tell people to chill out while you’re driving. Mentally simulate doing this ahead of time, so you don’t hesitate to do it when it matters.

Distractor: Adjusting the radio

  • Solution: If avoiding using the car radio is unrealistic, minimize your interaction with it by only using the hotkey buttons rather than manually searching through channels.
  • Solution: If you’re constantly tempted to change the channel (like I am), buy an iPod cable so you can listen to your own music and set playlists that you like, so you won't constantly want to change the song.

A last interesting fact about distraction, from Wikipedia:

Recent research conducted by British scientists suggests that music can also have an effect [on driving]; classical music is considered to be calming, yet too much could relax the driver to a condition of distraction. On the other hand, hard rock may encourage the driver to step on the acceleration pedal, thus creating a potentially dangerous situation on the road.


The Road and Traffic Authority of New South Wales claims that “speeding… is a factor in about 40 percent of road deaths.” Data from the NHTSA puts the number at 30%.

Speeding also increases the severity of crashes; “in a 60 km/h speed limit area, the risk of involvement in a casualty crash doubles with each 5 km/h increase in travelling speed above 60 km/h.

Stop. Think about that for a second. I’ll convert it to the Imperial system for my fellow Americans: in a [37.3 mph] speed limit area, the risk of involvement in a casualty crash doubles with each [3.1 mph] increase in travelling speed above [37.3 mph].” Remember that next time you drive a 'mere' 5 mph over the limit.

Equally shocking is this paragraph from the Freakonomics blog:

Kockelman et al. estimated that the difference between a crash on a 55 mph limit road and a crash on a 65 mph one means a 24 percent increase in the chances the accident will be fatal. Along with the higher incidence of crashes happening in the first place, a difference in limit between 55 and 65 adds up to a 28 percent increase in the overall fatality count.

Driving too slowly can be dangerous too. An NHTSA presentation cites two studies that found a U-shaped relationship between vehicle speed and crash incidence; thus “Crash rates were lowest for drivers traveling near the mean speed, and increased with deviations above and below the mean.”

However, driving fast is still far more dangerous than driving slowly. This relationship appears to be exponential, as you can see on the tenth slide of the presentation.

  • Solution: Watch this 30 second video for a vivid comparison of head-on crashes at 60 km/hr (37 mph) and 100 km/hr (60 mph). Imagine yourself in the car. Imagine your tearful friends and family. 
  • Solution: Develop an identity as someone who drives close to the speed limit, by meditating on the commitment, writing a journal entry about it, using Anki, or saying it every day when you wake up in the morning.

Driving conditions

Driving conditions are another source of driving risk.

One factor I discovered was the additional risk from driving at night. Nationwide, 49% of fatal crashes happen at night, with a fatality rate per mile of travel about three times as high as daytime hours. (Source)

  • Solution: make an explicit effort to avoid driving at night. Use Anki to remember this association.
  • Solution: Look at your schedule and see if you can change a recurring night-time drive to the daytime.

Berkeley research on 1.4 million fatal crashes found that “fatal crashes were 14% more likely to happen on the first snowy day of the season compared with subsequent ones.” The suggested hypothesis is that people take at least a day to recalibrate their driving behavior in light of new snow. 

  • Solution: make an explicit effort to avoid driving on the first snowy day after a sequence of non-snowy ones. Use Anki to remember this association.

Another valuable factoid: 77% of weather-related fatalities (and 75% of all crashes!) involve wet pavement.

Statistics are available for other weather-related issues, but the data I found wasn’t adjusted for the relative frequencies of various weather conditions. That’s problematic; it might be that fog, for example, is horrendously dangerous compared to ice or slush, but it’s rarer and thus kills fewer people. I’m interested in looking at appropriately adjusted statistics. 

Other considerations

  • Teen drivers are apparently way worse at not dying in cars than older people. So if you’re a teenager, take the outside view and accept that you (not just ‘other dumb teenagers’) may need to take particular care when driving. Relevant information about teen driving is available here.

  • Alcohol use appeared so often during my research that I didn’t even bother including stats about it. Likewise for wearing a seatbelt.

  • Since I’m not in the market for a car, I didn’t look into vehicle choice as a way to decrease personal existential risk. But I do expect this to be relevant to increasing driving safety.

  • “The most dangerous month, it turns out, is August, and Saturday the most dangerous day, according to the National Highway Traffic Safety Administration.” I couldn’t tell whether this was because of increased amount of driving or an increased rate of crashes.

  • This site recommends driving with your hands at 9 and 3 for increased control. The same site claims that “Most highway accidents occur in the left lane” because the other lanes have “more ‘escape routes’ should a problem suddenly arise that requires you to quickly change lanes”, but I found no citation for the claim.

  • Bad driver behavior appears to significantly increase the risk of death in an accident, so: don't ride in car with people who drive badly or aggressively. I have a few friends with aggressive driving habits, and I’m planning to either a) tell them to drive more slowly when I’m in the car or b) stop riding in their cars.

Commenters' recommendations

I should note here that I have not personally verified anything posted below. Be sure to look at the original comment and do followup research before depending on these recommendations.

  • MartinB recommends taking a driving safety class every few years.

  • Dmytry suggests that bicycling may be good training for constantly keeping one's eyes on the road, though others argue that bicycling itself may be significantly more dangerous than driving anyway.

  • Various commenters suggested simply avoiding driving whenever possible. Living in a city with good public transportation is recommended.

  • David_Gerard recommends driving a bigger car with larger crumple zones (but not an SUV because they roll over). He also recommends avoiding motorcycles altogether and taking advanced driving courses.

  • Craig_Heldreth adds that everyone in the car should be buckled up, as even a single unbuckled passenger can collide with and kill other passengers in a crash. Even cargo as light as a laptop should be secured or put in the trunk.

  • JRMayne offers a list of recommendations that merit reading directly. DuncanS also offers a valuable list.

1All bolding in the data was added for emphasis by me.

2The report notes that "57% of crashes were due solely to driver factors, 27% to combined roadway and driver factors, 6% to combined vehicle and driver factors, 3% solely to roadway factors, 3% to combined roadway, driver, and vehicle factors, 2% solely to vehicle factors and 1% to combined roadway and vehicle factors.”