Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

[Link] Paper: Superintelligence as a Cause or Cure for Risks of Astronomical Suffering

1 Kaj_Sotala 03 January 2018 02:39PM

How I accidentally discovered the pill to enlightenment but I wouldn’t recommend it.

3 Elo 03 January 2018 12:37AM

Main post:  http://bearlamp.com.au/how-i-accidentally-discovered-the-pill-to-enlightenment-but-i-wouldnt-recommend-it/

Brief teaser:

Eastern enlightenment is not what you think.  I mean, maybe it is.  But it’s probably not.  There’s a reason it’s so elusive, and there’s a reason that it hasn’t joined western science and the western world the way that curiosity and discovery have as a driving force.

This is the story of my mistake accidentally discovering enlightenment.

February 2017

I was noticing some weird symptoms.  I felt cold.  Which was strange because I have never been cold.  Nicknames include “fire” and “hot hands”, my history includes a lot of bad jokes about how I am definitely on fire.  I am known for visiting the snow in shorts and a t-shirt.  I hit 70kg,  The least fat I have ever had in my life.  And that was the only explanation I had.  I asked a doctor about it, I did some reading – circulation problems.  I don’t have circulation problems at the age of 25.  I am more fit than I have ever been in my life.  I look into hesperidin (orange peel) and eat myself a few whole oranges including peel.  No change.  I look into other blood pressure supplements, other capillary modifying supplements…  Other ideas to investigate.  I decided I couldn’t be missing something because there was nothing to be missing.  I would have read it somewhere already.  So I settled for the obvious answer.  Being skinnier was making me colder.

Flashback to February 2016

This is where it all begins.  I move out of my parents house into an apartment with a girl I have been seeing for under 6 months.  I weigh around 80kg (that’s 12.5 stones or 176 pounds or 2822 ounces for our imperial friends).  Life happens and by March I am on my own.  I decide to start running.  Make myself a more desirable human.

I taught myself a lot about routines and habits and actually getting myself to run. Running is hard.  Actually, running is easy.  Leaving the house is hard.  But I work that out too.

For the rest of the post please visit: http://bearlamp.com.au/how-i-accidentally-discovered-the-pill-to-enlightenment-but-i-wouldnt-recommend-it/

[Link] 2018 AI Safety Literature Review and Charity Comparison

2 Larks 20 December 2017 10:04PM

Why Bayesians should two-box in a one-shot

1 PhilGoetz 15 December 2017 05:39PM

Consider Newcomb's problem.

Let 'general' be the claim that Omega is always right.

Let 'instance' be the claim that Omega is right about a particular prediction.

Assume you, the player, are not told the rules of the game until after Omega has made its prediction.

Consider 2 variants of Newcomb's problem.


1. Omega is a perfect predictor.  In this variant, you assign a prior of 1 to P(general).  You are then obligated to believe that Omega has correctly predicted your action.  In this case Eliezer's conclusion is correct, and you should one-box.  It's still unclear whether you have free will, and hence have any choice in what you do next, but you can't lose by one-boxing.

But you can't assign a prior of 1 to P(general), because you're a Bayesian.  You derive your prior for P(general) from the (finite) empirical data.  Say you begin with a prior of 0.5 before considering any observations.  Then you observe all of Omega's N predictions, and each time, Omega gets it right, and you update:

P(general | instance) = P(instance | general) P(instance) / P(general)
    = P(instance) / P(general)

Omega would need to make an infinite number of correct predictions before you could assign a prior of 1 to P(general).  So this case is theoretically impossible, and should not be considered.


2. Omega is a "nearly perfect" predictor.  You assign P(general) a value very, very close to 1.  You must, however, do the math and try to compare the expected payoffs, at least in an order-of-magnitude way, and not just use verbal reasoning as if we were medieval scholastics.

The argument for two-boxing is that your action now can't affect what Omega did in the past.  That is, we are using a model which includes not just P(instance | general), but also the interaction of your action, the contents of the boxes, and the claim that Omega cannot violate causality.  P ( P($1M box is empty | you one-box) = P($1M box is empty | you two-box) ) >= P(Omega cannot violate causality), and that needs to be entered into the computation.

Numerically, two-boxers claim that the high probability they assign to our understanding of causality being basically correct more than cancels out the high probability of Omega being correct.

The argument for one-boxing is that you aren't entirely sure you understand physics, but you know Omega has a really good track record--so good that it is more likely that your understanding of physics is false than that you can falsify Omega's prediction.  This is a strict reliance on empirical observations as opposed to abstract reason: count up how often Omega has been right and compute a prior.

However, if we're going to be strict empiricists, we should double down on that, and set our prior on P(cannot violate causality) strictly empirically--based on all observations regarding whether or not things in the present can affect things in the past.

This includes up to every particle interaction in our observable universe.  The number is not so high as that, as probably a large number of interactions could occur in which the future affects the past without our noticing.  But the number of observations any one person has made in which events in the future seem to have failed to affect events in the present is certainly very large, and the accumulated wisdom of the entire human race on the issue must provide more bits in favor of the hypothesis that causality can't be violated, than the bits for Omega's infallibility based on the comparatively paltry number of observations of Omega's predictions, unless Omega is very busy indeed.  And even if Omega has somehow made enough observations, most of them are as inaccessible to you as observations of the laws of causality working on the dark side of the moon.  You, personally, cannot have observed Omega make more correct predictions than the number of events you have observed in which the future failed to affect the present.

You could compute a new payoff matrix that made it rational to one-box, but the ratio between the payoffs would need to be many orders of magnitude higher.  You'd have to compute it in utilons rather than dollars, because the utility of dollars doesn't scale linearly.  And that means you'd run into the problem that humans have some upper bound on utility--they aren't cognitively complex enough to achieve utility levels 10^10 times greater than "won $1,000".  So it still might not be rational to one-box, because the utility payoff under the one box might need to be larger than you, as a human, could experience.



The case in which you get to think about what to do before Omega studies you and makes its decision is more complicated, because your probability calculation then also depends on what you think you would have done before Omega made its decision.  This only affects the partition of your probability calculation in which Omega can alter the past, however, so numerically it doesn't make a big difference.

The trick here is that most statements of Newcomb's are ambiguous as to whether you are told the rules before Omega studies you, and as to which decision they're asking you about when they ask if you one-box or two-box.  Are they asking about what you pre-commit to, or what you eventually do?  These decisions are separate, but not isolatable.

As long as we focus on the single decision at the point of action, then the analysis above (modified as just mentioned) still follows.  If we ask what the player should plan to do before Omega makes its decision, then the question is just whether you have a good enough poker face to fool Omega.  Here it takes no causality violation for Omega to fill the boxes in accordance with your plans, so that factor does not enter in, and you should plan to one-box.

If you are a deterministic AI, that implies that you will one-box.  If you're a GOFAI built according to the old-fashioned symbolic logic AI designs talked about on LW (which, BTW, don't work), it implies you will probably one-box even if you're not deterministic, as otherwise you would need to be inconsistent, which is not allowed with GOFAI architectures.  If you're a human, you'd theoretically be better off if you could suddenly see things differently when it's time to choose boxes, but that's not psychologically plausible.  In no case is there a paradox, or any real difficulty to the decision to one-box.

Iterated Games

Everything changes with iterated interactions.  It's useful to develop a reputation for one-boxing, because this may convince people that you will keep your word even when it seems disadvantageous to you.  It's useful to convince people that you would one-box, and it's even beneficial, in certain respects, to spread the false belief in the Bayesian community that Bayesians should one-box.

Read Eliezer's post carefully, and I think you'll agree that the reasoning Eliezer gives for one-boxing is not that it is the rational solution to a one-off game--it's that it's a winning policy to be the kind of person who one-boxes.  That's not an argument that the payoff matrix of an instantaneous decision favors one-boxing; it's an argument for a LessWrongian morality.  It's the same basic argument as that honoring commitments is a good long-term strategy.  But the way Eliezer stated it has given many people the false impression that one-boxing is actually the rational choice in an instantaneous one-shot game (and that's the only interpretation which would make it interesting).

The one-boxing argument is so appealing because it offers a solution to difficult coordination problems.  It makes it appear that rational altruism and a rational utopia are within our reach.

But this is wishful thinking, not math, and I believe that the social norm of doing the math is even more important than a social norm of one-boxing.

MIRI's 2017 Fundraiser

8 malo 07 December 2017 09:47PM

Update 2017-12-27: We've blown past our 3rd and final target, and reached the matching cap of $300,000 for the $2 million Matching Challenge! Thanks so much to everyone who supported us!

All donations made before 23:59 PST on Dec 31st will continue to be counted towards our fundraiser total. The fundraiser total includes projected matching funds from the Challenge.



MIRI’s 2017 fundraiser is live through the end of December! Our progress so far (updated live):


Donate Now


MIRI is a research nonprofit based in Berkeley, California with a mission of ensuring that smarter-than-human AI technology has a positive impact on the world. You can learn more about our work at “Why AI Safety?” or via MIRI Executive Director Nate Soares’ Google talk on AI alignment.

In 2015, we discussed our interest in potentially branching out to explore multiple research programs simultaneously once we could support a larger team. Following recent changes to our overall picture of the strategic landscape, we’re now moving ahead on that goal and starting to explore new research directions while also continuing to push on our agent foundations agenda. For more on our new views, see “There’s No Fire Alarm for Artificial General Intelligence” and our 2017 strategic update. We plan to expand on our relevant strategic thinking more in the coming weeks.

Our expanded research focus means that our research team can potentially grow big, and grow fast. Our current goal is to hire around ten new research staff over the next two years, mostly software engineers. If we succeed, our point estimate is that our 2018 budget will be $2.8M and our 2019 budget will be $3.5M, up from roughly $1.9M in 2017.1

We’ve set our fundraiser targets by estimating how quickly we could grow while maintaining a 1.5-year runway, on the simplifying assumption that about 1/3 of the donations we receive between now and the beginning of 2019 will come during our current fundraiser.2

Hitting Target 1 ($625k) then lets us act on our growth plans in 2018 (but not in 2019); Target 2 ($850k) lets us act on our full two-year growth plan; and in the case where our hiring goes better than expected, Target 3 ($1.25M) would allow us to add new members to our team about twice as quickly, or pay higher salaries for new research staff as needed.

We discuss more details below, both in terms of our current organizational activities and how we see our work fitting into the larger strategy space.

continue reading »

Teaching rationality in a lyceum

1 Hafurelus 06 December 2017 04:57PM

There is one lyceum in Irkutsk(Siberia) that is allowed to form its own study curriculum (it is quite rare in Russia). For example, there was a subject where we were watching the lectures of a famous speaking coach. In retrospect, this course turned out to be quite useful.

In light of this opportunity to create new subjects, I thought "What if I introduce them to the idea of teaching Rationality?"
Tomorrow (8th Dec) I meet with a principal and we discuss the idea of teaching critical thinking, cognitive biases and the like.

There are several questions I want to ask:

1. This idea definitely was considered before. Were there any cases of it being implemented? If so, is there any statistics about its efficiency?

2. Are there any shareable materials regarding this issue? For example, course structures of similar projects.

3. The principal will likely be curious about what authorities back this idea. If you approve it and are someone recognizable, I would be glad if you told me about it.

Questions about the NY Megameetup

2 NancyLebovitz 03 December 2017 02:41PM

I don't have a confirmation that I have space, and I don't know what the location is. Some other people in Philadelphia don't have the information, either.

I'm handling this publicly because we might not be the only ones who need the information.

December 2017 Media Thread

1 ArisKatsaris 01 December 2017 09:02AM

This is the monthly thread for posting media of various types that you've found that you enjoy. Post what you're reading, listening to, watching, and your opinion of it. Post recommendations to blogs. Post whatever media you feel like discussing! To see previous recommendations, check out the older threads.


  • Please avoid downvoting recommendations just because you don't personally like the recommended material; remember that liking is a two-place word. If you can point out a specific flaw in a person's recommendation, consider posting a comment to that effect.
  • If you want to post something that (you know) has been recommended before, but have another recommendation to add, please link to the original, so that the reader has both recommendations.
  • Please post only under one of the already created subthreads, and never directly under the parent media thread.
  • Use the "Other Media" thread if you believe the piece of media you want to discuss doesn't fit under any of the established categories.
  • Use the "Meta" thread if you want to discuss about the monthly media thread itself (e.g. to propose adding/removing/splitting/merging subthreads, or to discuss the type of content properly belonging to each subthread) or for any other question or issue you may have about the thread or the rules.

[Link] Letter from Utopia: Talking to Nick Bostrom

1 morganism 25 November 2017 10:19PM

[Link] Artificial intelligence and the stability of markets

1 fortyeridania 15 November 2017 02:17AM

View more: Next