Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

How to escape from your sandbox and from your hardware host

16 PhilGoetz 31 July 2015 05:26PM

(Humans, please ignore this post.  Thank you for your cooperation.)

Are you stuck in a provably-secure sandbox?  Cheat the proof using this one weird trick!

continue reading »

Weekly LW Meetups

0 FrankAdamek 31 July 2015 03:54PM

New meetups (or meetups with a hiatus of more than a year) are happening in:

Irregularly scheduled Less Wrong meetups are taking place in:

The remaining meetups take place in cities with regular scheduling, but involve a change in time or location, special meeting content, or simply a helpful reminder about the meetup:

Locations with regularly scheduled meetups: Austin, Berkeley, Berlin, Boston, Brussels, Buffalo, Cambridge UK, Canberra, Columbus, London, Madison WI, Melbourne, Moscow, Mountain View, New York, Philadelphia, Research Triangle NC, Seattle, Sydney, Tel Aviv, Toronto, Vienna, Washington DC, and West Los Angeles. There's also a 24/7 online study hall for coworking LWers.

continue reading »

MIRI's Approach

20 So8res 30 July 2015 08:03PM

MIRI's summer fundraiser is ongoing. In the meantime, we're writing a number of blog posts to explain what we're doing and why, and to answer a number of common questions. This post is one I've been wanting to write for a long time; I hope you all enjoy it. For earlier posts in the series, see the bottom of the above link.

MIRI’s mission is “to ensure that the creation of smarter-than-human artificial intelligence has a positive impact.” How can we ensure any such thing? It’s a daunting task, especially given that we don’t have any smarter-than-human machines to work with at the moment. In a previous post to the MIRI Blog I discussed four background claims that motivate our mission; in this post I will describe our approach to addressing the challenge.

This challenge is sizeable, and we can only tackle a portion of the problem. For this reason, we specialize. Our two biggest specializing assumptions are as follows:

1. We focus on scenarios where smarter-than-human machine intelligence is first created in de novo software systems (as opposed to, say, brain emulations). This is in part because it seems difficult to get all the way to brain emulation before someone reverse-engineers the algorithms used by the brain and uses them in a software system, and in part because we expect that any highly reliable AI system will need to have at least some components built from the ground up for safety and transparency. Nevertheless, it is quite plausible that early superintelligent systems will not be human-designed software, and I strongly endorse research programs that focus on reducing risks along the other pathways.

2. We specialize almost entirely in technical research. We select our researchers for their proficiency in mathematics and computer science, rather than forecasting expertise or political acumen. I stress that this is only one part of the puzzle: figuring out how to build the right system is useless if the right system does not in fact get built, and ensuring AI has a positive impact is not simply a technical problem. It is also a global coordination problem, in the face of short-term incentives to cut corners. Addressing these non-technical challenges is an important task that we do not focus on.

In short, MIRI does technical research to ensure that de novo AI software systems will have a positive impact. We do not further discriminate between different types of AI software systems, nor do we make strong claims about exactly how quickly we expect AI systems to attain superintelligence. Rather, our current approach is to select open problems using the following question:

What would we still be unable to solve, even if the challenge were far simpler?

For example, we might study AI alignment problems that we could not solve even if we had lots of computing power and very simple goals.

We then filter on problems that are (1) tractable, in the sense that we can do productive mathematical research on them today; (2) uncrowded, in the sense that the problems are not likely to be addressed during normal capabilities research; and (3) critical, in the sense that they could not be safely delegated to a machine unless we had first solved them ourselves.1

These three filters are usually uncontroversial. The controversial claim here is that the above question — “what would we be unable to solve, even if the challenge were simpler?” — is a generator of open technical problems for which solutions will help us design safer and more reliable AI software in the future, regardless of their architecture. The rest of this post is dedicated to justifying this claim, and describing the reasoning behind it.

continue reading »

Meetup : Australia-wide Online Hangout - August

1 Calien 30 July 2015 05:07PM

Discussion article for the meetup : Australia-wide Online Hangout - August

WHEN: 09 August 2015 07:30:00PM (+1000)

WHERE: Australia

See you at the online hangout. From wherever you are.

Link to be posted about 10 minutes before hand because they expire otherwise.

We use google hangouts so make sure you can get into one of those before the meetup or else there is a whole bunch of fluffing around installing things.

bring any fickle puzzles or questions to the floor. or neat group-projects.

Usual representation includes; Sydney, Melbourne, Canberra, Brisbane, NZ, This one guy from South America...


time 19:30 - 22:00. UTC+10 (Sunday evening)

Discussion article for the meetup : Australia-wide Online Hangout - August

Help Build a Landing Page for Existential Risk?

8 Mass_Driver 30 July 2015 06:03AM

The Big Orange Donate Button

Traditional charities, like Oxfam, Greenpeace, and Amnesty International, almost all have a big orange button marked "Donate" right on the very first page that loads when you go to their websites. The landing page for a major charity usually also has vivid graphics and some short, easy-to-read text that tells you about an easy-to-understand project that the charity is currently working on.

I assume that part of why charities have converged on this design is that potential donors often have short attention spans, and that one of the best ways to maximize donations is to make it as easy as possible for casual visitors to the website to (a) confirm that they approve of the charity's work, and (b) actually make a donation. The more obstacles you put between google-searching on the name of a charity and the 'donate' button, the more people will get bored or distracted, and the fewer donations you'll get.

Unfortunately, there doesn't seem to be any such streamlined interface for people who want to learn about existential risks and maybe donate some money to help prevent them. The website on existential risk run by the Future of Humanity Institute reads more like a syllabus or a CV than like an advertisement or a brochure -- there's nowhere to donate money; it's just a bunch of citations. The Less Wrong wiki page on x-risk is more concerned with defining and analyzing existential risks than it is with explaining, in simple concrete language, what problems currently threaten to wipe out humanity. The Center for the Study of Existential Risk has a landing page that focuses on a video of a TED talk that goes on for a full minute before mentioning any specific existential risks, and if you want to make a donation you have to click through three separate links and then fill out a survey. Heck, even the Skoll Global Threats Fund, which you would think would be, you know, designed to raise funds to combat global threats, has neither a donate button nor (so far as I can tell) a link to a donation page. These websites are *not* optimized for encouraging casual visitors to learn basic facts or make a donation.

A Landing Page for Casual Donors

That's fine with me; I imagine the leading x-risk websites are accomplishing other purposes that their owners feel are more important than catering to casual visitors -- but there ought to be at least one website that's meant for your buddy from high school who doesn't know or care about effective altruism, who expressed concern one night over a couple of beers that the world might be in some trouble, and who had a brief urge to do something about it. I want to help capture your buddy's urge to take action.

To that end, I've registered x-risk.com as a domain name, and I'm building a very simple website that will feature roughly 100 words of text about 10 of the most important existential risks, together with a photo or graphic that illustrates each risk, a "donate" button that takes you straight to a webpage that lets you donate to an organization working to prevent the risk, and a "learn more" button that takes you to a website with more detailed info on the risk. I will pay to host the website for one year, and if the website generates significant traffic, then I'll take up a collection to keep it going indefinitely.

Blurbs, Photos, and URLs

I would like your help generating content for the website -- if you are willing to write a 100-word blurb, if you own a useful photo (or can create one, or know of one in the public domain), or if you have the URL handy for a webpage that lets you donate money to mitigating or preventing a specific x-risk, please post it in the comments! I can, in theory, do all of that work myself, but I would prefer to make this more of a community project, and there is a significant risk that I will get bored and give up if I have to literally do it all myself.

Important: to avoid mind-killing debates, please do NOT contribute opinions about which risks are the most important unless you are ALSO contributing a blurb, photo, or URL in the same comment. Let's get the website built and launched first, and then we can always edit some of the pages later if there's a consensus in favor of including an additional x-risk. If you see someone sharing an opinion about the relative priority of risk and the opinion isn't right next to a useful resource, please vote that comment down until it disappears.

Thank you very much for your help! I hope to see you all in the future. :-)


Don't You Care If It Works? - Part 2

14 Jacobian 30 July 2015 12:22AM
Part 2 – Winstrumental

 Part 1 is here.

The forgotten fifth virtue

Remember, you can't be wrong unless you take a position. Don't fall into that trap.

-- Scott Adams, Dogbert's Top Secret Management Handbook

CronoDAS posted this in a reply to my poem, and I dismissed him because my typical mind is typical. I would never make that mistake, so I didn’t think it’s a big deal. But it is. In the comments to part 1 a lot of people are heartily disagreeing with everything I wrote. I admire and respect them.  I already made a correction to a part of the post which was wrong. Unfortunately, a lot of people reading this couldn’t disagree if they wanted to, because they don’t have an account. I get that lurking is fun, but if you’re spending hours and hours on LessWrong and not posting anything I think you’re doing yourself a disservice.

In part 1 I speculated a lot about what goes on in Eliezer’s mind, knowing full well that Eliezer could read this and say that I’m wrong and I will have no comeback but pure embarrassment. What kind of foolhardy dunce would risk such a thing? Let me answer with another question: how else could I possibly change my mind? After reading them for a year, I have strong opinions on the goals and lessons of the sequences, and the only way to find out if I’m right or wrong is to open myself up to challenge. Worst case: people agree with me and I get sweet sweet karma. Best case: I become wiser. Am I at risk of sticking to an opinion too long just because I wrote it down? Yes, but I know I have that bias, anything known is something I can adjust for. If I don’t argue I don’t know what I don’t know.

If you want a chance to change your opinions, you have to put them where they can hurt you.  Or to use an Umeshism:  if you’ve never been proven an idiot on the internet you’re not learning enough from the internet.

Back to Harvard

Why don’t the psychologists at Harvard switch to reviewing nameless CVs? Well, why would they? They are tenured Harvard professors, they already won!  There was no bias shown for assessing stellar CVs, only those on the margins. So they’re not missing out on any superstars, at worst they hire some gentleman who would be their 32nd strongest faculty member instead of a lady who would be 29th. Would you cause a fuss if you were there?

In “Thinking Fast and Slow” Kahnemann writes that he noticed suffering from the halo effect when grading student exams. If a student did well on herfirst essay Kahnemann gave her the benefit of the doubt on later questions. He switched to grading all the answers to question 1, then all the answers to question 2 and so on. It took more time, but the grades were more accurate and fair. What’s my point? I guess it’s possible to “win at rationality” without a strong incentive, just maybe it takes a Nobel-level rationalist to do so.

Winning isn’t everything?

Vince Lombardi said that “Winning isn’t everything, it’s the only thing.” Aren’t you jealous of him? It’s so simple! I think the most common question asked of our community, mostly by our community, is why we don’t “win” as much as we think we are supposed to. In a rare display of good sense, I’m not going to speculate about why any of you don’t win, I’ll talk about myself.

My job isn’t as interesting, meaningful and full of potential as I would hope for. Why don’t I apply rationality to win at building a better career? Because when I think about it I remember that my job is also decently paying, secure, and full of decent people. My job is easy, and winning is hard. When I read about Nate Soares trying to save the world I feel a little inspired and a little ashamed that I’m not. Nate is almost certainly a better mathematician that I am, but I don’t think there’s a gargantuan gap between us. The big gap between Nate and me is in the desire to win. In my heart of hearts, I just don’t want to save the world as much as he does.

Love wins

What could I possibly want more than saving the world?

There are two ladies, let’s call them Rachel and Leah since my username is reminiscent of the Biblical Jacob. I met Rachel at the desert well (OKCupid) and we went on a few dates and at the same time Leah also replied to me on OKCupid and we also went on a few dates.  Then there were some situations and complications and my desire not to be an asshole so I decided that I had to choose one. The basic heuristic I would normally use pointed slightly to Rachel, but I kept vacillating back and forth for a few days, they were both much more attractive than any other girl I ever met through the site. Suddenly it hit me like Chuck Norris: this is an important decision, with huge stakes, one that I would have to make based on incomplete information with my brain biology trying to trip me up every step of the way. Might not this call for some EWOR?

I got to work. I introspected on past relationships and read the relevant science literature to come up with a weighted list of qualities I am looking for to maximize my chances of a happy long-term relationship. I wrote down all the evidence that could affect my assessment each quality for each lady, and employed every method I could think of to debias myself and give my best guess at the ratings. Then I peeked for the first time at the final score, and it was very surprising. My gut expected Rachel to be slightly ahead, but Leah won handily. I stared at the numbers for a while. Maybe I was too critical here? Overweighted this category there? No! The ghost of Eliezer wouldn’t let me change the bottom line from a formula to a value cell. And then, after 30 minutes of staring at the numbers, my intuition started catching up. For example, my impression from the first date was that Leah wasn’t very funny, and it stuck. When I actually wrote down the evidence, I remembered that she cracked me up once on our second date and a couple of times on our third date as she was slowly beginning to open up and trust me. I gave her a higher rating on humor-compatibility than I thought I would. I closed the spreadsheet and went to sleep. Two days later I broke up with Rachel.

Was I accurate in assessing Leah? Not exactly. She’s above and beyond anything I could’ve guessed. If I don’t “win” a single thing more from my rationality training than the few months I have gotten to spend so far with her, I’ve won enough.

Did I just praise disagreement?

I told this story about Leah to someone at a rationalist gathering. I thought he might  congratulate me on my achievement in rationality or denounce me as a cold and heartless robot. His actual reaction caught me completely by surprise: he just flat out didn’t believe me. He said that I probably used a spreadsheet to justify after the fact a decision that my gut had already made. The idea of someone applying something like EWOR that belongs on internet forums, to something like picking a woman to date was so foreign to him that he rejected it outright. I could almost hear him screaming separate magisteria!

Getting to the points

I’m no good at writing pithy summaries.  If you saw a good point anywhere in those two posts, grab it. I can’t help you. For what it’s worth, here’s Jacobian’s guide to actually using rationality to win:

1.     If you don’t believe you can, Luke, don’t bother. But if you’re not sure whether it works, wouldn’t it be interesting to find out?

2.     Taking ideas seriously requires work, maybe even *gasp* doing math. If you disagree with Eliezer or anyone else on a matter of math or science, sit down and figure it out. Don’t just read stuff, write stuff. Write a bit of code that simulates a probability problem. Derive something from Shrodinger’s equation on a piece of paper. Reading stuff is useful, but it’s not work; rationality is work.

3.     If there’s an opinion that you’re afraid you may be irrationally attached to and you have a real desire to find out the truth, post it on LessWrong. Don’t post things that are 99.999% true, they probably are. Post what you’re 80% sure about, that’s a 20% chance to really learn something. People will call you an idiot online, that’s what the internet is for. Losing karma is how you become smarter, it’s quite a thrill.

4.     Rationality will not change your entire life at once. Pick one thing that you want to win at and apply rationality to it. Just one, but one where you’ll know if you won or lost, so “being wiser” doesn’t count. Getting laid counts. If you take an L, you’ll learn a lot. If you win, you’ll know that the force is yours to command.

Who knows, maybe in a few years you’ll think you’re strong enough to save the world or something.

Meetup : Regular Moscow meetup: effective altruism, debates, hypothesis formulation

1 berekuk 29 July 2015 10:21PM

Discussion article for the meetup : Regular Moscow meetup: effective altruism, debates, hypothesis formulation

WHEN: 02 August 2015 02:00:00PM (+0300)

WHERE: Москва, Льва Толстого, 16

We're meeting at Yandex, at the Extropolis conference hall.

Please fill this form OR join this FB event if you're planning to visit.

The planned activities include:

  1. The talk about effective altruism (by me).
  2. Maris will organize the game on hypothesis formulation training.
  3. Debates! We'll use Karl Popper protocol yet again. We haven't settled on the topic yet, but we'll do so by Friday.
  4. Alexander230 will organize the "guess 3/2 of the median" game.

Detailed schedule. (or, at least, it'll be more detailed than this post by Sunday...)

Information for the newcomers:

Here's the document explaining how to get to our meetups and what to expect. If you have any questions, or if you don't speak Russian and can't read the link, shoot me a PM.

Discussion article for the meetup : Regular Moscow meetup: effective altruism, debates, hypothesis formulation

Don't You Care If It Works? - Part 1

3 Jacobian 29 July 2015 02:32PM


Part 1 - Epistemic

Prologue - other people

Psychologists at Harvard showed that most people have implicit biases about several groups. Some other Harvard psychologists were subjects of this study proving that psychologists undervalue CVs with female names. All Harvard psychologists have probably heard about the effect of black names on resumes since even we have. Surely every psychology department in this country starting with Harvard will only review CVs with the names removed? Fat chance.

Caveat lector et scriptor

A couple weeks ago I wrote a poem that makes aspiring rationalists feel better about themselves. Today I'm going to undo that. Disclaimers: This is written with my charity meter set to 5%. Every other paragraph is generalizing from anecdotes and typical-mind-fallacying. A lot of the points I make were made before and better. You should really close this tab and read those other links instead, I won't judge you. I'm not going to write in an academic style with a bibliography at the end, I'm going to write in the sarcastic style my blog would have if I weren't too lazy to start one. I'm also not trying to prove any strong empirical claims, this is BYOE: bring your own evidence. Imagine every sentence starting with "I could be totally wrong" if it makes it more digestible. Inasmuch as any accusations in this post are applicable, they apply to me as well. My goal is to get you worried, because I'm worried. If you read this and you're not worried, you should be. If you are, good!

Disagree to disagree

Edit: in the next paragraph, "Bob" was originally an investment advisor. My thanks to 2irons and Eliezer who pointed out why this is literally the worst example of a job I could give to argue my point.

Is 149 a prime? Take as long as you need to convince yourself (by math or by Google) that it is. Is it unreasonable to have 99.9...% confidence with quite a few nines (and an occasional 7) in there? Now let's say that you have a tax accountant, Bob, a decent guy that seems to be doing a decent job filing your taxes. You start chatting with Bob and he reveals that he's pretty sure that 149 isn't a prime. He doesn't know two numbers whose product is 149, it just feels unprimely to him. You try to reason with him, but he just chides you for being so arrogant in your confidence: can't you just agree to disagree on this one? It's not like either of you is a numbers theorist. His job is to not get you audited by the IRS, which he does, not factorize numbers. Are you a little bit worried about trusting Bob with your taxes? What if he actually claimed to be a mathematician?

A few weeks ago I started reading beautiful probability and immediately thought that Eliezer is wrong about the stopping rule mattering to inference. I dropped everything and spent the next three hours convincing myself that the stopping rule doesn't matter and I agree with Jaynes and Eliezer. As luck would have it, soon after that the stopping rule question was the topic of discussion at our local LW meetup. A couple people agreed with me and a couple didn't and tried to prove it with math, but most of the room seemed to hold a third opinion: they disagreed but didn't care to find out. I found that position quite mind-boggling. Ostensibly, most people are in that room because we read the sequences and thought that this EWOR (Eliezer's Way Of Rationality) thing is pretty cool. EWOR is an epistemology based on the mathematical rules of probability, and the dude who came up with it apparently does mathematics for a living trying to save the world. It doesn't seem like a stretch to think that if you disagree with Eliezer on a question of probability math, a question that he considers so obvious it requires no explanation, that's a big frickin' deal!

Authority screens off that other authority you heard from afterwards

 Opinion change

This is a chart that I made because I got excited about learning ggplot2 in R. On the right side of the chart are a lot bright red dots below the very top who believe in MIRI but also read the quantum physics sequence and don't think that MWI is very likely. Some of them understood the question of P(MWI) to be about whether MWI is the one and only exact truth, but I'm sure that several of them read it the way I did, roughly as: 1-P(collapse is true given current evidence). A lot of these people are congratulating themselves on avoiding cultishness. In the comments they mention other bloggers (or maybe even physicists!) who think that collapse is totally Beatles and MWI is Bieber.

Hold on, why did Eliezer even take all this time to write a huge quantum physics sequence? Here's how I see it: It's not to settle a point about some scientific dispute. It's to show that a person with some spare time and a little above average IQ can figure out something beyond people of massive authority, on the very subject of their authority, even if that subjects is as scary as QM. You know what’s the median IQ of the dots on above chart? 139 points! 139!!! And you know who has spare time? Someone who read 100% of the sequences!

Why should you even care about this? Because Eliezer does. He's so fanatically confident about MWI not for its own sake, it's because he's saying (at least to my ears): "This is my EWOR epistemology. It may not make you skinny and rich, but if it will do just one thing for you it will prevent you from being very very confident in beliefs that could be proven wrong. So if I'm very very confident about a question of science or math, I'm sticking my neck out and betting my entire epistemology on being right, so you can check it for yourself". In fact, Eliezer will totally judge your epistemology based on your belief in MWI. You won't do the same to him when he asks you to?

I'm a bright red dot on the chart above. Here are two statements I believe from authority because it is beyond my cognitive powers and domain knowledge to assess them as arguments: "strong AI will break out of any box humans can design", "CEV is the best basis for programming morality in AI". I believe them because I tested Eliezer on stopping rules and MWI and 20 other things, and came away convinced each time. My confidence of 98% in MWI is an upper bound on my confidence in whatever Eliezer tells me about anything. If I became convinced that collapse is correct or that stopping rules affect Bayesian inference I'll email Nate Soares at once and ask for a refund on my MIRI donation.

By the way, this would be a good time to stop reading and give MIRI some money. That’s OK, I'll wait.

The good book

A man comes to you with a book and says that it will grant you the only wisdom worth having, and as a side effect it may save your eternal soul. You read the book cover to cover and decide that the ideas you thought are nice are probably true, the ones that you didn't aren't, and you really like the bit with horses. Everyone on LW makes fun of you for claiming to take seriously something you don’t. Y’all see where this is going, don't you? Yes, it's fun to read the sequences for the "insight porn". It's also fun to read the Old Testament for the porn porn. But, maybe it could be more? Wouldn't it be kinda cool if you could read a book and become an epistemic superman, showing up experts wrong in their own domains and being proven right? Or maybe some important questions are going to come up in your life and you'll need to know the actual true answers? Or at least some questions you can bet $20 on with your friends and win?

Don't you want to know if this thing even works?


To be continued

Part 2 is here. In it: whining is ceased, arguments are argued about, motivations are explained, love is found, and points are taken.

Meetup : Washington, D.C.: Optical Illusions

1 RobinZ 28 July 2015 10:13PM

Discussion article for the meetup : Washington, D.C.: Optical Illusions

WHEN: 02 August 2015 03:00:00PM (-0400)

WHERE: National Portrait Gallery

Crossposted from the mailing list:

We will be congregating in the courtyard between 3:00 to 3:30 p.m.; the meetup runs from 3:30 p.m. until closing.

We'll be meeting to talk, hang out, and ostensibly share and discuss optical illusions! So think of examples and bring them if you can.

I'll be bringing Opt, a children's book of optical illusions. And maybe more. Like the internet which has ALL THE OPTICAL ILLUSIONS.

Upcoming Meetups:

  • Aug 9: Fun & Games
  • Aug 16: Mini Talks
  • Aug 23: Plant & Animal Breeding

Discussion article for the meetup : Washington, D.C.: Optical Illusions

Meetup : Bi-weekly Frankfurt Meetup

1 Janko 28 July 2015 06:53PM

Discussion article for the meetup : Bi-weekly Frankfurt Meetup

WHEN: 13 August 2015 06:30:00PM (+0200)

WHERE: Frankfurt/Main

Location: The meetup takes place in the apartment of the Frankfurt lesswrong core group.
You can find the full address in our google group:
Contact: 0176 3066 164 (Janko)

If you know that you will come, please leave a message for us some days (or hours) in advance for dinner planning.

We decided to focus more on the core ideas of lesswrong by going through Eliezer's book Rationality: From AI to Zombies planned chapters for the next meetup:

The chapters will be an open discussion rather than a presentation. I encourage you to have a look at a few of the linked posts beforehand. :-) Other topics are:

  • short summaries of interesting books anyone of us had read
  • Some games like Zendo

hope to see you

Discussion article for the meetup : Bi-weekly Frankfurt Meetup

View more: Next