## Meetup : West LA—Expert At Vs. Expert On

0 07 March 2014 10:10AM

## Discussion article for the meetup : West LA—Expert At Vs. Expert On

WHEN: 12 March 2014 06:00:00PM (-0800)

WHERE: 11066 Santa Monica Blvd, Los Angeles, CA

How to Find Us: Go into this Del Taco. I will bring a Rubik's Cube. The presence of a Rubik's Cube will be strong Bayesian evidence of the presence of a Less Wrong meetup.

Parking is completely free. There is a sign that claims there is a 45-minute time limit, but it is a lie.

Discussion: Expert at vs. expert on is a fairly important distinction. It's also a really simple one, which makes it conceptual low-hanging fruit. It's not totally without nuance; for example the terminology implies either total mastery or encyclopedic knowledge, but it applies just as well at any level of competence.

• Expert At Versus Expert On. I know of no other writing that is explicitly on this topic. Robin Hanson emphasizes the signaling aspect (of course he does), but I do not.
• It is well-known that you learn to play baseball by playing baseball, not by reading essays about baseball. However, it is not usually made explicit that the former makes you an expert at baseball, and the latter makes you an expert on baseball.
• Another nuance: Being an expert at something helps you become an expert on it; the vice versa may be true also. For example, you are probably a better linguist if you speak many languages.

NB: No prior knowledge of or exposure to Less Wrong is necessary; this will be generally accessible. Also, we may or may not play a card game.

## How to Study Unsafe AGI's safely (and why we might have no choice)

4 07 March 2014 07:24AM

TL;DR

A serious possibility is that the first AGI(s) will be developed in a Manhattan Project style setting before any sort of friendliness/safety constraints can be integrated reliably. They will also be substantially short of the intelligence required to exponentially self-improve. Within a certain range of development and intelligence, containment protocols can make them safe to interact with. This means they can be studied experimentally, and the architecture(s) used to create them better understood, furthering the goal of safely using AI in less constrained settings.

Setting the Scene

The year is 2040, and in the last decade a series of breakthroughs in neuroscience, cognitive science, machine learning, and computer hardware have put the long-held dream of a human-level artificial intelligence in our grasp. The wild commercial success of lifelike robotic pets, the integration into everyday work and leisure of AI assistants and concierges, and STUDYBOT's graduation from Harvard's Online degree program with an octuple major and full honors, DARPA, the NSF and the European Research Council have announced joint funding of an artificial intelligence program that will create a superhuman intelligence in 3 years.

Safety was announced as a critical element of the project, especially in light of the self-modifying LeakrVirus that catastrophically disrupted markets in 36 and 37. The planned protocols have not been made public, but it seems they will be centered in traditional computer security rather than techniques from the nascent field of Provably Safe AI, which were deemed impossible to integrate on the current project timeline.

Technological and/or Political issues could force the development of AI without theoretical safety guarantees that we'd certainly like, but there is a silver lining

A lot of the discussion around LessWrong and MIRI that I've seen (and I haven't seen all of it, please send links!) seems to focus very strongly on the situation of an AI that can self-modify or construct further AIs, resulting in an exponential explosion of intelligence (FOOM/Singularity). The focus on FAI is on finding an architecture that can be explicitly constrained (and a constraint set that won't fail to do what we desire).

My argument is essentially that there could be a critical multi-year period preceding any possible exponentially self-improving intelligence during which a series of AGIs of varying intelligence, flexibility and architecture will be built. This period will be fast and frantic, but it will be incredibly fruitful and vital both in figuring out how to make an AI sufficiently strong to exponentially self-improve and in how to make it safe and friendly (or develop protocols to bridge the even riskier period between when we can develop FOOM-capable AIs and when we can ensure their safety).

I'll break this post into three parts.
1. why is a substantial period of proto-singularity more likely than a straight-to-singularity situation?
2. Second, what strategies will be critical to developing, controlling, and learning from these pre-FOOM AIs?
3. Third, what are the political challenge that will develop immediately before and during this period?
Why is a proto-singularity likely?

The requirement for a hard singularity, an exponentially self-improving AI, is that the AI can substantially improve itself in a way that enhances its ability to further improve itself, which requires the ability to modify its own code; access to resources like time, data, and hardware to facilitate these modifications; and the intelligence to execute a fruitful self-modification strategy.

The first two conditions can (and should) be directly restricted. I'll elaborate more on that later, but basically any AI should be very carefully sandboxed (unable to affect its software environment), and should have access to resources strictly controlled. Perhaps no data goes in without human approval or while the AI is running. Perhaps nothing comes out either. Even a hyperpersuasive hyperintelligence will be slowed down (at least) if it can only interact with prespecified tests (how do you test AGI? No idea but it shouldn't be harder than friendliness). This isn't a perfect situation. Eliezer Yudkowsky presents several arguments for why an intelligence explosion could happen even when resources are constrained, (see Section 3 of Intelligence Explosion Microeconomics) not to mention ways that those constraints could be defied even if engineered perfectly (by the way, I would happily run the AI box experiment with anybody, I think it is absurd that anyone would fail it! [I've read Tuxedage's accounts, and I think I actually do understand how a gatekeeper could fail, but I also believe I understand how one could be trained to succeed even against a much stronger foe than any person who has played the part of the AI]).

But the third emerges from the way technology typically develops. I believe it is incredibly unlikely that an AGI will develop in somebody's basement, or even in a small national lab or top corporate lab. When there is no clear notion of what a technology will look like, it is usually not developed. Positive, productive accidents are somewhat rare in science, but they are remarkably rare in engineering (please, give counterexamples!). The creation of an AGI will likely not happen by accident; there will be a well-funded, concrete research and development plan that leads up to it. An AI Manhattan Project described above. But even when there is a good plan successfully executed, prototypes are slow, fragile, and poor-quality compared to what is possible even with approaches using the same underlying technology. It seems very likely to me that the first AGI will be a Chicago Pile, not a Trinity; recognizably a breakthrough but with proper consideration not immediately dangerous or unmanageable. [Note, you don't have to believe this to read the rest of this. If you disagree, consider the virtues of redundancy and the question of what safety an AI development effort should implement if they can't be persuaded to delay long enough for theoretically sound methods to become available].

A Manhattan Project style effort makes a relatively weak, controllable AI even more likely, because not only can such a project implement substantial safety protocols that are explicitly researched in parallel with primary development, but also because the total resources, in hardware and brainpower, devoted to the AI will be much greater than a smaller project, and therefore setting a correspondingly higher bar for the AGI thus created to reach to be able to successfully self-modify itself exponentially and also break the security procedures.

Strategies to handle AIs in the proto-Singularity, and why they're important

First, take a look the External Constraints Section of this MIRI Report and/or this article on AI Boxing. I will be talking mainly about these approaches. There are certainly others, but these are the easiest to extrapolate from current computer security.

These AIs will provide us with the experimental knowledge to better handle the construction of even stronger AIs. If careful, we will be able to use these proto-Singularity AIs to learn about the nature of intelligence and cognition, to perform economically valuable tasks, and to test theories of friendliness (not perfectly, but well enough to start).

"If careful" is the key phrase. I mentioned sandboxing above. And computer security is key to any attempt to contain an AI. Monitoring the source code, and setting a threshold for too much changing too fast at which point a failsafe freezes all computation; keeping extremely strict control over copies of the source. Some architectures will be more inherently dangerous and less predictable than others. A simulation of a physical brain, for instance, will be fairly opaque (depending on how far neuroscience has gone) but could have almost no potential to self-improve to an uncontrollable degree if its access to hardware is limited (it won't be able to make itself much more efficient on fixed resources). Other architectures will have other properties. Some will be utility optimizing agents. Some will have behaviors but no clear utility. Some will be opaque, some transparent.

All will have a theory to how they operate, which can be refined by actual experimentation. This is what we can gain! We can set up controlled scenarios like honeypots to catch malevolence. We can evaluate our ability to monitor and read the thoughts of the agi. We can develop stronger theories of how damaging self-modification actually is to imposed constraints. We can test our abilities to add constraints to even the base state. But do I really have to justify the value of experimentation?

I am familiar with criticisms based on absolutley incomprehensibly perceptive and persuasive hyperintelligences being able to overcome any security, but I've tried to outline above why I don't think we'd be dealing with that case.

Political issues

Right now AGI is really a political non-issue. Blue sky even compared to space exploration and fusion both of which actually receive funding from government in substantial volumes. I think that this will change in the period immediately leading up to my hypothesized AI Manhattan Project. The AI Manhattan Project can only happen with a lot of political will behind it, which will probably mean a spiral of scientific advancements, hype and threat of competition from external unfriendly sources. Think space race.

So suppose that the first few AIs are built under well controlled conditions. Friendliness is still not perfected, but we think/hope we've learned some valuable basics. But now people want to use the AIs for something. So what should be done at this point?

I won't try to speculate what happens next (well you can probably persuade me to, but it might not be as valuable), beyond extensions of the protocols I've already laid out, hybridized with notions like Oracle AI. It certainly gets a lot harder, but hopefully experimentation on the first, highly-controlled generation of AI to get a better understanding of their architectural fundamentals, combined with more direct research on friendliness in general would provide the groundwork for this.

## Meetup : March Meetup: Body Hacking!

0 07 March 2014 02:21AM

## Discussion article for the meetup : March Meetup: Body Hacking!

WHEN: 09 March 2014 08:07:00PM (-0500)

WHERE: TBA

An overview of body hacking, what's possible, what's known, what needs more exploration, and what tools are available to you.

Presenters needed! Do you have expertise on any of this? Lemme know and you can do anything from a full presentation with slides and handouts to leading a discussion on a particular topic.

## Meetup : Canberra: Meta-meetup + meditation

0 07 March 2014 01:04AM

## Discussion article for the meetup : Canberra: Meta-meetup + meditation

WHEN: 28 March 2014 06:00:00PM (+1100)

WHERE: ANU Arts Centre

Our first regular meetup will have two parts: firstly, we will be discussing what we want to get out of meetups, what sort of things we would ilke to do in them, and related matters, and secondly, we will be taught how to meditate and have a practice session. Vegan snacks will be provided.

General meetup info:

Structured meetups are held on the second Saturday and fourth Friday of each month from 6 pm until 10 pm at the XSite (home of the XSA), located upstairs in the ANU Arts Centre.

There will be LWers at the Computer Science Students Association's weekly board games night, held on Wednesdays from 7 pm in the CSIT building, room N101.

## Amanda Knox Redux: is Satoshi Nakamoto the real Satoshi Nakamoto?

7 06 March 2014 11:33PM

Many of you here have likely heard of Bitcoin, and maybe know something about it.

Earlier today, a story broke that a reporter had apparently tracked down the real Satoshi Nakamoto, infamous creator of the Bitcoin protocol.

This seems like an excellent opportunity to practice our Bayesian updating!

So, how likely do you think it is that this man is the founder of Bitcoin? What do you believe and why?

12 06 March 2014 09:40PM

In late December 2013, Jonah, my collaborator at Cognito Mentoring, announced the service on LessWrong. Information about the service was also circulated in other venues with high concentrations of gifted and intellectually curious people. Since then, we're received ~70 emails asking for mentoring from learners across all ages, plus a few parents. At least 40 of our advisees heard of us through LessWrong, and the number is probably around 50. Of the 23 who responded to our advisee satisfaction survey, 16 filled in information on where they'd heard of us, and 14 of those 16 had heard of us from LessWrong. The vast majority of student advisees with whom we had substantive interactions, and the ones we felt we were able to help the most, came from LessWrong (we got some parents through the Davidson Forum post, but that's a very different sort of advising).

In this post, I discuss some common themes that emerged from our interaction with these advisees. Obviously, this isn't a comprehensive picture of the LessWrong community the way that Yvain's 2013 survey results were.

• A significant fraction of the people who contacted us via LessWrong aren't active LessWrong participants, and many don't even have user accounts on LessWrong. The prototypical advisees we got through LessWrong don't have many distinctive LessWrongian beliefs. Many of them use LessWrong primarily as a source of interesting stuff to read, rather than a community to be part of.
• About 25% of the advisees we got through LessWrong were female, and a slightly higher proportion of the advisees with whom we had substantive interaction (and subjectively feel we helped a lot) were female. You can see this by looking at the sex distribution of the public reviews of us from students.
• Our advisees included people in high school (typically, grades 11 and 12) and college. Our advisees in high school tended to be interested in mathematics, computer science, physics, engineering, and entrepreneurship. We did have a few who were interested in economics, philosophy, and the social sciences as well, but this was rarer. Our advisees in college and graduate school were also interested in the above subjects but skewed a bit more in the direction of being interested in philosophy, psychology, and economics.
• Somewhat surprisingly and endearingly, many of our advisees were interested in effective altruism and social impact. Some had already heard of the cluster of effective altruist ideas. Others were interested in generating social impact through entrepreneurship or choosing an impactful career, even though they weren't familiar with effective altruism until we pointed them to it. Of those who had heard of effective altruism as a cluster of ideas, some had either already consulted with or were planning to consult with 80,000 Hours, and were connecting with us largely to get a second opinion or to get opinion on matters other than career choice.
• Some of our advisees had had some sort of past involvement with MIRI/CFAR/FHI. Some were seriously considering working in existential risk reduction or on artificial intelligence. The two subsets overlapped considerably.
• Our advisees were somewhat better educated about rationality issues than we'd expect others of similar academic accomplishment to be, and more than the advisees we got from sources other than LessWrong. That's obviously not a surprise at all.
• We hadn't been expecting it, but many advisees asked us questions related to procrastination, social skills, and other life skills. We were initially somewhat ill-equipped to handle these, but we've built a base of recommendations, with some help from LessWrong and other sources.
• One thing that surprised me personally is that many of these people had never spent time exploring Quora. I'd have expected Quora to be much more widely known and used by the sort of people who were sufficiently aware of the Internet to know LessWrong. But it's possible there's not that much overlap.

My overall takeaway is that LessWrong seems to still be one of the foremost places that smart and curious young people interested in epistemic rationality visit. I'm not sure of the exact reason, though HPMOR probably gets a significant fraction of the credit. As long as things stay this way, LessWrong remains a great way to influence a subset of the young population today that's likely to be disproportionately represented among the decision-makers a few years down the line.

It's not clear to me why they don't participate more actively on LessWrong. Maybe no special reasons are needed: the ratio of lurkers to posters is huge for most Internet fora. Maybe the people who contacted us were relatively young and still didn't have an Internet presence, or were being careful about building one. On the other hand, maybe there is something about the comments culture that dissuades people from participating (this need not be a bad feature per se: one reason people may refrain from participating is that comments are held to a high bar and this keeps people from offering off-the-cuff comments). That said, if people could somehow participate more, LessWrong could transform itself into an interactive forum for smart and curious people that's head and shoulders above all the others.

PS: We've now made our information wiki publicly accessible. It's still in beta and a lot of content is incomplete and there are links to as-yet-uncreated pages all over the place. But we think it might still be interesting to the LessWrong audience.

2 06 March 2014 06:32PM

http://mappingignorance.org/2014/02/03/mandela-was-right-the-foreign-language-effect/

Summary: Across the board, people are less prone to cognitive bias in a non-native language.

Conclusion: If all important discourse was conducted in Latin, or any other language native to no one, people would make better decisions.

Corollary: All the attempts to make a constructed "scientific language" actually could have worked relatively well, for reasons entirely unconnected to the painstaking scientific structure of the languages.

## Meetup : Moscow, Meet up

0 06 March 2014 05:31AM

## Discussion article for the meetup : Moscow, Meet up

WHEN: 09 March 2014 04:00:00PM (+0400)

WHERE: Russia, Moscow, ulitsa L'va Tolstogo 16

We will gather at the same second entrance, but we will go to a room inside the building at 16:00. So please do not be late. We will have:

• Report about “Zen to Done”.
• Stumbling on happiness for rationalists presentation.
• Report about "Decisive: How to Make Better Choices in Life and Work" book.
• Cognitive behavioural therapy workshop.

We gather in the Yandex office, you need the second revolving door with the sign “Яндекс”, here is the photo of the entrance you need. You need to pass the first entrance and through the archway. Here is additional guide how to get there: link.

You can fill this one minute form (in Russian), to share your contact information.

We start at 16:00 and sometimes finish at night. Please pay attention that we only gather near the second entrance and then come inside.

## Meetup : Auckland Preliminary Meetup

0 06 March 2014 04:31AM

## Discussion article for the meetup : Auckland Preliminary Meetup

WHEN: 08 March 2014 02:00:00PM (+1300)

WHERE: Albert Park, Auckland

I got back from the second Melbourne CFAR workshop recently and it was good. It's well worthwhile having a local rationalist community and while there are some good thinkers in my immediate circle of friends, meeting more and learning from each other would be awesome.

I'm not sure if others will be using it, but let's meet near the gazebo in Albert Park at 2pm Saturday. I'll be carrying my CFAR bag and water bottle if you want to come over and say hi. I would be wearing a "Just shy, not antisocial" shirt if I had one. If you're interested in coming, just comment, or come anyway.

Cheers, Marcel.

## How my math skills improved dramatically

19 05 March 2014 08:27PM

When I was a freshman in high school, I was a mediocre math student: I earned a D in second semester geometry and had to repeat the course. By the time I was a senior in high school, I was one of the strongest few math students in my class of ~600 students at an academic magnet high school. I went on to earn a PhD in math. Most people wouldn't have guessed that I could have improved so much, and the shift that occurred was very surreal to me. It’s all the more striking in that the bulk of the shift occurred in a single year. I thought I’d share what strategies facilitated the change.

## Meetup : Washington DC Fun and Games Meetup

0 05 March 2014 05:04PM

## Discussion article for the meetup : Washington DC Fun and Games Meetup

WHEN: 09 March 2014 02:00:00PM (-0500)

WHERE: National Portrait Gallery, Washington DC

We'll be meeting to hang out and play games.

## Meetup : Israel Less Wrong Meetup - Social and Board Games

4 05 March 2014 05:01PM

## Discussion article for the meetup : Israel Less Wrong Meetup - Social and Board Games

WHEN: 06 March 2014 07:00:00PM (+0200)

We're going to have a meetup on Thursday, March 6th at Google Israel's offices, Electra Tower, 98 Yigal Alon st., Tel Aviv.

This time we're going to have a social meetup! Unlike previous meetups where we had a set agenda, an a talk - this time we'll be socializing and playing games. Specifically, we look forward to playing any cool board or card game anyone will bring.

We'll start the meetup at 19:00, and we'll go on as much as we like to. Feel free to come a little bit later, as there is no agenda. (We've decided to start slightly earlier this time to give us more time and accommodate people with different schedules).

We'll meet at the 29th floor of the building (Note: Not the 26th where Google Campus is). If you arrive and cant find your way around, call Anatoly who is graciously hosting us at 054-245-1060.

Things that might happen: - You'll trade cool ideas with cool people from the Israel LW community. - You'll discover kindred spirits who agree with you about one/two boxing. - You'll kick someone's ass (and teach them how you did it) at some awesome boardgame. - You'll discover how to build a friendly AGI running on cold fusion (well probably not)

Things that will happen for sure: - You'll get to hang out with awesome people and have fun!

There is also talk of food and beers, and if you'd like to bring some too - that would be great. (But you don't have to).

If you have any question feel free to email me at hochbergg@gmail.com or call me at 054-533-0678 or call Anatoly at 054-245-1060.

See you there!

## Proposal: LW courses

8 05 March 2014 04:18PM

For a long time I have tried to study things on my own, at my own pace. But it was always an uphill struggle against strong akrasia issues, and eventually I came to the conclusion that the only thing that really seems to work is to have externally-imposed deadlines. The only way I could think of to do this was to sign up for classes, so I enrolled in a number of MOOCs. So far this has worked wonders - I went from basically spending most of my time playing around and wasting time, to several recent days where I studied for several hours straight.

The only thing I don't like about this setup is that there's a very limited number of really good MOOCs out there on the subjects I want to study. Also, most MOOCs are geared for a wider audience and are therefore dumbed-down to a certain degree.

So I had the following idea: A lot of us on LW seem to be studying a lot of the same material, whether it's the sequences, MIRI course list, CFAR booklist, or any of the various recommended reading lists. What if those who were studying the same thing would get together and set a schedule for themselves to finish the reading material, complete with deadlines? This might not be a normal "externally imposed" deadline, but at least it's a deadline with some social pressure to back it up. I can't be the only one on LW who could benefit from a deadline.

The details would need to be worked out, but here's a preliminary version of the way I envision it:

• There should be a monthly thread for requests for new classes. The request should specify the text to be used, or it could ask for suggestions for a good text. The request should also specify the approximate pace (very slow - slow - normal - fast - very fast), or an approximate weekly time commitment.
• The next thing that would be needed for each proposed class would be for someone who's already gone through that text to propose a rough calendar for the course. For example, they could say that given the requested pace / time commitment, you should expect to spend about 3 months on that particular text. Also, some chapters are harder than others, so the calendar should specify, for example, that you should expect to spend just one week on Chapters 1-3, but Chapter 7 will need to spread over three weeks. It would also be very useful to specify what prerequisites are needed for that text. (Similar to this thread. Keep in mind that different people have different styles when it comes to prerequisites. Some prefer to do as few prerequisites as possible and then skip straight to the harder stuff, and work backwards / fill in gaps as necessary. Others prefer to carefully cover all lower-level material before even touching the harder stuff. These people will want to know about all possible prerequisites so that they won't have to work backwards at all.)
• I would recommend creating a repository of available course calendars (i.e., course X should be split up this way, course Y should be split up that way, etc.). This can be done by creating a special thread for this purpose and then linking to that thread every time a new "proposed course" thread starts.
• A calendar provides some deadlines, but there needs to be some motivation for keeping to the deadlines. I can think of a few possibilities that might work:
• Social pressure: If there is anybody else in your class other than yourself, there's a certain amount of social pressure to keep up with the group and keep to the agreed-upon deadlines. Classmates can increase this pressure by actively encouraging each other to keep up.
• Social encouragement: As you make each deadline you should report that you did so, and others can then respond with encouragement.
• Karma: If someone makes a deadline they should post an announcement to that effect, and LWers (even those not part of your class, and even those who aren't taking any classes) could be encouraged to upvote the announcement. I haven't been on LW long enough to tell if this is a socially acceptable use of karma points, but this might be motivating for some people.
• Perhaps someone could design a "LW U" badge or something of the sort to post on your personal / social site when you complete a course. (Notice that with the karma or badge reward forms, it becomes possible to have only a single member in a course and they'll still be able to get some form of reward structure. It might not be as effective as having other people in the course, but at least it works.)
• There should be a dedicated thread for each course once it begins. The thread would be used for everything relating to the course: announcing progress, discussing subject-related material, meta-discussions about the course, etc.
• LWers who have already completed the subject / textbook could follow the course discussions and provide guidance and help as needed. Anyone who thinks they can contribute in this "teacher" capacity should let course participants know about it beforehand, as this will provide additional social pressure / support, and provide valuable encouragement (there's someone I can ask my stupid questions to!).
• I'd recommend that once one or more people decide to take a course, they should set a date to start the course that's at least two weeks (maybe a month) in the future. This would give time for others to join. Each month's thread for proposed courses could then include a list of "courses starting soon".
• Perhaps people who have already studied a given text could put together a few quizzes / tests / finals for that text. The quizzes would be sent to individual students at a certain point in the course via private message. Each student would take the quiz on their own (honor system, of course), and the quizzes would then be graded either by the creator of the quiz (the "teacher"), a volunteer TA (using an answer key provided by the teacher), the other students, or even by each student themselves. (I would not recommend this last unless there are no other options, since even very honest people can be sorely tempted to fudge things occasionally in their own favor.) There could even be a final grade for the course. I suspect that this system would create powerful psychological motivation for certain people to work hard at the coursework and complete their work on time.

What do you think about such an idea?

## Meetup : Urbana-Champaign: Discussion

1 05 March 2014 09:14AM

## Discussion article for the meetup : Urbana-Champaign: Discussion

WHEN: 09 March 2014 01:00:02PM (-0600)

WHERE: 412 W. Elm St., Urbana, IL

Starting topic: should we expect there to be easy ways to improve your life (life hacks)? Two articles that bear on this by Gwern and Scott.

Also, I'm going to finish up my sequence of posts on logical uncertainty (starts here) by then, and would be pleased to answer questions.

## LessWrong Hamburg Third Meetup Notes: Small Steps Forward

2 04 March 2014 11:25PM

Review of our third Meetup : LessWrong Hamburg - Structure

## Summary

To make it short: We didn't follow the nice agenda we planned. We did the procrastination topic but diverged a lot.

## Course of events

In the long open beginning (expected) we talked a lot, played some MindTrap and had lunch together.

Then to get started I began the presentation of the main topic of this meetup: procrastination. This was basically a summary of

This led to lots of satellite discussions which partly diverged but mostly were centered on examples of procrastination (though afterwards some felt that this got out of hand with too many personal details; this was controversial). This part was all in all very long but also led to quite some understanding of the problems of and strategries against procrastination.

In the previous meetups there was an interest in the topics of and the objectives behind lesswrong. To get an authentic handle on the former I posted a topic poll in the Polling Thread. This I presented shortly (see appendix).

Interesting points we arrived at:

The image of lukeprogs procastination algorithm led to a discussion of what we called the mathematical fallacy/bias: Just giving a mathematical formula naming properties of interest leads to the false impression of scientificity and presents a false image of correctness and precision that just isn't there. This is a method sometimes seen in pseudo-science publications to give the impression of science. It is also used in economic sciences to approximate tendencies numerically. The general pattern of the mathematical fallacy is that modelling complex human behavior (like procrastination) in a simple formula is a special case of over-simplification riding piggy-back on the habit to take formulas at face value. The disclaimer on such formular (lukeprog actually gave one) just cannot be large enough. In this special case it would have been better to just name the four (nonlinear, crossrelated) effects on motivation instead of using the formula (at least until the five quantities in the formula are actually shown or defined in a precise way).

We planned the next meetup for Mar 30th, but the location (near Hamburg central station) is not yet fixed.

We did have more structure than the last time but a review discussion at the end clearly showed that just having one main topic with unmoderated side tracks wasn't enough and that all preferred a more formal structure - at least for topic presentations. Which on the next meetup will be theme-centered interaction.

Having keen observers of behavior allowed to pinpoint differences and misunderstandings in the group (actually involving me) to address these in a friendly helpful way.

## Sidetrack

Following the advice from Lifestyle interventions to increase longevity I had bought an e-cig for the smoker in our round who wants to quit. He received it positively. He enjoyed the near-identical handling and the fact that he could smoke in the room (I didn't notice bad taste) and that there was no effort to 'light' and 'unlight' it. We discussed it afterwards whether it increased or decreased the amount of smoking (I had noticed that he had used the e-cig more often, but this may be balanced by a much smaller number of pulls. We promised to measure this.

## Appendix

I presented the following lost of LessWrong topics in order of decreasing typicality (most typical for LW first):

1. methods for being less wrong, knowing about biases, fallacies and heuristics

2. methods of self-improvement (if scientifically backed), e.g. living luminiously, winning at life, longevity

3. organization and discussion of meetups

4. dealing with procrastination and akrasia

5. statistics, probability theory, decision theory and related mathematical fields

6. topics of associated or related organizations CFAR, MIRI, GiveWell, CEA

7. advancing specific virtues: altruism, mindfulness, empathy, truthfulness, openness

8. artificial intelligence topics esp. if related to AGI, (U)FAI, AI going FOOM (or not)

9. the singularity and transhumanism (includes cryonics as method to get there)

10. rationality applied to social situations in relationships, parenting and small groups

11. (moral) philosophical theories, ethics

12. platform to hangout with like-minded smart people

## Meetup : LW Vienna Meetup

0 04 March 2014 04:33PM

## Discussion article for the meetup : LW Vienna Meetup

WHEN: 15 March 2014 03:00:00PM (+0000)

WHERE: Reichsratsstrasse 17, 1010 Vienna, Austria

Meetup at Cafe Votiv, where Anna Leptikon will present some psychological musings on rationality, followed by a discussion. Newcomers welcome. Please register for the FB event: https://www.facebook.com/events/627247490680732/?context=create

## Open Thread: March 4 - 10

3 04 March 2014 03:55AM

# If it's worth saying, but not worth its own post (even in Discussion), then it goes here.

## Meetup : Tempe, AZ (ASU)

1 03 March 2014 08:46PM

## Discussion article for the meetup : Tempe, AZ (ASU)

WHEN: 07 March 2014 06:30:00PM (-0700)

WHERE: 300 E Orange Mall, Tempe, AZ

We are meeting at the entrance to Hayden Library at ASU. Tentative discussion topics include: wrap up How To Measure Anything; planning / meta; belated New Year's resolutions andor goals in general

## Meetup : Utrecht: Effective Altruism

2 03 March 2014 07:55PM

## Discussion article for the meetup : Utrecht: Effective Altruism

WHEN: 07 March 2014 07:00:00PM (+0100)

WHERE: Oudegracht 158, Utrecht, the Netherlands

In this meetup we will discuss topics related to effective altruism. This meetup is not directly related to the previous one in Utrecht. It is also not purely a LessWrong meetup. We have already created an event on Facebook, where 5 people are planning to attend (I can add you to the event if you comment here). Most of those are also active on LW and we would be more than happy to have more LWers on board.

Some topics we may discuss are altruistic career choice, selection of causes, and whether we can create an effective altruism community in the Netherlands. Getting to know each other is also an important part.

We will meet in a café called De Winkel van Sinkel, which is 400m walking distance from Utrecht Centraal. The meetup will be held in English, since we have at least one German participant..

I will be holding a sign that says 'LW' on it.

## Akrasia and Immunity to change

5 03 March 2014 04:04PM

Does any of you has any relevant experience that you can share with Immunity to change by Robert Kegan and Lisa Laskow Lahey?

I'm currently reading their book and I find it fascinating.

Here is a HBR article titled The Real Reason People Won’t Change that describes the work.

## Group Rationality Diary, March 1-15

2 02 March 2014 11:56PM

This is the public group instrumental rationality diary for March 1-15.

It's a place to record and chat about it if you have done, or are actively doing, things like:

• Established a useful new habit
• Decided to behave in a different way in some set of situations
• Optimized some part of a common routine or cached behavior
• Consciously changed your emotions or affect with respect to something
• Consciously pursued new valuable information about something that could make a big difference in your life
• Learned something new about your beliefs, behavior, or life that surprised you
• Tried doing any of the above and failed

Or anything else interesting which you want to share, so that other people can think about it, and perhaps be inspired to take action themselves.  Try to include enough details so that everyone can use each other's experiences to learn about what tends to work out, and what doesn't tend to work out.

Thanks to cata for starting the Group Rationality Diary posts, and to commenters for participating.

Immediate past diary:  January 16-31

Rationality diaries archive

## Proportional Giving

10 02 March 2014 09:09PM

Executive summary: The practice of giving a fixed fraction of one's income to charity is near-universal but possibly indefensible. I describe one approach that certainly doesn't defend it, speculate vaguely about a possible way of fixing it up, and invite better ideas from others.

Many of us give a certain fraction of our income to charitable causes. This sort of practice has a long history:

Deuteronomy 14:22 Thou shalt truly tithe all the increase of thy seed, that the field bringeth forth year by year.

(note that "tithe" here means "give one-tenth of") and is widely practised today:

GWWC Pledge: I recognise that I can use part of my income to do a significant amount of good in the developing world. Since I can live well enough on a smaller income, I pledge that from today until the day I retire, I shall give at least ten percent of what I earn to whichever organizations can most effectively use it to help people in developing countries. I make this pledge freely, openly, and without regret.

And of course it's roughly how typical taxation systems (which are kinda-sorta like charitable donation, if you squint) operate. But does it make sense? Is there some underlying principle from which a policy of giving away a certain fraction of one's income (not necessarily the traditional 10%, of course) follows?

The most obvious candidate for such a principle would be what we might call

Weighted Utilitarianism: Act so as to maximize a weighted sum of utility, where (e.g.) one's own utility may be weighted much higher than that of random far-away people.

But this can't produce anything remotely like a policy of proportional giving. Assuming you aren't giving away many millions per year (which is a fair assumption if you're thinking in terms of a fraction of your salary) then the level of utility-per-unit-money achievable by your giving is basically independent of what you give, and so is the weight you attach to the utility of the beneficiaries.

So suppose that when your income, after taking out donations, is $X, your utility (all else equal) is u(X), so that your utility per marginal dollar is u'(X); and suppose you attach weight 1 to your own utility and weight w to that of the people who'd benefit from your donations; and suppose their gain in utility per marginal dollar given is t. Then when your income is S you will set your giving g so that u'(S-g) = wt. What this says is that a weighted-utilitarian should keep a fixed absolute amount S-g of his or her income, and give all the rest away. The fixed absolute amount will depend on the weight w (hence, on exactly which people are benefited by the donations) and on the utility per dollar given t (hence, on exactly what charities are serving them and how severe their need is), but not on the person's pre-donation income S. (Here's a quick oversimplified example. Suppose that utility is proportional to log(income), that the people your donations will help have an income equivalent to$1k/year, that you care 100x more about your utility than about theirs, and that your donations are the equivalent of direct cash transfers to those people. Then u' = 1/income, so you should keep everything up to \$100k/year and give the rest away. The generalization to other weighting factors and beneficiary incomes should be obvious.)

This argument seems reasonably watertight given its premises, but proportional giving is so well-established a phenomenon that we might reasonably trust our predisposition in its favour more than our arguments against. Can we salvage it somehow?

Here's one possibility. One effect of income is (supposedly) to incentivize work, and maybe (mumble near mode mumble) this effect is governed entirely by anticipated personal utility and not by any benefit conferred on others. Then the policy derived above, which above the threshold makes personal utility independent of effort, would lead to minimum effort and hence maybe less net weighted utility than could be attained with a different policy. Does this lead to anything like proportional giving, at least for some semi-plausible assumptions about the relationship between effort and income?

At the moment, I don't know. I have a page full of scribbled attempts to derive something of the kind, but they didn't work out. And of course there might be some better way to get proportional giving out of plausible ethical principles. Anyone want to do better?

## Learning languages efficiently.

3 02 March 2014 03:57PM

I'm not at all sure how this site works yet (I've gone only on traditional forums), so bear with me please if I do something foolish. I'm being drafted to the IDF in a few months and I need to learn Hebrew very quickly if I want to avoid being put into a program for foreign speakers. I currently reside in the US, but I've previously lived in (and have citizenship of) both countries.

After experiencing the government-sponsored Hebrew programs, I totally refuse to accept such a ridiculously inefficient and traumatic method of teaching a language. When I get enlisted, I'll want to focus whatever little time I have left on studying more important things. Something that will damage me psychologically, not to mention take up huge amounts of time and effort, will take away any opportunity I might get.

I can speak a few basic phrases in Hebrew and and can understand a bit more. Immersion is not an option for me currently. My attempts at teaching myself the language have been stunningly misguided (which is to say, like reading Atlas Shrugged to get a proper understanding of Objectivism) and I'm not interested in a lengthy trial and error process. Obviously getting literature on language acquisition is out of the question. I wouldn't even know where to start.

So, I'd just like some methods or heuristics for picking up languages as fast as possible. (I am extremely literate, so there's that.)

## LINK-Misunderstanding risk of murder reincidence

-4 02 March 2014 01:14PM

People don't want this institutionalized murderer to get free because they don't understand there's no such thing as "zero risk."

http://www.sunnewsnetwork.ca/sunnews/straighttalk/archives/2014/02/20140228-142409.html

## Find a study partner - March 2014 thread

2 02 March 2014 06:00AM

This is the monthly thread to find a study partner.

For reasons mentioned in So8res article as well as for other reasons: studying with a partner can be very good.

So if you're looking for a study partner for an online course, reading a manual or else (whether it's in the MIRI course list or not) tell others in the comment section.

The past treads about finding a study partner can be found under the tag study_thread. However, you have higher probabilities of finding a study partner in the most recent thread. If you haven't found a study partner last month, you are welcome to post the same comment again here.

## Meetup : Montreal: Let's relax and hang out

0 02 March 2014 02:31AM

## Discussion article for the meetup : Montreal: Let's relax and hang out

WHEN: 02 March 2014 07:00:00PM (-0500)

WHERE: 4109 Ch. de la Côte des Neiges, Apt 14

Meetup.com is currently not working so here are the infos for tomorrow's social meetup. Feel free to bring drinks and snacks. Call me if you get lost: 514 582 1052

## Discussion article for the meetup : Montreal: Let's relax and hang out

5 01 March 2014 11:57PM

This is the second installment of the Polling Thread.

There are some rules:

1. Each poll goes into its own top level comment and may be commented there.
2. You must at least vote all polls that were posted earlier than you own. This ensures participation in all polls and also limits the total number of polls. You may of course vote without posting a poll.
3. Your poll should include a 'don't know' option (to avoid conflict with 2). I don't know whether we need to add a troll catch option here but we will see.

If you don't know how to make a poll in a comment look at the Poll Markup Help.

This is not (yet?) a regular thread. If it is successful I may post again. Or you may. In that case do the following :

• Use "Polling Thread" in the title.
• Copy the rules.
• Create a top-level comment saying 'Discussion of this thread goes here; all other top-level comments should be polls or similar'
• Add a second top-level comment with an initial poll to start participation.

## Meetup : Washington DC: Dog Clickers

0 01 March 2014 04:57PM

## Discussion article for the meetup : Washington DC: Dog Clickers

WHEN: 02 March 2014 03:00:00PM (-0500)

WHERE: National Portrait Gallery, Washington, DC 20001, USA

We'll be meeting to see if we can train people to do things with dog clickers.

## Discussion article for the meetup : Washington DC: Dog Clickers

4 01 March 2014 03:49PM

This is the monthly thread for posting media of various types that you've found that you enjoy. Post what you're reading, listening to, watching, and your opinion of it. Post recommendations to blogs. Post whatever media you feel like discussing! To see previous recommendations, check out the older threads.

Rules:

• Please avoid downvoting recommendations just because you don't personally like the recommended material; remember that liking is a two-place word. If you can point out a specific flaw in a person's recommendation, consider posting a comment to that effect.
• If you want to post something that (you know) has been recommended before, but have another recommendation to add, please link to the original, so that the reader has both recommendations.
• If you think there should be a thread for a particular genre of media, please post it to the Other Media thread for now, and add a poll to the Meta thread asking if it should be a thread every month.

## Meetup : Urbana-Champaign: Games

0 28 February 2014 09:49PM

## Discussion article for the meetup : Urbana-Champaign: Games

WHEN: 02 March 2014 02:00:00PM (-0600)

WHERE: 40.110430, -88.223784

Available games will include Wits and Wagers, Zendo, Cards Against Humanity, Pandemic, probably Flux, and the closed beta of Dystheism, a cooperative multiplayer puzzle game.

Meetup will be held at my apartment: 300 S. Goodwin Ave Apt 102 Urbana IL. Coordinates: 40.110430, -88.223784 At 2pm, Sunday.

The main entrance to the building requires card access, but there is a door you can knock on where we will hear you at the north end of the west side of the building. If you have any trouble getting in, call (907) 590-0079.

Cross posted on the mailing list.

## Weekly LW Meetups

0 28 February 2014 05:04PM

This summary was posted to LW main on February 21st. The following week's summary is here.

Irregularly scheduled Less Wrong meetups are taking place in:

The remaining meetups take place in cities with regular scheduling, but involve a change in time or location, special meeting content, or simply a helpful reminder about the meetup:

Locations with regularly scheduled meetups: Austin, Berkeley, Berlin, Brussels, Cambridge, MA, Cambridge UK, Columbus, London, Madison WI, Melbourne, Mountain View, New York, Philadelphia, Research Triangle NC, Salt Lake City, Seattle, Toronto, Vienna, Washington DC, Waterloo, and West Los Angeles. There's also a 24/7 online study hall for coworking LWers.

## Meetup : Sydney Meetup - March

2 28 February 2014 08:17AM

## Discussion article for the meetup : Sydney Meetup - March

WHEN: 26 March 2014 06:30:00PM (+1100)

WHERE: Sydney City RSL, 565 George St, Sydney, Australia 2000

So far so good. Our last two meetups have been great - so lets do it one more time.

6:30 PM for early discussion 7PM general dinner-discussion after dinner we'll have our rationality exercise and a more specific discussion-topic.

I'll book another table under the name "less wrong". Last meetup we were in the restaurant on level 2. When I arrive I'll facebook about where exactly the table is located.

I'd like to theme this meetup with a sub-goal of outreach.

If you've been thinking of a friend you might like to bring along - this is the night to do it.

We'll have a brief intro for any newbies to the community and maybe have our early discussion be about the community and what we think we can offer.

Afterwards, we'll havethe rationality exercise and more specific discussion-topic TBD by Eliot.

## The sin of updating when you can change whether you exist

7 28 February 2014 01:25AM

Trigger warning: In a thought experiment in this post, I used a hypothetical torture scenario without thinking, even though it wasn't necessary to make my point. Apologies, and thanks to an anonymous user for pointing this out. I'll try to be more careful in the future.

Should you pay up in the counterfactual mugging?

I've always found the argument about self-modifying agents compelling: If you expected to face a counterfactual mugging tomorrow, you would want to choose to rewrite yourself today so that you'd pay up. Thus, a decision theory that didn't pay up wouldn't be reflectively consistent; an AI using such a theory would decide to rewrite itself to use a different theory.

But is this the only reason to pay up? This might make a difference: Imagine that Omega tells you that it threw its coin a million years ago, and would have turned the sky green if it had landed the other way. Back in 2010, I wrote a post arguing that in this sort of situation, since you've always seen the sky being blue, and every other human being has also always seen the sky being blue, everyone has always had enough information to conclude that there's no benefit from paying up in this particular counterfactual mugging, and so there hasn't ever been any incentive to self-modify into an agent that would pay up ... and so you shouldn't.

I've since changed my mind, and I've recently talked about part of the reason for this, when I introduced the concept of an l-zombie, or logical philosophical zombie, a mathematically possible conscious experience that isn't physically instantiated and therefore isn't actually consciously experienced. (Obligatory disclaimer: I'm not claiming that the idea that "some mathematically possible experiences are l-zombies" is likely to be true, but I think it's a useful concept for thinking about anthropics, and I don't think we should rule out l-zombies given our present state of knowledge. More in the l-zombies post and in this post about measureless Tegmark IV.) Suppose that Omega's coin had come up the other way, and Omega had turned the sky green. Then you and I would be l-zombies. But if Omega was able to make a confident guess about the decision we'd make if confronted with the counterfactual mugging (without simulating us, so that we continue to be l-zombies), then our decisions would still influence what happens in the actual physical world. Thus, if l-zombies say "I have conscious experiences, therefore I physically exist", and update on this fact, and if the decisions they make based on this influence what happens in the real world, a lot of utility may potentially be lost. Of course, you and I aren't l-zombies, but the mathematically possible versions of us who have grown up under a green sky are, and they reason the same way as you and me—it's not possible to have only the actual conscious observers reason that way. Thus, you should pay up even in the blue-sky mugging.

But that's only part of the reason I changed my mind. The other part is that while in the counterfactual mugging, the answer you get if you try to use Bayesian updating at least looks kinda sensible, there are other thought experiments in which doing so in the straight-forward way makes you obviously bat-shit crazy. That's what I'd like to talk about today.

*

The kind of situation I have in mind involves being able to influence whether you exist, or more precisely, influence whether the version of you making the decision exists as a conscious observer (or whether it's an l-zombie).

Suppose that you wake up and Omega explains to you that it's kidnapped you and some of your friends back in 2014, and put you into suspension; it's now the year 2100. It then hands you a little box with a red button, and tells you that if you press that button, Omega will slowly torture you and your friends to death; otherwise, you'll be able to live out a more or less normal and happy life (or to commit painless suicide, if you prefer). Furthermore, it explains that one of two things have happened: Either (1) humanity has undergone a positive intelligence explosion, and Omega has predicted that you will press the button; or (2) humanity has wiped itself out, and Omega has predicted that you will not press the button. In any other scenario, Omega would still have woken you up at the same time, but wouldn't have given you the button. Finally, if humanity has wiped itself out, it won't let you try to "reboot" it; in this case, you and your friends will be the last humans.

There's a correct answer to what to do in this situation, and it isn't to decide that Omega's just given you anthropic superpowers to save the world. But that's what you get if you try to update in the most naive way: If you press the button, then (2) becomes extremely unlikely, since Omega is really really good at predicting. Thus, the true world is almost certainly (1); you'll get tortured, but humanity survives. For great utility! On the other hand, if you decide to not press the button, then by the same reasoning, the true world is almost certainly (2), and humanity has wiped itself out. Surely you're not selfish enough to prefer that?

The correct answer, clearly, is that your decision whether to press the button doesn't influence whether humanity survives, it only influences whether you get tortured to death. (Plus, of course, whether Omega hands you the button in the first place!) You don't want to get tortured, so you don't press the button. Updateless reasoning gets this right.

*

Let me spell out the rules of the naive Bayesian decision theory ("NBDT") I used there, in analogy with Simple Updateless Decision Theory (SUDT). First, let's set up our problem in the SUDT framework. To simplify things, we'll pretend that FOOM and DOOM are the only possible things that can happen to humanity. In addition, we'll assume that there's a small probability $\textstyle \varepsilon$ that Omega makes a mistake when it tries to predict what you will do if given the button. Thus, the relevant possible worlds are $\textstyle \Omega = \{\mathrm{foom}, \mathrm{doom}\} \times \{\mathrm{correct},\mathrm{incorrect}\}$. The precise probabilities you assign to these doesn't matter very much; I'll pretend that FOOM and DOOM are equiprobable, $\textstyle \mathbb{P}(x,\mathrm{incorrect}) = \varepsilon/2$ and $\textstyle \mathbb{P}(x,\mathrm{correct}) = (1-\varepsilon)/2$.

There's only one situation in which you need to make a decision, $\textstyle \mathcal{I} = \{*\}$; I won't try to define NBDT when there is more than one situation. Your possible actions in this situation are to press or to not press the button, $\textstyle \mathcal{A}(*) = \{P,\neg P\}$, so the only possible policies are $\textstyle \pi_P$, which presses the button ($\textstyle \pi_P(*) = P$), and $\textstyle \pi_{\neg P}$, which doesn't ($\textstyle \pi_{\neg P}(*) = \neg P$); $\textstyle \Pi = \{\pi_P,\pi_{\neg P}\}$.

There are four possible outcomes, specifying (a) whether humanity survives and (b) whether you get tortured: $\textstyle \mathcal{O} = \{\mathrm{foom}, \mathrm{doom}\} \times \{\mathrm{torture},\neg\mathrm{torture}\}$. Omega only hands you the button if FOOM and it predicts you'll press it, or DOOM and it predicts you won't. Thus, the only cases in which you'll get tortured are $\textstyle o((\mathrm{foom},\mathrm{correct}),\pi_P) = (\mathrm{foom},\mathrm{torture})$ and $\textstyle o((\mathrm{doom},\mathrm{incorrect}),\pi_P) = (\mathrm{doom},\mathrm{torture})$. For any other $\textstyle x\in\{\mathrm{foom},\mathrm{doom}\}$, $\textstyle y\in\{\mathrm{correct},\mathrm{incorrect}\}$, and $\textstyle \pi\in\Pi$, we have $\textstyle o((x,y),\pi) = (x,\neg\mathrm{torture})$.

Finally, let's define our utility function by $u((\mathrm{foom},\neg\mathrm{torture})) = L$, $u((\mathrm{foom},\mathrm{torture})) = L-1$, $u((\mathrm{doom},\neg\mathrm{torture})) = -L$, and $u((\mathrm{doom},\mathrm{torture})) = -L-1$, where $\textstyle L$ is a very large number.

This suffices to set up an SUDT decision problem. There are only two possible worlds $\textstyle \omega\in\Omega$ where $\textstyle u(o(\omega,\pi_P))$ differs from $\textstyle u(o(\omega,\pi_{\neg P}))$, namely $\textstyle (\mathrm{foom},\mathrm{correct})$ and $\textstyle (\mathrm{doom},\mathrm{incorrect})$, where $\textstyle \pi_P$ results in torture and $\textstyle \pi_{\neg P}$ doesn't. In each of these cases, the utility of $\textstyle \pi_P$ is lower (by one) than that of $\textstyle \pi_{\neg P}$. Hence, $\textstyle \mathbb{E}[u(o(\boldsymbol{\omega},\pi_P))] < \mathbb{E}[u(o(\boldsymbol{\omega},\pi_{\neg P}))]$, implying that SUDT says you should choose $\textstyle \pi_{\neg P}$.

*

For NBDT, we need to know how to update, so we need one more ingredient: a function specifying in which worlds you exist as a conscious observer. In anticipation of future discussions, I'll write this as a function $\textstyle \mu(i;\omega,\pi)$, which gives the "measure" ("amount of magical reality fluid") of the conscious observation $\textstyle i\in\mathcal{I}$ if policy $\textstyle \pi\in\Pi$ is executed in the possible world $\textstyle \omega\in\Omega$. In our case, $\textstyle i = *$ and $\textstyle \mu(*;\omega,\pi)\in\{0,1\}$, indicating non-existence and existence, respectively. We can interpret $\textstyle \mu(i;\omega,\pi)$ as the conditional probability of making observation $\textstyle i$, given that the true world is $\textstyle \omega$, if plan $\textstyle \pi$ is executed. In our case, $\textstyle \mu(*;(\mathrm{foom},\mathrm{correct}),\pi_P) =$ $\textstyle \mu(*;(\mathrm{foom},\mathrm{incorrect}),\pi_{\neg P}) =$ $\textstyle \mu(*;(\mathrm{doom},\mathrm{correct}),\pi_{\neg P}) =$ $\textstyle \mu(*;(\mathrm{doom},\mathrm{incorrect}),\pi_P) = 1$, and $\textstyle \mu(*;\omega,\pi) = 0$ in all other cases.

Now, we can use Bayes' theorem to calculate the posterior probability of a possible world, given information $\textstyle i = *$ and policy $\textstyle \pi$: $\textstyle \mathbb{P}(\omega\mid i;\pi) = \mathbb{P}(\omega)\cdot\mu(i;\omega,\pi) / \sum_{\omega'\in\Omega} \mathbb{P}(\omega')\cdot\mu(i;\omega',\pi)$. NBDT tells us to choose the policy $\textstyle \pi$ that maximizes the posterior expected utility, $\textstyle \mathbb{E}[u(o(\boldsymbol{\omega},\pi))\mid i;\pi]$.

In our case, we have $\textstyle \mathbb{P}((\mathrm{foom},\mathrm{correct}) \mid *;\pi_P) = \mathbb{P}((\mathrm{doom},\mathrm{correct}) \mid *;\pi_{\neg P}) = 1-\varepsilon$ and $\textstyle \mathbb{P}((\mathrm{doom},\mathrm{incorrect}) \mid *;\pi_P) = \mathbb{P}((\mathrm{foom},\mathrm{incorrect}) \mid *;\pi_{\neg P}) = \varepsilon$. Thus, if we press the button, our expected utility is dominated by the near-certainty of humanity surviving, whereas if we don't, it's dominated by humanity's near-certain doom, and NBDT says we should press.

*

But maybe it's not updating that's bad, but NBDT's way of implementing it? After all, we get the clearly wacky results only if our decisions can influence whether we exist, and perhaps the way that NBDT extends the usual formula to this case happens to be the wrong way to extend it.

One thing we could try is to mark a possible world $\textstyle \omega$ as impossible only if $\textstyle \mu(*;\omega,\pi) = 0$ for all policies $\textstyle \pi$ (rather than: for the particular policy $\textstyle \pi$ whose expected utility we are computing). But this seems very ad hoc to me. (For example, this could depend on which set of possible actions $\textstyle \mathcal{A}(*)$ we consider, which seems odd.)

There is a much more principled possibility, which I'll call pseudo-Bayesian decision theory, or PBDT. PBDT can be seen as re-interpreting updating as saying that you're indifferent about what happens in possible worlds in which you don't exist as a conscious observer, rather than ruling out those worlds as impossible given your evidence. (A version of this idea was recently brought up in a comment by drnickbone, though I'd thought of this idea myself during my journey towards my current position on updating, and I imagine it has also appeared elsewhere, though I don't remember any specific instances.) I have more than one objection to PBDT, but the simplest one to argue is that it doesn't solve the problem: it still believes that it has anthropic superpowers in the problem above.

Formally, PBDT says that we should choose the policy $\textstyle \pi$ that maximizes $\textstyle \mathbb{E}[u(o(\boldsymbol{\omega},\pi))\cdot\mu(*;\boldsymbol{\omega},\pi)]$ (where the expectation is with respect to the prior, not the updated, probabilities). In other words, we set the utility of any outcome in which we don't exist as a conscious observer to zero; we can see PBDT as SUDT with modified outcome and utility functions.

When our existence is independent on our decisions—that is, if $\textstyle \mu(*;\omega,\pi)$ doesn't depend on $\textstyle \pi$—then it turns out that PBDT and NBDT are equivalent, i.e., PBDT implements Bayesian updating. That's because in that case, $\textstyle \mathbb{E}[u(o(\boldsymbol{\omega},\pi))\mid *;\pi] =$ $\textstyle \sum_{\omega\in\Omega} u(o(\omega,\pi))\cdot\mathbb{P}(\omega\mid *;\pi)$ $\textstyle = \sum_{\omega\in\Omega} u(o(\omega,\pi))\cdot\mathbb{P}(\omega)\cdot \mu(*;\omega,\pi) / \sum_{\omega'\in\Omega} \mathbb{P}(\omega')\cdot\mu(*;\omega',\pi)$. If $\textstyle \mu(*;\omega,\pi)$ doesn't depend on $\textstyle \pi$, then the whole denominator doesn't depend on $\textstyle \pi$, so the fraction is maximized if and only if the numerator is. But the numerator is $\textstyle \sum_{\omega\in\Omega} u(o(\omega,\pi))\cdot\mathbb{P}(\omega)\cdot \mu(*;\omega,\pi) =$ $\textstyle \mathbb{E}[u(o(\boldsymbol{\omega},\pi))\cdot\mu(*;\omega,\pi)]$, exactly the quantity that PBDT says should be maximized.

Unfortunately, although in our problem above $\mu(*;\omega,\pi)$ does depend of $\pi$, the denominator as a whole still doesn't: For both $\pi_P$ and $\pi_{\neg P}$, there is exactly one possible world with probability $(1-\varepsilon)/2$ and one possible world with probability $\varepsilon/2$ in which $*$ is a conscious observer, so we have $\textstyle\sum_{\omega'\in\Omega} \mathbb{P}(\omega')\cdot\mu(*;\omega',\pi) = 1/2$ for both $\pi\in\Pi$. Thus, PBDT gives the same answer as NBDT, by the same mathematical argument as in the case where we can't influence our own existence. If you think of PBDT as SUDT with the utility function $u(o(\omega,\pi))\cdot\mu(*;\omega,\pi)$, then intuitively, PBDT can be thought of as reasoning, "Sure, I can't influence whether humanity is wiped out; but I can influence whether I'm an l-zombie or a conscious observer; and who cares what happens to humanity if I'm not? Best to press to button, since getting tortured in a world where there's been a positive intelligence explosion is much better than life without torture if humanity has been wiped out."

I think that's a pretty compelling argument against PBDT, but even leaving it aside, I don't like PBDT at all. I see two possible justifications for PBDT: You can either say that $u(o(\omega,\pi))\cdot\mu(*;\omega,\pi)$ is your real utility function—you really don't care about what happens in worlds where the version of you making the decision doesn't exist as a conscious observer—or you can say that your real preferences are expressed by $u(o(\omega,\pi))$, and multiplying by $\mu(*;\omega,\pi)$ is just a mathematical trick to express a steelmanned version of Bayesian updating. If your preferences really are given by $u(o(\omega,\pi))\cdot\mu(*;\omega,\pi)$, then fine, and you should be maximizing $\textstyle \mathbb{E}[u(o(\boldsymbol{\omega},\pi))\cdot\mu(*;\omega,\pi)]$ (because you should be using (S)UDT), and you should press the button. Some kind of super-selfish agent, who doesn't care a fig even about a version of itself that is exactly the same up till five seconds ago (but then wasn't handed the button) could indeed have such preferences. But I think these are wacky preferences, and you don't actually have them. (Furthermore, if you did have them, then $u(o(\omega,\pi))\cdot\mu(*;\omega,\pi)$ would be your actual utility function, and you should be writing it as just $u(o(\omega,\pi))$, where $o(\omega,\pi)$ must now give information about whether $*$ is a conscious observer.)

If multiplying by $\mu(*;\omega,\pi)$ is just a trick to implement updating, on the other hand, then I find it strange that it introduces a new concept that doesn't occur at all in classical Bayesian updating, namely the utility of a world in which $*$ is an l-zombie. We've set this to zero, which is no loss of generality because classical utility functions don't change their meaning if you add or subtract a constant, so whenever you have a utility function where all worlds in which $*$ is an l-zombie have the same utility $u_0$, then you can just subtract $u_0$ from all utilities (without changing the meaning of the utility function), and get a function where that utility is zero. But that means that the utility functions I've been plugging into PBDT above do change their meaning if you add a constant to them. You can set up a problem where the agent has to decide whether to bring itself into existence or not (Omega creates it iff it predicts that the agent will press a particular button), and in that case the agent will decide to do so iff the world has utility greater than zero—clearly not invariant under adding and subtracting a constant. I can't find any concept like the utility of not existing in my intuitions about Bayesian updating (though I can find such a concept in my intuitions about utility, but regarding that see the previous paragraph), so if PBDT is just a mathematical trick to implement these intuitions, where does that utility come from?

I'm not aware of a way of implementing updating in general SUDT-style problems that does better than NBDT, PBDT, and the ad-hoc idea mentioned above, so for now I've concluded that in general, trying to update is just hopeless, and we should be using (S)UDT instead. In classical decision problems, where there are no acausal influences, (S)UDT will of course behave exactly as if it did do a Bayesian update; thus, in a sense, using (S)UDT can also be seen as a reinterpretation of Bayesian updating (in this case just as updateless utility maximization in a world where all influence is causal), and that's the way I think about it nowadays.

## [LINK] Joseph Bottum on Politics as the Mindkiller

2 27 February 2014 07:40PM

One of my favourite Less Wrong articles is Politics is the mindkiller. Part of the reason that political discussion so bad is the poor incentives - if you have little chance to change the outcome, then there is little reason to strive for truth or accuracy - but a large part of the reason is our pre-political attitudes and dispositions. I don't mean to suggest that there is a neat divide; clearly, there is a reflexive relation between the incentives within political discussion and our view of the appropriate purpose and scope of politics. Nevertheless, I think it's a useful distinction to make, and so I applaud the fact that Eliezer doesn't start his essays on the subject by talking about incentives, feedback or rational irrationality - instead he starts with the fact that our approach to politics is instinctively tribal.

This brings me to Joseph Bottum's excellent recent article in The American, The Post-Protestant Ethic and Spirit of America. This charts what he sees as the tribal changes within America that have shaped current attitudes to politics. I think it's best seen in conjunction with Arnold Kling's excellent The Three Languages of Politics; while Kling talks about the political language and rhetoric of modern American political groupings, Bottum's essay is more about the social changes that have led to these kinds of language and rhetoric.

We live in what can only be called a spiritual age, swayed by its metaphysical fears and hungers, when we imagine that our ordinary political opponents are not merely mistaken, but actually evil. When we assume that past ages, and the people who lived in them, are defined by the systematic crimes of history. When we suppose that some vast ethical miasma, racism, radicalism, cultural self-hatred, selfish blindness, determines the beliefs of classes other than our own. When we can make no rhetorical distinction between absolute wickedness and the people with whom we disagree. The Republican Congress is the Taliban. President Obama is a Communist. Wisconsin’s governor is a Nazi.

...

The real question, of course, is how and why this happened. How and why politics became a mode of spiritual redemption for nearly everyone in America, but especially for the college-educated upper-middle class, who are probably best understood not as the elite, but as the elect, people who know themselves as good, as relieved of their spiritual anxieties by their attitudes toward social problems.

Video of a related lecture can also be found here.

## The Rationality Wars

18 27 February 2014 05:08PM

Ever since Tversky and Kahneman started to gather evidence purporting to show that humans suffer from a large number of cognitive biases, other psychologists and philosophers have criticized these findings. For instance, philosopher L. J. Cohen argued in the 80's that there was something conceptually incoherent with the notion that most adults are irrational (with respect to a certain problem). By some sort of Wittgensteinian logic, he thought that the majority's way of reasoning is by definition right. (Not a high point in the history of analytic philosophy, in my view.) See chapter 8 of this book (where Gigerenzer, below, is also discussed).

Another attempt to resurrect human rationality is due to Gerd Gigerenzer and other psychologists. They have a) shown that if you tweak some of the heuristics and biases (i.e. the research program led by Tversky and Kahneman) experiments but a little - for instance by expressing probabilities in terms of frequencies - people make much fewer mistakes and b) argued, on the back of this, that the heuristics we use are in many situations good (and fast and frugal) rules of thumb (which explains why they are evolutionary adaptive). Regarding this, I don't think that Tversky and Kahneman ever doubted that the heuristics we use are quite useful in many situations. Their point was rather that there are lots of naturally occuring set-ups which fool our fast and frugal heuristics. Gigerenzer's findings are not completely uninteresting - it seems to me he does nuance the thesis of massive irrationality a bit - but his claims to the effect that these heuristics are rational in a strong sense are wildly overblown in my opnion. The Gigerenzer vs. Tversky/Kahneman debates are well discussed in this article (although I think they're too kind to Gigerenzer).

A strong argument against attempts to save human rationality is the argument from individual differences, championed by Keith Stanovich. He argues that the fact that some intelligent subjects consistently avoid to fall prey to the Wason Selection task, the conjunction fallacy, and other fallacies, indicates that there is something misguided with the notion that the answer that psychologists traditionally has seen as normatively correct is in fact misguided.

Hence I side with Tversky and Kahneman in this debate. Let me just mention one interesting and possible succesful method for disputing some supposed biases. This method is to argue that people have other kinds of evidence than the standard interpretation assumes, and that given this new interpretation of the evidence, the supposed bias in question is in fact not a bias. For instance, it has been suggested that the "false consensus effect" can be re-interpreted in this way:

The False Consensus Effect

Bias description: People tend to imagine that everyone responds the way they do. They tend to see their own behavior as typical. The tendency to exaggerate how common one’s opinions and behavior are is called the false consensus effect. For example, in one study, subjects were asked to walk around on campus for 30 minutes, wearing a sign board that said "Repent!". Those who agreed to wear the sign estimated that on average 63.5% of their fellow students would also agree, while those who disagreed estimated 23.3% on average.

Counterclaim (Dawes & Mulford, 1996): The correctness of reasoning is not estimated on the basis of whether or not one arrives at the correct result. Instead, we look at whether reach reasonable conclusions given the data they have. Suppose we ask people to estimate whether an urn contains more blue balls or red balls, after allowing them to draw one ball. If one person first draws a red ball, and another person draws a blue ball, then we should expect them to give different estimates. In the absence of other data, you should treat your own preferences as evidence for the preferences of others. Although the actual mean for people willing to carry a sign saying "Repent!" probably lies somewhere in between of the estimates given, these estimates are quite close to the one-third and two-thirds estimates that would arise from a Bayesian analysis with a uniform prior distribution of belief. A study by the authors suggested that people do actually give their own opinion roughly the right amount of weight.

(The quote is from an excellent Less Wrong article on this topic due to Kaj Sotala. See also this post by himthis by Andy McKenzie, this by Stuart Armstrong and this by lukeprog on this topic. I'm sure there are more that I've missed.)

It strikes me that the notion that people are "massively flawed" is something of an intellectual cornerstone of the Less Wrong community (e.g. note the names "Less Wrong" and "Overcoming Bias"). In the light of this it would be interesting to hear what people have to say about the rationality wars. Do you all agree that people are massively flawed?

Let me make two final notes to keep in mind when discussing these issues. Firstly, even though the heuristics and biases program is sometimes seen as pessimistic, one could turn the tables around: if they're right, we should be able to improve massively (even though Kahneman himself seems to think that that's hard to do in practice). I take it that CFAR and lots of LessWrongers who attempt to "refine their rationality" assume that this is the case. On the other hand, if Gigerenzer or Cohen are right, and we already are very rational, then it would seem that it is hard to do much better. So in a sense the latter are more pessimistic (and conservative) than the former.

Secondly, note that parts of the rationality wars seem to be merely verbal and revolve around how "rationality" is to be defined (tabooing this word is very often a good idea). The real question is not if the fast and frugal heuristics are in some sense rational, but whether there are other mental algorithms which are more reliable and effective, and whether it is plausible to assume that we could learn to use them on a large scale instead.

## Meetup : London Games Meetup 09/03 [VENUE CHANGE: PENDEREL'S OAK!], + Social 16/02

2 27 February 2014 04:07PM

## Discussion article for the meetup : London Games Meetup 09/03, + Socials 02/03 and 16/02

WHEN: 09 March 2014 02:00:00PM (+0000)

WHERE: 283-288 High Holborn, City of London, WC1V 7HP

LessWrong London's next non-social gathering is going to be on the 9th of March and is going to be a Games Meetup at a new location - The Penderel's Oak pub located about 5-10 minutes away from our usual spot in the middle between Chancery Lane and Holborn stations (I'd recommend looking at the map to get a better idea of the location)

Thanks to Phil we have a wide range of choices.The main ones are Resistance, Coup and Zendo. Alternatively, we will be able to play Ingenious, Go, Diplomacy (only if people insist on it) or card games.

We are also having socials on the 16th of March as the Meetups are currently a weekly event.

If you have trouble finding us - feel free to call or text me on 07425168803.

## Link: Poking the Bear (Podcast)

-3 27 February 2014 03:43PM

A Dan Carlin Podcast about how the United States is foolishly antagonizing the Russians over Ukraine.  Carlin makes an analogy as to how the United States would feel if Russia helped overthrow the government of Mexico to install an anti-American government under conditions that might result in a Mexican civil war.  Because of the Russian nuclear arsenal, even a tiny chance of a war between the United States and Russia has a huge negative expected value.

## Meetup : Saint Petersburg sunday meetup

0 27 February 2014 11:41AM

## Discussion article for the meetup : Saint Petersburg sunday meetup

WHEN: 01 March 2014 04:00:00PM (+0400)

WHERE: Санкт-Петербург, м. Технологический Институт, ул.1-я Красноармейская, дом 15

With high probability we will be playing Zendo. Also studying up information about CFAR and its publicly available information.

If you are russian lesswronger who sees this announcement for the first time, please check out our newsletter or vk group: newsletter or http://vk.com/lw_spb for more detailed descriptions.

If you are a foreign guest in Saint Petersburg - we also would all be glad to see you and to meet you - at lest some of our attendees speak english.

## Meetup : Berkeley: Implementation Intentions

1 27 February 2014 07:06AM

## Discussion article for the meetup : Berkeley: Implementation Intentions

WHEN: 05 March 2014 07:00:00PM (-0800)

WHERE: 2030 Addison, 3rd floor, Berkeley, CA

Hello all, next week's meetup will be about implementation intentions:

http://en.wikipedia.org/wiki/Implementation_intention

It sounds boring but in fact it's a technique for changing your behavior that produces half a standard deviation of change in studies with a minimal intervention. Basically, there's a good chance that coming to this meetup will change your behavior in a significant and positive way :)

This is a thing they teach at CFAR.

Please arrive between 7pm and 7:30pm on Wednesday. At 7:30pm as usual we'll review our weekly goals and record goals for the coming week; it should take less than 15 minutes. Afterward I will give a short presentation on implementation intentions, and then we will help each other create implementation intentions

Even though this takes place at CFAR, it's not a CFAR-sponsored event. The CFAR office is at 2030 Addison, 3rd floor, Berkeley, near the Downtown Berkeley BART. If you find yourself locked out, text me at:

http://i.imgur.com/Vcafy.png

## Meetup : Munich Meetup

0 26 February 2014 11:49PM

## Discussion article for the meetup : Munich Meetup

WHEN: 08 March 2014 02:00:00PM (+0100)

WHERE: Theresienstraße 41, 80333 München

We're going to try a different location this time, and a different format – I'd like to do a few concrete exercises on basic probability theory or something similar. Afterwards we'll move on to free discussion and maybe Zendo. We're planning to meet outside the mathematics building at the LMU. Depending on the weather, we'll stay outside or occupy a free room inside the math department. Whoever brings food for the group is awesome. :) It goes without saying that newcomers are very welcome.

View more: Next