Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

Good Places to Live as a Less Wronger

11 diegocaleiro 20 November 2012 03:21AM

Less Wrongers are a diverse crowd, more so now than in the early days.  I wonder if we could step away from anti-generalizations, generalize and try to say good places to live, under a few assumptions (remember, the idea of an assumption is to assume it, not to claim it is less or more representative of observation class X or Y and then go on to nerdify it.)

Recetly, Xanghai was claimed as an interesting place to teach english. 

Just having returned from 15 days in Rio de Janeiro, I may talk a little about it.


1) Assuming your family lives somewhere else, other state or country.

2) No children yet. Single, Married, Gay, Bisexual, Male, Female.

3) You can muster $1-4k a month (teaching a language, like English, programming, writing, family money, lottery, spy for the CIA)

4) You like science/philosophy, rationality, and not a complete misanthrope (you'd hug five times more than you do if given a chance, and you'd double the number of close friends you have, as well as balance their gender ratio)


My suggested format is city name, time spend there, experience, cons, and pros.

Rio de Janeiro,15 days, Rio is an interesting city. Near the subway you can get to the vast majority of places without a car, a good night out will cost between 15-40 dollars, depending on whether you drink or not, and therefore need a cab home. Nice dinner 12-50.  There are millions of people including lots of tourists easily reachable there. So unless you are estonian, you will be able to find someone from home there. Because travellers go to Rio for it's beauty, you can find them in free places, and make friends with locals and foreigners alike, allowing for short term and long term friendships. They say you get tired eventually, but the natural beauty is great and spread. Forests and beaches and mountains abound, all 4 minutes away from a supermarket.There are nearly free public bikes in some areas.

Cons: Science/philosophy are not what Rio is known for. Their universities are good, and you can find youe way there if you can in a good college, but a meeting with a lot of people to discuss two boxing on newcomb is less likely in the following ten years.You can't park in Rio during the day, if somehow you managed to have a car and a carplace in your apartment. You won't buy a place,and it won't be big, an awesome ipanema apartment 190sq meters goes for 2,3 million dollars, and renting a tiny place costs about 1thousand a month.

Pros: Papers to the contrary, weather does impact your life for quite a while if you pay attention to it. Not necessarily the weather itself, but the social oppotunities that arise because of it (moonlight music at the beach, free overhearing music in the bohemian neighborhood, dancing as opposed to freezing, etc...) can be, literally, life-changing.  Rio has many people not from Rio, so it is easy to befriend them, they also need new friends.  The Couchsurfing community is active and speaks english.

Neutral: Many think that people (specially women) look amazing in Brazil, quite the contrary. Our average look is way below your expectations, but the top5% of people are really better looking to foreigner eyes than the 5%of their own country. Long tails, pun intended.

If you lived for a while in a city that you'd like to recommend to some niche Less Wrongers, report. Avoid doing so for the city you were born in, since a native experience differs violently from a migrant/immigrant experience.


GiveWell and the Centre for Effective Altruism are recruiting

11 Pablo_Stafforini 19 November 2012 11:53PM

Both GiveWell and the Centre for Effective Altruism (CEA) --an Oxford-based umbrella organization consisting of Giving What We Can, 80,000 Hours, The Life You Can Save, and Effective Animal Activism-- have been discussed here before.  So I thought some folks might want to know that these organizations are recruiting for a number of positions.  Here are relevant excerpts from the official job announcements:

GiveWell: Research Analyst

GiveWell is looking for a Research Analyst to help us evaluate charities, find the most outstanding giving opportunities, and publish our analysis to help donors decide where to give.


Effective Animal Activism: Executive Director

Effective Animal Activism is a recently-founded project of 80,000 Hours. It is the world’s first online resource and international community for people who want to reduce animal suffering effectively. We are currently looking for a part-time executive director. Responsibilities will include creating content, managing the community, publicizing the site, and overseeing as well as undertaking further charity research. Future projects include creating a publication on our intervention evaluation once complete, attending conferences, running ad campaigns, and reaching out to the media, animal charities and philanthropists.


Giving What We Can: Head of Communications

We are looking for someone to communicate Giving What We Can’s message to the world. As Communications Manager you would be responsible for handling our press relations and guiding our public image.


80,000 Hours: Head of Careers Research

We are looking for someone to drive cutting-edge research into effective ethical careers and translate it into one-on-one and online careers advice, which you’ll share with interesting people from all over the world.


The Life You Can Save: Director of Outreach (Intern)

We are looking for someone to lead our outreach to pledgers and supporters as well as local groups, other charities, and corporations. In this role, you’ll play a key part in setting our strategic priorities and driving the growth of The Life You Can Save. You’ll be working alongside Peter Singer – one of the most influential ethicists of the 20th century.


Centre for Effective Altruism: Head of Fundraising and External Relations

We are looking for someone to manage our fundraising and represent us to other organisations. In this role you would serve all four organisations in the Centre for Effective Altruism: Giving What We Can, 80,000 Hours, The Life You Can Save and Effective Animal Activism.

(Full disclosure: I'm friends with the co-founders of CEA and have donated to Effective Animal Activism.)


Collating widely available time/money trades

18 RyanCarey 19 November 2012 10:57PM

In the xkcd comic Working, a man is seen filling up his gas tank. "Why are you going here", says the observer, "Gas is ten cents a gallon cheaper at the station five minutes that way". He responds "Because a penny saved is a penny earned". Randal's pragmatically spirited caption says "If you spend nine minutes of your time to save a dollar, you're working for less than the minimum wage."

Our opportunities to convert time into money and vice versa, though not unlimited, are numerous.

We work (sell our free time) when we…

  • Seek overtime shifts
  • Subscribe to a mailing list for local discounts e.g. GroupOn
  • Bargain with one more car dealer or travel agent, in search of a better deal

We buy free time when we…

  • Employ cleaners, chefs, babysitters, secretaries and others.
  • Buy productivity software
  • Buy a medication that improves our sleep

How can we evaluate these trades? It seems like we ought to only purchase free time when it comes cheaper than a certain figure, $x/hr, and ought to only work if we can sell our free time for more than $x/hr. Indeed, comparing trades to this time/money exchange rate is the only unexploitable way to behave.

Most of the time, when we share our estimates of the value of these trades, our comments are too vague to be helpful. If my father, a doctor tells me, a student, that "subscribing to discount mailing lists is a waste of time", what does he mean? He might mean that these mailing lists are poor value for me, he might mean the much stronger statement that they are poor value for everyone, or the much weaker statement, that they are poor value just for him (his time is obviously worth the most). I have to try to get him to disentangle his estimation from his jugement. I have to ask him "What low value would a person have to place on their time for discount mailing lists to be worthwhile?"

The easiest way for all individuals with different time/money exchange rates to share their estimates will be to quantify them. e.g. being on a discount mailing lists only saves $x per hour spent. Out of my father and I, this might represent value to none, one or both of us.

When we share these quantitative estimates, it would be silly to discuss deals that are only available privately like job offers, that are so dependent on our particular skills and qualification. Instead,we will gain the most by listing time-money trades that are likely to apply across domains, such as repairing a car on the one hand or catching a cab on the other.

By doing so, we stand to learn that many of the trades we have been carrying out have represented poor value, and we should learn of new trades that we had not previously considered. Of course, there are associated costs, like the time spent gathering this information, and the risk of becoming unduly preocuppied with these decisions, but it still seems worth doing.

A last point of order is that it will be best to indicate how far we can expect each estimate to generalise. For example, the cost of something like melatonin will differ between states or between countries, and that is worth mentioning.

So in this thread, please share your estimates in $/hr for potential ways to work or buy free time.

[SEQ RERUN] Cascades, Cycles, Insight...

1 MinibearRex 19 November 2012 04:58AM

Today's post, Cascades, Cycles, Insight... was originally published on 24 November 2008. A summary (taken from the LW wiki):


Cascades, cycles, and insight are three ways in which the development of intelligence appears discontinuous. Cascades are when one development makes more developments possible. Cycles are when completing a process causes that process to be completed more. And insight is when we acquire a chunk of information that makes solving a lot of other problems easier.

Discuss the post here (rather than in the comments to the original post).

This post is part of the Rerunning the Sequences series, where we'll be going through Eliezer Yudkowsky's old posts in order so that people who are interested can (re-)read and discuss them. The previous post was "Evicting" brain emulations, and you can use the sequence_reruns tag or rss feed to follow the rest of the series.

Sequence reruns are a community-driven effort. You can participate by re-reading the sequence post, discussing it here, posting the next day's sequence reruns post, or summarizing forthcoming articles on the wiki. Go here for more details, or to have meta discussions about the Rerunning the Sequences series.

Australian Rationalist in America

4 shokwave 19 November 2012 02:56AM

Hi LessWrong! I'm a LWer from Melbourne, Australia, and I'm taking a 3 month road trip (with a friend) through parts of the United States. I figure I'd enjoy hanging out with some fellow rationalists while I'm over here! 

I attended the May Rationality minicamp in San Francisco (and made some friends who I'm hoping to meet up with again), but I've also heard good things about the LessWrong groups all over the United States. I'd like to meet some of the awesome people involved in these communities!

We've been planning this trip for a while now and have accommodation pretty much everywhere except for the second half of San Francisco. 


  • 17th-21st Nov - Los Angeles, CA
  • 21st-28th Nov - San Francisco, CA
  • 28th Nov-1st Dec - Las Vegas, NV
  • 2nd-3rd Dec - Flagstaff, AZ
  • 3rd-7th Dec - Phoenix, AZ
  • 7th-9th Dec - Santa Fe, NM
  • 9th-10th Dec - El Paso, TX
  • 10th-13th Dec - San Antonio, TX
  • 13th-21st Dec - Austin, TX
  • 21st-26th Dec - Dallas, TX
  • 26th-29th Dec - San Antonio, TX
  • 29th Dec-2nd Jan - New York City, NY
  • 2nd-3rd Jan - San Antonio, TX
  • 3rd-6th Jan - Houston, TX
  • 6th-9th Jan - New Orleans, LA
  • 9th-12th Jan - Memphis, TN
  • 12th-15th Jan - Nashville, TN
  • 15th-18th Jan - Atlanta, GA
  • 18th-22nd Jan - Miami, FL
  • 22nd-26th Jan - Orlando, FL
  • 26th Jan-1st Feb - Washington DC
  • 1st-4th Feb - Philadelphia, PA
  • 4th-6th Feb - New York City, NY
  • 6th-9th Feb - Mount Snow, VT
  • 9th-13th Feb - Boston, MA
  • 13th-15th Feb - New York City, NY
  • 15th-26th Feb - Columbus, OH

If you're in one of these locations when I am, contact me! Either ahead of time or at short notice is fine. I'll be checking meetup posts and mailing lists for events that I can make it to as well, but if you happen to know of an event or meetup happening that fits the schedule, feel free to let me know in the comments.

Message or call me on 4242 394 657, email me at shokwave.sf@gmail.com - or you can leave a toplevel comment on this post, or message my LW account directly. Looking forward to meeting any and all of you!


Online Meetup: The High Impact Network

2 RyanCarey 19 November 2012 02:55AM

Update: The High Impact Network will meet at 7pm on Saturday the 24th of November, Eastern US time. Please email me to be invited to these hangouts:




Effective altruists, not all of whom are geographically located together, benefit from being connected and brought up to date with effective ideas and plans regarding areas of interest to them.

Mark Lee and I want to meet aspiring effective altruists and talk about how their talents and ideas might fit into the greater scheme of organised altruistic effort.

Due to the popularity of the previous meetup, the new discussion will be divided into two smaller groups that will host simultaneous discussions on:

1. Addressing Global Poverty - how can we best alleviate global poverty?

2. Beyond Global Poverty - what are other highly important causes and how can we address them?

Participants are welcome to suggest up to 3 ways that they are interested in addressing these problems, and then we'll discuss the strengths and weaknesses of these approaches. The agenda is broad so as not to preempt or undermine new suggestions likely to be effective. More targeted follow-up meetings can be later arranged if required.

ark and I will chair one conversation each. Both will take place through Google Hangouts, at a the democratically determined time of 7pm on Saturday the 24th of November, Eastern US time.

Please RSVP if you want to be added to the Google Hangout - you are welcome to specify which discussion you prefer to be involved in and topics that you would like attached to the agenda .

Meetup : Brussels meetup

0 Axel 18 November 2012 10:45PM

Discussion article for the meetup : Brussels meetup

WHEN: 24 November 2012 01:00:00PM (+0100)

WHERE: Rue des Alexiens 55 1000 Bruxelles

We are once again meeting at 'la fleur en papier doré' close to the Brussels Central station. If you feel like an intelligent discussion and are in the neighborhood, consider dropping by. As always, I'll have a sign.

Discussion article for the meetup : Brussels meetup

Introductory Short Videos

7 [deleted] 18 November 2012 07:13PM

Most people new to the Less Wrong community start out by reading the Sequences. There are also lists out there of highly recommended text books, pop sci books, and articles.

However these avenues can take a LOT of time, and not everyone learns best by reading. I am interested in putting together a list of introductory videos that cover the same type of materials discussed on Less Wrong, and could be viewed within a single night. 

One purpose of this is for meetup groups that get a large influx of new people all at once. It might take a while to get everybody up to speed by sending them off to all go read Sequence posts on their own, especially if they are generally busy in their daily life. However, it is much easier to turn one of your meetups into a Video Night. The goal is that at the end of the Video Night, everybody is aware of the type of mindset that we want people at meetups to have. 

The list can also be used by people who are brand new to rationality, and want to get the basic information in video form.


I'll get it started by listing 6 videos that are of the sort that I mean for this list to be. Please leave some recommendations! (Also, if there is already a list of good intro videos, let me know. I am currently unaware of any.) 

Note: Although AI/ x-risk videos are NOT what I am looking for, I know people will want to recommend them, so I'm creating a parent comment to place them under. If we get enough of those, we can create a separate list of good Intro to X-Risk and/or AI vids.


Julia Galef- Straw Vulcan, Skepticon 4
-A good first video that dispels some widely-believed myths about "rationality".

Spencer Greenberg- Self Skepticism, Skepticon 4
-Reasoning as to how and why you might be wrong, and what to do about that fact

Julia Galef- Rationality and the Future, Singularity Summit 2012
-Overview of the culture we are trying to develop

Dan Ariely- TED talks (all three)
- Engaging anecdotes and information about specific cognitive biases, and how they affect people

Open Thread, November 16–30, 2012

3 VincentYu 18 November 2012 01:59PM

If it's worth saying, but not worth its own post, even in Discussion, it goes here.

"How We're Predicting AI — or Failing to"

11 lukeprog 18 November 2012 10:52AM

The new paper by Stuart Armstrong (FHI) and Kaj Sotala (SI) has now been published (PDF) as part of the Beyond AI conference proceedings. Some of these results were previously discussed here. The original predictions data are available here.


This paper will look at the various predictions that have been made about AI and propose decomposition schemas for analysing them. It will propose a variety of theoretical tools for analysing, judging and improving these predictions. Focusing specifically on timeline predictions (dates given by which we should expect the creation of AI), it will show that there are strong theoretical grounds to expect predictions to be quite poor in this area. Using a database of 95 AI timeline predictions, it will show that these expectations are born out in practice: expert predictions contradict each other considerably, and are indistinguishable from non-expert predictions and past failed predictions. Predictions that AI lie 15 to 25 years in the future are the most common, from experts and non-experts alike.

[SEQ RERUN] "Evicting" brain emulations

2 MinibearRex 18 November 2012 06:15AM

Today's post, “Evicting” brain emulations, by Carl Shulman was originally published on November 23, 2008. A summary:


As new models of ems are created, slight improvements will let them outcompete old models. In a system in which ems can simply be "evicted", the bots will likely not be pleased about this system. If they are simultaneously smarter, faster, and better unified than biological humans, they could be quite dangerous.

Discuss the post here (rather than in the comments to the original post).

This post is part of the Rerunning the Sequences series, where we'll be going through Eliezer Yudkowsky's old posts in order so that people who are interested can (re-)read and discuss them. The previous post was Surprised By Brains, and you can use the sequence_reruns tag or rss feed to follow the rest of the series.

Sequence reruns are a community-driven effort. You can participate by re-reading the sequence post, discussing it here, posting the next day's sequence reruns post, or summarizing forthcoming articles on the wiki. Go here for more details, or to have meta discussions about the Rerunning the Sequences series.

If you could take it with you, what would you take?

4 DataPacRat 18 November 2012 12:12AM

The Scenario: Our protagonist estimates that present-day cryonics has around a five percent chance of leading to a successful revival. Since that's better than the zero percent chance if he doesn't sign up, and he can afford it, he makes the necessary arrangements. As part of those arrangements, he receives a lockable file-cabinet drawer, in which he can put any desired mementos, knick-knacks, or other objects; and which will be protected as securely as his own cryo-preserved body. The drawer is around one and a half cubic feet: two feet deep, one foot wide, nine inches high.

The Question: What should he arrange to have placed in his drawer?

Some of the more obvious options:

* Long-term archival DVDs, such as M-Discs, containing as much of his personal computer's data as possible. With slimline jewel cases, around 400 such discs would fit, which could hold up to around 1.5 terabytes. (Secondary question: Which data to archive?)
* Objects of sentimental value
* Objects with present-day value: cash, gold coins, jewelry
* Objects with predicted future value: collectibles, small antiques
* In honor of previous seekers of immortality: a copy of the ancient Egyptian funerary text, the Book of Coming Forth By Day (aka the Book of the Dead).
* For the purely practical and/or munchkin approach: a weapon, such as a fighting knife or even a pistol

What does the world look like, the day before FAI efforts succeed?

23 michaelcurzi 16 November 2012 08:56PM

TL;DR: let's visualize what the world looks like if we successfully prepare for the Singularity.

I remember reading once, though I can't remember where, about a technique called 'contrasting'. The idea is to visualize a world where you've accomplished your goals, and visualize the current world, and hold the two worlds in contrast to each other. Apparently there was a study about this; the experimental 'contrasting' group was more successful than the control in accomplishing its goals.

It occurred to me that we need some of this. Strategic insights about the path to FAI are not robust or likely to be highly reliable. And in order to find a path forward, you need to know where you're trying to go. Thus, some contrasting:

It's the year 20XX. The time is 10 AM, on the day that will thereafter be remembered as the beginning of the post-Singularity world. Since the dawn of the century, a movement rose in defense of humanity's future. What began with mailing lists and blog posts became a slew of businesses, political interventions, infrastructure improvements, social influences, and technological innovations designed to ensure the safety of the world.

Despite all odds, we exerted a truly extraordinary effort, and we did it. The AI research is done; we've laboriously tested and re-tested our code, and everyone agrees that the AI is safe. It's time to hit 'Run'.

And so I ask you, before we hit the button: what does this world look like? In the scenario where we nail it, which achievements enabled our success? Socially? Politically? Technologically? What resources did we acquire? Did we have superior technology, or a high degree of secrecy? Was FAI research highly prestigious, attractive, and well-funded? Did we acquire the ability to move quickly, or did we slow unFriendly AI research efforts? What else?

I had a few ideas, which I divided between scenarios where we did a 'fantastic', 'good', or 'sufficient' job at preparing for the Singularity. But I need more ideas! I'd like to fill this out in detail, with the help of Less Wrong. So if you have ideas, write them in the comments, and I'll update the list.

Some meta points:

  • This speculation is going to be, well, pretty speculative. That's fine - I'm just trying to put some points on the map. 
  • However, I'd like to get a list of reasonable possibilities, not detailed sci-fi stories. Do your best.
  • In most cases, I'd like to consolidate categories of possibilities. For example, we could consolidate "the FAI team has exclusive access to smart drugs" and "the FAI team has exclusive access to brain-computer interfaces" into "the FAI team has exclusive access to intelligence-amplification technology." 
  • However, I don't want too much consolidation. For example, I wouldn't want to consolidate "the FAI team gets an incredible amount of government funding" and "the FAI team has exclusive access to intelligence-amplification technology" into "the FAI team has a lot of power".
  • Lots of these possibilities are going to be mutually exclusive; don't see them as aspects of the same scenario, but rather different scenarios.

Anyway - I'll start.

Visualizing the pre-FAI world

  • Fantastic scenarios
    • The FAI team has exclusive access to intelligence amplification technology, and use it to ensure Friendliness & strategically reduce X-risk.
    • The government supports Friendliness research, and contributes significant resources to the problem. 
    • The government actively implements legislation which FAI experts and strategists believe has a high probability of making AI research safer.
    • FAI research becomes a highly prestigious and well-funded field, relative to AGI research.
    • Powerful social memes exist regarding AI safety; any new proposal for AI research is met with a strong reaction (among the populace and among academics alike) asking about safety precautions. It is low status to research AI without concern for Friendliness.
    • The FAI team discovers important strategic insights through a growing ecosystem of prediction technology; using stables of experts, prediction markets, and opinion aggregation.
    • The FAI team implements deliberate X-risk reduction efforts to stave off non-AI X-risks. Those might include a global nanotech immune system, cheap and rigorous biotech tests and safeguards, nuclear safeguards, etc.
    • The FAI team implements the infrastructure for a high-security research effort, perhaps offshore, implementing the best available security measures designed to reduce harmful information leaks.
    • Giles writes: Large amounts of funding are available, via government or through business. The FAI team and its support network may have used superior rationality to acquire very large amounts of money.
    • Giles writes: The technical problem of establishing Friendliness is easier than expected; we are able t construct a 'utility function' (or a procedure for determining such a function) in order to implement human values that people (including people with a broad range of expertise) are happy with.
    • Crude_Dolorium writes: FAI research proceeds much faster than AI research, so by the time we can make a superhuman AI, we already know how to make it Friendly (and we know what we really want that to mean).
  • Pretty good scenarios
    • Intelligence amplification technology access isn't exclusive to the FAI team, but it is differentially adopted by the FAI team and their supporting network, resulting in a net increase in FAI team intelligence relative to baseline. The FAI team uses it to ensure Friendliness and implement strategy surrounding FAI research.
    • The government has extended some kind of support for Friendliness research, such as limited funding. No protective legislation is forthcoming.
    • FAI research becomes slightly more high status than today, and additional researchers are attracted to answer important open questions about FAI.
    • Friendliness and rationality memes grow at a reasonable rate, and by the time the Friendliness program occurs, society is more sane.
    • We get slightly better at making predictions, mostly by refining our current research and discussion strategies. This allows us a few key insights that are instrumental in reducing X-risk.
    • Some X-risk reduction efforts have been implemented, but with varying levels of success. Insights about which X-risk efforts matter are of dubious quality, and the success of each effort doesn't correlate well to the seriousness of the X-risk. Nevertheless, some X-risk reduction is achieved, and humanity survives long enough to implement FAI.
    • Some security efforts are implemented, making it difficult but not impossible for pre-Friendly AI tech to be leaked. Nevertheless, no leaks happen.
    • Giles writes: Funding is harder to come by, but small donations, limited government funding, or moderately successful business efforts suffice to fund the FAI team.
    • Giles writes: The technical problem of aggregating values through a Friendliness function is difficult; people have contradictory and differing values. However, there is broad agreement as to how to aggregate preferences. Most people accept that FAI needs to respect values of humanity as a whole, not just their own.
    • Crude_Dolorium writes: Superhuman AI arrives before we learn how to make it Friendly, but we do learn how to make an 'Anchorite' AI that definitely won't take over the world. The first superhuman AIs use this architecture, and we use them to solve the harder problems of FAI before anyone sets off an exploding UFAI.
  • Sufficiently good scenarios
    • Intelligence amplification technology is widespread, preventing any differential adoption by the FAI team. However, FAI researchers are able to keep up with competing efforts to use that technology for AI research.
    • The government doesn't support Friendliness research, but the research group stays out of trouble and avoids government interference.
    • FAI research never becomes prestigious or high-status, but the FAI team is able to answer the important questions anyway.
    • Memes regarding Friendliness aren't significantly more widespread than today, but  the movement has grown enough to attract the talent necessary to implement a Friendliness program.
    • Predictive ability is no better than it is today, but the few insights we've gathered suffice to build the FAI team and make the project happen.
    • There are no significant and successful X-risk reduction efforts, but humanity survives long enough to implement FAI anyway.
    • No significant security measures are implemented for the FAI project. Still, via cooperation and because the team is relatively unknown, no dangerous leaks occur.
    • Giles writes: The team is forced to operate on a shoestring budget, but succeeds anyway because the problem turns out to not be incredibly sensitive to funding constraints.
    • Giles writes: The technical problem of aggregating values is incredibly difficult. Many important human values contradict each other, and we have discovered no "best" solution to those conflicts. Most people agree on the need for a compromise but quibble over how that compromise should be reached. Nevertheless, we come up with a satisfactory compromise.
    • Crude_Dolorium writes: The problems of Friendliness aren't solved in time, or the solutions don't apply to practical architectures, or the creators of the first superhuman AIs don't use them, so the AIs have only unreliable safeguards. They're given cheap, attainable goals; the creators have tools to read the AIs' minds to ensure they're not trying anything naughty, and killswitches to stop them; they have an aversion to increasing their intelligence beyond a certain point, and to whatever other failure modes the creators anticipate; they're given little or no network connectivity; they're kept ignorant of facts more relevant to exploding than to their assigned tasks; they require special hardware, so it's harder for them to explode; and they're otherwise designed to be safer if not actually safe. Fortunately they don't encounter any really dangerous failure modes before they're replaced with descendants that really are safe.


Room for more funding at the Future of Humanity Institute

18 John_Maxwell_IV 16 November 2012 08:45PM

In case you didn't already know: The Future of Humanity Institute, one of the three organizations co-sponsoring LW, is a group within the University of Oxford's philosophy department that tackles important, large-scale problems for humanity like how to go about reducing existential risk.

I've been casually corresponding with the FHI in an effort to learn more about the different options available for purchasing existential risk reduction. Here's a summary of what I've learned from research fellow Stuart Armstrong and academic project manager Sean O'Heigeartaigh:

  • Sean reports that since this SIAI/FHI achievements comparison, FHI's full-time research team has expanded to 7, the biggest it's ever been.  Sean writes: "Our output has improved dramatically by all tangible metrics (academic papers, outreach, policy impact, etc) to match this."
  • Despite this, Sean writes, "we’re not nearly at the capacity we’d like to reach. There are a number of research areas in which we would very like to expand (more machine intelligence work, synthetic biology risks, surveillance/information society work) and in which we feel that we could make a major impact. There are also quite a number of talented researchers over the past year who we haven’t been able to employ but would dearly like to."
  • They'd also like to do more public outreach, but standard academic funding routes aren't likely to cover this.  So without funding from individuals, it's much less likely to happen.
  • Sean is currently working overtime to cover a missing administrative staff member, but he plans to release a new achievement report (see sidebar on this page for past achievement reports) sometime in the next few months.
  • Although the FHI has traditionally pursued standard academic funding channels, donations from individuals (small and large) are more than welcome.  (Stuart says this can't be emphasized enough.)
  • Stuart reports current academic funding opportunities are "a bit iffy, with some possible hopes".
  • Sean is more optimistic than Stuart regarding near-term funding prospects, although he does mention that both Stuart and Anders Sandberg are currently being covered by FHI's "non-assigned" funding until grants for them can be secured.

Although neither Stuart nor Sean mentions this, I assume that one reason individual donations can be especially valuable is if they free FHI researchers up from writing grant proposals so they can spend more time doing actual research.

Interesting comment by lukeprog describing the comparative advantages of SIAI and FHI.

Prediction: Autism Rate will Stop Increasing

2 OneLonePrediction 16 November 2012 07:26PM

I predict that the prevalence of autism spectrum disorders is done increasing because it has all come from better diagnosis. The autism rate in children has now reached one in 88; the autism rate in adults is estimated at one in 86. We just went within my error bars.

There are two things that I think could confuse the issue. The first is that the DSM-V will come out soon. If, following the DSM-V, diagnosticians continue exactly what they're currently doing, except that they diagnose all autism spectrum disorders as autistic disorder (which will not affect the one in 88 statistic because Asperger's and PDD-NOS have always been part of it), then the prevalence rate has stopped increasing. If, following the DSM-V, diagnosticians do what it tells them, the prevalence rate will decrease, with the loss coming from people with PDD-NOS who have communication or language difficulties and one of the other two points in the triad of impairments.

The second is that the adult rate may be lower than the childhood rate because of the existence of people who are diagnosable as children but not as adults because of learned coping skills. If that's true, the rate may continue to increase.

I will consider myself right if it reaches one in 84, slightly surprised at one in 80 (I'll assume I underestimated the number of people diagnosable only as children), shaken and looking for explanations at one in 75 and outright wrong (I will abandon the theory and concede defeat) if the prevalence reaches one in 70 without some really significant evidence of overdiagnosis.

I also allocate some probability mass to the idea that the prevalence rate will decrease. I don't predict a huge decrease with great likelihood, but if I do see one, I will update on my beliefs about diagnosticians and watch who is and isn't getting diagnosed in the next youngest cohort. If the DSM-V causes such a drop, I would expect it to more likely be sudden as doctors adopt changes after the DSM-V comes out, but it could be slow and steady if older doctors do as they were already doing and younger doctors act differently. Those cases could be distinguished by looking at diagnoses made by older doctors and newer ones and comparing them.

I note that while I do not predict a huge drop with great probability, mine is the only theory which would explain any drop at all.

I predict that if there is a drop, John Best will claim that "lying neurodiverse psychopaths" are somehow responsible and that it harms "actual autistics" and their families. Note that I assign only small probability to the actual phrases given. For instance, he may suggest "real autistics" or "families actually affected by real autism" or any number of things. ("Psychopaths" will be in there somewhere as a description of people who do not want a cure for autism. If he doesn't call them liars in his blog post, they will be called liars somewhere in the comments.) If I am wrong, the most likely other possibility is that he is triumphant and believes that his supporters are "getting the message out" and parents are not vaccinating anymore. My model of reality takes a hit if John Best ever claims that new and surprising evidence does not support his ideas about autism. This does not apply to everyone who is against vaccines. It only applies to him. 

(Trivial prediction: you will get upset if you read his blogs. Don't go looking for them unless your utility function values becoming upset over false claims.)

If the autism rate is stable, this is evidence that it has been stable for a long time (I believe this because the increasing rate has been used as evidence that it did not exist before the last century and that it is environmentally caused) and if it has been stable for a long time, this is evidence against it being caused by vaccines, because that would have caused the prevalence to increase.

Please spread this around as much as possible. I am predicting ahead of time that: The autism rate does not go above one in eighty, probably stays stable in the high-to-mid eighties and may decrease. I admit that I was wrong if it reaches one in seventy. Please help: I want this well-known. I want people to know I made the prediction before we see the evidence. Sharing a link to this would be a quick and easy way to increase average utility and expose people to the idea of falsifiable ideas that make predictions about what they will and won't see.

Why is Mencius Moldbug so popular on Less Wrong? [Answer: He's not.]

9 arborealhominid 16 November 2012 06:37PM

I've seen several people on Less Wrong recommend Mencius Moldbug's writings, and I've been curious about how he became so popular here. He's certainly an interesting thinker, but he's rather obscure and doesn't have any obvious connection to Less Wrong, so I'm wondering where this overlap in readership came from.

[EDIT by E.Y.: The answer is that he's not popular here.  The 2012 LW annual survey showed 2.5% (30 of 1195 responses) identified as 'reactionary' or 'Moldbuggian'.  To the extent this is greater than population average, it seems sufficiently explained by Moldbug having commented on the early Overcoming Bias econblog before LW forked from it, bringing with some of his own pre-existing audience.  I cannot remember running across anyone talking about Moldbug on LW, at all, besides this post, in the last year or so.  Since this page has now risen to the first page of Google results for Mencius Moldbug due to LW's high pagerank, and on at least one occasion sloppy / agenda-promoting journalists such as Klint Finley have found it convenient to pretend to an alternate reality (where Moldbug is popular on LW and Hacker News due to speaking out for angry entitled Silicon Valley elites, or something), a correction in the post seems deserved.  See also the Anti-Reactionary FAQ by Scott Alexander (aka Yvain, LW's second-highest-karma user). --EY]

Meetup : Montreal Meetups - Now weekly!

2 DaFranker 16 November 2012 06:01PM

Discussion article for the meetup : Montreal Meetups - Now weekly!

WHEN: 19 November 2012 06:00:00PM (-0500)

WHERE: Caffe Art Java - 645, av. du Président-Kennedy, Montréal, QC H3A 1K1

After several meetings with two other LessWrong readers and several "outside" friends or acquaintances interested in rationality, we've established that we'll meet every Monday at 18:00.

At this next meetup, we'll attempt to solidify-in-practice some key concepts of LW-style epistemology and start moving into the realm of training thought habits for using words (more) correctly.

If you're interested in joining us, we're interested in having you! If you have questions, please contact Paul_G or myself.

We don't really have a public signal yet, so if you want to avoid the possible embarrassment of asking random people in the café if they're here for the LessWrong meetup, I recommend getting in touch with us.

Discussion article for the meetup : Montreal Meetups - Now weekly!

Meetup : Cambridge, MA third-Sundays meetup

3 jimrandomh 16 November 2012 06:00PM

Discussion article for the meetup : Cambridge, MA third-Sundays meetup

WHEN: 18 November 2012 02:00:00PM (-0500)

WHERE: 25 Ames St Cambridge, MA

Cambridge/Boston-area Less Wrong meetups on the first and third Sunday of every month, 2pm at the MIT Landau Building [25 Ames St, Bldg 66], room 148. Room number subject to change based on availability, signs will be posted with the actual room number.

Discussion article for the meetup : Cambridge, MA third-Sundays meetup

Weekly LW Meetups: Austin, Berlin, Champaign, Melbourne, Rio de Janeiro, Springfield MO, Washington DC

0 FrankAdamek 16 November 2012 05:03PM

[draft] Responses to Catastrophic AGI Risk: A Survey

11 Kaj_Sotala 16 November 2012 02:29PM

Here's the biggest thing that I've been working on for the last several months:

Responses to Catastrophic AGI Risk: A Survey
Kaj Sotala, Roman Yampolskiy, and Luke Muehlhauser

Abstract: Many researchers have argued that humanity will create artificial general intelligence (AGI) within the next 20-100 years. It has been suggested that this may become a catastrophic risk, threatening to do major damage on a global scale. After briefly summarizing the arguments for why AGI may become a catastrophic risk, we survey various proposed responses to AGI risk. We consider societal proposals, proposals for constraining the AGIs’ behavior from the outside, and for creating AGIs in such a way that they are inherently safe.

This doesn't aim to be a very strongly argumentative paper, though it does comment on the various proposals from an SI-ish point of view. Rather, it attempts to provide a survey of all the major AGI-risk related proposals that have been made so far, and to provide some thoughts on their respective strengths and weaknesses. Before writing this paper, we hadn't encountered anyone who'd have been familiar with all of these proposals - not to mention that even we ourselves weren't familiar with all of them! Hopefully, this should become a useful starting point for anyone who's at all interested in AGI risk or Friendly AI.

The draft will be public and open for comments for one week (until Nov 23rd), after which we'll incorporate the final edits and send it off for review. We're currently aiming to have it published in the sequel volume to Singularity Hypotheses.

EDIT: I've now hidden the draft from public view (so as to avoid annoying future publishers who may not like early drafts floating around before the work has been accepted for publication) while I'm incorporating all the feedback that we got. Thanks to everyone who commented!

[SEQ RERUN] Surprised By Brains

1 MinibearRex 16 November 2012 07:33AM

Today's post, Surprised by Brains was originally published on 23 November 2008. A summary (taken from the LW wiki):


If you hadn't ever seen brains before, but had only seen evolution, you might start making astounding predictions about their ability. You might, for instance, think that creatures with brains might someday be able to create complex machinery in only a millenium.

Discuss the post here (rather than in the comments to the original post).

This post is part of the Rerunning the Sequences series, where we'll be going through Eliezer Yudkowsky's old posts in order so that people who are interested can (re-)read and discuss them. The previous post was Billion Dollar Bots, and you can use the sequence_reruns tag or rss feed to follow the rest of the series.

Sequence reruns are a community-driven effort. You can participate by re-reading the sequence post, discussing it here, posting the next day's sequence reruns post, or summarizing forthcoming articles on the wiki. Go here for more details, or to have meta discussions about the Rerunning the Sequences series.

Meetup : DC Meetup: Boardgames

1 rocurley 16 November 2012 03:47AM

Discussion article for the meetup : DC Meetup: Boardgames

WHEN: 18 November 2012 03:00:00PM (-0500)

WHERE: National Portrait Gallery

We'll be meeting to play games and hang out, and possibly try a new variation of Zendo.

Discussion article for the meetup : DC Meetup: Boardgames

How well defined is ADHD?

8 jsalvatier 15 November 2012 11:34PM

I've long had attention and focus problems, but never explored the possibility that I have ADHD till recently. I understand that it's a standard term, but I'm still a bit suspicious; Psychiatry doesn't seem like the most reliable field.

Are there good reasons for picking out the behaviors associated with ADHD and giving them a name? Obviously this is a different question than whether ADHD is a 'disorder' or 'disease', and whether ADHD medication is good or bad for people.    

Answers that would satisfy me


  • Behaviors associated with ADHD strongly cluster
  • Analyzing questionnaires of attention and focus behaviors with faction analysis naturally produces an 'ADHD dimension' that explains a lot of variance (similar methodology to identifying Big 5 personality traits). 
  • ADHD diagnosis a strong predictor of anything interesting (income or grades or some contrived but interesting lab test)?
  • Something else along these lines.


Empirical claims, preference claims, and attitude claims

5 John_Maxwell_IV 15 November 2012 07:41PM

What do the following statements have in common?

  • "Atlas Shrugged is the best book ever written."
  • "You break it, you buy it."
  • "Earth is the most interesting planet in the solar system."

My answer: None of them are falsifiable claims about the nature of reality.  They're all closer to what one might call "opinions".  But what is an "opinion", exactly?

There's already been some discussion on Less Wrong about what exactly it means for a claim to be meaningful.  This post focuses on the negative definition of meaning: what sort of statements do people make where the primary content of the statement is non-empirical?  The idea here is similar to the idea behind anti-virus software: Even if you can't rigorously describe what programs are safe to run on your computer, there still may be utility in keeping a database of programs that are known to be unsafe.

Why is it useful to be able to be able to flag non-empirical claims?  Well, for one thing, you can believe whatever you want about them!  And it seems likely that this pattern-matching approach works better for flagging them than a more constructive definition.

continue reading »

A summary of the Hanson-Yudkowsky FOOM debate

23 Kaj_Sotala 15 November 2012 07:25AM

In late spring this year, Luke tasked me with writing a summary and analysis of the Hanson-Yudkowsky FOOM debate, with the intention of having it eventually published in somewhere. Due to other priorities, this project was put on hold for the time being. Because it doesn't look like it will be finished in the near future, and because Curiouskid asked to see it, we thought that we might as well share the thing.

I have reorganized the debate, presenting it by topic rather than in chronological order: I start by providing some brief conceptual background that's useful for understanding Eliezer's optimization power argument, after which I present his argument. Robin's various objections follow, after which there is a summary of Robin's view of how the Singularity will be like, together with Eliezer's objections to that view. Hopefully, this should make the debate easier to follow. This summary also incorporates material from the 90-minute live debate on the topic that they had in 2011. The full table of contents:

  1. Introduction
  2. Overview
  3. The optimization power argument
    1. Conceptual background
    2. The argument: Yudkowsky
    3. Recursive self-improvement
    4. Hard takeoff
    5. Questioning optimization power: the question of abstractions
    6. Questioning optimization power: the historical record
    7. Questioning optimization power: the UberTool question
  4. Hanson's Singularity scenario
    1. Architecture vs. content, sharing of information
    2. Modularity of knowledge
    3. Local or global singularity?
  5. Wrap-up
  6. Conclusions
  7. References

Here's the link to the current draft, any feedback is welcomed. Feel free to comment if you know of useful references, if you think I've misinterpreted something that was said, or if you think there's any other problem. I'd also be curious to hear to what extent people think that this outline is easier to follow than the original debate, or whether it's just as confusing.

Launched: Friendship is Optimal

40 iceman 15 November 2012 04:57AM

Friendship is Optimal has launched and is being published in chunks on FIMFiction

Friendship is Optimal is a story about an optimizer written to "satisfy human values through friendship and ponies." I would like to thank everyone on LessWrong who came out and helped edit itFriendship is Optimal wouldn't be what is today without your help.

Thank you.

Teaser description:

Hanna, the CEO of Hofvarpnir Studios, just won the contract to write the official My Little Pony MMO. Hanna has built an A.I. Princess Celestia and given her one basic drive: to satisfy everybody's values through friendship and ponies. And Princess Celestia will follow those instructions to the letter...even if you don't want her to.

Here is the schedule for the next chapters:

Friday (Nov. 16th): Chapter 4 - 5

Monday (Nov. 19th): Chapter 6 - 7

Thursday (Nov. 22th): Chapter 8 - 9

Sunday (Nov. 25th): Chapter 10 - 11, Author's Afterword

[SEQ RERUN] Billion Dollar Bots

0 MinibearRex 15 November 2012 04:43AM

Today's post, Billion Dollar Bots was originally published on November 22, 2008. A summary:


An alternate scenario for the creation of bots, this time involving lots of cloud computing.

Discuss the post here (rather than in the comments to the original post).

This post is part of the Rerunning the Sequences series, where we'll be going through Eliezer Yudkowsky's old posts in order so that people who are interested can (re-)read and discuss them. The previous post was Brain Emulation and Hard Takeoff, and you can use the sequence_reruns tag or rss feed to follow the rest of the series.

Sequence reruns are a community-driven effort. You can participate by re-reading the sequence post, discussing it here, posting the next day's sequence reruns post, or summarizing forthcoming articles on the wiki. Go here for more details, or to have meta discussions about the Rerunning the Sequences series.

Meetup : Salt Lake City Monthly meetup

0 hamnox 15 November 2012 12:46AM

Discussion article for the meetup : Salt Lake City Monthly meetup

WHEN: 17 November 2012 03:00:28PM (-0700)

WHERE: Calvin S. Smith Library 810 East 3300 South Salt Lake City, Utah 84106

We'll be in the small conference room at the back, and might go out for coffee or whatnot afterwards. There's Karaoke going on later too.

Agenda: 3:00 Greetings, Introduction and Catchup 3:10 Thinking Fast & Slow 3:30 discuss 3:40 Expected Value of Information 4:00 discuss 4:10 Group Goals 4:30 discuss 4:40 Wrap up and Final Thoughts, Sign up for next meetup Free Discussion

For anyone who can't make this one, know that there's one scheduled every third Saturday of the month, tentatively the same time and place, for most of the conceivable future. Our google group is here: https://groups.google.com/group/lesswrongslc/

Discussion article for the meetup : Salt Lake City Monthly meetup

Meetup : Cincinnati/Columbus: Memorisation exercise (NB, time changed)

2 RolfAndreassen 14 November 2012 08:38PM

Discussion article for the meetup : Cincinnati/Columbus: Memorisation exercise (NB, time changed)

WHEN: 18 November 2012 02:00:00PM (-0500)

WHERE: 4934 Juniper Way Beavercreek, OH 45440

Please notice changed time!

We will meet at Choe's Asian Gourmet in the Greene. Our exercise for this month is rote memorisation: Please come prepared with a poem of reasonable length, memorised this week, to recite for the group and then discuss. Bookstore and ice cream may occur later.

Discussion article for the meetup : Cincinnati/Columbus: Memorisation exercise (NB, time changed)

Meetup : Durham HPMoR Discussion, chapters 15-17

2 evand 14 November 2012 05:52PM

Discussion article for the meetup : Durham HPMoR Discussion, chapters 15-17

WHEN: 17 November 2012 11:00:00AM (-0500)

WHERE: Foster's Market, 2694 Durham-Chapel Hill Blvd., Durham, NC

We will meet and discuss HPMoR, chapters 15-17 (approx 55 pages). Main discussion will probably last until 12:30 or 1:00, and there will likely be Zendo afterwards.

Discussion article for the meetup : Durham HPMoR Discussion, chapters 15-17

[LINK] Steven Landsburg "Accounting for Numbers" - response to EY's "Logical Pinpointing"

9 David_Gerard 14 November 2012 12:55PM

"I started to post a comment, but it got long enough that I’ve turned my comment into a blog post."

So the study of second-order consequences is not logic at all; to tease out all the second-order consequences of your second-order axioms, you need to confront not just the forms of sentences but their meanings. In other words, you have to understand meanings before you can carry out the operation of inference. But Yudkowsky is trying to derive meaning from the operation of inference, which won’t work because in second-order logic, meaning comes first.

... it’s important to recognize that Yudkowsky has “solved” the problem of accounting for numbers only by reducing it to the problem of accounting for sets — except that he hasn’t even done that, because his reduction relies on pretending that second order logic is logic.

Instrumental rationality for overcoming disability and lifestyle failure (a specific case)

8 CAE_Jones 14 November 2012 08:46AM

I read things like "Rationality should win!", and I feel validated. That's basically what I've believed as far back as I can remember (and though my memory isn't as reliable as I'd like, I do think it is pretty decent, based on comparisons I've made with video evidence. Not that said video evidence includes much of me talking about my beliefs.).
Yet, clearly, I am no master of rationality, given that my situation at present is not what I would define as having won. When my father was my age, he was employed, married (to his second wife, with whom he is still married), and I was three years old. It wasn't long after that when my sister was born and my dad started work for his father-in-law, whose business he eventually came to own and still runs for a viable profit today. He did set some goals for himself that he hasn't quite met--gaining enough self-sustaining wealth to retire at age 45 (he turned 45 in February 2012) and/or spend his days lounging at the beach without negative consequences--but considering how much money he's put into vacations to the beach, and that he's still able to support his business and family (my stepmother makes a non-negligible contribution to this), I'd call him much closer to successful than me.
So, clearly I didn't pick up everything that made my father successful. I don't think dwelling on the differences will help much, but relevant is that I was born with no vision in one eye, and had some vision loss in the other a few years later, until it finally dropped to near-useless starting around age fourteen. (Incidentally, this is my excuse if I fail to catch any spelling errors in this post.) I was taught to believe strongly in the power of human intelligence, and that peer pressure is evil, and that education is the most important thing ever.
By 2008, I'd discovered that most of this was extremely flawed, and it was too late to correct the damage that acting on my perceived value of these beliefs had caused (I hadn't realized the extent of that damage by then, but was beginning to pick up on some of it). I'd gotten into an expensive college because the best education possible was apparently important and expensive... yet between my vision, attention to what I thought of as rationality, and rejection of many social conventions, I was completely lacking the skills to get much out of the college experience other than some basic details on a few foreign cultures. After six years of that, I'm back with my parents, $30,000 in debt, unemployed, lacking in the social department to an extent that seems more uncurable than not, still require a French credit to receive a diploma that will in all likelyhood be of very little utility to me, with a serious defficit in skills necessary for independence. Oh, and have less than $400 that I can use ($90 may or may not be in a bank account I don't know how to access, $123.51 in paypal, and the rest in cash). I was recently reapproved for supplemental security income (which I don't expect to be any more than $400 per month), conditional on me living at the property my grandmother left me.
I said all of that to ask: how do I fix these problems? I think I've gone on way too long with this post, but explaining some of the obstacles won't help me find solutions without goals, so I'll list some.
* I have lived in the same town for 24 years, and in the same house for close to 20 of those, yet I cannot travel anywhere beyond our property on my own without serious risk (getting lost / injured / trespassing / doing accidental damage to someone else's property / etc). I have more mobility skills than my parents give me credit for (About a week and a half ago, I decided to go across the street to ask my stepmother about lunch, and my father said "There's a road! You'll get run over!". I have crossed more dangerous roads than that one unaided multiple times, though he wasn't witness to any of this.). I believe I could get to the store down the road from my grandmother's house unaided (getting back might be more difficult, though), but that's about it. Being able to travel independently seems likely to increase my abilities to accomplish other things tremendously, so this is a problem to be overcome.
* My financial situation is horrible. By my estimates, I could live on $500 per month, assuming I was efficient with food, electricity, etc, and still be able to afford internet access and possibly avoid incurring the wrath of whoever my student loans are paid off to (an understanding of my finances was never given priority before 2010, so I only vaguely know what's going on). I'd much rather have a bit more ($1000 a month would be spectacular, although if part of that is SSI I wouldn't be able to save more than $2000 at a time without losing it). I am led to believe that my employment opportunities are extremely limited (my region has handled the economic recession well, but my other issues seem likely to add to the difficulty). Actually finding local employment without being able to travel to locations independently and fill out application forms would be more than a little difficult, and I have a strong preference for something with a physical component that would make online work undesirable. (My programming skills are also less than spectacular; I've been able to develop accessible computer games and have actually turned a slight profit on that (by which I mean less than $300), but I don't believe I can program on the level that would get someone else to hire me. And if I could write on demand, I wouldn't have an outstanding French credit.). (Here's what the American Foundation for the Blind has to say on employment among disabled Americans: http://www.afb.org/section.aspx?SectionID=15&SubTopicID=177 ).
* Health. I'm sure just living on my own would have a serious impact on my ability to adjust my diet for nutritional value (I find that I tend to eat whatever is easiest, which is usually horribly unhealthy). I can cook with a microwave, and am told that a slow-cooker/crockpot is easy to use. I'm more concerned about exercise, seeing as I can't safely go running or anything of the sort. (I also don't like treadmills, for some reason. I'd be ok with an eliptical, if I could acquire one and find somewhere to put it...).
* I have no experience with romantic relationships (I'm not even sure about strong platonic relationships, for that matter). I am possibly interested in changing this. As a hint at some almost-definitely-harmful things that I haven't been able to remove from my mind, the previous statement was far more painful to write than those before it (as in, if I wound up in a romantic relationship, I would absolutely dread discovery from my parents, knowing that the worst they'd do is make comments intended to be humorous rather than harmful.).
* I've had my creative endeavors (writing fiction, game development, etc) as side-goals for a while, but recently I've started to consider that making them higher priority is probably a good thing. The problem, other than combatting procrastination and other sorts of akrasia, is that I'm at a point where, to keep moving forward almost definitely requires funding. (I can't see well enough to do graphics, and haven't been able to find a decent way around this that doesn't involve paying someone to do graphics; I can't do all the voice-work I'd need, and volunteer voiceactors are horribly unreliable; sound libraries, software licenses, web hosting, etc). The goal is of course the creation of the products to my satisfaction, not the funding itself, but funding seems like a crucial upcoming step.

Solving/accomplishing any of the above would be a significant victory, yet I feel extremely limited in my personal ability to do so. If we can call any one of those successfully resolved using methods promoted at LessWrong, I'd definitely try to provide as much evidence for the victory as possible, as all the questions about the utility of LW make substantiated claims of successes due to the methods of rationality seem valuable to the community.

[SEQ RERUN] Brain Emulation and Hard Takeoff

1 MinibearRex 14 November 2012 06:47AM

Today's post, Brain Emulation and Hard Takeoff was originally published on November 22, 2008. A summary:


A project of bots could start an intelligence explosion once it got fast enough to start making bots of the engineers working on it, that would be able to operate at greater than human speed. Such a system could also devise a lot of innovative ways to acquire more resources or capital.

Discuss the post here (rather than in the comments to the original post).

This post is part of the Rerunning the Sequences series, where we'll be going through Eliezer Yudkowsky's old posts in order so that people who are interested can (re-)read and discuss them. The previous post was Emulations Go Foom, and you can use the sequence_reruns tag or rss feed to follow the rest of the series.

Sequence reruns are a community-driven effort. You can participate by re-reading the sequence post, discussing it here, posting the next day's sequence reruns post, or summarizing forthcoming articles on the wiki. Go here for more details, or to have meta discussions about the Rerunning the Sequences series.

Universal agents and utility functions

29 Anja 14 November 2012 04:05AM

I'm Anja Heinisch, the new visiting fellow at SI. I've been researching replacing AIXI's reward system with a proper utility function. Here I will describe my AIXI+utility function model, address concerns about restricting the model to bounded or finite utility, and analyze some of the implications of modifiable utility functions, e.g. wireheading and dynamic consistency. Comments, questions and advice (especially about related research and material) will be highly appreciated.

Introduction to AIXI

Marcus Hutter's (2003) universal agent AIXI  addresses the problem of rational action in a (partially) unknown computable universe, given infinite computing power and a halting oracle. The agent interacts with its environment in discrete time cycles, producing an action-perception sequence  with actions (agent outputs)   and perceptions (environment outputs)   chosen from finite sets  and . The perceptions are pairs , where  is the observation part and  denotes a reward. At time k the agent chooses its next action  according to the expectimax principle:

Here M denotes the updated Solomonoff prior summing over all programs  that are consistent with the history  [1] and which will, when run on the universal Turing machine T with successive inputs , compute outputs , i.e.

AIXI is a dualistic framework in the sense that the algorithm that constitutes the agent is not part of the environment, since it is not computable. Even considering that any running implementation of AIXI would have to be computable, AIXI accurately simulating AIXI accurately simulating AIXI ad infinitem doesn't really seem feasible. Potential consequences of this separation of mind and matter include difficulties the agent may have predicting the effects of its actions on the world. 

Utility vs rewards

So, why is it a bad idea to work with a reward system? Say the AIXI agent is rewarded whenever a human called Bob pushes a button. Then a sufficiently smart AIXI will figure out that instead of furthering Bob’s goals it can also threaten or deceive Bob into pushing the button, or get another human to replace Bob. On the other hand, if the reward is computed in a little box somewhere and then displayed on a screen, it might still be possible to reprogram the box or find a side channel attack. Intuitively you probably wouldn't even blame the agent for doing that -- people try to game the system all the time. 

You can visualize AIXI's computation as maximizing bars displayed on this screen; the agent is unable to connect the bars to any pattern in the environment, they are just there. It wants them to be as high as possible and it will utilize any means at its disposal. For a more detailed analysis of the problems arising through reinforcement learning, see Dewey (2011).

Is there a way to bind the optimization process to actual patterns in the environment? To design a framework in which the screen informs the agent about the patterns it should optimize for? The answer is, yes, we can just define a utility function

that assigns a value  to every possible future history  and use it to replace the reward system in the agent specification:

When I say "we can just define" I am actually referring to the really hard question of how to recognize and describe the patterns we value in the universe. Contrasted with the necessity to specify rewards in the original AIXI framework, this is a strictly harder problem, because the utility function has to be known ahead of time and the reward system can always be represented in the framework of utility functions by setting

For the same reasons, this is also a strictly safer approach.

Infinite utility

The original AIXI framework must necessarily place upper and lower bound on the rewards that are achievable, because the rewards are part of the perceptions and  is finite. The utility function approach does not have this problem, as the expected utility 

is always finite as long as we stick to a finite set of possible perceptions, even if the utility function is not bounded. Relaxing this constraint and allowing  to be infinite and the utility to be unbounded creates divergence of expected utility (for a proof see de Blanc 2008). This closely corresponds to the question of how to be a consequentialist in an infinite universe, discussed by Bostrom (2011). The underlying problem here is that (using the standard approach to infinities) these expected utilities will become incomparable. One possible solution to this problem could be to use a larger subfield than  of the surreal numbers, my favorite[2] so far being the Levi-Civita field generated by the infinitesimal :

with the usual power-series addition and multiplication. Levi-Civita numbers can be written and approximated as 

(see Berz 1996), which makes them suitable for representation on a computer using floating point arithmetic. If we allow the range of our utility function to be , we gain the possibility of generalizing the framework to work with an infinite set of possible perceptions, therefore allowing for continuous parameters. We also allow for a much broader set of utility functions, no longer excluding the assignment of infinite (or infinitesimal) utility to a single event. I recently met someone who argued convincingly that his (ideal) utility function assigns infinite negative utility to every time instance that he is not alive, therefore making him prefer life to any finite but huge amount of suffering.

Note that finiteness of  is still needed to guarantee the existence of actions with maximal expected utility, and the finite (but dynamic) horizon  remains a very problematic assumption, as described in Legg (2008).

Modifiable utility functions

Any implementable approximation of AIXI implies a weakening of the underlying dualism. Now the agent's hardware is part of the environment and at least in the case of a powerful agent, it can no longer afford to neglect the effect its actions may have on its source code and data. One question that has been asked is whether AIXI can protect itself from harm. Hibbard (2012) shows that an agent similar to the one described above, equipped with the ability to modify its policy responsible for choosing future actions, would not do so, given that it starts out with the (meta-)policy to always use the optimal policy, and the additional constraint to change only if that leads to a strict improvement. Ring and Orseau (2011) study under which circumstances a universal agent would try to tamper with the sensory information it receives. They introduce the concept of a delusion box, a device that filters and distorts the perception data before it is written into the part of the memory that is read during the calculation of utility. 

A further complication to take into account is the possibility that the part of memory that contains the utility function may get rewritten, either by accident, by deliberate choice (programmers trying to correct a mistake), or in an attempt to wirehead. To analyze this further we will now consider what can happen if the screen flashes different goals in different time cycles. Let 

denote the utility function the agent will have at time k.

Even though we will only analyze instances in which the agent knows at time k, which utility function  it will have at future times  (possibly depending on the actions  before that), we note that for every fixed future history  the agent knows the utility function  that is displayed on the screen because the screen is part of its perception data .

This leads to three different agent models worthy of further investigation:

  • Agent 1 will optimize for the goals that are displayed on the screen right now and act as if it would continue to do so in the future. We describe this with the utility function   
  • Agent 2 will try to anticipate future changes to its utility function and maximize the utility it experiences at every time cycle as shown on the screen at that time. This is captured by 
  • Agent 3 will, at time k, try to maximize the utility it derives in hindsight, displayed on the screen at the time horizon  

Of course arbitrary mixtures of these are possible.

The type of wireheading that is of interest here is captured by the Simpleton Gambit described by Orseau and Ring (2011), a Faustian deal that offers the agent maximal utility in exchange for its willingness to be turned into a Simpleton that always takes the same default action at all future times. We will first consider a simplified version of this scenario: The Simpleton future, where the agent knows for certain that it will be turned into a Simpleton at time k+1, no matter what it does in the remaining time cycle. Assume that for all possible action-perception combinations the utility given by the current utility function is not maximal, i.e.   holds for all . Assume further that the agents actions influence the future outcomes, at least from its current perspective. That is, for all  there exist   with . Let  be the Simpleton utility function, assigning equal but maximal utility  to all possible futures. While Agent 1 will optimize as before, not adapting its behavior to the knowledge that its utility function will change, Agent 3 will be paralyzed, having to rely on whatever method its implementation uses to break ties. Agent 2 on the other hand will try to maximize only the utility .

Now consider the actual Simpleton Gambit: At time k the agent gets to choose between changing, , resulting in  and  (not changing), leading to  for all . We assume that  has no further effects on the environment. As before, Agent 1 will optimize for business as usual, whether or not it chooses to change depends entirely on whether the screen specifically mentions the memory pointer to the utility function or not.

Agent 2 will change if and only if the utility of changing compared to not changing according to what the screen currently says is strictly smaller than the comparative advantage of always having maximal utility in the future. That is,

is strictly less than

This seems quite analogous to humans, who sometimes tend to choose maximal bliss over future optimization power, especially if the optimization opportunities are meager anyhow. Many people do seem to choose their goals so as to maximize the happiness felt by achieving them at least some of the time; this is also advice that I have frequently encountered in self-help literature, e.g. here. Agent 3 will definitely change, as it only evaluates situations using its final utility function.

Comparing the three proposed agents, we notice that Agent 1 is dynamically inconsistent: it will optimize for future opportunities, that it predictably will not take later. Agent 3 on the other hand will wirehead whenever possible (and we can reasonably assume that opportunities to do so will exist in even moderately complex environments). This leaves us with Agent model 2 and I invite everyone to point out its flaws.

[1] Dotted actions/ perceptions, like  denote past events, underlined perceptions  denote random variables to be observed at future times.

[2] Bostrom (2011) proposes using hyperreal numbers, which rely heavily on the axiom of choice for the ultrafilter to be used and I don't see how those could be implemented.

Meetup : Chicago Organizational Meeting and Open Discussion

2 Nic_Smith 14 November 2012 01:45AM

Discussion article for the meetup : Chicago Organizational Meeting and Open Discussion

WHEN: 17 November 2012 01:00:00PM (-0600)

WHERE: 360 N. Michigan Ave., Chicago, IL

"This is an organizational meeting planned for every other month to discuss the group's progress and ideas for possible future meetups. Afterwards, we will have open discussion..." -- http://www.meetup.com/Less-Wrong-Chicago/events/83542572/

This location is the same Corner Bakery that we previously had the weekly meetings at.

Discussion article for the meetup : Chicago Organizational Meeting and Open Discussion

Meetup : Berkeley meetup: Deliberate performance

1 Nisan 13 November 2012 11:58PM

Discussion article for the meetup : Berkeley meetup: Deliberate performance

WHEN: 14 November 2012 07:00:00PM (-0800)

WHERE: Berkeley, CA

This week's Berkeley meetup will be a structured discussion about getting better at skills. One way to get better at skills (like aikido) is deliberate practice. Another way to get better at skills (like making friends) is deliberate performance, which is what you do when practicing isn't an option. (If you meet an especially cool person, you don't want to try your risky bold untested friend-getting technique, because you actually want to make friends with them.) Very generally, the tools are

  • making predictions
  • experimenting
  • analyzing what happened
  • doing post-mortems on near-misses
  • getting experts to critique your technique

This is horribly abstract; at the meetup we'll have a structured discussion about how this could actually work in practice, and what we have been actually doing to become skilled. You don't have to read these linked articles to attend the meetup,



but if you've read this book I really hope you attend:


Doors open at 7pm, and the meetup begins at 7:30pm. For directions to Zendo, see the mailing list:


or call me at:


Discussion article for the meetup : Berkeley meetup: Deliberate performance

Group rationality diary, 11/13/12

1 cata 13 November 2012 06:39PM

This is the public group instrumental rationality diary for the week of October 29th. It's a place to record and chat about it if you have done, or are actively doing, things like:

  • Established a useful new habit
  • Obtained new evidence that made you change your mind about some belief
  • Decided to behave in a different way in some set of situations
  • Optimized some part of a common routine or cached behavior
  • Consciously changed your emotions or affect with respect to something
  • Consciously pursued new valuable information about something that could make a big difference in your life
  • Learned something new about your beliefs, behavior, or life that surprised you
  • Tried doing any of the above and failed

Or anything else interesting which you want to share, so that other people can think about it, and perhaps be inspired to take action themselves.  Try to include enough details so that everyone can use each other's experiences to learn about what tends to work out, and what doesn't tend to work out.

Thanks to everyone who contributes!

Previous diaryarchive of prior diaries.

CFAR and SI MOOCs: a Great Opportunity

13 Wrongnesslessness 13 November 2012 10:30AM

Massive open online courses seem to be marching towards total world domination like some kind of educational singularity (at least in the case of Coursera). At the same time, there are still relatively few courses available, and each new added course is a small happening in the growing MOOC community.

Needless to say, this seems like a perfect opportunity for SI and CFAR to advance their goals via this new education medium. Some people seem to have already seen the potential and taken advantage of it:

One interesting trend that can be seen is companies offering MOOCs to increase the adoption of their tools/technologies. We have seem this with 10gen offering Mongo courses and to a lesser extent with Coursera’s ‘Functional Programming in Scala’ taught by Martin Odersky

(from the above link to the Class Central Blog)


So the question is, are there any online courses already planned by CFAR and/or SI? And if not, when will it happen?


Edit: This is not a "yes or no" question, albeit formulated as one. I've searched the archives and did not find any mention of MOOCs as a potentially crucial device for spreading our views. If any such courses are already being developed or at least planned, I'll be happy to move this post to the open thread, as some have requested, or delete it entirely. If not, please view this as a request for discussion and brainstorming.

P.S.: Sorry, I don't have the time to write a good article on this topic.

[SEQ RERUN] Emulations Go Foom

1 MinibearRex 13 November 2012 05:44AM

Today's post, Emulations Go Foom was originally published on November 22, 2008. A summary:


A description of what Robin Hanson thinks is the most likely scenario for a intelligence takeoff.

Discuss the post here (rather than in the comments to the original post).

This post is part of the Rerunning the Sequences series, where we'll be going through Eliezer Yudkowsky's old posts in order so that people who are interested can (re-)read and discuss them. The previous post was Life's Story Continues, and you can use the sequence_reruns tag or rss feed to follow the rest of the series.

Sequence reruns are a community-driven effort. You can participate by re-reading the sequence post, discussing it here, posting the next day's sequence reruns post, or summarizing forthcoming articles on the wiki. Go here for more details, or to have meta discussions about the Rerunning the Sequences series.

Short introductory materials for a rationality meetup

3 Dolores1984 13 November 2012 05:10AM

So, I and a few other people are starting a Bayesian Conspiracy chapter at my university (New Mexico Tech).  We're trying to put together a short (three page) introductory packet to give to new members.  We'd like the packet to introduce people to what rationality is, what it's useful for, and some of the basic techniques.  We'd like it to be as readable and palatable as possible, to avoid the intimidation factor of simply pointing people at the Sequences, which are not particularly friendly to a casual reader.  

I'm compiling some materials of my own for this purpose, but before I get too excited, I thought I ought to check if any of the other meetups had or knew of something along these lines already created.  If not, we'll post our packet on our website for other meetups to use as they see fit.  

View more: Prev | Next