Filter All time

You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

Individual Deniability, Statistical Honesty

43 Alicorn 09 August 2011 04:17AM

If you have a lot of people to question about something, and they have a motivation to lie, consider this clever use of a six-sided die.

If the farmer tossed the die and got a one, they had to respond "yes" to the surveyor's question. If they got a six, they had to say "no." The rest of the time, they were asked to answer honestly. The die was hidden from the person who was conducting the survey, so they never knew what number the farmer was responding to.

Suddenly, the number of "yes" responses to the leopard question started coming up by more than just one-sixth.

A Gameplay Exploration of Yudkowsky's "Twelve Virtues"

43 ac3raven 18 May 2011 06:56PM

Hello Less Wrong, this is my first post (kind of).  I belong to a small game development company called Shiny Ogre Games.  We have a vested interest in making games that, as Johnathan Blow puts it, "speak to the human condition."  I am here to announce our next project for you.

In this announcement for Shiny Ogre's next project, There are two points to address.  Firstly:

Thought is a process like any other. The methods by which we think can be identified, specified, defined, categorized and even predicted.  One method of thinking that has been thoroughly defined is rationality.  Many would consider rationality (i.e. the careful exercise of reason), to be an essential path toward enlightenment (hence this).

Secondly: The objective, logical, and mechanical approach to reason that rationality takes, meshes nicely with game development, because any well-defined system can be turn into a game.  A game is a system composed of players making decisions while considering objectives, governed by a rule set.

Where there is no decision there can be no game.  Where decisions matter, a game can make them matter more.

Therefore, rationality is a core component of game playing.

Games are learning tools.  They are perhaps the best learning tool available to humans, because they invoke our biological tendency to play.

With that said, our announcement:

We're making a video game about rationality.

The game will explore rationality in the context of Eliezer Yudkowsky's "Twelve Virtues of Rationality" (which we have permission for).  From a narrative perspective the game takes place inside a mind on the brink of epiphany and will heavily feature themes from Plato's "Allegory of the Cave".

Yudkowsky's twelve virtues are the basis of the twelve levels in the game, and will feature each virtue in metaphorical form.  The underlying message here is that if you master all of the twelve virtues (by completing all of the twelve levels), you will achieve 'epiphany'.

The game is a 2D side-scrolling puzzle-platformer.  The player assumes the role of a figure that represents his/her own conscious mind while it constructs machines (ala "Incredible Machine") that are a metaphor for the thoughts and concepts that one would create while meditating on a complex problem.

We will update our progress and share development information on our website here, as well as with posts on Less Wrong, our twitter account, and the game's website.

You can expect discussions of design decisions for this project to be written frequently from the angle of game design theory.  We may also release a small documentary film of the development process after the release of the game.

A release date has been set (and its not too long from now), but I don't want to announce it just yet.

Here is some concept art for our Curiosity metaphor (you can view more art at our website linked above):

If you're interested, just upvote and/or comment.  If you have any specific queries related to this project or about game design in general, it would be cool if you went here.

We will be sharing our progress as we make this game over the next few months.  So pay attention to Less Wrong and/or shinyogre.com for updates.

Thanks!

 

A Rationalist's Account of Objectification?

43 lukeprog 19 March 2011 11:10PM

I'm seeking some feminist consciousness-raising, and I'm hoping some LWers (Alicorn?) can help.

Specifically, I've never understood why "objectification" is wrong.

I'm a tall white American male, so sometimes it takes a bit of work for me to understand what it's like to be a member of a suppressed group. I still need regular training in avoiding sexist language, etc.

First: my background. When I was 10ish I encountered the word "feminism" for the first time. I asked my mom what the word meant.

She said, "It's the idea that women should have the same rights and privileges as men do."

And I thought, "They have a word for that?" It seemed too obvious to deserve its own word. It felt like having a special word for the idea that left-handers and right-handers should have the same rights and privileges.

So I've always thought of myself as a feminist.

Of course, some activists (the word has positive connotations to me, BTW) pushed too far, as is the case in all large movements. At some times and places (1980s academia, I think), it was common to assert that there are almost no (average) significant differences between men and women that aren't caused by enculturation, except for genitalia. That is of course false. Hormones matter, especially during development.

Such overreaches made it psychologically easier for some non-feminists to dismiss legitimate feminist demands and resist thousands of much-needed feminist advances (which are still ongoing).

Now, on this matter of objectification. I've never understood it. I've tried to get people to explain it to me before, but they were (apparently) not well-trained in rationality. I'm hoping a rationalist can explain it to me.

Here's my confusion about objectification. Depending on what you mean by "objectification," it seems to be either something that (1) is very often perfectly acceptable, or that (2) means something very narrow and is usually not being exemplified when there is an accusation of it being exemplified.

Let me explain.

Earlier, when I tried to figure out what "objectification" was and why it was wrong, the leading article on the topic seemed to be one by philosopher Martha Nussbaum. She lays out the goal of her paper like this:

I shall argue that there are at least seven distinct ways of behaving introduced by the term, none of which implies any of the others, though there are many complex connections among them. Under some specifications, objectification… is always morally problematic. Under other specifications, objectification has features that may be either good or bad, depending on the overall context… Some features of objectification… may in fact in some circumstances… be either necessary or even wonderful features of sexual life.

Using examples, she then outlines seven ways to treat a person as a thing. Rae Langton added three more in 2009, bringing the total count to 10 ways to treat a person as a thing:

  1. Instrumentality. The objectifier treats the object as a tool of his or her purposes.
  2. Denial of autonomy. The objectifier treats the object as lacking in autonomy and self-determination.
  3. Inertness. The objectifier treats the object as lacking in agency, and perhaps also in activity.
  4. Fungibility. The objectifier treats the object as interchangeable (a) with other objects of the same type and/or (b) with objects of other types.
  5. Violability. The objectifier treats the object as lacking in boundary integrity, as something that it is permissible to break up, smash, break into.
  6. Ownership. The objectifier treats the object as something that is owned by another, can be bought or sold, etc.
  7. Denial of subjectivity. The objectifier treats the object as something whose experience and feelings (if any) need not be taken into account.
  8. Reduction to body: treatment of a person as identified with their body, or body parts.
  9. Reduction to appearance: treatment of a person primarily in terms of how they look.
  10. Silencing: the treatment of a person as if they lack the capacity to speak.

Consider a classic example of objectification from Playboy magazine: a photo of a female tennis player bending over, revealing her butt, above the caption "Why We Love Tennis."

The Playboy image exhibits at least eight features of objectification: instrumentalization, denial of autonomy, fungibility, denial of subjectivity, reduction to body, reduction to appearance, and silencing!

But, let's consider another example of objectification, what I'll call the Muddy People photo:

To us, these people are nothing but objects of our entertainment and pleasure. We have instrumentalized them. Moreover, they are fungible. It does not matter to us which people are covered in mud and looking silly. And just as with the Playboy example, this photo involves a denial of autonomy. Indeed, it is doubtful the permission to publish their photos was obtained. Moreover, we are not much interested in the feelings of these people but only their role in entertaining us as we gaze upon their mud-caked bodies – a denial of subjectivity. Often, nothing of these mud-covered people can be seen or known except their bodies – in many cases, only body parts, sticking every which way. This is the reduction to body. There is also clearly a reduction to appearance. Their mud-covered appearance is their only interest to us. In many cases, the emotions they might be having are totally obscured by the mud covering their faces. They are also, of course, silent to us.

So all the features of objectification found in the Playboy example, which we might feel is wrong somehow, are also shared by the Muddy People photo, which we probably feel is acceptable. Perhaps this suggests that our feelings are poor guides to moral truth. Or maybe what is wrong with the Playboy photo is something other than objectification.

Of course, there are disanalogies to be found. The Playboy example (especially with the caption) involved sexuality, and the Muddy People photo does not particularly do so. But if this is the line of thought that leads us to condemn Playboy but not the Muddy People photo, then we are bringing in another concept besides objectification.

For example, perhaps we want to say that Playboy‘s objectifications harm women by contributing to a culture of sexual prejudice, but the Muddy People objectifications do not cause any such harm. But then we are not appealing to this Kantian notion of "objectification." Rather, we are appealing to utilitarian principles. (Feminist philosopher Lina Papadaki makes similar objections to the notion of objectification.)

We all use each other as means to an end, or as objects of one kind or another, all the time. And we can do so while respecting their autonomy. I enjoy looking at the shapes and textures in the Muddy People photo while also respecting that the people whose bodies make up those shapes and textures are autonomous individuals of great value. But their value as individuals is not the point of the photo. The point of the photo, in this case, is that it's an interesting picture to look at. And that's okay, I think.

Good romantic partners use each other as a means to their own gratification while also respecting each others' autonomy. We use each other as sex objects, as emotion objects, as conversation objects, as knowledge objects, as carpool objects, and as other objects, all the time - while also respecting each others' autonomy and value. It's not clear to me what's wrong with that.

So if something like Nussbaum's analysis of "objectification" is what is meant by the term, then I don't see what's wrong with it. But if it means something much more narrow (what? I don't know), then I doubt it is exemplified nearly as often as people are accused of exemplifying it.

I reject Kant's epistemology, logic, and metaphysics - as I think any scientifically-informed person should. But even if you do accept all three, I still don't see what's intrinsically wrong with objectification as Nussbaum defines it.

Maybe I'm being dense. That has happened before. I'm not posting this with much confidence that objectification is a mostly useless concept. I'm posting this in pursuit of some consciousness-raising.

Understanding the problem is the first step toward fixing it. And right now I don't understand the problem. So if you have the time, please teach me.

Thanks.

 

Update: below, I'll keep an updated list of the most useful articles I've found so far.

An Overview of Formal Epistemology (links)

43 lukeprog 06 January 2011 07:57PM

The branch of philosophy called formal epistemology has very similar interests to those of the Less Wrong community. Formal epistemologists mostly work on (1) mathematically formalizing concepts related to induction, belief, choice, and action, and (2) arguing about the foundations of probability, statistics, game theory, decision theory, and algorithmic learning theory.

Those who value the neglected virtue of scholarship may want to study for themselves the arguments that have lead scholars either toward or against the very particular positions on formalizing language, decision theory, explanation, and probability typically endorsed at Less Wrong. As such, here's a brief overview of the field by way of some helpful links:

Enjoy.

 

Astronomy, Astrobiology, & The Fermi Paradox I: Introductions, and Space & Time

42 CellBioGuy 26 July 2015 07:38AM

This is the first in a series of posts I am putting together on a personal blog I just started two days ago as a collection of my musings on astrobiology ("The Great A'Tuin" - sorry, I couldn't help it), and will be reposting here.  Much has been written here about the Fermi paradox and the 'great filter'.   It seems to me that going back to a somewhat more basic level of astronomy and astrobiology is extremely informative to these questions, and so this is what I will be doing.  The bloggery is intended for a slightly more general audience than this site (hence much of the content of the introduction) but I think it will be of interest.  Many of the points I will be making are ones I have touched on in previous comments here, but hope to explore in more detail.

This post references my first two posts - an introduction, and a discussion of our apparent position in space and time in the universe.  The blog posts may be found at:

http://thegreatatuin.blogspot.com/2015/07/whats-all-this-about.html

http://thegreatatuin.blogspot.com/2015/07/space-and-time.htm

CFAR fundraiser far from filled; 4 days remaining

42 AnnaSalamon 27 January 2015 07:26AM

We're 4 days from the end of our matching fundraiser, and still only about 1/3rd of the way to our target (and to the point where pledged funds would cease being matched).

If you'd like to support the growth of rationality in the world, do please consider donating, or asking me about any questions/etc. you may have.  I'd love to talk.  I suspect funds donated to CFAR between now and Jan 31 are quite high-impact.

As a random bonus, I promise that if we meet the $120k matching challenge, I'll post at least two posts with some never-before-shared (on here) rationality techniques that we've been playing with around CFAR.

What science needs

42 PhilGoetz 02 December 2012 10:31PM

Science does not need more scientists.  It doesn't even need you, brilliant as you are.  We already have many times more brilliant scientists than we can fund.  Science could use a better understanding of the scientific method, but improving how individuals do science would not address most of the problems I've seen.

The big problems facing science are organizational problems.  We don't know how to identify important areas of study, or people who can do good science, or good and important results.  We don't know how to run a project in a way that makes correct results likely.  Improving the quality of each person on the project is not the answer.  The problem is the system.  We have organizations and systems that take groups of brilliant scientists, and motivate them to produce garbage.

continue reading »

Blogs by LWers

42 [deleted] 22 June 2012 08:53AM

Related to: Wikifying the Blog List

LessWrong posters and readers are generally pretty cool people. Maybe they are interesting bloggers too. And I'm not just talking about rationalist material, that we'd ideally like to be cross posted on LessWrong, no gardening blogs are also fair game. I'm making this a discussion level post so more people can see the list. Please share links to blogs by former or current LWers. Surely the authors wouldn't mind, who wouldn't like more readers? Original list here.

Anyone who wants to suggest a new blog for the list please follow this link.

Blogs by LWers:

List of LWers on Twitter


Note: Anyone just digging for interesting blogs they would like to read but dosen't care if they are written by LWers or not should check out this thread or maybe this one. Did you guys know we have a wiki article with external resources? We do. Check that out as well. Maybe once we figure out which LWer blogs related to rationality on this list are particularly good we can add a few of them there too.

Hofstadter's Superrationality

42 gwern 21 April 2012 01:33PM

Possibly the main and original inspiration for Yudkowsky's various musings on what advanced game theories should do (eg. cooperate in the Prisoner's Dilemma) is a set of essays penned by Douglas Hofstadter (of Godel, Escher, Bach) 1983. Unfortunately, they were not online and only available as part of a dead-tree collection. This is unfortunate. Fortunately the collection is available through the usual pirates as a scan, and I took the liberty of transcribing by hand the relevant essays with images, correcting errors, annotating with links, etc: http://www.gwern.net/docs/1985-hofstadter

The 3 essays:

  1. discuss the Prisoner's dilemma, the misfortune of defection, what sort of cooperative reasoning would maximize returns in a souped-up Prisoner's dilemma, and then offers a public contest
  2. then we learn the results of the contest, and a discussion of ecology and the tragedy of the commons
  3. finally, Hofstadter gives an extended parable about cooperation in the face of nuclear warfare; it is fortunate for us that it applies to most existential threats as well

I hope you find them educational. I am not 100% confident of the math transcriptions since the original ebook messed some of them up; if you find any apparent mistakes or typos, please leave comments.

Stupid Questions Open Thread

42 Costanza 29 December 2011 11:23PM

This is for anyone in the LessWrong community who has made at least some effort to read the sequences and follow along, but is still confused on some point, and is perhaps feeling a bit embarrassed. Here, newbies and not-so-newbies are free to ask very basic but still relevant questions with the understanding that the answers are probably somewhere in the sequences. Similarly, LessWrong tends to presume a rather high threshold for understanding science and technology. Relevant questions in those areas are welcome as well.  Anyone who chooses to respond should respectfully guide the questioner to a helpful resource, and questioners should be appropriately grateful. Good faith should be presumed on both sides, unless and until it is shown to be absent.  If a questioner is not sure whether a question is relevant, ask it, and also ask if it's relevant.

Video Q&A with Singularity Institute Executive Director

42 lukeprog 10 December 2011 11:27AM

Overcoming the Curse of Knowledge

42 JesseGalef 18 October 2011 05:39PM

[crossposted at Measure of Doubt]

What is the Curse of Knowledge, and how does it apply to science education, persuasion, and communication? No, it's not a reference to the Garden of Eden story. I'm referring to a particular psychological phenomenon that can make our messages backfire if we're not careful.

Communication isn't a solo activity; it involves both you and the audience. Writing a diary entry is a great way to sort out thoughts, but if you want to be informative and persuasive to others, you need to figure out what they'll understand and be persuaded by. A common habit is to use ourselves as a mental model - assuming that everyone else will laugh at what we find funny, agree with what we find convincing, and interpret words the way we use them. The model works to an extent - especially with people similar to us - but other times our efforts fall flat. You can present the best argument you've ever heard, only to have it fall on dumb - sorry, deaf - ears.

That's not necessarily your fault - maybe they're just dense! Maybe the argument is brilliant! But if we want to communicate successfully, pointing fingers and assigning blame is irrelevant. What matters is getting our point across, and we can't do it if we're stuck in our head, unable to see things from our audience's perspective. We need to figure out what words will work.

Unfortunately, that's where the Curse of Knowledge comes in. In 1990, Elizabeth Newton did a fascinating psychology experiment: She paired participants into teams of two: one tapper and one listener. The tappers picked one of 25 well-known songs and would tap out the rhythm on a table. Their partner - the designated listener - was asked to guess the song. How do you think they did?

Not well. Of the 120 songs tapped out on the table, the listeners only guessed 3 of them correctly - a measly 2.5 percent. But get this: before the listeners gave their answer, the tappers were asked to predict how likely their partner was to get it right. Their guess? Tappers thought their partners would get the song 50 percent of the time. You know, only overconfident by a factor of 20. What made the tappers so far off?

They lost perspective because they were "cursed" with the additional knowledge of the song title. Chip and Dan Heath use the story in their book Made to Stick to introduce the term:

 

"The problem is that tappers have been given knowledge (the song title) that makes it impossible for them to imagine what it's like to lack that knowledge. When they're tapping, they can't imagine what it's like for the listeners to hear isolated taps rather than a song. This is the Curse of Knowledge. Once we know something, we find it hard to imagine what it was like not to know it. Our knowledge has "cursed" us. And it becomes difficult or us to share our knowledge with others, because we can't readily re-create our listeners' state of mind."

 

So it goes with communicating complex information. Because we have all the background knowledge and understanding, we're overconfident that what we're saying is clear to everyone else. WE know what we mean! Why don't they get it? It's tough to remember that other people won't make the same inferences, have the same word-meaning connections, or share our associations.

It's particularly important in science education. The more time a person spends in a field, the more the field's obscure language becomes second nature. Without special attention, audiences might not understand the words being used - or worse yet, they might get the wrong impression.

Over at the American Geophysical Union blog, Callan Bentley gives a fantastic list of Terms that have different meanings for scientists and the public.

What great examples! Even though the scientific terms are technically correct in context, they're obviously the wrong ones to use when talking to the public about climate change. An inattentive scientist could know all the material but leave the audience walking away with the wrong message.

We need to spend the effort to phrase ideas in a way the audience will understand. Is that the same as "dumbing down" a message? After all, complicated ideas require complicated words and nuanced answers, right? Well, no. A real expert on a topic can give a simple distillation of material, identifying the core of the issue. Bentley did an outstanding job rephrasing technical, scientific terms in a way that conveys the intended message to the public.

That's not dumbing things down, it's showing a mastery of the concepts. And he was able to do it by overcoming the "curse of knowledge," seeing the issue from other people's perspective. Kudos to him - it's an essential part of science education, and something I really admire.

P.S. - By the way, I chose that image for a reason: I bet once you see the baby in the tree you won’t be able to ‘unsee’ it. (image via Richard Wiseman)

Q&A with Shane Legg on risks from AI

42 XiXiDu 17 June 2011 08:58AM

[Click here to see a list of all interviews]

I am emailing experts in order to raise and estimate the academic awareness and perception of risks from AI.

Below you will find some thoughts on the topic by Shane Legg, a computer scientist and AI researcher who has been working on theoretical models of super intelligent machines (AIXI) with Prof. Marcus Hutter. His PhD thesis Machine Super Intelligence has been completed in 2008. He was awarded the $10,000 Canadian Singularity Institute for Artificial Intelligence Prize.

Publications by Shane Legg:

  • Solomonoff Induction thesis
  • Universal Intelligence: A Definition of Machine Intelligence paper
  • Algorithmic Probability Theory article
  • Tests of Machine Intelligence paper
  • A Formal Measure of Machine Intelligence paper talk slides
  • A Collection of Definitions of Intelligence paper
  • A Formal Definition of Intelligence for Artificial Systems abstract poster
  • Is there an Elegant Universal Theory of Prediction? paper slides

The full list of publications by Shane Legg can be found here.

The Interview:

Q1: Assuming no global catastrophe halts progress, by what year would you assign a 10%/50%/90% chance of the development of human-level machine intelligence?

Explanatory remark to Q1:

P(human-level AI by (year) | no wars ∧ no disasters ∧ beneficially political and economic development) = 10%/50%/90%

Shane Legg: 2018, 2028, 2050

Q2: What probability do you assign to the possibility of negative/extremely negative consequences as a result of badly done AI?

Explanatory remark to Q2:

P(negative consequences | badly done AI) = ?
P(extremely negative consequences | badly done AI) = ?

(Where 'negative' = human extinction; 'extremely negative' = humans suffer;)

Shane Legg: Depends a lot on how you define things. Eventually, I think human extinction will probably occur, and technology will likely play a part in this.  But there's a big difference between this being within a year of something like human level AI, and within a million years. As for the former meaning...I don't know.  Maybe 5%, maybe 50%. I don't think anybody has a good estimate of this.

If by suffering you mean prolonged suffering, then I think this is quite unlikely.  If a super intelligent machine (or any kind of super intelligent agent) decided to get rid of us, I think it would do so pretty efficiently. I don't think we will deliberately design super intelligent machines to maximise human suffering.

Q3: What probability do you assign to the possibility of a human level AGI to self-modify its way up to massive superhuman intelligence within a matter of hours/days/< 5 years?

Explanatory remark to Q3:

P(superhuman intelligence within hours | human-level AI running at human-level speed equipped with a 100 Gigabit Internet connection) = ?
P(superhuman intelligence within days | human-level AI running at human-level speed equipped with a 100 Gigabit Internet connection) = ?
P(superhuman intelligence within < 5 years | human-level AI running at human-level speed equipped with a 100 Gigabit Internet connection) = ?

Shane Legg: "human level" is a rather vague term. No doubt a machine will be super human at some things, and sub human at others.  What kinds of things it's good at makes a big difference.

In any case, I suspect that once we have a human level AGI, it's more likely that it will be the team of humans who understand how it works that will scale it up to something significantly super human, rather than the machine itself. Then the machine would be likely to self improve.

How fast would that then proceed? Could be very fast, could be impossible -- there could be non-linear complexity constrains meaning that even theoretically optimal algorithms experience strongly diminishing intelligence returns for additional compute power. We just don't know.

Q4: Is it important to figure out how to make AI provably friendly to us and our values (non-dangerous), before attempting to solve artificial general intelligence?

Shane Legg: I think we have a bit of a chicken and egg issue here. At the moment we don't agree on what intelligence is or how to measure it, and we certainly don't agree on how a human level AI is going to work. So, how do we make something safe when we don't properly understand what that something is or how it will work? Some theoretical issues can be usefully considered and addressed. But without a concrete and grounded understanding of AGI, I think that an abstract analysis of the issues is going to be very shaky.

Q5: How much money is currently required to mitigate possible risks from AI (to be instrumental in maximizing your personal long-term goals, e.g. surviving this century), less/no more/little more/much more/vastly more?

Shane Legg: Much more. Though, similar to many charity projects, simply throwing more money at the problem is unlikely to help all that much, and it may even make things worse. I think the biggest issue isn't really financial, but cultural. I think this is going to change as AI progresses and people start to take the idea of human level AGI within their lifetimes more seriously.  Until that happens I think that the serious study of AGI risks will remain fringe.

Q6: Do possible risks from AI outweigh other possible existential risks, e.g. risks associated with the possibility of advanced nanotechnology?

Explanatory remark to Q6:

What existential risk (human extinction type event) is currently most likely to have the greatest negative impact on your personal long-term goals, under the condition that nothing is done to mitigate the risk?

Shane Legg: It's my number 1 risk for this century, with an engineered biological pathogen coming a close second (though I know little about the latter).

Q7:  What is the current level of awareness of possible risks from AI, relative to the ideal level?

Shane Legg: Too low...but it could well be a double edged sword: by the time the mainstream research community starts to worry about this issue, we might be risking some kind of arms race if large companies and/or governments start to secretly panic. That would likely be bad.

Q8:  Can you think of any milestone such that if it were ever reached you would expect human-level machine intelligence to be developed within five years thereafter?

Shane Legg: That's a difficult question! When a machine can learn to play a really wide range of games from perceptual stream input and output, and transfer understanding across games, I think we'll be getting close.

The Fundamental Question - Rationality computer game design

41 Kaj_Sotala 13 February 2013 01:45PM

I sometimes go around saying that the fundamental question of rationality is Why do you believe what you believe?

-- Eliezer in Quantum Non-Realism

I was much impressed when they finally came out with a PC version of DragonBox, and I got around to testing it on some children I knew. Two kids, one of them four and the other eight years old, ended up blazing through several levels of solving first-degree equations while having a lot of fun doing so, even though they didn't know what it was that they were doing. That made me think that there has to be some way of making a computer game that would similarly teach rationality skills at the 5-second level. Some game where you would actually be forced to learn useful skills if you wanted to make progress.

After playing around with some ideas, I hit upon the notion of making a game centered around the Fundamental Question. I'm not sure whether this can be made to work, but it seems to have promise. The basic idea: you are required to figure out the solution to various mysteries by collecting various kinds of evidence. Some of the sources of evidence will be more reliable than others. In order to hit upon the correct solution, you need to consider where each piece of evidence came from, and whether you can rely on it.

Gameplay example

Now, let's go into a little more detail. Let's suppose that the game has a character called Bob. Bob tells you that tomorrow, eight o'clock, there will be an assassination attempt on Market Square. The fact that Bob has told you this is evidence for the claim being true, so the game automatically records the fact that you have such a piece of evidence, and that it came from Bob.

(Click on the pictures in case you don't see them properly.)

But how does Bob know that? You ask, and it turns out that Alice told him. So next, you go and ask Alice. Alice is confused and says that she never said anything about any assassination attempt: she just said that something big is going to be happen at the Market Square at that time, she heard it from the Mayor. The game records two new pieces of evidence: Alice's claim of something big happening at the Market Square tomorrow (which she heard from the Mayor), and her story of what she actually told Bob. Guess that Bob isn't a very reliable source of evidence: he has a tendency to come up with fancy invented details.

Or is he? After all, your sole knowledge about Bob being unreliable is that Alice claims she never said what Bob says she said. But maybe Alice has a grudge against Bob, and is intentionally out to make everyone disbelieve him. Maybe it's Alice who's unreliable. The evidence that you have is compatible with both hypotheses. At this point, you don't have enough information to decide between them, but the game lets you experiment with setting either of them as "true" and seeing the implications of this on your belief network. Or maybe they're both true - Bob is generally unreliable, and Alice is out to discredit him. That's another possibility that you might want to consider. In any case, the claim that there will be an assassination tomorrow isn't looking very likely at the moment.

Actually, having the possibility for somebody lying should probably be a pretty late-game thing, as it makes your belief network a lot more complicated, and I'm not sure whether this thing should display numerical probabilities at all. Instead of having to juggle the hypotheses of "Alice lied" and "Bob exaggerates things", the game should probably just record the fact that "Bob exaggerates things". But I spent a bunch of time making these pictures, and they do illustrate some of the general principles involved, so I'll just use them for now.

Game basics

So, to repeat the basic premise of the game, in slightly more words this time around: your task is to figure out something, and in order to do so, you need to collect different pieces of evidence. As you do so, the game generates a belief network showing the origin and history of the various pieces of evidence that you've gathered. That much is done automatically. But often, the evidence that you've gathered is compatible with many different hypotheses. In those situations, you can experiment with different ways of various hypotheses being true or false, and the game will automatically propagate the consequences of that hypothetical through your belief network, helping you decide what angle you should explore next.

Of course, people don't always remember the source of their knowledge, or they might just appeal to personal experiences. Or they might lie about the sources, though that will only happen at the more advanced levels.

As you proceed in the game, you will also be given access to more advanced tools that you can use for making hypothetical manipulations to the belief network. For example, it may happen that many different characters say that armies of vampire bats tend to move about at full moon. Since you hear that information from many different sources, it seems reliable. But then you find out that they all heard it from a nature documentary on TV that aired a few weeks back. This is reflected in your belief graph, as the game modifies it to show that all of those supposedly independent sources can actually be tracked back to a single one. That considerably reduces the reliability of the information.

But maybe you were already suspecting that the sources might not be independent? In that case, it would have been nice if the belief graph interface would let you postulate this beforehand, and see how big of an effect it would make on the plausibility of the different hypotheses if they were in fact reliant on each other. Once your character learns the right skills, it becomes possible to also add new hypothetical connections to the belief graph, and see how this would influence your beliefs. That will further help you decide what possibilities to explore and verify.

Because you can't explore every possible eventuality. There's a time limit: after a certain amount of moves, a bomb will go off, the aliens will invade, or whatever.

The various characters are also more nuanced than just "reliable" or "not reliable". As you collect information about the various characters, you'll figure out their mindware, motivations, and biases. Somebody might be really reliable most of the time, but have strong biases when it comes to politics, for example. Others are out to defame others, or invent fancy details to all the stories. If you talk to somebody you don't have any knowledge about yet, you can set a prior on the extent that you rely on their information, based on your experiences with other people.

You also have another source of evidence: your own intuitions and experience. As you get into various situations, a source of evidence that's labeled simply "your brain" will provide various gut feelings and impressions about things. The claim that Alice presented doesn't seem to make sense. Bob feels reliable. You could persuade Carol to help you if you just said this one thing. But in what situations, and for what things, can you rely on your own brain? What are your own biases and problems? If you have a strong sense of having heard something at some point, but can't remember where it was, are you any more reliable than anyone else who can't remember the source of their information? You'll need to figure all of that out.

As the game progresses to higher levels, your own efforts will prove insufficient for analyzing all the necessary information. You'll have to recruit a group of reliable allies, who you can trust to analyze some of the information on their own and report the results to you accurately. Of course, in order to make better decisions, they'll need you to tell them your conclusions as well. Be sure not to report as true things that you aren't really sure about, or they will end up drawing the wrong conclusions and focusing on the wrong possibilities. But you do need to condense your report somewhat: you can't just communicate your entire belief network to them.

Hopefully, all of this should lead to player learning on a gut level things like:

  • Consider the origin of your knowledge: Obvious.
  • Visualizing degrees of uncertainty: In addition to giving you a numerical estimate about the probability of something, the game also color-codes the various probabilities and shows the amount of probability mass associated with your various beliefs.
  • Considering whether different sources really are independent: Some sources which seem independent won't actually be that, and some which seem dependent on each other won't be.
  • Value of information: Given all the evidence you have so far, if you found out X, exactly how much would it change your currently existing beliefs? You can test this and find out, and then decide whether it's worth finding out.
  • Seek disconfirmation: A lot of things that seem true really aren't, and acting on flawed information can cost you.
  • Prefer simpler theories: Complex, detailed hypotheses are more likely to be wrong in this game as well.
  • Common biases: Ideally, the list of biases that various characters have is derived from existing psychological research on the topic. Some biases are really common, others are more rare.
  • Epistemic hygiene: Pass off wrong information to your allies, and it'll cost you.
  • Seek to update your beliefs: The game will automatically update your belief network... to some extent. But it's still possible for you to assign mutually exclusive events probabilities that sum to more than 1, or otherwise have conflicting or incoherent beliefs. The game will mark these with a warning sign, and it's up to you to decide whether this particular inconsistency needs to be resolved or not.
  • Etc etc.

Design considerations

It's not enough for the game to be educational: if somebody downloads the game because it teaches rationality skills, that's great, but we want people to also play it because it's fun. Some principles that help ensure that, as well as its general utility as an educational aid, include:

  • Provide both short- and medium-term feedback: Ideally, there should be plenty of hints for how to find out the truth about something by investigating just one more thing: then the player can find out whether your guess was correct. It's no fun if the player has to work through fifty decisions before finding out whether they made the right move: they should get constant immediate feedback. At the same time, the player's decisions should be building up to a larger goal, with uncertainty about the overall goal keeping them interested.
  • Don't overwhelm the player: In a game like this, it would be easy to throw a million contradictory pieces of evidence at the player, forcing them to go through countless of sources of evidence and possible interactions and have no clue of what they should be doing. But the game should be manageable. Even if it looks like there is a huge messy network of countless pieces of contradictory evidence, it should be possible to find the connections which reveal the network to be relatively simple after all. (This is not strictly realistic, but necessary for making the game playable.)
  • Introduce new gameplay concepts gradually: Closely related to the previous item. Don't start out with making the player deal with every single gameplay concept at once. Instead, start them out in a trusted and safe environment where everyone is basically reliable, and then begin gradually introducing new things that they need to take into account.
  • No tedium: A game is a series of interesting decisions. The game should never force the player to do anything uninteresting or tedious. Did Alice tell Bob something? No need to write that down, the game keeps automatic track of it. From the evidence that has been gathered so far, is it completely obvious what hypothesis is going to be right? Let the player mark that as something that will be taken for granted and move on.
  • No glued-on tasks: A sign of a bad educational game is that the educational component is glued on to the game (or vice versa). Answer this exam question correctly, and you'll get to play a fun action level! There should be none of that - the educational component should be an indistinguishable part of the game play.
  • Achievement, not fake achievement: Related to the previous point. It would be easy to make a game that wore the attire of rationality, and which used concepts like "probability theory", and then when your character leveled up he would get better probability attacks or whatever. And you'd feel great about your character learning cool stuff, while you yourself learned nothing. The game must genuinely require the player to actually learn new skills in order to get further.
  • Emotionally compelling: The game should not be just an abstract intellectual exercise, but have an emotionally compelling story as well. Your choices should feel like they matter, and characters should be in risk of dying if you make the wrong decisions.
  • Teach true things: Hopefully, the players should take the things that they've learned from the game and apply them to their daily lives. That means that we have a responsibility not to teach them things which aren't actually true.
  • Replayable: Practice makes perfect. At least part of the game world needs to be randomly generated, so that the game can be replayed without a risk of it becoming boring because the player has memorized the whole belief network.

What next?

What you've just read is a very high-level design, and a quite incomplete one at that: I've spoken on the need to have "an emotionally compelling story", but said nothing about the story or the setting. This should probably be something like a spy or detective story, because that's thematically appropriate for a game which is about managing information; and it might be best to have it in a fantasy setting, so that you can question the widely-accepted truths of that setting without needing to get on anyone's toes by questioning widely-accepted truths of our society.

But there's still a lot of work that remains to be done with regard to things like what exactly does the belief network look like, what kinds of evidence can there be, how does one make all of this actually be fun, and so on. I mentioned the need to have both short- and medium-term feedback, but I'm not sure of how that could be achieved, or whether this design lets you achieve it at all. And I don't even know whether the game should show explicit probabilities.

And having a design isn't enough: the whole thing needs to be implemented as well, preferably while it's still being designed in order to take advantage of agile development techniques. Make a prototype, find some unsuspecting testers, spring it on them, revise. And then there are the graphics and music, things for which I have no competence for working on.

I'll probably be working on this in my spare time - I've been playing with the idea of going to the field of educational games at some point, and want the design and programming experience. If anyone feels like they could and would want to contribute to the project, let me know.

EDIT: Great to see that there's interest! I've created a mailing list for discussing the game. It's probably easiest to have the initial discussion here, and then shift the discussion to the list.

Ontological Crisis in Humans

41 Wei_Dai 18 December 2012 05:32PM

Imagine a robot that was designed to find and collect spare change around its owner's house. It had a world model where macroscopic everyday objects are ontologically primitive and ruled by high-school-like physics and (for humans and their pets) rudimentary psychology and animal behavior. Its goals were expressed as a utility function over this world model, which was sufficient for its designed purpose. All went well until one day, a prankster decided to "upgrade" the robot's world model to be based on modern particle physics. This unfortunately caused the robot's utility function to instantly throw a domain error exception (since its inputs are no longer the expected list of macroscopic objects and associated properties like shape and color), thus crashing the controlling AI.

According to Peter de Blanc, who used the phrase "ontological crisis" to describe this kind of problem,

Human beings also confront ontological crises. We should find out what cognitive algorithms humans use to solve the same problems described in this paper. If we wish to build agents that maximize human values, this may be aided by knowing how humans re-interpret their values in new ontologies.

I recently realized that a couple of problems that I've been thinking over (the nature of selfishness and the nature of pain/pleasure/suffering/happiness) can be considered instances of ontological crises in humans (although I'm not so sure we necessarily have the cognitive algorithms to solve them). I started thinking in this direction after writing this comment:

This formulation or variant of TDT requires that before a decision problem is handed to it, the world is divided into the agent itself (X), other agents (Y), and "dumb matter" (G). I think this is misguided, since the world doesn't really divide cleanly into these 3 parts.

What struck me is that even though the world doesn't divide cleanly into these 3 parts, our models of the world actually do. In the world models that we humans use on a day to day basis, and over which our utility functions seem to be defined (to the extent that we can be said to have utility functions at all), we do take the Self, Other People, and various Dumb Matter to be ontologically primitive entities. Our world models, like the coin collecting robot's, consist of these macroscopic objects ruled by a hodgepodge of heuristics and prediction algorithms, rather than microscopic particles governed by a coherent set of laws of physics.

For example, the amount of pain someone is experiencing doesn't seem to exist in the real world as an XML tag attached to some "person entity", but that's pretty much how our models of the world work, and perhaps more importantly, that's what our utility functions expect their inputs to look like (as opposed to, say, a list of particles and their positions and velocities). Similarly, a human can be selfish just by treating the object labeled "SELF" in its world model differently from other objects, whereas an AI with a world model consisting of microscopic particles would need to somehow inherit or learn a detailed description of itself in order to be selfish.

To fully confront the ontological crisis that we face, we would have to upgrade our world model to be based on actual physics, and simultaneously translate our utility functions so that their domain is the set of possible states of the new model. We currently have little idea how to accomplish this, and instead what we do in practice is, as far as I can tell, keep our ontologies intact and utility functions unchanged, but just add some new heuristics that in certain limited circumstances call out to new physics formulas to better update/extrapolate our models. This is actually rather clever, because it lets us make use of updated understandings of physics without ever having to, for instance, decide exactly what patterns of particle movements constitute pain or pleasure, or what patterns constitute oneself. Nevertheless, this approach hardly seems capable of being extended to work in a future where many people may have nontraditional mind architectures, or have a zillion copies of themselves running on all kinds of strange substrates, or be merged into amorphous group minds with no clear boundaries between individuals.

By the way, I think nihilism often gets short changed around here. Given that we do not actually have at hand a solution to ontological crises in general or to the specific crisis that we face, what's wrong with saying that the solution set may just be null? Given that evolution doesn't constitute a particularly benevolent and farsighted designer, perhaps we may not be able to do much better than that poor spare-change collecting robot? If Eliezer is worried that actual AIs facing actual ontological crises could do worse than just crash, should we be very sanguine that for humans everything must "add up to moral normality"?

To expand a bit more on this possibility, many people have an aversion against moral arbitrariness, so we need at a minimum a utility translation scheme that's principled enough to pass that filter. But our existing world models are a hodgepodge put together by evolution so there may not be any such sufficiently principled scheme, which (if other approaches to solving moral philosophy also don't pan out) would leave us with legitimate feelings of "existential angst" and nihilism. One could perhaps still argue that any current such feelings are premature, but maybe some people have stronger intuitions than others that these problems are unsolvable?

Do we have any examples of humans successfully navigating an ontological crisis? The LessWrong Wiki mentions loss of faith in God:

In the human context, a clear example of an ontological crisis is a believer’s loss of faith in God. Their motivations and goals, coming from a very specific view of life suddenly become obsolete and maybe even nonsense in the face of this new configuration. The person will then experience a deep crisis and go through the psychological task of reconstructing its set of preferences according the new world view.

But I don't think loss of faith in God actually constitutes an ontological crisis, or if it does, certainly not a very severe one. An ontology consisting of Gods, Self, Other People, and Dumb Matter just isn't very different from one consisting of Self, Other People, and Dumb Matter (the latter could just be considered a special case of the former with quantity of Gods being 0), especially when you compare either ontology to one made of microscopic particles or even less familiar entities.

But to end on a more positive note, realizing that seemingly unrelated problems are actually instances of a more general problem gives some hope that by "going meta" we can find a solution to all of these problems at once. Maybe we can solve many ethical problems simultaneously by discovering some generic algorithm that can be used by an agent to transition from any ontology to another? 

(Note that I'm not saying this is the right way to understand one's real preferences/morality, but just drawing attention to it as a possible alternative to other more "object level" or "purely philosophical" approaches. See also this previous discussion, which I recalled after writing most of the above.)

Taking "correlation does not imply causation" back from the internet

41 sixes_and_sevens 03 October 2012 12:18PM

(An idea I had while responding to this quotes thread)

"Correlation does not imply causation" is bandied around inexpertly and inappropriately all over the internet.  Lots of us hate this.

But get this: the phrase, and the most obvious follow-up phrases like "what does imply causation?" are not high-competition search terms.  Up until about an hour ago, the domain name correlationdoesnotimplycausation.com was not taken.  I have just bought it.

There is a correlation-does-not-imply-causation shaped space on the internet, and it's ours for the taking.  I would like to fill this space with a small collection of relevant educational resources explaining what is meant by the term, why it's important, why it's often used inappropriately, and the circumstances under which one may legitimately infer causation.

At the moment the Wikipedia page is trying to do this, but it's not really optimised for the task.  It also doesn't carry the undercurrent of "no, seriously, lots of smart people get this wrong; let's make sure you're not one of them", and I think it should.

The purpose of this post is two-fold:

Firstly, it lets me say "hey dudes, I've just had this idea.  Does anyone have any suggestions (pragmatic/technical, content-related, pointing out why it's a terrible idea, etc.), or alternatively, would anyone like to help?"

Secondly, it raises the question of what other corners of the internet are ripe for the planting of sanity waterline-raising resources.  Are there any other similar concepts that people commonly get wrong, but don't have much of a guiding explanatory web presence to them?  Could we put together a simple web platform for carrying out this task in lots of different places?  The LW readership seems ideally placed to collectively do this sort of work.

Petition: Off topic area

41 [deleted] 13 May 2012 06:41PM

 

Petition: LW should introduce a dedicated off topic area

 

Why?

 

1) I want to discuss various topics with people who are both intelligent and rationalist, and i know of no other place where to do it.

 

2) If find that rationality is getting boring in itself. I need to use it on something.

 

3) As stated in this comment http://lesswrong.com/lw/btc/how_can_we_get_more_and_better_lw_contrarians/6e3p

the narrow set of topics might actually hurt LW by driving good rationalists away.

Social status hacks from The Improv Wiki

41 lsparrish 21 March 2012 02:56AM

I can't remember how I found this, just that I was amazed at how rational and near-mode it is on a topic where most of the information one usually encounters is hopelessly far.

LessWrong wiki link on the same topic: http://wiki.lesswrong.com/wiki/Status

The Improv Wiki

Status

Status is pecking order. The person who is lower in status defers to the person who is higher in status.

Status is party established by social position--e.g. boss and employee--but mainly by the way you interact. If you interact in a way that says you are not to be trifled with, the other person must adjust to you, then you are establishing high status. If you interact in a way that says you are willing to go along, you don't want responsibility, that's low status. A boss can play low status or high status. An employee can play low status or high status.

Status is established in every line and gesture, and changes continuously. Status is something that one character plays to another at a particular moment. If you convey that the other person must not cross you on what you're saying now, then you are playing high status to that person in that line. Your very next line might come out low status, as you suggest willingness to defer about something else.

If you analyze your most successful scenes, it's likely they involved several status changes between the players. Therefore, one path to great scenes is to intentionally change status. You can raise or lower your own status, or the status of the other player. The more subtly you can do this, the better the scene.

High-status behaviors

When walking, assuming that other people will get out of your path.

Making eye contact while speaking.

Not checking the other person's eyes for a reaction to what you said.

Having no visible reaction to what the other person said. (Imagine saying something to a typical Clint Eastwood character. You say something expecting a reaction, and you get--nothing.)

Speaking in complete sentences.

Interrupting before you know what you are going to say.

Spreading out your body to full comfort. Taking up a lot of space with your body.

Looking at the other person with your eyes somewhat down (head tilted back a bit to make this work), creating the feeling that you are a parent talking to a child.

Talking matter-of-factly about things that the other person finds displeasing or offensive.

Letting your body be vulnerable, exposing your neck and torso to the other person.

Moving comfortably and gracefully.

Keeping your hands away from your face.

Speaking authoritatively, with certainty.

Making decisions for a group; taking responsibility.

Giving or withholding permission.

Evaluating other people's work.

Speaking cryptically, not adjusting your speech to be easily understood by the other person (except that mumbling does not count). E.g. saying, "Chomper not right" with no explanation of what you mean or what you want the other person to do.

Being surrounded by an entourage, especially of people who are physically smaller than you.

A "high-status specialist" conveys in every word and gesture, "Don't come near me, I bite."

Low-status behaviors

When walking, moving out of other people's path.

Looking away from the other person's eyes.

Briefly checking the other person's eyes to see if they reacted positively to what you said.

Speaking in halting, incomplete sentences. Trailing off, editing your sentences as you got.

Sitting or standing uncomfortably in order to adjust to the other person and give them space. Pulling inward to give the other person more room. If you're tall, you might need to scrunch down a bit to indicate that you're not going to use your height against the other person.

Looking up toward the other person (head tilted forward a bit to make this work), creating the feeling that you are a child talking to a parent.

Dancing around your words (beating around the bush) when talking about something that will displease the other person.

Shouting as an attempt to intimidate the other person. This is low status because it suggests that you expect resistance.

Crouching your body as if to ward off a blow; protecting your face, neck, and torso.

Moving awkwardly or jerkily, with unnecessary movements.

Touching your face or head.

Avoiding making decisions for the group; avoiding responsibility.

Needing permission before you can act.

Adjusting the way you say something to help the other person understand; meeting the other person on their (cognitive) ground; explaining yourself. E.g. "Could you please adjust the chomper? That's the gadget on the kitchen counter immediately to the left of the toaster. If you just give it a slight rap on the top, that should adjust it."

A "low-status specialist" conveys in every word and gesture, "Please don't bite me, I'm not worth the trouble."

Raising another person's status

To raise another person's status is to establish them as high in the pecking order in your group (possibly just the two of you).

Ask their permission to do something.
Ask their opinion about something.
Ask them for advice or help.
Express gratitude for something they did.
Apologize to them for something you did.
Agree that they are right and you were wrong.
Defer to their judgement without requiring proof.
Address them with a fancy title or honorific (even "Mr." or "Sir" works very well).
Downplay your own achievement or attribute in comparison to theirs. "Your wedding cake is so much whiter than mine."
Do something incompetent in front of them and then apologize for it or act sheepish about it.
Mention a failure or shortcoming of your own. "I was supposed to go to an audition today, but I was late. They said I was wrong for the part anyway."
Compliment them in a way that suggests appreciation, not judgement. "Wow, what a beautiful cat you have!"
Obey them unquestioningly.
Back down in a conflict.
Move out of their way, bow to them, lower yourself before them.
Tip your hat to them.
Lose to them at something competitive, like a game (or any comparison).
Wait for them.
Serve them; do manual labor for them.

Tip: Whenever you bring an audience member on stage, always raise their status, never lower it.

Lowering another person's status

To lower another person's status is to attack or discredit their right to be high in the pecking order. Another word for "lowering someone's status" is "humiliating them."

Criticize something they did.
Contradict them. Tell them they are wrong. Prove it with facts and logic.
Correct them.
Insult them.
Give them unsolicited advice.
Approve or disapprove of something they did or some attribute of theirs. "Your cat has both nose and ear points. That is acceptable." Anything that sets you up as the judge lowers their status, even "Nice work on the Milligan account, Joe."
Shout at them.
Tell them what to do.
Ignore what they said and talk about something else, especially when they've said something that requires an answer. E.g. "Have you seen my socks?" "The train leaves in five minutes."
One-up them. E.g. have a worse problem than the one they described, have a greater past achievement than theirs, have met a more famous celebrity, earn more money, do better than them at something they're good at, etc.
Win: beat them at something competitive, like a game (or any comparison).
Announce something good about yourself or something you did. "I went to an audition today, and I got the part!"
Disregard their opinion. E.g. "You'd better not smoke while pumping gas, it's a fire hazard." Flick, light, puff, puff, pump, pump.
Talk sarcastically to them.
Make them wait for you.
When they've fallen behind you, don't wait for them to catch up, just push on and get further out of sync.
Disobey them.
Violate their space.
Beat them up. Beating them up verbally, not physically as in martial arts or how you learned UFC fighting in an gym, in front of other people, especially their wife, girlfriend, and/or children, is particularly status-lowering.
In a conflict, make them back down.
Taunt them. Tease them.

The basic status-lowering act

Laugh at them. (Not with them.)

The basic status-raising act

Be laughed at by them.

Second to that is laughing with them at someone else.

(Notice that those are primarily what comedians do.)


Note that behaviors that raise another person's status are not necessarily low-status behaviors, and behaviors that lower another person's status are not necessarily high-status behaviors. People at any status level raise and lower each other all the time. They can do so in ways that convey high or low status.

For example, shouting at someone lowers their status but is itself a low-status behavior.


Objects and environments also have high or low status, although this is seldom explored. So explore it. Make something cheap and inconsequential high status. (This fingernail clipping came from Graceland!) Or bring down the status of a high status item. (Casually toss a 2 carat diamond ring on your jewelry pile.)

Source: http://greenlightwiki.com/improv/Status
Retrieved 20 March 2012

A model of UDT with a halting oracle

41 cousin_it 18 December 2011 02:18PM

This post requires some knowledge of mathematical logic and computability theory. The basic idea is due to Vladimir Nesov and me.

Let the universe be a computer program U that can make calls to a halting oracle. Let the agent be a subprogram A within U that can also make calls to the oracle. The source code of both A and U is available to A.

Here's an example U that runs Newcomb's problem and returns the resulting utility value:

  def U():
    # Fill boxes, according to predicted action.
    box1 = 1000
    box2 = 1000000 if (A() == 1) else 0
    # Compute reward, based on actual action.
    return box2 if (A() == 1) else (box1 + box2)

A complete definition of U should also include the definition of A, so let's define it. We will use the halting oracle only as a provability oracle for some formal system S, e.g. Peano arithmetic. Here's the algorithm of A:

  1. Play chicken with the universe: if S proves that A()≠a for some action a, then return a.
  2. For every possible action a, find some utility value u such that S proves that A()=a ⇒ U()=u. If such a proof cannot be found for some a, break down and cry because the universe is unfair.
  3. Return the action that corresponds to the highest utility found on step 2.

Now we want to prove that the agent one-boxes, i.e. A()=1 and U()=1000000. That will follow from two lemmas.

Lemma 1: S proves that A()=1 ⇒ U()=1000000 and A()=2 ⇒ U()=1000. Proof: you can derive that from just the source code of U, without looking at A at all.

Lemma 2: S doesn't prove any other utility values for A()=1 or A()=2. Proof: assume, for example, that S proves that A()=1 ⇒ U()=42. But S also proves that A()=1 ⇒ U()=1000000, therefore S proves that A()≠1. According to the first step of the algorithm, A will play chicken with the universe and return 1, making S inconsistent unsound (thx Misha). So if S is sound, that can't happen.

We see that the agent defined above will do the right thing in Newcomb's problem. And the proof transfers easily to many other toy problems, like the symmetric Prisoner's Dilemma.

But why? What's the point of this result?

There's a big problem about formalizing UDT. If the agent chooses a certain action in a deterministic universe, then it's a true fact about the universe that choosing a different action would have caused Santa to appear. Moreover, if the universe is computable, then such silly logical counterfactuals are not just true but provable in any reasonable formal system. When we can't compare actual decisions with counterfactual ones, it's hard to define what it means for a decision to be "optimal".

For example, one previous formalization searched for formal proofs up to a specified length limit. Problem is, that limit is a magic constant in the code that can't be derived from the universe program alone. And if you try searching for proofs without a length limit, you might encounter a proof of a "silly" counterfactual which will make you stop early before finding the "serious" one. Then your decision based on that silly counterfactual can make it true by making its antecedent false... But the bigger problem is that we can't say exactly what makes a "silly" counterfactual different from a "serious" one.

In contrast, the new model with oracles has a nice notion of optimality, relative to the agent's formal system. The agent will always return whatever action is proved by the formal system to be optimal, if such an action exists. This notion of optimality matches our intuitions even though the universe is still perfectly deterministic and the agent is still embedded in it, because the oracle ensures that determinism is just out of the formal system's reach.

P.S. I became a SingInst research associate on Dec 1. They did not swear me to secrecy, and I hope this post shows that I'm still a fan of working in the open. I might just try to be a little more careful because I wouldn't want to discredit SingInst by making stupid math mistakes in public :-)

The Joys of Conjugate Priors

41 TCB 21 May 2011 02:41AM

(Warning: this post is a bit technical.)

Suppose you are a Bayesian reasoning agent.  While going about your daily activities, you observe an event of type .  Because you're a good Bayesian, you have some internal parameter  which represents your belief that  will occur.

Now, you're familiar with the Ways of Bayes, and therefore you know that your beliefs must be updated with every new datapoint you perceive.  Your observation of  is a datapoint, and thus you'll want to modify .  But how much should this datapoint influence ?  Well, that will depend on how sure you are of  in the first place.  If you calculated  based on a careful experiment involving hundreds of thousands of observations, then you're probably pretty confident in its value, and this single observation of  shouldn't have much impact.  But if your estimate of  is just a wild guess based on something your unreliable friend told you, then this datapoint is important and should be weighted much more heavily in your reestimation of .

Of course, when you reestimate , you'll also have to reestimate how confident you are in its value.  Or, to put it a different way, you'll want to compute a new probability distribution over possible values of .  This new distribution will be , and it can be computed using Bayes' rule:



Here, since  is a parameter used to specify the distribution from which  is drawn, it can be assumed that computing  is straightforward.   is your old distribution over , which you already have; it says how accurate you think different settings of the parameters are, and allows you to compute your confidence in any given value of .  So the numerator should be straightforward to compute; it's the denominator which might give you trouble, since for an arbitrary distribution, computing the integral is likely to be intractable.

But you're probably not really looking for a distribution over different parameter settings; you're looking for a single best setting of the parameters that you can use for making predictions.  If this is your goal, then once you've computed the distribution , you can pick the value of  that maximizes it.  This will be your new parameter, and because you have the formula , you'll know exactly how confident you are in this parameter.

In practice, picking the value of  which maximizes  is usually pretty difficult, thanks to the presence of local optima, as well as the general difficulty of optimization problems.  For simple enough distributions, you can use the EM algorithm, which is guarranteed to converge to a local optimum.  But for more complicated distributions, even this method is intractable, and approximate algorithms must be used.  Because of this concern, it's important to keep the distributions  and  simple.  Choosing the distribution  is a matter of model selection; more complicated models can capture deeper patterns in data, but will take more time and space to compute with.

It is assumed that the type of model is chosen before deciding on the form of the distribution .  So how do you choose a good distribution for ?  Notice that every time you see a new datapoint, you'll have to do the computation in the equation above.  Thus, in the course of observing data, you'll be multiplying lots of different probability distributions together.  If these distributions are chosen poorly,  could get quite messy very quickly.

If you're a smart Bayesian agent, then, you'll pick  to be a conjugate prior to the distribution .  The distribution  is conjugate to  if multiplying these two distributions together and normalizing results in another distribution of the same form as .

Let's consider a concrete example: flipping a biased coin.  Suppose you use the bernoulli distribution to model your coin.  Then it has a parameter  which represents the probability of gettings heads.  Assume that the value 1 corresponds to heads, and the value 0 corresponds to tails.  Then the distribution of the outcome  of the coin flip looks like this:



It turns out that the conjugate prior for the bernoulli distribution is something called the beta distribution.  It has two parameters,  and , which we call hyperparameters because they are parameters for a distribution over our parameters.  (Eek!)

The beta distribution looks like this:



Since  represents the probability of getting heads, it can take on any value between 0 and 1, and thus this function is normalized properly.

Suppose you observe a single coin flip  and want to update your beliefs regarding .  Since the denominator of the beta function in the equation above is just a normalizing constant, you can ignore it for the moment while computing , as long as you promise to normalize after completing the computation:



Normalizing this equation will, of course, give another beta distribution, confirming that this is indeed a conjugate prior for the bernoulli distribution.  Super cool, right?

If you are familiar with the binomial distribution, you should see that the numerator of the beta distribution in the equation for  looks remarkably similar to the non-factorial part of the binomial distribution.  This suggests a form for the normalization constant:



The beta and binomial distributions are almost identical.  The biggest difference between them is that the beta distribution is a function of , with  and  as prespecified parameters, while the binomial distribution is a function of , with  and  as prespecified parameters.  It should be clear that the beta distribution is also conjugate to the binomial distribution, making it just that much awesomer.

Another difference between the two distributions is that the beta distribution uses gammas where the binomial distribution uses factorials.  Recall that the gamma function is just a generalization of the factorial to the reals; thus, the beta distribution allows  and  to be any positive real number, while the binomial distribution is only defined for integers.  As a final note on the beta distribution, the -1 in the exponents is not philosophically significant; I think it is mostly there so that the gamma functions will not contain +1s.  For more information about the mathematics behind the gamma function and the beta distribution, I recommend checking out this pdf: http://www.mhtl.uwaterloo.ca/courses/me755/web_chap1.pdf.  It gives an actual derivation which shows that the first equation for  is equivalent to the second equation for , which is nice if you don't find the argument by analogy to the binomial distribution convincing.

So, what is the philosophical significance of the conjugate prior?  Is it just a pretty piece of mathematics that makes the computation work out the way we'd like it to?  No; there is deep philosophical significance to the form of the beta distribution.

Recall the intuition from above: if you've seen a lot of data already, then one more datapoint shouldn't change your understanding of the world too drastically.  If, on the other hand, you've seen relatively little data, then a single datapoint could influence your beliefs significantly.  This intuition is captured by the form of the conjugate prior.   and  can be viewed as keeping track of how many heads and tails you've seen, respectively.  So if you've already done some experiments with this coin, you can store that data in a beta distribution and use that as your conjugate prior.  The beta distribution captures the difference between claiming that the coin has 30% chance of coming up heads after seeing 3 heads and 7 tails, and claiming that the coin has a 30% chance of coming up heads after seeing 3000 heads and 7000 tails.

Suppose you haven't observed any coin flips yet, but you have some intuition about what the distribution should be.  Then you can choose values for  and  that represent your prior understanding of the coin.  Higher values of  indicate more confidence in your intuition; thus, choosing the appropriate hyperparameters is a method of quantifying your prior understanding so that it can be used in computation.   and  will act like "imaginary data"; when you update your distribution over  after observing a coin flip , it will be like you already saw  heads and  tails before that coin flip.
 
If you want to express that you have no prior knowledge about the system, you can do so by setting  and to 1.  This will turn the beta distribution into a uniform distribution.  You can also use the beta distribution to do add-N smoothing, by setting  and  to both be N+1.  Setting the hyperparameters to a value lower than 1 causes them to act like "negative data", which helps avoid overfitting  to noise in the actual data.

In conclusion, the beta distribution, which is a conjugate prior to the bernoulli and binomial distributions, is super awesome.  It makes it possible to do Bayesian reasoning in a computationally efficient manner, as well as having the philosophically satisfying interpretation of representing real or imaginary prior data.  Other conjugate priors, such as the dirichlet prior for the multinomial distribution, are similarly cool.

I'm scared.

41 Mass_Driver 23 December 2010 09:05AM

Recently, I've been ratcheting up my probability estimate of some of Less Wrong's core doctrines (shut up and multiply, beliefs require evidence, brains are not a reliable guide as to whether brains are malfunctioning, the Universe has no fail-safe mechanisms) from "Hmm, this is an intriguing idea" to somewhere in the neighborhood of "This is most likely correct."

This leaves me confused and concerned and afraid. There are two things in particular that are bothering me. On the one hand, I feel obligated to try much harder to identify my real goals and then to do what it takes to actually achieve them -- I have much less faith that just being a nice, thoughtful, hard-working person will result in me having a pleasant life, let alone in me fulfilling anything like my full potential to help others and/or produce great art. On the other hand, I feel a deep sense of pessimism -- I have much less faith that even making an intense, rational effort to succeed will make much of a difference. Rationality has stripped me of some of my traditional sources of confidence that everything will work out OK, but it hasn't provided any new ones -- there is no formula that I can recite to myself to say "Well, as long as I do this, then everything will be fine." Most likely, it won't be fine; but it isn't hopeless, either; possibly there's something I can do to help, and if so I really want to find it. This is frustrating.

This isn't to say that I want to back away from rationalism -- it's not as if pretending to be dumb will help. To whatever extent I become more rational and thus more successful, that's better than nothing. The concern is that it may not ever be better enough for me to register a sense of approval or contentedness. Civilization might collapse; I might get hit by a bus; or I might just claw through some of my biases but not others, make poor choices, and fail to accomplish much of anything.

Has anyone else had experience with a similar type of fear? Does anyone have suggestions as to an appropriate response?

[meta] New LW moderator: Viliam_Bur

40 Kaj_Sotala 13 September 2014 01:37PM

Some time back, I wrote that I was unwilling to continue with investigations into mass downvoting, and asked people for suggestions on how to deal with them from now on. The top-voted proposal in that thread suggested making Viliam_Bur into a moderator, and Viliam gracefully accepted the nomination. So I have given him moderator privileges and also put him in contact with jackk, who provided me with the information necessary to deal with the previous cases. Future requests about mass downvote investigations should be directed to Viliam.

Thanks a lot for agreeing to take up this responsibility, Viliam! It's not an easy one, but I'm very grateful that you're willing to do it. Please post a comment here so that we can reward you with some extra upvotes. :)

Against utility functions

40 Qiaochu_Yuan 19 June 2014 05:56AM

I think we should stop talking about utility functions.

In the context of ethics for humans, anyway. In practice I find utility functions to be, at best, an occasionally useful metaphor for discussions about ethics but, at worst, an idea that some people start taking too seriously and which actively makes them worse at reasoning about ethics. To the extent that we care about causing people to become better at reasoning about ethics, it seems like we ought to be able to do better than this.

The funny part is that the failure mode I worry the most about is already an entrenched part of the Sequences: it's fake utility functions. The soft failure is people who think they know what their utility function is and say bizarre things about what this implies that they, or perhaps all people, ought to do. The hard failure is people who think they know what their utility function is and then do bizarre things. I hope the hard failure is not very common. 

It seems worth reflecting on the fact that the point of the foundational LW material discussing utility functions was to make people better at reasoning about AI behavior and not about human behavior. 

I notice that I am confused about Identity and Resurrection

40 ialdabaoth 14 November 2013 08:38PM

I've spent quite a bit of time trying to work out how to explain the roots of my confusion. I think, in the great LW tradition, I'll start with a story.

[Editor's note: The original story was in 16th century Mandarin, and used peculiar and esoteric terms for concepts that are just now being re-discovered. Where possible, I have translated these terms into their modern mathematical and philosophical equivalents. Such terms are denoted with curly braces, {like so}.]

Once upon a time there was a man by the name of Shen Chun-lieh, and he had a beautiful young daughter named Ah-Chen. She died.

Shen Chun-lieh was heartbroken, moreso he thought than any man who had lost a daughter, and so he struggled and scraped and misered until he had amassed a great fortune, and brought that fortune before me - for he had heard it told that I was could resurrect the dead.

I frowned when he told me his story, for many things are true after a fashion, but wisdom is in understanding the nature of that truth - and he did not bear the face of a wise man.

"Tell me about your daughter, Ah-Chen.", I commanded.

And so he told me.

I frowned, for my suspicions were confirmed.

"You wish for me to give you this back?", I asked.

He nodded and dried his tears. "More than anything in the world."

"Then come back tomorrow, and I will have for you a beautiful daughter who will do all the things you described."

His face showed a sudden flash of understanding. Perhaps, I thought, this one might see after all.

"But", he said, "will it be Ah-Chen?"

I smiled sagely. "What do you mean by that, Shen Chun-lieh?"

"I mean, you said that you would give me 'a' daughter. I wish for MY daughter."

I bowed to his small wisdom. "Indeed I did. If you wish for YOUR daughter, then you must be much, much more precise with me."

He frowned, and I saw in his face that he did not have the words.

"You are wise in the way of the Tao", he said, "surely you can find the words in my heart, so that even such as me could say them?"

I nodded. "I can. But it will take a great amount of time, and much courage from you. Shall we proceed?"

He nodded.

 

I am wise enough in the way of the Tao. The Tao whispers things that have been discovered and forgotten, and things that have yet to be discovered, and things that may never be discovered. And while Shen Chun-lieh was neither wise nor particularly courageous, his overwhelming desire to see his daughter again propelled him with an intensity seldom seen in my students. And so it was, many years later, that I judged him finally ready to discuss his daughter with me, in earnest.

"Shen", I said, "it is time to talk about your Ah-Chen."

His eyes brightened and he nodded eagerly. "Yes, Teacher."

"Do you understand why I said on that first day, that you must be much, much more precise with me?"

"Yes, Teacher. I had come to you believing that the soul was a thing that could be conjured back to the living, rather than a {computational process}."

"Even now, you are not quite correct. The soul is not a {computational process}, but a {specification of a search space} which describes any number of similar {computational processes}. For example, Shen Chun-lieh, would you still be Shen Chun-lieh if I were to cut off your left arm?"

"Of course, Teacher. My left arm does not define who I am."

"Indeed. And are you still the same Shen Chun-lieh who came to me all those years ago, begging me to give him back his daughter Ah-Chen?"

"I am, Teacher, although I understand much more now than I did then."

"That you do. But tell me - would you be the same Shen Chun-lieh if you had not come to me? If you had continued to save and to save your money, and craft more desperate and eager schemes for amassing more money, until finally you forgot the purpose of your misering altogether, and abandoned your Ah-Chen to the pursuit of gold and jade for its own sake?"

"Teacher, my love for Ah-Chen is all-consuming; such a fate could never befall me."

"Do not be so sure, my student. Remember the tale of the butterfly's wings, and the storm that sank an armada. Ever-shifting is the Tao, and so ever-shifting is our place in it."

Shen Chun-lieh understood, and in a brief moment he glimpsed his life as it could have been, as an old Miser Shen hoarding gold and jade in a great walled city. He shuddered and prostrated himself.

"Teacher, you are correct. And even such a wretch as Miser Shen, that wretch would still be me. But I thank the Buddha and the Eight Immortal Sages that I was spared that fate."

I smiled benevolently and helped him to his feet. "Then suppose that you had died and not your daughter, and one day a young woman named Ah-Chen had burst into my door, flinging gold and jade upon my table, and described the caring and wonderful father that she wished returned to her? What could she say about Shen Chun-lieh that would allow me to find his soul amongst the infinite chaos of the Nine Hells?"

"I..." He looked utterly lost.

"Tell me, Shen Chun-lieh, what is the meaning of the parable of the {Ship of Theseus}?"

"That personal identity cannot be contained within the body, for the flow of the Tao slowly strips away and the flow of the Tao slowly restores, such that no single piece of my body is the same from one year to the next; and within the Tao, even the distinction of 'sameness' is meaningless."

"And what is the relevance of the parable of the {Shroedinger's Cat} to this discussion?"

"Umm... that... let me think. I suppose, that personal identity cannot be contained within the history of choices that have been made, because for every choice that has been made, if it was truly a 'choice' at all, it was also made the other way in some other tributary of the Great Tao."

"And the parable of the tiny {Paramecium}?"

"That neither is the copy; there are two originals."

"So, Shen. Can you yet articulate the dilemma that you present to me?"

"No, Teacher. I fear that yet again, you must point it out to your humble student."

"You ask for Ah-Chen, my student. But which one? Of all the Ah-Chens that could be brought before you, which would satisfy you? Because there is no hard line, between {configurations} that you would recognize as your daughter and {configurations} which you would not. So why did my original offer, to construct you a daughter that would do all the things you described Ah-Chen as doing, not appeal to you?"

Shen looked horrified. "Because she would not BE Ah-Chen! Even if you made her respond perfectly, it would not be HER! I do not simply miss my six-year-old girl; I miss what she could have become! I regret that she never got to see the world, never got to grow up, never got to..."

"In what sense did she never do these things? She died, yes; but even a dead Ah-Chen is still an Ah-Chen. She has since experienced being worms beneath the earth, and flowers, and then bees and birds and foxes and deer and even peasants and noblemen. All these are Ah-Chen, so why is it so important that she appear before you as YOU remember her?"

"Because I miss her, and because she has no conscious awareness of those things."

"Ah, but then which conscious awareness do you wish her to have? There is no copy; all possible tributaries of the Great Tao contain an original. And each of those originals experience in their own way. You wish me to pluck out a {configuration} and present it to you, and declare "This one! This one is Ah-Chen!". But which one? Or do you leave that choice to me?"

"No, Teacher. I know better than to leave that choice to you. But... you have shown me many great wonders, in alchemy and in other works of the Tao. If her brain had been preserved, perhaps frozen as you showed me the frozen koi, I could present that to you and you could reconstruct her {configuration} from that?"

I smiled sadly. "To certain degrees of precision, yes, I could. But the question still remains - you have only narrowed down the possible {configurations}. And what makes you say that the boundary of {configurations} that are achievable from a frozen brain are correct? If I smash that brain with a hammer, melt it, and paint a portrait of Ah-Chen with it, is that not a {configuration} that is achievable from that brain?"

Shen looked disgusted. "You... how can you be so wise and yet not understand such simple things? We are talking about people! Not paintings!"

I continued to smile sadly. "Because these things are not so simple. 'People' are not things, as you said before. 'People' are {sets of configurations}; they are {specifications of search spaces}. And those boundaries are so indistinct that anything that claims to capture them is in error."

Now it was Shen's turn to look animated. "Just because the boundary cannot be drawn perfectly, does not make the boundary meaningless!"

I nodded. "You have indeed learned much. But you still have not described the purpose of your boundary-drawing. Do you wish for Ah-Chen's resurrection for yourself, so that you may feel less lonely and grieved, or do you wish it for Ah-Chen's sake, so that she may see the world anew? For these two purposes will give us very different boundaries for what is an acceptable Ah-Chen."

Shen grimaced, as war raged within his heart. "You are so wise in the Tao; stop these games and do what I mean!"

And so it was that Miser Shen came to live in the walled city of Ch'in, and hoarded gold and jade, and lost all memory and desire for his daughter Ah-Chen, until it was that the Tao swept him up into another tale.

 

So, there we are. My confusion is in two parts:

1. When I imagine resurrecting loved ones, what makes me believe that even a perfectly preserved brain state is any more 'resurrection' than an overly sophisticated wind-up toy that happens to behave in ways that fulfill my desire for that loved one's company? In a certain sense, avoiding true 'resurrection' should be PREFERABLE - since it is possible that a "wind-up toy" could be constructed that provides a superstimulus version of that loved one's company, while an actual 'resurrection' will only be as good as the real thing.

2. When I imagine being resurrected "myself", how different from this 'me' can it be and still count? How is this fundamentally different from "I will for the future to contain a being like myself", which is really just "I will for the future to contain a being like I imagine myself to be" - in which case, we're back to the superstimulus option (which is perhaps a little weird in this case, since I'm not there to receive the stimulus).

I'd really like to discuss this.

 

Polyphasic Sleep Seed Study: Reprise

40 BrienneYudkowsky 21 September 2013 10:29PM

(Original post on the polyphasic sleep experiment here.)

Welp, this got a little messy. The main culprit was Burning Man, though there were some other complications with data collection as well. Here are the basics of what went down.

Fourteen people participated in the main experiment. Most of them were from Leverage. There were a few stragglers from a distance, but communication with them was poor. 

We did some cognitive batteries beforehand, mostly through Quantified Mind. A few people had extensive baseline data, partially because many had been using Zeos for months, and partly because a few stuck to the two-week daily survey. Leverage members (not me) are processing the data, and they'll probably have more detailed info for us in three months(ish).

With respect to the adaptation itself, we basically followed the plan outlined in my last post. Day one no sleep, then Uberman-12, then cut back to Uberman-6, then Everyman-3.

Most people ended up switching very quickly to Uberman-6 (within the first two or three days), and most switched to Everyman-3 after about five to seven days on Uberman-6. Three people tried to hold the Uberman schedule indefinitely: One person continued Uberman-6 for two full weeks, and two held out for twenty-one days. Afterwards, all three transitioned to Everyman-3. 

During the originally planned one-month period, five people dropped out. Nine were on some form of polyphasic for the whole month. One returned to monophasic at the end of the official experiment with only partial adaptation achieved. 

Then Burning Man disrupted everybody's sleep schedule. Afterward, one person continued experimenting with less common variations of the Everyman schedule. Three went back to Everyman-3. One switched to Everyman-2. Two people have flexible schedules that include two hours less sleep per day. One person's schedule was disrupted by travel for a while after Burning Man, and they're now re-adapting.

Now that all is said and done, eight of the original fourteen are polyphasic.



I'll hold off on concluding very much from this until I see the results of the cognitive battery and such, plus the number who are still polyphasic after three months. In the mean time, I'll just stick with this: Some people are capable of going polyphasic and staying that way (probably?). Sleep is complicated and confusing. I don't know how it works. I don't think anyone else really does either. More research is desperately needed.

I know three months is a long way away. I'm feeling impatient too. But details will arrive! In the mean time, here's a video of what zombie-Brienne is like during the really difficult stretches, and here is how she entertained herself when she could manage to do things besides pace. (I was one of the few who bailed out early :-p)

Update on establishment of Cambridge’s Centre for Study of Existential Risk

40 Sean_o_h 12 August 2013 04:11PM
Cambridge’s high-profile launch of the Centre for Study of Existential Risk last November received a lot of attention on LessWrong, and a number of people have been enquiring as to what‘s happened since. This post is meant to give a little explanation and update of what’s been going on.

Motivated by a common concern over human activity-related risks to humanity, Lord Martin Rees, Professor Huw Price, and Jaan Tallinn founded the Centre for Study of Existential Risk last year.  However, this announcement was made before the establishment of a physical research centre or securement of long-term funding. The last 9 months have been focused on turning an important idea into a reality.

Following the announcement in November, Professor Price contacted us at the Future of Humanity Institute regarding the possibility of collaboration on joint academic funding opportunities; the aim being both to raise the funds for CSER’s research programmes and to support joint work by the FHI and CSER’s researchers on anthropogenic existential risk. We submitted our first grant application in January to the European Research Council – an ambitious project to create “A New Science of Existential Risk” that, if successful, would provide enough funding for CSER’s first research programme - a sizeable programme that will run for five years.
We’ve been successful in the first and second rounds, and we will hear a final round decision at the end of the year. It was also an opportunity for us to get some additional leading academics onto the project – Sir Partha Dasgupta, Professor of Economics at Cambridge and an expert in social choice theory, sustainability and intergenerational ethics, is a co-PI (along with Huw Price, Martin Rees and Nick Bostrom). In addition, a number of prominent academics concerned about technology-related risk – including Stephen Hawking, David Spiegelhalter, George Church and David Chalmers – have joined our advisory board.

The FHI regards establishment of CSER as of the highest priority for a number of reasons including:

1) The value of the research the Centre will engage in
2) The reputational boost to the field of Existential Risk gained by the establishment of high-profile research centre in Cambridge.
3) The impact on policy and public perception that academic heavy-hitters like Rees and Price can have

Therefore we’ve been working with CSER behind the scenes over the last 9 months. Progress has been a little slow until now – Huw, Martin and Jaan are fully committed to this project, but due to their other responsibilities aren’t in a position to work full-time on it yet. 

However, we’re now in a position to make CSER’s establishment official. Cambridge’s new Centre for Research in the Arts, Social Sciences and Humanities (CRASSH) will host CSER and provide logistical support. I’ll be acting manager of CSER’s activities over the coming 6-12 months, under the guidance of Huw, Martin and Jaan. A generous seed funding donation from Jaan Tallinn is funding CSER’s establishment and these activities – which will include a lecture series, workshops, public outreach, and staff time on grant-writing and fundraising. It’ll also provide a buyout of a fraction of my time from FHI (providing funds for us to hire part-time staff to offload some of the FHI workload and help with some of the CSER work).

At the moment and over the next couple of months we’re going to be focused on identifying and working on additional academic funding opportunities for additional programmes, as well as chasing some promising leads in industry, private and philanthropic funding. I’ll also be aiming to keep CSER’s public profile active. There will be newsletters every three months (sign up here), the website’s going to be fleshed out to contain more detail about our planned research and existing literature, and we’ll be arranging regular high-quality media engagement. While we’re unlikely to have time to answer every general query that comes in (though we’ll try whenever possible: email: admin@cser.org), we’ll aim to keep the existential risk community informed through the newsletters and posts such as these.

We’ve been lucky to get a lot of support from the academic and existential risk community for the CSER centre. In addition to CRASSH, Cambridge’s Centre for Science and Policy will provide support in making policy-relevant links, and may co-host and co-publicise events. Luke Muehlhauser, MIRI’s Executive Director, has been very supportive and has provided valuable advice, and has generously offered to direct some of MIRI’s volunteer support towards CSER tasks. We also expect to get valuable support from the growing community around FHI.

From where I’m sitting, CSER’s successful launch is looking very promising. The timeline on our research programmes, however, is still a little more uncertain. If we’re successful with the European Research Council, we can expect to be hiring a full research team next spring. If not, it may take a little longer, but we’re exploring a number of different opportunities in parallel and are feeling confident. The support of the existential risk community continues to be invaluable.

Thanks,

Seán Ó hÉigeartaigh
Academic Manager, Future of Humanity Institute 
Acting Academic Manager, Cambridge Centre for Study of Existential Risk.


"Stupid" questions thread

40 gothgirl420666 13 July 2013 02:42AM

r/Fitness does a weekly "Moronic Monday", a judgment-free thread where people can ask questions that they would ordinarily feel embarrassed for not knowing the answer to. I thought this seemed like a useful thing to have here - after all, the concepts discussed on LessWrong are probably at least a little harder to grasp than those of weightlifting. Plus, I have a few stupid questions of my own, so it doesn't seem unreasonable that other people might as well. 

Three more ways identity can be a curse

40 gothgirl420666 28 April 2013 02:53AM

The Buddhists believe that one of the three keys to attaining true happiness is dissolving the illusion of the self. (The other two are dissolving the illusion of permanence, and ceasing the desire that leads to suffering.) I'm not really sure exactly what it means to say "the self is an illusion", and I'm not exactly sure how that will lead to enlightenment, but I do think one can easily take the first step on this long journey to happiness by beginning to dissolve the sense of one's identity. 

Previously, in "Keep Your Identity Small", Paul Graham showed how a strong sense of identity can lead to epistemic irrationally, when someone refuses to accept evidence against x because "someone who believes x" is part of his or her identity. And in Kaj Sotala's "The Curse of Identity", he illustrated a human tendency to reinterpret a goal of "do x" as "give the impression of being someone who does x". These are both fantastic posts, and you should read them if you haven't already. 

Here are three more ways in which identity can be a curse.

1. Don't be afraid to change

James March, professor of political science at Stanford University, says that when people make choices, they tend to use one of two basic models of decision making: the consequences model, or the identity model. In the consequences model, we weigh the costs and benefits of our options and make the choice that maximizes our satisfaction. In the identity model, we ask ourselves "What would a person like me do in this situation?"1

The author of the book I read this in didn't seem to take the obvious next step and acknowledge that the consequences model is clearly The Correct Way to Make Decisions and basically by definition, if you're using the identity model and it's giving you a different result then the consequences model would, you're being led astray. A heuristic I like to use is to limit my identity to the "observer" part of my brain, and make my only goal maximizing the amount of happiness and pleasure the observer experiences, and minimizing the amount of misfortune and pain. It sounds obvious when you lay it out in these terms, but let me give an example. 

Alice is a incoming freshman in college trying to choose her major. In Hypothetical University, there are only two majors: English, and business. Alice absolutely adores literature, and thinks business is dreadfully boring. Becoming an English major would allow her to have a career working with something she's passionate about, which is worth 2 megautilons to her, but it would also make her poor (0 mu). Becoming a business major would mean working in a field she is not passionate about (0 mu), but it would also make her rich, which is worth 1 megautilon. So English, with 2 mu, wins out over business, with 1 mu.

However, Alice is very bright, and is the type of person who can adapt herself to many situations and learn skills quickly. If Alice were to spend the first six months of college deeply immersing herself in studying business, she would probably start developing a passion for business. If she purposefully exposed herself to certain pro-business memeplexes (e.g. watched a movie glamorizing the life of Wall Street bankers), then she could speed up this process even further. After a few years of taking business classes, she would probably begin to forget what about English literature was so appealing to her, and be extremely grateful that she made the decision she did. Therefore she would gain the same 2 mu from having a job she is passionate about, along with an additional 1 mu from being rich, meaning that the 3 mu choice of business wins out over the 2 mu choice of English.

However, the possibility of self-modifying to becoming someone who finds English literature boring and business interesting is very disturbing to Alice. She sees it as a betrayal of everything that she is, even though she's actually only been interested in English literature for a few years. Perhaps she thinks of choosing business as "selling out" or "giving in". Therefore she decides to major in English, and takes the 2 mu choice instead of the superior 3 mu.

(Obviously this is a hypothetical example/oversimplification and there are a lot of reasons why it might be rational to pursue a career path that doesn't make very much money.)

It seems to me like human beings have a bizarre tendency to want to keep certain attributes and character traits stagnant, even when doing so provides no advantage, or is actively harmful. In a world where business-passionate people systematically do better than English-passionate people, it makes sense to self-modify to become business-passionate. Yet this is often distasteful.

For example, until a few weeks ago when I started solidifying this thinking pattern, I had an extremely adverse reaction to the idea of ceasing to be a hip-hop fan and becoming a fan of more "sophisticated" musical genres like jazz and classical, eventually coming to look down on the music I currently listen to as primitive or silly. This doesn't really make sense - I'm sure if I were to become a jazz and classical fan I would enjoy those genres at least as much as I currently enjoy hip hop. And yet I had a very strong preference to remain the same, even in the trivial realm of music taste. 

Probably the most extreme example is the common tendency for depressed people to not actually want to get better, because depression has become such a core part of their identity that the idea of becoming a healthy, happy person is disturbing to them. (I used to struggle with this myself, in fact.) Being depressed is probably the most obviously harmful characteristic that someone can have, and yet many people resist self-modification.

Of course, the obvious objection is there's no way to rationally object to people's preferences - if someone truly prioritizes keeping their identity stagnant over not being depressed then there's no way to tell them they're wrong, just like if someone prioritizes paperclips over happiness there's no way to tell them they're wrong. But if you're like me, and you are interested in being happy, then I recommend looking out for this cognitive bias. 

The other objection is that this philosophy leads to extremely unsavory wireheading-esque scenarios if you take it to its logical conclusion. But holding the opposite belief - that it's always more important to keep your characteristics stagnant than to be happy - clearly leads to even more absurd conclusions. So there is probably some point on the spectrum where change is so distasteful that it's not worth a boost in happiness (e.g. a lobotomy or something similar). However, I think that in actual practical pre-Singularity life, most people set this point far, far too low. 

2. The hidden meaning of "be yourself"

(This section is entirely my own speculation, so take it as you will.)

"Be yourself" is probably the most widely-repeated piece of social skills advice despite being pretty clearly useless - if it worked then no one would be socially awkward, because everyone has heard this advice. 

However, there must be some sort of core grain of truth in this statement, or else it wouldn't be so widely repeated. I think that core grain is basically the point I just made, applied to social interaction. I.e, optimize always for social success and positive relationships (particularly in the moment), and not for signalling a certain identity. 

The ostensible purpose of identity/signalling is to appear to be a certain type of person, so that people will like and respect you, which is in turn so that people will want to be around you and be more likely to do stuff for you. However, oftentimes this goes horribly wrong, and people become very devoted to cultivating certain identities that are actively harmful for this purpose, e.g. goth, juggalo, "cool reserved aloof loner", guy that won't shut up about politics, etc. A more subtle example is Fred, who holds the wall and refuses to dance at a nightclub because he is a serious, dignified sort of guy, and doesn't want to look silly. However, the reason why "looking silly" is generally a bad thing is because it makes people lose respect for you, and therefore make them less likely to associate with you. In the situation Fred is in, holding the wall and looking serious will cause no one to associate with him, but if he dances and mingles with strangers and looks silly, people will be likely to associate with him. So unless he's afraid of looking silly in the eyes of God, this seems to be irrational.

Probably more common is the tendency to go to great care to cultivate identities that are neither harmful nor beneficial. E.g. "deep philosophical thinker", "Grateful Dead fan", "tough guy", "nature lover", "rationalist", etc. Boring Bob is a guy who wears a blue polo shirt and khakis every day, works as hard as expected but no harder in his job as an accountant, holds no political views, and when he goes home he relaxes by watching whatever's on TV and reading the paper. Boring Bob would probably improve his chances of social success by cultivating a more interesting identity, perhaps by changing his wardrobe, hobbies, and viewpoints, and then liberally signalling this new identity. However, most of us are not Boring Bob, and a much better social success strategy for most of us is probably to smile more, improve our posture and body language, be more open and accepting of other people, learn how to make better small talk, etc. But most people fail to realize this and instead play elaborate signalling games in order to improve their status, sometimes even at the expense of lots of time and money.

Some ways by which people can fail to "be themselves" in individual social interactions: liberally sprinkle references to certain attributes that they want to emphasize, say nonsensical and surreal things in order to seem quirky, be afraid to give obvious responses to questions in order to seem more interesting, insert forced "cool" actions into their mannerisms, act underwhelmed by what the other person is saying in order to seem jaded and superior, etc. Whereas someone who is "being herself" is more interested in creating rapport with the other person than giving off a certain impression of herself.  

Additionally, optimizing for a particular identity might not only be counterproductive - it might actually be a quick way to get people to despise you. 

I used to not understand why certain "types" of people, such as "hipsters"2 or Ed Hardy and Affliction-wearing "douchebags" are so universally loathed (especially on the internet). Yes, these people are adopting certain styles in order to be cool and interesting, but isn't everyone doing the same? No one looks through their wardrobe and says "hmm, I'll wear this sweater because it makes me uncool, and it'll make people not like me". Perhaps hipsters and Ed Hardy Guys fail in their mission to be cool, but should we really hate them for this? If being a hipster was cool two years ago, and being someone who wears normal clothes, acts normal, and doesn't do anything "ironically" is cool today, then we're really just hating people for failing to keep up with the trends. And if being a hipster actually is cool, then, well, who can fault them for choosing to be one?

That was my old thought process. Now it is clear to me that what makes hipsters and Ed Hardy Guys hated is that they aren't "being themselves" - they are much more interested in cultivating an identity of interestingness and masculinity, respectively, than connecting with other people. The same thing goes for pretty much every other collectively hated stereotype I can think of3 - people who loudly express political opinions, stoners who won't stop talking about smoking weed, attention seeking teenage girls on facebook, extremely flamboyantly gay guys, "weeaboos", hippies and new age types, 2005 "emo kids", overly politically correct people, tumblr SJA weirdos who identify as otherkin and whatnot, overly patriotic "rednecks", the list goes on and on. 

This also clears up a confusion that occurred to me when reading How to Win Friends and Influence People. I know people who have a Dale Carnegie mindset of being optimistic and nice to everyone they meet and are adored for it, but I also know people who have the same attitude and yet are considered irritatingly saccharine and would probably do better to "keep it real" a little. So what's the difference? I think the difference is that the former group are genuinely interested in being nice to people and building rapport, while members of the second group have made an error like the one described in Kaj Sotala's post and are merely trying to give off the impression of being a nice and friendly person. The distinction is obviously very subtle, but it's one that humans are apparently very good at perceiving. 

I'm not exactly sure what it is that causes humans to have this tendency of hating people who are clearly optimizing for identity - it's not as if they harm anyone. It probably has to do with tribal status. But what is clear is that you should definitely not be one of them. 

3. The worst mistake you can possibly make in combating akrasia

The main thesis of PJ Eby's Thinking Things Done is that the primary reason why people are incapable of being productive is that they use negative motivation ("if I don't do x, some negative y will happen") as opposed to positive motivation ("if i do x, some positive y will happen"). He has the following evo-psych explanation for this: in the ancestral environment, personal failure meant that you could possibly be kicked out of your tribe, which would be fatal. A lot of depressed people make statements like "I'm worthless", or "I'm scum" or "No one could ever love me", which are illogically dramatic and overly black and white, until you realize that these statements are merely interpretations of a feeling of "I'm about to get kicked out of the tribe, and therefore die." Animals have a freezing response to imminent death, so if you are fearing failure you will go into do-nothing mode and not be able to work at all.4

In Succeed: How We Can Reach Our Goals, Phd psychologist Heidi Halvorson takes a different view and describes positive motivation and negative motivation as having pros and cons. However, she has her own dichotomy of Good Motivation and Bad Motivation: "Be good" goals are performance goals, and are directed at achieving a particular outcome, like getting an A on a test, reaching a sales target, getting your attractive neighbor to go out with you, or getting into law school. They are very often tied closely to a sense of self-worth. "Get better" goals are mastery goals, and people who pick these goals judge themselves instead in terms of the progress they are making, asking questions like "Am I improving? Am I learning? Am I moving forward at a good pace?" Halvorson argues that "get better" goals are almost always drastically better than "be good" goals5. An example quote (from page 60) is:

When my goal is to get an A in a class and prove that I'm smart, and I take the first exam and I don't get an A... well, then I really can't help but think that maybe I'm not so smart, right? Concluding "maybe I'm not smart" has several consequences and none of them are good. First, I'm going to feel terrible - probably anxious and depressed, possibly embarrassed or ashamed. My sense of self-worth and self-esteem are going to suffer. My confidence will be shaken, if not completely shattered. And if I'm not smart enough, there's really no point in continuing to try to do well, so I'll probably just give up and not bother working so hard on the remaining exams. 

And finally, in Feeling Good: The New Mood Therapy, David Burns describes a destructive side effect of depression he calls "do-nothingism":

One of the most destructive aspects of depression is the way it paralyzes your willpower. In its mildest form you may simply procrastinate about doing a few odious chores. As your lack of motivation increases, virtually any activity appears so difficult that you become overwhelmed by the urge to do nothing. Because you accomplish very little, you feel worse and worse. Not only do you cut yourself off from your normal sources of stimulation and pleasure, but your lack of productivity aggravates your self-hatred, resulting in further isolation and incapacitation.

Synthesizing these three pieces of information leads me to believe that the worst thing you can possibly do for your akrasia is to tie your success and productivity to your sense of identity/self-worth, especially if you're using negative motivation to do so, and especially if you suffer or have recently suffered from depression or low-self esteem. The thought of having a negative self-image is scary and unpleasant, perhaps for the evo-psych reasons PJ Eby outlines. If you tie your productivity to your fear of a negative self-image, working will become scary and unpleasant as well, and you won't want to do it.

I feel like this might be the single number one reason why people are akratic. It might be a little premature to say that, and I might be biased by how large of a factor this mistake was in my own akrasia. But unfortunately, this trap seems like a very easy one to fall into. If you're someone who is lazy and isn't accomplishing much in life, perhaps depressed, then it makes intuitive sense to motivate yourself by saying "Come on, self! Do you want to be a useless failure in life? No? Well get going then!" But doing so will accomplish the exact opposite and make you feel miserable. 

So there you have it. In addition to making you a bad rationalist and causing you to lose sight of your goals, a strong sense of identity will cause you to make poor decisions that lead to unhappiness, be unpopular, and be unsuccessful. I think the Buddhists were onto something with this one, personally, and I try to limit my sense of identity as much as possible. A trick you can use in addition to the "be the observer" trick I mentioned, is to whenever you find yourself thinking in identity terms, swap out that identity for the identity of "person who takes over the world by transcending the need for a sense of identity". 


This is my first LessWrong discussion post, so constructive criticism is greatly appreciated. Was this informative? Or was what I said obvious, and I'm retreading old ground? Was this well written? Should this have been posted to Main? Should this not have been posted at all? Thank you. 


1. Paraphrased from page 153 of Switch: How to Change When Change is Hard

2. Actually, while it works for this example, I think the stereotypical "hipster" is a bizarre caricature that doesn't match anyone who actually exists in real life, and the degree to which people will rabidly espouse hatred for this stereotypical figure (or used to two or three years ago) is one of the most bizarre tendencies people have. 

3. Other than groups that arguably hurt people (religious fundamentalists, PUAs), the only exception I can think of is frat boy/jock types. They talk about drinking and partying a lot, sure, but not really any more than people who drink and party a lot would be expected to. Possibilities for their hated status include that they do in fact engage in obnoxious signalling and I'm not aware of it, jealousy, or stigmatization as hazers and date rapists. Also, a lot of people hate stereotypical "ghetto" black people who sag their jeans and notoriously type in a broken, difficult-to-read form of English. This could either be a weak example of the trend (I'm not really sure what it is they would be signalling, maybe dangerous-ness?), or just a manifestation of racism.

4. I'm not sure if this is valid science that he pulled from some other source, or if he just made this up.

5. The exception is that "be good" goals can lead to a very high level of performance when the task is easy. 

 

Explicit and tacit rationality

40 lukeprog 09 April 2013 11:33PM

Like Eliezer, I "do my best thinking into a keyboard." It starts with a burning itch to figure something out. I collect ideas and arguments and evidence and sources. I arrange them, tweak them, criticize them. I explain it all in my own words so I can understand it better. By then it is nearly something that others would want to read, so I clean it up and publish, say, How to Beat Procrastination. I write essays in the original sense of the word: "attempts."

This time, I'm trying to figure out something we might call "tacit rationality" (c.f. tacit knowledge).

I tried and failed to write a good post about tacit rationality, so I wrote a bad post instead — one that is basically a patchwork of somewhat-related musings on explicit and tacit rationality. Therefore I'm posting this article to LW Discussion. I hope the ensuing discussion ends up leading somewhere with more clarity and usefulness.

 

Three methods for training rationality

Which of these three options do you think will train rationality (i.e. systematized winning, or "winning-rationality") most effectively?

  1. Spend one year reading and re-reading The Sequences, studying the math and cognitive science of rationality, and discussing rationality online and at Less Wrong meetups.
  2. Attend a CFAR workshop, then spend the next year practicing those skills and other rationality habits every week.
  3. Run a startup or small business for one year.

Option 1 seems to be pretty effective at training people to talk intelligently about rationality (let's call that "talking-rationality"), and it seems to inoculate people against some common philosophical mistakes.

We don't yet have any examples of someone doing Option 2 (the first CFAR workshop was May 2012), but I'd expect Option 2 — if actually executed — to result in more winning-rationality than Option 1, and also a modicum of talking-rationality.

What about Option 3? Unlike Option 2 or especially Option 1, I'd expect it to train almost no ability to talk intelligently about rationality. But I would expect it to result in relatively good winning-rationality, due to its tight feedback loops.

 

Talking-rationality and winning-rationality can come apart

I've come to believe... that the best way to succeed is to discover what you love and then find a way to offer it to others in the form of service, working hard, and also allowing the energy of the universe to lead you.

Oprah Winfrey

Oprah isn't known for being a rational thinker. She is a known peddler of pseudoscience, and she attributes her success (in part) to allowing "the energy of the universe" to lead her.

Yet she must be doing something right. Oprah is a true rags-to-riches story. Born in Mississippi to an unwed teenage housemaid, she was so poor she wore dresses made of potato sacks. She was molested by a cousin, an uncle, and a family friend. She became pregnant at age 14.

But in high school she became an honors student, won oratory contests and a beauty pageant, and was hired by a local radio station to report the news. She became the youngest-ever news anchor at Nashville's WLAC-TV, then hosted several shows in Baltimore, then moved to Chicago and within months her own talk show shot from last place to first place in the ratings there. Shortly afterward her show went national. She also produced and starred in several TV shows, was nominated for an Oscar for her role in a Steven Spielberg movie, launched her own TV cable network and her own magazine (the "most successful startup ever in the [magazine] industry" according to Fortune), and became the world's first female black billionaire.

I'd like to suggest that Oprah's climb probably didn't come merely through inborn talent, hard work, and luck. To get from potato sack dresses to the Forbes billionaire list, Oprah had to make thousands of pretty good decisions. She had to make pretty accurate guesses about the likely consequences of various actions she could take. When she was wrong, she had to correct course fairly quickly. In short, she had to be fairly rational, at least in some domains of her life.

Similarly, I know plenty of business managers and entrepreneurs who have a steady track record of good decisions and wise judgments, and yet they are religious, or they commit basic errors in logic and probability when they talk about non-business subjects.

What's going on here? My guess is that successful entrepreneurs and business managers and other people must have pretty good tacit rationality, even if they aren't very proficient with the "rationality" concepts that Less Wrongers tend to discuss on a daily basis. Stated another way, successful businesspeople make fairly rational decisions and judgments, even though they may confabulate rather silly explanations for their success, and even though they don't understand the math or science of rationality well.

LWers can probably outperform Mark Zuckerberg on the CRT and the Berlin Numeracy Test, but Zuckerberg is laughing at them from atop a huge pile of utility.

 

Explicit and tacit rationality

Patri Friedman, in Self-Improvement or Shiny Distraction: Why Less Wrong is anti-Instrumental Rationality, reminded us that skill acquisition comes from deliberate practice, and reading LW is a "shiny distraction," not deliberate practice. He said a real rationality practice would look more like... well, what Patri describes is basically CFAR, though CFAR didn't exist at the time.

In response, and again long before CFAR existed, Anna Salamon wrote Goals for which Less Wrong does (and doesn't) help. Summary: Some domains provide rich, cheap feedback, so you don't need much LW-style rationality to become successful in those domains. But many of us have goals in domains that don't offer rapid feedback: e.g. whether to buy cryonics, which 40-year investments are safe, which metaethics to endorse. For this kind of thing you need LW-style rationality. (We could also state this as "Domains with rapid feedback train tacit rationality with respect to those domains, but for domains without rapid feedback you've got to do the best you can with LW-style "explicit rationality".)

The good news is that you should be able to combine explicit and tacit rationality. Explicit rationality can help you realize that you should force tight feedback loops into whichever domains you want to succeed in, so that you can have develop good intuitions about how to succeed in those domains. (See also: Lean Startup or Lean Nonprofit methods.)

Explicit rationality could also help you realize that the cognitive biases most-discussed in the literature aren't necessarily the ones you should focus on ameliorating, as Aaron Swartz wrote:

Cognitive biases cause people to make choices that are most obviously irrational, but not most importantly irrational... Since cognitive biases are the primary focus of research into rationality, rationality tests mostly measure how good you are at avoiding them... LW readers tend to be fairly good at avoiding cognitive biases... But there a whole series of much more important irrationalities that LWers suffer from. (Let's call them "practical biases" as opposed to "cognitive biases," even though both are ultimately practical and cognitive.)

...Rationality, properly understood, is in fact a predictor of success. Perhaps if LWers used success as their metric (as opposed to getting better at avoiding obvious mistakes), they might focus on their most important irrationalities (instead of their most obvious ones), which would lead them to be more rational and more successful.


Final scattered thoughts

  • If someone is consistently winning, and not just because they have tons of wealth or fame, then maybe you should conclude they have pretty good tacit rationality even if their explicit rationality is terrible.
  • The positive effects of tight feedback loops might trump the effects of explicit rationality training.
  • Still, I suspect explicit rationality plus tight feedback loops could lead to the best results of all.
  • I really hope we can develop a real rationality dojo.
  • If you're reading this post, you're probably spending too much time reading Less Wrong, and too little time hacking your motivation system, learning social skills, and learning how to inject tight feedback loops into everything you can.

My workflow

40 paulfchristiano 09 December 2012 09:16PM

 

Over the last 6 months I've started doing a lot of things differently. Some of these changes seem to have increased my work output a good bit and made me happier. I normally hesitate to share habits, but I'm pretty happy with these in particular, and even if they will work for only a few people I think they are worth sharing. Most of the habits I've adopted are fairly common, but I hope I can help people anyway by identifying the habits that have most helped me.

I'm curious to hear about alternatives that have worked for you. 

 

Workflowy

Workflowy lets you edit a single collapsible outline. I use it very extensively. It is much more convenient than the network of google docs it replaced, and I use it much more often. It is much like other outliners, but (1) has a slicker interface, (2) works offline, (3) lets you recurse on and share sublists.

Workflowy is free to try but costs $5 a month. This may seem expensive for what it does, but if you use (or could use!) outliners a lot this is not enough to matter. After some searching Workflowy seems like the best option. I'm sure I like Workflowy more than most people, but I really like it, so I think it's worth trying.

Here is a skeleton of my workflowy list, which hosts many of the other systems in this post.

Checklists:

I have a checklist of tasks to do each night before sleeping. In the past I would often forget one of these things; putting them in a checklist helps me do them more reliably and makes me more relaxed. 

Checklists for other occasions, particularly waking up and traveling, are also helpful, but are much less important to me. 

Todo lists:

I now maintain two todo lists: one with a list of tasks for each upcoming day, and one with a list of tasks for future events ("I'm in the UK," "it is Thursday," "I'm going grocery shopping"). Whenever I think of something I should do, I either put it under a future day and do it when that day arrives, or I put it with an associated event. Each night I check both lists and decide what to do tomorrow. 

Beeminder:

Beeminder is a service that holds you to commitments and tracks your progress. It has helped me a lot over the last months. I've experimented with a few different commitments, but two have been most useful: following a daily routine, and doing a minimum amount of work each day (on average). Beeminder has pretty low overhead.

Reflection:

I spend about 10% of my productive time reflecting on how things have been going and what I should do differently. I benefit from producing concrete possible changes each time I sit down to think. I realized how important this is for me recently; since I've started doing it more reliably, I have gotten a lot more out of reflection.

Pomodoro:

I do my work in uninterrupted blocks of 20 minutes, punctuated by 2-3 minute breaks. This is my bastardized, minimalist version of the pomodoro technique, which I arrived at by trial and error. I use Alinof timer, which was recommended to me by a friend. 

Calendar:

I now record commitments on my calendar reliably and check it each night. I failed to do this for 6 months after finishing my undergraduate degree, which I think was a serious mistake. I became much more reliable at checking my calendar after adopting a daily checklist.

Time Logging:

Whenever I start a new activity, I write down the current time and a description of what I just stopped doing. At the end of the day I spend a few minutes reading this log and estimating how much time I spent on each activity. This makes me more attentive to time during the day, helps me remember what I did throughout the day, and frees up attention. Sometimes I use the logs to try and notice trends. For example, I've been exercising on random days and measuring how this affects my time. I don't yet know if this helps at all.

Catch:

Catch is a note-taking app. It is very minimal, and lets you record a voice note by pressing a single button. It has substantially increased my affordance for taking notes during the day, which I use to remember todo items and help with time logging.

Aubrey de Grey has responded to his IAMA - Now with Transcript!

40 Username 30 June 2012 06:47AM

Watch the video response here: http://www.youtube.com/watch?v=-tsI_28O3Ws

This was posted here on lesswrong a while ago, but they recently uploaded a new version of the video and I took the liberty of typing up a transcript.

The video is fairly long, about 25 minutes. But it's incredibly engaging and I highly recommend watching it. For those who prefer text (because it's faster or because you are a computer), you can read the transcript in this google doc, or below in the comments. Enjoy!

Cryonics without freezers: resurrection possibilities in a Big World

40 Yvain 04 April 2012 10:48PM

And fear not lest Existence closing your
Account, should lose, or know the type no more;
The Eternal Saki from the Bowl has pour'd
Millions of Bubbles like us, and will pour.

When You and I behind the Veil are past,
Oh, but the long long while the World shall last,
Which of our Coming and Departure heeds
As much as Ocean of a pebble-cast.

    -- Omar Khayyam, Rubaiyat

 

A CONSEQUENTIALIST VIEW OF IDENTITY

The typical argument for cryonics says that if we can preserve brain data, one day we may be able to recreate a functioning brain and bring the dead back to life.

The typical argument against cryonics says that even if we could do that, the recreation wouldn't be "you". It would be someone who thinks and acts exactly like you.

The typical response to the typical argument against cryonics says that identity isn't in specific atoms, so it's probably in algorithms, and the recreation would have the same mental algorithms as you and so be you. The gap in consciousness of however many centuries is no more significant than the gap in consciousness between going to bed at night and waking up in the morning, or the gap between going into a coma and coming out of one.

We can call this a "consequentialist" view of identity, because it's a lot like the consequentialist views of morality. Whether a person is "me" isn't a function of how we got to that person, but only of where that person is right now: that is, how similar that person's thoughts and actions are to my own. It doesn't matter if we got to him by having me go to sleep and wake up as him, or got to him by having aliens disassemble my brain and then simulate it on a cellular automaton. If he thinks like me, he's me.

A corollary of the consequentialist view of identity says that if someone wants to create fifty perfect copies of me, all fifty will "be me" in whatever sense that means something.

GRADATIONS OF IDENTITY

An argument against cryonics I have never heard, but which must exist somewhere, says that even the best human technology is imperfect, and likely a few atoms here and there - or even a few entire neurons - will end up out of place. Therefore, the recreation will not be you, but someone very very similar to you.

And the response to this argument is "Who cares?" If by "me" you mean Yvain as of 10:20 PM 4th April 2012, then even Yvain as of 10:30 is going to have some serious differences at the atomic scale. Since I don't consider myself a different person every ten minutes, I shouldn't consider myself a different person if the resurrection-machine misplaces a few cells here or there.

But this is a slippery slope. If my recreation is exactly like me except for one neuron, is he the same person? Signs point to yes. What about five neurons? Five million? Or on a functional level, what if he blinked at exactly one point where I would not have done so? What if he prefers a different flavor of ice cream? What if he has exactly the same memories as I do, except for the outcome of one first-grade spelling bee I haven't thought about in years anyway? What if he is a Hindu fundamentalist?

If we're going to take a consequentialist view of identity, then my continued ability to identify with myself even if I naturally switch ice cream preferences suggests I should identify with a botched resurrection who also switches ice cream preferences. The only solution here that really makes sense is to view identity in shades of gray instead of black-and-white. An exact clone is more me than a clone with different ice cream preferences, who is more me than a clone who is a Hindu fundamentalist, who is more me than LeBron James is.

BIG WORLDS

There are various theories lumped together under the title "big world".

The simplest is the theory that the universe (or multiverse) is Very Very Big. Although the universe is probably only 15 billion years old, which means the visible universe is only 30 billion light years in size, inflation allows the entire universe to get around the speed of light restriction; it could be very large or possibly infinite. I don't have the numbers available, but I remember a back of the envelope calculation being posted on Less Wrong once about exactly how big the universe would have to be to contain repeating patches of about the size of the Earth. That is, just as the first ten digits of pi, 3141592653, must repeat somewhere else in pi because pi is infinite and patternless, and just as I would believe this with high probability even if pi were not infinite but just very very large, so the arrangement of atoms that make up Earth would recur in an infinite or very very large universe. This arrangement would obviously include you, exactly as you are now. A much larger class of Earth-sized patches would include slightly different versions of you like the one with different ice cream preferences. This would also work, as Omar Khayyam mentioned in the quote at the top, if the universe were to last forever or a very very long time.

The second type of "big world" is the one posited by the Many Worlds theory of quantum mechanics, in which each quantum event causes the Universe to split into several branches. Because quantum events determine larger-level events, and because each branch continues branching, some these branches could be similar to our universe but with observable macro-scale differences. For example, there might be a branch in which you are the President of the United States, or the Pope, or died as an infant. Although this sounds like a silly popular science version of the principle, I don't think it's unfair or incorrect.

The third type of "big world" is modal realism: the belief that all possible worlds exist, maybe in proportion to their simplicity (whatever that means). We notice the existence of our own world only for indexical reasons: that is, just as there are many countries, but when I look around me I only see my own; so there are many possibilities, but when I look around me I only see my own. If this is true, it is not only possible but certain that there is a world where I am Pope and so on.

There are other types of "big worlds" that I won't get into here, but if any type at all is correct, then there should be very many copies of me or people very much like me running around.

CRYONICS WITHOUT FREEZERS

Cryonicists say that if you freeze your brain, you may experience "waking up" a few centuries later when someone uses the brain to create a perfect copy of you.

But whether or not you freeze your brain, a Big World is creating perfect copies of you all the time. The consequentialist view of identity says that your causal connection with these copies is unnecessary for them to be you. So why should a copy of you created by a far-future cryonicist with access to your brain be better able to "resurrect" you than a copy of you that comes to exist for some other reason?

For example, suppose I choose not to sign up for cryonics, have a sudden heart attack, and die in my sleep. Somewhere in a Big World, there is someone exactly like me except that they didn't have the heart attack and they wake up healthy the next morning.

The cryonicists believe that having a healthy copy of you come into existence after you die is sufficient for you to "wake up" as that copy. So why wouldn't I "wake up" as the healthy, heart-attack-free version of me in the universe next door?

Or: suppose that a Friendly AI fills a human-sized three-dimensional grid with atoms, using a quantum dice to determine which atom occupies each "pixel" in the grid. This splits the universe into as many branches as there are possible permutations of the grid (presumably a lot) and in one of those branches, the AI's experiment creates a perfect copy of me at the moment of my death, except healthy. If creating a perfect copy of me causes my "resurrection", then that AI has just resurrected me as surely as cryonics would have.

The only downside I can see here is that I have less measure (meaning I exist in a lower proportion of worlds) than if I had signed up for cryonics directly. This might be a problem if I think that my existence benefits others - but I don't think I should be concerned for my own sake. Right now I don't go to bed at night weeping that my father only met my mother through a series of unlikely events and so most universes probably don't contain me; I'm not sure why I should do so after having been resurrected in the far future.

RESURRECTION AS SOMEONE ELSE

What if the speculative theories involved in Big Worlds all turn out to be false? All hope is still not lost.

Above I wrote:

An exact clone is more me than a clone with different ice cream preferences, who is more me than a clone who is a Hindu fundamentalist, who is more me than LeBron James is.

I used LeBron James because from what I know about him, he's quite different from me. But what if I had used someone else? One thing I learned upon discovering Less Wrong is that I had previously underestimated just how many people out there are *really similar to me*, even down to weird interests, personality quirks, and sense of humor. So let's take the person living in 2050 who is most similar to me now. I can think of several people on this site alone who would make a pretty impressive lower bound on how similar the most similar person to me would have to be.

In what way is this person waking up on the morning of January 1 2050 equivalent to me being sort of resurrected? What if this person is more similar to Yvain(2012) than Yvain(1995) is? What if I signed up for cryonics, died tomorrow, and was resurrected in 2050 by a process about as lossy as the difference between me and this person?

SUMMARY

Personal identity remains confusing. But some of the assumptions cryonicists make are, in certain situations, sufficient to guarantee personal survival after death without cryonics.

Common mistakes people make when thinking about decision theory

40 cousin_it 27 March 2012 08:03PM

From my experience reading and talking about decision theory on LW, it seems that many of the unproductive comments in these discussions can be attributed to a handful of common mistakes.

Mistake #1: Arguing about assumptions

The main reason why I took so long to understand Newcomb's Problem and Counterfactual Mugging was my insistence on denying the assumptions behind these puzzles. I could have saved months if I'd just said to myself, okay, is this direction of inquiry interesting when taken on its own terms?

Many assumptions seemed to be divorced from real life at first. People dismissed the study of electromagnetism as an impractical toy, and considered number theory hopelessly abstract until cryptography arrived. The only way to make intellectual progress (either individually or as a group) is to explore the implications of interesting assumptions wherever they might lead. Unfortunately people love to argue about assumptions instead of getting anything done, though they can't really judge before exploring the implications in detail.

Several smart people on LW are repeating my exact mistake about Newcomb's Problem now, and others find ways to commit the same mistake when looking at our newer ideas. It's so frustrating and uninteresting to read yet another comment saying my assumptions look unintuitive or unphysical or irrelevant to FAI or whatever. I'm not against criticism, but somehow such comments never blossom into interesting conversations, and that's reason enough to caution you against the way of thinking that causes them.

Mistake #2: Stopping when your idea seems good enough

There's a handful of ideas that decision theory newbies rediscover again and again, like pointing out indexical uncertainty as the solution to Newcomb's problem, or adding randomness to models of UDT to eliminate spurious proofs. These ideas don't work and don't lead anywhere interesting, but that's hard to notice when you just had the flash of insight and want to share it with the world.

A good strategy in such situations is to always push a little bit past the point where you have everything figured out. Take one extra step and ask yourself: "Can I make this idea precise?" What are the first few implications? What are the obvious extensions? If your result seems to contradict what's already known, work through some of the contradictions yourself. If you don't find any mistakes in your idea, you will surely find new formal things to say about your idea, which always helps.

Mistake #2A: Stopping when your idea actually is good enough

I didn't want to name any names in this post because my status on LW puts me in a kinda position of power, but there's a name I can name with a clear conscience. In 2009, Eliezer wrote:

Formally you'd use a Godelian diagonal to write (...)

Of course that's not a newbie mistake at all, but an awesome and fruitful idea! As it happens, writing out that Godelian diagonal immediately leads to all sorts of puzzling questions like "but what does it actually do? and how do we prove it?", and eventually to all the decision theory research we're doing now. Knowing Eliezer's intelligence, he probably could have preempted most of our results. Instead he just declared the problem solved. Maybe he thought he was already at 0.95 formality and that going to 1.0 would be a trivial step? I don't want to insinuate here, but IMO he made a mistake.

Since this mistake is indistinguishable from the last, the remedy for it is the same: "Can I make this idea precise?" Whenever you stake out a small area of knowledge and make it amenable to mathematical thinking, you're likely to find new math that has lasting value. When you stop because your not-quite-formal idea seems already good enough, you squander that opportunity.

...

If this post has convinced you to stop making these common mistakes, be warned that it won't necessarily make you happier. As you learn to see more clearly, the first thing you'll see will be a locked door with a sign saying "Research is hard". Though it's not very scary or heroic, mostly you just stand there feeling stupid about yourself :-)

Is community-collaborative article production possible?

40 lukeprog 21 March 2012 08:10PM

When I showed up at the Singularity Institute, I was surprised to find that 30-60 papers' worth of material was lying around in blog posts, mailing list discussions, and people's heads — but it had never been written up in clear, well-referenced academic articles.

Why is this so? Writing such articles has many clear benefits:

  • Clearly stated and well-defended arguments can persuade smart people to take AI risk seriously, creating additional supporters and collaborators for the Singularity Institute.
  • Such articles can also improve the credibility of the organization as a whole, which is especially important for attracting funds from top-level social entrepreneurs and institutions like the Gates Foundation and Givewell.
  • Laying out the arguments clearly and analyzing each premise can lead to new strategic insights that will help us understand how to purchase x-risk reduction most efficiently.
  • Clear explanations can provide a platform on which researchers can build to produce new strategic and technical research results.
  • Communicating clearly is what lets other people find errors in your reasoning.
  • Communities can use articles to cut down on communication costs. When something is written up clearly, 1000 people can read a single article instead of needing to transmit the information by having several hundred personal conversations between 2-5 people.

Of course, there are costs to writing articles, too. The single biggest cost is staff time / opportunity cost. An article like "Intelligence Explosion: Evidence and Import" can require anywhere from 150-800 person-hours. That is 150-800 paid hours during which our staff is not doing other critically important things that collectively have a bigger positive impact than a single academic article is likely to have.

So Louie Helm and Nick Beckstead and I sat down and asked, "Is there a way we can buy these articles without such an egregious cost?"

We think there might be. Basically, we suspect that most of the work involved in writing these articles can be outsourced. Here's the process we have in mind:

  1. An SI staff member chooses a paper idea we need written up, then writes an abstract and some notes on the desired final content.
  2. SI pays Gwern or another remote researcher to do a literature search-and-summary of relevant material, with pointers to other resources.
  3. SI posts a contest to LessWrong, inviting submissions of near-conference-level-quality articles that follow the provided abstract and notes on desired final content. Contestants benefit by starting with the results of Gwern's literature summary, and by knowing that they don't need to produce something as good as "Intelligence Explosion: Evidence and Import" to win the prize. First place wins $1200, 2nd place wins $500, and 3rd place wins $200.
  4. Submissions are due 1 month later. Submission are reviewed, and the authors of the best submissions are sent comments on what could be improved to maximize the chances of coming in first place.
  5. Revised articles are due 3 weeks after comments are received. Prizes are awarded.
  6. SI pays an experienced writer like Yvain or Kaj_Sotala or someone similar to build up and improve the 1st place submission, borrowing the best parts from the other submissions, too.
  7. An SI staff member does a final pass, adding some content, making it more clearly organized and polished, etc. One of SI's remote editors does another pass to make the sentences more perfect.
  8. The paper is submitted to a journal or an edited volume, and is marked as being co-authored by (1) the key SI staff member who provided the seed ideas and guided each stage of the revisions and polishing, (2) the author of the winning submission, and (3) Gwern. (With thanks to contributions from the other contest participants whose submissions were borrowed from — unless huge pieces were borrowed, in which case they may be counted as an additional co-author.)

If this method works, each paper may require only 50-150 hours of SI staff time per paper — a dramatic improvement! But this method has additional benefits:

  • Members of the community who are capable of doing one piece of the process but not the other pieces get to contribute where they shine. (Many people can write okay-level articles but can't do efficient literature searches or produce polished prose, etc.)
  • SI gets to learn more about the talent that exists in its community which hadn't yet been given the opportunity to flower. (We might be able to directly outsource future work to contest participants, and if one person wins three such contests, that's an indicator that we should consider hiring them.)
  • Additional paid "jobs" (by way of contest money) are created for LW rationalists who have some domain expertise in singularity-related subjects.
  • Many Less Wrongers are students in fields relevant to the subject matter of the papers that will be produced by this process, and this will give them an opportunity to co-author papers that can go on their CV.
  • The community in general gets better at collaborating.

This is, after all, more similar to how many papers would be produced by university departments, in which a senior researcher works with a team of students to produce papers.

Feedback? Interest?

(Not exactly the same, but see also the Polymath Project.)

Whole Brain Emulation: Looking At Progress On C. elgans

40 jkaufman 29 October 2011 03:21PM

Being able to treat the pattern of someone's brain as software to be run on a computer, perhaps in parallel or at a large speedup, would have a huge impact, both socially and economically.  Robin Hanson thinks it is the most likely route to artificial intelligence.  Anders Sandberg and Nick Bostrom of the Future Of Humanity Institute created out a roadmap for whole brain emulation in 2008, which covers a huge amount of research in this direction, combined with some scale analysis of the difficulty of various tasks.

Because the human brain is so large, and we are so far from having the technical capacity to scan or emulate it, it's difficult to evaluate progress.  Some other organisms, however, have much smaller brains: the nematode C. elegans has only 302 cells in its entire nervous system.  It is extremely well studied and well understood, having gone through heavy use as a research animal for decades.  Since at least 1986 we've known the full neural connectivity of C. elegans, something that would take decades and a huge amount of work to get for humans.  At 302 neurons, simulation has been within our computational capacity for at least that long.  With 25 years to work on it, shouldn't we be able to 'upload' a nematode by now?

Reading through the research, there's been some work on modeling subsystems and components, but I only find three projects that have tried to integrate this research into a complete simulation: the University of Oregon's NemaSys (~1997), the Perfect C. elegans Project (~1998), and Hiroshima University's Virtual C. Elegans project (~2004).  The second two don't have web pages, but they did put out papers: [1], [2], [3].

Another way to look at this is to list the researchers who seem to have been involved with C. elegans emulation.  I find:

  • Hiroaki Kitano, Sony [1]
  • Shugo Hamahashi, Keio University [1]
  • Sean Luke, University of Maryland [1]
  • Michiyo Suzuki, Hiroshima University  [2][3]
  • Takeshi Goto, Hiroshima Univeristy [2]
  • Toshio Tsuji, Hiroshima Univeristy [2][3]
  • Hisao Ohtake, Hiroshima Univeristy [2]
  • Thomas Ferree, University of Oregon [4][5][6][7]
  • Ben Marcotte, University of Oregon [5]
  • Sean Lockery, University of Oregon [4][5][6][7]
  • Thomas Morse, University of Oregon [4]
  • Stephen Wicks, University of British Columbia [8]
  • Chris Roehrig, University of British Columbia [8]
  • Catharine Rankin, University of British Columbia [8]
  • Angelo Cangelosi, Rome Instituite of Psychology [9]
  • Domenico Parisi, Rome Instituite of Psychology [9]

This seems like a research area where you have multiple groups working at different universities, trying for a while, and then moving on.  None of the simulation projects have gotten very far: their emulations are not complete and have some pieces filled in by guesswork, genetic algorithms, or other artificial sources.  I was optimistic about finding successful simulation projects before I started trying to find one, but now that I haven't, my estimate of how hard whole brain emulation would be has gone up significantly.  While I wouldn't say whole brain emulation could never happen, this looks to me like it is a very long way out, probably hundreds of years.

Note: I later reorganized this into a blog post, incorporating some feed back from these comments.

Papers:

[1] The Perfect C. elegans Project: An Initial Report (1998)

[2] A Dynamic Body Model of the Nematode C. elegans With Neural Oscillators (2005)

[3] A model of motor control of the nematode C. elegans with neuronal circuits (2005)

[4] Robust spacial navigation in a robot inspired by C. elegans (1998)

[5] Neural network models of chemotaxis in the nematode C. elegans (1997)

[6] Chemotaxis control by linear recurrent networks (1998)

[7] Computational rules for chemotaxis in the nematode C. elegans (1999)

[8] A Dynamic Network Simulation of the Nematode Tap Withdrawl Circuit: Predictions Concerning Synaptic Function Using Behavioral Criteria (1996)

[9] A Neural Network Model of Caenorhabditis Elegans: The Circuit of Touch Sensitivity (1997)

An EPub of Eliezer's blog posts

40 ciphergoth 11 August 2011 02:20PM

Update 2015-03-21: I would now strongly recommend reading Rationality: From AI to Zombies over this. Though the blog posts I collected here are the starting point for that book, considerable work has gone into selecting and arranging the essays as well as adding thoughtful new material and useful material not in this collection. Only if you've already read that should you consider starting on this; you can always skip the essays you've already read.

This is all Eliezer's posts to Less Wrong up to the end of 2010 as an EPub. Can be read with Aldiko and other eBook readers, though you might have to jump through some hoops on the Kindle (haven't tried it). I shared it privately with a few friends in the past, but I thought it might be more generally useful.  Highlights include that all the screwed-up Unicode is fixed AFAIK.

Source code.

Update: have now made a MOBI for the Kindle too.

Updated 2011-08-13 17:20 BST: Now with images!

A few misconceptions surrounding Roko's basilisk

39 RobbBB 05 October 2015 09:23PM

There's a new LWW page on the Roko's basilisk thought experiment, discussing both Roko's original post and the fallout that came out of Eliezer Yudkowsky banning the topic on Less Wrong discussion threads. The wiki page, I hope, will reduce how much people have to rely on speculation or reconstruction to make sense of the arguments.

While I'm on this topic, I want to highlight points that I see omitted or misunderstood in some online discussions of Roko's basilisk. The first point that people writing about Roko's post often neglect is:

 

  • Roko's arguments were originally posted to Less Wrong, but they weren't generally accepted by other Less Wrong users.

Less Wrong is a community blog, and anyone who has a few karma points can post their own content here. Having your post show up on Less Wrong doesn't require that anyone else endorse it. Roko's basic points were promptly rejected by other commenters on Less Wrong, and as ideas not much seems to have come of them. People who bring up the basilisk on other sites don't seem to be super interested in the specific claims Roko made either; discussions tend to gravitate toward various older ideas that Roko cited (e.g., timeless decision theory (TDT) and coherent extrapolated volition (CEV)) or toward Eliezer's controversial moderation action.

In July 2014, David Auerbach wrote a Slate piece criticizing Less Wrong users and describing them as "freaked out by Roko's Basilisk." Auerbach wrote, "Believing in Roko’s Basilisk may simply be a 'referendum on autism'" — which I take to mean he thinks a significant number of Less Wrong users accept Roko’s reasoning, and they do so because they’re autistic (!). But the Auerbach piece glosses over the question of how many Less Wrong users (if any) in fact believe in Roko’s basilisk. Which seems somewhat relevant to his argument...?

The idea that Roko's thought experiment holds sway over some community or subculture seems to be part of a mythology that’s grown out of attempts to reconstruct the original chain of events; and a big part of the blame for that mythology's existence lies on Less Wrong's moderation policies. Because the discussion topic was banned for several years, Less Wrong users themselves had little opportunity to explain their views or address misconceptions. A stew of rumors and partly-understood forum logs then congealed into the attempts by people on RationalWiki, Slate, etc. to make sense of what had happened.

I gather that the main reason people thought Less Wrong users were "freaked out" about Roko's argument was that Eliezer deleted Roko's post and banned further discussion of the topic. Eliezer has since sketched out his thought process on Reddit:

When Roko posted about the Basilisk, I very foolishly yelled at him, called him an idiot, and then deleted the post. [...] Why I yelled at Roko: Because I was caught flatfooted in surprise, because I was indignant to the point of genuine emotional shock, at the concept that somebody who thought they'd invented a brilliant idea that would cause future AIs to torture people who had the thought, had promptly posted it to the public Internet. In the course of yelling at Roko to explain why this was a bad thing, I made the further error---keeping in mind that I had absolutely no idea that any of this would ever blow up the way it did, if I had I would obviously have kept my fingers quiescent---of not making it absolutely clear using lengthy disclaimers that my yelling did not mean that I believed Roko was right about CEV-based agents [= Eliezer’s early model of indirectly normative agents that reason with ideal aggregated preferences] torturing people who had heard about Roko's idea. [...] What I considered to be obvious common sense was that you did not spread potential information hazards because it would be a crappy thing to do to someone. The problem wasn't Roko's post itself, about CEV, being correct.

This, obviously, was a bad strategy on Eliezer's part. Looking at the options in hindsight: To the extent it seemed plausible that Roko's argument could be modified and repaired, Eliezer shouldn't have used Roko's post as a teaching moment and loudly chastised him on a public discussion thread. To the extent this didn't seem plausible (or ceased to seem plausible after a bit more analysis), continuing to ban the topic was a (demonstrably) ineffective way to communicate the general importance of handling real information hazards with care.

 


On that note, point number two:

  • Roko's argument wasn’t an attempt to get people to donate to Friendly AI (FAI) research. In fact, the opposite is true.

Roko's original argument was not 'the AI agent will torture you if you don't donate, therefore you should help build such an agent'; his argument was 'the AI agent will torture you if you don't donate, therefore we should avoid ever building such an agent.' As Gerard noted in the ensuing discussion thread, threats of torture "would motivate people to form a bloodthirsty pitchfork-wielding mob storming the gates of SIAI [= MIRI] rather than contribute more money." To which Roko replied: "Right, and I am on the side of the mob with pitchforks. I think it would be a good idea to change the current proposed FAI content from CEV to something that can't use negative incentives on x-risk reducers."

Roko saw his own argument as a strike against building the kind of software agent Eliezer had in mind. Other Less Wrong users, meanwhile, rejected Roko's argument both as a reason to oppose AI safety efforts and as a reason to support AI safety efforts.

Roko's argument was fairly dense, and it continued into the discussion thread. I’m guessing that this (in combination with the temptation to round off weird ideas to the nearest religious trope, plus misunderstanding #1 above) is why RationalWiki's version of Roko’s basilisk gets introduced as

a futurist version of Pascal’s wager; an argument used to try and suggest people should subscribe to particular singularitarian ideas, or even donate money to them, by weighing up the prospect of punishment versus reward.

If I'm correctly reconstructing the sequence of events: Sites like RationalWiki report in the passive voice that the basilisk is "an argument used" for this purpose, yet no examples ever get cited of someone actually using Roko’s argument in this way. Via citogenesis, the claim then gets incorporated into other sites' reporting.

(E.g., in Outer Places: "Roko is claiming that we should all be working to appease an omnipotent AI, even though we have no idea if it will ever exist, simply because the consequences of defying it would be so great." Or in Business Insider: "So, the moral of this story: You better help the robots make the world a better place, because if the robots find out you didn’t help make the world a better place, then they’re going to kill you for preventing them from making the world a better place.")

In terms of argument structure, the confusion is equating the conditional statement 'P implies Q' with the argument 'P; therefore Q.' Someone asserting the conditional isn’t necessarily arguing for Q; they may be arguing against P (based on the premise that Q is false), or they may be agnostic between those two possibilities. And misreporting about which argument was made (or who made it) is kind of a big deal in this case: 'Bob used a bad philosophy argument to try to extort money from people' is a much more serious charge than 'Bob owns a blog where someone once posted a bad philosophy argument.'

 


Lastly:

  • "Formally speaking, what is correct decision-making?" is an important open question in philosophy and computer science, and formalizing precommitment is an important part of that question.

Moving past Roko's argument itself, a number of discussions of this topic risk misrepresenting the debate's genre. Articles on Slate and RationalWiki strike an informal tone, and that tone can be useful for getting people thinking about interesting science/philosophy debates. On the other hand, if you're going to dismiss a question as unimportant or weird, it's important not to give the impression that working decision theorists are similarly dismissive.

What if your devastating take-down of string theory is intended for consumption by people who have never heard of 'string theory' before? Even if you're sure string theory is hogwash, then, you should be wary of giving the impression that the only people discussing string theory are the commenters on a recreational physics forum. Good reporting by non-professionals, whether or not they take an editorial stance on the topic, should make it obvious that there's academic disagreement about which approach to Newcomblike problems is the right one. The same holds for disagreement about topics like long-term AI risk or machine ethics.

If Roko's original post is of any pedagogical use, it's as an unsuccessful but imaginative stab at drawing out the diverging consequences of our current theories of rationality and goal-directed behavior. Good resources for these issues (both for discussion on Less Wrong and elsewhere) include:

The Roko's basilisk ban isn't in effect anymore, so you're welcome to direct people here (or to the Roko's basilisk wiki page, which also briefly introduces the relevant issues in decision theory) if they ask about it. Particularly low-quality discussions can still get deleted (or politely discouraged), though, at moderators' discretion. If anything here was unclear, you can ask more questions in the comments below.

Can we talk about mental illness?

39 riparianx 08 March 2015 08:24AM

For a site extremely focused on fixing bad thinking patterns, I've noticed a bizarre lack of discussion here. Considering the high correlation between intelligence and mental illness, you'd think it would be a bigger topic. 

I personally suffer from Generalized Anxiety Disorder and a very tame panic disorder. Most of this is focused on financial and academic things, but I will also get panicky about social interaction, responsibilities, and things that happened in the past that seriously shouldn't bother me. I have an almost amusing response to anxiety that is basically my brain panicking and telling me to go hide under my desk.

I know lukeprog and Alicorn managed to fight off a good deal of their issues in this area and wrote up how, but I don't think enough has been done. They mostly dealt with depression. What about rational schizophrenics and phobics and bipolar people? It's difficult to find anxiety advice that goes beyond "do yoga while watching the sunrise!" Pop psych isn't very helpful. I think LessWrong could be. What's mental illness but a wrongness in the head?

Mental illness seems to be worse to intelligent people than your typical biases, honestly. Hiding under my desk is even less useful than, say, appealing to authority during an argument. At least the latter has the potential to be useful. I know it's limiting me, and starting cycles of avoidance, and so much more. And my mental illness isn't even that bad! Trying to be rational and successful when schizophrenic sounds like a Sisyphusian nightmare. 

I'm not fighting my difficulties nearly well enough to feel qualified to author my own posts. Hearing from people who are managing is more likely to help. If nothing else, maybe a Rational Support Group would be a lot of fun.

Easy wins aren't news

39 PhilGoetz 19 February 2015 07:38PM

Recently I talked with a guy from Grant Street Group. They make, among other things, software with which local governments can auction their bonds on the Internet.

By making the auction process more transparent and easier to participate in, they enable local governments which need to sell bonds (to build a high school, for instance), to sell those bonds at, say, 7% interest instead of 8%. (At least, that's what he said.)

They have similar software for auctioning liens on property taxes, which also helps local governments raise more money by bringing more buyers to each auction, and probably helps the buyers reduce their risks by giving them more information.

This is a big deal. I think it's potentially more important than any budget argument that's been on the front pages since the 1960s. Yet I only heard of it by chance.

People would rather argue about reducing the budget by eliminating waste, or cutting subsidies to people who don't deserve it, or changing our ideological priorities. Nobody wants to talk about auction mechanics. But fixing the auction mechanics is the easy win. It's so easy that nobody's interested in it. It doesn't buy us fuzzies or let us signal our affiliations. To an individual activist, it's hardly worth doing.

Don't estimate your creative intelligence by your critical intelligence

39 PhilGoetz 05 February 2015 02:41AM

When I criticize, I'm a genius. I can go through a book of highly-referenced scientific articles and find errors in each of them. Boy, I feel smart. How are these famous people so dumb?

But when I write, I suddenly become stupid. I sometimes spend half a day writing something and then realize at the end, or worse, after posting, that what it says simplifies to something trivial, or that I've made several unsupported assumptions, or claimed things I didn't really know were true. Or I post something, then have to go back every ten minutes to fix some point that I realize is not quite right, sometimes to the point where the whole thing falls apart.

If someone writes an article or expresses an idea that you find mistakes in, that doesn't make you smarter than that person. If you create an equally-ambitious article or idea that no one else finds mistakes in, then you can start congratulating yourself.

View more: Prev | Next