## Shock Levels are Point Estimates

8 14 February 2010 04:31AM

This is a post from my blog, Space and Games. Michael Vassar has requested that I repost it here. I thought about revising it to remove the mind projection fallacy, but instead I left it in for you to find.

Eliezer Yudkowsky1999 famously categorized beliefs about the future into discrete "shock levels." Michael Anissimov later wrote a nice introduction to future shock levels. Higher shock levels correspond to belief in more powerful and radical technologies, and are considered more correct than lower shock levels. Careful thinking and exposure to ideas will tend to increase one’s shock level.

If this is really true, and I think it is, shock levels are an example of human insanity. If you ask me to estimate some quantity, and track how my estimates change over time, you should expect it to look like a random walk if I’m being rational. Certainly I can’t expect that my estimate will go up in the future. And yet shock levels mostly go up, not down.

## Complexity of Value ≠ Complexity of Outcome

33 30 January 2010 02:50AM

Complexity of value is the thesis that our preferences, the things we care about, don't compress down to one simple rule, or a few simple rules. To review why it's important (by quoting from the wiki):

• Caricatures of rationalists often have them moved by artificially simplified values - for example, only caring about personal pleasure. This becomes a template for arguing against rationality: X is valuable, but rationality says to only care about Y, in which case we could not value X, therefore do not be rational.
• Underestimating the complexity of value leads to underestimating the difficulty of Friendly AI; and there are notable cognitive biases and fallacies which lead people to underestimate this complexity.

I certainly agree with both of these points. But I worry that we (at Less Wrong) might have swung a bit too far in the other direction. No, I don't think that we overestimate the complexity of our values, but rather there's a tendency to assume that complexity of value must lead to complexity of outcome, that is, agents who faithfully inherit the full complexity of human values will necessarily create a future that reflects that complexity. I will argue that it is possible for complex values to lead to simple futures, and explain the relevance of this possibility to the project of Friendly AI.

## Value Uncertainty and the Singleton Scenario

8 24 January 2010 05:03AM

In January of last year, Nick Bostrom wrote a post on Overcoming Bias about his and Toby Ord’s proposed method of handling moral uncertainty. To abstract away a bit from their specific proposal, the general approach was to convert a problem involving moral uncertainty into a game of negotiation, with each player’s bargaining power determined by one’s confidence in the moral philosophy represented by that player.

Robin Hanson suggested in his comments to Nick’s post that moral uncertainty should be handled the same way we're supposed to handle ordinary uncertainty, by using standard decision theory (i.e., expected utility maximization). Nick’s reply was that many ethical systems don’t fit into the standard decision theory framework, so it’s hard to see how to combine them that way.

In this post, I suggest we look into the seemingly easier problem of value uncertainty, in which we fix a consequentialist ethical system, and just try to deal with uncertainty about values (i.e., utility function). Value uncertainty can be considered a special case of moral uncertainty in which there is no apparent obstacle to applying Robin’s suggestion. I’ll consider a specific example of a decision problem involving value uncertainty, and work out how Nick and Toby’s negotiation approach differs in its treatment of the problem from standard decision theory. Besides showing the difference in the approaches, I think the specific problem is also quite important in its own right.

The problem I want to consider is, suppose we believe that a singleton scenario is very unlikely, but may have very high utility if it were realized, should we focus most of our attention and effort into trying to increase its probability and/or improve its outcome? The main issue here is (putting aside uncertainty about what will happen after a singleton scenario is realized) uncertainty about how much we value what is likely to happen.

## Privileged Snuff

17 22 January 2010 05:38AM

So one is asked, "What is your probability estimate that the LHC will destroy the world?"

Leaving aside the issue of calling brown numbers probabilities, there is a more subtle rhetorical trap at work here.

If one makes up a small number, say one in a million, the answer will be, "Could you make a million such statements and not be wrong even once?" (Of course this is a misleading image -- doing anything a million times in a row would make you tired and distracted enough to make trivial mistakes. At some level we know this argument is misleading, because nobody calls the non-buyer of lottery tickets irrational for assigning an even lower probability to a win.)

If one makes up a larger number, say one in a thousand, then one is considered a bad person for wanting to take even one chance in a thousand of destroying the world.

The fallacy here is http://wiki.lesswrong.com/wiki/Privileging_the_hypothesis

## Will reason ever outrun faith?

6 07 January 2010 02:00PM

Recently, a video produced by Christians claimed that the future world would be Muslim. It hit 10 million hits in YouTube. The alarming demographics presented were proven mostly false and exaggerated both by BBC and Snopes. Yet, religion is such a powerful self-replicating memeplex that its competition against atheism deserves some analysis.

Leaving apart the aesthetic nicety of some religious rituals — which I respect —, it would be preferable to see a world with predominance of rationality instead of faith, brights instead of supers. Not just because I whimsically wish so, but because reason ensues atheism. Rationality is the primer here. With more rational agents, the more rationality propagates, and people’s maps will be more accurate. And that’s better for us, human beings*.

(* This sentence is a bit of a strong claim, especially because I am not defining exactly what I mean by ‘better’, and some existential pain might be expected as a consequence of being unaided by the crutches of faith and of being deprived of their cultural antibodies. Also, if happiness happens to be an important attribute of ‘better’, I am not sure to what extent being rational will make people happier. Some people are very ok choosing the blue pill. For the time being, let’s take it as an axiom. The claim that rational is better might deserve a separate post.)

## Rationalists lose when others choose

-10 16 June 2009 05:50PM

At various times, we've argued over whether rationalists always win.  I posed Augustine's paradox of optimal repentance to argue that, in some situations, rationalists lose.  One criticism of that paradox is that its strongest forms posit a God who penalizes people for being rational.  My response was, So what?  Who ever said that nature, or people, don't penalize rationality?

There are instances where nature penalizes the rational.  For instance, revenge is irrational, but being thought of as someone who would take revenge gives advantages.1

## Why safety is not safe

48 14 June 2009 05:20AM

June 14, 3009

Twilight still hung in the sky, yet the Pole Star was visible above the trees, for it was a perfect cloudless evening.

"We can stop here for a few minutes," remarked the librarian as he fumbled to light the lamp. "There's a stream just ahead."

The driver grunted assent as he pulled the cart to a halt and unhitched the thirsty horse to drink its fill.

It was said that in the Age of Legends, there had been horseless carriages that drank the black blood of the earth, long since drained dry. But then, it was said that in the Age of Legends, men had flown to the moon on a pillar of fire. Who took such stories seriously?

The librarian did. In his visit to the University archive, he had studied the crumbling pages of a rare book in Old English, itself a copy a mere few centuries old, of a text from the Age of Legends itself; a book that laid out a generation's hopes and dreams, of building cities in the sky, of setting sail for the very stars. Something had gone wrong - but what? That civilization's capabilities had been so far beyond those of his own people. Its destruction should have taken a global apocalypse of the kind that would leave unmistakable record both historical and archaeological, and yet there was no trace. Nobody had anything better than mutually contradictory guesses as to what had happened. The librarian intended to discover the truth.

Forty years later he died in bed, his question still unanswered.

The earth continued to circle its parent star, whose increasing energy output could no longer be compensated by falling atmospheric carbon dioxide concentration. Glaciers advanced, then retreated for the last time; as life struggled to adapt to changing conditions, the ecosystems of yesteryear were replaced by others new and strange - and impoverished. All the while, the environment drifted further from that which had given rise to Homo sapiens, and in due course one more species joined the billions-long roll of the dead. For what was by some standards a little while, eyes still looked up at the lifeless stars, but there were no more minds to wonder what might have been.

## Concrete vs Contextual values

-4 02 June 2009 09:47AM

The concept of recursive self-improvement  is not an accepted idea outside of the futurist community. It just does not seem right in some fashion to some people. I am one of those people, so I'm going to try and explain the kind of instinctive skepticism I have towards it. It hinges on the difference between two sorts of values, whose difference I have not seen made explicit before (although likely it has somewhere). This difference is that of the between a concrete and contextual value.

## ...And Say No More Of It

21 09 February 2009 12:15AM

Followup toThe Thing That I Protect

Anything done with an ulterior motive has to be done with a pure heart.  You cannot serve your ulterior motive, without faithfully prosecuting your overt purpose as a thing in its own right, that has its own integrity.  If, for example, you're writing about rationality with the intention of recruiting people to your utilitarian Cause, then you cannot talk too much about your Cause, or you will fail to successfully write about rationality.

This doesn't mean that you never say anything about your Cause, but there's a balance to be struck.  "A fanatic is someone who can't change his mind and won't change the subject."

In previous months, I've pushed this balance too far toward talking about Singularity-related things.  And this was for (first-order) selfish reasons on my part; I was finally GETTING STUFF SAID that had been building up painfully in my brain for FRICKIN' YEARS.  And so I just kept writing, because it was finally coming out.  For those of you who have not the slightest interest, I'm sorry to have polluted your blog with that.

When Less Wrong starts up, it will, by my own request, impose a two-month moratorium on discussion of "Friendly AI" and other Singularity/intelligence explosion-related topics.

There's a number of reasons for this.  One of them is simply to restore the balance.  Another is to make sure that a forum intended to have a more general audience, doesn't narrow itself down and disappear.

But more importantly - there are certain subjects which tend to drive people crazy, even if there's truth behind them.  Quantum mechanics would be the paradigmatic example; you don't have to go funny in the head but a lot of people do.  Likewise Godel's Theorem, consciousness, Artificial Intelligence -

The concept of "Friendly AI" can be poisonous in certain ways.  True or false, it carries risks to mental health.  And not just the obvious liabilities of praising a Happy Thing.  Something stranger and subtler that drains enthusiasm.

## The Thing That I Protect

17 07 February 2009 07:18PM

Followup toSomething to Protect, Value is Fragile

"Something to Protect" discursed on the idea of wielding rationality in the service of something other than "rationality".  Not just that rationalists ought to pick out a Noble Cause as a hobby to keep them busy; but rather, that rationality itself is generated by having something that you care about more than your current ritual of cognition.

So what is it, then, that I protect?

I quite deliberately did not discuss that in "Something to Protect", leaving it only as a hanging implication.  In the unlikely event that we ever run into aliens, I don't expect their version of Bayes's Theorem to be mathematically different from ours, even if they generated it in the course of protecting different and incompatible values.  Among humans, the idiom of having "something to protect" is not bound to any one cause, and therefore, to mention my own cause in that post would have harmed its integrity.  Causes are dangerous things, whatever their true importance; I have written somewhat on this, and will write more about it.

But still - what is it, then, the thing that I protect?

Friendly AI?  No - a thousand times no - a thousand times not anymore.  It's not thinking of the AI that gives me strength to carry on even in the face of inconvenience.