Why Many-Worlds Is Not The Rationally Favored Interpretation
Eliezer recently posted an essay on "the fallacy of privileging the hypothesis". What it's really about is the fallacy of privileging an arbitrary hypothesis. In the fictional example, a detective proposes that the investigation of an unsolved murder should begin by investigating whether a particular, randomly chosen citizen was in fact the murderer. Towards the end, this is likened to the presumption that one particular religion, rather than any of the other existing or even merely possible religions, is especially worth investigating.
However, in between the fictional and the supernatural illustrations of the fallacy, we have something more empirical: quantum mechanics. Eliezer writes, as he has previously, that the many-worlds interpretation is the one - the rationally favored interpretation, the picture of reality which rationally should be adopted given the empirical success of quantum theory. Eliezer has said this before, and I have argued against it before, back when this site was just part of a blog. This site is about rationality, not physics; and the quantum case is not essential to the exposition of this fallacy. But given the regularity with which many-worlds metaphysics shows up in discussion here, perhaps it is worth presenting a case for the opposition.
Deontology for Consequentialists
Consequentialists see morality through consequence-colored lenses. I attempt to prise apart the two concepts to help consequentialists understand what deontologists are talking about.
Consequentialism1 is built around a group of variations on the following basic assumption:
- The rightness of something depends on what happens subsequently.
It's a very diverse family of theories; see the Stanford Encyclopedia of Philosophy article. "Classic utilitarianism" could go by the longer, more descriptive name "actual direct maximizing aggregative total universal equal-consideration agent-neutral hedonic act2 consequentialism". I could even mention less frequently contested features, like the fact that this type of consequentialism doesn't have a temporal priority feature or side constraints. All of this is is a very complicated bag of tricks for a theory whose proponents sometimes claim to like it because it's sleek and pretty and "simple". But the bottom line is, to get a consequentialist theory, something that happens after the act you judge is the basis of your judgment.
To understand deontology as anything but a twisted, inexplicable mockery of consequentialism, you must discard this assumption.
Deontology relies on things that do not happen after the act judged to judge the act. This leaves facts about times prior to and the time during the act to determine whether the act is right or wrong. This may include, but is not limited to:
- The agent's epistemic state, either actual or ideal (e.g. thinking that some act would have a certain result, or being in a position such that it would be reasonable to think that the act would have that result)
- The reference class of the act (e.g. it being an act of murder, theft, lying, etc.)
- Historical facts (e.g. having made a promise, sworn a vow)
- Counterfactuals (e.g. what would happen if others performed similar acts more frequently than they actually do)
- Features of the people affected by the act (e.g. moral rights, preferences, relationship to the agent)
- The agent's intentions (e.g. meaning well or maliciously, or acting deliberately or accidentally)
Reply to Holden on 'Tool AI'
I begin by thanking Holden Karnofsky of Givewell for his rare gift of his detailed, engaged, and helpfully-meant critical article Thoughts on the Singularity Institute (SI). In this reply I will engage with only one of the many subjects raised therein, the topic of, as I would term them, non-self-modifying planning Oracles, a.k.a. 'Google Maps AGI' a.k.a. 'tool AI', this being the topic that requires me personally to answer. I hope that my reply will be accepted as addressing the most important central points, though I did not have time to explore every avenue. I certainly do not wish to be logically rude, and if I have failed, please remember with compassion that it's not always obvious to one person what another person will think was the central point.
Luke Mueulhauser and Carl Shulman contributed to this article, but the final edit was my own, likewise any flaws.
Summary:
Holden's concern is that "SI appears to neglect the potentially important distinction between 'tool' and 'agent' AI." His archetypal example is Google Maps:
Google Maps is not an agent, taking actions in order to maximize a utility parameter. It is a tool, generating information and then displaying it in a user-friendly manner for me to consider, use and export or discard as I wish.
The reply breaks down into four heavily interrelated points:
First, Holden seems to think (and Jaan Tallinn doesn't apparently object to, in their exchange) that if a non-self-modifying planning Oracle is indeed the best strategy, then all of SIAI's past and intended future work is wasted. To me it looks like there's a huge amount of overlap in underlying processes in the AI that would have to be built and the insights required to build it, and I would be trying to assemble mostly - though not quite exactly - the same kind of team if I was trying to build a non-self-modifying planning Oracle, with the same initial mix of talents and skills.
Second, a non-self-modifying planning Oracle doesn't sound nearly as safe once you stop saying human-English phrases like "describe the consequences of an action to the user" and start trying to come up with math that says scary dangerous things like (he translated into English) "increase the correspondence between the user's belief about relevant consequences and reality". Hence why the people on the team would have to solve the same sorts of problems.
Appreciating the force of the third point is a lot easier if one appreciates the difficulties discussed in points 1 and 2, but is actually empirically verifiable independently: Whether or not a non-self-modifying planning Oracle is the best solution in the end, it's not such an obvious privileged-point-in-solution-space that someone should be alarmed at SIAI not discussing it. This is empirically verifiable in the sense that 'tool AI' wasn't the obvious solution to e.g. John McCarthy, Marvin Minsky, I. J. Good, Peter Norvig, Vernor Vinge, or for that matter Isaac Asimov. At one point, Holden says:
One of the things that bothers me most about SI is that there is practically no public content, as far as I can tell, explicitly addressing the idea of a "tool" and giving arguments for why AGI is likely to work only as an "agent."
If I take literally that this is one of the things that bothers Holden most... I think I'd start stacking up some of the literature on the number of different things that just respectable academics have suggested as the obvious solution to what-to-do-about-AI - none of which would be about non-self-modifying smarter-than-human planning Oracles - and beg him to have some compassion on us for what we haven't addressed yet. It might be the right suggestion, but it's not so obviously right that our failure to prioritize discussing it reflects negligence.
The final point at the end is looking over all the preceding discussion and realizing that, yes, you want to have people specializing in Friendly AI who know this stuff, but as all that preceding discussion is actually the following discussion at this point, I shall reserve it for later.
Thoughts on the Singularity Institute (SI)
This post presents thoughts on the Singularity Institute from Holden Karnofsky, Co-Executive Director of GiveWell. Note: Luke Muehlhauser, the Executive Director of the Singularity Institute, reviewed a draft of this post, and commented: "I do generally agree that your complaints are either correct (especially re: past organizational competence) or incorrect but not addressed by SI in clear argumentative writing (this includes the part on 'tool' AI). I am working to address both categories of issues." I take Luke's comment to be a significant mark in SI's favor, because it indicates an explicit recognition of the problems I raise, and thus increases my estimate of the likelihood that SI will work to address them.
September 2012 update: responses have been posted by Luke and Eliezer (and I have responded in the comments of their posts). I have also added acknowledgements.
The Singularity Institute (SI) is a charity that GiveWell has been repeatedly asked to evaluate. In the past, SI has been outside our scope (as we were focused on specific areas such as international aid). With GiveWell Labs we are open to any giving opportunity, no matter what form and what sector, but we still do not currently plan to recommend SI; given the amount of interest some of our audience has expressed, I feel it is important to explain why. Our views, of course, remain open to change. (Note: I am posting this only to Less Wrong, not to the GiveWell Blog, because I believe that everyone who would be interested in this post will see it here.)
I am currently the GiveWell staff member who has put the most time and effort into engaging with and evaluating SI. Other GiveWell staff currently agree with my bottom-line view that we should not recommend SI, but this does not mean they have engaged with each of my specific arguments. Therefore, while the lack of recommendation of SI is something that GiveWell stands behind, the specific arguments in this post should be attributed only to me, not to GiveWell.
Summary of my views
- The argument advanced by SI for why the work it's doing is beneficial and important seems both wrong and poorly argued to me. My sense at the moment is that the arguments SI is making would, if accepted, increase rather than decrease the risk of an AI-related catastrophe. More
- SI has, or has had, multiple properties that I associate with ineffective organizations, and I do not see any specific evidence that its personnel/organization are well-suited to the tasks it has set for itself. More
- A common argument for giving to SI is that "even an infinitesimal chance that it is right" would be sufficient given the stakes. I have written previously about why I reject this reasoning; in addition, prominent SI representatives seem to reject this particular argument as well (i.e., they believe that one should support SI only if one believes it is a strong organization making strong arguments). More
- My sense is that at this point, given SI's current financial state, withholding funds from SI is likely better for its mission than donating to it. (I would not take this view to the furthest extreme; the argument that SI should have some funding seems stronger to me than the argument that it should have as much as it currently has.)
- I find existential risk reduction to be a fairly promising area for philanthropy, and plan to investigate it further. More
- There are many things that could happen that would cause me to revise my view on SI. However, I do not plan to respond to all comment responses to this post. (Given the volume of responses we may receive, I may not be able to even read all the comments on this post.) I do not believe these two statements are inconsistent, and I lay out paths for getting me to change my mind that are likely to work better than posting comments. (Of course I encourage people to post comments; I'm just noting in advance that this action, alone, doesn't guarantee that I will consider your argument.) More
Intent of this post
I did not write this post with the purpose of "hurting" SI. Rather, I wrote it in the hopes that one of these three things (or some combination) will happen:
- New arguments are raised that cause me to change my mind and recognize SI as an outstanding giving opportunity. If this happens I will likely attempt to raise more money for SI (most likely by discussing it with other GiveWell staff and collectively considering a GiveWell Labs recommendation).
- SI concedes that my objections are valid and increases its determination to address them. A few years from now, SI is a better organization and more effective in its mission.
- SI can't or won't make changes, and SI's supporters feel my objections are valid, so SI loses some support, freeing up resources for other approaches to doing good.
Which one of these occurs will hopefully be driven primarily by the merits of the different arguments raised. Because of this, I think that whatever happens as a result of my post will be positive for SI's mission, whether or not it is positive for SI as an organization. I believe that most of SI's supporters and advocates care more about the former than about the latter, and that this attitude is far too rare in the nonprofit world.
A fungibility theorem
Restatement of: If you don't know the name of the game, just tell me what I mean to you. Alternative to: Why you must maximize expected utility. Related to: Harsanyi's Social Aggregation Theorem.
Summary: This article describes a theorem, previously described by Stuart Armstrong, that tells you to maximize the expectation of a linear aggregation of your values. Unlike the von Neumann-Morgenstern theorem, this theorem gives you a reason to behave rationally.1
Self-skepticism: the first principle of rationality
When Richard Feynman started investigating irrationality in the 1970s, he quickly begun to realize the problem wasn't limited to the obvious irrationalists.
Uri Geller claimed he could bend keys with his mind. But was he really any different from the academics who insisted their special techniques could teach children to read? Both failed the crucial scientific test of skeptical experiment: Geller's keys failed to bend in Feynman's hands; outside tests showed the new techniques only caused reading scores to go down.
What mattered was not how smart the people were, or whether they wore lab coats or used long words, but whether they followed what he concluded was the crucial principle of truly scientific thought: "a kind of utter honesty--a kind of leaning over backwards" to prove yourself wrong. In a word: self-skepticism.
As Feynman wrote, "The first principle is that you must not fool yourself -- and you are the easiest person to fool." Our beliefs always seem correct to us -- after all, that's why they're our beliefs -- so we have to work extra-hard to try to prove them wrong. This means constantly looking for ways to test them against reality and to think of reasons our tests might be insufficient.
When I think of the most rational people I know, it's this quality of theirs that's most pronounced. They are constantly trying to prove themselves wrong -- they attack their beliefs with everything they can find and when they run out of weapons they go out and search for more. The result is that by the time I come around, they not only acknowledge all my criticisms but propose several more I hadn't even thought of.
And when I think of the least rational people I know, what's striking is how they do the exact opposite: instead of viciously attacking their beliefs, they try desperately to defend them. They too have responses to all my critiques, but instead of acknowledging and agreeing, they viciously attack my critique so it never touches their precious belief.
Since these two can be hard to distinguish, it's best to look at some examples. The Cochrane Collaboration argues that support from hospital nurses may be helpful in getting people to quit smoking. How do they know that? you might ask. Well, they found this was the result from doing a meta-analysis of 31 different studies. But maybe they chose a biased selection of studies? Well, they systematically searched "MEDLINE, EMBASE and PsycINFO [along with] hand searching of specialist journals, conference proceedings, and reference lists of previous trials and overviews." But did the studies they pick suffer from selection bias? Well, they searched for that -- along with three other kinds of systematic bias. And so on. But even after all this careful work, they still only are confident enough to conclude "the results…support a modest but positive effect…with caution … these meta-analysis findings need to be interpreted carefully in light of the methodological limitations".
Compare this to the Heritage Foundation's argument for the bipartisan Wyden–Ryan premium support plan. Their report also discusses lots of objections to the proposal, but confidently knocks down each one: "this analysis relies on two highly implausible assumptions ... All these predictions were dead wrong. ... this perspective completely ignores the history of Medicare" Their conclusion is similarly confident: "The arguments used by opponents of premium support are weak and flawed." Apparently there's just not a single reason to be cautious about their enormous government policy proposal!
Now, of course, the Cochrane authors might be secretly quite confident and the Heritage Foundation might be wringing their hands with self-skepticism behind-the-scenes. But let's imagine for a moment that these aren't just reportes intended to persuade others of a belief and instead accurate portrayals of how these two different groups approached the question. Now ask: which style of thinking is more likely to lead the authors to the right answer? Which attitude seems more like Richard Feynman? Which seems more like Uri Geller?
Cult impressions of Less Wrong/Singularity Institute
I have several questions related to this:
- Did anyone reading this initially get the impression that Less Wrong was cultish when they first discovered it?
- If so, can you suggest any easy steps we could take?
- Is it possible that there are aspects of the atmosphere here that are driving away intelligent, rationally inclined people who might otherwise be interested in Less Wrong?
- Do you know anyone who might fall into this category, i.e. someone who was exposed to Less Wrong but failed to become an enthusiast, potentially due to atmosphere issues?
- Is it possible that our culture might be different if these folks were hanging around and contributing? Presumably they are disproportionately represented among certain personality types.
If you visit any Less Wrong page for the first time in a cookies-free browsing mode, you'll see this message for new users:
Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.
Here are the worst violators I see on that about page:
Some people consider the Sequences the most important work they have ever read.
Generally, if your comment or post is on-topic, thoughtful, and shows that you're familiar with the Sequences, your comment or post will be upvoted.
Many of us believe in the importance of developing qualities described in Twelve Virtues of Rationality: [insert mystical sounding description of how to be rational here]
And on the sequences page:
If you don't read the sequences on Mysterious Answers to Mysterious Questions and Reductionism, little else on Less Wrong will make much sense.
This seems obviously false to me.
These may not seem like cultish statements to you, but keep in mind that you are one of the ones who decided to stick around. The typical mind fallacy may be at work. Clearly there is some population that thinks Less Wrong seems cultish, as evidenced by Google's autocomplete, and these look like good candidates for things that makes them think this.
We can fix this stuff easily, since they're both wiki pages, but I thought they were examples worth discussing.
In general, I think we could stand more community effort being put into improving our about page, which you can do now here. It's not that visible to veteran users, but it is very visible to newcomers. Note that it looks as though you'll have to click the little "Force reload from wiki" button on the about page itself for your changes to be published.
How to Be Oversurprised
Followup to: How to Disentangle the Past and the Future
Some agents are memoryless, reacting to each new observation as it happens, without generating a persisting internal structure. When a LED observes voltage, it emits light, regardless of whether it did so a second earlier.
Other agents have very persistent memories. The internal structure of an amber nugget can remain unchanged by external conditions for millions of years.
Neither of these levels of memory persistence makes for very intelligent agents, because neither allows them to be good separators of the past and the future. Memoryless agents only have access to the most recent input of their sensors, which leaves them oblivious to the hidden internal structures of other things around them, and to the existence of things not around them. Unchanging agents, on the other hand, fail to entangle themselves with new evidence, which prevents them from keeping up to date with a changing world.
Intelligence requires observations. An intelligent agent needs to strike a delicate balance between the persistence of its internal structure and its susceptibility to new evidence. The optimal balance, a Bayesian update, has been explained many times before, and was shown to be optimal in keeping information about the world. This post highlights yet another aspect.
Suppose I predicted that a roll of a die will give 4, and then it did. How surprised would you be?
You may intuitively realize that the degree of your surprise should be a decreasing function of the probability that you assign to the event. Predicting a 4 on a fair die is more surprising than on one that is loaded in favor of 4. You may also want the measure of surprise to be extensive: if I repeated the feat a second time, you would be twice as surprised.
In that case, there's essentially one consistent notion of surprise (also called: self-information, surprisal). The amount of surprise that a random variable X has value x is
S(x) = -log Pr(X=x).
This is the negative logarithm of the probability of the event X=x.
This is a very useful concept in information theory. For example, the entropy of the random variable X is the surprise we expect to have upon seeing its value. The mutual information between X and another random variable Y is the difference between how much we expect x to surprise us now, and how much we expect it to surprise us after we first look at y.
The surprise of seeing 4 when rolling a fair die is
S(1/6) = -log(1/6) ≈ 2.585 bit.
That's also the surprise of any other result, so that's also the entropy of a die.
By the way, the objective surprise of a specific result is the same whether or not I actually announce it as a prediction. The reason you don't "feel surprised" when no prediction is made, is that your intuition evolved to successfully avoid hindsight bias. As we'll see in a moment, not all surprise is useful evidence.
The Bayesian update is optimal in that it gives the agent the largest possible gain in information about the world. Exactly how much information is gained?
We might ask this question when we choose between an agent that doesn't update at all and one that updates perfectly: this is what we stand to gain, and we can compare it with the cost (in energy, computation power, etc.) of actually performing the update.
We can also ask this question when we consider what observations to make. Not updating on new evidence is equivalent, in terms of information, to not gathering it in the first place. So the benefit of gathering some more evidence is, at most, the benefit of subsequently using it in a Bayesian update.
Suppose that at time t the world is in a state Wt, and that the agent may look at it and make an observation Ot. Objectively, the surprise of this observation would be
Sobj = S(Ot|Wt) = -log Pr(Ot|Wt).
However, the agent doesn't know the state of the world. Before seeing the observation at time t, the agent has its own memory state Mt-1, which is entangled with the state of the world through past observations, but is not enough for the agent to know everything about the world.
The agent has a subjective surprise upon seeing Ot, which is the result of its own private prior:
Ssubj = S(Ot|Mt-1) = -log Pr(Ot|Mt-1),
and this may be significantly different than the objective surprise.
Interestingly, the amount of information that the agent stands to gain by making the new observation, and perfectly updating on it in a new memory state Mt, is exactly equal to the agent's expected oversurprise upon seeing the evidence. That is, any update from Mt-1 to Mt gains at most this much information:
I(Wt;Mt) - I(Wt;Mt-1) ≤ E[Ssubj - Sobj],
and this holds with equality when the update is Bayesian. (The math is below in a comment.)
For example, let's say I have two coins, one of them fair and the other turns up heads with probability 0.7. I don't know which coin is which, so I toss one of them at random. I expect it to show heads with probability
0.5 / 2 + 0.7 / 2 = 0.6,
so if it does I'll be surprised S(0.6) ≈ 0.737 bit, and if it doesn't I'll be surprised S(0.4) ≈ 1.322 bit. On average, I expect to be surprised
0.6 * S(0.6) + 0.4 * S(0.4) ≈ 0.971 bit.
The objective surprise depends on how the world really is. If I toss the fair coin, the objective surprise is S(0.5) = 1 bit for each of the two outcomes, and the expectation is 1 bit too. If I toss the loaded coin, the objective surprise is S(0.7) ≈ 0.515 bit for heads and S(0.3) ≈ 1.737 bit for tails, for an average of
0.7 * S(0.7) + 0.3 * S(0.3) ≈ 0.881 bit.
See how I'm a little undersurprised for the fair coin, but oversurprised for the loaded one. On average I'm oversurprised by
0.971 - (1 / 2 + 0.881 / 2) = 0.0305 bit.
By observing which side the coin turns up, I gain 0.971 bit of information, but most of it is just about this specific toss. Only 0.0305 bit of that information, my oversurprise, goes beyond the specific toss to teach me something about the coin itself, if I update perfectly.
We are not interested in evidence for their own sake, only inasmuch as they teach us about the world. Winning the lottery is very surprising, but nobody gambles for epistemological reasons. When the odds are known in advance, the subjective surprise is exactly matched by the objective surprise - you are not oversurprised, and you learn nothing useful.
The oversurprise is the "spillover" of surprise beyond what is just about the observation, and onto the world. It's the part of the narrowness that originates in the world, not in the observation. And that's how much the observation teaches us about the world.
An interesting corollary is that you can never expect to be undersurprised, i.e. less surprised than you objectively should be. That's the same as saying that a Bayesian update can never lose information.
The next time you find yourself wondering how best to observe the world and gather information, ask this: how much more surprised do you expect to be, than someone who already knows what you wish to know.
Continue reading: Update Then Forget
Standard and Nonstandard Numbers
Followup to: Logical Pinpointing
"Oh! Hello. Back again?"
Yes, I've got another question. Earlier you said that you had to use second-order logic to define the numbers. But I'm pretty sure I've heard about something called 'first-order Peano arithmetic' which is also supposed to define the natural numbers. Going by the name, I doubt it has any 'second-order' axioms. Honestly, I'm not sure I understand this second-order business at all.
"Well, let's start by examining the following model:"

"This model has three properties that we would expect to be true of the standard numbers - 'Every number has a successor', 'If two numbers have the same successor they are the same number', and '0 is the only number which is not the successor of any number'. All three of these statements are true in this model, so in that sense it's quite numberlike -"
And yet this model clearly is not the numbers we are looking for, because it's got all these mysterious extra numbers like C and -2*. That C thing even loops around, which I certainly wouldn't expect any number to do. And then there's that infinite-in-both-directions chain which isn't corrected to anything else.
"Right, so, the difference between first-order logic and second-order logic is this: In first-order logic, we can get rid of the ABC - make a statement which rules out any model that has a loop of numbers like that. But we can't get rid of the infinite chain underneath it. In second-order logic we can get rid of the extra chain."
Do people think Less Wrong rationality is parochial?
I've spent so much time in the cogsci literature that I know the LW approach to rationality is basically the mainstream cogsci approach to rationality (plus some extra stuff about, e.g., language), but... do other people not know this? Do people one step removed from LessWrong — say, in the 'atheist' and 'skeptic' communities — not know this? If this is causing credibility problems in our broader community, it'd be relatively easy to show people that Less Wrong is not, in fact, a "fringe" approach to rationality.
For example, here's Oaksford & Chater in the second chapter to the (excellent) new Oxford Handbook of Thinking and Reasoning, the one on normative systems of rationality:
Is it meaningful to attempt to develop a general theory of rationality at all? We might tentatively suggest that it is a prima facie sign of irrationality to believe in alien abduction, or to will a sports team to win in order to increase their chance of victory. But these views or actions might be entirely rational, given suitably nonstandard background beliefs about other alien activity and the general efficacy of psychic powers. Irrationality may, though, be ascribed if there is a clash between a particular belief or behavior and such background assumptions. Thus, a thorough-going physicalist may, perhaps, be accused of irrationality if she simultaneously believes in psychic powers. A theory of rationality cannot, therefore, be viewed as clarifying either what people should believe or how people should act—but it can determine whether beliefs and behaviors are compatible. Similarly, a theory of rational choice cannot determine whether it is rational to smoke or to exercise daily; but it might clarify whether a particular choice is compatible with other beliefs and choices.
From this viewpoint, normative theories can be viewed as clarifying conditions of consistency… Logic can be viewed as studying the notion of consistency over beliefs. Probability… studies consistency over degrees of belief. Rational choice theory studies the consistency of beliefs and values with choices.
They go on to clarify that by probability they mean Bayesian probability theory, and by rational choice theory they mean Bayesian decision theory. You'll get the same account in the textbooks on the cogsci of rationality, e.g. Thinking and Deciding or Rational Choice in an Uncertain World.
View more: Next
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)