Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.
The basic idea:
Suppose Alice and Bob disagree on whether the world will end a year from tomorrow, with Alice believing it will end. If she is right, then there will be no way to settle the bet, what with the apocalypse and all that. Thus there is no way for her to collect, and so she has no incentive to bet on the apocalypse, no matter how certain she is.
Or so it would seem! The way around the difficulty is simply for Alice to get her money today, and enjoy it for a year. If she turns out to have been right, then she will have been paid properly. If the world doesn't end, then of course she'll have to return the money, plus interest--plus a penalty for being wrong.
The actual terms should depend on the confidence Alice and Bob place in their respective positions.
Personal note: Several people at my place of work have told me that they are worried about the world ending later this year. They saw a movie about it, or something. So far, they have rejected my bet proposals.
Concretizing the abstract is an interesting blog post in that it makes a relatively cogent argument for non-reductionism. While I don't agree with it, I found it useful in that it helped me better understand how intelligent non-reductionists think. It also helped clarify to me an old distinction, that of philosophers versus engineers.
We abstract when we consider some particular aspect of a concrete thing while bracketing off or ignoring the other aspects of the thing. For example, when you consider a dinner bell or the side of a pyramid exclusively as instances of triangularity, you ignore their color, size, function, and metal or stone composition. Or to borrow an example from a recent post, when aircraft engineers determine how many passengers can be carried on a certain plane, they might focus exclusively on their average weight and ignore not only the passengers’ sex, ethnicity, hair color, dinner service preferences, etc., but even the actual weight of any particular passenger. [...]Abstractions can be very useful, and are of themselves perfectly innocent when we keep in mind that we are abstracting. The trouble comes when we start to think of abstractions as if they were concrete realities themselves -- thereby “reifying” them -- and especially when we think of the abstractions as somehow more real than the concrete realities from which they have been abstracted. [...]I do not mean to deny that abstractions of the sort in question may have their uses. On the contrary, the mathematical conception of matter is extremely useful, as the astounding technologies that surround us in modern life make obvious. But contrary to what some proponents of scientism suppose, it simply doesn’t follow for a moment that that conception gives us an exhaustive conception of the material world, for reasons I have stated many times (e.g. here). [...]Then there is social science. When we abstract from concrete human beings their purely economic motivations, ignoring everything else and then reifying this abstraction, the result is homo economicus, a strange creature who, unlike real people, is driven by nothing but the desire to maximize utility. Nietzschean analyses of human motivation in terms of the will to power are less susceptible of mathematical modeling (and thus less “scientific”), but are variations on the same sort of error. Evolutionary psychology often combines abstractions of the natural scientific and social scientific sort. Like the neuroscientist, the evolutionary psychologist often treats parts of human beings as if they were substances independent of the whole from which they have been abstracted (”selfish genes,” “memes”), and adds to this reification the abstractions of the economist (e.g. game theory).As the neuroscientific and sociobiological examples indicate, the Reification Fallacy is often combined with other fallacies. In these cases, parts of a whole substance are first abstracted from it and treated as if they were substances in their own right (e.g. brain hemispheres, genes); and then a second, “Mereological Fallacy” (as Bennett and Hacker call it) is committed, in which what is intelligibly attributed only to the whole is attributed to the parts (e.g. the left hemisphere of the brain is said to “interpret,” and genes are said to be “selfish”). [...]The irony is that while New Atheists and others beholden to scientism pride themselves on being “reality based,” that is precisely what they are not. Actual, concrete reality is extremely complicated. There is far more to material systems than what can be captured in the equations of physics, far more to human beings than can be captured in the categories of neuroscience or economics, and far more to religion than can be captured in the ludicrous straw men peddled by New Atheists. All of these simplifying abstractions (except the last) have their value, but when we treat them as anything more than simplifying abstractions we have left the realm of science and entered that of ideology. The varieties of reductionism, eliminativism, and the “hermeneutics of suspicion” are manifestations of this tendency to replace real things with abstractions. They are all attempts to “conquer the abundance” of reality (as Paul Feyerabend might have put it), to force the world in all its concrete richness into a straightjacket.
I find this interesting in the way that smart people are likely to disagree with the correct interpretation of some of its claims - while others would say the post is worshipping the mysterious, others would say that it's just making reasonable cautions about the inherent methodological limitations of a certain approach. One might even think that it's essentially making a similar point as Eliezer's warning about floating beliefs, and therefore to agree with the Sequences. The caution of "beware of thinking that your abstractions say everything that there is to be said about something" is a reasonable one, and people do clearly make that mistake sometimes.
I expect that part of what influences how plausible one finds this argument depends on whether one has more of an "engineer's mindset" or a "philosopher's mindset". Somebody with an engineer's mindset will think that "yes, the abstractions we use might be imperfect, but what else do you propose we use? They're still the best tool for accomplishing stuff, and anything else is just philosophcial nonsense that isn't grounded in anything". Whereas the philosopher is less interested in using their knowledge to "accomplish stuff", and more interested in the ideas and their implications themselves.
As an aside, this distinction might be part of the reason why we have so many computer or hard science folks on this site. Partially it's because Eliezer used a lot of CS jargon in writing the Sequences, but probably also because the Sequences, while philosophical in nature, are also very focused on practical results and getting empirical predictions out of your beliefs.
Looking at what we could use this distinction for (and thus taking an engineer's mindset) some people here have mentioned getting an "ick" reaction from religious people, just due to those people having strong false beliefs. I think that, combined with properly understanding the emotional basis of religion, an understanding of the philosopher / engineer distinction can help avoid that reaction. Our values determine our beliefs, and there are plenty of religious people who aren't stupid, crazy, or anything like that. They might simply be philosophers instead of engineers, or they might be engineers who are more interested in the instrumental benefits of religion than the rather marginal benefits of x-rationality. (Amusingly, such a "religious engineer" might justifiably consider our obsession with "truth" as just an odd philosophical pursuit.)
Suppose, for a moment, you're a strong proponent of Glim, a fantastic new philosophy of ethics that will maximize truth, happiness, and all things good, just as soon as 51% of the population accepts it as the true way; once it has achieved majority status, careful models in game theory show that Glim proponents will be significantly more prosperous and happy than non-proponents (although everybody will benefit on average, according to its models), and it will take over.
Glim has stalled, however; it's stuck at 49% belief, and a new countermovement, antiGlim, has arisen, claiming that Glim is a corrupt moral system with fatal flaws which will destroy the country if it has its way. Belief is starting to creep down, and those who accepted the ideas as plausible but weren't ready to commit are starting to turn away from the movement.
In response, a senior researcher of Glim ethics has written a scathing condemnation of antiGlim as unpatriotic, evil, and determined to keep the populace in a state of perpetual misery to support its own hegemony. He vehemently denies that there are any flaws in the moral system, and refuses to entertain antiGlim in a public debate.
In response to this, belief creeps slightly up, but acceptance goes into a freefall.
You immediately ascertain that the negativity was worse for the movement than the criticisms; you write a response, and are accused of attacking the tone and ignoring the substance of the arguments. Glim and antiGlim leadership proceed into protracted and nasty arguments, until both are highly marginalized, and ignored by the general public. Belief in Glim continues, but when the leaders of antiGlim and Glim finally arrive on a bitterly agreed upon conclusion - the arguments having centered on an actual error in the original formulations of Glim philosophy, they're unable to either get their remaining supports to cooperate, or to get any of the public to listen. Truth, happiness, and all things good never arise, and things get slightly worse, as a result of the error.
Tone arguments are not necessarily logical errors; they may be invoked by those who agree with the substance of an argument who nevertheless may feel that the argument, as posed, is counterproductive to its intended purpose.
I have stopped recommending Dawkin's work to people who are on the fence about religion. The God Delusion utterly destroyed his effectiveness at convincing people against religion. (In a world in which they couldn't do an internet search on his name, it might not matter; we don't live in that world, and I assume other people are as likely to investigate somebody as I am.) It doesn't even matter whether his facts are right or not, the way he presents them will put most people on the intellectual defensive.
If your purpose is to convince people, it's not enough to have good arguments, or good facts; these things can only work if people are receptive to those arguments and those facts. Your first move is your most important - you must try to make that person receptive. And if somebody levels a tone argument at you, your first consideration should not be "Oh! That's DH2, it's a fallacy, I can disregard what this person has to say!" It should be - why are they leveling a tone argument at you to begin with? Are they disagreeing with you on the basis of your tone, or disagreeing with the tone itself?
Or, in short, the categorical assessment of "Responding to Tone" as either a logical fallacy or a poor argument is incorrect, as it starts from an unfounded assumption that the purpose of a tone response is, in fact, to refute the argument. In the few cases I have seen responses to tone which were utilized against an argument, they were in fact ad-hominems, of the formulation "This person clearly hates [x], and thus can't be expected to have an unbiased perspective." Note that this is a particularly persuasive ad-hominem, particularly for somebody who is looking to rationalize their beliefs against an argument - and that this inoculation against argument is precisely the reason you should, in fact, moderate your tone.
I was very interested in the discussions and opinions that grew out of the last time this was played, but find digging through 800+ comments for a new game to start on the same thread annoying. I also don't want this game ruined by a potential sock puppet (whom ever it may be). So here's a non-sockpuppetiered Irrationality Game, if there's still interest. If there isn't, downvote to oblivion!
The original rules:
Please read the post before voting on the comments, as this is a game where voting works differently.
Warning: the comments section of this post will look odd. The most reasonable comments will have lots of negative karma. Do not be alarmed, it's all part of the plan. In order to participate in this game you should disable any viewing threshold for negatively voted comments.
Here's an irrationalist game meant to quickly collect a pool of controversial ideas for people to debate and assess. It kinda relies on people being honest and not being nitpickers, but it might be fun.
Write a comment reply to this post describing a belief you think has a reasonable chance of being true relative to the the beliefs of other Less Wrong folk. Jot down a proposition and a rough probability estimate or qualitative description, like 'fairly confident'.
Example (not my true belief): "The U.S. government was directly responsible for financing the September 11th terrorist attacks. Very confident. (~95%)."
If you post a belief, you have to vote on the beliefs of all other comments. Voting works like this: if you basically agree with the comment, vote the comment down. If you basically disagree with the comment, vote the comment up. What 'basically' means here is intuitive; instead of using a precise mathy scoring system, just make a guess. In my view, if their stated probability is 99.9% and your degree of belief is 90%, that merits an upvote: it's a pretty big difference of opinion. If they're at 99.9% and you're at 99.5%, it could go either way. If you're genuinely unsure whether or not you basically agree with them, you can pass on voting (but try not to). Vote up if you think they are either overconfident or underconfident in their belief: any disagreement is valid disagreement.
If the proposition in a comment isn't incredibly precise, use your best interpretation. If you really have to pick nits for whatever reason, say so in a comment reply.
The more upvotes you get, the more irrational Less Wrong perceives your belief to be. Which means that if you have a large amount of Less Wrong karma and can still get lots of upvotes on your crazy beliefs then you will get lots of smart people to take your weird ideas a little more seriously.
Some poor soul is going to come along and post "I believe in God". Don't pick nits and say "Well in a a Tegmark multiverse there is definitely a universe exactly like ours where some sort of god rules over us..." and downvote it. That's cheating. You better upvote the guy. For just this post, get over your desire to upvote rationality. For this game, we reward perceived irrationality.
Try to be precise in your propositions. Saying "I believe in God. 99% sure." isn't informative because we don't quite know which God you're talking about. A deist god? The Christian God? Jewish?
Y'all know this already, but just a reminder: preferences ain't beliefs. Downvote preferences disguised as beliefs. Beliefs that include the word "should" are are almost always imprecise: avoid them.That means our local theists are probably gonna get a lot of upvotes. Can you beat them with your confident but perceived-by-LW-as-irrational beliefs? It's a challenge!
- Generally, no repeating an altered version of a proposition already in the comments unless it's different in an interesting and important way. Use your judgement.
- If you have comments about the game, please reply to my comment below about meta discussion, not to the post itself. Only propositions to be judged for the game should be direct comments to this post.
- Don't post propositions as comment replies to other comments. That'll make it disorganized.
- You have to actually think your degree of belief is rational. You should already have taken the fact that most people would disagree with you into account and updated on that information. That means that any proposition you make is a proposition that you think you are personally more rational about than the Less Wrong average. This could be good or bad. Lots of upvotes means lots of people disagree with you. That's generally bad. Lots of downvotes means you're probably right. That's good, but this is a game where perceived irrationality wins you karma. The game is only fun if you're trying to be completely honest in your stated beliefs. Don't post something crazy and expect to get karma. Don't exaggerate your beliefs. Play fair.
- Debate and discussion is great, but keep it civil. Linking to the Sequences is barely civil -- summarize arguments from specific LW posts and maybe link, but don't tell someone to go read something. If someone says they believe in God with 100% probability and you don't want to take the time to give a brief but substantive counterargument, don't comment at all. We're inviting people to share beliefs we think are irrational; don't be mean about their responses.
- No propositions that people are unlikely to have an opinion about, like "Yesterday I wore black socks. ~80%" or "Antipope Christopher would have been a good leader in his latter days had he not been dethroned by Pope Sergius III. ~30%." The goal is to be controversial and interesting.
- Multiple propositions are fine, so long as they're moderately interesting.
- You are encouraged to reply to comments with your own probability estimates, but comment voting works normally for comment replies to other comments. That is, upvote for good discussion, not agreement or disagreement.
- In general, just keep within the spirit of the game: we're celebrating LW-contrarian beliefs for a change!
Admitting to being wrong isn't easy, but it's something we want to encourage.
So ... were you convinced by someone's arguments lately? Did you realize a heated disagreement was actually a misunderstanding? Here's the place to talk about it!
(The following is a summary of some of my previous submissions that I originally created for my personal blog.)
...an intelligence explosion may have fair probability, not because it occurs in one particular detailed scenario, but because, like the evolution of eyes or the emergence of markets, it can come about through many different paths and can gather momentum once it gets started. Humans tend to underestimate the likelihood of such “disjunctive” events, because they can result from many different paths (Tversky and Kahneman 1974). We suspect the considerations in this paper may convince you, as they did us, that this particular disjunctive event (intelligence explosion) is worthy of consideration.
It seems to me that all the ways in which we disagree have more to do with philosophy (how to quantify uncertainty; how to deal with conjunctions; how to act in consideration of low probabilities) [...] we are not dealing with well-defined or -quantified probabilities. Any prediction can be rephrased so that it sounds like the product of indefinitely many conjunctions. It seems that I see the “SIAI’s work is useful scenario” as requiring the conjunction of a large number of questionable things [...]
— Holden Karnofsky, 6/24/11 (GiveWell interview with major SIAI donor Jaan Tallinn, PDF)
People associated with the Singularity Institute for Artificial Intelligence (SIAI) like to claim that the case for risks from AI is supported by years worth of disjunctive lines of reasoning. This basically means that there are many reasons to believe that humanity is likely to be wiped out as a result of artificial general intelligence. More precisely it means that not all of the arguments supporting that possibility need to be true, even if all but one are false risks from AI are to be taken seriously.
The idea of disjunctive arguments is formalized by what is called a logical disjunction. Consider two declarative sentences, A and B. The truth of the conclusion (or output) that follows from the sentences A and B does depend on the truth of A and B. In the case of a logical disjunction the conclusion of A and B is only false if both A and B are false, otherwise it is true. Truth values are usually denoted by 0 for false and 1 for true. A disjunction of declarative sentences is denoted by OR or ∨ as an infix operator. For example, (A(0)∨B(1))(1), or in other words, if statement A is false and B is true then what follows is still true because statement B is sufficient to preserve the truth of the overall conclusion.
Generally there is no problem with disjunctive lines of reasoning as long as the conclusion itself is sound and therefore in principle possible, yet in demand of at least one of several causative factors to become actual. I don’t perceive this to be the case for risks from AI. I agree that there are many ways in which artificial general intelligence (AGI) could be dangerous, but only if I accept several presuppositions regarding AGI that I actually dispute.
By presuppositions I mean requirements that need to be true simultaneously (in conjunction). A logical conjunction is only true if all of its operands are true. In other words, the a conclusion might require all of the arguments leading up to it to be true, otherwise it is false. A conjunction is denoted by AND or ∧.
Now consider the following prediction: <Mary is going to buy one of thousands of products in the supermarket.>
The above prediction can be framed as a disjunction: Mary is going to buy one of thousands of products in the supermarket, 1.) if she is hungry 2.) if she is thirsty 3.) if she needs a new coffee machine. Only one of the 3 given possible arguments need to be true in order to leave the overall conclusion to be true, that Mary is going shopping. Or so it seems.
The same prediction can be framed as a conjunction: Mary is going to buy one of thousands of products in the supermarket 1.) if she has money 2.) if she has some needs 3.) if the supermarket is open. All of the 3 given factors need to be true in order to render the overall conclusion to be true.
That a prediction is framed to be disjunctive does not speak in favor of the possibility in and of itself. I agree that it is likely that Mary is going to visit the supermarket if I accept the hidden presuppositions. But a prediction is only at most as probable as its basic requirements. In this particular case I don’t even know if Mary is a human or a dog, a factor that can influence the probability of the prediction dramatically.
The same is true for risks from AI. The basic argument in favor of risks from AI is that of an intelligence explosion, that intelligence can be applied to itself in an iterative process leading to ever greater levels of intelligence. In short, artificial general intelligence will undergo explosive recursive self-improvement.
Explosive recursive self-improvement is one of the presuppositions for the possibility of risks from AI. The problem is that this and other presuppositions are largely ignored and left undefined. All of the disjunctive arguments put forth by the SIAI are trying to show that there are many causative factors that will result in the development of unfriendly artificial general intelligence. Only one of those factors needs to be true for us to be wiped out by AGI. But the whole scenario is at most as probable as the assumption hidden in the words <artificial general intelligence> and <explosive recursive self-improvement>.
<Artificial General Intelligence> and <Explosive Recursive Self-improvement> might appear to be relatively simple and appealing concepts. But most of this superficial simplicity is a result of the vagueness of natural language descriptions. Reducing the vagueness of those concepts by being more specific, or by coming up with technical definitions of each of the words they are made up of, reveals the hidden complexity that is comprised in the vagueness of the terms.
If we were going to define those concepts and each of its terms we would end up with a lot of additional concepts made up of other words or terms. Most of those additional concepts will demand explanations of their own made up of further speculations. If we are precise then any declarative sentence (P#) (all of the terms) used in the final description will have to be true simultaneously (P#∧P#). And this does reveal the true complexity of all hidden presuppositions and thereby influence the overall probability, P(risks from AI) = P(P1∧P2∧P3∧P4∧P5∧P6∧…). That is because the conclusion of an argument that is made up of a lot of statements (terms) that can be false is more unlikely to be true since complex arguments can fail in a lot of different ways. You need to support each part of the argument that can be true or false and you can therefore fail to support one or more of its parts, which in turn will render the overall conclusion false.
To summarize: If we tried to pin down a concept like <Explosive Recursive Self-Improvement> we would end up with requirements that are strongly conjunctive.
Making numerical probability estimates
But even if the SIAI was going to thoroughly define those concepts, there is still more to the probability of risks from AI than the underlying presuppositions and causative factors. We also have to integrate our uncertainty about the very methods we used to come up with those concepts, definitions and our ability to make correct predictions about the future and integrate all of it into our overall probability estimates.
Take for example the following contrived quote:
We have to take over the universe to save it by making the seed of an artificial general intelligence, that is undergoing explosive recursive self-improvement, extrapolate the coherent volition of humanity, while acausally trading with other superhuman intelligences across the multiverse.
Although contrived, the above quote does only comprise actual beliefs hold by people associated with the SIAI. All of those beliefs might seem somewhat plausible inferences and logical implications of speculations and state of the art or bleeding edge knowledge of various fields. But should we base real-life decisions on those ideas, should we take those ideas seriously? Should we take into account conclusions whose truth value does depend on the conjunction of those ideas? And is it wise to make further inferences on those speculations?
Let’s take a closer look at the necessary top-level presuppositions to take the above quote seriously:
- The many-worlds interpretation
- Belief in the Implied Invisible
- Timeless Decision theory
- Intelligence explosion
1: Within the lesswrong/SIAI community the many-worlds interpretation of quantum mechanics is proclaimed to be the rational choice of all available interpretations. How to arrive at this conclusion is supposedly also a good exercise in refining the art of rationality.
2: P(Y|X) ≈ 1, then P(X∧Y) ≈ P(X)
In other words, logical implications do not have to pay rent in future anticipations.
3: “Decision theory is the study of principles and algorithms for making correct decisions—that is, decisions that allow an agent to achieve better outcomes with respect to its goals.”
4: “Intelligence explosion is the idea of a positive feedback loop in which an intelligence is making itself smarter, thus getting better at making itself even smarter. A strong version of this idea suggests that once the positive feedback starts to play a role, it will lead to a dramatic leap in capability very quickly.”
To be able to take the above quote seriously you have to assign a non-negligible probability to the truth of the conjunction of #1,2,3,4, 1∧2∧3∧4. Here the question is not not only if our results are sound but if the very methods we used to come up with those results are sufficiently trustworthy. Because any extraordinary conclusions that are implied by the conjunction of various beliefs might outweigh the benefit of each belief if the overall conclusion is just slightly wrong.
Not enough empirical evidence
Don’t get me wrong, I think that there sure are convincing arguments in favor of risks from AI. But do arguments suffice? Nobody is an expert when it comes to intelligence. My problem is that I fear that some convincing blog posts written in natural language are simply not enough.
Just imagine that all there was to climate change was someone who never studied the climate but instead wrote some essays about how it might be physical possible for humans to cause a global warming. If the same person then goes on to make further inferences based on the implications of those speculations, am I going to tell everyone to stop emitting CO2 because of that? Hardly!
Or imagine that all there was to the possibility of asteroid strikes was someone who argued that there might be big chunks of rocks out there which might fall down on our heads and kill us all, inductively based on the fact that the Earth and the moon are also a big rocks. Would I be willing to launch a billion dollar asteroid deflection program solely based on such speculations? I don’t think so.
Luckily, in both cases, we got a lot more than some convincing arguments in support of those risks.
Another example: If there were no studies about the safety of high energy physics experiments then I might assign a 20% chance of a powerful particle accelerator destroying the universe based on some convincing arguments put forth on a blog by someone who never studied high energy physics. We know that such an estimate would be wrong by many orders of magnitude. Yet the reason for being wrong would largely be a result of my inability to make correct probability estimates, the result of vagueness or a failure of the methods I employed to come up with those estimates. The reason for being wrong by many orders of magnitude would have nothing to do with the arguments in favor of the risks, as they might very well be sound given my epistemic state and the prevalent uncertainty.
I believe that mere arguments in favor of one risk do not suffice to neglect other risks that are supported by other kinds of evidence. I believe that logical implications of sound arguments should not reach out indefinitely and thereby outweigh other risks whose implications are fortified by empirical evidence. Sound arguments, predictions, speculations and their logical implications are enough to demand further attention and research, but not much more.
Artificial general intelligence is already an inference made from what we currently believe to be true, going a step further and drawing further inferences from previous speculations, e.g. explosive recursive self-improvement, is in my opinion a very shaky business.
What would happen if we were going to let logical implications of vast utilities outweigh other concrete near-term problems that are based on empirical evidence? Insignificant inferences might exhibit hyperbolic growth in utility: 1.) There is no minimum amount of empirical evidence necessary to extrapolate the expected utility of an outcome. 2.) The extrapolation of counterfactual alternatives is unbounded, logical implications can reach out indefinitely without ever requiring new empirical evidence.
All of the above hints at a general problem that is the reason for why I think that discussions between people associated with the SIAI, its critics and those who try to evaluate the SIAI, won’t lead anywhere. Those discussions miss the underlying reason for most of the superficial disagreement about risks from AI, namely that there is no disagreement about risks from AI in and of itself.
There are a few people who disagree about the possibility of AGI in general, but I don’t want to touch on that subject in this post. I am trying to highlight the disagreement between the SIAI and people who accept the notion of artificial general intelligence. With regard to those who are not skeptical of AGI the problem becomes more obvious when you turn your attention to people like John Baez organisations like GiveWell. Most people would sooner question their grasp of “rationality” than give five dollars to a charity that tries to mitigate risks from AI because their calculations claim it was “rational” (those who have read the article by Eliezer Yudkowsky on ‘Pascal’s Mugging‘ know that I used a statement from that post and slightly rephrased it). The disagreement all comes down to a general averseness to options that have a low probability of being factual, even given that the stakes are high.
Nobody is so far able to beat arguments that bear resemblance to Pascal’s Mugging. At least not by showing that it is irrational to give in from the perspective of a utility maximizer. One can only reject it based on a strong gut feeling that something is wrong. And I think that is what many people are unknowingly doing when they argue against the SIAI or risks from AI. They are signaling that they are unable to take such risks into account. What most people mean when they doubt the reputation of people who claim that risks from AI need to be taken seriously, or who say that AGI might be far off, what those people mean is that risks from AI are too vague to be taken into account at this point, that nobody knows enough to make predictions about the topic right now.
When GiveWell, a charity evaluation service, interviewed the SIAI (PDF), they hinted at the possibility that one could consider the SIAI to be a sort of Pascal’s Mugging:
GiveWell: OK. Well that’s where I stand – I accept a lot of the controversial premises of your mission, but I’m a pretty long way from sold that you have the right team or the right approach. Now some have argued to me that I don’t need to be sold – that even at an infinitesimal probability of success, your project is worthwhile. I see that as a Pascal’s Mugging and don’t accept it; I wouldn’t endorse your project unless it passed the basic hurdles of credibility and workable approach as well as potentially astronomically beneficial goal.
This shows that lot of people do not doubt the possibility of risks from AI but are simply not sure if they should really concentrate their efforts on such vague possibilities.
Technically, from the standpoint of maximizing expected utility, given the absence of other existential risks, the answer might very well be yes. But even though we believe to understand this technical viewpoint of rationality very well in principle, it does also lead to problems such as Pascal’s Mugging. But it doesn’t take a true Pascal’s Mugging scenario to make people feel deeply uncomfortable with what Bayes’ Theorem, the expected utility formula, and Solomonoff induction seem to suggest one should do.
Again, we currently have no rational way to reject arguments that are framed as predictions of worst case scenarios that need to be taken seriously even given a low probability of their occurrence due to the scale of negative consequences associated with them. Many people are nonetheless reluctant to accept this line of reasoning without further evidence supporting the strong claims and request for money made by organisations such as the SIAI.
Here is what mathematician and climate activist John Baez has to say:
Of course, anyone associated with Less Wrong would ask if I’m really maximizing expected utility. Couldn’t a contribution to some place like the Singularity Institute of Artificial Intelligence, despite a lower chance of doing good, actually have a chance to do so much more good that it’d pay to send the cash there instead?
And I’d have to say:
1) Yes, there probably are such places, but it would take me a while to find the one that I trusted, and I haven’t put in the work. When you’re risk-averse and limited in the time you have to make decisions, you tend to put off weighing options that have a very low chance of success but a very high return if they succeed. This is sensible so I don’t feel bad about it.
2) Just to amplify point 1) a bit: you shouldn’t always maximize expected utility if you only live once. Expected values — in other words, averages — are very important when you make the same small bet over and over again. When the stakes get higher and you aren’t in a position to repeat the bet over and over, it may be wise to be risk averse.
3) If you let me put the $100,000 into my retirement account instead of a charity, that’s what I’d do, and I wouldn’t even feel guilty about it. I actually think that the increased security would free me up to do more risky but potentially very good things!
All this shows that there seems to be a fundamental problem with the formalized version of rationality. The problem might be human nature itself, that some people are unable to accept what they should do if they want to maximize their expected utility. Or we are missing something else and our theories are flawed. Either way, to solve this problem we need to research those issues and thereby increase the confidence in the very methods used to decide what to do about risks from AI, or to increase the confidence in risks from AI directly, enough to make it look like a sensible option, a concrete and discernable problem that needs to be solved.
Many people perceive the whole world to be at stake, either due to climate change, war or engineered pathogens. Telling them about something like risks from AI, even though nobody seems to have any idea about the nature of intelligence, let alone general intelligence or the possibility of recursive self-improvement, seems like just another problem, one that is too vague to outweigh all the other risks. Most people feel like having a gun pointed to their heads, telling them about superhuman monsters that might turn them into paperclips then needs some really good arguments to outweigh the combined risk of all other problems.
But there are many other problems with risks from AI. To give a hint at just one example: if there was a risk that might kill us with a probability of .7 and another risk with .1 while our chance to solve the first one was .0001 and the second one .1, which one should we focus on? In other words, our decision to mitigate a certain risk should not only be focused on the probability of its occurence but also on the probability of success in solving it. But as I have written above I believe that the most pressing issue is to increase the confidence into making decisions under extreme uncertainty or to reduce the uncerainty itself.
As you may recall, Rolf Nelson disagrees with me about Amanda Knox -- rather sharply. Of course, the same can be said of lots of other people (if not so much here on Less Wrong). But Rolf isn't your average "guilter". Indeed, considering that he speaks fluent Bayesian, is one of the Singularity Institute's largest donors, and is also (as I understand it) signed up for cryonics, it's hard to imagine an "opponent" more "worthy". The Amanda Knox case may not be in the same category of importance as many other issues where Rolf and I probably agree; but my opinion on it is very confident, and it's the opposite of his. If we're both aspiring rationalists, at least one of us is doing something wrong.
As it turns out, Rolf is interested in having a debate with me on the subject, to see if one of us can help to change the other's mind. I'm setting this post up as an experiment, to see if LW can serve as a suitable venue for such an exercise. I hope it can: Less Wrong is almost unique in the extent to which the social norms governing discussion reflect and coincide with the requirements of personal epistemic rationality. (For example: "Do not believe you do others a favor if you accept their arguments; the favor is to you.") But I don't think we've yet tried an organized one-on-one debate -- so we'll see how it goes. If it proves too unwieldy or inappropriate for some other reason, we can always move to another venue.
Although the primary purpose of this post is a one-on-one debate between Rolf Nelson and myself, this is a LW Discussion post like any other, and it goes without saying that others are welcome and encouraged to comment. Just be aware that we, the main protagonists, will try to keep our discussion focused on each other's arguments. (Also, since our subject is an issue where there is already a strong LW consensus, one would prefer to avoid a sort of "gangup effect" where lots of people "pounce" on the person taking the contrarian position.)
With that, here we go...
This is a combination news-announcement and begging for someone with academic subscriptions to maybe jailbreak a PDF for us.
"Many philosophers appeal to intuitions to support some philosophical views. However, there is reason to be concerned about this practice as scientific evidence has documented systematic bias in philosophically relevant intuitions as a function of seemingly irrelevant features (e.g., personality). One popular defense used to insulate philosophers from these concerns holds that philosophical expertise eliminates the influence of these extraneous factors. Here, we test this assumption. We present data suggesting that verifiable philosophical expertise in the free will debate—as measured by a reliable and validated test of expert knowledge—does not eliminate the influence of one important extraneous feature (i.e., the heritable personality trait extraversion) on judgments concerning freedom and moral responsibility. These results suggest that, in at least some important cases, the expertise defense fails. Implications for the practice of philosophy, experimental philosophy, and applied ethics are discussed."
"For example, our research suggests that heritable personality traits predict bias in some fundamental philosophically relevant intuitions (Feltz & Cokely 2008, 2009; Cokely & Feltz, 2009; Feltz, Perez, & Harris, in press; Feltz, Harris, & Perez, 2010). In response to these findings, “philosophical expertise” has been used to shield some parts of standard philosophical practice from the worries presented by experimental philosophers (e.g., Ludwig, 2007; Kauppinen 2007; Horvarth, 2010; Sosa, 2010; Williamson, 2007, 2011). One important part of the “Expertise Defense” is that philosophers are assumed to be relevantly different from the folk (e.g., as a result of their years of training) and consequently philosophers' intuitions shouldn’t display the same (or similar) biases.
But more recently, there have been serious concerns raised by experimental philosophers about the Expertise Defense. Some have used indirect strategies suggesting that philosophical expertise is unlike expertise in areas known to result in the relevant differences (e.g., in chess) (Weinberg, Gonnerman, Buckner, & Alexander, 2010 see related discussion here). Others have opted for direct strategies showing that for many important everyday behaviors (e.g., voting, returning library books, showing common courtesy) philosophers often display the same (or similar) biases as the folk (Schwitzgebel 2009; Schwitzgebel & Rust, 2010, 2009; Schwitzgebel & Cushman, in press). In a new paper (Schulz, Cokely, & Feltz, in press), we also adopt the direct strategy and present the first evidence that personality predicts persistent bias in verifiable expert intuitions about free will and moral responsibility. These results suggest that, in at least some important fundamental philosophical debates, the Expertise Defense fails"
Kathryn Schulz is a self-identified "Wrongologist" (in fact, @wrongologist is her user name on Twitter). She has written a popular book ("Being Wrong: Adventures in the Margin of Error", web site) and also writes the Slate column 'The Wrong Stuff'. Her TED talk covers the problem of disagreement, the nature of belief, overconfidence bias and how to actually change your mind. She maintains that most folks actively avoid the unpleasant feeling of "being wrong", which is an important point I have not seen before (but see The Importance of Saying 'Oops' and Crisis of Faith). Unfortunately, she does not discuss reasoning about uncertainty, so her arguments against 'the feeling of right' end up seeming rather shallow.
Discuss her TED talk here. (Her broader work is also obviously on topic.)
Even though this was written by a current Less Wrong poster (hi, pdf23ds!), I don't think it has been posted here: Why and how to debate charitably (pg. 2, comments). (Edit: The original pdf23ds.net site has sadly been lost to entropy – Less Wrong poster MichaelBishop found a repost on commonsenseatheism.com. He also provides this summary version.)
I was linked to this article from a webcomic forum which had a low-key flamewar smouldering in the "Serious Business" section. (I will not link to it here; if you can tell from the description which forum it is, I would thank you not to link it either.) Three things struck me about it:
- I have been operating under similar rules for years, with great success.
- The participants in the flamewar on the forum where it was posted were not operating under these rules.
- Less Wrong posters generally do operate under these rules, at least here.
The list of rules is on pg. 2 - a good example is the rule titled "You cannot read minds":
As soon as you find someone espousing seemingly contradictory positions, you should immediately suspect yourself of being mistaken as to their intent. Even if it seems obvious to you that the person has a certain intent in their message, if you want to engage them, you must respond being open to the possibility that where you see contradictions (or, for that matter, insults), none were intended. While you keep in mind what the person’s contradictory position seems to be, raise your standards some, and ask questions so that the person must state the position more explicitly—this way, you can make sure whether they actually hold it. If you still have problems, keep raising your standards, and asking more specific questions, until the person starts making sense to you.
If part of their position is unclear or ambiguous to you, say that explicitly. Being willing to show uncertainty is an excellent way to defuse the person’s, and your own, defensiveness. It also helps them to more easily understand which aspects of their position they are not making clear enough.
The less their position makes sense to you, the more you should rely on interrogative phrase and the less on declarative. Questions defuse defensiveness and are much more pointed and communicative than statements, because they force you to think more about the person’s arguments, and to really articulate what it about their position you most need clarification on. They help to keep the discussion moving, and help you to stop arguing past each other. Phrase the questions sincerely, and use as much of the person’s own reasoning (putting in the best light) as you can. This requires that you have a pretty good grasp on what the person is arguing—try to understand their position as well as you can. If it’s simply not coherent enough, the case may be hopeless.
By Aumann's agreement theorem (and some related/inferred ideas), two rationalists can't agree to disagree.
However, there's vast amount of cases I know of that are not sufficiently important to spend significant time on discussing them (for the sake of narrowing the topic, I propose to concentrate on cases where it's doubtfully valuable to spend more than just few minutes discussing them).
More generally, it's also a question of optimizing amount of time (or mental resources) spent on updating own probabilities regarding some particular topic (choosing the option with maximal expected value of information divided by value of time, AFAIU).
It is rather obvious that, after discussing some topic for few minutes, it is suboptimal to not update probabilities about the discussed topic (and update just the knowledge regarding disagreement with other person).
But the question is, what probability update is most appropriate in such situation? Or, in a slightly different way, what would be an instrumentally optimal course of action given disagreement of two rationalists on some particular topic? Given some expectations like own expectation (probability) that other person is a honest rationalist. And a time limit.
In more detail: What initial expectations regarding other person's knowledge can be easily updated? It is rather simple to state some approximation of the size of evidence. Possible but more hard to state reasons to update expectations regarding own evidence being biased or unbiased. Also, would it be more right to shift probability distribution towards the other person's beliefs or towards uncertainty (unitary distribution of probabilities, AFAIU)? And, how much probability can be shifted in such a short time (given sufficiently complex topic) — would that amount be tiny?
Related references are http://wiki.lesswrong.com/wiki/Likelihood_ratio and most links 2 or 3 levels deep from there. And post tags, ofc. Related but doubtfully important chatlog (with an example of the situation) is here.
Related to Your Rationality is My Business
Among religious believers in the developed world, there is something of a hierarchy in terms of social tolerability. Near the top are the liberal, nonjudgmental, frequently nondenominational believers, of whom it is highly unpopular to express disapproval. At the bottom you find people who picket funerals or bomb abortion clinics, the sort with whom even most vocally devout individuals are quick to deny association.
Slightly above these, but still very close to the bottom of the heap, are proselytizers and door to door evangelists. They may not be hateful about their beliefs, indeed many find that their local Jehovah’s Witnesses are exceptionally nice people, but they’re simply so annoying. How can they go around pressing their beliefs on others and judging people that way?
I have never known another person to criticize evangelists for not trying hard enough to change others’ beliefs.
Sam Harris has a new book, The Moral Landscape, in which he makes a very simple argument, at least when you express it in the terms we tend to use on LW: he says that a reasonable definition of moral behavior can (theoretically) be derived from our utility functions. Essentially, he's promoting the idea of coherent extrapolated volition, but without all the talk of strong AI.
He also argues that, while there are all sorts of tricky corner cases where we disagree about what we want, those are less common than they seem. Human utility functions are actually pretty similar; the disagreements seem bigger because we think about them more. When France passes laws against wearing a burqa in public, it's news. When people form an orderly line at the grocery store, nobody notices how neatly our goals and behavior have aligned. No newspaper will publish headlines about how many people are enjoying the pleasant weather. We take it for granted that human utility functions mostly agree with each other.
What surprises me, though, is how much flak Sam Harris has drawn for just saying this. There are people who say that there can not, in principle, be any right answer to moral questions. There are heavily religious people who say that there's only one right answer to moral questions, and it's all laid out in their holy book of choice. What I haven't heard, yet, are any well-reasoned objections that address what Harris is actually saying.
So, what do you think? I'll post some links so you can see what the author himself says about it:
"The Science of Good and Evil": An article arguing briefly for the book's main thesis.
Frequently asked questions: Definitely helps clarify some things.
TED talk about his book: I think he devotes most of this talk to telling us what he's not claiming.