Under-acknowledged Value Differences

47 Wei_Dai 12 September 2012 10:02PM

I've been reading a lot of the recent LW discussions on politics and gender, and noticed that people rarely bring up or explicitly acknowledge that different people affected by some political or gender issue have different values/preferences, and therefore solving the problem involves a strong element of bargaining and is not just a matter of straightforward optimization. Instead, we tend to talk as if there is some way to solve the problem that's best for everyone, and that rational discussion will bring us closer to finding that one best solution.

For example, when discussing gender-related problems, one solution may be generally better for men, while another solution may be generally better for women. If people are selfish, then they will each prefer the solution that's individually best for them, even if they can agree on all of the facts. (It's unclear whether people should be selfish, but it seems best to assume that most are, for practical purposes.)

Unfortunately, in bargaining situations, epistemic rationality is not necessarily instrumentally rational. In general, convincing others of a falsehood can be useful for moving the negotiated outcome closer to one's own preferences and away from others', and this may be done more easily if one honestly believes the falsehood. (One of these falsehoods may be, for example, "My preferred solution is best for everyone.") Given these (subconsciously or evolutionarily processed) incentives, it seems reasonable to think that the more solving a problem resembles bargaining, the more likely we are to be epistemicaly irrationality when thinking and talking about it.

If we do not acknowledge and keep in mind that we are in a bargaining situation, then we are less likely to detect such failures of epistemic rationality, especially in ourselves. We're also less likely to see that there's an element of Prisoner's Dilemma in participating in such debates: your effort to convince people to adopt your preferred solution is costly (in time and in your and LW's overall sanity level) but may achieve little because someone else is making an opposite argument. Both of you may be better off if neither engaged in the debate.

The noncentral fallacy - the worst argument in the world?

157 Yvain 27 August 2012 03:36AM

Related to: Leaky Generalizations, Replace the Symbol With The Substance, Sneaking In Connotations

David Stove once ran a contest to find the Worst Argument In The World, but he awarded the prize to his own entry, and one that shored up his politics to boot. It hardly seems like an objective process.

If he can unilaterally declare a Worst Argument, then so can I. I declare the Worst Argument In The World to be this: "X is in a category whose archetypal member gives us a certain emotional reaction. Therefore, we should apply that emotional reaction to X, even though it is not a central category member."

Call it the Noncentral Fallacy. It sounds dumb when you put it like that. Who even does that, anyway?

It sounds dumb only because we are talking soberly of categories and features. As soon as the argument gets framed in terms of words, it becomes so powerful that somewhere between many and most of the bad arguments in politics, philosophy and culture take some form of the noncentral fallacy. Before we get to those, let's look at a simpler example.

Suppose someone wants to build a statue honoring Martin Luther King Jr. for his nonviolent resistance to racism. An opponent of the statue objects: "But Martin Luther King was a criminal!"

Any historian can confirm this is correct. A criminal is technically someone who breaks the law, and King knowingly broke a law against peaceful anti-segregation protest - hence his famous Letter from Birmingham Jail.

But in this case calling Martin Luther King a criminal is the noncentral. The archetypal criminal is a mugger or bank robber. He is driven only by greed, preys on the innocent, and weakens the fabric of society. Since we don't like these things, calling someone a "criminal" naturally lowers our opinion of them.

The opponent is saying "Because you don't like criminals, and Martin Luther King is a criminal, you should stop liking Martin Luther King." But King doesn't share the important criminal features of being driven by greed, preying on the innocent, or weakening the fabric of society that made us dislike criminals in the first place. Therefore, even though he is a criminal, there is no reason to dislike King.

This all seems so nice and logical when it's presented in this format. Unfortunately, it's also one hundred percent contrary to instinct: the urge is to respond "Martin Luther King? A criminal? No he wasn't! You take that back!" This is why the noncentral is so successful. As soon as you do that you've fallen into their trap. Your argument is no longer about whether you should build a statue, it's about whether King was a criminal. Since he was, you have now lost the argument.

Ideally, you should just be able to say "Well, King was the good kind of criminal." But that seems pretty tough as a debating maneuver, and it may be even harder in some of the cases where the noncentral Fallacy is commonly used.

continue reading »

Can we teach Rationality at the University of Reddit?

23 Lightwave 20 August 2012 09:59PM

This was posted a few hours ago. Basically, the reddit admins have decided to promote the so-called "University of Reddit", a subreddit where people can offer and teach any course they'd like, or just attend a course. This is official website for all courses. At the time of this posting the subreddit had ~38,400 subscribers, but I expect it will grow significantly.

Given the reddit demographics, a Rationality course has the potential to become extremely popular. The exposure can popularize CFAR and LessWrong, and can be used to recruit fresh minds for rationality-related causes. Also maybe run experiments with curricula or methods of teaching rationality?

What do you guys think?

Solve Psy-Kosh's non-anthropic problem

34 cousin_it 20 December 2010 09:24PM

The source is here. I'll restate the problem in simpler terms:

You are one of a group of 10 people who care about saving African kids. You will all be put in separate rooms, then I will flip a coin. If the coin comes up heads, a random one of you will be designated as the "decider". If it comes up tails, nine of you will be designated as "deciders". Next, I will tell everyone their status, without telling the status of others. Each decider will be asked to say "yea" or "nay". If the coin came up tails and all nine deciders say "yea", I donate $1000 to VillageReach. If the coin came up heads and the sole decider says "yea", I donate only $100. If all deciders say "nay", I donate $700 regardless of the result of the coin toss. If the deciders disagree, I don't donate anything.

First let's work out what joint strategy you should coordinate on beforehand. If everyone pledges to answer "yea" in case they end up as deciders, you get 0.5*1000 + 0.5*100 = 550 expected donation. Pledging to say "nay" gives 700 for sure, so it's the better strategy.

But consider what happens when you're already in your room, and I tell you that you're a decider, and you don't know how many other deciders there are. This gives you new information you didn't know before - no anthropic funny business, just your regular kind of information - so you should do a Bayesian update: the coin is 90% likely to have come up tails. So saying "yea" gives 0.9*1000 + 0.1*100 = 910 expected donation. This looks more attractive than the 700 for "nay", so you decide to go with "yea" after all.

Only one answer can be correct. Which is it and why?

(No points for saying that UDT or reflective consistency forces the first solution. If that's your answer, you must also find the error in the second one.)

Is Politics the Mindkiller? An Inconclusive Test

14 OrphanWilde 27 July 2012 05:45PM

Or is the convention against discussing politics here silly?

I propose a test.  I'm going to try to lay down some rules on voting on comments for the test here (not that I can force anybody to abide by them):

1.) Top-level comments should introduce arguments (or ridicule me and/or this test); responses should be responses to those arguments.

2.) Upvote and downvote based on whether or not you find an argument convincing in the context in which it was raised.  This means if it's a good argument against the argument it is responding to, not whether or not there's a good/obvious counterargument to it; if you have a good counterargument, raise it.  If it's a convincing argument, and the counterargument is also convincing, upvote both.  If both arguments are unconvincing, downvote both.

3.) Try not to downvote particular comments excessively, if they're legitimate lines of argument.  A faulty line of argument provides opportunity for rebuttal, and so for our test has value even then; that is, I want some faulty lines of argument here.  If you disagree, please downvote me, instead of the faulty comments, because this post is what you want less of, not those comments.  This necessarily implies, for balance, that we not excessively upvote comments.  I'd suggest fairly arbitrary limits of 3/-3?

Edit: 4.) A single argument per comment would be ideal; as MixedNuts points out here, it's otherwise hard to distinguish between one good and one bad argument, which makes the upvoting/downvoting difficult to evaluate.  (My apologies about missing this, folks.)

I'm going to try really hard not to get personally involved, except to lay down a leading comment posing an argument against abortion, a position I don't hold, for the record.  The core of the argument isn't disingenuous, and I hold that this argument is true, it just doesn't lead to my opposing abortion.  I do not hold the moral axiom by which I extend the basic argument to argue against abortion, however; I'm playing the devil's advocate to try to help me from getting sucked into the argument while providing an initial point of discussion.

Which leads me to the next point: If you see a hole in an argument, even if it's an argument for a perspective you agree with, poke through it.  The goal is to see whether we can have a constructive political argument here.

The fact that this is a test, and known to be a test, means this isn't a blind study.  Uh, try to act as if you're not being tested?

After it's gone on a little while, if this post hasn't been hopelessly downvoted and ridiculed (and thus the premise and test discarded as undesirable to begin with), we can put up a poll to see whether people found the political debates helpful, not helpful, and so on.

What Is Signaling, Really?

74 Yvain 12 July 2012 05:43PM

The most commonly used introduction to signaling, promoted both by Robin Hanson and in The Art of Strategy, starts with college degrees. Suppose, there are two kinds of people, smart people and stupid people; and suppose, with wild starry-eyed optimism, that the populace is split 50-50 between them. Smart people would add enough value to a company to be worth a $100,000 salary each year, but stupid people would only be worth $40,000. And employers, no matter how hard they try to come up with silly lateral-thinking interview questions like “How many ping-pong balls could fit in the Sistine Chapel?”, can't tell the difference between them.

Now suppose a certain college course, which costs $50,000, passes all smart people but flunks half the stupid people. A strategic employer might declare a policy of hiring (for a one year job; let's keep this model simple) graduates at $100,000 and non-graduates at $40,000.

Why? Consider the thought process of a smart person when deciding whether or not to take the course. She thinks “I am smart, so if I take the course, I will certainly pass. Then I will make an extra $60,000 at this job. So my costs are $50,000, and my benefits are $60,000. Sounds like a good deal.”

The stupid person, on the other hand, thinks: “As a stupid person, if I take the course, I have a 50% chance of passing and making $60,000 extra, and a 50% chance of failing and making $0 extra. My expected benefit is $30,000, but my expected cost is $50,000. I'll stay out of school and take the $40,000 salary for non-graduates.”

...assuming that stupid people all know they're stupid, and that they're all perfectly rational experts at game theory, to name two of several dubious premises here. Yet despite its flaws, this model does give some interesting results. For example, it suggests that rational employers will base decisions upon - and rational employees enroll in - college courses, even if those courses teach nothing of any value. So an investment bank might reject someone who had no college education, even while hiring someone who studied Art History, not known for its relevance to derivative trading.

We'll return to the specific example of education later, but for now it is more important to focus on the general definition that X signals Y if X is more likely to be true when Y is true than when Y is false. Amoral self-interested agents after the $60,000 salary bonus for intelligence, whether they are smart or stupid, will always say “Yes, I'm smart” if you ask them. So saying “I am smart” is not a signal of intelligence. Having a college degree is a signal of intelligence, because a smart person is more likely to get one than a stupid person.

Life frequently throws us into situations where we want to convince other people of something. If we are employees, we want to convince bosses we are skillful, honest, and hard-working. If we run the company, we want to convince customers we have superior products. If we are on the dating scene, we want to show potential mates that we are charming, funny, wealthy, interesting, you name it.

In some of these cases, mere assertion goes a long way. If I tell my employer at a job interview that I speak fluent Spanish, I'll probably get asked to talk to a Spanish-speaker at my job, will either succeed or fail, and if I fail will have a lot of questions to answer and probably get fired - or at the very least be in more trouble than if I'd just admitted I didn't speak Spanish to begin with. Here society and its system of reputational penalties help turn mere assertion into a credible signal: asserting I speak Spanish is costlier if I don't speak Spanish than if I do, and so is believable.

In other cases, mere assertion doesn't work. If I'm at a seedy bar looking for a one-night stand, I can tell a girl I'm totally a multimillionaire and feel relatively sure I won't be found out until after that one night - and so in this she would be naive to believe me, unless I did something only a real multimillionaire could, like give her an expensive diamond necklace.

How expensive a diamond necklace, exactly?  To absolutely prove I am a millionaire, only a million dollars worth of diamonds will do; $10,000 worth of diamonds could in theory come from anyone with at least $10,000. But in practice, people only care so much about impressing a girl at a seedy bar; if everyone cares about the same amount, the amount they'll spend on the signal depends mostly on their marginal utility of money, which in turn depends mostly on how much they have. Both a millionaire and a tenthousandaire can afford to buy $10,000 worth of diamonds, but only the millionaire can afford to buy $10,000 worth of diamonds on a whim. If in general people are only willing to spend 1% of their money on an impulse gift, then $10,000 is sufficient evidence that I am a millionaire.

But when the stakes are high, signals can get prohibitively costly. If a dozen millionaires are wooing Helen of Troy, the most beautiful woman in the world, and willing to spend arbitrarily much money on her - and if they all believe Helen will choose the richest among them - then if I only spend $10,000 on her I'll be outshone by a millionaire who spends the full million. Thus, if I want any chance with her at all, then even if I am genuinely the richest man around I might have to squander my entire fortune on diamonds.

This raises an important point: signaling can be really horrible. What if none of us are entirely sure how much Helen's other suitors have? It might be rational for all of us to spend everything we have on diamonds for her. Then twelve millionaires lose their fortunes, eleven of them for nothing. And this isn't some kind of wealth transfer - for all we know, Helen might not even like diamonds; maybe she locks them in her jewelry box after the wedding and never thinks about them again. It's about as economically productive as digging a big hole and throwing money into it.

If all twelve millionaires could get together beforehand and compare their wealth, and agree that only the wealthiest one would woo Helen, then they could all save their fortunes and the result would be exactly the same: Helen marries the wealthiest. If all twelve millionaires are remarkably trustworthy, maybe they can pull it off. But if any of them believe the others might lie about their wealth, or that one of the poorer men might covertly break their pact and woo Helen with gifts, then they've got to go through with the whole awful “everyone wastes everything they have on shiny rocks” ordeal.

Examples of destructive signaling are not limited to hypotheticals. Even if one does not believe Jared Diamond's hypothesis that Easter Island civilization collapsed after chieftains expended all of their resources trying to out-signal each other by building larger and larger stone heads, one can look at Nikolai Roussanov's study on how the dynamics of signaling games in US minority communities encourage conspicuous consumption and prevent members of those communities from investing in education and other important goods.

The Art of Strategy even advances the surprising hypothesis that corporate advertising can be a form of signaling. When a company advertises during the Super Bowl or some other high-visibility event, it costs a lot of money. To be able to afford the commercial, the company must be pretty wealthy; which in turn means it probably sells popular products and isn't going to collapse and leave its customers in the lurch. And to want to afford the commercial, the company must be pretty confident in its product: advertising that you should shop at Wal-Mart is more profitable if you shop at Wal-Mart, love it, and keep coming back than if you're likely to go to Wal-Mart, hate it, and leave without buying anything. This signaling, too, can become destructive: if every other company in your industry is buying Super Bowl commercials, then none of them have a comparative advantage and they're in exactly the same relative position as if none of them bought Super Bowl commercials - throwing money away just as in the diamond example.

Most of us cannot afford a Super Bowl commercial or a diamond necklace, and less people may build giant stone heads than during Easter Island's golden age, but a surprising amount of everyday life can be explained by signaling. For example, why did about 50% of readers get a mental flinch and an overpowering urge to correct me when I used “less” instead of “fewer” in the sentence above? According to Paul Fussell's “Guide Through The American Class System” (ht SIAI mailing list), nitpicky attention to good grammar, even when a sentence is perfectly clear without it, can be a way to signal education, and hence intelligence and probably social class. I would not dare to summarize Fussell's guide here, but it shattered my illusion that I mostly avoid thinking about class signals, and instead convinced me that pretty much everything I do from waking up in the morning to going to bed at night is a class signal. On flowers:

Anyone imagining that just any sort of flowers can be presented in the front of a house without status jeopardy would be wrong. Upper-middle-class flowers are rhododendrons, tiger lilies, amaryllis, columbine, clematis, and roses, except for bright-red ones. One way to learn which flowers are vulgar is to notice the varieties favored on Sunday-morning TV religious programs like Rex Humbard's or Robert Schuller's. There you will see primarily geraniums (red are lower than pink), poinsettias, and chrysanthemums, and you will know instantly, without even attending to the quality of the discourse, that you are looking at a high-prole setup. Other prole flowers include anything too vividly red, like red tulips. Declassed also are phlox, zinnias, salvia, gladioli, begonias, dahlias, fuchsias, and petunias. Members of the middle class will sometimes hope to mitigate the vulgarity of bright-red flowers by planting them in a rotting wheelbarrow or rowboat displayed on the front lawn, but seldom with success.

Seriously, read the essay.

In conclusion, a signal is a method of conveying information among not-necessarily-trustworthy parties by performing an action which is more likely or less costly if the information is true than if it is not true. Because signals are often costly, they can sometimes lead to a depressing waste of resources, but in other cases they may be the only way to believably convey important information.

Hope Function

22 gwern 01 July 2012 03:40PM

Yesterday I finished transcribing "The Ups and Downs of the Hope Function In a Fruitless Search". This is a statistics & psychology paper describing a simple probabilistic search problem and the sheer difficulty subjects have in producing the correct Bayesian answer. Besides providing a great but simple illustration of the mind projection fallacy in action, the simple search problem maps onto a number of forecasting problems: the problem may be looking in a desk for a letter that may not be there, but we could also look at a problem in which we check every year for the creation of AI and ask how our beliefs change over time - which turns out to defuse a common scoffing criticism of past technological forecasting. (This last problem was why I went back and used it, after I first read of it.)

The math is all simple - arithmetic and one application of Bayes's law - so I think all LWers can enjoy it, and it has amusing examples to analyze. I have also taken the trouble to annotate it with Wikipedia links, relevant materials, and many PDF links (some jailbroken just for this transcript). I hope everyone finds it as interesting as I did.

I thank John Salvatier for doing the ILL request which got me a scan of this book chapter.

When is Winning not Winning?

13 Eneasz 22 May 2012 04:25PM

 

Lately I'd gotten jaded enough that I simply accepted that different rules apply to the elite class. As Hanson would say, most rules are there specifically to curtail those who don't have the ability to avoid them and to be side-stepped by those who do - it's why we evolved such big, manipulative brains. So when this video recently made the rounds it shocked me to realize how far my values had drifted over the past several years.

(the video is not about politics, it is about status. My politics are far from those of Penn)
http://www.youtube.com/watch?v=wWWOJGYZYpk&feature=sharek

 

It's good we have people like Penn around to remind us what it was like to be teenagers and still expect the world to be fair, so our brains can be used for more productive things.

By the measure our society currently uses, Obama was winning. Penn was not. Yet Penn’s approach is the winning strategy for society. Brain power is wasted on status games and social manipulation when it could be used for actually making things better. The machinations of the elite class are a huge drain of resources that could be better used in almost any other pursuit. And yet the elites are admired high-status individuals who are viewed as “winning” at life. They sit atop huge piles of utility. Idealists like Penn are regarded as immature for insisting on things as low-status as “the rules should be fair and apply identically to every one, from the inner-city crack-dealer to the Harvard post-grad.”

The “Rationalists Should Win” meme is a good one, but it risks corrupting our goals. If we focus too much on “Rationalist Should Win” we risk going for near-term gains that benefit us. Status, wealth, power, sex. Basically hedonism – things that feel good because we’ve evolved to feel good when we get them. Thus we feel we are winning, and we’re even told we are winning by our peers and by society. But these things aren’t of any use to society. A society of such “rationalists” would make only feeble and halting progress toward grasping the dream of defeating death and colonizing the stars.

It is important to not let one’s concept of “winning” be corrupted by Azathoth.

 

ADDED 5/23:

It seems the majority of comments on this post are people who disagree on the basis of  rationality being a tool for achieving ends, but not for telling you what ends are worth achieving.

 

I disagree. As is written "The Choice between Good and Bad is not a matter of saying 'Good!'  It is about deciding which is which." And rationality can help to decide which is which. In fact without rationality you are much more likely to be partially or fully mistaken when you decide.

 

Work harder on tabooing "Friendly AI"

18 ChrisHallquist 20 May 2012 08:51AM

This is is an outgrowth of a comment I left on Luke's dialog with Pei Wang, and I'll start by quoting that comment in full:

Luke, what do you mean here when you say, "Friendly AI may be incoherent and impossible"? 

The Singularity Institute's page "What is Friendly AI?" defines "Friendly AI" as "A "Friendly AI" is an AI that takes actions that are, on the whole, beneficial to humans and humanity." Surely you don't mean to say, "The idea of an AI that takes actions that are, on the whole, beneficial to humans and humanity may be incoherent or impossible"?

Eliezer's paper "Artificial Intelligence as a Positive and Negative Factor in Global Risk" talks about "an AI created with specified motivations." But it's pretty clear that that's not the only thing you and he have in mind, because part of the problem is making sure the motivations we give an AI are the ones we really want to give it.

If you meant neither of those things, what did you mean? "Provably friendly"? "One whose motivations express an ideal extrapolation of our values"? (It seems a flawed extrapolation could still give results that are on the whole beneficial, so this is different than the first definition suggested above.) Or something else?

Since writing that comment, I've managed to find two other definitions of "Friendly AI." One is from Armstrong, Sandberg, and Bostrom's paper on Oracle AI, which describes Friendly AI as: "AI systems designed to be of low risk." This definition is very similar to the definition from the Singularity Institute's "What is Friendly AI?" page, except that it incorporates the concept of risk. The second definition is from Luke's paper with Anna Salamon, which describes Friendly AI as "an AI with a stable, desirable utility function." This definition has the important feature of restricting "Friendly AI" to designs that have a utility function. Luke's comments about "rationally shaped" AI in this essay seem relevant here.

Neither of those papers seems to use the initial definition they give of "Friendly AI" consistently. Armstrong, Sandberg, and Bostrom's paper has a section on creating Oracle AI by giving it a "friendly utility function," which states, "if a friendly OAI could be designed, then it is most likely that a friendly AI could also be designed, obviating the need to restrict to an Oracle design in the first place."

This is a non-sequitur if "friendly" merely means "low risk," but it makes sense if they are actually defining Friendly AI in terms of a safe utility function: what they're saying then is if we can create an AI that stays boxed because of its utility function, we can probably create an AI that doesn't need to be boxed to be safe.

In the case of Luke's paper with Anna Salamon, the discussion on page 17 seems to imply that "Nanny AI" and "Oracle AI" are not types of Friendly AI. This is strange under their official definition of "Friendly AI." Why couldn't Nanny AI or Oracle AI have a stable, desirable utility function? I'm inclined to think the best way to make sense of that part of the paper is if "Friendly AI" is interpreted to mean "an AI whose utility function an ideal extrapolation of our values (or at least comes close.)"

I'm being very nitpicky here, but I think the issue of how to define "Friendly AI" is important for a couple of reasons. First, it's obviously important for clear communication. If we aren't clear on what we mean by "Friendly AI," we won't understand each other when we try to talk about it." But another very important worry that confusion about the meaning of "Friendly AI" may be spawning sloppy thinking about it. Equivocating between narrower and broader definitions of "Friendly AI" may end up taking the place of an argument that the approach specified by the more narrow definition is the way to go. This seems like an excellent example of the benefits of tabooing your words.

I see on Luke's website that he has a forthcoming peer-reviewed article with Nick Bostrom titled "Why We Need Friendly AI." On the whole, I've been impressed with the drafts of the two peer-reviewed articles Luke has posted so far, so I'm moderately optimistic that that article will resolve these issues.

Holden Karnofsky's Singularity Institute critique: Is SI the kind of organization we want to bet on?

13 ciphergoth 11 May 2012 07:25AM

The sheer length of GiveWell co-founder and co-executive director Holden Karnofsky's excellent critique of the Singularity Institute means that it's hard to keep track of the resulting discussion.  I propose to break out each of his objections into a separate Discussion post so that each receives the attention it deserves.

Is SI the kind of organization we want to bet on?

This part of the post has some risks. For most of GiveWell's history, sticking to our standard criteria - and putting more energy into recommended than non-recommended organizations - has enabled us to share our honest thoughts about charities without appearing to get personal. But when evaluating a group such as SI, I can't avoid placing a heavy weight on (my read on) the general competence, capability and "intangibles" of the people and organization, because SI's mission is not about repeating activities that have worked in the past. Sharing my views on these issues could strike some as personal or mean-spirited and could lead to the misimpression that GiveWell is hostile toward SI. But it is simply necessary in order to be fully transparent about why I hold the views that I hold.

Fortunately, SI is an ideal organization for our first discussion of this type. I believe the staff and supporters of SI would overwhelmingly rather hear the whole truth about my thoughts - so that they can directly engage them and, if warranted, make changes - than have me sugar-coat what I think in order to spare their feelings. People who know me and my attitude toward being honest vs. sparing feelings know that this, itself, is high praise for SI.

One more comment before I continue: our policy is that non-public information provided to us by a charity will not be published or discussed without that charity's prior consent. However, none of the content of this post is based on private information; all of it is based on information that SI has made available to the public.

There are several reasons that I currently have a negative impression of SI's general competence, capability and "intangibles." My mind remains open and I include specifics on how it could be changed.

  • Weak arguments. SI has produced enormous quantities of public argumentation, and I have examined a very large proportion of this information. Yet I have never seen a clear response to any of the three basic objections I listed in the previous section. One of SI's major goals is to raise awareness of AI-related risks; given this, the fact that it has not advanced clear/concise/compelling arguments speaks, in my view, to its general competence.
  • Lack of impressive endorsements. I discussed this issue in my 2011 interview with SI representatives and I still feel the same way on the matter. I feel that given the enormous implications of SI's claims, if it argued them well it ought to be able to get more impressive endorsements than it has.

    I have been pointed to Peter Thiel and Ray Kurzweil as examples of impressive SI supporters, but I have not seen any on-record statements from either of these people that show agreement with SI's specific views, and in fact (based on watching them speak at Singularity Summits) my impression is that they disagree. Peter Thiel seems to believe that speeding the pace of general innovation is a good thing; this would seem to be in tension with SI's view that AGI will be catastrophic by default and that no one other than SI is paying sufficient attention to "Friendliness" issues. Ray Kurzweil seems to believe that "safety" is a matter of transparency, strong institutions, etc. rather than of "Friendliness." I am personally in agreement with the things I have seen both of them say on these topics. I find it possible that they support SI because of the Singularity Summit or to increase general interest in ambitious technology, rather than because they find "Friendliness theory" to be as important as SI does.

    Clear, on-record statements from these two supporters, specifically endorsing SI's arguments and the importance of developing Friendliness theory, would shift my views somewhat on this point.

  • Resistance to feedback loops. I discussed this issue in my 2011 interview with SI representatives and I still feel the same way on the matter. SI seems to have passed up opportunities to test itself and its own rationality by e.g. aiming for objectively impressive accomplishments. This is a problem because of (a) its extremely ambitious goals (among other things, it seeks to develop artificial intelligence and "Friendliness theory" before anyone else can develop artificial intelligence); (b) its view of its staff/supporters as having unusual insight into rationality, which I discuss in a later bullet point.

    SI's list of achievements is not, in my view, up to where it needs to be given (a) and (b). Yet I have seen no declaration that SI has fallen short to date and explanation of what will be changed to deal with it. SI's recent release of a strategic plan and monthly updates are improvements from a transparency perspective, but they still leave me feeling as though there are no clear metrics or goals by which SI is committing to be measured (aside from very basic organizational goals such as "design a new website" and very vague goals such as "publish more papers") and as though SI places a low priority on engaging people who are critical of its views (or at least not yet on board), as opposed to people who are naturally drawn to it.

    I believe that one of the primary obstacles to being impactful as a nonprofit is the lack of the sort of helpful feedback loops that lead to success in other domains. I like to see groups that are making as much effort as they can to create meaningful feedback loops for themselves. I perceive SI as falling well short on this front. Pursuing more impressive endorsements and developing benign but objectively recognizable innovations (particularly commercially viable ones) are two possible ways to impose more demanding feedback loops. (I discussed both of these in my interview linked above).

  • Apparent poorly grounded belief in SI's superior general rationality. Many of the things that SI and its supporters and advocates say imply a belief that they have special insights into the nature of general rationality, and/or have superior general rationality, relative to the rest of the population. (Examples here, here and here). My understanding is that SI is in the process of spinning off a group dedicated to training people on how to have higher general rationality.

    Yet I'm not aware of any of what I consider compelling evidence that SI staff/supporters/advocates have any special insight into the nature of general rationality or that they have especially high general rationality.

    I have been pointed to the Sequences on this point. The Sequences (which I have read the vast majority of) do not seem to me to be a demonstration or evidence of general rationality. They are about rationality; I find them very enjoyable to read; and there is very little they say that I disagree with (or would have disagreed with before I read them). However, they do not seem to demonstrate rationality on the part of the writer, any more than a series of enjoyable, not-obviously-inaccurate essays on the qualities of a good basketball player would demonstrate basketball prowess. I sometimes get the impression that fans of the Sequences are willing to ascribe superior rationality to the writer simply because the content seems smart and insightful to them, without making a critical effort to determine the extent to which the content is novel, actionable and important. 

    I endorse Eliezer Yudkowsky's statement, "Be careful … any time you find yourself defining the [rationalist] as someone other than the agent who is currently smiling from on top of a giant heap of utility." To me, the best evidence of superior general rationality (or of insight into it) would be objectively impressive achievements (successful commercial ventures, highly prestigious awards, clear innovations, etc.) and/or accumulation of wealth and power. As mentioned above, SI staff/supporters/advocates do not seem particularly impressive on these fronts, at least not as much as I would expect for people who have the sort of insight into rationality that makes it sensible for them to train others in it. I am open to other evidence that SI staff/supporters/advocates have superior general rationality, but I have not seen it.

    Why is it a problem if SI staff/supporter/advocates believe themselves, without good evidence, to have superior general rationality? First off, it strikes me as a belief based on wishful thinking rather than rational inference. Secondly, I would expect a series of problems to accompany overconfidence in one's general rationality, and several of these problems seem to be actually occurring in SI's case:

    • Insufficient self-skepticism given how strong its claims are and how little support its claims have won. Rather than endorsing "Others have not accepted our arguments, so we will sharpen and/or reexamine our arguments," SI seems often to endorse something more like "Others have not accepted their arguments because they have inferior general rationality," a stance less likely to lead to improvement on SI's part.
    • Being too selective (in terms of looking for people who share its preconceptions) when determining whom to hire and whose feedback to take seriously.
    • Paying insufficient attention to the limitations of the confidence one can have in one's untested theories, in line with my Objection 1.
  • Overall disconnect between SI's goals and its activities. SI seeks to build FAI and/or to develop and promote "Friendliness theory" that can be useful to others in building FAI. Yet it seems that most of its time goes to activities other than developing AI or theory. Its per-person output in terms of publications seems low. Its core staff seem more focused on Less Wrong posts, "rationality training" and other activities that don't seem connected to the core goals; Eliezer Yudkowsky, in particular, appears (from the strategic plan) to be focused on writing books for popular consumption. These activities seem neither to be advancing the state of FAI-related theory nor to be engaging the sort of people most likely to be crucial for building AGI.

    A possible justification for these activities is that SI is seeking to promote greater general rationality, which over time will lead to more and better support for its mission. But if this is SI's core activity, it becomes even more important to test the hypothesis that SI's views are in fact rooted in superior general rationality - and these tests don't seem to be happening, as discussed above.

  • Theft. I am bothered by the 2009 theft of $118,803.00 (as against a $541,080.00 budget for the year). In an organization as small as SI, it really seems as though theft that large relative to the budget shouldn't occur and that it represents a major failure of hiring and/or internal controls.

    In addition, I have seen no public SI-authorized discussion of the matter that I consider to be satisfactory in terms of explaining what happened and what the current status of the case is on an ongoing basis. Some details may have to be omitted, but a clear SI-authorized statement on this point with as much information as can reasonably provided would be helpful.

A couple positive observations to add context here:

  • I see significant positive qualities in many of the people associated with SI. I especially like what I perceive as their sincere wish to do whatever they can to help the world as much as possible, and the high value they place on being right as opposed to being conventional or polite. I have not interacted with Eliezer Yudkowsky but I greatly enjoy his writings.
  • I'm aware that SI has relatively new leadership that is attempting to address the issues behind some of my complaints. I have a generally positive impression of the new leadership; I believe the Executive Director and Development Director, in particular, to represent a step forward in terms of being interested in transparency and in testing their own general rationality. So I will not be surprised if there is some improvement in the coming years, particularly regarding the last couple of statements listed above. That said, SI is an organization and it seems reasonable to judge it by its organizational track record, especially when its new leadership is so new that I have little basis on which to judge these staff.

Wrapup

While SI has produced a lot of content that I find interesting and enjoyable, it has not produced what I consider evidence of superior general rationality or of its suitability for the tasks it has set for itself. I see no qualifications or achievements that specifically seem to indicate that SI staff are well-suited to the challenge of understanding the key AI-related issues and/or coordinating the construction of an FAI. And I see specific reasons to be pessimistic about its suitability and general competence.

When estimating the expected value of an endeavor, it is natural to have an implicit "survivorship bias" - to use organizations whose accomplishments one is familiar with (which tend to be relatively effective organizations) as a reference class. Because of this, I would be extremely wary of investing in an organization with apparently poor general competence/suitability to its tasks, even if I bought fully into its mission (which I do not) and saw no other groups working on a comparable mission.

View more: Prev | Next