All of utilitymonster's Comments + Replies

I prefer this briefer formalization, since it avoids some of the vagueness of "adequate preparations" and makes premise (6) clearer.

  1. At some point in the development of AI, there will be a very swift increase in the optimization power of the most powerful AI, moving from a non-dangerous level to a level of superintelligence. (Fast take-off)
  2. This AI will maximize a goal function.
  3. Given fast-take off and maximizing a goal function, the superintelligent AI will have a decisive advantage unless adequate controls are used.
  4. Adequate controls will not
... (read more)

IMO, the "rapid takeoff" idea should probably be seen as a fundraising ploy. It's big, scary, and it could conceivably happen - just the kind of thing for stimulating donations.

It seems that SIAI would have more effective methods for fundraising, e.g. simply capitalizing on "Rah Singularity!". I therefore find this objection somewhat implausible.

-5timtyler

Cool. Glad this turned out to be helpful.

A recent study by folks at the Oxford Centre for Neuroethics suggests that Greene et. al.'s results are better explained by appeal to differences in how intuitive/counterintuitive a moral judgment is, rather than differences in how utilitarian/deontological it is. I had a look at the study, and it seems reasonably legit, but I don't have any expertise in neuroscience. As I understand it, their findings suggest that the "more cognitive" part of the brain gets recruited more when making a counterintuitive moral judgment, whether utilitarian or de... (read more)

1lukeprog
Update: Greene's reply to Kahane et al. is here.
0lukeprog
I've now added a paragraph at the end after discussing the Kahane paper with Greene.
2lukeprog
Is this page broken for anyone else? When trying to load it, I just get a "Less Wrong broke!" message. I can still see the preview of it here, and I can even hit the 'edit' button from there and successfully update the post, and I can post new comments by replying to comments, but I can't actually load the page that contains this post! Is that happening for anyone else? It's been like this for me for more than an hour now.
9lukeprog
PDF of paper. Well done! This is a much better counter-argument to Greene's position than the ones presented in 2007. I shall update the original post accordingly.

A simple explanation is that using phrases like "brain scans indicate" and including brain scan images signals scientific eliteness, and halo effect/ordinary reasoning causes them to increase their estimate of the quality of the reasoning they see.

Do you know about Giving What We Can? You may be interested in getting to know people in that community. Basically, it's a group of people that pledges to give 10% of their earnings to the most effective charities in the developing world. Feel free to PM me or reply if you want to know more.

1juliawise
I'm familiar with it. Thanks for checking!

Usually, average utilitarians are interested in maximizing the average well-being of all the people that ever exist, they are not fundamentally interested in the average well-being of the people alive at particular points of time. Since some people have already existed, this is only a technical problem for average utilitarianism (and a problem that could not even possibly affect anyone's decision).

Incidentally, not distinguishing between averages over all the people that ever exist and all the people that exist at some time leads some people to wrongly conclude that average utilitarianism favors killing off people who are happy, but less happy than average.

7CarlShulman
A related commonly missed distinction is between maximizing welfare divided by lives, versus maximizing welfare divided by life-years. The second is more prone to endorsing euthanasia hypotheticals.

Gustaf Arrhenius is the main person to look at on this topic. His website is here. Check out ch. 10-11 of his dissertation Future Generations: A Challenge for Moral Theory (though he has a forthcoming book that will make that obsolete). You may find more papers on his website. Look at the papers that contain the words "impossibility theorem" in the title.

2steven0461
Do average utilitarians have a standard answer to the question of what is the average welfare of zero people? The theory seems consistent with any such answer. If you're maximizing the average welfare of the people alive at some future point in time, and there's a nonzero chance of causing or preventing extinction, then the answer matters, too.

Both you and Eliezer seem to be replying to this argument:

  • People only intrinsically desire pleasure.

  • An FAI should maximize whatever people intrinsically desire.

  • Therefore, an FAI should maximize pleasure.

I am convinced that this argument fails for the reasons you cite. But who is making that argument? Is this supposed to be the best argument for hedonistic utilitarianism?

We act not for the sake of pleasure alone. We cannot solve the Friendly AI problem just by programming an AI to maximize pleasure.

IAWYC, but would like to hear more about why you think the last sentence is supported by the previous sentence. I don't see an easy argument from "X is a terminal value for many people" to "X should be promoted by the FAI." Are you supposing a sort of idealized desire fulfilment view about value? That's fine--it's a sensible enough view. I just wouldn't have thought it so obvious that it would be a good idea to go around invisibly assuming it.

Second the need for a list of the most important problems.

How do you record your findings for future use, and how do you make sure you don't forget the important parts?

1Pablo
Summarizing the main ideas of the material you read in your own words, as Luke advises here and elsewhere, does appear to increase retention. After experimenting with several different note-taking software programs, I've reached the conclusion that WikidPad is the best option available. I also try to write one-sentence summaries of every article and book I read, for future reference. Here a citation-management application might help. Of the ones I tried, citeulike strikes me as the most useful.
7lukeprog
My own brain understands what I've learned best when I write up something that looks similar to a post intended for Less Wrong. Other people's brains may work differently. At the very least, I write down sources where all the most important experiments and concepts are explained.

Can you explain why this analysis renders directing away from the five and toward the one permissible?

The switch example is more difficult to analyze in terms of the intuitions it evokes. I would guess that the principle of double effect captures an important aspect of what's going on, though I'm not sure how exactly. I don't claim to have anything close to a complete theory of human moral intuitions.

In any case, the fact that someone who flipped the switch appears much less (if at all) bad compared to someone who pushed the fat man does suggest strongly that there is some important game-theoretic issue involved, or otherwise we probably wouldn't have evo... (read more)

I actually don't think this is about right. Last time I asked a philosopher about this, they pointed to an article by someone (I.J. Good, I think) about how to choose the most valuable experiment (given your goals), using decision theory.

Is there data on its influence and projected influence?

Yes. They posted a bunch of self-evaluation stats. It is a start toward the information you seek.

1SilasBarta
Refer to the linked discussion thread, which links to accounts of actual layshadows -- they describe what fields they did this for in detail. It's as you'd expect: they could pull it off for everything except engineering and the hard sciences.

Hard to be confident about these things, but I don't see the problem with external reasons/oughts. Some people seem to have some kind of metaphysical worry...harder to reduce or something. I don't see it.

R is a categorical reason for S to do A iff R counts in favor doing A for S, and would so count for other agents in a similar situation, regardless of their preferences. If it were true that we always have reasons to benefit others, regardless of what we care about, that would be a categorical reason. I don't use the term "categorical reason" any differently than "external reason".

S categorically ought to do A just when S ought to do A, regardless of what S cares about, and it would still be true that S ought to do A in similar situat... (read more)

1lukeprog
Agreed. And I'm skeptical of both. You?

So are categorical reasons any worse off than categorical oughts?

1lukeprog
Categorical oughts and reasons have always confused me. What do you see as the difference, and which type of each are you thinking of? The types of categorical reasons or reasons with which I'm most familiar are Kant's and Korsgaard's.

I can see that you might question the usefulness of the notion of a "reason for action" as something over and above the notion of "ought", but I don't see a better case for thinking that "reason for action" is confused.

The main worry here seems to have to do with categorical reasons for action. Diagnostic question: are these more troubling/confused than categorical "ought" statements? If so, why?

Perhaps I should note that philosophers talking this way make a distinction between "motivating reasons" and &q... (read more)

1lukeprog
utilitymonster, For the record, as a good old Humean I'm currently an internalist about reasons, which leaves me unable (I think) to endorse any form of utilitarianism, where utilitarianism is the view that we ought to maximize X. Why? Because internal reasons don't always, and perhaps rarely, support maximizing X, and I don't think external reasons for maximizing X exist. For example, I don't think X has intrinsic value (in Korsgaard's sense of "intrinsic value"). Thanks for the link to that paper on rational choice theories and decision theories!

I'm sort of surprised by how people are taking the notion of "reason for action". Isn't this a familiar process when making a decision?

  1. For all courses of action you're thinking of taking, identify the features (consequences if you that's you think about things) that count in favor of taking that course of action and those that count against it.

  2. Consider how those considerations weigh against each other. (Do the pros outweigh the cons, by how much, etc.)

  3. Then choose the thing that does best in this weighing process.

The same thing can

... (read more)

Even if we grant that one's meta-ethical position will determine one's normative theory (which is very contentious), one would like some evidence that it would be easier to find the correct meta-ethical view than it would be to find the correct (or appropriate, or whatever) normative ethical view. Otherwise, why not just do normative ethics?

4lukeprog
My own thought is that doing meta-ethics may illuminate normative theory, but I could be wrong about that. For example, I think doing meta-ethics right seals the deal for consequentialism, but not utilitarianism.

Yes, this is what I thought EY's theory was. EY? Is this your view?

On the symbolic action point, you can try making the symbolic action into a public commitment. Research suggests this will increase the strength of the effect you're talking about. Of course, this could also make you overcommit, so this strategy should be used carefully.

Especially if WBE comes late (so there is a big hardware overhang), you wouldn't need a lot of time to spend loads of subjective years designing FAI. A small lead time could be enough. Of course, you'd have to be first and have significant influence on the project.

Edited for spelling.

Don't forget about the ridiculous levels of teaching you're responsible for in that situation. Lots worse than at an elite institution.

3Jordan
Not necessarily. I'm not referring to no-research universities, which do have much higher teaching loads (although still not ridiculous. Teaching 3 or 4 classes a semester is hardly strenuous). I'm referring to research universities that aren't in the top 100, but which still push out graduate students. My undegrad Alma Mater, Kansas University, for instance. Professors teach 1 or 2 classes a semester, with TA support (really, when you have TAs, teaching is not real work). They are still expected to do research, but the pressure is much less than at a top 50 school.

I thought this was really, really good.

Enjoyed most of this, some worries about how far you're getting with point 8 (on giving now rather than later).

Give now (rather than later) - I’ve seen fascinating arguments that it might be possible to do more good by investing your money in the stock market for a long period of time and then giving all the proceeds to charity later. It’s an interesting strategy but it has a number of limitations. To name just two: 1) Not contributing to charity each year prevents you from taking advantage of the best tax planning strategy available to you. That tax-

... (read more)
2Louie
I wrote much more about this point but decided to cut it down substantially since it was already disproportionately large compared to it's value to my overall rhetorical goals. But here's some other things I wrote that didn't make it into the final draft of #8: "I do agree that this helps you donate more dollars that you can credibly say came from you. But does it reliably increase total impact? It seems unlikely. For instance, imagine donating to a highly rated GiveWell charity that is vaccinating people against a communicable disease in Africa. The vaccines will be cheaper in the future and if you invest well, your money should be worth more in the future too. More money, cheaper vaccines -- impact properly increased, right? But preventing the spread of that disease earlier with less money could easily have prevented more total occurrences of that disease. Most problems like disease, lack of education, poverty, environmental damage, or existential risk compound quickly while you sit on the sidelines. Does the particular disease or other problem you want to combat really spread slow enough that you can overtake it with the power of compounding interest? You should do the calculation yourself, but most of the problems I’m aware of become harder to solve faster than that. And this is definitely a bad strategy if the charity you’re supporting is actually working on long-term solutions to the problems they’re combating and not just producing a series of (noble but ultimately endless) band-aid outcomes. Solving the problem is entirely different than managing outcomes indefinitely and can drastically shift the balance in favor of giving less sooner rather than more later." I also wrote a lot of poorly phased notes (that I wasn't entirely happy with) to the effect that if you still thought this was a great idea... so much so that you actually planned to do it, you should definitely not execute it silently without communicating your plan to the non-profit you're expecti

Giving What We Can does not accept donations. Just give it all to Deworm the World.

0Mass_Driver
Okiedoke.

Some wisdom on warm fuzzies: http://www.pbfcomics.com/?cid=PBF162-Executive_Decision.jpg

[Not a quote, but doesn't seem suitable for a discussion article.]

1NancyLebovitz
Might this imply that we might still want open threads?

My reaction is that moral philosophy just isn't science. Sure, if you're a utilitarian you can use empirical evidence to figure out what maximizes aggregate welfare, relative to your account of well-being, but you can't use science to discover that utilitarianism is true. This is because utilitarianism, like any other first-order normative theory and many meta-ethical theories, doesn't lead you to expect any experiences over any other experiences.

Thanks for writing this, Carl. I'm going to post a link in the GWWC forum.

Here are some papers you should add to your bibliography, if you haven't already:

What is the Probability Your Vote Will Make a Difference? Voting as a Rational Choice

In the first paper, his probability estimate is 1 in 60 million on average for a voter in a US presidential election, 1 in 10 million in the best cases (New Mexico, Virginia, New Hampshire, and Colorado).

If you focused on the best case, that could mean an order of magnitude for you.

3CarlShulman
Thanks utilitymonster. As it happens, they're already in the draft for the next post following this one.

On this point, it is noteworthy that international health aid eliminated small pox. According to Toby Ord, it is estimated that this has prevented over 100 million deaths, which is more than the total number of people that died in all wars in the 20th century. If you assumed that all of the rest of international health aid achieved nothing at all, this single effort would make the average number of dollars per DALY achieved by international health aid better than what the British Government achieves.

0taw
You can always pick a reference class which supports any conclusion you want. You could just as plausibly claim that international aid mostly propped up various third world dictators and fueled local wars (no matter what "aid money" was for, government could always shift money it would otherwise need to spend on that area into buying weapons or beating up dissidents instead), leading to economic and civilizational stagnation, and over 100 million deaths which would otherwise not have happened. Or you could categorize reality into reference classes any other way.

Still don't get it. Let's say cards are being put in front of my face, and all I'm getting is their color. I can reliability distinguish the colors here "http://www.webspresso.com/color.htm". How do I associate a sequence of cards with a string? It doesn't seem like there is any canonical way of doing this. Maybe it won't matter that much in the end, but are there better and worse ways of starting?

2timtyler
Just so: the exact representation used is usually not that critical. If as you say you are using Solomonoff induction, the next step is to compress it - so any fancy encoding scheme you use will probably be stripped right off again.
0gwern
If you really can only distinguish those 255 colors, then you could associate each color with a single unique byte, and a sequence of n cards becomes a single bitstring with n*8 bits in it. For additional flavor, add some sort of compression. This is so elementary that I must be misunderstanding you somehow.
0timtyler
A stream of sense data is essentially equivalent to a binary stream - the associated programs are the ones that output that stream.

Question about Solomonoff induction: does anyone have anything good to say about how to associate programs with basic events/propositions/possible worlds?

0timtyler
Don't do that - instead associate programs with sensory input streams.
0khafra
Good question. Unfortunately, I don't think it's possible to create a universal shortcut for "run each one, and see if you get the possible world you were aiming for," other than the well-known alternatives like AIXI-tl and MC-AIXI.

This doesn't sound like a bad idea. Could someone give reasons to think that donations to SIAI now would be better than this?

4CarlShulman
In your specific case, given what you have said about your epistemic state, I would think that you subjectively-ought to do something like this (a commitment mechanism, but not necessarily with a commitment to reducing existential risk given your normative uncertainty). I'll have more to say about the general analysis in 48 hours or more, following a long flight from Australia.
0[anonymous]
Does "this" mean DAF, or signalling through waste?

I think the page makes a case that it is worth doing something about AI risk, and that SIAI is doing something. The page gives no one any reason to think that SIAI is doing better than anything else you could do about x-risk (there could be reasons elsewhere).

In this respect, the page is similar to other non-profit pages: (i) argue that there is a problem, (ii) argue that you're doing something to solve the problem, but don't (iii) try to show that you're solving the problem better than others. Maybe that's reasonable, since that rubs some donors the wrong way and is hard to establish that you're the best; but it doesn't advance our discussion about the best way to reduce x-risk.

So, you're really interested in this question: what is the best decision algorithm? And then you're interested, in a subsidiary way, in what you ought to do. You think the "action" sense is silly, since you can't run one algorithm and make some other choice.

Your answer to my objection involving the parody argument is that you ought to do something else (not go with loss aversion) because there is some better decision algorithm (that you could, in some sense of "could", use?) that tells you to do something else.

What do you do with case... (read more)

tl;dr Philsophers have been writing about what probabilities reduce to for a while. As far as I know, the only major reductionist view is David Lewis's "best system" account of laws (of nature) and chances. You can look for "best system" in this article for an intro. Barry Loewer has developed this view in this paper.

1cousin_it
From what I understand of Lewis's view, it's not a "reduction" in my sense of the word which (I think) also coincides with common LW usage. I generally try to reduce phenomena to programs that can be implemented on computers; the first two scenarios in the post are of this kind, and the third one can probably be implemented as well, once I understand it a little better.

I agree that this fact [you can't have a one-boxing disposition and then two box] could appear as premise in an argument, together with an alternative proposed decision theory, for the conclusion that one-boxing is a bad idea. If that was the implicit argument, then I now understand the point.

To be clear: I have not been trying to argue that you ought to take two boxes in Newcomb's problem.

But I thought this fact [you can't have a one-boxing disposition and then two box] was supposed to be a part of an argument that did not use a decision theory as a prem... (read more)

3CarlShulman
Not irrational by their own lights. "Take the action such that an unanticipated local miracle causing me to perform that action would be at least as good news as local miracles causing me to perform any of the alternative actions" is a coherent normative principle, even though such miracles do not occur. Other principles with different miracles are coherent too. Arguments for one decision theory or another only make sense for humans because we aren't clean implementations of any of these theories, and can be swayed by considerations like "agents following this rule regularly get rich."

What is false is that you ought to have disposition a and do B.

OK. So the argument is this one:

  1. According to two-boxers, you ought to (i) have the disposition to one-box, and (ii) take two boxes.
  2. It is impossible to do (i) and (ii).
  3. Ought implies can.
  4. So two-boxers are wrong.

But, on your use of "disposition", two-boxers reject 1. They do not believe that you should have a FAWS-disposition to one-box, since having a FAWS-disposition to one-box just means "actually taking one box, where this is not a result of randomness". Two-bo... (read more)

0FAWS
From the original post: Richard is probably using disposition in a different sense (possibly the model someone has of someones disposition in my sense) but I believe Eliezer's usage was closer to mine, and either way disposition in my sense is what she would need to actually get the million dollars.

Everyone agrees about what the best disposition to have is. The disagreement is about what to do. I have uniformly meant "ought" in the action sense, not the dispositional sense. (FYI: this is always the sense in which philosophers (incl. Richard) mean "ought", unless otherwise specified.)

BTW: I still don't understand the relevance of the fact that it is impossible for people with one-boxing dispositions to two-box. If you don't like the arguments that I formalized for you, could you tell me what other premises you are using to reach your conclusion?

4komponisto
That sense is entirely uninteresting, as I explained in my first comment in this thread. It's the sense in which one "ought" to two-box after having been predicted by Omega to one-box -- a stipulated impossibility. Philosophers who, after having considered the distinction, remain concerned with the "action" sense, would tend to be -- shall we say -- vehemently suspected of non-reductionist thinking; of forgetting that actions are completely determined by dispositions (i.e. the algorithms running in the mind of the agent). Having said that, if one does use "ought" in the action sense, then there should be no difficulty in saying that one "ought" to two-box in the situation where Omega has predicted you will one-box. That's just a restatement of the assumption that the outcome of (one-box predicted, two-box) is higher in the preference ordering than that of (one-box predicted, one-box). Normally, the two meanings of "ought" coincide, because outcomes normally depend on actions that happen to be determined by dispositions, not directly on dispositions themselves. Hence it's easy to be deceived into thinking that the action sense is the appropriate sense of "ought". But this breaks down in situations of the Newcomb type. There, the dispositional sense is clearly the right one, because that's the sense in which you ought to one-box; since the dispositional sense also gives the same answers as the action sense for "normal" situations, we may as well say that the dispositional sense is what we mean by "ought" in general.
0CarlShulman
Just take causal decision theory and then crank it with an account of counterfactuals whereby there is probably a counterfactual dependency between your box-choice and your early disposition. Arntzenius called something like this "counterfactual decision theory" in 2002. The counterfactual decision theorist would assign high probability to the dependency hypotheses "if I were to one-box now then my past disposition was one-boxing" and "if I were to two-box now then my past disposition was two-boxing." She would assign much lower probability to the dependency hypotheses on which her current action is independent of her past disposition (these would be the cognitive glitch/spasm sorts of cases).

Whatever you actually do (modulo randomness) at time t, that's your one and only disposition vs X at time t.

Okay, I understand how you use the word "disposition" now. This is not the way I was using the word, but I don't think that is relevant to our disagreement. I hereby resolve to use the phrase "disposition to A" in the same way as you for the rest of our conversation.

I still don't understand how this point suggests that people with one-boxing dispositions ought not to two-box. I can only understand it in one way: as in the ar... (read more)

0FAWS
No, when you have disposition a and do A it may be the case that you ought to have disposition b and do B, perhaps disposition a was formed by habit and disposition b would counter-factually have resulted if the disposition had formed on the basis of likely effects and your preferences. What is false is that you ought to have disposition a and do B.
Load More