It doesn't seem to affect my intuitions at all: my intuition in both cases is to ignore the "3^^^3 people" case and go for sublimity/lack-of-torture.
It also doesn't seem to affect my ability to do the math.
That said, there's also a status-quo bias in effect here: I feel a stronger impulse to "restore the default" by protecting someone from being tortured than I do to "improve on the default" by giving them sublimity. If I and my peers were ourselves living a sublime life, I presumably wouldn't feel that way.
Perhaps the question could be worded as:
A young girl is going to grow up and have an amazingly sublime life. You have the opportunity to cause her to instead lead a mediocre life, which will include a moment when she records her cat doing something similar, uploads it to youtube, and provides a bajillion people who would otherwise be having a boring afternoon with a few seconds of mildly funny youtube content.
Perhaps.
And, yes, if I phrase the question in such a way that emphasizes my ability to remove sublimity rather than my ability to grant it, then my intuitions shift around (among the reasons I don't trust my intuitions here, since I see no reason to endorse giving different answers depending on how I phrase the question).
The additional causal link between her mediocre life and the youtube content doesn't seem to affect my intuitions at all, though.
The causal link wasn't really meant to change anything, it just made the question more sensical. Why would affecting one person's sublimity create a brief youtube moment? Because she didn't have many friends so she bought a cat!
Also, I'm trying to remember -- has anyone attempted to defend the claim that there is no such thing as 3^^^3 sufficiently-different humans?
The number of possible humans is probably plenty large enough for the purposes of the thought experiment.
The number of possible humans is probably plenty large enough for the purposes of the thought experiment.
That is the most important question regarding all those scope sensitivity thought experiments. What are the upper and lower bounds for scope sensitivity? Disregard the happiness of a galactic civilization if there is a tiny probability of making many more beings happy by trying to leave the universe? Don't marry your girlfriend because you make two other girls and your mother happy if you don't?
Something seems wrong with this line of reasoning.
Something seems wrong with this line of reasoning.
What do you think is wrong?
To me it seems these are perfectly valid scenarios if you accept the premise that you are choosing based on a formally specified, well-behaving utility function. (Which is trivially untrue of humans, but there are reasons why we might want to emulate such behavior.)
To me it seems these are perfectly valid scenarios if you accept the premise that you are choosing based on a formally specified, well-behaving utility function. (Which is trivially untrue of humans, but there are reasons why we might want to emulate such behavior.)
I don't think that would be a good idea because you'd never stop and always hunt for an even larger payoff, irrespective of astronomically low probabilities. If that is stupid, then at what point does it become stupid? Why would I choose the present over the future ever? Yudkowsky seems to argue against discount rates:
But a 5%-per-year discount rate, compounded exponentially, implies that it is worth saving a single person from torture today, at the cost of 168 people being tortured a century later, or a googol persons being tortured 4,490 years later.
Yet without discounting the inverse applies, that it is worth to torture a single person today to save many later. Can that be more right?
Yet without discounting the inverse applies, that it is worth to torture a single person today to save many later. Can that be more right?
I think so, yes.
At least, if you gave me a button and convinced me that pressing it would save many people from suffering a century from now, but would allow one person to suffer today, and not-pressing it would save one person from suffering today, but allow many people to suffer a century from now, I would feel like I ought to press the button.
Would you not? Why not?
There seems to be a persistent meme to the effect that initiating harm to prevent others from future harm is never okay, or at best is a dirty business (that is, immoral in its entirety, rather than containing negatively moral components which are overwhelmed by the positively moral ones) that ought only be performed if we're pretty sure no options exist that don't involve hurting people.
From most consequentialist perspectives this is trivially false, of course; it'd probably be simplest to interpret the impulse as straight-up negative utilitarianism, but I don't think that flies when you look at the socially accepted responses to other moral dilemmas. Instead, I think we're looking at a deontological adaptation to the dilemma posed by power corrupting; despite its falsity, the meme could easily end up being adaptive in an environment which otherwise rewarded self-deception in order to knock off your political and social rivals.
In Tolkien: Author of the Century, Shippey says that Acton seems to have been the first to say that power corrupts. We're very used to the idea, but it isn't a human universal.
True enough; but I suspect a lot of that comes from the patterns of behavior we'd now label "corruption" being taken for granted by a lot of pre-Enlightenment thinkers.
I haven't read enough pre-Enlightenment political theory to claim a universal, but the familial model of power (with the leader as a father and subordinates as children; sometimes augmented with an executive officer figure as mother) seems to have been fairly ubiquitous. Seems to me that a lot of actions we'd view as blatant abuse of power might under this model be accepted as the rights of leadership, so long as noblesse oblige is upheld.
Sure, agreed; that's one reason I couched the example in terms of preventing harm and allowing harm, rather than causing harm. The cost-benefit equation stays the same, but it doesn't trip quite so many deontological alarms.
Hmm this is the first time I read up on what 'deontological' actually means. I actually think that there is something wrong with the whole business of probability/utility calculations. For a long time I know about this kind of problems but only a few days ago I came across the official terminology here. I didn't know about Pascal's Mugging or discount rates etc., and I was deeply surprised that the problem is unsolved and widely ignored. As I wrote in another comment I don't see how those methods are still refereed to as 'laws' here when you have to cherry pick what result to ignore.
I'm afraid I'm not following you here... that is, I'm not sure what 'laws' you are talking about, what it is I have to cherry-pick, what one has to do with the other, what either of them have to do with Pascal's Mugging or discount rates, or in what sense either of those are unsolved problems.
Sorry.
...I'm not sure what 'laws' you are talking about...
In the latest interview featuring Eliezer Yudkowsky he said that "there are parts of rationality that we do understand very well in principle.", namely "Bayes’ Theorem, the expected utility formula, and Solomonoff induction". He also often refers to 'laws' when talking about the basic principles that are being taught on LessWrong. For example in 'When (Not) To Use Probabilities' he writes "The laws of probability are laws, not suggestions,".
...what it is I have to cherry-pick, what one has to do with the other, what either of them have to do with Pascal's Mugging or discount rates, or in what sense either of those are unsolved problems.
In the post 'Pascal's Mugging: Tiny Probabilities of Vast Utilities' Eliezer Yudkowsky writes, "I don't feel I have a satisfactory resolution as yet". This might sound like yet another circumstantial problem to be solved, with no relevance for rationality in general or the bulk of friendly AI research. But it is actually one the most important problems because it does undermine the basis of rational choice.
Why do you do what you do? Most people just do what they feel is the right thing to do, they base their decisions on gut feeling. But to make decision making maximally efficient people started questioning if our evolutionary prior is still adequate for modern day decision making. And indeed, advances in probability theory allowed us to make a lot of progress and still are the best choice when dealing with an endless number of problems. This wasn't however enough when dealing with terminal goals, we had to add some kind of monetary value to be able to choose between goals of different probability, after all probability itself is no sufficient measure to discern desirable goals from goals that are not worthwhile. Doing so we were now able to formalize how probable a certain outcome was and how much we desired that outcome. Together it seemed that we could now discern how worthwhile it would be to pursue a certain outcome regardless of our instincts. Yet a minor problem arose, sometimes bothersome sometimes welcome. Our probabilities and utility calculations were often based solely on gut feeling, because it was often the only available evidence, which we called our prior probability. That was unsatisfactory as we were still relying on our evolutionary prior, something we tried to overcome after all. So we came up with the Solomonoff Prior to finally obliterate instinct, and it was very good. Only after a while we noticed that our new heuristics often told us to seek outcomes that seemed not only undesirable but plainly wrong. We were able to outweigh any probability by expecting additional utility and disregard any undesirable action by switching from experience-utility to decision-utility as we pleased. Those who didn't switch were susceptible for taking any chance, either because other agents told them that a certain outcome had an amount of utility that could outweigh their low credulity or because the utility of a certain decision was able to outweigh its low probability. People wouldn't decide not to marry their girlfriend because that decision would make two other girls and their mother equally happy and therefore outweigh their own happiness and that of their girlfriend, they just assigned even more utility to the decision to marry their girlfriend. Others would seek extreme risks and give all their money to a charity that was trying to take over the Matrix, the almost infinite utility associated with a success outweighing its astronomical low probability. This was unbearable so people decided that something must be wrong with their heuristics and that they would rather doubt their grasp of "rationality" than acting according to it. But it couldn't be completely wrong, after all their heuristics had been very successful on a number of problems? So people decided to just ignore certain extremes and only use their heuristics when they felt they would deliver reasonable results. Consequently, in the end we were still where we started, using our gut feelings to decide what to do. But how do we program that into an AI? Several solutions have been proposed, using discount rates to disregard extremes or measuring the running time or space requirements of computations, but all had their flaws. It all seemed to have something to do with empirical evidence, but we were already too committed to the laws of probability as the ultimate guidance that we missed out on the possibility that those 'laws' might actually be helpful tools not prescriptions of optimal and obligatory decisions.
OK... I think I understand.
And, you're right, I'm committed to the idea that its best to take the course of action with the highest expected utility.
That said, I do agree with you that if my calculations of expected utility lead to wildly counterintuitive results, sometimes that means my calculations are wrong.
Then again, sometimes it means my intuitions are wrong.
So I have to decide how much I trust my intuitions, and how much I trust my calculations.
This situation isn't unique to probability or calculations of expected utility. It applies to, say, ballistics just as readily.
That is, I have certain evolved habits of ballistic calculation that allow me to do things like hit dartboards with darts and catch baseballs with my hand and estimate how far my car is from other cars. By diligent practice I can improve those skills.
But I'm never going to be able to judge the distance of the moon by eyeballing it -- that's too far outside the range of what my instincts evolved to handle.
Fortunately, we have formalized some laws that help me compute distances and trajectories and firing solutions.
Unfortunately, because air resistance is important and highly variable, it turns out that we can't fully compute those things within an atmosphere. Our missiles don't always hit their targets.
Of course, one important difference is that with ballistics it turns out to be relatively easy to develop a formal system that is strictly better than our brains at hitting targets. Which is just another way of saying that our evolved capability for ballistics turns out to be not particularly good, compared to the quality of formal system that we know how to develop.
Whereas with other kinds of judgments, like figuring out what to do next, our evolved capability is much much better than the formal systems that we know how to develop.
Of course, two hundred years ago that was also true of ballistics. It turns out that we're pretty good at improving the quality of our formal systems.
I am still leaning towards the general resentment that I voiced in my first submission here, although I learnt a lot since then. I perceive the tendency to base decisions solely on logical implications to be questionable. I do not doubt that everything we know hints at the possibility that artificial intelligence could undergo explosive self-improvement and reach superhuman capabilities, but I do doubt that we know enough to justify the kind of commitment given within this community. I am not able to formalize my doubts right now or explain what exactly is wrong, but I feel the need to voice my skepticism nonetheless. And I feel that my recent discovery of some of the admitted problems reinforce my skepticism.
Take for example the risks from asteroids. There is a lot of empirical evidence that asteroids have caused damage before and that they pose an existential risk in future. One can use probability theory, the expected utility formula and other heuristics to determine how reasonable it would be to support the mitigation of that risk. Then there are risks like those posed by particle accelerators. There is some evidence that high-energy physics might pose and existential risk but more evidence that it does not. The risks and benefits are still not solely based on sheer speculation. Yet I don't think that we could just factor in the expected utility of the whole Earth and our possible future as galactic civilization to conclude that we shouldn't do high-energy physics. In the case of risks from AI we have a scenario that is solely based on extrapolation, on sheer speculation. The only reason to currently care strongly about risks from AI is the expected utility of success, respectively the disutility of a negative outcome. And that is where I become very skeptical, there can be no empirical criticism. Such scenarios then are those that I compare to 'Pascal's Mugging' because they are susceptible to the same problem of expected utility outweighing any amount of reasonable doubt. I feel that such scenarios can lead us astray by removing themselves from other kinds of evidence and empirical criticism and therefore are solely justifying themselves by making up huge numbers when the available data and understanding doesn't allow any such conclusions. Take for example MWI, itself a logical implication of the available data and current understanding. But is it enough to commit quantum suicide? I don't feel that such a conclusion is reasonable. I believe that logical implications and extrapolations are not enough to outweigh any risk. If there is no empirical evidence, if there can be no empirical criticism then all bets are off.
As you mentioned there are many uses like ballistic calculation where mere extrapolation works and is the best we can do. But since there are problems like 'Pascal's Mugging', that we perceive to be undesirable and that lead to an infinite hunt for ever larger expected utility, I think it is reasonable to ask for some upper and lower bounds regarding the use and scope of certain heuristics. We agree that we are not going to stop pursuing whatever terminal goal we have chosen just because someone promises us even more utility if we do what that person wants. We also agree that we are not going to stop loving our girlfriend just because there are many people who do not approve our relationship and who would be happy if we divorced. Therefore we already informally established some upper and lower bounds. But when do we start to take our heuristics seriously and do whatever they prove to be the optimal decision? That is an important question and I think that one of the answers, as mentioned above, is that we shouldn't trust our heuristics without enough empirical evidence as their fuel.
If there is no empirical evidence, if there can be no empirical criticism then all bets are off.
My usual example of the IT path being important is Microsoft. IT improvments have been responsible for much of the recent progress of humans. For many years, Microsoft played the role of the evil emperor of IT, with nasty business practices and shoddy, insecure software. They screwed humanity, and it was a nightmare - a serious setback for the whole planet. Machine intelligence could be like that - but worse.
(nods) I endorse drawing inferences from evidence, and being skeptical about the application of heuristics that were developed against one reference class to a different reference class.
In the post 'Pascal's Mugging: Tiny Probabilities of Vast Utilities' Eliezer Yudkowsky writes, "I don't feel I have a satisfactory resolution as yet". This might sound like yet another circumstantial problem to be solved, with no relevance for rationality in general or the bulk of friendly AI research. But it is actually one the most important problems because it does undermine the basis of rational choice.
The evolved/experienced heuristic is probably to ignore those. In most cases where such things crop up, it is an attempted mugging - like it was with Pascal and God. So, most people just learn to tune such things out.
There seems to be a persistent meme to the effect that initiating harm to prevent others from future harm is never okay, or at best is a dirty business (that is, immoral in its entirety, rather than containing negatively moral components which are overwhelmed by the positively moral ones) that ought only be performed if we're pretty sure no options exist that don't involve hurting people.
The problem is that I am Omega and that I predict that iff you punch Yudkowsky in the face then as a result 3^^^^3 beings will be maximally happy for 1 day.
Would you not? Why not?
Yeah why not? Once when I asked if the SIAI would consider the possibility of paying AGI researchers not to do AGI research, or kill an AGI researcher who is just days away from launching an uFAI, Yudkowsky said something along the lines that it is OK to just downvote me to -10 rather -10000. Talk about taking ideas seriously?
Never mind the above, I can't tell you why it would be wrong but I have a feeling that it is. It would lead to all kinds of bad behavior based on probabilities and expected utility calculations. I don't feel like taking that route right now...
...if you gave me a button and convinced me that pressing it would save many people from suffering a century from now...
Can I conclude that you would give in to a Pascal's Mugging scenario? If not, where do you draw the line and why? If an important part of your calculation, the part that sets the upper and lower bounds, is necessarily based on 'instinct' then why don't you disregard those calculations completely and do what you feel is right and don't harm anyone?
To answer your questions: No, I don't think you can fairly conclude that I'm subject to Pascal's Mugging, and I draw the line based on what calculations I can do and what calculations I can't do.
That is, my inability to come up with reliable estimates of the probability that Pascal's Mugger really can (and will) kill 3^^^3 people is not a good reason for me to disregard my ability to come up with reliable estimates of the probability that dropping poison in a well will kill people; I can reasonably refuse to do the latter (regardless of what I feel) on the grounds that I don't choose to kill people, regardless of what I say or don't say about Pascal's Mugging.
And if there's a connection between any of this and the initial question I asked, I don't see it.
What do you think is wrong?
If I knew I would be smarter than Yudkowsky, as he writes:
It doesn't feel to me like 3^^^^3 lives are really at stake, even at very tiny probability. I'd sooner question my grasp of "rationality" than give five dollars to a Pascal's Mugger because I thought it was "rational".
Something seems to be fundamentally wrong with using Bayes’ Theorem, the expected utility formula, and Solomonoff induction to determine how to choose given unbounded utility scenarios. If you just admit that it is wrong but less wrong, then I think it is valid to scrutinize your upper and lower bounds. Yudkowsky clearly sets some upper bound, but what is it and how does he determine it if not by 'gut feeling'? And if it all comes down to 'instinct' on when to disregard any expected utility, then how can one still refer to those heuristics as 'laws'?
Or even 3^^^3 sufficiently-different sentient beings. The first claim actually seems obviously true, and the second does not seem obviously false.
Agreed. Most of the aliens we're talking about have brains larger than our galaxy, and I don't know how I feel about the moral significance of their experience.
I'm not sure the Youtube example is a good thing, and if not a good thing it is certainly a bad thing. It could very well be worse than a dust speck in the eye.
Actually, thinking about this possibility it seems by far the most likely that it depends strongly on the specific of the person watching it, some like this kind of thing and others don't.
yea, this looks the same, and the answer is still that 3^^^3 is so large the question dosn't even seem interesting.
Or you can choose "Youtube", and 3^^^3 people who weren't doing much with some one-second period of their lives instead get to spend that second watching a brief, grainy, yet droll recording of a cat jumping into a box, which they find mildly entertaining.
This feels like negative utility compared to the normal course of existence. I think that many people (extrapolating from myself, with a qualitative discount) would rather be doing other things with that one-second interval of their lives than watching such a YouTube video, so that course of action actually has a cost if whatever else they would have been doing right then would have been more fun.
I also notice that if I'm feeling somewhat euphoric at the time, such a video might cause my mood to drop significantly, thus creating a rather large negative deltafun compared to, say, just thinking.
Whether the Youtube video is going to be plus or minus fun is going to depend on the context, so, uh, the answer is going to depend on psychology and lots of general information about how different people live. However, I would pick Sublimity because of the potential for non-negligable deltafuns occurring --- for the same reason, I would pick ~Sublimity (as in, take no action) over Youtube.
I have the same question as I have about the torture vs. dust speck question:
Is there an obvious reason why the utility to me of N people watching youtube has to be a function of N that increases without bound? (I grant, for the sake of argument, that it is an increasing function.)
Is there a slam dunk argument for why I can't consider 3^^^3 people watching youtube to be only a tiny sliver-of-utility better than 3^^3 people watching youtube? And why can't the sum of such slivers converge to a finite value as the number of "^"s goes to infinity?
Is such a utility function contrary to some obvious utility-function desideratum? Or is it just contrary to what many people here arrive at using intuition+reflection?
ETA: [Forgot the punchline.] And if the utility of arbitrarily many youtube video-watchers is bounded, is there an obvious reason why it can't be less than one sublime life?
Because I am on the utilitarian ethics bandwagon?
I think that (3^^^3(change in happiness when watching kitten 1 s)) > (1(average change in happiness in awesome life 3.15*10^9 s))
(assuming subject lives 100 years = 3.15 10^9 seconds). In fact, I think that change in happiness from seeing a kitten /second is probably comparable to the average change in happiness of a sublime life /second, so I could take the video even for around 3.1510^9 viewers. (But would be tempted to do more research first) With 3^^^3 viewers, its not much of a decision.
Can you clarify how you think this maps to the original torture vs dust specks? I personally see two ways to be consistent:
A) Qualitative Interpretation - both torture and sublimity cause strong reactions (both in the recipient and in the reader), while both dust specks and YouTube cause almost no reaction, so it is consistent to pick dust specks & sublimity to maximize good.
B) Quantitative Interpretation - both dust specks and YouTube destroy/create the most utilons, incomparably more so in the aggregate than the torture/sublimity option. So it is consistent to pick torture & youtube to maximize good.
Is the exercise to see if anyone would pick torture & sublimity (or dust specks & youtube) to find inconsistencies in thinking?
Between your positive vs. positive and original negative vs. negative, there is also a positive vs. negative variant by cousin_it: Torturing people for fun.
What exactly are we trying to learn from this thought experiment that we cannot already learn from the torture/dust-speck experiment?
"I can't remember anyone suggesting the reversal, one where the arguments taken by the hypothetical are positive and not negative. I'm curious about how it affects people's intuitions."
The YouTube is pure happiness. The sublimity is some happiness and some value. Therefore I choose the sublimity, but if it was "Wireheading vs. Youtube", or "Sublimity vs. seeing a motivational quote", I would choose the YouTube or the motivational quote, because I intrinsically value fairness.
I don't think it's actually possible for sustained sublimity to be as good as sustained torture is bad. Even if that was somehow compensated for I'd still choose Youtube, just like I would choose the torture/avoiding the dust specs (in both cases because otherwise I would either be inconsistent or feel I was making silly excuses).
Eliezer's solution to Torture vs. Dust relied on the assumptions that dust is at least somewhat bad and torture is clearly bad, and the reasoning that 3^^^3 is so big that if we choose to consider some aggregation of the dust speck negative utilities, 3^^^3 will outweigh any reasonable discounting of aggregation.
Now, I read in the OP that the Youtube existence is assumed to be zero value. I could go on actually imagining it and trying to assign some value to it on a gut level. However, people differ in their notions of "zero value", "a tiny bit good" and "a tiny bit bad" scenarios, unlike in the case of torture, which is unanimously believed very bad, and probably also unlike in the case of Sublimity.
I think a great part of the debate on Torture vs. Dust stemmed from uncertain value calculus of utilities very close to zero. The way Sublimity vs. Youtube is presented, I think it is going to head in the same direction. However, if I blank out the specifics, mentioning only that the first type of existence is value-zero and the second is definitely good, the problem gets reduced to 3^^^3*0 < any positive number.
Nevertheless it may not be a bad thing to debate and communicate our notions of close-to-zero value scenarios.
I actually find this reasonably instructive as to my own values. I will pick dust specks every time, but am inclined towards you tube here. I think its because my utility function for pain isn't an average, but maybe something like a maximin, but I'm actually more content with averaging out happiness..
Youtube is the large-scale, small-effect answer, while Sublimity is the small-scale, large-effect answer. Similarly, specks is the large-scale, small-effect answer, while torture is the small-scale, large-effect answer. Thus, I don't find the meaning of your comment to be obvious -- to estimate it, I would be forced to reconstruct a large-enough chain of reasoning that I would likely misunderstand something. Could you explain?
Edit: Request retracted -- I understand it now.
Because one is good but the other is bad! So the 3^^^3 of a small bad thing is much worse than the one-person big bad thing, and 3^^^3 of a small good thing is much better than the one-person big good thing.
I think Bongo read "pick torture" as choose torture over the dust specks, i. e. prioritize avoiding the dust specks over avoiding the torture. Which is actually the most straightforward reading of just those two words if you ignore the rest of thakil's post.
The torture vs. dust specks quandary is a canonical one to LW. Off the top of my head, I can't remember anyone suggesting the reversal, one where the arguments taken by the hypothetical are positive and not negative. I'm curious about how it affects people's intuitions. I call it - as the title indicates - "Sublimity vs. Youtube1".
Suppose the impending existence of some person who is going to live to be fifty years old whatever you do2. She is liable to live a life that zeroes out on a utility scale: mediocre ups and less than shattering downs, overall an unremarkable span. But if you choose "sublimity", she's instead going to live a life that is truly sublime. She will have a warm and happy childhood enriched by loving relationships, full of learning and wonder and growth; she will mature into a merrily successful adult, pursuing meaningful projects and having varied, challenging fun. (For the sake of argument, suppose that the ripple effects of her sublime life as it affects others still lead to the math tallying up as +(1 sublime life), instead of +(1 sublime life)+(various lovely consequences).)
Or you can choose "Youtube", and 3^^^3 people who weren't doing much with some one-second period of their lives instead get to spend that second watching a brief, grainy, yet droll recording of a cat jumping into a box, which they find mildly entertaining.
Sublimity or Youtube?
1The choice in my variant scenario of "watching a Youtube video" rather than some small-but-romanticized pleasure ("having a butterfly land on your finger, then fly away", for instance) is deliberate. Dust specks are really tiny, and there's not much automatic tendency to emotionally inflate them. Hopefully Youtube videos are the reverse of that.
2I'm choosing to make it an alteration of a person who will exist either way to avoid questions about the utility of creating people, and for greater isomorphism with the "torture" option in the original.