Perplexed comments on The Aliens have Landed! - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (156)
It seems you can set up a similar scenario without simulations: have the aliens physically kidnap 90% of the human race, then try to blackmail the other 10% into killing themselves. That would make for an interesting moral dilemma too.
And you can set up a scenario without dragging in torture and extinction. Aliens from Ganymede are about to ink a contract to trade us tons of Niobium in exchange for tons of Cobalt. But then the aliens reveal that they have billions of cloned humans working as an indentured proletariat in the mines of the Trojan asteroids. These humans are generally well treated, but the aliens offer to treat them even better - feed them ice cream - if we send the Cobalt without requiring payment in Niobium.
The central problem in all of these thought experiments is the crazy notion that we should give a shit about the welfare of other minds simply because they exist and experience things analogously to the way we experience things.
Is there a standard name for the logical fallacy where you attempt a reductio ad absurdum but fail to notice that you're deriving the absurdity from more than one assumption? Why conclude that it's the caring about far-away strangers that is crazy, as opposed to the decision algorithm that says you should give in to extortions like this?
I'm not sure words like "crazy" and "absurd" are even meaningful in this context. It's pretty easy to come up with internally consistent arguments generating both results, and the scenario's outlandish enough that it's not clear which one has more practical vulnerabilities; essentially we're dealing with dueling intuitions.
Good catch. Yes, I was deriving the absurdity from more than one assumption.
Maybe with the right decision algorithm you wouldn't give in to extortions like this. However, this extortion attempt cost the aliens approximately nothing, so unless correctly inferring our decision algorithm cost them less than approximately nothing, the rational step for the aliens is to try the extortion regardless. Thus having a different decision algorithm probably wouldn't prevent the extortion attempt.
But then changing your values to not care about simulated torture won't prevent the extortion attempt either (since the aliens will think there's a small chance you haven't actually changed your values and it costs them nothing to try). Unless you already really just don't care about simulated torture, it seems like you'd want to have a decision algorithm that makes you go to war against such extortionists (and not just ignore them).
That 'costs them nothing' part makes a potentially big difference. That the aliens must pay to make their attempt is what gives your decision leverage. The war that you suggest is another way of ensuring that there is a cost. Even though you may actually lose the war and be exterminated.
(Obviously there are whole other scenarios where becoming a 'protectorate' and tithing rather than going to war constitutes a mutually beneficial cooperation. When their BATNA is just to wipe you out but it is slightly better for them to just let you pay them.)
Wait, is this a variant on Newcomb's problem?
(Am I just slow today? Nobody else seems to have mentioned it outright, at least.)
This sort of thing is really the motivating example behind Newcomb's problem.
I'm not seeing the analogy. Can you explain?
The extortion attempt cost the aliens almost nothing, and would have given them a vacant solar system to move into if someone like Fred was in power, so it's rational for them to make the attempt almost regardless of the odds of succeeding. Nobody is reading anybody else's mind here, except the idiots who read their own minds and uploaded them to the Internet, and they don't seem to be making any of the choices.
This case looks most like the 'transparent boxes' version of the problem, which I haven't read much about.
In Newcomb's problem, Omega offers a larger amount of utility if you will predictably do something that intuitively would give a smaller amount of utility.
In this situation, being less open to blackmail probably gives you less disutility in the long run (fewer instances of people trying to blackmail you) than acceding to the blackmail, even though acceding intuitively gives you less disutility.
The other interesting part of this particular scenario is how to define 'blackmail' and differentiate it from, say, someone accidentally doing something that's harmful to you and asking you to help fix it. We've approached that issue, too, but I'm not sure if it's been given a thorough treatment yet.
They had other choices though. It would have been similarly inexpensive to offer to simulate happy people.
Even limiting the spheres to a single proof-of-concept would have been a start.
I really don't care about simulated torture, certainly not enough to prefer war over self-modification if simulated torture becomes an issue. War is very expensive and caring about simulated torture appears to be cost without benefit.
The story is consistent with this. Fred has problems because he cares about simulated torture, and Thud doesn't care and doesn't have problems.
Hmm, perhaps we agree that the story has only one source of absurdity now? No big deal either way.
(UDT is still worth my time to understand. I owe you that, and I didn't get to it yet.)
Err, the point of having a decision theory that makes you go to war against extortionists is not to have war, but to have no extortionists. Of course you only want to do that against potential extortionists who can be "dissuaded". Suffice it to say that the problem is not entirely solved, but the point is that it's too early to say "let's not care about simulated torture because otherwise we'll have to give in to extortion" given that we seem to have decision theory approaches that still show promise of solving such problems without having to change our values.
If Fred cared about the aliens exterminating China, and Thud didn't care; then if the aliens instead threatened to exterminate China, Fred would again have problems and Thud again wouldn't have.
A rock doesn't care about anything, and therefore it has no problems at all.
This topic isn't really about simulation, it's about the fact that caring about anything permits you to possibly sacrifice something else for it. Anything that isn't our highest value may end up traded away, sure.
You can travel from here to China and back. Therefore, caring about China has at least a potential instrumental consequence on the rest of my life. You can't travel from here to the aliens' simulation and back, so caring about what happens there imposes costs on the rest of my life but no benefits. The analogy is not valid.
Now, if the black spheres had decent I/O capabilities and you could outsource human intellectual labor tasks to the simulations, I suppose it would make sense to care about what happens there. People can't do useful work while they're being tortured, so that wasn't the scenario in the story.
That's the only sane reason you believe can exist for caring about distant people at all? That you can potentially travel to them?
So if you're a paraplegic , who doesn't want to travel anywhere, can't travel anywhere, and know you'll die in two weeks anyway. You get a choice to push a button or not push it. If you push it you get 1 dollar right now, but 1 billion Chinese people will die horrible deaths in two weeks, after your own death.
Are you saying that the ONLY "sane" choice is to push the button, because you can use the dollar to buy bubblegum or something, while there'll never be a consequence on you for having a billion Chinese die horrible deaths after your own death?
If so, your definition of sanity isn't the definition most people have. You're talking about the concept commonly called "selfishness", not "sanity".
Fine. Explain to me why Fred shouldn't exterminate his species, or tell me that he should.
The extortion aspect isn't essential. Fred could have been manipulated by true claims about making simulated people super happy.
ETA: At one point this comment had downvotes but no reply, but when I complained that that wasn't a rational discussion, someone actually replied. LessWrong is doing what it's supposed to do. Thanks people for making it and participating in it.
Um, I never tried to define sanity. What are you responding to?
Generally the benefit of caring about about any bad thing is that if you care about it there will be less of it because you will work to stop it.
Well, Fred cared, and his reaction was to propose exterminating humanity. I assume you think his is a wrong decision. Can you say why?
If you care about simulated torture (or simulated pleasure), and you're willing to shut up and multiply, then anybody with a big enough computer can get you to do anything even when that computer has no inputs or outputs and makes absolutely no difference to the real world. I think it's better to adjust oneself so one does not care. It's not like it's a well-tested human value that my ancestors on the savannah acted upon repeatedly.
Do your calculations and preferred choices change if instead of "simulations", we're talking about trillions of flesh-and-blood copies of human beings who are endlessly tortured to death and then revived to be tortured again? Even if they're locked in rooms without entrances or exists, and it makes absolutely no difference to the outside world?
If you care about them, then anybody with a big enough copier-of-humans, and enough torture chambers "can get you to do anything", as you say. So it's not really an issue that depends on caring for simulations. I wish the concept of "simulations" wasn't needlessly added where it has no necessity to be entered.
General Thud would possibly not care if it was the whole real-life population of China that got collected by the aliens, in exchange for a single village of Thud's own nation.
The issue of how-to-deal-with-extortion is a hard one, but it's just made fuzzier by adding the concept of simulations into the mix.
I agree that it's a fuzzy mix, but not the one you have in mind. I intended to talk about the practical issues around simulations, not about extortion.
Given that the aliens' extortion attempt cost them almost nothing, there's not much hope of gaming things to prevent it. Properly constructed, the black spheres would not have an audit trail leading back to the aliens' home, so a competent extortionist could prevent any counterattack. Extortion is not an interesting part of this situation.
Right. It's an issue about caring about things that are provably irrelevant to your day-to-day activities.
That sounds like a flaw in the decision theory. What kind of broken decision theory achieves its values better by optimizing for different values?
What do you mean by "the real world"? Why does it matter if it's "real"?
The real world generally doesn't get turned off. Simulations generally do. That's why it matters.
If there were a simulation that one might reasonably expect to run forever, it might make sense to debate the issue.
To a degree, arguing about extortion is arguing about definitions. In the context of the heuristic "don't give in to extortion", we would like to know exactly what the heuristic shouldn't give in to, though, and why.
In my opinion, the main problem is that the extortionist is making a no-downside trade: the thing it is trading is "not torturing simulated humans" or "not killing hostages" or whatever, which probably wasn't worth anything to the extortionist anyway.
A lot of no-downside trades are obviously unfair, so a useful heuristic is not to agree to no-downside trades in general. In fact, extremely unfair trades in general are metaphorically labeled "extortion" (for instance, I'm sure I've heard the term applied to the price of a diamond ring).
We can see cases besides straightforward extortion where people apply the no-downside heuristic. For instance, buying music from iTunes is a no-downside trade for iTunes at first glance: iTunes doesn't lose anything and gains 99 cents. In fact, iTunes has already spent money buying the rights to the music in expectation you'll download it, so this is something of an acausal trade: much like Omega, iTunes is very good at predicting what people will want, and if enough people aren't going to download a track, iTunes won't offer it. Acausal trades are counterintuitive, though, so it makes sense that some people are repelled by this offer and torrent the music instead.
How is offering to supply ice cream characterized as "extortion"?
In any case, I was not using the scenario as a reductio against universal unreciprocated altruism. That notion fails under its own weight, due to complete absence of support.
Sorry, I misread your comment and thought it was an extortion scenario similar to the OP. Now that I've read it more carefully, it's not clear to me that we shouldn't give up the Niobium in order to provide those humans workers with ice cream. (ETA: why did you characterize those humans as indentured workers? It would have worked as well if they were just ordinary salaried workers.)
Altruists certainly claim to have support for their stated preferences. Or one could argue that preferences don't need to have support. What kind of support do you have for liking ice cream, for example?
Your reading wasn't far off: "in all of these thought experiments" makes your reply remain relevant.
True enough. My main objection to calling my ice cream negotiating tactic 'extortion' is that I really don't like the "just say 'No' to extortion" heuristic. I see no way of definitionally distinguishing extortion from other, less objectionable negotiating stances. Nash's 1953 cooperative game theory model suggests that it is rational to yield to credible threats. I.e. saying 'no' to extortion doesn't win! An AI that begins with the "just say no" heuristic will self-modify to one that dispenses with that heuristic.
I don't think anybody is suggesting building an explicit "just say 'No' to extortion" heuristic into an AI. (I agree we do not have a good definition of "extortion" so when I use the word I use it in an intuitive sense.) We're trying to find a general decision theory that naturally ends up saying no to extortion (when it makes sense to).
Here's an argument that "saying 'no' to extortion doesn't win" can't be the full picture. Some people are more credibly resistant to extortion than others and as a result are less likely to be extorted. We want an AI that is credibly resistant to extortion, if such credibility is possible. Now if other players in the picture are intelligent enough, to the extent of being able to deduce our AI's decision algorithm, then isn't being "credibly resistant to extortion" the same as having a decision algorithm that actually says no to extortion?
ETA: Of course the concept of "credibility" breaks down a bit when all agents are reasoning this way. Which is why the problem is still unsolved!
It does what? How so?
That is pretty incoherent. If you are trying to come up with a general decision theory that wins and also says no to extortion, then you have overdetermined the problem (or will overdetermine it once you supply your definition). If you are predicting that a decision theory that wins will say no to extortion, then it is a rather pointless claim until you supply a definition. Perhaps what you really intend to do is to define 'extortion' as 'that which a winning decision theory says no to'. In which case, Nash has defined 'extortion' for you - as a threat which is not credible, in his technical sense.
Why do you say the problem is still unsolved? What issues do you feel were not addressed by Nash in 1953? Where is the flaw in his argument?
Part of the difficulty of discussing this here is that you have now started to use the word "credible" informally, when it also has a technical meaning in this context.
"Commit to just saying 'no' and proving that when just committing to just saying 'no' and proving that wins."
Perhaps something like that.
Well you don't want to signal that you give in to extortion. That would just increase the chances of people attempting extortion against you. Better to signal that you are on a vendetta to stamp out extortion - at your personal expense!!!
There is an idea, surprisingly prevalent on a rationality website, that costless signaling is an effective way to influence the behavior of rational agents. Or in other words, that it is rational to take signalling at face value. I personally doubt that this idea is correct. In any case, I reiterate that I suggest yielding only to credible threats. My own announcements do not change the credibility of any threats available to agents seeking to exploit me.
Perhaps what is really being expressed is the belief that social costs are real, and that mere pseudonymous posting has costs.
They inflluence the liklihood of them being made in the first place - by influencing the attacker's expected payoffs. Especially if it appears as though you were being sincere. Your comment didn't look much like signalling. I mean, it doesn't seem terribly likely that someone would deliberately publicly signal that they are more likely than unnamed others to capitulate if threatened with an attempt at extortion.
Credibly signalling resistance to extortion is non-trivial. Most compelling would be some kind of authenticated public track record of active resistance.
Well, a simple way would be to use the legal definition of extortion. That should at least help prevent house fires, kidnapping, broken windows and violence.
...but a better definition should not be too difficult - for instance: the set of "offers" which you would rather not be presented with.
My objection to calling the ice cream negotiation tactic 'extortion' is it just totally isn't. It's an offer of a trade.
Then it's a good thing we've made developments in our models in the last six decades!
Cute. But perhaps you should provide a link to what you think is the relevant development.
Well, the key concept underlying strong resistance to extortion is reputation management. Once you understand the long-term costs of becoming identified as a vulnerable "mark" by those in the criminal underground, giving in to extortion can start to look a lot less attractive.
To elaborate a bit:
I'll give you utility if you give me utility is a trade.
I won't cause you disutility if you give me utility is extortion.
I don't think that's exactly the right distinction. Let's say you go to your neighbour because he's being noisy.
Scenario A: He says "I didn't mean to disturb you, I just love my music loud. But give me 10 dollars, and sure, I'll turn the volume down." I'd call that a trade, though it's still about him not giving you disutility.
Scenario B: He says "Yeah, I do that on purpose, so that I can make people pay me to turn the volume down. It'll be 10 bucks. " I'd call that an extortion.
The difference isn't between the results of the offer if you accept or reject -- the outcomes and their utility for you is the same in each (loud music, silence - 10 dollars).
The difference is that in Scenario B, you wish the other person had never decided to make this offer. It's not the utility of your options that are to be compared with each other, but the utility of the timeline where the trade can be made vs the utility of the timeline where the trade can't be made...
In the Trade scenarios, if you can't make a trade with the person, he's still being noisy, and utility minimizes. In the Extortion scenarios, if you can't make a trade with the person, he has no reason to be noisy, and utility maximizes.
I'll probably let someone else to transform the above description into equations containing utility functions.
And what is the distinction between giving utility and not giving disutility? As consequentialists, I thought we were committed to the understanding that they are the same thing.
None at all. But then I don't claim that it is a universal moral imperative that will be revealed to be 'my own imperative' once my brain is scanned, the results of the scan are extrapolated, and the results are weighted in accordance with how "muddled" my preferences are judged to be.
I see, so you're saying that universal unreciprocated altruism fails as a universal moral imperative, not necessarily as a morality that some people might have. Given that you used the word "crazy" earlier I thought you were claiming that nobody should have that morality.
I think it is easily possible to imagine naturalists describing some kinds of maladaptive behaviour as being "crazy". The implication would be that the behaviour was being caused by some kind of psychological problem interfering with their brain's normal operation.
I do claim that. In two flavors.
Someone operating under that moral maxim will tend to dispense with that maxim as they approach reflective equilibrium.
Someone operating under that 'moral' maxim is acting immorally - this operationally means that good people should (i.e. are under a moral obligation to) shun such a moral idiot and make no agreements with him (since he proclaims that he cannot be trusted to keep his commitments).
Part of the confusion between us is that you seem to want the word 'morality' to encompass all preferences - whether a preference for chocolate over vanilla, or a preference for telling the truth over lying, or a preference for altruism over selfishness. It is the primary business of metaethics to make the distinction between moral opinions (i.e. opinions about moral issues) and mere personal preferences.
No, I don't want that. In fact I do not currently have a metaethical position beyond finding all existing metaethical theories (that I'm aware of) to be inadequate. In my earlier comment I offered two possible lines of defense for altruism, because I didn't know which metaethics you prefer:
In your reply to that comment you chose to respond to only the second sentence, hence the "confusion".
Anyway, why don't you make a post detailing your metaethics, as well as your arguments against "universal unreciprocated altruism"? It's not clear to me what you're trying to accomplish by calling people who believe such things (many of whom are very smart and have already seriously reflected on these issues) "crazy" without backing up your claims.
I'm not sure why you think I have called anyone crazy. What I said above is that a particular moral notion is crazy.
Perhaps you instead meant to complain that (in the grandparent) I had referred to the persons in question as "moral idiots". I'm afraid I must plead guilty to that bit of hyperbole.
I am gradually coming to think that there is little agreement here as to what the word metaethics even means. My current understanding is that metaethics is what you do to prepare the linguistic ground so that people operating under different ethical theories and doctrines can talk to each other. Meta-ethics strives to be neutral and non-normative. There are no meta-ethical facts about the world - only definitions that permit discourse and disputation about the facts.
Given this interpretation of "meta-ethics", it would seem that what you mean to suggest is that I make a post detailing my normative ethics, which would include an argument against "universal unreciprocated altruism" (which I take to be a competing theory of normative ethics).
Luke and/or Eliezer and/or any trained philosopher here: I would appreciate feedback as to whether I finally have the correct understanding of the scope and purpose of meta-ethics.
Hidden assumptions play a role similar to the auxiliary hypotheses which undermine naive Popperianism. The fallacy of ignoring auxiliary assumptions seems like a special case of the fallacy of presenting an argument from ignorance.
No, I think the central "problem" is that having preferences that others can thwart with little effort is risky because it makes you more vulnerable to extortion.
For example, if you have a preference against non-prime heaps of pebbles existing, the aliens can try to extort you by building huge numbers of non-prime heaps on their home planet and sending you pictures of them, and therefore, the argument goes, it's crazy and stupid to care about non-prime heaps.
The argument also yields a heuristic that the farther away a thing is from you, the more stupid and crazy it is to care about it.
Right. What you are saying is related to the notion of "credible threats". If other agents can give you disutility with little disutility for themselves, then they have a credible threat against you. And unless you either change your utility function, or find a way of making it much more difficult and costly for them to harm you, the rational course is to give in to the extortion.
One way to make it costly for others to harm you is to join a large coalition which threatens massive retaliation against anyone practicing extortion against coalition members. But notice that if you join such a coalition, you must be willing to bear your share of the burden should such retaliation be necessary.
The alternative I suggested in the grandparent was to change your utility function so as to make you less vulnerable - only care about things you have control over. Unfortunately, this is advice that may be impossible to carry out. Preferences, as several commentators here have pointed out, tend to be incorrigible.
I took the obvious solution to that difficulty. I self modified to an agent that behaves exactly as if he had self modified to be an agent with preferences that make him less vulnerable. This is a coherent configuration for my atoms to be in terms of physics and is also one that benefits me.
Your variation is better than mine! Not sure about your solution though, it looks a little hurried.
However it is a different problem. An interesting problem in its own right but one for which many people's coherent preferences will produce a different answer for slightly different reasons.
Well, I see the central problem in the notion that we should care about something that happens to other people if we're not the ones doing it to them. Clearly, the aliens are sentient; they are morally responsible for what happens to these humans. While we certainly should pursue possible avenues to end the suffering, we shouldn't act as if we were.
Interesting. Though in the scenario I suggested there is no suffering. Only an opportunity to deploy pleasure (ice cream).
I'm curious as to your reasons why you hold the aliens morally responsible for the human clones - I can imagine several reasons, but wonder what yours are. Also, I am curious as to whether you think that the existence of someone with greater moral responsibility than our own acts to decrease or eliminate the small amount of moral responsibility that we Earthlings have in this case.
Why would I not hold them responsible? They are the ones who are trying to make us responsible by giving us an opportunity to act, but their opportunities are much more direct - after all, they created the situation that exerts the pressure on us. This line of thought is mainly meant to be argued in Fred's terms, who has a problem with feeling responsible for this suffering (or non-pleasure) - it offers him an out of the conundrum without relinquishing his compassion for humanity (i.e. I feel the ending as written is illogical, and I certainly think "Michael" is acting very unprofessionally for a psychoanalyst). ["Relinquish the compassion" is also the conclusion you seem to have drawn, thus my response here.]
Of course, the alien strategy might not be directed at our sense of responsibility, but at some sort of game theoretic utility function that proposes the greater good for the greater number - these utility functions are always sort of arbitrary (most of them on lesswrong center around money, with no indication why money should be valuable), and the arbitrariness in this case consists in including the alien simulations, but not the aliens themselves. If the aliens are "rational agents", then not rewarding their behaviour will make them stop it if it has a cost, while rewarding it will make them continue. (Haven't you ever wondered how many non-rational entities are trying to pose conundrums to rational agents on here? ;)
I don't have a theory of quantifyable responsibility, and I don't have a definite answer for you. Let's just say there is only a limited amount of stuff we can do in the time that we have, so we have to make choices what to do with our lives. I hope that Fred comes to feel that he can accomplish more with his life than to indirectly die for a tortured simulation that serves alien interests.