Unpacking the Concept of "Blackmail"

Vladimir_Nesov

Keep in mind: Controlling Constant Programs, Notion of Preference in Ambient Control.

There is a reasonable game-theoretic heuristic, "don't respond to blackmail" or "don't negotiate with terrorists". But what is actually meant by the word "blackmail" here? Does it have a place as a fundamental decision-theoretic concept, or is it merely an affective category, a class of situations activating a certain psychological adaptation that expresses disapproval of certain decisions and on the net protects (benefits) you, like those adaptation that respond to "being rude" or "offense"?

We, as humans, have a concept of "default", "do nothing strategy". The other plans can be compared to the moral value of the default. Doing harm would be something worse than the default, doing good something better than the default.

Blackmail is then a situation where by decision of another agent ("blackmailer"), you are presented with two options, both of which are harmful to you (worse than the default), and one of which is better for the blackmailer. The alternative (if the blackmailer decides not to blackmail) is the default.

Compare this with the same scenario, but with the "default" action of the other agent being worse for you than the given options. This would be called normal bargaining, as in trade, where both parties benefit from exchange of goods, but to a different extent depending on which cost is set.

Why is the "default" special here? If bargaining or blackmail did happen, we know that "default" is impossible. How can we tell two situations apart then, from their payoffs (or models of uncertainty about the outcomes) alone? It's necessary to tell these situations apart to manage not responding to threats, but at the same time cooperating in trade (instead of making things as bad as you can for the trade partner, no matter what it costs you). Otherwise, abstaining from doing harm looks exactly like doing good. A charitable gift of not blowing up your car and so on.

My hypothesis is that "blackmail" is what the suggestion of your mind to not cooperate feels like from the inside, the answer to a difficult problem computed by cognitive algorithms you don't understand, and not a simple property of the decision problem itself. By saying "don't respond to blackmail", you are pushing most of the hard work into intuitive categorization of decision problems into "blackmail" and "trade", with only correct interpretation of the results of that categorization left as an explicit exercise.

(A possible direction for formalizing these concepts involves introducing some kind of notion of resources, maybe amount of control, and instrumental vs. terminal spending, so that the "default" corresponds to less instrumental spending of controlled resources, but I don't see it clearly.)

(Let's keep on topic and not refer to powerful AIs or FAI in this thread, only discuss the concept of blackmail in itself, in decision-theoretic context.)

Keep in mind: Controlling Constant Programs, Notion of Preference in Ambient Control.

(Let's keep on topic and not refer to powerful AIs or FAI in this thread, only discuss the concept of blackmail in itself, in decision-theoretic context.)

Hmm. Bear with me.

Consider this decision tree

The decision that maximises your gain, as Red, is for Blue to pick 5, -10. The decision that minimises your loss, as Blue, is to pick 5, -10. And so it is that extortionism is rediscovered by all agents like Red. But Blues sometimes pick -1, -20, more often than ‘some people are purely irrational’ would predict.

In a society that deeply understood game theory, they could recognise that this extortion scenario would be iterated. In an iterated series of extortion games, if a lunatic Blue completely ignored their own disutility to punish Red for extorting them, they could reduce Red’s utility below 0. At this point, Red would prefer not to have extorted the lunatic. If a Blue can convincingly signal their lunacy to potential extorter Reds, Reds will predictably leave said lunatic Blues alone. Therefore, the true decision tree, for both agents, looks like this.

Note that this is restricted to the long view of an iterated extortion game; understanding that precommitting to lunacy would have stopped Red will not magically undo the current situation, so if Blue knew that this case was unlikely to be part of an iterated series, they would simply capitulate. (Incidentally, this is why I do not extort people. Anyone worth extorting would be smart enough to figure out this line of reasoning, capitulate to me, and then expend some effort ensuring I was unable to iterate the series. Such as having me arrested or murdered.)

Whence the moral condemnation of extortion, then? Our morals and intuitions don’t exactly suggest all these options to us. Well, to a naive outsider looking in, the outcome of a game-theory society is that they usually refuse to cooperate with extortionists, they occasionally capitulate, and extortionists are hated. The naive outsider doesn’t see that extortionists are hated for trying to instantiate iterated negative-sum games, or the reasons why refusal is common and capitulation rare. All they see is that the game-theory society doesn’t have a problem with extortion. So they imitate the game-theory society’s actions, not reasons, get diminished but still impressive benefits, and expend far less energy working out these problems and far more energy copulating (a not entirely undesirable path).

There may not have been a game-theory society. In that case, producing animals that have irrationally have rough concepts of extortion and act in roughly the right way was an earlier solution than producing game-theory animals.

Extortion, then, is the label for events that roughly match the characteristics of negative-outcome trade from the perspective of the victim - that is, the victim would prefer the situation not to have taken place to either capitulating or refusing. The event only has to pattern-match this situation for people to think extortion.

If you would prefer the situation to not have happened, and you have reason to believe that there is a long enough iteration, and you can cause disutility for the agent that had a choice in causing the situation to happen, then suffering disutility to change the causal agent’s decision is the rational choice. Our intuition only shows us the individual slice, though, and some individual slices look like patently bizarre behaviour.

Of course, we would immensely prefer that this entire scenario of iterated extortion games resulting in disutility for both sides is simply a counterfactual in the mind of the potential extortionist.

36

Unpacking the Concept of "Blackmail"

36

36

36

Unpacking the Concept of "Blackmail"

36

36