Now I'm really confused. Are you trying to define words, or trying to understand (and manipulate) behaviors? I'm hearing you say something like "I don't know what blackmail is, but I want to make sure an AI doesn't do it". This must be a misunderstanding on my part.
I guess you might be trying to understand WHY some people don't like blackmail, so you can decide whether you want to to guard against it, but even that seems pretty backward.
You make it sound like those two things are mutually exclusive. They aren't. We are trying to define words so that we can understand and manipulate behavior.
"I don't know what blackmail is, but I want to make sure an AI doesn't do it." Yes, exactly, as long as you interpret it in the way I explained it above.* What's wrong with that? Isn't that exactly what the AI safety project is, in general? "I don't know what bad behaviors are, but I want to make sure the AI doesn't do them."
*"In other words there are a cluster of behaviors th...
To illustrate an old point - that it's hard to distinguish between extortion and trade negotiations - here's a schematic diagram of extortion, alternating actions by player B (blackmailer/extorter//blue) and V (victim//violet):
The extorter can let the default Def happen, or can instead do a threat, ending up in point T. Then the victim can resist (Res) or surrender (Sur). If the victim resists, the extorter has the option of carrying out their threat (C) or not doing so (¬C).
You need a few conditions to make this into a extortion situation:
A mere threat doesn't make this into extortion. Indeed, trade negotiations are a series of repeated threats from both sides, making offers with the implicit threat that they will walk away from the deal entirely if that offer is not accepted.
But if C is worse that Def for V, then this seems a true extortion: V will end up worse if they resist a extorter who carries out their threats, and they would have much preferred that the extorter not be able to make credible threats in the first place. And the only reason the extorter made the threats, was to force the victim to surrender.
Fairness and equity
Note that this is not about fairness or niceness. It's perfectly possible to extort someone into giving you fair treatment (depending on how you see the default point, many of the boycotts during the civil right movement would count as extortion).
The all important default
This model seems clear; so why is it so hard to identify extortion in real life? One key issue is disagreement over the default point. Consider the following situations:
These examples should be sufficient to illustrate the degree of the problem, and also show how defaults are often forged by implicit and explicit norms, so extortions are clearest cut when they also include norm violation, trust violation, or other dubious elements. behaviours. In a sense, picking the default is the important thing; picking exactly the right default is less important, since once the default is known, people can adjust their expectations and behaviour in consequence.