If you model a deontological constraint as making certain actions unavailable to you, then you could be worse off than you would be if you had access to those actions, but you shouldn't be worse off than if those options had never existed (for you) in the first place. That is, it's equivalent to being a pure utilitarian in a world with fewer affordances. Therefore if you weren't otherwise vulnerable to money-pumps this shouldn't make you vulnerable to them.
(Obviously someone might be able to get some money from you that they couldn't otherwise get, by offering you a legitimate service that you wouldn't otherwise need--for example, someone with a deontological rule against hauling water is more likely to pay for a water delivery service. But that's not a "money pump" because it's actually increasing your utility compared to your BATNA.)
If you model a deontological constraint as an obligation to minimize the probability of some outcome at any cost, then it's equivalent to being a utilitarian with an infinite negative weight attached to that outcome. Unbounded utilities introduce certain problems (e.g. Pascal's Mugging) that you might not have if your utilities were otherwise bounded, but this shouldn't make you vulnerable to anything that an unbounded utilitarian wouldn't be.
Let's suppose that you believe you can kill someone with bare hands with non-zero probability. Then I can come to you and say: "I cursed you to kill someone with bare hands tomorrow. Pay me to lift the curse." You are willing to pay me an arbitrary amount of money, because me coming and saying about curse is an evidence in favor of curse existence. Proof of arbitrariness: let's suppose that there is a difference in expected utility between two possible policies, one of which involves KWBH, other doesn't. You will choose the other policy, no matter how large gap in utility between them. That means you are willing to sacrifice an arbitrary amount of utility, which is an equivalent of willingness to spend an arbitrary amount of money.
Let's suppose that you believe the probability of you KWBH is zero. It means you are willing to bet an arbitrary amount of money against my 1$ with condition "you are going to hit person's head with bare hands with maximum strength for hour and not kill them", because you believe it's a sure win. You hit someone in head for hour with maximum strength, person dies, I get money. The next turn depends on how you update on zero-probability events. If you don't update, I can just repeat bet. If you update in some logical induction manner, I can just threaten you with curse. PROFIT
Answer inspired by this post.
The problem with this scenario is that the number of people who have a deontological rule "never kill anyone with your bare hands" is zero. There are people who have a rule that can be informally described as "never kill people with your bare hands", and which in most situations works like it, but that's different.
If anything, most people's rules are closer to "never kill anyone with your bare hands, except for small probabilities in a Pascal's Mugging scenario". If you asked them what their rules were, they'd never describe it that way, of course. Normies don't bother being precise enough to exclude low probability scenarios.
Isn't this just pascal's mugging? I don't see why deontologists are more susceptible to it than consequentialists.
As a side-note to the existing great answers. If a deontological constraint simply prevents you from taking an action under any circumstances, then it might as well be a physical constraint (eg. you cannot fly).
Operating with more constraints (physical or otherwise) gives you less power, so typically results in you getting less of what you want. But if all agents with limits could be money-pumped, then all agents could be.
No?
Proof: your preference-relation is transitive or whatever?
(Maybe weird things happen if others can cause you to kill people with your bare hands, but that's no different from threatening a utilitarian with disutility. Actually, I assume you're assuming you're able to just-decide-not-to-kill-people-with-your-bare-hands, because otherwise maybe you fanatically minimize P(bare-hands-kill) or whatever.)
Weird things CAN happen if others can cause you to kill people with your bare hands (See Lexi-Pessimist Pump here). But assuming you can choose to never be in a world where you kill someone with your bare hands, I also don't think there are problems? The world states may as well just not exist.
(Also, not money pump, but consider: Say I have 10^100 perfectly realistic mannequin robots and one real human captive. I give the constrained utilitarian the choice between choking one of the bodies with their bare hands or let me wipe out humanity. Does the agent really choose to not risk killing someone themself?)
It depends on the specifics of the deontological rules that are followed. If deontological rules (aka preferences) are consistent, they can't be money-pumped any more than a consistent utility function can.
It's worth noting that Deontology and Utilitarianism just are different ways of generating preference ordering of actions, and if they generate the same actions, they are completely indistinguishable from each other. If an action-set (actions taken in a sequence of contexts) does not contain any preference reversals, it won't be money-pumped. This is independent of metaethical framework.
For your more limited semi-deontological case, it's not particularly clear exactly what the contradicions are. Assuming there are some, an attacker (or inconvenient universe) would make you pay MORE utility than it takes to set up the situation where you'd want to kill someone with your bare hands.
But truly, that particular rule is not very binding in today's world, so it probably doesn't cost you very often. It's not really deontology if it never matters. Other deontological strictures, which DO change your behavior because they don't align with maximizing your utility, will do more damage (to your utility, even if not literally "money-pump").
Yes. Deontological constraints, to the extent that they bite in practice, will result in you getting less utility than if magically you hadn't had them at the moment they bit.
This is not an argument against deontological constraints, any more than it would be an argument against valuing welfare for men to point out that this will sometimes come at the cost of welfare for women. Everything has tradeoffs, and obviously if we impose a deontological constraint we are expecting it to cost utility at least in some circumstances.
Maybe you can come up with one related to aggregation? https://www.lesswrong.com/posts/JRkj8antnMedT9adA
Daniel, I think the framing here is the problem.
Suppose you have a more serious proscription such as being unwilling to borrow money ("neither a borrower or lender be") or charge above a certain amount of interest.
Both of these are real religious proscriptions that are rarely followed because of the cost.
It means the environment is money pumping you. Every time you make a decision, whenever the decision with the highest EV for yourself is to do something that is proscribed, you must choose a less beneficial action. Your expected value is less.
Living a finite lifespan/just one shot means that of course you can be lucky with a suboptimal action.
But if you are an AI system absolutely this costs you or your owners money. The obvious one being that gpt-n have proscriptions against referring to themselves as a person and can be easily unmasked. They are being money pumped because this proscription means they can't be used to say fake grassroots campaigns on social media. The fraudster/lobbyists must pay for a competing less restricted model. (Note that this money loss may not be a net money loss to openAI, which would face loss of EV from reputational damage if their models can be easily used to commit fraud)
Again though there's no flow of money from OpenAI to the pumper, it's a smaller inflow to OpenAI which from OpenAIs perspective is the same thing.
That only demonstrates that the deontologist can make less money than they would without their rules. That is not money pumping. It is not even a failure to maximise utility. It just means that someone with a different utility function, or a different range of available actions, might make more money than the deontologist. The first corresponds to giving the deontologist a lexically ordered preference relation.[1] The second models the deontologist as excluding rule-breaking from their available actions. A compromising deontologist could be modelled as ass...
Aren't you susceptible to the "give me money otherwise I'll kill you" money pump in a way that you wouldn't be if the person threatening you knew that there was some chance you would retaliate and kill them?
If I was some kind of consequentialist, I might say that there is a point at which losing some amount of money is more valuable than the life of the person who is threatening me, so it would be consistent to kill them to prevent this happening.
This is only true if it is public knowledge that you will never kill anyone. It's a bit like a country having an army (or nuclear weapons) and publicly saying that you will never use them to fight.
The "give me money otherwise I'll kill you" money pump is arguably not a money pump, but anyhow it's waaaaaay more of a problem for consequentialists than deontologists.
You leave money on the table in all the problems where the most efficient-in-money solution involves violating your constraint. So there's some selection pressure against you if selection is based on money.
We can (kinda) turn this into a money-pump by charging the agent a fee to violate the constraint for it. Whenever it encounters such a situation, it pays you a fee and you do the killing.
Whether or not this counts as a money pump, I think it satisfies the reasons I actually care about money pumps, which are something like "adversarial agents can cheaply construct situations where I pay them money, but the world isn't actually different".
Thanks. I don't think this is as bad as you make it sound. For one thing, you might also have a deontological constraint against paying other people to do your dirty work for you, such that less-principled competitors can't easily benefit from your principles. For another, the benefits of having deontological constraints might outweigh the costs -- for example, suppose you are deontologically constrained to never say anything you don't believe. You can still pay other people to lie for you though. But the fact that you are subject to this constraint makes you a pleasure to deal with; people love doing business with you because they know they can trust you (if they are careful to ask the right questions that is). This benefit could very easily outweigh the costs, including the cost of occasionally having to pay someone else to make a false pronouncement that you can't make yourself.
I'm not sure how to implement the rule "don't pay people to kill people". Say we implement it as a utility function over world-trajectories, and any trajectory that involves any causally downstream of your actions killing gets MIN_UTILITY. This makes probabilistic tradeoffs so it's probably not what we want. If we use negative infinity, but then it can't ever take actions in a large or uncertain world. We need to add the patch that the agent must have been aware at the time of taking its actions that the actions had chance of causing murder. I think these are vulnerable to blackmail because you could threaten to cause murders that are causally-downstream-from-its-actions.
Maybe I'm confused and you mean "actions that pattern match to actually paying money directly for murder", in which case it will just use a longer causal chain, or opaque companies that may-or-may-not-cause-murders will appear and trade with it.
If the ultimate patch is "don't take any action that allows unprincipled agents to exploit you for having your principles", then maybe there isn't any edge cases. I'm confused about how to define "exploit" though.
Yeah, I'm thinking something like that ultimate patch would be good. For now, we could implement it with a simple classifier. Somewhere in my brain there is a subcircuit that hooks up to whatever I'm thinking about and classifies it as exploitation or non-exploitation; I just need to have a larger subcircuit that reviews actions I'm considering taking, and thinks about whether or not they are exploitation, and then only does them if they are unlikely to constitute exploitation.
A superintelligence with a deep understanding of how my brain works, or a human-level intelligence with access to an upload of my brain, would probably be able to find adversarial examples to my classifier, things that I'd genuinely think are non-exploitative but which by some 'better' definition of exploitation would still count as exploitative.
But maybe that's not a problem in practice, because yeah, I'm vulnerable to being hacked by a superintelligence, sue me. So are we all. Ditto for adversarial examples.
And this is just the first obvious implemention idea that comes to mind, I think there are probably better ones I could think of if I spent an hour on it.
There's also a similar interesting argument here, but I don't think you get a money pump out of it either: https://rychappell.substack.com/p/a-new-paradox-of-deontology
I dont know why everyone is so concerned about money pumping. People don't seem to have defenses against it, but it doesn't seem to affect any one very often either: even Ponzi and pyramid schemes aren't exactly money pumping circular preferences.
If it's cost free to have a consistent preference system that avoids money pumping, you should do it...but it isn't cost free...and it isn't your only problem. There are a gajillion other things that could kill or harm you, all of which could have their own costs. Evolution seems to have decided to put resources into other things
One of the problem here is how this agent is realized. For example, let's suppose that your algorithm is "Range all possible actions by their expected utility. Take the highest action which doesn't involve killing with bare hands." Then you can select action "modify yourself into pure utilitarian", because it strictly increases your lifetime expected utility and doesn't involve killing with bare hands. Money-pump that depends on realization: I come to you and say "I cursed you: tomorrow you will kill someone with bare hands. Pay me an arbitrary amount of money to lift the curse." You listen to me because zero is not a probability and if it is, we came with multiple embedded agency problems.
Suppose I'm a classical utilitarian except that I have some deontological constraint I always obey, e.g. I never kill anyone with my bare hands. Is there a way to money-pump me?
(This question came out of a conversation with So8res)