There is a distinction people often fail to make, which is commonly seen in analyses of fictional characters' actions but also those of real people. It is the distinction between behaving irrationally and having extreme preferences.
If we look at actions and preferences the way decision theorists do, it is clear that preferences cannot be irrational. Indeed, rationality is defined as tendency toward preference-satisfaction. To say preferences are irrational is to say that someone's tastes can be objectively wrong.
Example: Voldemort is very stubborn in JKR's Harry Potter. He could have easily arranged for a minion to kill Harry, but he didn't, and this is decried as irrational. Or even more to the point, he could have been immortal if only he hid in a cave somewhere and didn't bother anyone.
But that is ignoring Voldemort's revealed preference relation and just treating survival as his chief end. What is the point of selling your soul to become the most powerful lich of all time so you can live as a hermit? That would be irrational, as it would neglect Voldemort's preferences.
It's possible to behave in a way you don't endorse on reflection. Or even merely wouldn't endorse, especially using modes of reflection you lack skill/knowledge for. Calling this condition "revealed preferences" is erring in the other direction from what you point out, an overly behaviorist view that ignores the less observable process of coming up with preferences. There is also something to be said about lacking preferences in some ways where it would be healthier to have them, even absent their spontaneous emergence.
I've read a good chunk of Eliezer's paper on TDT, and it's in that context that I am interpreting reflection. Forgive me if I misunderstand some of it; it's new to me.
TDT is motivated by requiring a decision rule that is consistent under reflection. It doesn't seem to pass judgment on preferences themselves, only on how actions ought to be chosen given preferences. Am I mistaken here?
Perhaps I should have been clearer with Voldemort's "revealed" preferences. JKR writes him as a fairly simple character and I did take for granted that what we saw was what we got. I agree that in general actions aren't indicative of beliefs.
EDIT: Ah, there is an exception. Eliezer is quite critical in the paper of preferring a decision rule for its own sake.
preferences cannot be irrational.
Preferences CAN be inconsistent (over time, or across different modes of preference-identification), which is definitely irrational. And sufficiently extreme preferences, even if consistent and consistently pursued, can be judged evil, insane, or harmful to many others.
Revealed preference is a very useful obfuscation over the fact that humans aren't actually rational in the VNM sense. Not a one of us has a consistent ordering over potential world-states, and we're full of contradictions between how we think about our wants (which varies in different modes and frames of asking the question) and what we actually do. It's quite a bit stronger, predictively, especially averaged over a population, than polling or discussion. But it's not the only valid conception of preferences, and not what most people mean when talking about preferences.
Voldemort can more easily be understood (and predicted) if you model his preference as immortality being instrumental to his quest for being powerful and feared (respected, in his worldview). But remember two things:
I would argue that inconsistency of preferences isn't necessarily a sign of irrationality. Come to think of it, it may hinge greatly on how you frame the preference.
Consider changing tastes. As a child, I preferred some sweets to savory items, and those preferences reversed as I aged. Is that irrational? No and, indeed, you needn't even view it as a preference reversal. The preference "I prefer to eat what tastes good to me" has remained unchanged, after all. Is my sense of taste itself a preference? It seems like this would devolve into semantics quickly.
My reluctance to characterize preferences as rational or irrational is that I see these as prescriptive terms. But you can't prescribe preferences. You either have them or you don't. Only decision rules are chosen.
When I first read about Newcomb's Problem, I will admit that it struck me as artificial at first. But not unfamiliar! Similar dilemmas seem common in film and television.
For example, consider Disney's Hercules. Hercules spends the entire movie trying to regain his status as a god. He is told that he must become a hero to do so, so of course he sets out doing what seem to be heroic things. In the end he succeeds by jumping into the Styx to save Meg despite being told he would die. Heroism in his world evidently requires irrationality!
While it isn't identical to Newcomb's Problem, there is the same theme of "just ignore the apparent causal chain and it will all work out". There are countless other instances: let the girl go and you'll end up with the girl, or admit to cheating in that contest and you'll win the prize, etc. The message is that denying ourselves is virtuous, creating Newcomblike scenarios.
It occurred to me today that the VNM utility functions model preferences concerning income rather than wealth in general.
Consider the continuity axiom, for example. This axiom seems to imply that a rational agent would be willing to gamble their entire life savings for an extra dollar provided that the probability of losing is small enough. Barring the possibility of charity, going broke is tantamount to death, since it costs money to make money. It seems reasonable to me that a rational agent would treat their own death as infinitely bad. Under this assumption no probability of losing is small enough.
This criticism doesn't apply if lotteries are only allowed positive payouts, of course, but no such assumption is ever made. This is what I mean when I say that the axioms describe preferred income streams rather than wealth levels. The obvious fix is to add a parameter for current wealth, but I'm unsure if a result will follow that is analogous to the VNM Utility Theorem.
I've always had philosophical leanings, so I find myself asking often what decision theory sets out to do, even as I grapple with a concrete mathematical application. This seems important to me if I want a realistic model of an actual decision an agent may face. My concerns keep returning to utility and what it represents.
Utility is used as a measure for many things: wealth, usefulness, satisfaction, happiness, scoring in games, etc. Our treatment of it suggests that what it represents doesn't matter--that the default aim of a decision theory should be to maximize an objective function, and we just call that function the utility function. It doesn't seem to me that this is always obvious.
One may object that the VNM Utility Theorem assures this is so. But VNM (at least the version with which I am familiar) covers only simple decision problems. It would be faster to list scenarios it can handle, but let's summarize what it doesn't:
It has nothing to say when there are infinite outcomes, or when the timing of utility gains matters (auxiliary discounting functions must be introduced, their motivation varied). It doesn't address apples-to-oranges comparisons, because everything must by the end be possible to convert to utils. It offers no insight into how the weights guaranteed by the continuity axiom can be calculated for an agent, so you can't construct their utility function without already knowing all of their preferences (which is what you were trying to infer by using the utility function!). It doesn't enable comparisons between agents, so it isn't a basis for a social choice theory.
The result is that utility ends up being the drawer of miscellaneous items we cram with stuff whose proper place isn't yet known; a black box that produces the result we want in that context. The limitations are mostly ignored. We use unbounded utility functions defined on unbounded domains whose growth rates are chosen for convenience, we discount their future values as if we will live forever, and we pretend every combination of assets we may acquire will fit into their domains.
As an example, we may try to investigate why people are inclined to play the lottery despite being risk-averse in other respects. A common explanation is that people overweight small probabilities with extreme outcomes. But it is also observed that some people simply enjoy the thrill of taking a risk. So do you use a weighting function or do you use a convex utility function? It isn't clear at all, partly because the utility of having money and the utility of having a thrill don't seem to be comparable. They certainly don't feel comparable.
To conclude, the mathematical convenience of turning decision-making into optimization shouldn't seduce us into being lazy like this.
aim of a decision theory [...] to maximize an objective function
Pragmatically, this is also the wrong thing to do if pursued too methodically, because a legible objective function is always only a proxy for what is actually valuable, and optimizing for a proxy ruins it. A better ethos might be to always remain on the lookout for improving the proxies, while only making careful use of currently available ones, perhaps in a way that pays attention to how unusual a given situation is.
Mortality and Discounting
Many are probably aware of how discounting works, but I'll give a brief summary first:
Humans have time preferences, which is to say that we prefer to have money (or any item of utility) sooner rather than later, all else equal. One way of capturing this is by converting the present value of cash to an equivalent value in the future with a discount function. Studies show that humans tend to use a hyperbolic discount function. This steeply discounts gains in the near future and mildly discounts in the distant future, which leads to preference reversals. For example, committing to begin saving money a year from now seems tempting, but committing to begin right away seems daunting. Suppose you commit to the former. After a year passes, you'll begin to regret the commitment since your scenario will now match the latter.
There are exactly two ways to avoid these preference reversals: not discounting at all (which requires a different treatment of time preference), or discounting exponentially. Exponential discounting gives you a constant conversion rate d. So that seems to settle it. Exponential discounting is the way to go, right?
I argue that it has a serious setback in that it doesn't account for our mortality. To be fair, neither does hyperbolic discounting! But just to use exponential discounting as an example, I'll show how you can be turned into a money pump: Let d be your annual discount rate, and suppose we know that you will live for at most t years. According to exponential discounting, you value one dollar now the same as dollars in t years. I will generously offer you +1 dollars in t years if you lend me one dollar now. Who cares that you'll be dead by the time you would collect it?
You might protest that your remaining lifespan is a random variable, but the argument holds so long as it's bounded. You can let t be large enough that the universe will have succumbed to heat death. You may also protest that the (zero) probability of living until the collection date ought to be involved in the calculation somehow. But the uncertainty of payment is one of the motivations for discounting in the first place! The whole point is that the discounting method fails to do that here.
Mortality requires the discount function to reach 0 in finite time. As no exponential function does this, any discounting method must either neglect mortality or allow preference reversals. My tentative conclusion is that discounting is perhaps not the "ideal" way to express time preferences, but I am open to suggestions.