I've written before about the difficulty of distinguishing values from errors, from algorithms, and from context. Now I have to add to that list: How can we distinguish our utility function from the parameters we use to apply it?
In my recent discussion post, "Rationalists don't care about the future", I showed that exponential time-discounting, plus some assumptions about physics and knowledge, leads to not caring about the future. Many people responded by saying that, if I care about the future, this shows that my utility function does not use exponential time-discounting.
This response assumes that the shape of my time-discounting function is part of my utility function. In other words, the way you time-discount is one of your values.
By contrast, Eliezer wrote an earlier post saying that we should use human values, but without time-discounting. Eliezer is aware that humans appear to use time discounting. Therefore, this implicitly claims that the time-discounting function is not one of our values. It's a parameter for how we implement them.
(Some of the arguments Eliezer used were value-based arguments, suggesting that we can use our values to set the parameters that we use to implement our values... I suspect this recursive approach could introduce bogus solutions, like multiplying both sides of an equation by a variable, or worse; but that would take a longer post to address. I will note that some recursive equations do have unique solutions.)
The program of CEV assumes that a transhuman can use some extrapolated version of values currently used by some humans. If that transhuman has a life expectancy of a billion years, it will likely view time discounting differently. Eliezer's post against time discounting suggests, to me, a God-like view of the universe, in which we eliminate time discounting in the same way (and for the same reasons) that many people want to eliminate space-discounting (not caring about far-away people) in contemporary ethics. This is taking an ethical code that evolved agents have, which is constructed to promote the propagation of those agents' genes, and applying it without reference to any particular set of genes. This is also pretty much what folk-morality says a social moral code is. So the idea that you can apply the same utility function from a radically different context, is inherent in CEV, and is common to much public discourse on ethics which assumes that you can construct a social morality that is based on the morality we find in individual agents.
On the other hand, I have argued that assuming that social ethics and individual ethics are the same, is either merely sloppy thinking, or an evolved (or deliberately constructed) lie. People who believed this would probably subscribe to a social-contract theory of ethics. (This view also has problems, beyond the scope of this post.)
I have one heuristic that I think is pretty good for telling when something is not a value: If it's mathematically wrong, it's an error, not a value. So my inclination is to point out that exponential time-discounting is correct. All other forms of time-discounting lead to inconsistencies. You can time-discount exponentially; or you can not time-discount at all, as Eliezer suggested; or you can be in error.
But my purpose in this post is not to continue the arguments from that other post. It's to point out this additional challenge in isolating what values are. Is your time-discounting function a value, or a value parameter?
I can't understand the proof, but I can construct some counterexamples. Either these disprove the theorem, or they will show where I misunderstand the proof.
The first class relies on the fact that the proof places no restrictions on the probability distribution. Define U(n) = n. This is bounded below in abs. value by the unbounded computable function U(n)=n. Define p:N->R such that p(1) = 1, and p=0 for all other n. Now EU(h(k)) = 1.
Notice that I am defining p as a function from the integers into the reals, whereas Peter defined it as a function from the reals into the reals. Peter's definition must be incorrect, since in equation 2, the term should not be p(f)U(f(k)), as Peter has defined it; it must be either p(f)U(f) or p(f(k))U(f(k)). p and U must always be operating on the same thing within one term, so p should be defined over the same domain as U.
(Peter has disagreed with me in email on this, but I can't agree. In an expected utility calculation, each term must be the utility of something, times the probability of that same something.)
The second class of counter-examples relies on the fact that the utility function is not constrained to be positive. The proof says, "To establish that this series does not converge, we will show that infinitely many of its terms have absolute value >= 1." The consequent does not follow from the antecedent if the terms can be negative; so we already know the proof is flawed in that case. (And if it isn't the case, why does the proof mention absolute value?)
I define U(n) = n if n is even, -n if n is odd. Now I define p(n) = 1 / 2^n. p(n) is a valid probability distribution, as it sums to one; and the infinite sum of p(f)U(f) converges on zero.
The third and most important class of counter-examples is building a reasonable positive utility function and probability distribution for which the theorem fails. Again use p(n) = 1 / 2^n, which sums to 1. My utility function will be
Now U(n) is bounded below in absolute value by itself, and U(n) is unbounded; and yet the infinite sum p(n)U(n) converges to 1. (Intuitive proof: The sum of all the n terms from (2^n)+1 to 2^(n+1) is n+1; their average is thus 1, and they all have very similar probabilities. Therefore, the sum of the infinite series is going to be the same as if it used the constant U(n) = 1, which would make the entire sum p(n)U(n) sum to 1.)
Some retractions, clarified by Peter in email:
Peter does place a limit on the probability distribution: It must never be zero. (I read a '<' as '<='.) This removes counterexamples 1 and 3. However, I am not sure it's possible to build a probability distribution satisfying his requirements (one that has cardinality 2^N, no zero terms, and sums to one). This link says it is not possible.
The reason Peter's expected value calculation is not of the form p(x)U(x) is because he is summing over one possible action. p(h) is the probability that a par