Utility - LessWrong

In the context of value alignment theory, 'Utility' always refers to a goal held by an artificial agent. It further implies that the agent is a consequentialist; that the agent has probabilistic beliefs about the consequences of its actions; that the agent has a quantitative notion of "how much better" one outcome is than another and the relative size of different intervals of betterness; and that the agent can therefore, e.g., trade off large probabilities of a small utility gain against small probabilities of a large utility loss.

True coherence in the sense of a von-Neumann Morgenstern utility function may be out of reach for bounded agents, but the term 'utility' may also be used for the bounded analogues of such decision-making, provided that quantitative relative intervals of preferability are being combined with quantitative degrees of belief to yield decisions.

Utility is explicitly not assumed to be normative. E.g., if speaking of a paperclip maximizer, we will say that an outcome has higher utility iff it contains more paperclips.

Humans should not be said (without further justification) to have utilities over complicated outcomes. On the mainstream view from psychology, humans are inconsistent enough that it would take additional assumptions to translate our psychology into a coherent utility function. E.g., we may differently value the interval between two outcomes depending on whether the interval is framed as a 'gain' or a 'loss'. For the things humans do or should want, see the special use of the word 'value'. For a general disambiguation page on words used to talk about human and AI wants, see Linguistic conventions in value alignment.

On some construals of value, e.g. reflective equilibrium, this construal may imply that the true values form a coherent utility function. Nonetheless, by convention, we will not speak of value as a utility unless it has been spelled out that, e.g., the value in question has been assumed to be a reflective equilibrium.

Multiple agents with different utility functions should not be said (without further exposition) to have a collective utility function over outcomes, since at present, there is no accepted [ canonical way to aggregate utility functions