Goal and Utility are central ideas in the rational agent approach of AI, in which the meaning of intelligence is to achieve goals or, more explicitly, to maximize expected utility.
What goal or utility function should an AI choose? This question is meaningless in the rational agents framework. It's like asking what program a computer should run, the answer is a universal computer can run any well-written program.
However, the rational agent is an idealization of real-world intelligence. The ACI model argues that either pursuing a goal or maximizing utility is an imprecise description of intelligence behaviors that try to follow the precedent. The rational agent model is a quasistatic approximation of the ACI model.
Following this statement, we are trying to derive goals and utility functions from the principles of ACI.
(The previous version has many errors, so I rewrite this chapter)
Goals = Futures that resemble the precedent
The right thing for an ACI agent is to follow the precedent, while the right thing for a rational agent is to achieve goals. Since a goal is a desired future, we can speculate that the best goal-directed approximation to the ACI model is a desired future that follows the precedent.
Consider an agent G that keeps doing the right things. We call doing the right thing the precedent. It is reasonable to assume that if G continues to behave in the same way, it is likely to continue to do the right things in the environments it has experienced.
If G is goal-directed, its goal should be a future that resembles the precedent as closely as possible. If the precedent is a sequence made of observations and actions, the goals should be the best possible continuation of that sequence. This brings us to the conclusion:
Setting goals for an agent is the same process as predicting the sequence of the precedent.
A formal description is given in the appendix at the end of the post.
A goal should have the following properties:
An agent can have multiple, possibly infinite goals, because there will be a goal at every future moment. Compromising between multiple goals would be difficult.
A goal may not always represent the right future even if it can be achieved. It has the highest probability of being the right future, but it can also turn to be out wrong.
When a goal is achieved, the agent may or may not receive notifications that if things are actually right. The information about right or wrong can be presented in any form, including real-time feedback and delayed notification. For example, video game players may not be notified whether they have won or lost until the end of a round . As a universal model of intelligence, ACI determines what is right without relying on any particular mechanism, be it natural selection or artificial control.
That's why an agent can't act directly with goals. It's more conventional to use expected utility to describe an agent's behavior.
Utility = The probability of doing the right things
People prefer thinking in goals, but working with utilities.
Since a goal should be assigned the highest expected utility among all possible worlds, and a goal is defined as a future world that has the highest probability of becoming a precedent/doing the right things, it is reasonable to define the expected utility in terms of the probability of becoming a precedent/doing the right things.
In other words, the utility of a future world is its probability of being the continuation of the precedent sequence.
It is easy to prove that ACI's definition of utility obeys the four axioms of VNM-rationality : completeness, transitivity, continuity, and independence. We can also define the total expected utility as a value.
FAQs
Q: OK, following the precedent might be right, but what if the agent lives in a carefree scenario, where doing everything is right?
A: If doing everything is right, the agent is more likely to follow simple policies than complex ones, so the precedent is most likely to be a simple sequence, such as continuing one action or just reflexes to the environment. On the other hand, if we can find rather complicated structures in the precedent, it is highly unlikely that the agent is in a carefree situation.
Q: With well-defined utility functions, should ACI maximize the expected utility like a rational agent or AIXI?
A: Not really. In relatively stable environments, rational agents can serve as acceptable approximations of ACI agents. However, they are likely to encounter the alignment problem when faced with unanticipated scenarios:
As soon as the precedent receives new data points, the utility function changes, making it unsuitable for straightforward optimization.
Up to this point, we have been discussing ideal ACI agents that have unlimited computing power and memory, and are able to achieve any possible future goal. However, real-world agents cannot perform Solomonoff Induction due to the inherent uncomputability of Solomonoff Induction. Only a constrained version of ACI, known as ACItl, can be implemented on practical computers. Once an ACItl agent receives an improvement in its performance level, it will change its approximation of the utility functions.
Appendix:
Define History, World, and Precedent
In the beginning we can have a formal definition of history, world, and doing the right things.
There is an agent G interacts with an unknown environment in time cycles k=1,2,3,...t . In cycle k, xk is the perception (input) from the environment, and yk is the action (output) of the agent.
Define agent's interaction History h<t≡x1y1x2y2...xt−1yt−1 , while the possible Worlds with history h<t isw<n≡x1y1x2y2...xt−1yt−1...xn−1yn−1 . Worlds are stratified by histories.
Let H be the set of histories, and W be the set of worlds. For any h∈H , there is a subset Wh⊂W consisting of all worlds with history h (Armstrong 2018) .
Define Judgment Functionas a function from a world or a history to 1 or 0 :
J:W∪H→{0,1}
The history of doing the right things should have J(h)=1 .
We can define aPrecedentas a history of doing the right things:
Definition 1 (Precedent). A precedent is a history h∗ ,
∀h∗<k⊆h∗J(h∗k)≡1
Any subsets of a precedent is also a precedent.
For an ACI agent, the precedent contains all the information we have about what is right, thus it set a standard for right things. The right future world that will become a precedent should meet this standard.
Define Goals
A goal can be defined as a future world that has the highest likelihood of doing the right things or becoming a precedent.
Definition 2 (Goal). At time m, given a precedent h∗<t , the goal of time n (n>m≥t )is a world w∗<n∈Wh<m⊆Wh∗<t that
∀w<n∈Wh<mP(J(w∗<n)=1|w∗<n)≥P(J(w<n)=1|w<n)
There is a simple and intuitive theorem about goals:
Theorem 1: An agent's goal given a precedent h∗ equals the most possible continuation of the precedent sequence.
Proof is given at the end of the post. With this theorem, the goal calculation problem turns out to be a sequence prediction task. Following Hutter's AIXI, ACI uses Solomonoff Induction as an all purpose sequence prediction tool. Solomonoff Induction considers all possible hypotheses about a sequence, and continuously updates the estimate of the probability of each hypothesis.
Define Utility Function, Values, and Reward
Utility function is defined as a function from worlds to real numbers:
U:W→R
Definition 3 (Expected Utility). The expected utility of any possible world w<n∈Wh∗<t is its probability of doing the right thing:
Uh∗<t(w<n)≡P(J(w<n)=1|w<n)
In other words, the utility of a world is its probability of doing the right thing given a known precedent was doing the right thing.
We will calculate the utility function using Solomonoff Induction in the last part of this article.
We can also define the total expected utility as value:
Definition 4 (Value) Total expected utility or value for a policy π and history h<n and precedent h∗<t⊆h<n :
where a policy πfor an agent is a map from histories to a probability distribution over actions, π:H→ΔA .
And define reward function as the difference between two total expected utilities (Armstrong 2018)
Definition 5 (Reward) Reward between two histories h<m⊂h<n for a policy π and precedent h∗<t⊆h<n is:
R(h∗<t,π,h<n,h<m)=V(h∗<t,π,h<n)−V(h∗<t,π,h<m)
Proof of Theorem 1
According to Solomonoff Induction, the probability that w is the future of the precedent sequence h∗ according to all hypotheses would be:
M(w<n=h∗<n|h∗<t)=M(h∗<n)/M(h∗<t)
Where M(h∗) is a precedent's prior distribution over all possible worlds when we take all the hypotheses into account:
M(x)≡∑μ∈MRQ−H(μ)μ(x)
where μ is the semi-measure which assigns probabilities to hypotheses x, and MR is the set of all recursive semi-measures, Q is the numbers of symbols in the sequences' alphabet, and H(μ) is the length of the shortest program that computes μ (Legg 1996).
We cannot directly use this equation to predict the future precedent, because for an agent there might be more than one possible right choices , in contrast to a sequence that has only one continuation.
Let's consider a sequence J+ , in which a variable j=J(h<k) is inserted to a history or world sequence every k steps. For example:
if jn−1=J(w<n)=1 (then all js from jt to jn−1 equal to 1), w<n would be a world of doing the right thing. Thus the problem of utility becomes the problem of sequence prediction, the utility of w<n is the probability of jn−1=1 :
Uh∗<t(w<n)=P(J+(w<n)∩J(w<n)=1)/P(J+(w<n))
=P(J(w<n)=1|J+(w<n))
Then we can try to prove A goal given a precedent h∗ equals the most possible continuation of the precedent sequence:
Let w′<n be one of w<n∈Wh∗<tthat has the highest probability to be the continuation of the precedent sequence, which means:
∀w<n∈Wh∗<tP(w′<n|h∗<t)≥P(w<n|h∗<t)
and because w<n∈Wh∗<t ,
P(w′<n)≥P(w<n)
And we know all the js in J+(w′<n) and J+(w<n) equal to 1, could be the output of a program of fixed length and has a fixed affect on the prior probability of a sequence, then:
Goal and Utility are central ideas in the rational agent approach of AI, in which the meaning of intelligence is to achieve goals or, more explicitly, to maximize expected utility.
What goal or utility function should an AI choose? This question is meaningless in the rational agents framework. It's like asking what program a computer should run, the answer is a universal computer can run any well-written program.
However, the rational agent is an idealization of real-world intelligence. The ACI model argues that either pursuing a goal or maximizing utility is an imprecise description of intelligence behaviors that try to follow the precedent. The rational agent model is a quasistatic approximation of the ACI model.
Following this statement, we are trying to derive goals and utility functions from the principles of ACI.
(The previous version has many errors, so I rewrite this chapter)
Goals = Futures that resemble the precedent
The right thing for an ACI agent is to follow the precedent, while the right thing for a rational agent is to achieve goals. Since a goal is a desired future, we can speculate that the best goal-directed approximation to the ACI model is a desired future that follows the precedent.
Consider an agent G that keeps doing the right things. We call doing the right thing the precedent. It is reasonable to assume that if G continues to behave in the same way, it is likely to continue to do the right things in the environments it has experienced.
If G is goal-directed, its goal should be a future that resembles the precedent as closely as possible. If the precedent is a sequence made of observations and actions, the goals should be the best possible continuation of that sequence. This brings us to the conclusion:
A formal description is given in the appendix at the end of the post.
A goal should have the following properties:
That's why an agent can't act directly with goals. It's more conventional to use expected utility to describe an agent's behavior.
Utility = The probability of doing the right things
People prefer thinking in goals, but working with utilities.
Since a goal should be assigned the highest expected utility among all possible worlds, and a goal is defined as a future world that has the highest probability of becoming a precedent/doing the right things, it is reasonable to define the expected utility in terms of the probability of becoming a precedent/doing the right things.
In other words, the utility of a future world is its probability of being the continuation of the precedent sequence.
It is easy to prove that ACI's definition of utility obeys the four axioms of VNM-rationality : completeness, transitivity, continuity, and independence. We can also define the total expected utility as a value.
FAQs
Q: OK, following the precedent might be right, but what if the agent lives in a carefree scenario, where doing everything is right?
A: If doing everything is right, the agent is more likely to follow simple policies than complex ones, so the precedent is most likely to be a simple sequence, such as continuing one action or just reflexes to the environment. On the other hand, if we can find rather complicated structures in the precedent, it is highly unlikely that the agent is in a carefree situation.
Q: With well-defined utility functions, should ACI maximize the expected utility like a rational agent or AIXI?
A: Not really. In relatively stable environments, rational agents can serve as acceptable approximations of ACI agents. However, they are likely to encounter the alignment problem when faced with unanticipated scenarios:
Appendix:
Define History, World, and Precedent
In the beginning we can have a formal definition of history, world, and doing the right things.
There is an agent G interacts with an unknown environment in time cycles k=1,2,3,...t . In cycle k, xk is the perception (input) from the environment, and yk is the action (output) of the agent.
Define agent's interaction History h<t≡x1y1x2y2...xt−1yt−1 , while the possible Worlds with history h<t is w<n≡x1y1x2y2...xt−1yt−1...xn−1yn−1 . Worlds are stratified by histories.
Let H be the set of histories, and W be the set of worlds. For any h∈H , there is a subset Wh⊂W consisting of all worlds with history h (Armstrong 2018) .
Define Judgment Function as a function from a world or a history to 1 or 0 :
J:W∪H→{0,1}
The history of doing the right things should have J(h)=1 .
We can define a Precedent as a history of doing the right things:
Definition 1 (Precedent). A precedent is a history h∗ ,
∀h∗<k⊆h∗ J(h∗k)≡1
Any subsets of a precedent is also a precedent.
For an ACI agent, the precedent contains all the information we have about what is right, thus it set a standard for right things. The right future world that will become a precedent should meet this standard.
Define Goals
A goal can be defined as a future world that has the highest likelihood of doing the right things or becoming a precedent.
Definition 2 (Goal). At time m, given a precedent h∗<t , the goal of time n (n>m≥t )is a world w∗<n∈Wh<m⊆Wh∗<t that
∀w<n∈Wh<m P(J(w∗<n)=1|w∗<n)≥P(J(w<n)=1|w<n)
There is a simple and intuitive theorem about goals:
Theorem 1: An agent's goal given a precedent h∗ equals the most possible continuation of the precedent sequence.
Proof is given at the end of the post. With this theorem, the goal calculation problem turns out to be a sequence prediction task. Following Hutter's AIXI, ACI uses Solomonoff Induction as an all purpose sequence prediction tool. Solomonoff Induction considers all possible hypotheses about a sequence, and continuously updates the estimate of the probability of each hypothesis.
Define Utility Function, Values, and Reward
Utility function is defined as a function from worlds to real numbers:
U:W→R
Definition 3 (Expected Utility). The expected utility of any possible world w<n∈Wh∗<t is its probability of doing the right thing:
Uh∗<t(w<n)≡P(J(w<n)=1|w<n)
In other words, the utility of a world is its probability of doing the right thing given a known precedent was doing the right thing.
We will calculate the utility function using Solomonoff Induction in the last part of this article.
We can also define the total expected utility as value:
Definition 4 (Value) Total expected utility or value for a policy π and history h<n and precedent h∗<t⊆h<n :
V(h∗<t,π,h<n)=Eπh∗<t(h<n)=∫w∈Wh∗<tUh∗<t(w)P(w|h<n)
where a policy π for an agent is a map from histories to a probability distribution over actions, π:H→ΔA .
And define reward function as the difference between two total expected utilities (Armstrong 2018)
Definition 5 (Reward) Reward between two histories h<m⊂h<n for a policy π and precedent h∗<t⊆h<n is:
R(h∗<t,π,h<n,h<m)=V(h∗<t,π,h<n)−V(h∗<t,π,h<m)
Proof of Theorem 1
According to Solomonoff Induction, the probability that w is the future of the precedent sequence h∗ according to all hypotheses would be:
M(w<n=h∗<n|h∗<t)=M(h∗<n)/M(h∗<t)
Where M(h∗) is a precedent's prior distribution over all possible worlds when we take all the hypotheses into account:
M(x)≡∑μ∈MRQ−H(μ)μ(x)
where μ is the semi-measure which assigns probabilities to hypotheses x, and MR is the set of all recursive semi-measures, Q is the numbers of symbols in the sequences' alphabet, and H(μ) is the length of the shortest program that computes μ (Legg 1996).
We cannot directly use this equation to predict the future precedent, because for an agent there might be more than one possible right choices , in contrast to a sequence that has only one continuation.
Let's consider a sequence J+ , in which a variable j=J(h<k) is inserted to a history or world sequence every k steps. For example:
J+(h∗<t)≡x1y11x2y21...xt−1yt−11
J+(h<t)≡x1y1j1x2y2j2...xt−1yt−1jt−1
for w<n∈Wh∗<t ,
J+(w<n)≡x1y11x2y21...xt−1yt−11xtytjt...xn−1yn−1jn−1
if jn−1=J(w<n)=1 (then all js from jt to jn−1 equal to 1), w<n would be a world of doing the right thing. Thus the problem of utility becomes the problem of sequence prediction, the utility of w<n is the probability of jn−1=1 :
Uh∗<t(w<n)=P(J+(w<n)∩J(w<n)=1)/P(J+(w<n))
=P(J(w<n)=1|J+(w<n))
Then we can try to prove A goal given a precedent h∗ equals the most possible continuation of the precedent sequence:
Let w′<n be one of w<n∈Wh∗<t that has the highest probability to be the continuation of the precedent sequence, which means:
∀w<n∈Wh∗<t P(w′<n|h∗<t)≥P(w<n|h∗<t)
and because w<n∈Wh∗<t ,
P(w′<n)≥P(w<n)
And we know all the js in J+(w′<n) and J+(w<n) equal to 1, could be the output of a program of fixed length and has a fixed affect on the prior probability of a sequence, then:
P(J+(w<n)∩J(w<n)=1)=P(w<n)−C1
P(J+(w<n))=P(w<n)−C2
and C1>C2
Then we can have:
∀w<n∈Wh∗<t
P(J+(w′<n)∩J(w′<n)=1)/P(J+(w′<n))≥P(J+(w<n)∩J(w<n)=1)/P(J+(w<n))
which equals
Uh∗<t(w′<n)≥Uh∗<t(w<n)