Context-independence We'll say that a consequentialist optimiser is context-independent if .
Context-independence is a stronger condition than consequentialism — this condition says that, once we know which payoffs are achievable in the agent's task, then the only thing relevant to the optimality of a particular option is its payoff. For example, is context-independent, but is not.
is not context-independent according to the given definition.
Consider tasks , with and . Then , despite .
I guess the correct definition says instead of .
Wow! Happy to see this exposition. I've been a fan of this framework for a while now. Happy to see it is finally percolating to the alignment community!
Yes!
In a subsequent post, everything will be internalised to an arbitrary category with enough structure to define everything. The words set and function will be replaced by object and morphism. When we do this, will be replaced by an arbitrary commutative monad .
In particular, we can internalise everything to the category Top. That is, we assume the option space and the payoff are equipped with topologies, and the tasks will be continuous functions , and optimisers will be continuous functions where is the function space equipped with pointwise topology, and is a monad on Top.
In the literature, everything is done with galaxy-brained category theory, but I decided to postpone that in the sequence for pedagogical reasons.
(2) Moral antirealism.
The subjective account (again, read too literally) suggests that values are out there in the world, that the environment contains entities called utilities which all rational agents seek, that all conflict is disagreement, that correctness is a property of pebble heaps, that microeconomics is normative, and (most concerning of all) that the primary obstacle to building a safe superintelligence is writing down a utility function.
The objective account, I think, is more moral antirealist. It says, "The world contains only paperclips and happy humans, never utilities! The world contains only paperclip-maximisers and happy-human-maximisers, never utility-maximisers!"
Re the issue of the subjective account causing us to focus too much on moral realism and the idea that values are out there in the world, a few responses.
Thus, whenever we say that something is a utility maximizer, we don't say about specific utility they are maximizing, but we can concretize the utility function by say that they are a paperclip maximizer, and say much more about that.
I do think there's too much focus on utility functions as a general class, but that's due to practical concerns like it being hard to prove what it's going to do, and people often assume on LW that their results go further than they do without naming what specific utility function it is maximizing.
Exercise 1:
The empty set is the only one. For any nonempty set X, you could pick as a counterexample:
Exercise 2:
The agent will choose an option which scores better than the threshold .
It's a generalization of satisficers, these latter are thresholders such as is nonempty.
Exercise 3:
Exercise 4:
I have discovered a truly marvelous-but-infinite solution of this, which this finite comment is too narrow to contain.
Exercise 5:
The generalisable optimisers are the following:
i.e. argmin will choose the minimal points.
i.e. satisfice will choose an option which dominates some fixed anchor point . Note that since R is only equipped with a preorder, it means it might be a more selective optimiser (if not total, it's harder to get an option better then the anchor). More interestingly, if there are indifferent options with the anchor (some x / and ), it could choose it rather than the anchor even if there is no gain to do so. This degree of freedom could be eventually not desirable.
Exercise 6:
Interesting problem.
First of all, is there a way to generalize the trick?
The first idea would be to try to find some equivalent to the destested element . For context-dependant optimiser such as better-than-average, there isn't.
A more general way to try to generalize the trick would be the following question:
For some fixed , and , could we find some other such as and
i.e. is there a general way to replace values outside of without modify the result of the constrained optimisation?
Answer: no. Counter-example: The optimiser for some infinite set X and finite nonempty sets and R.
So it seems there is no general trick here. But why bother? We should refocus on what we mean by constrained optimisation in general, and it has nothing to do with looking for some u'. What we mean is value outside are totally irrelevant.
How? For any of the current example we have here, what we actually want is not , but : we only apply the optimiser on the legal set.
Problem: in the current formalism, an optimiser has type , so I don't see obvious way to define the "same" optimiser on a different X. and the others here are implicitly parametrized, so it's not that a problem, but we have to specify this. This suggests to look for categories (e.g. for argmax...).
I'd like to read your solution to exercise 6, could you add math formatting? I have a hard time reading latex code directly.
You can do that with the visual editor mode by selecting the math and using the contextual menu that appears automatically, or with $ in the Markdown editor.
There are $ in your comment, so I guess you inadvertently typed in visual mode using the Markdown syntax.
I think this part uses an unfair comparison:
Supposes that and are small finite sets. A task can be implemented as dictionary whose keys lie in and whose values lie in , which uses bits. The functional can be implemented as a program which receives input of type and returns output of type . Easy!
In the subjective account, by contrast, the task requires infinite bits to specify, and the functional must somehow accept a representation of an arbitrary function . Oh no! This is especially troubling for embedded agency, where the agent's decision theory must run on a physical substrate.
If X and W+ are small finite sets, then any behavior can be described with a utility function requiring only a finite number of bits to specify. You only need to use R as the domain when W+ is infinite, such as when outcomes are continuous, in which case the dictionaries require infinite bits to specify too.
I think this is representative of an unease I have with the framing of this sequence. It seems to be saying that the more general formulation allows for agents that behave in ways that utility maximizers cannot, but most of these behaviors exist for maximizers of certain utility functions. I'm still waiting for the punchline of what AI safety relevant aspect requires higher order game theory rather than just maximizing agents, particularly if you allow for informational constraints.
Likewise, higher-order game theory promises a normalisation of game theory.
I don't know why, but this smells correct to me.
I have the same intuition - I agree that this smells correct. I suspect that the answer has something to do with the thing where when you take measurements (for whatever that should mean) of a system, what that actually looks like is some spectactularly non-injective map from [systems of interest in the same grouping as the system we're looking at] to , which inevitably destroys some information, by noninjectivity.
So I agree that it smells right, to quantify over the actual objects of interest and then maybe you apply information-destroying maps that take you outside spacetime (into math) rather than applying your quantifiers outside of spacetime after you've already applied your map and then crossing your fingers that the map is as regular/well-behaved as it needs to be.
Exercise 6:
I repeat the construction using each element of in turn in place of .
For each such , spits out a set. I pick the producing the set which is closer to .
For finite sets, I could look at the size of the intersection; more in general I'll need some metric on to say what it means to be close to .
Does that make sense?
Even with finite sets, it doesn't work because the idea to look for "closest to " is not what we looking for.
Let a class of students, scoring within a range , . Let the (uniform) better-than-average optimizer, standing for the professor picking any student who scores better than the mean.
Let (the professor despises Charlie and ignores him totally).
If u(Alice) = 5 and u(Bob) = 4, their average is 4.5 so only Alice should be picked by the constrained optimisation.
Howewer, with your proposal, you run into trouble with u' defined as u'(Alice) = u(Alice), u'(Bob) = u(Bob), and u'(Charlie) = 0.
The average value for this u' is , and both Alice and Bob scores better than 3: . The size of the intersection with is then maximal, so your proposal suggests to pick this set as the result. But the actual result should be , because the score of Charlie is irrelevant to constrained optimisation.
Ok, take 2.
If I understand correctly, what you want must be more like "restrict the domain of the task before plugging it into the optimiser," and less like "restrict the output of the optimiser."
I don't know how to do that agnostically, however, because optimisers in general have the domain of the task baked in. Indeed the expression for a class of optimisers is , with in it.
Considering better-than-average optimisers from your example, they are a class with a natural notion of "domain of the task" to tweak, so I can naturally map any initial optimiser to a new one with a restricted task domain: , by taking the mean over .
But given a otherwise unspecified , I don't see a natural way to define a .
Assuming there's no more elegant answer than filtering for that (), then the question must be: is there another minimally restrictive class of optimisers with such a natural notion, which is not the one with the "detested element" already proposed by the OP?
Try 1: consequentialist optimisers, plus the assumption , i.e., the legal moves do not restrict the possible payoffs. Then, since the optimiser picks actions only through , for each r I can delete illegal actions from the preimage, without creating new broken outputs. However, this turns out to be just filtering, so it's not an interesting case.
Try 2: the minimal distill of try 1 is that the output either is empty or contains legal moves already, and then I filter, so yeah not an interesting idea.
Try 3: invariance under permutation of something? A task invariant under permutation of is just a constant task. An optimiser "invariant under permutation of " does not even mean something.
Try 4: consider a generic map . This does not say anything, it's the baseline.
Try 5: analyse the structure of a specific example. The better-than-average class of optimisers is . It is consequentialist and context-independent. I can't see how to generalize something mesospecific here.
Time out.
Written during the SERI MATS program under the joint mentorship of John Wentworth, Nicholas Kees, and Janus.
Preface
In classical game theory, we characterise agents by a utility function and assume that agents choose options which cause maximal utility. This is a pretty good model, but it has some conceptual and empirical limitations which are particularly troublesome for AI safety.
Higher-order game theory (HOGT) is an attempt to rebuild game theory without appealing to either utility functions or maximisation. I think Higher-Order Game Theory is cool so I'm writing a sequence on it.
I'll try to summarise the relevant bits of the literature, present my own minor extensions, and apply HOGT to problems in AI safety
You're reading the first post! Let's get into it.
The role of argmax
For each set X, let argmaxX:(X→R)→P(X) be the familiar function which receives a function u:X→R and produces the set of element which maximise u. A function like argmaxX is sometimes called a higher-order function or functional, because it receives another function as input.
Explicitly, argmaxX=λu:X→R.{x∈X|∀x′∈X,u(x)≥u(x′)}.[1]
As you all surely know, argmax plays a central role in classical game theory. Typically we interpret the set X as the agent's options,[2] and the function u:X→R as the agent's task, which assigns a payoff u(x)∈R to each option x∈X. We say an option x∈X is optimal to the agent for the task u:X→R whenever x∈argmaxX(u). Classical game theory is governed by the assumption that agents choose optimal options in whatever task they face, where optimality strictly means utility-maximisation.
Due to the presence of the powerset operator P in argmaxX:(X→R)→P(X), this model of the agent is possibilistic — for each task u:X→R, our model says which options are possibly chosen by agent. The model doesn't say which options are probably chosen by the agent — for that we'd need a function (X→R)→Δ(X). Nor does the model say which options are actually chosen by the agent — for that we'd need a function (X→R)→X.[3]
Exercise 1: Find a set X such that argmaxX(u)=∅ for every function u:X→R.
Generalising the functional
The function argmaxX is a particular way to turn tasks into sets of options, i.e. it has the type-signature (X→R)→P(X). But there are many functions with the same type-signature (see the table below), so a natural question to ask is... What if we replace argmaxX in classical game theory with an arbitrary functional ψ:(X→R)→P(X)?
What we get is higher-order game theory.[4] Surprisingly, we can recover many game-theoretic concepts in this more general setting. We can typically recover the original classical concepts from the more general higher-order concepts by restricting our attention to either ψ=argmaxX or ψ=argminX.
So let's revise our definition —
In higher-order game theory, we model the agents options with a set X and model their task with a function u:X→R. But (unlike in classical game theory) we're free to model the agent's optimisation with any functional ψ:(X→R)→P(X). I hope to persuade you that this additional degree of freedom is actually quite handy.[5]
Higher-order game theory is governed by the central assumption that agents choose ψ-optimal options in whatever ψ-tasks they face, where ψ is our model of the agent's optimisation. If we observe the agent choosing an option x∈ψ(u) then that would be consistent with our model, and any observation of a choice x∉ψ(u) would falsify our model.[6]
Anyway, here is a table of some functionals and their game-theoretic interpretation —
This agent will choose an option x∈X which dominates some fixed option s∈X. The option s is called the anchor point. It might represent the "default" option, or the "do nothing" option, or the "human-approved" option.
According to Herbert Simon, satisficing is an accurate model of human and institutional decision-making.
Utility-satisficers are a kind of mild optimiser, which might be a safer way to build AI than a full-blown utility-maximiser.
This agent will choose an option in a fixed subset S⊆X, regardless of the task, e.g. DefectBot and CooperateBot.[7]
Using rockS, we can model a non-agents as a special (degenerate) case of agents.
This agent will choose an option x∈X in the top ϵ quantile, given that the option space is equipped with a distribution π∈Δ(X).
This is the possibilistic version of Jessica Taylor's quantiliser. Her original probabilistic version is a function (X→R)→ΔX, and so wouldn't count as an optimiser according to Definition 2.[8]
Generalising the payoff space.
Now let's generalise the payoff space to any set R, not only R. We will think of the elements of R as payoffs in a general sense, relaxing the assumption that the payoffs assigned to the options are real numbers. The function u:X→R describes which payoff u(x)∈R would result from the agent choosing the option x∈X.
This is significantly more expressive! When we are tasked with modelling a game-theoretic situation, we are can pick any set R to model the agent's payoffs![9]
I'll use the notation JP(X,R) to denote the set of functionals (X→R)→P(X), e.g. argmaxZ∈JP(Z,R).
Anyway, here is a table of some functionals and their game-theoretic interpretation —
An optimiser ψ∈JP(X,R∪{−∞,+∞}) must be well-defined for infinatary utility functions u:X→R∪{−∞,+∞}.
For example, u(x) might be the expected total winnings of a gambler employing the gambling strategy x∈X. The gambler themselves are modelled by an optimiser ψ:(X→R∪{−∞,+∞})→P(X). This functional ψ is characterised by its attitude towards infinite expected profit/loss.
The Levi-Civita field contains infinitesimals like ϵ,ϵ2,2ϵ+ϵ2,√ϵ , as well as infinite values like ϵ−1,ϵ−2,ϵ1/3+ϵ−1/3+2. In infinite ethics, we encounter tasks u:X→Levi-Civita Field, and we can model the infinitary ethicist by an optimiser ψ∈JP(X,Levi-Civita Field).
Exercise 4: Solve infinite ethics.
An optimiser ψ∈JP(X,Rn) can model different multi-objective optimisers. For example, there's an optimiser which, given multi-objective task u:X→Rn, returns those options which are maximal according to the lexicographic ordering, and there's another optimiser which uses the product ordering instead.[10]
Later in this post, we'll encounter an optimiser in JP(X1×⋯×Xn,R) which returns the nash equilibria of n-player games, where a task for this optimiser is an n-by-npayoff matrix u:X1×⋯×Xn→R.
The field of cooperative bargaining is concerned with different optimisers JP(X+1,Rn). A bargaining task f:(X+1)→Rn parameterises both the feasibility set F={f(x)∈Rn:x∈X} and the disagreement point d=f(⋆)∈Rn where 1={⋆} is the singleton set.
Suppose that R is any set equipped with a preorder ≤.[11] Then a function u:X→R will induce a preorder ≤u on X via x≤ux′⟺u(x)≤u(x′).
Let argmaxX∈JP(X,R) be the optimiser which chooses the maximal points of ≤u, i.e. options which aren't strictly dominated by any other options.
Explicitly λu:X→R.{x∈X:∀x′∈X.u(x)≤u(x′)⟹u(x′)≤u(x)}
If ≤ isn't total (i.e. the agent has incomplete preferences over the payoffs) then the resulting optimiser argmaxX is less selective (i.e. more options are optimal). In the extreme case, where no options are comparable, then argmaxX might choose any option.[12]
Exercise 5: Which optimisers ψ∈JP(X,R) defined in the previous table can be generalised to any preorder (R,≤)?
Occasionally the same set X will serve as both the option space and the payoff space. In this case, a task u:X→X represents some transformation of the underlying option space.
There's an optimiser fix:JP(X,X), which choices options which are fixed-points of u:X→X. That is, fix=λu:X→X.{x∈X|u(x)=x}. We can use fix to model a conservative agent who chooses options which remain untransformed by the task. Note that this optimiser is not consequentialist, because the optimality of an option is not determined by its payoff alone. For example, 0∈fixR(sin) but π∉fixR(sin), despite the fact that sin(0)=sin(π).
Subjective vs objective optimisers
It's standard practice, when modelling agents and their environments, to use payoff spaces like R, Rn, Δ(R), etc, but I think this can be misleading.
Consider the following situation —
Now, let p:W+→R be the function which counts the number of paperclips in a light-cone, and let h:W+→R be the function which counts the number of happy humans.
Here's what classical game theory says about your predicament —
I call this a subjective account, because the robot's task depends on the robot's preferences. Were the robot to have difference preferences, then they would've faced a different task, and because you don't know the robot's preferences you don't know their task.
However, by exploiting the expressivity of higher-order game theory, we can offer an objective account which rivals the subjective account. In the objective account, the task that the robot faces doesn't depend on the robots preferences —
Notice that both accounts yield the same solution! Nonetheless, I think the objective account is nicer for four reasons. (Feel free to skip if you're convinced.)
Disclaimer: Admittedly, the distinction between subjective accounts — where payoff spaces are stuff like R, Rn, R∪{−∞,∞}, Z, Δ([0,1]2), e.t.c. — and objective accounts — where payoff spaces are stuff like future light-cones, or brain states, or pixel configurations, e.t.c — is an informal (and somewhat metaphysical) distinction, but hopefully you can see what I'm pointing at.
(1) Carve nature at its joints.
The objective account, where R=W+, bares a closer structural resemblance to the physical reality. The physical robot corresponds to the functional ψ∈JP(X,W+) and the physical environment corresponds to the function u:X→W+. Notably, all the information about the robot's idiosyncratic preferences is bundled up inside the functional ψ.
In contrast, in the subjective account, where R=R, the functional ψ∈JP(X,R) contains almost no substantial information about the agent itself. It suggests (if read too literally) that all agents are basically indiscernible, and they behave differently because they face different environments.
(2) Moral antirealism.
The subjective account (again, read too literally) suggests that values are out there in the world, that the environment contains entities called utilities which all rational agents seek, that all conflict is disagreement, that correctness is a property of pebble heaps, that microeconomics is normative, and (most concerning of all) that the primary obstacle to building a safe superintelligence is writing down a utility function.
The objective account, I think, is more moral antirealist. It says, "The world contains only paperclips and happy humans, never utilities! The world contains only paperclip-maximisers and happy-human-maximisers, never utility-maximisers!"
(3) Experimental independence
In the objective account, the task f:X→W+ and the optimiser ψ:(X→W+)→P(X) have independent semantic meaning. At least in principle, I know how to find f:X→W+ independently of ψ — namely by inspecting the physical dynamics of the robot's environment or inspecting the robot's world-model. And I know how to find ψ:(X→W+)→P(X) independently of f — namely by placing the robot in different physical environments and observing their choices.
By contrast, in the subjective account, the task f:X→R and the optimiser ψ:(X→R)→P(X) have no independent meaning — they are merely exist to compress the optimality condition ψ(f)⊆X. What would it even mean for the robot to possess the utility function u:X→R without the presumption that they maximise utility? I've honestly no clue. And without the task u:X→R, how would I determine the robot's optimiser ψ:(X→R)→P(X) experimentally? Presumably I should vary the task u:X→R, however I can't do this experimentally because u:X→R contains the robot's preferences which is a variable outside my control.
Granted, for most historic applications of classical game theory, we do know the preferences of the agent — we already know that White wants to checkmate Black, and the consumer wants cheaper goods, and the statistician wants to accurate predictions, e.t.c — so it doesn't matter whether one sticks those preferences in the task or the optimiser. But in AI safety, a big chunk of our perplexity comes from the preferences of the agents. So it matters more that we stick those preferences in the right part of our model.
(4) No spooky reals.
The subjective account seems to rely on the elements of a mysterious set called "R" which is extraneous to the phenomenon under consideration. By contrast, the objective account refers only to the sets X and W+, where the elements of X and W+ are physical stuff intrinsic to the situation being modelled. Hence, higher-order game theory promises to dispense with R from game theory, along with argmax and utility functions, ensuring the weirdness of R doesn't contaminate our game theory.[13]
This has a computational upshot as well.
Supposes that X={x1,…,xn} and W+={w1,…,wm} are small finite sets. A task f:X→W+ can be implemented as dictionary whose keys lie in X and whose values lie in W+, which uses nlogm bits. The functional ψ:JP(X,W+) can be implemented as a program which receives input of type Dict[X,W] and returns output of type List[X]. Easy!
In the subjective account, by contrast, the task f:X→R requires infinite bits to specify, and the functional ψ:JP(X,R) must somehow accept a representation of an arbitrary function f:X→R. Oh no! This is especially troubling for embedded agency, where the agent's decision theory must run on a physical substrate.
Recovering utility functions
According to the objective account, what is fundamental about an agent is the functional ψ:(X→W+)→P(X) where W+ is some objective payoff, and the claim that the agent has a utility function v:W+→R is understood as the claim that ψ can be approximately decomposed into argmaxX(v∘−). Hence, the existence of a utility-decomposition of ψ is an additional fact about the agent to be discovered, rather than an assumption that should be baked into the formalism itself.
Utility functions are an emergent property of the underlying functional.
One clue that utility functions are emergent properties is that they aren't unique! It's well-known that a utility function v:W+→R for an agent is only well-defined modulo positive-affine transformation, i.e. there is no meaningful distinction between v:W+→R and v′:W+→R whenever v′=α⋅v+β for some α∈R+,β∈R. This fact falls immediately from the objective-first view, because argmaxX(v∘−) and argmaxX(v′∘−) are equal functionals whenever v=α⋅v′+β
Now, if we were dealing with argmax-ϵ-slack or threshα — instead of argmax — then there would be a meaningful difference between some utility functions which are equivalent modulo positive-affine transformation.
Let's make this notion precise —
Typically, ψ is some objective optimiser and Φ is some subjective optimiser. When Φ=argmaxX then we obtain the classical utility functions of an objective optimiser ψ, and we may obtain non-classical utility functions of the same optimiser ψ by considering (e.g.) Φ=satisfices∈JP(X,R) or ψ=better-than-averageπ∈JP(X,R) or whatever.
Classical game theory is the study of optimisers with classical utility functions. There are some theoretical and empirical arguments for restricting only to such optimisers but these arguments are probably overrated. In any case, I suspect that unifying deep learning and classical game theory will require studying non-classical agents. Here's why — in the deep learning paradigm, we build agents by training a large neural network with stochastic gradient descent on tasks which fortify agentic-like behaviour. At initialisation, these neural networks aren't classical agents, and classicality emerges incrementally, probably after passing through phases of nonclassical agency. Therefore, if we want to account for the emergence of agency (classical or otherwise), then we need to account for the loss gradient over the entire space of optimisers JP(X,W+), not merely over the subspace of JP(X,W+) corresponding to classical optimisers.
Some properties of optimisers
We can define formalise various properties and operations of optimisation using arbitrary functional ψ∈JP(X,R).
I've included the list of examples below for illustrative purposes only —
We'll say that an optimiser ψ1 is (weakly) more selective than ψ2 if and only if ∀u:X→R.ψ1(u)⊆ψ2(u). This relation defines a partial order on JP(X,R). For example, satisficeA is more selective than satisficeB whenever B⊊A⊆X.
Mild optimization, an approach for mitigating Goodhart's law, replaces argmax with less selective optimisers.
We'll say that an optimiser ψ∈JP(X,R) is consequentialist if ψ=λu:X→R.{x∈X|u(x)∈q(u)} for some q:(X→R)→P(R).[14]
In other words, for any task u:X→R, if u(x)=u(x′) then x∈ψ(u)⟺x′∈ψ(u).
This condition says that, once we know the agent's task, then the only thing relevant to the optimality of a particular choice is its payoff. For example, argmaxX andbetter-than-averageπ are consequentialist, but rockS and fix are not.
This function q says, for each task u:X→R, which payoffs would be acceptable to the agent, so q=max is the quantifier for ψ=argmaxX.
We'll say that a consequentialist optimiser ψ=λu:X→R.{x∈X|u(x)∈q(u)} is context-independent if Image(u)=Image(u′)⟹q(u)=q(u′).
Context-independence is a stronger condition than consequentialism — this condition says that, once we know which payoffs are achievable in the agent's task, then the only thing relevant to the optimality of a particular option is its payoff. For example, argmaxX is context-independent, but better-than-averageπ is not.
Suppose that ψ∈JP(X,R) is an optimiser, and Xlegal⊆Xis a subset of the options which are safe/valid/legal. Then we can define another optimiser ψ′=λu:X→R.Xlegal∩ψ(u)∈JP(X,R) who always chooses options in Xlegal.
This operation captures the notion of filtering options after the agent has applied the optimisation. For example, if X=R and ψ=argmaxR then ψ(cos)={2πk:k∈Z}, so if Xlegal=[0,4π] then ψ′(cos)={0,2π,4π}.
The filtering operation doesn't capture what we typically mean by optimisation within side-constraints. For example, if we change Xlegal to [π/4,π/2], then we would like the optimiser to choose π/4 as this maximises cos subject to the constraint Xlegal. The filtered optimisation would produce the emptyset, as none of the options optimal to argmaxR are legal.
Let's define constrained optimisation for an optimiser ψ∈JP(X,R). Suppose that there is an element ⊥∈R such that ⊥∈u(ψ(u))⟹Im(u)⊆{⊥}. Informally, this means that ψ detests the payoff ⊥∈R and would never chooses an option resulting in ⊥ unless it must. For example, argmaxX∈JP(X,R∪{−∞,+∞}) detests the payoff −∞.
We can defined constrained optimisation as follows: ψ′=λu:X→R.ψ(u′) where u′:X→R,x↦{u(x)if x∈Xlegal⊥otherwise
Exercise 6: How might we define constrained optimisation if there is no such ⊥∈R?
Maybe the set of legal options Xlegal∈PX depends on the task u:X→R itself. This dependency itself corresponds to an optimiser ψlegal∈JP(X,R).
We can define the filtered optimiser as ψ′:λX→R.ψlegal(u)∩ψ(u).
We can define the constrained optimiser ψ′=λu:X→R.ψ(u′) where u′:X→R,x↦{u(x)if x∈ψlegal(u)⊥otherwise
The optimiser ψ′ behaves like one agent ψ being supervised by another agent ψsafe where the mode of supervision is filtering and constraining respectively.
For a collection of optimisers Z⊆JP(X,R), we can define a single optimiser λu:X→R.⋂{ψ(u):ψ∈Z} who will only choose an option if each constituent optimiser would also choose the option, forming a unanimous coalition of Z.
Dually, we can define the optimiser λu:X→R.⋃{ψ(u):ψ∈Z} forming a unilateral coalition.
Suppose that f:(X→R)→JP(X,R) assigns an optimiser f(u):JP(X,R) to each task u:X→R.
In terms of f, we can define the optimiser ψ=λu:X→R.f(u)(u):JP(X,R) by diagonalising — i.e. ψ will match the optimiser f(u) in each task u:X→R.
This operation captures (somewhat) the notion of shards, or context-activated optimisation.
For example, imagine an agent ψ:JP(X,R) who "plays it safe" by satisficing unless they can achieve sufficiently high payoff, i.e. f(u)={argmaxXif max(u)≥100satisficesotherwise
Or imagine an agent who desires cheese whenever they are sufficiently close to cheese, but otherwise desires to run around aimlessly.
Recap
Next time...
The next post will answer the age-old question, "What happens when two optimisers ψA∈JP(A,R) and ψB∈JP(B,R) play the simultaneous game g:A×B→R?" We know what happens when ϕA and ψB are both utility-maximisers — the possible option-profiles are pairs (a,b)∈A×B in nash equilibrium.
Can we really generalise the nash equilibrium to any pair of optimisers?
Further reading
for Game Theory (2016)
functions (2019)
The little λ is from (simply-typed) lambda notion for introducing functions. If sin:R→R is the familiar sine function, then λx∈R.sin(x2) is the function R→R which sends each x∈R to sin(x2)∈R, so (λx∈R.sin(x2))(π3)=sin(π6). We can also use lambda notation to define higher-order functions, e.g. if doubleX=λu:X→X.u∘u then doubleR(sin)(π+1)=sin(sin(π+1)).
Synonyms: actions, alternatives, behaviours, configurations, coordinates, hypotheses, moves, options, outputs, parameters, plays, policies, positions, strategies — or whatever else is being chosen due to the payoffs assigned to it.
Scott Garrabrant has called (X→R)→X the the type-signature of agency, and you can interpret this sequence as an expansion on his remark.
In response to Garrabrant, MrMind raises the objection that no map ψ:(X→R)→X can be parametric in both X and R because there is no map (0→0)→0. Fortunately, this objection doesn't apply here, because we're looking for functions ψ:(X→R)→P(X), and there is a function ψ:(∅→∅)→P(∅).
The term higher-order game theory is coined in [Hedges 2015], presumably because functions of the form f:(A→B)→C are often called higher-order functions. Throughout the article, I will only use the term in this sense.
Warning: The term higher-order game theory is also used in [Nisan 2021] to refer to "the study of agents thinking about agents thinking about agents..." — and [Diffractor 2021] calls this higher-level game theory. I will never use the term in this sense.
Spoilers: As you'll see later, this is the definition of optimisation you need to ensure that the nash equilibrium between two optimisers is also an optimiser.
Of course, if ψ(u)=∅, then this model of the agent and their task would be falsified by any observed option, and can be eliminated a priori.
We can also interpret ψ(u)=∅ as the statement that the computer program implementing the agent's decision theory would throw an exception, or fail to halt, when given the input u:X→R.
Discussed in Program Equilibrium in the Prisoner’s Dilemma via Lob’s Theorem and elsewhere.
In tomorrow's post, P will be generalised to an arbitrary commutative monad M. When we let M be the probability monad Δ, then we get optimisers of type (X→R)→Δ(X) which would include Taylor's original probabilistic quantiliser. But don't worry about that now!
... and any set X to model the agent's options, any function u:X→R to model the agent's task, and any functional ψ:(X→R)→P(X) to model the agent's optimisation.
When Rn is equipped with the product ordering, the optimiser argmaxX:JP(X,Rn) is the called the pareto optimiser which maps each multi-objective task u:X→Rn to the pareto-frontier. This is the least selective optimiser which cares about each objective πi∘u:X→R.
A partial order ≤ is a relation on X satisfying three rules:
(1) Reflectivity: x≤x.
(2) Transitivity: If x≤y and y≤z then x≤z.
(3) Anti-symmetry: If x≤y and y≤x then x=y.
A preorder ≤ is a relation satisfying only reflectivity and transitivity. A preorder is the non-evil version of a partial order, because it's agnostic about ""identity"" between equivalent objects.
Now, we say an agent's preferences are incomplete when neither x≤y nor y≤x, and an agent's preferences are indifferent when both x≤y and y≤x but not x=y. So a partial order represents preferences which might be incomplete but never indifferent, while a preorder represents preferences which might be both incomplete and indifferent. Agents with incomplete preferences have recently been proposed as a solution to the Shutdown Problem.
Warning: There're another optimiser in JP(X,R) who chooses the maximum points of ≤u, i.e. options which weakly dominate all other options. Explicitly, λu:X→R.{x∈X:∀x′∈X.u(x′)≤u(x)}. This optimiser is qualitatively different, being more selective when its preferences are incomplete.
In Science without Numbers (1980), Hartry Field attempted to reformulate Newtonian physics without quantification over mathematical objects. Instead, the formulae of his theory would quantify only over physical objects and regions of spacetime. He called this normalisation of Newtonian physics.
Likewise, higher-order game theory promises a normalisation of game theory.
I don't know why, but this smells correct to me.
This property is called closedness in [Hedges 2015] because the set ψ(u)⊆X is closed under the equivalence relation x∼x′⟺u(x)=u(x′) for every u:X→R.
A function q:(X→R)→P(R) is called a quantifier in the HOGT literature.