Suppose players are playing in a correlated equilibrium using a reflective oracle distribution. How does the equilibrium they play in vary as a function of the parameters of the game, or of the players' policies? It turns out that the set of equilibria is a Kakutani map of the parameters to the game. This is a lot like it being a continuous map.
This might make it possible for players to reason about the effects of their policy on the equilibrium that they play (since the equilibrium is now a Kakutani map of the players' policies).
Definitions
Let k be some natural number. We will consider reflective oracle distributions whose queries are parameterized on some vector in θ∈Rk.
To do this, let the Turing machines M, instead of outputting a raw query, output a continuous function from θ∈Rk to the query. (The details of representing continuous functions don't seem that important). The reflectivity condition on oracle distributions is now relative to θ (since the queries depend on θ).
Define the map
ParamsToDistrs(θ):={D∈D|D is reflective relative to θ}
which maps the parameters θ to the set of reflective oracle distributions for θ.
Theorem:ParamsToDistrs is a Kakutani map.
Proof:
From the previous post, we have:
For each θ, ParamsToDistrs(θ) is nonempty (Theorem 1)
For each θ, ParamsToDistrs(θ) is convex (Theorem 2)
So it is sufficient to show that ParamsToDistrs has a closed graph.
Let θ1,θ2,... be an infinite sequence of Rk values converging to θ∞. Let D1,D2,... be an infinite sequence of oracle distributions in D converging to D∞. Assume that for each natural n, Dn is reflective relative to θn. We will show that D∞ is reflective relative to θ∞; this is sufficient to show that ParamsToDistrs has a closed graph.
Let M∈M. Let a be such that D∞(O(M)=a)>0. Then D∞(O(M)=a)>ϵ for some ϵ>0. By convergence, there is some N such that for all n≥N, Dn(O(M)=a)>ϵ. By reflectivity of each Dn relative to θi, we have, for each n≥N,
a∈Eval(M)(θn)(Condition(Dn,M,a))
Let qθ:=Eval(M)(θ). Rewriting the above statement:
a∈argmaxi∈{1,...,l(q)}EDn[qθni(O)]
Consider the set {(θ,D)|a∈argmaxi∈{1,...,l(q)}ED[qθi(O)]}. This is the intersection of a finite number of sets of the form {(θ,D)|ED[qθa(O)−qθi(O)]≥0}. Each of these sets is closed because (θ,D)↦ED[qθa(O)−qθi(O)] is continuous. Therefore the set {(θ,D)|a∈argmaxi∈{1,...,l(q)}ED[qθi(O)]} is closed.
In total, this is sufficient to show a∈argmaxi∈{1,...,l(q)}ED∞[qθ∞i(O)], as desired.
□
An immediate consequence of this theorem is that the set of correlated equilibria of a normal-form game is a Kakutani map of the parameters of the game.
(follow-up to: A correlated analogue to reflective oracles)
Motivation
Suppose players are playing in a correlated equilibrium using a reflective oracle distribution. How does the equilibrium they play in vary as a function of the parameters of the game, or of the players' policies? It turns out that the set of equilibria is a Kakutani map of the parameters to the game. This is a lot like it being a continuous map.
This might make it possible for players to reason about the effects of their policy on the equilibrium that they play (since the equilibrium is now a Kakutani map of the players' policies).
Definitions
Let k be some natural number. We will consider reflective oracle distributions whose queries are parameterized on some vector in θ∈Rk.
To do this, let the Turing machines M, instead of outputting a raw query, output a continuous function from θ∈Rk to the query. (The details of representing continuous functions don't seem that important). The reflectivity condition on oracle distributions is now relative to θ (since the queries depend on θ).
Define the map
ParamsToDistrs(θ):={D∈D|D is reflective relative to θ }
which maps the parameters θ to the set of reflective oracle distributions for θ.
Theorem: ParamsToDistrs is a Kakutani map.
Proof:
From the previous post, we have:
So it is sufficient to show that ParamsToDistrs has a closed graph.
Let θ1,θ2,... be an infinite sequence of Rk values converging to θ∞. Let D1,D2,... be an infinite sequence of oracle distributions in D converging to D∞. Assume that for each natural n, Dn is reflective relative to θn. We will show that D∞ is reflective relative to θ∞; this is sufficient to show that ParamsToDistrs has a closed graph.
Let M∈M. Let a be such that D∞(O(M)=a)>0. Then D∞(O(M)=a)>ϵ for some ϵ>0. By convergence, there is some N such that for all n≥N, Dn(O(M)=a)>ϵ. By reflectivity of each Dn relative to θi, we have, for each n≥N,
a∈Eval(M)(θn)(Condition(Dn,M,a))
Let qθ:=Eval(M)(θ). Rewriting the above statement:
a∈argmaxi∈{1,...,l(q)}EDn[qθni(O)]
Consider the set {(θ,D)|a∈argmaxi∈{1,...,l(q)}ED[qθi(O)]}. This is the intersection of a finite number of sets of the form {(θ,D)|ED[qθa(O)−qθi(O)]≥0}. Each of these sets is closed because (θ,D)↦ED[qθa(O)−qθi(O)] is continuous. Therefore the set {(θ,D)|a∈argmaxi∈{1,...,l(q)}ED[qθi(O)]} is closed.
In total, this is sufficient to show a∈argmaxi∈{1,...,l(q)}ED∞[qθ∞i(O)], as desired.
□
An immediate consequence of this theorem is that the set of correlated equilibria of a normal-form game is a Kakutani map of the parameters of the game.