Minimax and dynamic (in)consistency

Vanessa Kosoy

% operators that are separated from the operand by a space

% operators that require brackets

% operators that require parentheses

% Paper specific

The asymptotic optimality theorem for minimax we proved previously was formulated in terms of the "optimal value" $v (x)$ . Compared to the Bayesian case, $v (x)$ has the strange property that its definition depends on rewards other than those received after observing $x$ . This is a direct consequence of the dynamic inconsistency of minimax. In an attempt to get around this, we show that the minimax decision rule is dynamically consistent in a certain special case and demonstrate that this special case can occur often for a natural class of models. However, this doesn't immediately solve the problem: doing so requires replacing the minimax decision rule by a stronger condition that ensures subgame perfection in some sense. The formalization of the latter idea is left for future posts.

Appendix A contains the proofs of all results.

##Notation

Given $X$ a topological space, $P (X)$ will denote the space of Borel probability measures on $X$ . We regard it as a topological space using the weak $^{*}$ topology. Given $x \in X$ , $δ_{x} \in P (X)$ is defined by $δ_{x} (A) := [[x \in A]]$ . Given $μ, ν \in P (X)$ , $d_{tv} (μ, ν)$ stands for the total variation distance between $μ$ and $ν$ . We denote $P_{C} (X)$ the set of non-empty convex compact subsets of $P (X)$ .

Given $X$ a finite set, $X^{*}$ denotes the set of finite strings in alphabet $X$ , i.e. $X^{*} := ⨆_{n \in N} X^{n}$ . $X^{ω}$ denotes the set of infinite strings in alphabet $X$ . Given $x \in X^{*} ⊔ X^{ω}$ , $| x | \in N ⊔ {\infty}$ is the length of string. Given $0 \leq n < | x |$ , $x_{n} \in X$ is the $n$ -th symbol in $x$ . Given $0 \leq n \leq | x |$ , $x_{< n}$ is the prefix of $x$ of length $n$ . Given $x, y \in X^{*} ⊔ X^{ω}$ , the notations $x ⊏ y$ , $x ⊑ y$ , $x ⊏/ y$ and $x ⋢ y$ mean " $x$ is a proper prefix of $y$ ", " $x$ is a prefix of $y$ " and their negations. $λ_{X} \in X^{*}$ is the empty string and $X^{+} := X^{*} ∖ λ_{X}$ .

##Factorization

We recall the result of the previous post regarding subagents of minimax agents:

#Proposition 0

$π_{2}^{*} \in a r g m a x π_{2} : T \to P (S_{2}) min μ \in Φ (E_{π_{1}^{*} \times μ} [u_{1}] + μ (A) E_{t \sim α_{*} π_{1}^{*}} [E_{π_{2} (t) \times μ ∣ A} [u_{2} (t)]])$

In general, $π_{2}^{*}$ is not a minimax policy for any natural model i.e. the minimax decision rule is not "dynamically consistent". However, if we assume a certain factorization condition, it is. Specifically, assume $E_{1}, E_{2}$ are compact Polish, $f : E_{1} \times E_{2} \to E$ is Borel measurable, ${¯ u}_{1} : S_{1} \times E_{1} \to R$ and ${¯ u}_{2} : S_{2} \times T \times E_{2} \to R$ are continuous s.t.

$u_{1} (s_{1}, f (e_{1}, e_{2})) = {¯ u}_{1} (s_{1}, e_{1})$

$u_{2} (s_{2}, t, f (e_{1}, e_{2})) = {¯ u}_{1} (s_{2}, t, e_{2})$

Further assume that $A_{1} \subseteq E_{1}$ is Borel s.t. $f^{- 1} (A) = A_{1} \times E_{2}$ and $Φ_{1, 2} \in P_{C} (E_{1, 2})$ are s.t. $Φ$ is the closure of the convex hull of $f_{*} (Φ_{1} \times Φ_{2})$ .

#Proposition 1

Assume that ${min}_{μ \in Φ_{1}} μ_{1} (A_{1}) > 0$ . Then,

$π_{2}^{*} \in a r g m a x π_{2} : T \to P (S_{2}) min μ \in Φ_{2} E_{t \sim α_{*} π_{1}^{*}} [E_{π_{2} (t) \times μ} [{¯ u}_{2} (t)]]$

Moreover, define ${¯ S}_{2} := T \to S_{2}$ equipped with the product topology, define ${¯ π}_{2}^{*} \in P ({¯ S}_{2})$ by

${¯ π}_{2}^{*} := \prod t \in T π_{2}^{*} (t)$

Define ${¯ E}_{2} := T \times E_{2}$ and define ${¯ Φ}_{2} \in P_{C} ({¯ E}_{2})$ by

${¯ Φ}_{2} := {α_{*} π_{1}^{*} \times μ ∣ μ \in Φ_{2}}$

Finally, define ${^u}_{2} : {¯ S}_{2} \times {¯ E}_{2} \to R$ by

${^u}_{2} (s, t, e) := {¯ u}_{2} (s (t), t, e)$

Then, ${¯ π}_{2}^{*}$ is a minimax policy for ${¯ Φ}_{2}$ w.r.t. the utility function ${^u}_{2}$ .

##Factorization everywhere

We now formulate a condition that ensures that factorizations happens "often" in a certain sense.

#Definition 1

Consider $Φ \in P_{C} (O^{ω})$ and $x \in O^{*}$ . Define $f^{x} : O^{ω} \times O^{ω} \to O^{ω}$ by

$f^{x} (e_{1}, e_{2}) := {\begin{matrix} e_{1} if x ⊏/ e_{1} x e_{2} if x ⊏ e_{1} \end{matrix}$

$Φ$ is said to factorize at $x$ when there are $Φ_{1}, Φ_{2} \subseteq P (O^{ω})$ s.t. $Φ$ is the closure of the convex hull of $f_{*}^{x} (Φ_{1} \times Φ_{2})$ .

Given $e \in O^{ω}$ , $Φ$ is said to factorize over $e$ when there are infinitely many $n \in N$ s.t. $Φ$ factorizes at $e_{< n}$ .

Given a time discount function $γ$ , $Φ$ is said to factorize frequently over $e$ (w.r.t. $γ$ ) when there is an increasing sequence ${n_{k} \in N}_{k \in N}$ s.t.

i. $Φ$ factorizes at $e_{< n_{k}}$ for all $k$ .

ii. There is $ϵ > 0$ s.t. for all $k$

$\frac{\sum_{n = n_{k + 1} - 1}^{\infty} γ (n)}{\sum_{n = n_{k}}^{\infty} γ (n)} > ϵ$

For example, for geometric discount ( $γ (n) = β^{n})$ , condition ii means that $n_{k + 1} - n_{k}$ is bounded.

$Φ$ is said to factorize everywhere when

$\forall μ \in Φ : {Pr}_{e \sim μ} [Φ factorizes over e] = 1$

$Φ$ is said to factorize frequently everywhere (w.r.t. $γ$ ) when

$\forall μ \in Φ : {Pr}_{e \sim μ} [Φ factorizes frequently over e] = 1$

Note that any singleton ${μ}$ factorizes frequently everywhere w.r.t. any time discount function. More generally, we can give the following explicit description of models that factorize everywhere:

#Construction

Consider some $F : O^{*} \to P_{C} (O^{+})$ ( $O^{+}$ is considered to be a discrete space). Assume that for any $x \in O$ , there is $A_{x} \subseteq O^{+}$ prefix-free s.t. any $μ \in F (x)$ is supported on $A_{x}$ .

We define $X_{F} \subseteq O^{*}$ as the minimal set with the following properties:

i. $λ_{O} \in X_{F}$

ii. If $x \in X_{F}$ , $μ \in F (x)$ and $y \in O^{+}$ is s.t. $μ (y) > 0$ , then $x y \in X_{F}$ .

We define $Φ_{F} \subseteq P (O^{ω})$ as follows. $μ \in Φ_{F}$ iff for any $x \in X_{F}$ , there is $μ_{x} \in F (x)$ s.t. for any $y \in O^{+}$ , if $μ_{x} (y) > 0$ then

${Pr}_{e \sim μ} [x y ⊏ e] = {Pr}_{e \sim μ} [x ⊏ e] μ_{x} (y)$

#Proposition 2

For any $F$ as above, $Φ_{F} \in P_{C} (O^{ω})$ and factorizes everywhere.

It is also easy to use the above construction to get $Φ$ that factorizes frequently everywhere w.r.t. given $γ$ (for example, if we require that for every $x \in X_{F}$ , $F (x)$ is supported on strings of uniformly bounded length, $Φ_{F}$ will factorize frequently everywhere w.r.t. geometric discount; this condition on $F$ is sufficient but not necessary).

We now formulate the optimality condition that we would like to hold (but for which minimax is unfortunately insufficent). Consider again $Φ$ , $Ψ$ and $π^{*}$ as before. Define $u_{> n} : O^{ω} \times (O^{*} \to A) \to R$ to be normalized sum of rewards from time $n$ , i.e.

$u_{> n} (s, e) := \frac{\sum_{m > n} γ (m) r (e_{< m}^{s})}{\sum_{m > n} γ (m)}$

Define $w : O^{*} \to R$ by

$w (x) := max ρ \in A^{| x |} \to P (O^{*} \to A) min μ \in Φ E_{\begin{matrix} s \sim π^{*} \otimes_{x} ρ e \sim μ \end{matrix}} [u_{> | x |} (s, e) ∣ x ⊏ e]$

#Hypothesis

If $π^{*}$ satisfies the (yet unformulated) subgame perfection condition, we should have the following:

Assume $Φ$ factorizes everywhere. Then,

$\forall μ \in Φ : {Pr}_{e \sim μ} [liminf n \to \infty max (w (e_{< n}) - E_{\begin{matrix} s \sim π^{*} e^{'} \sim μ \end{matrix}} [u_{> n} (s, e^{'}) ∣ e_{< n} ⊏ e^{'}], 0) = 0] = 1$

Moreover, if $Φ$ factorizes frequently everywhere, then

$\forall μ \in Φ : {Pr}_{e \sim μ} [lim n \to \infty max (w (e_{< n}) - E_{\begin{matrix} s \sim π^{*} e^{'} \sim μ \end{matrix}} [u_{> n} (s, e^{'}) ∣ e_{< n} ⊏ e^{'}], 0) = 0] = 1$

##Appendix A

#Proof of Proposition 1

Recalling that minimization over $Φ$ can be replaced by minimization over any set s.t. $Φ$ is the closure of its convex hull, and using Proposition 0, we have

$π_{2}^{*} \in a r g m a x π_{2} : T \to P (S_{2}) min \begin{matrix} μ_{1} \in Φ_{1} μ_{2} \in Φ_{2} \end{matrix} (E_{π_{1}^{*} \times f_{*} (μ_{1} \times μ_{2})} [u_{1}] + (μ_{1} \times μ_{2}) (f^{- 1} (A)) E_{t \sim α_{*} π_{1}^{*}} [E_{π_{2} (t) \times f_{*} (μ_{1} \times μ_{2}) ∣ A} [u_{2} (t)]])$

$π_{2}^{*} \in a r g m a x π_{2} : T \to P (S_{2}) min \begin{matrix} μ_{1} \in Φ_{1} μ_{2} \in Φ_{2} \end{matrix} (E_{π_{1}^{*} \times μ_{1}} [{¯ u}_{1}] + μ_{1} (A_{1}) E_{t \sim α_{*} π_{1}^{*}} [E_{π_{2} (t) \times μ_{2}} [{¯ u}_{2} (t)]])$

$π_{2}^{*} \in a r g m a x π_{2} : T \to P (S_{2}) min μ_{1} \in Φ_{1} (E_{π_{1}^{*} \times μ_{1}} [{¯ u}_{1}] + μ_{1} (A_{1}) min μ_{2} \in Φ_{2} E_{t \sim α_{*} π_{1}^{*}} [E_{π_{2} (t) \times μ_{2}} [{¯ u}_{2} (t)]])$

Using the assumption ${min}_{μ_{1} \in Φ_{1}} μ_{1} (A_{1}) > 0$ , we get the desired result.

Now consider any ${¯ π}_{2} \in P ({¯ S}_{2})$ . Define $π_{2} : T \to P (S_{2})$ by $(π_{2} (t)) (B) := {Pr}_{σ \sim {¯ π}_{2}} [σ (t) \in B]$ . We have

$E_{t \sim α_{*} π_{1}^{*}} [E_{\begin{matrix} s \sim π_{2} (t) e \sim μ_{2} \end{matrix}} [{¯ u}_{2} (s, t, e)]] = E_{t \sim α_{*} π_{1}^{*}} [E_{\begin{matrix} σ \sim {¯ π}_{2} e \sim μ_{2} \end{matrix}} [{¯ u}_{2} (σ (t), t, e)]]$

$E_{t \sim α_{*} π_{1}^{*}} [E_{\begin{matrix} s \sim π_{2} (t) e \sim μ_{2} \end{matrix}} [{¯ u}_{2} (s, t, e)]] = E_{{¯ π}_{2} \times α_{*} π_{1}^{*} \times μ} [{^u}_{2}]$

For the special case ${¯ π}_{2} = {¯ π}_{2}^{*} = \prod_{t \in T} π_{2}^{*} (t)$ , we get $π_{2} = π_{2}^{*}$ and therefore

$E_{t \sim α_{*} π_{1}^{*}} [E_{\begin{matrix} s \sim π_{2}^{*} (t) e \sim μ_{2} \end{matrix}} [{¯ u}_{2} (s, t, e)]] = E_{{¯ π}_{2}^{*} \times α_{*} π_{1}^{*} \times μ} [{^u}_{2}]$

Using the property of $π_{2}^{*}$ , it follows that

$\forall μ \in Φ_{2} : E_{{¯ π}_{2} \times α_{*} π_{1}^{*} \times μ} [{^u}_{2}] \leq E_{{¯ π}_{2}^{*} \times α_{*} π_{1}^{*} \times μ} [{^u}_{2}]$

$\forall ¯ μ \in {¯ Φ}_{2} : E_{{¯ π}_{2} \times ¯ μ} [{^u}_{2}] \leq E_{{¯ π}_{2}^{*} \times ¯ μ} [{^u}_{2}]$

#Proposition A.0

In the setting of the Construction, consider any $μ \in Φ_{F}$ . Then:

$\forall x \in X_{F} \exists μ_{x} \in F (x) \forall y \in A_{x} : μ (x y O^{ω}) = μ (x O^{ω}) μ_{x} (y)$

#Proof of Proposition A.0

Given $x \in X_{F}$ , we know there is $μ_{x} \in F (x)$ s.t. the identity holds whenever $μ_{x} (y) > 0$ . Summing these identities over y, we get

$\sum μ_{x} (y) > 0 μ (x y O^{ω}) = \sum μ_{x} (y) > 0 μ (x O^{ω}) μ_{x} (y)$

$\sum μ_{x} (y) > 0 μ (x y O^{ω}) = μ (x O^{ω})$

Since $A_{x}$ is prefix-free, for any $y, y^{'} \in A_{x}$ we have $x y O^{ω} \cap x y^{'} O^{ω} = \emptyset$ . Also, obviously, $x y O^{ω} \subseteq x O^{ω}$ . It follows that

$\sum y \in A_{x} μ (x y O^{ω}) = μ (⋃ y \in A_{x} x y O^{ω}) \leq μ (x O^{ω}) = \sum μ_{x} (y) > 0 μ (x y O^{ω})$

Since $μ_{x} (y) > 0$ implies $y \in A_{x}$ , we conclude that for any $y \in A_{x}$ , if $μ_{x} (y) = 0$ then $μ (x y O^{ω}) = 0$ . This, together with the property of $μ_{x}$ , gives the desired result.

#Proposition A.1

In the setting of the Construction, $Φ_{F}$ is convex.

#Proof of Proposition A.1

Consider $μ, ν \in Φ_{F}$ and $p \in (0, 1)$ and define $ξ = p μ + (1 - p) ν$ . Consider some $x \in X_{F}$ , and let $A_{x} \subseteq O^{+}$ be as in the Construction. If $ξ (x O^{ω}) = 0$ , we can choose any $ξ_{x} \in F (x)$ to satisfy the desired condition. Therefore, we can assume $ξ (x O^{ω}) > 0$ . By Proposition A.0, there are $μ_{x}, ν_{x} \in F (x)$ s.t. for all $y \in A_{x}$ :

$μ (x y O^{ω}) = μ (x O^{ω}) μ_{x} (y)$

$ν (x y O^{ω}) = ν (x O^{ω}) ν_{x} (y)$

Taking a convex linear combination of these equations, we get

$ξ (x y O^{ω}) = p μ (x O^{ω}) μ_{x} (y) + (1 - p) ν (x O^{ω}) ν_{x} (y)$

$ξ (x y O^{ω}) = ξ (x O^{ω}) \frac{p μ (x O^{ω}) μ_{x} (y) + (1 - p) ν (x O^{ω}) ν_{x} (y)}{ξ (x O^{ω})}$

Define $ξ_{x} \in P (O^{+})$ by

$ξ_{x} := p \frac{μ (x O^{ω})}{ξ (x O^{ω})} μ_{x} + (1 - p) \frac{ν (x O^{ω})}{ξ (x O^{ω})} ν_{x}$

$ξ_{x}$ is a convex linear combination of $μ_{x}$ and $ν_{x}$ , therefore $ξ_{x} \in F (x)$ . We get

$ξ (x y O^{ω}) = ξ (x O^{ω}) ξ_{x} (y)$

Therefore, $ξ \in Φ_{F}$ .

#Proposition A.2

In the setting of the Construction, $Φ_{F}$ is closed.

#Proof of Proposition A.2

Consider any sequence ${μ_{n} \in Φ_{F}}_{n \in N}$ and s.t. $μ_{n} \to μ$ for some $μ \in P (O^{ω})$ . Consider some $x \in X_{F}$ , and let $A_{x} \subseteq O^{+}$ be as in the Construction. By Proposition A.0, for any $n \in N$ , there is $μ_{n x} \in F (x)$ s.t. for any $y \in A_{x}$ :

$μ_{n} (x y O^{ω}) = μ_{n} (x O^{ω}) μ_{n x} (y)$

Note that $P (O^{+})$ is not compact but is still Polish (since $O^{+}$ is such), and therefore $F (x)$ is compact and Polish. Let ${n_{k} \in N}_{k \in N}$ be an increasing sequence s.t. $μ_{n_{k} x} \to μ_{x}$ for some $μ_{x} \in F (x)$ . For any $z \in O^{*}$ , $χ_{z O^{ω}} : O^{ω} \to {0, 1}$ is a continuous function, and therefore $μ_{n} (z O^{*}) \to μ (z O^{*})$ . For any $y \in O^{+}$ , $χ_{{y}} : O^{+} \to {0, 1}$ is also continuous, and therefore $μ_{n_{k} x} (y) \to μ_{x} (y)$ . Putting these together, we get

$μ (x y O^{ω}) = μ (x O^{ω}) μ_{x} (y)$

Therefore $μ \in Φ_{F}$ .

#Proposition A.3

In the setting of the Construction, consider any $x, z \in X_{F}$ , $μ \in F (z)$ and $y \in O^{+}$ s.t. $μ (y) > 0$ . Assume that $x ⊏ z y$ . Then, $x ⊑ z$ .

#Proof of Proposition A.3

We prove by induction on $| x |$ . If $| x | = 0$ , then obviously $x ⊑ z$ . If $| x | > 0$ , then, by definition of $X_{F}$ , there is $x^{'} \in X_{F}$ , $y^{'} \in O^{+}$ and $ν \in F (x^{'})$ s.t. $ν (y^{'}) > 0$ and $x = x^{'} y^{'}$ . $x^{'} ⊏ x ⊏ z y$ , therefore, by the induction hypothesis, $x^{'} ⊑ z$ . Assume to the contrary that $x ⋢ z$ . Then, since $x ⊏ z y$ , $z ⊏ x = x^{'} y^{'}$ . By the induction hypothesis, this implies $z ⊑ x^{'}$ . We got $z = x^{'}$ . It follows that $z y^{'} = x ⊏ z y$ , and therefore $y^{'} ⊏ y$ . However, $y, y^{'} \in A_{z}$ and $A_{z}$ is prefix-free, which is a contradiction.

#Proposition A.4

In the setting of the Construction, $Φ_{F}$ is non-empty.

#Proof of Proposition A.4

Choose $μ_{x} \in F (x)$ for each $x \in X_{F}$ . Given any $z \in O^{+}$ , define $p (z) \in O^{*}$ as the longest proper prefix of $z$ s.t. $p (z) \in X_{F}$ . Define $θ : O^{*} \to [0, 1]$ recursively by

$θ (λ_{O}) := 1$

$\forall z \in O^{+} : θ (z) := θ (p (z)) {Pr}_{y \sim μ_{p (z)}} [z ⊑ p (z) y]$

Consider any $z \notin X_{F}$ . For any $o \in O$ , $p (z o) = p (z)$ , therefore:

$θ (z o) = θ (p (z)) {Pr}_{y \sim μ_{p (z)}} [z o ⊑ p (z) y]$

$\sum o \in O θ (z o) = θ (p (z)) \sum o \in O {Pr}_{y \sim μ_{p (z)}} [z o ⊑ p (z) y]$

$\sum o \in O θ (z o) = θ (p (z)) {Pr}_{y \sim μ_{p (z)}} [z ⊏ p (z) y]$

For any $y$ s.t. $μ_{p (z)} (y) > 0$ , $p (z) y \in X_{F}$ and therefore $z \neq p (z) y$ . It follows that $z ⊏ p (z) y$ iff $z ⊑ p (z) y$ , and we get

$\sum o \in O θ (z o) = θ (z)$

Now, consider $z \in X_{F}$ . For any $o \in O$ , $p (z o) = z$ , therefore:

$θ (z o) = θ (z) {Pr}_{y \sim μ_{z}} [z o ⊑ z y]$

$\sum o \in O θ (z o) = θ (z) \sum o \in O {Pr}_{y \sim μ_{z}} [z o ⊑ z y]$

Again we get

$\sum o \in O θ (z o) = θ (z)$

It follows that there is $μ \in P (O^{ω})$ s.t. for any $z \in O^{*}$ , $μ (z O^{ω}) = θ (z)$ .

Consider any $x \in X_{F}$ and $y_{0} \in O^{+}$ s.t. $μ_{x} (y_{0}) > 0$ . $p (x y_{0}) ⊏ x y_{0}$ and by Proposition\ A.13, this implies $p (x y_{0}) ⊑ x$ . By definition of $p$ , it follows that $p (x y_{0}) = x$ . We get

$θ (x y_{0}) = θ (p (x y_{0})) {Pr}_{y \sim μ_{p (x y_{0})}} [x y_{0} ⊑ p (x y_{0}) y]$

$θ (x y_{0}) = θ (x) {Pr}_{y \sim μ_{x}} [x y_{0} ⊑ x y]$

$θ (x y_{0}) = θ (x) {Pr}_{y \sim μ_{x}} [y_{0} ⊑ y]$

$y_{0} \in A_{x}$ and any $y \in O^{+}$ s.t. $μ_{x} (y) > 0$ is also in $A_{x}$ . $A_{x}$ is prefix-free, so $y_{0} ⊑ y$ iff $y_{0} = y$ . We get

$θ (x y_{0}) = θ (x) μ_{x} (y_{0})$

$μ (x y_{0} O^{ω}) = μ (x O^{ω}) μ_{x} (y_{0})$

Therefore, $μ \in Φ_{F}$ .

#Proposition A.5

Consider $x \in O^{*}$ and let $f^{x} : O^{ω} \times O^{ω} \to O^{ω}$ be defined as in Definition 1. Then, for any $y \in O^{*}$ :

$(f^{x})^{- 1} (x y O^{ω}) = x O^{ω} \times y O^{ω}$

#Proof of Proposition A.5

Given any $e_{1}, e_{2} \in O^{ω}$ , $f^{x} (x e_{1}, y e_{2}) = x y e_{2}$ . On the other hand, if $e_{1}, e_{2}, e_{3} \in O^{ω}$ are s.t. $f^{x} (e_{1}, e_{2}) = x y e_{3}$ then $x ⊏ e_{1}$ and $e_{2} = y e_{3}$ .

#Proposition A.6

Consider $x \in O^{*}$ and let $f^{x} : O^{ω} \times O^{ω} \to O^{ω}$ be defined as in Definition 1. Then, for any $z \in O^{*}$ , if $x ⊏/ z$ , then:

$(f^{x})^{- 1} (z O^{ω}) = z O^{ω} \times O^{ω}$

#Proof of Proposition A.6

Assume $x = z w$ for some $w \in O^{*}$ . For any $e_{1}, e_{2} \in O^{ω}$ , $f^{x} (z e_{1}, e_{2})$ is either $z e_{1}$ (if $w ⊏/ e_{1}$ ) or $x e_{2} = z w e_{2}$ (if $w ⊏ e_{1}$ ). Either way, $z ⊏ f^{x} (z e_{1}, e_{2})$ . On the other hand, if $e_{1}, e_{2}, e_{3} \in O^{ω}$ are s.t. $f^{x} (e_{1}, e_{2}) = z e_{3}$ then either $x ⊏/ e_{1}$ and $e_{1} = z e_{3}$ or $z w = x ⊏ e_{1}$ . Either way, $z ⊏ e_{1}$ .

Now, assume $x ⋣ z$ . For any $e_{1}, e_{2} \in O^{ω}$ , $f^{x} (z e_{1}, e_{2}) = z e_{1}$ (since $x ⊏/ z e_{1}$ ). On the other hand, if $e_{1}, e_{2}, e_{3} \in O^{ω}$ are s.t. $f^{x} (e_{1}, e_{2}) = z e_{3}$ then $e_{1} = z e_{3}$ (since $x ⊏/ z e_{3}$ ).

#Proposition A.7

In the setting of the Construction, consider some $x \in X_{F}$ and define $G : O^{*} \to P_{C} (O^{+})$ by $G (z) := F (x z)$ . Assume that $w \in O^{*}$ is s.t. $x w \in X_{F}$ . Then, $w \in X_{G}$ .

#Proof of Proposition A.7

We prove by induction on $| w |$ . For $| w | = 0$ , the proposition is obvious. Assume $| w | > 0$ . $x w \in X_{F}$ , therefore $x w = x^{'} y$ for some $x^{'} \in X_{F}$ , $μ \in F (x^{'})$ and $y \in O^{+}$ s.t. $μ (y) > 0$ . $x ⊏ x^{'} y$ , therefore, by Proposition A.3, $x ⊑ x^{'}$ . Let $z \in O^{*}$ be s.t. $x^{'} = x z$ . $x w = x^{'} y = x z y$ , therefore $w = z y$ and we can use the induction hypothesis to show that $z \in X_{G}$ . $μ \in F (x^{'}) = F (x z) = G (z)$ , implying that $w \in X_{G}$ .

#Proposition A.8

In the setting of the Construction, consider some $x \in X_{F}$ and define $G : O^{*} \to P_{C} (O^{+})$ by $G (z) := F (x z)$ . Consider any $w \in X_{G}$ . Then, $x w \in X_{F}$ .

#Proof of Proposition A.8

We prove by induction on $| w |$ . For $| w | = 0$ , the proposition is obvious. Assume that $| w | > 0$ . Then, there is some $w^{'} \in X_{G}$ , $μ \in G (w^{'}) = F (x w^{'})$ and $y \in O^{+}$ s.t. $μ (y) > 0$ and $w = w^{'} y$ . By the induction hypothesis, $x w^{'} \in X_{F}$ . Therefore, $x w = x w^{'} y$ is also in $X_{F}$ .

#Proposition A.9

In the setting of the Construction, $Φ_{F}$ factorizes at $x$ for any $x \in X_{F}$ .

#Proof of Proposition A.9

Fix $x \in X_{F}$ . Let $f^{x}$ be as in Definition 1. Define $G : O^{*} \to P_{C} (O^{+})$ by $G (z) := F (x z)$ . We claim that $Φ_{F} = f_{*}^{x} (Φ_{F} \times Φ_{G})$ .

Consider any $μ \in Φ_{F}$ , $ν \in Φ_{G}$ and denote $ξ := f_{*}^{x} (μ \times ν)$ . Consider some $z \in X_{F}$ and $A_{z} \subseteq O^{+}$ corresponding.

Assume $z = x w$ for some $w \in O^{*}$ . By Proposition A.7, $w \in X_{G}$ . Note that for any $ν_{0} \in G (w) = F (z)$ , $ν_{0} (y) > 0$ implies $y \in A_{z}$ . Let $ν_{w} \in P (O^{+})$ be as in Proposition A.0. For any $y \in A_{z}$ , we have

$ν (w y O^{ω}) = ν (w O^{ω}) ν_{w} (y)$

Multiplying both sides by $μ (x O^{ω})$ :

$μ (x O^{ω}) ν (w y O^{ω}) = μ (x O^{ω}) ν (w O^{ω}) ν_{w} (y)$

Using the definition of $ξ$ and Proposition A.5, we get

$ξ (z O^{ω}) = μ (x O^{ω}) ν (w O^{ω})$

$ξ (z y O^{ω}) = μ (x O^{ω}) ν (w y O^{ω})$

Combining, we get

$ξ (z y O^{ω}) = ξ (z O^{ω}) ν_{w} (y)$

Now, assume $x ⋢ z$ . Let $μ_{z} \in P (O^{+})$ be as in Proposition A.0. For any $y \in O^{+}$ s.t. $μ_{z} (y) > 0$ , we have

$μ (z y O^{ω}) = μ (z O^{ω}) μ_{z} (y)$

Using the definition of $ξ$ and Proposition A.6, we get

$ξ (z O^{ω}) = μ (z O^{ω})$

By Proposition A.3, $x ⊏/ z y$ . Therefore, we can use Proposition A.6 again to conclude

$ξ (z y O^{ω}) = μ (z y O^{ω})$

Combining, we get

$ξ (z y O^{ω}) = ξ (z O^{ω}) μ_{z} (y)$

We proved that $ξ \in Φ_{F}$ . It remains to show that, conversely, $μ \in f_{*}^{x} (Φ_{F} \times Φ_{G})$ .

Define $μ^{x} \in P (O^{ω})$ as follows. If $μ (x O^{ω}) = 0$ , choose arbitrary $μ^{x} \in Φ_{G}$ . If $μ (x O^{ω}) > 0$ , define $μ^{x}$ by

$μ^{x} (z O^{ω}) := \frac{μ (x z O^{ω})}{μ (x O^{ω})}$

Consider any $w \in X_{G}$ . By Proposition A.8, $x w \in X_{F}$ . By Proposition A.0, there is $μ_{x w} \in F (x w)$ s.t. for any $y \in A_{x w}$ :

$μ (x w y O^{ω}) = μ (x w O^{ω}) μ_{x w} (y)$

If $μ (x O^{ω}) > 0$ , we can divide both sides by $μ (x O^{ω})$ and get

$μ^{x} (w y O^{ω}) = μ^{x} (w O^{ω}) μ_{x w} (y)$

It follows that, in either case, $μ^{x} \in Φ_{G}$ and

$μ (x O^{ω}) μ^{x} (z O^{ω}) = μ (x z O^{ω})$

Denote $ξ := f_{*}^{x} (μ \times μ^{x})$ . Consider any $z \in O^{*}$ .

Assume $z = x w$ for some $w \in O^{*}$ . By Proposition A.5

$ξ (z O^{ω}) = μ (x O^{ω}) μ^{x} (w O^{ω}) = μ (z O^{ω})$

Now, assume $x ⋢ z$ . By Proposition A.6

$ξ (z O^{ω}) = μ (z O^{ω})$

We conclude that $μ = ξ$ .

#Proposition A.10

In the setting of the construction, consider any $μ \in Φ_{F}$ and $e \in O^{ω}$ s.t. for any $n \in N$ , $μ (e_{< n} O^{ω}) > 0$ . Then, there are infinitely many $x \in X_{F}$ s.t. $x ⊏ e$ .

#Proof of Proposition A.10

We prove by induction on $n$ that there are at least $n$ prefixes of $e$ in $X_{F}$ . For $n = 0$ the claim is vacuously true. Consider any $n > 0$ . By the induction hypothesis, there are $x_{0} ⊏ x_{1} ⊏ \dots ⊏ x_{n - 1} ⊏ e$ s.t. $x_{k} \in X_{F}$ for all $k < n$ . Without loss of generality, we can assume $A_{x_{n - 1}}$ is s.t.

$⋃ y \in A_{x_{n - 1}} y O^{ω} = O^{ω}$

Indeed, if this condition is false we can make it true by adding more elements to $A_{x_{n - 1}}$ . By Proposition A.0, there is $ν \in F (x_{n - 1})$ s.t. for any $y \in A_{x_{n - 1}}$

$μ (x_{n - 1} y O^{ω}) = μ (x_{n - 1} O^{ω}) ν (y)$

By the assumption on $A_{x_{n - 1}}$ , there is $y_{0} \in A_{x_{n - 1}}$ s.t. $x_{n - 1} y_{0} ⊏ e$ . Denote $x_{n} := x_{n - 1} y_{0}$ . We get

$μ (x_{n} O^{ω}) = μ (x_{n - 1} O^{ω}) ν (y_{0})$

The left hand side is positive since $x_{n} ⊏ e$ , therefore $ν (y_{0}) > 0$ .\ It follows that $x_{n} \in X_{F}$ .

#Proof of Proposition 2

By Propositions A.1, A.2 and A.4, $Φ_{F} \in P_{C} (O^{ω})$ . It remains to show that $Φ_{F}$ factorizes everywhere.

Consider any $μ \in Φ_{F}$ . Denote

$N := {e \in O^{ω} ∣ \exists n \in N : μ (e_{< n} O^{ω}) = 0}$

We have

$N = ⋃ \begin{matrix} x \in O^{*} μ (x O^{ω}) = 0 \end{matrix} x O^{ω}$

Therefore, $μ (N) = 0$ and $μ (O^{ω} ∖ N) = 1$ . By Proposition A.10, for any $e \notin N$ , there are infinitely many $x ⊏ e$ s.t. $x \in X_{F}$ . By Proposition A.9, it follows that $Φ_{F}$ factorizes over $e$ . We get

${Pr}_{e \sim μ} [Φ_{F} factorizes over e] \geq {Pr}_{e \sim μ} [e \notin N] = 1$