Stability of optimal predictor schemes under a broader class of reductions

Vanessa Kosoy

We define a new class of reductions which preserve optimal predictor schemes, generalizing the previous notion of pseudo-invertible reductions. These are reductions that preserve the estimation problem on average but allow for large variance.

Definition 1

Fix an error space $Δ$ of rank 2. Consider $(f, μ)$ , $(g, ν)$ distributional estimation problems, $^ζ = (ζ, r_{ζ}, a_{ζ})$ a ${0, 1}^{*}$ -valued $(p o l y, l o g)$ -bischeme. $^ζ$ is called a $Δ$ -pseudo-invertible weak reduction of $(f, μ)$ to $(g, ν)$ when there is a polynomial $p : N \to N$ s.t. the following conditions hold:

(i) $E_{μ^{k}} [(E_{U^{r_{ζ} (k, j)}} [g ({^ζ}^{k j} (x))] - f (x))^{2}] \in Δ$

(ii) $P r_{μ^{k} \times U^{r_{ζ} (k, j)}} [ν^{p (k)} ({^ζ}^{k j} (x)) = 0] \in Δ$

(iii) There is $M > 0$ and $^R = (R, r_{R}, a_{R})$ a $Q \cap [0, M]$ -valued $(p o l y, l o g)$ -bischeme s.t.

$E_{ν^{p (k)} \times U^{r_{R} (k, j)}} [({^R}^{k j} (y) - \frac{P r_{μ^{k} \times U^{r_{ζ} (k, j)}} [{^ζ}^{k j} (x) = y]}{ν^{p (k)} (y)})^{2}] \in Δ$

(iv) There is a ${0, 1}^{*}$ -valued $(p o l y, l o g)$ -scheme $^ξ = (ξ, r_{ξ}, a_{ξ})$ s.t.

$Eμk×Urζ(k,j)[∑x′∈{0,1}∗|PrUrξ(k,j)[^ξkj(^ζkj(x,z),w)=x′]−Prμk×Urζ(k,j)[x′′=x′∣^ζkj(x′′,z′)=^ζkj(x,z)]|]∈Δ$

Such $^ξ$ is called a $Δ$ -pseudo-inverse of $^ζ$ .

$^ζ = (ζ, r_{ζ}, a_{ζ})$ a ${0, 1}^{*}$ -valued $(p o l y, l o g)$ -scheme is called a $Δ$ -pseudo-invertible weak reduction of $(f, μ)$ to $(g, ν)$ when it becomes such when adding trivial dependence on $j$ .

Definition 2

An error space $Δ$ of rank 2 is called stable when for any non-constant polynomial $p : N \to N$ and $δ \in Δ$ , the function $δ^{'} (k, j) := δ (p (k), j)$ is in $Δ$ .

Theorem

Fix $Δ$ a stable error space of rank 2. Suppose there is a polynomial $h : N^{2} \to N$ s.t. $h^{- 1} \in Δ$ . Consider $(f, μ)$ , $(g, ν)$ distributional estimation problems, $^ζ$ a $Δ$ -pseudo-invertible weak reduction of $(f, μ)$ to $(g, ν)$ and ${^P}_{g}$ a $Δ (p o l y, l o g)$ -optimal predictor scheme for $(g, ν)$ . Define ${^P}_{f}$ by

${^P}_{f}^{k j} (x, y_{1} y_{2} \dots y_{h (k, j)} z_{1} z_{2} \dots z_{h (k, j)}) := \frac{1}{h (k, j)} h (k, j) \sum i = 1 {^P}_{g}^{p (k), j} ({^ζ}^{k j} (x, y_{i}), z_{i})$

Here, we assume the lengths of the $y_{i}$ and $z_{i}$ are compatible with $^ζ$ and ${^P}_{g}$ respectively. Then, ${^P}_{f}$ is a $Δ (p o l y, l o g)$ -optimal predictor scheme for $(f, μ)$ .

Proposition 1

Consider $(f, μ)$ , $(g, ν)$ distributional estimation problems. Suppose $^ζ = (ζ, r_{ζ}, a_{ζ})$ is a weak $Δ$ -pseudo-invertible reduction of $(f, μ)$ to $(g, ν)$ and $^ξ = (ξ, r_{ξ}, a_{ξ})$ is it's $Δ$ -pseudo-inverse. Then there is $δ \in Δ$ s.t. for any bounded function $h : {0, 1}^{*}^{2} \to R$

$| E_{μ^{k} \times U^{r_{ζ} (k, j)} \times U^{r_{ξ} (k, j)}} [h ({^ξ}^{k j} ({^ζ}^{k j} (x, z), w), {^ζ}^{k j} (x, z))] - E_{μ^{k} \times U^{r_{ζ} (k, j)}} [h (x, {^ζ}^{k j} (x, z))] | \leq (sup | h |) δ (k, j)$

Proposition 1 proved exactly as Proposition 2 for unidistributional estimation problems.

Proposition 2

Consider $X$ a finite set, $μ$ a probability measure on $X$ , $h : X \to R$ and $s, t \in R$ . Then

$E_{μ} [(h (x) - s)^{2} - (h (x) - t)^{2}] = (E_{μ} [h (x)] - s)^{2} - (E_{μ} [h (x)] - t)^{2}$

Proof of Proposition 2

$E_{μ} [(h (x) - s)^{2}] = E_{μ} [h (x)^{2}] - 2 E_{μ} [h (x)] s + s^{2}$

$E_{μ} [(h (x) - t)^{2}] = E_{μ} [h (x)^{2}] - 2 E_{μ} [h (x)] t + t^{2}$

$E_{μ} [(h (x) - s)^{2} - (h (x) - t)^{2}] = 2 E_{μ} [h (x)] (t - s) + s^{2} - t^{2}$

$(E_{μ} [h (x)] - s)^{2} = E_{μ} [h (x)]^{2} - 2 E_{μ} [h (x)] s + s^{2}$

$(E_{μ} [h (x)] - t)^{2} = E_{μ} [h (x)]^{2} - 2 E_{μ} [h (x)] t + t^{2}$

$(E_{μ} [h (x)] - s)^{2} - (E_{μ} [h (x)] - t)^{2} = 2 E_{μ} [h (x)] (t - s) + s^{2} - t^{2}$

Proof of Theorem

Consider ${^Q}_{f} = (Q_{f}, s_{f}, b_{f})$ a $(p o l y, l o g)$ -predictor scheme. Let ${^Q}_{g} = (Q_{g}, s_{g}, b_{g})$ be the $(p o l y, l o g)$ -predictor scheme defined by

${^Q}_{g}^{k j} (x, y_{1} y_{2} y_{3} y_{4}) := {^P}_{g}^{k j} (x, y_{1}) + {^Q}_{f}^{k j} ({^ξ}^{k j} (x, y_{2}), y_{3}) - {^P}_{f}^{k j} ({^ξ}^{k j} (x, y_{2}), y_{4})$

We have

$E_{ν^{p (k)} \times U^{r_{R} (k, j)} \times U^{r_{g} (p (k), j)}} [{^R}^{k j} (y) ({^P}_{g}^{p (k), j} (y) - g (y))^{2}] \leq E_{ν^{p (k)} \times U^{r_{R} (k, j)} \times U^{s_{g} (p (k), j)}} [{^R}^{k j} (y) ({^Q}_{g}^{p (k), j} (y) - g (y))^{2}] + δ_{0} (p (k), j)$

where $δ_{0} \in Δ$ .

Applying the definitive property of $^R$ to the left hand side we get

$E_{ν^{p (k)} \times U^{r_{R} (k, j)} \times U^{r_{g} (p (k), j)}} [{^R}^{k j} (y) ({^P}_{g}^{p (k), j} (y) - g (y))^{2}] = E_{ν^{p (k)} \times U^{r_{g} (p (k), j)}} [\frac{P r_{μ^{k} \times U^{r_{ζ} (k, j)}} [{^ζ}^{k j} (x) = y]}{ν^{p (k)} (y)} ({^P}_{g}^{p (k), j} (y) - g (y))^{2}] + γ_{R} (k, j)$

where $| γ_{R} | \in Δ$ . Using property (ii) of pseudo-invertible reductions, we get

$E_{ν^{p (k)} \times U^{r_{R} (k, j)} \times U^{r_{g} (p (k), j)}} [{^R}^{k j} (y) ({^P}_{g}^{p (k), j} (y) - g (y))^{2}] = E_{μ^{k} \times U^{r_{g} (p (k), j)} \times U^{r_{ζ} (k, j)}} [({^P}_{g}^{p (k), j} ({^ζ}^{k j} (x)) - g ({^ζ}^{k j} (x)))^{2}] + γ_{R} (k, j)$

Using the definitive property of $^R$ and property (ii) of pseudo-invertible reductions on the right-hand side, we get

$E [{^R}^{k j} (y) ({^Q}_{g}^{k j} (y) - g (y))^{2}] = E [({^P}_{g}^{p (k), j} ({^ζ}^{k j} (x)) + {^Q}_{f}^{k j} ({^ξ}^{k j} ({^ζ}^{k j} (x))) - {^P}_{f}^{k j} ({^ξ}^{k j} ({^ζ}^{k j} (x))) - g ({^ζ}^{k j} (x)))^{2}] + γ_{R}^{'} (k, j)$

where $| γ_{R}^{'} | \in Δ$ . Applying Proposition 1

$E [{^R}^{k j} (y) ({^Q}_{g}^{k j} (y) - g (y))^{2}] = E [({^P}_{g}^{p (k), j} ({^ζ}^{k j} (x)) + {^Q}_{f}^{k j} (x) - {^P}_{f}^{k j} (x) - g ({^ζ}^{k j} (x)))^{2}] + γ_{ξ} (k, j) + γ_{R}^{'} (k, j)$

where $| γ_{ξ} | \in Δ$ . Applying the stability of $Δ$ to $δ_{0}$ and putting everything together

$E [({^P}_{g}^{p (k), j} ({^ζ}^{k j} (x)) - g ({^ζ}^{k j} (x)))^{2}] \leq E [({^P}_{g}^{p (k), j} ({^ζ}^{k j} (x)) + {^Q}_{f}^{k j} (x) - {^P}_{f}^{k j} (x) - g ({^ζ}^{k j} (x)))^{2}] + δ_{1} (k, j)$

for $δ_{1} \in Δ$ . Applying Proposition 2

$E_{μ^{k}} [E [{^P}_{g}^{p (k), j} ({^ζ}^{k j} (x)) - g ({^ζ}^{k j} (x))]^{2}] \leq E_{μ^{k} \times U^{s_{f} (k, j) + r_{f} (k, j)}} [(E [{^P}_{g}^{p (k), j} ({^ζ}^{k j} (x)) - g ({^ζ}^{k j} (x))] + {^Q}_{f}^{k j} (x) - {^P}_{f}^{k j} (x))^{2}] + δ_{1} (k, j)$

Applying property (i) of pseudo-invertible reductions

$E_{μ^{k}} [(E [{^P}_{g}^{p (k), j} ({^ζ}^{k j} (x))] - f (x))^{2}] \leq E_{μ^{k} \times U^{s_{f} (k, j) + r_{f} (k, j)}} [(E [{^P}_{g}^{p (k), j} ({^ζ}^{k j} (x))] - f (x) + {^Q}_{f}^{k j} (x) - {^P}_{f}^{k j} (x))^{2}] + δ_{2} (k, j)$

for $δ_{2} \in Δ$ . Using the definition of ${^P}_{f}$ we get

$E [({^P}_{f}^{k j} (x) - f (x))^{2}] \leq E [({^P}_{f}^{k j} (x) - f (x) + {^Q}_{f}^{k j} (x) - {^P}_{f}^{k j} (x))^{2}] + δ_{3} (k, j)$

for $δ_{3} \in Δ$ . Finally we get

$E [({^P}_{f}^{k j} (x) - f (x))^{2}] \leq E [({^Q}_{f}^{k j} (x) - f (x))^{2}] + δ_{3} (k, j)$