Relax and Randomize: From Value to Algorithms
Relax and Randomize: From Value to Algorithms
Relax and Randomize: From Value to Algorithms
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
A<br />
PROOFS<br />
Proof of Proposition 1. By definition,<br />
T<br />
∑<br />
t=1<br />
T<br />
E ft∼q t<br />
l(f t , x t ) − inf ∑ l(f, x t ) ≤ ∑ E ft∼q t<br />
l(f t , x t ) + Rel T (F∣x 1 , . . . , x T ) .<br />
f∈F t=1<br />
Peeling off the T -th expected loss, we have<br />
T<br />
∑<br />
t=1<br />
T<br />
t=1<br />
T −1<br />
E ft∼q t<br />
l(f t , x t ) + Rel T (F∣x 1 , . . . , x T ) ≤<br />
∑<br />
t=1<br />
T −1<br />
≤ ∑<br />
t=1<br />
E ft∼q t<br />
l(f t , x t ) + {E ft∼q t<br />
l(f t , x t ) + Rel T (F∣x 1 , . . . , x T )}<br />
E ft∼q t<br />
l(f t , x t ) + Rel T (F∣x 1 , . . . , x T −1 )<br />
where we used the fact that q T is an admissible algorithm for this relaxation, <strong>and</strong> thus the last<br />
inequality holds for any choice x T of the opponent. Repeating the process, we obtain<br />
T<br />
∑<br />
t=1<br />
T<br />
E ft∼q t<br />
l(f t , x t ) − inf ∑ l(f, x t ) ≤ Rel T (F) .<br />
f∈F t=1<br />
We remark that the left-h<strong>and</strong> side of this inequality is r<strong>and</strong>om, while the right-h<strong>and</strong> side is not. Since<br />
the inequality holds for any realization of the process, it also holds in expectation. The inequality<br />
V T (F) ≤ Rel T (F)<br />
holds by unwinding the value recursively <strong>and</strong> using admissibility of the relaxation. The highprobability<br />
bound is an immediate consequences of (6) <strong>and</strong> the Hoeffding-Azuma inequality for<br />
bounded martingales. The last statement is immediate.<br />
Proof of Proposition 2. Denote L t (f) = ∑ t s=1 l(f, x s ). The first step of the proof is an application<br />
of the minimax theorem (we assume the necessary conditions hold):<br />
inf<br />
q t∈∆(F) x t∈X<br />
= sup<br />
sup { E [l(f t , x t )] + sup<br />
f t∼q t x<br />
p t∈∆(X ) f t∈F<br />
E ɛt+1∶T sup<br />
f∈F<br />
inf { E [l(f t , x t )] + E sup<br />
x t∼p t x t∼p t x<br />
[2<br />
T<br />
∑<br />
s=t+1<br />
E ɛt+1∶T sup<br />
f∈F<br />
ɛ s l(f, x s−t (ɛ t+1∶s−1 )) − L t (f)]}<br />
[2<br />
T<br />
∑<br />
s=t+1<br />
For any p t ∈ ∆(X ), the infimum over f t of the above expression is equal <strong>to</strong><br />
E sup E ɛt+1∶T sup [2<br />
x t∼p t x<br />
f∈F<br />
≤ E sup E ɛt+1∶T sup [2<br />
x t∼p t x<br />
f∈F<br />
T<br />
∑<br />
s=t+1<br />
T<br />
∑<br />
s=t+1<br />
≤ E sup E ɛt+1∶T sup [2<br />
x t,x ′ t ∼pt x<br />
f∈F<br />
ɛ s l(f, x s−t (ɛ t+1∶s−1 )) − L t−1 (f) + inf<br />
ɛ s l(f, x s−t (ɛ t+1∶s−1 )) − L t (f)]}<br />
f t∈F<br />
E [l(f t , x t )] − l(f, x t )]<br />
x t∼p t<br />
ɛ s l(f, x s−t (ɛ t+1∶s−1 )) − L t−1 (f) + E [l(f, x t )] − l(f, x t )]<br />
x t∼p t<br />
T<br />
∑<br />
s=t+1<br />
ɛ s l(f, x s−t (ɛ t+1∶s−1 )) − L t−1 (f) + l(f, x ′ t) − l(f, x t )]<br />
We now argue that the independent x t <strong>and</strong> x ′ t have the same distribution p t , <strong>and</strong> thus we can introduce<br />
a r<strong>and</strong>om sign ɛ t . The above expression then equals <strong>to</strong><br />
E E<br />
x t,x ′ t ∼pt<br />
ɛ t<br />
sup<br />
x<br />
E ɛt+1∶T sup<br />
f∈F<br />
[2<br />
≤ sup E sup E ɛt+1∶T sup [2<br />
x t,x ′ t ∈X ɛ t x<br />
f∈F<br />
T<br />
∑<br />
s=t+1<br />
T<br />
∑<br />
s=t+1<br />
ɛ s l(f, x s−t (ɛ t+1∶s−1 )) − L t−1 (f) + ɛ t (l(f, x ′ t) − l(f, x t ))]<br />
ɛ s l(f, x s−t (ɛ t+1∶s−1 )) − L t−1 (f) + ɛ t (l(f, x ′ t) − l(f, x t ))]<br />
where we upper bounded the expectation by the supremum. Splitting the resulting expression in<strong>to</strong><br />
two parts, we arrive at the upper bound of<br />
2 sup E<br />
x t∈X<br />
sup<br />
ɛ t x<br />
E ɛt+1∶T sup<br />
f∈F<br />
[<br />
T<br />
∑<br />
s=t+1<br />
ɛ s l(f, x s−t (ɛ t+1∶s−1 )) − 1 2 L t−1(f) + ɛ t l(f, x t )] = R T (F∣x 1 , . . . , x t−1 ) .<br />
10