Learning binary relations using weighted majority voting

270 GOLDMAN AND WARMUTH 

Furthermore, for y >_ 2 

@2 _ 7y + 2 4 1 

> - > - 

6y 2-10y+3 - 7 2 

and thus it suffices to have x _< ym/2 which is the case. 

Finally, it can be verified that fxxfyy - (fxy)2 is 

for y_> 2. 

y2 4y(y - 1) + 1 +bm In 2 + 2bray In 2(y - 2) + 2 in ~(-N-sTy_~ n(r~--l) ) ,_y (Q. 2 - 4y + 1) ) 

16b 2 (In 2) 2 (y - 1)2re(f (x, y))4 

This completes the proof that f(x, y) is concave over the desired interval. 

Notes 

1. The adversary, who tries to maximize the learner's mistakes, knows the learner's algorithm and has unlimited 

computing power. 

2. The halving algorithm is described in more detail in Section 2. 

3, Throughout this paper we let lg denote the base 2 logarithm and In the natural logarithm. Euler's constant 

is denoted by e. 

4. The same result holds if one performs the updates after each trial. 

5. A version of this second algorithm is described in detail in Figure 3. For the sake of simplicity it is 

parameterized by an update the factor 3' = 2~/(1 + ~) instead of/3. See Section 5 for a discussion of 

this algorithm called Learn-Relation(7). 

6. We can obtain a bound of half of (1) by either letting the algorithm predict probabilistically in {0, 1} 

or deterministically in the interval [0, 1] (Cesa-Bianchi, Freund, Helmbold, Haussler & Schapire 1993; 

Kivinen & Warmuth, 1994). In the case of probabilistic predictions this quantity is an upper bound for 

the expected number of mistakes. And in the case of deterministic predictions in [0, 1], this quantity is 

an upper bound on the loss of the algorithm (where the loss is just the sum over all trials of the absolute 

difference between the prediction and the correct value). 

7. Recall that 9(z) = 1 

1+2z+2~ 2 and 9(ee) = 0. 

8. Again, we can obtain improvements of a factor of 2 by either letting the algorithm predict probabilistically 

in {0, 1} or deterministically in the interval [0, 1] (Cesa-Bianchi, Freund, Helmbold, Haussler, Schapire & 

Warmuth, 1993; Kivinen & Warmuth, 1994). 

References 

Angluin, D. (1988). Queries and concept learning. Machine Learning, 2(4): pp. 319-342. 

Barzdin, J. & Freivald, R. (1972). On the prediction of general recursive functions. Soviet Mathematics 

Doklady, 13: pp. 1224-1228. 

Cesa-Bianchi, N., Freund, Y., Helmbold, D., Haussler, D., Schapire, R. & Warmuth, M. (1993). How to 

use expert advice. Proceedings of the Twenty Fifth Annual ACM Symposium on Theory of Computing, pp. 

382-391. 

Chert, W. (1992). Personal communication. 

Goldman, S., Rivest, R. & Schapire, R. (1993). Learning binary relations and total orders. SIAM Journal of 

Computing, 22(5): pp. 1006-1034. 

>_0

Previous page

Next page

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

Learning binary relations using weighted majority voting

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?