18.11.2012 Views

anytime algorithms for learning anytime classifiers saher ... - Technion

anytime algorithms for learning anytime classifiers saher ... - Technion

anytime algorithms for learning anytime classifiers saher ... - Technion

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>Technion</strong> - Computer Science Department - Ph.D. Thesis PHD-2008-12 - 2008<br />

Table 3.3: The size of the induced trees on various datasets. The numbers represent<br />

the average and standard deviation over the individual runs.<br />

ID3-k LSID3 LSID3-P<br />

Dataset ID3 C4.5 (k = 2) (r = 5) (r = 5)<br />

Autos Make 53.6 ±4.2 26.6 ±1.4 56.5 ±2.4 36.6 ±2.0 37.1 ±2.0<br />

Autos Sym. 46.7 ±1.8 20.1 ±4.1 47.0 ±1.8 26.8 ±1.7 26.5 ±2.1<br />

Balance 353.9 ±6.9 34.8 ±5.5 351.6 ±7.3 347.7 ±7.7 40.1 ±6.5<br />

Br. Cancer 129.6 ±5.9 6.1 ±4.0 124.0 ±4.9 100.3 ±4.4 7.7 ±7.1<br />

Connect-4 18507 ±139 3329 ±64 16143 ±44 14531 ±168 6614 ±173<br />

Corral 9.2 ±2.1 5.3 ±1.2 7.0 ±0.6 6.7 ±0.5 6.3 ±0.8<br />

Glass 38.7 ±2.3 23.9 ±2.6 32.3 ±2.3 34.2 ±2.1 35.5 ±2.4<br />

Iris 8.5 ±1.0 4.7 ±0.5 9.1 ±0.9 7.6 ±0.8 6.6 ±1.6<br />

Monks-1 62.0 ±0.0 11.0 ±0.0 27.0 ±0.0 27.0 ±0.0 21.0 ±0.0<br />

Monks-2 109.0 ±0.0 20.0 ±0.0 105.0 ±0.0 92.4 ±3.4 18.3 ±5.0<br />

Monks-3 31.0 ±0.0 9.0 ±0.0 34.0 ±0.0 26.8 ±1.6 9.9 ±1.4<br />

Mushroom 24.0 ±0.0 19.0 ±0.2 24.9 ±0.2 16.2 ±0.9 16.2 ±0.9<br />

Solar-Flare 68.9 ±2.9 2.0 ±1.0 68.4 ±3.2 63.2 ±2.9 1.2 ±0.8<br />

Tic-tac-toe 189.0 ±13.6 83.4 ±7.8 176.7 ±9.0 151.7 ±5.6 112.6 ±8.9<br />

Voting 13.6 ±2.2 2.8 ±1.5 12.3 ±1.6 13.0 ±2.0 3.5 ±2.1<br />

Wine 7.9 ±1.0 5.3 ±0.7 7.3 ±0.9 6.2 ±0.7 7.4 ±1.3<br />

Zoo 13.8 ±0.4 8.3 ±0.8 13.8 ±0.5 9.9 ±0.8 9.8 ±0.9<br />

Numeric XOR-3D 43.0 ±5.1 1.0 ±0.1 15.6 ±6.8 9.2 ±1.0 11.7 ±1.7<br />

Numeric XOR-4d 104.7 ±4.5 2.7 ±1.8 75.5 ±14.0 26.8 ±4.7 30.9 ±5.9<br />

Multiplexer-20 142.8 ±8.3 66.1 ±6.5 67.1 ±29.0 46.6 ±20.0 41.2 ±16.8<br />

Multiplex-XOR 84.0 ±5.6 26.0 ±5.2 70.2 ±5.3 43.9 ±5.7 31.6 ±4.8<br />

XOR-5 92.3 ±7.8 21.9 ±5.3 82.1 ±9.3 32.0 ±0.0 32.0 ±0.0<br />

XOR-5 Noise 93.6 ±6.0 23.2 ±5.9 82.4 ±7.3 58.2 ±6.1 40.4 ±5.0<br />

XOR-10 3901 ±34 1367 ±39 3287 ±37 2004 ±585 1641 ±524<br />

In the case of synthetic datasets, the optimal tree size can be found in theory. 11<br />

For instance, the tree that perfectly describes the n XOR concept is of size 2 n .<br />

The results show that in this sense, the trees induced by LSID3 were almost<br />

optimal.<br />

Reducing the tree size is usually beneficial only if the associated accuracy<br />

is not reduced. Analyzing the accuracy of the produced trees shows that LSID3<br />

significantly outper<strong>for</strong>ms ID3 <strong>for</strong> most datasets. For the other datasets, the t-test<br />

values indicate that the <strong>algorithms</strong> are not significantly different. The average<br />

absolute improvement in the accuracy of LSID3 over ID3 is 11%. The Wilcoxon<br />

test (Demsar, 2006), which compares <strong>classifiers</strong> over multiple datasets, indicates<br />

that the advantage of LSID3 over ID3 is significant, with α = 0.05.<br />

The accuracy achieved by ID3-k, as shown in Figure 3.21(b), is better than<br />

set.<br />

11 Note that a theoretically optimal tree is not necessarily obtainable from a given training<br />

49

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!