anytime algorithms for learning anytime classifiers saher ... - Technion
anytime algorithms for learning anytime classifiers saher ... - Technion
anytime algorithms for learning anytime classifiers saher ... - Technion
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
<strong>Technion</strong> - Computer Science Department - Ph.D. Thesis PHD-2008-12 - 2008<br />
significant with α = 0.05. Note that binary splits come at a price: the number<br />
of candidate splits increases and the runtime becomes significantly longer. When<br />
allocated the same time budget, LSID3-p can af<strong>for</strong>d larger samples than BLSID3p.<br />
The advantage of the latter, however, is kept.<br />
The second task is a variant on XOR-2, where there are 3 attributes a1, a2, a3,<br />
each of which can take one of the values A, C, G, T. The target concept is a ∗ 1 ⊕a∗ 2 .<br />
The values of a ∗ i are obtained from ai by mapping each A and C to 0 and each<br />
G and T to 1. The dataset, referred to as Categorial XOR-2, consists of 16<br />
randomly drawn examples. With multiway splits, both LSID3-p and C4.5 could<br />
not learn Categorial XOR-2 and their accuracy was about 50%. C4.5 failed also<br />
with binary splits. BLSID3-p, on the contrary, was 92% accurate.<br />
We also examined the bias of LSID3 toward binary attributes. For this purpose<br />
we used the example described in Section 3.5.2. An artificial dataset with<br />
all possible values was created. LSID3 with multiway and with binary splits were<br />
run 10000 times. Although a4 is as important to the class as a1 and a2, LSID3<br />
with multiway splits never chose it at the root. LSID3 with binary splits, however,<br />
split the root on a4 35% of the time. These results were similar to a1 (33%)<br />
and a2 (32%). They indicate that <strong>for</strong>cing binary splits removes the LSID3 bias.<br />
3.7.3 Anytime Behavior of the Contract Algorithms<br />
Both LSID3 and ID3-k make use of additional resources <strong>for</strong> generating better<br />
trees. However, in order to serve as good <strong>anytime</strong> <strong>algorithms</strong>, the quality of their<br />
output should improve with the increase in their allocated resources. For a typical<br />
<strong>anytime</strong> algorithm, this improvement is greater at the beginning and diminishes<br />
over time. To test the <strong>anytime</strong> behavior of LSID3 and ID3-k, we invoked them<br />
with successive values of r and k respectively. In the first set of experiments we<br />
focused on domains with nominal attributes, while in the second set we tested<br />
the <strong>anytime</strong> behavior <strong>for</strong> domains with continuous attributes.<br />
Nominal attributes<br />
Figures 3.23, 3.24 and 3.25 show the average results over 10 runs of 10-fold crossvalidation<br />
experiments <strong>for</strong> the Multiplex-XOR, XOR-10 and Tic-tac-toe datasets<br />
respectively. The x-axis represents the run time in seconds. 13 ID3 and C4.5,<br />
which are constant time <strong>algorithms</strong>, terminate quickly and do not improve with<br />
time. Since ID3-k with k = 1 and LSID3 with r = 0 are defined to be identical<br />
to ID3, the point at which ID3 yields a result serves also as the starting point of<br />
these <strong>anytime</strong> <strong>algorithms</strong>.<br />
13 The <strong>algorithms</strong> were implemented in C++, compiled with GCC, and run on Macintosh G5<br />
2.5 GHz.<br />
54