18.11.2012 Views

anytime algorithms for learning anytime classifiers saher ... - Technion

anytime algorithms for learning anytime classifiers saher ... - Technion

anytime algorithms for learning anytime classifiers saher ... - Technion

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>Technion</strong> - Computer Science Department - Ph.D. Thesis PHD-2008-12 - 2008<br />

17. TAE: The Teaching Assistant Evaluation dataset, taken from the UCI<br />

Repository, consists of evaluations of teaching per<strong>for</strong>mance; scores are<br />

”low”, ”medium”, or ”high”.<br />

18. Tic-Tac-Toe: The problem, taken from the UCI Repository, deals with<br />

the classification of legal tic-tac-toe end games, as wins or non-wins <strong>for</strong><br />

the x player. Each example is represented by 9 nominal attributes, which<br />

represent the slot values.<br />

19. Thyroid: In this dataset, taken from the UCI repository, the class is diagnostic<br />

(normal or hyperthyroid). The features represent medical in<strong>for</strong>mation<br />

and basic characteristics of the patient.<br />

20. Voting: This dataset, taken from the UCI repository, includes votes <strong>for</strong> each<br />

member of the U.S. House of Representatives on the 16 key votes identified<br />

by the CQA. The class of each record is Democrat or Republican.<br />

21. Wine: This problem, taken from the UCI Repository, deals with the classification<br />

of wines into 3 class types. Each example is represented by 13<br />

continuous attributes, which represent measures of chemical elements in<br />

the wine.<br />

22. Zoo: In this domain, taken from the UCI Repository, the goal is to determine<br />

the type of the animal from several attributes.<br />

23. Multiplexer: The multiplexer task was used by several researchers <strong>for</strong> evaluating<br />

<strong>classifiers</strong>, e.g., Quinlan (1993). An instance is a series of bits of<br />

length a + 2 a , where a is a positive integer. The first a bits represent an<br />

index into the remaining bits and the label of the instance is the value of the<br />

indexed bit. In our experiments we considered the 20-Multiplexer (a = 4).<br />

The dataset contains 500 randomly drawn instances.<br />

24. Boolean XOR: Parity-like functions are known to be problematic <strong>for</strong> many<br />

<strong>learning</strong> <strong>algorithms</strong>. However, they naturally arise in real-world data, such<br />

as the Drosophila survival concept (Page & Ray, 2003). We considered<br />

XOR of five and ten variables with additional irrelevant attributes.<br />

25. Numeric XOR: A XOR based numeric dataset that has been used to evaluate<br />

<strong>learning</strong> <strong>algorithms</strong>, e.g., Baram, El-Yaniv, and Luz (2003). Each<br />

example consists of values <strong>for</strong> x and y coordinates. The example is labeled<br />

1 if the product of x and y is positive, and −1 otherwise. We generalized<br />

this domain <strong>for</strong> three and four dimensions and added irrelevant variables to<br />

make the concept harder.<br />

141

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!