anytime algorithms for learning anytime classifiers saher ... - Technion

More documents

Recommendations

Info

Technion - Computer Science Department - Ph.D. Thesis PHD-2008-12 - 2008 Gogan, 2004). If we chose to update the portfolio at specific time points, we would like the learner to exploit the time between these updates. Furthermore, researchers in the field can benefit from the automatic method for cost assignments we have developed. Only a few UCI datasets have assigned costs. In this work we designed a semi-randomized method for assigning costs to existing datasets. We applied this method on 25 datasets and established a repository, available at: http://www.cs.technion.ac.il/∼esaher/cost. This research can be extended in several directions. We intend to apply monitoring techniques for optimal scheduling of the anytime learners. We also plan to use different measures for tree quality and compare their utility. While the tree size and expected error were generally successful, our sampling approach did not, in few cases, yield significant improvement. Using other measures may improve the performance in these cases. We also intend to test the performance of our framework on other cost schemes that involve other types of cost. We believe that the generality of our framework will allow excellent results to be obtained under other setups as well. To reduce the runtime of our anytime algorithms, we plan to cache some of the lookahead trees and use them, rather than resampling at each node. If a split is chosen, the sample of already available subtrees can be used to evaluate its descendants as well. Finally, an important advantage of our method is that it can be easily parallelized. Assume, for example, that we decided on samples of size r. Then, r different machines can independently form the sample and speed up the induction process by a factor of r. We intend to consider this direction in the future. 138
Technion - Computer Science Department - Ph.D. Thesis PHD-2008-12 - 2008 Appendix A Datasets Below we give a more detailed description of the datasets used in our experiments: 1. Automobile: This problem, taken from the UCI Repository, consists of three types of entities: (1) the specification of an automobile, (2) its assigned insurance risk rating, and (3) its normalized losses in use as compared to other cars. Several of the attributes in the database could be used as class attributes: we chose to use both the make (model) and the symbolling (risk degree). 2. Balance Scale: This dataset is taken from the UCI Repository and was generated to model psychological experimental results. Each example is classified as having the balance scale tip to the right, tip to the left, or be balanced. The attributes are the left weight, the left distance, the right weight and the right distance. 3. Breast Cancer: This problem is taken from the UCI Repository. Each instance represents the characteristics of a patient and the class is whether or not there are recurrence events. 4. Bupa: The BUPA Liver Disorders dataset is taken from the UCI Repository. The tests are blood tests that are thought to be sensitive to liver disorders that might arise from excessive alcohol consumption. The target concept is defined as drinks < 3 or drink ≥ 3. 5. Car: This dataset, taken from the UCI Repository, was derived from a simple hierarchical decision model that evaluates cars according to basic characteristics such as price, safety, and comfort. 6. Connect-4: This dataset, taken from the UCI Repository, contains all legal 8-ply positions in the connect-4 game in which neither player has won yet and the next move is not forced. 139
Page 1 and 2:
Technion - Computer Science Departm
Page 3 and 4:
Page 5 and 6:
Page 7 and 8:
Page 9 and 10:
Page 11 and 12:
Page 13 and 14:
Page 15 and 16:
Page 17 and 18:
Page 19 and 20:
Page 21 and 22:
Page 23 and 24:
Page 25 and 26:
Page 27 and 28:
Page 29 and 30:
Page 31 and 32:
Page 33 and 34:
Page 35 and 36:
Page 37 and 38:
Page 39 and 40:
Page 41 and 42:
Page 43 and 44:
Page 45 and 46:
Page 47 and 48:
Page 49 and 50:
Page 51 and 52:
Page 53 and 54:
Page 55 and 56:
Page 57 and 58:
Page 59 and 60:
Page 61 and 62:
Page 63 and 64:
Page 65 and 66:
Page 67 and 68:
Page 69 and 70:
Page 71 and 72:
Page 73 and 74:
Page 75 and 76:
Page 77 and 78:
Page 79 and 80:
Page 81 and 82:
Page 83 and 84:
Page 85 and 86:
Page 87 and 88:
Page 89 and 90:
Page 91 and 92:
Page 93 and 94:
Page 95 and 96:
Page 97 and 98:
Page 99 and 100:
Page 101 and 102:
Page 103 and 104: Technion - Computer Science Departm
Page 153: Technion - Computer Science Departm
show all

anytime algorithms for learning anytime classifiers saher ... - Technion

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?