Performance evaluation of learning algorithms - Mohak Shah

Recommendations

Info

“Evaluation is the key to making real progress in data mining”, [Witten & Frank, 2005], p. 143 2
Yet…. Evaluation gives us a lot <strong>of</strong>, sometimes contradictory, information. How do we make sense <strong>of</strong> it all? Evaluation standards differ from person to person and discipline to discipline. How do we decide which standards are right for us? Evaluation gives us support for a theory over another, but rarely, if ever, certainty. So where does that leave us? 3
Page 1: Nathalie Japkowicz Mohak Shah Unive
Page 5 and 6: Example I: Which Classifier is be8e
Page 7 and 8: Example III: What do our results me
Page 9 and 10: Book on which the tutorial is based
Page 11 and 12: What these steps depend on These s
Page 14 and 15: Performance Measures Outline Ontol
Page 16 and 17: Confusion Matrix-‐Based Perform
Page 18 and 19: Pairs of Measures and Compounded Me
Page 20 and 21: Some issues with performance measur
Page 22 and 23: Some issues with performance measur
Page 24 and 25: Graphical Measures ROC Analysis, AU
Page 26 and 27: AUC 26
Page 28 and 29: Recent Developments I: Smooth ROC C
Page 30 and 31: Recent Developments II: The H Measu
Page 32 and 33: ProbabilisNc Measures I: RMSE The
Page 34 and 35: Other Measures I: A MulN-‐Crite
Page 36 and 37: True class Pos Neg Yes 82 17 No 12
Page 38 and 39: Illustration on a Multiclass domain
Page 40 and 41: Mul$ple Annota$ons 40
Page 42 and 43: Such measurements are also desired
Page 44 and 45: General Agreement StaNsNc 44
Page 46: MulNple raters over mulNple classes
Page 49 and 50: Hold-‐out approach Set aside a
Page 51 and 52: A Tighter bound Based on binomial
Page 53 and 54:
Binomial vs. Gaussian assumpNons B
Page 55 and 56:
Hold-‐out sample size bound Th
Page 57 and 58:
What implicitly guides re-‐samp
Page 59 and 60:
An ontology of error esNmaNon techn
Page 61 and 62:
Simple Resampling: Some variaNons o
Page 63 and 64:
ObservaNons Leave-‐One-‐Ou
Page 65 and 66:
MulNple Resampling: Bootstrapping
Page 67 and 68:
MulNple Resampling: ε0 Bootstrappi
Page 69 and 70:
Discussion Bootstrap can be useful
Page 71:
What to watch out when selecNng err
Page 74 and 75:
StaNsNcal Significance TesNng Stat
Page 76 and 77:
Choosing a StaNsNcal Test There ar
Page 78 and 79:
Issues with hypothesis tesNng: But
Page 80 and 81:
Parametric vs. Non-‐parametric
Page 82 and 83:
Comparing 2 algorithms on a single
Page 84 and 85:
Page 86 and 87:
Effect size A typical interpretati
Page 88 and 89:
Page 90 and 91:
Page 92 and 93:
Page 94 and 95:
Comparing 2 algorithms on a mulNple
Page 96 and 97:
Comparing 2 algorithms on a mulNple
Page 98 and 99:
Comparing mulNple algorithms on a m
Page 100 and 101:
IllustraNon of the Friedman test Do
Page 102 and 103:
Nemenyi Test: IllustraNon Computin
Page 104 and 105:
ConsideraNons to keep in mind while
Page 106 and 107:
Pros and Cons of Repository Data P
Page 108 and 109:
Pros and Cons of Web-‐Based Exc
Page 110 and 111:
EvaluaNon Space Mapping Artificial
Page 112 and 113:
112
Page 114 and 115:
Where to look for evaluaNon metrics
Page 116 and 117:
Where to look for StaNsNcal Tests?
Page 118 and 119:
Some Concluding Remarks 118
Page 120:
References Too many to put down he
show all

Performance evaluation of learning algorithms - Mohak Shah

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?