- Page 1 and 2: Nathalie Japkowicz Mohak Shah Unive
- Page 3 and 4: Yet…. Evaluation gives us a lot
- Page 5 and 6: Example I: Which Classifier is be8e
- Page 7 and 8: Example III: What do our results me
- Page 9 and 10: Book on which the tutorial is based
- Page 11: What these steps depend on These s
- Page 15 and 16: Overview of Performance Measures 15
- Page 17 and 18: Aliases and other measures Accurac
- Page 19 and 20: Skew and Cost consideraNons Skew-
- Page 21 and 22: Some issues with performance measur
- Page 23 and 24: Some issues with performance measur
- Page 25 and 26: ROC Analysis ROC Analysis is appli
- Page 27 and 28: Cost Curves ROC Curves only tell us
- Page 29 and 30: Recent Developments II: The H Measu
- Page 31 and 32: Other Curves Other research commun
- Page 33 and 34: ProbabilisNc Measures II: InformaNo
- Page 35 and 36: Other Measures II: VisualizaNon-
- Page 37 and 38: IllustraNon on MulNple domains: Bre
- Page 39 and 40: Other Measures III: AccounNng for c
- Page 41 and 42: 2 Ques$ons of interest Inter-expert
- Page 43 and 44: Coincidental concordances (Natural
- Page 45 and 46: General Agreement StaNsNc Maximum A
- Page 48 and 49: Se we decided on a performance meas
- Page 50 and 51: Hold-‐out approach Confidence
- Page 52 and 53: Binomial vs. Gaussian assumpNons B
- Page 54 and 55: Hold-‐out sample size requireme
- Page 56 and 57: The need for re-‐sampling Too
- Page 58 and 59: What implicitly guides re-‐samp
- Page 60 and 61: Fold 1: Fold 2: Simple Resampling:
- Page 62 and 63:
ObservaNons k-‐fold CV is argu
- Page 64 and 65:
LimitaNons of Simple Resampling Do
- Page 66 and 67:
MulNple Resampling: ε0 Bootstrappi
- Page 68 and 69:
MulNple Resampling: e632 Bootstrapp
- Page 70 and 71:
MulNple Resampling: other approache
- Page 73 and 74:
StaNsNcal Significance TesNng Erro
- Page 75 and 76:
NHST State a null hypothesis Usua
- Page 77 and 78:
Issues with hypothesis tesNng NHST
- Page 79 and 80:
An overview of representaNve tests
- Page 81 and 82:
Tests covered in this tutorial 81
- Page 83 and 84:
Comparing 2 algorithms on a single
- Page 85 and 86:
Effect size t-‐test measured t
- Page 87 and 88:
Comparing 2 algorithms on a single
- Page 89 and 90:
Comparing 2 algorithms on a single
- Page 91 and 92:
Comparing 2 algorithms on a single
- Page 93 and 94:
Comparing 2 algorithms on a mulNple
- Page 95 and 96:
Comparing 2 algorithms on a mulNple
- Page 97 and 98:
Comparing 2 algorithms on a mulNple
- Page 99 and 100:
Friedman’s Test Algorithms are r
- Page 101 and 102:
Nemenyi Test The omnibus test reje
- Page 103 and 104:
103
- Page 105 and 106:
Where can we get our data from? Re
- Page 107 and 108:
Pros and Cons of ArNficial Data Pr
- Page 109 and 110:
A Technique for Characterizing our
- Page 111 and 112:
Results The graph below shows the h
- Page 113 and 114:
What help is available for conducNn
- Page 115 and 116:
WEKA even performs ROC Analysis and
- Page 117 and 118:
Where to look for Re-‐sampling
- Page 119 and 120:
If you need help, advice, etc… P