Performance evaluation of learning algorithms - Mohak Shah

Recommendations

Info

Outline <strong>of</strong> the tutorial: <strong>Performance</strong> measures Error Estimation/Resampling Statistical Significance Testing Data Set Selection and Evaluation Benchmark Design Available resources 12
Page 1 and 2: Nathalie Japkowicz Mohak Shah Unive
Page 3 and 4: Yet…. Evaluation gives us a lot
Page 5 and 6: Example I: Which Classifier is be8e
Page 7 and 8: Example III: What do our results me
Page 9 and 10: Book on which the tutorial is based
Page 11: What these steps depend on These s
Page 15 and 16: Overview of Performance Measures 15
Page 17 and 18: Aliases and other measures Accurac
Page 19 and 20: Skew and Cost consideraNons Skew-
Page 21 and 22: Some issues with performance measur
Page 23 and 24: Some issues with performance measur
Page 25 and 26: ROC Analysis ROC Analysis is appli
Page 27 and 28: Cost Curves ROC Curves only tell us
Page 29 and 30: Recent Developments II: The H Measu
Page 31 and 32: Other Curves Other research commun
Page 33 and 34: ProbabilisNc Measures II: InformaNo
Page 35 and 36: Other Measures II: VisualizaNon-
Page 37 and 38: IllustraNon on MulNple domains: Bre
Page 39 and 40: Other Measures III: AccounNng for c
Page 41 and 42: 2 Ques$ons of interest Inter-expert
Page 43 and 44: Coincidental concordances (Natural
Page 45 and 46: General Agreement StaNsNc Maximum A
Page 48 and 49: Se we decided on a performance meas
Page 50 and 51: Hold-‐out approach Confidence
Page 52 and 53: Binomial vs. Gaussian assumpNons B
Page 54 and 55: Hold-‐out sample size requireme
Page 56 and 57: The need for re-‐sampling Too
Page 58 and 59: What implicitly guides re-‐samp
Page 60 and 61: Fold 1: Fold 2: Simple Resampling:
Page 62 and 63:
ObservaNons k-‐fold CV is argu
Page 64 and 65:
LimitaNons of Simple Resampling Do
Page 66 and 67:
MulNple Resampling: ε0 Bootstrappi
Page 68 and 69:
MulNple Resampling: e632 Bootstrapp
Page 70 and 71:
MulNple Resampling: other approache
Page 73 and 74:
StaNsNcal Significance TesNng Erro
Page 75 and 76:
NHST State a null hypothesis Usua
Page 77 and 78:
Issues with hypothesis tesNng NHST
Page 79 and 80:
An overview of representaNve tests
Page 81 and 82:
Tests covered in this tutorial 81
Page 83 and 84:
Comparing 2 algorithms on a single
Page 85 and 86:
Effect size t-‐test measured t
Page 87 and 88:
Page 89 and 90:
Page 91 and 92:
Page 93 and 94:
Comparing 2 algorithms on a mulNple
Page 95 and 96:
Page 97 and 98:
Page 99 and 100:
Friedman’s Test Algorithms are r
Page 101 and 102:
Nemenyi Test The omnibus test reje
Page 103 and 104:
103
Page 105 and 106:
Where can we get our data from? Re
Page 107 and 108:
Pros and Cons of ArNficial Data Pr
Page 109 and 110:
A Technique for Characterizing our
Page 111 and 112:
Results The graph below shows the h
Page 113 and 114:
What help is available for conducNn
Page 115 and 116:
WEKA even performs ROC Analysis and
Page 117 and 118:
Where to look for Re-‐sampling
Page 119 and 120:
If you need help, advice, etc… P
show all

Performance evaluation of learning algorithms - Mohak Shah

Create successful ePaper yourself

Delete template?

Save as template?