13.07.2015 Views

Data Mining: Practical Machine Learning Tools and ... - LIDeCC

Data Mining: Practical Machine Learning Tools and ... - LIDeCC

Data Mining: Practical Machine Learning Tools and ... - LIDeCC

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

INDEX 521sigmoid kernel, 219Simple CLI, 371, 449, 450SimpleKMeans, 418–419simple linear regression, 326SimpleLinearRegression, 409SimpleLogistic, 410simplest-first ordering, 34simplicity-first methodology, 83, 183single-attribute evaluators in Weka, 421,422–423single-consequent rules, 118single holdout procedure, 150sister-of-relation, 46–47SMO, 410smoothinglocally weighted linear regression, 252model tree, 244, 251SMOreg, 410software programs. See Weka workbenchsorting, avoiding repeated, 190soybean data, 18–22spam, 356–357sparse data, 55–56sparse instance in Weka, 401SparseToNonSparse, 401specificity, 173specific-to-general search bias, 34split<strong>Data</strong>(), 480splitter nodes, 329splittingclustering, 254–255, 257decision tree, 62–63entropy-based discretization, 301massive datasets, 347model tree, 245, 247subexperiments, 447surrogate, 247SpreadSubsample, 403squared-error loss function, 227squared error measures, 177–179stacked generalization, 332stacking, 332–334Stacking, 417StackingC, 417stale data, 60st<strong>and</strong>ard deviation reduction (SDR), 245st<strong>and</strong>ard deviations from the mean, 148St<strong>and</strong>ardize, 398st<strong>and</strong>ardizing, 56statistical modeling, 88–97document classification, 94–96missing values, 92–94normal-distribution assumption, 92numeric attributes, 92–94statistics, 29–30Status box, 380step function, 227, 228stochastic algorithms, 348stochastic backpropagation, 232stopping criterion, 293, 300, 326stopwords, 310, 352stratification, 149, 151stratified holdout, 149StratifiedRemoveFolds, 403stratified cross-validation, 149StreamableFilter, 456string attributes, 54–55string conversion in Weka, 399string table, 55StringToNominal, 399StringToWordVector, 396, 399, 401, 462StripChart, 431structural patterns, 6structure learning by conditional independencetests, 280student’s distribution with k–1 degrees offreedom, 155student’s t-test, 154, 184subexperiments, 447subsampling in Weka, 400subset evaluators in Weka, 421, 422subtree raising, 193, 197subtree replacement, 192–193, 197success rate, 173supervised attribute filters in Weka, 402–403supervised discretization, 297, 298supervised filters in Weka, 401–403supervised instance filters in Weka, 402, 403supervised learning, 43support, 69, 113

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!