13.07.2015 Views

Data Mining: Practical Machine Learning Tools and ... - LIDeCC

Data Mining: Practical Machine Learning Tools and ... - LIDeCC

Data Mining: Practical Machine Learning Tools and ... - LIDeCC

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

1.7 FURTHER READING 37purchases? Should the manager move the most expensive, most profitablediapers near the beer, increasing sales to harried fathers of a high-margin item<strong>and</strong> add further luxury baby products nearby?Of course, anyone who uses advanced technologies should consider thewisdom of what they are doing. If data is characterized as recorded facts, theninformation is the set of patterns, or expectations, that underlie the data. Youcould go on to define knowledge as the accumulation of your set of expectations<strong>and</strong> wisdom as the value attached to knowledge. Although we will not pursue itfurther here, this issue is worth pondering.As we saw at the very beginning of this chapter, the techniques described inthis book may be called upon to help make some of the most profound <strong>and</strong>intimate decisions that life presents. <strong>Data</strong> mining is a technology that we needto take seriously.1.7 Further readingTo avoid breaking up the flow of the main text, all references are collected in asection at the end of each chapter. This first Further reading section describespapers, books, <strong>and</strong> other resources relevant to the material covered in Chapter1. The human in vitro fertilization research mentioned in the opening to thischapter was undertaken by the Oxford University Computing Laboratory,<strong>and</strong> the research on cow culling was performed in the Computer ScienceDepartment at the University of Waikato, New Zeal<strong>and</strong>.The example of the weather problem is from Quinlan (1986) <strong>and</strong> has beenwidely used to explain machine learning schemes. The corpus of example problemsmentioned in the introduction to Section 1.2 is available from Blake et al.(1998). The contact lens example is from Cendrowska (1998), who introducedthe PRISM rule-learning algorithm that we will encounter in Chapter 4. The irisdataset was described in a classic early paper on statistical inference (Fisher1936). The labor negotiations data is from the Collective bargaining review, apublication of Labour Canada issued by the Industrial Relations InformationService (BLI 1988), <strong>and</strong> the soybean problem was first described by Michalski<strong>and</strong> Chilausky (1980).Some of the applications in Section 1.3 are covered in an excellent paper thatgives plenty of other applications of machine learning <strong>and</strong> rule induction(Langley <strong>and</strong> Simon 1995); another source of fielded applications is a specialissue of the <strong>Machine</strong> <strong>Learning</strong> Journal (Kohavi <strong>and</strong> Provost 1998). The loancompany application is described in more detail by Michie (1989), the oil slickdetector is from Kubat et al. (1998), the electric load forecasting work is byJabbour et al. (1988), <strong>and</strong> the application to preventative maintenance ofelectromechanical devices is from Saitta <strong>and</strong> Neri (1998). Fuller descriptions

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!