13.07.2015 Views

Data Mining: Practical Machine Learning Tools and ... - LIDeCC

Data Mining: Practical Machine Learning Tools and ... - LIDeCC

Data Mining: Practical Machine Learning Tools and ... - LIDeCC

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

26 CHAPTER 1 | WHAT’S IT ALL ABOUT?whether or not a fault existed, but to diagnose the kind of fault, given that onewas there. Thus there was no need to include fault-free cases in the training set.The measured attributes were rather low level <strong>and</strong> had to be augmented by intermediateconcepts, that is, functions of basic attributes, which were defined inconsultation with the expert <strong>and</strong> embodied some causal domain knowledge.The derived attributes were run through an induction algorithm to produce aset of diagnostic rules. Initially, the expert was not satisfied with the rulesbecause he could not relate them to his own knowledge <strong>and</strong> experience. Forhim, mere statistical evidence was not, by itself, an adequate explanation.Further background knowledge had to be used before satisfactory rules weregenerated. Although the resulting rules were quite complex, the expert likedthem because he could justify them in light of his mechanical knowledge. Hewas pleased that a third of the rules coincided with ones he used himself <strong>and</strong>was delighted to gain new insight from some of the others.Performance tests indicated that the learned rules were slightly superior tothe h<strong>and</strong>crafted ones that had previously been elicited from the expert, <strong>and</strong> thisresult was confirmed by subsequent use in the chemical factory. It is interestingto note, however, that the system was put into use not because of its good performancebut because the domain expert approved of the rules that had beenlearned.Marketing <strong>and</strong> salesSome of the most active applications of data mining have been in the area ofmarketing <strong>and</strong> sales. These are domains in which companies possess massivevolumes of precisely recorded data, data which—it has only recently been realized—ispotentially extremely valuable. In these applications, predictions themselvesare the chief interest: the structure of how decisions are made is oftencompletely irrelevant.We have already mentioned the problem of fickle customer loyalty <strong>and</strong> thechallenge of detecting customers who are likely to defect so that they can bewooed back into the fold by giving them special treatment. Banks were earlyadopters of data mining technology because of their successes in the use ofmachine learning for credit assessment. <strong>Data</strong> mining is now being used toreduce customer attrition by detecting changes in individual banking patternsthat may herald a change of bank or even life changes—such as a move toanother city—that could result in a different bank being chosen. It may reveal,for example, a group of customers with above-average attrition rate who domost of their banking by phone after hours when telephone response is slow.<strong>Data</strong> mining may determine groups for whom new services are appropriate,such as a cluster of profitable, reliable customers who rarely get cash advancesfrom their credit card except in November <strong>and</strong> December, when they are pre-

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!