Real-time feature extraction from video stream data for stream ...

3.5. Tools and Frameworks

Standard Process for Data Mining (CRISP-DM), which addresses this fact by defining

a process model for carrying out data mining projects [Shearer, 2000]. Afterwards two

software frameworks for solving machine learning tasks are presented: RapidMiner and

the streams framework.

3.5.1. CRISP-DM

CRISP-DM was designed in 1996 by a European Union project led by four companies:

SPSS (an IBM company), Teradata, Daimler AG and OHRA, a dutch insurance company.

Regarding to CRISP-DM a data mining process consists of six phases, that are

shown in figure 3.6:

Figure 3.6.: CRoss Industry Standard Process for Data Mining (CRISP-DM). Figure

taken from [Wirth, 2000]

Business Understanding This initial phase copes with understanding the objectives

and requirements of the problem from a business perspective. As a result the

concrete learning task of the data mining problem gets defined and a rough project

schedule is developed.

Data Understanding In this phase the given data is collected and analyzed in order

to verify the data quality and explore first insights. This might include statistical

analysis or subset detection. At the end hypotheses are formulated. These hypothesis

then have to be proved or refuted in the further process.

Data Preparation Here the final dataset for learning is constructed. Common tasks

are data cleaning, feature extraction and creation of new attributes and the selection

of features.

Modeling In this phase, a selection of modeling techniques and learning algorithms are

applied on the data. Model parameters are assessed and methods of the evaluation

of the gotten models are conceived.


