Real-time feature extraction from video stream data for stream ...

3. Machine Learning

Evaluation This phase is all about evaluating the results: The data mining results are

assessed with regard to business success criteria.

Deployment The last step of the process covers the deployment of the models. This

includes the presentation of the results in a final report and the presentation for

the user. In most cases the user, and not the data analyst, will then be the one to

implement the model.

Especially in the third to fifth phase (Data Preparation, Modeling and Evaluation),

certain tasks reoccur frequently in practice. Therefore it makes sense to use a tool for

simplifying these tasks of the data mining process. I have chosen two different frameworks

for this: RapidMiner (see 3.5.2) and the streams framework (see 3.5.3). RapidMiner is

ranked first in a poll by KDnuggets 3 , a data mining newspaper, in 2010. Furthermore

Rapid-I, the company that is maintaining and enhancing the RapidMiner, is one of the

project partners in the ViSTA-TV project. The streams framework was first developed

as a stream data mining plug-in for RapidMiner [Bockermann and Blom, 2012a], but is

available as a stand alone tool as well. Both tools are described in further details in the

next two sections.

3.5.2. RapidMiner

RapidMiner, formerly YALE (Yet Another Learning Environment)[Mierswa et al., 2006],

is an open-source machine learning environment. It was originally developed by Ralf

Klinkenberg, Ingo Mierswa and Simon Fischer at the Artificial Intelligence Group at TU

Dortmund University. By now it gets maintained and extended by Rapid-I.

RapidMiner offers a construction kit for machine learning tasks. It provides the user with

various operators. Each operator receives, processes and dispatches data. By combining

different operators, complex data mining processes can be assembled and executed.

Figure 3.7.: Overview of the steps constituting the KDD process. Figure taken from

[Fayyad et al., 1996]

The operators cover all steps of the data mining process (see Figure 3.7). Therefore

IO-, Preprocessing- and Transformation- Operators are as well included as Modelingand

Evaluation- Operators. The number of operators can even be increased by installing

further PlugIns for certain task or developing your own operators. As we are using



