Real-time feature extraction from video stream data for stream ...

3.2. Supervised Learning

In literature the term distance measure equals the term dissimilar measure, as the idea

behind this measure is, that objects with a huge distance between each other are not similar

to each other. Popular examples **for** distance measures are **for** example the Euclidean

distance, given by

√

d(x (1) , x (2) ) = (x (1)

1 − x (2)

1 )2 + (x (1)

2 − x (2)

2 )2 + ... + (x (1)

j

− x (2)

j

) 2

or the Manhattan distance, defined as

d(x (1) , x (2) ) = |(x (1)

1 − x (2)

1 )| + |(x(1) 2 − x (2)

2 )| + ... + |(x(1)

j

− x (2)

j

)|

In Figure 3.2 the Euclidean distance was chosen.

3.2.2. Decision trees

Decision trees are an intuitive way to represent a prediction model ˆf and hence decision

tree learners are a commonly used method in machine learning. ”A decision tree is a

flowchart-like tree structure, where each internal node (nonleaf node) denotes a test on

an attribute [= **feature**], each branch represents an outcome of the test, and each leaf

node (or terminal node) holds a class label.” [Han and Kamber, 2006]. An example **for**

such a decision tree is shown in figure 3.3.

age

youth

middle_aged

senior

student

yes

credit_rating

no

yes

fair

excellent

no

yes

no

yes

Figure 3.3.: A decision tree to predict, whether a customer at an electronic store is likely

to buy a computer or not. Example taken **from** [Han and Kamber, 2006].

For this thesis, it is sufficient to focus on the top-down induction of decision trees

(TDIDT), as top-down approaches are by far the most common method to infer a

decision tree **from** given training examples X train . TDIDT approaches base on two

31