Ing. H. Ney 9. Exercise Sheet Pattern Recognition and Neural N

Lehrstuhl für Informatik VI Tuesday, June 21th, 2011 

Rheinisch–Westfälische Technische Hochschule Aachen Simon Wiesler & Patrick Lehnen 

Prof. Dr.–Ing. H. Ney 

9. Exercise Sheet Pattern Recognition and Neural Networks 

The solutions may be submitted in groups of up to three students until the next exercise lesson on 

Friday, July 1st, 2011, either in the secretariat of the Lehrstuhl für Informatik VI, per email to 

wiesler@cs.rwth-aachen.de and lehnen@cs.rwth-aachen.de, or at the exercise lesson.The problems indicated 

with (**. . . ) are optional. Bachelor students do not need to submit this exercise, all points are 

optional for them. 

1. (Repetition) What is? 

Give a description of the terms given in the itemization below. The length of the description 

should be one paragraph. 

(a) Bayes decision rule (** 1P) 

(b) Bayes error (** 1P) 

(c) Maximum Likelihood (** 1P) 

2. k-Nearest Neighbour Classification (* 5P) 

The k-nearest neighbour method is an extension to the nearest neighbour classifier, where the 

classification decision c(x) is taken using a majority vote among the k training samples closest 

to the test sample x, compared to the nearest neighbour method, where the decision is taken by 

looking only at the closest training sample. If k = 1 the k nearest neighbour method is equal to 

normal nearest neighbour. 

Use the k nearest neighbour implementation from “Netlab” 1 for classification on the USPS digit 

recognition task (see Exercise Sheet 6). Tabulate the error rate for test and training set for k in 

the range from 1 to 10. Interpret your results – what effect does the parameter k have? 

Hints: You will need the methods knn and knnfwd from Netlab, which are described in the help 

pages “knn.html”, “knnfwd.html” and the demo “demknn1.html”. A Matlab solution framework 

is provided at pub/PatRec/usps/kNN framework.m on our server wasserstoff.informatik.rwthaachen.de. 

See second page on reverse side! 

1 http://www1.aston.ac.uk/eas/research/groups/ncrg/resources/netlab/

3. Gaussian Mixture Distributions and the EM Algorithm 

Consider the estimation of the parameters of a Gaussian mixture distribution with a pooled 

diagonal covariance matrix 

Ik� 

p(x|k) = cki · N (x|µki, Σ) 

i=1 

using the Expectation Maximization (EM) algorithm applied to the image recognition system for 

handwritten digits from the US Postal (USPS) corpus as discussed in Exercise 6 (cf. directories 

pub/PatRec/usps/ and pub/PatRec/usps/SOLUTION on our server wasserstoff.informatik.rwthaachen.de). 

(a) Derive the EM estimates of the mean vectors µk,i, the pooled diagonal covariance 

matrices Σ with Σdd ≡ σ 2 d , and the mixture weights ck,i of Gaussian mixture 

distributions, k = 1, . . . , K and i = 1, . . . , Ik. 

(b) Implement the EM estimation of the above Gaussian mixture distributions for 

USPS. Use the solution presented in Exercise 6 as a starting point and use the 

Maximum Likelihood training result from Exercise 6 as initialization, which you 

find in the correct format at pub/PatRec/usps/usps.mixture.initparam on our server 

wasserstoff. Your implementation should take the number of EM reestimation iterations 

as an additional input parameter. 

In order to be able to increase the number Ik of densities per class, double each 

density by perturbing it by a small amount in opposite directions: µ → {µ−ɛ, µ+ɛ} 

and distribute the mixture weight equally to both new densities. Make sure that the 

perturbations ɛ are proportional to and smaller than the corresponding variances. 

Assume training data given by D-dimensional observation vectors xn ∈ IR, 

n = 1, . . . , N. Use the format introduced in pub/PatRec/usps/usps.mixture.README 

on our server wasserstoff to read the initial and store the resulting parameter 

sets. Use pub/PatRec/usps/usps.train to produce parameter sets for Ik = 2 and 

Ik = 4 using 10 EM iterations each. Monitor the avergage score (log-likelihood) 

per observation for each iteration and check that it increases. 

(c) Generalize the recognizer for the USPS problem from Exercise 6 to the case of using 

mixture densities together with pooled diagonal covariance matrices. Recognize 

pub/PatRec/usps/usps.test using the parameters obtained above for I = 2 and 

I = 4 and compare this to the results obtained in Ex. 6. 

(* 5P) 

(* 6P) 

(* 4P) 

Hints: Use the initial parameter set pub/PatRec/usps/usps.mixture.initparam for the case Ik = 1 

as testing cases for both your implementation of the training (without previous splitting, the parameters 

for the case Ik = 1 need to remain the same), and of your classifier (the result needs to 

be the same as using pooled diagonal covariances with the old implementation using one Gaussian 

per class.

Ing. H. Ney 9. Exercise Sheet Pattern Recognition and Neural N

Create successful ePaper yourself

Delete template?

Save as template?