12.07.2015 Views

1 Introduction

1 Introduction

1 Introduction

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

1.2. Probability Theory 27function can be written in the formExercise 1.11Section 1.1Exercise 1.12ln p ( x|µ, σ 2) = − 12σ 2N∑n=1(x n − µ) 2 − N 2 ln σ2 − N 2ln(2π). (1.54)Maximizing (1.54) with respect to µ, we obtain the maximum likelihood solutiongiven byµ ML = 1 N∑x n (1.55)Nwhich is the sample mean, i.e., the mean of the observed values {x n }. Similarly,maximizing (1.54) with respect to σ 2 , we obtain the maximum likelihood solutionfor the variance in the formσ 2 ML = 1 Nn=1N∑(x n − µ ML ) 2 (1.56)n=1which is the sample variance measured with respect to the sample mean µ ML . Notethat we are performing a joint maximization of (1.54) with respect to µ and σ 2 ,butin the case of the Gaussian distribution the solution for µ decouples from that for σ 2so that we can first evaluate (1.55) and then subsequently use this result to evaluate(1.56).Later in this chapter, and also in subsequent chapters, we shall highlight the significantlimitations of the maximum likelihood approach. Here we give an indicationof the problem in the context of our solutions for the maximum likelihood parametersettings for the univariate Gaussian distribution. In particular, we shall showthat the maximum likelihood approach systematically underestimates the varianceof the distribution. This is an example of a phenomenon called bias and is relatedto the problem of over-fitting encountered in the context of polynomial curve fitting.We first note that the maximum likelihood solutions µ ML and σML 2 are functions ofthe data set values x 1 ,...,x N . Consider the expectations of these quantities withrespect to the data set values, which themselves come from a Gaussian distributionwith parameters µ and σ 2 . It is straightforward to show thatE[µ ML ] = µ (1.57)( ) N − 1E[σML] 2 =σ 2 (1.58)Nso that on average the maximum likelihood estimate will obtain the correct mean butwill underestimate the true variance by a factor (N − 1)/N . The intuition behindthis result is given by Figure 1.15.From (1.58) it follows that the following estimate for the variance parameter isunbiased˜σ 2 =NN − 1 σ2 ML = 1 N∑(x n − µ ML ) 2 . (1.59)N − 1n=1

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!