12.07.2015 Views

1 Introduction

1 Introduction

1 Introduction

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

4 1. INTRODUCTIONFigure 1.2 Plot of a training data set of N =10 points, shown as blue circles,each comprising an observationof the input variable x along withthe corresponding target variablet. The green curve shows thefunction sin(2πx) used to generatethe data. Our goal is to predictthe value of t for some newvalue of x, without knowledge ofthe green curve.1t0−10x1detailed treatment lies beyond the scope of this book.Although each of these tasks needs its own tools and techniques, many of thekey ideas that underpin them are common to all such problems. One of the maingoals of this chapter is to introduce, in a relatively informal way, several of the mostimportant of these concepts and to illustrate them using simple examples. Later inthe book we shall see these same ideas re-emerge in the context of more sophisticatedmodels that are applicable to real-world pattern recognition applications. Thischapter also provides a self-contained introduction to three important tools that willbe used throughout the book, namely probability theory, decision theory, and informationtheory. Although these might sound like daunting topics, they are in factstraightforward, and a clear understanding of them is essential if machine learningtechniques are to be used to best effect in practical applications.1.1. Example: Polynomial Curve FittingWe begin by introducing a simple regression problem, which we shall use as a runningexample throughout this chapter to motivate a number of key concepts. Supposewe observe a real-valued input variable x and we wish to use this observation topredict the value of a real-valued target variable t. For the present purposes, it is instructiveto consider an artificial example using synthetically generated data becausewe then know the precise process that generated the data for comparison against anylearned model. The data for this example is generated from the function sin(2πx)with random noise included in the target values, as described in detail in Appendix A.Now suppose that we are given a training set comprising N observations of x,written x ≡ (x 1 ,...,x N ) T , together with corresponding observations of the valuesof t, denoted t ≡ (t 1 ,...,t N ) T . Figure 1.2 shows a plot of a training set comprisingN =10data points. The input data set x in Figure 1.2 was generated by choosingvalues of x n , for n =1,...,N, spaced uniformly in range [0, 1], and the targetdata set t was obtained by first computing the corresponding values of the function

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!