08.06.2015 Views

Building Machine Learning Systems with Python - Richert, Coelho

Building Machine Learning Systems with Python - Richert, Coelho

Building Machine Learning Systems with Python - Richert, Coelho

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Regression –<br />

Recommendations<br />

You have probably learned about regression already in high school mathematics<br />

class, this was probably called ordinary least squares (OLS) regression then. This<br />

centuries old technique is fast to run and can be effectively used for many real-world<br />

problems. In this chapter, we will start by reviewing OLS regression and showing<br />

you how it is available in both NumPy and scikit-learn.<br />

In various modern problems, we run into limitations of the classical methods<br />

and start to benefit from more advanced methods, which we will see later in this<br />

chapter. This is particularly true when we have many features, including when we<br />

have more features than examples (which is something that ordinary least squares<br />

cannot handle correctly). These techniques are much more modern, <strong>with</strong> major<br />

developments happening in the last decade. They go by names such as lasso, ridge,<br />

or elastic nets. We will go into these in detail.<br />

Finally, we will start looking at recommendations. This is an important area in many<br />

applications as it is a significant added-value to many applications. This is a topic<br />

that we will start exploring here and will see in more detail in the next chapter.<br />

Predicting house prices <strong>with</strong> regression<br />

Let us start <strong>with</strong> a simple problem, predicting house prices in Boston.<br />

We can use a publicly available dataset. We are given several demographic and<br />

geographical attributes, such as the crime rate or the pupil-teacher ratio, and the<br />

goal is to predict the median value of a house in a particular area. As usual, we<br />

have some training data, where the answer is known to us.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!