10.11.2016 Views

Learning Data Mining with Python

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Next Steps…<br />

Chapter 3: Predicting Sports Winners<br />

<strong>with</strong> Decision Trees<br />

More on pandas<br />

http://pandas.pydata.org/pandas-docs/stable/tutorials.html<br />

The pandas library is a great package—anything you normally write to do data<br />

loading is probably already implemented in pandas. You can learn more about it<br />

from their tutorial, linked above.<br />

There is also a great blog post written by Chris Moffitt that overviews common tasks<br />

people do in Excel and how to do them in pandas: http://pbpython.com/excelpandas-comp.html<br />

You can also handle large datasets <strong>with</strong> pandas; see the answer, from user Jeff (the<br />

top answer at the time of writing), to this StackOverflow question for an extensive<br />

overview of the process: http://stackoverflow.com/questions/14262433/<br />

large-data-work-flows-using-pandas.<br />

Another great tutorial on pandas is written by Brian Connelly:<br />

http://bconnelly.net/2013/10/summarizing-data-in-python-<strong>with</strong>-pandas/<br />

More complex features<br />

http://www.basketball-reference.com/teams/ORL/2014_roster_status.html<br />

Sports teams change regularly from game to game. What is an easy win for a team<br />

can turn into a difficult game if a couple of the best players are injured. You can get<br />

the team rosters from basketball-reference as well. For example, the roster for the<br />

2013-2014 season for the Orlando Magic is available at the above link—similar<br />

data is available for all NBA teams.<br />

Writing code to integrate how much a team changes, and using that to add new<br />

features, can improve the model significantly. This task will take quite a bit of<br />

work, though!<br />

[ 300 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!