08.06.2015 Views

Building Machine Learning Systems with Python - Richert, Coelho

Building Machine Learning Systems with Python - Richert, Coelho

Building Machine Learning Systems with Python - Richert, Coelho

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Chapter 1<br />

However, the speed comes at a price. Using NumPy arrays, we no longer have the<br />

incredible flexibility of <strong>Python</strong> lists, which can hold basically anything. NumPy<br />

arrays always have only one datatype.<br />

>>> a = np.array([1,2,3])<br />

>>> a.dtype<br />

dtype('int64')<br />

If we try to use elements of different types, NumPy will do its best to coerce them<br />

to the most reasonable common datatype:<br />

>>> np.array([1, "stringy"])<br />

array(['1', 'stringy'], dtype='|S8')<br />

>>> np.array([1, "stringy", set([1,2,3])])<br />

array([1, stringy, set([1, 2, 3])], dtype=object)<br />

<strong>Learning</strong> SciPy<br />

On top of the efficient data structures of NumPy, SciPy offers a magnitude of<br />

algorithms working on those arrays. Whatever numerical-heavy algorithm you<br />

take from current books on numerical recipes, you will most likely find support<br />

for them in SciPy in one way or another. Whether it is matrix manipulation,<br />

linear algebra, optimization, clustering, spatial operations, or even Fast Fourier<br />

transformation, the toolbox is readily filled. Therefore, it is a good habit to always<br />

inspect the scipy module before you start implementing a numerical algorithm.<br />

For convenience, the complete namespace of NumPy is also accessible via SciPy.<br />

So, from now on, we will use NumPy's machinery via the SciPy namespace. You<br />

can check this easily by comparing the function references of any base function;<br />

for example:<br />

>>> import scipy, numpy<br />

>>> scipy.version.full_version<br />

0.11.0<br />

>>> scipy.dot is numpy.dot<br />

True<br />

The diverse algorithms are grouped into the following toolboxes:<br />

SciPy package<br />

cluster<br />

Functionality<br />

Hierarchical clustering (cluster.hierarchy)<br />

Vector quantization / K-Means (cluster.vq)<br />

[ 17 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!