10.11.2016 Views

Learning Data Mining with Python

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Extracting Features <strong>with</strong> Transformers<br />

We then create our transformer instance and fit it using this test data:<br />

mean_discrete = MeanDiscrete()<br />

mean_discrete.fit(X_test)<br />

Next, we check whether the internal mean parameter was correctly set by comparing<br />

it <strong>with</strong> our independently verified result:<br />

assert_array_equal(mean_discrete.mean, np.array([13.5, 15.5]))<br />

We then run the transform to create the transformed dataset. We also create an<br />

(independently computed) array <strong>with</strong> the expected values for the output:<br />

X_transformed = mean_discrete.transform(X_test)<br />

X_expected = np.array([[ 0, 0],<br />

[ 0, 0],<br />

[ 0, 0],<br />

[ 0, 0],<br />

[ 0, 0],<br />

[ 1, 1],<br />

[ 1, 1],<br />

[ 1, 1],<br />

[ 1, 1],<br />

[ 1, 1]])<br />

Finally, we test that our returned result is indeed what we expected:<br />

assert_array_equal(X_transformed, X_expected)<br />

We can run the test by simply running the function itself:<br />

test_meandiscrete()<br />

If there was no error, then the test ran <strong>with</strong>out an issue! You can verify this by<br />

changing some of the tests to deliberately incorrect values, and seeing that the test<br />

fails. Remember to change them back so that the test passes.<br />

If we had multiple tests, it would be worth using a testing framework called nose to<br />

run our tests.<br />

[ 102 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!