08.06.2015 Views

Building Machine Learning Systems with Python - Richert, Coelho

Building Machine Learning Systems with Python - Richert, Coelho

Building Machine Learning Systems with Python - Richert, Coelho

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Getting Started <strong>with</strong> <strong>Python</strong> <strong>Machine</strong> <strong>Learning</strong><br />

Answering our initial question<br />

Finally, we have arrived at a model that we think represents the underlying process<br />

best; it is now a simple task of finding out when our infrastructure will reach 100,000<br />

requests per hour. We have to calculate when our model function reaches the value<br />

100,000.<br />

Having a polynomial of degree 2, we could simply compute the inverse of the<br />

function and calculate its value at 100,000. Of course, we would like to have an<br />

approach that is applicable to any model function easily.<br />

This can be done by subtracting 100,000 from the polynomial, which results in<br />

another polynomial, and finding the root of it. SciPy's optimize module has the<br />

fsolve function to achieve this when providing an initial starting position. Let<br />

fbt2 be the winning polynomial of degree 2:<br />

>>> print(fbt2)<br />

2<br />

0.08844 x - 97.31 x + 2.853e+04<br />

>>> print(fbt2-100000)<br />

2<br />

0.08844 x - 97.31 x - 7.147e+04<br />

>>> from scipy.optimize import fsolve<br />

>>> reached_max = fsolve(fbt2-100000, 800)/(7*24)<br />

>>> print("100,000 hits/hour expected at week %f" % reached_max[0])<br />

100,000 hits/hour expected at week 9.827613<br />

Our model tells us that given the current user behavior and traction of our startup,<br />

it will take another month until we have reached our threshold capacity.<br />

Of course, there is a certain uncertainty involved <strong>with</strong> our prediction. To get the real<br />

picture, you can draw in more sophisticated statistics to find out about the variance<br />

that we have to expect when looking farther and further into the future.<br />

And then there are the user and underlying user behavior dynamics that we cannot<br />

model accurately. However, at this point we are fine <strong>with</strong> the current predictions.<br />

After all, we can prepare all the time-consuming actions now. If we then monitor<br />

our web traffic closely, we will see in time when we have to allocate new resources.<br />

[ 30 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!