09.11.2016 Views

Foundations of Python Network Programming 978-1-4302-3004-5

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

CHAPTER 9 ■ HTTP<br />

If the connection works properly, then neither your government nor any <strong>of</strong> the various large and<br />

shadowy corporations that track such things should be able to easily determine either the search term<br />

you used or the results you viewed.<br />

HTTP Authentication<br />

The HTTP protocol came with a means <strong>of</strong> authentication that was so poorly thought out and so badly<br />

implemented that it seems to have been almost entirely abandoned. When a server was asked for a page<br />

to which access was restricted, it was supposed to return a response code:<br />

HTTP/1.1 401 Authorization Required<br />

...<br />

WWW-Authenticate: Basic realm="voetbal"<br />

...<br />

This indicated that the server did not know who was requesting the resource, so it could not decide<br />

whether to grant permission. By asking for Basic authentication, the site would induce the web browser<br />

to pop up a dialog box asking for a username and password. The information entered would then be sent<br />

back in a header as part <strong>of</strong> a second request for exactly the same resource. The authentication token was<br />

generated by doing base64 encoding on the colon-separated username and password:<br />

>>> import base64<br />

>>> print base64.b64encode("guido:vanOranje!")<br />

Z3VpZG86dmFuT3JhbmplIQ==<br />

This, <strong>of</strong> course, just protects any special characters in the username and password that might have<br />

been confused as part <strong>of</strong> the headers themselves; it does not protect the username and password at all,<br />

since they can very simply be decoded again:<br />

>>> print base64.b64decode("Z3VpZG86dmFuT3JhbmplIQ==")<br />

guido:vanOranje!<br />

Anyway, once the encoded value was computed, it could be included in the second request like this:<br />

Authorization: Basic QWxhZGRpbjpvcGVuIHNlc2FtZQ==<br />

An incorrect password or unknown user would elicit additional 401 errors from the server, resulting<br />

in the pop-up box appearing again and again. Finally, if the user got it right, she would either be shown<br />

the resource or—if she in fact did not have permission—be shown a response code like the following:<br />

403 Forbidden<br />

<strong>Python</strong> supports this kind <strong>of</strong> authentication through a handler that, as your program uses it, can<br />

accumulate a list <strong>of</strong> passwords. It is very careful to keep straight which passwords go with which web<br />

sites, lest it send the wrong one and allow one web site operator to learn your password to another site! It<br />

also checks the realm string specified by the server in its WWW-Authenticate header; this allows a single<br />

web site to have several separate areas inside that each take their own set <strong>of</strong> usernames and passwords.<br />

The handler can be created and populated with a single password like this:<br />

auth_handler = .HTTPBasicAuthHandler()<br />

auth_handler.add_password(realm='voetbal', uri='http://www.onsoranje.nl/',<br />

» » » » » » user='guido', passwd='vanOranje!')<br />

The resulting handler can be passed into build_opener(), just as we did with our debugging handler<br />

early in this chapter.<br />

157

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!