09.11.2016 Views

Foundations of Python Network Programming 978-1-4302-3004-5

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

CHAPTER 9 ■ HTTP<br />

But you should know that these other mechanisms exist if you are writing web clients, proxies, or even if<br />

you simply browse the Web yourself and are interested in controlling your identity.<br />

HTTP Session Hijacking<br />

A perpetual problem with cookies is that web site designers do not seem to realize that cookies need to<br />

be protected as zealously as your username and password. While it is true that well-designed cookies<br />

expire and will no longer be accepted as valid by the server, cookies—while they last—give exactly as<br />

much access to a web site as a username and password. If someone can make requests to a site with your<br />

login cookie, the site will think it is you who has just logged in.<br />

Some sites do not protect cookies at all: they might require HTTPS for your username and password,<br />

but then return you to normal HTTP for the rest <strong>of</strong> your session. And with every HTTP request, your<br />

session cookies are transmitted in the clear for anyone to intercept and start using.<br />

Other sites are smart enough to protect subsequent page loads with HTTPS, even after you have left<br />

the login page, but they forget that static data from the same domain, like images, decorations, CSS files,<br />

and JavaScript source code, will also carry your cookie. The better alternatives are to either send all <strong>of</strong><br />

that information over HTTPS, or to carefully serve it from a different domain or path that is outside the<br />

jurisdiction <strong>of</strong> the session cookie.<br />

And despite the fact this problem has existed for years, at the time <strong>of</strong> writing it is once again back in<br />

the news with the celebrated release <strong>of</strong> Firesheep. Sites need to learn that session cookies should always<br />

be marked as secure, so that browsers will not divulge them over insecure links.<br />

Earlier generations <strong>of</strong> browsers would refuse to cache content that came in over HTTPS, and that<br />

might be where some developers got into the habit <strong>of</strong> not encrypting most <strong>of</strong> their web site. But modern<br />

browsers will happily cache resources fetched over HTTPS—some will even save it on disk if the Cachecontrol:<br />

header is set to public—so there are no longer good reasons not to encrypt everything sent<br />

from a web site. Remember: If your users really need privacy, then exposing even what images,<br />

decorations, and JavaScript they are downloading might allow an observer to guess which pages they are<br />

visiting and which actions they are taking on your site.<br />

Should you happen to observe or capture a Cookie: header from an HTTP request that you observe,<br />

remember that there is no need to store it in a CookieJar or represent it as a cookielib object at all.<br />

Indeed, you could not do that anyway because the outgoing Cookie: header does not reveal the domain<br />

and path rules that the cookie was stored with. Instead, just inject the Cookie: header raw into the<br />

requests you make to the web site:<br />

request = urllib2.Request(url)<br />

request.add_header('Cookie', intercepted_value)<br />

info = urllib2.urlopen(request)<br />

As always, use your powers for good and not evil!<br />

Cross-Site Scripting Attacks<br />

The earliest experiments with scripts that could run in web browsers revealed a problem: all <strong>of</strong> the HTTP<br />

requests made by the browser were done with the authority <strong>of</strong> the user’s cookies, so pages could cause<br />

quite a bit <strong>of</strong> trouble by attempting to, say, POST to the online web site <strong>of</strong> a popular bank asking that<br />

money be transferred to the attacker’s account. Anyone who visited the problem site while logged on to<br />

that particular bank in another window could lose money.<br />

To address this, browsers imposed the restriction that scripts in languages like JavaScript can only<br />

make connections back to the site that served the web page, and not to other web sites. This is called the<br />

“same origin policy.”<br />

160

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!