Foundations of Python Network Programming 978-1-4302-3004-5
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
CHAPTER 9 ■ HTTP<br />
But you should know that these other mechanisms exist if you are writing web clients, proxies, or even if<br />
you simply browse the Web yourself and are interested in controlling your identity.<br />
HTTP Session Hijacking<br />
A perpetual problem with cookies is that web site designers do not seem to realize that cookies need to<br />
be protected as zealously as your username and password. While it is true that well-designed cookies<br />
expire and will no longer be accepted as valid by the server, cookies—while they last—give exactly as<br />
much access to a web site as a username and password. If someone can make requests to a site with your<br />
login cookie, the site will think it is you who has just logged in.<br />
Some sites do not protect cookies at all: they might require HTTPS for your username and password,<br />
but then return you to normal HTTP for the rest <strong>of</strong> your session. And with every HTTP request, your<br />
session cookies are transmitted in the clear for anyone to intercept and start using.<br />
Other sites are smart enough to protect subsequent page loads with HTTPS, even after you have left<br />
the login page, but they forget that static data from the same domain, like images, decorations, CSS files,<br />
and JavaScript source code, will also carry your cookie. The better alternatives are to either send all <strong>of</strong><br />
that information over HTTPS, or to carefully serve it from a different domain or path that is outside the<br />
jurisdiction <strong>of</strong> the session cookie.<br />
And despite the fact this problem has existed for years, at the time <strong>of</strong> writing it is once again back in<br />
the news with the celebrated release <strong>of</strong> Firesheep. Sites need to learn that session cookies should always<br />
be marked as secure, so that browsers will not divulge them over insecure links.<br />
Earlier generations <strong>of</strong> browsers would refuse to cache content that came in over HTTPS, and that<br />
might be where some developers got into the habit <strong>of</strong> not encrypting most <strong>of</strong> their web site. But modern<br />
browsers will happily cache resources fetched over HTTPS—some will even save it on disk if the Cachecontrol:<br />
header is set to public—so there are no longer good reasons not to encrypt everything sent<br />
from a web site. Remember: If your users really need privacy, then exposing even what images,<br />
decorations, and JavaScript they are downloading might allow an observer to guess which pages they are<br />
visiting and which actions they are taking on your site.<br />
Should you happen to observe or capture a Cookie: header from an HTTP request that you observe,<br />
remember that there is no need to store it in a CookieJar or represent it as a cookielib object at all.<br />
Indeed, you could not do that anyway because the outgoing Cookie: header does not reveal the domain<br />
and path rules that the cookie was stored with. Instead, just inject the Cookie: header raw into the<br />
requests you make to the web site:<br />
request = urllib2.Request(url)<br />
request.add_header('Cookie', intercepted_value)<br />
info = urllib2.urlopen(request)<br />
As always, use your powers for good and not evil!<br />
Cross-Site Scripting Attacks<br />
The earliest experiments with scripts that could run in web browsers revealed a problem: all <strong>of</strong> the HTTP<br />
requests made by the browser were done with the authority <strong>of</strong> the user’s cookies, so pages could cause<br />
quite a bit <strong>of</strong> trouble by attempting to, say, POST to the online web site <strong>of</strong> a popular bank asking that<br />
money be transferred to the attacker’s account. Anyone who visited the problem site while logged on to<br />
that particular bank in another window could lose money.<br />
To address this, browsers imposed the restriction that scripts in languages like JavaScript can only<br />
make connections back to the site that served the web page, and not to other web sites. This is called the<br />
“same origin policy.”<br />
160