php|architect's Guide to Web Scraping with PHP - Wind Business ...
php|architect's Guide to Web Scraping with PHP - Wind Business ...
php|architect's Guide to Web Scraping with PHP - Wind Business ...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
16 ” HTTP<br />
set, it will persist for the duration of the client session. F or normal web bro wsers, this<br />
is generally when all instances of the bro wser application have been closed.<br />
Redirection<br />
The Location header is used by the server <strong>to</strong> redirect the client <strong>to</strong> a URI. In this<br />
scenario , the response will most likely include a 3xx class status code (such as 302<br />
F ound), but may also include a 201 code <strong>to</strong> indicate the creation of a new resource.<br />
See subsection 14.30 of RFC 2616 for more information.<br />
It is hypothetically possible for a malfunctioning application <strong>to</strong> cause the server <strong>to</strong><br />
initiate an infinite series of redirections between itself and the client. F or this reason,<br />
client libraries often implement a limit on the number of consecutive redirections it<br />
will process before assuming that the application being accessed is behaving inappropriately<br />
and terminating. Libraries generally implement a default limit, but allo w<br />
you <strong>to</strong> o verride it <strong>with</strong> your o wn.<br />
Referring URLs<br />
It is possible for a requested resource <strong>to</strong> refer <strong>to</strong> other resources in some way. When<br />
this happens, clients traditionally include the URL of the referring resource in the<br />
Referer header. Yes, the header name is misspelled there and intentionally so . The<br />
commonality of that particular misspelling caused it <strong>to</strong> end up in the official HTTP<br />
specification, thereby becoming the standard industry spelling used when referring<br />
<strong>to</strong> that particular header.<br />
There are multiple situations in which the specification of a referer can occur. A<br />
user may click on a hyperlink in a bro wser, in which case the full URL of the resource<br />
containing the hyperlink would be the referer. When a resource containing markup<br />
<strong>with</strong> embedded images is requested, subsequent requests for those images will contain<br />
the full URL of the page containing the images as the referer. A referer is also<br />
specified when redirection occurs, as described in the previous section.<br />
The reason this is relevant is because some applications depend on the value of the<br />
Referer header by design, which is less than ideal for the simple fact that the header<br />
value can be spoofed. In any case, it is important <strong>to</strong> be aware that some applications<br />
may not function as expected if the pro vided header value is not consistent <strong>with</strong>