03.02.2014 Views

php|architect's Guide to Web Scraping with PHP - Wind Business ...

php|architect's Guide to Web Scraping with PHP - Wind Business ...

php|architect's Guide to Web Scraping with PHP - Wind Business ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Rolling Y o u Own r ” 85<br />

?><br />

See Section 3.6.1 and Appendix 19.4.6 of RFC 2616 for more information on chunked<br />

transfer encoding.<br />

Content Encoding<br />

If the zlib extension is loaded (which can be checked using the extension_loaded<br />

function or executing php -m from command line), the client can optionally include<br />

an Accept-Encoding header <strong>with</strong> a value of gzip,deflate in its request. If the server<br />

supports content compression, it will include a Content-Encoding header in its response<br />

<strong>with</strong> a value indicating which of the two compression schemes it used on the<br />

response body before sending it.<br />

The purpose of this is <strong>to</strong> reduce the amount of data being sent <strong>to</strong> reduce bandwidth<br />

consumption and increase throughput (assuming that compression and decompression<br />

takes less time than data transfer, which is generally the case). U p o n<br />

receiving the response, the client must decompress the response using the original<br />

scheme used by the server <strong>to</strong> compress it.<br />

<br />

• Y e the s function , names are correct. One would think that gzinflate would be<br />

used <strong>to</strong> decode a body encoded using the deflate encoding scheme. Apparently<br />

this is just an oddity in the naming scheme used by the zlib library.<br />

• When the encoding scheme is gzip, a GZIP header is included in the response.<br />

gzinflate does not respond well <strong>to</strong> this. H e n c e , the header (contained in the<br />

first 10 by t e s of the body) is stripped before the body is passed <strong>to</strong> gzinflate.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!