03.02.2014 Views

php|architect's Guide to Web Scraping with PHP - Wind Business ...

php|architect's Guide to Web Scraping with PHP - Wind Business ...

php|architect's Guide to Web Scraping with PHP - Wind Business ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

28 ” HTTP Streams W rapper<br />

code (though it is more noticeable in a high load environment). On the negative<br />

side, you have <strong>to</strong> either kno w C or depend on the community <strong>to</strong> deliver patches in a<br />

timely fashion for any issues that may arise. This also applies <strong>to</strong> extensions written<br />

in C that will be co vered in subsequent sections.<br />

The streams wrapper is part of the <strong>PHP</strong> core and as such has no installation requirements<br />

beyond that of <strong>PHP</strong> itself.<br />

Simple R equest and R esponse H andling<br />

H ere’s a simple example of the HTTP streams wrapper in action.<br />

<br />

There are a few things <strong>to</strong> note.<br />

• The allow_url_fopen <strong>PHP</strong> configuration setting must be enabled for this <strong>to</strong><br />

work, which it is in most environments.<br />

• In this example, the file_get_contents function call is equivalent <strong>to</strong> making a<br />

GET request for the specified URL ’http://localhost.example’.<br />

• $response will contain the response body after the call <strong>to</strong> the<br />

file_get_contents function completes.<br />

• $http_response_header is implicitly populated <strong>with</strong> the HTTP response status<br />

line and headers after the file_get_contents call because it uses the HTTP<br />

streams wrapper <strong>with</strong>in the current scope.<br />

While this example does work, it violates a core principle of good coding practices:<br />

no unexpected side effects. The origin of $http_response_header is not entirely<br />

ob vious because <strong>PHP</strong> populates it implicitly. Additionally, it’s more restrictive<br />

because the variable is only populated <strong>with</strong>in the scope containing the call <strong>to</strong><br />

file_get_contents. H ere’s a better way <strong>to</strong> get access <strong>to</strong> the same data from the response<br />

headers.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!