php|architect's Guide to Web Scraping with PHP - Wind Business ...
php|architect's Guide to Web Scraping with PHP - Wind Business ...
php|architect's Guide to Web Scraping with PHP - Wind Business ...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
28 ” HTTP Streams W rapper<br />
code (though it is more noticeable in a high load environment). On the negative<br />
side, you have <strong>to</strong> either kno w C or depend on the community <strong>to</strong> deliver patches in a<br />
timely fashion for any issues that may arise. This also applies <strong>to</strong> extensions written<br />
in C that will be co vered in subsequent sections.<br />
The streams wrapper is part of the <strong>PHP</strong> core and as such has no installation requirements<br />
beyond that of <strong>PHP</strong> itself.<br />
Simple R equest and R esponse H andling<br />
H ere’s a simple example of the HTTP streams wrapper in action.<br />
<br />
There are a few things <strong>to</strong> note.<br />
• The allow_url_fopen <strong>PHP</strong> configuration setting must be enabled for this <strong>to</strong><br />
work, which it is in most environments.<br />
• In this example, the file_get_contents function call is equivalent <strong>to</strong> making a<br />
GET request for the specified URL ’http://localhost.example’.<br />
• $response will contain the response body after the call <strong>to</strong> the<br />
file_get_contents function completes.<br />
• $http_response_header is implicitly populated <strong>with</strong> the HTTP response status<br />
line and headers after the file_get_contents call because it uses the HTTP<br />
streams wrapper <strong>with</strong>in the current scope.<br />
While this example does work, it violates a core principle of good coding practices:<br />
no unexpected side effects. The origin of $http_response_header is not entirely<br />
ob vious because <strong>PHP</strong> populates it implicitly. Additionally, it’s more restrictive<br />
because the variable is only populated <strong>with</strong>in the scope containing the call <strong>to</strong><br />
file_get_contents. H ere’s a better way <strong>to</strong> get access <strong>to</strong> the same data from the response<br />
headers.