php|architect's Guide to Web Scraping with PHP - Wind Business ...
php|architect's Guide to Web Scraping with PHP - Wind Business ...
php|architect's Guide to Web Scraping with PHP - Wind Business ...
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
56 ” pecl_http PECL Extension<br />
associative array of cookie name-value pairs for the ’cookie’ request option. If<br />
your cookie values are already encoded, set the ’encodecookies’ request option<br />
<strong>to</strong> false.<br />
• Also like cURL, pecl_http includes an option <strong>to</strong> use a file for s<strong>to</strong>ring cookie<br />
data. U nlike cURL, pecl_http always uses the same data source for both read<br />
and write operations. That is, it consolidates the CURLOPT_COOKIEFILE and<br />
CURLOPT_COOKIEJAR options in<strong>to</strong> the ’cookies<strong>to</strong>re’ request option.<br />
• Because the procedural API lacks the persistent scope that is a defining characteristic<br />
of the object-oriented API, extracting cookie values for uses beyond<br />
s<strong>to</strong>rage and persistence is somewhat involved. http_parse_message is used <strong>to</strong><br />
parse the headers and body from a string containing an HTTP response message<br />
in<strong>to</strong> an object for easier access. http_parse_cookie is then applied <strong>to</strong><br />
Set-Cookie header values <strong>to</strong> parse the cookie data from them.<br />
• In HttpRequest the enableCookies method explicitly sets CURLOPT_COOKIEFILE <strong>to</strong><br />
an empty string so that cookie data is persisted in memory. setCookies accepts<br />
an associative array of cookie name-value pairs just like the ’cookie’ request<br />
option. addCookies does the same thing, but merges the array contents in<strong>to</strong><br />
any existing cookie data rather than deleting the latter as setCookies does.<br />
• Once the send method is called on $request, cookie data from the response is<br />
retrieved by calling the getResponseCookies method.<br />
HTTP A uthentication<br />
The ’httpauth’ request option is used <strong>to</strong> set credentials in the format<br />
’username:password’. The type of HTTP authentication <strong>to</strong> use is specified via<br />
the ’httpauthtype’ request option using one of the pecl_http HTTP_AUTH_ * constants,<br />
which are similar <strong>to</strong> those intended for the same purpose in the cURL extension.<br />
Lastly, the ’unrestrictedauth’ request option can be set <strong>to</strong> true if authentication<br />
credentials should be included in requests resulting from redirections pointing <strong>to</strong> a<br />
different host from the current one.