03.02.2014 Views

php|architect's Guide to Web Scraping with PHP - Wind Business ...

php|architect's Guide to Web Scraping with PHP - Wind Business ...

php|architect's Guide to Web Scraping with PHP - Wind Business ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

56 ” pecl_http PECL Extension<br />

associative array of cookie name-value pairs for the ’cookie’ request option. If<br />

your cookie values are already encoded, set the ’encodecookies’ request option<br />

<strong>to</strong> false.<br />

• Also like cURL, pecl_http includes an option <strong>to</strong> use a file for s<strong>to</strong>ring cookie<br />

data. U nlike cURL, pecl_http always uses the same data source for both read<br />

and write operations. That is, it consolidates the CURLOPT_COOKIEFILE and<br />

CURLOPT_COOKIEJAR options in<strong>to</strong> the ’cookies<strong>to</strong>re’ request option.<br />

• Because the procedural API lacks the persistent scope that is a defining characteristic<br />

of the object-oriented API, extracting cookie values for uses beyond<br />

s<strong>to</strong>rage and persistence is somewhat involved. http_parse_message is used <strong>to</strong><br />

parse the headers and body from a string containing an HTTP response message<br />

in<strong>to</strong> an object for easier access. http_parse_cookie is then applied <strong>to</strong><br />

Set-Cookie header values <strong>to</strong> parse the cookie data from them.<br />

• In HttpRequest the enableCookies method explicitly sets CURLOPT_COOKIEFILE <strong>to</strong><br />

an empty string so that cookie data is persisted in memory. setCookies accepts<br />

an associative array of cookie name-value pairs just like the ’cookie’ request<br />

option. addCookies does the same thing, but merges the array contents in<strong>to</strong><br />

any existing cookie data rather than deleting the latter as setCookies does.<br />

• Once the send method is called on $request, cookie data from the response is<br />

retrieved by calling the getResponseCookies method.<br />

HTTP A uthentication<br />

The ’httpauth’ request option is used <strong>to</strong> set credentials in the format<br />

’username:password’. The type of HTTP authentication <strong>to</strong> use is specified via<br />

the ’httpauthtype’ request option using one of the pecl_http HTTP_AUTH_ * constants,<br />

which are similar <strong>to</strong> those intended for the same purpose in the cURL extension.<br />

Lastly, the ’unrestrictedauth’ request option can be set <strong>to</strong> true if authentication<br />

credentials should be included in requests resulting from redirections pointing <strong>to</strong> a<br />

different host from the current one.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!