03.02.2014 Views

php|architect's Guide to Web Scraping with PHP - Wind Business ...

php|architect's Guide to Web Scraping with PHP - Wind Business ...

php|architect's Guide to Web Scraping with PHP - Wind Business ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

40 ” cURL Extension<br />

• CURLOPT_RETURNTRANSFER is set <strong>to</strong> true in the curl_se<strong>to</strong>pt_array call even<br />

though the return value of curl_exec isn’t captured. This is simply <strong>to</strong> prevent<br />

unwanted output.<br />

• CURLINFO_HEADER_OUT is set <strong>to</strong> true in the curl_se<strong>to</strong>pt_array call <strong>to</strong> indicate that<br />

the request should be retained because it will be extracted after the request is<br />

made.<br />

• CURLINFO_HEADER_OUT is specified in the curl_getinfo call <strong>to</strong> limit its return<br />

value <strong>to</strong> a string containing the request that was made.<br />

Cookies<br />

<br />

H ere is a quick list of pertinent points.<br />

• After the first curl_exec call, cURL will have s<strong>to</strong>red the value of the the<br />

Set-Cookie response header returned by the server in the file referenced by<br />

’/path/<strong>to</strong>/file’ on the local filesystem as per the CURLOPT_COOKIEJAR setting.<br />

This setting value will persist through the second curl_exec call.<br />

• When the second curl_exec call takes place, the CURLOPT_COOKIEFILE setting<br />

will also point <strong>to</strong> ’/path/<strong>to</strong>/file’. This will cause cURL <strong>to</strong> read the contents of

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!