php|architect's Guide to Web Scraping with PHP - Wind Business ...
php|architect's Guide to Web Scraping with PHP - Wind Business ...
php|architect's Guide to Web Scraping with PHP - Wind Business ...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
66 ” PEAR::HTTP_Client<br />
U sing the Client<br />
HTTP_Client persists explicit sets of headers and requests parameters across multiple<br />
requests, which are set using the setDefaultHeader and setRequestParameter methods<br />
respectively. The client construc<strong>to</strong>r also accepts arrays of these. Default headers<br />
and request parameters can be cleared by calling reset on the client instance.<br />
Internally, the client class actually creates a new instance of HTTP_Request per request.<br />
The request operation is set depending on which of the client instance methods<br />
are called; get, head, and post are supported.<br />
The capabilities of the client described up <strong>to</strong> this point can all be accomplished<br />
by reusing the same request instance for multiple requests. H o wever, the client also<br />
handles two things that the request class does not: cookies and redirects.<br />
By default, cookies are persisted au<strong>to</strong>matically across requests <strong>with</strong>out any additional<br />
configuration. HTTP_Client_CookieManager is used internally for this. F or cus<strong>to</strong>m<br />
cookie handling, this class can be extended and an instance of it passed as the<br />
third parameter <strong>to</strong> the client construc<strong>to</strong>r. If this is done, that instance will be used<br />
rather than an instance of the native cookie manager class being created by default.<br />
The maximum number of redirects <strong>to</strong> process can be set using the setMaxRedirects<br />
method of the client class. Internally, requests will be created and sent as needed <strong>to</strong><br />
process the redirect until a non-redirecting response is received or the maximum<br />
redirect limit is reached. In the former case, the client method being called will return<br />
an integer containing the response code rather than true as the request class<br />
does. In the latter case, the client method will return an error instance. N ote that<br />
the client class will process redirects contained in meta tags of HTML documents in<br />
addition <strong>to</strong> those performed at the HTTP level.<br />
To retrieve information for the last response received, use the currentResponse<br />
method of the client instance. It will return an associative array containing the keys<br />
’code’, ’headers’, and ’body’ <strong>with</strong> values corresponding <strong>to</strong> the return values of request<br />
methods getResponseCode, getResponseHeader, and getResponseBody respectively.<br />
By default, all responses are s<strong>to</strong>red and can be accessed individually as sho wn belo<br />
w . To disable s<strong>to</strong>rage of all responses except the last one, call enableHis<strong>to</strong>ry on the<br />
client instance and pass it false.<br />