03.02.2014 Views

php|architect's Guide to Web Scraping with PHP - Wind Business ...

php|architect's Guide to Web Scraping with PHP - Wind Business ...

php|architect's Guide to Web Scraping with PHP - Wind Business ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

HTTP ” 17<br />

what is sent when the application is used in a bro wser. See subsection 14.36 of RFC<br />

2616 for more information.<br />

P ersistent Connections<br />

The standard operating procedure for an HTTP request is as follo ws.<br />

• A client connects <strong>to</strong> a server.<br />

• The client sends a request o ver the established connection.<br />

• The server returns a response.<br />

• The connection is terminated.<br />

When sending multiple consecutive requests <strong>to</strong> the same server, ho wever, the first<br />

and fourth steps in that process can cause a significant amount of o verhead. HTTP<br />

1.0 established no solution for this; one connection per request was normal behavior.<br />

Between the releases of the HTTP 1.0 and 1.1 standards, a convention was informally<br />

established that involved the client including a Connection header <strong>with</strong> a value<br />

of Keep-Alive in the request <strong>to</strong> indicate <strong>to</strong> the server that a persistent connection was<br />

desired.<br />

Later, 1.1 was released and changed the default behavior from one connection per<br />

request <strong>to</strong> persist connections. F or a non-persistent connection, the client could include<br />

a Connection header <strong>with</strong> a value of close <strong>to</strong> indicate that the server should<br />

terminate the connection after it sent the response. The difference between 1.0 and<br />

1.1 is an important distinction and should be a point of examination when evaluating<br />

both client libraries and servers hosting target applications so that you are aware<br />

of ho w they will behave <strong>with</strong> respect <strong>to</strong> persistent connections. See subsection 8.1 of<br />

RFC 2616 for more information.<br />

There is an alternative implementation that gained significantly less support in<br />

clients and servers involving the use of a Keep-Alive header. Technical issues <strong>with</strong><br />

this are discussed in subsection 19.7.1 of RFC 2068, but explicit use of this header<br />

should be avoided. It is mentioned here simply <strong>to</strong> make you aware that it exists and<br />

is related <strong>to</strong> the matter of persistent connections.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!