03.02.2014 Views

php|architect's Guide to Web Scraping with PHP - Wind Business ...

php|architect's Guide to Web Scraping with PHP - Wind Business ...

php|architect's Guide to Web Scraping with PHP - Wind Business ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

166 ” Legality of W e b <strong>Scraping</strong><br />

• “ A m grants a you z a o limited n license <strong>to</strong> access and make personal use of this<br />

site ... This license does not include ... any use of data mining, robots, or similar<br />

data gathering and extraction <strong>to</strong>ols.” – Amazon Conditions of U s e , LICENSE<br />

AND SITE AC C E S S section as of 2/14/10<br />

• “Youagree that you will not use any robot, spider, scraper or other au<strong>to</strong>mated<br />

means <strong>to</strong> access the Sites for any purpose <strong>with</strong>out our express written permission.”<br />

– eBay U s e r Agreement, Access and Interference section as of 2/14/10<br />

• “... you agree not <strong>to</strong>: ... access, moni<strong>to</strong>r or copy any content or information<br />

of this W e b s i using t e any robot, spider, scraper or other au<strong>to</strong>mated means or<br />

any manual process for any purpose <strong>with</strong>out our express written permission;<br />

...” – Expedia, Inc. W e b Site T e r Conditions, m s , and N o t i c e s , PROHIBITED<br />

AC T I V I T I E S section as of 2/14/10<br />

• “The foregoing licenses do not include any rights <strong>to</strong>: ... use any robot, spider,<br />

data miner, scraper or other au<strong>to</strong>mated means <strong>to</strong> access the Barnes & N o -<br />

ble.com Site or its systems, the Content or any portion or derivative thereof<br />

for any purpose; ...” – Barnes & N o b l e T e r of U m s e s , Section I LICENSES AND<br />

RESTRICTIONS as of 2/14/10<br />

Determining whether or not the web site in question has a TOS document will be the<br />

first step. If you find one, look for clauses using language similar <strong>to</strong> that of the above<br />

examples. Also, look for any broad “blanket” clauses of prohibited activities under<br />

which web scraping may fall.<br />

If you find a TOS document and it does not expressly forbid web scraping, the<br />

next step is <strong>to</strong> contact representatives who have authority <strong>to</strong> speak on behalf of the<br />

organization that o w n s the web site. Some organizations may allow web scraping assuming<br />

that you secure permission <strong>with</strong> appropriate authorities beforehand. When<br />

obtaining this permission, it is best <strong>to</strong> obtain a document in writing and on official<br />

letterhead that clearly indicates that it originated from the organization in question.<br />

This has the greatest chance of mitigating any legal issues that may arise.<br />

If intellectual property-related allegations are brought against an individual as a<br />

result of usage of an au<strong>to</strong>mated agent or information acquired by one, assuming<br />

the individual did not violate any TOS agreement imposed by its o w n e r or related<br />

computer use laws, a court decision will likely boil down <strong>to</strong> whether or not the usage

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!