03.02.2014 Views

php|architect's Guide to Web Scraping with PHP - Wind Business ...

php|architect's Guide to Web Scraping with PHP - Wind Business ...

php|architect's Guide to Web Scraping with PHP - Wind Business ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

T i d y Extension ” 91<br />

$tidy = new tidy;<br />

$tidy->parseString($string, $config);<br />

$tidy->parseFile($filename, $config);<br />

?><br />

Configuration<br />

Like the cURL extension, the tidy extension operates largely on the concept of configuration;<br />

hence, $config parameters are present in all calls in the above example.<br />

U n l i k emost other extensions, this parameter can actually be one of two things: an<br />

associative array of setting-value pairs or the path <strong>to</strong> an external configuration file.<br />

The configuration file format is somewhat similar <strong>to</strong> individual style settings in a<br />

CSS stylesheet. An example is shown below. It’s unlikely that a non-developer will<br />

need <strong>to</strong> access the configuration settings and not the <strong>PHP</strong> source code using tidy<br />

as well. As such, separation in<strong>to</strong> an external configuration file is really only useful<br />

for the sake of not cluttering source code <strong>with</strong> settings. Additionally, because the<br />

configuration file is read from disk, it may pose performance concerns when in high<br />

use.<br />

// single-line comment<br />

/* multi-line comment */<br />

indent: false /* setting: value */<br />

wrap: 78<br />

When using the object-oriented API, an alternative <strong>to</strong> using configuration files is<br />

subclassing the tidy class and o v e r r i d i n g its parseString and parseFile methods <strong>to</strong><br />

au<strong>to</strong>matically include specific configuration setting values. This method allows for<br />

easy reuse of tidy configurations.<br />

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!