03.02.2014 Views

php|architect's Guide to Web Scraping with PHP - Wind Business ...

php|architect's Guide to Web Scraping with PHP - Wind Business ...

php|architect's Guide to Web Scraping with PHP - Wind Business ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

122 ” XMLReader Extension<br />

Loading a Document<br />

The focal class of the XMLReader extension is aptly named XMLReader. It doesn’t declare<br />

a construc<strong>to</strong>r, but rather offers two methods for introducing XML data in<strong>to</strong> it.<br />

<br />

Both of these methods have two additional parameters.<br />

The second parameter is a string specifying the encoding scheme for the input<br />

document. It is optional and defaults <strong>to</strong> ’UTF-8’ if unspecified or specified<br />

as null. V a l i values d for this parameter aren’t included in the <strong>PHP</strong> manual,<br />

but can be found in the reference for the underlying libxml2 library at<br />

http://www.xmlsoft.org/encoding.html#Default.<br />

The third parameter is an integer value that can be set in bitmask fashion<br />

using constants from the libxml extension. This is the preferred method<br />

<strong>to</strong> configure the parser o v e r using the deprecated setParserProperty() method.<br />

The specific constants that can be used <strong>to</strong> form the bitmask (using the bitwise<br />

OR opera<strong>to</strong>r |) are listed below. Descriptions for them can be found at<br />

http://php.net/manual/en/libxml.constants.php.<br />

• LIBXML_COMPACT<br />

• LIBXML_DTDATTR<br />

• LIBXML_DTDLOAD<br />

• LIBXML_DTDVALID<br />

• LIBXML_NOBLANKS<br />

• LIBXML_NOCDATA<br />

• LIBXML_NOENT

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!