php|architect's Guide to Web Scraping with PHP - Wind Business ...
php|architect's Guide to Web Scraping with PHP - Wind Business ...
php|architect's Guide to Web Scraping with PHP - Wind Business ...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
122 ” XMLReader Extension<br />
Loading a Document<br />
The focal class of the XMLReader extension is aptly named XMLReader. It doesn’t declare<br />
a construc<strong>to</strong>r, but rather offers two methods for introducing XML data in<strong>to</strong> it.<br />
<br />
Both of these methods have two additional parameters.<br />
The second parameter is a string specifying the encoding scheme for the input<br />
document. It is optional and defaults <strong>to</strong> ’UTF-8’ if unspecified or specified<br />
as null. V a l i values d for this parameter aren’t included in the <strong>PHP</strong> manual,<br />
but can be found in the reference for the underlying libxml2 library at<br />
http://www.xmlsoft.org/encoding.html#Default.<br />
The third parameter is an integer value that can be set in bitmask fashion<br />
using constants from the libxml extension. This is the preferred method<br />
<strong>to</strong> configure the parser o v e r using the deprecated setParserProperty() method.<br />
The specific constants that can be used <strong>to</strong> form the bitmask (using the bitwise<br />
OR opera<strong>to</strong>r |) are listed below. Descriptions for them can be found at<br />
http://php.net/manual/en/libxml.constants.php.<br />
• LIBXML_COMPACT<br />
• LIBXML_DTDATTR<br />
• LIBXML_DTDLOAD<br />
• LIBXML_DTDVALID<br />
• LIBXML_NOBLANKS<br />
• LIBXML_NOCDATA<br />
• LIBXML_NOENT