10.07.2015 Views

Beginning Web Development With Perl : From Novice to ... - Nabo

Beginning Web Development With Perl : From Novice to ... - Nabo

Beginning Web Development With Perl : From Novice to ... - Nabo

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

CHAPTER 9 ■ XML PARSING WITH PERL 173On my Debian (Sarge) system, the output looks like this:--> XML::SAX::Pure<strong>Perl</strong>--> XML::LibXML::SAX::Parser--> XML::LibXML::SAX--> XML::SAX::ExpatA parser is chosen through the XML::SAX::ParserFac<strong>to</strong>ry interface. However, in practice,programmers frequently leave it up <strong>to</strong> XML::SAX <strong>to</strong> decide which parser <strong>to</strong> use, with the defaultbeing decided by the order in which the parsers were installed. Though this may soundblatantly obvious, parsers implement functions <strong>to</strong> parse XML. The parser methodsinclude parse_uri(), parse_file(), and so on.It’s important <strong>to</strong> realize the difference, from an XML::SAX standpoint, between a parserand a handler. A parser is usually software in the form of a <strong>Perl</strong> module that is installed withXML::SAX or can be installed from CPAN. A handler, on the other hand, is software that youwrite as part of the XML parsing programming task. The parser is created or instantiated bythe XML::SAX::ParserFac<strong>to</strong>ry and is passed an argument telling it which handler will be used.The handler then implements interfaces for events handed <strong>to</strong> it from the parser. Note thata parser can be passed numerous arguments in addition <strong>to</strong> the name of the handler <strong>to</strong> use.XML::SAX Parser MethodsAs previously stated, a parser implements several methods for parsing XML. For most of themethods, you pass the XML as an argument, as well as other options for parsing. The parsermethods are as follows:• parse([options]): This is a generic method that can accept optional options in list,name=>value pairs, or hash format.• parse_uri(uri [, options]): This is a commonly used method <strong>to</strong> parse XML asdenoted by the URI.• parse_file(filestream [, options]): This method parses a filestream such as a filehandle.Do not confuse this method with an argument of a plain file rather thana stream.• parse_string(string [, options]): This method parses the XML contained in thestring passed <strong>to</strong> it.SAX2 Handler InterfacesThe handler that you create will need <strong>to</strong> implement code <strong>to</strong> handle events as they are passedin by the parser. XML::SAX provides access <strong>to</strong> events <strong>to</strong> ensure that the SAX2 specification ismet. XML::SAX and related parsers also work with namespaces.Logically, XML::SAX events and handlers can be grouped in<strong>to</strong> categories. Many of the morecommon handlers fall in<strong>to</strong> the category of content handlers. Content handlers work with theactual content of the document itself, and so content handlers are where you’ll spend a largeamount of time coding. Another important category of handlers includes the error handlers thatenable you <strong>to</strong> create cus<strong>to</strong>m error handling code. Other handlers include lexical handlersthat work with CDATA sections, comments, DTDs, and entities. The following sections look atcontent and error event handlers.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!