10.07.2015 Views

Beginning Web Development With Perl : From Novice to ... - Nabo

Beginning Web Development With Perl : From Novice to ... - Nabo

Beginning Web Development With Perl : From Novice to ... - Nabo

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

CHAPTER 9 ■ XML PARSING WITH PERL 179dob = 3/10email = frank@example.comvehicle = Volvo S60vehicle = Honda Accordrequired = emailcus<strong>to</strong>mer =first_name = Sandylast_name = Sanbeansdob = 4/15email = sandy@example.comvehicle = McLaren MP4-20vehicle = Chevrolet S-10You’ve now seen how <strong>to</strong> parse XML using XML::SAX by creating your own handler for XML::SAXparser events. However, the examples here have only scratched the surface of XML parsing withXML::SAX. It is a very powerful specification and package with <strong>Perl</strong>. I invite you <strong>to</strong> spend some timereading the XML::SAX and XML::SAX::Base documentation and experimenting with the code andwith more complex examples <strong>to</strong> parse XML in <strong>Perl</strong> with this excellent module.Using Tree-Based ParsingThe chapter began with a look at XML::Simple for parsing simple XML. You then read aboutparsing of XML with XML::SAX, a framework around which very complex XML parsing can bedone. Tree-based parsing, or simply tree parsing, is yet another process for parsing XML. Thismethod delivers the entire XML structure <strong>to</strong> your program as one logical entity, as opposed <strong>to</strong>the delivery in chunks that you get with a stream processor. As noted earlier in the chapter,tree parsers are almost always stream-based parsers at heart, but they hold the data until theend of the parsing.Needing <strong>to</strong> pass the entire structure at once almost always means that tree parsers havehigher memory requirements than their stream-based counterparts. Since XML structurescan be quite complex, it’s not uncommon <strong>to</strong> receive an Out of Memory error when usinga tree parser on complex and/or lengthy XML.Tree parsers include XML::Parser, which can be used both as a tree and a stream parser,XML::Grove, XML::TreeBuilder, XML::Twig, and XML::SimpleObject, just <strong>to</strong> name a few. Theparser that you saw earlier in the chapter, XML::Simple, is yet another tree parser.Each XML parser has its own features and invariably its own syntax as well. XML::Twig, forexample, is interesting in that it can hold part of the XML tree, thus saving memory. XML::Twigalso provides a simple means for converting XML in<strong>to</strong> HTML or in<strong>to</strong> other formats. The followingexample prints XML using the indented option with XML::Twig:#!/usr/bin/perluse strict;use XML::Twig;

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!