10.07.2015 Views

Beginning Web Development With Perl : From Novice to ... - Nabo

Beginning Web Development With Perl : From Novice to ... - Nabo

Beginning Web Development With Perl : From Novice to ... - Nabo

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

CHAPTER 9■ ■ ■XML Parsing with <strong>Perl</strong>You have some data in XML. Maybe that data is from a SOAP web service, maybe it’s from anRSS feed, or maybe it’s from another source. Now you want <strong>to</strong> read the XML and extract the datafrom it. As is the theme with <strong>Perl</strong>, you have multiple ways <strong>to</strong> accomplish this task.XML parsing with <strong>Perl</strong> has a s<strong>to</strong>ried his<strong>to</strong>ry. Early modules were quirky, while others wereincomplete.Parsing simple XML with <strong>Perl</strong> is, well, simple. Parsing complex XML with <strong>Perl</strong> can be quitedifficult. The important thing <strong>to</strong> remember is that XML is just a way <strong>to</strong> represent data. That datahappens <strong>to</strong> be in an XML document. The program that you write <strong>to</strong> parse XML will first need<strong>to</strong> read the XML, and then use the results as it would any other data input.This chapter looks at XML parsing with <strong>Perl</strong>. It first reviews the main parsing methods,and then describes using two modules: XML::Simple and XML:SAX. Finally, it examines treebasedparsing.XML Parsing MethodsRecall that there’s always more than one way <strong>to</strong> do the same thing with <strong>Perl</strong>. XML parsing isno different. And, of course, there’s no rule that says that you must use an XML parser at all.It’s quite possible for you <strong>to</strong> write your own XML parser, just as it would be possible <strong>to</strong> writeyour own module for anything in <strong>Perl</strong>, rather than using an already existing module.Primarily, two methods exist for parsing an XML document:Stream parsing: Stream-based parsers process XML as it is read in<strong>to</strong> the parser. As new elementsare encountered (which are called <strong>to</strong>kens), they are processed by the parser and sentin<strong>to</strong> your program through a process of events. This means that the program must processeach piece of data as it is encountered by the stream-based parser. Stream-based parsershave lower memory requirements than their tree-based counterparts, simply because theydon’t s<strong>to</strong>re any data; rather, they send data along in<strong>to</strong> the rest of the program as it is found.Of course, the lower memory requirements are gained at the expense of complexity whencompared <strong>to</strong> tree-based parsers.Tree parsing: Tree-based parsers load entire XML structures in<strong>to</strong> memory for later processing.This means that the entire document is parsed prior <strong>to</strong> your program needing <strong>to</strong> handle it. Inturn, this leads <strong>to</strong> less complex programs when compared <strong>to</strong> stream-based parsing. The extrasimplicity comes at the cost of higher memory requirements. Naturally, on a modern computerwith a small document <strong>to</strong> parse, the memory required will be minimal.165

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!