04.08.2014 Views

o_18ufhmfmq19t513t3lgmn5l1qa8a.pdf

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

428 CHAPTER 22 ■ PROJECT 3: XML FOR ALL OCCASIONS<br />

Creating HTML Pages<br />

Now you’re ready to make the prototype. For now, let’s ignore the directories and concentrate<br />

on creating HTML pages. You have to create a slightly embellished event handler that does<br />

the following:<br />

• At the start of each page element, opens a new file with the given name, and writes a suitable<br />

HTML header to it, including the given title<br />

• At the end of each page element, writes a suitable HTML footer to the file, and closes it<br />

• While inside the page element, passes through all tags and characters without modifying<br />

them (writes them to the file as they are)<br />

• While not inside a page element, ignores all tags (such as website and directory<br />

Most of this is pretty straightforward (at least if you know a bit about how HTML documents<br />

are constructed). There are two problems, however, which may not be completely obvious.<br />

First, you can’t simply “pass through” tags (write them directly to the HTML file you’re<br />

building) because you are given their names only (and possibly some attributes). You have to<br />

reconstruct the tags (with angle brackets and so forth) yourself.<br />

Second, SAX itself gives you no way of knowing whether you are currently “inside” a page<br />

element. You have to keep track of that sort of thing yourself (as we did in the HeadlineHandler<br />

example). For this project, we’re only interested in whether or not to pass through tags and<br />

characters, so we’ll use a Boolean variable called passthrough, which we’ll update as we enter<br />

and leave the pages.<br />

See Listing 22-2 for the code for the simple program.<br />

Listing 22-2. A Simple Page Maker Script (pagemaker.py)<br />

from xml.sax.handler import ContentHandler<br />

from xml.sax import parse<br />

class PageMaker(ContentHandler):<br />

passthrough = False<br />

def startElement(self, name, attrs):<br />

if name == 'page':<br />

self.passthrough = True<br />

self.out = open(attrs['name'] + '.html', 'w')<br />

self.out.write('\n')<br />

self.out.write('%s\n' % attrs['title'])<br />

self.out.write('\n')<br />

elif self.passthrough:<br />

self.out.write('')

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!