XML Demystified

Recommendations

Info

96 Parsing an XML Document XML Demystifi ed An XML document is basically a text file where some tags represent information and other tags represent XML tags that describe the information. XML tags are designed to provide instructions to the program that transforms the information contained in the XML document into another form. The program that reads and interprets the information is called a parser and the process where information in an XML document is transformed into another form is called transformation. In its simplest form, a parser extracts and reformats information contained in an XML document based on the XML tag that describes the information. For example, suppose that the parser encountered the XML tag. The parser copies information contained in the tag and then reformats it into an HTML document. The parser reads it, and the transformer allows you to convert it to another type of document; for example, an HTML document. In its more complicated form, a parser extracts selected XML tags and reformats the information based on business logic. Suppose account manager Bob Smith is planning a sales trip and wants a list of the customers who are within the same vicinity. The list can be generated by giving the parser specific instructions, such as search the XML document for customers whose account manager is Bob Smith. Once they’re found, the parser determines if the customer’s zip code is within a specific set of zip codes. If so, then select information about the customer is copied from the XML document into an HTML web page that is displayed on Bob Smith’s computer. Instructions for the parser are written in the Extensible Stylesheet Language Transformation (XSLT) and stored in the Extensible Stylesheet Language (XSL), which you learned about in the Chapter 6. A parser is a program. There are a number of parsers that are available, each of which adheres to one of two XML parsing standards. These standards are SAX and DOM. The Simple API for XML (SAX) The Simple API for XML (SAX) standard was developed by members of the XML- DEV mailing list. It was driven by a need to have an open standard for companies or public organizations; this way, they could implement a standard that would be consistent across the board. SAX is not technically an XML parser—it’s a specification that defines the interface to the parser. Its first release was in May 1998. Of all the implementations
CHAPTER 7 XML Parsers and Transformations of the SAX specification, the Java implementation is probably the most mature and most widely used. It’s important to understand that SAX is a standard for an application program interface (API). It specifies standards for classes that you use to build a SAX parser. This may sound confusing, especially if you’ve never programmed before. However, you can probably imagine the many steps that are necessary to read and transform an XML document. You need to write code for each step in order to build a parser to transform the XML documents. This is a tedious and time-consuming job. However, you can minimize the tedium and save time by using the classes of an API, which other developers have already written. Think of these classes as already assembled subparts of the parser. You assemble the subparts together to create a parser. You aren’t expected to write a parser, but you’ll need a parser in order to transform your XML document. A SAX parser (a parser that was developed using the SAX API) is designed to read large XML documents because it starts at the beginning of the XML document and reads a group of lines, called a block at a time, until it reaches the end of the document. The entire transformation process occurs in one reading. As it reads each block, the SAX parser determines if the block contains an XML tag or information. If it’s an XML tag, the SAX parser compares the XML tag to the XSL and then transforms the information based on the XSL instructions. The SAX parser then reads the next block of the XML document. A block is discarded once it’s transformed. This frees memory for the next block, which gives the SAX parser an advantage over a DOM parser. A DOM parser loads the entire XML document in memory, which you’ll learn about in “The Document Object Model,” later in this chapter. The SAX parser requires a small amount of memory to transform a very large XML document. This advantage is also a disadvantage because a SAX parser cannot reference a block of an XML document other than the block that’s in memory. This means that it cannot modify XML information that has already been transformed based on the block that’s currently being read. A SAX parser gets one chance at reading each XML tag. Sometimes this is all you need, though for a more complex transformation, you’ll need to use a DOM parser that can reference any part of the XML document (see “The Document Object Model,” later in this chapter). Components of a SAX Parser There are four components in a SAX parser: the Content Handler, Error Handler, DTD Handler, and Entity Resolver. 97
Page 2 and 3:
XML DEMYSTIFIED
Page 4 and 5:
XML DEMYSTIFIED JIM KEOGH & KEN DAV
Page 6 and 7:
Professional Want to learn more? We
Page 8 and 9:
ABOUT THE AUTHORS Jim Keogh is on t
Page 10 and 11:
This page intentionally left blank
Page 12 and 13:
x XML Demystifi ed CHAPTER 3 Docume
Page 14 and 15:
xii Looking Ahead 118 Quiz 118 CHAP
Page 16 and 17:
Page 18 and 19:
xvi • Insert data into an XML doc
Page 20 and 21:
xviii Chapter 7: XML Parsers and Tr
Page 22 and 23:
2 XML Demystifi ed However, we’re
Page 24 and 25:
4 XML Demystifi ed Columns are desc
Page 26 and 27:
6 Why Is XML Such a Big Deal? XML D
Page 28 and 29:
8 XML Demystifi ed The first line s
Page 30 and 31:
10 XML Demystifi ed Next, reference
Page 32 and 33:
12 XML Demystifi ed Why Are Corpora
Page 34 and 35:
14 Quiz 1. XML cannot be used with
Page 36 and 37:
Page 38 and 39:
18 Identifying Information XML Demy
Page 40 and 41:
20 XML Demystifi ed Be sure that th
Page 42 and 43:
22 street_1 street_2 city state zip
Page 44 and 45:
24 XML Demystifi ed When writing th
Page 46 and 47:
26 XML Demystifi ed Suppose, for ex
Page 48 and 49:
28 Entities XML Demystifi ed Althou
Page 50 and 51:
30 The CDATA section is defined as:
Page 52 and 53:
32 7. All XML markup tags must have
Page 54 and 55:
34 Types of Document Type Definitio
Page 56 and 57:
36 XML Demystifi ed Let’s convert
Page 58 and 59:
38 XML Demystifi ed document—they
Page 60 and 61:
40 XML Demystifi ed NJ 07665 555-
Page 62 and 63:
42 XML Demystifi ed XML developers
Page 64 and 65:
44 XML Demystifi ed If you replace
Page 66 and 67: 46 Attribute Declarations XML Demys
Page 68 and 69: 48 XML Demystifi ed An external DTD
Page 70 and 71: This page intentionally left blank
Page 72 and 73: 52 Inside an XML Schema XML Demysti
Page 74 and 75: 54 XML Demystifi ed Next you must r
Page 76 and 77: 56 XML Demystifi ed You reference a
Page 78 and 79: 58 XML Demystifi ed Here’s how to
Page 80 and 81: 60 XML Demystifi ed simpleType, alt
Page 82 and 83: 62 Working with Whitespace Characte
Page 84 and 85: 64 XML Demystifi ed in the next exa
Page 86 and 87: 66 XML Demystifi ed Number of Occur
Page 88 and 89: 68 XML Demystifi ed c. Specifies th
Page 90 and 91: 70 An Inside Look at XLink XML Demy
Page 92 and 93: 72 XML Demystifi ed replace The lin
Page 94 and 95: 74 Here’s a typical Location Path
Page 96 and 97: 76 Axis Description child Contains
Page 98 and 99: 78 XML Demystifi ed For example, yo
Page 100 and 101: 80 XPointer XML Demystifi ed Number
Page 102 and 103: 82 5. onRequest is similar to XML D
Page 104 and 105: 84 What Is XSLT? XML Demystifi ed E
Page 106 and 107: 86 XML Demystifi ed You’re ready
Page 108 and 109: 88 XML Demystifi ed Style instructi
Page 110 and 111: 90 XML Demystifi ed The element d
Page 112 and 113: 92 XML Demystifi ed You really want
Page 114 and 115: 94 XML Demystifi ed c. For each cus
Page 118 and 119: 98 XML Demystifi ed The Content Han
Page 120 and 121: 100 XML Demystifi ed Notice the XML
Page 122 and 123: 102 XML Demystifi ed The customer e
Page 124 and 125: 104 XML Demystifi ed The getPreviou
Page 126 and 127: 106 XML Demystifi ed The Java trans
Page 128 and 129: This page intentionally left blank
Page 130 and 131: 110 XML Demystifi ed What Is Really
Page 132 and 133: 112 XML Demystifi ed Each channel c
Page 134 and 135: 114 XML Demystifi ed language code
Page 136 and 137: 116 XML Demystifi ed It’s import
Page 138 and 139: 118 Looking Ahead XML Demystifi ed
Page 140 and 141: 120 XML Demystifi ed 9. The link el
Page 142 and 143: 122 Getting Started XML Demystifi e
Page 144 and 145: 124 XML Demystifi ed Next, you’ll
Page 146 and 147: 126 How XQuery Works XML Demystifi
Page 148 and 149: 128 XML Demystifi ed Next is the me
Page 150 and 151: 130 XML Demystifi ed Let’s walk t
Page 152 and 153: 132 XML Demystifi ed The if…then
Page 154 and 155: 134 XML Demystifi ed Figure 9-3 Her
Page 156 and 157: 136 Retrieving the Value of an Attr
Page 158 and 159: 138 XML Demystifi ed 602498678299
Page 160 and 161: 140 Bob Dylan The Times They Are A
Page 162 and 163: 142 XML Demystifi ed The function d
Page 164 and 165: 144 8811160227 Jimi Hendrix Are Y
Page 166 and 167:
146 XML Demystifi ed The for and le
Page 168 and 169:
Page 170 and 171:
150 XML Demystifi ed However, acces
Page 172 and 173:
152 XML Demystifi ed You’ll notic
Page 174 and 175:
154 XML Demystifi ed document.all("
Page 176 and 177:
156 { var xslProcessor; var xslTemp
Page 178 and 179:
158 CD Listing UPC Artist Title
Page 180 and 181:
160 XML Demystifi ed accessed from
Page 182 and 183:
162 XML Demystifi ed in the browser
Page 184 and 185:
164 XML Demystifi ed The second lin
Page 186 and 187:
166 The InsertLast() Method XML Dem
Page 188 and 189:
168 Sony 1990-10-25 Phish Live Ph
Page 190 and 191:
170 Phish Live Phish, Vol. 15 26.9
Page 192 and 193:
172 XML Demystifi ed before that. T
Page 194 and 195:
174 XML Demystifi ed value="Live Ph
Page 196 and 197:
176 Rush Rush in Rio 13.98 Atlanti
Page 198 and 199:
178 XML Demystifi ed return; } var
Page 200 and 201:
180 The DeleteNodes() Function XML
Page 202 and 203:
182 XML Demystifi ed functions that
Page 204 and 205:
184 XML Demystifi ed The DOMDocumen
Page 206 and 207:
186 CD Listing Summary XML Demystif
Page 208 and 209:
188 XML Demystifi ed 7. The appendC
Page 210 and 211:
190 XML Demystifi ed 4. What does t
Page 212 and 213:
192 XML Demystifi ed 17. What does
Page 214 and 215:
194 XML Demystifi ed c. use=“requ
Page 216 and 217:
196 XML Demystifi ed 42. xlink:show
Page 218 and 219:
198 55. What is returned by round(8
Page 220 and 221:
200 69. An error stops the SAX pars
Page 222 and 223:
202 83. Which of the following will
Page 224 and 225:
204 97. Which of the following is a
Page 226 and 227:
206 Chapter 1 1. b. False 2. d. All
Page 228 and 229:
208 Chapter 5 XML Demystifi ed 1. a
Page 230 and 231:
210 Chapter 9 XML Demystifi ed 1. b
Page 232 and 233:
212 25. c. Specifies the type of da
Page 234 and 235:
214 87. b. False 88. c. 89. b. E-m
Page 236 and 237:
216 complex elements, 63-65 See als
Page 238 and 239:
218 Microsoft’s XML Core Services
Page 240 and 241:
220 V ValidateDocument( ) function,
show all

XML Demystified

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?