30.08.2014 Views

url - Universität zu Lübeck

url - Universität zu Lübeck

url - Universität zu Lübeck

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

14 CHAPTER 2. FUNDAMENTALS<br />

2.2 Document Type Definitions and XML Schema<br />

So far, we made no restriction on the elements’ labels and their structures. For<br />

most applications not every well-formed XML document is understandable and<br />

processable: For example, an auction system that expects XMark data will not be<br />

able to process an XML formatted list of publications. Technically it is possible to<br />

read and parse the elements but semantically the application is not aware how to<br />

deal with it.<br />

Therefore, we need a mechanism to declare a class or type of documents. This<br />

is done by schema languages like Document Type Definitions and XML Schema<br />

documents. The idea is to predefine the allowed element labels and to declare<br />

how they are allowed to be nested. Schemas are comparable to grammars for<br />

programming languages, however, context-free grammars describe sets of words<br />

whereas we need to describe sets of trees. The term ”schema” comes from the<br />

database community.<br />

If an XML document satisfies all constraints of a schema it is valid. Validity<br />

implies that a document is well-formed and is checked by validating parsers.<br />

2.2.1 DTD: Document Type Definition<br />

A significant feature that XML inherits from its predecessor SGML is the concept<br />

of a Document Type Definition (DTD). The DTD is an optional feature which provides<br />

a formal set of rules to define a document structure. It defines the elements<br />

that may be used and states where they may be applied in relation to each other.<br />

Therefore, the DTD defines the document’s hierarchy and granularity.<br />

In the following figure the DTD for an XMark fragment is presented.<br />

1 <br />

2 <br />

3 <br />

4 <br />

5 <br />

6 <br />

7 <br />

8 <br />

9 <br />

10 <br />

11 <br />

Figure 2.2: The DTD for an XMark fragment<br />

Line 2 states that the root element is an containing a sequence of ,<br />

, , and elements. The + symbol<br />

indicates that the payment> element may appear more than once. A ∗ symbol<br />

states that zero to many elements are allowed. The ? symbol indicates that an<br />

element may appear zero times or once. In the example an item may have a description<br />

but it does not need to have one. If no symbol is attached to an element<br />

it may appear exactly once as child.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!