14.07.2013 Views

EXtensible Markup Language (XML) - Cultural View

EXtensible Markup Language (XML) - Cultural View

EXtensible Markup Language (XML) - Cultural View

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>EXtensible</strong> <strong>Markup</strong><br />

<strong>Language</strong> (<strong>XML</strong>)<br />

Visit the <strong>Cultural</strong> <strong>View</strong> of Technology <strong>XML</strong> Tutorial page for videos and exercises<br />

PDF generated using the open source mwlib toolkit. See http://code.pediapress.com/ for more information.<br />

PDF generated at: Thu, 17 Jun 2010 01:47:38 UTC


Contents<br />

Articles<br />

Binary <strong>XML</strong> 1<br />

Business Process Definition Metamodel 2<br />

CDATA 3<br />

CDuce 6<br />

Character entity reference 7<br />

CodeSynthesis XSD 9<br />

D3L 10<br />

Darwin Information Typing Architecture 10<br />

DITA Open Toolkit 14<br />

Document Structure Description 15<br />

Document-Centric 16<br />

Document-centric <strong>XML</strong> processing 17<br />

Dynamic <strong>XML</strong> 18<br />

ECMAScript for <strong>XML</strong> 18<br />

Efficient <strong>XML</strong> Interchange 20<br />

Embedded RDF 21<br />

EpiDoc 21<br />

eXtensible Server Pages 23<br />

Fast Infoset 24<br />

Global listings format 26<br />

GMX 26<br />

GMX-V 27<br />

Head-Body Pattern 28<br />

HyTime 28<br />

Internationalization Tag Set 29<br />

Klip 32<br />

List of <strong>XML</strong> and HTML character entity references 33<br />

Log4js 44<br />

MAREC 46<br />

Media Object Server 47<br />

METS 47<br />

Numeric character reference 50<br />

Office Open <strong>XML</strong> 52<br />

Office Open <strong>XML</strong> file formats 61


OIO<strong>XML</strong> 70<br />

Open <strong>XML</strong> Paper Specification 71<br />

PCDATA 77<br />

Plain Old <strong>XML</strong> 78<br />

Portable Application Description 79<br />

Publishing Requirements for Industry Standard Metadata 80<br />

QName 82<br />

QTI 83<br />

Resource Description Framework 89<br />

Resources of a Resource 98<br />

Reverse Ajax 99<br />

Root element 100<br />

Schematron 101<br />

Simple Outline <strong>XML</strong> 103<br />

Simple <strong>XML</strong> 104<br />

Streaming <strong>XML</strong> 105<br />

Styled Layer Descriptor 105<br />

Topic (<strong>XML</strong>) 106<br />

Unique Particle Attribution 107<br />

VTD-<strong>XML</strong> 108<br />

X-expression 114<br />

XBRLS 114<br />

Xdos 116<br />

XDR Schema 116<br />

XEE (Starlight) 117<br />

XEP 118<br />

<strong>XML</strong> 119<br />

<strong>XML</strong> and MIME 132<br />

<strong>XML</strong> appliance 133<br />

<strong>XML</strong> Base 135<br />

<strong>XML</strong> Catalog 136<br />

<strong>XML</strong> Certification Program 138<br />

<strong>XML</strong> Configuration Access Protocol 143<br />

<strong>XML</strong> Control Protocol 144<br />

<strong>XML</strong> data binding 145<br />

<strong>XML</strong> database 146<br />

<strong>XML</strong> editor 150<br />

<strong>XML</strong> Enabled Directory 153


<strong>XML</strong> Encryption 154<br />

<strong>XML</strong> Events 154<br />

<strong>XML</strong> framework 156<br />

<strong>XML</strong> Literals 157<br />

<strong>XML</strong> namespace 157<br />

<strong>XML</strong> Pretty Printer 158<br />

<strong>XML</strong> Protocol 159<br />

<strong>XML</strong> schema 160<br />

<strong>XML</strong> Schema Editor 162<br />

<strong>XML</strong> Schema <strong>Language</strong> Comparison 165<br />

<strong>XML</strong> Studio 171<br />

<strong>XML</strong> Telemetric and Command Exchange 172<br />

<strong>XML</strong> template engine 174<br />

<strong>XML</strong> tree 177<br />

<strong>XML</strong> validation 177<br />

<strong>XML</strong>-Enabled Networking 178<br />

<strong>XML</strong>-Retrieval 180<br />

<strong>XML</strong>HttpRequest 182<br />

<strong>XML</strong>Socket 187<br />

XPath 188<br />

XPath 2.0 189<br />

Xs3p 192<br />

XSQL 193<br />

References<br />

Article Sources and Contributors 194<br />

Image Sources, Licenses and Contributors 198<br />

Article Licenses<br />

License 199


Binary <strong>XML</strong> 1<br />

Binary <strong>XML</strong><br />

Binary <strong>XML</strong> refers to any specification which defines the compact representation of <strong>XML</strong> (Extensible <strong>Markup</strong><br />

<strong>Language</strong>) in a binary format. While there are several competing formats, none has been widely adopted by a<br />

standards organization or accepted as a de facto standard. Using a binary <strong>XML</strong> format generally reduces the<br />

verbosity of <strong>XML</strong> documents and cost of parsing [1] , but hinders the use of ordinary text editors and third-party tools<br />

to view and edit the document. Binary <strong>XML</strong> is typically used in applications where standard <strong>XML</strong> is not an option<br />

due to performance limitations, but the ability to convert the document to and from a form which is easily viewed<br />

and edited is valued. Other advantages may include enabling random access and indexing of <strong>XML</strong> documents.<br />

The major challenge for binary <strong>XML</strong> is to create a single, widely adopted standard. The International Organization<br />

for Standardization (ISO) and the International Telecommunications Union (ITU) published the Fast Infoset standard<br />

in 2007 and 2005, respectively. The World Wide Web Consortium (W3C) has produced the first draft of the EXI<br />

format specification. Another standard (ISO/IEC 23001-1), known as Binary MPEG format for <strong>XML</strong> (BiM), has<br />

been standardized by the ISO in 2001. BiM is used by many ETSI standards for Digital TV and Mobile TV. The<br />

Open Geospatial Consortium also provides a Binary <strong>XML</strong> Encoding Specification (currently a Best Practice Paper)<br />

optimized for geo-related data (GML).<br />

Alternatives to binary <strong>XML</strong> include using traditional file compression methods on <strong>XML</strong> documents (for example<br />

gzip); or using an existing standard such as ASN.1. Traditional compression methods, however, offer only the<br />

advantage of compression, without the advantage of decreased parsing time or random access. ASN.1 is being used<br />

as the basis of Fast Infoset, which is one binary <strong>XML</strong> standard. There are also hybrid approaches (e.g., VTD-<strong>XML</strong>)<br />

that attach a small index file to an <strong>XML</strong> document to eliminate the overhead of parsing [2] .<br />

Adoption<br />

Projects and file formats which use binary <strong>XML</strong> include:<br />

• Fast Infoset, a standard published by ISO/IEC and ITU-T<br />

• Efficient <strong>XML</strong> from AgileDelta, Inc., selected as the basis for the W3C Standard for Binary <strong>XML</strong> (EXI)<br />

• Extensible Binary Meta <strong>Language</strong> (EBML) from Matroska<br />

• Wireless Binary <strong>XML</strong> (WB<strong>XML</strong>)<br />

Other projects that have functionality related to (or competing with) binary representations include:<br />

• VTD-<strong>XML</strong> from XimpleWare and VTD-<strong>XML</strong> project<br />

• BiM Standard, from the ISO, developed by the MPEG working group<br />

• Protocol Buffers from Google<br />

• Data Distribution Service from OMG<br />

References<br />

[1] The performance woe of binary <strong>XML</strong> http://webservices.sys-con.com/read/250512.htm<br />

[2] Index <strong>XML</strong> documents with VTD-<strong>XML</strong> (http://xml.sys-con.com/read/453082.htm)


Business Process Definition Metamodel 2<br />

Business Process Definition Metamodel<br />

The Business Process Definition Metamodel (BPDM) is a standard definition of concepts used to express business<br />

process models (a metamodel), adopted by the OMG (Object Management Group). Metamodels define concepts,<br />

relationships, and semantics for exchange of user models between different modeling tools. The exchange format is<br />

defined by XSD (<strong>XML</strong> Schema) and XMI (<strong>XML</strong> for Metadata Interchange), a specification for transformation of<br />

OMG metamodels to <strong>XML</strong>. Pursuant to the OMG's policies, the metamodel is the result of an open process<br />

involving submissions by member organizations, following a Request for Proposal [1] (RFP) issued in 2003. BPDM<br />

was adopted in initial form in July 2007, and finalized in July 2008.<br />

BPDM provides abstract concepts as the basis for consistent interpretation of specialized concepts used by business<br />

process modelers. For example, the ordering of many of the graphical elements in a BPMN (Business Process<br />

Modeling Notation) diagram is depicted by arrows between those elements, but the specific elements can have a<br />

variety of characteristics. For example, all BPMN events have some common characteristics, and a variety of<br />

specific events are designated by the type of circle and the icon in the circle. The abstract BPDM concepts ensure<br />

implementers of different modeling tools will associate the same characteristics and semantics with the modeling<br />

elements to ensure models are interpreted the same way when moved to a different tool. Users of the modeling tools<br />

do not need to be concerned with the abstractions-they only see the specialized elements.<br />

BPDM extends business process modeling beyond the elements defined by BPMN and BPEL to include interactions<br />

between otherwise-independent business processes executing in different business units or enterprises<br />

(choreography). A choreography can be specified independently of its participants, and used as a requirement for the<br />

specification of the orchestration implemented by a participant. BPDM provides for the binding of orchestration to<br />

choreography to ensure compatibility. Many current business process models focus on specification of executable<br />

business processes that execute within an enterprise (orchestration).<br />

The BPDM specification addresses the objectives of the OMG RFP [1] on which it is based:<br />

• BPDM "will define a set of abstract business process definition elements for specification of executable business<br />

processes that execute within an enterprise, and may collaborate between otherwise-independent business<br />

processes executing in different business units or enterprises."<br />

• common metamodel to unify the diverse business process definition notations that exist in the industry containing<br />

semantics compatible with leading business process modeling notations.<br />

• A metamodel that complements existing UML metamodels so that business processes specifications can be part<br />

of complete system specifications to assure consistency and completenes.<br />

• The ability to integrate process models for workflow management processes, automated business processes, and<br />

collaborations between business units.<br />

• Support for the specification of web services choreography, describing the collaboration between participating<br />

entities and the ability to reconcile the choreography with supporting internal business processes.<br />

• The ability to exchange business process specifications between modeling tools, and between tools and execution<br />

environments using XMI.<br />

The RFP seeks to "improve communication between modelers, including between business and software modelers,<br />

provide flexible selection of tools and execution environments, and promote the development of more specialized<br />

tools for the analysis and design of processes."<br />

For exchange of business process models, BPDM is an alternative to the existing process interchange format XPDL<br />

(<strong>XML</strong> Process Definition <strong>Language</strong>) from the WfMC (Workflow Management Coalition). The two specifications<br />

are similar in that they can be used by process design tools to exchange business process definitions. They are<br />

different in that BPDM provides a specification of semantics integrated in a metamodel, and it includes additional<br />

modeling capabilities such as choreography, discussed above. In addition, XPDL has many implementations, though


Business Process Definition Metamodel 3<br />

only some support for XPDL 2.x, needed for interchanging BPMN. BPDM implementations are in preparation,<br />

including support for BPMN, and translation to XPDL.<br />

External links<br />

• BPDM Tutorial [2]<br />

• Design Rationale [3] (see Section 4, also Sections 7.6 and 7.9).<br />

• Other introductory presentations [4]<br />

• Web pages showing metamodels [5] in UML notation<br />

• Specification documents, in two parts:<br />

• Common Infrastructure [6] (see Section 4.4.1.1 for an overview of metamodeling).<br />

• Process Definition [7] .<br />

References<br />

[1] http://www.omg.org/cgi-bin/doc?bei/03-01-06<br />

[2] http://doc.omg.org/omg/08-06-32<br />

[3] http://doc.omg.org/bmi/08-09-07<br />

[4] http://www.conradbock.org/#BPDM<br />

[5] ftp://ftp.omg.org/pub/docs/dtc/08-05-11/pages/188c21b53f42002f.htm<br />

[6] http://doc.omg.org/dtc/08-05-07<br />

[7] http://doc.omg.org/dtc/08-05-10<br />

CDATA<br />

The term CDATA, meaning character data, is used for distinct, but related purposes in the markup languages<br />

SGML and <strong>XML</strong>. The term indicates that a certain portion of the document is general character data, rather than<br />

non-character data or character data with a more specific, limited structure.<br />

CDATA sections in <strong>XML</strong><br />

In an <strong>XML</strong> document or external parsed entity, a CDATA section is a section of element content that is marked for<br />

the parser to interpret as only character data, not markup. A CDATA section is merely an alternative syntax for<br />

expressing character data; there is no semantic difference between character data that manifests as a CDATA section<br />

and character data that manifests as in the usual syntax in which "


CDATA 4<br />

John Smith]]><br />

then the code is interpreted the same as if it had been written like this:<br />

&lt;sender&gt;John Smith&lt;/sender&gt;<br />

That is, the "sender" tags will have exactly the same status as the "John Smith"— they will be treated as text.<br />

Similarly, if the numeric character reference &#240; appears in element content, it will be interpreted as the single<br />

Unicode character 00F0 (small letter eth). But if the same appears in a CDATA section, it will be parsed as six<br />

characters: ampersand, hash mark, digit 2, digit 4, digit 0, semicolon.<br />

Uses of CDATA sections<br />

New authors of <strong>XML</strong> documents often misunderstand the purpose of a CDATA section, mistakenly believing that its<br />

purpose is to "protect" data from being treated as ordinary character data during processing. Some APIs for working<br />

with <strong>XML</strong> documents do offer options for independent access to CDATA sections, but such options exist above and<br />

beyond the normal requirements of <strong>XML</strong> processing systems, and still do not change the implicit meaning of the<br />

data. Character data is character data, regardless of whether it is expressed via a CDATA section or ordinary markup.<br />

CDATA sections are useful for writing <strong>XML</strong> code as text data within an <strong>XML</strong> document. For example, if one wishes<br />

to typeset a book with XSL explaining the use of an <strong>XML</strong> application, the <strong>XML</strong> markup to appear in the book itself<br />

will be written in the source file in a CDATA section. However, a CDATA section cannot contain the string "]]>"<br />

and therefore it is not possible for a CDATA section to contain nested CDATA sections. The preferred approach to<br />

using CDATA sections for encoding text that contains the triad "]]>" is to use multiple CDATA sections by splitting<br />

each occurrence of the triad just before the ">". For example, to encode "]]>" one would write:<br />

]]><br />

This means that to encode "]]>" in the middle of a CDATA section, replace all occurrences of "]]>" with the<br />

following:<br />

]]]]><br />

This effectively stops and restarts the CDATA section.<br />

Use of CDATA in program output<br />

For generating <strong>XML</strong> "by hand", CDATA sections do not remove the need for escaping. The string ]]> (the CDATA<br />

end marker) must be escaped with a string such as ]]]]>, which breaks the string across separate<br />

CDATA sections. An alternative to using CDATA sections which may be simpler in some circumstances is to<br />

escape the single characters & and < (normally using &amp; or &#38; and &lt; or &#60;). The different approaches<br />

produce equally valid <strong>XML</strong>, and most <strong>XML</strong> parsers will not preserve the distinctions between them in their output.<br />

CDATA sections in XHTML documents are liable to be parsed differently by web browsers if they render the<br />

document as HTML, since HTML parsers do not recognise the CDATA start and end markers, nor do they recognise<br />

HTML entity references such as &lt; within tags. This can cause rendering problems in web browsers and<br />

can lead to cross-site scripting vulnerabilities if used to display data from untrusted sources, since the two kinds of<br />

parser will disagree on where the CDATA section ends.<br />

Since it is useful to be able to use less-than signs (


CDATA 5<br />

example:<br />

<br />

//<br />

<br />

or this CSS example:<br />

<br />

/**/<br />

<br />

This technique is only necessary when using inline scripts and stylesheets, and is language-specific. CSS stylesheets,<br />

for example, only support the second style of commenting-out (/* ... */), but CSS also has less need for the < and &<br />

characters than JavaScript and so less need for explicit CDATA markers.<br />

CDATA in DTDs<br />

CDATA-type attribute value<br />

In Document Type Definition (DTD) files for SGML and <strong>XML</strong>, an attribute value may be designated as being of<br />

type CDATA: arbitrary character data. Within a CDATA-type attribute, character and entity reference markup is<br />

allowed and will be processed when the document is read.<br />

For example, if an <strong>XML</strong> DTD contains<br />

<br />

it means that elements named foo may optionally have an attribute named "a" which is of type CDATA. In an <strong>XML</strong><br />

document that is valid according to this DTD, an element like this might appear:<br />

<br />

and an <strong>XML</strong> parser would interpret the "a" attribute's value as being the character data "1 & 2 are < 3".<br />

CDATA-type entity<br />

An SGML or <strong>XML</strong> DTD may also include entity declarations in which the token CDATA is used to indicate that<br />

entity consists of character data. The character data may appear within the declaration itself or may be available<br />

externally, referenced by a URI. In either case, character reference and parameter entity reference markup is allowed<br />

in the entity, and will be processed as such when it is read.<br />

CDATA-type element content<br />

An SGML DTD may declare an element's content as being of type CDATA. Within a CDATA-type element, no<br />

markup will be processed. It is similar to a CDATA section in <strong>XML</strong>, but has no special boundary markup, as it<br />

applies to the entire element.


CDATA 6<br />

External links<br />

• CDATA Confusion [1]<br />

• Character Data and <strong>Markup</strong> (in <strong>XML</strong>) [2]<br />

References<br />

[1] http://www.flightlab.com/~joe/sgml/cdata.html<br />

[2] http://www.w3.org/TR/REC-xml/#syntax<br />

CDuce<br />

CDuce is an <strong>XML</strong>-oriented functional language, which extends XDuce in a few directions. It features <strong>XML</strong> regular<br />

expression types, <strong>XML</strong> regular expression patterns, <strong>XML</strong> iterators. CDuce is not strictly speaking an <strong>XML</strong><br />

transformation language since it can be used for general-purpose programming.<br />

CDuce conforms to basic standards: Unicode, <strong>XML</strong>, DTD, Namespaces are fully supported, <strong>XML</strong> Schema is<br />

partially supported.<br />

Benefits of CDuce<br />

• static verifications (e.g.: ensure that a transformation produces a valid document);<br />

• in particular, we aim at smooth and safe compositions of <strong>XML</strong> transformations, and incremental programming;<br />

• static optimizations and efficient execution model (knowing the type of a document is crucial to extract<br />

information efficiently).<br />

Features particular to CDuce<br />

• <strong>XML</strong> objects can be manipulated as first-class citizen values: elements, sequences, tags, characters and strings,<br />

attribute sets; sequences of <strong>XML</strong> elements can be specified by regular expressions, which also apply to characters<br />

strings;<br />

• functions themselves are first-class values, they can be manipulated, stored in data structure, returned by a<br />

function,...<br />

• a powerful pattern matching operation can perform complex extractions from sequences of <strong>XML</strong> elements;<br />

• a rich type algebra, with recursive types and arbitrary boolean combinations (union, intersection, complement)<br />

allows precise definitions of data structures and <strong>XML</strong> types; general purpose types and types constructors are<br />

taken seriously (products, extensible records, arbitrary precision integers with interval constraints, Unicode<br />

characters);<br />

• polymorphism through a natural notion of subtyping, and overloaded functions with dynamic dispatch;<br />

• a highly-effective type-driven compilation schema.<br />

External links<br />

• CDuce [1]<br />

References<br />

[1] http://www.cduce.org


Character entity reference 7<br />

Character entity reference<br />

In the markup languages SGML, HTML, XHTML and <strong>XML</strong>, a character entity reference is a reference to a<br />

particular kind of named entity that has been predefined or explicitly declared in a Document Type Definition<br />

(DTD). The "replacement text" of the entity consists of a single character from the Universal Character Set/Unicode.<br />

The purpose of a character entity reference is to provide a way to refer to a character that is not universally<br />

encodable.<br />

Although in popular usage character references are often called "entity references" or even "entities", this usage is<br />

wrong. A character reference is a reference to a character, not to an entity. Entity reference refers to the content of a<br />

named entity. An entity declaration is created by using the syntax in a document type<br />

definition (DTD) or <strong>XML</strong> schema. Then, the name defined in the entity declaration is subsequently used in the<br />

<strong>XML</strong>. When used in the <strong>XML</strong>, it is called an entity reference.<br />

Concepts<br />

<strong>XML</strong> has two relevant concepts:<br />

Predefined entity<br />

A "predefined entitys reference" is a reference to one of the special characters denoted by:<br />

Character coding<br />

entity character code (dec) meaning<br />

&quot; " x22 (34) (double) quotation mark<br />

&amp; & x26 (38) ampersand<br />

&apos; ' x27 (39) apostrophe (= apostrophe-quote)<br />

&lt; < x3C (60) less-than sign<br />

&gt; > x3E (62) greater-than sign<br />

A "character reference" is a construct such as &#xa0; or equally &#160; that refers to a character by means of its<br />

numeric Unicode code point, i.e. here, the character code 160 (or xA0 in hexa) refers the &nbsp; character, the<br />

non-breaking space.<br />

See also<br />

• SGML entity<br />

• Character encodings in HTML<br />

• Numeric character reference<br />

• List of <strong>XML</strong> and HTML character entity references


Character entity reference 8<br />

External links<br />

• Entities Table [1]<br />

• A Simple Character Entity Chart [2]<br />

• A character entity chart with images for entities [3]<br />

• A Clear and Quick Reference to HTML Symbol Entities Codes [4]<br />

References<br />

[1] http://www.elizabethcastro.com/html/extras/entities.html<br />

[2] http://www.evolt.org/article/ala/17/21234/<br />

[3] http://www.escapecodes.info/<br />

[4] http://www.entitycode.com/


CodeSynthesis XSD 9<br />

CodeSynthesis XSD<br />

Written<br />

in<br />

C++<br />

Type library or framework<br />

CodeSynthesis XSD is an <strong>XML</strong> Data Binding compiler for C++ developed by Code Synthesis and dual-licensed<br />

under the GNU GPL and a proprietary license. Given an <strong>XML</strong> instance specification (<strong>XML</strong> Schema), it generates<br />

C++ classes that represent the given vocabulary as well as parsing and serialization code. It is supported on a large<br />

number of platforms, including AIX, GNU/Linux, HP-UX, Mac OS X, Solaris, Windows, HP OpenVMS, and IBM<br />

z/OS. Supported C++ compilers include GNU G++, Intel C++, HP aCC, Sun C++, IBM XL C++, and Microsoft<br />

Visual C++. A version for mobile and embedded systems, called CodeSynthesis XSD/e, is also available.<br />

One of the unique features of CodeSynthesis XSD is its support for two different <strong>XML</strong> Schema to C++ mappings:<br />

in-memory C++/Tree and stream-oriented C++/Parser. The C++/Tree mapping is a traditional mapping with a<br />

tree-like, in-memory data structure. C++/Parser is a new, SAX-like mapping which represents the information stored<br />

in <strong>XML</strong> instance documents as a hierarchy of vocabulary-specific parsing events. In comparison to C++/Tree, the<br />

C++/Parser mapping allows one to handle large <strong>XML</strong> documents that would not fit in memory, perform<br />

stream-oriented processing, or use an existing in-memory representation.<br />

CodeSynthesis XSD itself is written in C++ [1] .<br />

External links<br />

• CodeSynthesis XSD Home Page [2]<br />

• An Introduction to the C++/Tree Mapping [3]<br />

• An Introduction to the C++/Parser Mapping [4]<br />

• An Introduction to <strong>XML</strong> Data Binding in C++ [5]<br />

References<br />

[1] Bjarne Stroustrup. C++ applications (http://www.research.att.com/~bs/applications.html), 2007-05-25. Retrieved on 2007-06-18.<br />

[2] http://www.codesynthesis.com/products/xsd/<br />

[3] http://www.codesynthesis.com/projects/xsd/documentation/cxx/tree/guide/ [4]<br />

http://www.codesynthesis.com/projects/xsd/documentation/cxx/parser/guide/ [5]<br />

http://www.artima.com/cppsource/xml_data_binding.html


D3L 10<br />

D3L<br />

D3L (Data Definition Description <strong>Language</strong>) is an <strong>XML</strong>-based message description language that describes the<br />

structure that an application's native, non-<strong>XML</strong> format message (known also as its native view) must follow to<br />

communicate. Currently used in Oracle Application Server InterConnect, D3L message description language is used<br />

to interact through several transport adapters, including FTP, HTTP(S), MQ Series, and SMTP.<br />

External links<br />

http://download-uk.oracle.com/docs/cd/B10465_01/integrate.904/b10404/appx_d3l.htm#620714<br />

Darwin Information Typing Architecture<br />

The Darwin Information Typing Architecture (DITA) is an <strong>XML</strong>-based architecture for authoring, producing,<br />

and delivering information. Although its main applications have so far been in technical publications, DITA is also<br />

used for other types of documents such as policies and procedures.<br />

Origin and name<br />

The DITA architecture and a related DTD and <strong>XML</strong> Schema were originally developed by IBM. The architecture<br />

incorporates ideas in <strong>XML</strong> architecture, such as modular information architecture, various features for content reuse,<br />

and specialization, that had been developed over previous decades. [1] DITA is now an OASIS standard.<br />

The first word in the name "Darwin Information Typing Architecture" is a reference to the naturalist Charles Darwin.<br />

The key concept of "specialization" in DITA is in some ways analogous to Darwin's concept of evolutionary<br />

adaptation, with a specialized element inheriting the properties of the base element from which it is specialized.<br />

Features and limitations<br />

Topic orientation<br />

DITA content is written as modular topics, as opposed to long "book-oriented" files. A DITA map contains links to<br />

topics, organized in the sequence (which may be hierarchical) in which they are intended to appear in finished<br />

documents. A DITA map defines the table of contents for deliverables. Relationship tables in DITA maps can also<br />

specify which topics link to each other.<br />

Modular topics can be easily reused in different deliverables. However, the strict topic-orientation of DITA makes it<br />

an awkward fit for content that contains lengthy narratives that do not lend themselves to being broken into small,<br />

standalone chunks. Experts stress the importance of content analysis in the early stages of implementing structured<br />

[2] [3] [4]<br />

authoring.


Darwin Information Typing Architecture 11<br />

Content references<br />

Fragments of content within topics (or less commonly, the topics themselves) can be reused through the use of<br />

content references (conref), a transclusion mechanism.<br />

Conditional text<br />

Conditional text allows filtering or styling content based on attributes for audience, platform, product, and other<br />

properties.<br />

Metadata<br />

DITA includes extensive metadata elements and attributes, which make topics easier to find.<br />

Information typing<br />

DITA specifies three basic topic types: Task, Concept and Reference. Each of the three basic topic types is a<br />

specialization of a generic Topic type, which contains a title element, a prolog element for metadata, and a body<br />

element. The body element contains paragraph, table, and list elements, similar to HTML.<br />

1. A Task topic is intended for a procedure that describes how to accomplish a task. A Task topic lists a series of<br />

steps that users follow to produce an intended outcome. The steps are contained in a taskbody element, which is a<br />

specialization of the generic body element. The steps element is a specialization of an ordered list element.<br />

2. Concept information is more objective, containing definitions, rules, and guidelines.<br />

3. A Reference topic is for topics that describe command syntax, programming instructions, and other reference<br />

material, and usually contains detailed, factual material.<br />

Specialization<br />

DITA allows adding new elements and attributes through specialization of base DITA elements and attributes.<br />

Through specialization, DITA can accommodate new topic types, element types, and attributes as needed for specific<br />

industries or companies. Specializations of DITA for specific industries, such as the semiconductor industry, are<br />

standardized through OASIS technical committees or subcommittees. A significant percentage of organizations<br />

using DITA also develop their own specializations.<br />

The extensibility of DITA permits organizations to specialize DITA by defining specific information structures and<br />

still use standard tools to work with them. The ability to define company-specific information architectures enables<br />

companies to use DITA to enrich content with metadata that is meaningful to them, and to enforce company-specific<br />

rules on document structure.<br />

Compatibility with non-DITA content<br />

The element types and structures in DITA topics are similar to popular languages such as HTML. For example, a<br />

bulleted or numbered list can be copied and pasted directly from HTML to DITA.<br />

DITA maps can include both DITA topics and non-DITA documents (such as HTML files and Microsoft Word<br />

documents) in document hierarchies. However, processors are generally limited in their ability to merge DITA and<br />

non-DITA content into consolidated printed documents.


Darwin Information Typing Architecture 12<br />

Creating content in DITA<br />

DITA map and topic documents are <strong>XML</strong> files. As with HTML, any images, video files, or other files which need to<br />

appear in output are inserted via reference. Any <strong>XML</strong> editor can therefore be used to write DITA content, with the<br />

exception of editors that support only a limited set of <strong>XML</strong> schemas (such as XHTML editors). Various editing tools<br />

have been developed that provide specific features to support DITA, such as visualization of conrefs.<br />

Publishing content written in DITA<br />

DITA is conceived as an end-to-end architecture. In addition to indicating what elements, attributes, and rules are<br />

part of the DITA language, the DITA specification [5] includes rules for publishing DITA content in print, HTML,<br />

online Help, and other formats.<br />

For example, the DITA specification indicates that if the conref attribute of element A contains a path to element B,<br />

the contents of element B will be displayed in the location of element A. DITA-compliant publishing solutions,<br />

known as DITA processors, must handle the conref attribute according to the specified behaviour. Rules also exist<br />

for processing other rich features such as conditional text, index markers, and topic-to-topic links. Applications that<br />

transform DITA content into other formats, and meet the DITA specification's requirements for interpreting DITA<br />

markup, are known as DITA processors.<br />

DITA Open Toolkit<br />

When DITA was released as a public <strong>XML</strong> standard in 2001, IBM contributed the DITA Open Toolkit (DITA OT)<br />

to the wider community. The DITA OT was therefore the first DITA processor, and continues to be the foundation of<br />

most publishing of DITA content. It is currently an active open-source project, with contributions from several<br />

companies.<br />

Out of the box, the DITA OT handles all valid DITA specializations and produces several output formats, including:<br />

• PDF, through XSL-FO<br />

• XHTML<br />

• Microsoft Compiled HTML Help<br />

• Eclipse Help<br />

• Java Help<br />

• Oracle Help<br />

• Rich Text Format<br />

The DITA OT can also be extended to produce other (arbitrary) output formats. The raw DITA OT can be run from<br />

the command line. Some DITA authoring tools and content management systems now integrate the DITA OT, or<br />

parts of it, into their own publishing workflows. Standalone tools have also been developed to run the DITA OT via<br />

a graphical user interface instead of the command line.<br />

The DITA OT includes customizable stylesheets that control the formatting and layout of human-readable<br />

deliverables.


Darwin Information Typing Architecture 13<br />

Brief history<br />

• March 2001 Introduction by IBM<br />

• May 2002 Domain specialization added to topic specialization<br />

• April 2004 OASIS [6] Technical Committee for DITA formed<br />

• February 2005 SourceForge [7] begins DITA Open Toolkit support<br />

• June 2005 DITA v1.0 approved as an OASIS standard<br />

• August 2005 DITA Open Toolkit v1.1 is released<br />

• March 2006 OASIS launches DITA.<strong>XML</strong>.org [8]<br />

• August 2007 DITA V1.1 is approved by OASIS, including Bookmap specialization<br />

See also<br />

• DocBook<br />

• S1000D<br />

• List of document markup languages<br />

• Comparison of document markup languages<br />

References<br />

• IBM's Introduction to DITA [9]<br />

• DITA Architectural Specification, v 1.1 [5]<br />

• DITA <strong>Language</strong> Specification, v 1.1 [10]<br />

Further reading<br />

• Priestley, Michael; Swope, Amber (2008) (PDF). The DITA Maturity Model Whitepaper [11] . IBM Corp and<br />

JustSystems.<br />

• Doyle, Bob (2008) (PDF). DITA Tools from A to Z [12] . Society for Technical Communication.<br />

External links<br />

• DITA <strong>XML</strong>.org community site [8]<br />

• DITA World [13] — Comprehensive list of DITA resources: articles, vendors, user groups and more<br />

• DITA Open Toolkit User Guide and Reference [14]<br />

• Roadmap for DITA Development [15] , OASIS DITA Technical Committee<br />

• DITA News [16] - aggregates DITA bloggers, has extensive resources, and DITA tools listing<br />

• RuDI: Ruby Utilities for DITA processing [17]


Darwin Information Typing Architecture 14<br />

References<br />

[1] Doyle, Bob. "History of DITA" (http://dita.xml.org/book/history-of-dita). . Retrieved 2009-07-31.<br />

[2] "Implementing DITA versus implementing custom <strong>XML</strong> architecture" (http://www.scriptorium.com/whitepapers/dita_assessment/<br />

dita_assessment4.html). Scriptorium Publishing Services, Inc. 2008. . Retrieved 2009-07-29.<br />

[3] "Structure, DITA, and content other than technical documentation …" (http://rockley.com/blog/?p=22). The Rockley Group. October 16,<br />

2007. . Retrieved 2009-07-29.<br />

[4] "Survey on DITA Chellenges" (http://writepoint.com/blog/?p=1011). WritePoint Ltd.. January 18, 2010. . Retrieved 2010-01-21.<br />

[5] http://docs.oasis-open.org/dita/v1.1/CS01/archspec/archspec.html<br />

[6] http://www.oasis-open.org<br />

[7] http://dita-ot.sourceforge.net<br />

[8] http://dita.xml.org<br />

[9] http://www.ibm.com/developerworks/xml/library/x-dita1/<br />

[10] http://docs.oasis-open.org/dita/v1.1/CS01/langspec/ditaref-type.html<br />

[11] http://na.justsystems.com/files/Whitepaper-DITA_MM.pdf<br />

[12] http://www.ditanews.com/tools/STC_Intercom.pdf<br />

[13] http://www.ditaworld.com<br />

[14] http://dita-ot.sourceforge.net/doc/ot-userguide/xhtml/<br />

[15] http://wiki.oasis-open.org/dita/Roadmap_for_DITA_development<br />

[16] http://www.ditanews.com<br />

[17] http://kenai.com/projects/rudi/pages/Home<br />

DITA Open Toolkit<br />

The DITA Open Toolkit is a free and open-source implementation of the OASIS DITA Technical Committee's<br />

specification for Darwin Information Typing Architecture (DITA) DTDs and Schemas. [1]<br />

The Toolkit transforms DITA content (topics and maps) into deliverable formats like web (XHTML), print (PDF),<br />

and online Help.<br />

The DITA Open Toolkit, or dita-ot for short, is a set of Ant- and Java-based, open source tools that provide a<br />

"reference implementation" for processing DITA maps and topical content to multiple output formats.<br />

It is a demonstration of DITA's capabilities for single source publishing, modularity, structured writing, information<br />

typing, inheritance, specialization, topic-based authoring, conditional processing, component publishing, task<br />

orientation, and content reuse.<br />

Several <strong>XML</strong> editors and <strong>XML</strong> content management systems integrate the DITA Open Toolkit into their products,<br />

including Oxygen <strong>XML</strong> Editor, XMetaL, and Syntext Serna.<br />

See also<br />

Darwin Information Typing Architecture<br />

Further reading<br />

• Linton, Jen and Bruski, Kylene (2006). Introduction to DITA: A Basic User Guide to the Darwin Information<br />

Typing Architecture [2] . Denver, CO: Comtech Services.<br />

External links<br />

• http://dita.xml.org<br />

• SourceForge page on the DITA OT [3]<br />

• Don Day's Resources Page for the DITA OT [4]<br />

• DITA Open Toolkit User Guide [5]


DITA Open Toolkit 15<br />

• Download page for the DITA OT [6]<br />

• DITA Users [7] - a member organization with workspace folders and online version of the DITA Open Toolkit<br />

• PHP debugging tools for the DITA OT [8]<br />

• DITA Infocenter [9] - DITA OT User Guide in online help format<br />

References<br />

[1] http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=dita<br />

[2] http://www.comtech-serv.com/dita2.shtml#book<br />

[3] http://sourceforge.net/projects/dita-ot/<br />

[4] http://www.ditaopentoolkit.org/<br />

[5] http://dita-ot.sourceforge.net/SourceForgeFiles/doc/user_guide.html<br />

[6] http://sourceforge.net/project/showfiles.php?group_id=132728<br />

[7] http://www.ditausers.org<br />

[8] http://www.vrcommunications.com/Code/ditaotug131-18042007-tools.zip<br />

[9] http://www.ditainfocenter.com<br />

Document Structure Description<br />

Document Structure Description, or DSD, is a schema language for <strong>XML</strong>, that is, a language for describing valid<br />

<strong>XML</strong> documents. It's an alternative to DTD or the W3C <strong>XML</strong> Schema.<br />

An example of DSD in its simplest form:<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

This says that element named "foo" in the <strong>XML</strong> namespace "http://example.com" may have two attributes, named<br />

"first" and "second". A "foo" element may not have any character data. It must contain one subelement, named "bar",<br />

also in the "http://example.com" namespace. A "bar" element is not allowed any attributes, character data or<br />

subelements.


Document Structure Description 16<br />

One <strong>XML</strong> document that would be valid according to the above DSD would be:<br />

<br />

<br />

<br />

Current Software store<br />

• Prototype Java Processor [1] from BRICS<br />

External links<br />

• DSD home page [2]<br />

• Full DSD specification [3]<br />

• Comparison of DTD, W3C <strong>XML</strong> Schema, and DSD [4]<br />

References<br />

[1] http://www.brics.dk/DSD/dsd2<br />

[2] http://www.brics.dk/DSD/<br />

[3] http://www.brics.dk/DSD/dsd2.html<br />

[4] http://www.brics.dk/~amoeller/<strong>XML</strong>/schemas/<br />

Document-Centric<br />

Document Centric <strong>XML</strong> processing is a notion first introduced in VTD-<strong>XML</strong>. Before VTD-<strong>XML</strong>, traditional<br />

<strong>XML</strong> processing models (e.g. DOM, SAX and JAXB etc.) are designed around the notion of objects. The <strong>XML</strong> text,<br />

merely as the serialization of the objects, is relegated to the status of a second-class citizen. You base your<br />

applications on DOM nodes, string and various business objects, but rarely on the physical documents. If you have<br />

followed my articles on DevX so far, it should quickly become obvious that this object-oriented approach of <strong>XML</strong><br />

processing makes little sense because of the performance hits from virtually all directions. Not only are object<br />

creation and garbage collection inherently memory and CPU inefficient, but your applications incur the cost of<br />

re-serialization with even the smallest changes to the original text.<br />

With document-centric <strong>XML</strong> processing, the <strong>XML</strong> document (the persistent format of data) is the starting point from<br />

which everything else comes about. Whether it is parsing, XPath evaluation, modifying content, or slicing element<br />

fragments, by default you no longer work directly with objects. You only do that when it makes sense. More often<br />

than not, you treat documents purely as syntax, and think in bytes, byte arrays, integers, offsets, lengths, fragments<br />

and namespace-compensated fragments. The first-class citizen in this paradigm is the <strong>XML</strong> text. And the<br />

object-centric notions of <strong>XML</strong> processing, such as serialization and de-serialization (or marshalling and<br />

unmarshalling) are often displaced, if not replaced, by more document-centric notions of parsing and composition.<br />

Increasingly you will find that your <strong>XML</strong> programming experience is getting simpler. And not surprisingly, the<br />

simpler, intuitive way to think about <strong>XML</strong> processing is also the most efficient and powerful.


Document-centric <strong>XML</strong> processing 17<br />

Document-centric <strong>XML</strong> processing<br />

Document-centric <strong>XML</strong> processing is one of two conceptual approaches to processing <strong>XML</strong> content, along with<br />

Data-centric <strong>XML</strong> processing. Although there is no universally accepted definition of the term, following articles<br />

discuss features typically associated with this approach:<br />

• Data-centric vs Document-centric <strong>XML</strong> [1]<br />

• Text-centric vs data-centric <strong>XML</strong> retrieval [2]<br />

Applications based on Document-centric Approach<br />

VTD-<strong>XML</strong><br />

Before VTD-<strong>XML</strong>, traditional <strong>XML</strong> processing models (e.g. DOM, SAX and JAXB etc.) are designed around the<br />

notion of objects. The <strong>XML</strong> text, merely as the serialization of the objects, is relegated to the status of a second-class<br />

citizen. Applications are based on DOM nodes, strings and various business objects, but rarely on the physical<br />

documents. This object-oriented approach of <strong>XML</strong> processing has serious issues because of the performance hits<br />

from virtually all directions. Not only are object creation and garbage collection inherently memory and CPU<br />

inefficient, but applications incur the cost of re-serialization with even the smallest changes to the original text.<br />

With document-centric <strong>XML</strong> processing, the <strong>XML</strong> document (the persistent format of data) is the starting point from<br />

which everything else comes about. Whether it is parsing, XPath evaluation, modifying content, or slicing element<br />

fragments, by default you no longer work directly with objects. You only do that when it makes sense. More often<br />

than not, one treat documents purely as syntax, and think in bytes, byte arrays, integers, offsets, lengths, fragments<br />

and namespace-compensated fragments. The first-class citizen in this paradigm is the <strong>XML</strong> text. And the<br />

object-centric notions of <strong>XML</strong> processing, such as serialization and de-serialization (or marshalling and<br />

unmarshalling) are often displaced, if not replaced, by more document-centric notions of parsing and composition.<br />

References<br />

[1] http://techessence.info/node/51<br />

[2] http://nlp.stanford.edu/IR-book/html/htmledition/text-centric-vs-data-centric-xml-retrieval-1.html


Dynamic <strong>XML</strong> 18<br />

Dynamic <strong>XML</strong><br />

Dynamic <strong>XML</strong> means dynamic data that is in an <strong>XML</strong> format.<br />

Another popular use of this term also refers to information which is extracted from a database (commonly a<br />

relational database) and placed into <strong>XML</strong> format. Clearly this is a completely different case as it does not involve<br />

any updates to the data – and is in fact static data. In this context the word "dynamic" is taking the alternative<br />

meaning of "automated", in the sense that something which is performed dynamically is actioned without effort.<br />

ECMAScript for <strong>XML</strong><br />

ECMAScript for <strong>XML</strong> (E4X) is a programming language extension that adds native <strong>XML</strong> support to ECMAScript<br />

(which includes ActionScript, DMDScript, JavaScript, and JScript). The goal is to provide an alternative to DOM<br />

interfaces that uses a simpler syntax for accessing <strong>XML</strong> documents. It also offers a new way of making <strong>XML</strong><br />

visible. Before the release of E4X, <strong>XML</strong> was always accessed at an object level. E4X instead treats <strong>XML</strong> as a<br />

primitive (like characters, integers, and booleans). This implies faster access, better support, and acceptance as a<br />

building block (data structure) of a program.<br />

E4X is standardized by Ecma International in the ECMA-357 standard [1] . The first edition was published in June<br />

2004, the second edition in December 2005.<br />

Browser support<br />

E4X is currently supported by Mozilla's Rhino, used in OpenOffice.org and several other projects, and<br />

SpiderMonkey, used in Firefox, Thunderbird, and other XUL-based applications. It is also supported by Tamarin, the<br />

JavaScript engine used in the Flash virtual machine. It is not currently supported by Nitro (Safari), V8 (Google<br />

Chrome), or Internet Explorer.[2]<br />

Example<br />

var sales = <br />

<br />

<br />

<br />

;<br />

alert( sales.item.(@type == "carrot").@quantity );<br />

alert( sales.@vendor );<br />

for each( var price in sales..@price ) {<br />

}<br />

alert( price );<br />

delete sales.item[0];<br />

sales.item += ;<br />

sales.item.(@type == "oranges").@quantity = 4;


ECMAScript for <strong>XML</strong> 19<br />

Implementations<br />

The first implementation of E4X was designed by Terry Lucas and John Schneider and appeared in BEA's Weblogic<br />

Workshop 7.0 released in February 2002. BEA's implementation was based on Rhino and released before the<br />

ECMAScript E4X spec was completed in June 2004. John Schneider wrote an article [3] on the <strong>XML</strong> extensions in<br />

BEA's Workshop at the time.<br />

• E4X is implemented in SpiderMonkey (Gecko's JavaScript engine) since version 1.6.0 [4] and in Rhino (Mozilla's<br />

other JavaScript engine written in Java instead of C) since version 1.6R1 [5] .<br />

• As Mozilla Firefox is based on Gecko, it can be used to run scripts using E4X. The specification is supported in<br />

the 1.5 release or later.<br />

• Adobe's ActionScript 3 scripting language fully supports E4X. Early previews of ActionScript 3 were first made<br />

available in late 2005. Adobe officially released the language with Flash Player 9 on June 28, 2006.<br />

• E4X is available in Flash CS3, Adobe AIR and Adobe Flex as they use ActionScript 3 as a scripting language.<br />

• E4X is also available in Adobe Acrobat and Adobe Reader versions 8.0 or higher.<br />

• E4X is also available in Aptana's Jaxer Ajax application server which uses the Mozilla engine server-side.<br />

• Since the release of Alfresco Community Edition 2.9B, E4X is also available in this enterprise document<br />

management system.<br />

External links<br />

• ECMA-357 standard [1]<br />

• E4X at faqts.com [6]<br />

• Slides from 2005 E4X Presentation by Brendan Eich, Mozilla Chief Architect [7]<br />

• E4X at Mozilla Developer Center [8]<br />

• Introducing E4X at xml.com [9] : compares E4X and json<br />

• Processing <strong>XML</strong> with E4X [10] at Mozilla Developer Center<br />

• Tutorial from W3 Schools [11]<br />

• E4X: Beginner to Advanced [12] at Yahoo Developer Network<br />

References<br />

[1] http://www.ecma-international.org/publications/standards/Ecma-357.htm<br />

[2] http://code.google.com/p/chromium/issues/detail?id=30975<br />

[3] http://web.archive.org/web/20080403052807/http://dev2dev.bea.com/pub/a/2002/09/JSchneider_<strong>XML</strong>.html<br />

[4] SpiderMonkey 1.6.0 release notes (http://www.mozilla.org/js/spidermonkey/release-notes/JS_160.html)<br />

[5] Rhino 1.6R1 Change log (http://www.mozilla.org/rhino/rhino16R1.html)<br />

[6] http://www.faqts.com/knowledge_base/index.phtml/fid/1762<br />

[7] https://developer.mozilla.org/presentations/xtech2005/e4x/ [8]<br />

https://developer.mozilla.org/en/docs/E4X<br />

[9] http://www.xml.com/pub/a/2007/11/28/introducing-e4x.html<br />

[10] https://developer.mozilla.org/index.php?title=En/Core_JavaScript_1.5_Guide/Processing_<strong>XML</strong>_with_E4X<br />

[11] http://www.w3schools.com/e4x/default.asp<br />

[12] http://developer.yahoo.com/flash/articles/e4x-beginner-to-advanced.html


Efficient <strong>XML</strong> Interchange 20<br />

Efficient <strong>XML</strong> Interchange<br />

Efficient <strong>XML</strong> Interchange (EXI) is a proposed data format from the Efficient <strong>XML</strong> Interchange Working Group<br />

of the World Wide Web Consortium (W3C). It is one of the various efforts to encode <strong>XML</strong> documents in a binary<br />

data format, rather than plain text.<br />

Using a binary <strong>XML</strong> format generally reduces the verbosity of <strong>XML</strong> documents, and may reduce the cost of parsing.<br />

Performance of writing (generating) content is usually not similarly improved, although this depends on actual<br />

binary representation used.<br />

The EXI format is derived from the AgileDelta Efficient <strong>XML</strong> format [1] .<br />

See also<br />

• Binary <strong>XML</strong><br />

• Fast Infoset<br />

External links<br />

• Efficient <strong>XML</strong> Interchange Format 1.0 (Candidate Recommendation) [2]<br />

• Efficient <strong>XML</strong> Interchange Working Group home page [3]<br />

• EXIficient - Open Source implementation of the EXI Format 1.0 [4]<br />

• W3C binary <strong>XML</strong> requirements [5]<br />

References<br />

[1] "Lightning-Fast Delivery of <strong>XML</strong> to More Devices in More Locations" (http://www.agiledelta.com/product_efx.html). AgileDelta.<br />

2007-05-08. . Retrieved 2007-07-17.<br />

[2] http://www.w3.org/TR/exi/<br />

[3] http://www.w3.org/<strong>XML</strong>/EXI/<br />

[4] http://exificient.sourceforge.net/<br />

[5] http://www.w3.org/TR/2005/NOTE-xbc-characterization-20050331/


Embedded RDF 21<br />

Embedded RDF<br />

Embedded RDF (eRDF) is a syntax for writing HTML in such a way that the information in the HTML document<br />

can be extracted (with an eRDF parser or XSLT stylesheet) into Resource Description Framework.<br />

It was invented by Ian Davis in 2005, and partly inspired by microformats, a simplified approach to semantically<br />

annotate data in websites. [1]<br />

See also<br />

• RDFa, W3C's approach at embedding RDF<br />

• GRDDL, a way to extract (annotated) data out of XHTML and <strong>XML</strong> documents and transform it into an RDF<br />

graph<br />

• Microdata (HTML5), a proposed feature of HTML5 that improves on the capabilities of microformats<br />

External links<br />

• eRDF [2]<br />

References<br />

[1] Ian Davis (http://iandavis.com/)<br />

[2] http://research.talis.com/2005/erdf/wiki/Main/RdfInHtml<br />

EpiDoc<br />

The EpiDoc Collaborative [1] , building recommendations for structured markup of epigraphic documents in TEI<br />

<strong>XML</strong>, was originally formed in 2000 by scholars at the University of North Carolina at Chapel Hill: Tom Elliott, the<br />

former director of the Ancient World Mapping Center, with Hugh Cayless and Amy Hawkins. The guidelines have<br />

matured considerably through extensive discussion on the <strong>Markup</strong> list [2] and other discussion fora, at several<br />

conferences, and through the experience of various pilot projects. The first major—but not by any means the<br />

only—epigraphic project to adopt and pilot the EpiDoc recommendations has been the Inscriptions of Aphrodisias,<br />

and the guidelines have reached a degree of stability for the first time during this process.<br />

The EpiDoc schema and guidelines may also be applied, perhaps with some local modification to related<br />

palaeolgraphical fields including Papyrology (projects in progress), Sigillography, and Numismatics.<br />

Guidelines<br />

The EpiDoc Guidelines are available in two forms:<br />

1. the stable guidelines, released periodically and available at: http://www.stoa.org/epidoc/gl/(Current version 5<br />

[3] )<br />

2. the source code, available in its most up-to-date form in the CVS repository at SourceForge [4] ; the GL source<br />

files are a series of <strong>XML</strong> documents


EpiDoc 22<br />

Tools<br />

Tool developed by and for the EpiDoc community include:<br />

• The EpiDoc webapp, available from the SourceForge [4] CVS repository (the same application is used to deliver<br />

the guidelines).<br />

• The EpiDoc Crosswalker, a tool to transform data in both directions between EpiDoc and other encoding<br />

schemes, markup schemas, and databases. (In progress.)<br />

• CHET-C (the Chapel Hill Electronic Text-Converter), an application originally written in VBA, then as a<br />

free-standing Java app, and now available as a self-contained Javascript platform written by Hugh Cayless. [5] (A<br />

Python and XSLT version of CHET-C is under construction as part of the IDP project.)<br />

• Transcoder: a Java tool for converting between Beta Code, Unicode NF C, Unicode NF D, and GreekKeys<br />

encoding for Greek script on the fly (download link to follow).<br />

Projects<br />

• Concordia [6] , King's College London and New York University<br />

• Inscriptions of Aphrodisias [7] , King's College London, UK<br />

• Inscriptions of Roman Cyrenaica [7] , KCL<br />

• Integrating Digital Papyrology (Duke University, Columbia University, Heidelberg University, King's College<br />

London), see now http://papyri.info/<br />

• US Epigraphy Project [8] , Brown University, Providence RI, USA<br />

• Vindolanda Tablets Online [9] , Oxford University, UK<br />

• Etruscan Texts Project [10] , University of Massachusetts Amherst, Amherst MA, USA<br />

Bibliography<br />

• G. Bodard, 'Digital Epigraphy and Lexicographical and Onomastic <strong>Markup</strong>', in (edd. Aitken, Fraser, Thompson)<br />

Ancient Greek Lexicography: Electronic Databanks and the design of new dictionaries, Cardiff: University Press<br />

of Wales, (forthcoming 2007).<br />

• G. Bodard / Ch. Roueché, 'The Epidoc Aphrodisias Pilot Project', Forum Archaeologiae 23/VI/2002, online at<br />

http://farch.net (available: 2006-04-07)<br />

• J. Flanders / C. Roueché, 'Introduction for Epigraphers', online at http://epidoc.sf.net/IntroEpigraphers.shtml<br />

(available: 2006-04-25)<br />

• A. Mahoney, 'Epigraphy', in (edd. Burnard, O'Brian, Unsworth) Electronic Textual Editing (2006), preview online<br />

at http://www.tei-c.org/Activities/ETE/Preview/mahoney.xml (available: 2006-04-07)<br />

See also<br />

• Leiden Conventions<br />

• Epigraphy<br />

• Text Encoding Initiative<br />

• Digital Classicist<br />

References<br />

[1] http://epidoc.sourceforge.net/<br />

[2] http://lsv.uky.edu/archives/markup.html<br />

[3] http://www.stoa.org/epidoc/gl/5/<br />

[4] http://sourceforge.net/projects/epidoc<br />

[5] http://www.stoa.org/projects/epidoc/stable/chetc-js/chetc.html<br />

[6] http://concordia.atlantides.org/


EpiDoc 23<br />

[7] http://insaph.kcl.ac.uk/<br />

[8] http://usepigraphy.brown.edu/<br />

[9] http://vindolanda.csad.ox.ac.uk/<br />

[10] http://etp.classics.umass.edu/<br />

eXtensible Server Pages<br />

eXtensible servers Pages (XSP) is an <strong>XML</strong>-based language, which offers the possibility of dynamically arranged<br />

Java code into <strong>XML</strong> documents.<br />

It was developed by the Apache Software Foundation for the Web Publishing Framework Cocoon. The focus of XSP<br />

is the separation of content, logic and presentation. The Java program code is in its own <strong>XML</strong> section <br />

that can either occur within or outside of the root element ().<br />

The Java code is compiled with the first call. These directives are replaced by the generated content so that the<br />

resulting, augmented <strong>XML</strong> document can be subject to further processing with XSL Transformations.<br />

XSP pages are transformed into Cocoon producers, typically as Java classes, though any scripting language for<br />

which a Java-based processor exists could also be used.<br />

Directives can be either XSP built-in processing tags or user-defined library tags. XSP built-in tags are used to<br />

embed procedural logic, substitute expressions and dynamically build <strong>XML</strong> nodes. User-defined library tags act as<br />

templates that dictate how program code is generated from information encoded in each dynamic tag.<br />

External links<br />

• Cocoon XSP 2.1 [1]<br />

• XSP 1.x - Working Draft [2]<br />

References<br />

[1] http://cocoon.apache.org/2.1/userdocs/xsp/logicsheet.html<br />

[2] http://cocoon.apache.org/1.x/wd-xsp.html


Fast Infoset 24<br />

Fast Infoset<br />

Fast Infoset (or FI) is an international standard that specifies a binary encoding format for the <strong>XML</strong> Information Set<br />

(<strong>XML</strong> Infoset) as an alternative to the <strong>XML</strong> document format. It aims to provide more efficient serialization than the<br />

text-based <strong>XML</strong> format.<br />

One can think of FI as gzip for <strong>XML</strong>, though FI aims to optimize both document size and processing performance,<br />

whereas gzip optimizes only the size. While the original formatting is lost, no information is lost in the conversion<br />

from <strong>XML</strong> to FI and back to <strong>XML</strong>.<br />

The Fast Infoset specification is defined by both the ITU-T and the ISO standards bodies. FI is officially named<br />

ITU-T Rec. X.891 and ISO/IEC 24824-1 (Fast Infoset), respectively. However, it is commonly referred to by the<br />

name Fast Infoset. The standard was published by ITU-T on May 14, 2005, and by ISO on May 4, 2007.<br />

The Fast Infoset standard can be downloaded from the ITU website at [1]. There are no intellectual property<br />

restrictions on its implementation and use.<br />

A common misconception is that FI requires ASN.1 tool support. Although the formal specification uses ASN.1<br />

formalisms, ASN.1 tools are not required by implementations.<br />

Structure<br />

The underlying file format is ASN.1, with tag/length/value blocks. Text values of attributes and elements are therefor<br />

stored with length prefixes rather than end delimeters, so there is no need to escape special characters. There is also<br />

no need for any end tags, and binary data need not be base64 encoded.<br />

Although ASN.1 is used for storage, Fast Infoset is a higher level protocol built upon it. In particular, element and<br />

attribute names are stored within the octet stream, unlike raw ASN.1. This means that it is possible to recover a<br />

conventional <strong>XML</strong> file from the binary stream without the need to reference any <strong>XML</strong> Schema. It does not attempt<br />

to convert and <strong>XML</strong> Schema directly into an ASN.1 definition. (ASN.1 "Tags" are just type names, eg. String,<br />

Integer, or complex types.)<br />

An index table is built for most strings, which includes element and attribute names, and their values. This means<br />

that the text of repeated tags and values only appears once per document. The details are complex.<br />

Implementations<br />

Reference implementation<br />

A Java implementation [2] of the FI specification is available as part of the GlassFish project. The library is open<br />

source and is distributed under the terms of the Apache License 2.0. Several projects use this implementation,<br />

including the reference implementation for JAX-RPC and JAX-WS used in JWSDP.<br />

Alternative implementations<br />

The OSS Fast Infoset Tools [3] are designed for use with applications written in C or C++.<br />

Liquid Technologies [4] provides both C++ and C# .NET implementations of Fast Infoset with its <strong>XML</strong> Data<br />

Binding product Liquid <strong>XML</strong>.<br />

Applied Informatics [5] provides a C++ implementation [6] of Fast Infoset based on the POCO C++ Libraries.<br />

FastInfoset.NET [7] is a C# implementation for the .NET Framework. It is licensed under a proprietary licence.<br />

The XIOT [8] library has parts of Fast Infoset implemented to read and write compressed binary X3D files. It is<br />

licensed under LGPL.


Fast Infoset 25<br />

Performance<br />

Because Fast Infosets are compressed as as part of the <strong>XML</strong> generation process, they are much faster than using<br />

Zip-style compression algorithms on an <strong>XML</strong> stream, although they can produce slightly larger files.<br />

SAX-type parsing performance of Fast Infoset is also much faster than parsing performance of <strong>XML</strong> 1.0, even<br />

without any Zip-style compression. Typical increases in parsing speed observed for the reference Java<br />

implementation are a factor of 10 compared to Java Xerces, and a factor of 4 compared to the Piccolo driver [9] (one<br />

[10] [11] [12]<br />

of the fastest Java-based <strong>XML</strong> parsers).<br />

Typical applications<br />

Portable Devices - With mobile devices typically having access to low bandwidth data connections, and have slower<br />

CPUs. This can make Fast Infoset a better choice, lowering both data transmission and data processing times.<br />

Persisting Large Volumes of Data - When persisting <strong>XML</strong> either to file or a database, the volume of data your<br />

system produces can often get out of hand. This has a number of detrimental effects; the access times go up as you're<br />

reading more data, CPU load goes up as <strong>XML</strong> data takes more effort to process, and your storage costs go up. By<br />

persisting your <strong>XML</strong> data in Fast Infoset format, it is possible to reduce the data volume by up to 80 percent.<br />

Passing <strong>XML</strong> via the internet - As soon as an application starts passing information over the internet, one of the<br />

main bottlenecks is bandwidth. If you send reasonable chunks of data, this bottleneck can seriously degrade the<br />

performance of your client applications and limit your server's ability to process requests. Reducing the amount of<br />

data moving across the internet reduces the time it takes a message to be sent or received, while increasing the<br />

number of transactions a server can process per hour.<br />

See also<br />

• Binary <strong>XML</strong><br />

• EXI<br />

• X3D<br />

External links<br />

• A heavy technical description on Sun [13]<br />

• FastInfoset.NET home page [7]<br />

• FI project home page [14]<br />

• Fast Infoset page at the ASN.1 site [15]<br />

• OSS Fast Infoset Tools page [3]<br />

• Free download of the Fast Infoset standard (ITU-T Rec. X.891) from the ITU Web site [1]<br />

• Free download of the Fast Infoset standard (ISO/IEC 24824-1:2007) from ISO Freely Available Standards [16]


Fast Infoset 26<br />

References<br />

[1] http://www.itu.int/rec/T-REC-X.891-200505-I/en<br />

[2] https://fi.dev.java.net/<br />

[3] http://www.oss.com/xml/products/fi.html<br />

[4] http://www.liquid-technologies.com/Product_XmlCompression.aspx<br />

[5] http://www.appinf.com/<br />

[6] http://www.appinf.com/en/products/fis.html<br />

[7] http://www.noemax.com/products/fastinfoset/index.html<br />

[8] http://forge.collaviz.org/community/xiot<br />

[9] http://piccolo.sourceforge.net/<br />

[10] "Fast Infoset performance reports" (https://fi.dev.java.net/performance.html). 2005-10-06. . Retrieved 2007-10-11.<br />

[11] "Japex Report: ParsingPerformance" (https://fi.dev.java.net/reports/parsing/report.html). 2005-01-10. . Retrieved 2007-10-11.<br />

[12] "Japex Report: SizePerformance" (https://fi.dev.java.net/reports/size/report.html). 2005-01-10. . Retrieved 2007-10-11.<br />

[13] http://java.sun.com/developer/technicalArticles/xml/fastinfoset/<br />

[14] http://fi.dev.java.net/<br />

[15] http://asn1.elibel.tm.fr/xml/finf.htm<br />

[16] http://standards.iso.org/ittf/PubliclyAvailableStandards/index.html<br />

Global listings format<br />

Global listings format (GLF) refers to metadata for transferring program guide information and multimedia<br />

information. It is coded in <strong>XML</strong> format.<br />

GMX<br />

GMX [1] (global mail exchange) is also the name of a German company with an international webmail product<br />

GMX Mail.<br />

GMX is a collection of current and proposed standards, primarily targeted at the needs of the translation industry,<br />

although able to be used for other purposes also. They are concerned with measuring quantitatively aspects of a<br />

document, particularly those with relevance to the translation process (e.g. word counts, complexity). The primary<br />

use cases are in quoting, estimating and billing translation work.<br />

GMX-V is the first of the three standards to be completed. Work will commence in 2007 on GMX-Q and GMX-C.<br />

Quality (GMX-Q) will deal with the level of quality required for a task. For example, the quality required for the<br />

translation of a legal document is much higher than that for technical documentation that will have a relatively small<br />

audience. Complexity (GMX-C) will take into consideration the source and format of the original document and its<br />

subject matter. For example, a highly complex document dealing with a specific tight domain is far more complex to<br />

translate than user instructions for a simple consumer device.<br />

GMX-V forms part of the Open Architecture for <strong>XML</strong> Authoring and Localization (OAXAL) reference architecture.<br />

References<br />

[1] http://www.gmx.net/


GMX-V 27<br />

GMX-V<br />

GMX-V (Global Information Management Metrics eXchange - Volume: Word and Character Count Standard) is a<br />

word and character count standard for electronic documents. GMX-V is developed and maintained by OSCAR [1]<br />

(Open Standards for Container/Content Allowing Re-use), a special interest group of LISA [2] (Localization Industry<br />

Standards Association).<br />

GMX-V is one of the tripartite series of standards from the Localization Industry Standards Association (LISA).<br />

GMX-V deals with electronic document metrics.<br />

GMX is made up of the following standards:<br />

• GMX-V - Volume<br />

• GMX-C - Complexity<br />

• GMX-Q - Quality<br />

GMX-V forms part of the Open Architecture for <strong>XML</strong> Authoring and Localization (OAXAL) reference architecture.<br />

Scope and Primary Goal<br />

GMX-V is designed to fulfill two primary roles:<br />

• Establish a verifiable way of calculating the primary word and character counts for a given electronic document.<br />

• Establish a specific <strong>XML</strong> vocabulary that enables the automatic exchange of metric data<br />

Description<br />

GMX-V is itself based on other well established standards:<br />

• Unicode 5.0 normalized form<br />

• Unicode Technical Report 29 – Text Boundaries<br />

• OASIS <strong>XML</strong> Localization Interchange File Format (XLIFF) 1.2<br />

• LISA OSCAR Segmentation Rules Exchange (SRX) 2.0<br />

External links<br />

• GMX-V page on the LISA OSCAR web site [3]<br />

• GMX-V specification [4]<br />

References<br />

[1] OSCAR (http://www.lisa.org/sigs/oscar/) - Open Standards for Container/Content Allowing Re-use<br />

[2] LISA (http://www.lisa.org/index.html) - Localization Industry Standards Association<br />

[3] http://www.lisa.org/Global-information-m.104.0.html<br />

[4] http://www.lisa.org/fileadmin/standards/GMX-V.html


Head-Body Pattern 28<br />

Head-Body Pattern<br />

The Head-Body Pattern is a common <strong>XML</strong> design pattern, used for example in the SOAP protocol. This pattern is<br />

useful when a message, or parcel of data, requires considerable metadata. While mixing the meta-data with the data<br />

could be done it makes the whole confusing. In this pattern the meta-data or meta-information are structured as the<br />

header, sometimes known as the envelope. The ordinary data or information are structured as the body, sometimes<br />

known as the payload. <strong>XML</strong> is employed for both head and body.<br />

HyTime<br />

HyTime (Hypermedia/Time-based Structuring <strong>Language</strong>) is a markup language that is an "application" of SGML.<br />

HyTime defines a set of hypertext-oriented element types that, in effect, supplement SGML and allow SGML<br />

document authors to build hypertext and multimedia presentations in a standardized way.<br />

HyTime is an international standard published by the ISO and IEC. The first edition was published in 1992, and the<br />

second edition was published in 1997.<br />

Legacy<br />

Some of the concepts formalized in HyTime were later incorporated into HTML and <strong>XML</strong>:<br />

• HTML is an application of SGML for hypertext document presentations, that assigns specific semantics and<br />

processing expectations to a fixed set of element types.<br />

• <strong>XML</strong> defines a simplified subset of SGML that focuses on providing an open vocabulary of element types for data<br />

modeling and establishes precise expectations for how the marked-up data is read and subsequently fed to another<br />

software application for further processing, but does not assign semantics to the element types or establish<br />

expectations for how the data is processed.<br />

Standard<br />

The HyTime standard itself is ISO/IEC 10744, first published in 1992 and available from the International<br />

Organization for Standardization. It was developed by ISO/IEC JTC1/SC34 (ISO/IEC Joint Technical Committee 1,<br />

[1] [2]<br />

Subcommittee 34 - Document description and processing languages).<br />

Further reading<br />

• Steven DeRose and David Durand, "Making Hypermedia Work: A User's Guide to HyTime," Kluwer Academic<br />

Publishers 1994 (ISBN 0-7923-9432-1).<br />

External links<br />

• ISO/IEC 10744:1992 - Information technology -- Hypermedia/Time-based Structuring <strong>Language</strong> (HyTime) [3]<br />

• Robin Cover's HyTime resource list [4]<br />

• ISO/IEC 10744 Amendment 1 [5] - an amendment to ISO/IEC 10744:1997 Annex A.3<br />

• Standards: HyTime: A standard for structured hypermedia interchange [6] by Charles Goldfarb, from IEEE<br />

Computer magazine, vol. 24, iss. 8 (Aug. 1991), pp. 81–84<br />

• A Brief History of the Development of SMDL and HyTime [7]


HyTime 29<br />

References<br />

[1] ISO. "JTC 1/SC 34 - Document description and processing languages" (http://www.iso.org/iso/iso_technical_committee.<br />

html?commid=45374). ISO. . Retrieved 2009-12-25.<br />

[2] ISO JTC1/SC34. "JTC 1/SC 34 - Document Description and Processing <strong>Language</strong>s" (http://www.itscj.ipsj.or.jp/sc34/). . Retrieved<br />

2009-12-25.<br />

[3] http://www.iso.org/iso/iso_catalogue/catalogue_ics/catalogue_detail_ics.htm?csnumber=18834<br />

[4] http://xml.coverpages.org/hytime.html<br />

[5] http://www.y12.doe.gov/sgml/wg8/document/1957.htm<br />

[6] http://ieeexplore.ieee.org/iel1/2/2778/00084880.pdf?tp=&arnumber=84880&isnumber=2778<br />

[7] http://www.sgmlsource.com/history/hthist.htm<br />

Internationalization Tag Set<br />

The Internationalization Tag Set (ITS) [1] is a set of attributes and elements designed to provide<br />

internationalization and localization support in <strong>XML</strong> documents.<br />

The ITS specification identifies concepts (called "ITS data categories") which are important for internationalization<br />

and localization. It also defines implementations of these concepts through a set of elements and attributes grouped<br />

in the ITS namespace. <strong>XML</strong> developers can use this namespace to integrate internationalization features directly into<br />

their own <strong>XML</strong> schemas and documents.<br />

Overview<br />

ITS v1.0 includes seven data categories:<br />

• Translate: Defines what parts of a document are translatable or not.<br />

• Localization Note: Provides alerts, hints, instructions, and other information to help the localizers or the<br />

translators.<br />

• Terminology: Indicates parts of the documents that are terms and optionally pointers to information about these<br />

terms.<br />

• Directionality: Indicates what type of display directionality should be applied to parts of the document.<br />

• Ruby: Indicates what parts of the document should be displayed as ruby text. (Ruby is a short run of text<br />

alongside a base text, typically used in East Asian documents to indicate pronunciation or to provide a brief<br />

annotation).<br />

• <strong>Language</strong> Information: Identifies the language of the different parts of the document.<br />

• Elements Within Text: Indicates how elements should be treated with regard to linguistic segmentation.<br />

The vocabulary is designed to work on two different fronts: First by providing markup usable directly in the <strong>XML</strong><br />

documents. Secondly, by offering a way to indicate if there are parts of a given markup that correspond to some of<br />

the ITS data categories and should be treated as such by ITS processors.<br />

ITS applies to both new document types as well as existing ones. It also applies to both markups without any<br />

internationalization features as well as the class of documents already supporting some internationalization or<br />

localization-related functions.<br />

ITS can be specified using global rules and local rules.<br />

• The global rules are expressed anywhere in the document (embedded global rules), or even outside the document<br />

(external global rules), using the its:rules element.<br />

• The local rules are expressed by specialized attributes (and sometimes elements) specified inside the document<br />

instance, at the location where they apply.


Internationalization Tag Set 30<br />

Examples<br />

Example of ITS markup for the Translate data category:<br />

The elements and attributes with the its prefix are part of the ITS namespace. The its:rules element list the different<br />

rules to apply to this file. There is one its:translateRule rule that indicates that any content inside the head element<br />

should not be translated.<br />

The its:translate attributes used in some elements are utilised to override the global rule. Here, to make translatable<br />

the content of title and to make non-translatable the text "faux pas".<br />

<br />

<br />

Sep-10-2006 v5<br />

Ealasaidh McIan<br />

ealasaidh@hogw.ac.uk<br />

The Origins of Modern Novel<br />

<br />

<br />

<br />

<br />

<br />

<br />

Introduction<br />

It would certainly be quite a faux<br />

pas to start a dissertation on the origin of modern novel without<br />

mentioning the Epic of Gilgamesh...<br />

<br />

<br />

<br />

Example of ITS markup for the Localization Note data category:<br />

The its:locNote element specifies that any node corresponding to the XPath expression "//msg/data" has an<br />

associated note. The location of that note is expressed by the locNotePointer attribute, which holds a relative XPath<br />

expression pointing to the node where the note is, here ="../notes".<br />

Note also the use of the its:translate attribute to mark the notes elements as non-translatable.<br />

<br />

<br />

<br />

<br />


Internationalization Tag Set 31<br />

A division by 0 was going to be computed.<br />

Invalid parameter.<br />

<br />

<br />

<br />

ITS limitations<br />

ITS does not have a solution to all <strong>XML</strong> internationalization and localization issues.<br />

One reason is that the version 1.0 does not have data categories for everything. For example, there is currently no<br />

way to indicate a relation source/target in bilingual files where some parts of a document store the source text and<br />

some other parts the corresponding translation.<br />

The other reason is that many aspects of internationalization cannot be resolved with a markup. They have to do with<br />

the design of the DTD or the schema itself. There are best practices, design and authoring guidelines [2] that are<br />

necessary to follow to make sure documents are correctly internationalized and easy to localize. For example, using<br />

attributes to store translatable text is a bad idea for many different reasons, but ITS cannot prevent an <strong>XML</strong><br />

developer to make such choice.<br />

External links<br />

• Internationalization Tag Set (ITS) Version 1.0 [3]<br />

• W3C Internationalization Home [4]<br />

• Best Practices for <strong>XML</strong> Internaltionalization (Working Draft) [5]<br />

• List of ITS implementations and articles about ITS [6]<br />

References<br />

[1] http://www.w3.org/TR/its/<br />

[2] http://www.w3.org/TR/xml-i18n-bp/<br />

[3] http://www.w3.org/TR/2007/REC-its-20070403/<br />

[4] http://www.w3.org/International/<br />

[5] http://www.w3.org/TR/2007/WD-xml-i18n-bp-20070427/<br />

[6] http://www.w3.org/International/its/links.html


Klip 32<br />

Klip<br />

Klip is an <strong>XML</strong> file that contains markup, styles and JavaScript that provides the<br />

Klipfolio desktop dashboard platform with rules for the retrieval, interpretation, and<br />

presentation of arbitrary information sources such as web pages, RSS feeds, and<br />

proprietary <strong>XML</strong> back-ends. The Klip file extension is ".klip".<br />

When opened in Klipfolio, a Klip is rendered as a small window that displays text<br />

and image content. The size, position and visibility of the Klip on-screen is managed by the user. Settings particular<br />

to each Klip can be found in a "Klip Setup" dialog.<br />

Klips are considered by most to be widgets, and KlipFolio a widget engine. There are thousands of different Klips<br />

available as free downloads at Klipfolio.com [1] . Klips proivde all manner of information such as weather conditions,<br />

news headlines, stock quotes etc. The consumer version of KlipFolio is freeware and can be downloaded, installed,<br />

and used by anyone that cares to do so.<br />

Example usage<br />

This very simple example can be written using a plain text or <strong>XML</strong> editor.<br />

<br />

<br />

<br />

My Klip<br />

Your Description here....<br />

The author of the Klip<br />

15 keywords maximum to upload to KlipFarm the Klip directory<br />

<br />

<br />

http://mydomain.com/myxml.xml<br />

http://mydomain.com/myicon.jpg<br />

http://mydomain.com/mybanner.gif<br />

<br />

<br />

15<br />

<br />

<br />

Saving it as first.klip will allow you to open it using KlipFolio.<br />

Note: Klip also stands for the meaningful word Clip in a lot of Eastern Countries (e.g. Czechia, Lithuania, Poland,<br />

Serbia, Slovakia)


Klip 33<br />

See also<br />

• KlipFolio<br />

• Serence<br />

Links<br />

• KlipFolio Homepage [1]<br />

References<br />

[1] http://www.klipfolio.com/<br />

klip izle (http://www.klipizle.gen.tr)<br />

List of <strong>XML</strong> and HTML character entity<br />

references<br />

In SGML, HTML and <strong>XML</strong> documents, the logical constructs known as character data and attribute values consist<br />

of sequences of characters, in which each character can manifest directly (representing itself), or can be represented<br />

by a series of characters called a character reference, of which there are two types: a numeric character reference<br />

and a character entity reference. This article lists the character entity references that are valid in HTML and <strong>XML</strong><br />

documents.<br />

Character reference overview<br />

A numeric character reference refers to a character by its Universal Character Set/Unicode code point, and uses the<br />

format<br />

or<br />

&#nnnn;<br />

&#xhhhh;<br />

where nnnn is the code point in decimal form, and hhhh is the code point in hexadecimal form. The x must be<br />

lowercase in <strong>XML</strong> documents. The nnnn or hhhh may be any number of digits and may include leading zeros. The<br />

hhhh may mix uppercase and lowercase, though uppercase is the usual style.<br />

In contrast, a character entity reference refers to a character by the name of an entity which has the desired character<br />

as its replacement text. The entity must either be predefined (built-in to the markup language) or explicitly declared<br />

in a Document Type Definition (DTD). The format is the same as for any entity reference:<br />

&name;<br />

where name is the name of the entity. The semicolon is required.


List of <strong>XML</strong> and HTML character entity references 34<br />

Predefined entities in <strong>XML</strong><br />

The <strong>XML</strong> specification does not use the term "character entity" or "character entity reference". The <strong>XML</strong><br />

specification defines five "predefined entities" representing special characters, and requires that all <strong>XML</strong> processors<br />

honor them. The entities can be explicitly declared in a DTD, as well, but if this is done, the replacement text must<br />

be the same as the built-in definitions. <strong>XML</strong> also allows other named entities of any size to be defined on a<br />

per-document basis.<br />

The table below lists the five <strong>XML</strong> predefined entities. The "Name" column mentions the entity's name. The<br />

"Character" column shows the character, if it is renderable. In order to render the character, the format &name; is<br />

used; for example, &amp; renders as &. The "Unicode code point" column cites the character via standard<br />

UCS/Unicode "U+" notation, which shows the character's code point in hexadecimal. The decimal equivalent of the<br />

code point is then shown in parentheses. The "Standard" column indicates the first version of <strong>XML</strong> that includes the<br />

entity. The "Description" column cites the character via its canonical UCS/Unicode name, in English.<br />

Name Character Unicode code point (decimal) Standard Description<br />

quot " U+0022 (34) <strong>XML</strong> 1.0 (double) quotation mark<br />

amp & U+0026 (38) <strong>XML</strong> 1.0 ampersand<br />

apos ' U+0027 (39) <strong>XML</strong> 1.0 apostrophe (= apostrophe-quote)<br />

lt < U+003C (60) <strong>XML</strong> 1.0 less-than sign<br />

gt > U+003E (62) <strong>XML</strong> 1.0 greater-than sign<br />

Character entity references in HTML<br />

The HTML 4 DTDs define 252 named entities, references to which act as mnemonic aliases for certain Unicode<br />

characters. The HTML 4 specification requires the use of the standard DTDs and does not allow users to define<br />

additional entities.<br />

In the table below, the "Standard" column indicates the first version of the HTML DTD that defines the character<br />

entity reference. HTML 4.01 did not provide any new character references.<br />

Name Character Unicode code point<br />

(decimal)<br />

Standard DTD DTD<br />

Old ISO<br />

subset ISOsubset<br />

Description Description<br />

quot " U+0022 (34) HTML 2.0 HTMLspecial ISOnum quotation mark (= APL quote)<br />

amp & U+0026 (38) HTML 2.0 HTMLspecial ISOnum ampersand<br />

apos ' U+0027 (39) XHTML<br />

1.0<br />

HTMLspecial ISOnum apostrophe (= apostrophe-quote); see below<br />

lt < U+003C (60) HTML 2.0 HTMLspecial ISOnum less-than sign<br />

gt > U+003E (62) HTML 2.0 HTMLspecial ISOnum greater-than sign<br />

nbsp U+00A0 (160) HTML 3.2 HTMLlat1 ISOnum no-break space (= non-breaking space) spaces<br />

iexcl ¡ U+00A1 (161) HTML 3.2 HTMLlat1 ISOnum inverted exclamation mark<br />

cent ¢ U+00A2 (162) HTML 3.2 HTMLlat1 ISOnum cent sign<br />

pound £ U+00A3 (163) HTML 3.2 HTMLlat1 ISOnum pound sign<br />

curren ¤ U+00A4 (164) HTML 3.2 HTMLlat1 ISOnum currency sign<br />

yen ¥ U+00A5 (165) HTML 3.2 HTMLlat1 ISOnum yen sign (= yuan sign)<br />

brvbar ¦ U+00A6 (166) HTML 3.2 HTMLlat1 ISOnum broken bar (= broken vertical bar)


List of <strong>XML</strong> and HTML character entity references 35<br />

sect § U+00A7 (167) HTML 3.2 HTMLlat1 ISOnum section sign<br />

uml ¨ U+00A8 (168) HTML 3.2 HTMLlat1 ISOdia diaeresis (= spacing diaeresis); see German<br />

umlaut<br />

copy © U+00A9 (169) HTML 3.2 HTMLlat1 ISOnum copyright sign<br />

ordf ª U+00AA (170) HTML 3.2 HTMLlat1 ISOnum feminine ordinal indicator<br />

laquo « U+00AB (171) HTML 3.2 HTMLlat1 ISOnum left-pointing double angle quotation mark (= left<br />

not ¬ U+00AC (172) HTML 3.2 HTMLlat1 ISOnum not sign<br />

pointing guillemet)<br />

shy U+00AD (173) HTML 3.2 HTMLlat1 ISOnum soft hyphen (= discretionary hyphen)<br />

reg ® U+00AE (174) HTML 3.2 HTMLlat1 ISOnum registered sign ( = registered trade mark sign)<br />

macr ¯ U+00AF (175) HTML 3.2 HTMLlat1 ISOdia macron (= spacing macron = overline = APL<br />

overbar)<br />

deg ° U+00B0 (176) HTML 3.2 HTMLlat1 ISOnum degree sign<br />

plusmn ± U+00B1 (177) HTML 3.2 HTMLlat1 ISOnum plus-minus sign (= plus-or-minus sign)<br />

sup2 ² U+00B2 (178) HTML 3.2 HTMLlat1 ISOnum superscript two (= superscript digit two =<br />

squared)<br />

sup3 ³ U+00B3 (179) HTML 3.2 HTMLlat1 ISOnum superscript three (= superscript digit three =<br />

cubed)<br />

acute ´ U+00B4 (180) HTML 3.2 HTMLlat1 ISOdia acute accent (= spacing acute)<br />

micro µ U+00B5 (181) HTML 3.2 HTMLlat1 ISOnum micro sign<br />

para U+00B6 (182) HTML 3.2 HTMLlat1 ISOnum pilcrow sign ( = paragraph sign)<br />

middot · U+00B7 (183) HTML 3.2 HTMLlat1 ISOnum middle dot (= Georgian comma = Greek middle<br />

cedil ¸ U+00B8 (184) HTML 3.2 HTMLlat1 ISOdia cedilla (= spacing cedilla)<br />

sup1 ¹ U+00B9 (185) HTML 3.2 HTMLlat1 ISOnum superscript one (= superscript digit one)<br />

ordm º U+00BA (186) HTML 3.2 HTMLlat1 ISOnum masculine ordinal indicator<br />

raquo » U+00BB (187) HTML 3.2 HTMLlat1 ISOnum right-pointing double angle quotation mark (=<br />

dot)<br />

right pointing guillemet)<br />

frac14 ¼ U+00BC (188) HTML 3.2 HTMLlat1 ISOnum vulgar fraction one quarter (= fraction one<br />

quarter)<br />

frac12 ½ U+00BD (189) HTML 3.2 HTMLlat1 ISOnum vulgar fraction one half (= fraction one half)<br />

frac34 ¾ U+00BE (190) HTML 3.2 HTMLlat1 ISOnum vulgar fraction three quarters (= fraction three<br />

quarters)<br />

iquest ¿ U+00BF (191) HTML 3.2 HTMLlat1 ISOnum inverted question mark (= turned question mark)<br />

Agrave À U+00C0 (192) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter A with grave (= Latin capital<br />

letter A grave)<br />

Aacute Á U+00C1 (193) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter A with acute<br />

Acirc  U+00C2 (194) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter A with circumflex<br />

Atilde à U+00C3 (195) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter A with tilde<br />

Auml Ä U+00C4 (196) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter A with diaeresis<br />

Aring Å U+00C5 (197) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter A with ring above (= Latin<br />

capital letter A ring)


List of <strong>XML</strong> and HTML character entity references 36<br />

AElig Æ U+00C6 (198) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter AE (= Latin capital ligature<br />

Ccedil Ç U+00C7 (199) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter C with cedilla<br />

Egrave È U+00C8 (200) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter E with grave<br />

Eacute É U+00C9 (201) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter E with acute<br />

Ecirc Ê U+00CA (202) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter E with circumflex<br />

Euml Ë U+00CB (203) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter E with diaeresis<br />

Igrave Ì U+00CC (204) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter I with grave<br />

Iacute Í U+00CD (205) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter I with acute<br />

Icirc Î U+00CE (206) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter I with circumflex<br />

Iuml Ï U+00CF (207) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter I with diaeresis<br />

ETH Ð U+00D0 (208) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter ETH<br />

Ntilde Ñ U+00D1 (209) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter N with tilde<br />

Ograve Ò U+00D2 (210) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter O with grave<br />

Oacute Ó U+00D3 (211) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter O with acute<br />

Ocirc Ô U+00D4 (212) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter O with circumflex<br />

Otilde Õ U+00D5 (213) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter O with tilde<br />

Ouml Ö U+00D6 (214) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter O with diaeresis<br />

times × U+00D7 (215) HTML 3.2 HTMLlat1 ISOnum multiplication sign<br />

Oslash Ø U+00D8 (216) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter O with stroke (= Latin capital<br />

AE)<br />

letter O slash)<br />

Ugrave Ù U+00D9 (217) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter U with grave<br />

Uacute Ú U+00DA (218) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter U with acute<br />

Ucirc Û U+00DB (219) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter U with circumflex<br />

Uuml Ü U+00DC (220) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter U with diaeresis<br />

Yacute Ý U+00DD (221) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter Y with acute<br />

THORN Þ U+00DE (222) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter THORN<br />

szlig ß U+00DF (223) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter sharp s (= ess-zed); see<br />

German Eszett<br />

agrave à U+00E0 (224) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter a with grave<br />

aacute á U+00E1 (225) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter a with acute<br />

acirc â U+00E2 (226) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter a with circumflex<br />

atilde ã U+00E3 (227) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter a with tilde<br />

auml ä U+00E4 (228) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter a with diaeresis<br />

aring å U+00E5 (229) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter a with ring above<br />

aelig æ U+00E6 (230) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter ae (= Latin small ligature ae)<br />

ccedil ç U+00E7 (231) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter c with cedilla<br />

egrave è U+00E8 (232) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter e with grave<br />

eacute é U+00E9 (233) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter e with acute<br />

ecirc ê U+00EA (234) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter e with circumflex


List of <strong>XML</strong> and HTML character entity references 37<br />

euml ë U+00EB (235) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter e with diaeresis<br />

igrave ì U+00EC (236) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter i with grave<br />

iacute í U+00ED (237) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter i with acute<br />

icirc î U+00EE (238) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter i with circumflex<br />

iuml ï U+00EF (239) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter i with diaeresis<br />

eth ð U+00F0 (240) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter eth<br />

ntilde ñ U+00F1 (241) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter n with tilde<br />

ograve ò U+00F2 (242) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter o with grave<br />

oacute ó U+00F3 (243) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter o with acute<br />

ocirc ô U+00F4 (244) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter o with circumflex<br />

otilde õ U+00F5 (245) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter o with tilde<br />

ouml ö U+00F6 (246) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter o with diaeresis<br />

divide ÷ U+00F7 (247) HTML 3.2 HTMLlat1 ISOnum division sign<br />

oslash ø U+00F8 (248) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter o with stroke (= Latin small<br />

letter o slash)<br />

ugrave ù U+00F9 (249) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter u with grave<br />

uacute ú U+00FA (250) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter u with acute<br />

ucirc û U+00FB (251) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter u with circumflex<br />

uuml ü U+00FC (252) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter u with diaeresis<br />

yacute ý U+00FD (253) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter y with acute<br />

thorn þ U+00FE (254) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter thorn<br />

yuml ÿ U+00FF (255) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter y with diaeresis<br />

OElig ΠU+0152 (338) HTML 4.0 HTMLspecial ISOlat2 Latin capital ligature oe ligature<br />

oelig œ U+0153 (339) HTML 4.0 HTMLspecial ISOlat2 Latin small ligature oe ligature<br />

Scaron Š U+0160 (352) HTML 4.0 HTMLspecial ISOlat2 Latin capital letter s with caron<br />

scaron š U+0161 (353) HTML 4.0 HTMLspecial ISOlat2 Latin small letter s with caron<br />

Yuml Ÿ U+0178 (376) HTML 4.0 HTMLspecial ISOlat2 Latin capital letter y with diaeresis<br />

fnof ƒ U+0192 (402) HTML 4.0 HTMLsymbol ISOtech Latin small letter f with hook (= function =<br />

circ ˆ U+02C6 (710) HTML 4.0 HTMLspecial ISOpub modifier letter circumflex accent<br />

florin)<br />

tilde ˜ U+02DC (732) HTML 4.0 HTMLspecial ISOdia small tilde<br />

Alpha Α U+0391 (913) HTML 4.0 HTMLsymbol Greek capital letter Alpha<br />

Beta Β U+0392 (914) HTML 4.0 HTMLsymbol Greek capital letter Beta<br />

Gamma Γ U+0393 (915) HTML 4.0 HTMLsymbol ISOgrk3 Greek capital letter Gamma<br />

Delta Δ U+0394 (916) HTML 4.0 HTMLsymbol ISOgrk3 Greek capital letter Delta<br />

Epsilon Ε U+0395 (917) HTML 4.0 HTMLsymbol Greek capital letter Epsilon<br />

Zeta Ζ U+0396 (918) HTML 4.0 HTMLsymbol Greek capital letter Zeta<br />

Eta Η U+0397 (919) HTML 4.0 HTMLsymbol Greek capital letter Eta<br />

Theta Θ U+0398 (920) HTML 4.0 HTMLsymbol ISOgrk3 Greek capital letter Theta<br />

Iota Ι U+0399 (921) HTML 4.0 HTMLsymbol Greek capital letter Iota


List of <strong>XML</strong> and HTML character entity references 38<br />

Kappa Κ U+039A (922) HTML 4.0 HTMLsymbol Greek capital letter Kappa<br />

Lambda Λ U+039B (923) HTML 4.0 HTMLsymbol ISOgrk3 Greek capital letter Lambda<br />

Mu Μ U+039C (924) HTML 4.0 HTMLsymbol Greek capital letter Mu<br />

Nu Ν U+039D (925) HTML 4.0 HTMLsymbol Greek capital letter Nu<br />

Xi Ξ U+039E (926) HTML 4.0 HTMLsymbol ISOgrk3 Greek capital letter Xi<br />

Omicron Ο U+039F (927) HTML 4.0 HTMLsymbol Greek capital letter Omicron<br />

Pi Π U+03A0 (928) HTML 4.0 HTMLsymbol Greek capital letter Pi<br />

Rho Ρ U+03A1 (929) HTML 4.0 HTMLsymbol Greek capital letter Rho<br />

Sigma Σ U+03A3 (931) HTML 4.0 HTMLsymbol ISOgrk3 Greek capital letter Sigma<br />

Tau Τ U+03A4 (932) HTML 4.0 HTMLsymbol Greek capital letter Tau<br />

Upsilon Υ U+03A5 (933) HTML 4.0 HTMLsymbol ISOgrk3 Greek capital letter Upsilon<br />

Phi Φ U+03A6 (934) HTML 4.0 HTMLsymbol ISOgrk3 Greek capital letter Phi<br />

Chi Χ U+03A7 (935) HTML 4.0 HTMLsymbol Greek capital letter Chi<br />

Psi Ψ U+03A8 (936) HTML 4.0 HTMLsymbol ISOgrk3 Greek capital letter Psi<br />

Omega Ω U+03A9 (937) HTML 4.0 HTMLsymbol ISOgrk3 Greek capital letter Omega<br />

alpha α U+03B1 (945) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter alpha<br />

beta β U+03B2 (946) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter beta<br />

gamma γ U+03B3 (947) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter gamma<br />

delta δ U+03B4 (948) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter delta<br />

epsilon ε U+03B5 (949) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter epsilon<br />

zeta ζ U+03B6 (950) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter zeta<br />

eta η U+03B7 (951) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter eta<br />

theta θ U+03B8 (952) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter theta<br />

iota ι U+03B9 (953) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter iota<br />

kappa κ U+03BA (954) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter kappa<br />

lambda λ U+03BB (955) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter lambda<br />

mu μ U+03BC (956) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter mu<br />

nu ν U+03BD (957) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter nu<br />

xi ξ U+03BE (958) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter xi<br />

omicron ο U+03BF (959) HTML 4.0 HTMLsymbol NEW Greek small letter omicron<br />

pi π U+03C0 (960) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter pi<br />

rho ρ U+03C1 (961) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter rho<br />

sigmaf ς U+03C2 (962) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter final sigma<br />

sigma σ U+03C3 (963) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter sigma<br />

tau τ U+03C4 (964) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter tau<br />

upsilon υ U+03C5 (965) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter upsilon<br />

phi φ U+03C6 (966) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter phi<br />

chi χ U+03C7 (967) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter chi<br />

psi ψ U+03C8 (968) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter psi


List of <strong>XML</strong> and HTML character entity references 39<br />

omega ω U+03C9 (969) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter omega<br />

thetasym ϑ U+03D1 (977) HTML 4.0 HTMLsymbol NEW Greek theta symbol<br />

upsih ϒ U+03D2 (978) HTML 4.0 HTMLsymbol NEW Greek Upsilon with hook symbol<br />

piv ϖ U+03D6 (982) HTML 4.0 HTMLsymbol ISOgrk3 Greek pi symbol<br />

ensp U+2002 (8194) HTML 4.0 HTMLspecial ISOpub en space spaces<br />

emsp U+2003 (8195) HTML 4.0 HTMLspecial ISOpub em space spaces<br />

thinsp U+2009 (8201) HTML 4.0 HTMLspecial ISOpub thin space spaces<br />

zwnj U+200C (8204) HTML 4.0 HTMLspecial NEW RFC 2070 zero-width non-joiner<br />

zwj U+200D (8205) HTML 4.0 HTMLspecial NEW RFC 2070 zero-width joiner<br />

lrm U+200E (8206) HTML 4.0 HTMLspecial NEW RFC 2070 left-to-right mark<br />

rlm U+200F (8207) HTML 4.0 HTMLspecial NEW RFC 2070 right-to-left mark<br />

ndash – U+2013 (8211) HTML 4.0 HTMLspecial ISOpub en dash<br />

mdash — U+2014 (8212) HTML 4.0 HTMLspecial ISOpub em dash<br />

lsquo ‘ U+2018 (8216) HTML 4.0 HTMLspecial ISOnum left single quotation mark<br />

rsquo ’ U+2019 (8217) HTML 4.0 HTMLspecial ISOnum right single quotation mark<br />

sbquo ‚ U+201A (8218) HTML 4.0 HTMLspecial NEW single low-9 quotation mark<br />

ldquo “ U+201C (8220) HTML 4.0 HTMLspecial ISOnum left double quotation mark<br />

rdquo ” U+201D (8221) HTML 4.0 HTMLspecial ISOnum right double quotation mark<br />

bdquo „ U+201E (8222) HTML 4.0 HTMLspecial NEW double low-9 quotation mark<br />

dagger † U+2020 (8224) HTML 4.0 HTMLspecial ISOpub dagger<br />

Dagger ‡ U+2021 (8225) HTML 4.0 HTMLspecial ISOpub double dagger<br />

bull • U+2022 (8226) HTML 4.0 HTMLspecial ISOpub bullet (= black small circle) black<br />

hellip … U+2026 (8230) HTML 4.0 HTMLsymbol ISOpub horizontal ellipsis (= three dot leader)<br />

permil ‰ U+2030 (8240) HTML 4.0 HTMLspecial ISOtech per mille sign<br />

prime ′ U+2032 (8242) HTML 4.0 HTMLsymbol ISOtech prime (= minutes = feet)<br />

Prime ″ U+2033 (8243) HTML 4.0 HTMLsymbol ISOtech double prime (= seconds = inches)<br />

lsaquo ‹ U+2039 (8249) HTML 4.0 HTMLspecial ISO proposed single left-pointing angle quotation mark proposed<br />

rsaquo › U+203A (8250) HTML 4.0 HTMLspecial ISO proposed single right-pointing angle quotation mark proposed<br />

oline ‾ U+203E (8254) HTML 4.0 HTMLsymbol NEW overline (= spacing overscore)<br />

frasl ⁄ U+2044 (8260) HTML 4.0 HTMLsymbol NEW fraction slash (= solidus)<br />

euro € U+20AC (8364) HTML 4.0 HTMLspecial NEW euro sign<br />

image ℑ U+2111 (8465) HTML 4.0 HTMLsymbol ISOamso black-letter capital I (= imaginary part)<br />

weierp ℘ U+2118 (8472) HTML 4.0 HTMLsymbol ISOamso script capital P (= power set = Weierstrass p)<br />

real ℜ U+211C (8476) HTML 4.0 HTMLsymbol ISOamso black-letter capital R (= real part symbol)<br />

trade U+2122 (8482) HTML 4.0 HTMLsymbol ISOnum trademark sign<br />

alefsym ℵ U+2135 (8501) HTML 4.0 HTMLsymbol NEW alef symbol (= first transfinite cardinal) alefsym<br />

larr ← U+2190 (8592) HTML 4.0 HTMLsymbol ISOnum leftwards arrow<br />

uarr ↑ U+2191 (8593) HTML 4.0 HTMLsymbol ISOnum upwards arrow


List of <strong>XML</strong> and HTML character entity references 40<br />

rarr → U+2192 (8594) HTML 4.0 HTMLsymbol ISOnum rightwards arrow<br />

darr ↓ U+2193 (8595) HTML 4.0 HTMLsymbol ISOnum downwards arrow<br />

harr ↔ U+2194 (8596) HTML 4.0 HTMLsymbol ISOamsa left right arrow<br />

crarr ↵ U+21B5 (8629) HTML 4.0 HTMLsymbol NEW downwards arrow with corner leftwards (=<br />

carriage return)<br />

lArr ⇐ U+21D0 (8656) HTML 4.0 HTMLsymbol ISOtech leftwards double arrow lArr<br />

uArr ⇑ U+21D1 (8657) HTML 4.0 HTMLsymbol ISOamsa upwards double arrow<br />

rArr ⇒ U+21D2 (8658) HTML 4.0 HTMLsymbol ISOnum rightwards double arrow rArr<br />

dArr ⇓ U+21D3 (8659) HTML 4.0 HTMLsymbol ISOamsa downwards double arrow<br />

hArr ⇔ U+21D4 (8660) HTML 4.0 HTMLsymbol ISOamsa left right double arrow<br />

forall ∀ U+2200 (8704) HTML 4.0 HTMLsymbol ISOtech for all<br />

part ∂ U+2202 (8706) HTML 4.0 HTMLsymbol ISOtech partial differential<br />

exist ∃ U+2203 (8707) HTML 4.0 HTMLsymbol ISOtech there exists<br />

empty ∅ U+2205 (8709) HTML 4.0 HTMLsymbol ISOamso empty set (= null set = diameter)<br />

nabla ∇ U+2207 (8711) HTML 4.0 HTMLsymbol ISOtech nabla (= backward difference)<br />

isin ∈ U+2208 (8712) HTML 4.0 HTMLsymbol ISOtech element of<br />

notin ∉ U+2209 (8713) HTML 4.0 HTMLsymbol ISOtech not an element of<br />

ni ∋ U+220B (8715) HTML 4.0 HTMLsymbol ISOtech contains as member<br />

prod ∏ U+220F (8719) HTML 4.0 HTMLsymbol ISOamsb n-ary product (= product sign) prod<br />

sum ∑ U+2211 (8721) HTML 4.0 HTMLsymbol ISOasmb n-ary summation sum<br />

minus − U+2212 (8722) HTML 4.0 HTMLsymbol ISOtech minus sign<br />

lowast ∗ U+2217 (8727) HTML 4.0 HTMLsymbol ISOtech asterisk operator<br />

radic √ U+221A (8730) HTML 4.0 HTMLsymbol ISOtech square root (= radical sign)<br />

prop ∝ U+221D (8733) HTML 4.0 HTMLsymbol ISOtech proportional to<br />

infin ∞ U+221E (8734) HTML 4.0 HTMLsymbol ISOtech infinity<br />

ang ∠ U+2220 (8736) HTML 4.0 HTMLsymbol ISOamso angle<br />

and ∧ U+2227 (8743) HTML 4.0 HTMLsymbol ISOtech logical and (= wedge)<br />

or ∨ U+2228 (8744) HTML 4.0 HTMLsymbol ISOtech logical or (= vee)<br />

cap ∩ U+2229 (8745) HTML 4.0 HTMLsymbol ISOtech intersection (= cap)<br />

cup ∪ U+222A (8746) HTML 4.0 HTMLsymbol ISOtech union (= cup)<br />

int ∫ U+222B (8747) HTML 4.0 HTMLsymbol ISOtech integral<br />

there4 ∴ U+2234 (8756) HTML 4.0 HTMLsymbol ISOtech therefore<br />

sim ∼ U+223C (8764) HTML 4.0 HTMLsymbol ISOtech tilde operator (= varies with = similar to) sim<br />

cong ≅ U+2245 (8773) HTML 4.0 HTMLsymbol ISOtech congruent to<br />

asymp ≈ U+2248 (8776) HTML 4.0 HTMLsymbol ISOamsr almost equal to (= asymptotic to)<br />

ne ≠ U+2260 (8800) HTML 4.0 HTMLsymbol ISOtech not equal to<br />

equiv ≡ U+2261 (8801) HTML 4.0 HTMLsymbol ISOtech identical to; sometimes used for 'equivalent to'<br />

le ≤ U+2264 (8804) HTML 4.0 HTMLsymbol ISOtech less-than or equal to<br />

ge ≥ U+2265 (8805) HTML 4.0 HTMLsymbol ISOtech greater-than or equal to


List of <strong>XML</strong> and HTML character entity references 41<br />

sub ⊂ U+2282 (8834) HTML 4.0 HTMLsymbol ISOtech subset of<br />

sup ⊃ U+2283 (8835) HTML 4.0 HTMLsymbol ISOtech superset of sup<br />

nsub ⊄ U+2284 (8836) HTML 4.0 HTMLsymbol ISOamsn not a subset of<br />

sube ⊆ U+2286 (8838) HTML 4.0 HTMLsymbol ISOtech subset of or equal to<br />

supe ⊇ U+2287 (8839) HTML 4.0 HTMLsymbol ISOtech superset of or equal to<br />

oplus ⊕ U+2295 (8853) HTML 4.0 HTMLsymbol ISOamsb circled plus (= direct sum)<br />

otimes ⊗ U+2297 (8855) HTML 4.0 HTMLsymbol ISOamsb circled times (= vector product)<br />

perp ⊥ U+22A5 (8869) HTML 4.0 HTMLsymbol ISOtech up tack (= orthogonal to = perpendicular) perp<br />

sdot ⋅ U+22C5 (8901) HTML 4.0 HTMLsymbol ISOamsb dot operator sdot<br />

lceil ⌈ U+2308 (8968) HTML 4.0 HTMLsymbol ISOamsc left ceiling (= APL upstile)<br />

rceil ⌉ U+2309 (8969) HTML 4.0 HTMLsymbol ISOamsc right ceiling<br />

lfloor ⌊ U+230A (8970) HTML 4.0 HTMLsymbol ISOamsc left floor (= APL downstile)<br />

rfloor ⌋ U+230B (8971) HTML 4.0 HTMLsymbol ISOamsc right floor<br />

lang U+2329 (9001) HTML 4.0 HTMLsymbol ISOtech left-pointing angle bracket (= bra) lang<br />

rang U+232A (9002) HTML 4.0 HTMLsymbol ISOtech right-pointing angle bracket (= ket) rang<br />

loz ◊ U+25CA (9674) HTML 4.0 HTMLsymbol ISOpub lozenge<br />

spades ♠ U+2660 (9824) HTML 4.0 HTMLsymbol ISOpub black spade suit black<br />

clubs ♣ U+2663 (9827) HTML 4.0 HTMLsymbol ISOpub black club suit (= shamrock) black<br />

hearts ♥ U+2665 (9829) HTML 4.0 HTMLsymbol ISOpub black heart suit (= valentine) black<br />

diams ♦ U+2666 (9830) HTML 4.0 HTMLsymbol ISOpub black diamond suit black<br />

Notes:<br />

• DTD: the full public DTD name (where the character entity name is defined) is actually mapped from one of the<br />

following three defined named entities:<br />

HTMLlat1<br />

maps to:<br />

• PUBLIC "-//W3C//ENTITIES Latin 1//EN//HTML" in HTML (the DTD is implicitly defined,<br />

no system URI is needed);<br />

• PUBLIC "-//W3C//ENTITIES Latin 1 for XHTML//EN" "http://www.w3.org/TR/xhtml1/<br />

HTMLsymbol<br />

DTD/xhtml-lat1.ent" in XHTML 1.0;<br />

maps to:<br />

• PUBLIC "-//W3C//ENTITIES Symbols//EN//HTML" in HTML (the DTD is implicitly defined,<br />

no system URI is needed);<br />

• PUBLIC "-//W3C//ENTITIES Symbols for XHTML//EN" "http://www.w3.org/TR/xhtml1/<br />

HTMLspecial<br />

DTD/xhtml-symbol.ent" in XHTML 1.0;<br />

maps to:


List of <strong>XML</strong> and HTML character entity references 42<br />

• PUBLIC "-//W3C//ENTITIES Special//EN//HTML" in HTML (the DTD is implicitly defined,<br />

no system URI is needed);<br />

• PUBLIC "-//W3C//ENTITIES Special for XHTML//EN" "http://www.w3.org/TR/xhtml1/<br />

DTD/xhtml-special.ent" in XHTML 1.0.<br />

• Old ISO subset: these are old (documented) character subsets used in legacy encodings before the unification<br />

within ISO 10646.<br />

• Description: the standard ISO 10646 and Unicode character name is displayed first for each character, with<br />

non-standard but legacy synonyms shown in italics between parentheses after an equal sign)<br />

• spaces: a blue background has been used in order to display each space's width.<br />

• ISO proposed: these characters have been standardized in ISO 10646 after the release of HTML 4.0.<br />

• ligature: this is a standard misnomer as this is a separate character in some languages.<br />

• black: here it seems to mean filled as opposed to hollow.<br />

• alefsym: 'alef symbol' is NOT the same as U+05D0 'Hebrew letter alef', although the same glyph could be used<br />

to depict both characters.<br />

• lArr: ISO 10646 does not say that 'leftwards double arrow' is the same as the 'is implied by' arrow but also does<br />

not have any other character for that function. So lArr can be used for 'is implied by' as ISOtech suggests.<br />

• rArr: ISO 10646 does not say that 'rightwards double arrow' is the 'implies' character but does not have another<br />

character with this function so rArr can be used for 'implies' as ISOtech suggests.<br />

• prod: 'n-ary product' is NOT the same character as U+03A0 'Greek capital letter Pi' though the same glyph might<br />

be used for both.<br />

• sum: 'n-ary summation' is NOT the same character as U+03A3 'Greek capital letter Sigma' though the same glyph<br />

might be used for both.<br />

• sim: 'tilde operator' is NOT the same character as U+007E 'tilde', although the same glyph might be used to<br />

represent both.<br />

• sup: note that nsup, U+2283 'not a superset of', is not covered by the Symbol font encoding and is not included.<br />

Should it be, for symmetry? It is in the ISOamsn subset.<br />

• perp: Unicode only defines U+22A5 as the "up tack". The Unicode symbol for "perpendicular" is U+27C2. The<br />

two symbols look similar, but are separate in Unicode. However, HTML uses U+22A5 as its "perpendicular"<br />

symbol. This is a discrepancy between HTML and Unicode.<br />

• sdot: 'dot operator' is NOT the same character as U+00B7 'middle dot'.<br />

• lang: 'left-pointing angle bracket' is NOT the same character as U+003C 'less than' or U+2039 'single<br />

left-pointing angle quotation mark'.<br />

• rang: 'right-pointing angle bracket' is NOT the same character as U+003E 'greater than' or U+203A 'single<br />

right-pointing angle quotation mark'.<br />

Entities representing special characters in XHTML<br />

The XHTML DTDs explicitly declare 253 entities (including the 5 predefined entities of <strong>XML</strong> 1.0) whose expansion<br />

is a single character, which can therefore be informally referred to as "character entities". These (with the exception<br />

of the &apos; entity) have the same names and represent the same characters as the 252 character entities in HTML.<br />

Also, by virtue of being <strong>XML</strong>, XHTML documents may reference the predefined &apos; entity, which is not one of<br />

the 252 character entities in HTML. Additional entities of any size may be defined on a per-document basis.<br />

However, the usability of entity references in XHTML is affected by how the document is being processed:<br />

• If the document is read by a conforming HTML processor, then only the 252 HTML character entities can safely<br />

be used. The use of &apos; or custom entity references may not be supported and may produce unpredictable<br />

results.<br />

• If the document is read by an <strong>XML</strong> parser that does not or cannot read external entities, then only the five built-in<br />

<strong>XML</strong> character entities (see above) can safely be used, although other entities may be used if they are declared in


List of <strong>XML</strong> and HTML character entity references 43<br />

the internal DTD subset.<br />

• If the document is read by an <strong>XML</strong> parser that does read external entities, then the five built-in <strong>XML</strong> character<br />

entities can safely be used. The other 248 HTML character entities can be used as long as the XHTML DTD is<br />

accessible to the parser at the time the document is read. Other entities may also be used if they are declared in the<br />

internal DTD subset.<br />

Because of the special &apos; case mentioned above, only &quot;, &amp;, &lt;, and &gt; will work in all processing<br />

situations.<br />

See also<br />

• Character encodings in HTML<br />

• HTML decimal character rendering<br />

• SGML entity<br />

References<br />

• Unicode Consortium [1] . See also: Unicode Consortium<br />

• UnicodeData.txt from the Unicode Consortium [2]<br />

• World Wide Web Consortium [3] . See also: World Wide Web Consortium<br />

• <strong>XML</strong> 1.0 spec [4]<br />

• HTML 2.0 spec [5]<br />

• HTML 3.2 spec [6]<br />

• HTML 4.0 spec [7]<br />

• HTML 4.01 spec [8]<br />

• XHTML 1.0 spec [9]<br />

• <strong>XML</strong> Entity Definitions for Characters [10]<br />

• The normative reference to RFC 2070 (still found in DTDs defining the character entities for HTML or XHTML)<br />

is historic; this RFC (along with other RFC's related to different part of the HTML specification) has been<br />

deprecated in favor of the newer informational RFC 2854 which defines the "text/html" MIME type and<br />

references directly the W3C specifications for the actual HTML content.<br />

• Numerical Reference of Unicode code points at Wikibooks<br />

External links<br />

• Character entity references in HTML 4 [11] at the W3C<br />

• Multilanguage special character entity list [12] - List of special characters, entities and their names.<br />

References<br />

[1] http://www.unicode.org/<br />

[2] http://www.unicode.org/Public/UNIDATA/UnicodeData.txt<br />

[3] http://www.w3.org/<br />

[4] http://www.w3.org/TR/REC-xml/<br />

[5] http://www.w3.org/MarkUp/html-spec/html-spec_toc.html<br />

[6] http://www.w3.org/TR/REC-html32<br />

[7] http://www.w3.org/TR/1998/REC-html40-19980424/<br />

[8] http://www.w3.org/TR/REC-html40/<br />

[9] http://www.w3.org/TR/xhtml1/<br />

[10] http://www.w3.org/TR/xml-entity-names/<br />

[11] http://www.w3.org/TR/html4/sgml/entities.html<br />

[12] http://www.seomister.com/ch


Log4js 44<br />

Log4js<br />

Developer(s)<br />

Log4js Logo<br />

Stephan Strittmatter, Seth Chisamore<br />

[1]<br />

Stable release 1.0 / August 4, 2008<br />

Operating<br />

system<br />

Type Framework<br />

Windows, Linux, Mac OS<br />

License Apache Software Foundation<br />

Website http://log4js.berlios.de [1]<br />

Log4js is a framework written in JavaScript to log application events.<br />

The framework is very close to the API of Log4j. It is also available under the licence of Apache Software<br />

Foundation.<br />

Functionality<br />

The base concept is identical to Log4j. The same log levels and almost<br />

all methods are identical.<br />

One special feature of Log4js is the ability to log the events of the<br />

browser remote on the server. Using Ajax it is possible to send the<br />

logging events in several formats (<strong>XML</strong>, JSON, plain ASCII etc.) to<br />

the server to be evaluated there.<br />

Appender<br />

Following appenders are implemented currently:<br />

AjaxAppender<br />

Sends the logs via XmlHttpRequest (Ajax) to the server to be processed there.<br />

ConsoleAppender<br />

Logs within the HTML page or in a separate window.<br />

FileAppender<br />

Writes to a local file (Internet Explorer and Mozilla supported).<br />

JSConsoleAppender<br />

Appender for the JavaScript Console of Mozilla, Opera and Safari.<br />

MetatagAppender<br />

Adds the log events to Metatags in the DOM of document.<br />

class diagram


Log4js 45<br />

WindowsEventsAppender<br />

Layout<br />

Using Internet Explorer it is possible to log to Windows System Events.<br />

The Layout classes are for different formattings of the events:<br />

BasicLayout<br />

Simple textual output of the events.<br />

HtmlLayout<br />

Formats the event as HTML -element.<br />

JSONLayout<br />

Converts the events to JSON-objects which are readable in many other programming languages like Perl, PHP<br />

and Java.<br />

<strong>XML</strong>Layout<br />

<strong>XML</strong> formatted output.<br />

External links<br />

• Log4js Homepage [1]<br />

• Log4js Wiki [2]<br />

• Apache Logging Homepage [3]<br />

References<br />

[1] http://log4js.berlios.de<br />

[2] http://scratchpad.wikia.com/wiki/Log4js<br />

[3] http://logging.apache.org/


MAREC 46<br />

MAREC<br />

The MAtrixware REsearch Collection (MAREC) is a standardised patent data corpus available for research<br />

purposes. MAREC could be defined as corpus that seeks to represent patent documents of several languages in order<br />

to answer specific research questions. [1] [2] It consists of 19 million patent documents in different languages,<br />

normalised to a highly specific <strong>XML</strong> schema.<br />

MAREC is intended as raw material for research in areas such as information retrieval, natural language processing<br />

or machine translation, which require large amounts of complex documents. [3] The collection contains documents in<br />

19 languages, the majority being English, German and French, and about half of the documents include full text.<br />

In MAREC, the documents from different countries and sources are normalised to a common <strong>XML</strong> format with a<br />

uniform patent numbering scheme and citation format. The standardised fields include dates, countries, languages,<br />

references, person names, and companies as well as subject classifications such as IPC codes. [4]<br />

MAREC is a comparable corpus, where many documents are available in similar versions in other languages. A<br />

comparable corpus can be defined as consisting of texts that share similar topics – news text from the same time<br />

period in different countries, while a parallel corpus is defined as a collection of documents with aligned translations<br />

from the source to the target language. [5] Since the patent document refers to the same “invention” or “concept of<br />

idea” the text is a translation of the invention, but it does not have to be a direct translation of the text itself – text<br />

parts could have been removed or added for clarification reasons.<br />

The 19,386,697 <strong>XML</strong> files measure a total of 621 GB and are hosted by the Information Retrieval Facility. Access<br />

and support are free of charge for research purposes.<br />

External links<br />

• User guide and statistics [6]<br />

• Information Retrieval Facility [7]<br />

• "One week of MAREC" sample [8]<br />

References<br />

[1] Merz C., (2003) A Corpus Query Tool For Syntactically Annotated Corpora Licentiate Thesis, The University of Zurich, Department of<br />

Computation linguistic, Switzerland<br />

[2] Biber D., Conrad S., and Reppen R. (2000) Corpus Linguistics: Investigating <strong>Language</strong> Structure and Use. Cambridge University Press, 2nd<br />

edition<br />

[3] Manning, C. D. and Schütze, H. (2002) Foundations of statistical natural language processing Cambridge, MA, Massachusetts Institute of<br />

Technology (MIT) ISBN 0-262-13360-1.<br />

[4] European Patent Office (2009) Guidelines for examination in the European Patent Office (http://documents.epo.org/projects/babylon/<br />

eponet.nsf/0/1AFC30805E91D074C125758A0051718A/$File/guidelines_2009_complete_en.pdf), Published by European Patent Office,<br />

Germany (April 2009)<br />

[5] Järvelin A. , Talvensaari T. , Järvelin Anni, (2008) Data driven methods for improving mono- and cross-lingual IR performance in noisy<br />

environments, Proceedings of the second workshop on Analytics for noisy unstructured text data, (Singapore)<br />

[6] http://www.matrixware.com/documentation/marec/index.jsp?topic=/com.MxW.MAREC/ch02.html<br />

[7] http://ir-facility.org<br />

[8] http://matrixware.net/tos/marec/


Media Object Server 47<br />

Media Object Server<br />

Media Object Server (MOS) is an <strong>XML</strong>-based protocol for transferring information between newsroom automation<br />

systems and other associated systems such as media servers.<br />

The MOS protocol allows a variety of devices to be controlled from one central device or piece of software. This<br />

limits the need to have operators in multiple locations throughout the studio environment. For example, multiple<br />

character generators can be fired from a single control workstation, without needing an operator at each CG console.<br />

External references<br />

• http://www.mosprotocol.com/<br />

• http://www.codeproject.com/KB/cs/mosprotocol.aspx by Rizwan Qureshi<br />

METS<br />

The Metadata Encoding and Transmission Standard is a metadata standard for encoding descriptive, administrative,<br />

and structural metadata regarding objects within a digital library, expressed using the <strong>XML</strong> schema language of the<br />

World Wide Web Consortium. The standard is maintained in the Network Development and MARC Standards Office<br />

of the Library of Congress, and is being developed as an initiative of the Digital Library Federation.<br />

Introduction<br />

METS is an <strong>XML</strong> Schema designed for the purpose of:<br />

• Creating <strong>XML</strong> document instances that express the hierarchical structure of digital library objects.<br />

• Recording the names and locations of the files that comprise those objects.<br />

• Recording associated metadata. METS can, therefore, be used as a tool for modeling real world objects, such as<br />

particular document types.<br />

Depending on its use, a METS document could be used in the role of Submission Information Package (SIP),<br />

Archival Information Package (AIP), or Dissemination Information Package (DIP) within the Open Archival<br />

Information System (OAIS) Reference Model.<br />

Digital libraries Vs Traditional libraries<br />

Maintaining a library of digital objects requires maintaining metadata about those objects. The metadata necessary<br />

for successful management and use of digital objects is both more extensive than and different from the metadata<br />

used for managing collections of printed works and other physical materials.<br />

• Where a traditional library may record descriptive metadata regarding a book in its collection, the book will not<br />

dissolve into a series of unconnected pages if the library fails to record structural metadata regarding the book's<br />

organization, nor will scholars be unable to evaluate the book's worth if the library fails to note that the book was<br />

produced using a Ryobi offset press.<br />

• The same cannot be said for a digital library. Without structural metadata, the page image or text files<br />

comprising the digital work are of little use, and without technical metadata regarding the digitization process,<br />

scholars may be unsure of how accurate a reflection of the original the digital version provides.<br />

• However in a digital library it is possible to create e-book like PDF file, Tiff file which can be seen a single<br />

physical book and reflect the integrity of the original.


METS 48<br />

Characteristics of METS documents<br />

Any METS document has the following features:<br />

• An open standard (non-proprietary)<br />

• Developed by the library community<br />

• Relatively simple<br />

• Extensible<br />

• Modular<br />

Sections of a METS document Example of a METS document<br />

The 7 sections of a METS document<br />

• METS header: Contains metadata describing the METS document itself, such as its creator, editor, etc.<br />

• Descriptive Metadata: May contain internally embedded metadata or point to metadata external to the METS<br />

document. Multiple instances of both internal and external descriptive metadata may be included.<br />

• Administrative Metadata: Provides information regarding how files were created and stored, intellectual<br />

property rights, metadata regarding the original source object from which the digital library object derives, and<br />

information regarding the provenance of files comprising the digital library object (such as master/derivative<br />

relationships, migrations, and transformations). As with descriptive metadata, administrative metadata may be<br />

internally encoded or external to the METS document.<br />

• File Section: Lists all files containing content which comprise the electronic versions of the digital object. file<br />

elements may be grouped within fileGrp elements to subdivide files by object version.<br />

• Structural Map: Outlines a hierarchical structure for the digital library object, and links the elements of that<br />

structure to associated content files and metadata.<br />

• Structural Links: Allows METS creators to record the existence of hyperlinks between nodes in the Structural<br />

Map. This is of particular value in using METS to archive Websites.<br />

• Behavioral: Used to associate executable behaviors with content in the METS object. Each behavior has a<br />

mechanism element identifying a module of executable code that implements behaviors defined abstractly by its<br />

interface definition.


METS 49<br />

METS profiles<br />

METS Profiles are intended to describe a class of METS documents in sufficient detail to provide both document<br />

authors and programmers the guidance they require to create and process METS documents conforming with a<br />

particular profile.<br />

A profile is expressed as an <strong>XML</strong> document. There is a schema for this purpose. The profile expresses the<br />

requirements that a METS document must satisfy. A sufficiently explicit METS Profile may be considered a data<br />

standard.<br />

METS Profiles in use<br />

• Musical Score (may be a score, score and parts, or a set of parts only)<br />

• Print Material (books, pamphlets, etc.)<br />

• Music Manuscript (score or sketches)<br />

• Recorded Event (audio or video)<br />

• PDF Document<br />

• Bibliographic Record<br />

• Photograph<br />

• Compact Disc<br />

• Collection<br />

See also<br />

• Digital Item Declaration <strong>Language</strong><br />

• Dublin Core, an ISO metadata standard<br />

• Preservation Metadata: Implementation Strategies (PREMIS)<br />

• Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH)<br />

External links<br />

• Network Development and MARC Standards Office [1]<br />

• Library of Congress [2]<br />

• Digital Library Federation [3]<br />

• METS Official web site [1]<br />

References<br />

[1] http://www.loc.gov/standards/mets/<br />

[2] http://www.loc.gov/index.html<br />

[3] http://www.diglib.org/


Numeric character reference 50<br />

Numeric character reference<br />

A numeric character reference (NCR) is a common markup construct used in SGML and other SGML-related<br />

markup languages such as HTML and <strong>XML</strong>. It consists of a short sequence of characters that, in turn, represent a<br />

single character from the Universal Character Set (UCS) of Unicode. NCRs are typically used in order to represent<br />

characters that are not directly encodable in a particular document. When the document is interpreted by a<br />

markup-aware reader, each NCR is treated as if it were the character it represents.<br />

Example<br />

In SGML, HTML, and <strong>XML</strong>, the following are all valid numeric character references for the Greek capital letter<br />

Sigma ("Σ"),<br />

Numerical character reference of Unicode character Σ<br />

Σ = U+03A3: GREEK CAPITAL LETTER SIGMA (3A3 = 931 )<br />

16 10<br />

Unicode character Numerical base Numerical reference in markup Effect<br />

U+03A3 Decimal &#931; Σ<br />

U+03A3 Decimal &#0931; Σ<br />

U+03A3 Hexadecimal &#x3A3; Σ<br />

U+03A3 Hexadecimal &#x03A3; Σ<br />

U+03A3 Hexadecimal &#x3a3; Σ<br />

Discussion<br />

<strong>Markup</strong> languages are typically defined in terms of UCS or Unicode characters. That is, a document consists, at its<br />

most fundamental level of abstraction, of a sequence of characters, which are abstract units that exist independently<br />

of any encoding.<br />

Ideally, when the characters of a document utilizing a markup language are encoded for storage or transmission over<br />

a network as a sequence of bits, the encoding that is used will be one that supports representing each and every<br />

character in the document, if not in the whole of Unicode, directly as a particular bit sequence.<br />

Sometimes, though, for reasons of convenience or due to technical limitations, documents are encoded with an<br />

encoding that cannot represent some characters directly. For example, the widely used encodings based on ISO 8859<br />

can only represent, at most, 256 unique characters as one 8-bit byte each.<br />

Documents are rarely, in practice, ever allowed to use more than one encoding internally, so the onus is usually on<br />

the markup language to provide a means for document authors to express unencodable characters in terms of<br />

encodable ones. This is generally done through some kind of "escaping" mechanism.<br />

The SGML-based markup languages allow document authors to use special sequences of characters from the ASCII<br />

range (the first 128 code points of Unicode) to represent, or reference, any Unicode character, regardless of whether<br />

the character being represented is directly available in the document's encoding. These special sequences are<br />

character references.<br />

Character references that are based on the referenced character's UCS or Unicode "code point" are called numeric<br />

character references. In HTML 4 and in all versions of XHTML and <strong>XML</strong>, the code point can be expressed either as<br />

a decimal (base 10) number or as a hexadecimal (base 16) number. The syntax is as follows:


Numeric character reference 51<br />

Character U+0026 (ampersand), followed by character U+0023 (number sign), followed by one of the following<br />

choices:<br />

• one or more decimal digits zero (U+0030) through nine (U+0039); or<br />

• character U+0078 ("x") followed by one or more hexadecimal digits, which are zero (U+0030) through nine<br />

(U+0039), Latin capital letter A (U+0041) through F (U+0046), and Latin small letter a (U+0061) through f<br />

(U+0066);<br />

all followed by character U+003B (semicolon). Older versions of HTML disallowed the hexadecimal syntax.<br />

The characters that comprise a numeric character reference can be represented in every character encoding used in<br />

computing and telecommunications today, so there is no risk of the reference itself being unencodable.<br />

There is another kind of character reference called a character entity reference, which allows a character to be<br />

referred to by a name instead of a number. (Naming a character creates a character entity.) HTML defines some<br />

character entities, but not many; all other characters can only be included by direct encoding or using NCRs.<br />

Restrictions<br />

The Universal Character Set defined by ISO 10646 is the "document character set" of SGML, HTML 4, so by<br />

default, any character in such a document, and any character referenced in such a document, must be in the UCS.<br />

While the syntax of SGML does not prohibit references to unassigned code points, such as &#xFFFF;,<br />

SGML-derived markup languages such as HTML and <strong>XML</strong> can, and often do, restrict numeric character references<br />

to only those code points that are assigned to characters or that have not been permanently left unassigned.<br />

Restrictions may also apply for other reasons. For example, in HTML 4, &#12;, which is a reference to a<br />

non-printing "form feed" control character, is allowed because a form feed character is allowed. But in <strong>XML</strong>, the<br />

form feed character cannot be used, not even by reference. As another example, &#128;, which is a reference to<br />

another control character, is not allowed to be used or referenced in either HTML or <strong>XML</strong>, but when used in HTML,<br />

it is usually not flagged as an error by web browsers—some of which attempt to interpret it as a reference to the<br />

character represented by code value 128 in the Windows-1252 encoding: "€", which actually should be represented<br />

as &#8364;. As a further example, prior to the publication of <strong>XML</strong> 1.0 Second Edition on October 6, 2000, <strong>XML</strong> 1.0<br />

was based on an older version of ISO 10646 and prohibited using characters above U+FFFD, except in character<br />

data, thus making a reference like &#65536; (U+10000) illegal. In <strong>XML</strong> 1.1 and newer editions of <strong>XML</strong> 1.0, such a<br />

reference is allowed, because the available character repertoire was explicitly extended.<br />

<strong>Markup</strong> languages also place restrictions on where character references can occur.<br />

See also<br />

• Character entity reference<br />

• List of <strong>XML</strong> and HTML character entity references


Office Open <strong>XML</strong> 52<br />

Office Open <strong>XML</strong><br />

class="infobox" style="width: 22em; font-size: 88%; line-height: 1.5em" Office<br />

Open <strong>XML</strong><br />

• Office Open <strong>XML</strong> file formats<br />

• Open Packaging Conventions<br />

• Open Specification Promise<br />

• Vector <strong>Markup</strong> <strong>Language</strong><br />

• Office Open <strong>XML</strong> software<br />

• Comparison of Office Open <strong>XML</strong> software<br />

• Office Open <strong>XML</strong> standardization<br />

Filename extension .docx or .docm<br />

Internet media<br />

type<br />

application/vnd.<br />

openxmlformats-officedocument.<br />

wordprocessingml.<br />

[1]<br />

document<br />

Developed by Microsoft, Ecma, ISO/IEC<br />

Type of format Document file format<br />

Extended from <strong>XML</strong>, DOC, WordProcessingML<br />

Standard(s) ECMA-376, ISO/IEC 29500<br />

Website ECMA-376 [2] ISO/IEC 29500:2008 [3]<br />

,<br />

Filename extension .pptx or .pptm<br />

Internet media<br />

type<br />

application/vnd.<br />

openxmlformats-officedocument.<br />

presentationml.<br />

[1]<br />

presentation


Office Open <strong>XML</strong> 53<br />

|-<br />

|}<br />

Developed by Microsoft, Ecma, ISO/IEC<br />

Type of format Presentation<br />

Extended from <strong>XML</strong>, PPT<br />

Standard(s) ECMA-376, ISO/IEC 29500<br />

Website ECMA-376 [2] ISO/IEC 29500:2008 [3]<br />

,<br />

Filename extension .xlsx or .xlsm<br />

Internet media<br />

type<br />

application/vnd.<br />

openxmlformats-officedocument.<br />

spreadsheetml.<br />

[1]<br />

sheet<br />

Developed by Microsoft, Ecma, ISO/IEC<br />

Type of format Spreadsheet<br />

Extended from <strong>XML</strong>, XLS, SpreadsheetML<br />

Standard(s) ECMA-376, ISO/IEC 29500<br />

Website ECMA-376 [2] ISO/IEC 29500:2008 [3]<br />

,<br />

Office Open <strong>XML</strong> (also informally known as OO<strong>XML</strong> or Open<strong>XML</strong>) is a zipped, <strong>XML</strong>-based file format<br />

developed by Microsoft [4] for representing spreadsheets, charts, presentations and word processing documents. The<br />

Office Open <strong>XML</strong> specification has been standardised both by Ecma and, in a later edition, by ISO and IEC as an<br />

International Standard (ISO/IEC 29500).<br />

Starting with Microsoft Office 2007, the Office Open <strong>XML</strong> file formats (ECMA-376) have become the default [5]<br />

target file format of Microsoft Office, [6] [7] although the Strict variant of the standard is not fully supported. [8]<br />

Background<br />

In 2000, Microsoft released an initial version of an <strong>XML</strong>-based format for Microsoft Excel, which was incorporated<br />

in Office XP. In 2002, a new file format for Microsoft Word followed. [9] The Excel and Word formats—known as<br />

the Microsoft Office <strong>XML</strong> formats—were later incorporated into the 2003 release of Microsoft Office.<br />

Microsoft announced in November 2005 that it would co-sponsor standardization of the new version of their<br />

<strong>XML</strong>-based formats through Ecma International, as "Office Open <strong>XML</strong>". [10]


Office Open <strong>XML</strong> 54<br />

Standardization process<br />

Microsoft submitted initial material to Ecma International Technical Committee TC45, where it was standardized to<br />

become ECMA-376, approved in December 2006. [11]<br />

This standard was then fast-tracked in the Joint Technical Committee 1 of ISO and IEC.<br />

After initially failing to pass, an amended version of the format received the necessary votes for approval as an<br />

ISO/IEC Standard as the result of a JTC 1 fast tracking standardization process that concluded in April 2008. [12] The<br />

resulting four part International Standard (designated ISO/IEC 29500:2008) was published in November 2008 [13]<br />

and can be downloaded from the ITTF. [14] A technically equivalent set of texts is published by Ecma as ECMA-376<br />

Office Open <strong>XML</strong> File Formats — 2nd edition (December 2008); they can be downloaded from their web site. [15]<br />

Licensing<br />

Under the Ecma International code of conduct in patent matters, [16] participating and approving member<br />

organisations of ECMA are required to make available their patent rights on a Reasonable and Non Discriminatory<br />

(RAND) basis.<br />

Holders of patents which concern ISO/IEC International Standards may agree to a standardized license governing the<br />

terms under which such patents may be licensed, in accord with the ISO/IEC/ITU common patent policy [17] .<br />

Microsoft, the main contributor to the standard, provided a Covenant Not to Sue [18] for its patent licensing. The<br />

covenant received a mixed reception, with some like the Groklaw blog criticizing it, [19] and others such as Lawrence<br />

Rosen, (an attorney and lecturer at Stanford Law School), endorsing it. [20]<br />

Microsoft has added the format to their Open Specification Promise [21] in which<br />

Microsoft irrevocably promises not to assert any Microsoft Necessary Claims against you for making,<br />

using, selling, offering for sale, importing or distributing any implementation to the extent it conforms to<br />

a Covered Specification […]<br />

This is limited to applications which do not deviate from the ISO/IEC 29500:2008 or Ecma-376 standard and to<br />

parties that do not "file, maintain or voluntarily participate in a patent infringement lawsuit against a Microsoft<br />

implementation of such Covered Specification". [22] [23] The Open Specification Promise was included in documents<br />

submitted to ISO/IEC in support of the ECMA-376 fast track submission. [24] Ecma International asserted that, "The<br />

OSP enables both open source and commercial software to implement [the specification]". [25]<br />

Versions<br />

The Office Open <strong>XML</strong> specification exists in a number of versions.<br />

ECMA-376 1st edition (2006)<br />

The ECMA standard is structured in five parts to meet the needs of different audiences. [15]<br />

Part 1. Fundamentals<br />

Vocabulary, notational conventions and abbreviations<br />

Summary of primary and supporting markup languages<br />

Conformance conditions and interoperability guidelines<br />

Constraints within the Open Packaging Conventions that apply to each document type<br />

Part 2. Open Packaging Conventions<br />

The Open Packaging Conventions (OPC), for the package model and physical package, is defined and used by<br />

various document types in various applications from multiple vendors.


Office Open <strong>XML</strong> 55<br />

It defines core properties, thumbnails, digital signatures, and authorizations and encryption capabilities for<br />

parts or all the contents in the package.<br />

<strong>XML</strong> schemas for the OPC are declared as <strong>XML</strong> Schema Definitions (XSD) and (non-normatively) using<br />

RELAX NG (ISO/IEC 19757-2)<br />

Part 3. Primer<br />

Informative (non-normative) introduction to WordprocessingML, SpreadsheetML, PresentationML,<br />

DrawingML, VML and Shared MLs, providing context and illustrating elements through examples and<br />

diagrams<br />

Describes the custom <strong>XML</strong> data storing facility within a package to support integration with business data<br />

Part 4. <strong>Markup</strong> <strong>Language</strong> Reference<br />

Contains the reference material for WordprocessingML, SpreadsheetML, PresentationML, DrawingML,<br />

Shared MLs and Custom <strong>XML</strong> Schema, defining every element and attribute including the element hierarchy<br />

(parent/child relationships)<br />

<strong>XML</strong> schemas for the markup languages are declared as XSD and (non-normatively) using RELAX NG<br />

Defines the custom <strong>XML</strong> data storing facility<br />

Part 5. <strong>Markup</strong> Compatibility and Extensibility<br />

Describes extension facilities of Open<strong>XML</strong> documents and specifies elements and attributes by which<br />

applications with different extensions can interoperate<br />

ISO/IEC 29500:2008<br />

The ISO/IEC standard is structured into four parts. [26] Parts 1, 2 and 3 are independent standards; for example Part 2,<br />

specifying Open Packaging Conventions, is used by other files formats including XPS and Design Web Format. Part<br />

4 is to be read as a modification to Part 1, on which it depends.<br />

A technically equivalent set of texts is also published by Ecma as ECMA-376 2nd edition (2008).<br />

Part 1 (Fundamentals and <strong>Markup</strong> <strong>Language</strong> Reference)<br />

This part has 5560 pages. It contains:<br />

• Conformance definitions<br />

• Reference material for the <strong>XML</strong> document markup languages defined by the Standard<br />

• <strong>XML</strong> schemas for the document markup languages declared using XSD and (non-normatively) RELAX NG<br />

• Defines the foreign markup facilities<br />

Part 2 (Open Packaging Conventions)<br />

This part has 129 pages. It contains:<br />

• A description of the Open Packaging Conventions (package model, physical package)<br />

• Core properties, thumbnails and digital signatures<br />

• <strong>XML</strong> schemas for the OPC are declared using XSD and (non-normatively) RELAX NG)<br />

Part 3 (<strong>Markup</strong> Compatibility and Extensibility)<br />

This part has 40 pages. It contains:<br />

• A description of extensions: elements and attributes which define mechanisms allowing applications to specify<br />

alternative means of negotiating content<br />

• Extensibility rules are expressed using NVDL<br />

Part 4 (Transitional Migration Features)<br />

This part has 1464 pages. It contains:


Office Open <strong>XML</strong> 56<br />

• Legacy material such as compatibility settings and the graphics markup language VML<br />

• A list of syntactic differences between this text and ECMA-376 1st edition<br />

The standard specifies two levels of document and application conformance, strict and transitional for each of<br />

WordprocessingML, PresentationML and SpreadsheetML. The standard also specifies applications descriptions of<br />

base and full.<br />

Compatibility between versions<br />

The intent of the changes from ECMA-376 1st edition to ISO/IEC 29500:2008 was that a valid ECMA-376<br />

document would be a valid ISO 29500 "transitional" document [27] , but at least one change introduced at the BRM<br />

(refusing to allow further values for xsd:boolean) had the effect of breaking backwards compatibility for most<br />

documents. [28] A fix for this has been suggested to ISO/IEC JTC1/SC34/WG4, and was approved in June 2009 to go<br />

forward as a recommendation for the first amendment to Office Open <strong>XML</strong>. [29]<br />

File formats<br />

The Office Open <strong>XML</strong> file formats are a set of file formats that can be used to represent electronic office documents.<br />

The format defines a set of <strong>XML</strong> markup vocabularies for word processing documents, spreadsheets and<br />

presentations as well as specific <strong>XML</strong> markup vocabularies for material such as mathematical formulae, graphics,<br />

bibliographies etc. The stated goal of the Office Open <strong>XML</strong> standard is to be capable of faithfully representing the<br />

pre-existing corpus of word-processing documents, spreadsheets and presentations that had been produced by the<br />

Microsoft Office applications and to facilitate extensibility and interoperability by enabling implementations by<br />

multiple vendors and on multiple platforms.<br />

An Office Open <strong>XML</strong> file is a ZIP-compatible OPC package containing <strong>XML</strong> documents and other resources. That<br />

is, one can see the insides of a .xlsm file, for example, by renaming it as .zip file. Then, the file can be opened by any<br />

zip tool and the actual .xml files contained therein can be viewed in a web browser or a plain text editor.<br />

Adoption<br />

Several countries have formally announced either adoption, or the evaluation of adoption of Office Open <strong>XML</strong>,<br />

while others have rejected it completely. In some cases the Office Open <strong>XML</strong> standard has a national standard<br />

identifier; In some cases the Office Open <strong>XML</strong> standard is permitted to be used where national regulation says that<br />

non-proprietary formats must be used, in other cases, it means that some government body has actually decided that<br />

Office Open <strong>XML</strong> will be used in some specific context, and in still other cases, some Government body has decided<br />

that it will not use Office Open <strong>XML</strong> at all.<br />

Belgium<br />

Denmark<br />

Germany<br />

Belgium's Federal Public Service for Information and Communication Technology in 2006 was evaluating the<br />

adoption of the Office Open <strong>XML</strong> format. It already then confirmed that it would consider all ISO standards to<br />

be open standards, mentioning Office Open <strong>XML</strong> as such a possible future ISO standard. [30]<br />

In June 2007, the Danish Ministry of Science, Technology and Innovation recommended that beginning with<br />

January 1, 2008 public authorities must support at least one of the two word processing document formats<br />

Office Open <strong>XML</strong> or Open Document Format in all new IT solutions, where appropriate. [31]<br />

In Germany the Office Open <strong>XML</strong> standard is currently under observation by the governmental office for<br />

standards in public IT ("Koordinierungs- und Beratungsstelle der Bundesregierung für Informationstechnik in<br />

der Bundesverwaltung" (KBSt). The latest release of "SAGA" (Standards and Architectures for


Office Open <strong>XML</strong> 57<br />

Japan<br />

Lithuania<br />

Norway<br />

Sweden<br />

E-Government-Applications) includes Office Open <strong>XML</strong> file formats. The standard may be used to exchange<br />

complex documents when further processing is required. [32]<br />

On June 29, 2007, the government of Japan published a new interoperability framework which gives<br />

preference to the procurement of products that follow open standards. [33] [34] On July 2 the government<br />

declared that they hold the view that formats like Office Open <strong>XML</strong> which organizations such as Ecma<br />

International and ISO had also approved was, according to them, an open standard . Also, they said that it was<br />

one of the preferences, whether the format is open, to choose which software the government shall deploy.<br />

Lithuanian Standards Board has adopted the ISO/IEC 29500:2008 Office Open <strong>XML</strong> format standard as<br />

Lithuanian National standard. The decision was made by Technical Committee 4 Information Technology on<br />

March 5, 2009. The proposal to adopt the Office Open <strong>XML</strong> format standard was submitted by Lithuanian<br />

Archives Department under the Government of the Republic of Lithuania. [35]<br />

Norway's Ministry of Government Administration and Reform is evaluating the adoption of the Office Open<br />

<strong>XML</strong> format. The ministry put the document standard under observation in December 2007. [36]<br />

The Kingdom of Sweden has adopted Office Open <strong>XML</strong> as a 4 part Swedish National Standard SS-ISO/IEC<br />

[37] [38] [39] [40]<br />

29500:2009.<br />

Switzerland<br />

In July 2007, the Swiss Federal Council announced adherence SAGA.ch e-Government standards mandatory<br />

for its departments as well as for cantons, cities and municipalities. The latest version of SAGA.ch includes<br />

Office Open <strong>XML</strong> file formats. [41]<br />

United Kingdom<br />

The UK has put out an action plan for use of open standards, which includes ISO/IEC 29500 as one of several<br />

[42] [43]<br />

formats to be supported.<br />

United States of America<br />

On April 15, 2009, the ANSI-accredited INCITS organisation voted to adopt ISO/IEC 29500:2008 as an<br />

American National Standard. [44]<br />

The state of Massachusetts has been examining its options for implementing <strong>XML</strong>-based document<br />

processing. In early 2005, Eric Kriss, Secretary of Administration and Finance in Massachusetts, was the first<br />

government official in the United States to publicly connect open formats to a public policy purpose: "It is an<br />

overriding imperative of the American democratic system that we cannot have our public documents locked up<br />

in some kind of proprietary format, perhaps unreadable in the future, or subject to a proprietary system license<br />

that restricts access". [45] Since 2007 Massachusetts has classified Office Open <strong>XML</strong> as "Open Format" and has<br />

amended [46] its approved technical standards list — the Enterprise Technical Reference Model (ETRM) — to<br />

include Office Open <strong>XML</strong>. Massachusetts, under heavy pressure from some vendors, now formally endorses<br />

Office Open <strong>XML</strong> formats for its public records. [47]


Office Open <strong>XML</strong> 58<br />

Application support<br />

Starting with Microsoft Office 2007, the Office Open <strong>XML</strong> file formats (ECMA-376) have become the default [5] file<br />

format of Microsoft Office. [6] [7] However, due to the changes introduced in a later version, Office 2007 is not<br />

entirely in compliance with ISO/IEC 29500:2008. [48] [49] [50] [51] Microsoft Office 2010 includes support for the<br />

ISO/IEC 29500:2008 compliant version of Office Open <strong>XML</strong>. [49] . Office 2010 does not yet support saving<br />

document conform the strict schema of the ISO/IEC 29500:2008 specification, but saves documents conform the<br />

transitional schema of the ISO/IEC 29500:2008 specification. [52] [53] The intent of the ISO/IEC is to allow the<br />

removal of the transitional variant from the ISO/IEC 29500 standard. [53]<br />

The SoftMaker Office 2010 Suite claims to be able to reliably read and write .DOCX and .XLSX files in its word<br />

processor and spreadsheet applications.<br />

The OpenOffice.org office suite has been able to import Office Open <strong>XML</strong> files (.docx, .xlsx, .pptx, etc.) since<br />

version 3. [54]<br />

The KOffice office suite has been able to import Office Open <strong>XML</strong> files since version 2.2.<br />

Other mainstream Office products that have started to offer import support for the Office Open <strong>XML</strong> formats are<br />

Apple's TextEdit (included with Mac OS X) and iWork, IBM Lotus Notes, Corel Wordperfect, Kingsoft Office and<br />

Google apps.<br />

Controversies<br />

The ISO standardization of Office Open <strong>XML</strong> was controversial and embittered. According to InfoWorld:<br />

OO<strong>XML</strong> was opposed by many on grounds it was unneeded, as software makers could use<br />

OpenDocument Format (ODF), a less complicated office software format that was already an<br />

international standard. [55]<br />

The same InfoWorld article reported that IBM (which supports the ODF format) threatened to leave standards bodies<br />

that it said allow dominant corporations like Microsoft to wield undue influence. Microsoft was accused of co-opting<br />

the standardization process by leaning on countries to ensure that it got enough votes at the ISO for Office Open<br />

<strong>XML</strong> to pass. [56]<br />

Richard Stallman of the Free Software Foundation has stated that "Microsoft offers a gratis patent license for<br />

OO<strong>XML</strong> on terms which do not allow free implementations." [57]<br />

See also<br />

• List of document markup languages<br />

• Comparison of document markup languages<br />

• Open Document Format<br />

External links<br />

• ECMA-376 site [2]<br />

• ISO/IEC 29500:2008 [3]<br />

• Open<strong>XML</strong>Developer.org [58] , Microsoft's site for developers<br />

• Open <strong>XML</strong> Community site [59] Microsoft's site for customers and partners<br />

• "The WordprocessingML Vocabulary", sample chapter from O'Reilly book Office 2003 <strong>XML</strong> [60] PDF (1.22 MB)<br />

• OpenOffice.org [61] , How do I open Microsoft Office 2007 files? Article by OpenOffice.org<br />

• Information technology -- Office Open <strong>XML</strong> file formats [62] , ISO Standards, JTC 1 Information technology, SC<br />

34<br />

• FAQs on ISO/IEC 29500 [63] , ISO's FAQ site on ISO/IEC 29500


Office Open <strong>XML</strong> 59<br />

• DOCX reference document [64] , contains a file with fairly complex formatting and can be used to quickly check<br />

compatibility of an implementation<br />

• Open<strong>XML</strong> site [65] , contains resources, articles and tools for Office Open <strong>XML</strong><br />

• Interoperability study [66] showing an indication of the percentage of support for Office Open <strong>XML</strong> by several<br />

different office suite implementations in aug-2008<br />

References<br />

[1] Microsoft. "Register file extensions on third party servers" (http://technet.microsoft.com/en-us/library/cc179224.aspx). microsoft.com. .<br />

Retrieved 2009-09-04.<br />

[2] http://www.ecma-international.org/publications/standards/Ecma-376.htm<br />

[3] http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_tc_browse.htm?commid=45374<br />

[4] "Q&A: Microsoft Co-Sponsors Submission of Office Open <strong>XML</strong> Document Formats to Ecma International for Standardization" (https://<br />

www.microsoft.com/presspass/features/2005/nov05/11-21Ecma.mspx). Microsoft. 2005-11-21. .<br />

[5] "Microsoft Expands List of Formats Supported in Microsoft Office" (http://www.microsoft.com/Presspass/press/2008/may08/<br />

05-21ExpandedFormatsPR.mspx?rss_fdn=Press Releases). Microsoft. . Retrieved 2008-05-21.<br />

[6] "Microsoft's future lies somewhere beyond the Vista by Evansville Courier & Press" (http://www.courierpress.com/news/2008/oct/24/<br />

microsofts-future-lies-somewhere-beyond-the/). Courierpress.com. . Retrieved 2009-05-19.<br />

[7] "Rivals Set Their Sights on Microsoft Office: Can They Topple the Giant? - Knowledge@Wharton" (http://knowledge.wharton.upenn.edu/<br />

article.cfm?articleid=1795). Knowledge.wharton.upenn.edu. . Retrieved 2009-05-19.<br />

[8] ISO OO<strong>XML</strong> convener: Microsoft's format "heading for failure" (http://arstechnica.com/microsoft/news/2010/04/<br />

iso-ooxml-convener-microsofts-format-heading-for-failure.ars)<br />

[9] Brian Jones (2007-01-25). "History of office <strong>XML</strong> formats (1998–2006)" (http://blogs.msdn.com/brian_jones/archive/2007/01/25/<br />

office-xml-formats-1998-2006.aspx). MSDN blogs. .<br />

[10] "Microsoft Co-Sponsors Submission of Office Open <strong>XML</strong> Document Formats to Ecma International for Standardization" (http://www.<br />

microsoft.com/presspass/features/2005/nov05/11-21Ecma.mspx). Microsoft. 2005-11-21. .<br />

[11] "Ecma International approves Office Open <strong>XML</strong> standard" (http://www.ecma-international.org/news/PressReleases/<br />

PR_TC45_Dec2006.htm). Ecma International. 2006-12-07. .<br />

[12] "ISO/IEC DIS 29500 receives necessary votes for approval as an International Standard" (http://www.iso.org/iso/pressrelease.<br />

htm?refid=Ref1123). ISO. 2008-04-02. .<br />

[13] ISO/IEC (2008-11-18). "Publication of ISO/IEC 29500:2008, Information technology — Office Open <strong>XML</strong> formats" (http://www.iso.<br />

org/iso/pressrelease.htm?refid=Ref1181). ISO. . Retrieved 2008-11-19.<br />

[14] "Freely Available Standards" (http://standards.iso.org/ittf/PubliclyAvailableStandards/index.html). ITTF (ISO/IEC). 2008-11-18. .<br />

[15] "Standard ECMA-376" (http://www.ecma-international.org/publications/standards/Ecma-376.htm). Ecma-international.org. . Retrieved<br />

2009-05-19.<br />

[16] "Code of Conduct in Patent Matters" (http://www.ecma-international.org/memento/codeofconduct.htm). Ecma International. .<br />

[17] "ISO/IEC/ITU common patent policy" (http://isotc.iso.org/livelink/livelink/fetch/2000/2122/3770791/Common_Policy.htm). .<br />

[18] "Microsoft Covenant Regarding Office 2003 <strong>XML</strong> Reference Schemas" (http://www.microsoft.com/office/xml/covenant.mspx).<br />

Microsoft. . Retrieved 2006-07-11.<br />

[19] "2 Escape Hatches in MS's Covenant Not to Sue" (http://www.groklaw.net/articlebasic.php?story=20051202135844482). Groklaw. .<br />

Retrieved 2007-01-29.<br />

[20] Berlind, David (November 28, 2005). "Top open source lawyer blesses new terms on Microsoft's <strong>XML</strong> file format" (http://blogs.zdnet.<br />

com/BTL/?p=2192). ZDNet. . Retrieved 2007-01-27.<br />

[21] "Microsoft Open Specification Promise" (http://www.microsoft.com/interop/osp/default.mspx). Microsoft. 2006-09-12. . Retrieved<br />

2007-04-22.<br />

[22] "" (http://www.ecma-international.org/publications/index.html). Ecma International. . ""Ecma Standards and Technical Reports are<br />

made available to all interested persons or organizations, free of charge and licensing restrictions""<br />

[23] "Microsoft Open Specification Promise" (http://www.microsoft.com/Interop/osp/default.mspx). Microsoft.com. .<br />

[24] "Licensing conditions that Microsoft offers for Office Open <strong>XML</strong>" (http://www.jtc1sc34.org/repository/0810c.htm). Jtc1sc34.org.<br />

2006-12-20. . Retrieved 2009-05-19.<br />

[25] "Microsoft Word — Responses to Comments and Perceived Contradictions.doc" (http://www.ecma-international.org/news/<br />

TC45_current_work/Ecma responses.pdf) (PDF). . Retrieved 2009-09-16.<br />

[26] "ISO (You searched for "29500" in title and abstract" (http://www.iso.org/iso/search.htm?qt=29500&published=on&<br />

active_tab=standards). International Organization for Standardization. 2009-06-05. .<br />

[27] "Re-introducing on/off-values to ST-OnOff in OO<strong>XML</strong> Part 4" (http://idippedut.dk/post/2009/06/23/<br />

Re-introducing-onoff-values-to-ST-OnOff-in-OO<strong>XML</strong>-Part-4.aspx). . Retrieved 2009-09-29.<br />

[28] "OO<strong>XML</strong> and Office 2007 Conformance: a Smoke Test" (http://www.adjb.net/post/<br />

OO<strong>XML</strong>-and-Office-2007-Conformance-a-Smoke-Test.aspx). . Retrieved 2009-09-29.


Office Open <strong>XML</strong> 60<br />

[29] "Minutes of the Copenhagen Meeting of ISO/IEC JTC1/SC34/WG4" (http://www.itscj.ipsj.or.jp/sc34/open/1239.pdf). 2009-06-22. .<br />

Retrieved 2009-09-29. page 15<br />

[30] "FED13321-docsPeterStrickx.indd" (http://www.fedict.belgium.be/nl/binaries/Open_Standaarden_NL_V1_tcm167-16667.pdf) (PDF).<br />

. Retrieved 2009-09-16.<br />

[31] "Bilag 8 – Sammenligning af rapporten om "Estimering af omkostningerne ved indførelse af Office Open <strong>XML</strong> (OO<strong>XML</strong>) og Open<br />

Document Format (ODF) i centraladministrationen" i forhold til de spørgsmål, der skal belyses i de økonomiske konsekvensvurderinger, jf.<br />

rapporten om "Anvendelse af åbne standarder i det offentlige"" (http://vtu.dk/nyheder/aktuelle-temaer/2007/aabne-standarder/bilag/<br />

bilag-8.html/). Vtu.dk. . Retrieved 2009-05-19.<br />

[32] "SAGA 4.0" (http://gsb.download.bva.bund.de/KBSt/SAGA/SAGA_v4.0.pdf) (PDF). . Retrieved 2009-09-16.<br />

[33] Gardner, David (2007-07-10). "Office Software Formats Battle Moves To Asia" (http://www.informationweek.com/news/showArticle.<br />

jhtml?articleID=201000546). Information Week. . Retrieved 2007-07-27.<br />

[34] "Interoperability framework for information systems (in Japanese)" (http://www.meti.go.jp/press/20070629014/20070629014.html).<br />

Ministry of Economy, Trade and Industry, Japan. 2007-06-29. . Retrieved 2007-07-27.<br />

[35] "Latest News" (http://www.openxmlcommunity.com/latestnews.aspx). Open <strong>XML</strong> Community. . Retrieved 2009-05-19.<br />

[36] "Referansekatalog for IT-standarder i offentlig sektor" (http://www.regjeringen.no/en/dep/fad/Documents/Rundskriv/2007/<br />

Referansekatalog-for-IT-standarder-i-off.html?id=494951). regjeringen.no. . Retrieved 2009-05-19.<br />

[37] "SS-ISO/IEC 29500-1:2009" (http://www.sis.se/DesktopDefault.aspx?tabName=@DocType_1&Doc_ID=68693&PresID=2&<br />

Desc=SS-ISO/IEC 29500-1:2009). Sis.se. 2009-01-19. . Retrieved 2009-09-16.<br />

[38] "SS-ISO/IEC 29500-2:2009" (http://www.sis.se/DesktopDefault.aspx?tabName=@DocType_1&Doc_ID=68694&PresID=1&<br />

Desc=SS-ISO/IEC 29500-2:2009). Sis.se. . Retrieved 2009-09-16.<br />

[39] "SS-ISO/IEC 29500-3:2009" (http://www.sis.se/DesktopDefault.aspx?tabName=@DocType_1&Doc_ID=68695&PresID=2&<br />

Desc=SS-ISO/IEC 29500-3:2009). Sis.se. . Retrieved 2009-09-16.<br />

[40] "SS-ISO/IEC 29500-4:2009" (http://www.sis.se/DesktopDefault.aspx?tabName=@DocType_1&Doc_ID=68696&PresID=1&<br />

Desc=SS-ISO/IEC 29500-4:2009). Sis.se. . Retrieved 2009-09-16.<br />

[41] "eCH — Downloads | Standards/Normes | eCH-0014 d SAGA.ch" (http://www.ech.ch/index.php?option=com_docman&<br />

task=cat_view&gid=92&lang=en). Ech.ch. . Retrieved 2009-05-19.<br />

[42] "Open Source, Open Standards and Re–Use: Government Action Plan" (http://www.cabinetoffice.gov.uk/government_it/open_source/<br />

action.aspx). UK Government Cabinet Office. 2009-02-24. .<br />

[43] Rick Jelliffe (2009-02-26). "Open standards: the UK gets it, probably" (http://broadcast.oreilly.com/2009/02/<br />

open-standards-the-uk-gets-it.html). .<br />

[44] "INCITS Letter Ballot 3025" (http://ballot.itic.org/itic/archive.taf?function=detail&ballot_id=3025&<br />

_UserReference=9B6726AA59D4BAC249E6E82E). INCITS. 2009-04-15. .<br />

[45] "Informal comments on Open Formats" (http://web.archive.org/web/20061013201242/http://www.mass.gov/eoaf/<br />

open_formats_comments.html). Web.archive.org. . Retrieved 2009-09-16.<br />

[46] http://www.mass.gov/?pageID=itdterminal&L=3&L0=Home&L1=Policies%2c+Standards+%26+Guidance&L2=Drafts+for+<br />

Review&sid=Aitd&b=terminalcontent&f=policies_standards_etrmv4_etrmv4dot0revisions&csid=Aitd<br />

[47] "Cover Pages: Major Revision of Massachusetts Enterprise Technical Reference Model (ETRM)" (http://xml.coverpages.org/<br />

ni2007-07-03-a.html). Xml.coverpages.org. . Retrieved 2009-05-19.<br />

[48] "OO<strong>XML</strong> Implementations: A Community of One" (http://www.odfalliance.org/resources/IssueBriefImplementations.pdf). ODF<br />

Alliance. 2008-02-20. . Retrieved 2009-05-19.<br />

[49] "Microsoft Expands List of Formats Supported in Microsoft Office" (http://www.microsoft.com/Presspass/press/2008/may08/<br />

05-21ExpandedFormatsPR.mspx). Microsoft.com. 2008-05-21. . Retrieved 2009-05-19.<br />

[50] Lai, Eric (2008-05-27). = 141&pageNumber=1 "FAQ: Office 14 and Microsoft's support for ODF" (http://www.computerworld.com/<br />

action/article.do?command=viewArticleBasic&taxonomyName=Protocols+and+Standards&articleId=9089258&taxonomyId).<br />

Computerworld.com. = 141&pageNumber=1. Retrieved 2009-05-19.<br />

[51] Andy Updegrove. "Microsoft Office 2007 to Support ODF — and not OO<strong>XML</strong>" (http://consortiuminfo.org/standardsblog/article.<br />

php?story=20080521092930864). ConsortiumInfo.org. . Retrieved 2009-05-19.<br />

[52]


Office Open <strong>XML</strong> 61<br />

[59] http://www.openxmlcommunity.org/<br />

[60] http://www.oreilly.com/catalog/officexml/chapter/ch02.pdf<br />

[61] http://wiki.services.openoffice.org/wiki/Documentation/FAQ/General/OpeningMSO2007Files<br />

[62] http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=45515<br />

[63] http://www.iso.org/iso/faqs_isoiec29500<br />

[64] http://katana.oooninja.com/w/reference_sample_documents<br />

[65] http://www.openxml.biz/<br />

[66] http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1201708<br />

Office Open <strong>XML</strong> file formats<br />

Office Open <strong>XML</strong><br />

• Office Open <strong>XML</strong> file formats<br />

• Open Packaging Conventions<br />

• Open Specification Promise<br />

• Vector <strong>Markup</strong> <strong>Language</strong><br />

• Office Open <strong>XML</strong> software<br />

• Comparison of Office Open <strong>XML</strong><br />

software<br />

• Office Open <strong>XML</strong> standardization<br />

Filename extension .docx or .docm<br />

Internet media<br />

type<br />

application/vnd.<br />

openxmlformats-officedocument.<br />

wordprocessingml.<br />

[1]<br />

document<br />

Developed by Microsoft, Ecma, ISO/IEC<br />

Type of format Document file format<br />

Extended from <strong>XML</strong>, DOC, WordProcessingML<br />

Standard(s) ECMA-376, ISO/IEC 29500<br />

Website ECMA-376 [2] ISO/IEC 29500:2008 [3]<br />

,


Office Open <strong>XML</strong> file formats 62<br />

Filename extension .pptx or .pptm<br />

Internet media<br />

type<br />

application/vnd.<br />

openxmlformats-officedocument.<br />

presentationml.<br />

[1]<br />

presentation<br />

Developed by Microsoft, Ecma, ISO/IEC<br />

Type of format Presentation<br />

Extended from <strong>XML</strong>, PPT<br />

Standard(s) ECMA-376, ISO/IEC 29500<br />

Website ECMA-376 [2] ISO/IEC 29500:2008 [3]<br />

,<br />

Filename extension .xlsx or .xlsm<br />

Internet media<br />

type<br />

application/vnd.<br />

openxmlformats-officedocument.<br />

spreadsheetml.<br />

[1]<br />

sheet<br />

Developed by Microsoft, Ecma, ISO/IEC<br />

Type of format Spreadsheet<br />

Extended from <strong>XML</strong>, XLS, SpreadsheetML<br />

Standard(s) ECMA-376, ISO/IEC 29500<br />

Website ECMA-376 [2] ISO/IEC 29500:2008 [3]<br />

,<br />

The Office Open <strong>XML</strong> file formats are a set of file formats that can be used to represent electronic office<br />

documents. There are formats for word processing documents, spreadsheets and presentations as well as specific<br />

formats for material such as mathematical formulae, graphics, bibliographies etc.<br />

The formats were developed by Microsoft and first appeared in Microsoft Office 2007. They were standardized<br />

between December 2006 and November 2008, first by the Ecma International consortium, where they became


Office Open <strong>XML</strong> file formats 63<br />

ECMA-376, and subsequently, after a contentious standardization process, by the ISO/IEC's Joint Technical<br />

Committee 1, where they became ISO/IEC 29500:2008.<br />

Container<br />

Office Open <strong>XML</strong> documents are stored in Open Packaging<br />

Convention (OPC) packages, which are ZIP files containing<br />

<strong>XML</strong> and other data files, along with a specification of the<br />

relationships between them. [2] Depending on the type of the<br />

document, the packages have different internal directory<br />

structures and names. An application will use the relationships<br />

files to locate individual sections (files), with each having<br />

accompanying metadata, in particular MIME metadata.<br />

A basic package contains an <strong>XML</strong> file called<br />

[Content_Types].xml at the root, along with three directories:<br />

_rels, docProps, and a directory specific for the document<br />

type (for example, in a .docx word processing package, there<br />

would be a word directory). The word directory contains the<br />

document.xml file which is the core content of the document.<br />

[Content_Types].xml<br />

_rels<br />

_rels/.rel<br />

This file provided MIME type information for parts of<br />

the package, using defaults for certain file extensions<br />

and overrides for parts specificied by IRI.<br />

Container structure of Part 2 of the Ecma Office Open <strong>XML</strong><br />

standard, ECMA-376<br />

This directory contains relationships for the files within the package. To find the relationships for a specific<br />

file, look for the _rels directory that is a sibling of the file, and then for a file that has the original file name<br />

with a .rels appended to it. For example, if the content types file had any relationships, there would be a file<br />

called [Content_Types].xml.rels inside the _rels directory.<br />

This file is where the package relationships are located. Applications look here first. <strong>View</strong>ing in a text editor,<br />

one will see it outlines each relationship for that section. In a minimal document containing only the basic<br />

document.xml file, the relationships detailed are metadata and document.xml.<br />

docProps/core.xml<br />

This file contains the core properties for any Office Open <strong>XML</strong> document.<br />

word/document.xml<br />

This file is the main part for any Word document.<br />

Relationships<br />

An example relationship file (word/_rels/document.xml.rels), is:<br />

<br />

<br />


Office Open <strong>XML</strong> file formats 64<br />

Target="http://en.wikipedia.org/images/wiki-en.png"<br />

TargetMode="External" /><br />

<br />

<br />

As such, images referenced in the document can be found in the relationship file by looking for all relationships that<br />

are of type http://schemas.microsoft.com/office/2006/relationships/image. To change the used image, edit the<br />

relationship.<br />

The following code shows an example of inline markup for a hyperlink:<br />

<br />

In this example, the Uniform Resource Locator (URL) is represented by "rId2". The actual URL is in the<br />

accompanying relationships file, located by the corresponding "rId2" item. Linked images, templates, and other<br />

items are referenced in the same way.<br />

Pictures can be embedded or linked using a tag:<br />

<br />

This is the reference to the image file. All references are managed via relationships. For example, a document.xml<br />

has a relationship to the image. There is a _rels directory in the same directory as document.xml, inside _rels is a file<br />

called document.xml.rels. In this file there will be a relationship definition that contains type, ID and location. The<br />

ID is the referenced ID used in the <strong>XML</strong> document. The type will be a reference schema definition for the media<br />

type and the location will be an internal location within the ZIP package or an external location defined with a URL.<br />

Document properties<br />

Office Open <strong>XML</strong> uses the Dublin Core Metadata Element Set and DCMI Metadata Terms to store document<br />

properties. Dublin Core is a standard for cross-domain information resource description and is defined in ISO<br />

15836:2003 [3] .<br />

An example document properties file (docProps/core.xml) that uses Dublin Core metadata, is:<br />

<br />


Office Open <strong>XML</strong> file formats 65<br />

2008-06-19T20:00:00Z<br />

2008-06-19T20:42:00Z<br />

Document file format<br />

Final<br />

<br />

Document markup languages<br />

An Office Open <strong>XML</strong> file may contain several documents encoded in specialized markup languages corresponding<br />

to applications within the Microsoft Office product line. Office Open <strong>XML</strong> defines multiple vocabularies using 27<br />

namespaces and 89 schema modules.<br />

The primary markup languages are:<br />

• WordprocessingML for word-processing<br />

• SpreadsheetML for spreadsheets<br />

• PresentationML for presentations<br />

Shared markup language materials include:<br />

• Office Math <strong>Markup</strong> <strong>Language</strong> (OMML)<br />

• DrawingML used for vector drawing, charts, and for example, text art (additionally, though deprecated, VML is<br />

supported for drawing)<br />

• Extended properties<br />

• Custom properties<br />

• Variant Types<br />

• Custom <strong>XML</strong> data properties<br />

• Bibliography<br />

In addition to the above markup languages custom <strong>XML</strong> schemas can be used to extend Office Open <strong>XML</strong>.<br />

Design approach<br />

Patrick Durusau, the editor of ODF, has viewed the markup style of OO<strong>XML</strong> and ODF as representing two sides of<br />

a debate: the "element side" and the "attribute side". He notes that OO<strong>XML</strong> represents "the element side of this<br />

approach" and singles out the KeepNext element as an example:<br />

<br />

<br />

…<br />

<br />

In contrast, he notes ODF would use the single attribute fo:keep-next, rather than an element, for the same<br />

semantic. [4]<br />

The <strong>XML</strong> Schema of Office Open <strong>XML</strong> emphasizes reducing load time and improving parsing speed. [5] In a test<br />

with applications current in April 2007, <strong>XML</strong>-based office documents were slower to load than binary formats. [6] To<br />

enhance performance, Office Open <strong>XML</strong> uses very short element names for common elements and spreadsheets save<br />

dates as index numbers (starting from 1899 or from 1904). In order to be systematic and generic, Office Open <strong>XML</strong><br />

typically uses separate child elements for data and metadata (element names ending in Pr for properties) rather than<br />

using multiple attributes, which allows structured properties. Office Open <strong>XML</strong> does not use mixed content but uses<br />

elements to put a series of text runs (element name r) into paragraphs (element name p). The result is terse and<br />

highly nested in contrast to HTML, for example, which is fairly flat, designed for humans to write in text editors and<br />

is more congenial for humans to read.


Office Open <strong>XML</strong> file formats 66<br />

The naming of elements and attributes within the text have attracted some criticism. There are three different<br />

syntaxes in OO<strong>XML</strong> (ECMA-376) for specifying the color and alignment of text depending on whether the<br />

document is a text, spreadsheet, or presentation. Rob Weir (an IBM employee and co-chair of the OASIS<br />

OpenDocument Format TC) asks "What is the engineering justification for this horror?". He contrasts with<br />

OpenDocument: "ODF uses the W3C's XSL-FO vocabulary for text styling, and uses this vocabulary<br />

consistently". [7]<br />

Some have argued the design is based too closely on Microsoft applications. In August 2007, the Linux Foundation<br />

published a blog post calling upon ISO National Bodies to vote "No, with comments" during the International<br />

Standardization of OO<strong>XML</strong>. It said, "OO<strong>XML</strong> is a direct port of a single vendor's binary document formats. It<br />

avoids the re-use of relevant existing international standards (e.g. several cryptographic algorithms, VML, etc.).<br />

There are literally hundreds of technical flaws that should be addressed before standardizing OO<strong>XML</strong> including<br />

continued use of binary code tied to platform specific features, propagating bugs in MS-Office into the standard,<br />

proprietary units, references to proprietary/confidential tags, unclear IP and patent rights, and much more". [8]<br />

The version of the standard submitted to JTC 1 was 6546 pages long. The need and appropriateness of such length<br />

has been questioned. [9] [10] Google stated that "the ODF standard, which achieves the same goal, is only 867<br />

pages" [9]<br />

WordprocessingML (WML)<br />

Word processing documents use the <strong>XML</strong> vocabulary known as WordprocessingML normatively defined by the<br />

schema wml.xsd which accompanies the standard. This vocabulary is defined in clause 11 of Part 1. [11]<br />

SpreadsheetML (SML)<br />

Spreadsheet documents use the <strong>XML</strong> vocabulary known as SpreadsheetML normatively defined by the schema<br />

sml.xsd which accompanies the standard. This vocabulary is described in clause 12 of Part 1. [11]<br />

Each worksheet in a spreadsheet is represented by an <strong>XML</strong> document with a root element named <br />

in the http://schemas.openxmlformats.org/spreadsheetml/2006/main Namespace.<br />

The representation of date and time values in SpreadsheetML has attracted some criticism. ECMA-376 1st edition<br />

does not conform to ISO 8601:2004 "Representation of Dates and Times". It requires that implementations replicate<br />

a Lotus 1-2-3 [12] bug that dictates that 1900 is a leap year, which in fact it isn't. Products complying with<br />

ECMA-376 would be required to use the WEEKDAY() spreadsheet function, and therefore assign incorrect dates to<br />

some days of the week, and also miscalculate the number of days between certain dates. [13] ECMA-376 2nd edition<br />

(ISO/IEC 29500) allows the use of 8601:2004 "Representation of Dates and Times" in addition to the Lotus 1-2-3<br />

[14] [15]<br />

bug-compatible form.<br />

3<br />

Office MathML (OMML)<br />

Office Math <strong>Markup</strong> <strong>Language</strong> is a mathematical markup language which can be embedded in WordprocessingML,<br />

with intrinsic support for including word processing markup like revision markings, [16] footnotes, comments, images<br />

and elaborate formatting and styles. [17] The OMML format is different from the World Wide Web Consortium<br />

(W3C) MathML recommendation that does not support those office features, but is partially compatible [18] through<br />

XSL Transformations.<br />

The following Office MathML example defines the fraction:<br />

<br />


Office Open <strong>XML</strong> file formats 67<br />

<br />

π


Office Open <strong>XML</strong> file formats 68<br />

Foreign resources<br />

Non-<strong>XML</strong> content<br />

OO<strong>XML</strong> documents are typically composed of other resources in addition to <strong>XML</strong> content (graphics, video, etc.).<br />

Some have criticised the choice of permitted format for such resources: ECMA-376 1st edition specifies "Embedded<br />

Object Alternate Image Requests Types" and "Clipboard Format Types", which refer to Windows Metafiles or<br />

Enhanced Metafiles – each of which are proprietary formats that have hard-coded dependencies on Windows itself.<br />

The critics state the standard should instead have referenced the platform neutral standard ISO/IEC 8632 "Computer<br />

Graphics Metafile". [13]<br />

Foreign markup<br />

The Standard provides three mechanisms to allow foreign markup to be embedded within content for editing<br />

purposes:<br />

• Smart tags<br />

• Custom <strong>XML</strong> markup<br />

• Structured Document Tags<br />

These are defined in clause 17.5 of Part 1.<br />

Compatibility settings<br />

Versions of Office Open <strong>XML</strong> contain what are termed "compatibility settings". These are contained in Part 4<br />

("<strong>Markup</strong> <strong>Language</strong> Reference") of ECMA-376 1st Edition, but during standardization were moved to become a new<br />

part (also called Part 4) of ISO/IEC 29500:2008 ("Transitional Migration Features").<br />

These settings (including element with names such as autoSpaceLikeWord95, footnoteLayoutLikeWW8,<br />

lineWrapLikeWord6, mwSmallCaps, shapeLayoutLikeWW8, suppressTopSpacingWP, truncateFontHeightsLikeWP6,<br />

uiCompat97To2003, useWord2002TableStyleRules, useWord97LineBreakRules, wpJustification and wpSpaceWidth)<br />

were the focus of some controversy during the standardisation of DIS 29500. [24] As a result, new text was added to<br />

ISO/IEC 29500 to document them. [25]<br />

An article in Free Software Magazine has criticized the markup used for these settings. Office Open <strong>XML</strong> uses<br />

distinctly named elements for each compatibility setting, each of which is declared in the schema. The repertoire of<br />

settings is thus limited — for new compatibility settings to be added, new elements may need to be declared,<br />

"potentially creating thousands of them, each having nothing to do with interoperability". [26]<br />

Extensibility<br />

The standard provides two types of extensibility mechanism, <strong>Markup</strong> Compatibility and Extensibility (MCE) defined<br />

in Part 3 (ISO/IEC 29500-3:2008) and Extension Lists defined in clause 18.2.10 of Part 1.<br />

References<br />

[1] Microsoft. "Register file extensions on third party servers" (http://technet.microsoft.com/en-us/library/cc179224.aspx). microsoft.com. .<br />

Retrieved 2009-09-04.<br />

[2] Tom Ngo (December 11, 2006). "Office Open <strong>XML</strong> Overview" (http://www.ecma-international.org/news/TC45_current_work/<br />

Open<strong>XML</strong> White Paper.pdf) (PDF). Ecma International. p. 6. . Retrieved 2007-01-23.<br />

[3] http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=37629<br />

[4] Patrick Durusau (21 October 2008). "Old Wine In New Skins" (http://www.durusau.net/publications/old_wine.pdf). .<br />

[5] Intellisafe Technologies. "Software Developer uses Office Open <strong>XML</strong> to Minimize File Space, Increase Interoperability" (http://www.<br />

openxmlcommunity.org/documents/casestudies/Intellisafe_Open<strong>XML</strong>_Final.pdf). .


Office Open <strong>XML</strong> file formats 69<br />

[6] George Ou (2007-04-27). "MS Office 2007 versus Open Office 2.2 shootout" (http://blogs.zdnet.com/Ou/?p=480). ZDnet.com. .<br />

Retrieved 2007-04-27.<br />

[7] Rob Weir (14 March 2008). "Disharmony of OO<strong>XML</strong>" (http://www.robweir.com/blog/2008/03/disharmony-of-ooxml.html). .<br />

[8] John Cherry (14 March 2008). "OO<strong>XML</strong> — vote "No, with comments"" (http://www.linux-foundation.org/weblogs/cherry/2007/08/29/<br />

ooxml-vote-no-with-comments/). .<br />

[9] "Google's Position on OO<strong>XML</strong> as a Proposed ISO Standard" (http://www.odfalliance.org/resources/Google OO<strong>XML</strong> Q A.pdf). Google.<br />

2008-02. . "If ISO were to give OO<strong>XML</strong> with its 6546 pages the same level of review that other standards have seen, it would take 18 years<br />

(6576 days for 6546 pages) to achieve comparable levels of review to the existing ODF standard (871 days for 867 pages) which achieves the<br />

same purpose and is thus a good comparison. Considering that OO<strong>XML</strong> has only received about 5.5% of the review that comparable standards<br />

have undergone, reports about inconsistencies, contradictions and missing information are hardly surprising"<br />

[10] "OO<strong>XML</strong>: What's the big deal?" (http://www.ibm.com/developerworks/library/x-ooxmlstandard.html). IBM. 2008-02-19. .<br />

[11] "ISO/IEC 29500-1:2008" (http://standards.iso.org/ittf/PubliclyAvailableStandards/c051463_ISOIEC 29500-1_2008(E).zip). ISO and<br />

IEC. 2008-09. .<br />

[12] Kyd, Charley (October 2006). "How to Work With Dates Before 1900 in Excel" (http://www.exceluser.com/explore/earlydates.htm).<br />

ExcelUser. . Retrieved 2009-09-16.<br />

[13] "The Contradictory Nature of OO<strong>XML</strong>" (http://www.consortiuminfo.org/standardsblog/article.php?story=20070117145745854).<br />

ConsortiumInfo.org. .<br />

[14] "ECMA-376 2nd edition Part 1 (3. Normative references)" (http://www.ecma-international.org/publications/standards/Ecma-376.htm).<br />

Ecma-international.org. . Retrieved 2009-09-16.<br />

[15] "New set of proposed dispositions posted, including more positive changes to the Ecma Office Open <strong>XML</strong> formats – Dispositions now<br />

proposed for more than half of National Bodies' comments" (http://www.ecma-international.org/news/TC45_current_work/New set of<br />

proposed dispositions posted.htm). Ecma-international.org. 2007-12-11. . Retrieved 2009-09-16.<br />

[16] Jesper Lund Stocholm (2008-01-29). "Do your math — OO<strong>XML</strong> and OMML" (http://idippedut.dk/post/2008/01/<br />

Do-your-math---OO<strong>XML</strong>-and-OMML.aspx). A Mooh Point blog. . Retrieved 2008-02-12.<br />

[17] Murray Sargent (2007-06-05). "Science and Nature have difficulties with Word 2007 mathematics" (http://blogs.msdn.com/murrays/<br />

archive/2007/06/05/science-and-nature-have-difficulties-with-word-2007-mathematics.aspx). MSDN blogs. . Retrieved 2007-07-31.<br />

[18] David Carlisle (2007-05-09). "XHTML and MathML from Office 2007" (http://dpcarlisle.blogspot.com/2007/04/<br />

xhtml-and-mathml-from-office-20007.html). David Carlisle. . Retrieved 2007-09-20.<br />

[19] "Microsoft Office dumped by Science and Nature" (http://www.zdnet.com.au/news/software/soa/<br />

Microsoft-Office-dumped-by-Science-and-Nature/0,130061733,339278690,00.htm). ZDNet Australia. 18 June 2007. .<br />

[20] Wouter Van Vugt (2008-11-01). "Open <strong>XML</strong> Explained e-book" (http://openxmldeveloper.org/articles/1970.aspx).<br />

Openxmldeveloper.org. . Retrieved 2007-09-14.<br />

[21] Rick Jelliffe in Technical (2007-04-16). "Why EMUs? - O'Reilly <strong>XML</strong> Blog" (http://www.oreillynet.com/xml/blog/2007/04/<br />

what_is_an_emu.html). Oreillynet.com. . Retrieved 2009-05-19.<br />

[22] "The X Factor" (http://reddevnews.com/features/article.aspx?editorialsid=2356). reddevnews.com. October 2007. .<br />

[23] "VML — the Vector <strong>Markup</strong> <strong>Language</strong>" (http://www.w3.org/TR/NOTE-VML). W3.org. 1998-05-13. . Retrieved 2009-05-19.<br />

[24] "ODF/OO<strong>XML</strong> technical white paper — A white paper based on a technical comparison between the ODF and OO<strong>XML</strong> formats" (http://<br />

www.freesoftwaremagazine.com/articles/odf_ooxml_technical_white_paper?page=0,9). Free Software Magazine. .<br />

[25] "ECMA-376 2nd edition Part 4 (paragraph 9.7.3)" (http://www.ecma-international.org/publications/standards/Ecma-376.htm).<br />

Ecma-international.org. . Retrieved 2009-09-16.<br />

[26] "ODF/OO<strong>XML</strong> technical white paper — A white paper based on a technical comparison between the ODF and OO<strong>XML</strong> formats" (http://<br />

www.freesoftwaremagazine.com/articles/odf_ooxml_technical_white_paper?page=0,7). Free Software Magazine. . ""... OO<strong>XML</strong> chose this<br />

route. Rather than create an application-definable configuration tag there is a unique tag for each setting ... Currently, the only application's<br />

unique settings that are catered for are the applications that the standard's authors have decided to include, ... For other applications to be<br />

added, further tag names would need to be defined in the specification, potentially creating thousands of them, each having nothing to do with<br />

interoperability .."."


OIO<strong>XML</strong> 70<br />

OIO<strong>XML</strong><br />

OIO<strong>XML</strong> is a project by the Danish government to develop a number of reusable data components serializable in<br />

various formats, although currently the only method of serialization for OIO<strong>XML</strong> data is in the <strong>XML</strong> format. This<br />

project was undertaken so as to ease communication from, to and between Danish governmental instances. It was<br />

made as part of the Danish government's transition to what they refer to as an eGovernment, in which<br />

communication between governmental instances, companies and the public should be paper-free. There has been<br />

some confusion as to what OIO<strong>XML</strong> is as the most prominent OIO<strong>XML</strong> format, the Danish Efaktura format which<br />

is a localization of UBL is also referred to as OIO<strong>XML</strong> by many governmental documents. It is currently a<br />

requirement for all invoices given to a Danish governmental organization to be in the Efaktura format.<br />

Sources<br />

• The interoperability framework [1]<br />

• OIO - Offentlig Information Online (public information online) - english main page of the site [2]<br />

• Description of OIO<strong>XML</strong> and its reasons [3]<br />

• Reference to the OIO<strong>XML</strong> markup language [4]<br />

• Validator for OIO<strong>XML</strong> [5]<br />

• Examples of OIO<strong>XML</strong> invoices in comparison with regular invoices (danish) [6]<br />

References<br />

[1] http://standarder.oio.dk/my-home-your-home/view?set_language=en<br />

[2] http://www.oio.dk/?o=a54bd5e3b9e3e94209f94882ac0c9301<br />

[3] http://isb.oio.dk/Info/Standardization/OIO<strong>XML</strong>%20Classes.htm<br />

[4] http://xmltools.oio.dk/oioonlinevalidator/ehandel/0p71/Invoice/<br />

[5] http://xmltools.oio.dk/oioonlinevalidator/<br />

[6] http://www.oio.dk/dataudveksling/ehandel/eFaktura/eksempler


Open <strong>XML</strong> Paper Specification 71<br />

Open <strong>XML</strong> Paper Specification<br />

Filename extension .oxps, .xps<br />

Internet media<br />

type<br />

application/oxps, application/vnd.ms-xpsdocument<br />

Developed by Microsoft, Ecma International<br />

Initial release October 2006<br />

Latest release First Edition / June 16, 2009<br />

Type of format Page description language /<br />

Document file format<br />

Contained by Open Packaging Conventions<br />

Extended from ZIP, <strong>XML</strong>, XAML<br />

Standard(s) ECMA-388<br />

Website [1] [1]<br />

The Open <strong>XML</strong> Paper Specification (also referred to as OpenXPS), is an open specification for a page description<br />

language and a fixed-document format originally developed by Microsoft as <strong>XML</strong> Paper Specification (XPS) that<br />

was later standardized by Ecma International as international standard ECMA-388. It is an <strong>XML</strong>-based (more<br />

precisely XAML-based) specification, based on a new print path and a color-managed vector-based document format<br />

that supports device independence and resolution independence. OpenXPS was standardized as an open standard<br />

document format on June 16, 2009. [2]<br />

Development of the <strong>XML</strong> Paper Specification<br />

In 2003 Global Graphics was chosen by Microsoft to provide consultancy and proof of concept development<br />

services on XPS and worked with the Windows development teams on the specification and reference architecture<br />

for the new format. [3]<br />

The XPS document format consists of structured <strong>XML</strong> markup that defines the layout of a document and the visual<br />

appearance of each page, along with rendering rules for distributing, archiving, rendering, processing and printing<br />

the documents. Notably, the markup language for XPS is a subset of XAML, allowing it to incorporate<br />

vector-graphic elements in documents, using XAML to mark up the WPF primitives. The elements used are<br />

described in terms of paths and other geometrical primitives.<br />

An XPS file is in fact a ZIP archive using the Open Packaging Conventions, containing the files which make up the<br />

document. These include an <strong>XML</strong> markup file for each page, text, embedded fonts, raster images, 2D vector<br />

graphics, as well as the digital rights management information. The contents of an XPS file can be examined simply<br />

by opening it in an application which supports ZIP files.


Open <strong>XML</strong> Paper Specification 72<br />

Features<br />

XPS specifies a set of document layout functionality for paged, printable documents. It also has support for features<br />

such as color gradients, transparencies, CMYK color spaces, printer calibration, multiple-ink systems and print<br />

schemas. XPS supports the Windows Color System color management technology for color conversion precision<br />

across devices and higher dynamic range. It also includes a software raster image processor (RIP) which is<br />

downloadable separately. [4] The print subsystem also has support for named colors, simplifying color definition for<br />

images transmitted to printers supporting those colors.<br />

XPS also supports HD Photo images natively for raster images. [5] The XPS format used in the spool file represents<br />

advanced graphics effects such as 3D images, glow effects, and gradients as Windows Presentation Foundation<br />

primitives, which are processed by the printer drivers without rasterization, preventing rendering artifacts and<br />

reducing computational load.<br />

Similarities with PDF and PostScript<br />

Like Adobe Systems's PDF format, XPS is a fixed-layout document format designed to preserve document<br />

fidelity, [6] providing device-independent documents appearance. PDF is a database of objects, created from<br />

PostScript and also directly generated from many applications, whereas XPS is based on <strong>XML</strong>. The filter pipeline<br />

architecture of XPS is also similar to the one used in printers supporting the PostScript page description language.<br />

PDF includes dynamic capabilities not supported by the XPS format. [7]<br />

<strong>View</strong>ing and creating XPS documents<br />

XPS is supported on several versions of Windows.<br />

Because the printing architecture of Windows Vista uses XPS as the spooler format, [6] it has native support for<br />

generating and reading XPS documents. [8] XPS documents can be created by printing to the virtual XPS printer<br />

driver. The XPS <strong>View</strong>er is installed by default in Windows Vista and Windows 7. The viewer is hosted within<br />

Internet Explorer in Windows Vista, but is a native application in Windows 7. The IE-hosted XPS viewer and the<br />

XPS Document Writer are also available to Windows XP users when they download the .NET Framework 3.0. The<br />

IE-hosted viewer supports digital rights management and digital signatures. Users who do not wish to view XPS<br />

documents in the browser can download the XPS Essentials Pack, [9] which includes a standalone viewer and the XPS<br />

Document Writer. The XPS Essentials Pack also includes providers to enable the IPreview and IFilter capabilities<br />

used by Windows Desktop Search, as well as shell handlers to enable thumbnail views and file properties for XPS<br />

documents in Windows Explorer. [10] The XPS Essentials Pack is available for Windows XP, Windows Server 2003,<br />

and Windows Vista. [10] Installing this pack enables operating systems prior to Windows Vista to use the XPS print<br />

processor, instead of the GDI-based WinPrint, which can produce better quality prints for printers that support XPS<br />

in hardware (directly consume the format). [11] The print spooler format on these operating systems when printing to<br />

older, non-XPS-aware printers, however, remains unchanged.<br />

Windows 7 contains a standalone version of the XPS viewer that supports digital signatures. [12]<br />

Third-party support<br />

Software


Open <strong>XML</strong> Paper Specification 73<br />

GhostXPS<br />

Name Publisher Platform Function<br />

Artifex Software<br />

Inc. [13]<br />

Okular Okular team [15] • Linux<br />

Cross platform The Ghostscript software suite for processing of various page description<br />

• FreeBSD<br />

• Microsoft<br />

Windows<br />

• Solaris<br />

languages includes an input parser called GhostXPS for XPS. The software may<br />

be downloaded in source code form from ghostscript.com [14]<br />

.<br />

Okular, the document viewer of the KDE project, can display XPS documents.<br />

STDU <strong>View</strong>er STDUtility [16] Microsoft Windows STDU <strong>View</strong>er and display and organize XPS documents (as well as other<br />

XPS Annotator<br />

Aspose.Words<br />

product family<br />

www.xpsdev.com<br />

[17]<br />

ASPOSE [18] • .NET Framework<br />

• Java<br />

electronic document formats).<br />

Microsoft Windows XPS Annotator can display, digitally-sign and annotate XPS documents. In<br />

• Microsoft<br />

Sharepoint<br />

• SQL Server<br />

Reporting<br />

Services<br />

• JasperReports<br />

Multilizer Multilizer [20] • Microsoft<br />

Windows<br />

NiXPS <strong>View</strong> NiXPS [21] • Microsoft<br />

Windows<br />

• Mac OS X<br />

NiXPS Edit NiXPS [21] • Microsoft<br />

Windows<br />

• Mac OS X<br />

NiXPS SDK NiXPS [21] • Microsoft<br />

Pagemark<br />

Xps<strong>View</strong>er<br />

Pagemark<br />

XpsConvert<br />

Pagemark<br />

XpsPlugin<br />

PDFTron<br />

XPSConvert<br />

Pagemark<br />

Technology,Inc.<br />

[25]<br />

Pagemark<br />

Technology,Inc.<br />

[25]<br />

Pagemark<br />

Technology,Inc.<br />

[25]<br />

Windows<br />

• Mac OS X<br />

• Microsoft<br />

Windows<br />

• Mac OS<br />

• Linux<br />

• Microsoft<br />

Windows<br />

• Mac OS<br />

• Linux<br />

• Mozilla Firefox<br />

• Safari<br />

PDFTron [27] • Microsoft<br />

Windows<br />

• Mac OS X<br />

• Linux<br />

addition, it can convert XPS documents to common picture formats.<br />

Aspose.Words enables application developers to build applications that<br />

"generate, modify, convert, render and print" XPS documents as well as some<br />

other formats. Aspose.Words is .NET Framework class library rather than an<br />

[19]<br />

independent computer software; hence it cannot be used by consumers.<br />

Multilizer localization products support the translation of documents through a<br />

XPS Scanner plug-in. This plug-in enables users to extract texts from a XPS<br />

document, translate it, and write a translated XPS document with the same<br />

structure.<br />

[22]<br />

NiXPS <strong>View</strong> can display, search and print XPS documents.<br />

NiXPS Edit can view, edit, search, print and export XPS<br />

[23]<br />

documents.<br />

NiXPS SDK enables application developers to develop applications that can<br />

[24]<br />

view, edit or export XPS documents.<br />

Pagemark Xps<strong>View</strong>er can display and organize XPS documents as well as<br />

[26]<br />

converting them to common picture formats.<br />

Pagemark XpsConverter, a command-line interface tool, can convert XPS<br />

[26]<br />

documents to PDF documents, as well as common picture formats.<br />

Pagemark XpsPlugin, an add-on for Mozilla Firefox and Safari web browsers,<br />

enables these web browsers to display XPS documents inside the browser<br />

window. This commercial product is still not available for purchase, but a demo<br />

[26]<br />

version is<br />

available.<br />

PDFTron XPSConvert, a command-line interface tool, can convert XPS<br />

[28]<br />

documents to PDF format or common picture formats.


Open <strong>XML</strong> Paper Specification 74<br />

PDFTron<br />

PDF2XPS<br />

Software Imaging<br />

XPS<strong>View</strong>er<br />

PDFTron [27] • Microsoft<br />

Windows<br />

Software Imaging<br />

[30]<br />

• Mac OS X<br />

• Linux<br />

PDFTron PDF2XPS, a command-line interface tool, can convert PDF<br />

[29]<br />

documents into XPS documents.<br />

Microsoft Windows Software Imaging XPS<strong>View</strong>er, a freeware alternative to Microsoft XPS <strong>View</strong>er,<br />

can view and print XPS documents.Software Imaging [31]<br />

NDesk XPS<br />

NDesk [32] Mono [33]<br />

NDesk XPS can view and convert XPS documents.<br />

Danet Studio<br />

Danetsoft [34] Microsoft Windows<br />

Danet Studio can create, display, sign, convert and annotate XPS documents. It<br />

[35]<br />

can split and merge existing XPS documents to create new XPS<br />

documents.<br />

xps2pdf.org [36] World Wide Web xps2pdf.org, an online tool, can convert XPS documents to PDF format.<br />

TreasureUP XPS to<br />

Image Converter<br />

1.1<br />

Hardware<br />

TreasureUP [37] Microsoft Windows<br />

Convert XPS pages to image files formats: Jpeg, Png and Gif. Supports batch<br />

[38]<br />

files conversion, and automatically converting files in specified folder.<br />

XPS has the support of printing companies such as Konica Minolta, Sharp, [39] Canon, Epson, Hewlett-Packard, [40]<br />

and Xerox [41] and software and hardware companies such as Software Imaging, [42] Pagemark Technology Inc., [43]<br />

Informative Graphics Corp. (IGC), [44] NiXPS NV, [45] Zoran, [46] and Global Graphics. [47]<br />

Native XPS printers have been introduced by Canon ,Konica Minolta, Toshiba, and Xerox. [48]<br />

Devices that are Certified for Windows Vinod' level of Windows Logo conformance certificate are required to have<br />

XPS drivers for printing since 1 June 2007. [49]<br />

Licensing<br />

In order to encourage wide use of the format, Microsoft has released XPS under a royalty-free patent license called<br />

the Community Promise for XPS, [50] [51] allowing users to create implementations of the specification that read, write<br />

and render XPS files as long as they include a notice within the source that technologies implemented may be<br />

encumbered by patents held by Microsoft. Microsoft also requires that organizations "engaged in the business of<br />

developing (i) scanners that output XPS Documents; (ii) printers that consume XPS Documents to produce<br />

hard-copy output; or (iii) print driver or raster image software products or components thereof that convert XPS<br />

Documents for the purpose of producing hard-copy output, [...] will not sue Microsoft or any of its licensees under<br />

the <strong>XML</strong> Paper Specification or customers for infringement of any <strong>XML</strong> Paper Specification Derived Patents (as<br />

defined below) on account of any manufacture, use, sale, offer for sale, importation or other disposition or promotion<br />

of any <strong>XML</strong> Paper Specification implementations." The specification itself is released under a royalty-free copyright<br />

license, allowing its free distribution. [52]<br />

Standardization<br />

Microsoft submitted the XPS specification to Ecma International. [53]<br />

In June 2007 Ecma International Technical Committee 46 (TC46) was set up to develop a standard based on the<br />

Open <strong>XML</strong> Paper Specification (OpenXPS). [54]<br />

At the 97th General Assembly held in Budapest, June 16, 2009, Ecma International approved Open <strong>XML</strong> Paper<br />

Specification (OpenXPS) as an Ecma standard (ECMA-388). [2]<br />

TC46's members are:


Open <strong>XML</strong> Paper Specification 75<br />

See also<br />

• Comparison of OpenXPS and PDF<br />

• Windows Vista printing technologies<br />

• Functional specification<br />

External links<br />

• <strong>XML</strong> Paper Specification [55]<br />

• Autodesk • Konica Minolta • QualityLogic<br />

• Brother Industries • Lexmark • Ricoh<br />

• Canon • Microsoft • Software Imaging Limited<br />

• Fujifilm • Monotype Imaging • Toshiba<br />

• Fujitsu • Océ Technologies • Xerox<br />

• Global Graphics • Pagemark Technology • Zoran Corporation<br />

• Hewlett Packard • Panasonic/Matsushita<br />

• Microsoft XPS Development Team Blog [56]<br />

• Standard ECMA-388 Open <strong>XML</strong> Paper Specification [1]<br />

• XPS FAQ and white papers on office and professional printing from a software technology provider [57]<br />

• <strong>View</strong>ing XPS Documents [58]<br />

References<br />

[1] http://www.ecma-international.org/publications/standards/Ecma-388.htm<br />

[2] Steve McGibbon (Microsoft) (2009-06-17). "OpenXPS - Open<strong>XML</strong> Paper Specification" (http://notes2self.net/archive/2009/06/17/<br />

openxps-openxml-paper-specification.aspx). .<br />

[3] "Global Graphics XPS reference" (http://www.redorbit.com/news/technology/665662/<br />

global_graphics_xps_reference_rip_available_from_microsoft/index.html). Redorbit.com. 2006-09-21. . Retrieved 2009-12-10.<br />

[4] "Reference Raster Image Processor (RIP)" (http://www.microsoft.com/whdc/device/print/RRIP.mspx). Microsoft.com. 2007-01-09. .<br />

Retrieved 2009-12-10.<br />

[5] "HD Photo information on Microsoft Photography team blog" (http://blogs.msdn.com/pix/archive/2007/03/12/hd-photo.aspx).<br />

Blogs.msdn.com. 2007-03-12. . Retrieved 2009-12-10.<br />

[6] Foley, Mary Jo (2005-04-25). "Microsoft Readies New Document Printing Specification" (http://www.microsoft-watch.com/content/<br />

operating_systems/microsoft_readies_new_document_printing_specification.html). Microsoft-watch.com. . Retrieved 2009-12-10.<br />

[7] "Comparison of PDF, XPS and ODF by an ISV providing PDF solutions" (http://www.amyuni.com/blog/?p=8). Amyuni.com. . Retrieved<br />

2009-12-10.<br />

[8] "XPS Documents in Windows Vista" (http://www.microsoft.com/windows/products/windowsvista/features/details/xps.mspx).<br />

Microsoft.com. . Retrieved 2009-12-10.<br />

[9] Download details: XPS Essentials Pack Version 1.0 (http://www.microsoft.com/downloads/details.<br />

aspx?FamilyID=b8dcffdd-e3a5-44cc-8021-7649fd37ffee&displaylang=en) Microsoft <strong>XML</strong> Paper Specification Essentials Pack<br />

[10] "<strong>View</strong> and generate XPS" (http://www.microsoft.com/whdc/xps/viewxps.mspx). Microsoft.com. . Retrieved 2009-12-10.<br />

[11] XPSDrv Filter Pipeline: Implementation and Best Practice (http://download.microsoft.com/download/9/c/5/<br />

9c5b2167-8017-4bae-9fde-d599bac8184a/XPSDrv_FilterPipe.doc)<br />

[12] "<strong>View</strong> and Generate XPS" (http://www.microsoft.com/whdc/xps/viewxps.mspx). Microsoft.com. . Retrieved 2009-12-10.<br />

[13] http://www.artifex.com/<br />

[14] http://www.ghostscript.com/GhostPCL.html<br />

[15] http://okular.kde.org/team.php<br />

[16] http://www.stdutility.com<br />

[17] http://www.xpsdev.com<br />

[18] http://www.aspose.com/<br />

[19] "Aspose.Words Product Family" (http://www.aspose.com/categories/product-family-packs/aspose.words-product-family/default.<br />

aspx). Aspose.com. . Retrieved 2010-03-24.<br />

[20] http://www.multilizer.com


Open <strong>XML</strong> Paper Specification 76<br />

[21] http://www.nixps.com<br />

[22] "NiXPS <strong>View</strong>" (http://www.nixps.com/view3/index.html). Nixps.com. . Retrieved 2010-03-24.<br />

[23] "NiXPS Edit" (http://www.nixps.com/nixps_edit_20.html). Nixps.com. . Retrieved 2010-03-24.<br />

[24] "Nixps Sdk" (http://www.nixps.com/library.html). Nixps.com. . Retrieved 2010-03-24.<br />

[25] http://www.pagemarktechnology.com/<br />

[26] "Pagemark: XPS <strong>View</strong>er, XPS Converter and XPS Plug-in" (http://www.pagemarktechnology.com/home/products.html).<br />

Pagemarktechnology.com. . Retrieved 2010-03-24.<br />

[27] http://www.pdftron.com/<br />

[28] "PDFTron XPSConvert" (http://www.pdftron.com/xpsconvert/index.html). Pdftron.com. 2007-04-02. . Retrieved 2010-03-24.<br />

[29] "PDFTron PDF2XPS" (http://www.pdftron.com/pdf2xps/index.html). Pdftron.com. 2007-04-02. . Retrieved 2010-03-24.<br />

[30] http://softwareimaging.com/<br />

[31] http://softwareimaging.com/products-services/XPS<strong>View</strong>er/index.asp<br />

[32] http://www.ndesk.org/<br />

[33] "NDesk XPS" (http://www.ndesk.org/Xps). Ndesk.org. . Retrieved 2010-03-24.<br />

[34] http://www.danetsoft.com/<br />

[35] Danet Studio (http://www.danetsoft.com/product)<br />

[36] http://www.xps2pdf.org<br />

[37] http://www.treasureup.com/page1.aspx<br />

[38] "XPS to Image" (http://download.cnet.com/TreasureUP-XPS-to-Image-Converter/3000-6675_4-10838983.html). download.cnet.com.<br />

2010-04-05. .<br />

[39] "Sharp Open Systems Architecture supports XPS in multi-function printers" (http://www.sharpusa.com/products/<br />

FunctionPressReleaseSingle/0,1080,650-5,00.html#). Sharpusa.com. . Retrieved 2009-12-10.<br />

[40] Monckton, Paul. "''IT Week'' 10 November 2006, Canon, Epson and HP support for XPS" (http://www.itweek.co.uk/<br />

personal-computer-world/features/2167665/photo-printing-under-windows). Itweek.co.uk. . Retrieved 2009-12-10.<br />

[41] "''Fuji Xerox and Microsoft Collaborate in Document Management Solutions Field''" (http://www.fujixerox.co.jp/eng/headline/2006/<br />

1128_withms.html). Fujixerox.co.jp. 2006-11-28. . Retrieved 2009-12-10.<br />

[42] "XPS & Windows Vista" (http://softwareimaging.com/xps). Software Imaging. . Retrieved 2009-12-10.<br />

[43] "Bot generated title ->" (http://www.pagemarktechnology.com). Pagemark Technology


PCDATA 77<br />

PCDATA<br />

PCDATA is a term originated from SGML, short for "Parsed Character Data".<br />

#PCDATA in <strong>XML</strong> DTD<br />

In <strong>XML</strong> DTD[1], #PCDATA is the keyword to specify "mixed content", meaning an element can contain character<br />

data and/or child elements in arbitrary order and number of occurrences. For example:<br />

<br />

<br />

In this example, element must contain character data only; element can contain a mixture of any<br />

combination of character data , , element(s).<br />

Although its name and its appearance in DTD suggest so, #PCDATA itself is not a semantic term for character<br />

data; it can only appear as the leading syntactic construct in "mixed content" definition. The following usages are<br />

illegal:<br />

<br />

<br />

<br />

<br />

[1] http://www.w3.org/TR/REC-xml/#sec-mixed-content


Plain Old <strong>XML</strong> 78<br />

Plain Old <strong>XML</strong><br />

Plain Old <strong>XML</strong> (POX) is a term used to describe basic <strong>XML</strong>, sometimes mixed in with other, blendable<br />

specifications like <strong>XML</strong> Namespaces, Dublin Core, XInclude and XLink. People typically use the term as a contrast<br />

with complicated, multilayered <strong>XML</strong> specifications like those for web services or RDF. The term may have been<br />

derived from or inspired by the expression plain old telephone service (a.k.a. POTS) and, similarly Plain Old Java<br />

Object.<br />

An interesting question is how POX relates to <strong>XML</strong> Schema. On the one hand, POX is completely compatible with<br />

<strong>XML</strong> Schema. However, many POX users eschew <strong>XML</strong> Schema to avoid the poor or inconsistent quality of <strong>XML</strong><br />

Schema-to-Java tools.<br />

POX is complementary to REST: REST refers to a communication pattern, while POX refers to an information<br />

format style.<br />

The primary competitors to POX are more strictly-defined <strong>XML</strong>-based information formats such as RDF and SOAP<br />

section 5 encoding, as well as general non-<strong>XML</strong> information formats such as JSON and CSV.<br />

External links<br />

• REST and POX article [1] from the Microsoft Developer Network<br />

• Plain Old <strong>XML</strong> Considered Harmful [2] from Microformats.org<br />

• Support for POX [3] in the Java Spring Framework<br />

• Plain<strong>XML</strong> on SourceForge.net [4]<br />

References<br />

[1] http://msdn.microsoft.com/en-us/library/aa395208.aspx<br />

[2] http://microformats.org/wiki/plain-old-xml-considered-harmful<br />

[3] http://static.springsource.org/spring-ws/sites/1.5/apidocs/org/springframework/ws/pox/package-summary.html<br />

[4] http://sourceforge.net/projects/plainxml/


Portable Application Description 79<br />

Portable Application Description<br />

Portable Application Description is a machine-readable document format designed by the Association of<br />

Shareware Professionals.<br />

It allows authors to provide product descriptions and specifications to online sources in a standard way, using a<br />

standard data format, a simplified subset of <strong>XML</strong>, that will allow webmasters and program librarians to automate<br />

program listings. PAD saves time for both authors and webmasters.<br />

Each field in the specification has a regular expression (regex) associated with it. The regex acts as a constraint on<br />

the field: if the regex matches, the field value is legal and if it fails to match, the field and the PAD file as a whole<br />

are out of spec. Only files where all fields in the file pass validation are properly called PAD files.<br />

The simplifications in PAD over <strong>XML</strong> are primarily PAD does not use name/value pairs in tags. All tags are<br />

attribute-free. This is less expressive than <strong>XML</strong> but easier to parse. The official PAD spec uses unique tags. To<br />

extract the fields in the official spec, it is not necessary to descend through the tag path. However, if multiple<br />

languages are represented in a single PAD file, then correct parsing does require descending though the tag path<br />

because leaf tags are duplicated for each language supported.<br />

External links<br />

• Official PAD site [1]<br />

• The Official PAD specification [2]<br />

• The Official PAD validator [3]<br />

• 30 or so free and commercial PAD products, services, and links [4]<br />

• PAD database and graphics updated weekly [5]<br />

• About PAD files (Software Industry Professionals) [6]<br />

• PAD Validation Tool [7]<br />

• Online PAD Generator [8]<br />

• Taşınabilir Uygulama Tanımı [9]<br />

References<br />

[1] http://www.asp-shareware.org/pad/<br />

[2] http://www.asp-shareware.org/pad/spec/spec.php<br />

[3] http://www.asp-shareware.org/pad/spec/validate.php<br />

[4] http://www.asp-shareware.org/pad/padlinks.php<br />

[5] http://paddatacenter.net/<br />

[6] http://www.siprofessionals.org/developers/viewarticle.php?id=si20070802<br />

[7] http://www.sharewarepromotions.com/PAD_Validation.asp<br />

[8] http://www.padbuilder.com/<br />

[9] http://www.tankado.com/pad-portable-application-description/


Publishing Requirements for Industry Standard Metadata 80<br />

Publishing Requirements for Industry Standard<br />

Metadata<br />

PRISM Metadata Standard<br />

Introduction<br />

The Publishing Requirements for Industry Standard Metadata (PRISM) [1] specification defines a set of <strong>XML</strong><br />

metadata vocabularies for syndicating, aggregating, post-processing and multi-purposing content. PRISM provides a<br />

framework for the interchange and preservation of content and metadata, a collection of elements to describe that<br />

content, and a set of controlled vocabularies listing the values for those elements. PRISM can be <strong>XML</strong>, RDF/<strong>XML</strong>,<br />

or XMP and incorporates Dublin Core elements. PRISM can be thought of as a set of <strong>XML</strong> tags used to contain the<br />

metadata of articles and even tag article content.<br />

PRISM conforms to the World Wide Web standard for Namespaces. PRISM namespaces are PRISM (prism:),<br />

PRISM Usage Rights (pur:), Dublin Core (dc: and dcterms:), PRISM Inline Metadata (pim:), PRISM Rights<br />

<strong>Language</strong> (prl:), PRISM Aggregator Message (pam:), and PRISM Controlled Vocabulary (pcv:). PRISM<br />

incorporated existing industry standards such as Dublin Core and XHTML in order to leverage work that had already<br />

been done in the publishing industry. New elements were created only when required, and were assigned to PRISM<br />

specific namespaces.<br />

Overview<br />

PRISM consists of three specifications. The PRISM Specification, itself, provides definition for the overall PRISM<br />

framework. A second specification, the PRISM Aggregator Message (PAM) Schema/DTD, is a standard format for<br />

publishers to use for delivery of content to websites, aggregators, and syndicators. PAM is available as an <strong>XML</strong><br />

DTD and an <strong>XML</strong> schema (XSD). Both PAM formats provides a simple, flexible model for transmitting content and<br />

PRISM metadata. The third, and newest, specification provides an <strong>XML</strong> schema (XSD) for capture of content usage<br />

rights metadata. This Guide to PRISM Usage Rights utilizes the elements found in PRISM’s Usage Rights<br />

Namespace to allow users to comprehensively capture and relay rights metadata for text and media content.<br />

Background<br />

In 1999, IDEAlliance contracted Linda Burman to found the PRISM Working Group to address emerging publisher<br />

requirements for a metadata standard to facilitate “agile” content for search, digital asset management, content<br />

aggregation. Since that time, individuals from more than 50 IDEAlliance member companies have participated in the<br />

development of the specifications.<br />

PRISM is an IDEAlliance specification but is available free of charge. IDEAlliance (International Digital Enterprise<br />

Alliance) is a not-for-profit membership organization. Its mission is to advance user-driven, cross-industry solutions<br />

for all publishing and content-related processes by developing standards, fostering business alliances, and identifying<br />

best practices.<br />

Many organizations use PRISM because it provides a common metadata standard across platforms, media types and<br />

business units. Organizations who are involved in any type of content creation, categorization, management,<br />

aggregation and distribution, both commercially and within intranet and extranet frameworks can use the PRISM<br />

standards.<br />

The PRISM Working Group is open to all IDEAlliance members and includes: Adobe Systems, Hachette Filipacchi<br />

Media, Hearst, L.A. Burman Associates, LexisNexis, The McGraw-Hill Companies, Reader’s Digest, Source<br />

Interlink Media Companies, Time Inc., The Nature Publishing Group, and U.S. News and World Report.


Publishing Requirements for Industry Standard Metadata 81<br />

Usage and Applications<br />

PRISM can be incorporated into other standards and at this time, the PRISM Working Group is only aware of<br />

PRISM incorporation with RSS 1.0. See RSS 1.0 [2] and the RSS 1.0 PRISM Module for more information.<br />

The PRISM specification defines a set of metadata vocabularies. PRISM metadata may be expressed in a different<br />

syntax depending on the specific use-case scenario. Currently PRISM metadata can be encoded <strong>XML</strong>, <strong>XML</strong>/RDF, or<br />

as XMP. Each of these expressions of PRISM metadata is called a profile.<br />

• Profile 1 is for the expression of PRISM metadata in <strong>XML</strong>. An example is the <strong>XML</strong> PRISM Aggregator Message<br />

(PAM).<br />

• Profile 2 is for the expression of PRISM metadata in <strong>XML</strong>/RDF such as for expressing PRISM metadata in RSS<br />

feeds.<br />

• Profile 3 is for embedding PRISM metadata in media objects such as digital images or PDFs using XMP<br />

technology.<br />

PRISM describes many components of print, online, mobile, and multimedia content including the following:<br />

• Who created, contributed to, and owns the rights to the content?<br />

• What locations, organizations, topics, people, and/or events it covers, the media it contains, and under what<br />

conditions it may be reproduced?<br />

• When it was published? (cover date, post date, volume, number), withdrawn?<br />

• Where it can be republished, and the original platform on which it appeared?<br />

• How it can be reused?<br />

Common PRISM Usage<br />

• Syndication to partners<br />

• Content aggregation<br />

• Content repurposing<br />

• Resource discovery and search optimization<br />

• Multiple platform and channel distribution<br />

• Content archiving<br />

• Capture rights usage information<br />

• Creation of feeds, such as RSS<br />

• Standalone services<br />

• Embedded descriptions, such as XMP<br />

• Web publishing<br />

See also<br />

• Dublin Core<br />

• DTD<br />

• Comparison of document markup languages<br />

• Controlled vocabulary<br />

• Interoperability


Publishing Requirements for Industry Standard Metadata 82<br />

See also<br />

• Dublin Core Metadata Initiative<br />

• Bibliographic Ontology<br />

Further reading<br />

• IDEAlliance [3]<br />

• PRISM Standard [4]<br />

• PRISM FAQ [5]<br />

• RSS 1.0 PRISM Module [6]<br />

• Using PRISM - The PRISM Cookbook [7] is a systematic guide that demonstrates how to apply PRISM elements<br />

in particular business scenarios. The existing PRISM Cookbook addresses only PRISM Profile 1 (<strong>XML</strong>).<br />

• W3C – Namespaces in <strong>XML</strong> [8]<br />

References<br />

[1] PRISM Metadata Standard (http://www.idealliance.org/industry_resources/intelligent_content_informed_workflow/prism)<br />

[2] http://web.resource.org/rss/1.0/spec<br />

[3] http://www.idealliance.org<br />

[4] http://www.prismstandard.org<br />

[5] http://www.prismstandard.org/faq/<br />

[6] http://nurture.nature.com/rss/modules/mod_prism.html<br />

[7] http://www.prismstandard.org/resources/<br />

[8] http://www.w3.org/TR/2006/REC-xml-names11-20060816/<br />

QName<br />

QNames were introduced by <strong>XML</strong> Namespaces in order to be used as URI references [1] . QName stands for<br />

"qualified name" and defines a valid identifier for elements and attributes. QNames are generally used to reference<br />

particular elements or attributes within <strong>XML</strong> documents. [2]<br />

Motivation<br />

Since URI references can be long and may contain prohibited characters for element/attribute naming, QNames are<br />

used to create a mapping between the URI and a namespace prefix. The mapping enables the abbreviation of URIs,<br />

therefore it achieves a more convenient way to write <strong>XML</strong> documents. (see Example)<br />

Formal definition<br />

QNames are formally defined by the W3C as [3] :<br />

QName ::= PrefixedName | UnprefixedName<br />

PrefixedName ::= Prefix ':' LocalPart<br />

UnprefixedName ::= LocalPart<br />

Whereby the Prefix is used as placeholder for the namespace and the LocalPart as the local part of the qualified<br />

name. A local part can be an attribute name or an element name.


QName 83<br />

Example<br />

<br />

<br />

<br />

<br />

In line two the prefix "x" is declared to be associated with the URI "http://example.com/ns/foo". This prefix can<br />

further on be used as abbreviation for this namespace. Subsequently the tag "x:p" is a valid QName because it uses<br />

the "x" as namespace reference and "p" as local part. The tag "doc" is also a valid QName, but it consists only of a<br />

local part. [4]<br />

See also<br />

• CURIE<br />

References<br />

[1] Namespaces in <strong>XML</strong> 1.0 (Second Edition) (http://www.w3.org/TR/REC-xml-names/#dt-qualname)<br />

[2] Using Qualified Names (QNames) as Identifiers in <strong>XML</strong> Content (http://www.w3.org/2001/tag/doc/qnameids.html#sec-qnames-xml)<br />

[3] Namespaces in <strong>XML</strong> 1.0 (Second Edition) (http://www.w3.org/TR/REC-xml-names/#NT-QName)<br />

[4] Namespaces in <strong>XML</strong> 1.0 (Second Edition) (http://www.w3.org/TR/REC-xml-names/#NT-LocalPart)<br />

QTI<br />

The IMS Question and Test Interoperability specification (QTI) defines a standard format for the representation<br />

of assessment content and results, supporting the exchange of this material between authoring and delivery systems,<br />

repositories and other learning management systems. It allows assessment materials to be authored and delivered on<br />

multiple systems interchangeably. It is, therefore, designed to facilitate interoperability between systems [1] .<br />

The specification consists of a data model that defines the structure of questions, assessments and results from<br />

questions and assessments together with an <strong>XML</strong> data binding that essentially defines a language for interchanging<br />

questions and other assessment material. The <strong>XML</strong> binding is widely used for exchanging questions between<br />

different authoring tools and by publishers. The assessment and results parts of the specification are less widely used.<br />

Background<br />

QTI was produced by the IMS Global Learning Consortium, which is an industry and academic consortium that<br />

develops specifications for interoperable learning technology. QTI was inspired by the need for interoperability in<br />

question design, and to avoid people losing or having to re-type questions when technology changes. Developing and<br />

validating good questions is time consuming, and it's desirable to be able to create them in a platform and technology<br />

neutral format.<br />

QTI version 1.0 was materially based on a proprietary Questions <strong>Markup</strong> <strong>Language</strong> (QML) language defined by<br />

QuestionMark, but the language has evolved over the years and can now describe almost any reasonable question<br />

that one might want to describe. (QML is still in use by Questionmark and is generated for interoperability by tools<br />

like Adobe Captivate).<br />

The most widely used version of QTI at the time of writing is version 1.2, which was finalized in 2002. This works<br />

well for exchanging simple question types, and is supported by many tools that allow the creation of questions.<br />

Version 2.0 was released in 2005, with v2.1 due for release in 2008 [2] . 2.0 addressed the item (individual question)<br />

level of the specification only, with 2.1 covering assessments and results as well as correcting errors which had


QTI 84<br />

become apparent in 2.0. Version 2.x is a significant improvement on earlier versions, defining a new underlying<br />

interaction model. It is also notable for its significantly greater degree of integration with other specifications (some<br />

of which did not exist during the production of v1): the specification addresses the relationship with IMS Content<br />

Packaging v1.2, IEEE Learning Object Metadata, IMS Learning Design, IMS Simple Sequencing and other<br />

standards such as XHTML. It also provides guidance on representing context-specific usage data and information to<br />

support the migration of content from earlier versions of the specification.<br />

Because v2.0 was limited to items only, and v2.1 has yet to be formally released by IMS (although two public drafts<br />

plus an addendum are currently available), uptake of v2.x has been slow to date. The delay between the release of 2.0<br />

and 2.1 (over three years to date) may have hindered uptake to some extent, with developers reluctant to commit to<br />

v2.0 knowing that v2.1 is in development. The use of a profile of v1.2.1 in the IMS Common Cartridge specification<br />

may exacerbate this. A number of implementations are emerging, however, and uptake may increase once the<br />

specification is finally available in a stable form.<br />

In early 2009, the IMS Global Learning Consortium withdrew QTI 2.1, stating that "Adequate feedback on the<br />

specification has not been received, and therefore, the specification has been put back into the IMS project group<br />

process for further work." [3] The most recent version of QTI that is fully endorsed by IMS GLC is v1.2.1. This<br />

decision met with disapproval on the IMS-QTI mailing list. [4] A further clarification on the QTI 2.1 withdrawal<br />

acknowledged the work done on implementing the QTI 2.1 draft specification, and cited criticism on the lack of<br />

interoperability of IMS specifications as a reason for endorsing only IMS QTI 1.2. [5] A few weeks later IMS GLC<br />

reposted the QTI v2.1 draft specification on their website [6] with a warning that the specification is incomplete:<br />

Caution: The QTIv2.1PD Version 2 specification is incomplete in its current state. The IMS QTI project group<br />

is in the process of evolving this specification based on input from market participants. Suppliers of products<br />

and services are encouraged to participate by contacting Mark McKell at [e-mail address removed]. This<br />

specification will be superseded by an updated release based on the input of the project group participants.<br />

Please note that supplier's claims as to implementation of QTI v2.1 and conformance to it HAVE NOT BEEN<br />

VALIDATED by IMS GLC. While such suppliers are likely well-intentioned, IMS GLC member<br />

organizations have not yet put in place the testing process to validate these claims. IMS GLC currently grants a<br />

conformance mark to the Common Cartridge profile of QTI v1.2.1. [7]<br />

Timeline<br />

Date Version Comments<br />

March 1999 0.5 Internal to IMS<br />

February 2000 1.0 public draft<br />

May 2000 1.0 final release<br />

August 2000 1.01<br />

March 2001 1.1<br />

January 2002 1.2<br />

March 2003 1.2.1 addendum<br />

September 2003 2.0 charter Initiation of working group<br />

January 2005 2.0 final release<br />

January 2006 2.1 public draft<br />

July 2006 2.1 public draft version 2<br />

April 2008 2.1 public draft addendum<br />

early 2009 2.1 removed from website


QTI 85<br />

January 2010 2.1 reinstated on website<br />

Applications with IMS QTI support<br />

Name QTI<br />

ANGEL Learning<br />

Management Suite<br />

APIS QTIv2<br />

Assessment Engine<br />

version<br />

Type of tool Comment<br />

2.1 [8] LMS also supports IMS Common Cartridge [8]<br />

2.0 draft<br />

[9]<br />

Java library & demo<br />

application.<br />

AQuRate 2.1 [10] authoring tool see QTITools<br />

ASDEL 2.1 [11] assessment delivery system see QTITools<br />

ATutor 1.2, 2.1<br />

[12]<br />

LCMS<br />

Canvas Learning [13]<br />

1.2.1<br />

Authoring tools and SCORM<br />

compatible item renderer<br />

CCReader 1.2.1 CC<br />

Cognero<br />

Profile<br />

[14]<br />

1.2 and<br />

2.1 [15]<br />

Content-e 1.2 & 2.0<br />

[16]<br />

DB Primary 2.0 [17]<br />

[18]<br />

Diploma 1.2, 2.1<br />

[19]<br />

Dokeos<br />

Elques<br />

1.2 and<br />

2.0 [20]<br />

2.1 [21]<br />

[22]<br />

available as middle-ware<br />

solutions.<br />

Common Cartridge <strong>View</strong>er<br />

Assessment authoring and<br />

delivery system.<br />

Professional authoring tool<br />

Content-e.<br />

LMS<br />

Incomplete. Author recommends using QTITools instead.<br />

Creators - Can Studios contributed to the development of the QTI specification.<br />

A number of LMS systems used the Canvas Learning Player to achieve<br />

compatibility with the Becta learning platform conformance regime. The system<br />

is currently being distributed to schools in the UK as a result of this integration<br />

work.<br />

Cognero imports QTI 1.2 and exports QTI 1.2 and 2.1 to allow content to work<br />

with other systems.<br />

Imports QTI 1.2 and 2.0.<br />

export QTI 1.2 & 2.1<br />

LMS/LCMS export QTI 1.2 & 2.0 (1.2 disabled by default but available) (supports SCORM<br />

1.2)<br />

authoring tool exports QTI 2.1 and QTI 1.2 (for LMS OLAT only); imports QTI 2.1, Tests<br />

from Blackboard and OLAT (kind of QTI 1.2 too)<br />

it's learning 2.1 [23] VLE import and export questions in QTI 2.1 format<br />

ILIAS<br />

Lectora<br />

not stated<br />

[24]<br />

not stated<br />

[25]<br />

LMS supports SCORM 1.2 and SCORM 2004<br />

authoring tool supports SCORM 1.2 and SCORM 2004<br />

Mathqurate 2.1 [26] authoring tool see QTITools. Embedded Gecko engine and support for multiple interactions<br />

Moodle<br />

not stated<br />

[27]<br />

LCMS supports adaptive questions; QTI 2.0 export is still unfinished


QTI 86<br />

Online Learning And<br />

Training<br />

QTI 1.2<br />

[28]<br />

ONYX 2.1 [29] modular assessment delivery<br />

OWL Testing<br />

Software<br />

not stated<br />

[30]<br />

LCMS QTI 2.1 compliance can be achieved with ONYX as plugin<br />

system<br />

QTITools 2.1 [31] collection of tools and<br />

QuestionMark<br />

Perception<br />

Question Writer 2.0<br />

Publisher Edition<br />

Question Writer 3.5<br />

Professional<br />

not stated<br />

[33]<br />

Respondus 1.2 [39]<br />

RM Test Authoring<br />

System<br />

open-source, QTI 2.1 import and export, Report <strong>View</strong>er for graphical<br />

visualization of QTI-Result-Files<br />

test management system can import IMS QTI<br />

libraries<br />

authoring tool and delivery<br />

system<br />

Test authoring tool Spectatus procudes QTI<br />

[32]<br />

2.1<br />

can export IMS QTI, an online tool provides QTI 1.2 import<br />

[34]<br />

1.2 authoring tool Exports as QTI 1.2 and SCORM 1.2 [35]<br />

[36]<br />

1.2 authoring tool Exports as QTI 1.2 and SCORM 1.2 [37] Also specific QTI Export for Pearson<br />

VUE [38]<br />

[40]<br />

authoring tool QTI export<br />

2.1 [41] authoring tool<br />

Sakai 1.2 [42] LMS<br />

SToMP (Software<br />

Teaching of Modular<br />

Physics)<br />

2.1 [43] assessment system mostly unavailable as of July 2008<br />

Studywiz 1.2 [44] Virtual Learning<br />

Wimba Create<br />

Other software:<br />

QTI Lite<br />

[45]<br />

Environment Module<br />

authoring tool only export<br />

An optional module for creating and assigning QTI v1.2 questions to students.<br />

Available as of June 2008<br />

• QTI Migration Tool (University of Cambridge): converts QTI version 1.x data into QTI 2.0 content packages. [46]<br />

External links<br />

• IMS Global Learning Consortium: IMS Question & Test Interoperability Specification [47]<br />

• TOIA (Technologies for Online Interoperable Assessment) [48] - this project ended in 2007 and software is no<br />

longer available.<br />

• QTI Tools [49]<br />

• JISC CETIS Assessment special interest group [50]<br />

• JISC CETIS wiki: Assessment tools, projects and resources [51]<br />

• IMS Question & Test Interoperability mailing list [52]


QTI 87<br />

References<br />

[1] Effective Practice with e-Assessment guide, p.44 (http://www.jisc.ac.uk/media/documents/themes/elearning/effpraceassess.pdf)<br />

[2] QTI Update (http://wiki.cetis.ac.uk/Assessment_and_EC_SIGs_meeting_Feb_2008#QTI_Update)<br />

[3] IMS Global Learning Consortium: IMS Question & Test Interoperability Specification (http://www.imsglobal.org/question/index.html).<br />

Accessed March 29, 2009.<br />

[4] E-mail thread "QTI 2.1 draft specification withdrawn" (http://lists.ucles.org.uk/public/ims-qti/2009-March/001456.html), starting<br />

March 27, 2009.<br />

[5] Rob Abel: Further clarification on the removal of QTI v2.1 from the IMS web site (http://www.imsglobal.org/community/forum/<br />

messageview.cfm?catid=21&threadid=36&enterthread=y), on the IMS Global Learning Consortium's Question and Test Interoperability<br />

Forum, March 30, 2009. Accessed March 29, 2009.<br />

[6] rabel: We are reposting the QTI v2.1 (http://www.imsglobal.org/community/forum/messageview.cfm?catid=21&threadid=41&<br />

enterthread=y). Question and Test Interoperability Forum, April 14, 2009. Accessed April 17, 2009.<br />

[7] IMS Global Learning Consortium: IMS Question & Test Interoperability Specification (http://www.imsglobal.org/question/index.html).<br />

Accessed April 17, 2009.<br />

[8] ANGEL Learning Management Suite: Standards Leadership (http://www.angellearning.com/products/lms/standards.html). Accessed<br />

March 30, 2009.<br />

[9] Sourceforge.net: APIS QTIv2 Assessment Engine (http://sourceforge.net/projects/apis). Accessed March 30, 2009.<br />

[10] AQuRate: A QTI-2.x Authoring Tool (http://aqurate.kingston.ac.uk/). Accessed March 30, 2009.<br />

[11] ASDEL: assessment delivery system for QTIv2 questions (http://www.asdel.ecs.soton.ac.uk/). Accessed March 30, 2009.<br />

[12] ATutorATutor Learning Content Management System: Information (http://www.atutor.ca/atutor/). Accessed March 30, 2009.<br />

[13] Canvas Learning (http://www.canvaslearning.com). Accessed August, 2009.<br />

[14] CCReader project in Sourceforge (http://sourceforge.net/projects/ccreader). Accessed March 30, 2009.<br />

[15] Cognero: Cognero Features (http://www.cognero.com/features.html). Accessed February 19, 2009<br />

[16] Professional authoring tool content-e. (http://eng.content-e.nl/) Accessed July, 2009.<br />

[17] iBoard content available in DB Primary (http://www.e2bn.org/services/120/iboard-content-available-in-db-primary.html). Accessed<br />

March 30, 2009.<br />

[18] DB Primary's own Technical Overview (http://www.getprimary.com/tech_spec.html) does not mention QTI.<br />

[19] Diploma 6 (Windows) Release Notes (6.61 (Build 0087 - 8/8/2008)) (http://www.brownstone.net/support/Dip6-ReleaseNotes.asp).<br />

Accessed March 30, 2009.<br />

[20] Dokeos code (no other reference available) (http://dokeos.svn.sourceforge.net/viewvc/dokeos/trunk/dokeos/main/exercice/export/)<br />

[21] Elques: Elques Features (http://elques.bps-system.de/en/?Features). Accessed March 30, 2009.<br />

[22] Elques: Elques 2.0[[Category:Articles containing German language text (http://elques.bps-system.de/)]] (in German). Accessed<br />

September 30, 2009.<br />

[23] it's learning: Importing and exporting (https://www.itslearning.com/Ntt/Help/en-GB/Default_Left.htm#StartTopic=Adding). Accessed<br />

June 19, 2009.<br />

[24] ILIAS France (http://ilias-france.info/ilias.htm). Accessed March 30, 2009.<br />

[25] Lectora Supports eLearning Standards (http://www.trivantis.com/products/elearningstandards.html). Accessed March 30, 2009.<br />

[26] Mathqurate: Maths-enabled QTI-2.1 item authoring (http://aqurate.kingston.ac.uk/mathqurate/). Accessed April 3, 2009.<br />

[27] Development:Question engine - MoodleDocs (http://docs.moodle.org/en/Question_engine). Accessed March 30, 2009.<br />

[28] OLAT Feature List and Some Screenshots (http://www.olat.org/website/en/html/about_features.html). Accessed March 30, 2009.<br />

[29] Onyx Feature List and more Infos (http://onyx.bps-system.de/en/?Features). Accessed March 30, 2009.<br />

[30] OWL Test Conversion Service (http://www.owlts.com/test-conversion.html). Accessed March 30, 2009.<br />

[31] SourceForge.net: QTItools (http://sourceforge.net/projects/qtitools/). Accessed March 30, 2009.<br />

[32] Paul Neve: " Spectatus - QTI 2.1 test authoring tool (http://lists.ucles.org.uk/public/ims-qti/2010-February/001571.html)", IMS-QTI<br />

mailing list, February 26, 2010. Accessed April 14, 2010.<br />

[33] Questionmark - Windows Based Authoring - Question Types (http://www.questionmark.com/us/perception/<br />

authoring_windows_qm_qtypes.aspx). Accessed March 30, 2009.<br />

[34] Publisher's Legacy Software Page (http://www.questionwriter.com/pricing/custom-development.html). Accessed March 31, 2009.<br />

[35] Question Writer 2.0 Publisher Edition Manual (http://downloads.centralquestion.com/QuestionWriterManual.pdf). Accessed March 31,<br />

2009.<br />

[36] Question Writer Blog Announcement (http://www.questionwriterblog.com/archives/2009/05/question_writer_34.html). Accessed May<br />

18, 2009.<br />

[37] Question Writer Features Description (http://www.questionwriter.com/features.html). Accessed May 18, 2009.<br />

[38] Question Writer Blog Entry on Feature (http://www.questionwriterblog.com/archives/2009/06/qti_for_pearson_vue.html). Accessed<br />

July 29, 2009.<br />

[39] Respondus Plug-in for Moodle (http://www.respondus.com/update/2007-11-c.shtml). Accessed March 30, 2009.<br />

[40] The Respondus Version 3.5 page (http://www.respondus.com/products/respondus.shtml) does not mention the QTI version.<br />

[41] RM: Test Authoring System (http://www.rm.com/generic.asp?cref=GP1002551). Accessed March 31, 2009.


QTI 88<br />

[42] Sakai: SAMigo/Test and Quizzes (http://bugs.sakaiproject.org/confluence/display/SAM/Home). Accessed March 30, 2009.<br />

[43] SToMP: An Overview (http://www.stomp.ac.uk/). Accessed March 31, 2009.<br />

[44] Studywiz QT Assessment (http://www.europe.studywiz.com/?page_id=72). Accessed April 03, 2009.<br />

[45] Wimba Create Brochure (http://www.wimba.com/assets/resources/wimbaCrBrochure_HE.pdf). Accessed March 30, 2009.<br />

[46] QTI Migration Tool (http://qtitools.caret.cam.ac.uk/index.php?option=com_docman&task=cat_view&gid=18&Itemid=28). Accessed<br />

March 30, 2009.<br />

[47] http://www.imsglobal.org/question<br />

[48] http://www.toia.ac.uk<br />

[49] http://qtitools.caret.cam.ac.uk/<br />

[50] http://jisc.cetis.ac.uk/domain/assessment<br />

[51] http://wiki.cetis.ac.uk/Assessment_tools%2C_projects_and_resources<br />

[52] http://lists.ucles.org.uk/lists/listinfo/ims-qti


Resource Description Framework 89<br />

Resource Description Framework<br />

Current Status Published<br />

Editors Frank Manola, Eric Miller<br />

Base Standards <strong>XML</strong>, URI<br />

Related<br />

Standards<br />

RDFS, OWL<br />

Domain Semantic Web<br />

Abbreviation RDF<br />

Website RDF Primer [1]<br />

The Resource Description Framework (RDF) is a family of World Wide Web Consortium (W3C) specifications<br />

originally designed as a metadata data model. It has come to be used as a general method for conceptual description<br />

or modeling of information that is implemented in web resources, using a variety of syntax formats.<br />

Overview<br />

The RDF data model [2] is similar to classic conceptual modeling approaches such as Entity-Relationship or Class<br />

diagrams, as it is based upon the idea of making statements about resources (in particular Web resources) in the form<br />

of subject-predicate-object expressions. These expressions are known as triples in RDF terminology. The subject<br />

denotes the resource, and the predicate denotes traits or aspects of the resource and expresses a relationship between<br />

the subject and the object. For example, one way to represent the notion "The sky has the color blue" in RDF is as<br />

the triple: a subject denoting "the sky", a predicate denoting "has the color", and an object denoting "blue". RDF is<br />

an abstract model with several serialization formats (i.e., file formats), and so the particular way in which a resource<br />

or triple is encoded varies from format to format.<br />

This mechanism for describing resources is a major component in what is proposed by the W3C's Semantic Web<br />

activity: an evolutionary stage of the World Wide Web in which automated software can store, exchange, and use<br />

machine-readable information distributed throughout the Web, in turn enabling users to deal with the information<br />

with greater efficiency and certainty. RDF's simple data model and ability to model disparate, abstract concepts has<br />

also led to its increasing use in knowledge management applications unrelated to Semantic Web activity.<br />

A collection of RDF statements intrinsically represents a labeled, directed multi-graph. As such, an RDF-based data<br />

model is more naturally suited to certain kinds of knowledge representation than the relational model and other<br />

ontological models. However, in practice, RDF data is often persisted in relational database or native representations<br />

also called Triplestores, or Quad stores if context (i.e. the named graph) is also persisted for each RDF triple. [3] As<br />

RDFS and OWL demonstrate, additional ontology languages can be built upon RDF.<br />

History<br />

There were several ancestors to the W3C's RDF. Technically the closest was MCF, a project initiated by<br />

Ramanathan V. Guha while at Apple Computer and continued, with contributions from Tim Bray, during his tenure<br />

at Netscape Communications Corporation. Ideas from the Dublin Core community, and from PICS, the Platform for<br />

Internet Content Selection (the W3C's early Web content labelling system) were also key in shaping the direction of<br />

the RDF project.<br />

The W3C published a specification of RDF's data model and <strong>XML</strong> syntax as a Recommendation in 1999. [4] Work<br />

then began on a new version that was published as a set of related specifications in 2004. While there are a few


Resource Description Framework 90<br />

implementations based on the 1999 Recommendation that have yet to be completely updated, adoption of the<br />

improved specifications has been rapid since they were developed in full public view, unlike some earlier<br />

technologies of the W3C. Most newcomers to RDF are unaware that the older specifications even exist.<br />

RDF Topics<br />

RDF Vocabulary<br />

The vocabulary defined by the RDF specification is:<br />

• rdf:type - a predicate used to state that a resource is an instance of a class<br />

• rdf:<strong>XML</strong>Literal - the class of typed literals<br />

• rdf:Property - the class of properties<br />

• rdf:Alt, rdf:Bag, rdf:Seq - containers of alternatives, unordered containers, and ordered containers (rdfs:Container<br />

is a super-class of the three)<br />

• rdf:List - the class of RDF Lists<br />

• rdf:nil - an instance of rdf:List representing the empty list<br />

• rdf:Statement, rdf:subject, rdf:predicate, rdf:object – used for reification (see below)<br />

This vocabulary is used as a foundation for RDF Schema where it is extended.<br />

Serialization formats<br />

Two common serialization formats are in use.<br />

The first is an <strong>XML</strong> format. This format is often called simply RDF because it was introduced among the other W3C<br />

specifications defining RDF. However, it is important to distinguish the <strong>XML</strong> format from the abstract RDF model<br />

itself. Its MIME media type, application/rdf+xml, was registered by RFC 3870. It recommends RDF documents to<br />

follow the new 2004 specifications.<br />

In addition to serializing RDF as <strong>XML</strong>, the W3C introduced Notation 3 (or N3) as a non-<strong>XML</strong> serialization of RDF<br />

models designed to be easier to write by hand, and in some cases easier to follow. Because it is based on a tabular<br />

notation, it makes the underlying triples encoded in the documents more easily recognizable compared to the <strong>XML</strong><br />

serialization. N3 is closely related to the Turtle and N-Triples formats.<br />

Triples may be stored in a triplestore.<br />

Resource identification<br />

The subject of an RDF statement is either a Uniform Resource Identifier (URI) or a blank node, both of which<br />

denote resources. Resources indicated by blank nodes are called anonymous resources. They are not directly<br />

identifiable from the RDF statement. The predicate is a URI which also indicates a resource, representing a<br />

relationship. The object is a URI, blank node or a Unicode string literal.<br />

In Semantic Web applications, and in relatively popular applications of RDF like RSS and FOAF (Friend of a<br />

Friend), resources tend to be represented by URIs that intentionally denote, and can be used to access, actual data on<br />

the World Wide Web. But RDF, in general, is not limited to the description of Internet-based resources. In fact, the<br />

URI that names a resource does not have to be dereferenceable at all. For example, a URI that begins with "http:"<br />

and is used as the subject of an RDF statement does not necessarily have to represent a resource that is accessible via<br />

HTTP, nor does it need to represent a tangible, network-accessible resource — such a URI could represent<br />

absolutely anything. However, there is broad agreement that a bare URI (without a # symbol) which returns a<br />

300-level coded response when used in an http GET request should be treated as denoting the internet resource that it<br />

succeeds in accessing.


Resource Description Framework 91<br />

Therefore, producers and consumers of RDF statements must agree on the semantics of resource identifiers. Such<br />

agreement is not inherent to RDF itself, although there are some controlled vocabularies in common use, such as<br />

Dublin Core Metadata, which is partially mapped to a URI space for use in RDF. The intent of publishing<br />

RDF-based ontologies on the Web is often to establish, or circumscribe, the intended meanings of the resource<br />

identifiers used to express data in RDF. For example, the URI http:/ / www. w3. org/ TR/ 2004/<br />

REC-owl-guide-20040210/ wine#merlot is intended by its owners to refer to the class of all Merlot red wines, an<br />

intent which is expressed by the OWL ontology — itself an RDF document — in which it occurs. Note that this is<br />

not a 'bare' resource identifier, but is rather a URI reference, containing the '#' character and ending with a fragment<br />

identifier.<br />

Statement reification and context<br />

The body of knowledge modeled by a collection of statements may be subjected to reification, in which each<br />

statement (that is each triple subject-predicate-object altogether) is assigned a URI and treated as a resource about<br />

which additional statements can be made, as in "Jane says that John is the author of document X". Reification is<br />

sometimes important in order to deduce a level of confidence or degree of usefulness for each statement.<br />

In a reified RDF database, each original statement, being a resource, itself, most likely has at least three additional<br />

statements made about it: one to assert that its subject is some resource, one to assert that its predicate is some<br />

resource, and one to assert that its object is some resource or literal. More statements about the original statement<br />

may also exist, depending on the application's needs.<br />

Borrowing from concepts available in logic (and as illustrated in graphical notations such as conceptual graphs and<br />

topic maps), some RDF model implementations acknowledge that it is sometimes useful to group statements<br />

according to different criteria, called situations, contexts, or scopes, as discussed in articles by RDF specification<br />

co-editor Graham Klyne [5] [6] . For example, a statement can be associated with a context, named by a URI, in order<br />

to assert an "is true in" relationship. As another example, it is sometimes convenient to group statements by their<br />

source, which can be identified by a URI, such as the URI of a particular RDF/<strong>XML</strong> document. Then, when updates<br />

are made to the source, corresponding statements can be changed in the model, as well.<br />

Implementation of scopes does not necessarily require fully reified statements. Some implementations allow a single<br />

scope identifier to be associated with a statement that has not been assigned a URI, itself [7] [8] . Likewise named<br />

graphs in which a set of triples is named by a URI can represent context without the need to reify the triples. [9]<br />

Query and inference languages<br />

The predominant query language for RDF graphs is SPARQL. SPARQL is an SQL-like language, and a<br />

recommendation of the W3C as of January 15, 2008.<br />

An example of a SPARQL query to show country capitals in Africa, using a fictional ontology.<br />

PREFIX abc: .<br />

SELECT ?capital ?country<br />

WHERE {<br />

}<br />

?x abc:cityname ?capital ;<br />

abc:isCapitalOf ?y.<br />

?y abc:countryname ?country ;<br />

abc:isInContinent abc:Africa.<br />

Other ways to query RDF graphs include:<br />

• RDQL, precursor to SPARQL, SQL-like<br />

• Versa, compact syntax (non–SQL-like), solely implemented in 4Suite (Python)


Resource Description Framework 92<br />

• RQL, one the first declarative languages for uniformly querying RDF schemas and resource descriptions,<br />

implemented in RDFSuite.<br />

• XUL has a template [10] element in which to declare rules for matching data in RDF. XUL uses RDF extensively<br />

for databinding.<br />

Examples<br />

Example 1: RDF Description of a person named Eric Miller [11]<br />

Here is an example taken from the W3C website [11] describing a resource with statements "there is a Person<br />

identified by http:/ / www. w3. org/ People/ EM/ contact#me, whose name is Eric Miller, whose email address is<br />

em@w3.org, and whose title is Dr.".<br />

The resource "http:/ / www. w3. org/ People/ EM/ contact#me" is the<br />

subject. The objects are: (i) "Eric Miller" (with a predicate "whose<br />

name is"), (ii) em@w3.org (with a predicate "whose email address is"),<br />

and (iii) "Dr." (with a predicate "whose title is"). The subject is a URI.<br />

The predicates also have URIs. For example, the URI for the predicate:<br />

(i) "whose name is" is http:/ / www. w3. org/ 2000/ 10/ swap/ pim/<br />

contact#fullName, (ii) "whose email address is" is http:/ / www. w3.<br />

org/ 2000/ 10/ swap/ pim/ contact#mailbox, (iii) "whose title is" is<br />

http:/ / www. w3. org/ 2000/ 10/ swap/ pim/ contact#personalTitle. In<br />

addition, the subject has a type (with URI http://www.w3.org/1999/<br />

02/ 22-rdf-syntax-ns#type), which is person (with URI http:/ / www.<br />

[11]<br />

An RDF Graph Describing Eric Miller<br />

w3. org/ 2000/ 10/ swap/ pim/ contact#Person), and a mailbox (with URI http:/ / www. w3. org/ 2000/ 10/ swap/<br />

pim/contact#mailbox.) Therefore, the following "subject, predicate, object" RDF triples can be expressed:<br />

(i) http:/ / www. w3. org/ People/ EM/ contact#me, http:/ / www. w3. org/ 2000/ 10/ swap/ pim/ contact#fullName,<br />

"Eric Miller"<br />

(ii) http:/ / www. w3. org/ People/ EM/ contact#me, http:/ / www. w3. org/ 2000/ 10/ swap/ pim/<br />

contact#personalTitle, "Dr."<br />

(iii) http://www.w3.org/People/EM/contact#me, http://www.w3.org/1999/02/22-rdf-syntax-ns#type, http://<br />

www.w3.org/2000/10/swap/pim/contact#Person<br />

(iv) http:/ / www. w3. org/ People/ EM/ contact#me, http:/ / www. w3. org/ 2000/ 10/ swap/ pim/ contact#mailbox,<br />

em@w3.org<br />

Example 2: The postal abbreviation for New York<br />

Certain concepts in RDF are taken from logic and linguistics, where subject-predicate and subject-predicate-object<br />

structures have meanings similar to, yet distinct from, the uses of those terms in RDF. This example demonstrates:<br />

In the English language statement 'New York has the postal abbreviation NY' , 'New York' would be the subject, 'has<br />

the postal abbreviation' the predicate and 'NY' the object.<br />

Encoded as an RDF triple, the subject and predicate would have to be resources named by URIs. The object could be<br />

a resource or literal element. For example, in the Notation 3 form of RDF, the statement might look like:<br />

"NY" .<br />

In this example, "urn:x-states:New%20York" is the URI for a resource that denotes the U.S. state New York,<br />

"http://purl.org/dc/terms/alternative" is the URI for a predicate (whose human-readable definition can be found at<br />

here [12] ), and "NY" is a literal string. Note that the URIs chosen here are not standard, and don't need to be, as long


Resource Description Framework 93<br />

as their meaning is known to whatever is reading them.<br />

N-Triples is just one of several standard serialization formats for RDF. The triple above can also be equivalently<br />

represented in the standard RDF/<strong>XML</strong> format as:<br />

<br />

<br />

<br />

<br />

NY<br />

However, because of the restrictions on the syntax of QNames (such as dcterms:alternative above), there are some<br />

RDF graphs that are not representable with RDF/<strong>XML</strong>.<br />

Example 3: A Wikipedia article about Tony Benn<br />

In a like manner, given that "http://en.wikipedia.org/wiki/Tony_Benn" identifies a particular resource (regardless of<br />

whether that URI could be traversed as a hyperlink, or whether the resource is actually the Wikipedia article about<br />

Tony Benn), to say that the title of this resource is "Tony Benn" and its publisher is "Wikipedia" would be two<br />

assertions that could be expressed as valid RDF statements. In the N-Triples form of RDF, these statements might<br />

look like the following:<br />

"Tony Be<br />

"Wik<br />

And these statements might be expressed in RDF/<strong>XML</strong> as:<br />

<br />

<br />

<br />

Tony Benn<br />

Wikipedia<br />

<br />

To an English-speaking person, the same information could be represented simply as:<br />

The title of this resource, which is published by Wikipedia, is 'Tony Benn'<br />

However, RDF puts the information in a formal way that a machine can understand. The purpose of RDF is to<br />

provide an encoding and interpretation mechanism so that resources can be described in a way that particular<br />

software can understand it; in other words, so that software can access and use information that it otherwise couldn't<br />

use.<br />

Both versions of the statements above are wordy because one requirement for an RDF resource (as a subject or a<br />

predicate) is that it be unique. The subject resource must be unique in an attempt to pinpoint the exact resource being<br />

described. The predicate needs to be unique in order to reduce the chance that the idea of Title or Publisher will be<br />

ambiguous to software working with the description. If the software recognizes http://purl.org/dc/elements/1.1/title<br />

(a specific definition for the concept of a title established by the Dublin Core Metadata Initiative), it will also know<br />

that this title is different from a land title or an honorary title or just the letters t-i-t-l-e put together.


Resource Description Framework 94<br />

The following example shows how such simple claims can be elaborated on, by combining multiple RDF<br />

vocabularies. Here, we note that the primary topic of the Wikipedia page is a "Person" whose name is "Tony Benn":<br />

<br />

<br />

<br />

Tony Benn<br />

Wikipedia<br />

<br />

Applications<br />

<br />

<br />

Tony Benn<br />

<br />

<br />

• Sigma [13] - Application from DERI in National University of Ireland, Galway(NUIG).<br />

• Creative Commons - Uses RDF to embed license information in web pages and mp3 files.<br />

• DOAC (Description of a Career) - supplements FOAF to allow the sharing of résumé information.<br />

• FOAF (Friend of a Friend) - designed to describe people, their interests and interconnections.<br />

• Haystack client - Semantic web browser from MIT CS & AI lab. [14]<br />

• IDEAS Group - developing a formal 4D Ontology for Enterprise Architecture using RDF as the encoding. [15]<br />

• Microsoft shipped a product, Connected Services Framework [16] ,which provides RDF-based Profile Management<br />

capabilities.<br />

• MusicBrainz - Publishes information about Music Albums. [17]<br />

• NEPOMUK, an open-source software specification for a Social Semantic desktop uses RDF as a storage format<br />

for collected metadata. NEPOMUK is mostly known because of its integration into the KDE4 desktop<br />

environment.<br />

• RDF Site Summary - one of several "RSS" languages for publishing information about updates made to a web<br />

page; it is often used for disseminating news article summaries and sharing weblog content.<br />

• Simple Knowledge Organization System (SKOS) - an KR representation intended to support<br />

vocabulary/thesaurus applications<br />

• SIOC (Semantically-Interlinked Online Communities) - designed to describe online communities and to create<br />

connections between Internet-based discussions from message boards, weblogs and mailing lists. [18]<br />

• Smart-M3 - provides an infrastructure for using RDF and specifically uses the ontology agnostic nature of RDF to<br />

enable heterogeneous mashing-up of information [19]<br />

• Many other RDF schemas are available by searching SchemaWeb. [20]<br />

Some uses of RDF include research into social networking. This is important because it could help governments<br />

keep track of terrorists cells. It will also help people in business fields understand better their relationships with<br />

members of industries that could be of use for product placement [21] . It will also help scientists understand how<br />

people are connected to one another.<br />

RDF is being used to have a better understanding of traffic patterns. This is because the information regarding traffic<br />

patterns is on different websites, and RDF is used to integrate information from different sources on the web. Before,<br />

the common methodology was using keyword searching, but this method is problematic because it does not consider


Resource Description Framework 95<br />

synonyms. This is why ontologies are useful in this situation. But one of the issues that comes up when trying to<br />

efficiently study traffic is that to fully understand traffic, concepts related to people, streets, and roads must be well<br />

understood. Since these are human concepts, they require the addition of fuzzy logic. This is because values that are<br />

useful when describing roads, like slipperiness, are not precise concepts and cannot be measured. This would imply<br />

that the best solution would incorporate both fuzzy logic and ontology. [22]<br />

See also<br />

Notations for RDF<br />

• N3<br />

• N-Triples<br />

• TRiG<br />

• TRiX<br />

• Turtle<br />

• RDF/<strong>XML</strong><br />

• RDFa<br />

Ontology/vocabulary languages<br />

• OWL<br />

• SKOS<br />

• RDF schema<br />

Similar concepts<br />

• Entity-attribute-value model<br />

• Graph theory - An RDF model is a labeled, directed multi-graph.<br />

• Website Parse Template<br />

• Tagging<br />

• Topic Maps - Topic Maps is in some ways, similar to RDF.<br />

• Semantic network<br />

Other (unsorted)<br />

• Associative model of data<br />

• Business Intelligence 2.0 (BI 2.0)<br />

• DataPortability<br />

• Folksonomy<br />

• GRDDL<br />

• Life Science Identifiers<br />

• Meta Content Framework<br />

• Semantic Web<br />

• Swoogle<br />

• Universal Networking <strong>Language</strong> (UNL)


Resource Description Framework 96<br />

Further reading<br />

• W3C's RDF at W3C [23] : specifications, guides, and resources<br />

• RDF Semantics [24] : specification of semantics, and complete systems of inference rules for both RDF and RDFS<br />

Tutorials and documents<br />

• Quick Intro to RDF [25]<br />

• RDF in Depth [26]<br />

• Introduction to the RDF Model [27]<br />

• What is RDF? [28]<br />

• An introduction to RDF [29]<br />

• RDF and XUL [30] , with examples.<br />

External links<br />

News and resources<br />

• Dave Beckett's RDF Resource Guide [31]<br />

• Resource Description Framework: According to W3C specifications and Mozilla's documentation [30]<br />

• RDF Datasources [32] : RDF datasources in Mozilla<br />

• The Finance Ontology [33] Semantic web application under construction.<br />

RDF software tools<br />

• Raptor RDF Parser Library [34]<br />

• Listing of RDF and OWL tools at W3C wiki [35]<br />

• SemWebCentral [36] Open Source semantic web tools<br />

• Intellidimension [37] Semantic web software and tools for Windows, .NET/C# and SQL Server<br />

• Listing of RDF software at xml.com [38]<br />

• Rhodonite [39] : freeware RDF editor and RDF browser with a drag-and-drop interface<br />

• D2R Server [40] : tool to publish relational databases as an RDF-graph<br />

• Virtuoso Universal Server: a SPARQL compliant platform for RDF data management, SQL-RDF integration, and<br />

RDF based Linked Data deployment<br />

• ROWLEX [41] : .NET library and toolkit built to create and browse RDF documents easily. It abstracts away the<br />

level of RDF triples and elevates the level of the programming work to (OWL) classes and properties.<br />

• AlchemyAPI [42] : web service API / SDK that converts unstructured text into RDF & Linked Data.<br />

• The Sweet Tools [43] listing of 800+ RDF and -related tools, most open source, and sortable by category and<br />

language (among other facets).<br />

RDF datasources<br />

• Wikipedia 3 [44] : System One's RDF conversion of the English Wikipedia, updated monthly<br />

• DBpedia: a Linking Open Data Community Project [45] that exposes an every increasing collection of RDF based<br />

Linked Data sources<br />

• Semantic Systems Biology [46]


Resource Description Framework 97<br />

References<br />

[1] http://www.w3.org/TR/rdf-primer/<br />

[2] http://www.w3.org/TR/PR-rdf-syntax/"Resource Description Framework (RDF) Model and Syntax Specification"<br />

[3] Optimized Index Structures for Querying RDF from the Web (http://sw.deri.org/2005/02/dexa/yars.pdf) Andreas Harth, Stefan Decker,<br />

3rd Latin American Web Congress, Buenos Aires, Argentina, October 31 to November 2, 2005, pp. 71-80<br />

[4] W3C 1999 specification (http://www.w3.org/TR/rdf-syntax-grammar/)<br />

[5] Contexts for RDF Information Modelling (http://www.ninebynine.org/RDFNotes/RDFContexts.html)<br />

[6] Circumstance, Provenance and Partial Knowledge (http://www.ninebynine.org/RDFNotes/UsingContextsWithRDF.html)<br />

[7] The Concept of 4Suite RDF Scopes (http://uche.ogbuji.net/tech/akara/nodes/2003-01-01/scopes)<br />

[8] Redland RDF Library - Contexts (http://librdf.org/notes/contexts.html)<br />

[9] Named Graphs (http://www.w3.org/2004/03/trix/)<br />

[10] http://developer.mozilla.org/en/docs/XUL:Template_Guide:Introduction<br />

[11] "RDF Primer" (http://www.w3.org/TR/rdf-primer/). W3C. . Retrieved 2009-03-13.<br />

[12] http://dublincore.org/documents/library-application-profile/index.shtml#Alternative<br />

[13] http://sig.ma/<br />

[14] Haystack (http://groups.csail.mit.edu/haystack/home.html)<br />

[15] The IDEAS Group Website (http://www.ideasgroup.org)<br />

[16] Connected Services Framework (http://www.microsoft.com/serviceproviders/solutions/connectedservicesframework.mspx)<br />

[17] RDF on MusicBrainz Wiki (http://wiki.musicbrainz.org/RDF)<br />

[18] SIOC (Semantically-Interlinked Online Communities) (http://sioc-project.org/)<br />

[19] Oliver Ian, Honkola Jukka, Ziegler Jurgen (2008). “Dynamic, Localized Space Based Semantic Webs”. IADIS WWW/Internet 2008.<br />

Proceedings, p.426, IADIS Press, ISBN 978-972-8924-68-3<br />

[20] SchemaWeb (http://www.schemaweb.info)<br />

[21] An RDF Approach for Discovering the Relevant Semantic Associations in a Social Network By Thushar A.K, and P. Santhi Thilagam<br />

[22] Traffic Information Retrieval Based on Fuzzy Ontology and RDF on the Semantic Web By Jun Zhai, Yi Yu, Yiduo Liang, and Jiatao Jiang<br />

(2008)<br />

[23] http://www.w3.org/RDF/<br />

[24] http://www.w3.org/TR/2004/REC-rdf-mt-20040210/<br />

[25] http://rdfabout.com/quickintro.xpd<br />

[26] http://rdfabout.com/intro/<br />

[27] http://www.xulplanet.com/tutorials/mozsdk/rdfstart.php<br />

[28] http://www.xml.com/pub/a/2001/01/24/rdf.html<br />

[29] http://www-128.ibm.com/developerworks/library/w-rdf/<br />

[30] http://www.xul.fr/en-xml-rdf.html<br />

[31] http://planetrdf.com/guide/<br />

[32] http://xulplanet.com/tutorials/mozsdk/rdfsources.php<br />

[33] http://www.fadyart.com/ontology.html<br />

[34] http://librdf.org/raptor/<br />

[35] http://esw.w3.org/topic/SemanticWebTools<br />

[36] http://projects.semwebcentral.org/<br />

[37] http://www.intellidimension.com/<br />

[38] http://www.xml.com/pub/rg/RDF_Software<br />

[39] http://rhodonite.angelite.nl<br />

[40] http://sites.wiwiss.fu-berlin.de/suhl/bizer/d2r-server/<br />

[41] http://rowlex.nc3a.nato.int<br />

[42] http://www.alchemyapi.com/api/entity/ldata.html<br />

[43] http://www.mkbergman.com/new-version-sweet-tools-sem-web/<br />

[44] http://labs.systemone.at/wikipedia3<br />

[45] http://esw.w3.org/topic/SweoIG/TaskForces/CommunityProjects/LinkingOpenData<br />

[46] http://www.semantic-systems-biology.org


Resources of a Resource 98<br />

Resources of a Resource<br />

Resources of a Resource (ROR) is an <strong>XML</strong> format for describing the content of an internet resource or website in a<br />

generic fashion so this content can be better understood by search engines, spiders, web applications, etc. The ROR<br />

format provides several pre-defined terms for describing objects like sitemaps, products, events, reviews, jobs,<br />

classifieds, etc. The format can be extended with custom terms.<br />

RORweb.com [1] is the official website of ROR; the ROR format was created by AddMe.com [2] as a way to help<br />

search engines better understand content and meaning. Similar concepts, like Google Sitemaps and Google Base,<br />

have also been developed since the introduction of the ROR format.<br />

ROR objects are placed in an ROR feed called ror.xml. This file is typically located in the root directory of the<br />

resource or website it describes. When a search engine like Google or Yahoo searches the web to determine how to<br />

categorize content, the ROR feed allows the search engines "spider" to quickly identify all the content and attributes<br />

of the website.<br />

This has three main benefits:<br />

1. It allows the spider to correctly categorize the content of the website into its engine.<br />

2. It allows the spider to extract very detailed information about the objects on a website (sitemaps, products,<br />

events, reviews, jobs, classifieds, etc)<br />

3. It allows the website owner to optimize his site for inclusion of its content into the search engines.<br />

External links<br />

• RORweb.com [1]<br />

References<br />

[1] http://www.rorweb.com<br />

[2] http://www.AddMe.com


Reverse Ajax 99<br />

Reverse Ajax<br />

Reverse Ajax refers to an Ajax design pattern that uses long-lived HTTP connections to enable low-latency<br />

communication between a web server and a browser. Basically it is a way of sending data from client to server and a<br />

[1] [2]<br />

mechanism for pushing server data back to the browser.<br />

This server–client communication takes one of two forms:<br />

• Client polling: the client repeatedly queries (polls) the server and waits for an answer.<br />

• Server pushing: a connection between a server and client is kept open and the server sends data when available.<br />

Reverse Ajax describes the implementation of either of these models, or a combination of both. The design pattern is<br />

also known as Ajax Push, Full Duplex Ajax and Streaming Ajax.<br />

Examples<br />

The following is a simple example. Imagine we have 2 clients and 1 server, and client1 wants to send the message<br />

"hello" to every other client.<br />

With traditional Ajax (polling):<br />

• client1 sends the message "hello"<br />

• server receives the message "hello"<br />

• client2 polls the server<br />

• client2 receives the message "hello"<br />

• client1 polls the server<br />

• client1 receives the message "hello"<br />

With reverse Ajax (pushing):<br />

• client1 sends the message "hello"<br />

• server receives the message "hello"<br />

• server sends the message "hello" to all clients<br />

Less traffic is generated with Reverse Ajax and messages are transferred with less delay (low-latency).<br />

External links<br />

• The Slow Load Technique/Reverse AJAX - Simulating Server Push in a Standard Web Browser [3]<br />

• Exploring Reverse Ajax [4]<br />

• Reverse Ajax with DWR (an Java Ajax framework) [5]<br />

• Changing the Web Paradigm - Moving from traditional Web applications to Streaming-AJAX [6]<br />

References<br />

[1] Crane, Dave; McCarthy, Phil (July 2008) (in English). Comet and Reverse Ajax: The Next Generation Ajax 2.0. Apress. ISBN 1590599985.<br />

[2] Martin, Katherine (2007-03-22). "Developing Applications using Reverse Ajax" (http://today.java.net/pub/a/today/2007/03/22/<br />

developing-applications-using-reverse-ajax.html). java.net, O'Reilly and CollabNet. .<br />

[3] http://www.obviously.com/tech_tips/slow_load_technique<br />

[4] http://gmapsdotnetcontrol.blogspot.com/2006/08/exploring-reverse-ajax-ajax.html<br />

[5] http://ajaxian.com/archives/reverse-ajax-with-dwr<br />

[6] http://www.lightstreamer.com/Lightstreamer_Paradigm.pdf


Root element 100<br />

Root element<br />

Each <strong>XML</strong> document has exactly one single root element. This element is also known as the document element. It<br />

encloses all the other elements and is therefore the sole parent element to all the other elements.<br />

The World Wide Web Consortium defines not only the specifications for <strong>XML</strong> itself [1] , but also the DOM, which is<br />

a platform- and language-independent standard object model for representing <strong>XML</strong> documents. DOM Level 1<br />

defines, for every <strong>XML</strong> document, an object representation of the document itself and an attribute or property on the<br />

document called documentElement. This property provides access to an object of type element which directly<br />

represents the root element of the document [2] .<br />

<br />

content<br />

<br />

<br />

There can be other <strong>XML</strong> nodes outside of the root element [3] , in particular the root element may be preceded by a<br />

prolog, which itself may consist of an <strong>XML</strong> declaration, optional comments, processing instructions and whitespace,<br />

followed by an optional DOCTYPE declaration and more optional comments, processing instructions and<br />

whitespace. After the document element there may be further optional comments, processing instructions and<br />

whitespace within the document [4] .<br />

Within the document element, apart from any number of attributes and other elements, there may also be more<br />

optional text, comments, processing instructions and whitespace.<br />

A more expanded example of an <strong>XML</strong> document follows, demonstrating some of these extra nodes along with a<br />

single rootElement element.<br />

<br />

<br />

<br />

<br />

<br />

<br />

text<br />

<br />


Root element 101<br />

References<br />

[1] The current W3C <strong>XML</strong> 1.0 specification (http://www.w3.org/TR/xml/)<br />

[2] The 'documentElement' definition in the W3C DOM Level 1 specification (http://www.w3.org/TR/1998/REC-DOM-Level-1-19981001/<br />

level-one-core.html#i-Document)<br />

[3] The 'well-formed document' section of the W3C <strong>XML</strong> specification (http://www.w3.org/TR/2006/REC-xml-20060816/<br />

#sec-well-formed)<br />

[4] The 'prolog' section of the W3C <strong>XML</strong> specification (http://www.w3.org/TR/2006/REC-xml-20060816/#NT-prolog)<br />

Schematron<br />

In markup languages, Schematron is a rule-based validation language for making assertions about the presence or<br />

absence of patterns in <strong>XML</strong> trees. It is a structural schema language expressed in <strong>XML</strong> using a small number of<br />

elements and XPath.<br />

In a typical implementation, the Schematron schema <strong>XML</strong> is processed into normal XSLT code for deployment<br />

anywhere that XSLT can be used.<br />

Schematron is capable of expressing constraints in ways that XDR and DTD cannot. For example, it can require that<br />

the content of an element be controlled by one of its siblings. Or it can request or require that the root element,<br />

regardless of what element that is, must have specific attributes. Schematron can also specify required relationships<br />

between multiple <strong>XML</strong> files.<br />

Constraints and content rules may be associated with "plain-English" validation error messages. This may be<br />

preferred by some users who might otherwise have to cross-reference numeric error codes to understand what they<br />

mean.<br />

Uses<br />

Schematron's design of expressing constraints through an XPath-based language that can be deployed as XSLT code,<br />

make it practical for applications such as the following:<br />

Adjunct to Structural Validation<br />

by testing for co-occurrence constraints, non-regular constraints, and inter-document constraints, Schematron<br />

can extend the validations able to be expressed in languages such as DTDs, RELAX NG or <strong>XML</strong> Schema.<br />

Lightweight Business Rules Engine<br />

Schematron is not a comprehensive, Rete rules engine, but it can be used to express rules about complex<br />

structures with an <strong>XML</strong> document.<br />

<strong>XML</strong> Editor Syntax Highlighting Rules<br />

<strong>XML</strong> Editors use Schematron rules to conditionally highlight <strong>XML</strong> files for errors.


Schematron 102<br />

Versions<br />

Schematron was invented by Rick Jelliffe at Academia Sinica Computing Centre, Taiwan. He described Schematron<br />

as "a feather duster to reach the parts other schema languages cannot reach".<br />

The most common versions of Schematron are:<br />

• Schematron 1.0 (1999)<br />

• Schematron 1.3 (2000): this version used the namespace http://xml.ascc.net/schematron/''.It was supported by<br />

an XSLT implementation with a plug-in architecture.<br />

• Schematron 1.5 [1] (2001): this version was widely implemented and still found.<br />

• Schematron 1.6 [2] (2002): this version was the base of ISO Schematron and obsoleted by it<br />

• ISO Schematron [16] (2006): this version regularizes several features, and provides an <strong>XML</strong> output format SVRL.<br />

It uses the new namespace http://purl.oclc.org/dsdl/schematron''<br />

• ISO Schematron (2010): this proposed version adds support for XSLT2 and arbitrary properties<br />

Schematron as an ISO Standard<br />

Schematron has been standardized to become part of : ISO/IEC 19757 - Document Schema Definition <strong>Language</strong>s<br />

(DSDL) - Part 3: Rule-based validation - Schematron.<br />

This standard is available free on the ISO Publicly Available Specifications [16] list. Paper versions may be<br />

purchased from ISO or national standards bodies.<br />

Schemas that use ISO/IEC FDIS 19757-3 should use the following namespace:<br />

http://purl.oclc.org/dsdl/schematron<br />

Sample Rule<br />

Schematron rules are very simple to create using a standard <strong>XML</strong> editor or XForms application. The following is a<br />

sample schema:<br />

<br />

<br />

Date rules<br />

<br />

ContractDate should be in the pa<br />

are not allowed.<br />

<br />

<br />

<br />

This rule checks to make sure that the ContractDate <strong>XML</strong> element has a date that is before the current date. If this<br />

rule fails the validation will fail and an error message which is the body of the assert element will be returned to the<br />

user.


Schematron 103<br />

Implementation<br />

Schematron source files are usually transformed into XSLT files (using XSLT) and placed in an <strong>XML</strong> Pipeline. This<br />

allows workflow process designers to build and maintain rules using standard <strong>XML</strong> manipulation tools.<br />

For example an Apache Ant task can be used to convert Schematron rules into XSLT files.<br />

See also<br />

• <strong>XML</strong> Schema <strong>Language</strong> Comparison - Comparison to other <strong>XML</strong> Schema languages.<br />

• Service Modeling <strong>Language</strong> - Service Modeling <strong>Language</strong> uses Schematron.<br />

External links<br />

• ISO Schematron Home Page [3]<br />

• Academia Sinica Computing Centre's Schematron Home Page [4]<br />

• Schematron Wiki including Implementer's FAQ [5]<br />

References<br />

[1] http://xml.ascc.net/schematron/<br />

[2] http://xml.ascc.net/resource/schematron/Schematron2000.html<br />

[3] http://www.schematron.com<br />

[4] http://www.ascc.net/xml/resource/schematron/<br />

[5] http://www.eccnet.com/schematron/index.php/Main_Page<br />

Simple Outline <strong>XML</strong><br />

Simple Outline <strong>XML</strong> (SOX) is a compressed way of writing <strong>XML</strong>.<br />

SOX uses indenting to represent the structure of an <strong>XML</strong> document, eliminating the need for closing tags.<br />

Example<br />

The following XHTML markup fragment:<br />

<br />

<br />

Sample page<br />

<br />

<br />

A very brief page<br />

<br />

<br />

... would appear in SOX as:<br />

html><br />

xmlns=http://www.w3.org/1999/xhtml<br />

head><br />

body><br />

title> Sample page<br />

p> A very brief page


Simple Outline <strong>XML</strong> 104<br />

SOX can be readily converted to <strong>XML</strong>.<br />

See also<br />

• Haml is a meta-XHTML representation that integrates with Ruby on Rails and has a similar mark-up structure.<br />

Sources<br />

• http://www.langdale.com.au/SOX/<br />

• http://www.ibm.com/developerworks/xml/library/x-syntax.html<br />

Simple <strong>XML</strong><br />

Simple <strong>XML</strong> is a variation of <strong>XML</strong> containing only elements. All attributes are converted into elements. Not having<br />

attributes or other xml elements such as the <strong>XML</strong> declaration / DTDs allows the use of simple and fast parsers. This<br />

format is also compatible with mainstream <strong>XML</strong> parsers.<br />

Structure<br />

For example:<br />

gardening Watering 6:00<br />

7:00 cooking <br />

12:00 <br />

would represent:<br />

<br />

<br />

Validation<br />

Simple <strong>XML</strong> uses a simple XPath list for validation. The <strong>XML</strong> snippet above for example, would be represented by:<br />

/Agenda/type|(Activity/type|(*/time))<br />

or a bit more human readable as:<br />

/Agenda/type /Agenda/Activity/type /Agenda/Activity/*/time<br />

This allows the <strong>XML</strong> to be processed as a stream (without creating an object model in memory) with fast validation.<br />

References<br />

1. http://www.w3.org/<strong>XML</strong>/simple-<strong>XML</strong>.html


Streaming <strong>XML</strong> 105<br />

Streaming <strong>XML</strong><br />

Streaming <strong>XML</strong> means dynamic data which is in an <strong>XML</strong> format.<br />

Another popular use of this term refers to one method of consuming <strong>XML</strong> data – largely known as Simple API for<br />

<strong>XML</strong>. This is via asynchronous events that are generated as the <strong>XML</strong> data is parsed. In this context, the consumer<br />

streams through the <strong>XML</strong> data one item at a time. It does not have anything to do whether the underlying data is<br />

being updated via dynamic or static means.<br />

Uses<br />

• Extensible Messaging and Presence Protocol (XMPP). This is the protocol used for example in Google Talk.<br />

Styled Layer Descriptor<br />

A Styled Layer Descriptor (SLD) is an <strong>XML</strong> schema specified by the Open Geospatial Consortium (OGC) for<br />

describing the appearance of map layers. It is capable of describing the rendering of vector and raster data. A typical<br />

use of SLDs is to instruct a Web Map Service (WMS) of how to render a specific layer.<br />

In August 2007 the SLD specification has been split up into two new OGC specifications [1] :<br />

• Symbology Encoding Implementation Specification (SE)<br />

• Styled Layer Descriptor<br />

Styled Layer Descriptor Specification now only contains the protocol for communicating with a WMS about how to<br />

style a layer. The actual description of the styling is now exclusively described in the Symbology Encoding<br />

Implementation Specification.<br />

Open source SLD supporting software<br />

Desktop software<br />

• JUMP GIS<br />

• UDig<br />

Server-side software<br />

• GeoServer<br />

• Mapserver<br />

See also<br />

• UDig<br />

• GeoServer


Styled Layer Descriptor 106<br />

External links<br />

• AtlasStyler SLD Editor [2] is a free-software (LGPL) SLD Editor developed with GeoTools+Java+Swing.<br />

External links<br />

• OpenGIS Styled Layer Descriptor Implementation Specification [3]<br />

• OpenGIS Symbology Encoding Implementation Specification [4]<br />

References<br />

[1] OGC press release about Symbology Encoding and SLD (http://www.opengeospatial.org/press/?page=pressrelease&year=0&prid=306)<br />

[2] http://wald.intevation.org/projects/atlas-framework<br />

[3] http://www.opengeospatial.org/standards/sld<br />

[4] http://www.opengeospatial.org/standards/symbol<br />

Topic (<strong>XML</strong>)<br />

In <strong>XML</strong> terminology, topic can mean<br />

1. A resource that acts as a proxy for some subject; the topic map system's representation of that subject. The<br />

relationship between a topic and its subject is defined to be one of reification. Reification of a subject allows topic<br />

characteristics to be assigned to the topic that reifies it.<br />

2. A short document which is written in such a way that it completely answers a single question. For example, an<br />

online help system typically consists of hundreds of topics, each describing a single procedure or concept. See<br />

topic-based authoring.<br />

3. A element, used in many <strong>XML</strong> formats.<br />

See also<br />

• Topic Maps<br />

External links<br />

• Specification in <strong>XML</strong> Topic Maps (XTM) 1.0 (topicmaps.org) [1]<br />

• FAQ: The Topic Architecture of DITA [2]<br />

References<br />

[1] http://www.topicmaps.org/xtm/index.html<br />

[2] http://dita.xml.org/node/1230


Unique Particle Attribution 107<br />

Unique Particle Attribution<br />

The Unique Particle Attribution (UPA) rule is <strong>XML</strong> Schema's mechanism to prevent schema ambiguity.<br />

Due to the UPA rule the schema fragment given below is prohibited.<br />

<br />

<br />

<br />

<br />

Given the instance fragment:<br />

42<br />

It is not possible to create a Post-Schema-Validation Infoset, because it is ambiguous whether should be<br />

associated with the element declaration x, or the wildcard (xsd:any).<br />

The W3C schema workgroup is considering weak wildcards for schema version 1.1. Using weak wildcards, the<br />

explicit element declaration would always take precedence ( is associated with the element declaration), thus<br />

removing the ambiguity.<br />

See also<br />

• W3C <strong>XML</strong> Schema<br />

External links<br />

• Schema Component Constraint: Unique Particle Attribution [1]<br />

• An Approach for Evolving <strong>XML</strong> Vocabularies Using <strong>XML</strong> Schema [2]<br />

• <strong>XML</strong> Schema 1.1 Part 1: Structures [3]<br />

• <strong>XML</strong> Schema 1.1 Part 2: Datatypes [4]<br />

References<br />

[1] http://www.w3.org/TR/xmlschema-1/#cos-nonambig<br />

[2] http://lists.w3.org/Archives/Public/www-tag/2004Aug/att-0010/NRMVersioningProposal.html<br />

[3] http://www.w3.org/TR/xmlschema11-1/<br />

[4] http://www.w3.org/TR/xmlschema11-2/


VTD-<strong>XML</strong> 108<br />

VTD-<strong>XML</strong><br />

Developer(s) XimpleWare<br />

Stable release 2.8 / April 12, 2009<br />

Operating<br />

system<br />

Portable<br />

Type <strong>XML</strong> parser/indexer/slicer/editor library<br />

License GPL and Proprietary License<br />

Website vtd-xml.sourceforge.net [1] VTD-<strong>XML</strong> blog<br />

[2]<br />

Virtual Token Descriptor for eXtensible <strong>Markup</strong> <strong>Language</strong> (VTD-<strong>XML</strong>) refers to a collection of cross-platform<br />

<strong>XML</strong> processing technologies centered around a non-extractive [3] [4] <strong>XML</strong>, "document-centric" parsing technique<br />

called Virtual Token Descriptor (VTD). Depending on the perspective, VTD-<strong>XML</strong> can be viewed as one of the<br />

following:<br />

• A "Document-Centric" [5] [6] [7] [8] [9]<br />

<strong>XML</strong> parser<br />

• A native <strong>XML</strong> indexer or a file format that uses binary data to enhance the text <strong>XML</strong> [10]<br />

[11] [12]<br />

• An incremental <strong>XML</strong> content modifier<br />

• An <strong>XML</strong> slicer/splitter/assembler [13]<br />

• An <strong>XML</strong> editor/eraser<br />

[14] [15] [16]<br />

• A way to port <strong>XML</strong> processing on chip<br />

• A non-blocking, stateless XPath evaluator [17]<br />

VTD-<strong>XML</strong> is developed by XimpleWare and dual-licensed under GPL and proprietary license. It is originally<br />

written in Java, but is now available in C [18] and C#. An extended version supporting 256 GB file size is also<br />

available.<br />

Basic Concept<br />

Non-Extractive, Document-Centric Parsing<br />

Traditionally, a lexical analyzer represents tokens (the small units of indivisible character values) as discrete string<br />

objects. This approach is designated extractive parsing. In contrast, non-extractive tokenization mandates that one<br />

keeps the source text intact, and uses offsets and lengths to describe those tokens.<br />

Virtual Token Descriptor<br />

Virtual Token Descriptor (VTD) applies the concept of non-extractive, document-centric parsing to <strong>XML</strong><br />

processing. A VTD record uses a 64-bit integer to encode the offset, length, token type and nesting depth of a token<br />

in an <strong>XML</strong> document. Because all VTD records are 64-bit in length, they can be stored efficiently and managed as<br />

an array. [19]


VTD-<strong>XML</strong> 109<br />

Location Cache<br />

Location Caches (LC) build on VTD records to provide efficient random access. Organized as tables, with one table<br />

per nesting depth level, LCs contain entries modeling an <strong>XML</strong> document's element hierarchy. An LC entry is a<br />

64-bit integer encoding a pair of 32-bit values. The upper 32 bits identify the VTD record for the corresponding<br />

element. The lower 32 bits identify that element's first child in the LC at the next lower nesting level.<br />

Benefits<br />

Overview<br />

Virtually all the core benefits of VTD-<strong>XML</strong> are inherent to non-extractive, document-centric parsing which provides<br />

these characteristics:<br />

• The source <strong>XML</strong> text is kept intact in memory without decoding.<br />

• The internal representation of VTD-<strong>XML</strong> is inherently persistent.<br />

• Obviates object-oriented modeling of the hierarchical representation as it relies entirely on primitive data types<br />

(e.g., 64-bit integers) to represent the <strong>XML</strong> hierarchy, thus reducing object creation cost to nearly zero [20] .<br />

Combining those characteristics permits thinking of <strong>XML</strong> purely as syntax (bits, bytes, offsets, lengths, fragments,<br />

namespace-compensated fragments, and document composition) instead of the serialization/deserialization of<br />

objects. This is a powerful way to think about <strong>XML</strong>/SOA applications.<br />

Simplicity<br />

Developers' typical first impression is that, with VTD-<strong>XML</strong>, there are relatively few classes and methods to<br />

remember in order to write applications.<br />

As Parser<br />

When used in parsing mode, VTD-<strong>XML</strong> is a general purpose, extremely high performance [21] <strong>XML</strong> parser which<br />

compares favorably with others:<br />

• VTD-<strong>XML</strong> typically outperforms SAX (with NULL content handler) while still providing full random access and<br />

built-in XPath support.<br />

• VTD-<strong>XML</strong> typically consumes 1.3-1.5 times the <strong>XML</strong> document's size in memory, which is about 1/5 the<br />

memory usage of DOM<br />

• Applications written in VTD-<strong>XML</strong> are usually much shorter and cleaner than their DOM or SAX versions.<br />

As Indexer<br />

Because of the inherent persistence of VTD-<strong>XML</strong>, developers can write the internal representation of a parsed <strong>XML</strong><br />

document to disk and later reload it to avoid repetitive parsing. To this end, XimpleWare has introduced VTD+<strong>XML</strong><br />

as a binary packaging format combining VTD, LC and the <strong>XML</strong> text. It can typically be viewed in one of the<br />

following two ways:<br />

• A native <strong>XML</strong> index that completely eliminates the parsing cost and also retains all benefits of <strong>XML</strong>. It is a file<br />

format that is human readable and backward compatible with <strong>XML</strong>.<br />

• A binary <strong>XML</strong> format that uses binary data to enhance the processing of the <strong>XML</strong> text.


VTD-<strong>XML</strong> 110<br />

<strong>XML</strong> Content Modifier<br />

Because VTD-<strong>XML</strong> keeps the <strong>XML</strong> text intact without decoding, when an application intends to modify the content<br />

of <strong>XML</strong> it only needs to modify the portions most relevant to the changes. This is in stark contrast with DOM, SAX,<br />

or StAx parsing, which incur the cost of parsing and re-serialization no matter how small the changes are.<br />

Since VTDs refer to document elements by their offsets, changes to the length of elements occurring earlier in a<br />

document require adjustments to VTDs referring to all later elements. However, those adjustments are integer<br />

additions, albeit to many integers in multiple tables, so they are quick.<br />

<strong>XML</strong> Slicer/Splitter/Assembler<br />

An application based on VTD-<strong>XML</strong> can also use offsets and lengths to address tokens, or element fragments. This<br />

allows <strong>XML</strong> documents to be manipulated like arrays of bytes.<br />

• As a slicer, VTD-<strong>XML</strong> can "slice" off a token or an element fragment from an <strong>XML</strong> document, then insert it back<br />

into another location in the same document, or into a different document.<br />

• As a splitter, VTD-<strong>XML</strong> can split sub-elements in an <strong>XML</strong> document and dump each into a separate <strong>XML</strong><br />

document.<br />

• As an assembler, VTD-<strong>XML</strong> can "cut" chunks out of multiple <strong>XML</strong> documents and assemble them into a new<br />

<strong>XML</strong> document.<br />

<strong>XML</strong> Editor/Eraser<br />

Used as an editor/eraser, VTD-<strong>XML</strong> can directly edit/erase the underlying byte content of the <strong>XML</strong> text, provided<br />

that the token length is wider than the intended new content. An immediate benefit of this approach is that the<br />

application can immediately reuse the original VTD and LC. In contrast, when using VTD-<strong>XML</strong> to incrementally<br />

update an <strong>XML</strong> document, an application needs to reparse the updated document before the application can process<br />

it.<br />

An editor can be made smart enough to track the location of each token, permitting new, longer tokens to replace<br />

existing, shorter tokens by merely addressing the new token in separate memory outside that used to store the<br />

original document. Likewise, when reordering the document, element text does not need to be copied; only the LCs<br />

need to be updated. When a complete, contiguous <strong>XML</strong> document is needed, such as when saving it, the disparate<br />

parts can be reassembled into a new, contiguous document.<br />

Other Benefits<br />

VTD-<strong>XML</strong> also pioneers the non-blocking, stateless XPath evaluation approach.<br />

Weaknesses<br />

VTD-<strong>XML</strong> also exhibits a few noticeable shortcomings:<br />

• As an <strong>XML</strong> parser, it does not support external entities declared in the DTD.<br />

• As a file format, it increases the document size by about 30% to 50%.<br />

• As an API, it is not compatible with DOM or SAX.<br />

• It is difficult to support certain validation techniques, employed by DTD and <strong>XML</strong> Schema (e.g., default<br />

attributes and elements), that require modifications to the <strong>XML</strong> instances being parsed.


VTD-<strong>XML</strong> 111<br />

Areas of Applications<br />

General-purpose Replacement for DOM or SAX<br />

Because of VTD-<strong>XML</strong>'s performance and memory advantages, it covers a larger portion of <strong>XML</strong> use cases than<br />

either DOM or SAX [22] .<br />

• Compared to DOM, VTD-<strong>XML</strong> processes bigger (3x~5x) <strong>XML</strong> documents for the same amount of physical<br />

memory at about 3 to 10 times the performance.<br />

• Compared to SAX, VTD-<strong>XML</strong> provides random access and XPath support and outperforms SAX by at least 2x.<br />

XPath over Huge <strong>XML</strong> documents<br />

The extended edition of VTD-<strong>XML</strong> combining with 64-bit JVM makes possible XPath-based <strong>XML</strong> processing over<br />

huge <strong>XML</strong> documents (up to 256 GB) in size.<br />

For SOA/WS/<strong>XML</strong> Security<br />

[23] [24] [25]<br />

The combination of VTD-<strong>XML</strong>'s high performance and incremental-update capability makes it essential<br />

to achieve the desired level of Quality of Service for SOA/WS/<strong>XML</strong> security applications.<br />

For SOA/WS/<strong>XML</strong> Intermediary<br />

VTD-<strong>XML</strong> is well suited for SOA intermediary applications such as <strong>XML</strong> routers/switches/gateways, Enterprise<br />

Service Buses, and services aggregation points. All those applications perform the basic "store and forward"<br />

operations for which retaining the original <strong>XML</strong> is critical for minimizing latency. VTD-<strong>XML</strong>'s incremental update<br />

capability also contributes significantly to the forwarding performance.<br />

VTD-<strong>XML</strong>'s random-access capability lends itself well to XPath-based <strong>XML</strong> routing/switching/filtering common in<br />

AJAX and SOA deployment.<br />

Intelligent SOA/WS/<strong>XML</strong> Load-balancing and Offloading<br />

When an <strong>XML</strong> document travels through several middle-tier SOA components, the first message stop, after finishing<br />

the inspection of the <strong>XML</strong> document, can choose to send the VTD+<strong>XML</strong> file format to the downstream components<br />

to avoid repetitive parsing, thus improving throughput.<br />

By the same token, an intelligent SOA load balancer can choose to generate VTD+<strong>XML</strong> for incoming/outgoing<br />

SOAP messages to offload <strong>XML</strong> parsing from the application servers that receive those messages.<br />

<strong>XML</strong> Persistence Data Store<br />

When viewed from the perspective of native <strong>XML</strong> persistence, VTD-<strong>XML</strong> can be used as a human-readable, easy to<br />

use, general-purpose <strong>XML</strong> index. <strong>XML</strong> documents stored this way can be loaded into memory to be queried,<br />

updated, or edited without the overhead of parsing/re-serialization.<br />

Schemaless <strong>XML</strong> Data Binding<br />

VTD-<strong>XML</strong>'s combination of high performance, low memory usage, and non-blocking XPath evaluation makes<br />

possible a new <strong>XML</strong> data binding approach based entirely on XPath. This approach's biggest benefit is it no longer<br />

requires <strong>XML</strong> schema, avoids needless object creation, and takes advantage of <strong>XML</strong>'s inherent loose encoding [26] .<br />

It is worth noting that data binding discussed in the article mentioned above needs to be implemented by the<br />

application: VTD-<strong>XML</strong> itself only offers accessors. In this regard VTD-<strong>XML</strong> is not a data binding solution itself<br />

(unlike JiBX, JAXB, <strong>XML</strong>Beans), although it offers extraction functionality for data binding packages, much like<br />

other <strong>XML</strong> parsers (STAX, StAX).


VTD-<strong>XML</strong> 112<br />

Essential Classes<br />

As of Version 2.6, the Java and C# versions of VTD-<strong>XML</strong> consist of the following classes:<br />

• VTDGen (VTD Generator) is the class that encapsulates the main parsing, index loading and index writing<br />

functions.<br />

• VTDNav (VTD Navigator) is the class that (1) encapsulates <strong>XML</strong>, VTD, and hierarchical info, (2) contains<br />

various navigation methods,(3) performs various comparisons between VTD records and strings, and (4) converts<br />

VTD records to primitive data types.<br />

• AutoPilot is a class containing functions that perform node-level iteration and XPath.<br />

• <strong>XML</strong>Modifier is a class that offers incremental update capability, such as delete, insert and update.<br />

The extended VTD-<strong>XML</strong> consists of the following classes:<br />

• VTDGenHuge (Extended VTD Generator) encapsulates the main parsing.<br />

• <strong>XML</strong>Buffer performs in-memory loading of <strong>XML</strong> documents.<br />

• <strong>XML</strong>MemMappedBuffer performs memory mapped loading of <strong>XML</strong> documents.<br />

• VTDNavHuge (Extended VTD Navigator)1) encapsulates <strong>XML</strong>, Extended VTD, and hierarchical info, (2)<br />

contains various navigation methods,(3) performs various comparisons between VTD records and strings, and (4)<br />

converts VTD records to primitive data types.<br />

• AutoPilotHuge performs node-level iteration and XPath.<br />

Code Sample<br />

/* In this java program, we demonstrate how to use <strong>XML</strong>Modifier to<br />

incrementally<br />

* update a simple <strong>XML</strong> purchase order.<br />

* a particular name space. We also are going<br />

* to use VTDGen's parseFile to simplify programming.<br />

*/<br />

import com.ximpleware.*;<br />

public class Update {<br />

public static void main(String argv[]) throws NavException,<br />

ModifyException, IOException{<br />

// open a file and read the content into a byte array<br />

VTDGen vg = new VTDGen();<br />

if (vg.parseFile("oldpo.xml", true)){<br />

VTDNav vn = vg.getNav();<br />

AutoPilot ap = new AutoPilot(vn);<br />

<strong>XML</strong>Modifier xm = new <strong>XML</strong>Modifier(vn);<br />

ap.selectXPath("/purchaseOrder/items/item[@partNum='872-AA']");<br />

int i = -1;<br />

while((i=ap.evalXPath())!=-1){ xm.remove();<br />

xm.insertBeforeElement("\n");


VTD-<strong>XML</strong> 113<br />

}<br />

}<br />

References<br />

}<br />

}<br />

ap.selectXPath("/purchaseOrder/items/item/USPrice[.


X-expression 114<br />

X-expression<br />

X-expressions are the unification of S-expressions found in the Lisp programming language with <strong>XML</strong>.<br />

X-expressions unify notions of computation with data sharing.<br />

XBRLS<br />

XBRLS (XBRL Simple Application Profile) is an application profile of XBRL.<br />

XBRLS is designed to be 100% XBRL compliant. The stated goals of XBRLS are "to maximize XBRL's benefits,<br />

reduce costs of implementation, and maximize the functionality and effectiveness of XBRL" [1] . XBRL is a general<br />

purpose specification, based on the idea that no one is likely to use 100% of the components of XBRL in building<br />

any one solution. XBRLS specifies a subset of XBRL that is designed to meet the needs of most business users in<br />

most situations, and offers it as a starting point for others. This approach creates an application profile of XBRL<br />

(equivalent to a database view but concerned with metadata, not data).<br />

XBRLS is intended to enable the non-XBRL expert to create both XBRL metadata and XBRL reports in a simple<br />

and convenient manner. At the same time, it seeks to improve the usability of XBRL, the interoperability among<br />

XBRL-based solutions, the effectiveness of XBRL extensions and to reduce software development costs.<br />

The profile was created by Rene van Egmond and Charlie Hoffman, who was the initial creator of XBRL. It borrows<br />

heavily from the US GAAP Taxonomy Architecture.<br />

XBRLS Architecture<br />

The XBRLS architecture is based on many ideas used by the US GAAP Taxonomy Architecture. The intent of the<br />

XBRLS architecture is to make it easier for business users to make use of XBRL, to make it easier for software<br />

vendors to support XBRL, and to safely use the features of XBRL. XBRLS is a subset of what is allowed by the<br />

complete XBRL Specification. Examples of these limitations placed on XBRL are the following:<br />

• Uses no tuples.<br />

• Only uses the segment element of the instance context and disallows the use of the scenario element.<br />

• Allows only XBRL dimensional information as content for the segment element in the instance context.<br />

Furthermore, it requires that every concept (member, primary item) participates in a hypercube and that all<br />

hypercubes are closed.<br />

• Allows no uses of simple or complex typed members within XBRL Dimensions.<br />

• XBRLS never uses the precision attribute, always uses the decimals attribute.<br />

• Requires that every measure exists in at least one XBRL Dimension.<br />

XBRL Components not used in XBRLS


XBRLS 115<br />

XBRL<br />

Specification<br />

Instance Context: entity<br />

Topic Explanation<br />

identifier, entity<br />

scheme<br />

Although not required when using XBRLS, it is highly encouraged that the entity scheme and identifier<br />

be “held static” or synchronized with an explicit member and rather have XBRL Dimensions be used to<br />

articulate entity information, perhaps with an XBRLS “Entity [Axis]” dimension.<br />

The “entity identifier” and “entity scheme” portion of a context should not be used. Rather, the “entity<br />

identifier” and “entity schema” are static (i.e., dummy values in order to pass XBRL validation), using<br />

constant values. The information articulates relating to the entity identifier and entity scheme are moved<br />

to an XBRLS specific taxonomy that makes use of XBRL Dimensions to communicate this information.<br />

Instance Context: period Although not required when using XBRLS, it is highly encouraged that the period context be “held<br />

Instance (sections<br />

4.7.4 and 4.7.3.2)<br />

Context: segments,<br />

scenarios<br />

Instance Fact Value:<br />

precision<br />

Taxonomy Elements: tuples Tuples are not allowed.<br />

static” or synchronized with an explicit member and that rather XBRL Dimensions be used to articulate<br />

this information, perhaps with an XBRLS “Period [Axis]” dimension. Uses XBRL Dimensions to<br />

articulate this XBRL quasi dimension.<br />

Only uses XBRL Dimensions to articulate the content of segments and scenarios, excluding the use of<br />

<strong>XML</strong> Schema-based contextual information allowed by sections. Furthermore, mixing <strong>XML</strong> Schema<br />

based-contextual information and XBRL Dimensions is technically dangerous.<br />

Uses only the decimals attribute, precision must not be used.<br />

Taxonomies Weight The weight attribute value of calculations must be either “1” or “-1”, no decimal value between the two is<br />

Taxonomies Annotation,<br />

Documentation<br />

allowed.<br />

Each schema and each linkbase must provide documentation that describes the contents of the file that is<br />

readable by a computer application.<br />

Dimensions Open Hypercubes Open hypercubes are not allowed, only closed hypercubes are allowed.<br />

Dimensions notAll Only “all” has-hypercube arcroles are allowed, “notAll” is not allowed<br />

Dimensions Typed Members Typed members (simple or complex) are not allowed.<br />

External links<br />

• XBRL Business Information Exchange [2]<br />

• XBRLS: how a simpler XBRL can make a better XBRL [3]<br />

• Comprehensive Example [4]<br />

• XBRLS - XBRL Made Easy [5]<br />

• Data Interactive: An Interview with Charlie Hoffman [6]<br />

References<br />

[1] XBRL Business Information Exchange (http://xbrl.squarespace.com/xbrls/) website<br />

[2] http://xbrl.squarespace.com/xbrls/<br />

[3] http://xbrl.squarespace.com/storage/xbrls/XBRLS-How-simpler-can-be-better-2008-03-11.pdf<br />

[4] http://xbrl.squarespace.com//storage/xbrls/XBRLS-ComprehensiveExample-2008-04-18.zip<br />

[5] http://www.ubmatrix.com/company/innovation.htm<br />

[6] http://hitachidatainteractive.com/2008/04/23/an-interview-with-charlie-hoffman


Xdos 116<br />

Xdos<br />

XDoS is an acronym for <strong>XML</strong> denial-of-service.<br />

An XDoS attack is a content-borne attack whose purpose is to shut down a web service or system running that<br />

service. A common XDoS attack occurs when an <strong>XML</strong> message is sent with a multitude of digital signatures and a<br />

naive parser would look at each signature and use all the CPU cycles, eating up all resources. These are less common<br />

than inadvertent XDoS attacks which occur when a programming error by a trusted customer causes a handshake to<br />

go into an infinite loop.<br />

XDR Schema<br />

<strong>XML</strong>-Data Reduced (XDR) schema, used in W3C <strong>XML</strong>-Data Note and the Document Content Description (DCD)<br />

initiative for <strong>XML</strong>.<br />

MS<strong>XML</strong> provided XDR schema support from versions 2.0 up to - but not including - version 6.0 [1] .<br />

See also<br />

• <strong>XML</strong> Schema <strong>Language</strong> Comparison - Comparison of other <strong>XML</strong> Schema languages (not XDR).<br />

• List of <strong>XML</strong> Schemas - list of <strong>XML</strong> schemas in use on the Internet sorted by purpose<br />

External links<br />

• XDR Schema Data Types Reference [2]<br />

References<br />

[1] Version and Conformance (http://msdn2.microsoft.com/en-us/library/ms757825(VS.85).aspx)<br />

[2] http://msdn2.microsoft.com/en-us/library/ms256049.aspx


XEE (Starlight) 117<br />

XEE (Starlight)<br />

XEE (<strong>XML</strong> Engineering Environment) is a visual language for data processing and ETL tasks. It is designed for the<br />

Starlight Information Visualization System as a method for producing and processing <strong>XML</strong> data.


XEP 118<br />

XEP<br />

Developer(s) RenderX<br />

Stable release 4.18 / March 2010<br />

Written in Java<br />

Operating<br />

system<br />

Type Layout engine<br />

Website [1]<br />

Microsoft Windows, Linux, FreeBSD<br />

XEP is a commercial XSL-FO layout engine written in Java. XEP is proprietary software by RenderX.<br />

History<br />

Started in 1999 as a working prototype written in Perl and completely rewritten in Java soon, XEP has evolved into a<br />

complete engine. XEP runs on any platform where Java runtime is available, including Windows, Linux, FreeBSD<br />

and other server platforms.<br />

Features<br />

XEP accepts XSL-FO as input, as well as <strong>XML</strong>+XSLT. Its output formats are: PDF, PostScript, AFP, PPML, XPS,<br />

HTML, SVG, and internal <strong>XML</strong>-based format called XEPOUT.<br />

XEP demonstrates conformance with XSL-FO Recommendation v1.0, a wide range of extensions, and support for a<br />

good subset of XSL 1.1 features. [2]<br />

Available font types, depending on the output format generator, are Type 1, TrueType and OpenType, with the<br />

ability of embedding and subsetting.<br />

Accepted images are most of flavors of raster graphics, SVG, EPS and PDF.<br />

API<br />

For integration XEP provides API in Java and examples covering a number of approaches such as SAX, JAXP and<br />

DOM. XEP has a flexible configuration, which allows running it concurrently in threads on huge input documents,<br />

but also in a small heap in diskless environments such as appservers.<br />

Satellite software<br />

For Windows users there exists a .NET wrapper called XEPWin, and an accompanying .NET development kit with<br />

API in C#, VB and ASP.NET.<br />

Satellite software includes EnMasse - a multiplexer of a grid of XEP engines, with simple networked API and<br />

examples in C, Java, Perl and Python.


XEP 119<br />

External links<br />

• XEP on RenderX site [1]<br />

• Official W3C XSL recommendation formatted by XEP [3]<br />

• How to use XEP with Stylus Studio [4]<br />

References<br />

[1] http://www.renderx.com/tools/xep.html<br />

[2] http://xml.coverpages.org/ni2001-11-08-b.html<br />

[3] http://www.w3.org/TR/2006/REC-xsl11-20061205/xsl11.pdf<br />

[4] http://www.stylusstudio.com/renderx/xep.html<br />

<strong>XML</strong><br />

Filename extension .xml<br />

Internet media type [1] [2]<br />

application/xml , text/xml (deprecated)<br />

Uniform Type Identifier public.xml<br />

Developed by World Wide Web Consortium<br />

Type of format <strong>Markup</strong> language<br />

Extended from SGML<br />

Extended to Numerous, including:<br />

XHTML, RSS, Atom<br />

Standard(s) 1.0 (Fifth Edition) [3] November 26, 2008<br />

1.1 (Second Edition) [4] August 16, 2006<br />

Open format? Yes


<strong>XML</strong> 120<br />

Current Status Published<br />

Year Started 1996<br />

Editors Tim Bray, Jean Paoli, C. M. Sperberg-McQueen, Eve Maler, François Yergeau, John Cowan<br />

Related<br />

Standards<br />

<strong>XML</strong> Schema<br />

Domain Data Serialization<br />

Abbreviation <strong>XML</strong><br />

Website <strong>XML</strong> 1.0 [5]<br />

<strong>XML</strong> (Extensible <strong>Markup</strong> <strong>Language</strong>) is a set of rules for encoding documents in machine-readable form. It is<br />

defined in the <strong>XML</strong> 1.0 Specification [6] produced by the W3C, and several other related specifications, all gratis<br />

open standards. [7]<br />

<strong>XML</strong>'s design goals emphasize simplicity, generality, and usability over the Internet. [8] It is a textual data format,<br />

with strong support via Unicode for the languages of the world. Although <strong>XML</strong>'s design focuses on documents, it is<br />

widely used for the representation of arbitrary data structures, for example in web services.<br />

There are many programming interfaces that software developers may use to access <strong>XML</strong> data, and several schema<br />

systems designed to aid in the definition of <strong>XML</strong>-based languages.<br />

As of 2009, hundreds of <strong>XML</strong>-based languages have been developed, [9] including RSS, Atom, SOAP, and XHTML.<br />

<strong>XML</strong>-based formats have become the default for most office-productivity tools, including Microsoft Office (Office<br />

Open <strong>XML</strong>), OpenOffice.org (OpenDocument), and Apple's iWork. [10]<br />

Key terminology<br />

The material in this section is based on the <strong>XML</strong> Specification. This is not an exhaustive list of all the constructs<br />

which appear in <strong>XML</strong>; it provides an introduction to the key constructs most often encountered in day-to-day use.<br />

(Unicode) Character<br />

By definition, an <strong>XML</strong> document is a string of characters. Almost every legal Unicode character may appear<br />

in an <strong>XML</strong> document.<br />

Processor and Application<br />

The processor analyzes the markup and passes structured information to an application. The specification<br />

places requirements on what an <strong>XML</strong> processor must do and not do, but the application is outside its scope.<br />

The processor (as the specification calls it) is often referred to colloquially as an <strong>XML</strong> parser.<br />

<strong>Markup</strong> and Content<br />

Tag<br />

Element<br />

The characters which make up an <strong>XML</strong> document are divided into markup and content. <strong>Markup</strong> and content<br />

may be distinguished by the application of simple syntactic rules. All strings which constitute markup either<br />

begin with the character "", or begin with the character "&" and end with a ";". Strings of<br />

characters which are not markup are content.<br />

A markup construct that begins with "". Tags come in three flavors: start-tags, for example<br />

, end-tags, for example , and empty-element tags, for example .<br />

A logical component of a document which either begins with a start-tag and ends with a matching end-tag, or<br />

consists only of an empty-element tag. The characters between the start- and end-tags, if any, are the element's<br />

content, and may contain markup, including other elements, which are called child elements. An example of an


<strong>XML</strong> 121<br />

Attribute<br />

element is Hello, world. (see hello world). Another is .<br />

A markup construct consisting of a name/value pair that exists within a start-tag or empty-element tag. In the<br />

example (below) the element img has two attributes, src and alt:<br />

. Another example would be<br />

Connect A to B. where the name of the attribute is "number" and the value is "3":<br />

<strong>XML</strong> Declaration<br />

<strong>XML</strong> documents may begin by declaring some information about themselves, as in the following example.<br />

<br />

Example<br />

Here is a small, complete <strong>XML</strong> document, which uses all of these constructs and concepts.<br />

<br />

<br />

<br />

This is Raphael's "Foligno" Madonna, painted in<br />

1511–1512.<br />

<br />

<br />

There are five elements in this example document: painting, img, caption, and two dates. The date elements are<br />

children of caption, which is a child of the root element painting. img has two attributes, src and alt.<br />

Characters and escaping<br />

<strong>XML</strong> documents consist entirely of characters from the Unicode repertoire. Except for a small number of<br />

specifically excluded control characters, any character defined by Unicode may appear within the content of an <strong>XML</strong><br />

document. The selection of characters which may appear within markup is somewhat more limited but still large.<br />

<strong>XML</strong> includes facilities for identifying the encoding of the Unicode characters which make up the document, and for<br />

expressing characters which, for one reason or another, cannot be used directly.<br />

Details on valid characters<br />

Unicode characters in the following code point ranges are valid in <strong>XML</strong> 1.0 documents: [11]<br />

• U+0009<br />

• U+000A<br />

• U+000D<br />

• U+0020–U+D7FF<br />

• U+E000–U+FFFD<br />

• U+10000–U+10FFFF<br />

Unicode characters in the following code point ranges are always valid in <strong>XML</strong> 1.1 documents: [12]<br />

• U+0001–U+0008<br />

• U+000B–U+000C<br />

• U+000E–U+001F<br />

• U+007F–U+0084<br />

• U+0086–U+009F


<strong>XML</strong> 122<br />

The preceding code points are contained in the following code point ranges which are only valid in certain contexts<br />

in <strong>XML</strong> 1.1 documents:<br />

• U+0001–U+D7FF<br />

• U+E000–U+FFFD<br />

• U+10000–U+10FFFF<br />

Encoding detection<br />

The Unicode character set can be encoded into bytes for storage or transmission in a variety of different ways, called<br />

"encodings". Unicode itself defines encodings which cover the entire repertoire; well-known ones include UTF-8<br />

and UTF-16. [13] There are many other text encodings which pre-date Unicode, such as ASCII and ISO/IEC 8859;<br />

their character repertoires in almost every case are subsets of the Unicode character set.<br />

<strong>XML</strong> allows the use of any of the Unicode-defined encodings, and any other encodings whose characters also appear<br />

in Unicode. <strong>XML</strong> also provides a mechanism whereby an <strong>XML</strong> processor can reliably, without any prior knowledge,<br />

determine which encoding is being used. [14] Encodings other than UTF-8 and UTF-16 will not necessarily be<br />

recognized by every <strong>XML</strong> parser.<br />

Escaping<br />

There are several reasons why it may be difficult or impossible to include some character directly in an <strong>XML</strong><br />

document.<br />

• The characters "


<strong>XML</strong> 123<br />

Comments<br />

Comments may appear anywhere in a document outside other markup. Comments should not appear on the first line<br />

or otherwise above the <strong>XML</strong> declaration for <strong>XML</strong> processor compatibility. The string "--" (double-hyphen) is not<br />

allowed (as it is used to delimit comments), and entities must not be recognized within comments.<br />

An example of a valid comment: ""<br />

International use<br />

<strong>XML</strong> supports the direct use of almost any Unicode character in element names, attributes, comments, character<br />

data, and processing instructions (other than the ones that have special symbolic meaning in <strong>XML</strong> itself, such as the<br />

open corner bracket, "


<strong>XML</strong> 124<br />

DTD<br />

The oldest schema language for <strong>XML</strong> is the Document Type Definition (DTD), inherited from SGML.<br />

DTDs have the following benefits:<br />

• DTD support is ubiquitous due to its inclusion in the <strong>XML</strong> 1.0 standard.<br />

• DTDs are terse compared to element-based schema languages and consequently present more information in a<br />

single screen.<br />

• DTDs allow the declaration of standard public entity sets for publishing characters.<br />

• DTDs define a document type rather than the types used by a namespace, thus grouping all constraints for a<br />

document in a single collection.<br />

DTDs have the following limitations:<br />

• They have no explicit support for newer features of <strong>XML</strong>, most importantly namespaces.<br />

• They lack expressiveness. <strong>XML</strong> DTDs are simpler than SGML DTDs and there are certain structures that cannot<br />

be expressed with regular grammars. DTDs only support rudimentary datatypes.<br />

• They lack readability. DTD designers typically make heavy use of parameter entities (which behave essentially as<br />

textual macros), which make it easier to define complex grammars, but at the expense of clarity.<br />

• They use a syntax based on regular expression syntax, inherited from SGML, to describe the schema. Typical<br />

<strong>XML</strong> APIs such as SAX do not attempt to offer applications a structured representation of the syntax, so it is less<br />

accessible to programmers than an element-based syntax may be.<br />

Two peculiar features that distinguish DTDs from other schema types are the syntactic support for embedding a<br />

DTD within <strong>XML</strong> documents and for defining entities, which are arbitrary fragments of text and/or markup that the<br />

<strong>XML</strong> processor inserts in the DTD itself and in the <strong>XML</strong> document wherever they are referenced, like character<br />

escapes.<br />

DTD technology is still used in many applications because of its ubiquity.<br />

<strong>XML</strong> Schema<br />

A newer schema language, described by the W3C as the successor of DTDs, is <strong>XML</strong> Schema, often referred to by<br />

the initialism for <strong>XML</strong> Schema instances, XSD (<strong>XML</strong> Schema Definition). XSDs are far more powerful than DTDs<br />

in describing <strong>XML</strong> languages. They use a rich datatyping system and allow for more detailed constraints on an <strong>XML</strong><br />

document's logical structure. XSDs also use an <strong>XML</strong>-based format, which makes it possible to use ordinary <strong>XML</strong><br />

tools to help process them.<br />

RELAX NG<br />

RELAX NG was initially specified by OASIS and is now also an ISO international standard (as part of DSDL).<br />

RELAX NG schemas may be written in either an <strong>XML</strong> based syntax or a more compact non-<strong>XML</strong> syntax; the two<br />

syntaxes are isomorphic and James Clark's Trang conversion tool can convert between them without loss of<br />

information. RELAX NG has a simpler definition and validation framework than <strong>XML</strong> Schema, making it easier to<br />

use and implement. It also has the ability to use datatype framework plug-ins; a RELAX NG schema author, for<br />

example, can require values in an <strong>XML</strong> document to conform to definitions in <strong>XML</strong> Schema Datatypes.


<strong>XML</strong> 125<br />

Schematron<br />

Schematron is a language for making assertions about the presence or absence of patterns in an <strong>XML</strong> document. It<br />

typically uses XPath expressions.<br />

ISO DSDL and other schema languages<br />

The ISO DSDL (Document Schema Description <strong>Language</strong>s) standard brings together a comprehensive set of small<br />

schema languages, each targeted at specific problems. DSDL includes RELAX NG full and compact syntax,<br />

Schematron assertion language, and languages for defining datatypes, character repertoire constraints, renaming and<br />

entity expansion, and namespace-based routing of document fragments to different validators. DSDL schema<br />

languages do not have the vendor support of <strong>XML</strong> Schemas yet, and are to some extent a grassroots reaction of<br />

industrial publishers to the lack of utility of <strong>XML</strong> Schemas for publishing.<br />

Some schema languages not only describe the structure of a particular <strong>XML</strong> format but also offer limited facilities to<br />

influence processing of individual <strong>XML</strong> files that conform to this format. DTDs and XSDs both have this ability;<br />

they can for instance provide the infoset augmentation facility and attribute defaults. RELAX NG and Schematron<br />

intentionally do not provide these.<br />

Related specifications<br />

A cluster of specifications closely related to <strong>XML</strong> have been developed, starting soon after the initial publication of<br />

<strong>XML</strong> 1.0. It is frequently the case that the term "<strong>XML</strong>" is used to refer to <strong>XML</strong> together with one or more of these<br />

other technologies which have come to be seen as part of the <strong>XML</strong> core.<br />

• <strong>XML</strong> Namespaces enable the same document to contain <strong>XML</strong> elements and attributes taken from different<br />

vocabularies, without any naming collisions occurring. Essentially all software which is advertised as supporting<br />

<strong>XML</strong> also supports <strong>XML</strong> Namespaces.<br />

• <strong>XML</strong> Base defines the xml:base attribute, which may be used to set the base for resolution of relative URI<br />

references within the scope of a single <strong>XML</strong> element.<br />

• The <strong>XML</strong> Information Set or <strong>XML</strong> infoset describes an abstract data model for <strong>XML</strong> documents in terms of<br />

information items. The infoset is commonly used in the specifications of <strong>XML</strong> languages, for convenience in<br />

describing constraints on the <strong>XML</strong> constructs those languages allow.<br />

• xml:id Version 1.0 asserts that an attribute named xml:id functions as an "ID attribute" in the sense used in a<br />

DTD.<br />

• XPath defines a syntax named XPath expressions which identifies one or more of the internal components<br />

(elements, attributes, and so on) included in an <strong>XML</strong> document. XPath is widely used in other core-<strong>XML</strong><br />

specifications and in programming libraries for accessing <strong>XML</strong>-encoded data.<br />

• XSLT is a language with an <strong>XML</strong>-based syntax that is used to transform <strong>XML</strong> documents into other <strong>XML</strong><br />

documents, HTML, or other, unstructured formats such as plain text or RTF. XSLT is very tightly coupled with<br />

XPath, which it uses to address components of the input <strong>XML</strong> document, mainly elements and attributes.<br />

• XSL Formatting Objects, or XSL-FO, is a markup language for <strong>XML</strong> document formatting which is most often<br />

used to generate PDFs.<br />

• XQuery is an <strong>XML</strong>-oriented query language strongly rooted in XPath and <strong>XML</strong> Schema. It provides methods to<br />

access, manipulate and return <strong>XML</strong>.<br />

• <strong>XML</strong> Signature defines syntax and processing rules for creating digital signatures on <strong>XML</strong> content.<br />

• <strong>XML</strong> Encryption defines syntax and processing rules for encrypting <strong>XML</strong> content.<br />

Some other specifications conceived as part of the "<strong>XML</strong> Core" have failed to find wide adoption, including<br />

XInclude, XLink, and XPointer.


<strong>XML</strong> 126<br />

Use on the Internet<br />

It is common for <strong>XML</strong> to be used in interchanging data over the Internet. RFC 3023 gives rules for the construction<br />

of Internet Media Types for use when sending <strong>XML</strong>. It also defines the types "application/xml" and "text/xml",<br />

which say only that the data is in <strong>XML</strong>, and nothing about its semantics. The use of "text/xml" has been criticized as a<br />

potential source of encoding problems and is now in the process of being deprecated. [18] RFC 3023 also<br />

recommends that <strong>XML</strong>-based languages be given media types beginning in "application/" and ending in "+xml"; for<br />

example "application/svg+xml" for SVG.<br />

Further guidelines for the use of <strong>XML</strong> in a networked context may be found in RFC 3470, also known as IETF BCP<br />

70; this document is very wide-ranging and covers many aspects of designing and deploying an <strong>XML</strong>-based<br />

language.<br />

Programming interfaces<br />

The design goals of <strong>XML</strong> include "It shall be easy to write programs which process <strong>XML</strong> documents." [8] Despite<br />

this fact, the <strong>XML</strong> specification contains almost no information about how programmers might go about doing such<br />

processing. The <strong>XML</strong> Infoset provides a vocabulary to refer to the constructs within an <strong>XML</strong> document, but once<br />

again does not provide any guidance on how to access this information. A variety of APIs for accessing <strong>XML</strong> have<br />

been developed and used, and some have been standardized.<br />

Existing APIs for <strong>XML</strong> processing tend to fall into these categories:<br />

• Stream-oriented APIs accessible from a programming language, for example SAX and StAX.<br />

• Tree-traversal APIs accessible from a programming language, for example DOM.<br />

• <strong>XML</strong> data binding, which provides an automated translation between an <strong>XML</strong> document and<br />

programming-language objects.<br />

• Declarative transformation languages such as XSLT and XQuery.<br />

Stream-oriented facilities require less memory and, for certain tasks which are based on a linear traversal of an <strong>XML</strong><br />

document, are faster and simpler than other alternatives. Tree-traversal and data-binding APIs typically require the<br />

use of much more memory, but are often found more convenient for use by programmers; some include declarative<br />

retrieval of document components via the use of XPath expressions.<br />

XSLT is designed for declarative description of <strong>XML</strong> document transformations, and has been widely implemented<br />

both in server-side packages and Web browsers. XQuery overlaps XSLT in its functionality, but is designed more for<br />

searching of large <strong>XML</strong> databases.<br />

Simple API for <strong>XML</strong> (SAX)<br />

SAX is a lexical, event-driven interface in which a document is read serially and its contents are reported as<br />

callbacks to various methods on a handler object of the user's design. SAX is fast and efficient to implement, but<br />

difficult to use for extracting information at random from the <strong>XML</strong>, since it tends to burden the application author<br />

with keeping track of what part of the document is being processed. It is better suited to situations in which certain<br />

types of information are always handled the same way, no matter where they occur in the document.<br />

Pull parsing<br />

Pull parsing [19] treats the document as a series of items which are read in sequence using the Iterator design pattern.<br />

This allows for writing of recursive-descent parsers in which the structure of the code performing the parsing mirrors<br />

the structure of the <strong>XML</strong> being parsed, and intermediate parsed results can be used and accessed as local variables<br />

within the methods performing the parsing, or passed down (as method parameters) into lower-level methods, or<br />

returned (as method return values) to higher-level methods. Examples of pull parsers include StAX in the Java<br />

programming language, Simple<strong>XML</strong> in PHP and System.Xml.XmlReader in the .NET Framework.


<strong>XML</strong> 127<br />

A pull parser creates an iterator that sequentially visits the various elements, attributes, and data in an <strong>XML</strong><br />

document. Code which uses this iterator can test the current item (to tell, for example, whether it is a start or end<br />

element, or text), and inspect its attributes (local name, namespace, values of <strong>XML</strong> attributes, value of text, etc.), and<br />

can also move the iterator to the next item. The code can thus extract information from the document as it traverses<br />

it. The recursive-descent approach tends to lend itself to keeping data as typed local variables in the code doing the<br />

parsing, while SAX, for instance, typically requires a parser to manually maintain intermediate data within a stack of<br />

elements which are parent elements of the element being parsed. Pull-parsing code can be more straightforward to<br />

understand and maintain than SAX parsing code.<br />

Document Object Model (DOM)<br />

DOM (Document Object Model) is an interface-oriented Application Programming Interface that allows for<br />

navigation of the entire document as if it were a tree of "Node" objects representing the document's contents. A<br />

DOM document can be created by a parser, or can be generated manually by users (with limitations). Data types in<br />

DOM Nodes are abstract; implementations provide their own programming language-specific bindings. DOM<br />

implementations tend to be memory intensive, as they generally require the entire document to be loaded into<br />

memory and constructed as a tree of objects before access is allowed.<br />

Data binding<br />

Another form of <strong>XML</strong> processing API is <strong>XML</strong> data binding, where <strong>XML</strong> data is made available as a hierarchy of<br />

custom, strongly typed classes, in contrast to the generic objects created by a Document Object Model parser. This<br />

approach simplifies code development, and in many cases allows problems to be identified at compile time rather<br />

than run-time. Example data binding systems include the Java Architecture for <strong>XML</strong> Binding (JAXB), <strong>XML</strong><br />

Serialization in .NET, [20] [21] [22]<br />

and CodeSynthesis XSD for C++.<br />

<strong>XML</strong> as data type<br />

<strong>XML</strong> is beginning to appear as a first-class data type in other languages. The ECMAScript for <strong>XML</strong> (E4X)<br />

extension to the ECMAScript/JavaScript language explicitly defines two specific objects (<strong>XML</strong> and <strong>XML</strong>List) for<br />

JavaScript, which support <strong>XML</strong> document nodes and <strong>XML</strong> document lists as distinct objects and use a dot-notation<br />

specifying parent-child relationships. E4X is supported by the Mozilla 2.5+ browsers and Adobe Actionscript, but<br />

has not been adopted more universally. Similar notations are used in Microsoft's LINQ implementation for Microsoft<br />

.NET 3.5 and above, and in Scala (which uses the Java VM). The open-source xmlsh application, which provides a<br />

Linux-like shell with special features for <strong>XML</strong> manipulation, similarly treats <strong>XML</strong> as a data type, using the <br />

notation. [23] The Resource Description Framework defines a data type rdf:<strong>XML</strong>Literal to hold wrapped, canonical<br />

<strong>XML</strong>. [24]<br />

History<br />

<strong>XML</strong> is an application profile of SGML (ISO 8879). [25]<br />

The versatility of SGML for dynamic information display was understood by early digital media publishers in the<br />

late 1980s prior to the rise of the Internet. [26] [27] By the mid-1990s some practitioners of SGML had gained<br />

experience with the then-new World Wide Web, and believed that SGML offered solutions to some of the problems<br />

the Web was likely to face as it grew. Dan Connolly added SGML to the list of W3C's activities when he joined the<br />

staff in 1995; work began in mid-1996 when Sun Microsystems engineer Jon Bosak developed a charter and<br />

recruited collaborators. Bosak was well connected in the small community of people who had experience both in<br />

SGML and the Web. [28]<br />

<strong>XML</strong> was compiled by a working group of eleven members, [29] supported by an (approximately) 150-member<br />

Interest Group. Technical debate took place on the Interest Group mailing list and issues were resolved by consensus


<strong>XML</strong> 128<br />

or, when that failed, majority vote of the Working Group. A record of design decisions and their rationales was<br />

compiled by Michael Sperberg-McQueen on December 4, 1997. [30] James Clark served as Technical Lead of the<br />

Working Group, notably contributing the empty-element "" syntax and the name "<strong>XML</strong>". Other names that<br />

had been put forward for consideration included "MAGMA" (Minimal Architecture for Generalized <strong>Markup</strong><br />

Applications), "SLIM" (Structured <strong>Language</strong> for Internet <strong>Markup</strong>) and "MGML" (Minimal Generalized <strong>Markup</strong><br />

<strong>Language</strong>). The co-editors of the specification were originally Tim Bray and Michael Sperberg-McQueen. Halfway<br />

through the project Bray accepted a consulting engagement with Netscape, provoking vociferous protests from<br />

Microsoft. Bray was temporarily asked to resign the editorship. This led to intense dispute in the Working Group,<br />

eventually solved by the appointment of Microsoft's Jean Paoli as a third co-editor.<br />

The <strong>XML</strong> Working Group never met face-to-face; the design was accomplished using a combination of email and<br />

weekly teleconferences. The major design decisions were reached in twenty weeks of intense work between July and<br />

November 1996, when the first Working Draft of an <strong>XML</strong> specification was published. [31] Further design work<br />

continued through 1997, and <strong>XML</strong> 1.0 became a W3C Recommendation on February 10, 1998.<br />

Sources<br />

<strong>XML</strong> is a profile of an ISO standard SGML, and most of <strong>XML</strong> comes from SGML unchanged. From SGML comes<br />

the separation of logical and physical structures (elements and entities), the availability of grammar-based validation<br />

(DTDs), the separation of data and metadata (elements and attributes), mixed content, the separation of processing<br />

from representation (processing instructions), and the default angle-bracket syntax. Removed were the SGML<br />

Declaration (<strong>XML</strong> has a fixed delimiter set and adopts Unicode as the document character set).<br />

Other sources of technology for <strong>XML</strong> were the Text Encoding Initiative (TEI), which defined a profile of SGML for<br />

use as a 'transfer syntax'; HTML, in which elements were synchronous with their resource, the separation of<br />

document character set from resource encoding, the xml:lang attribute, and the HTTP notion that metadata<br />

accompanied the resource rather than being needed at the declaration of a link. The Extended Reference Concrete<br />

Syntax (ERCS) project of the SPREAD (Standardization Project Regarding East Asian Documents) project of the<br />

ISO-related China/Japan/Korea Document Processing expert group was the basis of <strong>XML</strong> 1.0's naming rules;<br />

SPREAD also introduced hexadecimal numeric character references and the concept of references to make available<br />

all Unicode characters. To support ERCS, <strong>XML</strong> and HTML better, the SGML standard IS 8879 was revised in 1996<br />

and 1998 with WebSGML Adaptations. The <strong>XML</strong> header followed that of ISO HyTime.<br />

Ideas that developed during discussion which were novel in <strong>XML</strong> included the algorithm for encoding detection and<br />

the encoding header, the processing instruction target, the xml:space attribute, and the new close delimiter for<br />

empty-element tags. The notion of well-formedness as opposed to validity (which enables parsing without a schema)<br />

was first formalized in <strong>XML</strong>, although it had been implemented successfully in the Electronic Book Technology<br />

"Dynatext" software [32] ; the software from the University of Waterloo New Oxford English Dictionary Project; the<br />

RISP LISP SGML text processor at Uniscope, Tokyo; the US Army Missile Command IADS hypertext system;<br />

Mentor Graphics Context; Interleaf and Xerox Publishing System.<br />

Versions<br />

There are two current versions of <strong>XML</strong>. The first (<strong>XML</strong> 1.0) was initially defined in 1998. It has undergone minor<br />

revisions since then, without being given a new version number, and is currently in its fifth edition, as published on<br />

November 26, 2008. It is widely implemented and still recommended for general use.<br />

The second (<strong>XML</strong> 1.1) was initially published on February 4, 2004, the same day as <strong>XML</strong> 1.0 Third Edition [33] , and<br />

is currently in its second edition, as published on August 16, 2006. It contains features (some contentious) that are<br />

intended to make <strong>XML</strong> easier to use in certain cases [34] . The main changes are to enable the use of line-ending<br />

characters used on EBCDIC platforms, and the use of scripts and characters absent from Unicode 3.2. <strong>XML</strong> 1.1 is<br />

not very widely implemented and is recommended for use only by those who need its unique features. [35]


<strong>XML</strong> 129<br />

Prior to its fifth edition release, <strong>XML</strong> 1.0 differed from <strong>XML</strong> 1.1 in having stricter requirements for characters<br />

available for use in element and attribute names and unique identifiers: in the first four editions of <strong>XML</strong> 1.0 the<br />

characters were exclusively enumerated using a specific version of the Unicode standard (Unicode 2.0 to Unicode<br />

3.2.) The fifth edition substitutes the mechanism of <strong>XML</strong> 1.1, which is more future-proof but reduces redundancy.<br />

The approach taken in the fifth edition of <strong>XML</strong> 1.0 and in all editions of <strong>XML</strong> 1.1 is that only certain characters are<br />

forbidden in names, and everything else is allowed, in order to accommodate the use of suitable name characters in<br />

future versions of Unicode. In the fifth edition, <strong>XML</strong> names may contain characters in the Balinese, Cham, or<br />

Phoenician scripts among many others which have been added to Unicode since Unicode 3.2. [36]<br />

Almost any Unicode code point can be used in the character data and attribute values of an <strong>XML</strong> 1.0 or 1.1<br />

document, even if the character corresponding to the code point is not defined in the current version of Unicode. In<br />

character data and attribute values, <strong>XML</strong> 1.1 allows the use of more control characters than <strong>XML</strong> 1.0, but, for<br />

"robustness", most of the control characters introduced in <strong>XML</strong> 1.1 must be expressed as numeric character<br />

references (and #x7F through #x9F, which had been allowed in <strong>XML</strong> 1.0, are in <strong>XML</strong> 1.1 even required to be<br />

expressed as numeric character references [37] ). Among the supported control characters in <strong>XML</strong> 1.1 are two line<br />

break codes that must be treated as whitespace. Whitespace characters are the only control codes that can be written<br />

directly.<br />

There has been discussion of an <strong>XML</strong> 2.0, although no organization has announced plans for work on such a project.<br />

<strong>XML</strong>-SW (SW for skunk works), written by one of the original developers of <strong>XML</strong>, contains some proposals for<br />

what an <strong>XML</strong> 2.0 might look like: elimination of DTDs from syntax, integration of namespaces, <strong>XML</strong> Base and<br />

<strong>XML</strong> Information Set (infoset) into the base standard.<br />

The World Wide Web Consortium also has an <strong>XML</strong> Binary Characterization Working Group doing preliminary<br />

research into use cases and properties for a binary encoding of the <strong>XML</strong> infoset. The working group is not chartered<br />

to produce any official standards. Since <strong>XML</strong> is by definition text-based, ITU-T and ISO are using the name Fast<br />

Infoset for their own binary infoset to avoid confusion (see ITU-T Rec. X.891 | ISO/IEC 24824-1).<br />

See also<br />

• Category:<strong>XML</strong><br />

• Binary <strong>XML</strong><br />

• <strong>XML</strong> Protocol<br />

• List of <strong>XML</strong> markup languages<br />

• Category:<strong>XML</strong>-based standards<br />

• Comparison of layout engines (<strong>XML</strong>)<br />

• Comparison of data serialization formats<br />

• OpenDocument<br />

Further reading<br />

• Annex A of ISO 8879:1986 (SGML)<br />

• Lawrence A. Cunningham (2005). "<strong>Language</strong>, Deals and Standards: The Future of <strong>XML</strong> Contracts". Washington<br />

University Law Review. SSRN 900616 [38] .<br />

• Bosak, Jon; Tim Bray (May 1999). "<strong>XML</strong> and the Second-Generation Web". Scientific American. Online at <strong>XML</strong><br />

and the Second-Generation Web [39] .


<strong>XML</strong> 130<br />

External links<br />

• W3C <strong>XML</strong> homepage [40]<br />

• <strong>XML</strong> 1.0 Specification [41]<br />

• Introduction to Generalized <strong>Markup</strong> [42] by Charles Goldfarb<br />

• Making Mistakes with <strong>XML</strong> [43] by Sean Kelly<br />

• The Multilingual WWW [44] by Gavin Nicol<br />

• Retrospective on Extended Reference Concrete Syntax [45] by Rick Jelliffe<br />

• <strong>XML</strong>, Java and the Future of the Web [46] by Jon Bosak<br />

• <strong>XML</strong> tutorials in w3schools [47]<br />

• <strong>XML</strong>.gov [48]<br />

• Thinking <strong>XML</strong>: The <strong>XML</strong> decade [49] by Uche Ogbuji<br />

• <strong>XML</strong>: Ten year anniversary [50] by Elliot Kimber<br />

• Five years later, <strong>XML</strong>... [51] by Simon St. Laurent<br />

• 23 <strong>XML</strong> fallacies to watch out for [52] by Sean McGrath<br />

• <strong>XML</strong> Injection [53] - Web Application Security Consortium<br />

• W3C <strong>XML</strong> is Ten! [54] , <strong>XML</strong> 10 years press release<br />

References<br />

[1] "<strong>XML</strong> Media Types, RFC 3023" (http://tools.ietf.org/html/rfc3023#section-3.2). IETF. 2001-01. pp. 9–11. . Retrieved 2010-01-04.<br />

[2] "<strong>XML</strong> Media Types, RFC 3023" (http://tools.ietf.org/html/rfc3023#section-3.1). IETF. 2001-01. pp. 7–9. . Retrieved 2010-01-04.<br />

[3] http://www.w3.org/TR/2008/REC-xml-20081126/<br />

[4] http://www.w3.org/TR/2006/REC-xml11-20060816/<br />

[5] http://www.w3.org/TR/rec-xml<br />

[6] <strong>XML</strong> 1.0 Specification (http://www.w3.org/TR/REC-xml)<br />

[7] "W3C DOCUMENT LICENSE" (http://www.w3.org/Consortium/Legal/2002/copyright-documents-20021231). .<br />

[8] "<strong>XML</strong> 1.0 Origin and Goals" (http://www.w3.org/TR/REC-xml/#sec-origin-goals). . Retrieved July 2009.<br />

[9] "<strong>XML</strong> Applications and Initiatives" (http://xml.coverpages.org/xmlApplications.html). .<br />

[10] "Introduction to iWork Programming Guide. Mac OS X Reference Library" (http://developer.apple.com/mac/library/documentation/<br />

AppleApplications/Conceptual/iWork2-0_<strong>XML</strong>/Chapter01/Introduction.html). Apple. .<br />

[11] http://www.w3.org/TR/2006/REC-xml-20060816/#charsets<br />

[12] http://www.w3.org/TR/xml11/#charsets<br />

[13] "Characters vs. Bytes" (http://www.tbray.org/ongoing/When/200x/2003/04/26/UTF). .<br />

[14] "Autodetection of Character Encodings" (http://www.w3.org/TR/REC-xml/#sec-guessing). .<br />

[15] It is allowed, but not recommended, to use "


<strong>XML</strong> 131<br />

[27] edited by Sueann Ambron and Kristina Hooper ; foreword by John Sculley. (1988). "Publishers, multimedia, and interactivity". Interactive<br />

multimedia. Cobb Group. ISBN 1-55615-124-1.<br />

[28] Eliot Kimber (2006). "<strong>XML</strong> is 10" (http://drmacros-xml-rants.blogspot.com/#116460437782808906). .<br />

[29] The working group was originally called the "Editorial Review Board." The original members and seven who were added before the first<br />

edition was complete, are listed at the end of the first edition of the <strong>XML</strong> Recommendation, at http://www.w3.org/TR/1998/<br />

REC-xml-19980210.<br />

[30] "Reports From the W3C SGML ERB to the SGML WG And from the W3C <strong>XML</strong> ERB to the <strong>XML</strong> SIG" (http://www.w3.org/<strong>XML</strong>/<br />

9712-reports.html). W3.org. . Retrieved 2009-07-31.<br />

[31] "Extensible <strong>Markup</strong> <strong>Language</strong> (<strong>XML</strong>)" (http://www.w3.org/TR/WD-xml-961114.html). W3.org. 1996-11-14. . Retrieved 2009-07-31.<br />

[32] Jon Bosak, Sun Microsystems (2006-12-07). "Closing Keynote, <strong>XML</strong> 2006" (http://2006.xmlconference.org/proceedings/162/<br />

presentation.html). 2006.xmlconference.org. . Retrieved 2009-07-31.<br />

[33] Extensible <strong>Markup</strong> <strong>Language</strong> (<strong>XML</strong>) 1.0 (Third Edition) (http://www.w3.org/TR/2004/REC-xml-20040204)<br />

[34] "Extensible <strong>Markup</strong> <strong>Language</strong> (<strong>XML</strong>) 1.1 (Second Edition) – Rationale and list of changes for <strong>XML</strong> 1.1" (http://www.w3.org/TR/<br />

xml11/#sec-xml11). W3C. . Retrieved 2006-12-21.<br />

[35] Harold, Elliotte Rusty (2004). Effective <strong>XML</strong> (http://www.cafeconleche.org/books/effectivexml/). Addison-Wesley. pp. 10–19.<br />

ISBN 0321150406. .<br />

[36] "Extensible <strong>Markup</strong> <strong>Language</strong> (<strong>XML</strong>) 1.1 (Second Edition) – Rationale and list of changes for <strong>XML</strong> 1.1" (http://www.w3.org/TR/<br />

xml11/#dt-name). W3C. . Retrieved 2009-12-11.<br />

[37] http://www.w3.org/TR/xml11/#sec-xml11<br />

[38] http://ssrn.com/abstract=900616<br />

[39] http://www.scientificamerican.com/article.cfm?id=xml-and-the-second-genera<br />

[40] http://www.w3.org/<strong>XML</strong>/<br />

[41] http://www.w3.org/TR/REC-xml<br />

[42] http://www.sgmlsource.com/history/AnnexA.htm<br />

[43] http://www.developer.com/xml/article.php/10929_3583081_1<br />

[44] http://www.mind-to-mind.com/library/papers/multilingual/multilingual-www.html<br />

[45] http://xml.ascc.net/en/utf-8/ercsretro.html<br />

[46] http://www.xml.com/pub/a/w3j/s3.bosak.html<br />

[47] http://www.w3schools.com/xml/default.asp<br />

[48] http://xml.gov/<br />

[49] http://www-128.ibm.com/developerworks/library/x-think38.html<br />

[50] http://drmacros-xml-rants.blogspot.com/2006/11/xml-ten-year-aniversary.html<br />

[51] http://www.oreillynet.com/xml/blog/2003/02/five_years_later_xml.html<br />

[52] http://www.itworld.com/xml-fallacies-nlstipsm-080122<br />

[53] http://projects.webappsec.org/<strong>XML</strong>-Injection<br />

[54] http://www.w3.org/2008/02/xml10-pressrelease


<strong>XML</strong> and MIME 132<br />

<strong>XML</strong> and MIME<br />

<strong>XML</strong><br />

An <strong>XML</strong> document is a text document that consists of an <strong>XML</strong> declaration and a root element with well-formed<br />

content.<br />

Example <strong>XML</strong> Document<br />

MIME<br />

<br />

<br />

<br />

Blah<br />

MIME (Multipurpose Internet Mail Extensions) is an Internet Standard that allows email systems to interpret<br />

complex data. Web browsers also use the MIME type to accurately display information or launch a separate<br />

application to handle the data.<br />

All MIME types (called Internet media type) consist of two parts, in the form type/subtype.<br />

This information is sent to the browser by a web server. Usually, the server determines the MIME type based on the<br />

document's file extension. For example, the server would interpret an extension of .txt (plain text file) to have a<br />

MIME type of text/plain.<br />

<strong>XML</strong> Specific MIME Types<br />

There are two MIME assignments for <strong>XML</strong> data. These are:<br />

• application/xml (RFC 3023)<br />

• text/xml (RFC 3023)<br />

Because of the wide variety of documents that can be expressed using an <strong>XML</strong> syntax, additional MIME types are<br />

needed to differentiate between languages. <strong>XML</strong>-based formats add a suffix of +xml to the MIME type.<br />

The followings are some examples of common <strong>XML</strong> media types.<br />

• Registered<br />

• Extensible HyperText <strong>Markup</strong> <strong>Language</strong> (XHTML): application/xhtml+xml (RFC 3236)<br />

• Atom: application/atom+xml (RFC 4287)<br />

• Registration-In-Progress<br />

• Extensible Stylesheet <strong>Language</strong> Transformations (XSLT): application/xslt+xml [1]<br />

• Scalable Vector Graphics (SVG): image/svg+xml [2]<br />

• Unregistered<br />

• Mathematical <strong>Markup</strong> <strong>Language</strong> (MathML): application/mathml+xml<br />

• Really Simple Syndication (RSS 2.0): application/rss+xml


<strong>XML</strong> and MIME 133<br />

External links<br />

• Official List of MIME Types [3]<br />

• IBM article [4]<br />

References<br />

[1] http://www.w3.org/TR/xslt20/#xslt-mime-definition<br />

[2] http://www.w3.org/TR/SVGMobile12/mimereg.html<br />

[3] http://www.iana.org/assignments/media-types/<br />

[4] http://www-128.ibm.com/developerworks/xml/library/x-mxd2.html<br />

<strong>XML</strong> appliance<br />

An <strong>XML</strong> appliance is a separate computer system with deliberately narrow functionality that exchanges <strong>XML</strong><br />

messages with other computer systems. <strong>XML</strong> appliances secure, accelerate and route <strong>XML</strong> so enterprises can<br />

cost-effectively realize its full potential for messaging and service-oriented architectures (SOAs). They are designed<br />

specifically to be easy to install, configure and manage. While some <strong>XML</strong> appliances must rely on specialized<br />

hardware and software to accelerate the processing of <strong>XML</strong> messages, others accomplish the same tasks using<br />

standards-based hardware and operating systems.<br />

History of <strong>XML</strong> appliances<br />

The first <strong>XML</strong> appliances were created by DataPower in 1999, Sarvega and Forum Systems in 2001, but there were<br />

generally two groups of engineers - some who were focused on large volumes of <strong>XML</strong> transformations and some<br />

who were focused on high-speed <strong>XML</strong> processing and security. The transformation team created specialized<br />

software or Application-specific integrated circuits that performed transformations up to 100 times faster than basic<br />

software-only solutions. Although there were some early adopters of these systems, it was initially restricted to large<br />

e-commerce sites such as Yahoo! and Amazon. The <strong>XML</strong> processing team created highly optimized appliances that<br />

secured and integrated <strong>XML</strong> across many use cases. Early entrants in <strong>XML</strong> appliances include vendors such as<br />

DataPower (now owned by IBM), Reactivity, Inc. (acquired by CISCO), Forum Systems, Layer 7 Technologies,<br />

Vordel, and Sarvega (now owned by Intel).<br />

These two approaches began to converge when a second generation of <strong>XML</strong> appliances started to appear around<br />

2003, when these devices were used to exchange SOAP <strong>XML</strong> messages between computers on public networks.<br />

These messages required advanced security features such as encryption, digital signatures and denial of service<br />

attack prevention. Because the setup and configuration of software-only systems was time consuming, companies<br />

could save a great deal of money by using appliances that were pre-packaged with WS-Security standards built in.


<strong>XML</strong> appliance 134<br />

Common features of <strong>XML</strong> appliances<br />

• They can validate <strong>XML</strong> messages for well-formedness as they enter or exit the appliance<br />

• They include hardware and/or software customized for efficient <strong>XML</strong> parsing and analysis.<br />

• They have built-in support for many <strong>XML</strong> standards such as XSLT, XPath, SOAP and WS-Security<br />

Classification of <strong>XML</strong> appliances<br />

Although the term <strong>XML</strong> appliance is the most general term to describe these devices, most vendors use alternative<br />

terminology that describe more specific functionality of these devices. The following are alternative names used for<br />

<strong>XML</strong> Appliances:<br />

• <strong>XML</strong> accelerators — are devices that typically use custom hardware or software built on standards-based<br />

hardware to accelerate XPath processing. This hardware typically provides a performance boost between 10 and<br />

100 times in the number of messages per second that can be processed.<br />

• Integration appliance — (also known as application routers) are devices that are designed to make the integration<br />

of computer systems easier.<br />

• <strong>XML</strong> security gateways (also known as <strong>XML</strong> firewalls) are devices that support the WS-Security standards.<br />

These appliances typically offload encryption and decryption to specialized hardware devices.<br />

• <strong>XML</strong> Enabled Networking — an abstraction layer that exists alongside the traditional IP network. This layer<br />

addresses the security, incompatibility and latency issues encumbering <strong>XML</strong> messages, web services and<br />

service-oriented architectures (SOAs).<br />

Notable <strong>XML</strong> appliance vendors<br />

• Bloombase<br />

• Citrix Systems (through acquisition of QuickTree [1])<br />

• DataPower (now owned by IBM), see IBM WebSphere DataPower SOA Appliances<br />

• F5 Networks<br />

• Radware<br />

• Solace Systems<br />

• Xtradyne<br />

• Cisco<br />

See also<br />

• <strong>XML</strong><br />

• XSLT<br />

• SOAP<br />

• <strong>XML</strong> Enabled Networking<br />

• WS-Security<br />

• Apache Axis<br />

• Integration appliance


<strong>XML</strong> appliance 135<br />

References<br />

[1] http://community.citrix.com/display/ocb/2008/11/14/<strong>XML</strong>+Security+Features+in+Netscaler+9.0<br />

<strong>XML</strong> Base<br />

<strong>XML</strong> Base is a World Wide Web Consortium recommended facility for defining base URIs for parts of <strong>XML</strong><br />

documents.<br />

<strong>XML</strong> Base recommendation was adopted on 2001-06-27.<br />

The attribute xml:base may be inserted in <strong>XML</strong> documents to specify a base URI other than the base URI of the<br />

document or external entity. The value of this attribute is interpreted as a URI Reference as defined in RFC 3986<br />

[IETF RFC 3986]. It serves the function described in section 5.1.1 of RFC3986, establishing the base URI (or IRI)<br />

for resolving any relative references found within the effective scope of the xml:base attribute.<br />

In namespace-aware <strong>XML</strong> processors, the "xml" prefix is bound to the namespace name http:/ / www. w3. org/<br />

<strong>XML</strong>/1998/namespace as described in Namespaces in <strong>XML</strong> [<strong>XML</strong> Names]. Note that xml:base can be still used by<br />

non-namespace-aware processors.<br />

External links<br />

• <strong>XML</strong> Base W3C Recommendation [1]<br />

References<br />

[1] http://www.w3.org/TR/xmlbase/


<strong>XML</strong> Catalog 136<br />

<strong>XML</strong> Catalog<br />

<strong>XML</strong> documents typically refer to external entities, for example the public and/or system ID for the Document Type<br />

Definition. These external relationships are expressed using URIs, typically as URLs.<br />

However, if they are absolute URLs, they only work when your network can reach them. Relying on remote<br />

resources makes <strong>XML</strong> processing susceptible to both planned and unplanned network downtime.<br />

Conversely, if they are relative URLs, they're only useful in the context where they were initially created. For<br />

example, the URL "../../xml/dtd/docbookx.xml" will usually only be useful in very limited circumstances.<br />

One way to avoid these problems is to use an entity resolver (a standard part of SAX) or a URI Resolver (a standard<br />

part of JAXP). A resolver can examine the URIs of the resources being requested and determine how best to satisfy<br />

those requests. The <strong>XML</strong> catalog is a document describing a mapping between external entity references and<br />

locally-cached equivalents.<br />

Example Catalog.xml<br />

The following simple catalog shows how one might provide locally-cached DTDs for an XHTML page validation<br />

tool, for example.<br />

<br />

<br />

<br />

<br />

<br />

<br />

This catalog makes it possible to resolve -//W3C//DTD XHTML 1.0 Strict//EN to the local URI<br />

dtd/xhtml1/xhtml1-strict.dtd. Similarly, it provides local URIs for two other public IDs.<br />

Note that the document above includes a DOCTYPE - this may cause the parser to attempt to access the system ID<br />

URL for the DOCTYPE (i.e. http://www.oasis-open.org/committees/entity/release/1.0/catalog.dtd) before the<br />

catalog resolver is fully functioning, which is probably undesirable. To prevent this, simply remove the DOCTYPE<br />

declaration.<br />

The following example shows this, and also shows the equivalent declarations as an alternative to<br />

declarations.


<strong>XML</strong> Catalog 137<br />

<br />

<br />

<br />

<br />

<br />

Using a Catalog - Java SAX Example<br />

Catalog resolvers are available for various programming languages. The following example shows how, in Java, a<br />

SAX parser may be created to parse some input source in which the<br />

org.apache.xml.resolver.tools.CatalogResolver is used to resolve external entities to<br />

locally-cached instances. This resolver originates from Apache Xerces but is now included with the Sun Java<br />

runtime.<br />

Simply create a SAXParser in the normal way, using factories. Obtain the <strong>XML</strong> reader and set the entity resolver<br />

to the standard one (CatalogResolver) or another of your own.<br />

final SAXParser saxParser =<br />

SAXParserFactory.newInstance().newSAXParser();<br />

final <strong>XML</strong>Reader reader = saxParser.get<strong>XML</strong>Reader();<br />

final ContentHandler handler = ...;<br />

final InputSource input = ...;<br />

reader.setEntityResolver( new CatalogResolver() );<br />

reader.setContentHandler( handler );<br />

reader.parse( input );<br />

It is important to call the parse method on the reader, not on the SAX parser.


<strong>XML</strong> Catalog 138<br />

See also<br />

• <strong>XML</strong> Catalogs. OASIS Standard, Version 1.1. 07-October-2005. [1]<br />

• <strong>XML</strong> Entity and URI Resolvers [2] , Sun<br />

• <strong>XML</strong> Catalog Manager [3] project on Sourceforge<br />

• <strong>XML</strong> Catalogs for .NET and Mono [4]<br />

References<br />

[1] http://www.oasis-open.org/committees/download.php/14810/xml-catalogs.pdf<br />

[2] http://java.sun.com/webservices/docs/1.6/jaxb/catalog.html<br />

[3] http://xmlcatmgr.sourceforge.net/<br />

[4] http://xmlcatalog.net/<br />

<strong>XML</strong> Certification Program<br />

<strong>XML</strong> Certification Program (<strong>XML</strong> Master) is IT professional certification for <strong>XML</strong> and related technologies.<br />

There are two levels of <strong>XML</strong> Certifications, <strong>XML</strong> Master Basic certification and <strong>XML</strong> Master Professional<br />

certification, and more than 18000 examiners have passed those examinations.<br />

Certification paths<br />

<strong>XML</strong> Master Professional Application Developer Certification<br />

• <strong>XML</strong> Master Professional Application Developer is a certification for professionals who have demonstrated the<br />

ability to use technology in developing applications that deal with <strong>XML</strong> data.<br />

<strong>XML</strong> Master Professional Application Developer Certification Requirements<br />

• Pass the <strong>XML</strong> Master Basic exam and the <strong>XML</strong> Master Professional Application Developer certification exam.<br />

<strong>XML</strong> Master Professional Application Developer Certification Exam<br />

• Duration => 90 minutes<br />

• Number of Questions => 45 questions<br />

• Required Passing Score => 70%<br />

<strong>XML</strong> Master Professional Application Developer Certification Exam Topics<br />

• Section 1 - DOM / SAX<br />

• Section 2 - DOM / SAX Programming<br />

• Section 3 - XSLT<br />

• Section 4 - <strong>XML</strong> Schema<br />

• Section 5 - <strong>XML</strong> Processing System Design Technology<br />

• Section 6 - Utilizing <strong>XML</strong>


<strong>XML</strong> Certification Program 139<br />

<strong>XML</strong> Master Professional Database Administrator Certification<br />

• The <strong>XML</strong> Master Professional Database Administrator is a certification for professionals who have demonstrated<br />

the ability to use technology in XQuery and <strong>XML</strong>DB.<br />

<strong>XML</strong> Master Professional Database Administrator Certification Requirements<br />

• Pass the <strong>XML</strong> Master Basic exam and the <strong>XML</strong> Master Professional Database Administrator certification exam.<br />

<strong>XML</strong> Master Professional Database Administrator Certification Exam<br />

• Duration in minutes => 90 minutes<br />

• Number of Questions => 30 questions<br />

• Required Passing Score => 80%<br />

<strong>XML</strong> Master Professional Database Administrator Certification Exam Topics<br />

• Section 1 - Overview<br />

• Section 2 - XQuery, XPath<br />

• Section 3 - Manipulating <strong>XML</strong> Data<br />

• Section 4 - Creating <strong>XML</strong> Schema and Other <strong>XML</strong> Database Objects<br />

<strong>XML</strong> Master Basic Certification<br />

• <strong>XML</strong> Master Basic is a certification for professionals who have demonstrated the ability to use <strong>XML</strong> and related<br />

technologies.<br />

<strong>XML</strong> Master Basic Certification Requirements<br />

• Pass the <strong>XML</strong> Master Basic certification exam.<br />

<strong>XML</strong> Master Basic Certification Exam<br />

• Duration in minutes => 90 minutes<br />

• Number of Questions => 50 questions<br />

• Minimum Passing Score => 70%<br />

<strong>XML</strong> Master Basic Certification Exam Topics<br />

• Section 1 - <strong>XML</strong> Overview<br />

• Section 2 - Creating <strong>XML</strong> Documents<br />

• Section 3 - DTD<br />

• Section 4 - <strong>XML</strong> Schema<br />

• Section 5 - XSLT, XPath<br />

• Section 6 - Namespace


<strong>XML</strong> Certification Program 140<br />

For Certification Exam Takers<br />

Exam Fee<br />

It takes US$125 for each certification exam.<br />

Exam Enrollment<br />

The <strong>XML</strong> Master exams are available daily at Prometric Authorized Testing Centers. To take the exam, schedule a<br />

day and time at Prometric Web site [1] .<br />

External links<br />

<strong>XML</strong> Certification Program (<strong>XML</strong> Master) official website<br />

• Introduction to <strong>XML</strong> Certification Program: <strong>XML</strong> Master [2]<br />

• <strong>XML</strong> Master Certification Practice Exam [3]<br />

• <strong>XML</strong> Master Certification Success Stories [4]<br />

<strong>XML</strong> Master Basic Certification Exam Preparation Links<br />

Section 1 - <strong>XML</strong> Overview<br />

• a. Overview of <strong>XML</strong><br />

• <strong>XML</strong> features [5]<br />

• Purpose of <strong>XML</strong> [6]<br />

• b. Overview of related <strong>XML</strong> technologies<br />

• Names for and overview of <strong>XML</strong>-related technologies defined by the W3C or other standards<br />

organizations XPath, XLink, XQuery, XPointer, DOM, SAX, SOAP, XHTML etc.<br />

[7]<br />

• Names for and overview of applicable <strong>XML</strong> specifications defined according to industry or purpose by the<br />

W3C or other standards organizations [6]<br />

• Purpose of schema definition language defining <strong>XML</strong> structure [8]<br />

• Differences in defined content and functions of <strong>XML</strong> Schema and DTD [8]<br />

Section 2 - Creating <strong>XML</strong> Documents<br />

• a. Syntax<br />

• Naming rules, usable characters defined within an <strong>XML</strong> document [9]<br />

• Methods for coding <strong>XML</strong> documents utilizing tags [10]<br />

• Rules for coding declarations, elements, comments, character references, and processing commands<br />

comprising an <strong>XML</strong> document [10]<br />

• Methods for coding character data and markups (tags, references, comments, etc.) comprising an <strong>XML</strong><br />

document [11]<br />

• The role of an <strong>XML</strong> processor (<strong>XML</strong> parser) [12]<br />

• b. Elements, attributes, entities<br />

• Coding elements that include attributes [13]<br />

• Types of entities [14]<br />

• Handling entities and references using an <strong>XML</strong> processor [15]<br />

• Usage of character references [16]<br />

• Usage of Predefined entities [17]<br />

• Method for referencing entities [17]<br />

• c. Valid <strong>XML</strong> documents, well-formed <strong>XML</strong> documents


<strong>XML</strong> Certification Program 141<br />

• Well-formed <strong>XML</strong> document coding methods [18]<br />

• Coding methods to ensure valid <strong>XML</strong> documents [8]<br />

• Differences between valid <strong>XML</strong> documents and well-formed <strong>XML</strong> documents [19]<br />

• Creating valid <strong>XML</strong> documents for defined DTDs [19]<br />

• Creating valid <strong>XML</strong> documents for defined <strong>XML</strong> Schema [19]<br />

• d. Special characters/ character codes, encoding/ normalizing <strong>XML</strong> documents<br />

• Character references [16]<br />

• <strong>XML</strong> declarations and text declarations [20]<br />

• Handling white spaces [21]<br />

• End-of-line handling in <strong>XML</strong> documents [22]<br />

• Normalizing attribute values [23]<br />

Section 3 - DTD<br />

• a. Basics<br />

• Document type declarations [24]<br />

• Methods for coding DTD internal subsets and external subsets [24]<br />

• Differences between DTD internal subsets and external subsets [24]<br />

• Internal entities and external entities, Parsed entities and unparsed entities [24]<br />

• b. Content model/element type declarations/attribute-list declarations/actual processing/entity declarations<br />

• Element type declarations [25]<br />

• Content model definitions for elements [25]<br />

• Attribute-list declarations [26]<br />

• Attribute types [26]<br />

• Attribute defaults [26]<br />

• Entity declarations [27]<br />

Section 4 - <strong>XML</strong> Schema<br />

• a. Basics<br />

• <strong>XML</strong> Schema document structure [28]<br />

• <strong>XML</strong> Schema Namespace [28]<br />

• Mapping between <strong>XML</strong> documents and <strong>XML</strong> schema documents [29]<br />

• b. Data types/ coding methods/ actual processing<br />

• <strong>XML</strong> Schema embedded data types [30]<br />

• Simple type and complex type [31]<br />

• Type extensions and restrictions [29]<br />

• Element definitions [30]<br />

• Attribute definitions [30]<br />

Section 5 - XSLT, XPath<br />

• a. Basics<br />

• Purpose of XSLT [32]<br />

• Application use of XSLT [33]<br />

• XSLT stylesheet structure [34]<br />

• XSLT Namespace [34]<br />

• b. Elements/ templates/ character encoding/ actual transformation processing<br />

• Coding methods and related functions for well-known XSLT elements [35]<br />

• Template rules and templates [36]


<strong>XML</strong> Certification Program 142<br />

• Pattern coding and matching patterns and nodes [36]<br />

• Output processing using XSLT [37]<br />

• c. Coding XPath expressions within a stylesheet<br />

• Basic operators [36]<br />

• Basic functions [36]<br />

• Basic coding methods for location paths (designating tree structure nodes) [36]<br />

Section 6 - Namespace<br />

• a. <strong>XML</strong> namespaces<br />

• <strong>XML</strong> namespace defined content [38]<br />

• Application use of <strong>XML</strong> namespace [38]<br />

• <strong>XML</strong> namespace coding methods [38]<br />

• <strong>XML</strong> namespace scope (effective scope) [39]<br />

External links<br />

• <strong>XML</strong> Master Trainings [40] (German)<br />

• <strong>XML</strong> Master Basic Training [41] (German)<br />

References<br />

[1] http://securereg3.prometric.com/<br />

[2] http://www.xmlmaster.org/en<br />

[3] http://www.xmlmaster.org/en/practice_exam/<br />

[4] http://www.xmlmaster.org/en/success/index.html<br />

[5] http://www.w3schools.com/xml/xml_whatis.asp<br />

[6] http://www.w3schools.com/xml/xml_usedfor.asp<br />

[7] http://www.xmlmaster.org/en/article/d01/c01/<br />

[8] http://www.w3schools.com/schema/schema_why.asp<br />

[9] http://www.w3schools.com/xml/xml_elements.asp<br />

[10] http://www.w3schools.com/xml/xml_syntax.asp<br />

[11] http://www.w3schools.com/xml/xml_cdata.asp [12]<br />

http://www.w3schools.com/dtd/dtd_validation.asp [13]<br />

http://www.w3schools.com/xml/xml_attributes.asp<br />

[14] http://www.w3.org/TR/2006/REC-xml-20060816/#sec-entity-decl<br />

[15] http://www.w3.org/TR/2006/REC-xml-20060816/#TextEntities<br />

[16] http://www.w3.org/TR/2004/REC-xml-20040204/#sec-entexpand<br />

[17] http://www.w3.org/TR/2006/REC-xml-20060816/#sec-references<br />

[18] http://www.xmlmaster.org/en/article/d01/c02/<br />

[19] http://www.w3schools.com/xml/xml_dtd.asp<br />

[20] http://www.w3schools.com/xml/xml_encoding.asp<br />

[21] http://www.w3.org/TR/2006/REC-xml-20060816/#sec-white-space<br />

[22] http://www.w3.org/TR/2006/REC-xml-20060816/#sec-line-ends<br />

[23] http://www.w3.org/TR/2006/REC-xml-20060816/<br />

[24] http://www.xmlmaster.org/en/article/d01/c03/<br />

[25] http://www.w3schools.com/dtd/dtd_elements.asp<br />

[26] http://www.w3schools.com/dtd/dtd_attributes.asp<br />

[27] http://www.w3schools.com/dtd/dtd_entities.asp<br />

[28] http://www.w3schools.com/schema/schema_schema.asp<br />

[29] http://www.xmlmaster.org/en/article/d01/c06/<br />

[30] http://www.xmlmaster.org/en/article/d01/c04/<br />

[31] http://www.xmlmaster.org/en/article/d01/c05/<br />

[32] http://www.w3schools.com/xsl/xsl_intro.asp<br />

[33] http://www.xmlmaster.org/en/article/d01/c07/<br />

[34] http://www.w3schools.com/xsl/xsl_transformation.asp


<strong>XML</strong> Certification Program 143<br />

[35] http://www.w3schools.com/xsl/xsl_templates.asp<br />

[36] http://www.xmlmaster.org/en/article/d01/c08/<br />

[37] http://www.w3schools.com/xsl/el_output.asp<br />

[38] http://www.w3schools.com/xml/xml_namespaces.asp<br />

[39] http://www.xmlmaster.org/en/article/d01/c10/<br />

[40] http://www.digicomp.ch/xml<br />

[41] http://www.data2type.de/leistungen/schulungen/xmlmaster<br />

<strong>XML</strong> Configuration Access Protocol<br />

The <strong>XML</strong> Configuration Access Protocol (XCAP), is an application layer protocol that allows a client to read, write,<br />

and modify application configuration data stored in <strong>XML</strong> format on a server.<br />

Overview<br />

XCAP maps <strong>XML</strong> document sub-trees and element attributes to HTTP URIs, so that these components can be<br />

directly accessed by clients using HTTP protocol. An XCAP server is used by XCAP clients to store data like buddy<br />

lists and presence policy in combination with a SIP Presence server that supports PUBLISH, SUBSCRIBE and<br />

NOTIFY methods to provide a complete SIP SIMPLE server solution.<br />

Features<br />

The following operations are supported via XCAP protocol in a client-server interaction:<br />

• Retrieve an item<br />

• Delete an item<br />

• Modify an item<br />

• Add an item<br />

The operations above can be executed on the following items:<br />

• Document<br />

• Element<br />

• Attribute<br />

The XCAP addressing mechanism is based on XPath, that provides the ability to navigate around the <strong>XML</strong> tree.<br />

Application Usages<br />

The following applications are provided by XCAP, by using specific auid (Application Unique Id):<br />

• XCAP capabilities (auid = xcap-caps).<br />

• Resource lists (auid = resource-lists). A resource lists application is any application that needs access to a list of<br />

resources, identified by a URI, to which operations, such as subscriptions, can be applied.<br />

• Presence rules (auid = pres-rules, org.openmobilealliance.pres-rules). A Presence Rules application is an<br />

application which uses authorization policies, also known as authorization rules, to specify what presence<br />

information can be given to which watchers, and when.<br />

• RLS services (auid = rls-services). A Resource List Server (RLS) services application is Session Initiation<br />

Protocol (SIP) application whereby a server receives SIP SUBSCRIBE requests for resource, and generates<br />

subscriptions towards the resource list.<br />

• PIDF manipulation (auid = pidf-manipulation). Pidf-manipulation application usage defines how XCAP is used<br />

to manipulate the contents of PIDF based presence documents.


<strong>XML</strong> Configuration Access Protocol 144<br />

Standards<br />

The XCAP protocol is based on the following IETF standards:<br />

RFC4825 [1] , RFC4826 [2] , RFC4827 [3] , RFC5025 [4]<br />

External links<br />

• XCAP Tutorial [5]<br />

• OpenXCAP [6]<br />

References<br />

[1] RFC4825 (http://www.ietf.org/rfc/rfc4825.txt)<br />

[2] RFC4826 (http://www.ietf.org/rfc/rfc4826.txt)<br />

[3] RFC4827 (http://www.ietf.org/rfc/rfc4827.txt)<br />

[4] RFC5025 (http://www.ietf.org/rfc/rfc5025.txt)<br />

[5] http://www.jdrosen.net/papers/xcap-tutorial.ppt<br />

[6] http://openxcap.org/<br />

<strong>XML</strong> Control Protocol<br />

<strong>XML</strong> Control Protocol, or XCP, was launched as an April Fools' Day joke on April 1, 2004. It was pitched as a<br />

drop-in replacement for TCP with the slogan "Light the Fiber!". The web site put up for the occasion now seems to<br />

be owned by a link farm.<br />

External links<br />

• TCP is So Over by Tim Bray [1]<br />

• Former XCP home page [2]<br />

References<br />

[1] http://www.tbray.org/ongoing/When/200x/2004/04/01/XCP<br />

[2] http://www.x-cp.org/


<strong>XML</strong> data binding 145<br />

<strong>XML</strong> data binding<br />

<strong>XML</strong> data binding refers to the process of representing the information in an <strong>XML</strong> document as an object in<br />

computer memory. This allows applications to access the data in the <strong>XML</strong> from the object rather than using the<br />

DOM or SAX to retrieve the data from a direct representation of the <strong>XML</strong> itself.<br />

An <strong>XML</strong> data binder accomplishes this by automatically creating a mapping between elements of the <strong>XML</strong> schema<br />

of the document we wish to bind and members of a class to be represented in memory.<br />

When this process is applied to convert a <strong>XML</strong> document to an object, it is called unmarshalling. The reverse<br />

process, to serialize an object as <strong>XML</strong>, is called marshalling.<br />

Since <strong>XML</strong> is inherently sequential and objects are (usually) not, <strong>XML</strong> data binding mappings often have difficulty<br />

preserving all the information in an <strong>XML</strong> document. Specifically, information like comments, <strong>XML</strong> entity<br />

references, and sibling order may fail to be preserved in the object representation created by the binding application.<br />

This is not always the case; sufficiently complex data binders are capable of preserving 100% of the information in<br />

an <strong>XML</strong> document.<br />

Similarly, since objects in computer memory are not inherently sequential, and may include links to other objects<br />

(including self-referential links), <strong>XML</strong> data binding mappings often have difficulty preserving all the information<br />

about an object when it is marshalled to <strong>XML</strong>.<br />

An alternative approach to automatic data binding relies instead on hand-crafted XPath expressions that extract the<br />

data from <strong>XML</strong>. This approach has a number of benefits. First, the data binding code only needs proximate<br />

knowledge (e.g., topology, tag names, etc.) of the <strong>XML</strong> tree structure, which developers can determine by looking at<br />

the <strong>XML</strong> data; <strong>XML</strong> schemas are no longer mandatory. Furthermore, XPath allows the application to bind the<br />

relevant data items and filter out everything else, avoiding the unnecessary processing that would be required to<br />

completely unmarshall the entire <strong>XML</strong> document. The drawback of this approach is the lack of automation in<br />

implementing the object model and XPath expressions. Instead the application developers have to create these<br />

artifacts manually.<br />

Data binding in general<br />

One of <strong>XML</strong> data binding's strengths is the ability to un/serialize objects across programs, languages, and platforms.<br />

You can dump a time series of structured objects from a datalogger written in C on an embedded processor, bring it<br />

across the network to process in perl and finally visualize in Mathematica. The structure and the data remain<br />

consistent and coherent throughout the journey, and no custom formats or parsing is required. This is not unique to<br />

<strong>XML</strong>. YAML, for example, is emerging as a powerful data binding alternative to <strong>XML</strong>. JSON (which can be<br />

regarded as a subset of YAML) is often suitable for lightweight or restricted applications.


<strong>XML</strong> data binding 146<br />

External links<br />

• <strong>XML</strong> Data Binding Resources [1] , by Ronald Bourret<br />

• <strong>XML</strong> Schema Patterns for Databinding Working Group [2]<br />

See also<br />

• Bound control<br />

• Data structure<br />

• JSON<br />

• Serialization<br />

• YAML<br />

References<br />

[1] http://www.rpbourret.com/xml/<strong>XML</strong>DataBinding.htm<br />

[2] http://www.w3.org/2002/ws/databinding<br />

<strong>XML</strong> database<br />

An <strong>XML</strong> database is a data persistence software system that allows data to be stored in <strong>XML</strong> format. This data can<br />

then be queried, exported and serialized into the desired format.<br />

Two major classes of <strong>XML</strong> database exist:<br />

1. <strong>XML</strong>-enabled: these map all <strong>XML</strong> to a traditional database (such as a relational database [1] ), accepting <strong>XML</strong> as<br />

input and rendering <strong>XML</strong> as output. This term implies that the database does the conversion itself (as opposed to<br />

relying on middleware).<br />

2. Native <strong>XML</strong> (NXD): the internal model of such databases depends on <strong>XML</strong> and uses <strong>XML</strong> documents as the<br />

fundamental unit of storage, which are, however, not necessarily stored in the form of text files.<br />

Rationale for <strong>XML</strong> in databases<br />

O'Connell (2005, 9.2) gives one reason for the use of <strong>XML</strong> in databases: the increasingly common use of <strong>XML</strong> for<br />

data transport, which has meant that "data is extracted from databases and put into <strong>XML</strong> documents and vice-versa".<br />

It may prove more efficient (in terms of conversion costs) and easier to store the data in <strong>XML</strong> format.<br />

Native <strong>XML</strong> databases<br />

The term "native <strong>XML</strong> database" (NXD) can lead to confusion. Many NXDs do not function as standalone databases<br />

at all, and do not really store the native (text) form.<br />

The formal definition from the <strong>XML</strong>:DB initiative (which appears to be inactive since 2003 [2] ) states that a native<br />

<strong>XML</strong> database:<br />

• Defines a (logical) model for an <strong>XML</strong> document — as opposed to the data in that document — and stores and<br />

retrieves documents according to that model. At a minimum, the model must include elements, attributes,<br />

PCDATA, and document order. Examples of such models include the XPath data model, the <strong>XML</strong> Infoset, and<br />

the models implied by the DOM and the events in SAX 1.0.<br />

• Has an <strong>XML</strong> document as its fundamental unit of (logical) storage, just as a relational database has a row in a<br />

table as its fundamental unit of (logical) storage.


<strong>XML</strong> database 147<br />

• Need not have any particular underlying physical storage model. For example, NXDs can use relational,<br />

hierarchical, or object-oriented database structures, or use a proprietary storage format (such as indexed,<br />

compressed files).<br />

Additionally, many <strong>XML</strong> databases provide a logical model of grouping documents, called "collections". Databases<br />

can set up and manage many collections at one time. In some implementations, a hierarchy of collections can exist,<br />

much in the same way that an operating system's directory-structure works.<br />

All <strong>XML</strong> databases now support at least one form of querying syntax. Minimally, just about all of them support<br />

XPath for performing queries against documents or collections of documents. XPath provides a simple pathing<br />

system that allows users to identify nodes that match a particular set of criteria.<br />

In addition to XPath, many <strong>XML</strong> databases support XSLT as a method of transforming documents or query-results<br />

retrieved from the database. XSLT provides a declarative language written using an <strong>XML</strong> grammar. It aims to define<br />

a set of XPath filters that can transform documents (in part or in whole) into other formats including Plain text,<br />

<strong>XML</strong>, or HTML.<br />

Many <strong>XML</strong> databases also support XQuery to perform querying. XQuery includes XPath as a node-selection<br />

method, but extends XPath to provide transformational capabilities. Users sometimes refer to its syntax as<br />

"FLWOR" (pronounced 'Flower') because the flow may include the following statements: 'For', 'Let', 'Where', 'Order'<br />

and 'Return'. Traditional RDBMS vendors (who traditionally had SQL only engines), are now shipping with hybrid<br />

SQL and XQuery engines. Hybrid SQL/XQuery engines help to query <strong>XML</strong> data alongside the Relational data, in<br />

the same query expression. This approach helps in combining Relational and <strong>XML</strong> data.<br />

Some <strong>XML</strong> databases support an API called the <strong>XML</strong>:DB API (or XAPI) as a form of implementation-independent<br />

access to the <strong>XML</strong> datastore. In <strong>XML</strong> databases, XAPI resembles ODBC and JDBC as used with relational<br />

databases. On the 24th of June 2009, The Java Community Process released the final version of the XQuery API for<br />

Java specification (XQJ) [3] - "a common API that allows an application to submit queries conforming to the W3C<br />

XQuery 1.0 specification and to process the results of such queries".<br />

Databases known to support XQuery, XQJ, <strong>XML</strong>:DB, or a RESTful API<br />

<strong>XML</strong> Database License <strong>Language</strong> XQJ API <strong>XML</strong>:DB<br />

[4]<br />

Apache XIndice (no longer maintained )<br />

API<br />

RESTful<br />

API<br />

Open source Java No Yes No No<br />

BaseX Open source Java Yes Yes Yes Yes<br />

Gemfire Enterprise Commercial Unknown No Yes No Yes<br />

DOMSafe<strong>XML</strong> Commercial Unknown No Yes No Yes<br />

eXist Open source Java No Yes Yes No<br />

MarkLogic Server Commercial C++ No No Yes Yes<br />

MonetDB/XQuery Open source C++ No Yes No No<br />

my<strong>XML</strong>DB Open source Java No Yes No Unknown<br />

OZONE Open source Java No Yes No Yes<br />

Sedna Open source C++ Yes Yes No Yes<br />

Tamino Commercial Unknown No Partial No Unknown<br />

TeXtML Commercial Unknown Unknown Unknown No Yes<br />

Xpriori XMS Commercial C++ No No No Yes<br />

Transaction Support


<strong>XML</strong> database 148<br />

Implementations<br />

• Apache Xindice [5] (previous name:dbxml)<br />

• BaseX [6] native, open-source <strong>XML</strong> Database developed at the University of Konstanz. Supports XQuery and Full<br />

Text [7] and Update [8] extensions.<br />

• BSn/NONMONOTONIC Lab: IB Search Engine [9] , embeddable <strong>XML</strong>++ search engine using a generic/abstract<br />

model and a mix of polymorphic objects types. Spin-off from the Isearch project.<br />

• Clusterpoint Storage Engine [10] , an <strong>XML</strong> storage engine geared towards high-volume applications and<br />

millisecond query times.<br />

• DB2 9 Express-C [11] , no-charge hybrid relational/<strong>XML</strong> data server with Pure<strong>XML</strong><br />

• EMC Documentum xDB [12] , a commercial native <strong>XML</strong> database including XQuery implementation, embeddable<br />

• eXist-db [13] , open-source native <strong>XML</strong> database, written in Java<br />

• Gemstone System's GemFire Enterprise [14] commercial <strong>XML</strong> database<br />

• MarkLogic Server [15] , a native <strong>XML</strong> database which uses XQuery.<br />

• M/DB:X [16] , a lightweight, REST-interfaced native <strong>XML</strong> database designed for use as a Cloud database.<br />

• MonetDB/XQuery [17] - XQuery processor on top of the MonetDB relational database system. Also supports<br />

W3C XQUF [8] updates. Open source.<br />

• Oracle <strong>XML</strong> DB [18] <strong>XML</strong> Enabled, (as of Oracle 10g known as Oracle XDB) despite its name it does not support<br />

the <strong>XML</strong>:DB API.<br />

• Oracle Berkeley DB <strong>XML</strong> [19] , <strong>XML</strong> Enabled, embedded database; built on top of the Berkeley DB (a key-value<br />

database).<br />

• Sedna <strong>XML</strong> Database [20] , Open source <strong>XML</strong> database developed by MODIS [21] team at Institute for System<br />

Programming [22] . Supports XQuery, Updates, XQJ API, Transactions and Triggers<br />

• SQL Server 2005 [23] , Free Express Edition with full xml features<br />

• Tamino <strong>XML</strong> Server [24] , native <strong>XML</strong> database. support for XQuery, XQuery Update, Transactions and Server<br />

Extensions.<br />

• TEXTML Server [25] , a native <strong>XML</strong> database combined with a full-text search engine.<br />

• TigerLogic XDMS [26] native <strong>XML</strong> Database<br />

• Timber [27] , a native <strong>XML</strong> database system developed at the University of Michigan<br />

• Qizx 3.0 [28] a native XQuery database engine written in Java (free & open source edition available)<br />

• XStreamDB [29] , native <strong>XML</strong> Database<br />

• Xpriori XMS [30] , XMS is a completely self constructing native <strong>XML</strong> database.<br />

External references<br />

• <strong>XML</strong> Databases - The Business Case, Charles Foster, June 2008 [31] - Talks about the current state of Databases<br />

and data persistence, how the current Relational Database model is starting to crack at the seams and gives an<br />

insight into a strong alternative for today's requirements.<br />

• An <strong>XML</strong>-based Database of Molecular Pathways (2005-06-02) [32] Speed / Performance comparisons of eXist,<br />

X-Hive, Sedna and Qizx/open<br />

• <strong>XML</strong> Native Database Systems: Review of Sedna, Ozone, NeoCoreXMS [33] 2006<br />

• <strong>XML</strong> Data Stores: Emerging Practices [34]<br />

• Bhargava, P.; Rajamani, H.; Thaker, S.; Agarwal, A. (2005) <strong>XML</strong> Enabled Relational Databases, Texas, The<br />

University of Texas at Austin.<br />

• O'Connell, S. Advanced Databases Course Notes, Southampton, University of Southampton, 2005<br />

• Initiative for <strong>XML</strong> Databases [35]<br />

• <strong>XML</strong> and Databases, Ronald Bourret, September 2005 [36]<br />

• <strong>XML</strong> Database Products, Ronald Bourret, 2000-2009 [37]


<strong>XML</strong> database 149<br />

• The State of Native <strong>XML</strong> Databases, Elliotte Rusty Harold, August 13, 2007 [38]<br />

• <strong>XML</strong> for DB2 Information Integration [39] , an IBM Redbook that has a chapter on <strong>XML</strong> and databases (1st<br />

chapter).<br />

References<br />

[1] Mustafa Atay and Shiyong Lu, “Storing and Querying <strong>XML</strong>: An Efficient Approach Using Relational Databases”, ISBN 3639115813, VDM<br />

Verlag, 2009.<br />

[2] http://xmldb-org.sourceforge.net/faqs.html<br />

[3] http://jcp.org/en/jsr/detail?id=225<br />

[4] http://www.oreillynet.com/onjava/blog/2006/03/dont_be_misled_xindice_is_dead.html<br />

[5] http://xml.apache.org/xindice/<br />

[6] http://basex.org/<br />

[7] http://www.w3.org/TR/xpath-full-text-10/<br />

[8] http://www.w3.org/TR/xqupdate/<br />

[9] http://www.ibu.de/node/52<br />

[10] http://www.clusterpoint.com/<br />

[11] http://ibm.com/db2/viper/<br />

[12] http://www.emc.com/products/detail/software/documentum-xdb.htm<br />

[13] http://exist.sourceforge.net/<br />

[14] http://www.gemstone.com/products/gemfire/enterprise.php<br />

[15] http://www.marklogic.com/<br />

[16] http://www.mgateway.com/mdbx.html<br />

[17] http://monetdb.cwi.nl/XQuery/<br />

[18] http://www.oracle.com/technology/tech/xml/xmldb/index.html<br />

[19] http://www.oracle.com/database/berkeley-db/xml/index.html<br />

[20] http://modis.ispras.ru/sedna<br />

[21] http://modis.ispras.ru<br />

[22] http://ispras.ru<br />

[23] http://www.microsoft.com/sql/default.mspx<br />

[24] http://www.softwareag.com/corporate/products/wm/tamino/<br />

[25] http://www.ixiasoft.com/textmlserver<br />

[26] http://www.rainingdata.com/products/tl/index.html<br />

[27] http://www.eecs.umich.edu/db/timber/<br />

[28] http://www.xmlmind.com/qizx/<br />

[29] http://bluestream.com/products/xstreamdb32<br />

[30] http://www.xpriori.com<br />

[31] http://www.cfoster.net/articles/xmldb-business-case<br />

[32] http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-3717<br />

[33] http://swing.felk.cvut.cz/index.php?option=com_docman&task=doc_view&gid=5&Itemid=62<br />

[34] http://csdl2.computer.org/persagen/DLAbsToc.jsp?resourcePath=/dl/mags/ic/&toc=comp/mags/ic/2005/02/w2toc.xml&DOI=10.<br />

1109/MIC.2005.48<br />

[35] http://xmldb-org.sourceforge.net<br />

[36] http://www.rpbourret.com/xml/<strong>XML</strong>AndDatabases.htm<br />

[37] http://www.rpbourret.com/xml/<strong>XML</strong>DatabaseProds.htm<br />

[38] http://cafe.elharo.com/xml/the-state-of-native-xml-databases/<br />

[39] http://publib-b.boulder.ibm.com/Redbooks.nsf/RedbookAbstracts/sg246994.html


<strong>XML</strong> editor 150<br />

<strong>XML</strong> editor<br />

An <strong>XML</strong> editor is a markup language editor with added functionality to facilitate the editing of <strong>XML</strong>. This can be<br />

done using a plain text editor, with all the code visible, but <strong>XML</strong> editors have added facilities like tag completion<br />

and menus and buttons for tasks that are common in <strong>XML</strong> editing, based on data supplied with document type<br />

definition (DTD) or the <strong>XML</strong> tree.<br />

There are also graphical <strong>XML</strong> editors that hide the code in the background and present the content to the user in a<br />

more user-friendly format, approximating the rendered version or editing forms. This is helpful for situations where<br />

people who are not fluent in <strong>XML</strong> code need to enter information in <strong>XML</strong> based documents such as time sheets and<br />

expenditure reports. And even if the user is familiar with <strong>XML</strong>, use of such editors, which take care of syntax<br />

details, is often faster and more convenient.<br />

Functionality beyond syntax highlighting<br />

An <strong>XML</strong> editor goes beyond the syntax highlighting offered by many plaintext editors and generic source code<br />

editors, verifying the <strong>XML</strong> source based on an <strong>XML</strong> Schema or <strong>XML</strong> DTD, and some can do it as the document is<br />

being edited in real time. Other features of an editor designed specifically for editing <strong>XML</strong> might include element<br />

word completion and automatic appending of a closing tag whenever an opening tag is entered. These features can<br />

help to prevent typographically originating errors in the <strong>XML</strong> code. Some <strong>XML</strong> editors provide for the ability to run<br />

an XSLT transform, or series of transforms, over a document. Some of the larger <strong>XML</strong> packages even offer XSLT<br />

debugging features and XSL-FO processors for generation of PDF files from documents.<br />

Textual editors<br />

Text <strong>XML</strong> editors generally provide features dealing with working with element tags. Syntax highlighting is a basic<br />

standard of any <strong>XML</strong> editor; that is, they color element text differently from regular text. Element and attribute<br />

completion based on a DTD or schema is also available from many text <strong>XML</strong> editors. Displaying line numbers is<br />

also a common and useful feature, as is providing the ability to reformat a document to conform to a particular style<br />

of indenture.<br />

Here is an example of edition in a text editor with syntax coloring:<br />

The advantage of text editors is that they present exactly the information that is stored in the <strong>XML</strong> file. It is the best<br />

way to control the formatting of the file (such as indentations), to do low-level operations (such as a find/replace on<br />

element names) and to edit <strong>XML</strong> files without any schema or configuration file.<br />

Graphical editors<br />

Graphical editors based on GUIs may be easier for some people to use than text editors, and may not require<br />

knowledge of <strong>XML</strong> syntax. These are often called WYSIWYG ("What You See Is What You Get") editors, but not<br />

all of them are WYSIWYG: graphical <strong>XML</strong> editors can be WYSIWYG when they try to display the final rendering<br />

or WYSIWYM ("What You See Is What You Mean") when they try to display the actual meaning of <strong>XML</strong> elements.<br />

When they are not WYSIWYG, they do not display the (or one of the) graphical end result of a document, but<br />

instead focus on conveying the meaning of the text. They use DTDs or <strong>XML</strong> schemas and/or configuration files to


<strong>XML</strong> editor 151<br />

map <strong>XML</strong> elements to graphical components.<br />

These kinds of editors are generally more useful for <strong>XML</strong> languages for data rather than for storing documents.<br />

Documents tends to be fairly freeform in structure, which tends to defy the generally rigid nature of many graphical<br />

editors.<br />

In the above example, the editor is using a configuration file to know that the TABLE element represents a table, the<br />

TR element represents a row of the table, and the TD element represents a cell of the table. It is using this<br />

information to display the table based on this structuring information, in order to make editing easier.<br />

Schema and configuration files information can also be used to ensure that users do not create invalid documents.<br />

For instance, in a text editor, it is possible to create a row with too many cells in the table, while this would not be<br />

possible with the above graphical user interface.<br />

WYSIWYG editors<br />

WYSIWYG editors let people edit files directly with the tags represented by some form of graphical viewing rather<br />

than bare <strong>XML</strong> code. Often, WYSIWYG editors attempt to emulate the end result of some transform or CSS<br />

stylesheet application. This emulation may or may not be possible, depending on the transformation from <strong>XML</strong> into<br />

the end result.<br />

Naive use of a WYSIWYG editor can lead to the creation of documents that do not have the intrinsic semantics of<br />

the particular <strong>XML</strong> language. This comes about if the user is focused on trying to achieve a certain visual<br />

presentation with the editor, rather than using the WYSIWYG to make editing the document easier. For instance,<br />

someone creating a web page could use an H2 element (meaning: second level title) instead of H1 (meaning: first<br />

level title) because it looks smaller on their current WYSIWYG editor. Such an author is making a choice based on<br />

the apparent visual representation, but a visitor to the author's web page can offer a very different rendering in their<br />

browser.<br />

However, as long as the underlying meaning of the document is understood by the author, and the author does not<br />

make decisions based on the exact look in the WYSIWYG editor, such an editor can be of value to the writer. It is<br />

generally much easier to read a document that is being rendered in some fashion than it is to read the raw <strong>XML</strong> code.<br />

Also, editing can be much more intuitive, as the WYSIWYG editor can use tools similar to many word processing<br />

applications. Some WYSIWYG editors even allow the user to use a DTD or Schema and define their own user<br />

interface for editing.<br />

Usually WYSIWYG editors support CSS but not XSLT, because XSLT transformations can be very complex, and<br />

guessing what the user meant when changing the end result can be impossible. The WYSIWYG editors that do<br />

support XSLT, such as Syntext Serna, will therefore apply changes directly to the original <strong>XML</strong>, while updating the<br />

view by running the XSLT for every change.


<strong>XML</strong> editor 152<br />

In the above example, a stylesheet is used to color table cells in a particular way. For instance, even rows do not have<br />

the same background color as odd rows, in order to make reading easier.<br />

Application domains<br />

• Computer programming<br />

• Technical editing<br />

See also<br />

• List of <strong>XML</strong> editors<br />

• Authoring system<br />

• Editing<br />

• Source code editor<br />

Edited formats<br />

• <strong>XML</strong><br />

• Darwin Information Typing Architecture (DITA)<br />

• DocBook<br />

External links<br />

• <strong>XML</strong> Editors [1] at the Open Directory Project<br />

• List of editors from xml.com [2]<br />

References<br />

[1] http://www.dmoz.org/Computers/Data_Formats/<strong>Markup</strong>_<strong>Language</strong>s/<strong>XML</strong>/Tools/Editors//<br />

[2] http://www.xml.com/pub/pt/3


<strong>XML</strong> Enabled Directory 153<br />

<strong>XML</strong> Enabled Directory<br />

<strong>XML</strong> Enabled Directory (XED) is a framework for managing objects represented using the Extensible <strong>Markup</strong><br />

<strong>Language</strong> (<strong>XML</strong>). XED builds on X.500 and LDAP directory services technologies.<br />

XED was originally designed in 2003 by Steven Legg of eB2Bcom (formerly of Adacel Technologies) and Daniel<br />

Prager (formerly of Deakin University).<br />

The <strong>XML</strong> Enabled Directory (XED) framework leverages existing Lightweight Directory Access Protocol (LDAP)<br />

and X.500 directory technology to create a directory service that stores, manages and transmits Extensible <strong>Markup</strong><br />

<strong>Language</strong> (<strong>XML</strong>) format data, while maintaining interoperability with LDAP clients, X.500 Directory User Agents<br />

(DUAs), and X.500 Directory System Agents (DSAs).<br />

The main features of XED are:<br />

• semantically equivalent <strong>XML</strong> renditions of existing directory protocols,<br />

• <strong>XML</strong> renditions of directory data,<br />

• the ability to accept at run time, user defined attribute syntaxes specified in a variety of <strong>XML</strong> schema languages,<br />

• the ability to perform filter matching on the parts of <strong>XML</strong> format attribute values.<br />

• the flexibility for implementors to develop XED clients using only their favoured <strong>XML</strong> schema language.<br />

The <strong>XML</strong> Enabled Directory allows directory entries to contain <strong>XML</strong> formatted data as attribute values.<br />

Furthermore, the attribute syntax can be specified in any one of a variety of <strong>XML</strong> schema languages that the<br />

directory understands.<br />

The directory server is then able to perform data validation and semantically meaningful matching of <strong>XML</strong><br />

documents, or their parts, on behalf of client applications, making the implementation of <strong>XML</strong>-based applications<br />

easier and faster.<br />

<strong>XML</strong> applications can also exploit the directory's traditional capabilities of cross-application data sharing, data<br />

distribution, data replication, user authentication and user access control, further lowering the cost of building new<br />

<strong>XML</strong> applications<br />

XED Implementations<br />

eB2Bcom's <strong>View</strong>500 Identity Server provides organisations with a fast, scalable and flexible directory system. As it<br />

has been developed strictly adhering to open standards and it features support for the X.500, LDAP, XED and<br />

ACP133 Standards. Being standards compliant, <strong>View</strong>500 will interface with a variety of applications, both now and<br />

into the future.<br />

External links<br />

• <strong>XML</strong> Enabled Directory [1]<br />

• A work-in-progress XED specification [2]<br />

References<br />

[1] http://www.xmled.info/<br />

[2] http://www.xmled.info/specs.htm


<strong>XML</strong> Encryption 154<br />

<strong>XML</strong> Encryption<br />

<strong>XML</strong> Encryption, also known as <strong>XML</strong>-Enc, is a specification, governed by a W3C recommendation, that defines<br />

how to encrypt the contents of an <strong>XML</strong> element.<br />

Although <strong>XML</strong> Encryption can be used to encrypt any kind of data, it is nonetheless known as "<strong>XML</strong> Encryption"<br />

because an <strong>XML</strong> element (either an EncryptedData or EncryptedKey element) contains or refers to the cipher text,<br />

keying information, and algorithms.<br />

Both <strong>XML</strong> Signature and <strong>XML</strong> Encryption use the KeyInfo element, which appears as the child of a SignedInfo,<br />

EncryptedData, or EncryptedKey element and provides information to a recipient about what keying material to use<br />

in validating a signature or decrypting encrypted data.<br />

The KeyInfo element is optional: it can be attached in the message, or be delivered through a secure channel.<br />

External links<br />

• W3C info [1]<br />

References<br />

[1] http://www.w3.org/TR/xmlenc-core/<br />

<strong>XML</strong> Events<br />

In computer science and web development, <strong>XML</strong> Events is a W3C standard [1] for handling events that occur in an<br />

<strong>XML</strong> document. These events are typically caused by users interacting with the web page using a device such as a<br />

web browser on a personal computer or mobile phone.<br />

Formal Definition<br />

An <strong>XML</strong> Event is the representation of some asynchronous occurrence (such as a mouse button click) that gets<br />

associated with a data element in an <strong>XML</strong> document. <strong>XML</strong> Events provides a static, syntactic binding to the DOM<br />

Events interface, allowing the event to be handled.<br />

Motivation<br />

The <strong>XML</strong> Events standard is defined to provide <strong>XML</strong>-based languages with the ability to uniformly integrate event<br />

listeners and associated event handlers with Document Object Model (DOM) Level 2 event interfaces. The result is<br />

to provide a declarative, interoperable way of associating behaviors with <strong>XML</strong>-based documents such as XHTML.<br />

Advantages of <strong>XML</strong> Events<br />

<strong>XML</strong> Events uses a separation of concerns design pattern, and is technology-neutral with regards to handlers. It<br />

gives authors freedom in organizing their code and allows separation of document content from scripting.<br />

legacy HTML and early SVG versions bind events to presentation elements by encoding the event name in an<br />

attribute name, such that the value of the attribute is the action for that event at that element. For example (with<br />

Javascript’s onclick attribute):<br />

Stay here!


<strong>XML</strong> Events 155<br />

This design has three drawbacks:<br />

1. it hard-wires the events into the language, so that adding new event types requires changes to the language<br />

2. it forces authors to mix the content of the document with the specifications of the scripting and event handling,<br />

rather than allowing them to separate them.<br />

3. it restricts authors to a single scripting language per document.<br />

Relationship to Other Standards<br />

Unlike DOM Events which are usually associated with HTML documents, <strong>XML</strong> events are designed to be<br />

independent of specific devices. <strong>XML</strong> Events are used extensively in XForms, and, in version 1.2 of the SVG<br />

specification as of July 2006, is still a working draft.<br />

Example of <strong>XML</strong> Events using Listener in XForms<br />

The following is an example of how <strong>XML</strong> events are used in the XForms specification:<br />

<br />

<br />

<br />

Do it!<br />

<br />

alert("test");<br />

<br />

<br />

In this example, when the DOMActivate event occurs on the data element with an id attribute of myButton, the<br />

handler doit (for example a Javascript script element) is executed.<br />

See also<br />

• ECMAScript<br />

• DOM Events<br />

• XForms<br />

• XHTML<br />

External links<br />

• W3C <strong>XML</strong> Events Specification [2] was a W3C Recommendation on 14 October 2003 [3]<br />

• W3C <strong>XML</strong> Events for HTML Authors [4] tutorial


<strong>XML</strong> Events 156<br />

References<br />

[1] "<strong>XML</strong> Events: An Events Syntax for <strong>XML</strong>" (http://www.w3.org/TR/xml-events/). World Wide Web Consortium. 2003-10-14. .<br />

Retrieved 2008-11-19.<br />

[2] http://www.w3.org/TR/xml-events<br />

[3] http://www.w3.org/TR/2003/REC-xml-events-20031014/<br />

[4] http://www.w3.org/MarkUp/2004/xmlevents-for-html-authors<br />

<strong>XML</strong> framework<br />

An <strong>XML</strong> framework is a Software framework for <strong>XML</strong>. Basically, the framework implements several features to aid<br />

the programmer in creating her own application, but an <strong>XML</strong> framework differs from other frameworks in that all<br />

data produced is <strong>XML</strong>. The programmer defines and produces pure data in <strong>XML</strong> format and the framework<br />

transforms the document to any format desired.<br />

One code, one <strong>XML</strong> and several transformations like XHTML, SVG, WML, Excel or Word format, or any<br />

document type may result.<br />

Features in an <strong>XML</strong> framework<br />

• Classes to abstract the USE of <strong>XML</strong> documents<br />

• Classes to abstract the DATA access - All data is <strong>XML</strong> independent of your source, like <strong>XML</strong>, Database, text<br />

files<br />

• XSLT cache.<br />

• Easy way to create XSLT documents like code snippets<br />

• Framework must be extensible because <strong>XML</strong> is extensible by definition.<br />

Pure <strong>XML</strong> frameworks<br />

• <strong>XML</strong>Nuke


<strong>XML</strong> Literals 157<br />

<strong>XML</strong> Literals<br />

In the Microsoft .NET framework, <strong>XML</strong> Literal allows computer program to include <strong>XML</strong> directly in the code. It is<br />

currently only supported in VB.NET 9.0. When Visual Basic expression is embedded in an <strong>XML</strong> literal, the<br />

application creates a LINQ to <strong>XML</strong> object for each literal at run time.<br />

<strong>XML</strong> namespace<br />

<strong>XML</strong> namespaces are used for providing uniquely named elements and attributes in an <strong>XML</strong> document. They are<br />

defined in Namespaces in <strong>XML</strong> [1] , a W3C recommendation. An <strong>XML</strong> instance may contain element or attribute<br />

names from more than one <strong>XML</strong> vocabulary. If each vocabulary is given a namespace then the ambiguity between<br />

identically named elements or attributes can be resolved.<br />

A simple example would be to consider an <strong>XML</strong> instance that contained references to a customer and an ordered<br />

product. Both the customer element and the product element could have a child element named id. References to the<br />

id element would therefore be ambiguous; placing them in different namespaces would remove the ambiguity.<br />

Namespace declaration<br />

A namespace is declared using the reserved <strong>XML</strong> attribute xmlns, the value of which must be an Internationalized<br />

Resource Identifier (IRI), usually a Uniform Resource Identifier (URI) reference.<br />

For example:<br />

xmlns="http://www.w3.org/1999/xhtml"<br />

Note, however, that the namespace specification does not require nor suggest that the namespace URI be used to<br />

retrieve information; it is simply treated by an <strong>XML</strong> parser as a string. For example, the document at http:/ / www.<br />

w3.org/1999/xhtml itself does not contain any code. It simply describes the XHTML namespace to human readers.<br />

Using a URI (such as "http://www.w3.org/1999/xhtml") to identify a namespace, rather than a simple string (such as<br />

"xhtml"), reduces the possibility of different namespaces using duplicate identifiers.<br />

It is also possible to map namespaces to prefixes in namespace declarations. For example:<br />

xmlns:xhtml="http://www.w3.org/1999/xhtml"<br />

In this case, any element or attribute names that start with the prefix "xhtml:" are considered to be in the XHTML<br />

namespace.<br />

Namespace names<br />

Although the term namespace URI is widespread, the W3C Recommendation refers to it as the namespace name.<br />

The specification is not entirely prescriptive about the precise rules for namespace names (it does not explicitly say<br />

that parsers must reject documents where the namespace name is not a valid Uniform Resource Identifier), and many<br />

<strong>XML</strong> parsers allow any character string to be used. In version 1.1 of the recommendation, the namespace name<br />

becomes an Internationalized Resource Identifier, which licenses the use of non-ASCII characters that in practice<br />

were already accepted by nearly all <strong>XML</strong> software. The term namespace URI persists, however, not only in popular<br />

usage but also in many other specifications from W3C and elsewhere.<br />

Following publication of the Namespaces recommendation, there was an intensive debate about how a relative URI<br />

should be handled, with some arguing that it should simply be treated as a character string, and others that it should<br />

be turned into an absolute URI by resolving it against the base URI of the document [2] . The result of the debate was


<strong>XML</strong> namespace 158<br />

a ruling from W3C that relative URIs were deprecated [3] .<br />

The use of URIs taking the form of URLs in the http scheme (such as http:/ / www. w3. org/ 1999/ xhtml'') is<br />

common, despite the absence of any formal relationship with the HTTP protocol. The Namespaces specification does<br />

not say what should happen if such a URL is dereferenced (that is, if software attempts to retrieve a document from<br />

this location). One convention adopted by some users is to place a RDDL document at the location [4] . In general,<br />

however, users should assume that the namespace URI is simply a name, not the address of a document on the web.<br />

See also<br />

• Namespace<br />

External links<br />

• Namespaces in <strong>XML</strong> 1.0 (Third Edition) [1]<br />

• Namespaces in <strong>XML</strong> 1.1 (Second Edition) [8]<br />

References<br />

[1] http://www.w3.org/TR/REC-xml-names/<br />

[2] Leigh Dodds (24 May 2000), News from the trenches (http://www.xml.com/pub/a/2000/05/24/deviant/index.html),<br />

[3] Dan Connolly (11 Sep 2000), W3C <strong>XML</strong> Plenary decision on relative URI references in namespace declarations<br />

[4] Elliotte Rusty Harold (20 Feb 2001), RDDL Me This: What Does a Namespace URL Locate? (http://www.oreillynet.com/pub/a/oreilly/<br />

xml/news/xmlnut2_0201.html),<br />

<strong>XML</strong> Pretty Printer<br />

<strong>XML</strong> Pretty Printers are a type of Prettyprint or code beautifier that specifically improve the readability of <strong>XML</strong>.<br />

<strong>XML</strong> as a standard is designed to be human readable, but is sometimes generated by a computer as tightly<br />

compressed or compacted, and hence more difficult to read and edit. Running the <strong>XML</strong> file through a pretty printer<br />

will improve its readability and editability.<br />

Examples of <strong>XML</strong> Pretty Printers<br />

• xmllint (utility in open source library libxml2)<br />

• xmlindent open source tool, more information on the homepage here [1] .<br />

Online:<br />

• <strong>XML</strong> Pretty Printer Online<br />

• DecisionSoft <strong>XML</strong> Pretty Printer<br />

Windows:<br />

• xmlpp (command line)


<strong>XML</strong> Pretty Printer 159<br />

See also<br />

• Prettyprint<br />

• <strong>XML</strong><br />

External links<br />

• <strong>XML</strong> Pretty Printer Online [2]<br />

• DecisionSoft <strong>XML</strong> Pretty Printer [3]<br />

• xmlpp pretty printer [4]<br />

• <strong>XML</strong> Indent [1] , an <strong>XML</strong> stream reformatter<br />

References<br />

[1] http://xmlindent.sourceforge.net/<br />

[2] http://www.iconv.com/xmllint.htm<br />

[3] http://tools.decisionsoft.com/xmlpp.html<br />

[4] http://www.cheztabor.com/xmlpp/index.htm<br />

<strong>XML</strong> Protocol<br />

The <strong>XML</strong> Protocol ("<strong>XML</strong>P") is a standard being developed by the W3C <strong>XML</strong> Protocol Working Group to the<br />

following guidelines, outlined in the group's charter:<br />

1. An envelope for encapsulating <strong>XML</strong> data to be transferred in an interoperable manner that allows for distributed<br />

extensibility.<br />

2. A convention for the content of the envelope when used for RPC (Remote Procedure Call) applications. The<br />

protocol aspects of this should be coordinated closely with the IETF and make an effort to leverage any work they<br />

are doing, see below for details.<br />

3. A mechanism for serializing data representing non-syntactic data models such as object graphs and directed<br />

labeled graphs, based on the data types of <strong>XML</strong> Schema.<br />

4. A mechanism for using HTTP transport in the context of an <strong>XML</strong> Protocol. This does not mean that HTTP is the<br />

only transport mechanism that can be used for the technologies developed, nor that support for HTTP transport is<br />

mandatory. This component merely addresses the fact that HTTP transport is expected to be widely used, and so<br />

should be addressed by this Working Group. There will be coordination with the Internet Engineering Task Force<br />

(IETF). (See Blocks Extensible Exchange Protocol)<br />

Further, the protocol developed must meet the following requirements, as per the working group's charter:<br />

1. The envelope and the serialization mechanisms developed by the Working Group may not preclude any<br />

programming model nor assume any particular mode of communication between peers.<br />

2. Focus must be put on simplicity and modularity and must support the kind of extensibility actually seen on the<br />

Web. In particular, it must support distributed extensibility where the communicating parties do not have a priori<br />

knowledge of each other.


<strong>XML</strong> Protocol 160<br />

See also<br />

• <strong>XML</strong><br />

• Internet Engineering Task Force<br />

External links<br />

• <strong>XML</strong> Protocol Working Group Charter [1]<br />

• <strong>XML</strong> Protocol Working Group [2]<br />

References<br />

[1] http://www.w3.org/2004/02/<strong>XML</strong>-Protocol-Charter<br />

[2] http://www.w3.org/2000/xp/Group/<br />

<strong>XML</strong> schema<br />

An <strong>XML</strong> schema is a description of a type of <strong>XML</strong> document, typically expressed in terms of constraints on the<br />

structure and content of documents of that type, above and beyond the basic syntactical constraints imposed by <strong>XML</strong><br />

itself. These constraints are generally expressed using some combination of grammatical rules governing the order of<br />

elements, Boolean predicates that the content must satisfy, data types governing the content of elements and<br />

attributes, and more specialized rules such as uniqueness and referential integrity constraints.<br />

There are languages developed specifically to express <strong>XML</strong> schemas. The Document Type Definition (DTD)<br />

language, which is native to the <strong>XML</strong> specification, is a schema language that is of relatively limited capability, but<br />

that also has other uses in <strong>XML</strong> aside from the expression of schemas. Two more expressive <strong>XML</strong> schema<br />

languages in widespread use are <strong>XML</strong> Schema (with a capital S) and RELAX NG.<br />

The mechanism for associating an <strong>XML</strong> document with a schema varies according to the schema language. The<br />

association may be achieved via markup within the <strong>XML</strong> document itself, or via some external means.<br />

Validation<br />

The process of checking to see if an <strong>XML</strong> document conforms to a schema is called validation, which is separate<br />

from <strong>XML</strong>'s core concept of syntactic well-formedness. All <strong>XML</strong> documents must be well-formed, but it is not<br />

required that a document be valid unless the <strong>XML</strong> parser is "validating," in which case the document is also checked<br />

for conformance with its associated schema. DTD-validating parsers are most common, but some support W3C<br />

<strong>XML</strong> Schema or RELAX NG as well.<br />

Documents are only considered valid if they satisfy the requirements of the schema with which they have been<br />

associated. These requirements typically include such constraints as:<br />

• Elements and attributes that must/may be included, and their permitted structure<br />

• The structure as specified by a regular expression syntax<br />

• How character data is to be interpreted, e.g. as a number, a date, a URL, a Boolean, etc.<br />

Validation of an instance document against a schema can be regarded as a conceptually separate operation from<br />

<strong>XML</strong> parsing. In practice, however, many schema validators are integrated with an <strong>XML</strong> parser.


<strong>XML</strong> schema 161<br />

<strong>XML</strong> schema languages<br />

• Document Content Description facility for <strong>XML</strong>, an RDF framework [1]<br />

• Document Definition <strong>Markup</strong> <strong>Language</strong> (DDML)<br />

• Document Schema Definition <strong>Language</strong>s (DSDL)<br />

• Document Structure Description (DSD)<br />

• Document Type Definition (DTD)<br />

• Namespace Routing <strong>Language</strong> (NRL)<br />

• RELAX NG and its predecessors RELAX and TREX<br />

• SGML<br />

• Schema for Object-Oriented <strong>XML</strong> (SOX)<br />

• Schematron<br />

• <strong>XML</strong>-Data Reduced (XDR)<br />

• <strong>XML</strong> Schema (WXS or XSD)<br />

Capitalization<br />

There is some confusion as to when to use the capitalized spelling "Schema" and when to use the lowercase spelling.<br />

The lowercase form is a generic term and may refer to any type of schema, including DTD, <strong>XML</strong> Schema (aka<br />

XSD), RELAX NG, or others, and should always be written using lowercase except when appearing at the start of a<br />

sentence. The form "Schema" (capitalized) in common use in the <strong>XML</strong> community always refers to W3C <strong>XML</strong><br />

Schema.<br />

See also<br />

• Data structure<br />

• Structuring information<br />

• List of <strong>XML</strong> schemas<br />

• <strong>XML</strong> Information Set<br />

• <strong>XML</strong> Schema <strong>Language</strong> Comparison<br />

• Schema (for other uses of the term)<br />

External links<br />

• Comparing <strong>XML</strong> Schema <strong>Language</strong>s [2] by Eric van der Vlist (2001)<br />

• Comparative Analysis of Six <strong>XML</strong> Schema <strong>Language</strong>s [3] by Dongwon Lee, Wesley W. Chu, In ACM SIGMOD<br />

Record, Vol. 29, No. 3, page 76-87, September 2000<br />

• Taxonomy of <strong>XML</strong> Schema <strong>Language</strong>s using Formal <strong>Language</strong> Theory [4] by Makoto Murata, Dongwon Lee,<br />

Murali Mani, Kohsuke Kawaguchi, In ACM Trans. on Internet Technology (TOIT), Vol. 5, No. 4, page 1-45,<br />

November 2005<br />

• Application of <strong>XML</strong> Schema in Web Services Security [5] by Sridhar Guthula, W3C Schema Experience Report,<br />

May 2005


<strong>XML</strong> schema 162<br />

References<br />

[1] "Document Content Description for <strong>XML</strong>: Submission to the World Wide Web Consortium 31-July-1998" (http://www.w3.org/TR/<br />

NOTE-dcd). .<br />

[2] http://www.xml.com/pub/a/2001/12/12/schemacompare.html<br />

[3] http://pike.psu.edu/publications/sigmod-record-00.pdf<br />

[4] http://pike.psu.edu/publications/toit05.pdf<br />

[5] http://www.w3.org/2005/05/25-schema/guthula.html<br />

<strong>XML</strong> Schema Editor<br />

The W3C's <strong>XML</strong> Schema Recommendation defines a formal mechanism for describing <strong>XML</strong> documents. The<br />

standard has become very popular and is used by the majority of standards bodies when describing their data. [1]<br />

The standard is very versatile allowing for programming concepts such as inheritance, and type creation. However<br />

its high complexity is one of its main issues. The standard itself is highly technical and published in 3 different parts,<br />

making it difficult to understand without committing large amounts of time to it.<br />

<strong>XML</strong> Schema Editor Tools<br />

The problems users face when working with the XSD standard can largely be mitigated with the use of graphical<br />

editing tools. Although any text-based editor can be used to edit an <strong>XML</strong> Schema, a graphical editor offers the<br />

biggest advantages, allowing the structure of the document to be viewed graphically and edited with validation<br />

support, entry helpers and other useful features.<br />

The editors that have been developed so far take several different approaches to the presentation of information:<br />

Text <strong>View</strong><br />

The text view of an <strong>XML</strong> Schema shows the schema in its native form. <strong>XML</strong> Schema Editors generally add to the<br />

text view with features like inline entry helpers and entry helper windows, code completion, line numbering, source<br />

folding, and syntax coloring.<br />

In more lengthy and complex schema documents, this is often difficult for even highly trained content model<br />

architects to work with, paving the way for software companies to come up with new and inventive way for users to<br />

visualize these documents.<br />

Physical <strong>View</strong><br />

A physical view of an <strong>XML</strong> Schema displays a graphical entity for each element within the <strong>XML</strong> Schema. This can<br />

make an XSD document easier to read, but does little to simplify editing. This is largely down to the structure of the<br />

XSD Standard, container elements are required which are dependent on the base type used and the types contained<br />

within. Meaning small changes to the logical structure can cause changes to ripple through the document.<br />

The structure of the XSD standard also means entities are referenced from other locations with the document, some<br />

editors allow these to be expanded and viewed in the location they are referenced from some don't, meaning lots of<br />

manual cross referencing.


<strong>XML</strong> Schema Editor 163<br />

Logical <strong>View</strong><br />

A logical view shows the structure of the <strong>XML</strong> Schema without showing all the detail of the syntax used to describe<br />

it. This provides a much clearer view of the <strong>XML</strong> Schema, making it easier to understand the structure of the<br />

document, and makes it easier to edit. Because the editor shows the logical structure of the XSD document, there is<br />

no need to show every element, removing much of the complexity and allowing the editor to automatically manage<br />

the syntactical rules.<br />

Example<br />

The following example will show the source XSD, logical and physical views for a simple schema.<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

A Sample <strong>XML</strong> Document for the schema<br />

<br />

<br />

<br />

Physical <strong>View</strong> Logical <strong>View</strong>


<strong>XML</strong> Schema Editor 164<br />

<br />

<br />

John<br />

Doe<br />

As you can see the logical view provides more information, but without the syntactical clutter, making it easier to<br />

understand and work with.<br />

<strong>XML</strong> Schema Editors<br />

As the XSD standard has gained support, a host of <strong>XML</strong> Schema editors have been developed.<br />

Application Name Screenshot Code Editor Physical<br />

Altova <strong>XML</strong>Spy screenshots [2]<br />

Editor<br />

Eclipse XSD Editor (eclipse.org [3] ) screenshots [3] Limited Editing<br />

Liquid <strong>XML</strong> Studio screenshots [4]<br />

Oxygen xml screenshots [5] Read only<br />

Stylus Studio screenshots [6] Read only<br />

<strong>XML</strong> Fox - Freeware Edition screenshots [7]<br />

References<br />

[1] http://www.w3.org/TR/xmlschema-0/W3C Primer<br />

[2] http://www.altova.com/features_dtdschema.html<br />

[3] http://wiki.eclipse.org/index.php/Introduction_to_the_XSD_Editor<br />

[4] http://www.liquid-technologies.com/XmlStudio/XmlStudio.aspx [5]<br />

http://www.oxygenxml.com/xml_schema_editor.html<br />

[6] http://www.stylusstudio.com/xml_schema_editor.html<br />

[7] http://www.xmlfox.com/xml_schema_editor.htm<br />

Logical Editor Split Code/Diagram<br />

<strong>View</strong>


<strong>XML</strong> Schema <strong>Language</strong> Comparison 165<br />

<strong>XML</strong> Schema <strong>Language</strong> Comparison<br />

A <strong>XML</strong> schema is a description of a type of <strong>XML</strong> document, typically expressed in terms of constraints on the<br />

structure and content of documents of that type, above and beyond the basic syntax constraints imposed by <strong>XML</strong><br />

itself. There are several different languages available for specifying an <strong>XML</strong> schema. Each language has its strengths<br />

and weaknesses.<br />

Note: the W3C defined schema language is called "<strong>XML</strong> Schema". However, this name can be confusing in the<br />

context of referring to a number of <strong>XML</strong> schema languages. As such, throughout this document, references to the<br />

term "<strong>XML</strong> schema" will be any <strong>XML</strong> schema language where the meaning might be ambiguous, while the term<br />

"W3C <strong>XML</strong> Schema" (referred to in this article as WXS) will be used for the W3C-defined <strong>XML</strong> schema language.<br />

Overview<br />

Though there are a number of schema languages available, the primary three languages are Document Type<br />

Definitions, W3C <strong>XML</strong> Schema, and RELAX NG. Each language has its own advantages and disadvantages.<br />

This article also covers a brief review of other schema languages.<br />

The primary purpose of a schema language is to specify what the structure of an <strong>XML</strong> document can be. This means<br />

which elements can reside in which other elements, which attributes are and are not legal to have on a particular<br />

element, and so forth. A schema is somewhat equivalent to a grammar for a language; a schema defines what the<br />

vocabulary for the language may be and what a valid "sentence" is.<br />

Document Type Definitions<br />

Advantages<br />

Of the primary three languages, DTDs are the only ones that can be defined inline. That is, the DTD can actually be<br />

embedded directly into the document.<br />

DTDs can define more than merely the content model. It can define data elements that can be used in the document,<br />

much like a C or C++ preprocessor may have #defines that are used internally.<br />

The DTD language is compact and highly readable, though it does require some experience to understand.<br />

Disadvantages<br />

The primary disadvantage to DTDs is their weakness of specificity. The content models for DTDs are very basic,<br />

particularly compared to the other two languages.<br />

Overuse of DTD-defined elements may make a document illegible or incomprehensible without the associated DTD.<br />

Additionally, there are several <strong>XML</strong> processors that, typically for ease-of-implementation reasons, do not understand<br />

DTDs. As such, if DTD-defined entities are being used, these <strong>XML</strong> processors will not recognize them.<br />

The language that DTDs are written in is not <strong>XML</strong>. Therefore, DTDs cannot use the various frameworks that have<br />

been built around <strong>XML</strong>. <strong>XML</strong> editors that support writing DTDs must do so by parsing an additional language, for<br />

example. Some <strong>XML</strong> processors, typically for economy of implementation or execution, simply ignore DTD<br />

information, including DTD data elements.<br />

The DTD concept for <strong>XML</strong> was borrowed from the SGML DTD concept. As such, the construct could not be<br />

changed when <strong>XML</strong> was extended with namespaces. As such, DTDs are namespace unaware.<br />

There is limited support for defining the type of the contained data. DTDs are primarily structural in nature. They do<br />

not have the ability to specify that an element contains an integral number, real number, a date, or anything of that<br />

nature.


<strong>XML</strong> Schema <strong>Language</strong> Comparison 166<br />

Tool Support<br />

DTDs are perhaps the most widely supported schema language for <strong>XML</strong>. Because DTDs are one of the earliest<br />

schema languages for <strong>XML</strong>, defined before <strong>XML</strong> even had namespace support, they are widely supported. Internal<br />

DTDs are often supported in <strong>XML</strong> processors; external DTDs are less often supported, but only slightly. Most large<br />

<strong>XML</strong> parsers, ones that support multiple <strong>XML</strong> technologies, will provide support for DTDs as well.<br />

W3C <strong>XML</strong> Schema<br />

Advantages over DTDs<br />

Compared to DTDs, W3C <strong>XML</strong> Schemas are exceptionally powerful. They provide much greater specificity than<br />

DTDs could. They are namespace aware, and provide support for types.<br />

W3C <strong>XML</strong> Schema is written in <strong>XML</strong> itself, and therefore has a schema of its own (appropriately, written in W3C<br />

<strong>XML</strong> Schema).<br />

W3C <strong>XML</strong> Schema has a large number of built-in and derived data types. These are specified by the W3C <strong>XML</strong><br />

Schema specification, so all W3C <strong>XML</strong> Schema validators and processors must support them.<br />

Due to the nature of the schema language, after an <strong>XML</strong> document is validated, the entire <strong>XML</strong> document, both<br />

content and structure, can be expressed in terms of the schema itself. This functionality, known as<br />

Post-Schema-Validation Infoset (PSVI), can be used to transform the document into a hierarchy of typed objects that<br />

can be accessed in a programming language through a neutral interface.<br />

Commonality with RELAX NG<br />

Both RELAX NG and W3C <strong>XML</strong> Schema allow for similar mechanisms of specificity. Both allow for a degree of<br />

modularity in their languages, going so far as to being able to split the schema into multiple files. And both of them<br />

are, or can be, defined in an <strong>XML</strong> language.<br />

Advantages over RELAX NG<br />

RELAX NG lacks any analog to PSVI. Unlike W3C <strong>XML</strong> Schema, RELAX NG was not designed with type<br />

assignment and data binding in mind.<br />

W3C <strong>XML</strong> Schema has a formal mechanism for attaching a schema to an <strong>XML</strong> document.<br />

RELAX NG has no ability to apply default attribute data to an element's list of attributes (i.e., changing the <strong>XML</strong><br />

info set), while W3C <strong>XML</strong> Schema does. [1]<br />

W3C <strong>XML</strong> Schema has a rich "simple type" system built in (xs:number, xs:date, etc., plus derivation of custom<br />

types), while RELAX NG has an extremely simplistic one because it's meant to use type libraries developed<br />

independently of RELAX NG, rather than grow its own. This is seen by some as a disadvantage. In practice it's<br />

common for a RELAX NG schema to use the predefined "simple types" and "restrictions" (pattern, maxLength, etc.)<br />

of W3C <strong>XML</strong> Schema.<br />

In W3C <strong>XML</strong> Schema a specific number or range of repetitions of patterns can be expressed more elegantly than<br />

under RELAX NG. For large numbers it's practically not possible to specify at all in RELAX NG.


<strong>XML</strong> Schema <strong>Language</strong> Comparison 167<br />

Disadvantages<br />

W3C <strong>XML</strong> Schema is complex and hard to learn, although that's partially because it tries to do more than mere<br />

validation (see PSVI).<br />

Although being written in <strong>XML</strong> is an advantage, it is also a disadvantage in some ways. The W3C <strong>XML</strong> Schema<br />

language in particular can be quite verbose, while a DTD can be terse and relatively easily editable.<br />

Likewise, WXS's formal mechanism for associating a document with a schema can pose a potential security<br />

problem. For WXS validators that will follow a URI to an arbitrary online location, there is the potential for reading<br />

something malicious from the other side of the stream. [2]<br />

W3C <strong>XML</strong> Schema does not implement most of the DTD ability to provide data elements to a document. While<br />

technically a comparative deficiency, it also does not have the problems that this ability can create as well, which<br />

makes it a strength.<br />

Although W3C <strong>XML</strong> Schema's ability to add default attributes to elements is an advantage, it is a disadvantage in<br />

some ways as well. It means that an <strong>XML</strong> file may not be usable in the absence of its schema, even if the document<br />

would validate against that schema. In effect, all users of such an <strong>XML</strong> document must also implement the W3C<br />

<strong>XML</strong> Schema specification, thus ruling out minimalist or older <strong>XML</strong> parsers. It can also dramatically slow down<br />

processing of the document, as the processor must potentially download and process a second <strong>XML</strong> file (the<br />

schema).<br />

Tool Support<br />

WXS support exists in a number of large <strong>XML</strong> parsing packages. Xerces and the .NET Framework's Base Class<br />

Library both provide support for WXS validation.<br />

RELAX NG<br />

RELAX NG provides for most of the advantages that W3C <strong>XML</strong> Schema does over DTDs.<br />

Advantages over W3C <strong>XML</strong> Schema<br />

While the language of RELAX NG can be written in <strong>XML</strong>, it also has an equivalent form that is much more like a<br />

DTD, but with greater specifying power. This form is known as the compact syntax. Tools can easily convert<br />

between these forms with no loss of features or even commenting. Even arbitrary elements specified between<br />

RELAX NG <strong>XML</strong> elements can be converted into the compact form.<br />

RELAX NG provides very strong support for unordered content. That is, it allows the schema to state that a<br />

sequence of patterns may appear in any order.<br />

RELAX NG also allows for non-deterministic content models. What this means is that RELAX NG allows the<br />

specification of a sequence like the following:<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

When the validator encounters something that matches the "odd" pattern, it is unknown whether this is the optional<br />

last "odd" reference or simply one in the zeroOrMore sequence without looking ahead at the data. RELAX NG<br />

allows this kind of specification. W3C <strong>XML</strong> Schema requires all of its sequences to be fully deterministic, so


<strong>XML</strong> Schema <strong>Language</strong> Comparison 168<br />

mechanisms like the above must be either specified in a different way or omitted altogether.<br />

RELAX NG allows attributes to be treated as elements in content models. In particular, this means that one can<br />

provide the following:<br />

<br />

<br />

<br />

false<br />

<br />

<br />

<br />

true<br />

<br />

<br />

<br />

<br />

<br />

This block states that the element "some_element" must have an attribute named "has_name". This attribute can only<br />

take true or false as values, and if it is true, the first child element of the element must be "name", which stores text.<br />

If "name" did not need to be the first element, then the choice could be wrapped in an "interleave" element along<br />

with other elements. The order of the specification of attributes in RELAX NG has no meaning, so this block need<br />

not be the first block in the element definition.<br />

W3C <strong>XML</strong> Schema cannot specify such a dependency between the content of an attribute and child elements.<br />

RELAX NG's specification only lists two built-in types (string and token), but it allows for the definition of many<br />

more. In theory, the lack of a specific list allows a processor to support data types that are very problem-domain<br />

specific.<br />

Most RELAX NG schemas can be algorithmically converted into W3C <strong>XML</strong> Schemas and even DTDs (except when<br />

using RELAX NG features not supported by those languages, as above). The reverse is not true. As such, RELAX<br />

NG can be used as a normative version of the schema, and the user can convert it to other forms for tools that do not<br />

support RELAX NG.<br />

Disadvantages<br />

Most of RELAX NG's disadvantages are covered under the section on W3C <strong>XML</strong> Schema's advantages over<br />

RELAX NG.<br />

Though RELAX NG's ability to support user-defined data types is useful, it comes at the disadvantage of only<br />

having two data types that the user can rely upon. Which, in theory, means that using a RELAX NG schema across<br />

multiple validators requires either providing those user-defined data types to that validator or using only the two<br />

basic types. In practice however, most RELAX NG processors support the W3C <strong>XML</strong> Schema set of data types.


<strong>XML</strong> Schema <strong>Language</strong> Comparison 169<br />

Tool Support<br />

RELAX NG's tool support is significant, but it is less widespread than W3C <strong>XML</strong> Schema. The Mono Project's<br />

implementation of the .NET Framework includes a RELAX NG validator. The C library libxml2 provides RELAX<br />

NG support as well. Sun Microsystems's Multiple Schema Validator for Java also provides RELAX NG support.<br />

Schematron<br />

Schematron is a fairly unique schema language. Unlike the main three, it defines an <strong>XML</strong> file's syntax as a list of<br />

XPath-based rules. If the document passes these rules, then it is valid.<br />

Advantages<br />

Because of its rule-based nature, Schematron's specificity is very strong. It can require that the content of an element<br />

be controlled by one of its siblings. It can also request or require that the root element, regardless of what element<br />

that happens to be, have specific attributes. It can even specify required relationships between multiple <strong>XML</strong> files.<br />

Disadvantages<br />

While Schematron is good at relational constructs, its ability to specify the basic structure of a document, that is,<br />

which elements can go where, results in a very verbose schema.<br />

The typical way to solve this is to combine Schematron with RELAX NG or W3C <strong>XML</strong> Schema. There are several<br />

schema processors available for both languages that support this combined form. This allows Schematron rules to<br />

specify additional constraints to the structure defined by W3C <strong>XML</strong> Schema or RELAX NG.<br />

Tool Support<br />

Schematron's reference implementation is actually an XSLT transformation that transforms the Schematron<br />

document into an XSLT that validates the <strong>XML</strong> file. As such, Schematron's potential toolset is any XSLT processor,<br />

though libxml2 provides an implementation that does not require XSLT. Sun Microsystems's Multiple Schema<br />

Validator for Java has an add-on that allows it to validate RELAX NG schemas that have embedded Schematron<br />

rules.<br />

Namespace Routing <strong>Language</strong> (NRL)<br />

This is not technically a schema language. Its sole purpose is to direct parts of documents to individual schemas<br />

based on the namespace of the encountered elements. An NRL is merely a list of <strong>XML</strong> namespaces and a path to a<br />

schema that each corresponds to. This allows each schema to be concerned with only its own language definition,<br />

and the NRL file routes the schema validator to the correct schema file based on the namespace of that element.<br />

This <strong>XML</strong> format is schema-language agnostic and works for just about any schema language.


<strong>XML</strong> Schema <strong>Language</strong> Comparison 170<br />

See also<br />

• Document Type Definition<br />

• Document Structure Description<br />

• W3C <strong>XML</strong> Schema<br />

• RELAX NG<br />

• Schematron<br />

• Namespace Routing <strong>Language</strong><br />

• Namespace-based Validation Dispatching <strong>Language</strong><br />

References<br />

• Comparative Analysis of Six <strong>XML</strong> Schema <strong>Language</strong>s [3] by Dongwon Lee, Wesley W. Chu, In ACM SIGMOD<br />

Record, Vol. 29, No. 3, page 76-87, September 2000<br />

• Taxonomy of <strong>XML</strong> Schema <strong>Language</strong>s using Formal <strong>Language</strong> Theory [4] by Makoto Murata, Dongwon Lee,<br />

Murali Mani, Kohsuke Kawaguchi, In ACM Trans. on Internet Technology (TOIT), Vol. 5, No. 4, page 1-45,<br />

November 2005<br />

[1] While annotations in RELAX NG can support default attribute values, the RELAX NG specification does not mandate that a validator<br />

provide this ability to modify an <strong>XML</strong> infoset as part of validation. The WXS specification does mandate this behavior. An additional<br />

specification associated with RELAX NG does provide this ability. See Relax NG DTD Compatibility (default value) (http://www.<br />

oasis-open.org/committees/relax-ng/compatibility.html#default-value).<br />

[2] James Clark (co-creator of RELAX NG). RELAX NG and W3C <strong>XML</strong> Schema (http://www.imc.org/ietf-xml-use/mail-archive/<br />

msg00217.html)


<strong>XML</strong> Studio 171<br />

<strong>XML</strong> Studio<br />

Editing an <strong>XML</strong> Schema in <strong>XML</strong> Studio<br />

Developer(s) Liquid Technologies<br />

Operating system Microsoft Windows<br />

Type <strong>XML</strong> Editor<br />

License EULA<br />

Website [1]<br />

Liquid <strong>XML</strong> Studio is an <strong>XML</strong> Editor and Integrated Development Environment (IDE) from Liquid Technologies.<br />

Liquid <strong>XML</strong> Studio allows developers to create <strong>XML</strong>-based and Web services applications using technologies such<br />

as <strong>XML</strong>, <strong>XML</strong> Schema, XSLT, XPath, WSDL, and SOAP [2] . Liquid <strong>XML</strong> Studio is also available as a plug-in for<br />

Microsoft Visual Studio [3] .<br />

Editions<br />

• Starter Edition<br />

• Designer Edition. Adds Visual Studio Integration and an <strong>XML</strong> Differencing tool.<br />

• Developer Edition. Adds Code generation to the features found in the Designer Edition. The <strong>XML</strong> Data Binder<br />

generates code for C++, C#, VB.Net, Java, Silverlight & Visual Basic.<br />

Editing <strong>View</strong>s<br />

• Graphical <strong>XML</strong> Schema Editor (XSD).<br />

• <strong>XML</strong> editor - with syntax highlighting and intellisense<br />

• DTD & CSS Editor - with syntax highlighting and Validation<br />

• XSLT Editor - Test Transform, syntax highlighting, intellisense and Validation<br />

Features<br />

• XPath Expression Builder - shows the results of your queries in realtime<br />

• Web Service Call Composer - allows developers to browse and call web services<br />

• <strong>XML</strong> Sample Generator - generates sample <strong>XML</strong> from an <strong>XML</strong> Schema<br />

• XSD Documentation Generation - creates HTML documentation from an <strong>XML</strong> Schema<br />

• <strong>XML</strong> Differencing tool - visualize the differences between 2 <strong>XML</strong> files<br />

• <strong>XML</strong> Schema Code Generation (<strong>XML</strong> Data Binding) for C++, C#, Java, VB.Net & Visual Basic 6<br />

• XSLT Editor - edits and executes XSL Transforms<br />

• Fast Infoset Support - Load and Save <strong>XML</strong> as Fast InfoSet [4]


<strong>XML</strong> Studio 172<br />

See also<br />

• Liquid Technologies<br />

• <strong>XML</strong><br />

• Category:<strong>XML</strong> editors<br />

• IDE<br />

• <strong>XML</strong> Schema<br />

• XSLT<br />

• XPath<br />

• Web services<br />

• Web Services Description <strong>Language</strong><br />

• SOAP<br />

External links<br />

• <strong>XML</strong> Studio product page [1]<br />

References<br />

[1] http://www.liquid-technologies.com/Product_XmlStudio.aspx<br />

[2] Liquid <strong>XML</strong> Studio Product Page (http://www.liquid-technologies.com/Product_XmlStudio.aspx)<br />

[3] Micorosoft Visual Studio Gallery (http://visualstudiogallery.com/ExtensionDetails.<br />

aspx?ExtensionID=33d43486-e73a-4f64-a342-f32c702abc19)<br />

[4] OSS Nokalva 'Market Wire' (http://www.marketwire.com/press-release/Oss-Nokalva-714198.html)<br />

<strong>XML</strong> Telemetric and Command Exchange<br />

XTCE (for <strong>XML</strong> Telemetric and Command<br />

Exchange) is an <strong>XML</strong> based exchange<br />

format for spacecraft telemetry and<br />

command meta-data. Using XTCE the<br />

format and content of a space systems<br />

command and telemetry links can be readily<br />

exchanged between spacecraft operators and<br />

manufacturers. XTCE was originally<br />

standardized by the OMG. In April 2007 the<br />

OMG released revision 1.1 of XTCE as an<br />

OMG available specification. Version 1.0 of the XTCE specification is a CCSDS red-book specification and version<br />

1.1 is a candidate CCSDS blue-book specification.<br />

Overview<br />

During the entire ground system development and operation phases of a mission, telemetry and telecommand<br />

definitions may be exchanged between multiple systems and organizations. Without a standard format, databases<br />

need dedicated converters to convert between the various proprietary database formats and editors. Allowing for a<br />

common database exchange format throughout the entire mission lifecycle will significantly reduce the cost of<br />

database conversions that occur in many space projects. XTCE has been developed as part of an international<br />

cooperation involving the National Aeronautics and Space Administration, the Jet Propulsion Laboratory, the<br />

Goddard Space Flight Center, the European Space Agency, the United States Air Force and private industry


<strong>XML</strong> Telemetric and Command Exchange 173<br />

including RT Logic, Harris, SciSys, Boeing and Lockheed Martin. The standards development effort has been<br />

coordinated via the Consultative Committee for Space Data Systems and the Object Management Group. The <strong>XML</strong><br />

Telemetry and Command Exchange standard is now in active use as a means to exchange mission databases<br />

improving interoperability while reducing mission readiness costs.<br />

External links<br />

• XTCE home [1]<br />

References<br />

• AIAA conference - SpaceOps2006, The XTCE Standardization approach of Telemetry and Command Databases -<br />

The ESA example: http://pdf.aiaa.org/preview/CDReadyMSPOPS06_1317/PV2006_5582.pdf<br />

• AIAA conference - SpaceOps 2006, Exchanging Databases with Dissimilar Systems using CCSDS XTCE: http://<br />

pdf.aiaa.org/preview/CDReadyMSPOPS06_1317/PV2006_5801.pdf<br />

• CCSDS, MOIMS-SMC Working Group: http://cwe.ccsds.org/moims/docs/MOIMS-SMandC<br />

• GSAW conference - 2006, Exchanging Databases with Dissimilar Systems using CCSDS XTCE, http://sunset.<br />

usc.edu/gsaw/gsaw2006/s2/merri.pdf<br />

• Aerospace Conference, 2004, XTCE: a standard <strong>XML</strong>-schema for describing mission operations databases, http:/<br />

/ieeexplore.ieee.org/Xplore/login.jsp?url=/iel5/9422/29904/01368138.pdf<br />

• AIAA conference - SpaceOps2006, A Model for a Spacecraft Operations <strong>Language</strong>, http://www.rheagroup.<br />

com/AIAA-2006-5708-129.pdf<br />

References<br />

[1] http://www.omg.org/space/xtce


<strong>XML</strong> template engine 174<br />

<strong>XML</strong> template engine<br />

An <strong>XML</strong> template engine (or <strong>XML</strong> template processor) is a specialized template processor for <strong>XML</strong> input and/or<br />

output, working in an <strong>XML</strong> template system context. There are two main types:<br />

• "<strong>XML</strong>-suite standards" compliant engines:<br />

• XSLT engines, named also XSLT processors<br />

• XQuery engines, named also XQuery processors<br />

• Others, like Web template engines<br />

XSLT processors<br />

XSLT processors may be delivered as standalone applications, or as software components or libraries intended for<br />

use by applications. Many web browsers and web server software have XSLT processor components built into them.<br />

Most current operating systems have an XSLT processor installed. For example, Windows XP comes with the<br />

MS<strong>XML</strong>3 library, which includes an XSLT processor.<br />

Optimizations<br />

Early XSLT processors had very few optimizations; stylesheet documents were read using the Document Object<br />

Model and the processor would act on them directly. XPath engines were also not optimized.<br />

By 2000, however, implementors saw optimization opportunities in both XPath evaluation and template rule<br />

processing. For example, the Java programming language's Transformation API for <strong>XML</strong> (TrAX), later subsumed<br />

into the Java API for <strong>XML</strong> Processing (JAXP), acknowledged one such optimization: before processing, the XSLT<br />

processor could condense the template rules and other stylesheet tree information into a single, compact Templates<br />

object, free from the constraints and bloat of standard DOMs, in an implementation-specific manner. This<br />

intermediate representation of the stylesheet tree allows for more efficient processing by potentially reducing<br />

preparation time and memory overhead. Additionally, the formal API allows for the object to be cached and reused<br />

for multiple transformations, potentially providing higher performance if several input documents are to be<br />

processed with the same XSLT stylesheet. Parallels are often drawn between this optimization and the compilation<br />

of programming language source code to bytecode: the stylesheets are said to be "compiled", even though they don't<br />

usually produce native programming language bytecode; rather, they produce intermediate structures and routines<br />

that are stored and processed internally. [1]<br />

In contrast, Eugene Kuznetsov (DataPower, IBM) and Jacek Ambroziak (Sun Microsystems: XSLT, Ambrosoft:<br />

Gregor/XSLT) have, independently, created the industry's first genuine optimizing compilers to output executable<br />

binary output. The approach has two major benefits: 1) the transformation executable can be run anywhere: servers,<br />

mobile devices, embedded environments lacking memory for the complete interpreter/compiler system, and 2) the<br />

transformation performance may reach the highest possible levels. The optimized compilation approach will lead to<br />

fastest transformation execution only when complemented by equally careful runtime system design!<br />

XPath evaluation also has room for significant optimizations, and most processor vendors have implemented at least<br />

some of them, for speed. For example, in the test will evaluate to true if /some/nodes<br />

identifies any nodes, so evaluation can stop as soon as the first matching node is found; continuing to look for the<br />

entire set of matching nodes would not change the result. Similar optimizations can be undertaken when processing<br />

xsl:when and xsl:value-of, as well as expressions relying on, either implicitly or explicitly, string(), boolean(), or<br />

number(), and those that use numeric and position()/last()-based predicates.


<strong>XML</strong> template engine 175<br />

Implementations<br />

Some of these are only libraries for specific programming languages, but some form the basis for command<br />

line or shell script utilities for one or more operating systems. Such utilities are either bundled with the<br />

libraries or independently maintained, and some are incorporated into other applications, such as database<br />

engines and web browsers, in order to add XSLT functionality to them. With the exception of web browsers,<br />

such utilities and applications are not listed here.<br />

Implementations for Java<br />

Xalan: Xalan-Java [2]<br />

SAXON by Michael Kay<br />

Gregor/XSLT [3] optimizing compiler and runtime by Jacek Ambroziak<br />

XT [4] originally by James Clark<br />

Oracle XSLT, in the Oracle XDK [5]<br />

Implementations for the .NET Framework<br />

Saxon .NET SourceForge Project Page [6] , an IKVM.NET-based port of Dr. Michael Kay's and<br />

Saxonica's Saxon Processor provides XSLT 2.0, XPath 2.0, and XQuery 1.0 support on the .NET<br />

platform.<br />

The .NET System. <strong>XML</strong> assembly provides a compiled XSLT 1.0 implementation, as well as an<br />

interpreted XSLT 1.0 implementation.<br />

Implementations for C or C++<br />

Xalan: Xalan-C++ [7]<br />

libxslt the XSLT C library for GNOME<br />

Sablotron [8] , which is integrated into PHP4<br />

XJR [9] , with XSLT 2.0, XPath2.0, and JSON support<br />

Implementations for Perl<br />

<strong>XML</strong>::LibXSLT [10] is a Perl interface to the libxslt C library<br />

<strong>XML</strong>::Sablotron [11] is a Perl interface to the Sablotron [8] processor<br />

Implementations for PHP<br />

XSLT [12] is the PHP4 interface to the Sablotron [8] processor<br />

XSL [13] is the new interface to XSL introduced in PHP5. The extension uses the libxslt library.<br />

Implementations for Python<br />

4XSLT, in the 4Suite [14] toolkit by Fourthought, Inc.<br />

lxml [15] is a Pythonic wrapper of the libxslt C library<br />

Implementations for Ruby<br />

Implementations for Tcl<br />

Ruby/XSLT [16] is a simple XSLT class based on libxml and libxslt<br />

Sablotron module for Ruby [17] is a ruby interface to Sablotron<br />

TclXSLT [18] wraps the libxslt library.<br />

tDOM [19] is a generic <strong>XML</strong> package, based on the expat library, that includes an XSLT<br />

implementation. In 2003, it was deemed "very probably the fastest available open source XSLT<br />

implementation, especially for bigger source files". [20]<br />

Implementations for JavaScript


<strong>XML</strong> template engine 176<br />

Google AJAXSLT [21] is an implementation of XSLT in JavaScript, intended for use in Ajax<br />

applications.<br />

Implementations for specific operating systems<br />

Microsoft's MS<strong>XML</strong> library may be used in various Microsoft Windows application development<br />

environments and languages, such as Visual Basic, C, and JScript.<br />

Microsoft offers a new XSLT processor in the System. <strong>XML</strong> component of the .NET Framework.<br />

Implementations integrated into web browsers<br />

References<br />

(Comparison of layout engines (<strong>XML</strong>))<br />

Mozilla has native XSLT support [22] based on TransforMiiX.<br />

Safari 1.2+ has native XSLT support, but Safari 1.2 is unable to perform XSL transformations via<br />

JavaScript [23] , a limitation that does not occur in Mozilla or Internet Explorer, or Safari 3. This limits<br />

the capabilities of Ajax applications that would run in Safari 2. Safari's (all varsions?) <strong>XML</strong>-parser is<br />

also not standards-compliant; it will parse <strong>XML</strong> strings according to HTML rules. Therefore, under<br />

certain circumstances, it will omit data from the DOM tree if it encounters malformed "HTML" — even<br />

though it actually encountered valid <strong>XML</strong>. These errors will propagate to XSL-processed DOM trees.<br />

X-Smiles has native XSLT support.<br />

Opera has partial native XSLT support since Version 9. Notable exceptions include the absence of the<br />

document() function.<br />

Internet Explorer 6 supports XSLT 1.0 via the MS<strong>XML</strong> library (described above). IE5 and IE5.5 came<br />

with an earlier MS<strong>XML</strong> component that only supported an older, nonrecommended dialect of XSLT. A<br />

newer version of MS<strong>XML</strong> can be downloaded and installed separately to enable IE5 and IE5.5 to<br />

support XSLT 1.0 through scripting, and if certain Windows Registry keys are modified, the newer<br />

library will replace the older version as the default used by IE.<br />

[1] Saxon: Anatomy of an XSLT processor (http://www-128.ibm.com/developerworks/xml/library/x-xslt2/) - An article describing the<br />

implementation and optimization details of a popular Java-based XSLT processor.<br />

[2] http://xml.apache.org/xalan-j/<br />

[3] http://ambrosoft.com/<br />

[4] http://www.blnz.com/xt/<br />

[5] http://www.oracle.com/technology/tech/xml/xdkhome.html<br />

[6] http://saxon.sourceforge.net/<br />

[7] http://xml.apache.org/xalan-c/<br />

[8] http://www.gingerall.org/sablotron.html<br />

[9] https://www.p6r.com/software/xjr.html<br />

[10] http://search.cpan.org/~msergeant/<strong>XML</strong>-LibXSLT-1.57/LibXSLT.pm<br />

[11] http://search.cpan.org/~pavelh/<strong>XML</strong>-Sablotron-1.01/Sablotron.pm [12]<br />

http://no.php.net/manual/en/ref.xslt.php<br />

[13] http://no.php.net/manual/en/book.xsl.php<br />

[14] http://4suite.org/<br />

[15] http://codespeak.net/lxml/<br />

[16] http://raa.ruby-lang.org/project/ruby-xslt/<br />

[17] http://www.rubycolor.org/sablot/<br />

[18] http://tclxml.sourceforge.net/tclxslt.html<br />

[19] http://www.tdom.org/<br />

[20] Loewer, Jochen; Ade, Rolf. "tDOM manual: tDOM Overview" (http://www.tdom.org/). . Retrieved 2009-11-12.<br />

[21] http://goog-ajaxslt.sourceforge.net/<br />

[22] http://www.mozilla.org/projects/xslt/<br />

[23] http://developer.apple.com/internet/safari/faq.html#anchor21


<strong>XML</strong> tree 177<br />

<strong>XML</strong> tree<br />

<strong>XML</strong> documents have a hierarchical structure and can conceptually be interpreted as a tree structure, called an <strong>XML</strong><br />

tree.<br />

This tree structure can not be divided into just root, nodes and leaves as normal tree structures. Although there is no<br />

consensus on the terminology used on <strong>XML</strong> Trees, at least two standard terminologies exist:<br />

• The terminology used in the XPath Data Model<br />

• The terminology used in the <strong>XML</strong> Information Set.<br />

<strong>XML</strong> validation<br />

<strong>XML</strong> validation is the process of checking a document written in <strong>XML</strong> (eXtensible <strong>Markup</strong> <strong>Language</strong>) to confirm<br />

that it is both "well-formed" and also "valid" in that it follows a defined structure. A "well-formed" document<br />

follows the basic syntactic rules of <strong>XML</strong>, which are the same for all <strong>XML</strong> documents. [1] A valid document also<br />

respects the rules dictated by a particular DTD or <strong>XML</strong> schema, according to the application-specific choices for<br />

those particular . [2]<br />

In addition, extended tools are available such as OASIS CAM standard specification that provide contextual<br />

validation of content and structure that is more flexible than basic schema validations.<br />

xmllint is a command line <strong>XML</strong> tool that can perform <strong>XML</strong> validation. It can be found in UNIX / Linux<br />

environments. An example with the use of this program for validation of a file called example.xml is<br />

xmllint --valid --noout example.xml<br />

External links<br />

Example C program<br />

• Validate <strong>XML</strong> against XSD in C [3] (using libxml)<br />

<strong>XML</strong> toolkit<br />

• The <strong>XML</strong> C parser and toolkit of Gnome [4] – libxml includes xmllint<br />

• Windows port of libxml [5] – maintained by Igor Zlatkovic<br />

Online validators for <strong>XML</strong> files<br />

• http://www.xmlvalidation.com/<br />

• http://www.stg.brown.edu/service/xmlvalid/<br />

• http://www.jcam.org.uk<br />

Articles discussing <strong>XML</strong> validation<br />

• DEVX March, 2009 - Taking <strong>XML</strong> Validation to the Next Level: Introducing CAM [6]


<strong>XML</strong> validation 178<br />

References<br />

[1] "Well-Formed <strong>XML</strong> Documents" (http://www.w3.org/TR/2004/REC-xml11-20040204/#sec-well-formed). Extensible <strong>Markup</strong> <strong>Language</strong><br />

(<strong>XML</strong>) 1.1. W3C. 2004. .<br />

[2] "Constraints and Validation Rules" (http://www.w3.org/TR/xmlschema-1/#concepts-schemaConstraints). <strong>XML</strong> Schema Part 1:<br />

Structures Second Edition. W3C. 2004. .<br />

[3] http://knol2share.blogspot.com/2009/05/validate-xml-against-xsd-in-c.html<br />

[4] http://xmlsoft.org/xmldtd.html<br />

[5] http://www.zlatkovic.com/libxml.en.html<br />

[6] http://www.devx.com/xml/Article/41066<br />

<strong>XML</strong>-Enabled Networking<br />

<strong>XML</strong> Enabled Networking provides an abstraction layer that exists alongside the traditional Internet Protocol (IP)<br />

network. This layer addresses the security, incompatibility and latency issues encumbering <strong>XML</strong> messages, web<br />

services and service oriented architectures (SOAs).<br />

History of <strong>XML</strong> Enabled Networking<br />

Many organizations have adopted <strong>XML</strong> technologies - often as Web services or service oriented architectures<br />

(SOAs) - as the standard for new application development and integration. Applications based on <strong>XML</strong> and Web<br />

services offer rapid interoperability and seamless service re-use by establishing a standard data format and a standard<br />

interface.<br />

With faster development cycles, less development effort and improved agility, <strong>XML</strong> and Web services enable IT to<br />

deliver more solutions to the business at a substantially lower cost. However, using these technologies also creates<br />

some potential problems:<br />

• Security concerns: <strong>XML</strong> messages are text-based, human readable, verbose, and self-describing. An <strong>XML</strong><br />

message could include descriptions of identities and credentials used to authenticate services, signatures requiring<br />

verification etc. <strong>XML</strong> by itself does not provide an infrastructure for integrating with multiple identity/access<br />

control systems across the organization, ensuring trust and compliance for <strong>XML</strong> message processing, or<br />

protecting the organization from the threats that malicious individuals could introduce into the organization with<br />

<strong>XML</strong>.<br />

• Incompatibilities: Many <strong>XML</strong> standards have emerged. <strong>XML</strong> messages use a variety of security standards,<br />

transport protocols, credential types and data structures. Web service developers need some way to mediate<br />

between these different standards and protocols, especially when they are integrating with business partners who<br />

may employ entirely different standards and protocols.<br />

• Application latency: <strong>XML</strong> messages can consume significant processing resources from application servers,<br />

lowering performance for the <strong>XML</strong>-based service and for other applications that run on the same platform.<br />

<strong>XML</strong> Enabled Networking attempts to address these issues by creating an abstraction layer that exists alongside the<br />

traditional Internet Protocol (IP) network to provide security and access enforcement, accelerated <strong>XML</strong> message<br />

processing, mediation between standards and protocols, policy control and auditing. <strong>XML</strong> Enabled Networks have<br />

typically been sold as network appliances. Initially they required application-specific integrated circuits, but<br />

appliances that run on standards-based hardware and operating systems are now available.


<strong>XML</strong>-Enabled Networking 179<br />

Common Features of <strong>XML</strong> Enabled Networking<br />

• It is powered by hardened network appliances, ready to incorporate into the network with minimal disruption<br />

• <strong>XML</strong> Enabled Networking appliances have software to make the appliances easy to install, configure and manage<br />

• They can validate <strong>XML</strong> messages for well-formedness as they enter or exit the appliance<br />

• They can convert <strong>XML</strong> to any data format<br />

• They have built-in storage capabilities to enable on-device logging for compliance and debugging purposes.<br />

• They have built-in support for many <strong>XML</strong> standards such as XSLT, XPath, SOAP and WS-Security<br />

• They are easily upgradeable<br />

Classification of <strong>XML</strong> Enabled Networking<br />

<strong>XML</strong> Security Gateways or <strong>XML</strong> Firewalls offer comprehensive <strong>XML</strong> security processing. <strong>XML</strong> Security Gateways<br />

include acceleration and integration functionality. Enterprise class <strong>XML</strong> Security Gateways include robust policy<br />

management, correlated event/message/policy logging for visibility and extensibility frameworks.<br />

<strong>XML</strong> Routers deliver robust access control and integration with identity authorities with acceleration and integration<br />

functionality. Enterprise class <strong>XML</strong> Routers include robust policy management, correlated event/message/policy<br />

logging for visibility and extensibility frameworks.<br />

<strong>XML</strong> Accelerators optimize both message throughput and server performance for <strong>XML</strong> operations including schema<br />

validation, encryption/decryption, authentication, signing, data transformation and protocol mediation. Enterprise<br />

class <strong>XML</strong> Accelerators include robust policy management, correlated event/message/policy logging for visibility<br />

and extensibility frameworks.<br />

<strong>XML</strong> Enabled Networking vendors<br />

• Citrix Systems<br />

• DataPower (IBM)<br />

• F5 Networks<br />

• Forum Systems<br />

• Layer 7 Technologies<br />

• Reactivity, Inc. (Cisco [1] )<br />

• Solace Systems<br />

• Sonoa Systems<br />

• Strangeloop Networks<br />

• Vordel<br />

• Zeus Systems<br />

See also<br />

<strong>XML</strong><br />

SOAP<br />

WS-Security<br />

<strong>XML</strong> appliance<br />

References<br />

[1] http://newsroom.cisco.com/dlls/2007/corp_022107.html


<strong>XML</strong>-Retrieval 180<br />

<strong>XML</strong>-Retrieval<br />

<strong>XML</strong> Retrieval, or <strong>XML</strong> Information Retrieval, is the content-based retrieval of documents structured with <strong>XML</strong><br />

(eXtensible <strong>Markup</strong> <strong>Language</strong>). As such it is used for computing relevance of <strong>XML</strong> documents. [1]<br />

Queries<br />

Most <strong>XML</strong> retrieval approaches do so based on techniques from the information retrieval (IR) area, e.g. by<br />

computing the similarity between a query consisting of keywords (query terms) and the document. However, in<br />

<strong>XML</strong>-Retrieval the query can also contain structural hints. So-called "content and structure" (CAS) queries enable<br />

users to specify what structure the requested content can or must have.<br />

Exploiting <strong>XML</strong> structure<br />

Taking advantage of the self-describing structure of <strong>XML</strong> documents can improve the search for <strong>XML</strong> documents<br />

significantly. This includes the use of CAS queries, the weighting of different <strong>XML</strong> elements differently and the<br />

focused retrieval of subdocuments.<br />

Ranking<br />

Ranking in <strong>XML</strong>-Retrieval can incorporate both content relevance and structural similarity, which is the<br />

resemblance between the structure given in the query and the structure of the document. Also, the retrieval units<br />

resulting from an <strong>XML</strong> query may not always be entire documents, but can be any deeply nested <strong>XML</strong> elements, i.e.<br />

dynamic documents. The aim is to find the smallest retrieval unit that is highly relevant. Relevance can be defined<br />

according to the notion of specificity, which is the extent to which a retrieval unit focuses on the topic of request. [2]<br />

Existing <strong>XML</strong> search engines<br />

An overview of two potential approaches is available. [3] [4] The INitiative for the Evaluation of <strong>XML</strong>-Retrieval<br />

(INEX) was founded in 2002 and provides a platform for evaluating such algorithms. [2] Three different areas<br />

influence <strong>XML</strong>-Retrieval: [5]<br />

Traditional <strong>XML</strong> query languages<br />

Query languages such as the W3C standard XQuery [6] supply complex queries, but only look for exact matches.<br />

Therefore, they need to be extended to allow for vague search with relevance computing. Most <strong>XML</strong>-centered<br />

approaches imply a quite exact knowledge of the documents' schemas. [7]<br />

Databases<br />

Classic database systems have adopted the possibility to store semi-structured data [5] and resulted in the development<br />

of <strong>XML</strong> databases. Often, they are very formal, concentrate more on searching than on ranking, and are used by<br />

experienced users able to formulate complex queries.<br />

Information retrieval<br />

Classic information retrieval models such as the vector space model provide relevance ranking, but do not include<br />

document structure; only flat queries are supported. Also, they apply a static document concept, so retrieval units<br />

usually are entire documents. [7] They can be extended to consider structural information and dynamic document<br />

retrieval. Examples for approaches extending the vector space models are available: they use document subtrees<br />

(index terms plus structure) as dimensions of the vector space. [8]


<strong>XML</strong>-Retrieval 181<br />

See also<br />

• Document retrieval<br />

• Information retrieval applications<br />

References<br />

[1] Winter, Judith; Drobnik, Oswald (November 9, 2007).<br />

%20Architecture%20for%20<strong>XML</strong>%20Information%20Retrieval%20in%20a%20Peer-to-Peer%20Environment_2007.pdf "An Architecture<br />

for <strong>XML</strong> Information Retrieval in a Peer-to-Peer Environment" (ftp://ftp.tm.informatik.uni-frankfurt.de/pub/papers/ir/An). ACM.<br />

%20Architecture%20for%20<strong>XML</strong>%20Information%20Retrieval%20in%20a%20Peer-to-Peer%20Environment_2007.pdf. Retrieved<br />

2009-02-10.<br />

[2] Malik, Saadia; Trotman, Andrew; Lalmas, Mounia; Fuhr, Norbert (2007). "Overview of INEX 2006" (http://www.cs.otago.ac.nz/<br />

homepages/andrew/2006-10.pdf). Proceedings of the Fifth Workshop of the INitiative for the Evaluation of <strong>XML</strong> Retrieval. . Retrieved<br />

2009-02-10.<br />

[3] Amer-Yahia, Sihem; Lalmas, Mounia (2006). "<strong>XML</strong> Search: <strong>Language</strong>s, INEX and Scoring" (http://www.sigmod.org/record/issues/<br />

0612/p16-article-yahia.pdf). SIGMOD Rec. Vol. 35, No. 4. . Retrieved 2009-02-10.<br />

[4] Pal, Sukomal (June 30, 2006). "<strong>XML</strong> Retrieval: A Survey" (http://66.102.1.104/scholar?q=cache:R6ZYFNoTRrUJ:citeseerx.ist.psu.edu/<br />

viewdoc/download?doi=10.1.1.109.5986&rep=rep1&type=pdf). Technical Report, CVPR. . Retrieved 2009-02-10.<br />

[5] Fuhr, Norbert; Gövert, N.; Kazai, Gabriella; Lalmas, Mounia (2003). "INEX: Initiative for the Evaluation of <strong>XML</strong> Retrieval" (http://www.<br />

is.informatik.uni-duisburg.de/bib/pdf/ir/Fuhr_etal:02a.pdf). Proceedings of the First INEX Workshop, Dagstuhl, Germany, 2002. ERCIM<br />

Workshop Proceedings, France. . Retrieved 2009-02-10.<br />

[6] Boag, Scott; Chamberlin, Don; Fernández, Mary F.; Florescu, Daniela; Robie, Jonathan; Siméon, Jérôme (23 January 2007). "XQuery 1.0:<br />

An <strong>XML</strong> Query <strong>Language</strong>" (http://www.w3.org/TR/2007/REC-xquery-20070123/). W3C Recommendation. World Wide Web<br />

Consortium. . Retrieved 2009-02-10.<br />

[7] Schlieder, Torsten; Meuss, Holger (2002). "Querying and Ranking <strong>XML</strong> Documents" (http://209.85.173.132/<br />

search?q=cache:KHBo9BRjO7QJ:www.cis.uni-muenchen.de/people/Meuss/Pub/JASIS02.ps.gz). Journal of the American Society for<br />

Information Science and Technology, Vol. 53, No. 6. . Retrieved 2009-02-10.<br />

[8] Liu, Shaorong; Zou, Qinghua; Chu, Wesley W. (2004). "Configurable Indexing and Ranking for <strong>XML</strong> Information Retrieval" (http://www.<br />

cobase.cs.ucla.edu/tech-docs/sliu/SIGIR04.pdf). SIGIR'04. ACM. . Retrieved 2009-02-10.


<strong>XML</strong>HttpRequest 182<br />

<strong>XML</strong>HttpRequest<br />

HTTP<br />

Persistence · Compression · HTTP<br />

Secure<br />

Headers<br />

ETag · Cookie · Referrer · Location<br />

Status codes<br />

301 Moved permanently<br />

302 Found<br />

303 See Other<br />

403 Forbidden<br />

404 Not Found<br />

<strong>XML</strong>HttpRequest (XHR) is an API available in web browser scripting languages such as JavaScript. It is used to<br />

send HTTP or HTTPS requests directly to a web server and load the server response data directly back into the<br />

script. [1] The data might be received from the server as <strong>XML</strong> text [2] or as plain text. [3] Data from the response can be<br />

used directly to alter the DOM of the currently active document in the browser window without loading a new web<br />

page document. The response data can also be evaluated by the client-side scripting. For example, if it was formatted<br />

as JSON by the web server, it can easily be converted into a client-side data object for further use.<br />

<strong>XML</strong>HttpRequest has an important role in the Ajax web development technique. It is currently used by many<br />

websites to implement responsive and dynamic web applications. Examples of these web applications include Gmail,<br />

Google Maps, Facebook, and many others.<br />

History and support<br />

The concept behind the <strong>XML</strong>HttpRequest object was originally created by the developers of Outlook Web Access for<br />

Microsoft Exchange Server 2000. [4] An interface called I<strong>XML</strong>HTTPRequest was developed and implemented into<br />

the second version of the MS<strong>XML</strong> library using this concept. [4] [5] The second version of the MS<strong>XML</strong> library was<br />

shipped with Internet Explorer 5.0 in March 1999, allowing access, via ActiveX, to the I<strong>XML</strong>HTTPRequest interface<br />

using the <strong>XML</strong>HTTP wrapper of the MS<strong>XML</strong> library. [6]<br />

The Mozilla Foundation developed and implemented an interface called nsI<strong>XML</strong>HttpRequest into the Gecko layout<br />

[7] [8]<br />

engine. This interface was modelled to work as closely to Microsoft's I<strong>XML</strong>HTTPRequest interface as possible.<br />

Mozilla created a wrapper to use this interface through a JavaScript object which they called <strong>XML</strong>HttpRequest. [9]<br />

[10] [11]<br />

The <strong>XML</strong>HttpRequest object was accessible as early as Gecko version 0.6 released on December 6 of 2000,<br />

but it was not completely functional until as late as version 1.0 of Gecko released on June 5, 2002. [10] [11] The<br />

<strong>XML</strong>HttpRequest object became a de facto standard amongst other major user agents, implemented in Safari 1.2<br />

released in February 2004, [12] Konqueror, Opera 8.0 released in April 2005, [13] and iCab 3.0b352 released in<br />

September 2005. [14]<br />

The World Wide Web Consortium published a Working Draft specification for the <strong>XML</strong>HttpRequest object on April<br />

5, 2006, edited by Anne van Kesteren of Opera Software and Dean Jackson of W3C. [15] Its goal is "to document a<br />

minimum set of interoperable features based on existing implementations, allowing Web developers to use these<br />

features without platform-specific code." The last revision to the <strong>XML</strong>HttpRequest object specification was on<br />

[16] [17]<br />

November 19 of 2009, being a last call working draft.


<strong>XML</strong>HttpRequest 183<br />

Microsoft added the <strong>XML</strong>HttpRequest object identifier to its scripting languages in Internet Explorer 7.0 released in<br />

October 2006. [6]<br />

With the advent of cross-browser JavaScript libraries such as jQuery and the Prototype JavaScript Framework,<br />

developers can invoke <strong>XML</strong>HttpRequest functionality without coding directly to the API. Prototype provides an<br />

asynchronous requester object called Ajax.Request that wraps the browser's underlying implementation and provides<br />

access to it. [18] jQuery objects represent or wrap elements from the current client-side DOM. They all have a .load()<br />

method that takes a URI parameter and makes an <strong>XML</strong>HttpRequest to that URI, then by default places any returned<br />

[19] [20]<br />

HTML into the HTML element represented by the jQuery object.<br />

The W3C has since published another Working Draft specification for the <strong>XML</strong>HttpRequest object,<br />

"<strong>XML</strong>HttpRequest Level 2", on February 25 of 2008. [21] Level 2 consists of extended functionality to the<br />

<strong>XML</strong>HttpRequest object, including, but not currently limited to, progress events, support for cross-site requests, and<br />

the handling of byte streams. The latest revision of the <strong>XML</strong>HttpRequest Level 2 specification is that of 20th August<br />

2009, which is still a working draft. [22]<br />

Support in Internet Explorer versions 5, 5.5 and 6<br />

Internet Explorer versions 5 and 6 did not define the <strong>XML</strong>HttpRequest object identifier in their scripting languages<br />

as the <strong>XML</strong>HttpRequest identifier itself was not standard at the time of their releases. [6] Backward compatibility can<br />

be achieved through object detection if the <strong>XML</strong>HttpRequest identifier does not exist.<br />

An example of how to instantiate an <strong>XML</strong>HttpRequest object with support for Internet Explorer versions 5 and 6<br />

using JScript method ActiveXObject is below. [23]<br />

/*<br />

Provide the <strong>XML</strong>HttpRequest constructor for IE 5.x-6.x:<br />

Other browsers (including IE 7.x-8.x) do not redefine<br />

<strong>XML</strong>HttpRequest if it already exists.<br />

This example is based on findings at:<br />

http://blogs.msdn.com/xmlteam/archive/2006/10/23/using-the-right-version-of-msxml-in-inte<br />

*/<br />

if (typeof <strong>XML</strong>HttpRequest == "undefined")<br />

<strong>XML</strong>HttpRequest = function () {<br />

};<br />

try { return new ActiveXObject("Msxml2.<strong>XML</strong>HTTP.6.0"); }<br />

catch (e) {}<br />

try { return new ActiveXObject("Msxml2.<strong>XML</strong>HTTP.3.0"); }<br />

catch (e) {}<br />

try { return new ActiveXObject("Msxml2.<strong>XML</strong>HTTP"); }<br />

catch (e) {}<br />

//Microsoft.<strong>XML</strong>HTTP points to Msxml2.<strong>XML</strong>HTTP.3.0 and is redundant<br />

throw new Error("This browser does not support <strong>XML</strong>HttpRequest.");<br />

Web pages that use <strong>XML</strong>HttpRequest or <strong>XML</strong>HTTP can mitigate the current minor differences in the<br />

implementations either by encapsulating the <strong>XML</strong>HttpRequest object in a JavaScript wrapper, or by using an<br />

existing framework that does so. In either case, the wrapper should detect the abilities of current implementation and<br />

work within its requirements.


<strong>XML</strong>HttpRequest 184<br />

HTTP request<br />

The following sections demonstrate how a request using the <strong>XML</strong>HttpRequest object functions within a conforming<br />

user agent based on the W3C Working Draft. As the W3C standard for the <strong>XML</strong>HttpRequest object is still a draft,<br />

user agents may not abide by all the functionings of the W3C definition and any of the following is subject to<br />

change. Extreme care should be taken into consideration when scripting with the <strong>XML</strong>HttpRequest object across<br />

multiple user agents. This article will try to list the inconsistencies between the major user agents.<br />

The open method<br />

The HTTP and HTTPS requests of the <strong>XML</strong>HttpRequest object must be initialized through the open method. This<br />

method must be invoked prior to the actual sending of a request to validate and resolve the request method, URL,<br />

and URI user information to be used for the request. This method does not assure that the URL exists or the user<br />

information is correct. This method can accept up to five parameters, but requires only two, to initialize a request.<br />

The first parameter of the method is a text string indicating the HTTP request method to use. The request methods<br />

that must be supported by a conforming user agent, defined by the W3C draft for the <strong>XML</strong>HttpRequest object, are<br />

currently listed as the following. [24]<br />

• GET (Supported by IE7+, Mozilla 1+)<br />

• POST (Supported by IE7+, Mozilla 1+)<br />

• HEAD (Supported by IE7+)<br />

• PUT<br />

• DELETE<br />

• OPTIONS (Supported by IE7+)<br />

However, request methods are not limited to the ones listed above. The W3C draft states that a browser may support<br />

additional request methods at their own discretion.<br />

The second parameter of the method is another text string, this one indicating the URL of the HTTP request. The<br />

W3C recommends that browsers should raise an error and not allow the request of a URL with either a different port<br />

or ihost URI component from the current document. [25]<br />

The third parameter, a boolean value indicating whether or not the request will be asynchronous, is not a required<br />

parameter by the W3C draft. The default value of this parameter should be assumed to be true by a W3C conforming<br />

user agent if it is not provided. An asynchronous request ("true") will not wait on a server response before continuing<br />

on with the execution of the current script. It will instead invoke the onreadystatechange event listener of the<br />

<strong>XML</strong>HttpRequest object throughout the various stages of the request. A synchronous request ("false") however will<br />

block execution of the current script until the request has been completed, thus not invoking the onreadystatechange<br />

event listener.<br />

The fourth and fifth parameters are the URI user and password, respectively. These parameters are not required and<br />

should default to the current user and password of the document if not supplied, as defined by the W3C draft.<br />

The setRequestHeader method<br />

Upon successful initialization of a request, the setRequestHeader method of the <strong>XML</strong>HttpRequest object can be<br />

invoked to send HTTP headers with the request. The first parameter of this method is the text string name of the<br />

header. The second parameter is the text string value. This method must be invoked for each header that needs to be<br />

sent with the request. Any headers attached here will be removed the next time the open method is invoked in a W3C<br />

conforming user agent.


<strong>XML</strong>HttpRequest 185<br />

The send method<br />

To send an HTTP request, the send method of the <strong>XML</strong>HttpRequest must be invoked. This method accepts a single<br />

parameter containing the content to be sent with the request. This parameter may be omitted if no content needs to be<br />

sent. The W3C draft states that this parameter may be any type available to the scripting language as long as it can be<br />

turned into a text string, with the exception of the DOM document object. If a user agent cannot stringify the<br />

parameter, then the parameter should be ignored.<br />

If the parameter is a DOM document object, a user agent should assure the document is turned into well-formed<br />

<strong>XML</strong> using the encoding indicated by the inputEncoding property of the document object. If the Content-Type<br />

request header was not added through setRequestHeader yet, it should automatically be added by a conforming user<br />

agent as "application/xml;charset=charset," where charset is the encoding used to encode the document.<br />

The onreadystatechange event listener<br />

If the open method of the <strong>XML</strong>HttpRequest object was invoked with the third parameter set to true for an<br />

asynchronous request, the onreadystatechange event listener will be automatically invoked for each of the<br />

following actions that change the readyState property of the <strong>XML</strong>HttpRequest object.<br />

• After the open method has been invoked successfully, the readyState property of the <strong>XML</strong>HttpRequest object<br />

should be assigned a value of 1.<br />

• After the send method has been invoked and the HTTP response headers have been received, the readyState<br />

property of the <strong>XML</strong>HttpRequest object should be assigned a value of 2.<br />

• Once the HTTP response content begins to load, the readyState property of the <strong>XML</strong>HttpRequest object should<br />

be assigned a value of 3.<br />

• Once the HTTP response content has finished loading, the readyState property of the <strong>XML</strong>HttpRequest object<br />

should be assigned a value of 4.<br />

The major user agents are inconsistent with the handling of the onreadystatechange event listener.<br />

The HTTP response<br />

After a successful and completed call to the send method of the <strong>XML</strong>HttpRequest, if the server response was valid<br />

<strong>XML</strong> and the Content-Type header sent by the server is understood by the user agent as an Internet media type for<br />

<strong>XML</strong>, the response<strong>XML</strong> property of the <strong>XML</strong>HttpRequest object will contain a DOM document object. Another<br />

property, responseText will contain the response of the server in plain text by a conforming user agent, regardless of<br />

whether or not it was understood as <strong>XML</strong>.<br />

See also<br />

• Hypertext Transfer Protocol<br />

• Representational State Transfer<br />

• Ajax<br />

External links<br />

• Level 1 specification of the <strong>XML</strong>HttpRequest object from W3C [26]<br />

• Level 2 specification of the <strong>XML</strong>HttpRequest object from W3C [27]<br />

• Specification of the <strong>XML</strong>HttpRequest object for Apple developers [28]<br />

• Specification of the <strong>XML</strong>HttpRequest object for Microsoft developers [29]<br />

• Specification of the <strong>XML</strong>HttpRequest object for Mozilla developers [30]<br />

• Specification of the <strong>XML</strong>HttpRequest object for Opera developers [31]


<strong>XML</strong>HttpRequest 186<br />

• "Attacking AJAX Applications" [32] , a presentation given at the Black Hat security conference. Discusses several<br />

issues involving XHR and the future of cross-domain AJAX.<br />

References<br />

[1] "<strong>XML</strong>HttpRequest object explained by the W3C Working Draft" (http://www.w3.org/TR/<strong>XML</strong>HttpRequest/). W3.org. . Retrieved<br />

2009-07-14.<br />

[2] "The response<strong>XML</strong> attribute of the <strong>XML</strong>HttpRequest object explained by the W3C Working Draft" (http://www.w3.org/TR/<br />

<strong>XML</strong>HttpRequest/#responsexml). W3.org. . Retrieved 2009-07-14.<br />

[3] "The responseText attribute of the <strong>XML</strong>HttpRequest object explained by the W3C Working Draft" (http://www.w3.org/TR/<br />

<strong>XML</strong>HttpRequest/#responsetext). W3.org. . Retrieved 2009-07-14.<br />

[4] "Article on the history of <strong>XML</strong>HTTP by an original developer" (http://www.alexhopmann.com/xmlhttp.htm). Alexhopmann.com.<br />

2007-01-31. . Retrieved 2009-07-14.<br />

[5] "Specification of the I<strong>XML</strong>HTTPRequest interface from the Microsoft Developer Network" (http://msdn.microsoft.com/en-us/library/<br />

ms759148(VS.85).aspx). Msdn.microsoft.com. . Retrieved 2009-07-14.<br />

[6] Dutta, Sunava (2006-01-23). "Native <strong>XML</strong>HTTPRequest object" (http://blogs.msdn.com/ie/archive/2006/01/23/516393.aspx). IEBlog.<br />

Microsoft. . Retrieved 2006-11-30.<br />

[7] "Specification of the nsI<strong>XML</strong>HttpRequest interface from the Mozilla Developer Center" (https://developer.mozilla.org/en/<br />

nsI<strong>XML</strong>HttpRequest). Developer.mozilla.org. 2008-05-16. . Retrieved 2009-07-14.<br />

[8] "Specification of the nsIJS<strong>XML</strong>HttpRequest interface from the Mozilla Developer Center" (https://developer.mozilla.org/en/<br />

NsIJS<strong>XML</strong>HttpRequest). Developer.mozilla.org. 2009-05-03. . Retrieved 2009-07-14.<br />

[9] "Specification of the <strong>XML</strong>HttpRequest object from the Mozilla Developer Center" (https://developer.mozilla.org/en/XmlHttpRequest).<br />

Developer.mozilla.org. 2009-05-03. . Retrieved 2009-07-14.<br />

[10] "Version history for the Mozilla Application Suite" (http://www.mozilla.org/releases/history.html). Mozilla.org. . Retrieved 2009-07-14.<br />

[11] "Downloadable, archived releases for the Mozilla browser" (http://www-archive.mozilla.org/releases/). Archive.mozilla.org. . Retrieved<br />

2009-07-14.<br />

[12] "Archived news from Mozillazine stating the release date of Safari 1.2" (http://weblogs.mozillazine.org/hyatt/archives/2004_02.html).<br />

Weblogs.mozillazine.org. . Retrieved 2009-07-14.<br />

[13] "Press release stating the release date of Opera 8.0 from the Opera website" (http://www.opera.com/press/releases/2005/06/16/).<br />

Opera.com. 2005-04-19. . Retrieved 2009-07-14.<br />

[14] Soft-Info.org. "Detailed browser information stating the release date of iCab 3.0b352 from" (http://www.soft-info.org/browsers/<br />

icab-10109.html). Soft-Info.com. . Retrieved 2009-07-14.<br />

[15] "Specification of the <strong>XML</strong>HttpRequest object from the Level 1 W3C Working Draft released on April 5th, 2006" (http://www.w3.org/<br />

TR/2006/WD-<strong>XML</strong>HttpRequest-20060405/). W3.org. . Retrieved 2009-07-14.<br />

[16] "<strong>XML</strong>HttpRequest W3C Working Draft 19 November 2009" (http://www.w3.org/TR/2009/WD-<strong>XML</strong>HttpRequest-20091119/).<br />

W3.org. . Retrieved 2009-12-17.<br />

[17] "W3C Process Document, Section 7.4.2 Last Call Announcement" (http://www.w3.org/2005/10/Process-20051014/tr#last-call).<br />

W3.org. . Retrieved 2009-12-17.<br />

[18] Porteneuve, Christophe (2007). "9". in Daniel H Steinberg. Raleigh, North Carolina: Pragmatic Bookshelf. pp. 183. ISBN 1-934356-01-8.<br />

[19] Chaffer, Jonathan; Karl Swedberg (2007). Learning jQuery. Birmingham: Packt Publishing. pp. 107. ISBN 978-1-847192-50-9.<br />

[20] Chaffer, Jonathan; Karl Swedberg (2007). jQuery Reference Guide. Birmingham: Packt Publishing. pp. 156. ISBN 978-1-847193-81-0.<br />

[21] "Specification of the <strong>XML</strong>HttpRequest object from the Level 2 W3C Working Draft released on February 25th, 2008" (http://www.w3.<br />

org/TR/2008/WD-<strong>XML</strong>HttpRequest2-20080225/). W3.org. . Retrieved 2009-07-14.<br />

[22] "<strong>XML</strong>HttpRequest Level 2, W3C Working Draft 20 August 2009" (http://www.w3.org/TR/<strong>XML</strong>HttpRequest2/). W3.org. . Retrieved<br />

2010-04-08.<br />

[23] "Ajax Reference (<strong>XML</strong>HttpRequest object)" (http://www.javascriptkit.com/jsref/ajax.shtml). JavaScript Kit. 2008-07-22. . Retrieved<br />

2009-07-14.<br />

[24] "Dependencies of the <strong>XML</strong>HttpRequest object explained by the W3C Working Draft" (http://www.w3.org/TR/<strong>XML</strong>HttpRequest/<br />

#dependencies). W3.org. . Retrieved 2009-07-14.<br />

[25] "The "open" method of the <strong>XML</strong>HttpRequest object explained by the W3C Working Draft" (http://www.w3.org/TR/<strong>XML</strong>HttpRequest/<br />

#the-open-method). W3.org. . Retrieved 2009-10-13.<br />

[26] http://www.w3.org/TR/<strong>XML</strong>HttpRequest/<br />

[27] http://www.w3.org/TR/<strong>XML</strong>HttpRequest2/<br />

[28] http://developer.apple.com/internet/webcontent/xmlhttpreq.html<br />

[29] http://msdn.microsoft.com/en-us/library/ms535874(VS.85).aspx<br />

[30] https://developer.mozilla.org/en/<strong>XML</strong>HttpRequest<br />

[31] http://www.opera.com/docs/specs/opera9/xhr/<br />

[32] http://www.isecpartners.com/files/iSEC-Attacking_AJAX_Applications.BH2006.pdf


<strong>XML</strong>Socket 187<br />

<strong>XML</strong>Socket<br />

<strong>XML</strong>Socket is a class in ActionScript which allows Adobe Flash content to use socket communication, via TCP<br />

stream sockets. It can be used for plain text, although, as the name implies, it was made for <strong>XML</strong>. It is often used in<br />

chat applications and multiplayer games.<br />

Examples<br />

ActionScript 2.0<br />

For a simple Hello, World! application in ActionScript 2.0, you could use the code below:<br />

var xmlSocket:<strong>XML</strong>Socket=new <strong>XML</strong>Socket();<br />

xmlSocket.onConnect=function() {<br />

}<br />

xmlSocket.send(new <strong>XML</strong>("Hello, World!"));<br />

xmlSocket.on<strong>XML</strong>=function(my<strong>XML</strong>) {<br />

}<br />

trace(my<strong>XML</strong>.firstChild.childNodes[0].firstChild.nodeValue);<br />

xmlSocket.close();<br />

xmlSocket.connect("localhost",8463);<br />

This would result in the output window of the Flash IDE opening and displaying "Hello, World!", assuming that a<br />

socket server was running on port 8463 of the local machine, and was echoing everything sent to it. <br />

External links<br />

• <strong>XML</strong> Sockets: the basics of multiplayer games [1] , gotoAndPlay Flash Tutorials<br />

• <strong>XML</strong>Socket Simplified [2] , Heliant Whitepaper for ActionScript<br />

• Utilizing Flash Player <strong>XML</strong>Sockets for JavaScript applications [3]<br />

• Palabre, Simple open source <strong>XML</strong> socket server for Flash written in python [4]<br />

References<br />

[1] http://www.gotoandplay.it/_articles/2003/12/xmlSocket.php<br />

[2] http://www.heliant.net/~stsai/code/<br />

[3] http://www.devpro.it/xmlsocket/<br />

[4] http://palabre.gavroche.net


XPath 188<br />

XPath<br />

Paradigm Query language<br />

Appeared in 1999<br />

Developer W3C<br />

Stable release 2.0 (January 23 2007)<br />

Major implementations JavaScript, C#, Java<br />

Influenced by XSLT, XPointer<br />

Influenced <strong>XML</strong> Schema,<br />

XForms<br />

XPath, the <strong>XML</strong> Path <strong>Language</strong>, is a query language for selecting nodes from an <strong>XML</strong> document. In addition,<br />

XPath may be used to compute values (e.g., strings, numbers, or Boolean values) from the content of an <strong>XML</strong><br />

document. XPath was defined by the World Wide Web Consortium (W3C).<br />

History<br />

The XPath language is based on a tree representation of the <strong>XML</strong> document, and provides the ability to navigate<br />

around the tree, selecting nodes by a variety of criteria. [1] In popular use (though not in the official specification), an<br />

XPath expression is often referred to simply as an XPath.<br />

Originally motivated by a desire to provide a common syntax and behavior model between XPointer and XSLT,<br />

subsets of the XPath query language are used in other W3C specifications such as <strong>XML</strong> Schema and XForms.<br />

Versions<br />

There are currently two versions in use.<br />

• XPath 1.0 became a Recommendation on 16 November 1999 and is widely implemented and used, either on its<br />

own (called via an API from languages such as Java, C# or JavaScript), or embedded in languages such as XSLT<br />

or XForms.<br />

• XPath 2.0 is the current version of the language; it became a Recommendation on 23 January 2007. A number of<br />

implementations exist but are not as widely used as XPath 1.0. The XPath 2.0 language specification is much<br />

larger than XPath 1.0 and changes some of the fundamental concepts of the language such as the type system.<br />

The most notable change is that XPath 2.0 has a much richer type system; [2] Every value is now a sequence (a single<br />

atomic value or node is regarded as a sequence of length one). XPath 1.0 node-sets are replaced by node sequences,<br />

which may be in any order.<br />

To support richer type sets, XPath 2.0 offers a greatly expanded set of functions and operators.<br />

XPath 2.0 is in fact a subset of XQuery 1.0. It offers a for expression which is cut-down version of the "FLWOR"<br />

expressions in XQuery. It is possible to describe the language by listing the parts of XQuery that it leaves out: the<br />

main examples are the query prolog, element and attribute constructors, the remainder of the "FLWOR" syntax, and<br />

the typeswitch expression.


XPath 189<br />

See also<br />

• XPath 1.0<br />

• XPath 2.0<br />

External links<br />

• XPath syntax [3]<br />

• XPath 1.0 specification [4]<br />

• XPath 2.0 specification [5]<br />

• What's New in XPath 2.0 [6]<br />

References<br />

[1] Article on xpath in techsoftcomputing.com<br />

[2] XPath 2.0 supports atomic types, defined as built-in types in <strong>XML</strong> Schema, and may also import user-defined types from a schema. (http://<br />

www.techsoftcomputing.com)<br />

[3] http://www.w3schools.com/XPath/xpath_syntax.asp<br />

[4] http://www.w3.org/TR/xpath<br />

[5] http://www.w3.org/TR/xpath20/<br />

[6] http://www.xml.com/pub/a/2002/03/20/xpath2.html<br />

XPath 2.0<br />

XPath 2.0 is the current version of the XPath language defined by the World Wide Web Consortium, W3C. It<br />

became a recommendation on 23 January 2007.<br />

XPath is used primarily for selecting parts of an <strong>XML</strong> document. For this purpose the <strong>XML</strong> document is modelled as<br />

a tree of nodes. XPath allows nodes to be selected by means of a hierarchic navigation path through the document<br />

tree.<br />

The language is significantly larger than its predecessor, XPath 1.0, and some of the basic concepts such as the data<br />

model and type system have changed. The two language versions are therefore described in separate articles.<br />

XPath 2.0 is used as a sublanguage of XSLT 2.0, and it is also a subset of XQuery 1.0. All three languages share the<br />

same data model, type system, and function library, and were developed together and published on the same day.<br />

Data model<br />

Every value in XPath 2.0 is a sequence of items. The items may be nodes or atomic values. An individual node or<br />

atomic value is considered to be a sequence of length one. Sequences may not be nested.<br />

Nodes are of seven kinds, corresponding to different constructs in the syntax of <strong>XML</strong>: elements, attributes, text<br />

nodes, comments, processing instructions, namespace nodes, and document nodes. (The document node replaces the<br />

root node of XPath 1.0, because the XPath 2.0 model allows trees to be rooted at other kinds of node, notably<br />

elements.)<br />

Nodes may be typed or untyped. A node acquires a type as a result of validation against an <strong>XML</strong> Schema. If an<br />

element or attribute is successfully validated against a particular complex type or simple type defined in a schema,<br />

the name of that type is attached as an annotation to the node, and determines the outcome of operations applied to<br />

that node: for example, when sorting, nodes that are annotated as integers will be sorted as integers.<br />

Atomic values may belong to any of the 19 primitive types defined in the <strong>XML</strong> Schema specification (for example,<br />

string, boolean, double, float, decimal, dateTime, QName, and so on). They may also belong to a type derived from


XPath 2.0 190<br />

one of these primitive types: either a built-in derived type such as integer or Name, or a user-defined derived type<br />

defined in a user-written schema.<br />

Type system<br />

The type system of XPath 2.0 is noteworthy for the fact that it mixes strong typing and weak typing within a single<br />

language.<br />

Operations such as arithmetic and boolean comparison require atomic values as their operands. If an operand returns<br />

a node (for example, @price * 1.2), then the node is automatically atomized to extract the atomic value. If the input<br />

document has been validated against a schema, then the node will typically have a type annotation, and this<br />

determines the type of the resulting atomic value (in this example, the price attribute might have the type decimal). If<br />

no schema is in use, the node will be untyped, and the type of the resulting atomic value will be untypedAtomic.<br />

Typed atomic values are checked to ensure that they have an appropriate type for the context where they are used:<br />

for example, it is not possible to multiply a date by a number. Untyped atomic values, by contrast, follow a weak<br />

typing discipline: they are automatically converted to a type appropriate to the operation where they are used: for<br />

example with an arithmetic operation an untyped atomic value is converted to the type double.<br />

Path expressions<br />

The location paths of XPath 1.0 are referred to in XPath 2.0 as path expressions. Informally, a path expression is a<br />

sequence of steps separated by the "/" operator, for example a/b/c (which is short for child::a/child::b/child::c). More<br />

formally, however, "/" is simply a binary operator that applies the expression on its right-hand side to each item in<br />

turn selected by the expression on the left hand side. So in this example, the expression a selects all the element<br />

children of the context node that are named ; the expression child::b is then applied to each of these nodes,<br />

selecting all the children of the elements; and the expression child::c is then applied to each node in this<br />

sequence, which selects all the children of these elements.<br />

The "/" operator is generalized in XPath 2.0 to allow any kind of expression to be used as an operand: in XPath 1.0,<br />

the right-hand side was always an axis step. For example, a function call can be used on the right-hand side. The<br />

typing rules for the operator require that the result of the first operand is a sequence of nodes. The right hand operand<br />

can return either nodes or atomic values (but not a mixture). If the result consists of nodes, then duplicates are<br />

eliminated and the nodes are returned in document order, and ordering defined in terms of the relative positions of<br />

the nodes in the original <strong>XML</strong> tree.<br />

In many cases the operands of "/" will be axis steps: these are largely unchanged from XPath 1.0, and are described<br />

in the article on XPath 1.0.<br />

Other operators<br />

Other operators available in XPath 2.0 include the following:


XPath 2.0 191<br />

Operators Effect<br />

+, -, *, div, mod, idiv Arithmetic on numbers, dates, and durations<br />

=, !=, , = General comparison: compare arbitrary sequences. The result is true if any pair of items, one from each sequence, satisfies<br />

the comparison<br />

eq, ne, lt, gt, le, ge Value comparison: compare single items<br />

is Compare node identity: true if both operands are the same node<br />

Compare node position, based on document order<br />

union, intersect,<br />

except<br />

Compare sequences of nodes, treating them as sets, returning the set union, intersection, or difference<br />

and, or boolean conjunction and disjunction. Negation is achieved using the not() function.<br />

to defines an integer range, for example 1 to 10<br />

instance of determines whether a value is an instance of a given type<br />

cast as converts a value to a given type<br />

castable as tests whether a value is convertible to a given type<br />

Conditional expressions may be written using the syntax if (A) then B else C.<br />

XPath 2.0 also offers a for expression, which is a small subset of the FLWOR expression from XQuery. The<br />

expression for $x in X return Y evaluates the expression Y for each value in the result of expression X in turn,<br />

referring to that value using the variable reference $x.<br />

Function library<br />

The function library in XPath 2.0 is greatly extended from the function library in XPath 1.0.<br />

The functions available include the following:<br />

Purpose Example Functions<br />

General string<br />

handling<br />

Regular<br />

expressions<br />

lower-case, upper-case, substring, substring-before, substring-after, translate, starts-with, ends-with, contains, string-length,<br />

concat, normalize-space, normalize-unicode<br />

matches, replace, tokenize<br />

Arithmetic count, sum, avg, min, max, round, floor, ceiling, abs<br />

Dates and times adjust-dateTime-to-timezone, current-dateTime, day-from-dateTime, month-from-dateTime, days-from-duration,<br />

months-from-duration, etc.<br />

Properties of nodes name, node-name, local-name, namespace-uri, base-uri, nilled<br />

Document handling doc, doc-available, document-uri, collection, id, idref<br />

URIs encode-for-uri, escape-html-uri, iri-to-uri, resolve-uri<br />

QNames QName, namespace-uri-from-QName, prefix-from-QName, resolve-QName<br />

Sequences insert-before, remove, subsequence, index-of, distinct-values, reverse, unordered, empty, exists<br />

Type checking one-or-more, exactly-one, zero-or-one


XPath 2.0 192<br />

Backwards compatibility<br />

Because of the changes in the data model and type system, not all expressions in XPath 2.0 have exactly the same<br />

effect as in 1.0. The main difference is that XPath 1.0 was more relaxed about type conversion, for example<br />

comparing two strings ("4" > "4.0") was quite possible but would do a numeric comparison; in XPath 2.0 this is<br />

defined to compare the two values as strings using a context-defined collating sequence.<br />

To ease transition, XPath 2.0 defines a mode of execution in which the semantics are modified to be as close as<br />

possible to XPath 1.0 behavior. When using XSLT 2.0, this mode is activated by setting version="1.0" as an attribute<br />

on the xsl:stylesheet element. This still doesn't offer 100% compatibility, but any remaining differences are only<br />

likely to be encountered in unusual cases.<br />

Support<br />

Support for XPath 2.0 is still limited.<br />

• For browser support, see Comparison of layout engines (<strong>XML</strong>).<br />

External links<br />

• XPath 2.0 specification [5]<br />

• What's New in XPath 2.0 [6]<br />

Xs3p<br />

xs3p is an XSLT stylesheet that generates XHTML documentation from <strong>XML</strong> Schema Definition language (XSD)<br />

schema.<br />

xs3p requires an XSLT processor like Xalan from Apache Software Foundation. The results can be generally viewed<br />

with any browser that supports Cascading Style Sheets Level 2 (CSS2) and XHTML 1.0, such as Explorer 5.5,<br />

Mozilla 1.0, Netscape 6 or Opera 5 (or later).<br />

xs3p was developed by Project Titanium [1] , Distributed Systems Technology Centre (DSTC) Pty Ltd. and<br />

distributed under a Mozilla Public License (MPL). xs3p is used by both the Oxygen <strong>XML</strong> Editor and Stylus Studio<br />

to generate schema documentation, and a modified version of the stylesheet is included with this program.[2]<br />

Recently the DSTC website, which was officially hosting the xs3p stylesheet, has become unavailable. A download<br />

of the xs3p stylesheet is available from the FiForms <strong>XML</strong> Definitions [3] project.<br />

References<br />

[1] http://titanium.dstc.edu.au/xml/xs3p/<br />

[2] http://www.oxygenxml.com/forum/ftopic2027.html<br />

[3] http://xml.fiforms.org/xs3p/


XSQL 193<br />

XSQL<br />

XSQL combines the power of <strong>XML</strong> and SQL to provide a language and database independent means to store and<br />

retrieve SQL queries and their results.<br />

Description<br />

XSQL is the combination of <strong>XML</strong> (Extensible <strong>Markup</strong> <strong>Language</strong>) and SQL (Structured Query <strong>Language</strong>) to provide<br />

a language and database independent means for storing SQL queries, clauses and query results. XSQL development<br />

is still in its infancy and welcomes suggestions for improvement (especially in the form of patches).<br />

Currently, the XSQL project has a DTD (Document Type Definition) to define the structure of an XSQL document<br />

and researchers are currently working on modifying the <strong>XML</strong> Generator, DBI Perl module to be able to parse XSQL<br />

documents and provide a tree- and event-based API (Application Programming Interface) to their elements. These<br />

modifications are being submitted as patches to the modules maintainer, Matt Sergeant. Thus, the source code does<br />

not live at this site.<br />

It is hoped that XSQL will provide an end-to-end solution for handling SQL in Perl (other languages can be<br />

supported if there is interest). Creating XSQL implementations in other languages will allow all databases to support<br />

<strong>XML</strong> without having to alter the database source code in any way. The XSQL implementations can take care of<br />

turning XSQL in SQL and turning results into XSQL.<br />

External links<br />

• XSQL project website [1]<br />

References<br />

[1] http://xsql.sourceforge.net/


Article Sources and Contributors 194<br />

Article Sources and Contributors<br />

Binary <strong>XML</strong> Source: http://en.wikipedia.org/w/index.php?oldid=353493919 Contributors: Chrisch, Cpl Syx, CyberSkull, Cybercobra, DSosnoski, Hervegirod, Hooperbloob, Joriki,<br />

Jzhang2007, Mac D83, Mipadi, Ordinant, Pengo, Potato32, Qutezuce, Semog, Skrapion, Sneftel, Tbleier, The Anome, Thumperward, 44 anonymous edits<br />

Business Process Definition Metamodel Source: http://en.wikipedia.org/w/index.php?oldid=349128636 Contributors: BPDM, Baudoin1, Diveintobpm, Ehheh, Goflow6206, Jpbowen, Lurp,<br />

Sisyph, Tomdebevoise, 8 anonymous edits<br />

CDATA Source: http://en.wikipedia.org/w/index.php?oldid=365608659 Contributors: Archer3, Barefootliam, CesarB, Ded.morris, Duke33, Ehn, ILikeThings, Luislobo, MC10, Mjb, Npowell,<br />

Phluid61, PoliticalJunkie, Renesis, Rjwilmsi, Thickycat, WakiMiko, Wiml, 49 anonymous edits<br />

CDuce Source: http://en.wikipedia.org/w/index.php?oldid=367963828 Contributors: AndrewGNF, Apokrif, Elonka, Elwikipedista, Frisch, Hans Adler, Jaxhere, Sourada, Stentie, The Thing<br />

That Should Not Be, Trovatore, VoluntarySlave<br />

Character entity reference Source: http://en.wikipedia.org/w/index.php?oldid=365999121 Contributors: ANONYMOUS COWARD0xC0DE, Bitnap, Clixus, DePiep, Derekread, Gazpacho,<br />

Gdr, Jatkins, Koujimachi07, Loadmaster, M7, Martin451, Mhkay, Mjb, Mzajac, Oashi, Svick, Tokek, UU, 12 anonymous edits<br />

CodeSynthesis XSD Source: http://en.wikipedia.org/w/index.php?oldid=333209308 Contributors: Boseko, Bunnyhop11, Csabo, Nicolas1981, Pedram.salehpoor, Soumyasch, 4 anonymous<br />

edits<br />

D3L Source: http://en.wikipedia.org/w/index.php?oldid=344098822 Contributors: Dawynn, Fabrictramp, Jackollie, Malcolma, Squids and Chips, Vgiasolli, 4 anonymous edits<br />

Darwin Information Typing Architecture Source: http://en.wikipedia.org/w/index.php?oldid=357405956 Contributors: AlexSpurling, Andy Dingley, Biker JR, Bobdoyle, Bruce Esrig,<br />

ChrisLott, Clayoquot, Cmsreview, Cschleifstein, Deathphoenix, DeweyQ, Dmccreary, Doug Bell, Elharo, Eslchip, Ghettoblaster, Hgkamath, Infoprosmktg, JDBravo, JamesBWatson,<br />

JosebaAbaitua, Jwalling, Krusch, LCP, LeeHunter, Masiano, MatisseEnzer, Mhedblom, Mythobeast, Ndenison, Nozipedia, Ohnoitsjamie, Roberto999, Ru.spider, Sernauser, Sibersandi,<br />

Skierpage, Terrillja, Toussaint, Tsemii, Walk Up Trees, Who, WissenVeredeln, Yorrose, 78 anonymous edits<br />

DITA Open Toolkit Source: http://en.wikipedia.org/w/index.php?oldid=367972391 Contributors: Andy Dingley, Bobdoyle, Cander0000, Elwikipedista, Ewlyahoocom, Sernauser<br />

Document Structure Description Source: http://en.wikipedia.org/w/index.php?oldid=344099967 Contributors: Amalas, Asser hassanain, Bunnyhop11, Dawynn, Dreftymac, Jerazol,<br />

Kbdank71, Mamling, Minghong, Rene Mas, 8 anonymous edits<br />

Document-Centric Source: http://en.wikipedia.org/w/index.php?oldid=319489730 Contributors: Canis Lupus, Jzhang2007, Malcolma, Oh Snap<br />

Document-centric <strong>XML</strong> processing Source: http://en.wikipedia.org/w/index.php?oldid=363018500 Contributors: Aj00200, Gary King, Jzhang2007, LilHelpa, R'n'B, RJFJR, Victor Lopes, 3<br />

anonymous edits<br />

Dynamic <strong>XML</strong> Source: http://en.wikipedia.org/w/index.php?oldid=302412968 Contributors: Aboriginal Noise, Egpetersen, Filmackay, Malcolma, 1 anonymous edits<br />

ECMAScript for <strong>XML</strong> Source: http://en.wikipedia.org/w/index.php?oldid=368395484 Contributors: AVRS, Aaronbrick, Ale jrb, Asqueella, Bobince, CesarB, Cybit, David Gerard, Deineka,<br />

DonToto, Drdamour, Drukepple, Everyking, Ffangs, Ghettoblaster, Guppie, Herorev, Imroy, Intgr, Jasonglchu, Klondike, Kuteni, Mbini, Mysterd429, Niqueco, Onevalefan, Pfurla, Pointillist,<br />

Schepers, Schristie, Shepard, Simonster, Spankman, Speck-Made, Tabletop, Vishnava, William Graham, WulfTheSaxon, Ysangkok, 83 anonymous edits<br />

Efficient <strong>XML</strong> Interchange Source: http://en.wikipedia.org/w/index.php?oldid=335112487 Contributors: Biscuittin, Cybercobra, Darobin, Erechtheus, Hervegirod, Jeffhos, Pengo, Sdw,<br />

TuukkaH, 10 anonymous edits<br />

Embedded RDF Source: http://en.wikipedia.org/w/index.php?oldid=344100836 Contributors: 4th-otaku, Cander0000, Dawynn, Earle Martin, Iridescent, Keithalexander, Mathiastck, Mdd, O<br />

keyes, Prodoc, Shepard, The Anome, Themfromspace, Ultimatewisdom, 1 anonymous edits<br />

EpiDoc Source: http://en.wikipedia.org/w/index.php?oldid=331916994 Contributors: Bluemoose, Bpiche, El C, Gabrielbodard, Paregorios, Polon11, Tobias Bergemann, XPtr, 3 anonymous<br />

edits<br />

eXtensible Server Pages Source: http://en.wikipedia.org/w/index.php?oldid=171773630 Contributors: Honestcurio, John Vandenberg, Jutta234, 2 anonymous edits<br />

Fast Infoset Source: http://en.wikipedia.org/w/index.php?oldid=363168067 Contributors: Beetstra, Doug Bell, Drano, Dreftymac, Ernstdehaan, Ghettoblaster, Gurch, Hervegirod, Iharjw,<br />

JavaIsGroovy, Johndrinkwater, Jzhang2007, Ksn, Merlin12, Obiltschnig, Pelegri, Precious Roy, Prickus, Torc2, Tuntable, Tycoon de, Warreed, 35 anonymous edits<br />

Global listings format Source: http://en.wikipedia.org/w/index.php?oldid=323960620 Contributors: Alvin Seville, Capnstank, 1 anonymous edits<br />

GMX Source: http://en.wikipedia.org/w/index.php?oldid=297460141 Contributors: Azydron, Canadian, GEn3S!Z, GregorB, Ikar.us, Jared Preston, Malinaccier, Pegship, Radon210, 8<br />

anonymous edits<br />

GMX-V Source: http://en.wikipedia.org/w/index.php?oldid=325214239 Contributors: Azydron, Emeraude, 2 anonymous edits<br />

Head-Body Pattern Source: http://en.wikipedia.org/w/index.php?oldid=332049421 Contributors: Duncharris, Pegship, RedWolf, Robertvan1, Timc, Uthbrian, Ynhockey, 3 anonymous edits<br />

HyTime Source: http://en.wikipedia.org/w/index.php?oldid=334129102 Contributors: Andreas Kaufmann, Klimov, Mjb, Mosca, Onlyemarie, Sderose, Thumperward, 9 anonymous edits<br />

Internationalization Tag Set Source: http://en.wikipedia.org/w/index.php?oldid=247890861 Contributors: Ghettoblaster, Sintaku, Ysavourel, 18 anonymous edits<br />

Klip Source: http://en.wikipedia.org/w/index.php?oldid=359761468 Contributors: Bogrady, Diveloop, Gdrori, Melaen, SDC, Utcursch, Wizard191, Wykis, Xe7al, 16 anonymous edits<br />

List of <strong>XML</strong> and HTML character entity references Source: http://en.wikipedia.org/w/index.php?oldid=365723538 Contributors: Adoniscik, Alerante, Andrew Carlssin, AxSkov, Beland,<br />

BenjaminHare, Cbrunet, Christian75, Clixus, Cy21, DePiep, DmitTrix, ERcheck, Fudo, Gaius Cornelius, George Hernandez, Gerbrant, Happy-melon, Isaac Dupree, J4 james, Jatkins, Joejava,<br />

John Vandenberg, Kf4yfd, Kieff, LiborX, Loadmaster, Mathtinder, Mhkay, Mindmatrix, Mjb, Monedula, NJJ.Rocher, Ohnoitsjamie, Phil Boswell, Psychonaut, Radon210, Reinyday, Reisio,<br />

RetiredUser2, Ringbang, Rjwilmsi, Rwwww, SallyForth123, Sam 1123, Suruena, Tamfang, Tezza2k1, The Thing That Should Not Be, The wub, Thinboy00P, Tokek, TreasuryTag, Wavelength,<br />

Wolf1728, Wwoods, 93 anonymous edits<br />

Log4js Source: http://en.wikipedia.org/w/index.php?oldid=333453341 Contributors: Amux, Euchiasmus, Ian Moody, JLaTondre, Stritti, Wdflake, 5 anonymous edits<br />

MAREC Source: http://en.wikipedia.org/w/index.php?oldid=352689127 Contributors: Hydrox, Mpgarnier, Ofalk, 13 anonymous edits<br />

Media Object Server Source: http://en.wikipedia.org/w/index.php?oldid=282541918 Contributors: Chungkuo, The Anome, Theroachman, Xezbeth, 1 anonymous edits<br />

METS Source: http://en.wikipedia.org/w/index.php?oldid=357913828 Contributors: Buiras, CBM, Charles Brooking, Davissp, DerHexer, Elonka, Grumpycraig, Isnow, Lyc. cooperi,<br />

M4gnum0n, Nicolas1981, Paulerb, Rich Farmbrough, Sallyrenee, SchfiftyThree, Stf, Thryduulf, Trovatore, WilliamDenton, 15 anonymous edits<br />

Numeric character reference Source: http://en.wikipedia.org/w/index.php?oldid=364363130 Contributors: ABCD, ANONYMOUS COWARD0xC0DE, Ahoerstemeier, Ajgorhoe, D99figge,<br />

David H. Flint, DePiep, Gudeldar, Hytri, Indefatigable, Karl Dickman, Kjoonlee, LeoNomis, Million Moments, Mjb, Ringbang, Shlomital, TreasuryTag, Voidvector, 11 anonymous edits<br />

Office Open <strong>XML</strong> Source: http://en.wikipedia.org/w/index.php?oldid=368502841 Contributors: AJRobbins, AVRS, Adi86, Adiel, Agentbla, AlbinoFerret, Ale2006, AlexHudson, Alexbrn,<br />

Alexmaco, AlistairMcMillan, Aljullu, AllTheThings, Alvestrand, Amux, Ancheta Wis, Andrew J. MacDonald, AnonMoos, Ans, Arebenti, Arnieswap, ArnoldReinhold, Artw, Asbjornu, Atchom,<br />

Avenue, BCable, Bbatsell, BeSherman, Beetstra, BenLanghinrichs, Bender235, Bento00, Biztalkguy, Blaisorblade, Blakkandekka, Bobblehead, Bobman52, Boing! said Zebedee, Booyabazooka,<br />

BradC, Brucevdk, Brumle72, Bryan Derksen, Bull Market, Cahill1, Cander0000, Catskul, CattleGirl, CesarB, Cfauck, Charles Esson, Chealer, CheesePlease NL, Cheros, Chowbok,<br />

Chuckhoffmann, Cibumamo, Clicketyclack, Cloud02, CodeNaked, Codyrank, CritterNYC, CyberSkull, D-Notice, DMacks, Damian Yerrick, Danfuzz, Danieldotcom, Dave souza, David Gerard,


Article Sources and Contributors 195<br />

DavidJ710, Davidprior, Delafield, Denis.labaye, DennyColt, DerHexer, Dguertin, Diamonddavej, Discospinster, Dockurt2k, Dolda2000, Donho, Dougofborg, Dovi, Downcreate, Dreftymac,<br />

Dwheeler, Długosz, Earthsound, EatMyShortz, Ebyabe, Edschofield, Egandrews, Elagatis, Emurphy42, Etscrivner, Euchiasmus, EvenT, Evice, Existhigh, Feedmecereal, Fingerz, Fjarlq,<br />

Fleminra, Froth, Fulldecent, Gabriella11758, Gabrielzorz, Gagravarr, Gakrivas, GangsterPanda, Garnwraly, Ghettoblaster, Gilliam, Greg L, GregorB, Guyjohnston, H2g2bob, HAl, HPSCHD,<br />

HaeB, Hamish Lawson, Hankwang, HarryHenryGebel, Harumphy, Hebrides, Helpsloose, Herorev, Hervegirod, Herzen, HiDrNick, HorsePunchKid, Hu12, HubertRoksor, Ildefonso Giron, Innv,<br />

Intgr, Iridescent, Ironiridis, Irperez, Isaac Dupree, ItsProgrammable, Iunaw, JLaTondre, Jac16888, Jacob Poon, JanusDC, Jeffmcneill, Jeltz, Jleedev, Jlovick, Joelpt, John Nevard, John of Reading,<br />

John zhu, JohnOwens, Johndrinkwater, Joker1984, Joker2007, Jonathan888, Joshua Issac, Jstaniek, Jtnn, Juliancolton, Justin545, Juventas, Jynus, KAMiKAZOW, Kaern, Karada, Karnesky,<br />

Kayano, Kedar damle, Kegart, Kenb215, Kenyon, Ketil, Khalid hassani, Khukri, KiloByte, Kilz, Klauys, Kneale, Kozuch, Kravietz, Kungfuadam, Latha P Nair, Laughton.andrew, Leandrod,<br />

LeeHunter, Leotohill, Lester, Liftarn, Lisamh, Lulu of the Lotus-Eaters, MBisanz, MZMcBride, Mahanga, Marbux, Mardus, Masterpjz9, Mat macwilliam, Mateo LeFou, Mathias<br />

Schindler, Mauro Bieg, Max Naylor, Mcld, Melomel, Mentaka, Merbenz, Micro01, Midnightcomm, Mipadi, Mitchoyoshitaka, Mmj, MonirTime, Mrand, Mratzloff, Mxn, NJA, Nberardi, Nbibler,<br />

Nealmcb, NeutralPoint, Niemeyerstein en, Nigelj, Nil Einne, Nitesh.dubey, Nmagedman, Noloader, Octahedron80, Odie5533, Odoncaoa, Oggiejnr, Oneiros, Opium, Orrc, Osaeris, Oub, Pairadox,<br />

Palfrey, Pandion auk, ParticleMan, Partyoffive, Paul Foxworthy, Paul1337, Pdfpdf, Peak Freak, Peashy, Perfect Proposal, Phil153, Piano non troppo, Pieterh, Piken, Piperh, Pixelface, PlainHolds,<br />

Plopez339, PokeYourHeadOff, PonThePony, Praetor alpha, Promethean promise, Putt1ck, Quantumelfmage, R3m0t, RS Ren, Rafert, Rainwarrior, Ramdrake, Rasmus.p, Raul654, Rcandelori,<br />

Reedy, RekishiEJ, Remiel, Reuqr, Rick Jelliffe, Rizox, Rjwilmsi, Rlmorgan, Robdurbar, RockMFR, Ronark, RossPatterson, Rursus, Ruud Koot, Ryuch, Régis Décamps, Salimfadhley, Scarian,<br />

Scientus, Scisonic, Scj2315, Sdedeo, SeanDuggan, Segedunum, Seweso, Shd, Shir Khan, Shmget, SigmaEpsilon, Sigmundg, Signalhead, Simosx, Sir Anon, SkyWalker, Sladen, SmartWarthog,<br />

Smartse, Soumyasch, Spartaz, Spitzak, SpuriousQ, Stang99gtv8, Stannered, Stephenchou0722, SteveSims, Stevenfruitsmaak, Stevenj, Subsume, Sumb, Superluser, Superm401, Svdb, Swiftdove,<br />

Syncrosoft, TKD, Ta bu shi da yu, Tabletop, Tackit, TakuyaMurata, Tarmle, Tatoute, Tawker, Tayste, Tgape, The Anome, The Divine Fluffalizer, The Thing That Should Not Be, TheMadGerman,<br />

Thelennonorth, Theonlyedge, Theosch, Thiseye, Thrapper, Thumperward, Tigernike1, Tiptoety, Tmpsantos, Todd Vierling, Tomdobb, Toolnut, Torfason, Towsonu2003, Tprit, Trails, TraxPlayer,<br />

Tregoweth, Trevordevore, Ttiotsw, Tunah, Turlo Lomon, Tvhuang, Tvol, Ultramandk, Utcursch, Veinor, Verbal, Verdy p, Vexorian, Virtualt333, WalterGR, Warren, Webhat, West London<br />

Dweller, Wheelybrook, WhiteCat, WiebeVanDerWorp, Wiki Raja, Wiki1959, WikiLaurent, Witoldp, Wmorein, Womble, Work permit, Wrightbus, WurmWoode, X-Bert, X-dark, Xpclient,<br />

Xx521xx, Yellowdesk, Yesudeep, Yoonkit, Zayani, Zero0w, Zoobab, Zsvedic, 1036 anonymous edits<br />

Office Open <strong>XML</strong> file formats Source: http://en.wikipedia.org/w/index.php?oldid=363884163 Contributors: Alvestrand, CommonsDelinker, Nigelj, Rjwilmsi, Verdy p, 3 anonymous edits<br />

OIO<strong>XML</strong> Source: http://en.wikipedia.org/w/index.php?oldid=232294100 Contributors: Covergaard, JosefAssad, Part Deux, 2 anonymous edits<br />

Open <strong>XML</strong> Paper Specification Source: http://en.wikipedia.org/w/index.php?oldid=368337454 Contributors: A.Ou, Akhristov, Alecamiga, Alexander Abramov, Ambarish, Azakea,<br />

Benhutchings, Bfinn, Blicktek, Bokarevitch, Callidior, Chris Chittleborough, Chris the speller, Chuck Marean, CobraA1, Csiahistorian, Cwolfsheep, Cynical, DBrane, Danglobalgraph, David<br />

Haslam, Dawnseeker2000, Digita, Etienne.navarro, Feedmecereal, Filemon, FleetCommand, Fleminra, Frap, Fritz Saalfeld, Gertyk, Ghettoblaster, Gioto, Gordonf, HAl, Hervegirod, Inarius,<br />

JLaTondre, JanSöderback, Javalenok, Joaopaulo1511, Joelholdsworth, Joker1984, Jonhall, Jutiphan, Kpearce, Lasindi, Lboonsen, Leafnode, Lhammer610, LobStoR, LodesterreLLC, Maerk,<br />

Marasmusine, Marcosw, Mathrick, Morris lin, Mpbailey, Msiebuhr, Mythobeast, Nihiltres, Nil Einne, Nixps, Objectivesea, Oneiros, Orderud, Owen Ambur, Paul A, Paulej, Pelago, Philippe,<br />

PseudoSudo, Psiphiorg, Qef, Quiggles, RedAznor, Rjwilmsi, SURIV, SW2000, Seth Nimbosa, Simaocampos, Snailshoes, Soumyasch, Stephenchou0722, Sterrys, Sugeina, Superm401, Svick,<br />

Thumperward, Todd Vierling, Tooki, TotoBaggins, Toussaint, TreasuryTag, Uzume, Voidxor, Warren, WatchAndObserve, Wikianon, Woohookitty, Wq-man, Xpclient, ZimZalaBim, 159<br />

anonymous edits<br />

PCDATA Source: http://en.wikipedia.org/w/index.php?oldid=360253969 Contributors: Chealer, Fæ, Lobner, Malcolma, Renata3, Winterheat, 4 anonymous edits<br />

Plain Old <strong>XML</strong> Source: http://en.wikipedia.org/w/index.php?oldid=268137155 Contributors: Alynna Kasmira, Arto B, Atifmk, BrokenSegue, Bunnyhop11, Chalisa, Charivari, CondeNasty,<br />

Djmackenzie, Dpm64, Emersoni, Evil Monkey, GermanX, Hoos-foos, Julesd, LittleDan, MarXidad, Mindmatrix, Minghong, Tantek, Thumperward, Toby Woodwark, ZayZayEM, 16 anonymous<br />

edits<br />

Portable Application Description Source: http://en.wikipedia.org/w/index.php?oldid=353334808 Contributors: Aaleksanyants, Bitsmith, Christopher.widdowson, Gesslein, Here, Jll, MER-C,<br />

Pegship, RenegadeMinds, Riki, TheParanoidOne, 22 anonymous edits<br />

Publishing Requirements for Industry Standard Metadata Source: http://en.wikipedia.org/w/index.php?oldid=367751412 Contributors: Malcolma, Mauro Bieg, Prismwg, Rettetast, Rich<br />

Farmbrough<br />

QName Source: http://en.wikipedia.org/w/index.php?oldid=319026346 Contributors: Amire80, Anthony Appleyard, Frap, Gurch, Jnutting512, Motine, Stezton, Zundark, 1 anonymous edits<br />

QTI Source: http://en.wikipedia.org/w/index.php?oldid=361214744 Contributors: Alexcq, Bektur, Benscripps, Carnildo, ChristopheS, Fujnky, Gcm, Gimboid13, Grussak, Hammersmith38,<br />

J04n, Ja6a, JimTittsler, Larham, Lastkaled, Lindsey Kuper, Olak Ksirrin, RobertG, Ruale, Staffordaz, The7thone1188, Ysangkok, 33 anonymous edits<br />

Resource Description Framework Source: http://en.wikipedia.org/w/index.php?oldid=368011899 Contributors: 213.253.39.xxx, A5b, Acaciz, Akinyemi, Alcalazar, Alexius08,<br />

AlistairMcMillan, Amire80, AnAj, Andy Dingley, Angela, Ankitasdeveloper, Anrie Nord, Arto B, Asqueella, Backoftheboat, Barticus88, Bawolff, BeakerK44, BernhardBauer, Blathnaid,<br />

Blue.death, BobKeim, Booles, Broosty, C1932, Caoimhin, Carbuncle, Carlo.Ierna, Carmenutzadd, Cedringen, Chmod007, Cjcollier, Clan-destine, CloCkWeRX, Conversion script, Cygri, DRE,<br />

DanBri, Dancter, Daniele Gallesio, Davemck, Deodar, Dmccreary, Donald Albury, Dpv, Dr Shorthair, Dtcdthingy, Earle Martin, EddyVanderlinden, Emperor, EoGuy, Erick.Antezana, Esprit15d,<br />

Finell, Fleminra, FrankTobia, Fredrik, Funandtrvl, Ghettoblaster, Graham87, GregorB, Gyuri10, Haakon, Harrigan, Hetar, Hu12, Ian Spackman, IanDBailey, Ianalchemy, Jdthood,<br />

JesseChisholm, Jhammerb, Joe Jarvis, John Vandenberg, JonHarder, Jonathan O'Donnell, Jpbowen, Kaihsu, Kbdank71, Khurrad, KimvdLinde, KingsleyIdehen, Kiranoush, Kku, Knavesdied,<br />

Kwan, Langec, Liftarn, Lokatzis, Luk, Lysy, M3wiki1, Maduskis, Mandarax, Mark Renier, Mathiastck, Mauro Bieg, Mav, Mccaffry, Mdd, Mecanismo, Michael Hardy, MichaelBillington,<br />

Michal Nebyla, Midnight Madness, Minghong, Mjb, N2e, N3c, Nicolas1981, Nikevich, Niteowlneils, Nkour, Novum, Nux, Ojw, Onlyemarie, Pagatiponon, PatHayes, Pemboid, Pete142, Piet<br />

Delport, Pointillist, Pvosta, RaymondYee, RedWolf, Roland2, RossPatterson, Rursus, SEWilco, SMcCandlish, SamuelScarano, Sanxiyn, Sapoguapo, Schandi, Sdorrance, Securiger,<br />

ShaunMacPherson, Shepard, Shermanmonroe, Shinkolobwe, Sibersandi, Sina2, Smalljim, Soumyasch, Sstair, StWeasel, SteinbDJ, StephenReed, Stevertigo, Stoni, Stw, TNLNYC, Tezza2k1, The<br />

Anome, TikaKino, Tomlzz1, Toussaint, Triadic2000, Trixter, Turnstep, Ultimatewisdom, Universimmedia, Uriyan, Venullian, Vsddkjn, Wavelength, Wesleyneo, Wiki alf, WojPob, Xezbeth,<br />

Yaron K., Yitzhak, 217 anonymous edits<br />

Resources of a Resource Source: http://en.wikipedia.org/w/index.php?oldid=252504394 Contributors: GregorB, Jjordanpedia, NawlinWiki, Pearle, Robocoder, 8 anonymous edits<br />

Reverse Ajax Source: http://en.wikipedia.org/w/index.php?oldid=354518378 Contributors: Agentscott00, Anaraug, Brest, CarlManaster, CometGuru, Damiens.rf, Fadookie, FatalError,<br />

Furrykef, Gregdan, In side the pc, Inquisitus, Jacobolus, Jwoodger, Kalan, Kdknigga, MrOllie, MuffledThud, Pohta ce-am pohtit, Psilya, Sleepyhead81, Sprocketonline, Stefan Hintz, Ødipus sic,<br />

52 anonymous edits<br />

Root element Source: http://en.wikipedia.org/w/index.php?oldid=292478129 Contributors: Ferkelparade, Malcolma, Mike the k, Nigelj, Pegship, RJFJR, Rich Farmbrough, Robertvan1,<br />

Sardine, 6 anonymous edits<br />

Schematron Source: http://en.wikipedia.org/w/index.php?oldid=345263915 Contributors: Aqueenan, Bunnyhop11, Canadabear, Chsimps, Dmccreary, Dreftymac, Ghettoblaster, HoodedMan,<br />

Hymek, JukoFF, Kbdank71, Korval, Modify, Nickcarr, Pnkrockr, Rjwilmsi, Samdutton, Securiger, Wellithy, Žiedas, 23 anonymous edits<br />

Simple Outline <strong>XML</strong> Source: http://en.wikipedia.org/w/index.php?oldid=245061146 Contributors: CDV, Dreftymac, KennethJ, Krusch, Nfwu, Qu3a, Stevage, Tadman, Verdatum, 4<br />

anonymous edits<br />

Simple <strong>XML</strong> Source: http://en.wikipedia.org/w/index.php?oldid=290618000 Contributors: Codebytez, Danlev, Melab-1, Sydius, 8 anonymous edits<br />

Streaming <strong>XML</strong> Source: http://en.wikipedia.org/w/index.php?oldid=313593995 Contributors: Clq, Deathy, Egpetersen, Fikus, Filmackay, Maustrauser, Neustradamus, Patdreams<br />

Styled Layer Descriptor Source: http://en.wikipedia.org/w/index.php?oldid=345632921 Contributors: Beautyod, Ebyabe, Firsfron, Lars Washington, Lordsatri, Mabdul, Oskosk, SEWilco,<br />

SheldonYoung, Vitomeuli, 4 anonymous edits<br />

Topic (<strong>XML</strong>) Source: http://en.wikipedia.org/w/index.php?oldid=305325594 Contributors: Barticus88, Blathnaid, Clayoquot, Eleusis, Fool, Hbent, Lheuer, Pearle, Quaque, Treborbassett, Walk<br />

Up Trees, 8 anonymous edits<br />

Unique Particle Attribution Source: http://en.wikipedia.org/w/index.php?oldid=272335554 Contributors: Bunnyhop11, Frandsen, Politepunk, Rich Farmbrough, 2 anonymous edits<br />

VTD-<strong>XML</strong> Source: http://en.wikipedia.org/w/index.php?oldid=356464294 Contributors: AnmaFinotera, Beefyt, CamTarn, CambridgeBayWeather, EurekaLott, FayssalF, Greatestrowerever,<br />

Hervegirod, Hut 8.5, Jacosi, Jzhang2007, Katieh5584, LilHelpa, Paul8046, Pegship, Raise exception, Rjwilmsi, Rookkey, Switchercat, Toohool, Torc2, UncleDouggie, םודנר, 187 anonymous<br />

edits


Article Sources and Contributors 196<br />

X-expression Source: http://en.wikipedia.org/w/index.php?oldid=272451914 Contributors: Dragentsheets, Greenrd, JLaTondre, 1 anonymous edits<br />

XBRLS Source: http://en.wikipedia.org/w/index.php?oldid=338054643 Contributors: Blowdart, CharlesHoffman, Glennfcowan, Lancet75, Niente21, Pohta ce-am pohtit, 3 anonymous edits<br />

Xdos Source: http://en.wikipedia.org/w/index.php?oldid=352934824 Contributors: Dawynn, FreeKresge, Malcolma, Pearle, Salad Days, Smthng2sav, Tinucherian, 3 anonymous edits<br />

XDR Schema Source: http://en.wikipedia.org/w/index.php?oldid=335801797 Contributors: Aaaidan, Abelson, Greenrd, Jonnie d smith, Sergey.Radkevich, 2 anonymous edits<br />

XEE (Starlight) Source: http://en.wikipedia.org/w/index.php?oldid=245059907 Contributors: Elblanco, Malcolma, Ratarsed, 1 anonymous edits<br />

XEP Source: http://en.wikipedia.org/w/index.php?oldid=359239404 Contributors: Msulyaev, Odo1982, Toddst1, Zundark<br />

<strong>XML</strong> Source: http://en.wikipedia.org/w/index.php?oldid=367508169 Contributors: .:Ajvol:., 207.172.11.xxx, 213.253.39.xxx, 24ten, AHMartin, AThing, Aadaam, Actam, AdamCarden, Adeio,<br />

Ahabr, Ahkond, Ahoerstemeier, Aitias, Ajcumming, Aklauss, Aksi great, Alan Liefting, Alansohn, Alexbrn, AlistairMcMillan, Allkeyword, Amire80, AndersFeder, Andrisi, Angeltoribio, Ani td,<br />

Ankitasdeveloper, Anna Lincoln, Anon lynx, AnonMoos, Anti stupidity, Anu-43, Aomarks, Asqueella, Asteiner, Asymmetric, Atanveer9, AzaToth, B4hand, Barek, Barticus88, Bdesham,<br />

Beetstra, Belamp, Bernd in Japan, BertSen, Bevo, Bhadani, Biezl, BigFatBuddha, Bissinger, Bje2089, Blinklmc, Bluemoose, BlurTento, Bobdc, Bobianite, Boehm, Bonbayel, Bonethugnd, Booles,<br />

BorgQueen, Borgdylan, Boseko, BrianCully, Brick Thrower, Brighterorange, Brion VIBBER, Bryan Derksen, Brz7, Bunnyhop11, Burschik, Businessman332211, Bvajet,<br />

C.M.Sperberg-McQueen, CLD, CambridgeBayWeather, Cameltrader, Can't sleep, clown will eat me, CanadianLinuxUser, Caomhin, CapitalSasha, Carewolf, CarlHewitt, Cbdorsett, Cels2, Centrx,<br />

Charivari, Chininazu12, ChongDae, Chowbok, Chris 73, Chris Roy, Chrislk02, Chrisnewell, ChristopheS, Chzz, Cipherynx, Clayoquot, ClementSeveillac, CoSort2007, Coconut99 99, Cody5,<br />

Colonies Chris, Comesuntbob, Contraverse, Conversion script, CptAnonymous, Crosstowns, Cspan64, Cybercobra, D6, DKEdwards, Da monster under your bed, Dan100, DanConnolly, Daniel<br />

Olsen, Daniel.Cardenas, DanielVonEhren, DarkFalls, Darkfred, David spector, Davis685, Dcattell, Dcoetzee, DeadEyeArrow, Delcnsltmd, Deodar, Derek Ross, Derekread, Dicklyon, Dickpenn,<br />

DigitalEnthusiast, Dingbats, Dino72, Dkrms, Dlohcierekim, Dlrohrer2003, Dolcecars, DominiqueHazaelMassieux, Donmay12, DopefishJustin, DoriSmith, DougBarry, Dpattison2007, Dpbsmith,<br />

Dpm64, Dr Headgear, Dreftymac, Dthvt, Dullhunk, Dwheeler, Ebruchez, Edcolins, Edward Z. Yang, Efcavanaugh, Egandrews, Egil, Eisnel, ElBenevolente, Elharo, Ellmist, Elwikipedista,<br />

EngineerScotty, Eranb, Ericjs, Erik Zachte, Erikdw, Eritain, Etu, Evaluist, Ewsers, Fang.zheng, Fantasticfears, FatalError, Feline Hymnic, Ferdinand Pienaar, Figure, Fleminra, FloatingMind,<br />

Fnielsen, Folajimi, Fragglet, Fran Rogers, Francl, Frap, Freyr, Frisket, Fsolda, Furrykef, Fvw, GTBacchus, Gaius Cornelius, Gc9580, Gdrori, Geniac, Gennaro Prota, GentlemanGhost,<br />

GeoffPurchase, Ghettoblaster, Giftlite, Gjlubbertsen, Gjs238, Glass of water, Glenn, Gogo Dodo, Golwengaud, GrEp, GraemeL, Graham, Greg Murray, Ground Zero, Grumpycraig, Gudeldar,<br />

Haakon, Hairy Dude, Hannes Hirzel, Harold f, Hashar, Hervegirod, Hicketyhicketyhack, Highwayman65251, Hirzel, Hogman500, Hu12, Hurricane111, Hypertrek, Hyuri, IMSoP, Ian Moody,<br />

IanBurrell, Iftikhar88hussaini, Ijmorlan, IlanaDavidi, Imars, Imjustmatthew, Int21h, Intgr, Iridescent, Isilanes, Itai, J.delanoy, JForget, JKing, JLaTondre, JPalonus, JRocketeer, Jackacon,<br />

Jacobko, Jacobolus, JakobVoss, JamesBrownJr, Jao, Jargon64, Jauerback, JavaWoman, Jaxad0127, Jaxsam1, Jay, Jeenuv, Jeff G., Jeff3000, Jehzlau, Jerazol, Jesin,<br />

Jhannah, Jibjibjib, Jilplo Haggins, Jimthing, Jmlipton, Joachim Wuttke, Joanjoc, John Vandenberg, JohnSmith777, JohnWhitlock, Johnmarkh, Johnwcowan, Joku, Jonabbey, Jonkerz,<br />

Jonnyamazing, Jor, Jpbowen, Jshadias, Jzhang2007, Kai.Klesatschke, Kaldosh, Kamalakannanprogrammer, Kanags, Kapoing, Karderio, Karl Dickman, Katalaveno, Kbrose, Kc2idf,<br />

Keithgabryelski, Kenmccallum, Kensall, Kevinconroy, Kgaughan, Kha0sK1d, KickAssClown, Kl4m, Klaws, Koavf, Korval, Krauss, Kubigula, Kx1186, LDiracDelta, Lambiam, Larala,<br />

Lazynitwit, Lianmei, Liao, Lifefeed, Liftarn, Ligulem, Ling.Nut, LittleDan, Loveenatayal, Lumi71, Lycurgus, M.franceschet, M4gnum0n, MER-C, MK8, MaBoehm, Madir, Mah159, Mak<br />

Thorpe, Manishtomar, Maoj-wsu-sp, Mark Renier, MarkSweep, Martijn faassen, Martin451, Martinp23, MartynDavies, Mathmo, Matthäus Wander, MaxEnt, Maximaximax, Maximus06,<br />

Mayfare, Mbbradford, Mbell, Mcintyem, Mcorazao, Melab-1, Melon039, Meszigues, Mhkay, Michael Hardy, MichaelJanich, Miguelfms, Minghong, Mion, Miss Dark, Mjb, Mjpieters,<br />

Mola8sses, Montgomery '39, Mp, Mr. Shoeless, Mr.Z-man, MrJones, MrOllie, Mrjmcneil, Ms2ger, Mthibault, Mvulpe, Mwtoews, Mww113, Mxn, NO ACMLM,AND XKEPPER SUCK !,<br />

Nannus, Nanshu, Natasha2006, NawlinWiki, Neckro, Nemo bis, Netsnipe, Nicmila, Nigelj, Nikkimaria, Nile, Ninly, Niteowlneils, Nivaca, Nixeagle, Noldoaran, Nomediga, Norm mit, Nowa,<br />

Nsh, Nwbeeson, Octane, Ogmios, Ohnoitsjamie, Okyea, OliD, OsamaK, Oscar-ja, Osquar F, OverlordQ, Oxblood, P3x984, PTSE, Patrick, Paul Foxworthy, PaulXemdli, Pavel Vozenilek,<br />

Paxsimius, Peashy, Pelle, Pengo, PeteVerdon, Peterl, Pgk, Philip Trueman, Phluid61, Phoenix-forgotten, Phyzome, Pianohacker, Pikiwyn, Pmberry, Poccil, Porges, Pozcircuitboy, Prakash<br />

Nadkarni, Prodoc, Quarl, Quasipalm, Quiddity, Quilokos, Ramesses the Great, Rbonvall, Rbstimers, Rdmsoft, Red660, RedWolf, Redherring, Reinthal, Remy B, RenniePet, Rich Farmbrough,<br />

RichMorin, Richalex2010, Rick Block, Rick Jelliffe, RickBeton, Risi, Ritvikbhatnagar1, Rivecoder, Rje, Rjstott, Rjwilmsi, Rklawton, Robert K S, Robert Merkel, Robinjwest, Robomaeyhem,<br />

Rodney Boyd, Roger costello, Rory096, RoseParks, Rr2bwreain, Rror, Rvmolen, Ryanrs, Sam Hocevar, SamHathaway, SandiCastle, Sandius, Saqib, Saucepan, Sbvb, Schnolle, Scjessey, Scott<br />

MacLean, Scottielad, Sderose, Seanhan, Seidenstud, Semper discens, Sen Mon, ShaneCavanaugh, Shanes, Shibboleth, Shii, Shinkolobwe, Shizhao, Shlomital, SickTwist, Signsofstatic, Simetrical,<br />

SivaKumar, Sj, Sjc, Sleepyhead81, Smyth, Sosinfo, Sound effx, Spankman, Spe88, Spudstud, SqueakBox, Stefan.ciobaca, Stephen Gilbert, Steve R Barnes, SteveRwanda, Stevy76, StewartMH,<br />

Stf, Stijn Vermeeren, Stupiddestyredgasd, Stwalkerster, Superm401, Suruena, Suwayya, Svetovid, Syangtar, Sydius, TPK, Tagith, Taknik, Talktovalentine, TastyPoutine, Technopilgrim, Teddyb,<br />

Terjen, Terrifictriffid, Terrycojones, Thadius856, The Thing That Should Not Be, TheMightyOrb, Thierryc, Think777, Thumperward, Thunderhead, TimBray, TimR, Timc, Timur.shemsedinov,<br />

Tobias Bergemann, Todd Vierling, Tony1, ToonArmy, Topbanana, Toussaint, Trade2tradewell, Trankin, Traroth, Treekids, Trovatore, Trscavo, Tsunaminoai, Turnstep, TwoOneTwo, Twocs,<br />

Typhoonhurricane, Typochimp, UkPaolo, Unforgettableid, Unixxx, Unknown W. Brackets, Vaganyik, Varlaam, Versageek, Vespristiano, Vigilius, Violetriga, Vladkornea, Vojta, Volphy,<br />

WSU-AW-AK, Waskage, Wavelength, Wellithy, Wereon, Whale plane, Whkoh, Wickorama, Wiki alf, Wiki0709, Wikilibrarian, Wmahan, WojPob, Woohookitty, Wrs1864, Wulfila, Ww,<br />

XJamRastafire, Xompanthy, Xpclient, Yaronf, Ygramul, Yonkie, Zhaolei, Zoeb, Zootm, Олександр Кравчук, 1175 anonymous edits<br />

<strong>XML</strong> and MIME Source: http://en.wikipedia.org/w/index.php?oldid=359215170 Contributors: Crowne, Ellymelly, Hawky, John Vandenberg, Mgungora, O keyes, Roger costello,<br />

ShakespeareFan00, SpK, Typhoonhurricane, Wdflake, Wrs1864, 8 anonymous edits<br />

<strong>XML</strong> appliance Source: http://en.wikipedia.org/w/index.php?oldid=358713658 Contributors: AJR, Abesford, Alfe, Biot, Bunnly, Bunnyhop11, Comindico, CommonsDelinker, Darraghs,<br />

Dmccreary, Glace, Haakon, Hoagtim, Hughser, Iamrohit, Irishguy, Isotope23, Jbromhead, JonHarder, Jpbowen, Julesd, Kakarrott64, Kmorozov, L200817s, Layer7, Layer7tech, Lisfire, Lsonne,<br />

Martpol, MinorContributor, Ohthelameness, Reedy, Sherool, Sreekesh, Staffwaterboy, Stephen Compall, Tcramer1234, Vikingforties, 30 anonymous edits<br />

<strong>XML</strong> Base Source: http://en.wikipedia.org/w/index.php?oldid=333780510 Contributors: Anrie Nord, Fullstop, Furrykef, Pegship, Suruena, TimBray, Toussaint, Utcursch, 2 anonymous edits<br />

<strong>XML</strong> Catalog Source: http://en.wikipedia.org/w/index.php?oldid=350443768 Contributors: Abcoates, Alex.g, <strong>Markup</strong>854, Nate1481, RickBeton, TubularWorld, 4 anonymous edits<br />

<strong>XML</strong> Certification Program Source: http://en.wikipedia.org/w/index.php?oldid=365135620 Contributors: Melon039, Michel7789, Sykamoore, WestCity, 26 anonymous edits <strong>XML</strong><br />

Configuration Access Protocol Source: http://en.wikipedia.org/w/index.php?oldid=367225868 Contributors: Calment, Kbrose, Mondoblu, R'n'B, 9 anonymous edits<br />

<strong>XML</strong> Control Protocol Source: http://en.wikipedia.org/w/index.php?oldid=294538733 Contributors: Asbjornu, Malcolma, Melab-1, Mild Bill Hiccup, Salmar<br />

<strong>XML</strong> data binding Source: http://en.wikipedia.org/w/index.php?oldid=362549516 Contributors: Beetstra, Biehl, Boseko, Cander0000, Coconut99 99, DSosnoski, Doug Bell, Drrngrvy,<br />

Dsevilla, Emerks, Eshear, Jnutting512, Khookguy, Liempt, Miami33139, MrOllie, Mrflip, Nskhan84, Objsys, Payxystaxna, Poccil, Precious Roy, RedWolf, Redvers, Robert van Engelen,<br />

Sebastian.Dietrich, Simon sprott, SprottS, Squash, Stephen B Streater, Teeks99, Tirkfl, Trident job, Venango, Virgiltrasca, Wavelength, Yourfired101, 86 anonymous edits<br />

<strong>XML</strong> database Source: http://en.wikipedia.org/w/index.php?oldid=366561321 Contributors: 16x9, AJackl, Abukaspar, Adrianwn, Amirfr, Andionita, Arnabdotorg, Barefootliam, Belovedfreak,<br />

Bernd vdB old, Bohumir Zamecnik, Bradjamesbrown, Brick Thrower, Bunnyhop11, Ccouvrette, ChristianGruen, Colonies Chris, CorcaighAbu, DickieRose, Dilane, Dizzzz, Dmccreary,<br />

Doclabyrinth, DoriSmith, Edward C. Zimmermann, Eedeebee, Enric Naval, Epbr123, EricBloch, GVogeler, Glen Pepicelli, Gpallis, Gregburd, Happygiraffe, Hgkamath, Hobartimus, Joerg84,<br />

John Vandenberg, Johndbritton, Juansempere, Jzhang2007, Klingon, Kmorozov, Kokotero, Lamdk, Libcub, Mdd, Metaperl, Michael Slone, MiddleEarth, Nichtich, Nikkimaria, OlliX, Pearle,<br />

Pedant17, Philip Trueman, Playmobilonhishorse, Radim Baca, Rastgoo, Rayngwf, Rjwilmsi, Rtweed1955, Signalhead, Slakr, Snodnipper, Stevertigo, Sykamoore, TRosenbaum, Tbradford,<br />

Terrifictriffid, Thumperward, Tide rolls, Touko vk, Xmlchamp, Xpriori, Xshezang, Xxanthippe, 216 anonymous edits<br />

<strong>XML</strong> editor Source: http://en.wikipedia.org/w/index.php?oldid=358875951 Contributors: Alcalazar, Asqueella, Booles, Cedric dlb, Cinnamon42, Clayoquot, Damien1, DirkvdM, Dulciana,<br />

Efcavanaugh, Egandrews, Furrykef, GeoffPurchase, Geralds, Icairns, Julesd, Korval, LeeHunter, Mark Richards, Mjb, Mzajac, Nabeth, Owens1, Ownlyanangel, Quasipalm, RedWolf, Remuel,<br />

Richardmtl, Saqib, Sernauser, SimonP, Sjoerd visscher, Skreyola, Spankman, Srbauer, Swaq, Thv, Tobias Bergemann, Wrs1864, 72 anonymous edits<br />

<strong>XML</strong> Enabled Directory Source: http://en.wikipedia.org/w/index.php?oldid=291296534 Contributors: Chowbok, EagleOne, Kdz, Melab-1, MerryMorris, 3 anonymous edits<br />

<strong>XML</strong> Encryption Source: http://en.wikipedia.org/w/index.php?oldid=354384058 Contributors: Alekseysanin, ArnoldReinhold, AutumnSnow, Cuonghuyto, Gudeldar, Jc3s5h, Mabdul, Ntsimp,<br />

Pmerson, Samsara, Sverdrup, Westenra, Wrs1864, 15 anonymous edits<br />

<strong>XML</strong> Events Source: http://en.wikipedia.org/w/index.php?oldid=328519050 Contributors: Ahoerstemeier, Dmccreary, Dmyersturnbull, Dvunkannon, Ghettoblaster, Groupsixty, Hawky, I<br />

already forgot, Lev Matematik, Mathiastck, Pemboid, Reinthal, Risi, Rjwilmsi, Toussaint, Xaje, Zundark, 10 anonymous edits<br />

<strong>XML</strong> framework Source: http://en.wikipedia.org/w/index.php?oldid=322139018 Contributors: Bunnyhop11, Byjg, Kateshortforbob, Libcub, 2 anonymous edits<br />

<strong>XML</strong> Literals Source: http://en.wikipedia.org/w/index.php?oldid=297983431 Contributors: Biscuittin, Drilnoth, Highpitch, Maniamin, 1 anonymous edits


Article Sources and Contributors 197<br />

<strong>XML</strong> namespace Source: http://en.wikipedia.org/w/index.php?oldid=347806300 Contributors: Anthony Appleyard, Anwar saadat, AutumnSnow, CardinalDan, Detroit, Dpm64, Dreftymac,<br />

Ear1grey, Eh kia, Ehn, Franl, Gagsie, Hairy Dude, I am neuron, Ilyanep, ImperfectlyInformed, Juanpablosoto, Korval, Mabdul, Mhkay, Nigelj, Pitoutom, Reinthal, Robina Fox, Sciurinæ,<br />

Sourcejedi, SuperHamster, The.Modificator, TimBray, TubularWorld, 24 anonymous edits<br />

<strong>XML</strong> Pretty Printer Source: http://en.wikipedia.org/w/index.php?oldid=349425356 Contributors: Ashburnite, BackToThePast, KeithTyler, Malcolma, Oneiros, Tiberiusgrant, 4 anonymous<br />

edits<br />

<strong>XML</strong> Protocol Source: http://en.wikipedia.org/w/index.php?oldid=272452094 Contributors: ClementSeveillac, Imjustmatthew, Longhair, Pegship<br />

<strong>XML</strong> schema Source: http://en.wikipedia.org/w/index.php?oldid=340715184 Contributors: ABCD, Acdx, Ahoerstemeier, Alik Kirillovich, AutumnSnow, Beetstra, Bunnyhop11, Cbdorsett,<br />

Choster, Crystallina, Derekread, Dongwon, Doug Bell, Dreftymac, Ehn, Fryed-peach, Gardenstew, Hervegirod, Hymek, Jamelan, Jaxsam1, Korval, Krauss, Kucing, Mamling, MariahX, Mark<br />

Renier, MarkSweep, Mhkay, Minghong, Mjb, Ninly, Pi8ch, Pmerson, Poccil, Pxma, Rich Farmbrough, Runnerupnj, SheepNotGoats, Smyth, Stevage, SteveLoughran, Tobias Bergemann,<br />

Vernanimalcula, Vishrave, Wellithy, Xan 213, Þjóðólfr, 51 anonymous edits<br />

<strong>XML</strong> Schema Editor Source: http://en.wikipedia.org/w/index.php?oldid=364480930 Contributors: Bunnyhop11, Ched Davis, Egandrews, Fabrictramp, Gsgsgsgs, Kostmo, Pjcwikip, Rhubbarb,<br />

Rklear, Simon sprott, 12 anonymous edits<br />

<strong>XML</strong> Schema <strong>Language</strong> Comparison Source: http://en.wikipedia.org/w/index.php?oldid=349051677 Contributors: Ahoerstemeier, Bunnyhop11, Cfeet77, Crystallina, Decrease789, Dongwon,<br />

Dreftymac, Ghettoblaster, Giraffedata, Grumpycraig, Hsivonen, Jlowery, Korval, Penter ghost, Q Chris, Sloop Jon, Sześćsetsześćdziesiątsześć, Tuntable, 31 anonymous edits<br />

<strong>XML</strong> Studio Source: http://en.wikipedia.org/w/index.php?oldid=345963200 Contributors: Beetstra, Fabrictramp, Simon sprott, 7 anonymous edits<br />

<strong>XML</strong> Telemetric and Command Exchange Source: http://en.wikipedia.org/w/index.php?oldid=368146406 Contributors: Briangregory2000, BuffaloChip97, Eyreland, GerryInColorado,<br />

Iridescent, Jsafranek, Minizinim, Nasa-verve, O keyes, Pan Dan, Rich Farmbrough, SamFCooper, Timmerlj, 4 anonymous edits<br />

<strong>XML</strong> template engine Source: http://en.wikipedia.org/w/index.php?oldid=362160823 Contributors: Akmg, Crystallina, FatalError, Ishnigarrab, JacekA, Krauss, Markjoseph sc, Mhkay,<br />

MichaK, RHaworth, Radiant!, Rjwilmsi, Sanxiyn, Stevage, Stf, Tokek, 13 anonymous edits<br />

<strong>XML</strong> tree Source: http://en.wikipedia.org/w/index.php?oldid=352933815 Contributors: Booyabazooka, Dawynn, Malcolma, Nagle, Tinucherian, Velle, WereSpielChequers<br />

<strong>XML</strong> validation Source: http://en.wikipedia.org/w/index.php?oldid=361641348 Contributors: 3nx, Andy Dingley, David Haslam, Dawynn, Dreftymac, Drrwebber, EdJogg, Fnielsen, Hmains,<br />

Hymek, Jaxsam1, Korval, Pmerson, Rich Farmbrough, Waacstats, 10 anonymous edits<br />

<strong>XML</strong>-Enabled Networking Source: http://en.wikipedia.org/w/index.php?oldid=352338066 Contributors: Asparagus, Hybernator, Kakarrott64, Krbabu, Lsonne, MaxDel, Mbenna, Melab-1,<br />

MinorContributor, 7 anonymous edits<br />

<strong>XML</strong>-Retrieval Source: http://en.wikipedia.org/w/index.php?oldid=342675426 Contributors: DoriSmith, JudithWinter, Magioladitis, Nikkimaria<br />

<strong>XML</strong>HttpRequest Source: http://en.wikipedia.org/w/index.php?oldid=366820907 Contributors: .:Ajvol:., A3r0, Aditsu, Ahoerstemeier, Alaa.moustafa, Alansohn, Alcalazar, Alex Smotrov,<br />

Alexandre Martins, Algae, Alphachimp, Anirvan, Apv, Arjun G. Menon, Artw, Bezenek, Blackdenimgumby, BobBagwill, Bobo192, Bovineone, CDV, Caged.danimal, CambridgeBayWeather,<br />

CanisRufus, CapitalR, Catamorphism, Chealer, Christopherlin, Cic, Coffeeflower, DJ Rubbie, Damicatz, Dantman, Darklama, Delfuego, Digita, Dionyziz, Dirus, Discospinster, Djkenzie,<br />

Downfromzero, Drano, Dsnell923, EatMyShortz, Ej0c, Eloi.sanmartin, Enyo, Eric B. and Rakim, Eve Teschlemacher, Fabiob, FatalError, Filipvr, Fred Bradstadt, Fromz, Furrykef, Gabrielsroka,<br />

Gerbrant, Gilgamesh, Gilliam, Gimboid13, GraemeL, GregorB, Haza-w, Hondavice, Ignacio Javier Igjav, Isnow, J.delanoy, Jaray, Javalenok, Javawizard, Jaw959, Jdowland, Jeroldan, Jmabel,<br />

John Vandenberg, Jriffel, Keelypavan, Khalid hassani, Kozuch, Krellis, Kugland, Lee J Haywood, LemonairePaides, Liberatus, Lindsay-mclennan, Locos epraix, Lupin, Macaldo, Maian,<br />

Mamund, Manop, Marktmilligan, Marskind, Martin Hampl, Martnym, Masonbarge, Meand, Merc64, Metaeducation, Mindmatrix, Minghong, Mnot, Molily, Mrcs, Nickshanks, Nigelj,<br />

Nightstallion, Niven, Nkour, Norm mit, Oeln, Ohgyun Ahn, Pcj, Pctopp, Ph0t0phobic, Phloopy, Pjakubo86, Pjdonnelly, Proton.mule, Quilokos, Ramu50, RedWolf, Reisio, Remember the dot,<br />

Renku, RidinHood25, Ringbang, Rjwilmsi, Robert p levy, Rohan Jayasekera, Rufous, SalM, Sega381, Shamesspwns, Simon Lieschke, SineSwiper, Skeejay, Slant, Sleepyhead81, Spankman,<br />

Speight, Stephen Morley, Suruena, SvartMan, Taka, TakuyaMurata, Tamlyn, Teiladnam, The Anome, TheJosh, Thedangerouskitchen, Thumperward, Timc, Timeroot, Timwi, Tolmaion, Twsx,<br />

Urkle0, Vberger, VictorAnyakin, Vladogr, Wengier, White 720, WhiteHatLurker, Widgetguy, WikHead, Zippedmartin, Zoef1234, Zvn, Zzuuzz, ~K, 380 anonymous edits<br />

<strong>XML</strong>Socket Source: http://en.wikipedia.org/w/index.php?oldid=313909088 Contributors: Icktoofay, O keyes, Tomjenkins52, 1 anonymous edits<br />

XPath Source: http://en.wikipedia.org/w/index.php?oldid=355507321 Contributors: Bitbit, Bunnyhop11, D.c.camero, Girlo2111, Gondooley, JLaTondre, Jasondburkert, Jeffz1, Mabdul,<br />

Mathiastck, Mhkay, Ninly, Norro, Pgfearo, RSStockdale, Ringbang, Tibti, Walk Up Trees, 15 anonymous edits<br />

XPath 2.0 Source: http://en.wikipedia.org/w/index.php?oldid=344816885 Contributors: Bunnyhop11, D.c.camero, Fredrik, Girlo2111, Gudeldar, Int19h, Jan.Sievers, K1Bond007, Lar, Mabdul,<br />

Mathiastck, Mhkay, Roland Beker, Stevage, TheParanoidOne, Typhoonhurricane, Xiroth, 7 anonymous edits<br />

Xs3p Source: http://en.wikipedia.org/w/index.php?oldid=352203304 Contributors: AriManninen, Ashburnite, Databases, Dawynn, Hysteria18, 4 anonymous edits<br />

XSQL Source: http://en.wikipedia.org/w/index.php?oldid=362589846 Contributors: Bunnyhop11, Cander0000, Fatal!ty, HJWeng, Intgr, Legoktm, Levin, Melab-1, Tabletop, Xezbeth, 4<br />

anonymous edits


Image Sources, Licenses and Contributors 198<br />

Image Sources, Licenses and Contributors<br />

Image:Klip-logo1.png Source: http://en.wikipedia.org/w/index.php?title=File:Klip-logo1.png License: unknown Contributors: User:Awille, User:Cydebot, User:Diveloop<br />

Image:Log4js.png Source: http://en.wikipedia.org/w/index.php?title=File:Log4js.png License: GNU Free Documentation License Contributors: Stritti<br />

Image:Log4JS-UML.jpg Source: http://en.wikipedia.org/w/index.php?title=File:Log4JS-UML.jpg License: GNU Free Documentation License Contributors: Stritti<br />

Image:PARTSangles.jpg Source: http://en.wikipedia.org/w/index.php?title=File:PARTSangles.jpg License: Public Domain Contributors: Buiras<br />

Image:METSdocument.jpg Source: http://en.wikipedia.org/w/index.php?title=File:METSdocument.jpg License: unknown Contributors: Buiras<br />

Image:X-office-document.svg Source: http://en.wikipedia.org/w/index.php?title=File:X-office-document.svg License: unknown Contributors: Bdesham, Rocket000, Sasa Stefanovic<br />

Image:X-office-presentation.svg Source: http://en.wikipedia.org/w/index.php?title=File:X-office-presentation.svg License: unknown Contributors: Linuxerist, Rocket000, Túrelio, 1<br />

anonymous edits<br />

Image:X-office-spreadsheet.svg Source: http://en.wikipedia.org/w/index.php?title=File:X-office-spreadsheet.svg License: unknown Contributors: Bdesham, Rocket000, Sasa Stefanovic<br />

Image:Open Packaging Convention.png Source: http://en.wikipedia.org/w/index.php?title=File:Open_Packaging_Convention.png License: GNU General Public License Contributors:<br />

various<br />

Image:DrawingML example.png Source: http://en.wikipedia.org/w/index.php?title=File:DrawingML_example.png License: Public Domain Contributors: Original uploader was Tuanese at<br />

en.wikipedia<br />

Image:XPSIcon.png Source: http://en.wikipedia.org/w/index.php?title=File:XPSIcon.png License: unknown Contributors: Athaenara, Cristan, Joelholdsworth, Salavat, Sfan00 IMG, 2<br />

anonymous edits<br />

Image:Rdf graph for Eric Miller.png Source: http://en.wikipedia.org/w/index.php?title=File:Rdf_graph_for_Eric_Miller.png License: Attribution Contributors: W3C<br />

Image:<strong>XML</strong>.svg Source: http://en.wikipedia.org/w/index.php?title=File:<strong>XML</strong>.svg License: Creative Commons Attribution-Sharealike 2.5 Contributors: AutumnSnow, Fryed-peach, JeffyP,<br />

Jusjih, Karl Dickman, Latics, Platonides, SKvalen, Soeb, Verdy p, 3 anonymous edits<br />

Image:Xml_text_editor.png Source: http://en.wikipedia.org/w/index.php?title=File:Xml_text_editor.png License: Public Domain Contributors: Damien1, 1 anonymous edits<br />

Image:xml_graphical_editor.png Source: http://en.wikipedia.org/w/index.php?title=File:Xml_graphical_editor.png License: Public Domain Contributors: Damien1<br />

Image:xml_wysiwyg_editor.png Source: http://en.wikipedia.org/w/index.php?title=File:Xml_wysiwyg_editor.png License: Public Domain Contributors: Damien1, 1 anonymous edits<br />

Image:SimpleXsd Physical.png Source: http://en.wikipedia.org/w/index.php?title=File:SimpleXsd_Physical.png License: Creative Commons Attribution 3.0 Contributors: User:Simon sprott<br />

Image:SimpleXsd Logical.png Source: http://en.wikipedia.org/w/index.php?title=File:SimpleXsd_Logical.png License: Creative Commons Attribution 3.0 Contributors: User:Simon sprott<br />

Image:Tick-green.png Source: http://en.wikipedia.org/w/index.php?title=File:Tick-green.png License: Public Domain Contributors: Wesley Warren<br />

Image:ScreenShot XsdEditor.png Source: http://en.wikipedia.org/w/index.php?title=File:ScreenShot_XsdEditor.png License: Creative Commons Attribution 3.0 Contributors: Simon sprott<br />

(talk). Original uploader was Simon sprott at en.wikipedia<br />

Image:XTCE exchange.gif Source: http://en.wikipedia.org/w/index.php?title=File:XTCE_exchange.gif License: Public Domain Contributors: GerryInColorado


License 199<br />

License<br />

Creative Commons Attribution-Share Alike 3.0 Unported<br />

http://creativecommons.org/licenses/by-sa/3.0/

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!