XML essentials (PDF) - Chagall
XML essentials (PDF) - Chagall
XML essentials (PDF) - Chagall
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
This document presents a quick introduction to <strong>XML</strong>, as used for the submission<br />
of structured information into the SigPath information management system (IMS).<br />
Please see the SigPath CD-ROM for electronic versions of the files discussed<br />
here. The CD-ROM also features animated tutorials that explain how to submit<br />
<strong>XML</strong> files to SigPath, and other useful ways to interact with the SigPath IMS to<br />
manage interactions and quantitative data in support the biochemical modeling<br />
efforts.<br />
<strong>XML</strong> <strong>essentials</strong><br />
<strong>XML</strong> stands for EXtensible Markup Language. <strong>XML</strong> is a language used to<br />
represent data and structured information. The information management<br />
approach implemented in SigPath uses <strong>XML</strong> extensively to support advanced<br />
information submissions.<br />
An <strong>XML</strong> documents contains <strong>XML</strong> markup elements, elements for short, or tags.<br />
An element looks like this: , or , or again, .<br />
An element is an identifier enclosed in brackets (‘’).<br />
A valid <strong>XML</strong> document contains one root element, as shown in the following<br />
document:<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
Document 1.<br />
Note the structure of comments in the previous example, and how elements are<br />
indented to illustrate the parent-child relation among elements. The same<br />
structure is shown below in a graphical format:<br />
root<br />
element<br />
boat<br />
Fabien Campagne, October 8 th 2003. For up-to-date information, please visit:<br />
http://www.sigpath.org/<br />
1
<strong>XML</strong> elements can have attributes (e.g., attribute=”” or size=””). Attributes are<br />
often used to describe properties of the element, such as size and color of the<br />
“boat” element in Document 1.<br />
Together with the ability to create a hierarchy of elements, these features make it<br />
possible to represent a variety of information.<br />
For instance, Document 2, below, illustrates how data and information can be<br />
represented in <strong>XML</strong> and shows that text can be used as a child of element (see<br />
).<br />
Many software tools support <strong>XML</strong> schemas. An <strong>XML</strong> schema defines what<br />
constitutes a valid <strong>XML</strong> document. For instance, a schema is used to define the<br />
name of the root element for documents that contain submissions for SigPath.<br />
According to the current SigPath schema, the root element of valid <strong>XML</strong><br />
information exchange documents must be named .<br />
The SigPath submission schema describes many more constraints on<br />
submission documents. These constraints are useful to remove ambiguity and<br />
assist users in building consistent submissions.<br />
A complete description of the SigPath schema is accessible from the Project web<br />
site: Open http://www.sigpath.org/ and click on the “SigPath <strong>XML</strong> schema” link on<br />
the menu bar (top left of the page).<br />
Figure 1. The SigPath <strong>XML</strong> Schema page is shown. The page presents the<br />
schema documentation, a link to the most recent schema and <strong>XML</strong> submission<br />
examples.<br />
Fabien Campagne, October 8 th 2003. For up-to-date information, please visit:<br />
http://www.sigpath.org/<br />
3
Click on “<strong>XML</strong> schema documentation” to access details about what constitutes a<br />
valid SigPath submission document. You should see something like shown on<br />
Figure 2.<br />
Click on this link to see<br />
the root element for<br />
SigPath documents.<br />
Figure 2. Overview of the documentation for the element.<br />
Fabien Campagne, October 8 th 2003. For up-to-date information, please visit:<br />
http://www.sigpath.org/<br />
4
Figure 3. Root element of SigPath submission documents. Elements on the right<br />
are children of the root element, and must occur in sequence, with the given<br />
multiplicity (e.g., 0...∞ for elements that can occur zero or more times).<br />
Beyond documentation shown on Figure 2 and 3, <strong>XML</strong> Schemas are formally<br />
defined in <strong>XML</strong> Schema files. Such files (generally named with an .xsd<br />
extension) can be used in <strong>XML</strong> editors to dynamically check that a document is<br />
valid, or to provide contextual help or support for element completion while<br />
creating a document manually.<br />
Fabien Campagne, October 8 th 2003. For up-to-date information, please visit:<br />
http://www.sigpath.org/<br />
5
The SigPath <strong>XML</strong> Schema can be downloaded from the second link shown on<br />
Figure 1. The next section will show how to use this schema to create a reaction<br />
submission file.<br />
Zooming on the molecule element<br />
The schema documentation is shown on Figure 4 for the molecule element. This<br />
element makes it possible to uniquely identify molecules in the current<br />
submission file or in SigPath, to include them in reactions, models, or other<br />
entities represented in SigPath.<br />
According to the documentation, the following three constructs are valid. Each of<br />
them can be used to match a molecule in the database or in the current<br />
submission file.<br />
<br />
<br />
<br />
(a)<br />
(b)<br />
(c)<br />
(a) Can be used to match a molecule which was previously submitted to<br />
SigPath, for instance a complex, that does not exist in another database.<br />
(b) Can be used to match proteins or small molecules that exist in other<br />
databases and have been imported in SigPath (these proteins are part of<br />
what we call background information).<br />
(c) Can be used to reference a molecule that is defined in the current <strong>XML</strong><br />
submission file. The attribute ‘idref’ must match the attribute ‘id’ of one<br />
(and only one) molecule definition element in the SigPath submission file.<br />
A combination of (a), (b) and (c) is allowed by the schema. In this case, priority<br />
rules apply: idref is considered first. When idref does not match any molecule,<br />
accession code and organisms are used. If no match is found, spid is finally<br />
considered. When no match can be found, an error is reported.<br />
In the context of an enzymatic reaction, the following attributes can also be used:<br />
<br />
<br />
(d)<br />
(e)<br />
(d) The attribute isSubstrate=”true” indicates that the molecule acts as a<br />
substrate of the enclosing enzymatic reaction.<br />
(e) Similarly, isProduct=”true” indicates that the molecule acts as a product of<br />
the enclosing enzymatic reaction.<br />
Fabien Campagne, October 8 th 2003. For up-to-date information, please visit:<br />
http://www.sigpath.org/<br />
6
Figure 4. SigPath schema documentation for the element. Diagram<br />
shows that the molecule element is a sequence of interaction_site, name and<br />
description elements. When browsing the documentation online, clicking on an<br />
element displays a detailed view such as the one shown for molecule. The “used<br />
by” row shows in which context a molecule element can appear in a SigPath<br />
submission. The “attributes” row presents the elements of the molecule element.<br />
Importing <strong>XML</strong> submission into SigPath<br />
To submit this reaction into SigPath, follow these steps:<br />
1. Navigate to the production instance of SigPath (start from<br />
http://www.sigpath.org/, use the menu on the left to access production).<br />
2. Click on the link “Submit Data via <strong>XML</strong> Upload”.<br />
3. Login (registering is free and helps identify submitters and reviewers).<br />
4. You should see a page that contains what is shown on Figure 5.<br />
Figure 5. Submitting information in the SigPath <strong>XML</strong> format.<br />
Fabien Campagne, October 8 th 2003. For up-to-date information, please visit:<br />
http://www.sigpath.org/<br />
7
5. Click “Browse” to locate the file that contains the <strong>XML</strong> submission (you<br />
can create this file by copying the content of Document 3 to an empty xml<br />
file, for instance).<br />
6. Press “Submit” when you have located the <strong>XML</strong> submission file.<br />
7. The <strong>XML</strong> submission will be checked for errors and inconsistencies.<br />
Errors will be reported. The page shown in Figure 6 will be displayed. Any<br />
error will be shown highlighted in yellow in the submission file.<br />
8. Correct any error and resubmit (to avoid schema validation errors it is best<br />
to use a validating <strong>XML</strong> schema editor, to check the submission against<br />
the schema).<br />
9. When no errors are detected, the display will change to Figure 6. Press<br />
Confirm Data Submission to save the submission into SigPath.<br />
Figure 6. Confirmation step for <strong>XML</strong> submission.<br />
Fabien Campagne, October 8 th 2003. For up-to-date information, please visit:<br />
http://www.sigpath.org/<br />
8
Submitting a binding reaction<br />
The document needed to submit a binding reaction into SigPath is shown on<br />
Document 3. Follow the steps shown in the previous section to submit this<br />
reaction. The elements shown in Document 3 will be reviewed during the tutorial.<br />
Figure 7 shows the hierarchy of elements in this submission.<br />
<br />
<br />
<br />
<br />
Grb2-Sos<br />
<br />
<br />
<br />
<br />
<br />
This is the reaction that creates the complex.<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
Document 3. Grb2 + Sos ↔ Grb2.Sos<br />
sigpath-submission<br />
components<br />
reaction<br />
complex (define)<br />
left<br />
right<br />
molecule<br />
molecule<br />
complex (reference)<br />
Figure 7. Structure of Document 3. Note how a complex is explicitly defined (on<br />
the left, underneath components) before it is used on the right side of the<br />
reaction. Complexes cannot be defined by themselves. They must appear in at<br />
least one reaction of the form A + B ↔ Complex(A,B).<br />
Fabien Campagne, October 8 th 2003. For up-to-date information, please visit:<br />
http://www.sigpath.org/<br />
9
Submitting an enzymatic reaction<br />
The document needed to submit an enzymatic reaction into SigPath is shown in<br />
Document 4. Follow the steps shown in “Importing <strong>XML</strong> submission into<br />
SigPath” to submit this reaction. The elements shown in Document 4 will be<br />
reviewed during the tutorial.<br />
<br />
<br />
<br />
<br />
P-CREB_RAT<br />
protein<br />
<br />
CREB_RAT<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
milliM-1.sec-1<br />
1.0<br />
Fabien Campagne, October 8 th 2003. For up-to-date information, please visit:<br />
http://www.sigpath.org/<br />
10
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
micromoles/l<br />
20.0<br />
<br />
<br />
sec-1<br />
0.1<br />
<br />
<br />
<br />
<br />
Document 4. CREB-RAT + ATP → P-CREB_RAT + ADP [enzyme : MAPK2]<br />
Fabien Campagne, October 8 th 2003. For up-to-date information, please visit:<br />
http://www.sigpath.org/<br />
11