Download - Academy Publisher
Download - Academy Publisher
Download - Academy Publisher
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
Theorem 4: (10), individually, as one situation :<br />
"" element contents.<br />
Theorem 5: (11), (12), (13), the three cases can be<br />
classified one situation: .<br />
Theorem 67: the cases of (15) and (10) are the same, as<br />
a judgement event: <br />
element content.<br />
Theorem 7: (16), (17), (18), the three cases as a<br />
judgement event class: .<br />
Theorem 8: (19) as a judgement event alonely: <br />
According to the 7 judgement events summarized in<br />
this article, and then expand the SAX API, then we can<br />
create the SCTP index.<br />
Through the analysis above ,the algorithm process of<br />
building CSTP as follows:<br />
Input: XML and encoding table<br />
Output: CSTP Compressed index<br />
Begin:<br />
Start with SAX //parse sequencely,<br />
new (Node3)// to build the first node;<br />
s = new (Node1);<br />
push (S3, s);<br />
if (xml root tag has property value)<br />
{<br />
p = new (Node3);<br />
s.attribute = p;<br />
P-> left = property value;<br />
}<br />
while (* PP) / / there is also property<br />
{<br />
q = new (Node3);<br />
p->right = q;<br />
p = q;<br />
q->left = property value;<br />
}<br />
p->right = NULL;<br />
while (xml document did not be finished)<br />
{<br />
Case 1: / / read the string form: <<br />
... ...<br />
Case 2: / / read the string form : <<br />
... ...<br />
Case 3: //the third case: contents<br />
... ...<br />
Case 4: / /,be similar to Case 3<br />
... ...<br />
Case 5: / / <br />
... ...<br />
Case 6: / / read the string form : <br />
<br />
... ...<br />
Case 7: / / read the string form: <br />
( must be Children)<br />
... ...<br />
}<br />
There is another way to build an index: build nodes<br />
using the contents of judgement event, rather than the<br />
previous one, then it’s different to categorize the 25<br />
original events. Of course, the specific implementation<br />
of the algorithm is also different from the process above,<br />
don’t give it’s detail here.<br />
Ⅲ. THE QUERY ALGORITHM BASED ON CSTP<br />
A. Tthe Analysis of The Process and Character of Query<br />
Query process of Xpath is similar to Xquery’s, but the<br />
latter’s is more complicated to achieve. Only discuss the<br />
XPath query at this. Analyse the query process and the<br />
character of it on the CSTP index before discussing the<br />
XPath query, as follows:<br />
For simple path query: I : t / t / t<br />
1 2<br />
L<br />
m<br />
.The t 1<br />
may<br />
be the root element node of tree-graph of CSTP, that may<br />
be not. If it is not the root element node, such as a third<br />
layer node in the tree-graph of CSTP, when users submit<br />
query expression, then the computer would begin to visit<br />
differrent chain on the second layer of the CSTP treegraph.<br />
If there is no t 1<br />
, then visit the the differrent chain<br />
constituted with the children of the second node on the<br />
second layer (the second node may be in the same name<br />
item). Continue to do until find t 1<br />
, we can see that the<br />
computer will compare many times. Suppose, there are N<br />
elements nodes on the second layer of the CSTP, and<br />
there are M elements nodes in the different chain, each<br />
node on the second layer has on average N 2 child nodes<br />
of different chain, then such a query’s time complexity is<br />
O( N1 + N2<br />
⋅ N ) ≈ O( N2<br />
⋅ N ) . To a mass of data, N is a<br />
rather large number. And if the expression t 1<br />
supplied by<br />
user does not exist, which will make computer visit<br />
through CSTP tree-graph. Time cost is obviously very<br />
high. Of course, if additional information such as start<br />
layer can be specified, and new a head node on each layer<br />
of CSTP tree-graph, so that can significantly reduce the<br />
query time, but this would obviously lead to an<br />
unnecessary number of additional work. Therefore,<br />
suppose, t<br />
1<br />
is on the first layer of , this path<br />
expression is called a full path expression. If the path<br />
expression submitted by the user is not full, we can predefined<br />
certain rules, then generate the full path<br />
expression automatically when the user query.<br />
B. Analysis of Algorithm and Efficiency Based on XPath<br />
Query<br />
According to the analysis above, we can query<br />
efficiently in the compressed structure. At the same time,<br />
in accordance with coding table generated by DTD ,we<br />
can design decompression algorithms at the process of<br />
query.<br />
Because of the complexity of XML itself and CSTP<br />
index, only give the qualitative analysis and comparison<br />
at this.<br />
Basing on the high repeatability of XML tag data this<br />
paper introduce TP relation to convert the original tree<br />
structure of XML into CSTP structure in this paper. And<br />
the introduction of the same name items and different<br />
56