Automated Formal Static Analysis and Retrieval of Source Code - JKU
Automated Formal Static Analysis and Retrieval of Source Code - JKU
Automated Formal Static Analysis and Retrieval of Source Code - JKU
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
3.2. INTEGRATION OF MINDBREEZE CODE SEARCH INTO MINDBREEZE ENTERPRISE SEARCH37<br />
3.2.1.2 Information Structuring<br />
The tagger <strong>and</strong> the parser work together in order to retrieve as much information (static <strong>and</strong><br />
dynamic) as possible for a hit-type. Because the information retrieved by the parser <strong>and</strong> the<br />
tagger does not have a uniform representation, we have to unify the representation <strong>of</strong> their output<br />
data.<br />
This issues can be solved by structuring the information in a file with XML structure. The<br />
benefits gained are:<br />
• we keep the structure <strong>of</strong> the document apart from the content;<br />
• we obtain a lot <strong>of</strong> structured information.<br />
Therefore a XML structure is given to the CTAGS output file, a XML element for each CTAGS<br />
output file entry. The information stored for each element (corresponding to a hit-type) is enriched<br />
by the parser, the only hard point is to add information to the right hit-type.<br />
We came up with the following XML structure for the indexing needs:<br />
Example 3.2.<br />
<br />
<br />
...<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
int x = 6;<br />
...<br />
<br />
The Document element attributes have the following meaning:<br />
• category – the name <strong>of</strong> the data source;<br />
• categoryclass – the type <strong>of</strong> the hit-type analyzed;