Automated Formal Static Analysis and Retrieval of Source Code - JKU
Automated Formal Static Analysis and Retrieval of Source Code - JKU
Automated Formal Static Analysis and Retrieval of Source Code - JKU
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
Chapter 3<br />
<strong>Code</strong> Search Integration Facility into<br />
Mindbreeze Enterprise Search<br />
3.1 Background<br />
3.1.1 Information <strong>Retrieval</strong><br />
The necessity <strong>of</strong> information storage <strong>and</strong> retrieval earned attention since the amounts <strong>of</strong> information<br />
increased <strong>and</strong> it should be fast <strong>and</strong> accurate retrieved. Therefore, requirements <strong>of</strong> efficient<br />
methods for information retrieval were necessary; important data are then not ignored <strong>and</strong> the<br />
force <strong>of</strong> work <strong>and</strong> effort is not wasted. Moreover, the implementation <strong>of</strong> the information retrieval<br />
methods into computers (retrieval systems) provides effective, intelligent <strong>and</strong> fast search results<br />
within a huge amount <strong>of</strong> data.<br />
The scenario <strong>of</strong> information storage <strong>and</strong> retrieval is as follows ([Rij79]): there exists a repository<br />
<strong>of</strong> documents <strong>and</strong> a person formulates requests (queries). The result <strong>of</strong> the query it is a set <strong>of</strong><br />
documents satisfying the query. The perfect result can be obtained by reading all the documents,<br />
keeping in mind the interest one <strong>and</strong> discarding the others.<br />
Introducing the computers to perform the retrieval <strong>of</strong> relevant documents for one needs, was,<br />
in a first phase, not to efficient because the search algorithms used textual search instead <strong>of</strong> using<br />
the semantics <strong>of</strong> the documents.<br />
Next step was to characterize <strong>and</strong> structure the documents in such a way that it is relevant to a<br />
query <strong>and</strong> more, to be accessible in a reasonable amount <strong>of</strong> time. For the second feature, indexing<br />
came out to be a good technique. Nowadays, the information retrieval systems implement complex<br />
indexing algorithms, but there is still the question <strong>of</strong> the accuracy <strong>of</strong> the data retrieved.<br />
Artificial Intelligence disciplines help in the accuracy problem by combining both syntax<br />
(automatic text categorization [LRS99]) <strong>and</strong> semantics (semantic inference, ontology reasoning<br />
[dB03]) <strong>of</strong> a document trying to overcome the query formulation problem.<br />
Query Formulation Techniques<br />
While most <strong>of</strong> the time was spent on constructing retrieval algorithms because it was thought that<br />
they play the central role in retrieving relevant information, William Frakes <strong>and</strong> Thomas Pole<br />
27