13.06.2014 Views

Automated Formal Static Analysis and Retrieval of Source Code - JKU

Automated Formal Static Analysis and Retrieval of Source Code - JKU

Automated Formal Static Analysis and Retrieval of Source Code - JKU

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Chapter 3<br />

<strong>Code</strong> Search Integration Facility into<br />

Mindbreeze Enterprise Search<br />

3.1 Background<br />

3.1.1 Information <strong>Retrieval</strong><br />

The necessity <strong>of</strong> information storage <strong>and</strong> retrieval earned attention since the amounts <strong>of</strong> information<br />

increased <strong>and</strong> it should be fast <strong>and</strong> accurate retrieved. Therefore, requirements <strong>of</strong> efficient<br />

methods for information retrieval were necessary; important data are then not ignored <strong>and</strong> the<br />

force <strong>of</strong> work <strong>and</strong> effort is not wasted. Moreover, the implementation <strong>of</strong> the information retrieval<br />

methods into computers (retrieval systems) provides effective, intelligent <strong>and</strong> fast search results<br />

within a huge amount <strong>of</strong> data.<br />

The scenario <strong>of</strong> information storage <strong>and</strong> retrieval is as follows ([Rij79]): there exists a repository<br />

<strong>of</strong> documents <strong>and</strong> a person formulates requests (queries). The result <strong>of</strong> the query it is a set <strong>of</strong><br />

documents satisfying the query. The perfect result can be obtained by reading all the documents,<br />

keeping in mind the interest one <strong>and</strong> discarding the others.<br />

Introducing the computers to perform the retrieval <strong>of</strong> relevant documents for one needs, was,<br />

in a first phase, not to efficient because the search algorithms used textual search instead <strong>of</strong> using<br />

the semantics <strong>of</strong> the documents.<br />

Next step was to characterize <strong>and</strong> structure the documents in such a way that it is relevant to a<br />

query <strong>and</strong> more, to be accessible in a reasonable amount <strong>of</strong> time. For the second feature, indexing<br />

came out to be a good technique. Nowadays, the information retrieval systems implement complex<br />

indexing algorithms, but there is still the question <strong>of</strong> the accuracy <strong>of</strong> the data retrieved.<br />

Artificial Intelligence disciplines help in the accuracy problem by combining both syntax<br />

(automatic text categorization [LRS99]) <strong>and</strong> semantics (semantic inference, ontology reasoning<br />

[dB03]) <strong>of</strong> a document trying to overcome the query formulation problem.<br />

Query Formulation Techniques<br />

While most <strong>of</strong> the time was spent on constructing retrieval algorithms because it was thought that<br />

they play the central role in retrieving relevant information, William Frakes <strong>and</strong> Thomas Pole<br />

27

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!