Automated Formal Static Analysis and Retrieval of Source Code - JKU
Automated Formal Static Analysis and Retrieval of Source Code - JKU
Automated Formal Static Analysis and Retrieval of Source Code - JKU
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
Abstract<br />
In this thesis two approaches to source code analysis are theoretically investigated <strong>and</strong> implemented<br />
in two prototype systems: formal static analysis <strong>and</strong> retrieval. An integration <strong>of</strong> the formal<br />
static analysis prototype into the code search prototype is designed.<br />
The formal static analysis method is based on forward symbolic execution <strong>and</strong> functional<br />
semantics. It systematically generates the verification conditions which are necessary for program<br />
correctness.<br />
We formalize the notions <strong>of</strong> syntax, semantics, partial correctness <strong>and</strong> termination <strong>of</strong> imperative<br />
recursive programs in a purely logic manner. The partial correctness <strong>and</strong> the termination principles<br />
are expressed in the underlying theory <strong>of</strong> programs. The termination property is expressed<br />
as an induction principle depending on the structure <strong>of</strong> the program with respect to recursion <strong>and</strong> it<br />
plays a central role in the existence <strong>and</strong> the uniqueness <strong>of</strong> the function computed by the program,<br />
without which the total correctness formula is trivial due to inconsistency <strong>of</strong> the assumptions. The<br />
method is implemented in a verification condition generator (FwdVCG) which generates the pro<strong>of</strong><br />
obligations that insure the correctness <strong>of</strong> the program. A formulae simplifier is then applied <strong>and</strong><br />
reduces them to [system(s)] <strong>of</strong> equalities <strong>and</strong>/or inequalities.<br />
Another achievement in this thesis is the integration <strong>of</strong> a source code retrieval engine Mindbreeze<br />
<strong>Code</strong> Search into an information retrieval system Mindbreeze Enterprise Search. The<br />
integration required slight modifications <strong>of</strong> the information retrieval system architecture <strong>and</strong> components:<br />
a database, a source code custom crawler, a new data source representing the source<br />
code files category were integrated. The context interface <strong>of</strong> the Query Service was enhanced to<br />
provide context items specific to source code files category.<br />
One <strong>of</strong> the components <strong>of</strong> the custom crawler is a tagger which extracts rapidly programming<br />
languages constructs. The crawled source code is made available for retrieval by being structured<br />
<strong>and</strong> inserted into an index or database. A new data source, representing the source code files,<br />
was integrated into the system by deploying on the server a context provider, category icon <strong>and</strong><br />
category descriptor source code category specific. Moreover, we had to provide to this new data<br />
source, category specific icons <strong>and</strong> menus. To this aim, we improved the Query Service Context<br />
Interface with category icons <strong>and</strong> menus specific for the source code category specific.<br />
Keywords : program verification, symbolic execution, forward reasoning, functional semantics,<br />
Theorema, tagging, parsing, crawling, indexing, Mindbreeze Enterprise Search.