13.02.2013 Views

2 Debian Code Search: An Overview

2 Debian Code Search: An Overview

2 Debian Code Search: An Overview

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

4 <strong>Search</strong> result quality<br />

Query is a library symbol When the query matches one of the library’s exported function<br />

symbols, the package should get a much higher ranking. This is computationally cheap<br />

but works only for C-like programming languages and libraries.<br />

Query is in the package’s description The package description is a concise summary<br />

of what the package does. Library packages often list their main areas or available<br />

functions, so it is likely that the user’s search query matches the description, especially<br />

if she is searching for an algorithm name.<br />

4.2.3 Ranking factors which depend on the actual results<br />

Indentation level of the result The vast majority of programming languages represents<br />

different scopes by indentation 3 . Scopes represent a hierarchy: higher scopes mean<br />

the symbol is more important (e.g. a function definition at the top-level scope in C<br />

is more important than a variable declaration in a lower-level scope). To quickly and<br />

language-independently decide whether one line is more important than another, the<br />

amount of whitespace in front of it is counted.<br />

Word boundary match To prefer more exact results, the regular expression features \b<br />

can be used (match on word boundaries). As an example, \bXCreateWindow\b will<br />

match Window XCreateWindow( but not register XCreateWindowEvent *ev = .<br />

Since the position of the matching portion of text is known, it is used to further prefer<br />

earlier matches over later matches.<br />

Since programming languages are structured from left-to-right, earlier matches (such as<br />

function definitions) are better than later matches (such as function parameter types).<br />

3 Of the most popular languages in <strong>Debian</strong>, the first 10 (C, Perl, C++, Python, Java, Ruby, ocaml, LISP, Shell,<br />

PHP, see section 4.6 (page 44)) use indentation.<br />

32

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!