13.02.2013 Views

2 Debian Code Search: An Overview

2 Debian Code Search: An Overview

2 Debian Code Search: An Overview

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Abstract<br />

Since the shutdown of Google <strong>Code</strong> <strong>Search</strong> in January 2012, developers of free/libre open<br />

source software (FOSS) have lacked a source code search engine covering the large corpus of<br />

open source computer program code.<br />

Without such a tool, developers could not easily find example code for poorly documented<br />

libraries. They could not quickly determine the scope of a problem — for example, figuring<br />

out how many packages call a specific library function with a bug in it.<br />

This thesis discusses the design and implementation of <strong>Debian</strong> <strong>Code</strong> <strong>Search</strong>, a search engine<br />

based on a re-implementation of the ideas behind Google <strong>Code</strong> <strong>Search</strong>, but with <strong>Debian</strong>’s<br />

FOSS archive as a corpus, and using <strong>Debian</strong>’s clean meta data about software packages to<br />

improve search results.<br />

The work presented includes optimizing Russ Cox’s re-implementation to work with a large<br />

corpus of data, refactoring it as a scalable web application backend, and enriching it with<br />

various ranking factors so that more relevant results are presented first. Detailed analysis of<br />

these main contributions as well as various smaller utilities to update, index and rank <strong>Debian</strong><br />

packages are included.<br />

With the completion of this thesis, <strong>Debian</strong> <strong>Code</strong> <strong>Search</strong> is working and accessible to the<br />

public at http://codesearch.debian.net/. <strong>Debian</strong> <strong>Code</strong> <strong>Search</strong> can be used to search<br />

129 GiB of source code, typically within one second.<br />

iv

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!