13.02.2013 Views

2 Debian Code Search: An Overview

2 Debian Code Search: An Overview

2 Debian Code Search: An Overview

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

4.11.2 After optimizing the trigram index<br />

4.11 Performance by processing step (profiling)<br />

Section 3.8.4 (“Posting list decoding implementation”) and 3.8.6 (“Posting list query optimization”)<br />

explain the different optimizations to the trigram index in depth.<br />

After these optimizations, the trigram lookup step is ≈ 4× faster. Since the “ranking” step<br />

also contains trigram result retrieval, it is also faster.<br />

response time<br />

100 ms<br />

80 ms<br />

60 ms<br />

40 ms<br />

20 ms<br />

0 ms<br />

Response split into steps<br />

0 20 40 60 80 100<br />

total time<br />

grep<br />

request<br />

sorting<br />

ranking<br />

n-gram<br />

Figure 4.11: Optimizing the trigram lookup cuts the total time in half.<br />

4.11.3 After re-implementing the ranking<br />

It turns out that the cause for the long ranking time in figure 4.11 was that the implementation<br />

used regular expressions. By replacing the use of regular expression with hand-optimized,<br />

equivalent code, the time used for ranking could be cut to half, as you can observe in figure<br />

4.12:<br />

response time<br />

100 ms<br />

80 ms<br />

60 ms<br />

40 ms<br />

20 ms<br />

0 ms<br />

Response split into steps<br />

0 20 40 60 80 100<br />

total time<br />

grep<br />

request<br />

sorting<br />

ranking<br />

n-gram<br />

Figure 4.12: Replacing regular expression matching with a loop cuts the ranking time in half.<br />

Intuitively, one would now try to optimize the grep step. However, as it is rather complicated<br />

and not well-documented, this is beyond the scope of this work.<br />

53

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!