Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
4.11.2 After optimizing the trigram index<br />
4.11 Performance by processing step (profiling)<br />
Section 3.8.4 (“Posting list decoding implementation”) and 3.8.6 (“Posting list query optimization”)<br />
explain the different optimizations to the trigram index in depth.<br />
After these optimizations, the trigram lookup step is ≈ 4× faster. Since the “ranking” step<br />
also contains trigram result retrieval, it is also faster.<br />
response time<br />
100 ms<br />
80 ms<br />
60 ms<br />
40 ms<br />
20 ms<br />
0 ms<br />
Response split into steps<br />
0 20 40 60 80 100<br />
total time<br />
grep<br />
request<br />
sorting<br />
ranking<br />
n-gram<br />
Figure 4.11: Optimizing the trigram lookup cuts the total time in half.<br />
4.11.3 After re-implementing the ranking<br />
It turns out that the cause for the long ranking time in figure 4.11 was that the implementation<br />
used regular expressions. By replacing the use of regular expression with hand-optimized,<br />
equivalent code, the time used for ranking could be cut to half, as you can observe in figure<br />
4.12:<br />
response time<br />
100 ms<br />
80 ms<br />
60 ms<br />
40 ms<br />
20 ms<br />
0 ms<br />
Response split into steps<br />
0 20 40 60 80 100<br />
total time<br />
grep<br />
request<br />
sorting<br />
ranking<br />
n-gram<br />
Figure 4.12: Replacing regular expression matching with a loop cuts the ranking time in half.<br />
Intuitively, one would now try to optimize the grep step. However, as it is rather complicated<br />
and not well-documented, this is beyond the scope of this work.<br />
53