Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
4 <strong>Search</strong> result quality<br />
4.10 Overall performance<br />
Of course, outside of theoretical simulations, a search engine is not used by precisely one<br />
person at a time, but more people might want to access it. This section examines the overall<br />
performance of <strong>Debian</strong> <strong>Code</strong> <strong>Search</strong> by simulating real-world usage in various ways.<br />
This section seeks to determine the number of queries per second (qps) that <strong>Debian</strong> <strong>Code</strong><br />
<strong>Search</strong> can handle. Keep in mind that this number is a worst-case limit: Only actual search<br />
queries are measured, while the time which users normally spend looking at the search results<br />
or accessing static pages (the index or help page for example) is neglected. That is, assuming<br />
DCS could handle 5 queries per second, that means that in the worst case, when there are<br />
6 people who want to search in the very same second, one of them has to wait longer than<br />
the others (or everyone has to wait a little bit longer, depending on whether requests are<br />
artificially limited to the backend).<br />
4.10.1 Measurement setup<br />
To measure HTTP requests per second for certain access patterns, there are multiple programs<br />
available:<br />
ApacheBench Shipping with the Apache HTTP server, this tool has been used for years to<br />
measure the performance of the Apache server and others. While it supports a gnuplot<br />
output format, the gnuplot output is a histogram since it is sorted by total request time,<br />
not sequentially.<br />
siege A multi-threaded HTTP load testing and benchmarking utility.<br />
weighttp Developed as part of the lighttpd project, weighttp describes itself as a lightweight<br />
and small benchmarking tool for webservers. It is multi-threaded and event-based.<br />
weighttp’s command-line options are similar to those of ApacheBench, but it does not<br />
support dumping the raw timing measurements.<br />
httperf A single-threaded but non-blocking HTTP performance measurement tool. Its<br />
strength is its workload generator, with which more realistic scenarios can be tested<br />
rather than just hitting the same URL over and over again.<br />
Unfortunately, none of these tools provides a way of obtaining the raw timing measurements.<br />
Therefore, an additional parameter has been implemented in dcs-web which makes dcs-web<br />
measure and save the time spent for processing each request. This measurement file can later<br />
be processed with gnuplot.<br />
To verify that these measurements are correct, two lines of the very simple weighttp source<br />
code have been changed to make it print the request duration to stdout. Then, dcs-web was<br />
started and the benchmark was run in this way:<br />
48<br />
export DCS="http://localhost :28080"<br />
./dcs -web -timing_total_path="dcs -total -1.dat" &<br />
./weighttp -c 1 -n 100 "$DCS/search?q=XCreateWindow" \<br />
| grep 'req duration ' > weighttp -vs-internal.raw