One - The Linux Kernel Archives

More documents

Recommendations

Info

222 • Linux Symposium 2004 • Volume One shown in Figure 4. Here the graph shows that all techniques produce very similar results with a very slight performance advantage going to epoll-ET after the saturation point is reached. The results for the SPECweb99-like workload with 10,000 idle connections are shown in Figure 6. In this case each of the event mechanisms is impacted in a manner similar to that in which they are impacted by idle connections in the one-byte workload case. Replies/s 25000 20000 15000 10000 select epoll-LT epoll-ET epoll2 Replies/s 25000 20000 15000 10000 select poll epoll-ET epoll-LT epoll2 5000 5000 0 0 5000 10000 15000 20000 25000 30000 Requests/s Figure 4: µserver performance on SPECweb99-like workload with no idle connections 5.1 Results With Idle Connections We now compare the performance of the event mechanisms with 10,000 idle connections. The idle connections are intended to simulate the presence of larger numbers of simultaneous connections (as might occur in a WAN environment). Thus, the event dispatch mechanism has to keep track of a large number of descriptors even though only a very small portion of them are active. By comparing results in Figures 3 and 5 one can see that the performance of select and poll degrade by up to 79% when the 10,000 idle connections are added. The performance of epoll2 with idle connections suffers similarly to select and poll. In this case, epoll2 suffers from the overheads incurred by making a large number of epoll_wait calls the vast majority of which return events that are not of current interest to the server. Throughput with level-triggered epoll is slightly reduced with the addition of the idle connections while edgetriggered epoll is not impacted. 0 0 5000 10000 15000 20000 25000 30000 Requests/s Figure 5: µserver performance on one byte workload and 10,000 idle connections Replies/s 25000 20000 15000 10000 5000 0 0 5000 10000 15000 20000 25000 30000 Requests/s select poll epoll-ET epoll-LT epoll2 Figure 6: µserver performance on SPECweb99-like workload and 10,000 idle connections 5.2 Tuning Accept Strategy for epoll The µserver’s accept strategy has been tuned for use with select. The µserver includes a parameter that controls the number of connections that are accepted consecutively. We call this parameter the accept-limit. Parameter values range from one to infinity (Inf). A value of one limits the server to accepting at most one connection when notified of a pending connection request, while Inf causes the server to consecutively accept all currently pending connections. To this point we have used the accept strategy that was shown to be effective for select by
Linux Symposium 2004 • Volume One • 223 Brecht et al. [4] (i.e., accept-limit is Inf). In order to verify whether the same strategy performs well with the epoll-based methods we explored their performance under different accept strategies. Figure 7 examines the performance of leveltriggered epoll after the accept-limit has been tuned for the one-byte workload (other values were explored but only the best values are shown). Level-triggered epoll with an accept limit of 10 shows a marked improvement over the previous accept-limit of Inf, and now matches the performance of select on this workload. The accept-limit of 10 also improves peak throughput for the epoll-ctlv model by 7%. This gap widens to 32% at 21,000 requests/sec. In fact the best accept strategy for epoll-ctlv fares slightly better than the best accept strategy for select. Replies/s 25000 20000 15000 10000 select accept=Inf epoll-LT accept=Inf 5000 epoll-LT accept=10 epoll-ctlv accept=Inf epoll-ctlv accept=10 0 0 5000 10000 15000 20000 25000 30000 Requests/s Figure 7: µserver performance on one byte workload with different accept strategies and no idle connections Varying the accept-limit did not improve the performance of the edge-triggered epoll technique under this workload and it is not shown in the graph. However, we believe that the effects of the accept strategy on the various epoll techniques warrants further study as the efficacy of the strategy may be workload dependent. 6 Discussion In this paper we use a high-performance eventdriven HTTP server, the µserver, to compare and evaluate the performance of select, poll, and epoll event mechanisms. Interestingly, we observe that under some of the workloads examined the throughput obtained using select and poll is as good or slightly better than that obtained with epoll. While these workloads may not utilize representative numbers of simultaneous connections they do stress the event mechanisms being tested. Our results also show that a main source of overhead when using level-triggered epoll is the large number of epoll_ctl calls. We explore techniques which significantly reduce the number of epoll_ctl calls, including the use of edge-triggered events and a system call, epoll_ctlv, which allows the µserver to aggregate large numbers of epoll_ctl calls into a single system call. While these techniques are successful in reducing the number of epoll_ctl calls they do not appear to provide appreciable improvements in performance. As expected, the introduction of idle connections results in dramatic performance degradation when using select and poll, while not noticeably impacting the performance when using epoll. Although it is not clear that the use of idle connections to simulate larger numbers of connections is representative of real workloads, we find that the addition of idle connections does not significantly alter the performance of the edge-triggered and level-triggered epoll mechanisms. The edgetriggered epoll mechanism performs best with the level-triggered epoll mechanism offering performance that is very close to edgetriggered. In the future we plan to re-evaluate some of
Page 1:
Proceedings of the Linux Symposium
Page 4 and 5:
The State of ACPI in the Linux Kern
Page 7 and 8:
Conference Organizers Andrew J. Hut
Page 9 and 10:
TCP Connection Passing Werner Almes
Page 11 and 12:
Linux Symposium 2004 • Volume One
Page 13 and 14:
Page 15 and 16:
Page 17 and 18:
Page 19 and 20:
Page 21 and 22:
Page 23 and 24:
Cooperative Linux Dan Aloni da-x@co
Page 25 and 26:
Page 27 and 28:
Page 29 and 30:
Page 31 and 32:
Page 33 and 34:
Build your own Wireless Access Poin
Page 35 and 36:
Page 37 and 38:
Page 39 and 40:
Page 41 and 42:
Run-time testing of LSB Application
Page 43 and 44:
Page 45 and 46:
Page 47 and 48:
Page 49 and 50:
Page 51 and 52:
Linux Block IO—present and future
Page 53 and 54:
Page 55 and 56:
Page 57 and 58:
Page 59 and 60:
Page 61 and 62:
Page 63 and 64:
Linux AIO Performance and Robustnes
Page 65 and 66:
Page 67 and 68:
Page 69 and 70:
Page 71 and 72:
Page 73 and 74:
Page 75 and 76:
Page 77 and 78:
Page 79 and 80:
Methods to Improve Bootup Time in L
Page 81 and 82:
Page 83 and 84:
Page 85 and 86:
Page 87 and 88:
Page 89 and 90:
Linux on NUMA Systems Martin J. Bli
Page 91 and 92:
Page 93 and 94:
Page 95 and 96:
Page 97 and 98:
Page 99 and 100:
Page 101 and 102:
Page 103 and 104:
Improving Kernel Performance by Unm
Page 105 and 106:
Page 107 and 108:
Page 109 and 110:
Page 111 and 112:
Page 113 and 114:
Linux Virtualization on IBM POWER5
Page 115 and 116:
Page 117 and 118:
Page 119 and 120:
Page 121 and 122:
The State of ACPI in the Linux Kern
Page 123 and 124:
Page 125 and 126:
Page 127 and 128:
Page 129 and 130:
Page 131 and 132:
Page 133 and 134:
Scaling Linux® to the Extreme From
Page 135 and 136:
Page 137 and 138:
Page 139 and 140:
Page 141 and 142:
Page 143 and 144:
Page 145 and 146:
Page 147 and 148:
Page 149 and 150:
Get More Device Drivers out of the
Page 151 and 152:
Page 153 and 154:
Page 155 and 156:
Page 157 and 158:
Page 159 and 160:
Page 161 and 162:
Page 163 and 164:
Big Servers—2.6 compared to 2.4 W
Page 165 and 166:
Page 167 and 168:
Multi-processor and Frequency Scali
Page 169 and 170:
Page 171 and 172: Linux Symposium 2004 • Volume One
Page 187 and 188: Dynamic Kernel Module Support: From
Page 203 and 204: e100 Weight Reduction Program Writi
Page 207 and 208: NFSv4 and rpcsec_gss for linux J. B
Page 215 and 216: Comparing and Evaluating epoll, sel
Page 221: Linux Symposium 2004 • Volume One
Page 227 and 228: The (Re)Architecture of the X Windo
Page 239 and 240: IA64-Linux perf tools for IO dorks
Page 255 and 256: Carrier Grade Server Features in th
Page 269 and 270: Demands, Solutions, and Improvement
Page 273 and 274:
Page 275 and 276:
Page 277 and 278:
Page 279 and 280:
Page 281 and 282:
Page 283 and 284:
Page 285 and 286:
Page 287 and 288:
Hotplug Memory and the Linux VM Dav
Page 289 and 290:
Page 291 and 292:
Page 293 and 294:
show all

One - The Linux Kernel Archives

Create successful ePaper yourself

Delete template?

Save as template?