Contents Telektronikk - Telenor
Contents Telektronikk - Telenor
Contents Telektronikk - Telenor
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
158<br />
Segment size [byte]<br />
8192<br />
6144<br />
4096<br />
2048<br />
0<br />
0 5000 10000 15000<br />
Time [msec]<br />
(a)<br />
quence, a timer generated acknowledgment<br />
can change the segment flow on the<br />
connection. An example of this effect is<br />
found in Figure 3 (a) which displays a<br />
log (Sparc2, SBA-100/2.2.6) of the size<br />
of the segments on a connection with a<br />
4 kbyte window and a user data size of<br />
4096 bytes. There are three different flow<br />
behaviors on the connection; 4096 byte<br />
segments, fluctuation between 1016 and<br />
3080 byte segments, and fluctuation between<br />
2028 and 2068 byte segments. Initially,<br />
4096 byte segments are transmitted.<br />
After nearly 4 seconds a timer generated<br />
acknowledgment acknowledges all<br />
outstanding bytes and announces a window<br />
size of 3080 bytes. This acknowledgment<br />
is generated before all bytes are<br />
copied out of the socket receive buffer,<br />
and the window is thereby reduced corresponding<br />
to the number of bytes still in<br />
the socket receive buffer. This affects the<br />
size of the following segments which<br />
will fluctuate between 1016 and 3080<br />
bytes. Figure 3 (b) presents the segment<br />
flow before and after this timer generated<br />
acknowledgment which acknowledges<br />
4096 bytes and announces a window of<br />
3080 bytes. A similar chain of events<br />
gets the connection into a fluctuation of<br />
transmitting 2028 and 2026 byte segments.<br />
This shows up as the horizontal<br />
line of the last part of the segment size<br />
graph in Figure 3 (a).<br />
3.3 Host architecture and<br />
network adapters<br />
To establish how the difference in the<br />
Sparc2 and the Sparc10 bus architecture<br />
influences the achievable performance<br />
we measured the time for the send and<br />
receive paths for both architectures.<br />
Using the SBA-100 adapters, the measured<br />
driver times are proportional to the<br />
segment size. The receive times of the<br />
SBA-100 adapter include the time it<br />
[byte]<br />
8192<br />
6144<br />
4096<br />
2048<br />
Figure 3 Timer generated acknowledgments<br />
0<br />
3930 3935 3940 3945<br />
Time [msec]<br />
3950 3955 3960<br />
takes to read the cells from the receive<br />
FIFO on the adapter. The corresponding<br />
send times include the time it takes to<br />
write the cells to the transmit FIFO on<br />
the adapter. Using the SBA-200 adapters<br />
the measurable driver times are more or<br />
less byte independent. Obviously, the<br />
driver times for the DMA-based SBA-<br />
200 adapter do not include the time to<br />
transfer the segment between host memory<br />
and the network adapter memory.<br />
(We do not have an Sbus analyzer.) Figure<br />
4 presents for different segment sizes<br />
for both Sparc2 and Sparc10 the total<br />
send and receive times and the driver<br />
send and receive times as seen from the<br />
host:<br />
- the total send time is the time from the<br />
write call is issued to the driver is finished<br />
processing the outgoing segment,<br />
- the driver send time is the time from<br />
the driver processing starts until it is<br />
finished processing the outgoing segment.<br />
- the total receive time is the time from<br />
the host starts processing the network<br />
hardware interrupt to the return of the<br />
read system call, and<br />
- the driver receive time is the time from<br />
the host starts processing the network<br />
hardware interrupt until the packet has<br />
been inserted into the IP input queue.<br />
Each measurement point is the average<br />
of 1000 samples. A client-server program<br />
was written to control the segment flow<br />
through the sending and receiving end<br />
system. The client issues a request which<br />
is answered by a response from the server.<br />
Both the request segment and the<br />
response segment are of the same size.<br />
The reported send and receive times are<br />
taken as the average of the measured<br />
send and receive times at both the client<br />
and the server. To be able to send single<br />
Timer generated acknowledgment<br />
Window size<br />
Unacknowledged bytes<br />
(b)<br />
segments of sizes up to MSS bytes, we<br />
removed the 4096-byte copy limit of the<br />
socket layer (Section 3.1).<br />
As expected, the receive operation is the<br />
most time-consuming. The total send and<br />
receive processing times of the Sparc10<br />
are shorter for all segment sizes compared<br />
to the Sparc2. However, using the<br />
SBA-100 adapters, the driver processing<br />
times are in general faster on the Sparc2.<br />
(The only exception is for small segments.)<br />
This is due to the fact that the<br />
Sparc10 CPU does not have direct access<br />
to the Sbus. The latency to access onadapter<br />
memory is thereby longer. Thus,<br />
a 4.5 times as powerful CPU does not<br />
guarantee higher performance with programmed<br />
I/O adapters. As mentioned<br />
above, the SBA-200 driver processing<br />
times do not include the moving of data<br />
between the host memory and the network<br />
adapter. The Sparc10 SBA-200<br />
driver send times are longer than the<br />
driver receive times. For Sparc2 it is the<br />
other way round. On transmit the driver<br />
must dynamically set up a vector of<br />
DMA address-length pairs. On receive,<br />
only a length field and a pointer need to<br />
be updated. The Sparc2 must in addition<br />
do an invalidation of cache lines mapping<br />
the pages of the receive buffers,<br />
while the Sparc10 runs a single cache<br />
invalidation routine. The Sparc10 SBA-<br />
200 driver send time is slightly longer<br />
than the corresponding Sparc2 times, as<br />
the Sparc10 sets up DVMA mappings for<br />
the buffers to be transmitted.<br />
The send and receive processing times<br />
reflect the average time to process one<br />
single segment. The processing times of<br />
the receive path do not include the time<br />
from the network interface poses an<br />
interrupt until the interrupt is served by<br />
the network driver. Neither do the times<br />
include the acknowledgment generation<br />
and reception. The numbers are therefore