20.11.2012 Views

Contents Telektronikk - Telenor

Contents Telektronikk - Telenor

Contents Telektronikk - Telenor

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

The effect of end system hardware and software on TCP/IP<br />

throughput performance over a local ATM network<br />

BY KJERSTI MOLDEKLEV, ESPEN KLOVNING AND ØIVIND KURE<br />

Abstract<br />

High-speed networks reinstate the<br />

end-system as the communication path<br />

bottleneck. The Internet TCP/IP protocol<br />

suite is the first higher-level protocol<br />

stack to be used on ATM based<br />

networks. In this paper we present<br />

how the host architecture and host network<br />

interface are crucial for memory-to-memory<br />

TCP throughput. In<br />

addition, configurable parameters like<br />

the TCP maximum window size and<br />

the user data size in the write and read<br />

system calls influence the segment flow<br />

and throughput performance. We present<br />

measurements done between<br />

Sparc2 and Sparc10 based machines<br />

for both generations of ATM-adapters<br />

from FORE Systems. The first generation<br />

adapters are based on programmed<br />

I/O; the second generation<br />

adapters on DMA. To explain the variations<br />

in the throughput characteristics,<br />

we put small optimized probes in<br />

the network driver to log the segment<br />

flow on the TCP connections.<br />

1 Introduction<br />

The TCP/IP (Transmission Control Protocol/Internet<br />

Protocol) stack has shown<br />

a great durability. It has been adapted<br />

and widely used over a large variety of<br />

network technologies, ranging from lowspeed<br />

point-to-point lines to high-speed<br />

networks like FDDI (Fiber Distributed<br />

Data Interface) and ATM (Asynchronous<br />

Transfer Mode). The latter is based on<br />

transmission of small fixed-size cells and<br />

aims at offering statistical multiplexing<br />

of connections with different traffic characteristics<br />

and quality of service requirements.<br />

For ATM to work as intended, the<br />

network depends on a characterization of<br />

the data flow on the connection. TCP/IP<br />

has no notion of traffic characteristics<br />

and quality-of-service requirements, and<br />

considers the ATM network as highbandwidth<br />

point-to-point links between<br />

routers and/or end systems. Nevertheless,<br />

TCP/IP is the first protocol to run on top<br />

of ATM.<br />

Several extensions are suggested to make<br />

TCP perform better over networks with a<br />

high bandwidth-delay product [1]. At<br />

present, these extensions are not widely<br />

used. Furthermore, in the measurements<br />

to be presented in this paper, the propagation<br />

delay is minimal making the<br />

extensions above of little importance.<br />

The TCP/IP protocol stack, and in particular<br />

its implementations for BSD UNIX<br />

derived operating systems, has continuously<br />

been a topic for analyses [2], [3],<br />

[4], [5], [6], [7]. These analyses consider<br />

networks with lower bandwidth or smaller<br />

frame transmission units than the cellbased<br />

ATM network can offer through<br />

the ATM adaptation layers, AALs.<br />

This paper contributes to the TCP analyses<br />

along two axes. The first is the<br />

actual throughput results of TCP/IP over<br />

a high-speed local area ATM network for<br />

popular host network interfaces and host<br />

architectures. Measurements are done on<br />

both Sparc2 and Sparc10 based machines<br />

using both generations of ATM network<br />

interfaces from FORE Systems; the programmed<br />

I/O based SBA-100 adapters<br />

with segmentation and reassembly in network<br />

driver software, and the more<br />

advanced DMA based SBA-200 adapters<br />

with on-board segmentation and reassembly.<br />

Both the hardware and software<br />

components of the network interface,<br />

the network adapter and the network<br />

driver, respectively, are upgraded<br />

between our different measurements.<br />

The second axis is an analysis of how<br />

and why the hardware and software components<br />

influence the TCP/IP segment<br />

flow and thereby the measured performance.<br />

The software parameters with the<br />

largest influence are the maximum window<br />

size and the user data size. In general,<br />

the throughput increases with increasing<br />

window and user data sizes up<br />

to certain limits. It is not a monotonous<br />

behavior; the throughput graphs have<br />

their peaks and drops.<br />

TCP is byte-stream oriented [8]. The segmentation<br />

of the byte stream depends on<br />

the user data size, the window flow control,<br />

the acknowledgment scheme, an<br />

algorithm (Nagle’s) to avoid the transmission<br />

of many small segments, and the<br />

operating system integration of the TCP<br />

implementation. The functionality and<br />

speed of the host and network interface<br />

also influence the performance; more<br />

powerful machines and more advanced<br />

interfaces can affect the timing relationships<br />

between data segments and window<br />

updates and acknowledgments.<br />

The rest of this paper is outlined as follows:<br />

The next section describes the<br />

measurement environment and methods.<br />

The third section presents in more detail<br />

the software and hardware factors influencing<br />

the performance; the protocol<br />

mechanisms, the system environment<br />

factors, and the host architecture. The<br />

fourth section contains throughput<br />

measurements and segment flow analysis<br />

of our reference architecture, a Sparc2<br />

based machine using the programmed<br />

I/O based SBA-100 interface. The fifth<br />

section presents performance results<br />

when upgrading a software component,<br />

namely the network driver of the network<br />

interface. The sixth section discusses the<br />

results when upgrading the hosts to<br />

Sparc10 based machines. The throughput<br />

results and segment flows using the<br />

SBA-200 adapters in Sparc10 machines<br />

follow in the seventh section. The paper<br />

closes with summary and conclusions.<br />

2 Measurement environment<br />

and methods<br />

The performance measurements in this<br />

paper are based on the standard TCP/IP<br />

protocol stack in SunOS 4.1.x. We used<br />

two Sparc2 based Sun IPX machines and<br />

two Sparc10 based Axil 311/5.1<br />

machines. The I/O bus of both machine<br />

architectures is the Sbus [9] to which the<br />

network adapter is attached. The Sun<br />

machines run SunOS 4.1.1, while the<br />

Axil machines run SunOS 4.1.3. For our<br />

TCP measurements the differences<br />

between the two SunOS versions are<br />

negligible. Access to the TCP protocol is<br />

through the BSD-based socket interface<br />

[11]. The workstations have both ATM<br />

and ethernet network connections. Figure<br />

1 illustrates the measurement environment<br />

and set-up.<br />

2.1 The local ATM network<br />

The workstations are connected to an<br />

ATM switch, ASX-100, from FORE Systems.<br />

The ASX-100 is a 2.5 Gbit/s busbased<br />

ATM switch with an internal<br />

Sparc2 based switch controller. The<br />

ATM physical interface is a 140 Mbit/s<br />

TAXI interface [12]. The ATM host network<br />

interfaces, SBA-100 and SBA-200,<br />

are the first and second generation from<br />

FORE.<br />

The first generation Sbus ATM adapter,<br />

SBA-100 [17], [18], is a simple slaveonly<br />

interface based on programmed I/O.<br />

The ATM interface has a 16 kbyte<br />

receive FIFO and a 2 kbyte transmit<br />

FIFO. The SBA-100 network adapter<br />

performs on-board computation of the<br />

cell based AAL3/4 CRC, but the segmentation<br />

and reassembly between<br />

frames and cells are done entirely in software<br />

by the network driver. The SBA-<br />

100 adapters have no hardware support<br />

for AAL5 frame based CRC. Therefore,<br />

using the AAL3/4 adaptation layer gives<br />

the best performance. The SBA-100<br />

adapters were configured to issue an<br />

155

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!