13.07.2015 Views

Magazine - 1000 BiT

Magazine - 1000 BiT

Magazine - 1000 BiT

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

IINTERNET PROTOCOL V.6DigitalTechnicalJournalPRESERVATION OF HISTORICALCOMPUTER SYSTEMSFORTRAN FOR PARALLEL COMPUTINGSERVER PERFORMANCE EVALUATIONAND OPTIMIZATIONINTERNET COLLABORATION SOFTWARE


ForewordAlan G. NemethCb?po~itc C'o~?s~~/lr~itl[:,V/X A i.ci?rtoc/~~t.c~ CIHLI 7i'chi1010:


lie! elements of the protocol,Digital's implementation \\.asthe onl\r one running to demonstratethat the design \\,as indccd fe,lsiblc.I3ut don't bclic\~c that \\.e c,lnimplement all the pieces of 111.6 ,IS '1single company. Tliereforc \Ire cliooscto slinrc the implcment'~tion cxpcriencctlirol~gli tliis paper to aid othcrsj11 tllcir efforts to deal with the iniplcmcnt,ltionproblems. We also don'tcl,lim completeness; tlie full suite ofspccifcdtions for II'v6 is e\rol\,ing, antitlic sok\\.'lre to implement it is large.We f~~ll!, ehpect that portions of our~~lrimarc product offcrings \\,ill hedc\.clopcd b\, others in the industry.Tlie long-term e\rolution of theIntcr~ict c,lpturcd in tlie IPj.6 implcmcnt.ltionpaper is but one exnmplcin this issue oftlie extent to \\,liiclicompl~ting no\\) has a histol-!, rhatgives us 111~1cli insight into tlic tiltul-c.(:crtainly tlie paper by Supnil< ,~nd1lu1.1ict is an explicit trip throughcomputing histor!,. Tlie re-creation,I>orh pli!,sicnl and logicell, of compurinss!.stclns of the p~st can onl!, helpremind us that the artif;~cts \\.c crc~tclin\.c n lon~cl. life than \\.e anticip,lte.As our progrnmmcrs \\,ri re ne\\ code,or our li,lrd\\,arc dcsigncrs prod~~ccnew arcliitcct~~ral approaclies, or ourstorage dcsigncrs push the bo~~nd-'~rics on nc\v ~ncdia rcchnologies, ilothey consider tlie imponderables ofrunning thcsc systems 25 or more!,c'irs in the fi~t~~re? The \lie\\' of .l~-cIii-\mists trying to prcser\.e tllis liisto~.!~rc~iiinds us of tlic difficulty of prcscr-\,.]tion aker tlie fact and of the nln.lzingd~lr~ltion of dcsign decisions.The '7q1"r on the e\,olution ofPol-r~an is yct another example of therich liistory of computing. Hcrc \\,csee clc,lrly the c\dution of ,l kc!,I,~ng~~,lgc to c~cco~n~ilod~te tlic cli,inging~>.xtcr~isofs!~stern arcl~itecti~r.lltic\isns ,lnd p:lrallcl progr,lni conccpth.The computer industl-\, fixq~~cntl!,dc\,clops commcrciall~~inlport.lnt programs b!. e\,olutionthe100,000-line program rhat 10years I,ltcr Ii~s become 10 millionlines ofcodc in ,In assortment ofI,lngu.~gcmand computing styles.Hcrc the \,cncr,lble Fortrdn (frstintroti~lccd in 1954!) adds si~pportfor some ofrllc latest approaches tofast sh8stcm inrcrconncct rcp~.cscntciib!, h,lEi\/lO tiiluitol~s nature of the Internet pcrnlits,~nd cncournges tools such as thistli,lt [~tilize computer systems in new\\ .l!,s. This approach builds on thet,tbric tli.lt \\.c emphasized in tlic 11'\~6p.iper hut sees the Internet 3s a tool,~nd.I component ofa I'lrger solutionc11ld sllo\\,s lie\\. to exploit thcsc c ~p~lbiliricsto all()\\. ne\ir \\?a!,s of \\,orking.Using imagin'ltion and building ontlic \\,orli of othcrs are characteristicof the approach tal


lnternet ProtocolVersion 6 and the DigitalUNIX ImplementationExperienceIDaniel T. HuringtonJames P. BoundJohn J. McCa~mMatt ThomasIn the early 1990s. the lnternet community recognizedthat the current TCPIIP architecture wasnot capable of sustaining the explosive growthof the Internet. In July 1994, the lnternet Protocolnext generation (IPng) directorate responded tothe problem with the lnternet Protocol version 6(lPv6) as the replacement network layer protocol.Working groups of the lnternet EngineeringTask Force (IETF) then began to build specificationsthat would address the needs for an expandedlnternet address space, an increase in router tablesize, and new technology features. As a contributorto these efforts, Digital has implementedIPv6 on the Digital LlNlX platform. The primarygoal of Digital's efforts has been to evaluate thetechnical feasibility of the proposed architectureand provide critical feedback to the standardsdevelopment process in the IETF. The secondarygoal has been to evaluate system design alternativesto gain the experience needed to allowDigital to incorporate this new architecture intoexisting products.As one of its ongoing advanced development cfforts innenvorkung technology, Digital has built an InternetProtocol version 6 (IPv6) prototype for the DigitalUNIX operating system. In this paper, we describe thedesign of the Digital UNIX Il'v6 prototype and its historyrelevant to the Inter~iet Protocol nest generation(IPng) effort in the Internet Engineering Task Force(IETF). We also compare its relationship with theexisting Transmission Control Protocol/InternetProtocol (TCP/IP) suite. We emphasize techniquesand technologies that were de\dopcd to accommodateparticular aspects of the Ih6 architecture andissues that required further discussion in tlie IETF. Inparticular, 11.c discuss the modifications to the transportlayer modules to use nvo distinct nenvork layerprotocols, along with the implications to the UNISsocltet laycr and applications. 111 addition, \ire dcscribcthe new IPv6 and Internet Control lMessage Protocol(ICM1') net\\~ork layer modules, including their interactionswith both thc data link layer and tlie IPv4protocol. \Yc rc\rie\v the ne\\, Neighbor DiscoveryProtocol and its algorithms and give details of itsimplcmcntation.To acconirnodatc the dynamic naturc of hlture net-works, IP\,6 incl~~des mechanisms to do both statelessarid statefill address configuration, as well as routcrdiscovery; \vc explain the design of a user-modeprocess that implements these filnctions. The paperincludes a discussion of enhancements to well-luiownIPv4 services, such as dynamic updates to the domainnaming service (DNS), as well as general techniquesto support the transition of existing applications. Thepaper concludes with an overview of \vliat we have. .learned in this project and summ? ' ~IZCS ' our current sta-tus and future work, includillg cfforts in nonbroadcast~nultiple access (NBlMA) data link technologies such asasynchonous transfer mode (ATM) and resource reservationprotocols.lnternet Protocol Next GenerationIn the early 1990s, the members of the Internet communityrealized that thc address space and certainaspects of the current TCI'/IP arcliitect~~re \\/ere notcaL>able of sustaining the esplosi\le growth of theVol. S No. 3 1996


Intcrnct. Within tlic IETt', several efforts \\?ere undertakento both study and i~npro\*e the i~sc of the 32-bitIntcrnct Protocol (IPv4) addresses, as \\,ell as to idcntieand replace protocols and services that \\!auld limitgro\\.th. The 32-bit ,tddrcssing architecture jn the nct-\\,ark la!rcr \\,as cll~ickl!. detczrmi~~ed to be the crux ottlic problem, \\,it11 both hard\\,arc nncl li~~man limitsnpproxlli~~g f~ndamental boiu~darics.' 1h.4 addrcsscs3rc ~~ncvcnly allocated in blocks that arc oficn toolarge or too small; they are also difficult to change\vithin any csisting network.When the IETF called for replaccmcnt proposals,Digital participated in this industry-\vide cffort bysubmitting \\!l~itc papers outlining issues and by dcvclopingand c\laluating prototT\ipes of the \w-ious proposals,Digital also participated in the IE'l'F \\forltinggro~~psaand in the IPng directorate, \\~Iiich liati tlicrcsrx)nsibility for making the ultimate decision. In July1994, the IPng directorate selected the InternetProtocol \.enion 6 (1116) as the replaccmcnt nenvorklayer protocol, and IETE \\,orking groups began tob~~ild spccificatio~ls. "The Recornnlendatio~l for rlicIP Nest Generation Protocol" summarizes the candidatesand explains the selection of this protocol.'Digital UNlX PrototypeThe current Digital UNIS IPv6 prototype pmjcct isDigital's most recent acidition to an ongoing effort todc~lclop and evaluate the competing Il'r~g proposals.This began with the Si~nplc Internet Protocol (SIP),\\,liich ~ ~sed eight octet addresses. SIP was latcr mcldcd\vith another early proposal and bcca~nc known asSi~nplc Internet Protocol Plus (SIPP), the directantecedent of 11'\,6.' The prirnary goal of Digital'sefforts h:~s been to e\,aluate the technical feasibility oftlic proposcti arcliitect~~rc ancl pro\ridc fccdbaclc to theIF,TF \\,orking groups. This is critical to the stantlardsclc\clopment process in the IETF, \\.llich rccl~~ircs n1~1ltiplcjndcpclldcnt ancl intczroperable jn~plc~ilc~itntio~isof 3 spcciticatiorl before it may become an Intcrnctst;tndnrcl. An additional goal has been to cvaluatc systemdesign altcrnati\,cs to gain the cspcricncc nccdcdto allo\v Digital to incorporate this nc\\. architccti~rcinto existing products. Digital has niadc thc protohpcavailable to researchers within the company as a sourcccode distribution and Inore reccntl!. has begun to supplybinary kits for early adopters and evaluators in theIntcrnct con~munity. As the IR.6 protocol ,~nd architecturemdturcs, use have begun to focus on ho\v toOcst i~ltegr~itc tlie code illto thc D~gital UNIS prodilct.IPv6 OverviewTo underst:~nd the s!rstem-\vide impact of IPv6, \\.crc\*ic\\*somc ofits ~ic\\.fcatures and contrast them \\.it11the IPv4 model. IPv6 is both a complctcly nc\vnct\vork layer protocol and a major revision of theIntcrnct architecture. At both le\~els, it builds uponand incorporates csperiences gained wit11 ll'v4.Figure 1 sho\\!s the evolution of the packet formatinto the ne\v TPv6 Iieader. It retains sonic fields (vcrsion,source, and destination address), clarifies the roleofot11c1.s (for csample, the Time -To Li\,c [TTL,] fcldis rcnamcd the Hop Li~i~it), and introduces new ones(such as Flo\\, 11)) with as yet untapped potential. Thencrt header fi cld allo\\.s for modirlar construction ofcomplex packcts: difkrcnt 1leadt.1- npcs can he chnincdtogether to provide specialized functionality, includingsccurin and source routing. Findl!; all licadcrs arcstructured to allo\v 64-bit alignmcnt, \\.hich shouldallo\v optimal processing both at sourcc and dcstinationsystc~ns, ;IS \\.ell as in transit.'Tlic most striking departure fro~ii I1'\,4 is theaddress size: it has increased horn 32 bits to 128 bits.- >1 he 11'\,6 addressing architecti~re is rich, \\lit11 prcfiscsti)r ni~~lticast addresses and prcdcfincci scopes for bothunicast and multicast addresses. One special type ofunicast address is the link-local address, \\lhich permitscorn~n~~nications \\,ith only those s!fstems directly conncctcdon the same link. This allo\\!s a standard bootstrnppjngmechanism, so that systcnis can Icarn aboutneighbors and scr\-ices before a routablc atitlrcss is'lssig~ied to an intcrfacc. \Jarious addrcss assign~nclltoptions lin\,c bccn defined, including liicrarcliicalmodels bnsccl up011 rcgional rcgistrics and scr\,iccproviclcr identifiers."" In each casc, care has been takento ensure proper route aggregation, \\.liich \\.ill helpyiclcl Inore efficient bacltbonc routcr pcrfor~nancc.Multiple liicans of acquiring addrcsscs have bee11defined k,r 11'1.6 addrcssing, \\;it11 the goals of al lo\\,ingtlcsibility through different administrati\~c policiesVERSIONPRIORITYFLOW LABELPAYLOAD LENGTH NEXT HEADER HOP LIMITSOURCE ADDRESSDESTINATION ADDRESSFigure 11 I'j.6 Mcnticr6 1)i~inl 'lkcl~~iic.~l Joitrt~:~l Vol. S No. 3 I990


a~icl, pcrhaps Inorc important, ofdemanding that nct-\vork addrcss reassignment be supported throughoutthe arcl-~itccturc. Tlic two neu. addrcssing services areStatclcss Address Autoconfi g~rration and the stateful,transaction-based Dynamic Host Configuration Protocolversion 6 (1>H(:l'\~6).7Y~n the st.ltcless model,address prctiscs arc Icarncci by listcni~ig for routerad\fertiscrncnt packets. Addrcsscs ~ rc formed by combiningtlic prcf s \\.it11 linlc-specific token such as the48-bit Ethcrnct liard\\,are addrcss. In tlie statcfi~l proccclure,hosts Ilia!, rcqucst addrcsscs, configurationi~iformntion, and scr\.iccs from drdicatcd contigurationser\,crs, \\,it11 routers potcnti,~ll!r scr\'ing as relaystations cil~ring the initid pIi3sc. In botli cases, tlieresulting ;iddrcsscs have associated lifetimes, and systeltlsr~lust be prepared to both learn addrcsscsand rcleasc cspjred addrcsscs. Combined \vitli theability to rcgistcr updated addrcss information withDNS serlrers, these mcclianisms provide a path toward~ictc\lork rcnunibcring, a goal that has provcd difficultto achieve in the I l'v4 world.Finally, the Internet Control Mcssagc Protocol version6 (ICMPv6) was dcvclopcd.~This specificationaimed to merge the fi~nctions of two distinct Ih14 protocolsfor reporting errors and status, ICMP for unicastpackct trans~nission and tlic Internet Group~Mcssagc Protocol (IGMP) for multicast traffic.Tlic messages defined in this protocol arc catcgorizcdas either crror or inhr~national, ~4th a family ofIiiessages in the second gl-0~1~ LISC~ to provide theNeighbor Disco\rcry Protocol."' Nciglibor ciisco\,erysen8es mi~ltiplc purposes \\,it11 tlic o\,crall theme ofpro\,iding a systc~n \\,it11 ropologicnl and cn\,ironmcntalhints. For csnmplc, link-la\rcr addrcss resolution,router discovcr\l, destination addrcss redirection, 2ndaddrcss autoconfigurntio11 mechanisms arc a11 specifiedusing neighbor disco\,cry packct ~ pcs.Although thc ncn\,ork Inycr did experience the largestamount of change, Fig~~rc 2 sho\\,s tliat the effects ofthis \\fork touch nearly all aspects of the Digital UNISsystem. IiVc point out csamplcs ofdecisions made due toour fi~nda~~~c~lt;~l design pllilos~phy, \\,liicli is bascdupon integration \\lit11 tlic UNIX system framc\\,ork,modular and cstcnsiblc sohvarc, support for m~~ltipleoperational policies, and a dcsirc to take advantage ofthe Alpha platti)nn \vitliout compromising portabiliq.In the follo\ving sections, bvc study these topics indepth, beginning with tlic nct\vork laycr, then coveringthe transport laycr modifications and the newneighbor discovery algorithms. Aticr that, wc discussaddress autoconfiguratio~i mechanisms and theireffects upon tlic system. We concludc with servicestliat will be affcctcd by the transition ti-om IPv4 toIPv6 SLICII as thc socltct application progra~nrninginterface (API) nncl DNS.COMMANDSUSER....................KERNELFigure 2Base Platform ChangesNetwork LayerI LINK-LAYERMODULESIAND NEIGHBORIn this section, \ve review the processing rcquircmcntsof the IPv6 modules, including lli-st\.lc nctnlaslu,from Classless Inter-Do~iiain Routing), \\,liich ,11-cappropriate to botli Ih.4 and IP\,G.'? Wc ha\r alsotried to take rnasi~ni~m aditantage of the 64-bit Alphaarchitecture \\,hen i~nplementing 11'\.6, \\,liilc ~naltingcertain that this implementation \\,auld run on 32-bitCPUs as \veil. For csaniplc, the cliccksi~m routinesoperate on 32-bit quantities (~llo\ving the carry too\,erflo\\~ into the upper 32 bits of a 64-bit rcgistcr).The checlzsum routine is also designed to allo\\l it to bcissued to multiple Alpha csccution units, \\,hichremains a topic for fi~rther in\rcstigation.Adaptations to Existing IP and ICMP RoutinesThe IPv6 and ICMPv6 routines arc completelyindependent of the corresponding IPv4 and ICMPv4routines, and tlie processing styles have distinct differences.In IPv6, the incoming packet is treated as bci~lgread-only, \vhile the BSD Il'v4 code ~iianipulatcs f cldswithin the Ih4 heaclcr. We also avoici unnecessary useof the 111-pullup routine (\\~liicli co~isolidatcs chainedmemory b~rff2rs into a single large buffer) bccnl~sc thiscould cailsc the packet to bc nccdlcssl!. lost. Fin~ill!?,instead of passing nulmerous arguments \\,hen callingfro111 hnction to fi~nction, a comliion data structure isDigital Technical Journnl 1'01. 8 So 3 I996 7


~~scd to store Iicccssar!z ciata and pointers; hr mostfunction calls, it is only necessary to pass a pointerto tliis structure. Tliis redl~ccs the stack o\.erhcad ,lndalso !riclds mod~llar ,)lid c:~sil\r cstcnsible subroutines.lPl.6 lias a cicdic.~tcd interrupt processing thread,,lnd rccci\,cci 111.6 packets arc placed onto thcir o\\,ninterface inp~~r queue (ifcll~cuc). PVhcn an Ih.6 pacl,~cltcts ,11-c checked to scc that thedestination m,ltcIics one of tlic system's ndcircsscs. Intlic special cnsc of the pacltct bcing t'irgctcd to a linkloc.ilacicircss, only tlic link-loc~ll .~dtfrcss for the recci\ringintcrhcc is comp,lrcd. If there is ,in exact match,the p~cltct is proccsscd normnlly; otlicr\visc, it isp'lsscd to tlic ~~nicast pacltct for\varding ro~~tinc.Header ProcessingAkcr ,i pacl


message contains the MTU of the constricting linlc.The source node adjusts its packet size to fit throughthis link.Path MTU information is kept on a per-destinationbasis and is stored in the ro~lting table entry for a givendestination. Packcts sent on that route \\,ill bc sizedaccording to the path MTU value. When a PTB messageis received, the appropriate route is updated tocontain the new path IMTU value as reported in thePTB message, and a tirner is started. When the timerexpires, tlic path MTU value is increased to the(known) MTU of the first hop link. This allows thenodc to detect increases in the path IMTU.Switches are provided to disable path lMTU discoverysystem-widc, on a per-destination basis and ona pa-socket basis. When path LMTU discovery is disabled,packets are limited to 576 bytes.FragmentationA packet that is larger than the lMTU of tlie path onwhich it is to be sent must be fragmented. Unlike IPv4,the IPv6 header contains no fields to carry fiagmentationinformation. Instead, this infor~nation is carriedin a specialized estension header, called the fragmentheader. As shown in Figure 3, the fields in the fragmenthcader include an offset, in eight octet units, andan identifier common to all fragments of the originalpacltct. 1M (managed) flag is used to indicate internicdiatefragments; the terminal fragment has the bitRESERVED\ \NEXT HEADER RESERVED FRAGMENT OFFSET \ 1 MI IDENTIFICATION 1Figure 3Fragment Headercleal-ed. Note that the amount of data in a fragmentpacket is derived from the total pacltct length.The first step in the fragmentation process isto idcntifjl the fragmentable and unfragmentable partsof the original pacltet (see Figure 4). 'The unfragnicntablepart ofthc paclcet consists of the Il'v6 headerand any estension lieadcrs that must be processed byeach node traversed by the pacltet (c.g., hop-by-hopheader, routing header). The fragment Iieadcr isappended to the unfragmcntable part. The rest of thepaclcet is divided into fragments, and each fragment isappended to a copy of the LIII~I-agmcntable part plusfragnlent headcr.When the fragment header is appended to theu~lfrngmentable part, nvo fields in the unfragmentablepart must be updated. First, the payload length field inthe IPv6 header 11111st be upciated to reflect thc lengthoftlie fragment pacltct. Second, the ncxt header fieldin the last header of the unfi-agmcntable part must bcchanged to indicate that a fragnient headcr follo\\s.A copy of the unfragmentable part is created foreach fragment packet. As an optin~izntion, DigitalUNIS allo\vs portions of a pacltrt to be shared amongcopies of the paclwt, to avoid an actual data copy. Aswith IPv4, care must be taken to clls~~re tl~at fieldsbeing updated are not co~itai~led in shared buffers.This is typically accomplished by copying tlie portionsthat 1n11st be updated into a private memory buffer(mbuf). Unlilte IPv4, the ~~nfragmentable part maynot fit in a single mbuf, and the Il'v6 fag~nentationcode must be capable of handling this case.To reduce the possibility of fragment loss at thesource node, all tlie fragnient pacltets arc built beforeany is passed to thc data link for transmission.A question that arises herc is how big shouldthe fragment pacltets be? Should they be sized accordingto the path IMTU, or sho~~ld they be limited to576 bytes? The former yields the desirable largerFRAGMENTABLE PARTORIGINAL PACKET .UNFRAGMENTABLE FIRST1 SECOND 1 ,'.:IPARTFRAGMENT FRAGMENT FRAGMENTFRAGMENT PACKETSUNFRAGMENTABLE FRAGMENT FIRSTI PART 1 HEADER I FRAGMENT 1UNFRAGMENTABLE FRAGMENT SECOND1 PART I HEADER I FRAGMENT II PARTFigure 4Fragmcn tationDigitdl Tccllnic.il Jour1l.11 Vo1. 8 No. 3 1996 9


packets, \\,hilt the latter avoids ~~ndesirab.lc fragmentloss (due to tlic fragment packct being too big). TheDigital UNIS 111~6 prototype supports cithcr choiceon a systcni-\\ride, per-destination, or per-socket basis.This is an csamplc of scparnrion of mechanism frompolicy, a bnsic g~~idclinc bcing ~~scci 'icross tliis projcct.ReassemblyThe reassembly process reconstructs the originalpacket from fragn~cnt packets. Fragments belongingto the samc packct arc idcntifcd b\r a combination ofso~~rcc 11' ~ddrcss, ncst licadcr nrpc (frst header of thefragme~~t~~hlc part) and t?,lgmcnt idcntifer. Incli\.iclualfragments arc q~~cucd \\,ithi11 rlic ncnvork layer until theoriginal packct can be completely reasscmblcd, at \\lliichpoint it is passed to tlic appropri.itc protocol module.When all fragnlcnts have arrived, the original packetcan be reasseniblcd. A singlc copy of the unfragmentablcpart is kept, and the data from each fiagmentpackct is appended. Thc payload length fcld of the Ih.6he.ldcr is updated to rctlcct tlic le~igtli of the rcasscnibledp.~cl


SOCKETLAYERUSER /....................KERNELAF-INETAF-INET6I-V4 TRANSPORT V6 TRANSPORTFigure 5Indepcndcnt Transport ImplementationSOCKETLAYERUSER /....................KERNELmmV4 AND V6 TRANSPORTI I LISTFigure 6111tegr.ated Transport Implernent~tionThe ab~l~ty to malntaln, let alone extend, the code base\\rould ~lso suffer. Fortunatcl\l, duc to thc fact thatIPv4 addresses are a well-defined subset of the entireIPv6 address space, it is relatively straightforcvard toimplement tlie transports so that a single set of modulescan be ~iscd over both network la!ler~.~" To accomplishthis, \vc increased the storage space al.locatedfor addresses and separated those f~~nctions that aredependent upon a particular ~ienvork layer. We discusseach of these issucs in this section.Althougli the concept of ;i sockadclr is generic in theBSD architecturc, the Bo\v label and in6-addr mcJnbersof tliis structure are unique to Ih6 and \\,auld beused only in tlie AF-INET6 address family. Tlie detailsof this are specified in Reference 21.The in-pcb data structure is crcatcd for each socketusing TCP, UDP, or otl~er clients of the net\\rork layer.In addition to storing the source and destinationaddresses, various other pieces of information recjuil-cdfor proper communication are stored here, includingthe port numbers, options and flags, a pointer to thesocket receiving the data, a header template, and apointer to tlic routing entry for tlie given destination.For Il'v6, this basic model has been retained, and addi-tional infor~nation is stored. This information includeslocal and remote tlow labels and indicators of whichaddress family the application is using and which networklayer the transport communication is using.Finally, a partial checksum of the transport pseudoheaderis stored here as \\lcll; its use is dcscribcd in thefollowing section.In addition to the explicit storage of tlie nen\,orklayer and address hmily, the funda~llental techniquethat facilitates the use of a comnion transport is thestorage of IPv4 addresses in an IPv6 format. This isknown as an Il\r4-niapped addrcss and is describedin "IP Version 6 Addressing Architect~lre.""' Thisaddress format js explicitly reserved to store addressesof systems that are capable of using only the IPv4protocol, and thus is an appropriate form of storagein the PCB for con~munications that will be sent usingthe IPv4 protocol, as opposed to IP\r4-compatibleaddresses, which are sent using IP\16 pacltcts. Thesemapped addresses are ofthe follo\\ring form:Storing Large AddressesT\vo specific data structures must be niodified toaccom~nodate addresses larger than the 32-bit IPv4type. Tlie first of these is tlie sockaddr struct, \vIiich isused \\,\.hen dealing with the BSD socket layer andpassed along to user applications. Tlie second is theInternet Protocol Control Block (PCB) data structure,the in-pcb. In this section, we review the modificationsto each structure.A program that uses a transport does so by means ofthe BSD socltets interface and passes addressing informatio~iin a sockaddr structure. For 1P1.6, this is asockaddr-in6. Internally, tlie structure is defined sothat 64-bit alignment is preserved; however, it has thefollo\ving public definition:struct sockaddr-in6 Cu-char sin6-Len;u-char sin6-family;u-short sin6-port;u-int sin6-flowlabel;struct in6-addr sin6-addr;These addresses arc manipulated \\~ithin the IPv4TCP and UDP protocols by means of macros thatallotv the IPv4 addresses to be inserted, estractcd,or compared whilc in an IPv6 address structure(in6-addr). As an example, the code ft-agmcnt inFigure 7 sho\vs an address being extracted for usein evaluating a co~lfigurablc IPv4 socket option.Special Transport/Network Layer InteractionsWithin the integrated transport layers, the transportprotocol is treated independently of the particularnetwork layer being used, and nenvork-layer-specificf~lnctio~ls are used to interface to either IPv4 or IPv6.Tliere are two partic~llar instances in which thetransport layer has interactions with the IPv6 ncnvorklayer over and above the exchange ofdata packets forinput or output. These are the notification and updateof path MTU, which is required in IPv6, and thepotential to refresh the neighbor discovery cachebased on forward progress; i.e., if the transport kno\\sthat data is reaching its destination, it can validate the


* Test address for IPv4 characteristic* /if (inp->inp-netlayer == AF-INET) {struct in-addr tmp;Figure 7Code Fragment of a Il'v4- napped Addresscurrent nenvork layer path. We investigate each ofthese issues in tLlrn.Path MTU discovery, as previousl!/ described, istriggered by ICMP messages processed in the networklaycr, with lcarncd inforlnation stored in the ro~~tingtablc. In the course of processing a PTB message, tlictransport layer is notified through its control input(ctlinput) path. This is rcq~~ircd bccausc tlic rcccptionof such an ICMP message indicates that the packet intransit has been discarded, thus the protocol may needto take appropriate action. In the case of TCP, it isnecessary to recompute the masimuni seglnent sizeand retransmit the affected dat~. Although this is notrequired for UDl', which is J pure datagram scr\~icc,this luiowlcdge can bc madc a\lailable to the correspondingsocket ocvner.The othcr interaction between an upper laycr andthc 1P laycr occurs \\,hen the upper layer, specificallytlic TCP transport, \\/islies to indicate that communicationswith a destination host has niadc forwardprogress, for the purpose ofrcsetti~lg the timer in theneighbor discovery cache. This positive feedbackmechanism is described in the nciglibor unrcachabilitycletcction portion of the "Neighbor Disco\~cry for 11'Version 6" specification and prevents unnecessaryprobing of the current path."' VVhen acl


for the IP\16 pseudolicadcr, n~hich consists of the sourceand destination addresses, the paclzet payload length,and thc nest header value. This partial chccltsum, withthe exception of tlie payload length (which varies perpacket), is then stored in the PCB, to be passed along\\iith the pointer to usel- data within the memory bufferto the chcclzs~~m fi~nction. The initial checltsum calculationsare done using 32-bit ~~alues in 64-bit registers,and later are collapsed to tlie final 16-bit ~11111. This iscodcd as one large C statement, adding the variouspseudoheadcr components in piecemeal fashion. Thisallo\\~s the compiler to schedule the instructions foroptimal pcrformancc. Tlie final packet checl


ROUTER SOLICITATIONTYPECODECHECKSUFAOPTIONS ...RESERVEDROUTER ADVERTISEMENTOPTIONS ...NEIGHBOR SOLICITATIONI TYPE I CODE I CHECKSUM II RESERVED IOPTIONS ...NEIGHBOR ADVERTISEMENTTARGET ADDRESSTYPECODE~ l S l 0 l RESERVEDTARGET ADDRESSI II OPTIONS ...REDIRECTCHECKSUMTYPECODECHECKSUMITARGET ADDRESSIII OPTIONS ...RESERVEDDESTINATION ADDRESSSOURCEmARGET LINK-LAYER ADDRESS OPTIONI TYPE I LENGTH I LINK-LAYER ADDRESS . . .PREFIX INFORMATION OPTIONTYPELENGTHPREFIXLENGTHLARESERVED1REDIRECTED HEADER OPTIONVALID LIFETIMEPREFERRED LIFETIMERESERVED2LENGTHPREFIXMTU OPTIONRESERVEDlP HEADER AND DATAtRESERVEDTYPELENGTHRESERVEDMTUFigure 8Ncigl~bor I)isco\.cry P,~cltccs


\lalue is not reachable, tlie mapping \\lill be deleted.7'he address resolution process has several implicationsfor the implementation. Outbound packets mustbe quelled pending link-layer address resolution, andlink-layer addresses must be stored sonie~vliere. Tlie"Neighbor Disco\lery for IP Version 6" specificationdescribes a conceptual neighbor cache to hold thisinformation."' The Digital UNIS Ihr6 prototype usesseveral data structures to implement the neighborcache. An nd6-llinfo structure keeps track of eachentry in the neighbor cache. This structure containsthe queue header for packets awaiting link-layeraddressresolution. The link-layer address is stored inthe routing table, in a host route entry for the destinationIhl6 address. The RTF-LLINFO flag in the routeentry indicates the presence of link-layer information.Each nd6-llinfo structure contains a pointer to thecorresponding routing table entry, and tlie routingtable entry points baclc to the nd6-llinfo structure.The use of routing-table entries to hold the linlclayer-addressinformation is an optimization. A routingtable entry is associated with thc majority ofpackets transmitted for reasons other than address resolution.Storing the linlc-layer address in the routingtable entry avoids the overhead of a separate link-layeraddresstable. This approach is modeled after the BSD4.4 system's AN' implenientation.Neighbor Unreachability DetectionNeighbor unreachability detection (NUD) has itsroots in the dead gateway detection in Ih14 but hasbeen generalized in IPv6 to include all neighboringnodes (not just gatc~ays).~~ Unlike 11\14, the rnechanismssupporting NUD are an integral part of IPv6.11'116 nodes monitor the reachability of neighboringnodes to which packets are being sent, An IPv6 noderelies on reachability confirmations to determine tliereachability state of a neigllbor. In the absence of anyreachability indications, an IPv6 node n~ill periodicallyuse an NS to acti\iely probe the reachability ofa neighbor.Ai NA sent in response to an NS provides reachabilityconfirmation. The S (solicited) flag in the NAis provided specifically for this purpose. If neithermethod succeeds within a given period of time, aneighbor is considered unreachable. Figure 9 sho\vsthe neighbor unreacl~abilit)~ states.A reachability confirlnation may take several differentforms. Any packet received from a neighbor can beviewed as a reachability confrrnation, provided thatthe pacltet c\lould only have been sent by the neighborin responsc to a paclcet sent from the local node.A TCP aclu~o\\lledgment is one esample: receipt ofa TCP ACI< indicates that a packet sent to the neighbordid in fact reach it. Another exa~nplc is an ICIMP\I~redirect message. Rcccipt of a redirect message indicatesthat the neigl~boring router received a packetfrom the local node.In tlie Digital UNIX IPv6 prototype, the nd6pllinfostructure holds NUD state and retrans~nit count informatio~?.A field in the routing table entry is used forNUD timers. Tlie RTF-LLVALID flag in the routeentry is used to indicate that the neighbor is reachable.A nc\v routing message type (RTM-CONFIRM)\\.as defined to pass reachability confirmations to theneighbor cache. This mechanism is used by TCP uponreceipt of new ackno~vledgments.AutoconfigurationOne of the goals of IPv6 is to \vork properl!~ in adynamic nct\\lork e~nrironrncnt without the need formanual intervention on each system attached to thcIQUEUEPACKET +RECEIVE LINK-LAYER ADDRESS (UNSOLICITED)RECEIVE LINK-LAYER ADDRESS (UNSOLICITED)I I IREACHABLERECEIVE NA(SOLICITED)EXCEEDEDINCOMPLETE REACHABLE STALE-NONESEND NS-(MULTICAST)REACHABILITYMAX-MULTICAST-SOLICITRETRIES EXCEEDEDREACHABILITYCONFIRMATIONREACHABILITYCONFIRMATIONCONFIRMATIONPACKETt PROBE +MAX-UNICAST-SOLICITRETRIES EXCEEDED-SEND NS(UNICAST)I 4 EXCEEDEDDELAY-FIRST-PROBE-TIMEDELAYFigure 9Neighbor Unreachability StatesDigital Tcchnic~l Journal \ful.SNo.3 1996 15


7 .ncn\.orlt. The solution is to allo\\* important picccs oti~iformntion to be Icarncd and the s!-stcln to autocolifigureitself using this data. IPv6 autoconfigurationc~~compasscs the ti)llo\vinb r ~tcnis: 'Router discoveryOn-link pcti s disco\.cr!*Intcrhcc ntrributc configurationStateless ;iddress config~~rationSt~tcf~~l address config~~r.itionTllc mccl.lnnisn~ for dcl.i\~cring this infol-rn'ltioll tothe Ilosts is rlic router advertise~nie~nt (TW) pacltct ofthe Nciglibor l)isco\~cry Protocol. In tlic follo\\,ingsections, \\,c describe the methods \\~c dc\,clopcd toprocc~stlncsc pnckcts and i~pdate the systc~ii.Host Autoconfiguration DaemonTo p~)ccss tlicsc RAs, \\.c designed n host d.icmoncalled ~ldbhostd, \\.hich resides in the application spaceof the lligitnl UNIS operating systc~l~. Wc dctcrmincdt11at a LISCI--I~~O~C daemo~~ \vas the most efficientto implement I Pv6 autoconfig~~ration fc)r the fi)llo\\.-ing reasons:A user-mode dacmon \vould avoid ler~icl bloat.M.lintcnn~lcc and cstensibilin~ \\rould be c,lsier.The hl~iction is not performance critical.Thc autoconfiguratio~i processing is implcmcntcd.IS a single esccutahlc image, as a coliesi\.e scr oftightlycoupled modules. 'l'hc dnerlion currently is dcsigncclas a single-threaded application that i~scs a dispatchmechanism to call cacli specialized fiunction niodulc inturn. W \\.ill csnminc the idca of hn\,ing this dacmonrun as a mulritlircadcd application in the hlturc.'l'lic ndbhosrd dacnnon comnnunicatcs \\.it11 thenen\.ork suhs\~ste~ni in tlnc Iwrnel through multipletcchniqucs. Fiprc 10 sI1on.s the autoco~ifigi~~-nrion~x-occssiing ~inodulcs. Tile Inn. socket interfncc is uscci torccci\.c IMs, and J/O control messages (ioctls) arc usedto mnnipulnte Iterncl data structures. Also, tlic routiugtnblc is updated ns neccssar!; b!, mews of .J. ran. socketintcrfiicc to the I'F-ROUTE protocol hmily.W dcsig~led the J1%6 ran- socket's intcrfilcc \\.it11the abilih to pass only specific ICMP\,6 messages ton user and to filter extraneous packets or protocols.1 lie nd6liostd daemon sets n socket option to rccci\pco~~l!. neighbor discovery Rgs. It then crccutcs ;I ciispatchrolltine that polls the ra\v socltct, a\\,aitingpnckcts. Wen data is available on the socltct, the dncmondctcrmincs the characteristics of the message,crc~tes ;I data structure to contain it, 2nd calls the ncccs~nr!~f~nctions to perform autoconfigul-ation. ?'hcdisp;~tcli ~iiotiulc, in addition to polling socltct dcscriptors,S L I ~ P O J'lcccssary ~ ~ S timer Inanngcmcllt filnctionssucl~ as creation, deletion, rind expiration. Figurc 11slio\\,s tlic npplici~tion daemon design center.Kernel Interface Data StructuresIn many \\ays, the data link interface is tlic fi>cus ofI1+6 nutocontiguration support. The kernel clat~ structuresk)r Ih4 interfaces arc not sufficient to jrnplcmc~~tthe ncccssar! IP1.6 tiinctions. We designed and implcmcntcdnc\\. intesfiice data structures that encapsulatedthe existing I h.4 structures. This allo\\-cd us to avoid ;Ircconnpilarion of the existing data link dri\.crs on the13igitnl UNIS operating system. In the filturc, 1j.c \\.illattempt LI dcsig-11 ill \\rhich thc intcrEicc StrLIctiIrcsI1'\,4 and I l'v6 arc completely integrated.As slio\\~n in Figire 12, \\Ie designed an in6-itiictsrrucrurc to support each data linli typc (c.g.,t;,thcrnct, I'PI', loopback) 2nd used the csistiligifiict structures to point to thosc link in[-erf,lccs. Tlicin6-ifiict lias its oun in6-ifaddr structure k)l. cncliII'vO address configured in the dntn structurein6-IocalaJJr. Wc also defined rlic in6-router structureto support each routcr availnblc For the implcmentation.Thc in6-router structilrc spccitics theintc~.Eicc of the router, neighbor cache route, ;indthe 1 P1.6 address of the router.ION-LINK INTERFACE STATELESS STATEFULROUTER PREFIX ATTRIBUTE ADDRESS ADDRESSDISCOVERY DISCOVERY PROCESSING CONFIGURATION CONFIGURATION (DHCPVG)t t t t tIPV6 SOURCEADDRESS DEFAULTROUTERLIFETIMEPREFIX OPTIONS:ON-LINK PREFIXPREFIX LENGTHVALID LIFETIMERECEIVING INTERFACEHOP LIMITREACHABLE TIMERETRANSMIT TIMELINK MTUI I I IROUTER ADVERTISEMENTPREFIX OPTIONS:ADDRESS PREFIXPREFIX LENGTHVALID LIFETIMEPREFERRED LIFETIMEMANAGED BIT, OTHERCONFIGURATION BITFigure 10Autoconfg~~rntion Processing ~ModuIes


, ,0AUTOCONFIGURATIONDATA LISTSKERNELSTRUCTURESIOCTLSCALL AUTOCONFIGURATIONPROCESSING TIMERSOPEN RAW SOCKETTO LISTEN FOR RAsSOCKETUSER SPACE----- ---------------KERNEL SPACEttROUTING TABLEAND INTERFACESTRUCTURESDISCOVERYPROCESSINGFigure 11Application Daenion l3esig1i CenterETHERNET, PPP, FDDI, ATM, TUNNEL1n6-~faddrr---- 1 ---------------I ADDRESS CONFIGURATION PARAMETERSAND POINTERS TO IPV4 ADDRESSESL----,,------------I'I 1n6-localaddrI ADDRESSESI ADDRESS STATES, DATA LINK INTERFACE ''1IFigure 12Autoconfg~~ration Interface Structures and RclationsliipsInterface Attribute AutoconfigurationTo autoconfigure tlie interfaces for IPv6, \\re creatednew ioctl filnctions to create, delete, ~~pdate, andaccess thc interfaces. In additio~l to their use by thend6hostd daemon, these ioctls may be ~~sed by anyfuture modules that need to access or manipulate theinterfaces. This might include specialized configurationutilities, Simple Nehvork Management Protocol(SNMP) nianagelnent hu~ictions, security tools, orLI\~ICI- SCI-\ icca.The interface module to update and maintain interfacestructures for ndbhostd serves nvo purposes: toL I ~ C ~ J ~it,~~ C li~ili ~ rilx~~chr ~ ~ ~ c ~ n i 1,)) ~ l~ Ic I L ~ R,\, l < I I I ~ I 10111~int.lin tlie ciata structures as a set of linlzed lists forrouter disco\~ery, on-link prefiscs, and addrcss configuration.Fig~~rc 13 shows the interhcc attribute updates.Router Discoveryhi RA packet has mandatory and optional parts.Bcfore a default router is added to the routing table,the following interface attributes must be determined:1 . Receiving interface2. Cl~rrcnt hop limit3. Reachable and retransmit times for use in NUDThe link-local address finm tlie so~~rcc, link-I~~PIoptionofthc RA is then addcd to thc routing t,~l>lc',


AUTOCONFIGURATIONPROCESSINGUSER SPACEKERNEL SPACEIOCTLsSIOCIPVGADDRTRSIOCIPVGDELRTRSIOCIPVGIFINITSIOCIPVGAIFADDRSIOCIPVGSIFATTRAvSIOCIPVGGIFATTRSIOCIPVGMIFADDRSIOCIPVGGIFADDRSIOCIPVGDIFADDRIPV6 INTERFACE CONTROL MODULEADDRESSES AND STATEON-LINK PREFIXESROUTE ATTRIBUTESDATA LINK ATTRIBUTESFigure 13IntcrLicc Attribute Updatesand the Iternel data structures k)r routcr intbrrnationarc upd.ltcd. The routcr lifctimc ficld in the 1WdcfincsIio\\, long this router ma), bc used as a defa~~lt routel..The ndbliostci daemon first updates the intcrf~ccattributes. A tinier is sct ~~sitig the appropriate routinefrom the ciispatcli module. When the timer expires,the delete dcKx~lt routcr routine is callcd, and theroutcr is deleted from the routing table. The daemonmust also be able to dclctc the rourcl- if it recei\,es an1W \\,it11 a zcro lifetime \7;1lue, \\,liicli can occur \\,lien anode is acting as n routcr b~lt is rcsct to be ~1 llost.On-link PrefixesAn on-link prefix in TI36 def nes a subnet a11d is typicallyconfigured on a routcr for a specific link by thencnvork administrator. The routcr tlicn ad\fcrtiscs thisprcfi s to all nodcs connccteci to that linl< as a prctisoption, appended to an IW. A pretis option dctincs ,Isinglc prefix only, but an 1W may contain niore th;inone such option. As sho\vn in Figurc 8, tlie prefixoption pro\.ides tlie fbllo\ving information:Prcfs lengthLink- or L-bit, \\~liiclis sct if tlic prcfis is dircctl!~rcaclablc on link (i.c., a neighbor)Autono~nous- or A- bit, \\rhicli is set if the prcfis canbe used for statclcss address conf g~~rationThe length of time the prefix is \didThe dac111on adds the prcfis to tl~c routing table.Then a timer ro~ttinc is c:lllcil fro111 the disparcll mod-IIIC JI~LI ib kc1 t;)r thc ~ imc tllr k>r,cti,y ~z \.,)lid. W~ICIIdispatch routit~e calls the delctc on-link prefix nlodulc,the prefis is dcletcd fion~ the routing tablc. A prcfisc.111 .IIXJ bc dclctcd I\.I~CIL ,I I ~CM K,\ ~)rcsc'~~ts tlx profix\\.ith a lifetime of zero. In that c,lsc, the on-linkprefix modulc \\.ill stop the timer routine and deleteIIIC prclii II*OIII t11c ro11i111g I.II>Ic.111~Address ConfigurationAddress configuration is one of thc nc\\. paraciigmsthat InLlst bc supported in I11\,6. '1'1\~o configurationmethods, stateless and statcftll, are pro\ridcd to autoconfigureaddresses for a ]lost. The M-bit flag in an Kiimessage determines \\.hich method to i~sc and inhrn~sa host. In addition, the othcl--bit (0-bit) flag is pro-\,idcd to configure other nct\vork parameters requiredfor the host's operation on thc nen\~orli \vIicn tliestatefill config~~ration is uscti.Adtircss autoconfigurntion in IPi.6 supports thc~bilig, to dynamicall), rcnumbcr a link or a colnplctcnetwork througll thc use of lifetimes specified in the1W message. The \did lifetime is the timc the addresshas bcfixc cspiration. When the timer expires, ;ill connectionsusing that addrcss arc dropped b!~ tlie ilnplemc~ltation,ancl no lie\\ conncctiolls arc permitted.The preferred lifetime is provided to inform an implc-~ncntation that an address is about to expire; it typicallyis set to a lo\\lcr \value than the valid lifctinic.When this tinlcr cspircs, the addrcss is said to enter thedcprec~ted state, at \\.hic11 point an iniplcn~entatio~i is~xr~iiittcd (as a config~rration option) to prc\,cnt nenJcomm~~nications using this ndclrcss as a source 01. dcstin~~tion.'l'liis niodcl is designed to pro\~ide net\\,orliad~ninistrators with control over the use of net\\ro~-kaddresses \\.itho~~t manual intcr\,ention ofeach host onthe ncn\rork. The stateless niodcl is intended for users\\rho do not necd tight co~itrol over address config-~u-ation; statcfi~l mechanisms \vill be used \\,here thc:wirni~li~ t(\ dc-Iei?:~re 2dclr(h


AUTOCONFIGURATIONDAEMONSTATELESSAUTOCONFIGURATION.USE ON-LINK PREFIXESAND INTERFACE TOKENAUTOCONFIGURATION.PROCESS CONFIGURATIONINFORMATION. STARTDHCPV6 CLIENTDHCPVGSERVERUSER SPACEKERNEL SPACE- IFigure 14Adcircss Aurocontig~~~..~tio~lADMINISTER ADDRESSCONFIGURATION STATEFOR THE RECEIVINGINTERFACE-- -ciclctcd, or ~~pci,ltcci on tlic inrcrkicc based on the pre-&:..,>c, ..-.I li&:~L-..~,, .-,,,.,.:..,%,Ii.. *I.,. D A ,-,,.IF,.+lI.\L> ncli to tlicil. correspondingnslncs.'" .l'lic t\y'c A rccoul.cc rccol.cI i5~rseci to liolcl ,111 Ih.4 ,itici~.cc. Since it\ s~zc is fixcejat 4 b\rtrs, ,i nc\\ resource ~.ccord r\q.rc, Ai\.M, \\ ,ISdefincci to hold 11'1.6 .~cidl.csscs.''l'l~c l)icit,ll USISApplication ServiceslVost 1'


pilssing tlic olx~i x)ckcts to them. Witli tlic ad\,cntof tllc Af.'-INF,T6 socket ypc, illctd \vas modifiedto accept a ne\\ application configuration option inits co11figu1-ation tile. The licy\\.ord inct6 is used toinciicatc an ;lpplicarion that \\.ants to use AF-INET6soclicts. Thc Itc\r\\.o~-cinct (or the abscnce of a kcy-\\.orel) i~ldicatcs use of AF-INli'l' sockcts.ApplicationsA ?.pica1 application nccds onl!. minor modification touse the AF-INET6 address hmily. Applications thatLISC ;~dci~.csscs .is p~1.t of their dcsign or ~wotocol, suchas tlic File TI-n~iskr Protocol (F'TP), rcql~irc morccstensi\,c moiiif c,~tio~i. 'The l>igi rnl LINIS 11'\.6 protot!yxincludes sc\.cl-'11 basic applications that Iln\.e beenmodified to support II'\,6, including Tclncr anci FTI'....I licse programs \\,ere ~iiotiificd to 11sc IP\6 socltcts,addrcss strucr~~rcs, and library ro~~tincs. Note tliat tlieIPv6 sockcts also support cornmi~nicatio~~s o\.cr Ih4,so that npplicxions need not ~iiaintain separate sockctsfor 11\14 and II'\r6, and a single csccutable image canintcrot>w'itc \\/it11 both types of~.c~iiotc systcm.Future WorkFuture implementation cfhrts \\*ill includc security,routing, statcfi~l ;iddress configuration, d!lnamic~pdarcsto DNS, IPv6 over PP1' and ATILl, resourcercsc~.\.;itio~l, ,ind scl-vice loc:ltioll. In addition, \ire \\illre\,ie\\. clcmcnrs O~OLII- existing dcsign allti imple~iientationn~.chitcctllrc to incl-case pc~.ti)~.rnn~ice and to easethe transition from II'i.4 to Ik.6. Wc \\.ill conrinl~c top;irticipatc in the I h.6 i~i(i~~str\' ~i~~~lti\.clidor interopcmbilit\fciScnts, \\.hich is .I practical and conccntratecieffort to cicbug the specifications and tlie code hase.Ih.6 sccurin~ supports both tlic autlicnticntion andthe cncr!ption ofIII.6 packets end-to-cnd.'TThe moduletbr tlicsc ti~nctions \\.ill reside in the Iccrnel and mostli kel!. \\,ill be cnl led .I[ rlic point \\.licrc the IP1.6 ucnvorkIa\scr packet is processed. A key nianagcnicnr fiarnc-\\,ark is bcing dcvclopcd to support both autlicnticntionand enel-yptio~i.lo access the Itcy manngcmcntinterface, ;I sockets AI'I extension \\ill be pro\-ided tosupply tlic keying criteria ti)r the security modules.1-0 tcst tlic intcropcrability .ind robustness ofthe 11'\~6 inlplcmcnrnrions, a test ~lcnvork kuo\\.n asthe 6130NF. 11~1s hccn crcntcd 011 tlic Inter~~t't. Thisnascent tcst bed is currcntl\j bcing b~lilt \\lit11 stuticallydefi~ied t~~nncls connecting 11'\f1 ncn\~orks. OLI~ neststcp in I l'v6 dc\~clopmcnt \\,ill bc to implcmcnt routingprotocols, starting \\rith llol~ti~ig 1nfo1.1nationProtocol vcl-sion 6 (1


2. S. Brad~icr and A, iManlti11, "TIie Recom~~ic~ld~tion 23. 1). Plummer, "An Ethernet Address Resolution Protoforthe 11' Nest Gelielation l'rotocol," lWC:1752col," RFC826 (i\'o\wmbel. 1982).(J'inuary 1995).24. D. Clark, "Fault Isol'ltion dnd licco\lery," RFC8163. R. Hinden, "Simplc 111tcrncr Protocol Plus M'hitc( J L I 1982). ~ ~Paper," 1U'C 1710 (October 1994).4. S. Deering and R. Hinden, "Intcnlct Protocol, Version6 (Ih.6) Speciticntion," 1U-'C1883 (Jalluary 1996).5. Y. Keklitcr and T. Li, "An Archirect~~re for IP\(6 Unic~stAddrcss Allocation," 1WC1887 (Janllary 1996).6. R. Hinder1 and J. Postel, "IP\.6 Testing AddressAllocation," lWC1897 ( Janllary 1996).7. S. Tlio~nson and T. Nnrrell, "lPv6 Sr~telcss AddressAuroconfig~~latio~l," RFC1971 (August 1996).8. J. Bo~~nd and C. Perkins, "Dynamic Host ConfigurationProtocol for IP\16 (DHCPvG)," VVork in progress(Aug~lsr 1996).9. A. Conta and S. Dccring, "Internet Control messageProtocol (ICMh6) for thc Internet Protocol Version6 (IP1,6)," IWClSS5 (January 1996).25. W. Src\rens and M. Tlio~iias, L'Adva~iced Sockets APIfor IP1.6," b\Jork in progress (October 1996).26. P. klockaperris, "Domain Names-Concepts andFacilities," RFC1034 (November 1987).27. S. Thonison and C. Huitenia, "l>NS Exter~sions toSupport IP Versioli 6," KFClSS6 (Deceriiber 1995).25. R. Atkinson, "Security Architccrurc for rhc Intel-netProtocol," (MJol-k in progress, June 1996).29. "Resource Reservation Protocol (RSVP)-Version 1Functional Specification," (Work ill progress,August 1996 ).30. J. Veizades er al., "Service Location Protocol," (Workin progress, June 1996).General References10. T. Narten, E. Nordmark, and W. Simpson, "Ncighbor~ i for ~p ~ ~ ~ 6 (Ipv6),m ~ ~ RFC1970 ~ ~ i ,( ~ A ~ ) ~ S. Bradner ~ ~ ~ and ~ ~ A. Mankin, ~ cds., ~ IPr~g-Il71e1.1let ~P1.otoco11996). i\khhl Gw~erntiol~ (Reading, blass.: Addison-\Vcslcy, ISBN:0-201-63395-7, 1996).1 I. M. McKusick ct al., 717~ De.si,y!r u~rd/~~~p/et~~etrtntiouq11/7(, 4.4 BSD ~p~~.li~.~~~ ~,,~t~,~,, ( ~ ~ ~ M d~ i ~ ~ , S. ~ : Thonias, ~ , JPt76 LIIILI /he TCIJ/IJj 1'rotocol.i (i\lc\\> YorlcAddison-kvcsley, ISBN: 0-201-54979-4, 1996). John \Vilcy & Sons, Inc., ISlIN: 0-471-13088-5, 1996).12, V, ~ ~ cr al,, l y-lnssless l ~ ~~~~~i~~ R. Urxden, "Rcq~~ircments for Intcrnct Hosts-Comm~~ni-(CIDR): An Address Assignlncnt and Aggrcgntioll 'ation La!'cl-s," lwC1122 (Octobcr 1989).Str~tegy," 1WC1519 (September 1993).G. Wright and Iecri~lg, "P~tli h4TU Disco\~ery,"RFC:l 19 1 (No\iember 1990).15. J, b1cC:ann et al., "l'ath iMTU Disco\jer!l for IPVersion 6," RFC198 1 (Aug~lst 1996).16. W. Simpson, "IP in IP Tunneling," RFC1853(October 1995).17. A. Contn and S, l>ecring, "Generic Pacltet'I'unnelilig in1 1'\,6 Spcciticntioli," Work ill progress (October 1996).IS. I


James P. BoundJIII~ Bol~nd 15 ,I co115~1lri11~ \oft\\ ,il.c c~igi~iccl- .l~id rllc rcclilliidlciirccto~. fo~ II'\ 0 \I 1tli111 tlic 11'1 O 1'1.og1..1111 Offic-e, j111iI< rc\~io~i\il>lc k)r tI1c o\c~~.ill ,~d\~i~lccci ~ i c \ c l o ~ ~ \ircl~i- ~ ~ ~ c ~ i ttcct~r~-e ,iiicl 1-cfcre11cc AIpll~ llig~t.il LTSIX c~xle ll,i$c, \\ liicl1\c~-~tic\ tli.lt t ~ 1 c h.6 spccitic.ltio~i\ .]I-c ~~nplcnicnt.~l~lc. I-lcia also L)~gi[,ll's It I t. I I'\ 6 tcill~iic.il Ic.iiic~.)nd o11c ot'tlic11'16 nd\anccci dc\clopmcnr c11si11ccr\ or1 .-\llrh.i l)i5irL1l('SIX 111199.3, Ji~n Llcg~ri 111s p.~~-ticip.it~o~) i ~ [lie i It. 11:to \\ 01-k o11 the I I'ns 2nd tlic .id\ .lliccil dc\ clol>~i>c~ir 11'119lxotot\ 11c. ,\>,1 !iic~iillc~- ol'tlic I I. 1 b'\ II'ny I)~~.ccro~-.irc,Jill1 liell>cci cictc1.1~ii11c tlic 1~c~cl~1i1~e1iie11ts ,111~1 ~01.c \i1.~11i1c~Mntt Tlio~ll;~shl.il~ I'honi.i\ joi~~cti I)191r.ll 111 1983 \I it11 Sofi\\.irc Scl-\.ices111 (~,il~ti)~~~ii~i ,AItlio~~gli Ilc i< l~~.i~icip.il sofn\.i~ 1.i1icil1,11 \ot't\\.,l~-c c~1i~i11ccr i11 tlic LTNIXl-,nsi~~ccrilis c1- of~lic ll'\.6 l>~.ojcct tc.lr11.H c conti-ibutcti to tllc ticsisn ,11111 i11iplc1iic1~t.ltio11 of tI1cl)i::i~,il LINIX 11'\,6 p~.otot\,l>c, i11cl~1cii11g roilrcr ci~hco\c~.\,,.~~~tc~eo~ifgi~~~ntio~i,~~.;I>I~ICI~~.~~II)II, rc.isscmbl!,, p,irIi h'l'l'l'.i~~ticil>.it~~\ i11sc\ cr.11 It-, I'F \\,orking gro~~ps .loci is .I co.it~thor of 11ltcr.1ictIllZC 1931, "P.lth 1L1 I'(: I)i>co\.cl.\. ti)[ ll'\~c~.sic)~i 0." J.ickjoincd I)~$it.ll in I983 to hccolnc .i ~ncnil>c~- of the l)~srl-~ll-~ ~ t S!stclna c d l'ccl1nic;ll F,\,:ilu.ition C;roi~l>. 1Hc .il\o \\ o~-l;crlin the l)EcllVklS k11ginccl.i11s (;I-OLI~Ibefore t.ilcillg Iiis cul-rent posllron. Hc ~-ccc~\,cii .I I3 S. 111coml>uLcl. \clcr)cc (n1.ign.l cum 1.1~1dc) fi-om tlic L'~li\ crsityof Lo\\ ell ill 1933 ~iiti .In b1.S. in conij>urcl- sclcllcc ti.0111Iloston Ulli\,cl.sit!, in 1995.


Preserving Computing'sPast: Restoration andSimulationIMax7i7ell M. BurnetRobert M. SupnikRestoration and simulation are two techniquesfor preserving computing systems of historicalinterest. In computer restoration, historical systemsare returned to working condition throughrepair of broken electrical and mechanical subsystems,if necessary substituting current partsfor the original ones. In computer simulation,historical systems are re-created as softwareprograms on current computer systems. In eachcase, the operating environment of the originalsystem is presented to a modern user for inspectionor analysis. This differs with computer conservation,which preserves historical systemsin their current state, usually one of disrepair.The authors argue that an understanding ofcomputing's past is vital to understanding itsfuture, and thus that restoration, rather thanjust conservation, of historic systems is animportant activity for computer technologists.The Computing PastThe continuous impro\.cmcnts in computing tcchnologycause the r.lpici obsolcsccncc of compi~rcr s!,stems,architccturcs, media, and dc\.iccs. Since old computi~lgsystems nrc mrcl\ pcrcci\.cci to 11'1vc any \.aluc, thedangcr of losing portions of the computing record issignitic~int. When a computing .~rchitccturc becomescsti~icr, its sofn\.,~rc, clat.1, and \\.rittcn and ornl recordsoficn disippcw \\sit11 it.Older computer systclns cmbody major invcstmcntsin so%\are, the \,alue of\vhich may persist long after thes!stcms Jla\,c lost thcir technical I-clc\.ancy. For example,tlic Pl31'-11 computer 11.1s not bccn a Icadi~lg-cdgcarchitcct~~rc sjncc tl~c introciuction of 32-bit s\,stcmsin the Intc 1970s ~lnci 11.1s not rccci\.cd LI I~c\\. hnrd\\.arcimplc~ncnt.itio~> si~icc 1984. Soncrhclcss, 1'1)I'- 11 s!,stc.111~conti~l~~c to bc ilscii \\.ol.lcl\\.ide, p'~rticl~la~.l\' inreal-time anti co~~trol .ipplic.~tions. Tlic una\..iil,lbilinof suitablc rcplaccnlcnts of \\urn-out original p.lrts i\a serious issi~c for I'1)I'- 1 1 s\,stc~nstill in use.Another nrcn of porcnrinl loss is ciatn. In rccc~ltnrcl1ivnl stor,igc nlccii.1 ha\zc undergone rapidtechnologjc c\zol~~tion, 2nd the industry st,l~ld,lrds ofconip~ting's first 30 \rc;lrs, sucl~ as 0.5-inch ~nagnctictape, arc no\\, a~itiq~~cs. SnI\.nging data fi-o~n origilialmedia is nli i~iri~~str\~-\\,iric problem nnii has gener.itcda smnll cottage inci~~stl-y of specialists in data ~.cco\.cr!,.This problem \\.ill only prolifcl-~tc, as tr~nsitions inmcdin nepcs accclcl-arc. Ten !,cars from no\\; the Ic~r~cdiameteroptic;ll disks used for today's ~rchivcs \\-illlook as quaint ns l)k


process of restoring a p;~rtic~~I~r 1'1)l'-1 1 ~ninico~nkx~tcr.'The scconti section ciisc~~sscs the simulationof olci computers on modern s!?stcms. It describes asirnillation framc\\,ork callccl SIIM, \\,hich has beenused to implement simul,~tors for the I'D1'-S, I'l>P-11,PDP-4/7/9/15,and Nova ~ninicomputcrs.Restoring Old ComputersSince the complltcl- bccn~ne 3 ~i~;~ss-proti~~ccd itemthe late 1960s, its n-picnl life c!,clc has consisted of iluti,llinst~lllation, rcnt~l or cicprccintion hl- 'lbo~~t fi\,c \rc;~~.s,retention .~nti use for a fen. more ycnrs (just in case), aldthen retircmcnt and a trip to the rcflsc dump. There isonly a b~-ief\\zindo\\. of opport~~ni~, to collect old coniputersat the end of their \vorlting life. Once that \\indo\\,is closed, the compiltcrs arc gone ti)rc\ler.The Australian Museum CollectionIn Sydney, Australia, this \\fincio\\~ of opportunityfirst bccamc .lppnrcnt in 1971, \\lhcn the early PDPsystcms rcachcd tl~c ends of tlicir life c\clcs. Digital'sA~~stralj,lli si~I>siJi'~ry 1)cgan collecting systems by acre,lti\rc program OF trade-ins for new ecli~ipment.' It\\!,IS cspccially urscnt to obtain csaniplcs of the 12-bit,18-bit, and 36-hit 1'131' series, as the!' \\.ere rclati\,elyfe\~ in n ~~~nbcr. .l'nblc 1 lists the pcrccntnge of a\.ailableunits that have bccn collcctcd. The status of each isStatic-canreasonsncvcr be madc to \\.ark fix \.ariousl


Table 2The Digital Australian Collection (chronological order)Year Item Description Status138ASR-33KSR-35PDP-6PDP-5PDP-7PDP-8PDP-8PDP-8PDP-8COPE-45PDP-9KAlOLinc-8PDP-81sPDP-8/SDF32PDP-9/LPDP-811PDP-8/LPDP-12PDP-12PDP-15KllOPDP-81EPDP-8/EPDP-11/20CR11PDP-8/FVT05LA30PPDP-11/45GT40PDP-I 1/10PDP-I 1E10PDP-I 1/35PDP-8/APDP-11/40VT50LA36DS3 10PDP-I 1/70PDP-11/34PRSOILSl20WS78LA1 20VAX- I 1 I780AID converterTeletype readerlpunch, 110 baudHeavy-duty TeletypeModules of first Digital computer in AustraliaFirst minicomputer in AustraliaThird Digital computer in AustraliaClassic, table-top modelCabinet modelTypesetting systemCabinet model, first in New ZealandRemote batch (OEM PDP-8)18-bit computerConsole of PDP-10 mainframeEarly medical computerSerial, under 810,000, CPUSerial computerDigital's first disk, 1/16 MbLast transistor logic, 18-bitDigital's first IC minicomputerOEM version of PDP-811Laboratory computerLaboratory computerLast of 18-bit familyConsole of DECsystem-10Pinnacle of PDP-8 developmentFull LAB 8 configurationThe first PDP-I 1Card reader, 285 cpmSmall PDP-8/EDigital's first video terminalDigital's first hard-copy terminalLast PDP-11Graphics workstationSmall PDP-1 1First packaged systemMid-range PDP-I 1Last non-chip PDP-8Mid-range, end-user PDP-11Video terminalDECwriter ll printerDesk-based commercial systemLargest PDP-I 1Mid-range PDP-11Portable paper tape readerDECwriter printerWord processor, &inch floppy disksDECwriter Ill printer, 180 cpsOriginal unit of 1 VAX-I ln80StaticWorkingWorkingPartsWorkingStaticWorkingRestorableStaticRestorableRestorableStaticStaticWorkingStaticStaticStaticStaticWorkingStaticWorkingStaticStaticStaticWorkingWorkingWorkingWorkingWorkingWorkingWorkingStaticBrokenStaticWorkingStaticWorkingRestorableWorkingWorkingWorkingRestorableWorkingWorkingWorkingWorkingWorkingRestorablecontinued on next pageDig~tal Technical Jo~~rllal \'01.8 No. 3 1996 25


Table 2 (continued)Year Item Description StatusVTl 00MlNCVAX-I 11750PDT-1 50GIG1VT125WS278VAX-111730LA1 2LQP03DECmate IIDECmate IIRainbowPRO350VT24 1MicroVAX IVAX-111725LN03MicroVAX IIVAXmateDECmate IllMicroVAX IllVAX 8250VAX 9000DS3 100Famous video terminalLSI-11 lab unit with RT-11Mid-range VAX systemTable-top LSI-11 with RXOl drivesLow-cost terminal for schoolsVideo terminal with graphicsDECmate I word processorLow-performance VAX systemPortable hard-copy terminalLetter-quality printerWord processor on mobile standWord processorPersonal computerProfessional PCGraphics color terminalSmallest VAX .3 VUPLowest cabinet VAX .3 VUPLaser printerFamous MicroVAX II286-based PC with RX33 driveSmall word processor3-VUP MicroVAX II systemDual VAX CPU, BI-basedChip setMips UNlX workstationWorkingWorkingRestorableWorkingWorkingWorkingRestorableWorkingStaticWorkingWorkingWorkingWorkingWorkingWorkingWorkingWorkingWorkingWorkingWorkingWorkingWorkingRestorableStaticRestorableribbon cables. After 20 !.c,lrs, it turns into ;I stick!;gooey mess. It sliould be rcmo\.cd as soon as possible;other\\.isc, it falls into the ~iic.)dulcs and backplane.Replacing it \\.it11 a modern cqui\.alcnt can be donc butis not essential.)The first step in restomtion is to collect l~ard\\.nre,sofnvarc, and docunlcnt.ltio11.Collcct tlic hardu.nrc, if possible t\+-o or idcall!,three items of cach cs.lmplc. 'l'liis providcs n s\~stcmto \\~orlc on ,~nd a sp.l~.c, ~s \\*ell 3s the abilit!, to malccconlpwiso~ls bcn\,ccn ~~nits.Collcct diagnostic ,lnd operating soh\zarc on originalbootstrap mcciia. Sources 11rc \,cry uscfi~l, particulnrl!lfor diagnostics.Collcct l~nrd\\r;lrc mnnu;lls and schematics.Thcrc is n 11ct\\,orIc o f enthusiasts around the \\,orldwho can help at this stage.Once the "ingrcdic~lts" hn\.c bccn collcctcd, thesteps nccdcd to restore 3 1960s or 1970s vint:~gemachinc arc as h)l lo\\-s:Inspect the l~arci\\~arc k)~- ph\.sical snfcn; particularlythe heavy dra\\.crs and slide n~ccl~anisms.Ph!.sicnll!~ assc~llblc the hnrci\\.nrc, checking moduleallocations, cabli~~g, ctc.Cnrefidl!. inspect the po\\.cr s!.stcm, high-\.olt~gcsources can kill. Althou~h most ofthc po\\.cr \\-iringmaterial appears to stnnrf the test oftimc. the earl!.mnchines ofien had rntllcr thin co\.cri~lgs on tcnninals.Safer!,-f rst is a principal criterion ill I-cstoration,sincc somcdn!- oontcchnicnl pcoplc may openthe back door.Assclnblc n n1inim31 s!,stcln of (:I'U, ~ncmol-\., 2ndco~isolc s\\~itcl~ register for initial tcsts.Ponzcr up thc computer, chcclting s~1pp.l~ \,oJtagcs,fhw, and front co~lsolc tix signs ol-lifc.Usr si11ipIc routines at tile s\\.itcIi rcgistc~. to ~IlccIih>r e1emcntn1-y opcrntion.Fit a scrial line unit so that ;I VT 01. a '1-clct!.pc consolecan be used.Get the ke!.honrci echoing to the screen 01. printel-\\-ith simple routines.If the!, arc a\-ailnblc, run the internal tcsts of theread-only melnor!r ( ROM).\'(,I. 8 So. 3 1996


RXOlDUAL 8-INCH FLOPPY DISKETTESTD8E TU56ACCUMULATOR TRANSFERDUAL DECTAPE SYSTEMPC8E 300 CPS READER.50 CPS PUNCH PAPER TAPEPDP-RIE - --- CPU - WITH EXTENDED - --- ARITHMETIC - -ELEMENT. 16K WORDS MEMORY.KL8E 2400-BAUD CONSOLE.KL8E 2400-BAUD COMMUNICATION PORT.DECTAPE BOOTSTRAP. RK05 DISK BOOTSTRAP.REAL-TIME CLOCK( RK05 REMOVABLE 2.4-MBCARTRIDGE DISKSTORAGE RACK FOR 10DECTAPE SYSTEMSH861 POVVER DISTRIBUTIONFigure 11'I)l'-S/E(:oml>utc~. SysrclnCon\-cntional \\.isdon1\\.oillei no\\. iid\.isc that all thedi~lgnostic routines he run. Ho\vcver, diagnostics \\.ere(philosophically) al\\-a!,s ~~scd to find bugs in a prcviouslygood m,lchinc; they arc too conlplcs \\.hen hugechunks oF the machi~lc ~niglit still be ~nissing. Thclmost practical ncst stcp is to get Inass storage on-line.I)cpc~~di~ig on the manuhcturcr, the target dcviccniay be 3 tloppy ciisk drive, a carrridgc harci disk drive,or so~iic hrm of lnngnctic tape. With a \\,orking Illasssrorngc dc\,icc .lnd .I bootstrap routinc, it brco~nes~x)ssilblc to boot si~n~~lc operating s)arcm (like OS/801. 1


Table 3Goals of the Australian Digital MuseumTo preserve one of each model of Digital's computersTo keep each major Digital operating system workingTo have a working unit of each Digital terminal, console,and PCTo provide conversion and archival facilities for oldmediaTo preserve significant Digital literature and manualsTo preserve a VAX-111780 computer as the originalunit of 1 VUPTo disseminate instructive and educational materialTo educate and amuse our staff, our customers, andthe publicTo support the DECUS NOP (nostalgic obsolete product)Special Interest GroupTo preserve spares, tools, test gear, and documentationto keep the collection workingTo preserve and protect these treasures for futuregenerations(for example, a business card), not \\.it11 a pencil crnscr,\\.hicli leaves residues. Silicon components appear tobe \,cry stable and a tribute to tlic conscr\#ati\,c dcsignprinciples of'carl!r colnputcr engineers.The main compone~its that seem to agc arc po\\,crsupply capacitors, fans, ancl lights. Tllc ti ltcr capacjtorsacross the high-\,oltage sources can short, andreference electrolytic capacitors in po\\?cr supply rcgulatorscan dry out. Although the large capacitors inpo\i1er supply RC filters have pro\*cn to be rcliablc,some restorers replace them as a niattcr ofcoursc k)rsafety reasons. Small rotary fans may scizc if they havelogged many hours. Incandcsccnt panel lamps arcal\\~a!a failing and can bc rcplaced by mocicrn lighte~nittingdiodes ('LElIs) if rccli~ircd. The irony is thattlie panel lamps arc needed only during initial clicckout;once the operating system is running, tlic)~ arcrarely used.Once rcstorcd, arc old units reliable? Espcricnccproves that they arc. A classic 1'13P-8 s!lstcni rcstorcdin 1988 still turns on happily (~~ntouclicd) eight ycarslater. A fully configured PDP-8/E system is still workingfour pears akcr restoration.Restoring a Minicomputer: A Case StudyAn ongoing project is tlie restoration of n Inrgc,UNIBUS-based PI11'- 11 systcm with man)! UNIBUSperipherals attached to it. The project was startedusing the original 1'l)P- 1 1/20 Cl'U. Sincc manyPD1'-11 peripherals u.crc designed long after thcPDP-11/20 CPU, it could not cope \vitli single-boarddirect nielnory access (DMA) devices, mctnl-osidcTable 4Digital Data Media from 1960 to 1996Paper tape80-column punched and mark sense cards7-track, half-inch magnetic tape9-track, half-inch magnetic tapeDECtape and LlNCtape systemsAudiocassetteDECtape I I cartridge (TU58)CompacTape (TK50, etc.)Quarter-inch cartridge tapeDigital audio tape8-inch floppy disk5.25-inch floppy disk3.5-inch floppy diskRK05 removable diskRK06, RK07 removable diskRLOI, RLO2 removable diskRP01 ... RP06 removable diskRM03, RM05 removable diskRC25 removable diskscrnicond~~ctor (ivlOS) memory, and other later in\.entions.The project rcfoc~lscd on the wid-range1'131'-1 1/34, \\,liicli in retrospect h,ls pro\rd \\,isc. Thc1'131'- 11/34 s~~p~mrts h40S Illernor\', 11.1s a11 LED andpt~sli-button co~lsolc,111d I-cprescnts .i mature implc-~iienration of the 1'1)l'-11 instruction set. It has anoptional cache, battery back~~p, floating-point operation,;lud the cstcncfcd instruction set (EIS).The current co~lfguration occupies three large cabjnctsin \+.hat used to bc thc dining room of Masl


Tllc (:I'U and memor!. arc relati\,cl\ ex!, to checkout. I)LI~ to the \.crsatilitl\r of the UNII3US, l~o\\.e\~er,clicclMA or non-~>roccssoreference (NI'R) tcsts. Espcricncc sho\\atlint tests nccd to be rerun \\,henc\!cr n peripheralis nddcd.The systcni curre~ltly runs tlic R'I'-1 1 version 5.04operating s\lsteni on a configuration comprising1


fidelity, tlic tcst of \\~liicli is that no piccc of soft\\.nrcrunning on the simulntor sl~o~~lci bcli;i\,c dil'li.rcntl\than it \\,auld on the targct llarti\\rnrc. In practice, s ~~chperfect mimicr!. is diffic~~lt to ,~cliic\.c, as it ~.ccl~~ircs ,Ipainstalting re-crention of tim~ng tictnil (fiw csaniplc,tllc ncrunl '~ccclcration c~~l-\,c of a I)EC:tal>c stomgcs!,stcln) and access to implcmcntatio~i docu~ncnr~tiolithat has often vanished. Noncthclcss, some si~nulatorslia\.c acliie\.ecl results vcr!. close to this goal: iMIMIEplicd l),ltnResearch, \\.as able to run C:l'LT- ,lnd dc\ficc-spccifcdiagnostics. (As testimony to the \,~~lncl-nbility ofcomp~~ting's past, dl ~machinc-rcndnl>lcopies of tllcICIIh4I(: sources nppcnr to lia\.c been lost.)r\n inst~.~~ctio~i sim~~l,~tor steps back from the 1ClI.Ic\.c.I alld trics to si~llill~tc ,I[ the f~nctionnl or tllcbclin\~ior-al Ic\.cl. S!rstcm clc~licnts 'Ire treated as functionstlint tr;lnsf(>r~n st,ltc nccor-ding to tlic a[>st~.actdcti nitions of tlic s!,stcm architccturc, rnthcr thanas logic blocks that transhr~n state h~scd on implc-~iientation equations. Instruction sim~~lators sacrifccabsolutc tidelit\. to the idiosqcr,lsics of a p;~rtici~I.~rimplsmcntntion and focus on tlic intentions of thearchitccti~rc specification. As a result, i~lstruction simulatorscan us~~~lly run s\.stemr soft\\,arc and npplic'ltionsbut can rarcl!. fool iiiag~lostics.Finall!; a sott\\.,~~.e-spccitic sinu~lntio~l fi~rthcrabstracts the f~~nctions oftl~c tnrgct s!.stcm to onl! tl~oscnccdcd L?!, n pnrticular piccc of tnrgct systcm soh\rnrc.For csamplc, the OS/S opcr-nring s!.stcm on the I'l)l'-Sco111puter docs not use program interrupts; a sim~~lntoraillied at running onl! tlic OS/S opcmting s!*stcm\vould not need to implement interrupts or c\.cnqi~ci~ed c\7ents. A recent 1'131'-1 1 sinlulator designed torun the 2.9 RSD UNIS operating system abstr.lcrcdparts oftlic 1'1)l'-11 s\.stcrn1s interrupt ~iiodcl and couldnot run other 1'131'- 11 operating s\.stcms. ''Simulating Minicomputers: A Case StudySIh4 is n portable instr~~ction-lc\cl rni~licornpi~tc~. siniulatori~nplcn~c~ltcd ill C. Its objccti\,cs arc to thcilitatctlic study ;uid use of historic computer arcl~itccturcs I)!'making simulated i~i~plcmcntatio~~s and historic soft-\\,arc a\,ailablc to anyone \\-lio has a 32-bit computer. Itsi~pports the follo\ving target architcct~~rcsand lias been succcssti~ll\, ported to tlic \/AS VMS, tllcAlpha OpcnVhilS, the 1)igital UNIS, and tllc L,inusarchitcct~~res. Ports to the Windo\\.s NT and theWindo\\.s 95 architccturcs and to an IRM 1401 simulatorarc ~~ndcr \\.:I?General Design Considerations The design of aninsrl.~~ction-lc\.cl sim~~lato~. is 11ot tcchnicall!. complic~tcii;indeed, sim~~lnting 3 1'1>1'-8 s\,stcrn is a commonprol3lc1ii ill ~~~icic~.g~.nd~~atc cornputel- scicncc cou~.scs.Slh'l hllo\\.s the proccsso~--mc~~~c)r~~s\\-itch (l'i\/lS)structure pro1x)scd by l\cll and Nc\\.cll and ilnplcmcntcdin MIMIC: and countless otllcr sin~i~latorssincc.~~~.~.: The simulnted ~!~stcrn is a collcctio~i ofdc\.iccs, one of \\~liicli hns spccinl properties (theCI'U). l.;,~cIi dc\.icc 11.1s state (rcsistcrs) and one ormore c~nirs. Each unit hns srntc and tiscd- or vari,lblcsizedstolagc. 111 the (:PU cie\.icc, tlic stolxgc is m~innlcrnol.\.. 111 211 I/() cic\.icc, the storngc is the dc\.iccnlccii.1. Tlic (:I'L: is ciisring~~ishcii from ott~c~ ric\.iccsh\. h,l\ ing tllc Jnnstcl 1.o11tinc fix instr~~ction cscc~~tion.-I-llis ro~~tinc is ~.cspo~isiblc for rllc sciluentinl c\.;il-~~ntiori ofi~lsr~.uctio~is and for the stfltc tl.ansformatio~isth~t rcpl-cxnt simulntcd csccution. The


Table 5Commands Available in SIMCommandattach detach I ALLreset I ALLload boot run (}go (}contstep {}examine iexamine deposit ideposit save restore show queueshow configurationshow timeshow set helpexit I quit I byeDefinition- - - -Associate file with unit's media.Disassociate unit's (all units) media from any file.Reset device (all devices).Load binary program from fileReset all devices and bootstrap from unit.Reset all devices and resume execution at the current PC {or new PC}.Resume execution at the current PC (or new PC}.Resume execution at the current PC.Execute one instruction {or number instructions}.Display contents of list of memory locations or registers.Display contents of list of memory locations or registers and allow interactivemodification.Store value in list of memory locations or registers.Interactively modify list of memory locations or registers.Save simulator state in fileRestore simulator state from file.Display the simulator's event queue.Display the simulator's configuration.Display the simulated time counter.Show device's configuration optionsSet a device configuration optionDisplay a terse help message.Leave the simulator.c,lrlicl.. I,nsrl\,, tlic ~ii,~tcri,ll is liloti!,i~ig tiiffc~.i~ig rc\,isio~is or \.crsio~is of tlic arcllitcct~~rc,'1s \\ell ns crl.ol-5 that hn\.c crept in during thedocumcnt,ltio~i p~.occss.For 1)igital's 12-hit ,inti 16-bit ~ninicompi~ters, thet\,pic,ll Ilic~.,~rch!. ofdoc~~mcntation \\.as the follo\\,ing:Proccs\ol- H,inti book. I'r.o\ iiiing ,in all-incIusi\~csLlrnrn.lr\. of the intrueti011 set ,i~.cIiitccturc, pcriplicr~ls,0~1s inrc~.hcc, .11ld soti\\ ~rc, tlicsc papcrbacksi/cbooli~ .i~.c tlic most common hrm of s!,stemdoc~~rncnt,ltion hut ~lso tllc least accLlr,ltc.S~~bs\,stcrn lc the registers and functions,~t tlic hnrcl\\,~irc implc~nc~itatio~i Ic\,cl, oftrn includings~~bsrnntial ,ibt~-,~cts tiom the print set. licca~~srof tlic Ic\,cl oFtict.iil, rlic mnintcri,i~xcc manuals havepI.o\.cn to bc the ~iiosr uscf-ill ~.cfcrc~iccs for sijuulatori~ii~>lc~ncnt,~tio~i,Ilesign documents. For s\>stems t1i.1t tlo nor ha\~c\.cr\. 1,lrgc-sc,llc intcglation (\'LSI), the onl! cxt,lnttiesign doc~~nicnts arc tlic logic prints anti the bin,lr!,microcode ROIV listings. The prints ~1.csscnti,il h)~.11TJ_. simulation: tlie! pro\.idc the onl!. doc~~rncnt~ltionof implemcntatio~i cluirks. For \'IS1 systems,tliere are chip-lc\.cl ticsign spccjficntio~is ,is \\.ell ashuman-rcad'iblc rnicrop~-ogr'l~ii listings.t'olldorc. L)uring the 11scft11 lifetime of^ s!,stc~ii, itsusers escli,inge information .ind crcarc a11 inti,rmalrecorcl, both \\,rittcn and \~crb,ll, of shnrccl cxpcriences(ti)ll1g1r.~l Tccl~~lic~~l JOLI~I~,~~ o . S So 3 I .3 I


An important consideration is that much of thedocumentntio~l, nII the folklore, and most worl


Table 7Software for Simulators in SIMArchitecture Software LocationPDP-8 Basic instruction tests 1 and 2 Digital Australia collectionMemory management testDigital Australia collectionFOCAL69Digital Australia collection0518 system disk Public archivelaPDP-I 1RT- 11RSX-I 1 MRSTSIELINIX V5, V6, V7, 2.9 BSD2.11 BSDRDOSNova18-bit PDPNo software to dateTranscribed from real systemTranscribed from real systemTranscribed from real systemPDP LlNlX Preservation Society (PUPS) archive'OPrivate collect~onPrivate collectionoptio~inl 1'111'-1 1 instri~crions), tllc opcr.lting s\,s- as press"'\'.ition of soh! ~ r ,111d c dnt'l; be\,ond tl~at,ten1 \\,ill bc scnsiti\,e to c\.cr!, error in i~nplcmcntatio~~. there is an oblig'ltion to f~~turc generations. In 100For c\,lmplc, 1)igital's seeonti-generation 1'11P- 11 years, the s!,stcrns from computing's earl!, histol.!. \\,ills\,stc~ns-the PDP- 11/05, 11/40, anti 11/45-appca- to bc nbsolutc dinosaurs oftlic past. Yet tlici~.\\.ere debugged \\,ith DOS-1 1 anti IISI'S after diag- educational ,~nd sociological \raluc \\,ill be considcrnosticsKlilcci to clctcct certain subtlc implc~nc~~tatio~i able. A computer is a mncliinc \\pith a soul, and it mustcl.ro~-s. Unfortunately, in Jn oper'lting sJrstcm, tlicciist.1nc.c In ti~nc and spdcc hcr\\rccn the error ,lnd tlicbc kept ali\re \\,it11 its operating cn\.ironmenr to sl~o\\,its abilities anti the contclnpor'lr!r srntc of the ~rt.s\,lnptorn mnLr be enormous, dnti tlic tl.'~ccnble pathIIIJ!. be l~~~gth\~ ,lnd coni~~licatcd. ArtifL1cts in the Acknowledgmentssoft\\ ,II.~ can ~lso complic,~tc dcb~~g: tlic OS/S diski111,lgc 011 the I~itcr~lct cont~i~ls ,1 COP!^ of I


Table 8Architectures Implemented by SIMPDP-8 PDP-11 NovaCPU PDP-8/E J-I 1, Q-bus Nova 820Options KE8E EAE, Integral FPl I MultiplyldivideKM8E memory extensionMemory 4-32K words 16 KB-4 MB 4-32K wordsTerminal KL8E DL1 1 KSR-33, DasherPaper tape PC8E PC1 1 YesClock DK8E KW11L YesPrinter LE8E LP11 YesStorage RX8ElRXO1 RXll/RXOI 4019RK8ElRK05 RKlllRK05 4046f4047, 4048,RF08lRS08 RLVll/RLOl,Z 4057,4234Magnetic tape TM8ETTU 10 TMllTTU10 6026PDP-4 PDP-7 PDP-9 PDP-15CPUOptionsMemoryTerminalPaper tapeClockPrinterStorageMagnetic tapePDPd4-8K wordsKSR-28IntegralT75 punchYesT62PDP-7TI77 EAE,TI48 memoryextension4-32K wordsKSR-33T444 readerT75 punchYesT647T24 drumPDP-9KE09A EAE,KX09A memoryprotectionKP09A power4-32K wordsKSR-33PC09Areader- punchYesT647 ERF09IRS09PDP-15/30KE 15 EAE,KM15 memoryprotectionKP15 power4-1 28K wordsKSR-35PC1 5 readerpunchYesLP15RFI 5lRS09RPI 5IRP02TC59TTU 10the hard\\.arc. In adclition, Rill pro\,ided a \vorki~igOS/8 systcm disk, and John copied several Pl3P-11operating systcm disks off a \\,orking PDP-11/34.Megan Gcntr!. was an important source of PDP-11tblklorc, debugged some of the subtlest problems, crcatcdtlic Mal


Table 9Simulator PerformanceSimulator Simulated Real RatioInstructionsInstructionsper Secondper SecondPDP-8 1,800,000 400,000 4.5:lPDP-11 440,000 500,000 .88: 1Nova 1,700,000 750,000 2.26:lPDP-8sim>sim>s i m u l a t o r V2.2ba t t r k O os8.dskboot r k OCOPYIT-SV 2 09-Mar-93DIRECT-SV 7 11-Oct-92CCLX .SV 2 4 25-Feb-93P I P .SV 11 11-Oct-92FOTP .SV 8 11-Oct-92ABSLDR.SV 5 11-Oct-92BASIC .SV 11 11-Oct-92BATCH .SV 1 0 11-Oct-92BCOMP .SV 2 6 11-Oct-92BITMAP.SV 5 11-Oct-92BLOAD .SV 10 11-Oct-92BOOT .SV 5 11-Oct-92BRTS .SV 2 4 11-Oct-92CHEKMO.SV 1 5 11-Oct-92COMPAF.SV 5 11-Oct-92CREF .SV 13 11-Oct-92EDIT .SV 10 11-Oct-92EDITS .SV 6 11-Oct-92EPIC .SV 1 4 11-Oct-92F 4 .SV 2 0 11-Oct-92FRTS .SV 2 6 11-Oct-92FUTIL .SV 2 6 11-Oct-92HELP .SV 5 11-Oct-92LIBRA .SV 11 11-Oct-92LIBSET-SV 5 11-Oct-92LOAD .SV 16 11-Oct-92LOADER.SV 1 2 11-Oct-92MATST .SV 9 11-Aug-93MDTST . SV 1 4 11-Aug-93OCOMP .SV 8 11-Oct-92OPTF4 .SV 1 3 11-Oct-92PAL8 .SV 19 11-Oct-92PASS2 .SV 2 0 11-Oct-92PASS2O.SV 5 11-Oct-92PASS3 .SV 8 11-Oct-92RALF .SV 19 11-Oct-92RESORC-SV 10 11-Oct-92RUNOFF-SV 2 4 11-Oct-92SABR .SV 2 4 11-Oct-92SCROLL.SV 1 7 11-Oct-92SET .SV 2 0 11-Oct-92SRCCOM-SV 5 11-Oct-92TECO .SV 3 2 11-Oct-92VERSN3.SV 1 0 11-Oct-92BUILD .SV 3 3 11-Oct-92BASIC .OV 16 11-Oct-92BUILD6.SV 33 11-Oct-92BUILT .SV 3 3 12-Oct-92HELP .HE 1 18-Oct-92HELP .HL 72 18-Oct-92HELP .OC 4 18-Oct-92FORT7 .LD 2 07-Sep-93JMPTST-SV 3 18-Oct-92JMPJMS-SV 3 18-Oct-92RK8ENS.BN 1 30-Oct-92INSTI .SV 1 4 01-Dec-92INST2 .SV 11 01-Dec-92FORT .FT 1 17-Jun-93FORT .LD 2 09-Jul-93FORT2 .LD 2 09-Jul-93FORT2 .FT 1 22-Jun-93DOS .SV 2 25-Jan-94SHELL .SV 2 25-Jan-94FORT3 . FT 1 26-Jun-93FORT3 .LD 3 06-Jut-93CLOSE .SV 2 10-Jut-93FORT4 .FT 1 11-Jul-93FORT4 .LD 2 04-Aug-93FORT6 .LD 2 09-Aug-93FORT5 .FT 1 09-Aug-93FORT5 .LD 2 09-Aug-93FORT6 . FT 1 09-Aug-93METSC .SV 10 11-Aug-93METSC2.SV 1 0 11-Aug-93EMAT .SV 9 11-Aug-93EMDCT .SV 1 4 11-Aug-93EMTST .SV 10 11-Aug-93SINSTI. SV 1 4 11 -Aug-93ADDER .SV 13 11-Aug-93FORT7 .FT 1 30-Aug-93CLEAR .LS 2 13-Jan-94CLEAR .CF 2 13-Jan-94CLEAR .SV 2 13-Jan-94CLEAR .PA 1 13-Jan-94CLEAR .BN 2 13-Jan-94DEMO . 2 8 21-Mar-95DOS .PA 4 25-Jan-94DOS .EN 1 25-Jan-94DOS .LS 10 25-Jan-94SHELL .PA 1 25-Jan-94SHELL .BN 1 25-Jan-94SHELL .LS 2 25-Jan-94BASIC .WS 1 10-Mar-94FOO .PA 1 31-Mar-94FOO .BN 1 31-Mar-949 5 F i l e s In 980 B l o c k s - 2212 Free B l o c k sIS i m u l a t i o n stopped, PC: 01207 (KSF)sim>Figure 2P1)P-8 Simi~latol. Running OS/8\lol. 8 No. 3 1996 35


ucoder~ novaNOVA simulator V2.2bsim> att dpO rdos-dsksim> set tti dashersim> boot dpOF i Lename?NOVA RDOS Rev 7.50Date (m/d/y) ? 4 8 96Time (h:m:s) ? 16 26 0RListre sys-.-SYS5.LB 17216 D 05/24/77 13:18 05/31/85 COO10171 0SYS.SV 56320 SD 12/14/95 16:21 12/14/95 COO50571 0SYS.LB 20240 D 04/30/85 14:49 05/31/85 COO07461 0SYS.0L 30720 C 12/14/95 16:21 12/14/95 COO52721 0SYSGEN. SV 23040 SD 05/02/85 22:20 05/31/85 COO14011 0RdiskLEFT: 2158 USED: 2706 MAX. CONTIGUOUS: 2054RSimulation stopped, PC: 41740 (LDA 1,4,3)sim>Figure 3No\.a Si~ii~~lnto~. Running 1U)OS5. A, ,\hi, C;. Rurro~~ghs, A. Gore, S. 1.3i\/l,11-, (1.-1'. [.ill,.~nd A. \\'icln,l~ln,"13esign \Tcrifc.nrion ofrhc HI' 9000Scslcs 700 I'A-IIISC Works~arions," I~OII~/(~//-/'(IL~~(II.~//~III.II~I/. \,ol. 43, no. 4 (1992).14. For inl.i)rmntio~i on ;und pictures of Data Gcncl-nl6. W. Anderson, "Logical Vcriticatio~? ofthc NVXS (:I'L'~ninicol~iputcn, scc (:. Friend's 11-cb page atChip I)csign," Digitul ?Lch/~iccil ,/OIII.II(I/.\.()I. 4,l~rtp://\~~\\~~\~.~~Irra~ict~co~~~/-~crlno. 3 (1992): 38-46./indcs.hrml.7. II~LI/. \*ol. 2, 110. 2 (1990):64-72.S. A. Hurchi~~g~, "The E\.olution of rlic r~)/~ollcct/i~idex.l~r~~~l.20. For more i11k)rmation on the 1'1)l'-1 1 USIS archi\.c,scc 11ic I ' S Iio~nc page at hrtp://mi1i1iic.cs.~dI~;7.oz.n~1/L'C~I'S/i11dc~.lit1i1l.


ucoder> pdpl IPDP-11 simulator V2.2bsim> att rkO rtrk.dsksim> boot rkORT-11SJ (S) V05.04.da 8-apr-96.di r08-Apr-96N L . SY S 2 18-Sep-89RT11SJ.SYS 80 18-Sep-89PTESTX-MAC 23 27-Jan-94BINCOM-SAV 24 27-Sep-88D I R .SAV 19 27-Sep-88LIBR .SAV 24 27-Sep-88LINK .SAV 49 27-Sep-88FORMAT-SAV 24 27-Sep-88PBCOPY-SAV 2 16-Feb-89ODT .OBJ 8 05-Oct-89SIPP .SAV 21 27-Sep-88IOP .SAV 11 24-Apr-89T T .SYS 2 18-Sep-89DM . SY S 5 18-Sep-89D X .SYS 4 18-Sep-89L S . SY S 5 05-Oct-89L P .SYS 2 18-Sep-89PIP .SAV 30 27-Sep-88L D .SYS 8 26-Dec-90LC . SYS 2 01-Jan-80UCL .CCL 4 07-Oct-90MTPIP .SAV 28 27-Feb-87MLIB .SYS 300 20-Dec-90XPC .SAV 16 25-Jun-91PTESTX.OBJ 849 Files, 1432 Blocks3330 Free blocksRT11 FB-SYSSPOOL .RELGVI .SAVDUP .SAVIND .SAVMACRO .SAVRESORC.SAVODT .SAVSYSLIB.0BJSYSMAC.SMLDATE .SAVSWAP .SYSDLSY SD PSYSR KSY SM TSYSS P . SYSHANDLE.SAVMAC .SAVUCL .SAVSTARTS.COMMTROL .SAVHELP .SAVDESS .SAV.sho devDevice Status C S R Vector(s)- - - - - - - - - - - - - - - - - - - - - - - -N L Installed 000000 000T T Installed 000000 000DL Installed 174400 160DM Not installed 177440 210D P Not installed 176710 254D X Installed 177170 264R K Resident 177400 220L S -Not installed 176500 470 474 300 304M T Installed 172520 224L P Installed 177514 200S P Installed 000000 110L D Installed 000000 000LC Installed 177514 200Simulation stopped, PC: 146506 (ASR R5)sim>Figure 4I'I)I'- I I Simulnror Running KT- 11


BiographiesMaxwell M. BurnetMax Burnet has becn \\,it11 Oigital in Australia for 29 years.lluring that time, lie has sold, serviced, or marketed a11 themachines in thc collection. He managed the DigiulAustralia subsidiary for sewn years. He \\-as 2 salcsnlanin Boston during 1971 nnd managed to repl~cc an IRJLI1620 at Tufis University \vith ;n Pl'>P-10. He is c~~rrcntlythc oldest sur\,i\.illg "tccliic" ill the Sydnc!. ofticc andnlnkcs In;in!,corporntc prcxnt;ltions in Austrnli;~. licnlnll3gcs rhc Australi.~n l)F,CUS Socicn, the Subsidiary'slocal coiltcnt and export obligations \\lit11 the AustralianGover~~nient, and the local l'roduct Assurance Groilp.Hc has collected a museum of carly Digital machincs andis knon.11 around Sydney as "Museum Ma." He rccrivcdn R.Sc. (honours) fi.o~n ~Mclbo~lrne Uni\-ersin.Robert M. SupnikI


Modern FortranRevived as theLanguage of ScientificParallel ComputingNew features of Fortran are changing the wayin which scientists are writing and maintaininglarge analytic codes. Further, a number of thesenew features make it easier for compilers togenerate highly optimized architecture-specificcodes. Among the most exciting kinds ofarchitecture-specific optimizations are thosehaving to do with parallelism. This paperdescribes Fortran 90 and the standardizedlanguage extensions for both shared-memoryand distributed-memory parallelism. In particular,three case studies are examined, show-ing how the distributed-memory extensions(High Performance Fortran) are used both fordata parallel algorithms and for single-program-multiple-data algorithms.A Brief History of FortranThe Fortran (FORm~lla TRANslating) computer Ianguage\\-as the result of a project begun by JohnBackus at I13iA4 in 1954. The goal ofthis project \\.as toprovide a \\say for programmers to express matheniaticalform~~las through a l?)rnialism that coiilputers couldtranslate into macliinc instructions. Initially there \\,asa great deal of skepticism about the efficacy of sucha scheme. "H~\\J," the scientists asked, "would anyonebe able to tolerate the inefficiencies that \\lo~~ld resultfrom co~iipilcd code?" I


no\\. tli;it appropriate stantlards have c\.ol\.cd. Just asearly Fortran c~iabled average scientists nnci engineersto progr~~ii tlie computers of the 1960s, modernFortran may cnablc a\.crage scientists anti en&' rlneers toprogram parallel computers oftlic nest decade.An Introduction to Fortran 90Fortrnn 90 introduces somc irnportn~it cnpabilirics inmathcmntical espressivity tlirougli a \\,c,lltli of naturalconstructs fix manipulating arrays.' In addition,Fortran 90 incorporates modern control constructsanti up-to-date fcaturcs fix data abstraction and dntahiding. Somc of these constructs, For csamplc, 1)O\VHII,E, altho~igh ~iot part of FOI\TIWN 77, arcalrcad! part of the de hcto F0rt~i11 stnlidard .IS pro-\.ided, fi)r csumplc, with DM: Fortrnn.Among tlic kc!, 11c\\. features of Fortrn~l 90 ~1.c tllcfollo\\.ilig:Inclusion of all of FOl


A Brief History of Parallel Fortran: PCF and HPF1)~11-ing tlic p ~sr ten !,cars, t\\,o significant efforts li,l\,cbee11 unticr~.~lega11 \\cork on cxtc~lding Fortm~i 90 for distributcdmemor!,nrchitcct~~res, \vitli the goal of providing.I l,~~ig~~ngc suit,lblc for scalable colilputing. 'Thisco~il~liittcc I>cc,~~lie 1


High Perfor~nance Fortran V1.l is currcntl\r the onl!,language st~ndard for distrib~~tcci-memory par~llclcompi~ting. Tlic most significant \\.a!, ill \\,hich HI'Fextends Fortran 90 is tliro~~gli a rich hlnil!, of dataplacement directives. There are also lihrar!, rolltinesEND.ind somc estensions for control parallelism. HI'Fis the si~nplcst \\lay of parallelizing d;lta-p;l~.;~IIel ~pplj- Tlic Hl'F compiler is rcsponsiblc for generating 311 ofcations on clusters (also known as "F,II-ms") of \\,ark- the bounclary-elcmcnt communicntion code. The constationsand servers. Other n~ctliods of cluster piler is also responsible for dctemli~ung die niost c\vnparnllclism, s~~cli as lnessagc passing, rccluire more distribution of arrays. (If, for csample, therc \verc 1.3booltltccping and are thcrcforc lcss easy to express and processors, some chunks \\.o~rld be bigger than othcrs.)lcss c.1~). to maiutain. In addition, during the past ycx, This simplc example is uscti~l not only as an illustra-H PF has bcco~ue \videly a\,ailablc and is supported on tion of the po\vcr of H1'F hut also as a \\.a!. of poilitiligthe platfornis of all major \renders.to one of tlic hazards of par~llcl algorithm dc\.clop-Hl'l-' is often collsidered to bc a clnln lx11~11lc.l lan- nicnt. Encll of tllc clement-updates invol\res thrccguage. That is, it facilitates parallclization of array- floating-point operations-an addition, a subtraction,bnscd algorithms in \vliicli the instri~ction stream can arld a ~nultiplicntio~l. So, as an cxample, on a fourbedescribed as a sequence of army r~ianipulations, processor system, cncli processor \i~~~ld operatc oncach of \vIiich is inherently parallel. What is less \vcll 250 elements \\lit11 750 floating-point operations. Inkno\vn is that HPF also provides a powcrfi~l \\lay of addition, each processor \\/auld be required to collicsprcssingthe more general Sl'M1) parallelism Incn- municatc OIIC \ \ ~ of d iiat;~ for each of the n\!o chunktio~iccl earlier. 'This kind ofparallclism, ohcn expressed boundaries. Tlic time tIi;~t each of tliese co~nlliunicn-\\,it11 message-passing libraries sucl~ as hlll'l,' is one in tions trlkcs is larallelism.A Three-dimensional Red-Black PoissonEquation SolverThc esamplc of a one-din~cnsional algorithm in theprevious scction can he easily gene~.alized to n n~orcrc'ilistic thrcc-cii~iicnsio~ii~l algoritli~n for ~ol\~ingtlic Poisson ccluation 11sing a rclasation tcclinicluccommonly knoiim as tlic red-black method. Thcgrid is pnrtirioncd into t\vo colors, follo\ving n n\.odimensionalclicckcrboard an-angement. Each redgrid elcnicnt is updated based ~ Jthe I values of neighboringblack clc~nc~lts. A si~nilnr array assiglinlc~lt can42 1)igtal Tcchnicul Journal vol. X No. 3 1996


CALL CFDCV) ! DO LOCAL WORK ON THE LOCAL PART OF VEXTRINSIC(HPF-LOCAL) SUBROUTINE CFD(VL0CAL)REAL*8, DIMENSION(:,:) :: VLOCAL!HPFO DISTRIBUTE *(*,BLOCK) :: VLOCALENDFigure 2


5. For ca;i~i~l~lc scc /'I.OC.CY'L/~//~~.~ (~/'.S/I~C~I.C~II~J)/I/~II~~'9.7'(It:l:t, No\.cmbc~- 1993): 875-883, and W. Gropp,E. I.usk. and A. Skjcllum, I:~~ti~ig, both as 3 sciclitist and as a computi~~g consulnnt.Joining lligitnl kom RRN in 1991, Rill managed thcporti~ig of~iinjor scicntihc. a~id engineering applicatio~is tothc I)b;


PerformanceMeasurement ofTruCluster Systems underthe TPC-C BenchmarkIJudith A. PiantedosiArchana S. SathayeD. John ShakshoberDigital Equipment Corporation and OracleCorporation have announced a new TPC-Cperformance record in the competitive marketfor database applications and UNIX serverson the Alphaserver 8400 51350 four-nodeTruCluster system. A performance evaluationstrategy enabled Digital to achieve recordsettingperformance for this TruCluster configurationsupporting the Oracle Parallel Serverdatabase application under the TPC-C workload.The system performance in this environment isa result of tuning the system under test and takingadvantage of TruCluster features such as theMEMORY CHANNEL interconnect and Digital'sdistributed lock manager and distributed rawdisk service.Current industry trc~ltls have movecl, from centralizedso~nputing offcrcd by uniproccssors and synl~nctricniultiproccssing (SMI') systems to ~nultinode, highl!.available and ~c31d~Ie systcnis, called clusters. TheTruCluster niultico~nputcr system for the DigitalUNIS en\.ironmcnt is tlic latest cluster product ft-on1Digital Equipment (:orporation,' In this paper,disci~ss our test and results on a four-node AlpliaSer\,cr8400 5/350 Tru


tllc s\,stcIn under test to use the \,er!, large. menlor!!technolog!, and tr'lde off Incmor!, h)r the databasecaclic \\,it11 memory for DLiM locks to ilnpro\.c thethroughput. (For a discussion of this technology, seetl~c section Performance E\,aluatio~~ ~Mcthoclology.)\iVc ~ncasurcd the masiliium tliroi~gl~p~~t, the YOtlipercentile response time for each transaction type, andthe keying and tliinl< times. Finally, L\/C colnparcd ourmcasurcd thro~~ghput and price/pcrformanc.c withcompctiti\~c vendors like Tandeli1 (:ornp~~tcrs andHc\\,lctt-Pacltarcl Company.The rcst of the paper is organized as follo\\~s. In thencst section, \ve pro\,ide a sy~~opsis of the TruClustcrtcclinolo~ a~ld introduce the Olaclc Parallel Scr\.er,an optional Oracle product that enables tlic user to useTruClustcr technology \\-it11 the Oruclc relationaldatabase management systcn~. Follo\ving tlint, \\*c givcan ovcrvjc\~. of the TPC-C benchmark. Next, \\.edescribe tllc systeni under test and our pcrfi)rniancecvaluatio~l methodology. Then \vc discuss our pcrforrn31iccmeasurement results and compare them withcompetitive vendor results. Finally, c\/c prcscnt ourconcluding remarks and discuss our fi~t~~rc \vorlt.TruCluster Clustering Technologyl>igit,ll's Tr~~


sliarcd SCSI buses, thus constructing a11 A\.ailnhlcScr\,cr Environment (ASE). A shared Sl)I) nrc s~tpportcd for conuccti~~g clientsto cl~tstcr ~ne~nbcrs. Disks arc co~l~lcctcd citlicr loc.lll!r(i.c., no~isliarcd) to ;i S


WAREHOUSEW.89,0.000089'WDISTRICTW'10.95,O 00095'W~IOOKHISTORYR1.89.0.000089'Wl i ICUSTOMERt-ORDER-LINEW'300K+,54,16,2'W+KEY:TABLE NAMECARDINALITY.APPROXIMATE ROW.TABLE SIZECARDINALITY OF RELATIONSHIPNote: + impl~es variations over measurement interval as rows are deleted or added.Figure 3TI'jr>cs thcqua~itin~ ot'stocli for tlic items ordered b!, cacl.1 of thelast 20 ordcrs in a district 2nd determines tlic itemsthat have a sroclc Ic\,cl belo\\. a spccificcl threshold.Digiral TCCIIIIICJI JoLII.I~.~I Vol. 8 No. 3 I096 49


The TP


III 8-CPU. 8-GB 8-CPU, 8-GB II -II ALPHASERVER 8400 ALPHASERVER 8400 -6 HSZ40 51350 SYSTEM 51350 SYSTEM 6 HSZ40---RAlD - RAIDCONTROLLERS - CONTROLLERS-F!31 RZ28 CHANNEL HUBAND 141 RZ29DISK DRIVESIMEMORY-B31IIRZ28AND 141 RZ29DISK DRIVESI 8-CPU. 8-GB 8-CPU, 8-GB II --I ALPHASERVER 8400 ALPHASERVER --84006 HSZ40 51350 SYSTEM 51350 SYSTEM 6 HSZ40RAlDCONTROLLERSCONTROLLERSDECHUB 900 DECHUB 900GIGASWITCH4 VAXSTATION 31 00 4 VAXSTATION 3100WORKSTATIONSWORKSTATIONS4 VAXSTATION 3100 4 VAXSTATION 3100WORKSTATIONSWORKSTATIONSFigure 5I cnlls. l'hcsc transactio~l reclLlcsts ineacli clucuc JI-c ~~roccssccl in n first in, first OLI~(FIFO)order by tlic T~~sctlo s~r\~cr proccsscs running ~ ItheIclient. \Ye had 44 Tuscdo scr\,cr proccsscs that \\.erenot e\ml\r clistrib~~tcd .lmollg tllc 5 order q~~c~~cs 1~1t\\.ere distributed so that rlic number of Tuscdo scr\.crprocesses dcdicatcti to a qilc~~c \\.as dircctl!. corrclatcdto the pcrcclit.lgc of the \\,orkJoaci handled by theVal. 8 So. 3 1996 5 I


CLIENT 1 CLIENT 3 CLIENT 1 CLIENT 3 CLIENT 1 CLIENT 3CLIENT 1 CLIENT 3nNODE 3I I I IMEMORYCHANNELHUBKEY:= MEMORY CHANNEL LINK CABLE- FDDl- ETHERNETFigure 6I .ogic.ll 1)cscription of the Net\\.ork 71'opolog\Each emulated user on the RTE uses a differentseed so all clients are not executing the mix inthe same order.There is a one-to-one relationship betweenemulated users and TPC Client Forms.For this lest. 44 totalTuxedo Servers servicerequests. Each processservices one type oftransaction. However.not all transactiontypes have the samenumber ol serverprocesses.-----I\For this test. 1.620 users were emulatedRTEon each RTE. This number however, isI/ deprndont on the amount 61 memory onUSERS the client.I 1 1 IFigure 7(:onlni~lnicario~l bet\veen an RTE, .i Client, and a Scr\,crfTUXEDO LIBRARIESI I I I0Communication IS TCPIlPFDDl RING7SERVER (CLUSTER NODE)LAT connections were used from emulatedusers to TPC Client Forms.fTPC Client Forms/ send requestslo the appropriateorder queueEach queuerepresents onetransaction type./


~ILI~LIC. In othcr \\.orcis, the grcatcr tl.1~ pcrcclit,lgc ofthe \\.orkload on a queue, the grcatcr the numberof Tuscdo scr\*cr processes dedicated to tliat queue.The nunibcr of Tuxedo server processes per client iscomp~tcd bascd on the rule of tlii~mb rliat each clllcueshould Iia\,c no more than 300 outstanding I-cqucstsd~~ring chcclcpointing and 15 at othcr times. TllcseTuscdo scrvcr processes communicate with the sc~-\,crsystem (cli~stcr ~iodc) using the Transmission ControlPI-otocol/Intcrnet Protocol (TC:l'/Il') over 1-'1)111 tocsccutc related database operations."Tlic industry-accepted method of tu~iing the TPGCback end is to add enougll disks and disk controllers onthe scr\.cr to clirninate the potential for an 1/0 bottleneck,thus tbrcing the CPU to be snturatcd. Once thec~lginccrs arc assured that the performarlee limitation is(;1'U sat~~lxrion, tlie amount of menlor!! is tuned toi~ilpso\'c tlic datal>ase hit ratio. Bcca~~sc all \,cntiors sub-~nitti~ig TI'(:-(: results use this style of t~~ning, tl~c perti)r~nn~icclimitation for TP andI)L,iVI scr\~iccs of the TruCluster sofh\rnre to prcsentn co~ltiguous \,ie\v of the dat~basc across thc clustcr. Ifboth tlic datnbasc- and the indescs could havc beencomplctcl!* partitioned, \\.e could lia\*e achieved closeto linear scaling per node. Hen-ever, since the OraclcP;lmllcl Server does not havc horizontal partitioningoftlic indcscs, could not co~nplctcly partition theilldcscs across the cluster.' This rcsultcd in 15 pcrccntto 20 prccnt of intesnodal access, \\/Iiich means that15 pcrccnt to 20 percent of the nc\v ordcrs wcrc satistiedby rcniotc \\/arehouscs, tlicrcforc ~naliing our'll'(:-y tllc OmcleSystem C;lobal Area (SGA) and tlic 13L1bl. Our tcstingfoulid tliat using VLM to increase the size oftbe SGA to5.0 GR of pliysic.il riicmory yicldcd optin~al pcrhrmancein 3 TruLhIl.) Consccl~~cntly, as sccn in Figure 8, thc tpm ona single-node cluster s!>stcm running the Oraclc l',lr,illclServcr (8.41< tpm) is lcss than a singlc-node clirstcr ~iotrunning the Oracle P;irallcl Server ( 1 l.4K tpm).In an Oraclc Parallel Server en\?ironmcnr, \vcassigned 1 GI3 of mcmory to the DLM for the folio\\,-ing reasons: The 1)Lh4, under the 64-bit Digital UNISoperating s!,stcrn, rccluires 256 bytes for each loci


1-NODE (8-CPU) 1-NODE (8-CPU) 2-NODE (16-CPU) 3-NODE (24-CPU) 4-NODE (32-CPU)WITHOUT THE ORACLEYPARALLEL SERVERRUNNING THE ORACLE PARALLEL SERVERNotes: 1. Each node 1s an 8-CPU Alphaserver 8400 51350 cluster system.2. The number preced~ng the X lndlcates a mull~ple of the tpmC measured on a s~ngle node runnlng the Oracle Parallel Server.Figure 8'1'1'


~norc to ~ I ~ S L Ithe I . ~ ~.cj>l-od~lcibilir!, of the makimumrnc,lsurcil rp~nc:. lligiral Eql~ipmcnt crfi)rm~i~~cc.The performance testing of the Tr~~(:lustcr mdtico1npLltel.systclii \\,,IS til1ie-colis~11iii11g ,111ti cxpcnsi\,c.Thus, ans\vcring "\\~hat if" il~~cstions ~-cg,il.cling si~ingand tuning of varying cl~~stcl. co~ltig~l~.,ltio~is IIIIC~CI. ~iifferent\\~orldoads using Iilc,lsurcmcnts is an cxl>cnsi\rc(\\,it11 respect to monc!, allti tilnc) t~slc. '1'0 ,ltlcil-css thisproblem, \rrc arc clc\'eloping .In ~nnlytic,ll pcrform:lncccluster model ti)r c,ipacir!, pl~nning ,111d t~~nin!g.'" Themodel \\rill PI-edict the pcrfor~nancc of cl~~stcr contigurntions(r&l~nging from t\\,o to eight mcmbcrs)\~,ith\.arying \i.orltloads unri c\.htcm p,lr,lmctcrs (tor


KEY:ALPHASERVER 8400 51350 CIS(32-CPU, 4-NODE)tpmcPRICEIPERFORMANCEHP 9000 ENTERPRISE SERVER MODEL EPS3O CIS(48-CPU, 4-NODE)Figure 9


13. 1)igital Eqi~ip~ncnt Corporation and Oracle Corporation,"Digital Alphaserver 8400 5/350 32-CPU4-Node Cluster Using Oracle7, Tuxedo, and DigitalUNIX," TPC Benchmark C Full L)isclosurc Reporttiled \\,ith thc 'Transaction Processing PerfornianccCouncil, April 1996. Also a\,ailablc from the TPC \Vcbpage.14. Note that tllesc results \\ere not a~tdited; pel- TPC:-Cspecification, wc rcfcr to them as tpm instead of tprnC.15. Horizontal partitioning of the indexes allolvs the userto li~\teach node in thc cluster store indexes that aremapped only to tables that are local.16. T. Kawaf, 11. Shakshobcr, and D. Stanley, "PerfornlanceAnalysis Using Very Large Memory on the64-bit AIphaServer System," D


Performance AnalysisUsing Very LargeMemory on the 64-bitAlphaServer SystemITareef S. IbwafD. John S11,akshoberDavid C. StanleyOptimization techniques have been used todeploy very large memory (VLM) database technologyon Digital's AlphaServer 8400 multi-processor system. VLM improves the use ofhardware and software caches, main memory,the I10 subsystems, and the Alpha 21164 microprocessoritself, which in turn causes fewerprocessor stalls and provides faster locking.Digital's 64-bit AlphaServer 8400 system runningdatabase software from a leading vendor hasachieved the highest TPC-C results to date, anincreased throughput due to increased databasecache size, and an improved scaling with sym-metric multiprocessing systems.Digital's AlpliaSer\.cr 8400 enterprise-class ser\,cr combinesa 2-gignb\rtc-per-scco~id (GB/s) rnultiproccssorLXIS \\,ith the latcst rllpha 2 1 16464-bit ~nicroproccssor.'Benvecn October and llcceni bcr 1995, an XlpliaScr\.cr8400 m~~ltiproccssor ~!~ste~ii running the 64-bit DigitalUNIX operating s!.stcm acliic\.cd iulprecedented resultson the Transaction Processing l'crformance Council'sTPC-C bc~iclimark, surp;lssing all otlier single-node~-esults by n filcto~. of ncnl-l!, 2. As oFSeptemhcr 1996,only one otlicr complltcr \vendor has COIII~ \\itlii~i 20percent of the Alpl~aScr\rcr 8400 s!lstem's TP(:-(:results.A mcmor!, size of 2 GI3 or more, kno\i-n as vcr!,large mcllior!r (\'1,k,l), \\-as essential to achieving thcscresults. most 32-bit UNlS s!,stcms can use 31 bitsfor virtual ndcircss space, Ic~\.i~ig. 1 bit to differcntintcben\.ccn s!,stcrn nntl uscr spacc, \\,liich crcatcs dimculries\\.lien attempting to address more tlian 2 GRof menlor!! (\\.licthcr virru.1I or pliysicnl).In contrast, 1>isital7s Alpha niicroprocessors and theDigital UNIS opcrating systcm have implenicntcda 64-bit \.irtual address spacc that is four billion timeslarger tlian 32-bit systems. Toclay's Alpha chips arccapaL>lc ofnddrcssi~ig 43 bits of physical memory. Tl~cNpIiaScl.vcr 8400 ~\~stc~ii supports as Inany as S physicalmodules, each of \\.liicl.i can contain 2 CPUs oras much ,IS 2 GR of~nc~nory.? Using these limits, databaseapplic;ltions tend to acliic\+e peak performanceusing 8 to 10 CI'Us and as mi~cli as 8 (;B of 11iemo1-y.The csnmplcs in this pnpcr ,~rc dra\\.n priniaril!. hornthe optimization of a state-of-the-;lrt datlihase applicationon AlphaScr\acr s\.stcms; siniilar technical consiileratio~lsappl!, to any datab,~se running in an Alpli,ien\.ironlncnt. As of Scptcmbcr 1996, three of theforemost datnbasc co~npanics have estendcd theirproducts to exploit Digital's 64- bit Alpha en\,iroument,nnlncly Oraclc


TPC-C Benchmark7 -.I. lie TPC-C bcncIi~-nark \\,as designed to mimic compleson-linc transaction proccssing (OLTP) as specifiedby the Tra~lsaction Processing PerformanceCouncil.' The TPC-(: ~rorkload depicts the activity ofa generic \vliolesale supplier conipany. Tlie companyconsists of n nurnbcr of distributed sales districts andassociated \\rarcho~~ses. Each \\~archo~~se has 10 districts.Each district ser\jiccs 3,000 customer rccluests. Each\\~arcliol~se maintains a stock of 100,000 items sold bytlie company. The database is scaled according tothro~~glipilt (that is, higher transaction rates Lue largerdatabases). Customers call thc company to placc ne\vorders or request the status of an esisting ordcr.MethodThe benchmark consists of five co~nples trans? - 'that access nine different tables.'Thc fivc transactionsarc \vcightcd as ti)llo\\s:' L~IOIIS1. Fortythree percent-A ne\v-order transactionplaces an order (an average of 10 lines) fi-om a \\,archousethrough a single databasc transactlon and~~pdatcs the corresponding stock level for each item.In 99 percent of the ne\\/-ordcr tr.ins'ictions, thesupplving \\larehouse is the locd ar arc house and only1 percent of tlic accesscs arc to a rernotc \\larehouse.2. For?-thrcc percent-A paylnent transactionprocesses a payrncnt for a customer, updates the customer'sbalance, and rctlccts the payment in thcdistrict and warehouse sales statistics. Tlie customerrcsident \\larehouse is thc honle \\larel~o~~se S5 percentof tlic tinie and is the rernotc \\,arehouse15 pcrccnt of tlie time.3. Four percent-An order-status tra~isaction returnsthe stat~~s ofa custo~iicr ordcr. The customer order isselectcd 60 pcrcent of the time by the last name and40 pcrccnt of the time by an identification number.4. Four percent-A deli\rcr!l transaction proccssesorders corresponding to 10 pcnding orders for eachdistrict \vitIi 10 items per order. The correspondingentry in the ne\v-order table is also deletcd. Thedelivery transaction is intcnded to bc executed inclefcrrcd niodc through a clueuing mcclianisni.Tlicre is no terminal rcsponse for completion.5. Four percent-A stocl


Table 1TPC-C ResultsPrice1NumberSystem Throughput Performance of CPUs DateAlphaserver 8400 51350, 14,227 tpmC 8269ltpmC 10 May 1996Oracle Rdb7 V7.0, OpenVMS V7.0Alphaserver 8400 51350, 14,176 tpmC 8198ltpmC 10 May 1996Sybase SQL Server 11 .O, Digital UIVIX,iTi TuxedoAlphaserver 8400 51350,lnformix V7.21, Digital UNIX, iTi TuxedoSun Ultra Enterprise 5000,Sybase SQL Server V 11.0.2Alphaserver 8400 51350,Oracle7, Digital UNIX, iTi TuxedoAlphaserver 8400 51300,Sybase SQL Server 11 .O, Digital UNIX,iTi TuxedoAlphaserver 8400 51300,Oracle7, Digital UNIX, iTi TuxedoSGI CHALLENGE XL Server,INFORMIX-OnLine V7.1, IRIX, IMC TuxedoHP 9000 Corporate Business Server,Sybase SQL Server 11,HP-UX, IMC TuxedoHP 9000 Corporate Business Server,Oracle7, HP-UX, IMC TuxedoSun SPARCcenter 2OOOEOracle7, Solaris, TuxedoSun SPARCcenter ZOOOE,INFORMIX-OnLine 7.1,Solaris, TuxedoIBlVl RSl6000 PowerPC R30,DB2 for AIX, AIX, IMC TuxedoIBM RS16000 PowerPC J30,DB2 for AIX, AIX, IMC Tuxedo13,646.17 tpmC11,465.93 tpmC1 1,456.13 tpmC11,014.10 tpmC9,414.06 tpmC6.31 3.78 tpmC5,621 .OO tprnC5,369.68 tpmC5,124.21 tpmC3,534.20 tpmCMarch 1996April 1996December 1995December 1995October 1995November 1995May 1995May 1995April 1996July 1995June 1995June 1995Table 2Amount of Memory versus Back-end tpm, Database-cache Miss Rate, and lnstructions per TransactionDatabaseMemory(G B)Back-end(Normalizedtpm)RelativeDatabase-cacheMiss (Percentage)RelativeInstructions perTransaction'I'\\,o optimizations generally realixcd 20 percent gainson Alpha systc~iis.' These \irere1. Opri~nizntion ofspinlock primitives supported no\\.b!, 1)E


Lock OptimizationL,ocl


TEST-AND-SET implements the Alpha version of a test and set operation using//the Load-Locked .. store-conditional instructions. The purpose of this//function is to check the value pointed to by spinlock-address and, i f the//value is 0, set it to 1 and return success (1) in RO. If either the spinlock//value is already 1 or the store-conditional failed, the value of the spinlock//remains unchanged and a failure status (0,2, or 3) is returned in RO./ ///The status returned in RO is one of the following:// 0 failure (spinlock was clear; still clear, store-conditional failed)// 1 success (spinlock was clear; now set)// 2 - failure (spinlock was set; still set, store-conditional failed)// 3 - failure (spinlock was set; still set)/ /#define TEST-AND-SET (spinlock-address) asm( "Ldl-1 $0,($16);" o r $0,1,$1;"""st 1-c $1,($16); ""st1 $0,1,$0; ""or $0,$1,$0 ",(spinlock-address));// BASIC-SPINLOCK-ACQUIRE implements the simple case of acquiring a spinlock. I f// the spinlock is already owned or the store-conditional fails, this function// spins until the spinlock is acquired. This function doesn't return until the// spinlock is acquired./ /#define BASIC~SPINLOCK~ACQUIRE(spin1ock~address)C Long status = 0; \\while (1) \C \i f (*(spinlock-address) == 0) \C \status = TEST-AND-SET (spinlock-address); \i f (status == 1) \C \MB; \break; \\\\1Figure 1(:ode Scclucnccs for 1,ocl;ing Intrinsicsinstruction-caclic miss rntc OF 10 to 12 pcrcclir canct'fccti\~cl!~ stall the


DATABASE CACHE SIZE IN GBFigure 2Uatahc~sc Cnclie Sizc \Jessus T~I-ougliputprocesses. For examplc, an 8-GI? system allows 6.6 GRto be used for tlie databasc c~che.Performance AnalysisWhy docs thc use of VLLM impro\~c pcrformancc by afactor of nearly 21 Using statistics within the database,we measured the database-cache hit ratio as memorywas added. Figure 3 sl~ows the direct correlationbcnvccn lnorc mclnory and dccrcascd database-cachemisses: as Inemor!! is added, the database-cache missratc dcclincs from 12 pcrccnt to 5 pcrccnt. This raiseshvo more c1i1cstions: (1) Why docs the database-cachemiss rate rem~in at 5 percent? and (2) Why does asmall chdnge in database-cache miss rates iniprove thethroughput so greatly?The answer to tlie first cluestion is that \vitli a databasesize of more than 100 GB, it is not possible tocache t11c cntirc databasc. The cache improves thetransactions that are I-cad-intensive, but it does notentirely eliminate 1/0 contention.-KEY:BUS011 2 3 4 5 6MEMORY IN GBUTILIZATIONH B-CACHE MISS RATEM I-CACHE MlSS RATEW DATABASE CACHE MlSS RATEFigure 3Cache A/Iiss Ritcs and Bus UtilizationTo ansnrcr tlic second question, \\re need to look atthe AlpIiaServer 8400 s!!stem's hard\\rare counters thatmeasure instructio~i-cache (I-caclie) miss rate, hoardcache(B-caclic) miss rate, ancl the band\vidtli used onthe multiprocessor bus. Wit11 an increase in througlipi~tand niemolj! size, tlie VLIM system is spanning a largerdata space, and the bus utilization increases horn 24pcrccnt to 32 percent. Intuitively, one might tliinl< this\vould result in less opti~nal instr~~ction-and d~t'l-st~-e~rnlocality, thus increasing both miss rates. As sho\\>n inFigure 3, this provcd true for instruction stream misses(I-cache miss rate) but not true for tlic data stream, asI-epresented by the B-cachc miss ratc. Thc instructionstream rarely I-esi~lts in B-cache misses, so B-cachemisses can be attributed primarily to the dara stream.Performance analysis reqi~ires careful esaminationoftlie throughput of the system under test. 'The apparentparadox just I-elated can be resolved ifwe norm'liizethe statistics to the tlirouglipi~t acliie\~ed. Figure 4shon~s that tlie instruction-cache misses per transactiondcclincd slightly as tlic mclnory size \\[as increased fi-on11 GB to 6 GI?-and as t~-ansaction throi~ghp~~t doi~bled.Further~iiore, the R-cache \\lorlts substant.ially betterwith more memory: misses declined by 2S on a pcrtransactionbasis. M%!J is this so?Analysis of the system monitor data for each runindicates that bringing the dara into nlemory helpedI-educe the 1/0 per second by 30 pel-cent. If the transactionis forced to \trait for I/O operations, it is doneas)!nchronously, and the databasc causes some otherthread to begin cxccuting. Without VL,IM, 12 pel-centof trarlsactions miss the database cache and thus stallfor J/O activity. VVitIi VLM, only 5 percent of tlietransactions miss tlie database caclie, and tlie time toperform each transaction is greatly reiluccd. Thus eachthread or process has a shorter transaction latency. Theshorter latency contributes to a 15-percent reductionin system contest s\\~itch rates. We attribute themeasured inipro\~e~iienthard\\,are miss rates pertransaction \\hen using VLbI to the improvement incontest s\vitching.The performance counters on the Alpha rnicroprocessor\\/ere used to collect the number of instructionsissued and the n~rmber of c!~cles." In Table 2,the relative i~istructions per transaction res~~lts are theratios of instructions issued per second divided by tlienumber of ne\\i-order transactions. (113 TPC-C, eachtransaction has a different code path and instructioncount; tlicrcfore the instructions per transactionamount is not tlie total number of ne\\r-order transactions.)-The relative difference bct\vcc~~ instl-uctionsper transaction for 1 GB of d~tabase memory versus6 GB of database rneliiory is the nieasurcd effect ofeliminating 30 percent of the I/O operations, satisfiingmore transactions from main memory, reducingcontext switches, and reducing loci< contention.Vol. 8 Ko. 3 1996 63


0 11 2 3 4 5 6-KEY:BUSTRAFFICB-CACHEMISS RATEw I-CACHE MISS RATEMEMORY IN GBFigure 4Normalized C,iclic Miss Rates and Rus Traffic0,5 0 L2 4 6 8KEY:H NORMALIZED Ipm AT 2 GB-- NORMALIZED lpm AT 8 GBNUMBER OF CPUsFigure 5CPU Scalins \.ct-si~s ~Vc~noryImproved CPU Scaling- More Efficient LockingA final hcncfit of using VLM is i~nprovcd symmetric~nulti~woccssi~ig (Sh4P) scaling. Rccausc the .1'1'( :-(:workload has seireral transactions \\*it11 high ~-e;ld content,lia\,ing thc data a\.ailable in mcmor!; rnthcr thanon disk, allo\\,s an SIMP system to perform morc efficicntly.~Morc requests call be scr\riccd that arc closer inc!,clcs to the (:PU. Data found in menlor\ is lcss thana microsecond a\\.a\., n.hereas cintn fo~~~lcl on disk ison the order of milliseconds n\\.a!..Wc lia\,c slio\\~~ ho\\ this sit~~ation impl-o\,cs tl~covcrall sytcm throughput. In addition, it i~np~-ovcsSIMP scaling. Figure 5 shows the rclnti\rc scalingbcnvccn 2 CPUs and 8 CPUs \\fit11 o~ily 2 GI< ofsystc~nmcmory (1.5 GK of database cache) cornpal-cd to thesame configurations having 8 GR of system memory(6.6 C;R ofclatabase cache).Wc used tlie performance countcrs on tllc Alpha2 1 164 ~nicroprocessor to monitor the nnmbcr ofcycles spent on thc rncmor!, barrier instr~~cltion."Mcmor!. bal.riers are required for i~nplcmcntingmutual cscl~~sion in the Alpha processor. T'lic\. arc uscclh!. all locking primitives in the dat.tbasc nnd the opcratingsystem. With VLhl at 8 GB of mcmor\r, \\.e nicasurcdn 20-percent decline in time spent ill the memorybarrier i~istruction. Larger nicmory implied lcss contcntionfor critical disk and 1/0 clinnncl rcso~~~.ccs andtl1~1s lcss time in thc memory barrier instruction.Conclusions01x11 systc~ii database vendors arc expanding intom;1infi3nic markets as open systcnis acquirc grckiterprocessing po\\.er, larger I/O subsystems, nnd thenbilit!? to deliver higher througl~pi~t at reasonablercs~onsc ti~ilcs. TO this end, Digitill's Alpli~Scr\.er8400 5/350 s!,stcrn using VT.M ciatnbnsc technolog!,I~as dc~nonstratcd substantial gains ill commercialperfvrlnancc \\die11 compared to systems \\'itlio~lt thecapability to use VtA4. Thc use ofup to 8 GI3 of mcmoryhelps increase system throughput by a factor of 2,e\Ien for dambascs that span 50 GR to 100 GK in size.The Digital AlpliaScr\,cr 8400 5/350 system combined\vith the 1)igital UNIS operating s!,stcm toaddress greater than 2 GR of n~cmory has made possibleimpro\.ed TP(:-(: results from several vendors. Inthis paper, \\,c lha\.c slio\\.n ho\\ VLhI1nc1.eascd the thro~lghput b!, a factor of nearl!. 2I~icrcascd the d.it;~basc-cnchc hit ratios fro111 88 percentto 95 pacentBy i~sing monitor tools designed for tlie Alpha platform,we 1ia\.c mcasurcd the effect of VLIM in issuingfewer instructions per tra~lsaction on the Alpha 21164microprocessor. When transactions are satisfied bydata that is alrcad!, in Incmor!; tlie CI'U has fe\\rerliard\\,are clachc misses, fc\\rcr mcmor!, barrier proccssorstalls, faster locking, anti bcttcr SMP scaling.Future Digitdl Alpl~aScr\.cr s!.srcms that \\.ill becapable of i~sing morc pl~!sical me~iiory nil1 bc able tohrther exploit V1.M database technolog. The resultsof industry-standard bcucli~narks such as TI'


Engineering Group); Marl< Davis and RJC~ Grove(C;ompilcrs Group); Peter Yalcutis (I/O PerformanceGroi~p); slid Don Harbert and Pa~~linc Nist (projectsl.x)~isors).References and Notes1. I). Fcn\\ficl


Building CollaborationSoftware for the InternetIDah Ming CliiuDavid M. GriffuiCollaboration software for the Internet's WorldWide Web involves the development of sharedinformation systems for network computing.The AltaVista Forum version 2.0 software fromDigital contains extensions to World Wide Webtechnology that facilitate collaboration on theInternet. The extensions consist of a toolkitand a set of collaboration applications. Thetoolkit components include a built-in databasewith an indexing and search capability.Generic applications include discussion, documentsharing, and calendar applications andadministrative functions for managing users,teams, and access control.Tlic Internet and the World Wicic Web (WWW)lia\,e changed the scope of nen\,ork computing. Asthe Intcrnct i~scr population has gro\\,n, so I~as thcdemand for better to collaborate on the Intcmct.Some csamplcs include the ability to share and discussissues of com~iion interest, coauthor iioc~~ments, andtrack project st,~tus. Altliougli today's \V\VW is idealfor publishing inhrlnation, it rcquircs considerablecustomized programming to support collaboration.The AltaVista For~~m \.el-sion 2.0 proci~~ct is both a setof collnbor


H\,pcrtc\-t Transfer I'rotocol (HTTI'), ,i si~iiplcclic~it-scr\.cr protocol to transport j~ifornlatio~iassociatccl \\.ith a UKL\Ycb bro\\sc~-, a program that renders HTMLdocumcnts,provides URL caching, and supports aclirectory till- URLsVVcb scr\.cr, a ser\'er that responds to rcclucsts forinformation ti-on1 the Web bro\\~scrsInformation Access\\I\YW tcclinolog!i lias tra~lsfo~.nicd the \vay usersacccss inform'ition tlirougli col.iiputcr ncn\.orks.Access to information on the I~ltcrnct \\,as primnril!.text-h,iscci; \\,it11 tlic \VWW, users nrc ~blc to ncccssinfotmntion in ~iiultimedia format. 'The combin.~tionof firnctionC1lit\ (information linking, graphical inter-Lice, and caching), estensibility (for dealing \\,it11 ne\\.protocols and nc\v information types), case-of-use,and lo\v cost appealed to a \\ride range of users inhomes, offices, and corporations. In addition, theR/Iosaic-snlc of "point-and-click" graphical Intcrncthl-o\vscr lias bcco~iie the most \\lidely acccptcd userintcl-hcc for nct\\,ork coniputing.Tlie most pop~~lar use of tlie WW\Y todny is for publishinginformation, and tlic process is comparable totlic \\,a!! ,I nc\\.spaper publislics or a tclc\-ision station131-oadc~sts information. The roles of tlic informationpro\.idc~- and the information consunicr arc clearlydcfi ncd. The information provider gatlicrs and organizestlic pertinent information, con\,crts it to tlieHTML scripting format, and makes it a\.ailable on a\Vcb server. Tlie information consumer, after obraininginitial acccss to the Wb server (as one might tuneinto the correct rele\rision station), can then bl-o\vse2nd seal-cli for \,al-ious types of information a\,ailableon tliat ~cr\~el-. The linlting capability of UliL andHTA4I. allo\\rs the references or links to additionalinfi)~.matio~i on \.arious serirers to be easily pi~blishcdalong \\.it11 the original infor~iifition.In contl-.lst, multiple information [>I-o\.idcrs\\.o~-l


Design PhilosophyOLI~ h~ndarnental design pliilosopli!, I-equired usingtlic Internet and its infr'istr~~ctu~.c as buildi~ig blocksti)r OLI~ collaboration softusare. After yars of esperimcntingand collaborating to develop an opcnprocess, the Internet developers realized that theIntcrnct had reached a state ofcritical niass. In the caseof ncnvorks and connccti\ity, reaching critical mass isa tremendous i~npet~~s for ngrccing on ,I commonsta~~ciard. As more and Inorc users acccss the Internet,tllc need for soh\.are dc\~clopmc~lt for the I~lternctalso increases. In addition, the \,cr!, nature of tlicInternet demands an opcn standardization process toensure the long-term \.iability of a product.Our philosophy also included the reuse of existingolx~i software as building blocks \\,licne\~er possiblc.In addition to our choice of building upon theI~itcrnct and thc WWW tcclinolog!,, \\.e selectedthe 'rool Co~n~nand La~ig~~agc (TcJ) as the primarylangu;~ge for cie\.eloping most of o111 application anciuser interface f~inctions.~ We also took 'id\.antagc ofthe database library in thc Rcrltclcy UNIS distributionfor built-in databasc support.'Another objective \\,as to mnltc sure our so%\.arc\vould be easy to port to all the rclc\,ant operatingsystem platfol-ms. This principle guided our selcctiollofcomponcnts and lhclpcd us isolate 3 small set ofplatforni-ciependentf~~nctions illto a special library forporting the soh\lare.As stated earlier, \\.c tried to take an object-orientedapproach \vhenevcr possible. The ad\.anta~es of ourapproach became incrcasingl!. apparent as more peoplebecame involved \vitli the sofnvarc development.The object-oriented approach made component reusefcasi blc.FrameworkOur fi-a~nc\~~orlt orga~lizcs the AltaVista Forum soft-\\!arc into n1.o layers: toolkit and ~lpplications. The toolsrcquircd to build the applications overlap each other.kVc hnve used them to build generic npplications,including a discussion application tliat supports iucrsdiscussing a set of related topics, nlucli like ne\\.sgroupsdo; a calend'lr applicatio~i that supports users' abilitiesto sclicdule elrents o ~i a specific date ,~nd,~ta particulartime; and a nc\\.spapcr npplicntion tl~~t provides a person~lizcdnci\.s filtering scr\.icc. \Vc envision that, o\.crti~nc, the fr.~me\\rork \ire Iia\~c dc\,clopcd \\rill supportn 11i111ibcr of divcrsc applications. Figure 1 sho\vs theAltaVista Forum toolkit and application la!lcrs.Tlic toollit is a combination of both C and Tcl codethat creates the follo\ving intcrhcc components:Ruilt-in databasc. The applic~ition ~ ~scs a built-ind.ir.ib,~sc to storc its object instances. The databascis \.c~-!~ simple rclatioli,iI moiicl \.irith an objecthict-arch!, relationship t;~cilit\r a\railable to those.~pplic.~tions rliat need it. The librrir\r also pro\,idcsinversions on ccrtai~i attributes to support fastrctl-ic\d and sorting based on attribute values.l


Grc~pliical objccts such as definitions of b~~ttons,toolbars, various objccts that arc part of a form(c.g., select boses, radio buttons, checlc boses, textboscs), and icons.l>,~t.ibnsc entries, tlie definitions of their attrib~~tcs,anti default \,ali~es.User interfke aggrcgatc objects such as for~iis,\,icws, dialogs, and error messages.I)cfiii~lt access control policies, including debultgroups, acccss riglits, and their mappings, to control\\rho call access indi\.idual fi)ru~ns and \\,hatactions tlic!l can take \\.ithin them.This approach encapsulates thc dct'~ils in lo\\,-lc\rclmociulcs, malu~ig tlie soft\\rarc more readable andmaintainable. It also makes it cas!, for different functionsto reuse the objects.To f~~rtlicr hcilitate code sIi,lring, the framc\\~orl


Forum product can support . ~ l l tlic uscrs tliat a Webscr\cr can handle slncc only one repository of usersand groups is necessarv.Community, Team, and Personal Vistas A vista isallothcr term for home page, \vhich is a place for theuser to log in to the WWW. Once in tlic community\.ism, tlie user sees a set ofp~~blic fi)ru~ns and links toperform various tasks, c.g., register oncsclt; look upteams or join a team, perform AltaVista Forum administrativetasks (if an administrator), and so on. For thisreason, the cornmunit\, \.ista is also called the summit.In ~nucli the same \\.a\., n tcaln \vista lieeps trnck of allthe fi,r~~ms and links for n group of users, and a personalvista performs this f~nction for single user.lloth team and personal vistas can o\\:~i forums that arc~iot visible to the public community vista.Discussion iUuch lilte a b~~llctin-boardiscussiongro~~p or Digital's Dl


select box part of the toollcit procedure, which translatesthis object into HTML.Most of the early Web browsers were single-windo\vbased. This limitation was especially problematic for usbccause most of our applications provide some organizationto the information content. A much niorc natural\\lap of browsing for our environment wouldinclude at least nvo \vindows: one sho\ving tlie contextand the other sho\ving the content of a specific item.For this reason, \\/c introduced multiple n~vigationalmethods. For example, the discussioll applicationAllows hierarchical navigation (previous, nest, up)Allo\vs na\rigation in chro~iological order (nextunseen, \\that's ne\v)Provides a category view that lists topics accordi~igto tlncir categorySupports content-based search or an index-likefi~nctionNewer vcrsio~ls of Web bro\vscrs support fr-arncs,\vliicli lia\re nnultiplc windo\\l-l>ro\\~sing capabilities(altliough the standards in this area are still a bitvagi~c). \Ve are updating our applications to takeadvantage of these new features.Usability studies guided our decisions as we \ireredesigning forms and dialog boxes. It is liltely tliatmany potuitial ilscrs of our product are fa~niliar \vithVlli~idows-style user interface objects. Because tlieearly Web browsers (c.g., iMosaic) were UNIX-based,little attention was given to providing a huniancomputerinterface that resembled tlie more widelyused Windo\\ls interhce. However, our usabilitystudies indicated that many personal computer (PC:)users had difficulty using Web browsel-s out-of-tlncbox.For example, a user might expect a dialog box tohave ccrtain standard buttons, such as OI


a set of attributes that is similar to a relational datdbasetalzlc. The toolkit pro\!idcs each entry \\,ith a set ofbuilt-in attributes (such as title, creation and modificdtiondates, and author). I-he applications can thendcli\~cr additional attributcs.Tlic toolkit provides the means to retrieve, modi$,and iterate through the collection of entries in astr'~ightforward manner. Bcc~~lsc the attributes arep.lrt of the application description ~ n are d not storedin a separate database, the toolkit can use its lu~o\\.lcdgcof the attributes to simplif\r certain commonoperations. For example, because tmnsferring datafl-om M'I'ML. forms to the tiatabase and back is a b.~sicol.xration in colJaborati\~e 'ipplications, the toolkit canlink fields on forms to database attributes, making itpossilble to store them n~itli a single com~lland. To supporta djrn~mic de\fclopmcnt cnvironme~lt, the toollitalso upgradcs databases in real tinic as nc\v attributesarc added or deleted. This permits tlie applicationdci.eloper to conccntratc on the task at hand ratherthan \\.err\! about databasc managcmcnt tasks.Altl~ougl? the prirnar!, organi~ation ~ncchanism is '1fl~t table indcsed by docu~ncnt identifiers, the databascintegrates a hierarchical relationship ben\!een entries\\hen necessary. Because liier,lrcliies are cornrnon incollaborative applications (c.g., foldcrs/documentsand topics/replies)), it was important to reflect this ina natural way in the database.In addition to attributes, the datab.ise offers propcrties.Compared to attributes, \\~hicli arc stored for eachcntr!, in the databasc, properties arc stored withine,~cIi forum. Application clesigners can use these propertiesin an!! \fray tlic\r desire: the\, .Ire simple key-\valuerclationsliips. The Alta\iista For~lrn sohvare ~~scspops-ties to impleme~~t a \rariety of features, fromaccess co~ltrol policies to the I>acIkgro~~nd color of thescrccn display.User properties are an extension of standard forumproperties. They act like forum psopcrtics except thatthey arc tied to tlie user \\.lie is executing the transaction.User properties keep database loclung to a niinimumbecause, in collaborati\,e ,~pplic,~tions, a user \\,illt\~pic'~ll\' execute onl!, one transaction at a time.Indexing and Search: The Way of the Future?One Itcy design decision \\,as to include an inclexing,~nd search cnginc as a basic component of the product.Although the database is often the central piece ofa groupware product, an indexing and search engineoftcn plays a similar role for '1 WWW site. This develop~naitis completely consistent with tllc philosophyof the WWW-inbrmation is linked as needed, notncccssarily follou,ing any structure. Database use ismore suitable for inform,ition objects that have some~~niformity in their definitions.The basic function of the indexing engine is to mapa set oF\\~ords to a docunicnt containing thosc words.term docunient is ~~seti in ,I generic sense. It canbe any logical entity associated \irith a set or nlords.)-.I lie indcxing information IIILIS~ be stored in such a\\.a)r tliat subseq~~ent searchcs based on individual\\/orcis (and phrases) 'Ire efficient and speedy. Theindexing engine in the Alt~Vista Forum toollcit isbasically the same indexing cnginc '~\r,~ilable on theAltaVista \Web site.' l>esig~led and implemented atIligit,~l's System Research Center, it is liiglil!~ scalableand efficient.'.l'lie built-in databasc fi~nctions '1s a repository fixentries \\zit11 3 predefined set of attributcs. It pro\.idesfast retrieval when tlie entries arc identified using eitheran cntl-)I 113 or a hicrarcliical 113, and it pro\,idcs sirnplccreating, updating, and sorting f~~nctions associatedw~itli rctrie\~al. The indexing and search engine complementsthe AltaVista Forum ciat,lbase: it proi~ides acontent-based search method and f ~nctions at higherspccd. Since the search engine is extremely fast andsc,ll,~ble, \ve also use it to index some of tlie attribute\-alucs in the database. This allo\\rs us to use tlie se'~rcliengine for certain cornputci~~tc~lsi\~e scarches thatotlicr\\/isc \\,auld be performed by the database.Kascd on our expel-icncc, \\,c cxpcct the capabilitiesof the indexing and search cnginc to continue toexpand. As the popularity of the WM'W technologycontinues to grow, the vol~~mc of pi~blislied information\\till also increase. Only a small a~iio~unt of thisinformation can be effecti\,elv c


work on it was limited, \ve divided our efforts betweenmaking access control flexible and choosing defaultoptions that ~vould pro~note collaboration.\Ilk defined access control for the whole database(forum), rather than for individual entries and attributesof cntries. However, some e~itrp-level access controlis necessary. For example, it is preferable to let onlythe owner (or the creator) of an entry modify anddelete that entry. As a result, we allowed the group definitionto include cntr~l-spechc logcal users, ratherthan pro\kle a general meclianisni for entry-level accesscontrol. Therefore, a group may contain a memberwho is the o\vner of the current entry. During accesscontrol checking, the current entry's owner is loolccdup and matched against tlie currently logged-in user.Instead of letting the adniinistrator define accesscontrol for each possible incoming access/action, ourfi-ame\\lorlc allows the application definition to groupaccesses together into logical access rights. For esample,for the discussion application, we defined the followingaccess rights:Read-Includes all read URLs (different views,whether for a single entry or a list of entries)Contribute-IncludesModifj-Includesdeletionaddng a topic or replyany form of modification ori44odcratc-Includcs such fi~nctions as creating Itejlwords,polling options, controlling number oflevelsof replies, and setting certain entries as hiddcnAdliii~iistrate-Cha~~ge access control or otherlinds of resource consuniption policiesBy defining these access rights, the administrator onlyneeds to establish who can do these five operations,rather than define numerous other kinds of operations.It is still possible to change and add to thisgroup of access rights by ~nalci~lg simple modificationsto the application definition...Our basic strategy for making access control easy tomanage is to set up default policies of access controlthat apply to as many situations as possible, within reason.The default policy is added to the application definition.If the admillistrator is satisfied with the defaultpolicies, then the access control can be used as supplied.For the discussion application, die default policyis tlie following:Read-AllContribute-Allusers, including anonymoususers, escluduig arionyrnous~Modifj-Owner (creator) of entry and moderatorsModerate-O\vnerAdministrate-Ownerof the forumof tlie forumTo simplifj implcmcntation, we chose not to allownesting of groups. Our design allows for adding it inthe future as long as it nialtes management of accesscontrol policies easier.Future DirectionsTo date, we have receivcd encouraging feedback fiomusers. Of the ways that we can continue to improvethe AItaVista Forum product, we feel tlie follou~ingdeserve the highest priority.First, \\re need to provide better ways to help usersdeal with information o\!erflow. Although we havebuilt ways to filter and search information into ourapplication, further simplification is necessary. \Weare worlcing on srnart agents that bring the relevantinformation to the user's fingertips.Second, a number of the fi~nctions that we providecan be more easily performed on the client machine.The Java language is the best candidate for pro\jidingthese hnctions since it enables us to handle a widevariety ofclient platforms. Initially, we arc loolung intousing Java to improve certain user interface problems,such as opening additional \\.indows on the clientmachine to notify users of new information.Third, s)lnchronous collaboration using video,audio, and whiteboard will soon become feasible andcost effective. It is iniportant for LLS to help bring userstogether through both synchronous and asynchro-JIOUS methods of collaboration. For example, usersshould be able to use the calendar application toschedule a meeting over tlie Internet, and Windo\\ashould be available to the user automatically.Fourth, as the AltaVista Forum sohvare matures,we hope to add to its performance and increaseits scalability. As its environment evolves, we are lookinginto ways to bypass the CGI interface and use acompiled language for more of the toolliit iniplc~nc~itation.We also hope to add support for large commercialdatabases.Finally, will continue to add innovative applicationsto our product. We recently built a prototype ofa customer-support application that lteeps track ofproblem reporting. We are loolci~ig into other applicationssuch as project management, group review, andsurvey and decision-support systems.AcknowledgmentsWe wish to thank the AltaVista Forum developnientand management teams for their contributions to theprod~~ct. In particular, we wisli to thank Peter Hurleyfor his leadership in starting the effort; Ralph DeMent,Bob Travis, David Marques, and Rick Franltosky, whohave worlted with us throughout the lifetime of theproduct and with whom we have developed a specialcamaraderie; and Dan I


References and Notes1. 1)E(: Notes is a disc~~ssion application r~~n~iing pri~n;lrilyon VAS systems connccrcd oil a I)l


Further ReadingsThc Digital Technical.jon~~~~cil is a refereed, quarterlypublication of papers that explore the foundations ofDigital's products and technologies. Journal contentis selected by the Journal Ad\lisory Board, and papersare \\rritten by Digital's engineers and engineeringpartners. Engineers \\dlo \ilould like to contribute apaper to the./ounzal should contact the managingeditor, Jane Blalte, at Ja11e.B1aIte@ljo.dec.com.Top~cs co\~ered in pre\~ious issucs of the nlglrnlTech~~icn/./o~~rnal are as follo\ia:Spiralog Log-structured File System/OpenVMSfor 64-bit Addressable Virtual Memory/High-performance Message Passing for Clusters/Speech Recognition SoftwareVol. 8, No. 2, 1996, EY-N6992- 18Digital UNIX Clusters/Object ModificationTools/eXcursion for Windows Operating Systems/Network Directory ServicesVal. S, NO. 1,1996, EY-U025E-TJAudio and Video Technologies/UNJX AvailableServers/Real-time Debugging ToolsVol. 7, No. 4,1995, EY-UOO2E-TJHigh Performance Fortran in ParallelE~~viro~lrnents/Se~oia 2000 ResearchVol. 7, No. 3, 1995, EY-T83SE-TI(A\,nil,tblc only on the lntcr~ict)Graphical Software Development/Systems Engi~leeri~igVol. 7, No. 2, 1995, EY-U001E-TIDatabase Integration/Alpha Servers & Workstations/Alpha 21 164 CPUVol 7, No. 1, 1995, EY-1'135E TJ(A\allnblc olil!~ on the Internet)RAID Array Controllers/Workflow Models/PC LANand System Management ToolsVol. 6, No. 4, Fall 1994, EY-T11SE-TJAlphaserver Multiprocessing Systems/DEC OSF/lSymmetric Multiprocessing/Scientific ComputingOptimization for AlphaVol. 6, No. 3, Summer 1994, EY-S799E-TIAlpha AXP Partners-Cray, Raytheon, I


Technical Publications by Digital AuthorsM. Elbcrt and K. Ho\\,e, "Manufnct~~ring Proccss Stud!.and istrib~ltion Nct\\,oi.ks for High Pel-for~nancc CI'U(:hip," l)~vcoeeli~rg.s of' //?LJ .i-;tzl Design Al1tort7utiotiC'b~//i.~.elic,c+(Ji~nc 1996).M. 1)csni nlid Y. Ycli, "A Systematic Technique for Verieiligcca!.s in a 300blHz Alpha CPLJ DesignUsing


J. Iroccssor," PI-ocec~di7rg.i ?/'l/~ Ikt:'/:'.34/h AI~IIULI~ to rlie Thcr~nal Stl-cngrh of Ceramic Packages," A~IICI~CNII.Sf)rYr~g lcvgn Applications," I~?El5'l:'ke~l,'o,1 Ilct'icc-;'/i'cl?rlicril/ji:


Recent DigitalU.S. PatentsTlic fi)llowiog patents \arere recently issucd to DigitalEcluipn~u~t Corporation. Titles and names s~~pplicdto LIS by rI1c U.S. l'ate~it and Trademark Office arcrcproduccti 3s thc!, .~ppe'~r on the origin'11 p~tblishcdp;""l t.Modular Enclosure for Electronic Equipnlcnt;,371,868 I;. 1'. Konig, H. S. Yang, and W. Ha\\.c5,371,870 1'. All. Good\\,in, D. Smclser, andI). A. Tatosian5,371,874 M. Gagliardo, J. Lynch, K. Chinnas\vvi~y,and J. Tcssari5,372,262 J. IM. 13enso1l and J. E. Frirschcr5,373,42 1


B. Lce, E. Atnkov, and J. ClementM. C. Benson and L. M. ~MazzoneIntegrated Circuit Metal Film Intcrconncct HavingEnhanced Resistance to ElectromigrationI/O Espansion BosM. Patrick and J. A. Daly State Machines for Contigi~ration ofn Com~nunicat-ioniskand the On-line lkstoration of a Rcplnce~nent DiskS. Birch, G. Gavrel, and Z. meni ionDcterniination ofl~iterconnect Stress Tcst CurrcnrJ. i\/lu~-~-ay .1nd G. Antoslienko\rI


Call for PapersNetwork ~rbductsand Technologies'T'hc fligital Tech/~icul,/orrrtzal seclts technical papers in all areas of net\\lorltingtechnology for an issue to be publislied in the fall of 1997. Digital's engineers andindustry partners intcrcsted in participating in the special issue shoi~ld sclld topicsand bricfabstracts (100 \\,orcis) by February 10, 1997, toJane Blake, Managing EditorDrgital Tc>chnrcal /o~rt-/7alL31g1tal Equipment Corporation50 Nagog Parlt, AI(02-3/B3Acton, MA 0 1720-9843Email: jane.blake@ljo.dec.com508-486-2544Notice of the topics accepted ill be sent to all authors by February 28, 1997.The manuscript-siib~i~issio~i date for accepted topics is May 30, 1997.For informatio~i on topics p~~blishcd in the,/or/,./rol, the audience, writing ~uidelines,311~1 the pecr-rc\rie\v process, see littp://\\?\,\\~.digital.co~n/info/dtj/dtj-g~~idc.htl~~ or contact the Managing Editor at the address above.


Printcd in U.S.A. EC-N7285-18/9612 14 2 1 .5 Copy~.~glir 0 l>~g~tal Equiprncnt Col-porar~on

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!