06.02.2015 Views

An Analytical Model for Wormhole Routing with Finite Size Input Bu ers

An Analytical Model for Wormhole Routing with Finite Size Input Bu ers

An Analytical Model for Wormhole Routing with Finite Size Input Bu ers

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>An</strong><strong>An</strong>alytical<strong>Model</strong><strong>for</strong><strong>Wormhole</strong><strong>Routing</strong><strong>with</strong><strong>Finite</strong><strong>Size</strong><strong>Input</strong><strong>Bu</strong><strong>ers</strong> Po-ChiHuaandLeonardKleinrockb aLucentTechnologies,Inc.,200SchulzDrive,RedBank,NJ07701,USA CA90095-1596,USA bDepartmentofComputerScience,Univ<strong>ers</strong>ityofCali<strong>for</strong>niaatLos<strong>An</strong>geles,Los<strong>An</strong>geles,<br />

theoutputlinkcontentiondelayandbuerqueueingdelayareproposed.Comparing theanalyticalresultstosimulation,weshowthatthemodelispessimistic<strong>with</strong>regard oflinkdependency(denedinsection3).Severalapproximationmethods<strong>for</strong>estimating Inthispaper,wedevelopaqueueingmodel<strong>for</strong>wormholerouting<strong>with</strong>nitesizebu<strong>ers</strong>. Thismodelassumestheuseofadeadlock-freeroutingschemethatguaranteesnocycle Abstract<br />

tonetworkper<strong>for</strong>manceandthatthedierenceinnetworkthroughputislessthan10 percent. 1Introduction<br />

1.1<strong>Wormhole</strong><strong>Routing</strong> appliedtohigh-speedlocalareanetworks(LANs)[1,2,3]tosupportapplicationssuch asclustercomputingthatdemandaveryfast,high-data-ratecommunicationmedia. Inadditiontoitsuse<strong>for</strong>supercomputerinterconnection,wormholeroutingalsohasbeen interconnections.Ithasthemeritsoflowlatency,lowcost,andsimpleimplementation. <strong>Wormhole</strong>routingwasdevelopedfromtheearlierideaofcut-throughswitching[4], <strong>Wormhole</strong>routingisasimple,low-costswitchingschemeoftenused<strong>for</strong>supercomputer<br />

mustin<strong>for</strong>mup-streamswitchestostoptransmission(i.e.,itexercisesback-pressureow gure1-b)untiltheoutgoinglinkisavailable.Inthiscase,calledblocking,theswitch itsroutingin<strong>for</strong>mation)isreceived(cut-through).Iftheoutgoinglinktothenextswitch isbusyservinganotherpacket,thenthepacketisblockedandresidesinthenetwork(see andwasrstintroducedin[5].Awormholeroutingnetworkiscomposedofseveral<br />

control)duetothelimitedsizeofbu<strong>ers</strong>ateachswitch.Apacket(whichisalsocalled switcheswhichhaverelativelysmallinputbu<strong>ers</strong>(seegure1-a).Asopposedtostoreand-<strong>for</strong>wardswitching,apacketis<strong>for</strong>wardedtothenextswitchassoonasitsheader(or<br />

aworm)maybebueredalongachainofswitchingnodeswhenblocked.Consequently, DABT63-93-C-0055\TheDistributedSupercomputerSupernet|AMultiServiceOpticalIntelligent Network". ThisworkwassupportedbytheAdvancedResearchProjectsAgency,ARPA/CSTO,underContract


deadlocksarepossibleunlessadeadlock-freeroutingstrategyisemployed.Asurveyof<br />

Figure1:<strong>An</strong>illustrationofwormholerouting. beenproposedandpresentedintheliterature[7,8,9,10,11].However,theyallassumed 1.2<strong>Wormhole</strong><strong>Routing</strong><strong>An</strong>alysis Manyper<strong>for</strong>mancemodels<strong>for</strong>wormholeroutinginamulti-processorenvironmenthave<br />

wormholeroutingcanbefoundin[6].l<br />

a640MbpsMyrinet<strong>with</strong>alinklengthof25met<strong>ers</strong>needsabu<strong>ers</strong>izeofatleast54bytes propagationdelaythaninamultiprocessorinterconnectionapplication.Asanexample, toaccommodatetransitdatathatcannotbestoppedimmediatelyduetothelongerlink anegligiblesizeofinputbu<strong>ers</strong>.Thisbu<strong>ers</strong>izemustincreaseinaLANenvironment<br />

p<br />

thepossibilityofthebuerbeingemptybe<strong>for</strong>etransmissionisresumed.ALANspanning hundredsofmet<strong>ers</strong>requiresabu<strong>ers</strong>izelargerthanhundredsofbytes(abu<strong>ers</strong>izethat eectsmustbecapturedinthemodel. [2]perporttopreventdatalossduetoabueroveroworatransmissionbreakdueto monlyusedassumptionthatawormreachesitsdestinationbe<strong>for</strong>eitstailleavesitssource couldholdmorethanonepacket).Thesebu<strong>ers</strong>alleviateblockingproblem.Thus,their host,isnolongervalid.Itisnowthecasethatablockedwormmayoccupyonlyafractionofthelinksalongitspath(notallofthem).Secondly,abuermayholdmorethan<br />

one,butnotaninnitenumber,ofworms.<strong>Bu</strong>eringdelaybecomesdiculttoestimate becausethebu<strong>ers</strong>izeisnite(intermsoftheamountofdata). <strong>An</strong>itesizebuercomplicatestheanalyticalmodelintwoways.Firstly,thecom-<br />

dependencyamongalllinksisneeded.Toestimatethelinkblockingdelay,thelength ofthelinkdependencychainmustberesolvedaccordingtothewormsizedistribution. Approximations<strong>for</strong>determiningtheblockingchainlengthandthelinkblockingdelay solutionisdescribedinsection5.Theentiremodelingprocedureissummarizedinsection M/G/1/Kqueues<strong>with</strong>nitecapacity.Thestructureoftheequivalentqueueandits arepresentedinsection4.Thenitesizebuerisapproximatedthroughequivalent Todeal<strong>with</strong>thedelaycausedbyblockinginthesucceedinghops,knowledgeofthe<br />

6.Section7showscomparisonresults<strong>with</strong>simulations.Section8concludesthispaper. 2<strong>Model</strong>AssumptionsandNotation awormholeroutingnetworkusingadeadlock-freeroutingthatguaranteesnocycle Theanalysisworkpresentedinthispaperassumesthefollowings: condition<strong>for</strong>deadlockfreerouting,asdiscussedin[12,13]. oflinkdependency.Nocycleoflinkdependencyisasucient,butnotnecessary,<br />

<strong>Input</strong> Port<br />

<strong>Input</strong> buffer<br />

Cross bar<br />

Output<br />

Port<br />

(a) A wormhole routing switch<br />

p<br />

li<br />

p<br />

Λ<br />

p<br />

p i p<br />

p<br />

l i+1 l i+2 l i+3 l i+4<br />

Q Wl p Wl p Wl p Hl p Z<br />

i i+1 i+2 i+3 i+4 li+4<br />

p<br />

a worm<br />

(b) <strong>An</strong> illustration of notation and various delays


onlyonenitesizebuerateachinputportofaswitch.Also,wormscannotshare sourcerouting.<strong>Routing</strong>ismadebythesourcehostandcannotbechangedby switches(i.e.,nodeectionoradaptiverouting).<br />

Myrinet[2]hasonebyteperitlasting12.5ns. Tofacilitatethispaperpresentation,wemeasurepacketlengthbyits,whichisthe amountofdatathatcanbetransmittedinonetimeunit.Forexample,the640Mbps innitesizebu<strong>ers</strong>athosts. aPoissonwormarrivalprocessandanarbitrarywormsizedistribution. alinkthroughinterleaving(i.e.,multiplevirtualchannelsarenotallowed).<br />

illustratedingure1-b. Thefollowingsdenesomenotationusedthroughthispaper.Thenotationisalso lab=Thelinkthatoriginatesatnode(ahostoraswitch)aandendsatnodeb. dp=Thelength(numberofhops)ofpathp.<br />

Ha=Thesetofpathswhichoriginatesathosta. lpi=Thepropagationdelayoflinklpi. Lp=Thesetoflinkswhicharetrav<strong>ers</strong>edalongpathp. lpi=Theithlinkofpathp;1idp.Iftheithlinkofpathporiginatesatnode<br />

p=Thearrivalrateofwormsthattrav<strong>ers</strong>ealongpathp. =Thebu<strong>ers</strong>ize,intermsofnumberofits. aandendsatnodeb,thenlpilab.<br />

lab=Thetotalwormarrivalrateatlab.<br />

Qlpj(s)=TheLaplace-Stieltjestrans<strong>for</strong>moftheprobabilitydensityfunctionofqlpi. L(s)=TheLaplace-Stieltjestrans<strong>for</strong>moftheprobabilitydensityfunctionof`. qlpi=Arandomvariablethatdenotesthedelayofawormheadtoreachthehead a=Thetotalwormarrivalrateofwormsathosta.a=Pp:p2Hap. `=Arandomvariablethatdenotesawormsize.<br />

Zlpj(s)=TheLaplace-Stieltjestrans<strong>for</strong>moftheprobabilitydensityfunctionofzlpi. zlpi=Arandomvariablethatdenotesthedelayofawormheadtoreachthepoint wheretheaccumulatedbu<strong>ers</strong>paceislargeenoughtostoretheentireworm, afterthewormheadhasenteredthebuer<strong>for</strong>linklpi(seegure1). oftheinputbuer<strong>for</strong>linklpi,afterthewormhasenteredthebuer.<br />

W Hlpj(s)=TheLaplace-Stieltjestrans<strong>for</strong>moftheprobabilitydensityfunctionofhlpi. lpj(s)=TheLaplace-Stieltjestrans<strong>for</strong>moftheprobabilitydensityfunctionof!lpi. !lpi=Arandomvariablethatdenotestheone-hop<strong>for</strong>wardingdelay,excludingthe hlpi=Arandomvariablethatdenotesthecontentiondelay<strong>for</strong>linklpi. headtobuerhead)vialinklpi. linkpropagationdelay,<strong>for</strong>thewormheadtoadvancetothenexthop(buer


Blab(s)=TheLaplace-Stieltjestrans<strong>for</strong>moftheprobabilitydensityfunctionofthelink Blpi(s)=TheLaplace-Stieltjestrans<strong>for</strong>moftheprobabilitydensityfunctionofblpi. blpi=Arandomvariablethatdenotesthelinkoccupancytimeoflinklpi.<br />

Slpi(s)=TheLaplace-Stieltjestrans<strong>for</strong>moftheprobabilitydensityfunctionofslpi. slab=Arandomvariablethatdenotestheservicetimeofawormatthebuer<strong>for</strong> slpi=Arandomvariablethatdenotestheservicetimeofawormviapathpatthe linklab. buer<strong>for</strong>theithlinkofpathp. occupancytimeatlinklab.<br />

3OrderingLinks Slab(s)=TheLaplace-Stieltjestrans<strong>for</strong>mofslab(). slab()=Theprobabilitydensityfunctionofslab.<br />

linkblockingfeature.Blockingoccursduetothesmallsizeoftheinputbu<strong>ers</strong>andresults aservedwormholdsthislink)isnotonlyafunctionofthewormsize,butalsoafunction inincreasedlinkoccupancytime.Thisoccupancytime(denedasthetimeintervalthat Awormholeroutingnetworkdi<strong>ers</strong>fromavirtualcut-throughnetworkbecauseofits Tp=Theaveragenetworkdelay<strong>for</strong>wormsviapathp.<br />

oftheblockingdelayinthesucceedinghops.Asaconsequence,itisimportanttond thedependencyamonglinks.Thelinkdependencyandthecycleoflinkdependencyare<br />

Notethatitispossiblethatlablefbutlabisnotasubsequentlinkoflefinanypath, Denition1Wesaythatlabdependsonlcd,if9p,suchthatlcdisasubsequentlinkof denedasfollows: lcdlef,thenwesaylablef,too(i.e.,itistransitive). labinpathp.Thisdependencyisrepresentedaslablcd.Moreover,iflablcd,and<br />

time.Inourearlierpaper[11],wedevelopedtherelationsbetweentheirdistributions andlcdlab. butreliedoniterativemethodstondthesolution.Actually,acomputationorder,which Denition2Wesaythatthereisacycleoflinkdependencyif9lab;lcdsuchthatlablcd accordingtothetransitiveproperty.<br />

indicatesthesequenceoflinks<strong>for</strong>blockingdelayanalysiscanbederivedifthereisnocycle oflinkdependency,asillustratedin[14].Themethodissimplythetopologicalsorting[15]. thecomputationorder,linkoccupancytimeandblockingtimecanbeevaluatedlinkby Forexamples,iflablcdlef,wehaveacomputationorder,lef!lcd!lab.Following Linkdependencyprovidestherelationshipbetweenlinkoccupancytimeandblocking<br />

link<strong>with</strong>outiterations.


4LinkOccupancyTime thenitesizebueraectslinkstatusandwormtransmission.Whenthereisnobuer availableatswitches,therelationbetweenthelinkoccupancytime(blpi)andwaitingtime (!lpi)hasbeenwellestablishedin[11].TheLaplace-Stieltjestrans<strong>for</strong>mequationis: Blpi(s)=L(s)dpY Toestimatetheblockingdelayateachswitch,itisimportanttorstanalyzehow<br />

eectivesubsequentlinks<strong>for</strong>awormattheithlinkofpathp(lpi),pi(`),isderivedby: limitednumberofsubsequentlinks,notallofthem.Givenawormsize`,thenumberof awormcanspreadover.Inotherwords,thelinkoccupancytimeisonlyaectedbya Introducinganitesizebueroneachinputportreducesthenumberoflinksthat j=i+1W lpj(s) (1)<br />

entireworm. Asshowningures1-b,blockingthatoccursafterthenextpi(`)linksdoesnotaect thelinkoccupancytimesincetheaccumulatedbu<strong>ers</strong>paceislargeenoughtoholdthe pi(`)=(j`kif`


Position<br />

worm head worm tail<br />

sideofinequality(6)shouldbeadoptedtoapproximateblpiwhenthe<strong>for</strong>wardingdelay,xlpi, Figure2:<strong>An</strong>illustrationofthelinkoccupancytime. the<strong>for</strong>wardingdelayincreases,andmustbeatleastaslargeasthewormsize.Tosatisfy dominates.Also,theaveragelinkoccupancytimeshouldbemonotonicallyincreasingas<br />

filling buffer<br />

p<br />

Λ<br />

alloftheabove,thefollowingapproximationisproposed<strong>for</strong>thelinkoccupancytime<br />

i<br />

distribution: Sincebu<strong>ers</strong>tendtobefullyutilizedund<strong>ers</strong>everelyblockingconditions,thelefthand<br />

blocking<br />

p<br />

Blpi(s)=8>:Ylpi+Xlpi1hYlpiXlpiYlpi(s)+2XlpiL(s)iifL>Xlpi<br />

l i<br />

Ylpi+Xlpi1hYlpiXlpiYlpi(s)+2XlpiX<br />

(x<br />

p<br />

l ) worm size Time<br />

i<br />

lpi(s)iifLXlpi link occupancy time<br />

monotonicallyincreasing<strong>with</strong>Xlpi,asprovenin[14]. whereXlpiisth<strong>ers</strong>tmomentofX limXlpi!1Blpi(s)=X TheremainingW Itcanbeshownthatequation(7)hasthelimitvalues,limXlpi!0Blpi(s)=Ylpi(s)and lpi(s),sinceYlpi=Xlpi+L.Moreover,Blpiderivedbyequation(7)is lpj(s),Zlpj(s),Hlpj(s),andQlpj(s)quantitiesarediscussedinsection5. lpi(s),andsimilarly<strong>for</strong>YlpiandL.<br />

5<strong>Model</strong>ingthe<strong>Finite</strong><strong>Size</strong><strong>Bu</strong>er andvicev<strong>ers</strong>a.Toanalyzebothindependentlycouldresultinapoormodel.Forthe buerresemblesanitedamsystem.Awormowsinthebuerconstantlywhenitisnot sakeofaccuracyandsimplicity,weuseanalternativeapproachwhichtreatsbothlink full.However,theoutgoingowofthebuermaybeinterruptedduetowormblocking. Thequeueingmodel<strong>for</strong>anitedamsystemdevelopedin[16]cannotbeapplieddirectly inthiscase.Furthermore,thestatusofthebueristightlyrelatedtoitsupstreamnode, Sincebuercapacityisxedintermsofthenumberofits,thenatureoftheinput<br />

5.1M/G/1/KApproximation buerheadinthenexthop(i.e.,theone-hop<strong>for</strong>wardingdelay,!lpi)isexactlythewaiting contentionandtheinputbuerasonesinglequeue. capacityisK).Withinputports,theM/G/1/Kqueue(seegure3)hasthecapacity timeofanM/G/1queue<strong>with</strong>nitecapacity(denotedasM/G/1/K,<strong>for</strong>thecasethat Asshowningure3,thedelay<strong>for</strong>awormtoseizeitsoutputlinkandreachthe


approximately+#,where#isthenumberofwormsthatcanbecompletelyheldinthe queueincludestheinputbuerattheendofthelinkandcontention<strong>for</strong>thislink. Figure3:The<strong>for</strong>wardingdelayisconsideredasasinglequeue<strong>with</strong>nitecapacity.The portionofthenitesizebuer.Un<strong>for</strong>tunately,thebu<strong>ers</strong>izeisdeterminedasthenumber Ingeneral,anequivalentqueuesizespecieshowmanywormscanbeheldinthebuer Tosimplifytheanalysisofthisnitesizebuer,equivalentqueuesareusedhereinstead.<br />

H W<br />

link contention<br />

andisassociated<strong>with</strong>aprobability.Specically,thebuerisapproximatedasaqueue ofits,notthenumberofworms.Forvariablewormsizecases,#isnotdeterministic.<br />

S<br />

S<br />

ofcapacity+k<strong>with</strong>theprobability#(k)that,<br />

finite buffer<br />

Q<br />

finite buffer<br />

#(k)=Probf`1++`k


Thoughslab()canberecoveredbyinvertingitsLaplace-Stieltjestrans<strong>for</strong>m,Slab(s),the tioncanbeexploited. inv<strong>ers</strong>ionisnotcompletelysystematic.Toeasethisdiculty,atwo-momentapproxima-<br />

Z1<strong>An</strong>otherchangeisabouttheintegration[18,Chapter5,equation(1.7)], 00labk k!e0labslab()d<br />

Figure4:Thetwo-stageapproximation<strong>for</strong>adistributionfunction. th<strong>ers</strong>ttwomomentsofslab,SlabandS2lab,theprobabilitydensityfunctionofslabcanbe approximatedas(gure4): slab()=8


5.2<strong>Bu</strong>eringDelayandMore asanM/G/1/Kqueue<strong>with</strong>K=,andthequeueservicetimeisexactlythelink hlpi,asshowningure3.Thecontentionblocking,Hlpi(s),canalsobeapproximated Thebueringdelay,Qlpj(s)isderivedsimplyasQlpi(s)=Wlpi(s)<br />

mustbeproperlyapproximatedrstinordertoderiveHlpi(s)andQlpi(s).Asimple known,whichrequiresknowledgeofHlpi(s)asshownintheabove.Consequently,Blab(s) occupancytime,Blab(s),iflpilab.However,Blab(s)isnotavailableuntilQlpi(s)is Hlpi(s)because!lpi=qlpi+<br />

Blab(s)=ProbfbuerfullgSlab(s)+(1Probfbuerfullg)L(s) approximationisproposedasthefollowing: Thisapproximationisbasedonthefollowingobservations: 1.Whenthebuerisfull,itsimplyresemblesadatapipe|oneitofdataoutofthe thiscase. buercorrespondstooneitofdataenteringthebuer.Thus,Blab(s)=Slab(s),in (15)<br />

areintheM/G/1/KqueueusedtoapproximateW 2.Whenthereisspaceleftinthebuer,awormowsinthebuer<strong>with</strong>outinterruption.Thus,Blab(s)=L(s).<br />

lpi(s),namely,theprobabilitythatmorethan#worms Probfbuerfullg=1Xj=0#(j)0@(1PjB)j+1 thatisderivedwhenweanalyzeW thenitesizebuer.There<strong>for</strong>e,wehave, Thebuerfullprobabilitycanbecloselyestimatedfromthesteady-stateprobability<br />

wherejk,PjBdenotethek,PBoftheM/G/1/Kqueue<strong>with</strong>K=j+. Xk=jjk+PjB1A lpi(s).#isanequivalentqueuesizeof<br />

factthattheequivalentqueueusedtoapproximatethenitesizebuerdoesnotcount thebu<strong>ers</strong>pacethatcanonlyholdpartofaworm.Thisdelayisnotrecountedhere. AfterW Finally,Zlpi(s)isignored,sinceitissmallandimplicitlyincludedinHlpi(s)duetothe lpi(s),Hlpi(s)andQlpi(s)arederived,Blpi(s)isgivenbyequation(7). (16)<br />

inputbueris: hopinputbuerisderivedasslpi=hlpi+1+blpi+1,whichgivesus: Slpi(s)=Hlpi+1(s)Blpi+1(s) Now,theservicetimedistribution<strong>for</strong>awormthroughpathpattheheadofitsith<br />

Slab(s)=X Consideringwormsfromdierentpaths,theservicetimedistribution<strong>for</strong>anitesize p:lab2Lpp labSlpp(lab)(s) (17)<br />

wherep(lab)isafunctionwhichreturnsiiflinklabistheithlinkofpathp. (18)


thebueringdelayatth<strong>ers</strong>tlinkisnotcounted,sinceitispartofthehostqueueing delayisobtainedas(seegures1-band2): Tp=va+dpXi=2wlpi+L+dpXi=1lpi ifpathporiginatesathosta.vaisthemeanofthequeueingdelayathosta.Notethat Oncetheservicetimeandmean<strong>for</strong>wardingdelayateachhopisderived,thenetwork<br />

delay,whichisdirectlyderivedfromtheM/G/1queuesolution[20],va=aS2lab 6<strong>Model</strong>Summary 2(1aSlab). (19)<br />

2.Readinallpathsandtheirwormarrivalrates,p. 3.Computethewormarrivalrateateachsinglelink(e.g.,lab). 4.Withthegivenwormsizedistribution,compute 1.Readinthenetworktopology. Here,wesummarizethefullmodelingprocess.<br />

6.Fork=1tothehighestorder,compute(inthefollowingorder)S(s),W(s), 5.Usetopologicalsorting(see[14])toconstructthelinkcomputationorder. 7.ComputeTp<strong>for</strong>allpathsp. H(s),Q(s),andB(s)<strong>for</strong>alllinksbelongingtoorderk.Thedistributionofall oftheabovemayactuallybecharacterizedbytheirrsttwomoments. lpi(j)and#(j),8jandlpi.<br />

uration.NotethattheLaplace-Stieltjestrans<strong>for</strong>m<strong>for</strong>eachprobabilitydensityfunction Theentireprocedurecanbecomputerizedexcept<strong>for</strong>step4.Step4involvesintegration doesnotneedtobesolvedexplicitly.Theyareonlyused<strong>for</strong>theconvenienceofpresentation.Onlyth<strong>ers</strong>ttwomomentsofeachdistributionarerequired.<br />

givenwormsizedistribution,therestcanbedoneautomatically<strong>for</strong>anynetworkcong-<br />

andotheroperationsthatrequiremanualeort.However,oncetheyarecompleted<strong>for</strong>a<br />

theassumptionsofexponentialwormsizedistributionandPoissonwormarrivals.Figure 7ComparisonofResults ing[1]andsymmetrictracload(see[14]<strong>for</strong>details),theper<strong>for</strong>manceresultsestimated byboththemodelandsimulationareshowningure5-a.Theresultsarederived<strong>with</strong> 5-aindicatesatenpercentdierencebetweenthenetworkthroughputestimatedbythe analyticalresultsarealwayspessimistic,comparedtothesimulation. modelandthesimulation,inbothsmallbu<strong>ers</strong>izeandlargebu<strong>ers</strong>izecases.Also,the Usinga33torus(totally,9switchesand36hosts)<strong>with</strong>up/downdeadlock-freerout-<br />

pessimismofthemodel.First,thenitesizebuermaybebetterapproximatedbyan equivalentqueue<strong>with</strong>highercapacity.Thecurrentapproximation: #(j)=Probf`1++`k


Figure5:Resultsofthenitesizebuermodels(wormsize=200its,propagation<br />

3000<br />

3000<br />

buffer size<br />

2500<br />

80 289<br />

2500 buffer size<br />

80, simulation<br />

80<br />

80, simulation<br />

2000 289, simulation<br />

2057 2000 289, simulation<br />

2057, simulation<br />

289<br />

80, modified 1500 80, 1500 289, modified 289, 1000 2057, 1000<br />

oftheabovedependency,theworm<strong>for</strong>wardingdelayisoverestimated. Alargercapacityclearlyimpliesasmalleraveragesizeofwormsinthequeue,duetothe mustbehigherinahighcapacitycasethaninalowcapacityone.Withoutconsideration delay=10timeunits).<br />

500<br />

500<br />

0<br />

0<br />

0 2 4 6 8 10<br />

0 2 4 6 8 10<br />

regardtotheabovediscussion.Theprobabilityofthenumberofwormsthatcanbe timeoftheequivalentqueuedependsonthequeuecapacity(intermsofnumberofworms). factthatthebu<strong>ers</strong>izeisxedintermsofnumberofits.Asaresult,theservicerate<br />

Throughput (flits/time unit)<br />

Throughput (flits/time unit)<br />

heldinthenitesizebuer,#(j),isre<strong>for</strong>malizedbyenlargingthebu<strong>ers</strong>izeto+L2. Ingure5-b,weshowtheanalyticalresultsofamodiedmodel(seebelow)<strong>with</strong><br />

(a) original (b) modified thenitesizebuer,andfromthelinkcontention)hasanewmeanbu<strong>ers</strong>ervice anewaveragewormsize.Namely,anequivalentqueue<strong>with</strong>capacityj+(jfrom time:hnewSlabi=+L ofthebu<strong>ers</strong>ervicetimedistributionareadjusted<strong>with</strong>thequeuecapacity,whichgives TheL2portioncountsthebu<strong>ers</strong>pacethatcannotholdafullworm.Also,themoments<br />

pessimistic. 8Summary thatthisanalysisisnottrivialandneedsmanyapproximations.Tofurtherimprove networkper<strong>for</strong>manceingure5-bisclosertothesimulation.However,themodelisstill Inthispaper,anitesizebuermodel<strong>for</strong>wormholeroutingisdeveloped.Itisshown (+j)LSlabandsimilarly,hnewS2labi=+L (+j)L2S2lab.Thepredicted<br />

However,thefullmodelingprocedurepresentedinthispaperissystematicandcouldbe theseapproximationsrequireintensivestudyofseveralsophisticatedqueueingmodels. implementedasausefultool. References [2]C.Seitz,D.Cohen,andR.Felderman.\Myrinet|AGigabit-per-secondLocal-Area [1]M.D.Schroeder,A.D.Birrell,M.<strong>Bu</strong>rrows,H.Murray,etal.\Autonet:AHigh-speed, Self-conguringLocalAreaNetworkUsingPoint-to-pointLinks".IEEEJournalon Network".IEEEMicro,15(1):29{36,February1995. SelectedAreasinCommunications,9(8):1318{1335,October1991.<br />

Delay (time units)<br />

Delay (time units)


[5]C.Seitzetal.\TheHypercubeCommunicationsChip".Technicalreport,Dep. [3]et.alL.Kleinrock.\TheSupercomputerSupernet:AScalableDistributedTerabitNetwork".JournalofHighSpeedNetworks:specialissueonOpticalNetworks,<br />

[6]L.M.NiandP.K.McKinley.\ASurveyof<strong>Wormhole</strong><strong>Routing</strong>TechniquesinDirect [4]P.KermaniandL.Kleinrock.\Virtualcut-through:ANewComputerCommunicationSwitchingTechnique".ComputerNetworks,3:267{289,1979.<br />

ComputerScience,Cali<strong>for</strong>niaInst.,March1985.DisplayFile5128:DF:85. Networks".Computer,pages62{76,February1993. 4(4):407{24,1995.<br />

[7]W.J.Dally.\Per<strong>for</strong>mance<strong>An</strong>alysisofK-aryn-cubeInterconnectionNetworks".<br />

[10]J.T.DraperandJ.Ghosh.\AComprehensive<strong>An</strong>alytical<strong>Model</strong><strong>for</strong><strong>Wormhole</strong> [9]J.KimandC.R.Das.\HypercubeCommunicationDelay<strong>with</strong><strong>Wormhole</strong><strong>Routing</strong>". [8]W-J.Guan,W.K.Tsai,andD.Blough.\<strong>An</strong><strong>An</strong>alytical<strong>Model</strong><strong>for</strong><strong>Wormhole</strong><strong>Routing</strong> ParallelProcessingSymposium,pages650{654,April1993. IEEETrans.onComput<strong>ers</strong>,39(6),June1990. inMulticomputerInterconnectionNetworks".InProceedingsofSeventhInternational<br />

[11]Po-ChiHuandL.Kleinrock.\AQueueing<strong>Model</strong><strong>for</strong><strong>Wormhole</strong><strong>Routing</strong><strong>with</strong>Time-<br />

23:202{214,November1994. IEEETrans.onComput<strong>ers</strong>,43(7),July1994.<br />

[12]J.Duato.\ANecessaryandSucientCondition<strong>for</strong>Deadlock-freeAdaptive<strong>Routing</strong> <strong>Routing</strong>inMulticomputerSystems".JournalofParallelandDistributedComputing,<br />

[13]L.SchwiebertandD.N.Jayasimha.\ANecessaryandSucientCondition<strong>for</strong> tionsandNetworks,pages584{593,LasVegas,NV,U.S.,September1995. out".InProceedingsofthe4thInternationalConferenceonComputerCommunica-<br />

[14]Po-ChiHu.High-SpeedLocalAreaNetworksUsing<strong>Wormhole</strong><strong>Routing</strong>:<strong>Model</strong>ing 6(10):1055{67,October1995. in<strong>Wormhole</strong>Networks".IEEETransactionsonParallelandDistributedSystems,<br />

[15]RalphP.Grimaldi.DiscreteandCombinatorialMathematics:<strong>An</strong>AppliedIntroduction.Addison-Wesley,Reading,Mass.,2ndedition,1989.<br />

Deadlock-free<strong>Wormhole</strong><strong>Routing</strong>".JournalofParallelandDistributedComputing, andExtensions.PhDthesis,Univ<strong>ers</strong>ityofCali<strong>for</strong>nia,Los<strong>An</strong>geles,June1996. 32(1):103{117,January1996.<br />

[17]L.Kleinrock.CommunicationNets:StochasticMessageFlowandDelay.MgGraw- [16]JacobWillemCohen.TheSingleServerQueue.North-HollandPub.Co.,revised [18]HideakiTakagi.Queueing<strong>An</strong>alysis:AFoundationofPer<strong>for</strong>manceEvaluation, edition,1982. [19]WilliamH.Pressetal.NumericalRecipes:TheArtofScienticcomputing.CambridgeUniv<strong>ers</strong>ityPress,NewYork,1986.<br />

[20]L.Kleinrock.QueueingSystems,Vol.I:Theory.WileyInt<strong>ers</strong>cience,NewYork,1975. Hill,NewYork,1964.ReprintedbyDoverPublications,1972. volume2.North-Holland,NewYork,NY,U.S.A.,1993.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!