Network Traffic Characteristics of Data Centers in the Wild - Sigcomm
Network Traffic Characteristics of Data Centers in the Wild - Sigcomm
Network Traffic Characteristics of Data Centers in the Wild - Sigcomm
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
<strong>Network</strong> <strong>Traffic</strong> <strong>Characteristics</strong> <strong>of</strong> <strong>Data</strong> <strong>Centers</strong> <strong>in</strong> <strong>the</strong> <strong>Wild</strong><br />
ABSTRACT<br />
Theophilus Benson ∗ , Aditya Akella ∗ and David A. Maltz †<br />
∗ University <strong>of</strong> Wiscons<strong>in</strong>–Madison<br />
† Micros<strong>of</strong>t Research–Redmond<br />
Although<strong>the</strong>reistremendous<strong>in</strong>terest<strong>in</strong>design<strong>in</strong>gimprovednetworksfordatacenters,verylittleisknownabout<strong>the</strong>network-leveltrafficcharacteristics<strong>of</strong>currentdatacenters.Inthispaper,weconductanempiricalstudy<strong>of</strong><strong>the</strong>networktraffic<strong>in</strong>10datacentersbelong<strong>in</strong>gtothreedifferenttypes<strong>of</strong>organizations,<strong>in</strong>clud<strong>in</strong>guniversity,enterprise,andclouddatacenters.Ourdef<strong>in</strong>ition<strong>of</strong>clouddatacenters<strong>in</strong>cludesnotonlydatacentersemployedbylargeonl<strong>in</strong>eserviceproviders<strong>of</strong>fer<strong>in</strong>gInternet-fac<strong>in</strong>gapplications,butalsodatacentersusedtohostdata-<strong>in</strong>tensive(MapReducestyle)applications.<br />
WecollectandanalyzeSNMPstatistics,topology,and<br />
packet-leveltraces.Weexam<strong>in</strong>e<strong>the</strong>range<strong>of</strong>applicationsdeployed<br />
<strong>in</strong><strong>the</strong>sedatacentersand<strong>the</strong>irplacement,<strong>the</strong>flow-levelandpacketleveltransmissionproperties<strong>of</strong><strong>the</strong>seapplications,and<strong>the</strong>irimpactonnetworkutilization,l<strong>in</strong>kutilization,congestion,andpacket<br />
drops.Wedescribe<strong>the</strong>implications<strong>of</strong><strong>the</strong>observedtrafficpatterns<br />
fordatacenter<strong>in</strong>ternaltrafficeng<strong>in</strong>eer<strong>in</strong>gaswellasforrecentlyproposedarchitecturesfordatacenternetworks.<br />
CategoriesandSubjectDescriptors<br />
C.4 [Performance<strong>of</strong>Systems]: Designstudies;Performance<br />
attributes<br />
GeneralTerms<br />
Design,Measurement,Performance<br />
Keywords<br />
<strong>Data</strong>centertraffic,characterization<br />
1. INTRODUCTION<br />
Adatacenter(DC)referstoanylarge,dedicatedcluster<strong>of</strong>computersthatisownedandoperatedbyas<strong>in</strong>gleorganization.<br />
<strong>Data</strong><br />
centers<strong>of</strong>varioussizesarebe<strong>in</strong>gbuiltandemployedforadiverseset<strong>of</strong>purposestoday.<br />
On<strong>the</strong>onehand,largeuniversities<br />
andprivateenterprisesare<strong>in</strong>creas<strong>in</strong>glyconsolidat<strong>in</strong>g<strong>the</strong>irITserviceswith<strong>in</strong>on-sitedatacentersconta<strong>in</strong><strong>in</strong>gafewhundredtoafew<br />
Permissiontomakedigitalorhardcopies<strong>of</strong>allorpart<strong>of</strong>thisworkfor<br />
personalorclassroomuseisgrantedwithoutfeeprovidedthatcopiesare<br />
notmadeordistributedforpr<strong>of</strong>itorcommercialadvantageandthatcopies<br />
bearthisnoticeand<strong>the</strong>fullcitationon<strong>the</strong>firstpage.Tocopyo<strong>the</strong>rwise,to<br />
republish,topostonserversortoredistributetolists,requirespriorspecific<br />
permissionand/orafee.<br />
IMC’10,November1–3,2010,Melbourne,Australia.<br />
Copyright2010ACM978-1-4503-0057-5/10/11...$10.00.<br />
267<br />
thousandservers.On<strong>the</strong>o<strong>the</strong>rhand,largeonl<strong>in</strong>eserviceproviders,<br />
suchasGoogle,Micros<strong>of</strong>t,andAmazon,arerapidlybuild<strong>in</strong>ggeographicallydiverseclouddatacenters,<strong>of</strong>tenconta<strong>in</strong><strong>in</strong>gmorethan10Kservers,to<strong>of</strong>feravariety<strong>of</strong>cloud-basedservicessuchasEmail,Webservers,storage,search,gam<strong>in</strong>g,andInstantMessag<strong>in</strong>g.<br />
Theseserviceprovidersalsoemploysome<strong>of</strong><strong>the</strong>irdatacentersto<br />
runlarge-scaledata-<strong>in</strong>tensivetasks,suchas<strong>in</strong>dex<strong>in</strong>gWebpagesor<br />
analyz<strong>in</strong>glargedata-sets,<strong>of</strong>tenus<strong>in</strong>gvariations<strong>of</strong><strong>the</strong>MapReduce<br />
paradigm[6].<br />
Despite<strong>the</strong>grow<strong>in</strong>gapplicability<strong>of</strong>datacenters<strong>in</strong>awidevariety<strong>of</strong>scenarios,<strong>the</strong>reareveryfewsystematicmeasurementstudies[19,3]<strong>of</strong>datacenterusagetoguidepracticalissues<strong>in</strong>data<br />
centeroperations. Crucially,littleisknownabout<strong>the</strong>keydifferencesbetweendifferentclasses<strong>of</strong>datacenters,specificallyuniversitycampusdatacenters,privateenterprisedatacenters,andcloud<br />
datacenters(boththoseusedforcustomer-fac<strong>in</strong>gapplicationsand<br />
thoseusedforlarge-scaledata-<strong>in</strong>tensivestasks).<br />
Whileseveralaspects<strong>of</strong>datacentersstillneedsubstantialempiricalanalysis,<strong>the</strong>specificfocus<strong>of</strong>ourworkisonissuesperta<strong>in</strong><strong>in</strong>gtoadatacenternetwork’soperation.Weexam<strong>in</strong>e<strong>the</strong>send<strong>in</strong>g/receiv<strong>in</strong>gpatterns<strong>of</strong>applicationsrunn<strong>in</strong>g<strong>in</strong>datacentersand<br />
<strong>the</strong>result<strong>in</strong>gl<strong>in</strong>k-levelandnetwork-levelperformance. Abetter<br />
understand<strong>in</strong>g<strong>of</strong><strong>the</strong>seissuescanleadtoavariety<strong>of</strong>advancements,<br />
<strong>in</strong>clud<strong>in</strong>gtrafficeng<strong>in</strong>eer<strong>in</strong>gmechanismstailoredtoimproveavailablecapacityandreducelossrateswith<strong>in</strong>datacenters,mechanisms<br />
forimprovedquality-<strong>of</strong>-service,andeventechniquesformanag<strong>in</strong>g<br />
o<strong>the</strong>rcrucialdatacenterresources,suchasenergyconsumption.<br />
Unfortunately,<strong>the</strong>fewrecentempiricalstudies[19,3]<strong>of</strong>datacenternetworksarequitelimited<strong>in</strong><strong>the</strong>irscope,mak<strong>in</strong>g<strong>the</strong>irobservationsdifficulttogeneralizeandemploy<strong>in</strong>practice.<br />
Inthispaper,westudydatacollectedfromtendatacentersto<br />
shedlighton<strong>the</strong>irnetworkdesignandusageandtoidentifypropertiesthatcanhelpimproveoperation<strong>of</strong><strong>the</strong>irnetwork<strong>in</strong>gsubstrate.Thedatacenterswestudy<strong>in</strong>cludethreeuniversitycampus<br />
datacenters,twoprivateenterprisedatacenters,andfivecloud<br />
datacenters,three<strong>of</strong>whichrunavariety<strong>of</strong>Internet-fac<strong>in</strong>gapplicationswhile<strong>the</strong>rema<strong>in</strong><strong>in</strong>gtwopredom<strong>in</strong>antlyrunMapReduce<br />
workloads. Some<strong>of</strong><strong>the</strong>datacenterswestudyhavebeen<strong>in</strong>operationforover10years,whileo<strong>the</strong>rswerecommissionedmuch<br />
morerecently.Ourdata<strong>in</strong>cludesSNMPl<strong>in</strong>kstatisticsforalldata<br />
centers,f<strong>in</strong>e-gra<strong>in</strong>edpackettracesfromselectswitches<strong>in</strong>four<strong>of</strong><br />
<strong>the</strong>datacenters,anddetailedtopologyforfivedatacenters. By<br />
study<strong>in</strong>gdifferentclasses<strong>of</strong>datacenters,weareabletoshedlight<br />
on<strong>the</strong>question<strong>of</strong>howsimilarordifferent<strong>the</strong>yare<strong>in</strong>terms<strong>of</strong><strong>the</strong>ir<br />
networkusage,whe<strong>the</strong>rresultstakenfromoneclasscanbeapplied<br />
to<strong>the</strong>o<strong>the</strong>rs,andwhe<strong>the</strong>rdifferentsolutionswillbeneededfor<br />
design<strong>in</strong>gandmanag<strong>in</strong>g<strong>the</strong>datacenters’<strong>in</strong>ternalnetworks.<br />
Weperformatop-downanalysis<strong>of</strong><strong>the</strong>datacenters,start<strong>in</strong>gwith
<strong>the</strong>applicationsrun<strong>in</strong>eachdatacenterand<strong>the</strong>ndrill<strong>in</strong>gdownto<br />
<strong>the</strong>applications’sendandreceivepatternsand<strong>the</strong>irnetwork-level<br />
impact. Us<strong>in</strong>gpackettraces,wefirstexam<strong>in</strong>e<strong>the</strong>type<strong>of</strong>applicationsrunn<strong>in</strong>g<strong>in</strong>eachdatacenterand<strong>the</strong>irrelativecontributiontonetworktraffic.We<strong>the</strong>nexam<strong>in</strong>e<strong>the</strong>f<strong>in</strong>e-gra<strong>in</strong>edsend<strong>in</strong>gpatternsascapturedbydatatransmissionbehaviorat<strong>the</strong>packetand<br />
flowlevels.Weexam<strong>in</strong>e<strong>the</strong>sepatternsboth<strong>in</strong>aggregateandata<br />
per-applicationlevel.F<strong>in</strong>ally,weuseSNMPtracestoexam<strong>in</strong>e<strong>the</strong><br />
network-levelimpact<strong>in</strong>terms<strong>of</strong>l<strong>in</strong>kutilization,congestion,and<br />
packetdrops,and<strong>the</strong>dependence<strong>of</strong><strong>the</strong>sepropertieson<strong>the</strong>location<strong>of</strong><strong>the</strong>l<strong>in</strong>ks<strong>in</strong><strong>the</strong>networktopologyandon<strong>the</strong>time<strong>of</strong>day.<br />
Ourkeyempiricalf<strong>in</strong>d<strong>in</strong>gsare<strong>the</strong>follow<strong>in</strong>g:<br />
• Weseeawidevariety<strong>of</strong>applicationsacross<strong>the</strong>datacenters,<br />
rang<strong>in</strong>gfromcustomer-fac<strong>in</strong>gapplications,suchasWebservices,filestores,au<strong>the</strong>nticationservices,L<strong>in</strong>e-<strong>of</strong>-Bus<strong>in</strong>essapplications,andcustomenterpriseapplicationstodata<strong>in</strong>tensiveapplications,suchasMapReduceandsearch<strong>in</strong>dex<strong>in</strong>g.Wef<strong>in</strong>dthatapplicationplacementisnon-uniformacross<br />
racks.<br />
• Mostflows<strong>in</strong><strong>the</strong>datacentersaresmall<strong>in</strong>size(≤ 10KB),<br />
asignificantfraction<strong>of</strong>whichlastunderafewhundreds<strong>of</strong><br />
milliseconds,and<strong>the</strong>number<strong>of</strong>activeflowspersecondis<br />
under10,000perrackacrossalldatacenters.<br />
• Despite<strong>the</strong>differences<strong>in</strong><strong>the</strong>sizeandusage<strong>of</strong><strong>the</strong>datacenters,trafficorig<strong>in</strong>at<strong>in</strong>gfromarack<strong>in</strong>adatacenterisON/OFF<br />
<strong>in</strong>naturewithpropertiesthatfi<strong>the</strong>avy-taileddistributions.<br />
• In<strong>the</strong>clouddatacenters,amajority<strong>of</strong>trafficorig<strong>in</strong>atedby<br />
servers(80%)stayswith<strong>in</strong><strong>the</strong>rack. For<strong>the</strong>universityand<br />
privateenterprisedatacenters,most<strong>of</strong><strong>the</strong>traffic(40-90%)<br />
leaves<strong>the</strong>rackandtraverses<strong>the</strong>network’s<strong>in</strong>terconnect.<br />
• Irrespective<strong>of</strong><strong>the</strong>type,<strong>in</strong>mostdatacenters,l<strong>in</strong>kutilizations<br />
arera<strong>the</strong>rlow<strong>in</strong>alllayersbut<strong>the</strong>core.In<strong>the</strong>core,wef<strong>in</strong>d<br />
thatasubset<strong>of</strong><strong>the</strong>corel<strong>in</strong>ks<strong>of</strong>tenexperiencehighutilization.Fur<strong>the</strong>rmore,<strong>the</strong>exactnumber<strong>of</strong>highlyutilizedcore<br />
l<strong>in</strong>ksvariesovertime,butneverexceeds25%<strong>of</strong><strong>the</strong>core<br />
l<strong>in</strong>ks<strong>in</strong>anydatacenter.<br />
• Lossesoccurwith<strong>in</strong><strong>the</strong>datacenters;however,lossesarenot<br />
localizedtol<strong>in</strong>kswithpersistentlyhighutilization.Instead,<br />
lossesoccuratl<strong>in</strong>kswithlowaverageutilizationimplicat<strong>in</strong>gmomentaryspikesas<strong>the</strong>primarycause<strong>of</strong>losses.<br />
We<br />
observethat<strong>the</strong>magnitude<strong>of</strong>lossesisgreaterat<strong>the</strong>aggregationlayerthanat<strong>the</strong>edgeor<strong>the</strong>corelayers.<br />
• Weobservethatl<strong>in</strong>kutilizationsaresubjecttotime-<strong>of</strong>-day<br />
andday-<strong>of</strong>-weekeffectsacrossalldatacenters.However<strong>in</strong><br />
many<strong>of</strong><strong>the</strong>clouddatacenters,<strong>the</strong>variationsarenearlyan<br />
order<strong>of</strong>magnitudemorepronouncedatcorel<strong>in</strong>ksthanat<br />
edgeandaggregationl<strong>in</strong>ks.<br />
Tohighlight<strong>the</strong>implications<strong>of</strong>ourobservations,weconclude<br />
<strong>the</strong>paperwithananalysis<strong>of</strong>twodatacenternetworkdesignissues<br />
thathavereceivedalot<strong>of</strong>recentattention,namely,networkbisectionbandwidthand<strong>the</strong>use<strong>of</strong>centralizedmanagementtechniques.<br />
• BisectionBandwidth:Recentdatacenternetworkproposals<br />
havearguedthatdatacentersneedhighbisectionbandwidth<br />
tosupportdemand<strong>in</strong>gapplications.Ourmeasurementsshow<br />
thatonlyafraction<strong>of</strong><strong>the</strong>exist<strong>in</strong>gbisectioncapacityislikely<br />
tobeutilizedwith<strong>in</strong>agiventime<strong>in</strong>terval<strong>in</strong>all<strong>the</strong>datacenters,even<strong>in</strong><strong>the</strong>“worstcase”whereapplication<strong>in</strong>stancesare<br />
268<br />
<strong>Data</strong>Center Type<strong>of</strong> Type<strong>of</strong> #<strong>of</strong>DCs<br />
Study <strong>Data</strong>Center Apps Measured<br />
Fat-tree[1] Cloud MapReduce 0<br />
Hedera[2] Cloud MapReduce 0<br />
Portland[22] Cloud MapReduce 0<br />
BCube[13] Cloud MapReduce 0<br />
DCell[16] Cloud MapReduce 0<br />
VAL2[11] Cloud MapReduce 1<br />
MicroTE[4] Cloud MapReduce 1<br />
Flyways[18] Cloud MapReduce 1<br />
Opticalswitch<strong>in</strong>g[29] Cloud MapReduce 1<br />
ECMP.study1[19] Cloud MapReduce 1<br />
ECMP.study2[3] Cloud MapReduce 19<br />
WebServices<br />
ElasticTree[14] ANY WebServices 1<br />
SPAIN[21] Any Any 0<br />
Ourwork Cloud MapReduce 10<br />
PrivateNet Webservices<br />
Universities DistributedF’S<br />
Table1: Comparison<strong>of</strong>priordatacenterstudies,<strong>in</strong>clud<strong>in</strong>g<br />
type<strong>of</strong>datacenterandapplication.<br />
spreadacrossracksra<strong>the</strong>rthanconf<strong>in</strong>edwith<strong>in</strong>arack.This<br />
istrueevenforMapReducedatacentersthatseerelatively<br />
higherutilization.Fromthis,weconcludethatloadbalanc<strong>in</strong>gmechanismsforspread<strong>in</strong>gtrafficacross<strong>the</strong>exist<strong>in</strong>gl<strong>in</strong>ks<strong>in</strong><strong>the</strong>network’scorecanhelpmanageoccasionalcongestion,given<strong>the</strong>currentapplicationsused.<br />
• CentralizationManagement: Afewrecentproposals[2,<br />
14]havearguedforcentrallymanag<strong>in</strong>gandschedul<strong>in</strong>gnetworkwidetransmissionstomoreeffectivelyeng<strong>in</strong>eerdatacenter<br />
traffic.Ourmeasurementsshowthatcentralizedapproaches<br />
mustemployparallelismandfastroutecomputationheuristicstoscaleto<strong>the</strong>size<strong>of</strong>datacenterstodaywhilesupport<strong>in</strong>g<br />
<strong>the</strong>applicationtrafficpatternsweobserve<strong>in</strong><strong>the</strong>datacenters.<br />
Therest<strong>of</strong><strong>the</strong>paperisstructuredasfollows:wepresentrelated<br />
work<strong>in</strong>Section2and<strong>in</strong>Section3describe<strong>the</strong>datacentersstudied,<br />
<strong>the</strong>irhigh-leveldesign,andtypicaluses.InSection4,wedescribe<br />
<strong>the</strong>applicationsrunn<strong>in</strong>g<strong>in</strong><strong>the</strong>sedatacenters. InSection5,we<br />
zoom<strong>in</strong>to<strong>the</strong>microscopicproperties<strong>of</strong><strong>the</strong>variousdatacenters.<br />
InSection6,weexam<strong>in</strong>e<strong>the</strong>flow<strong>of</strong>trafficwith<strong>in</strong>datacentersand<br />
<strong>the</strong>utilization<strong>of</strong>l<strong>in</strong>ksacross<strong>the</strong>variouslayers.Wediscuss<strong>the</strong>implications<strong>of</strong>ourempirical<strong>in</strong>sights<strong>in</strong>Section7,andwesummarize<br />
ourf<strong>in</strong>d<strong>in</strong>gs<strong>in</strong>Section8.<br />
2. RELATEDWORK<br />
Thereistremendous<strong>in</strong>terest<strong>in</strong>design<strong>in</strong>gimprovednetworksfor<br />
datacenters [1,2,22,13,16,11,4,18,29,14,21];however,such<br />
workanditsevaluationisdrivenbyonlyafewstudies<strong>of</strong>datacentertraffic,andthosestudiesaresolely<strong>of</strong>huge(>10Kserver)data<br />
centers,primarilyrunn<strong>in</strong>gdatam<strong>in</strong><strong>in</strong>g,MapReducejobs,orWeb<br />
services.Table1summarizes<strong>the</strong>priorstudies.FromTable1,we<br />
observethatmany<strong>of</strong><strong>the</strong>dataarchitecturesareevaluatedwithout<br />
empiricaldatafromdatacenters. For<strong>the</strong>architecturesevaluated<br />
wi<strong>the</strong>mpiricaldata,wef<strong>in</strong>dthat<strong>the</strong>seevaluationsareperformed<br />
withtracesfromclouddatacenters.Theseobservationsimplythat<br />
<strong>the</strong>actualperformance<strong>of</strong><strong>the</strong>setechniquesundervarioustypes<strong>of</strong><br />
realisticdatacentersfound<strong>in</strong><strong>the</strong>wild(suchasenterpriseanduniversitydatacenters)isunknownandthuswearemotivatedbythis<br />
toconductabroadstudyon<strong>the</strong>characteristics<strong>of</strong>datacenters.Such<br />
astudywill<strong>in</strong>form<strong>the</strong>designandevaluation<strong>of</strong>currentandfuture<br />
datacentertechniques.
Thispaperanalyzes<strong>the</strong>networktraffic<strong>of</strong><strong>the</strong>broadestset<strong>of</strong><br />
datacentersstudiedtodate,<strong>in</strong>clud<strong>in</strong>gdatacentersrunn<strong>in</strong>gWeb<br />
servicesandMapReduceapplications,butalsoo<strong>the</strong>rcommonenterpriseandcampusdatacentersthatprovidefilestorage,au<strong>the</strong>nticationservices,L<strong>in</strong>e-<strong>of</strong>-Bus<strong>in</strong>essapplications,ando<strong>the</strong>rcustomwrittenservices.Thus,ourworkprovides<strong>the</strong><strong>in</strong>formationneeded<br />
toevaluatedatacenternetworkarchitectureproposalsunder<strong>the</strong><br />
broadrange<strong>of</strong>datacenterenvironmentsthatexist.<br />
Previousstudies[19,3]havefocusedontrafficpatternsatcoarse<br />
time-scales,report<strong>in</strong>gflowsizedistributions,number<strong>of</strong>concurrent<br />
connections,duration<strong>of</strong>congestionperiods,anddiurnalpatterns.<br />
Weextend<strong>the</strong>semeasuresbyconsider<strong>in</strong>gadditionalissues,suchas<br />
<strong>the</strong>applicationsemployed<strong>in</strong><strong>the</strong>differentdatacenters,<strong>the</strong>irtransmissionpatternsat<strong>the</strong>packetandflowlevels,<strong>the</strong>irimpactonl<strong>in</strong>k<br />
andnetworkutilizations,and<strong>the</strong>prevalence<strong>of</strong>networkhot-spots.<br />
Thisadditional<strong>in</strong>formationiscrucialtoevaluat<strong>in</strong>gtrafficeng<strong>in</strong>eer<strong>in</strong>gstrategiesanddatacenterplacement/schedul<strong>in</strong>gproposals.<br />
Theclosestpriorworksare[19]and[3];<strong>the</strong>formerfocuses<br />
onas<strong>in</strong>gleMapReducedatacenters,while<strong>the</strong>latterconsiders<br />
clouddatacentersthathostWebservicesaswellasthoserunn<strong>in</strong>g<br />
MapReduce.Nei<strong>the</strong>rstudyconsidersnon-clouddatacenters,such<br />
asenterpriseandcampusdatacenters,andnei<strong>the</strong>rprovidesascompleteapicture<strong>of</strong>trafficpatternsasthisstudy.Thekeyobservations<br />
fromBenson’sstudy[3]arethatutilizationsarehighest<strong>in</strong><strong>the</strong>core<br />
butlossesarehighestat<strong>the</strong>edge.Inourwork,weaugment<strong>the</strong>se<br />
f<strong>in</strong>d<strong>in</strong>gsbyexam<strong>in</strong><strong>in</strong>g<strong>the</strong>variations<strong>in</strong>l<strong>in</strong>kutilizationsovertime,<br />
<strong>the</strong>localization<strong>of</strong>lossestol<strong>in</strong>k,and<strong>the</strong>magnitude<strong>of</strong>lossesover<br />
time.FromKandula’sstudy[19],welearnedthatwhilemosttraffic<br />
<strong>in</strong><strong>the</strong>cloudisrestrictedtowith<strong>in</strong>arackandasignificantnumber<br />
<strong>of</strong>hot-spotsexist<strong>in</strong><strong>the</strong>network.Ourworksupplements<strong>the</strong>seresultsbyquantify<strong>in</strong>g<strong>the</strong>exactfraction<strong>of</strong>trafficthatstayswith<strong>in</strong><br />
arackforawiderange<strong>of</strong>datacenters. Inaddition,wequantify<br />
<strong>the</strong>number<strong>of</strong>hot-spots,showthatlossesaredueto<strong>the</strong>underly<strong>in</strong>gburst<strong>in</strong>ess<strong>of</strong>traffic,andexam<strong>in</strong>e<strong>the</strong>flowlevelpropertiesfor<br />
universityandprivateenterprise(bothareclasses<strong>of</strong>datacenters<br />
ignored<strong>in</strong>Kandula’sstudy[19]).<br />
Ourworkcomplementspriorworkonmeasur<strong>in</strong>gInternettraffic[20,10,25,9,8,17]bypresent<strong>in</strong>ganequivalentstudyon<strong>the</strong><br />
flowcharacteristics<strong>of</strong>applicationsandl<strong>in</strong>kutilizationswith<strong>in</strong>data<br />
centers.Wef<strong>in</strong>dthatdatacentertrafficisstatisticallydifferentfrom<br />
wideareatraffic,andthatsuchbehaviorhasseriousimplications<br />
for<strong>the</strong>designandimplementation<strong>of</strong>techniquesfordatacenter<br />
networks.<br />
3. DATASETSANDOVERVIEWOFDATA<br />
CENTERS<br />
Inthispaper,weanalyzedata-setsfrom10datacenters,<strong>in</strong>clud<strong>in</strong>g5commercialclouddatacenters,2privateenterprisedatacenters,and3universitycampusdatacenters.Foreach<strong>of</strong><strong>the</strong>sedatacenters,weexam<strong>in</strong>eoneormore<strong>of</strong><strong>the</strong>follow<strong>in</strong>gdata-sets:networktopology,packettracesfromselectswitches,andSNMPpolls<br />
from<strong>the</strong><strong>in</strong>terfaces<strong>of</strong>networkswitches. Table2summarizes<strong>the</strong><br />
datacollectedfromeachdatacenter,aswellassomekeyproperties.<br />
Table2showsthat<strong>the</strong>datacentersvary<strong>in</strong>size,both<strong>in</strong>terms<strong>of</strong><br />
<strong>the</strong>number<strong>of</strong>devicesand<strong>the</strong>number<strong>of</strong>servers.Unsurpris<strong>in</strong>gly,<br />
<strong>the</strong>largestdatacentersareusedforcommercialcomput<strong>in</strong>gneeds<br />
(allownedbyas<strong>in</strong>gleentity),with<strong>the</strong>enterpriseanduniversity<br />
datacentersbe<strong>in</strong>ganorder<strong>of</strong>magnitudesmaller<strong>in</strong>terms<strong>of</strong><strong>the</strong><br />
number<strong>of</strong>devices.<br />
Thedatacentersalsovary<strong>in</strong><strong>the</strong>irproximityto<strong>the</strong>irusers.The<br />
enterpriseanduniversitydatacentersarelocated<strong>in</strong><strong>the</strong>western/mid-<br />
269<br />
<strong>Data</strong>CenterName Number<strong>of</strong>Locations<br />
EDU1 1<br />
EDU2 1<br />
EDU3 1<br />
PRV2 4<br />
Table3:Thenumber<strong>of</strong>packettracecollectionlocationsfor<strong>the</strong><br />
datacenters<strong>in</strong>whichwewereableto<strong>in</strong>stallpacketsniffers.<br />
westernU.S.andarehostedon<strong>the</strong>premises<strong>of</strong><strong>the</strong>organizations<br />
toservelocalusers. Incontrast,<strong>the</strong>commercialdatacentersare<br />
distributedaround<strong>the</strong>world<strong>in</strong><strong>the</strong>U.S.,Europe,andSouthAmerica.<br />
Theirglobalplacementreflectsan<strong>in</strong>herentrequirementfor<br />
geo-diversity(reduc<strong>in</strong>glatencytousers),geo-redundancy(avoid<strong>in</strong>gstrikes,wars,orfibercuts<strong>in</strong>onepart<strong>of</strong><strong>the</strong>world),andregulatoryconstra<strong>in</strong>ts(somedatacannotberemovedfrom<strong>the</strong>E.U.or<br />
U.S.).<br />
Inwhatfollows,wefirstdescribe<strong>the</strong>datawecollect.We<strong>the</strong>n<br />
outl<strong>in</strong>esimilaritiesanddifferences<strong>in</strong>keyattributes<strong>of</strong><strong>the</strong>datacenters,<strong>in</strong>clud<strong>in</strong>g<strong>the</strong>irusagepr<strong>of</strong>iles,andphysicaltopology.<br />
We<br />
foundthatunderstand<strong>in</strong>g<strong>the</strong>seaspectsisrequiredtoanalyze<strong>the</strong><br />
propertiesthatwewishtomeasure<strong>in</strong>subsequentsections,suchas<br />
applicationbehavioranditsimpactonl<strong>in</strong>k-levelandnetwork-wide<br />
utilizations.<br />
3.1 <strong>Data</strong>Collection<br />
SNMPpolls: Forall<strong>of</strong><strong>the</strong>datacentersthatwestudied,we<br />
wereabletopoll<strong>the</strong>switches’SNMPMIBsforbytes-<strong>in</strong>andbytesoutatgranularitiesrang<strong>in</strong>gfrom1m<strong>in</strong>uteto30m<strong>in</strong>utes.For<strong>the</strong><br />
5commercialclouddatacentersand<strong>the</strong>2privateenterprises,we<br />
wereabletopollfor<strong>the</strong>number<strong>of</strong>packetdiscardsaswell.<br />
Foreachdatacenter,wecollectedSNMPdataforatleast10<br />
days. Insomecases(e.g.,EDU1,EDU2,EDU3,PRV1,PRV2,<br />
CLD1,CLD4),ourSNMPdataspansmultipleweeks. Thelong<br />
time-span<strong>of</strong>ourSNMPdataallowsustoobservetime-<strong>of</strong>-dayand<br />
day-<strong>of</strong>-weekdependencies<strong>in</strong>networktraffic.<br />
<strong>Network</strong>Topology: For<strong>the</strong>privateenterprisesanduniversity<br />
datacenters,weobta<strong>in</strong>edtopologyvia<strong>the</strong>CiscoCDPprotocol,<br />
whichgivesboth<strong>the</strong>networktopologyaswellas<strong>the</strong>l<strong>in</strong>kcapacities.Whenthisdataisunavailable,aswith<strong>the</strong>5clouddatacenters,weanalyzedeviceconfigurationtoderiveproperties<strong>of</strong><strong>the</strong>topology,suchas<strong>the</strong>relativecapacities<strong>of</strong>l<strong>in</strong>ksfac<strong>in</strong>gendhostsversus<br />
network-<strong>in</strong>ternall<strong>in</strong>ksversusWAN-fac<strong>in</strong>gl<strong>in</strong>ks.<br />
Packettraces: F<strong>in</strong>ally,wecollectedpackettracesfromafew<br />
<strong>of</strong><strong>the</strong>privateenterpriseanduniversitydatacenters(Table2).Our<br />
packettracecollectionspans12hoursovermultipledays.S<strong>in</strong>ceit<br />
isdifficultto<strong>in</strong>strumentanentiredatacenter,weselectedahandful<strong>of</strong>locationsatrandomperdatacenterand<strong>in</strong>stalledsnifferson<br />
<strong>the</strong>m.InTable3,wepresent<strong>the</strong>number<strong>of</strong>sniffersperdatacenter.<br />
In<strong>the</strong>smallerdatacenters(EDU1,EDU2,EDU3),we<strong>in</strong>stalled<br />
1sniffer. For<strong>the</strong>largerdatacenter(PRV2),we<strong>in</strong>stalled4sniffers.Alltraceswerecapturedus<strong>in</strong>gaCiscoportspan.Toaccount<br />
fordelay<strong>in</strong>troducedby<strong>the</strong>packetduplicationmechanismandfor<br />
endhostclockskew,web<strong>in</strong>nedresultsfrom<strong>the</strong>spans<strong>in</strong>to10microsecondb<strong>in</strong>s.<br />
3.2 High-levelUsage<strong>of</strong><strong>the</strong><strong>Data</strong><strong>Centers</strong><br />
Inthissection,weoutl<strong>in</strong>eimportanthigh-levelsimilaritiesand<br />
differencesamong<strong>the</strong>datacenterswestudied.<br />
Universitydatacenters:Thesedatacentersserve<strong>the</strong>students<br />
andadm<strong>in</strong>istrativestaff<strong>of</strong><strong>the</strong>university<strong>in</strong>question. Theyprovideavariety<strong>of</strong>services,rang<strong>in</strong>gfromsystemback-upstohost<strong>in</strong>g<br />
distributedfilesystems,E-mailservers,Webservices(adm<strong>in</strong>istra-
<strong>Data</strong>Center <strong>Data</strong>Center Location Age(Years) SNMP Packet Topology Number Number Over<br />
Role Name (CurrVer/Total) Traces Devices Servers Subscription<br />
EDU1 US-Mid 10 22 500 2:1<br />
Universities<br />
EDU2<br />
EDU3<br />
US-Mid<br />
US-Mid<br />
(7/20)<br />
N/A<br />
<br />
<br />
<br />
<br />
<br />
<br />
36<br />
1<br />
1093<br />
147<br />
47:1<br />
147:1<br />
Private<br />
PRV1<br />
PRV2<br />
US-Mid<br />
US-West<br />
(5/5)<br />
> 5<br />
<br />
<br />
X<br />
<br />
<br />
<br />
96<br />
100<br />
1088<br />
2000<br />
8:3<br />
48:10<br />
CLD1 US-West > 5 X X 562 10K 20:1<br />
Commercial<br />
CLD2<br />
CLD3<br />
US-West<br />
US-East<br />
> 5<br />
> 5<br />
<br />
<br />
X<br />
X<br />
X<br />
X<br />
763<br />
612<br />
15K<br />
12K<br />
20:1<br />
20:1<br />
CLD4 S.America (3/3) X X 427 10K 20:1<br />
CLD5 S.America (3/3) X X 427 10K 20:1<br />
Table2:Summary<strong>of</strong><strong>the</strong>10datacentersstudied,<strong>in</strong>clud<strong>in</strong>gdevices,types<strong>of</strong><strong>in</strong>formationcollected,and<strong>the</strong>number<strong>of</strong>servers.<br />
tivesitesandwebportalsforstudentsandfaculty),andmulticast<br />
videostreams. Weprovide<strong>the</strong>exactapplicationmix<strong>in</strong><strong>the</strong>next<br />
section. Intalk<strong>in</strong>gto<strong>the</strong>networkoperators,wefoundthat<strong>the</strong>se<br />
datacenters“organically”evolvedovertime,mov<strong>in</strong>gfromacollection<strong>of</strong>devices<strong>in</strong>astorageclosettoadedicatedroomforserversandnetworkdevices.As<strong>the</strong>datacentersreachedcapacity,<strong>the</strong>operatorsre-evaluated<strong>the</strong>irdesignandarchitecture.Manyoperatorschosetomovetoamorestructured,two-layertopologyand<strong>in</strong>troducedservervirtualizationtoreduceheat<strong>in</strong>gandpowerrequirementswhilecontroll<strong>in</strong>gdatacentersize.<br />
Privateenterprises:TheprivateenterpriseITdatacentersserve<br />
corporateusers,developers,andasmallnumber<strong>of</strong>customers.Unlikeuniversitydatacenters,<strong>the</strong>privateenterprisedatacenterssupportasignificantnumber<strong>of</strong>customapplications,<strong>in</strong>additionto<br />
host<strong>in</strong>gtraditionalserviceslikeEmail,storage,andWebservices.<br />
They<strong>of</strong>tenactasdevelopmenttestbeds,aswell. Thesedatacentersaredeveloped<strong>in</strong>aground-upfashion,be<strong>in</strong>gdesignedspecificallytosupport<strong>the</strong>demands<strong>of</strong><strong>the</strong>enterprise.<br />
For<strong>in</strong>stance,to<br />
satisfy<strong>the</strong>needtosupportadm<strong>in</strong>istrativeservicesandbetatest<strong>in</strong>g<br />
<strong>of</strong>database-dependentproducts,PRV1commissioned<strong>the</strong>development<strong>of</strong>an<strong>in</strong>-housedatacenter5yearsago.PRV2wasdesignedover5yearsagomostlytosupportcustomL<strong>in</strong>e-<strong>of</strong>-Bus<strong>in</strong>essapplicationsandtoprovidelog<strong>in</strong>serversforremoteusers.<br />
Commercialclouddatacenters: Unlike<strong>the</strong>firsttwoclasses<br />
<strong>of</strong>datacenters,<strong>the</strong>commercialdatacenterscatertoexternalusers<br />
and<strong>of</strong>fersupportforawiderange<strong>of</strong>Internet-fac<strong>in</strong>gservices,<strong>in</strong>clud<strong>in</strong>g:InstantMessag<strong>in</strong>g,Webmail,search,<strong>in</strong>dex<strong>in</strong>g,andvideo.Additionally,<strong>the</strong>datacentershostlarge<strong>in</strong>ternalsystemsthatsupport<strong>the</strong>externallyvisibleservices,forexampledatam<strong>in</strong><strong>in</strong>g,storage,andrelationaldatabases(e.g.,forbuddylists).Thesedatacentersare<strong>of</strong>tenpurpose-builttosupportaspecificset<strong>of</strong>applications<br />
(e.g.,withaparticulartopologyorover-subscriptionratiotosome<br />
targetapplicationpatterns),but<strong>the</strong>reisalsoatensiontomake<strong>the</strong>m<br />
asgeneralaspossiblesothat<strong>the</strong>applicationmixcanchangeover<br />
timeas<strong>the</strong>usageevolves.CLD1,CLD2,CLD3hostavariety<strong>of</strong><br />
applications,rang<strong>in</strong>gfromInstantMessag<strong>in</strong>gandWebmailtoadvertisementsandwebportals.CLD4andCLD5areprimarilyused<br />
forrunn<strong>in</strong>gMapReducestyleapplications.<br />
3.3 TopologyandComposition<strong>of</strong><strong>the</strong><strong>Data</strong><strong>Centers</strong><br />
Inthissection,weexam<strong>in</strong>e<strong>the</strong>differencesandsimilarities<strong>in</strong><br />
<strong>the</strong>physicalconstruction<strong>of</strong><strong>the</strong>datacenters. Beforeproceed<strong>in</strong>g<br />
toexam<strong>in</strong>e<strong>the</strong>physicaltopology<strong>of</strong><strong>the</strong>datacentersstudied,we<br />
presentabriefoverview<strong>of</strong><strong>the</strong>topology<strong>of</strong>agenericdatacenter.In<br />
Figure1,wepresentacanonical3-Tiereddatacenter.The3tiers<strong>of</strong><br />
<strong>the</strong>datacenterare<strong>the</strong>edgetier,whichconsists<strong>of</strong><strong>the</strong>Top-<strong>of</strong>-Rack<br />
switchesthatconnect<strong>the</strong>serversto<strong>the</strong>datacenter’snetworkfabric;<br />
<strong>the</strong>aggregationtier,whichconsists<strong>of</strong>devicesthat<strong>in</strong>terconnect<strong>the</strong><br />
270<br />
Figure1:Canonical3-Tierdatacentertopology.<br />
ToRswitches<strong>in</strong><strong>the</strong>edgelayer;and<strong>the</strong>coretier,whichconsists<br />
<strong>of</strong>devicesthatconnect<strong>the</strong>datacenterto<strong>the</strong>WAN.Insmallerdata<br />
centers,<strong>the</strong>coretierand<strong>the</strong>aggregationtierarecollapsed<strong>in</strong>toone<br />
tier,result<strong>in</strong>g<strong>in</strong>a2-Tiereddatacentertopology.<br />
Now,wefocusontopologicalstructureand<strong>the</strong>keyphysical<br />
properties<strong>of</strong><strong>the</strong>constituentdevicesandl<strong>in</strong>ks. Wef<strong>in</strong>dthat<strong>the</strong><br />
topology<strong>of</strong><strong>the</strong>datacenteris<strong>of</strong>tenanaccident<strong>of</strong>history. Some<br />
haveregularpatternsthatcouldbeleveragedfortrafficeng<strong>in</strong>eer<strong>in</strong>gstrategieslikeValiantLoadBalanc<strong>in</strong>g[11],whilemostwould<br />
requireei<strong>the</strong>rasignificantupgradeormoregeneralstrategies.<br />
Topology. Of<strong>the</strong>threeuniversitydatacenters,wef<strong>in</strong>dthattwo<br />
(EDU1,EDU2)haveevolved<strong>in</strong>toastructured2-Tierarchitecture.<br />
Thethird(EDU3)usesastar-liketopologywithahigh-capacity<br />
centralswitch<strong>in</strong>terconnect<strong>in</strong>gacollection<strong>of</strong>serverracks–adesignthathasbeenuseds<strong>in</strong>ce<strong>the</strong><strong>in</strong>ception<strong>of</strong>thisdatacenter.As<br />
<strong>of</strong>thiswrit<strong>in</strong>g,<strong>the</strong>datacenterwasmigrat<strong>in</strong>gtoamorestructured<br />
set-upsimilarto<strong>the</strong>o<strong>the</strong>rtwo.<br />
EDU1usesatopologythatissimilartoacanonical2-Tierarchitecture,withonekeydifference:while<strong>the</strong>canonical2-Tierdata<br />
centersuseTop-<strong>of</strong>-Rackswitches,whereeachswitchconnectstoa<br />
rack<strong>of</strong>20-80serversorso,<strong>the</strong>setwodatacentersutilizeMiddle<strong>of</strong>-Rackswitchesthatconnectarow<strong>of</strong>5to6rackswith<strong>the</strong>potentialtoconnectfrom120to180servers.<br />
Wef<strong>in</strong>dthatsimilar<br />
conclusionsholdforEDU2(omittedforbrevity).<br />
Theenterprisedatacentersdonotdeviatemuchfromtextbookstyleconstructions.<br />
Inparticular,<strong>the</strong>PRV1enterprisedatacenter<br />
utilizesacanonical2-TierCiscoarchitecture.ThePRV2datacenterutilizesacanonical3-TierCiscoarchitecture.
Percent <strong>of</strong> Bytes Per App<br />
100<br />
80<br />
60<br />
40<br />
20<br />
0<br />
PRV2 1 PRV2 2 PRV2 3 PRV2 4 EDU1 EDU2 EDU3<br />
OTHER<br />
HTTP<br />
<strong>Data</strong> Center Edge Switches<br />
HTTPS<br />
LDAP<br />
SMB<br />
NCP<br />
AFS<br />
Figure2:Classification<strong>of</strong>networktraffictoapplicationus<strong>in</strong>g<br />
Bro-Id.Each<strong>of</strong><strong>the</strong>sniffersseesaverydifferentmix<strong>of</strong>applications,eventhough<strong>the</strong>first4sniffersarelocatedondifferent<br />
switches<strong>in</strong><strong>the</strong>samedatacenter.<br />
Notethatwedonothave<strong>the</strong>physicaltopologiesfrom<strong>the</strong>cloud<br />
datacenters,although<strong>the</strong>operators<strong>of</strong><strong>the</strong>sedatacenterstellusthat<br />
<strong>the</strong>senetworksuniformlyemploy<strong>the</strong>3-Tiertextbookdatacenter<br />
architecturesdescribed<strong>in</strong>[11].<br />
4. APPLICATIONSINDATACENTERS<br />
Webeg<strong>in</strong>our“top-down”analysis<strong>of</strong>datacentersbyfirstfocus<strong>in</strong>gon<strong>the</strong>applications<strong>the</strong>yrun.<br />
Inparticular,weaimtoanswer<br />
<strong>the</strong>follow<strong>in</strong>gquestions:(1)Whattype<strong>of</strong>applicationsarerunn<strong>in</strong>g<br />
with<strong>in</strong><strong>the</strong>sedatacenters? and,(2)Whatfraction<strong>of</strong>trafficorig<strong>in</strong>atedbyaswitchiscontributedbyeachapplication?<br />
Weemploypackettracedata<strong>in</strong>thisanalysisanduseBro-Id[26]<br />
toperformapplicationclassification.Recallthatwecollectedpacket<br />
tracedatafor7switchesspann<strong>in</strong>g4datacenters,namely,<strong>the</strong>universitycampusdatacenters,EDU1,EDU2,andEDU3,andaprivateenterprisedatacenter,PRV2.<br />
Tolendfur<strong>the</strong>rweighttoour<br />
observations,wespoketo<strong>the</strong>operators<strong>of</strong>eachdatacenter,<strong>in</strong>clud<strong>in</strong>g<strong>the</strong>6forwhichwedidnothavepackettracedata.Theoperatorsprovideduswithadditional<strong>in</strong>formationabout<strong>the</strong>specificapplicationsrunn<strong>in</strong>g<strong>in</strong><strong>the</strong>irdatacenters.<br />
Thetype<strong>of</strong>applicationsfoundateachedgeswitch,alongwith<br />
<strong>the</strong>irrelativetrafficvolumes,areshown<strong>in</strong>Figure2.Eachbarcorrespondstoasniffer<strong>in</strong>adatacenter,and<strong>the</strong>first4barsarefrom<strong>the</strong>4<br />
edgesswitcheswith<strong>in</strong><strong>the</strong>samedatacenter(PRV2).Inconvers<strong>in</strong>g<br />
with<strong>the</strong>operators,wediscoveredthatthisdatacenterhostsamixture<strong>of</strong>au<strong>the</strong>nticationservices(labeled“LDAP”),3-TierL<strong>in</strong>e-Of-<br />
Bus<strong>in</strong>essWebapplications(captured<strong>in</strong>“HTTP”and“HTTPS”),<br />
andcustomhome-brewedapplications(captured<strong>in</strong>“O<strong>the</strong>rs”).<br />
Bylook<strong>in</strong>gat<strong>the</strong>composition<strong>of</strong><strong>the</strong>4barsforPRV2,wecan<strong>in</strong>ferhow<strong>the</strong>servicesandapplicationsaredeployedacrossracks<strong>in</strong><br />
<strong>the</strong>datacenter.Wef<strong>in</strong>dthateach<strong>of</strong><strong>the</strong>edgeswitchesmonitored<br />
hostsaportion<strong>of</strong><strong>the</strong>back-endfor<strong>the</strong>customapplications(cap-<br />
tured<strong>in</strong>“O<strong>the</strong>rs”).Inparticular,<strong>the</strong>rackcorrespond<strong>in</strong>gtoPRV24<br />
appearstopredom<strong>in</strong>antlyhostcustomapplicationsthatcontribute<br />
over90%<strong>of</strong><strong>the</strong>trafficfromthisswitch. At<strong>the</strong>o<strong>the</strong>rswitches,<br />
<strong>the</strong>seapplicationsmakeup50%,25%,and10%<strong>of</strong><strong>the</strong>bytes,respectively.<br />
Fur<strong>the</strong>r,wef<strong>in</strong>dthat<strong>the</strong>secureportions<strong>of</strong><strong>the</strong>L<strong>in</strong>e-<strong>of</strong>-Bus<strong>in</strong>ess<br />
Webservices(labeled“HTTPS”)arehosted<strong>in</strong><strong>the</strong>rackcorrespond-<br />
271<br />
<strong>in</strong>gto<strong>the</strong>edgeswitchPRV22,butnot<strong>in</strong><strong>the</strong>o<strong>the</strong>rthreeracks<br />
monitored.Au<strong>the</strong>nticationservices(labeled“LDAP”)aredeployed<br />
across<strong>the</strong>rackscorrespond<strong>in</strong>gtoPRV21andPRV22,whichmakes<br />
upasignificantfraction<strong>of</strong>bytesfrom<strong>the</strong>seswitches(40%<strong>of</strong><strong>the</strong><br />
bytesfromPRV21and25%<strong>of</strong><strong>the</strong>byesfromPRV22). Asmall<br />
amount<strong>of</strong>LDAPtraffic(2%<strong>of</strong>allbytesonaverage)orig<strong>in</strong>ates<br />
from<strong>the</strong>o<strong>the</strong>rtwoswitches,aswell,butthisismostlyrequesttrafficheadedfor<strong>the</strong>au<strong>the</strong>nticationservices<strong>in</strong>PRV21andPRV22.F<strong>in</strong>ally,<strong>the</strong>unsecuredportions<strong>of</strong><strong>the</strong>L<strong>in</strong>e-<strong>of</strong>-Bus<strong>in</strong>ess(consist<strong>in</strong>g<strong>of</strong>helppagesandbasicdocumentation)arelocatedpredom<strong>in</strong>antlyon<strong>the</strong>rackcorrespond<strong>in</strong>gto<strong>the</strong>edgeswitchPRV23—<br />
nearly85%<strong>of</strong><strong>the</strong>trafficorig<strong>in</strong>at<strong>in</strong>gfromthisrackisHTTP.<br />
Wealsoseesomeamount<strong>of</strong>file-systemtraffic(SMB)acrossall<br />
<strong>the</strong>4switches(roughly4%<strong>of</strong><strong>the</strong>bytesonaverage).<br />
Cluster<strong>in</strong>g<strong>of</strong>applicationcomponentswith<strong>in</strong>thisdatacenterleads<br />
ustobelievethatemerg<strong>in</strong>gpatterns<strong>of</strong>virtualizationandconsolidationshavenotyetledtoapplicationsbe<strong>in</strong>gspreadacross<strong>the</strong><br />
switches.<br />
Next,wefocuson<strong>the</strong>last3bars,whichcorrespondtoanedge<br />
switcheach<strong>in</strong><strong>the</strong>3universitydatacenters,EDU1,EDU2and<br />
EDU3.While<strong>the</strong>se3datacentersserve<strong>the</strong>sametypes<strong>of</strong>userswe<br />
observevariationsacross<strong>the</strong>networks.Two<strong>of</strong><strong>the</strong>universitydata<br />
centers,EDU2andEDU3,seemtoprimarilyutilize<strong>the</strong>networkfor<br />
distributedfilesystemstraffic,namelyAFSandNCP—AFSmakes<br />
upnearlyall<strong>the</strong>trafficseenat<strong>the</strong>EDU3switch,whileNCPconstitutesnearly80%<strong>of</strong><strong>the</strong>trafficat<strong>the</strong>EDU2switch.<br />
Thetraffic<br />
at<strong>the</strong>lastdatacenter,EDU1,issplit60/40betweenWebservices<br />
(bothHTTPandHTTPS)ando<strong>the</strong>rapplicationssuchasfileshar<strong>in</strong>g<br />
(SMB).Theoperator<strong>of</strong>thisdatacentertellsusthat<strong>the</strong>datacenter<br />
alsohostspayrollandbenefitsapplications,whicharecaptured<strong>in</strong><br />
“O<strong>the</strong>rs.”<br />
Notethatwef<strong>in</strong>dfilesystemtraffictoconstituteamoresignificantfraction<strong>of</strong><strong>the</strong>switches<strong>in</strong><strong>the</strong>universitydatacenterswemonitoredcomparedto<strong>the</strong>enterprisedatacenter.<br />
Thekeytake-awaysfrom<strong>the</strong>aboveobservationsarethat(1)<br />
Thereisawidevariety<strong>of</strong>applicationsobservedbothwith<strong>in</strong>and<br />
acrossdatacenters,suchas“regular”andsecureHTTPtransactions,au<strong>the</strong>nticationservices,file-systemtraffic,andcustomapplicationsand(2)Weobserveawidevariation<strong>in</strong><strong>the</strong>composition<br />
<strong>of</strong>trafficorig<strong>in</strong>atedby<strong>the</strong>switches<strong>in</strong>agivendatacenter(see<strong>the</strong><br />
4switchescorrespond<strong>in</strong>gtoPRV2).Thisimpliesthatonecannot<br />
assumethatapplicationsareplaceduniformlyatrandom<strong>in</strong>data<br />
centers.<br />
For<strong>the</strong>rema<strong>in</strong><strong>in</strong>gdatacenters(i.e.,PRV1,CLD1–5),wherewe<br />
didnothaveaccesstopackettraces,weused<strong>in</strong>formationfromoperatorstounderstand<strong>the</strong>applicationmix.<br />
CLD4andCLD5are<br />
utilizedforrunn<strong>in</strong>gMapReducejobs,wi<strong>the</strong>achjob,scheduledto<br />
packasmany<strong>of</strong>itsnodesaspossible<strong>in</strong>to<strong>the</strong>sameracktoreduce<br />
demandon<strong>the</strong>datacenter’score<strong>in</strong>terconnect.Incontrast,CLD1,<br />
CLD2,andCLD3hostavariety<strong>of</strong>applications,rang<strong>in</strong>gfrommessag<strong>in</strong>gandWebmailtoWebportals.Each<strong>of</strong><strong>the</strong>seapplicationsiscomprised<strong>of</strong>multiplecomponentswith<strong>in</strong>tricatedependencies,deployedacross<strong>the</strong>entiredatacenter.Forexample,<strong>the</strong>Webportal<br />
requiresaccesstoanau<strong>the</strong>nticationserviceforverify<strong>in</strong>gusers,and<br />
italsorequiresaccesstoawiderange<strong>of</strong>Webservicesfromwhich<br />
dataisaggregated.InstantMessag<strong>in</strong>gsimilarlyutilizesanau<strong>the</strong>nticationserviceandcomposes<strong>the</strong>user’sbuddylistbyaggregat<strong>in</strong>g<br />
dataspreadacrossdifferentdatastores.Theapplicationmixfound<br />
<strong>in</strong><strong>the</strong>datacentersimpacts<strong>the</strong>trafficresults,whichwelookatnext.
5. APPLICATIONCOMMUNICATIONPAT-<br />
TERNS<br />
In<strong>the</strong>previoussection,wedescribed<strong>the</strong>set<strong>of</strong>applicationsrunn<strong>in</strong>g<strong>in</strong>each<strong>of</strong><strong>the</strong>10datacentersandobservedthatavariety<strong>of</strong>applicationsrun<strong>in</strong><strong>the</strong>datacentersandthat<strong>the</strong>irplacementisnonuniform.Inthissection,weanalyze<strong>the</strong>aggregatenetworktransmissionbehavior<strong>of</strong><strong>the</strong>applications,bothat<strong>the</strong>flow-levelandat<br />
<strong>the</strong>f<strong>in</strong>er-gra<strong>in</strong>edpacket-level. Specifically,weaimtoanswer<strong>the</strong><br />
follow<strong>in</strong>gquestions:(1)Whatare<strong>the</strong>aggregatecharacteristics<strong>of</strong><br />
flowarrivals,sizes,anddurations? and(2)Whatare<strong>the</strong>aggregatecharacteristics<strong>of</strong><strong>the</strong>packet-level<strong>in</strong>ter-arrivalprocessacrossallapplications<strong>in</strong>arack—thatis,howburstyare<strong>the</strong>transmissionpatterns<strong>of</strong><strong>the</strong>seapplications?Theseaspectshaveimportant<br />
implicationsfor<strong>the</strong>performance<strong>of</strong><strong>the</strong>networkanditsl<strong>in</strong>ks.<br />
Asbefore,weuse<strong>the</strong>packettraces<strong>in</strong>ouranalysis.<br />
5.1 Flow-LevelCommunication<strong>Characteristics</strong><br />
First,weexam<strong>in</strong>e<strong>the</strong>number<strong>of</strong>activeflowsacross<strong>the</strong>4data<br />
centerswherewehavepacket-leveldata,EDU1,EDU2,EDU3,<br />
andPRV2.Toidentifyactiveflows,weusealong<strong>in</strong>activitytimeout<strong>of</strong>60seconds(similartothatused<strong>in</strong>previousmeasurements<br />
studies[19]).<br />
InFigure3(a),wepresent<strong>the</strong>distribution<strong>of</strong><strong>the</strong>number<strong>of</strong>active<br />
flowswith<strong>in</strong>aonesecondb<strong>in</strong>,asseenatsevendifferentswitches<br />
with<strong>in</strong>4datacenters.Wef<strong>in</strong>dthatalthough<strong>the</strong>distributionvaries<br />
across<strong>the</strong>datacenters,<strong>the</strong>number<strong>of</strong>activeflowsatanygiven<br />
<strong>in</strong>tervalislessthan10,000. Basedon<strong>the</strong>distributions,wegroup<br />
<strong>the</strong>7monitoredswitches<strong>in</strong>totwoclasses. In<strong>the</strong>firstclassare<br />
all<strong>of</strong><strong>the</strong>universitydatacenterswitchesEDU1,EDU2andEDU3,<br />
andone<strong>of</strong><strong>the</strong>switchesfromaprivateenterprise,namelyPRV24,<br />
where<strong>the</strong>number<strong>of</strong>activeflowsisbetween10and500<strong>in</strong>90%<strong>of</strong><br />
<strong>the</strong>time<strong>in</strong>tervals.In<strong>the</strong>secondclass,are<strong>the</strong>rema<strong>in</strong><strong>in</strong>gswitches<br />
from<strong>the</strong>enterprise,namely,PRV21,PRV22,andPRV23,where<br />
<strong>the</strong>number<strong>of</strong>activeflowsisbetween1,000and5,000about90%<br />
<strong>of</strong><strong>the</strong>time.<br />
Weexam<strong>in</strong>e<strong>the</strong>flow<strong>in</strong>ter-arrivaltimes<strong>in</strong>Figure3(b).Wef<strong>in</strong>d<br />
that<strong>the</strong>timebetweennewflowsarriv<strong>in</strong>gat<strong>the</strong>monitoredswitchis<br />
lessthan10µsfor2-13%<strong>of</strong><strong>the</strong>flows. Formost<strong>of</strong><strong>the</strong>switches<strong>in</strong><br />
PRV2,80%<strong>of</strong><strong>the</strong>flowshavean<strong>in</strong>ter-arrivaltimeunder1ms.This<br />
observationsupports<strong>the</strong>results<strong>of</strong>apriorstudy[19]<strong>of</strong>aclouddata<br />
center.However,wefoundthatthisobservationdoesnotholdfor<br />
<strong>the</strong>universitydatacenters,wherewesee80%<strong>of</strong><strong>the</strong>flow<strong>in</strong>terarrivaltimeswerebetween4msand40ms,suggest<strong>in</strong>gthat<strong>the</strong>sedatacentershavelesschurnthanPRV2and<strong>the</strong>previouslystudiedclouddatacenter[19].Amongo<strong>the</strong>rissues,flow<strong>in</strong>ter-arrival<br />
timeaffectswhatk<strong>in</strong>ds<strong>of</strong>process<strong>in</strong>gcanbedoneforeachnew<br />
flowand<strong>the</strong>feasibility<strong>of</strong>logicallycentralizedcontrollersforflow<br />
placement.Wereturnto<strong>the</strong>sequestions<strong>in</strong>Section7.<br />
Next,weexam<strong>in</strong>e<strong>the</strong>distributions<strong>of</strong>flowsizesandandlengths<br />
<strong>in</strong>Figure4(a)and(b),respectively.FromFigure4(a),wef<strong>in</strong>dthat<br />
flowsizesareroughlysimilaracrossall<strong>the</strong>studiedswitchesand<br />
datacenters.Across<strong>the</strong>datacenters,wenotethat80%<strong>of</strong><strong>the</strong>flows<br />
aresmallerthan10KB<strong>in</strong>size.Most<strong>of</strong><strong>the</strong>bytesare<strong>in</strong><strong>the</strong>top10%<br />
<strong>of</strong>largeflows.FromFigure4(b),wef<strong>in</strong>dthatformost<strong>of</strong><strong>the</strong>data<br />
centers80%<strong>of</strong><strong>the</strong>flowsarelessthan11secondslong.Theseresultssupport<strong>the</strong>observationsmade<strong>in</strong>priorastudy[19]<strong>of</strong>acloud<br />
datacenter. However,wedonotethat<strong>the</strong>flows<strong>in</strong>EDU2appear<br />
tobegenerallyshorterandsmallerthan<strong>the</strong>flows<strong>in</strong><strong>the</strong>o<strong>the</strong>rdata<br />
centers. Webelievethisisdueto<strong>the</strong>nature<strong>of</strong><strong>the</strong>predom<strong>in</strong>ant<br />
applicationthataccountsforover70%<strong>of</strong><strong>the</strong>bytesat<strong>the</strong>switch.<br />
F<strong>in</strong>ally,<strong>in</strong>Figure5,weexam<strong>in</strong>e<strong>the</strong>distribution<strong>of</strong>packetsizes<br />
<strong>in</strong><strong>the</strong>studieddatacenters.Thepacketsizesexhibitabimodalpat-<br />
272<br />
CDF<br />
(a)<br />
CDF<br />
(b)<br />
1<br />
0.8<br />
0.6<br />
0.4<br />
0.2<br />
0<br />
EDU1<br />
EDU2<br />
EDU3<br />
PRV21 PRV22 PRV23 PRV24 10 100 1000 10000 100000<br />
1<br />
0.8<br />
0.6<br />
Number <strong>of</strong> Active Flows<br />
0.4<br />
0.2<br />
0<br />
EDU1<br />
EDU2<br />
EDU3<br />
PRV21 PRV22 PRV23 PRV24 10 100 1000 10000 100000<br />
Flow Interarrival Times (<strong>in</strong> usecs)<br />
Figure3: CDF<strong>of</strong><strong>the</strong>distribution<strong>of</strong><strong>the</strong>number<strong>of</strong>flowsat<br />
<strong>the</strong>edgeswitch(a)and<strong>the</strong>arrivalrateforflows(b)<strong>in</strong>EDU1,<br />
EDU2,EDU3,andPRV2.<br />
tern,withmostpacketsizescluster<strong>in</strong>garoundei<strong>the</strong>r200Bytesand<br />
1400Bytes.Surpris<strong>in</strong>gly,wefoundapplicationkeep-alivepackets<br />
asamajorreasonfor<strong>the</strong>smallpackets,withTCPacknowledgments,asexpected,be<strong>in</strong>g<strong>the</strong>o<strong>the</strong>rmajorcontributor.Uponclose<br />
<strong>in</strong>spection<strong>of</strong><strong>the</strong>packettraces,wefoundthatcerta<strong>in</strong>applications,<br />
<strong>in</strong>clud<strong>in</strong>gMSSQL,HTTP,andSMB,contributedmoresmallpacketsthanlargepackets.Inoneextremecase,wefoundanapplicationproduc<strong>in</strong>g5timesasmanysmallpacketsaslargepackets.<br />
Thisresultspeakstohowcommonlypersistentconnectionsoccur<br />
asadesignfeature<strong>in</strong>datacenterapplications,and<strong>the</strong>importance<br />
<strong>of</strong>cont<strong>in</strong>uallyma<strong>in</strong>ta<strong>in</strong><strong>in</strong>g<strong>the</strong>m.<br />
5.2 Packet-LevelCommunication<strong>Characteristics</strong><br />
Wefirstexam<strong>in</strong>e<strong>the</strong>temporalcharacteristics<strong>of</strong><strong>the</strong>packettraces.<br />
Figure6showsatime-series<strong>of</strong>packetarrivalsobservedatone<strong>of</strong><br />
<strong>the</strong>sniffers<strong>in</strong>PRV2,and<strong>the</strong>packetarrivalsexhibitanON/OFF<br />
patternatboth15msand100msgranularities.Weobservedsimilar<br />
trafficpatternsat<strong>the</strong>rema<strong>in</strong><strong>in</strong>g6switchesaswell.<br />
Per-packetarrivalprocess: Leverag<strong>in</strong>g<strong>the</strong>observationthat<br />
trafficisON/OFF,weuseapacket<strong>in</strong>ter-arrivaltimethresholdto<br />
identify<strong>the</strong>ON/OFFperiods<strong>in</strong><strong>the</strong>traces. Let arrival95be<strong>the</strong><br />
95thpercentilevalue<strong>in</strong><strong>the</strong><strong>in</strong>ter-arrivaltimedistributionataparticularswitch.Wedef<strong>in</strong>eaperiodonas<strong>the</strong>longestcont<strong>in</strong>ualperioddur<strong>in</strong>gwhichall<strong>the</strong>packet<strong>in</strong>ter-arrivaltimesaresmallerthan<br />
arrival95.Accord<strong>in</strong>gly,a period<strong>of</strong>fisaperiodbetweentwoON<br />
periods.TocharacterizethisON/OFFtrafficpattern,wefocuson<br />
threeaspects:(i)<strong>the</strong>durations<strong>of</strong><strong>the</strong>ONperiods,(ii)<strong>the</strong>durations
CDF<br />
(a)<br />
CDF<br />
(b)<br />
1<br />
0.8<br />
0.6<br />
0.4<br />
EDU1<br />
EDU2<br />
EDU3<br />
0.2<br />
0<br />
PRV21 PRV22 PRV23 PRV24 1 10 100 1000 10000 100000 1e+06 1e+07 1e+08<br />
1<br />
0.8<br />
0.6<br />
Flow Sizes (<strong>in</strong> Bytes)<br />
0.4<br />
EDU1<br />
EDU2<br />
EDU3<br />
0.2<br />
0<br />
1 10 100<br />
PRV21 PRV22 PRV23 PRV24 1000 10000 100000 1e+06 1e+07 1e+08 1e+09<br />
Flow Lengths (<strong>in</strong> usecs)<br />
Figure4: CDF<strong>of</strong><strong>the</strong>distribution<strong>of</strong><strong>the</strong>flowsizes(a)and<strong>of</strong><br />
flowlengths(b)<strong>in</strong>PRV2,EDU1,EDU2,andEDU3.<br />
CDF<br />
1<br />
0.8<br />
0.6<br />
0.4<br />
0.2<br />
0<br />
EDU1<br />
EDU2<br />
EDU3<br />
PRV21 PRV22 PRV23 PRV24 0 200 400 600 800 1000 1200 1400 1600<br />
Packet Size (<strong>in</strong> Bytes)<br />
Figure5:Distribution<strong>of</strong>packetsize<strong>in</strong><strong>the</strong>variousnetworks.<br />
<strong>of</strong><strong>the</strong>OFFperiods,and(iii)<strong>the</strong>packet<strong>in</strong>ter-arrivaltimeswith<strong>in</strong><br />
ONperiods.<br />
Figure7(a)shows<strong>the</strong>distribution<strong>of</strong><strong>in</strong>ter-arrivaltimeswith<strong>in</strong><br />
ONperiodsatone<strong>of</strong><strong>the</strong>switchesforPRV2. Web<strong>in</strong><strong>the</strong><strong>in</strong>terarrivaltimesaccord<strong>in</strong>gto<strong>the</strong>clockgranularity<strong>of</strong>10µs.Notethat<br />
<strong>the</strong>distributionhasapositiveskewandaheavytail.Weattempted<br />
t<strong>of</strong>itseveralheavy-taileddistributionsandfoundthat<strong>the</strong>lognormal<br />
curveproduces<strong>the</strong>bestfitwith<strong>the</strong>leastmeanerror. Figure7(b)<br />
273<br />
# <strong>of</strong> packets received<br />
0 2 4 6 8 10<br />
x 10 4<br />
Time (<strong>in</strong> Milliseconds)<br />
# packets received<br />
0 2 4 6 8 10<br />
x 10 4<br />
Time (<strong>in</strong> milliseconds)<br />
(a)15ms (b)100ms<br />
Figure6:ON/OFFcharacteristics:Timeseries<strong>of</strong><strong>Data</strong>Center<br />
traffic(number<strong>of</strong>packetspertime)b<strong>in</strong>nedbytwodifferent<br />
timescales.<br />
<strong>Data</strong>centerOffperiod ONperiodInterarrivalRate<br />
DistributionDistribution Distribution<br />
PRV 21 LognormalLognormal Lognormal<br />
PRV 22 LognormalLognormal Lognormal<br />
PRV 23 LognormalLognormal Lognormal<br />
PRV 24 LognormalLognormal Lognormal<br />
EDU1 Lognormal Weibull Weibull<br />
EDU2 Lognormal Weibull Weibull<br />
EDU3 Lognormal Weibull Weibull<br />
Table4:Thedistributionfor<strong>the</strong>parameters<strong>of</strong>each<strong>of</strong><strong>the</strong>arrivalprocesses<strong>of</strong><strong>the</strong>variousswitches.<br />
shows<strong>the</strong>distribution<strong>of</strong><strong>the</strong>durations<strong>of</strong>ONperiods. Similarto<br />
<strong>the</strong><strong>in</strong>ter-arrivaltimedistribution,thisONperioddistributionalso<br />
exhibitsapositiveskewandfitswellwithalognormalcurve.The<br />
sameobservationcanbeappliedto<strong>the</strong>OFFperioddistributionas<br />
well,asshown<strong>in</strong>Figure7(c).<br />
Wefoundqualitativelysimilarcharacteristicsat<strong>the</strong>o<strong>the</strong>r6<br />
switcheswherepackettraceswerecollected.However,<strong>in</strong>fitt<strong>in</strong>ga<br />
distributionto<strong>the</strong>packettraces(Table4),wefoundthatonly<strong>the</strong><br />
OFFperiodat<strong>the</strong>differentswitchesconsistentlyfit<strong>the</strong>lognormal<br />
distribution. For<strong>the</strong>ONperiodsand<strong>in</strong>terarrivalrates,wefound<br />
thatbestdistributionwasei<strong>the</strong>rWeibullandlognormal,vary<strong>in</strong>gby<br />
datacenter.<br />
Ourf<strong>in</strong>d<strong>in</strong>gs<strong>in</strong>dicatethatcerta<strong>in</strong>positiveskewedandheavytaileddistributionscanmodeldatacenterswitchtraffic.Thishighlightsadifferencebetween<strong>the</strong>datacenterenvironmentand<strong>the</strong>wideareanetwork,where<strong>the</strong>long-tailedParetodistributiontypicallyshows<strong>the</strong>bestfit[27,24].<br />
Thedifferencesbetween<strong>the</strong>se<br />
distributionsshouldbetaken<strong>in</strong>toaccountwhenattempt<strong>in</strong>gtoapply<br />
modelsortechniquesfromwideareanetwork<strong>in</strong>gtodatacenters.<br />
Per-applicationarrivalprocess:Recallthat<strong>the</strong>datacenters<strong>in</strong><br />
thisanalysis,namely,EDU1,EDU2,EDU3,andPRV2,aredom<strong>in</strong>atedbyWebanddistributedfile-systemtraffic(Figure2).Wenow<br />
exam<strong>in</strong>e<strong>the</strong>arrivalprocessesfor<strong>the</strong>sedom<strong>in</strong>antapplicationsto<br />
seeif<strong>the</strong>yexpla<strong>in</strong><strong>the</strong>aggregatearrivalprocessat<strong>the</strong>correspond<strong>in</strong>gswitches.<br />
InTable5,wepresent<strong>the</strong>distributionthatbestfits<br />
<strong>the</strong>arrivalprocessfor<strong>the</strong>dom<strong>in</strong>antapplication. Fromthistable,<br />
wenoticethat<strong>the</strong>dom<strong>in</strong>antapplications<strong>in</strong><strong>the</strong>universities(EDU1,<br />
EDU2,EDU3),whichaccountfor70–100%<strong>of</strong><strong>the</strong>bytesat<strong>the</strong><br />
respectiveswitches,are<strong>in</strong>deedcharacterizedbyidenticalheavytaileddistributionsas<strong>the</strong>aggregatetraffic.<br />
However,<strong>in</strong><strong>the</strong>case<br />
<strong>of</strong>two<strong>of</strong><strong>the</strong>PRV2switches(#1and#3),wef<strong>in</strong>dthat<strong>the</strong>dom<strong>in</strong>antapplicationdiffersslightlyfrom<strong>the</strong>aggregatebehavior.Thus,<strong>in</strong><strong>the</strong>generalcase,wef<strong>in</strong>dthatsimplyrely<strong>in</strong>gon<strong>the</strong>characteristics<strong>of</strong><strong>the</strong>mostdom<strong>in</strong>antapplicationsisnotsufficienttoaccurately<br />
model<strong>the</strong>aggregatearrivalprocessesatdatacenteredgeswitches.
<strong>Data</strong>centerOffperiodInterarrivalRateONperiod Dom<strong>in</strong>ant<br />
Distribution Distribution DistributionApplications<br />
PRV 21 Lognormal Weibull Exponential O<strong>the</strong>rs<br />
PRV 22 Weibull Lognormal Lognormal LDAP<br />
PRV 23 Weibull Lognormal Exponential HTTP<br />
PRV 24 Lognormal Lognormal Weibull O<strong>the</strong>rs<br />
EDU1 Lognormal Lognormal Weibull HTTP<br />
EDU2 Lognormal Weibull Weibull NCP<br />
EDU3 Lognormal Weibull Weibull AFS<br />
Table5:Thedistributionfor<strong>the</strong>parameters<strong>of</strong>each<strong>of</strong><strong>the</strong>arrivalprocesses<strong>of</strong><strong>the</strong>dom<strong>in</strong>antapplicationsoneachswitch.<br />
(a)<br />
(b)<br />
(c)<br />
CDF<br />
CDF<br />
CDF<br />
10 0<br />
10 −1<br />
10 −2<br />
10 −3<br />
10 −4<br />
1<br />
0.9<br />
0.8<br />
0.7<br />
0.6<br />
0.5<br />
0.4<br />
0.3<br />
0.2<br />
0.1<br />
0<br />
1<br />
0.9<br />
0.8<br />
0.7<br />
0.6<br />
0.5<br />
0.4<br />
0.3<br />
0.2<br />
0.1<br />
10 2<br />
10 3<br />
Interarrival Times (<strong>in</strong> milliseconds)<br />
wbl: 0.013792<br />
logn: 0.011119<br />
exp: 0.059716<br />
pareto: 0.027664<br />
data<br />
10 4<br />
wbl :0.016516<br />
logn :0.016093<br />
exp :0.01695<br />
pareto :0.03225<br />
data<br />
20 30 40 50 60 70 80 90 100<br />
Length <strong>of</strong> ON−Periods (<strong>in</strong> microseconds)<br />
wbl: 0.090269<br />
logn: 0.081732<br />
exp: 0.11594<br />
pareto: 0.66908<br />
data<br />
2 3 4 5 6 7 8 9 10<br />
x 10 4<br />
Length <strong>of</strong> OFF−Periods(<strong>in</strong> milliseconds)<br />
Figure7:CDF<strong>of</strong><strong>the</strong>distribution<strong>of</strong><strong>the</strong>arrivaltimes<strong>of</strong>packets<br />
at3<strong>of</strong><strong>the</strong>switches<strong>in</strong>PRV2.Thefigureconta<strong>in</strong>sbestfitcurve<br />
forlognormal,Weibull,Pareto,andExponentialdistributions,<br />
aswellas<strong>the</strong>leastmeanerrorsforeach.<br />
F<strong>in</strong>ally,wecompare<strong>the</strong>observeddistributionsforHTTPapplications<strong>in</strong><strong>the</strong>datacenteraga<strong>in</strong>stHTTPapplications<strong>in</strong><strong>the</strong>widearea<br />
andf<strong>in</strong>dthat<strong>the</strong>distribution<strong>of</strong>ONperiods<strong>in</strong><strong>the</strong>datacenterdoes<br />
matchobservationsmadebyo<strong>the</strong>rs[7]<strong>in</strong><strong>the</strong>WAN.<br />
Thetakeawaysfromourobservationsarethat: (1)Thenumber<strong>of</strong>activeflowsataswitch<strong>in</strong>anygivensecondis,atmost,<br />
10,000flows. However,newflowscanarrivewith<strong>in</strong>rapidsuccession(10µs)<strong>of</strong>eacho<strong>the</strong>r,result<strong>in</strong>g<strong>in</strong>high<strong>in</strong>stantaneousarrival<br />
rates;(2)Mostflows<strong>in</strong><strong>the</strong>datacentersweexam<strong>in</strong>edaresmall<strong>in</strong><br />
size(≤ 10KB)andasignificantfractionlastunderafewhundreds<strong>of</strong>milliseconds;(3)<strong>Traffic</strong>leav<strong>in</strong>g<strong>the</strong>edgeswitches<strong>in</strong>a<br />
274<br />
datacenterisbursty<strong>in</strong>natureand<strong>the</strong>ON/OFF<strong>in</strong>tervalscanbe<br />
characterizedbyheavy-taileddistributions;and(4)Insomedata<br />
centers,<strong>the</strong>predom<strong>in</strong>antapplicationdrives<strong>the</strong>aggregatesend<strong>in</strong>g<br />
patternat<strong>the</strong>edgeswitch. In<strong>the</strong>generalcase,however,simply<br />
focus<strong>in</strong>gondom<strong>in</strong>antapplicationsis<strong>in</strong>sufficienttounderstand<strong>the</strong><br />
processdriv<strong>in</strong>gpackettransmission<strong>in</strong>to<strong>the</strong>datacenternetwork.<br />
In<strong>the</strong>nextsection,weanalyzel<strong>in</strong>kutilizationsat<strong>the</strong>various<br />
layerswith<strong>in</strong><strong>the</strong>datacentertounderstandhow<strong>the</strong>burstynature<strong>of</strong><br />
trafficimpacts<strong>the</strong>utilizationandpacketloss<strong>of</strong><strong>the</strong>l<strong>in</strong>ksateach<strong>of</strong><br />
<strong>the</strong>layers.<br />
6. NETWORKCOMMUNICATION<br />
PATTERNS<br />
In<strong>the</strong>twoprevioussections,weexam<strong>in</strong>ed<strong>the</strong>applicationsemployed<strong>in</strong>each<strong>of</strong><strong>the</strong>10datacenters,<strong>the</strong>irplacement,andtransmissionpatterns.<br />
Inthissection,weexam<strong>in</strong>e,with<strong>the</strong>goal<strong>of</strong><br />
<strong>in</strong>form<strong>in</strong>gdatacentertrafficeng<strong>in</strong>eer<strong>in</strong>gtechniques,howexist<strong>in</strong>g<br />
datacenterapplicationsutilize<strong>the</strong><strong>in</strong>terconnect. Inparticular,we<br />
aimtoanswer<strong>the</strong>follow<strong>in</strong>gquestions: (1)Towhatextentdoes<br />
<strong>the</strong>currentapplicationtrafficutilize<strong>the</strong>datacenter’s<strong>in</strong>terconnect?<br />
Forexample,ismosttrafficconf<strong>in</strong>edtowith<strong>in</strong>arackornot? (2)<br />
Whatis<strong>the</strong>utilization<strong>of</strong>l<strong>in</strong>ksatdifferentlayers<strong>in</strong>adatacenter?<br />
(3)How<strong>of</strong>tenarel<strong>in</strong>ksheavilyutilizedandwhatare<strong>the</strong>properties<strong>of</strong>heavilyutilizedl<strong>in</strong>ks?<br />
Forexample,howlongdoesheavy<br />
utilizationpersiston<strong>the</strong>sel<strong>in</strong>ks,anddo<strong>the</strong>highlyutilizedl<strong>in</strong>ks<br />
experiencelosses?(4)Towhatextentdol<strong>in</strong>kutilizationsvaryover<br />
time?<br />
6.1 Flow<strong>of</strong><strong>Traffic</strong><br />
Westartbyexam<strong>in</strong><strong>in</strong>g<strong>the</strong>relativeproportion<strong>of</strong>trafficgenerated<br />
by<strong>the</strong>serversthatstayswith<strong>in</strong>arack(Intra-Racktraffic)versus<br />
trafficthatleavesitsrackforei<strong>the</strong>ro<strong>the</strong>rracksorexternaldest<strong>in</strong>ations(Extra-Racktraffic).<br />
Extra-Racktrafficcanbedirectly<br />
measured,asitis<strong>the</strong>amount<strong>of</strong>trafficon<strong>the</strong>upl<strong>in</strong>ks<strong>of</strong><strong>the</strong>edge<br />
switches(i.e.,<strong>the</strong>“Top-<strong>of</strong>-Rack”switches). WecomputeIntra-<br />
Racktrafficas<strong>the</strong>differencebetween<strong>the</strong>volume<strong>of</strong>trafficgeneratedby<strong>the</strong>serversattachedtoeachedgeswitchand<strong>the</strong>traffic<br />
exit<strong>in</strong>gedgeswitches.<br />
InFigure8,wepresentabargraph<strong>of</strong><strong>the</strong>ratio<strong>of</strong>Extra-Rackto<br />
Intra-Racktraffic<strong>in</strong><strong>the</strong>10datacenterswestudied. Wenotethat<br />
apredom<strong>in</strong>antportion<strong>of</strong>server-generatedtraffic<strong>in</strong><strong>the</strong>clouddata<br />
centersCLD1–5—nearly,75%onaverage—isconf<strong>in</strong>edtowith<strong>in</strong><br />
<strong>the</strong>rack<strong>in</strong>whichitwasgenerated.<br />
RecallfromSection4thatonlytwo<strong>of</strong><strong>the</strong>se5datacenters,<br />
CLD4andCLD5,runMapReducestyleapplications,while<strong>the</strong><br />
o<strong>the</strong>rthreerunamixture<strong>of</strong>differentcustomer-fac<strong>in</strong>gWebservices.<br />
Despitethiskeydifference<strong>in</strong>usage,weobservesurpris<strong>in</strong>glylittle<br />
difference<strong>in</strong><strong>the</strong>relativeproportions<strong>of</strong>Intra-RackandExtra-Rack<br />
traffic. Thiscanbeexpla<strong>in</strong>edbyrevisit<strong>in</strong>g<strong>the</strong>nature<strong>of</strong>applications<strong>in</strong><strong>the</strong>sedatacenters:asstated<strong>in</strong>Section4,<strong>the</strong>servicesrunn<strong>in</strong>g<strong>in</strong>CLD1–3havedependenciesspreadacrossmanyservers<strong>in</strong><br />
<strong>the</strong>datacenter. Theadm<strong>in</strong>istrators<strong>of</strong><strong>the</strong>senetworkstrytocolocateapplicationsanddependentcomponents<strong>in</strong>to<strong>the</strong>sameracksto<br />
avoidshar<strong>in</strong>garackwitho<strong>the</strong>rapplications/services. LowExtra-<br />
Racktrafficisaside-effect<strong>of</strong>thisartifact.In<strong>the</strong>case<strong>of</strong>CLD4and<br />
CLD5,<strong>the</strong>operatorsassignMapReducejobstoco-locatedservers<br />
forsimilarreasons. However,faulttolerancerequiresplac<strong>in</strong>gredundantcomponents<strong>of</strong><strong>the</strong>applicationanddatastorage<strong>in</strong>todifferentracks,which<strong>in</strong>creases<strong>the</strong>Extra-Rackcommunication.Ourf<strong>in</strong>d<strong>in</strong>gs<strong>of</strong>highIntra-Racktrafficwith<strong>in</strong>datacenterssupportsobservationsmadebyo<strong>the</strong>rs[19],where<strong>the</strong>focuswasonclouddata<br />
centersrunn<strong>in</strong>gMapReduce.
Percent <strong>of</strong> <strong>Traffic</strong><br />
0 20 40 60 80 100<br />
EDU1<br />
EDU2<br />
EDU3<br />
PRV1<br />
PRV2<br />
CLD1<br />
<strong>Data</strong> <strong>Centers</strong><br />
CLD2<br />
CLD3<br />
CLD4<br />
Intra-Rack Extra-Rack<br />
Figure8:Theratio<strong>of</strong>Extra-RacktoIntra-Racktraffic<strong>in</strong><strong>the</strong><br />
datacenters.<br />
Next,wefocuson<strong>the</strong>enterpriseanduniversitydatacenters.<br />
With<strong>the</strong>exception<strong>of</strong>EDU1,<strong>the</strong>seappeartobebothverydifferentfrom<strong>the</strong>clouddatacentersandqualitativelysimilartoeacho<strong>the</strong>r:atleast50%<strong>of</strong><strong>the</strong>server-orig<strong>in</strong>atedtraffic<strong>in</strong><strong>the</strong>datacentersleaves<strong>the</strong>racks,comparedwithunder25%for<strong>the</strong>clouddata<br />
centers. Thesedatacentersrunuser-fac<strong>in</strong>gapplications,suchas<br />
Webservicesandfileservers.WhilethisapplicationmixissimilartoCLD1–3discussedabove,<strong>the</strong>Intra/Extrarackusagepatterns<br />
arequitedifferent.Apossiblereasonfor<strong>the</strong>differenceisthat<strong>the</strong><br />
placement<strong>of</strong>dependentservices<strong>in</strong>enterpriseandcampusdatacentersmaynotbeasoptimizedas<strong>the</strong>clouddatacenters.<br />
6.2 L<strong>in</strong>kUtilizationsvsLayer<br />
Next,weexam<strong>in</strong>e<strong>the</strong>impact<strong>of</strong><strong>the</strong>Extra-Racktrafficon<strong>the</strong><br />
l<strong>in</strong>kswith<strong>in</strong><strong>the</strong><strong>in</strong>terconnect<strong>of</strong><strong>the</strong>variousdatacenters. Weexam<strong>in</strong>el<strong>in</strong>kutilizationasafunction<strong>of</strong>location<strong>in</strong><strong>the</strong>datacenter<br />
topology. Recallthatall10datacentersemployed2-Tieredor3-<br />
Tieredtree-likenetworks.<br />
Inperform<strong>in</strong>gthisstudy,westudiedseveralhundred5-m<strong>in</strong>ute<br />
<strong>in</strong>tervalsatrandomforeachdatacenterandexam<strong>in</strong>ed<strong>the</strong>l<strong>in</strong>kutilizationsasreportedbySNMP.InFigure9,wepresent<strong>the</strong>utilizationforl<strong>in</strong>ksacrossdifferentlayers<strong>in</strong><strong>the</strong>datacentersforonesuch<br />
representative<strong>in</strong>terval.<br />
Ingeneral,wef<strong>in</strong>dthatutilizationswith<strong>in</strong><strong>the</strong>core/aggregation<br />
layersarehigherthanthoseat<strong>the</strong>edge; thisobservationholds<br />
acrossallclasses<strong>of</strong>datacenters.Thesef<strong>in</strong>d<strong>in</strong>gssupportobservationsmadebyo<strong>the</strong>rs[3],where<strong>the</strong>focuswasonclouddatacenters.<br />
Akeypo<strong>in</strong>ttonote,notraisedbypriorwork[3],isthatacross<br />
<strong>the</strong>variousdatacenters,<strong>the</strong>rearedifferences<strong>in</strong><strong>the</strong>tail<strong>of</strong><strong>the</strong>distributionsforalllayers–<strong>in</strong>somedatacenters,suchasCLD4,<strong>the</strong>re<br />
isagreaterprevalence<strong>of</strong>highutilizationl<strong>in</strong>ks(i.e.,utilization70%<br />
orgreater)especially<strong>in</strong><strong>the</strong>corelayer,while<strong>in</strong>o<strong>the</strong>rs<strong>the</strong>reareno<br />
highutilizationl<strong>in</strong>ks<strong>in</strong>anylayer(e.g.,EDU1).Next,weexam<strong>in</strong>e<br />
<strong>the</strong>sehighutilizationl<strong>in</strong>ks<strong>in</strong>greaterdepth.<br />
6.3 Hot-spotL<strong>in</strong>ks<br />
Inthissection,westudy<strong>the</strong>hot-spotl<strong>in</strong>ks—thosewith70%<br />
orhigherutilization—unear<strong>the</strong>d<strong>in</strong>variousdatacenters,focus<strong>in</strong>g<br />
on<strong>the</strong>persistenceandprevalence<strong>of</strong>hot-spots.Morespecifically,<br />
weaimtoanswer<strong>the</strong>follow<strong>in</strong>gquestions:(1)Dosomel<strong>in</strong>ksfrequentlyappearashot-spots?Howdoesthisresultvaryacrosslayersanddatacenters?<br />
(2)Howdoes<strong>the</strong>set<strong>of</strong>hot-spotl<strong>in</strong>ks<strong>in</strong><br />
alayerchangeovertime? (3)Dohot-spotl<strong>in</strong>ksexperiencehigh<br />
packetloss?<br />
CLD5<br />
275<br />
CDF<br />
CDF<br />
CDF<br />
(a)<br />
(b)<br />
(c)<br />
1<br />
0.8<br />
0.6<br />
0.4<br />
0.2<br />
0<br />
0.01 0.1 1 10 100<br />
1<br />
0.8<br />
0.6<br />
0.4<br />
0.2<br />
Edge L<strong>in</strong>k Utilization<br />
EDU1<br />
EDU3<br />
PRV1<br />
PRV2<br />
CLD1<br />
CLD2<br />
CLD3<br />
CLD4<br />
CLD5<br />
0<br />
0.01 0.1 1 10 100<br />
1<br />
0.8<br />
0.6<br />
0.4<br />
0.2<br />
Agg L<strong>in</strong>k Utilization<br />
PRV2<br />
CLD1<br />
CLD2<br />
CLD3<br />
CLD4<br />
CLD5<br />
0<br />
0.01 0.1 1 10 100<br />
Core L<strong>in</strong>k Utilization<br />
EDU1<br />
EDU3<br />
PRV1<br />
PRV2<br />
CLD1<br />
CLD2<br />
CLD3<br />
CLD4<br />
CLD5<br />
Figure9:CDF<strong>of</strong>l<strong>in</strong>kutilizations(percentage)<strong>in</strong>eachlayer.<br />
6.3.1 PersistenceandPrevalence<br />
InFigure10,wepresent<strong>the</strong>distribution<strong>of</strong><strong>the</strong>percentage<strong>of</strong><br />
time<strong>in</strong>tervalsthatal<strong>in</strong>kisahot-spot.WenotefromFigures10(a)<br />
and(b)thatveryfewl<strong>in</strong>ks<strong>in</strong>ei<strong>the</strong>r<strong>the</strong>edgeoraggregationlayersarehot-spots,andthisobservationsholdsacrossalldatacenters<br />
anddatacentertypes. Specifically,only3%<strong>of</strong><strong>the</strong>l<strong>in</strong>ks<strong>in</strong><strong>the</strong>se<br />
twolayersappearasahot-spotformorethan 0.1%<strong>of</strong>time<strong>in</strong>tervals.<br />
Whenedgel<strong>in</strong>ksarecongested,<strong>the</strong>ytendtobecongested<br />
cont<strong>in</strong>uously,as<strong>in</strong>CLD2,whereaverysmallfraction<strong>of</strong><strong>the</strong>edge<br />
l<strong>in</strong>ksappearashot-spots<strong>in</strong>90%<strong>of</strong><strong>the</strong>time<strong>in</strong>tervals.<br />
Incontrast,wef<strong>in</strong>dthat<strong>the</strong>datacentersdiffersignificantly<strong>in</strong><br />
<strong>the</strong>ircorelayers(Figure10(c)).Ourdatacenterscluster<strong>in</strong>to3hotspotclasses:<br />
(1)LowPersistence-LowPrevalence: Thisclass<strong>of</strong><br />
datacenterscomprisesthosewhere<strong>the</strong>hot-spotsarenotlocalized<br />
toanyset<strong>of</strong>l<strong>in</strong>ks. This<strong>in</strong>cludesPRV2,EDU1,EDU2,EDU3,<br />
CLD1,andCLD3,whereanygivencorel<strong>in</strong>kisahot-spotforno<br />
morethan10%<strong>of</strong><strong>the</strong>time<strong>in</strong>tervals;(2)HighPersistence-Low<br />
Prevalence:Thesecondgroup<strong>of</strong>datacentersischaracterizedby<br />
hot-spotsbe<strong>in</strong>glocalizedtoasmallnumber<strong>of</strong>corel<strong>in</strong>ks.This<strong>in</strong>cludesPRV1andCLD2where3%and8%<strong>of</strong><strong>the</strong>corel<strong>in</strong>ks,respectively,eachappearashot-spots<strong>in</strong><br />
> 50%<strong>of</strong><strong>the</strong>time<strong>in</strong>tervals;and<br />
(3)HighPersistence-HighPrevalence: F<strong>in</strong>ally,<strong>in</strong><strong>the</strong>lastgroup<br />
conta<strong>in</strong><strong>in</strong>gCLD4andCLD5,asignificantfraction<strong>of</strong><strong>the</strong>corel<strong>in</strong>ks
CDF<br />
(a)<br />
CDF<br />
CDF<br />
(b)<br />
(c)<br />
1<br />
0.995<br />
0.99<br />
0.985<br />
0.98<br />
EDU1<br />
EDU3<br />
PRV1<br />
PRV2<br />
CLD1<br />
CLD2<br />
CLD3<br />
CLD4<br />
CLD5<br />
0.975<br />
0.01 0.1 1 10 100<br />
1<br />
0.995<br />
0.99<br />
0.985<br />
0.98<br />
% <strong>of</strong> Times an Edge L<strong>in</strong>k is a Hotspot<br />
PRV2<br />
CLD1<br />
CLD2<br />
CLD3<br />
CLD4<br />
CLD5<br />
0.975<br />
0.01 0.1 1 10 100<br />
1<br />
0.95<br />
% <strong>of</strong> Times an Agg L<strong>in</strong>k is a Hotspot<br />
0.9<br />
EDU1<br />
EDU2<br />
0.85<br />
EDU3<br />
PRV1<br />
0.8<br />
PRV2<br />
0.75<br />
CLD1<br />
CLD2<br />
0.7<br />
0.65<br />
CLD3<br />
CLD4<br />
CLD5<br />
0.1 1 10 100<br />
% <strong>of</strong> Times a Core L<strong>in</strong>k is a Hotspot<br />
Figure10:ACDF<strong>of</strong><strong>the</strong>fraction<strong>of</strong>timesthatl<strong>in</strong>ks<strong>in</strong><strong>the</strong>variouslayersarehot-spots.<br />
appearpersistentlyashot-spots. Specifically,roughly20%<strong>of</strong><strong>the</strong><br />
corel<strong>in</strong>ksarehot-spotsatleast50%<strong>of</strong><strong>the</strong>timeeach. Notethat<br />
bothCLD4andCLD5runMapReduceapplications.<br />
Next,weexam<strong>in</strong>e<strong>the</strong>variation<strong>in</strong><strong>the</strong>fraction<strong>of</strong><strong>the</strong>corel<strong>in</strong>ks<br />
thatarehot-spotsversustime.InFigure13,weshowourobservationsforonedatacenter<strong>in</strong>each<strong>of</strong><strong>the</strong>3hot-spotclassesjustdescribed.Fromthisfigure,weobservethateachclasshasadifferent<br />
pattern.In<strong>the</strong>lowpersistence-lowprevalencedatacenter,CLD1,<br />
wef<strong>in</strong>dthatveryfewhot-spotsoccurover<strong>the</strong>course<strong>of</strong><strong>the</strong>day,and<br />
when<strong>the</strong>ydooccur,onlyasmallfraction<strong>of</strong><strong>the</strong>corel<strong>in</strong>ksemerge<br />
ashot-spots(lessthan0.002%).However,<strong>in</strong><strong>the</strong>highpersistence<br />
classes,weobservethathot-spotsoccurthroughout<strong>the</strong>day. Interest<strong>in</strong>gly,with<strong>the</strong>highpersistence-highprevalencedatacenter,<br />
CLD5,weobservethat<strong>the</strong>fraction<strong>of</strong>l<strong>in</strong>ksthatarehot-spotsis<br />
affectedby<strong>the</strong>time<strong>of</strong>day.Equallyimportantisthatonly25%<strong>of</strong><br />
<strong>the</strong>corel<strong>in</strong>ks<strong>in</strong>CLD5areeverhot-spots.Thissuggeststhat,depend<strong>in</strong>gon<strong>the</strong>trafficmatrix,<strong>the</strong>rema<strong>in</strong><strong>in</strong>g75%<strong>of</strong><strong>the</strong>corel<strong>in</strong>ks<br />
canbeutilizedto<strong>of</strong>floadsometrafficfrom<strong>the</strong>hot-spotl<strong>in</strong>ks.<br />
6.3.2 Hot-spotsandDiscards<br />
F<strong>in</strong>ally,westudylossratesacrossl<strong>in</strong>ks<strong>in</strong><strong>the</strong>datacenters. In<br />
276<br />
CDF<br />
CDF<br />
CDF<br />
(a)<br />
(b)<br />
(c)<br />
1<br />
0.998<br />
0.996<br />
0.994<br />
0.992<br />
0.99<br />
0.988<br />
0.986<br />
0.984<br />
0.982<br />
0.001 0.01 0.1 1 10 100 1000 10000 100000<br />
1<br />
0.99<br />
0.98<br />
0.97<br />
0.96<br />
0.95<br />
0.94<br />
0.93<br />
0.92<br />
0.91<br />
0.9<br />
0.99<br />
Size <strong>of</strong> Edge Discards (<strong>in</strong> Bits)<br />
EDU1<br />
EDU3<br />
PRV1<br />
PRV2<br />
CLD1<br />
CLD2<br />
CLD3<br />
CLD4<br />
CLD5<br />
0.001 0.01 0.1 1 10 100 1000 10000 100000 1e+06 1e+07<br />
1<br />
Size <strong>of</strong> Agg Discards (<strong>in</strong> Bits)<br />
PRV2<br />
CLD1<br />
CLD2<br />
CLD3<br />
CLD4<br />
CLD5<br />
0.98<br />
EDU1<br />
EDU2<br />
0.97<br />
EDU3<br />
PRV1<br />
0.96<br />
PRV2<br />
0.95<br />
CLD1<br />
CLD2<br />
0.94<br />
0.93<br />
CLD3<br />
CLD4<br />
CLD5<br />
0.001 0.01 0.1 1 10 100 1000 10000<br />
Size <strong>of</strong> Core Discards (<strong>in</strong> Bits)<br />
Figure11:ACDF<strong>of</strong><strong>the</strong>number<strong>of</strong>bitslostacross<strong>the</strong>various<br />
layers.<br />
particular,westartbyexam<strong>in</strong><strong>in</strong>g<strong>the</strong>discardsfor<strong>the</strong>set<strong>of</strong>hotspotl<strong>in</strong>ks.<br />
Surpris<strong>in</strong>gly,wef<strong>in</strong>dthatnone<strong>of</strong><strong>the</strong>hot-spotl<strong>in</strong>ks<br />
experienceloss.Thisimpliesthat<strong>in</strong><strong>the</strong>datacentersstudied,loss<br />
doesnotcorrelatewithhighutilization.<br />
Tounderstandwherelossesareprevalent,weexam<strong>in</strong>eFigures11<br />
and12thatdisplay<strong>the</strong>lossratesandl<strong>in</strong>kutilizationfor<strong>the</strong>l<strong>in</strong>ks<br />
withlosses. In<strong>the</strong>coreandaggregation,all<strong>the</strong>l<strong>in</strong>kswithlosses<br />
havelessthan30%averageutilization,whereasat<strong>the</strong>edge,<strong>the</strong><br />
l<strong>in</strong>kswithlosseshavenearly60%utilization. Thefactthatl<strong>in</strong>ks<br />
withrelativelylowaverageutilizationconta<strong>in</strong>losses<strong>in</strong>dicatesthat<br />
<strong>the</strong>sel<strong>in</strong>ksexperiencemomentaryburststhatdonotpersistfora<br />
longenoughperiodto<strong>in</strong>crease<strong>the</strong>averageutilization.Thesemomentaryburstscanbeexpla<strong>in</strong>edby<strong>the</strong>burstynature<strong>of</strong><strong>the</strong>traffic<br />
(Section5).<br />
6.4 Variations<strong>in</strong>utilization<br />
Inthissection,weexam<strong>in</strong>eif<strong>the</strong>utilizationsvaryovertimeand<br />
whe<strong>the</strong>rornotl<strong>in</strong>kutilizationsarestableandpredictable.<br />
Weexam<strong>in</strong>ed<strong>the</strong>l<strong>in</strong>kutilizationoveraoneweekperiodand<br />
foundthatdiurnalpatternsexist<strong>in</strong>alldatacenters.Asanexample,<br />
Figure14presents<strong>the</strong>utilizationfor<strong>in</strong>putandoutputtrafficata
CDF<br />
(a)<br />
CDF<br />
CDF<br />
(b)<br />
(c)<br />
1<br />
0.998<br />
0.996<br />
0.994<br />
0.992<br />
0.99<br />
0.988<br />
0.986<br />
EDU1<br />
EDU3<br />
PRV1<br />
PRV2<br />
CLD1<br />
CLD2<br />
CLD3<br />
CLD4<br />
CLD5<br />
0.984<br />
0.001 0.01 0.1 1 10 100<br />
1<br />
0.99<br />
0.98<br />
0.97<br />
0.96<br />
0.95<br />
0.94<br />
0.93<br />
0.92<br />
Utilization <strong>of</strong> Edge L<strong>in</strong>ks with Discards<br />
PRV2<br />
CLD1<br />
CLD2<br />
CLD3<br />
CLD4<br />
CLD5<br />
0.91<br />
0.001 0.01 0.1 1 10 100<br />
1<br />
0.99<br />
Utilization <strong>of</strong> Agg L<strong>in</strong>ks with Discards<br />
0.98<br />
EDU1<br />
EDU2<br />
0.97<br />
EDU3<br />
PRV1<br />
0.96<br />
PRV2<br />
0.95<br />
CLD1<br />
CLD2<br />
0.94<br />
0.93<br />
CLD3<br />
CLD4<br />
CLD5<br />
0.001 0.01 0.1 1 10 100<br />
Utilization <strong>of</strong> Core L<strong>in</strong>ks with Discards<br />
Figure12:ACDF<strong>of</strong><strong>the</strong>utilization<strong>of</strong>l<strong>in</strong>kswithdiscards.<br />
routerport<strong>in</strong>one<strong>of</strong><strong>the</strong>clouddatacenters.The5-daytraceshows<br />
diurnalandpronouncedweekend/weekdayvariations.<br />
Toquantifythisvariation,weexam<strong>in</strong>e<strong>the</strong>differencebetween<br />
peakandtroughutilizationsforeachl<strong>in</strong>kacross<strong>the</strong>studieddata<br />
centers. InFigure15,wepresent<strong>the</strong>distribution<strong>of</strong>peakversus<br />
troughl<strong>in</strong>kutilizationsacross<strong>the</strong>variousdatacenters.Thex-axis<br />
is<strong>in</strong>percentage. Wenotethatedgel<strong>in</strong>ks<strong>in</strong>generalshowvery<br />
littlevariation(lessthan10%foraleast80%<strong>of</strong>edgel<strong>in</strong>ks).The<br />
sameistrueforl<strong>in</strong>ks<strong>in</strong><strong>the</strong>aggregationlayer(whereavailable),<br />
althoughweseeslightlygreatervariability. Inparticular,l<strong>in</strong>ks<strong>in</strong><br />
<strong>the</strong>aggregationlayer<strong>of</strong>PRV2showsignificantvariability,whereas<br />
those<strong>in</strong><strong>the</strong>o<strong>the</strong>rdatacentersdonot(variationislessthan10%for<br />
aleast80%<strong>of</strong>edgel<strong>in</strong>ks). Notethatl<strong>in</strong>kswithalowdegree<strong>of</strong><br />
variationcanberunataslowerspeedbasedonexpectedtraffic<br />
volumes.Thiscouldresult<strong>in</strong>sav<strong>in</strong>gs<strong>in</strong>networkenergycosts[14].<br />
Thevariation<strong>in</strong>l<strong>in</strong>kutilizationsat<strong>the</strong>edge/aggregationaresimilaracross<strong>the</strong>studieddatacenters.<br />
At<strong>the</strong>core,however,weare<br />
abletodist<strong>in</strong>guishbetweenseveral<strong>of</strong><strong>the</strong>datacenters.Whilemost<br />
havelowvariations(lessthan1%),wef<strong>in</strong>dthattwoclouddata<br />
centers(CLD4andCLD5)havesignificantvariations.Recallthat<br />
unlike<strong>the</strong>o<strong>the</strong>rclouddatacenters,<strong>the</strong>setwoclouddatacenters<br />
277<br />
0.002<br />
0.001<br />
0<br />
0.06<br />
0.03<br />
0<br />
0.24<br />
0.12<br />
CLD1<br />
CLD2<br />
0<br />
0 50 100 150 200 250 300<br />
Time (<strong>in</strong> 5 m<strong>in</strong>utes Intervals)<br />
CLD5<br />
Figure13:Timeseries<strong>of</strong><strong>the</strong>fraction<strong>of</strong>l<strong>in</strong>ksthatarehot-spots<br />
<strong>in</strong><strong>the</strong>corelayerforCLD1,CLD2,andCLD5.<br />
L<strong>in</strong>k Utilization<br />
0.14<br />
0.12<br />
0.1<br />
0.08<br />
0.06<br />
0.04<br />
0.02<br />
0<br />
Firday Saturday Sunday Monday Tuesday<br />
Days <strong>of</strong> <strong>the</strong> Week<br />
In<br />
Out<br />
Figure14:Time-<strong>of</strong>-Day/Day-<strong>of</strong>-Weektrafficpatterns.<br />
runprimarilyMapReduce-stylejobs. Thelargevariationsreflect<br />
differencesbetween<strong>the</strong>periodswhendataisbe<strong>in</strong>greducedfrom<br />
<strong>the</strong>workernodesto<strong>the</strong>masterando<strong>the</strong>rperiods.<br />
Tosummarize,<strong>the</strong>keytake-awaysfromouranalysis<strong>of</strong>network<br />
trafficpatternsareasfollows:(1)Inclouddatacenters,asignificantfraction<strong>of</strong>trafficstays<strong>in</strong>side<strong>the</strong>rack,while<strong>the</strong>oppositeis<br />
trueforenterpriseandcampusdatacenters;(2)Onaverage,<strong>the</strong><br />
core<strong>of</strong><strong>the</strong>datacenteris<strong>the</strong>mostutilizedlayer,while<strong>the</strong>data<br />
centeredgeislightlyutilized;(3)Thecorelayers<strong>in</strong>variousdata<br />
centersdoconta<strong>in</strong>hot-spotl<strong>in</strong>ks.Insome<strong>of</strong><strong>the</strong>datacenters,<strong>the</strong><br />
hot-spotsappearonlyoccasionally.Insome<strong>of</strong><strong>the</strong>clouddatacenters,asignificantfraction<strong>of</strong>corel<strong>in</strong>ksappearashot-spotsalarge<br />
fraction<strong>of</strong><strong>the</strong>time. At<strong>the</strong>sametime,<strong>the</strong>number<strong>of</strong>corel<strong>in</strong>ks<br />
thatarehot-spotsatanygiventimeislessthan25%;(4)Losses<br />
arenotcorrelatedwithl<strong>in</strong>kswithpersistentlyhighutilizations.We<br />
observedlossesdooccuronl<strong>in</strong>kswithlowaverageutilization<strong>in</strong>dicat<strong>in</strong>gthatlossesareduetomomentarybursts;and(5)Ingeneral,<br />
time-<strong>of</strong>-dayandday-<strong>of</strong>-weekvariationexists<strong>in</strong>many<strong>of</strong><strong>the</strong>data<br />
centers. Thevariation<strong>in</strong>l<strong>in</strong>kutilizationismostsignificant<strong>in</strong><strong>the</strong><br />
core<strong>of</strong><strong>the</strong>datacentersandquitemoderate<strong>in</strong>o<strong>the</strong>rlayers<strong>of</strong><strong>the</strong><br />
datacenters.
CDF<br />
CDF<br />
CDF<br />
(a)<br />
(b)<br />
(c)<br />
1<br />
0.8<br />
0.6<br />
0.4<br />
0.2<br />
0<br />
0.01 0.1 1 10 100<br />
1<br />
0.8<br />
0.6<br />
0.4<br />
0.2<br />
Max-Trough for Edge L<strong>in</strong>k<br />
EDU1<br />
EDU3<br />
PRV1<br />
PRV2<br />
CLD1<br />
CLD2<br />
CLD3<br />
CLD4<br />
CLD5<br />
0<br />
0.01 0.1 1 10 100<br />
1<br />
0.8<br />
0.6<br />
0.4<br />
0.2<br />
Max-Trough for Agg L<strong>in</strong>k<br />
PRV2<br />
CLD1<br />
CLD2<br />
CLD3<br />
CLD4<br />
CLD5<br />
0<br />
0.01 0.1 1 10 100<br />
Max-Trough for Core L<strong>in</strong>k<br />
EDU1<br />
EDU3<br />
PRV1<br />
PRV2<br />
CLD1<br />
CLD2<br />
CLD3<br />
CLD4<br />
CLD5<br />
Figure15:Differencebetween<strong>the</strong>peakandtroughutilization.<br />
7. IMPLICATIONSFORDATACENTER<br />
DESIGN<br />
7.1 Role<strong>of</strong>BisectionBandwidth<br />
Severalproposals[1,22,11,2]fornewdatacenternetworkarchitecturesattempttomaximize<strong>the</strong>networkbisectionbandwidth.<br />
Theseapproaches,whilewellsuitedfordatacenters,whichrun<br />
applicationsthatstress<strong>the</strong>network’sfabricwithall-to-alltraffic,<br />
wouldbeunwarranted<strong>in</strong>datacenterswhere<strong>the</strong>bisectionbandwidthisnottaxedby<strong>the</strong>applications.Inthissection,were-evaluate<br />
<strong>the</strong>SNMPandtopologydatacapturedfrom<strong>the</strong>10datacentersand<br />
exam<strong>in</strong>ewhe<strong>the</strong>r<strong>the</strong>prevalenttrafficpatternsarelikelytostress<strong>the</strong><br />
exist<strong>in</strong>gbisectionbandwidth. Wealsoexam<strong>in</strong>ehowmuch<strong>of</strong><strong>the</strong><br />
exist<strong>in</strong>gbisectionbandwidthisneededatanygiventimetosupport<br />
<strong>the</strong>prevalenttrafficpatterns.<br />
Beforeexpla<strong>in</strong><strong>in</strong>ghowweaddress<strong>the</strong>sequestions,weprovide<br />
afewdef<strong>in</strong>itions. Wedef<strong>in</strong>e<strong>the</strong>bisectionl<strong>in</strong>ksforatiereddata<br />
centertobe<strong>the</strong>set<strong>of</strong>l<strong>in</strong>ksat<strong>the</strong>top-mosttier<strong>of</strong><strong>the</strong>datacenter’s<br />
treearchitecture;<strong>in</strong>o<strong>the</strong>rwords,<strong>the</strong>corel<strong>in</strong>ksmakeup<strong>the</strong>bisectionl<strong>in</strong>ks.Thebisectioncapacityis<strong>the</strong>aggregatecapacity<strong>of</strong><strong>the</strong>sel<strong>in</strong>ks.Thefullbisectioncapacityis<strong>the</strong>capacitythatwouldberequiredtosupportserverscommunicat<strong>in</strong>gatfulll<strong>in</strong>kspeedswith<br />
arbitrarytrafficmatricesandnooversubscription. Thefullbisec-<br />
278<br />
Precent <strong>of</strong> Bisection Utilized<br />
0 10 20 30<br />
CLD1<br />
CLD2<br />
CLD3<br />
CLD4<br />
CLD5<br />
EDU1<br />
<strong>Data</strong> Center<br />
EDU2<br />
EDU3<br />
Current<br />
Full<br />
Figure16:Thefirstbaris<strong>the</strong>ratio<strong>of</strong>aggregateservertraffic<br />
overBisectionBWand<strong>the</strong>secondbaris<strong>the</strong>ratio<strong>of</strong>aggregate<br />
servertrafficoverfullbisectioncapacity. They-axisdisplays<br />
utilizationasapercentage.<br />
tioncapacitycanbecomputedassimply<strong>the</strong>aggregatecapacity<strong>of</strong><br />
<strong>the</strong>serverNICs.<br />
Return<strong>in</strong>gto<strong>the</strong>questionsposedearlier<strong>in</strong>thissection,weuse<br />
SNMPdatatocompute<strong>the</strong>follow<strong>in</strong>g:(1)<strong>the</strong>ratio<strong>of</strong><strong>the</strong>current<br />
aggregateserver-generatedtrafficto<strong>the</strong>currentbisectioncapacity<br />
and(2)<strong>the</strong>ratio<strong>of</strong><strong>the</strong>currenttrafficto<strong>the</strong>fullbisectioncapacity.<br />
Indo<strong>in</strong>gso,wemake<strong>the</strong>assumptionthat<strong>the</strong>bisectionl<strong>in</strong>kscan<br />
betreatedasas<strong>in</strong>glepool<strong>of</strong>capacityfromwhichall<strong>of</strong>feredtraffic<br />
candraw. Whilethismaynotbetrue<strong>in</strong>allcurrentnetworks,it<br />
allowsustodeterm<strong>in</strong>ewhe<strong>the</strong>rmorecapacityisneededorra<strong>the</strong>r<br />
betteruse<strong>of</strong>exist<strong>in</strong>gcapacityisneeded(forexample,byimprov<strong>in</strong>g<br />
rout<strong>in</strong>g,topology,or<strong>the</strong>migration<strong>of</strong>applicationservers<strong>in</strong>side<strong>the</strong><br />
datacenter).<br />
InFigure16,wepresent<strong>the</strong>setworatiosforeach<strong>of</strong><strong>the</strong>data<br />
centersstudied.Recall(fromTable2)thatalldatacentersareoversubscribed,mean<strong>in</strong>gthatifallserverssentdataasfastas<strong>the</strong>ycan<br />
andalltrafficleft<strong>the</strong>racks,<strong>the</strong>n<strong>the</strong>bisectionl<strong>in</strong>kswouldbefully<br />
congested(wewouldexpectt<strong>of</strong><strong>in</strong>dutilizationratiosover100%).<br />
However,wef<strong>in</strong>d<strong>in</strong>Figure16that<strong>the</strong>prevalenttrafficpatternsare<br />
suchthat,even<strong>in</strong><strong>the</strong>worstcasewhereallserver-generatedtraffic<br />
isassumedtoleave<strong>the</strong>rackhost<strong>in</strong>g<strong>the</strong>server,<strong>the</strong>aggregateoutput<br />
fromserversissmallerthan<strong>the</strong>network’scurrentbisectioncapacity.<br />
Thismeansevenif<strong>the</strong>applicationsweremovedaroundand<br />
<strong>the</strong>trafficmatrixchanged,<strong>the</strong>currentbisectionwouldstillbemore<br />
thansufficientandnomorethan25%<strong>of</strong>itwouldbeutilizedacross<br />
alldatacenters,<strong>in</strong>clud<strong>in</strong>g<strong>the</strong>MapReducedatacenters.F<strong>in</strong>ally,we<br />
notethat<strong>the</strong>aggregateoutputfromserversisanegligiblefraction<br />
<strong>of</strong><strong>the</strong>idealbisectioncapacity<strong>in</strong>allcases.Thisimpliesthatshould<br />
<strong>the</strong>sedatacentersbeequippedwithanetworkthatprovidesfullbisectionbandwidth,atleast95%<strong>of</strong>thiscapacitywouldgounused<br />
andbewastedbytoday’strafficpatterns.<br />
Thus,<strong>the</strong>prevalenttrafficpatterns<strong>in</strong><strong>the</strong>datacenterscanbesupportedby<strong>the</strong>exist<strong>in</strong>gbisectioncapacity,evenifapplicationswere<br />
placed<strong>in</strong>suchawaythat<strong>the</strong>rewasmore<strong>in</strong>ter-racktrafficthan<br />
existstoday.Thisanalysisassumesthat<strong>the</strong>aggregatecapacity<strong>of</strong><br />
<strong>the</strong>bisectionl<strong>in</strong>ksformsasharedresourcepoolfromwhichall<br />
<strong>of</strong>feredtrafficcandraw.If<strong>the</strong>topologypreventssome<strong>of</strong>feredtrafficfromreach<strong>in</strong>gsomel<strong>in</strong>ks,<strong>the</strong>nsomel<strong>in</strong>kscanexperiencehigh<br />
utilizationwhileo<strong>the</strong>rsseelowutilization.Even<strong>in</strong>thissituation,<br />
however,<strong>the</strong>issueisone<strong>of</strong>chang<strong>in</strong>g<strong>the</strong>topologyandselect<strong>in</strong>g<br />
arout<strong>in</strong>galgorithmthatallows<strong>of</strong>feredtraffictodraweffectively<br />
PRV1<br />
PRV2
from<strong>the</strong>exist<strong>in</strong>gcapacity,ra<strong>the</strong>rthanaquestion<strong>of</strong>add<strong>in</strong>gmore<br />
capacity. Centralizedrout<strong>in</strong>g,discussednext,couldhelp<strong>in</strong>construct<strong>in</strong>g<strong>the</strong>requisitenetworkpaths.<br />
7.2 CentralizedControllers<strong>in</strong><strong>Data</strong><strong>Centers</strong><br />
Thearchitecturesforseveralproposals[1,22,12,2,14,21,4,<br />
18,29]rely<strong>in</strong>someformorano<strong>the</strong>ronacentralizedcontroller<br />
forconfigur<strong>in</strong>groutesorfordissem<strong>in</strong>at<strong>in</strong>grout<strong>in</strong>g<strong>in</strong>formationto<br />
endhosts. Acentralizedcontrollerisonlypracticalifitisableto<br />
scaleuptomeet<strong>the</strong>demands<strong>of</strong><strong>the</strong>trafficcharacteristicswith<strong>in</strong><strong>the</strong><br />
datacenters.Inthissection,weexam<strong>in</strong>ethisissue<strong>in</strong><strong>the</strong>context<strong>of</strong><br />
<strong>the</strong>flowpropertiesthatweanalyzed<strong>in</strong>Section5.<br />
Inparticular,wefocuson<strong>the</strong>proposals(Hedera[2],MicroTE[4]<br />
andElasticTree[14])thatrelyonOpenFlowandNOX[15,23].In<br />
anOpenFlowarchitecture,<strong>the</strong>firstpacket<strong>of</strong>aflow,whenencounteredataswitch,canbeforwardedtoacentralcontrollerthatdeterm<strong>in</strong>es<strong>the</strong>routethat<strong>the</strong>packetshouldfollow<strong>in</strong>ordertomeetsome<br />
network-wideobjective.Alternatively,toelim<strong>in</strong>ate<strong>the</strong>setupdelay,<br />
<strong>the</strong>centralcontrollercanprecomputeaset<strong>of</strong>networkpathsthat<br />
meetnetwork-wideobjectivesand<strong>in</strong>stall<strong>the</strong>m<strong>in</strong>to<strong>the</strong>networkat<br />
startuptime.<br />
Ourempiricalobservations<strong>in</strong>Section5,haveimportantimplicationsforsuchcentralizedapproaches.First,<strong>the</strong>factthat<strong>the</strong>number<strong>of</strong>activeflowsissmall(seeFigure4(a))impliesthatswitchesenabledwithOpenFlowcanmakedowithasmallflowtable,which<br />
isaconstra<strong>in</strong>edresourceonswitchestoday.<br />
Second,flow<strong>in</strong>ter-arrivaltimeshaveimportantimplicationsfor<br />
<strong>the</strong>scalability<strong>of</strong><strong>the</strong>controller.Asweobserved<strong>in</strong>Section5,asignificantnumber<strong>of</strong>newflows(2–20%)canarriveatagivenswitchwith<strong>in</strong>10µs<strong>of</strong>eacho<strong>the</strong>r.Theswitchmustforward<strong>the</strong>firstpackets<strong>of</strong><strong>the</strong>seflowsto<strong>the</strong>controllerforprocess<strong>in</strong>g.Evenif<strong>the</strong>datacenterhasasfewasa100edgeswitches,<strong>in</strong><strong>the</strong>worstcase,acontrollercansee10newflowsperµsor10millionflowspersecond.Depend<strong>in</strong>gon<strong>the</strong>complexity<strong>of</strong><strong>the</strong>objectiveimplementedat<br />
<strong>the</strong>controller,comput<strong>in</strong>garouteforeach<strong>of</strong><strong>the</strong>seflowscouldbe<br />
expensive.Forexample,priorwork[5]showedacommoditymach<strong>in</strong>ecomput<strong>in</strong>gasimpleshortestpathforonly50Kflowarrivals<br />
persecond. Thus,toscale<strong>the</strong>throughput<strong>of</strong>acentralizedcontrolframeworkwhilesupport<strong>in</strong>gcomplexrout<strong>in</strong>gobjectives,we<br />
mustemployparallelism(i.e.,usemultipleCPUspercontrollerand<br />
multiplecontrollers)and/orusefasterbutlessoptimalheuristicsto<br />
computeroutes. Priorwork[28]hasshown,throughparallelism,<br />
<strong>the</strong>ability<strong>of</strong>acentralcontrollertoscaleto20millionflowsper<br />
second.<br />
F<strong>in</strong>ally,<strong>the</strong>flowdurationandsizealsohaveimplicationsfor<strong>the</strong><br />
centralizedcontroller.Thelengths<strong>of</strong>flowsdeterm<strong>in</strong>e<strong>the</strong>relative<br />
impact<strong>of</strong><strong>the</strong>latencyimposedbyacontrolleronanewflow.Recall<br />
thatwefoundthatmostflowslastlessthan100ms.Priorwork[5]<br />
showedthanittakesreactivecontrollers,whichmakedecisionsat<br />
flowstartuptime,approximately10msto<strong>in</strong>stallflowentriesfor<br />
newflows.Givenourresults,thisimposesa10%delayoverhead<br />
onmostflows. Additionalprocess<strong>in</strong>gdelaymaybeacceptable<br />
forsometraffic,butmightbeunacceptableforo<strong>the</strong>rk<strong>in</strong>ds. For<br />
<strong>the</strong>class<strong>of</strong>workloadsthatf<strong>in</strong>dsuchadelayunacceptable,Open-<br />
Flowprovidesaproactivemechanismthatallows<strong>the</strong>controllers,<br />
atswitchstartuptime,to<strong>in</strong>stallflowentries<strong>in</strong><strong>the</strong>switches.This<br />
proactivemechanismelim<strong>in</strong>ates<strong>the</strong>10msdelaybutlimits<strong>the</strong>controllertoproactivealgorithms.<br />
Insummary,itappears<strong>the</strong>numberand<strong>in</strong>ter-arrivaltime<strong>of</strong>data<br />
centerflowscanbehandledbyasufficientlyparallelizedimplementation<strong>of</strong><strong>the</strong>centralizedcontroller.However,<strong>the</strong>overhead<strong>of</strong><br />
reactivelycomput<strong>in</strong>gflowplacementsisareasonablefraction<strong>of</strong><br />
<strong>the</strong>length<strong>of</strong><strong>the</strong>typicalflow.<br />
279<br />
8. SUMMARY<br />
Inthispaper,weconductedanempiricalstudy<strong>of</strong><strong>the</strong>network<br />
traffic<strong>of</strong>10datacentersspann<strong>in</strong>gthreeverydifferentcategories,<br />
namelyuniversitycampus,privateenterprisedatacenters,andcloud<br />
datacentersrunn<strong>in</strong>gWebservices,customer-fac<strong>in</strong>gapplications,<br />
and<strong>in</strong>tensiveMap-Reducejobs.To<strong>the</strong>best<strong>of</strong>ourknowledge,this<br />
is<strong>the</strong>broadest-everlarge-scalemeasurementstudy<strong>of</strong>datacenters.<br />
Westartedourstudybyexam<strong>in</strong><strong>in</strong>g<strong>the</strong>applicationsrunwith<strong>in</strong><br />
<strong>the</strong>variousdatacenters. Wefoundthatavariety<strong>of</strong>applications<br />
aredeployedandthat<strong>the</strong>yareplacednon-uniformlyacrossracks.<br />
Next,westudied<strong>the</strong>transmissionproperties<strong>of</strong><strong>the</strong>applications<strong>in</strong><br />
terms<strong>of</strong><strong>the</strong>flowandpacketarrivalprocessesat<strong>the</strong>edgeswitches.<br />
Wediscoveredthat<strong>the</strong>arrivalprocessat<strong>the</strong>edgeswitchesis<br />
ON/OFF<strong>in</strong>naturewhere<strong>the</strong>ON/OFFdurationscanbecharacterizedbyheavy-taileddistributions.Inanalyz<strong>in</strong>g<strong>the</strong>flowsthatconstitute<strong>the</strong>searrivalprocess,weobservedthatflowswith<strong>in</strong><strong>the</strong>data<br />
centersstudiedaregenerallysmall<strong>in</strong>sizeandseveral<strong>of</strong><strong>the</strong>seflows<br />
lastonlyafewmilliseconds.<br />
Westudied<strong>the</strong>implications<strong>of</strong><strong>the</strong>deployeddatacenterapplicationsand<strong>the</strong>irtransmissionpropertieson<strong>the</strong>datacenternetwork<br />
anditsl<strong>in</strong>ks.Wefoundthatmost<strong>of</strong><strong>the</strong>servergeneratedtraffic<strong>in</strong><br />
<strong>the</strong>clouddatacentersstayswith<strong>in</strong>arack,while<strong>the</strong>oppositeistrue<br />
forcampusdatacenters. Wefoundthatat<strong>the</strong>edgeandaggregationlayers,l<strong>in</strong>kutilizationsarefairlylowandshowlittlevariation.<br />
Incontrast,l<strong>in</strong>kutilizationsat<strong>the</strong>corearehighwithsignificant<br />
variationsover<strong>the</strong>course<strong>of</strong>aday. Insomedatacenters,asmall<br />
butsignificantfraction<strong>of</strong>corel<strong>in</strong>ksappeartobepersistentlycongested,but<strong>the</strong>reisenoughsparecapacity<strong>in</strong><strong>the</strong>coretoalleviate<br />
congestion. Weobservedlosseson<strong>the</strong>l<strong>in</strong>ksthatarelightlyutilizedonaverageandarguedthat<strong>the</strong>selossescanbeattributedto<br />
<strong>the</strong>burstynature<strong>of</strong><strong>the</strong>underly<strong>in</strong>gapplicationsrunwith<strong>in</strong><strong>the</strong>data<br />
centers.<br />
On<strong>the</strong>whole,ourempiricalobservationscanhelp<strong>in</strong>formdata<br />
centertrafficeng<strong>in</strong>eer<strong>in</strong>gandQoSapproaches,aswellasrecent<br />
techniquesformanag<strong>in</strong>go<strong>the</strong>rresources,suchasdatacenternetworkenergyconsumption.T<strong>of</strong>ur<strong>the</strong>rhighlight<strong>the</strong>implications<strong>of</strong>ourstudy,were-exam<strong>in</strong>edrecentdatacenterproposalsandarchitectures<strong>in</strong>light<strong>of</strong>ourresults.Inparticular,wedeterm<strong>in</strong>edthatfullbisectionbandwidthisnotessentialforsupport<strong>in</strong>gcurrentapplications.Wealsohighlightedpracticalissues<strong>in</strong>successfullyemploy<strong>in</strong>gcentralizedrout<strong>in</strong>gmechanisms<strong>in</strong>datacenters.Ourempiricalstudyisbynomeansall-encompass<strong>in</strong>g.Werecognizethat<strong>the</strong>remaybeo<strong>the</strong>rdatacenters<strong>in</strong><strong>the</strong>wildthatmayor<br />
maynotshareall<strong>the</strong>propertiesthatwehaveobserved.Ourwork<br />
po<strong>in</strong>tsoutthatitisworthcloselyexam<strong>in</strong><strong>in</strong>g<strong>the</strong>differentdesignand<br />
usagepatterns,as<strong>the</strong>reareimportantdifferencesandcommonalities.<br />
9. ACKNOWLEDGMENTS<br />
Wewouldliketothank<strong>the</strong>operatorsat<strong>the</strong>variousuniversities,<br />
onl<strong>in</strong>eservicesproviders,andprivateenterprisesforboth<strong>the</strong>time<br />
anddatathat<strong>the</strong>yprovidedus. Wewouldalsoliketothank<strong>the</strong><br />
anonymousreviewersfor<strong>the</strong>ir<strong>in</strong>sightfulfeedback.<br />
Thisworkissupported<strong>in</strong>partbyanNSFFINDgrant(CNS-<br />
0626889),anNSFCAREERAward(CNS-0746531),anNSFNetSE<br />
grant(CNS-0905134),andbygrantsfrom<strong>the</strong>University<strong>of</strong><br />
Wiscons<strong>in</strong>-MadisonGraduateSchool.TheophilusBensonissupportedbyanIBMPhDFellowship.<br />
10. REFERENCES<br />
[1] M.Al-Fares,A.Loukissas,andA.Vahdat.Ascalable,<br />
commoditydatacenternetworkarchitecture.InSIGCOMM,<br />
pages63–74,2008.
[2]M.Al-Fares,S.Radhakrishnan,B.Raghavan,W.College,<br />
N.Huang,andA.Vahdat.Hedera:Dynamicflowschedul<strong>in</strong>g<br />
fordatacenternetworks.InProceed<strong>in</strong>gs<strong>of</strong>NSDI2010,San<br />
Jose,CA,USA,April2010.<br />
[3]T.Benson,A.Anand,A.Akella,andM.Zhang.<br />
Understand<strong>in</strong>g<strong>Data</strong>Center<strong>Traffic</strong><strong>Characteristics</strong>.In<br />
Proceed<strong>in</strong>gs<strong>of</strong><strong>Sigcomm</strong>Workshop:ResearchonEnterprise<br />
<strong>Network</strong>s,2009.<br />
[4]T.Benson,A.Anand,A.Akella,andM.Zhang.Thecasefor<br />
f<strong>in</strong>e-gra<strong>in</strong>edtrafficeng<strong>in</strong>eer<strong>in</strong>g<strong>in</strong>datacenters.In<br />
Proceed<strong>in</strong>gs<strong>of</strong>INM/WREN’10,SanJose,CA,USA,April<br />
2010.<br />
[5]M.Casado,M.J.Freedman,J.Pettit,J.Luo,N.McKeown,<br />
andS.Shenker.Ethane:tak<strong>in</strong>gcontrol<strong>of</strong><strong>the</strong>enterprise.In<br />
SIGCOMM,2007.<br />
[6]J.DeanandS.Ghemawat.MapReduce:simplifieddata<br />
process<strong>in</strong>gonlargeclusters.volume51,pages107–113,<br />
NewYork,NY,USA,2008.ACM.<br />
[7]A.B.Downey.Evidenceforlong-taileddistributions<strong>in</strong><strong>the</strong><br />
<strong>in</strong>ternet.InInProceed<strong>in</strong>gs<strong>of</strong>ACMSIGCOMMInternet<br />
MeasurmentWorkshop,pages229–241.ACMPress,2001.<br />
[8]M.Fomenkov,K.Keys,D.Moore,andK.Claffy.<br />
Longitud<strong>in</strong>alstudy<strong>of</strong>Internettraffic<strong>in</strong>1998-2003.In<br />
WISICT’04:Proceed<strong>in</strong>gs<strong>of</strong><strong>the</strong>W<strong>in</strong>terInternational<br />
SymposiumonInformationandCommunication<br />
Technologies,pages1–6.Tr<strong>in</strong>ityCollegeDubl<strong>in</strong>,2004.<br />
[9]H.J.Fowler,W.E.Leland,andB.Bellcore.Localarea<br />
networktrafficcharacteristics,withimplicationsfor<br />
broadbandnetworkcongestionmanagement.IEEEJournal<br />
onSelectedAreas<strong>in</strong>Communications,9:1139–1149,1991.<br />
[10] C.Fraleigh,S.Moon,B.Lyles,C.Cotton,M.Khan,<br />
D.Moll,R.Rockell,T.Seely,andC.Diot.Packet-level<br />
trafficmeasurementsfrom<strong>the</strong>Spr<strong>in</strong>tIPbackbone.IEEE<br />
<strong>Network</strong>,17:6–16,2003.<br />
[11] A.Greenberg,J.R.Hamilton,N.Ja<strong>in</strong>,S.Kandula,C.Kim,<br />
P.Lahiri,D.A.Maltz,P.Patel,andS.Sengupta.VL2:a<br />
scalableandflexibledatacenternetwork.InSIGCOMM,<br />
2009.<br />
[12] A.Greenberg,P.Lahiri,D.A.Maltz,P.Patel,and<br />
S.Sengupta.Towardsanextgenerationdatacenter<br />
architecture:scalabilityandcommoditization.InPRESTO<br />
’08:Proceed<strong>in</strong>gs<strong>of</strong><strong>the</strong>ACMworkshoponProgrammable<br />
routersforextensibleservices<strong>of</strong>tomorrow,pages57–62,<br />
NewYork,NY,USA,2008.ACM.<br />
[13] C.Guo,G.Lu,D.Li,H.Wu,X.Zhang,Y.Shi,C.Tian,<br />
Y.Zhang,andS.Lu.BCube:AHighPerformance,<br />
Server-centric<strong>Network</strong>ArchitectureforModular<strong>Data</strong><br />
<strong>Centers</strong>.InProceed<strong>in</strong>gs<strong>of</strong><strong>the</strong>ACMSIGCOMM2009<br />
Conferenceon<strong>Data</strong>Communication,Barcelona,Spa<strong>in</strong>,<br />
August17-212009.<br />
[14] B.Heller,S.Seetharaman,P.Mahadevan,Y.Yiakoumis,<br />
P.Sharma,S.Banerjee,andN.McKeown.Elastictree:<br />
Sav<strong>in</strong>genergy<strong>in</strong>datacenternetworks.April2010.<br />
280<br />
[15] NOX:AnOpenFlowController.<br />
http://noxrepo.org/wp/.<br />
[16] C.Guo,H.Wu,K.Tan,L.Shi,Y.Zhang,andS.Lu.Dcell:a<br />
scalableandfault-tolerantnetworkstructurefordatacenters.<br />
InSIGCOMM’08:Proceed<strong>in</strong>gs<strong>of</strong><strong>the</strong>ACMSIGCOMM<br />
2008conferenceon<strong>Data</strong>communication,pages75–86,New<br />
York,NY,USA,2008.ACM.<br />
[17] W.JohnandS.Tafvel<strong>in</strong>.Analysis<strong>of</strong>Internetbackbone<br />
trafficandheaderanomaliesobserved.InIMC’07:<br />
Proceed<strong>in</strong>gs<strong>of</strong><strong>the</strong>7thACMSIGCOMMconferenceon<br />
Internetmeasurement,pages111–116,NewYork,NY,USA,<br />
2007.ACM.<br />
[18] S.Kandula,J.Padhye,andP.Bahl.Flywaystode-congest<br />
datacenternetworks.InProc.ACMHotnets-VIII,NewYork<br />
City,NY.USA.,Oct.2009.<br />
[19] S.Kandula,S.Sengupta,A.Greenberg,P.Patel,and<br />
R.Chaiken.TheNature<strong>of</strong><strong>Data</strong>Center<strong>Traffic</strong>:<br />
MeasurementsandAnalysis.InIMC,2009.<br />
[20] W.E.Leland,M.S.Taqqu,W.Will<strong>in</strong>ger,andD.V.Wilson.<br />
On<strong>the</strong>self-similarnature<strong>of</strong>e<strong>the</strong>rnettraffic.InSIGCOMM<br />
’93:Conferenceproceed<strong>in</strong>gsonCommunications<br />
architectures,protocolsandapplications,pages183–193,<br />
NewYork,NY,USA,1993.ACM.<br />
[21] J.Mudigonda,P.Yalagandula,M.Al-Fares,andJ.C.Mogul.<br />
Spa<strong>in</strong>:Cotsdata-centere<strong>the</strong>rnetformultipath<strong>in</strong>gover<br />
arbitrarytopologies.InProceed<strong>in</strong>gs<strong>of</strong>NSDI2010,SanJose,<br />
CA,USA,April2010.<br />
[22] R.NiranjanMysore,A.Pamboris,N.Farr<strong>in</strong>gton,N.Huang,<br />
P.Miri,S.Radhakrishnan,V.Subramanya,andA.Vahdat.<br />
Portland:ascalablefault-tolerantlayer2datacenternetwork<br />
fabric.InSIGCOMM,2009.<br />
[23] TheOpenFlowSwitchConsortium.<br />
http://www.openflowswitch.org/.<br />
[24] V.Paxson.Empirically-DerivedAnalyticModels<strong>of</strong><br />
Wide-AreaTCPConnections.2(4):316–336,Aug.1994.<br />
[25] V.Paxson.Measurementsandanalysis<strong>of</strong>end-to-end<strong>in</strong>ternet<br />
dynamics.Technicalreport,1997.<br />
[26] V.Paxson.Bro:asystemfordetect<strong>in</strong>gnetwork<strong>in</strong>truders<strong>in</strong><br />
real-time.InSSYM’98:Proceed<strong>in</strong>gs<strong>of</strong><strong>the</strong>7thconferenceon<br />
USENIXSecuritySymposium,pages3–3,Berkeley,CA,<br />
USA,1998.USENIXAssociation.<br />
[27] V.PaxsonandS.Floyd.Wideareatraffic:<strong>the</strong>failure<strong>of</strong><br />
poissonmodel<strong>in</strong>g.IEEE/ACMTrans.Netw.,3(3):226–244,<br />
1995.<br />
[28] A.Tavakoli,M.Casado,T.Koponen,andS.Shenker.<br />
Apply<strong>in</strong>gnoxto<strong>the</strong>datacenter.InProc.<strong>of</strong>workshoponHot<br />
Topics<strong>in</strong><strong>Network</strong>s(HotNets-VIII),2009.<br />
[29] G.Wang,D.G.Andersen,M.Kam<strong>in</strong>sky,M.Kozuch,<br />
T.S.E.Ng,K.Papagiannaki,M.Glick,andL.Mummert.<br />
Yourdatacenterisarouter:Thecaseforreconfigurable<br />
opticalcircuitswitchedpaths.InProc.ACMHotnets-VIII,<br />
NewYorkCity,NY.USA.,Oct.2009.