13.07.2013 Views

Network Traffic Characteristics of Data Centers in the Wild - Sigcomm

Network Traffic Characteristics of Data Centers in the Wild - Sigcomm

Network Traffic Characteristics of Data Centers in the Wild - Sigcomm

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>the</strong>applicationsrun<strong>in</strong>eachdatacenterand<strong>the</strong>ndrill<strong>in</strong>gdownto<br />

<strong>the</strong>applications’sendandreceivepatternsand<strong>the</strong>irnetwork-level<br />

impact. Us<strong>in</strong>gpackettraces,wefirstexam<strong>in</strong>e<strong>the</strong>type<strong>of</strong>applicationsrunn<strong>in</strong>g<strong>in</strong>eachdatacenterand<strong>the</strong>irrelativecontributiontonetworktraffic.We<strong>the</strong>nexam<strong>in</strong>e<strong>the</strong>f<strong>in</strong>e-gra<strong>in</strong>edsend<strong>in</strong>gpatternsascapturedbydatatransmissionbehaviorat<strong>the</strong>packetand<br />

flowlevels.Weexam<strong>in</strong>e<strong>the</strong>sepatternsboth<strong>in</strong>aggregateandata<br />

per-applicationlevel.F<strong>in</strong>ally,weuseSNMPtracestoexam<strong>in</strong>e<strong>the</strong><br />

network-levelimpact<strong>in</strong>terms<strong>of</strong>l<strong>in</strong>kutilization,congestion,and<br />

packetdrops,and<strong>the</strong>dependence<strong>of</strong><strong>the</strong>sepropertieson<strong>the</strong>location<strong>of</strong><strong>the</strong>l<strong>in</strong>ks<strong>in</strong><strong>the</strong>networktopologyandon<strong>the</strong>time<strong>of</strong>day.<br />

Ourkeyempiricalf<strong>in</strong>d<strong>in</strong>gsare<strong>the</strong>follow<strong>in</strong>g:<br />

• Weseeawidevariety<strong>of</strong>applicationsacross<strong>the</strong>datacenters,<br />

rang<strong>in</strong>gfromcustomer-fac<strong>in</strong>gapplications,suchasWebservices,filestores,au<strong>the</strong>nticationservices,L<strong>in</strong>e-<strong>of</strong>-Bus<strong>in</strong>essapplications,andcustomenterpriseapplicationstodata<strong>in</strong>tensiveapplications,suchasMapReduceandsearch<strong>in</strong>dex<strong>in</strong>g.Wef<strong>in</strong>dthatapplicationplacementisnon-uniformacross<br />

racks.<br />

• Mostflows<strong>in</strong><strong>the</strong>datacentersaresmall<strong>in</strong>size(≤ 10KB),<br />

asignificantfraction<strong>of</strong>whichlastunderafewhundreds<strong>of</strong><br />

milliseconds,and<strong>the</strong>number<strong>of</strong>activeflowspersecondis<br />

under10,000perrackacrossalldatacenters.<br />

• Despite<strong>the</strong>differences<strong>in</strong><strong>the</strong>sizeandusage<strong>of</strong><strong>the</strong>datacenters,trafficorig<strong>in</strong>at<strong>in</strong>gfromarack<strong>in</strong>adatacenterisON/OFF<br />

<strong>in</strong>naturewithpropertiesthatfi<strong>the</strong>avy-taileddistributions.<br />

• In<strong>the</strong>clouddatacenters,amajority<strong>of</strong>trafficorig<strong>in</strong>atedby<br />

servers(80%)stayswith<strong>in</strong><strong>the</strong>rack. For<strong>the</strong>universityand<br />

privateenterprisedatacenters,most<strong>of</strong><strong>the</strong>traffic(40-90%)<br />

leaves<strong>the</strong>rackandtraverses<strong>the</strong>network’s<strong>in</strong>terconnect.<br />

• Irrespective<strong>of</strong><strong>the</strong>type,<strong>in</strong>mostdatacenters,l<strong>in</strong>kutilizations<br />

arera<strong>the</strong>rlow<strong>in</strong>alllayersbut<strong>the</strong>core.In<strong>the</strong>core,wef<strong>in</strong>d<br />

thatasubset<strong>of</strong><strong>the</strong>corel<strong>in</strong>ks<strong>of</strong>tenexperiencehighutilization.Fur<strong>the</strong>rmore,<strong>the</strong>exactnumber<strong>of</strong>highlyutilizedcore<br />

l<strong>in</strong>ksvariesovertime,butneverexceeds25%<strong>of</strong><strong>the</strong>core<br />

l<strong>in</strong>ks<strong>in</strong>anydatacenter.<br />

• Lossesoccurwith<strong>in</strong><strong>the</strong>datacenters;however,lossesarenot<br />

localizedtol<strong>in</strong>kswithpersistentlyhighutilization.Instead,<br />

lossesoccuratl<strong>in</strong>kswithlowaverageutilizationimplicat<strong>in</strong>gmomentaryspikesas<strong>the</strong>primarycause<strong>of</strong>losses.<br />

We<br />

observethat<strong>the</strong>magnitude<strong>of</strong>lossesisgreaterat<strong>the</strong>aggregationlayerthanat<strong>the</strong>edgeor<strong>the</strong>corelayers.<br />

• Weobservethatl<strong>in</strong>kutilizationsaresubjecttotime-<strong>of</strong>-day<br />

andday-<strong>of</strong>-weekeffectsacrossalldatacenters.However<strong>in</strong><br />

many<strong>of</strong><strong>the</strong>clouddatacenters,<strong>the</strong>variationsarenearlyan<br />

order<strong>of</strong>magnitudemorepronouncedatcorel<strong>in</strong>ksthanat<br />

edgeandaggregationl<strong>in</strong>ks.<br />

Tohighlight<strong>the</strong>implications<strong>of</strong>ourobservations,weconclude<br />

<strong>the</strong>paperwithananalysis<strong>of</strong>twodatacenternetworkdesignissues<br />

thathavereceivedalot<strong>of</strong>recentattention,namely,networkbisectionbandwidthand<strong>the</strong>use<strong>of</strong>centralizedmanagementtechniques.<br />

• BisectionBandwidth:Recentdatacenternetworkproposals<br />

havearguedthatdatacentersneedhighbisectionbandwidth<br />

tosupportdemand<strong>in</strong>gapplications.Ourmeasurementsshow<br />

thatonlyafraction<strong>of</strong><strong>the</strong>exist<strong>in</strong>gbisectioncapacityislikely<br />

tobeutilizedwith<strong>in</strong>agiventime<strong>in</strong>terval<strong>in</strong>all<strong>the</strong>datacenters,even<strong>in</strong><strong>the</strong>“worstcase”whereapplication<strong>in</strong>stancesare<br />

268<br />

<strong>Data</strong>Center Type<strong>of</strong> Type<strong>of</strong> #<strong>of</strong>DCs<br />

Study <strong>Data</strong>Center Apps Measured<br />

Fat-tree[1] Cloud MapReduce 0<br />

Hedera[2] Cloud MapReduce 0<br />

Portland[22] Cloud MapReduce 0<br />

BCube[13] Cloud MapReduce 0<br />

DCell[16] Cloud MapReduce 0<br />

VAL2[11] Cloud MapReduce 1<br />

MicroTE[4] Cloud MapReduce 1<br />

Flyways[18] Cloud MapReduce 1<br />

Opticalswitch<strong>in</strong>g[29] Cloud MapReduce 1<br />

ECMP.study1[19] Cloud MapReduce 1<br />

ECMP.study2[3] Cloud MapReduce 19<br />

WebServices<br />

ElasticTree[14] ANY WebServices 1<br />

SPAIN[21] Any Any 0<br />

Ourwork Cloud MapReduce 10<br />

PrivateNet Webservices<br />

Universities DistributedF’S<br />

Table1: Comparison<strong>of</strong>priordatacenterstudies,<strong>in</strong>clud<strong>in</strong>g<br />

type<strong>of</strong>datacenterandapplication.<br />

spreadacrossracksra<strong>the</strong>rthanconf<strong>in</strong>edwith<strong>in</strong>arack.This<br />

istrueevenforMapReducedatacentersthatseerelatively<br />

higherutilization.Fromthis,weconcludethatloadbalanc<strong>in</strong>gmechanismsforspread<strong>in</strong>gtrafficacross<strong>the</strong>exist<strong>in</strong>gl<strong>in</strong>ks<strong>in</strong><strong>the</strong>network’scorecanhelpmanageoccasionalcongestion,given<strong>the</strong>currentapplicationsused.<br />

• CentralizationManagement: Afewrecentproposals[2,<br />

14]havearguedforcentrallymanag<strong>in</strong>gandschedul<strong>in</strong>gnetworkwidetransmissionstomoreeffectivelyeng<strong>in</strong>eerdatacenter<br />

traffic.Ourmeasurementsshowthatcentralizedapproaches<br />

mustemployparallelismandfastroutecomputationheuristicstoscaleto<strong>the</strong>size<strong>of</strong>datacenterstodaywhilesupport<strong>in</strong>g<br />

<strong>the</strong>applicationtrafficpatternsweobserve<strong>in</strong><strong>the</strong>datacenters.<br />

Therest<strong>of</strong><strong>the</strong>paperisstructuredasfollows:wepresentrelated<br />

work<strong>in</strong>Section2and<strong>in</strong>Section3describe<strong>the</strong>datacentersstudied,<br />

<strong>the</strong>irhigh-leveldesign,andtypicaluses.InSection4,wedescribe<br />

<strong>the</strong>applicationsrunn<strong>in</strong>g<strong>in</strong><strong>the</strong>sedatacenters. InSection5,we<br />

zoom<strong>in</strong>to<strong>the</strong>microscopicproperties<strong>of</strong><strong>the</strong>variousdatacenters.<br />

InSection6,weexam<strong>in</strong>e<strong>the</strong>flow<strong>of</strong>trafficwith<strong>in</strong>datacentersand<br />

<strong>the</strong>utilization<strong>of</strong>l<strong>in</strong>ksacross<strong>the</strong>variouslayers.Wediscuss<strong>the</strong>implications<strong>of</strong>ourempirical<strong>in</strong>sights<strong>in</strong>Section7,andwesummarize<br />

ourf<strong>in</strong>d<strong>in</strong>gs<strong>in</strong>Section8.<br />

2. RELATEDWORK<br />

Thereistremendous<strong>in</strong>terest<strong>in</strong>design<strong>in</strong>gimprovednetworksfor<br />

datacenters [1,2,22,13,16,11,4,18,29,14,21];however,such<br />

workanditsevaluationisdrivenbyonlyafewstudies<strong>of</strong>datacentertraffic,andthosestudiesaresolely<strong>of</strong>huge(>10Kserver)data<br />

centers,primarilyrunn<strong>in</strong>gdatam<strong>in</strong><strong>in</strong>g,MapReducejobs,orWeb<br />

services.Table1summarizes<strong>the</strong>priorstudies.FromTable1,we<br />

observethatmany<strong>of</strong><strong>the</strong>dataarchitecturesareevaluatedwithout<br />

empiricaldatafromdatacenters. For<strong>the</strong>architecturesevaluated<br />

wi<strong>the</strong>mpiricaldata,wef<strong>in</strong>dthat<strong>the</strong>seevaluationsareperformed<br />

withtracesfromclouddatacenters.Theseobservationsimplythat<br />

<strong>the</strong>actualperformance<strong>of</strong><strong>the</strong>setechniquesundervarioustypes<strong>of</strong><br />

realisticdatacentersfound<strong>in</strong><strong>the</strong>wild(suchasenterpriseanduniversitydatacenters)isunknownandthuswearemotivatedbythis<br />

toconductabroadstudyon<strong>the</strong>characteristics<strong>of</strong>datacenters.Such<br />

astudywill<strong>in</strong>form<strong>the</strong>designandevaluation<strong>of</strong>currentandfuture<br />

datacentertechniques.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!