Network Traffic Characteristics of Data Centers in the Wild - Sigcomm
Network Traffic Characteristics of Data Centers in the Wild - Sigcomm
Network Traffic Characteristics of Data Centers in the Wild - Sigcomm
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
<strong>the</strong>applicationsrun<strong>in</strong>eachdatacenterand<strong>the</strong>ndrill<strong>in</strong>gdownto<br />
<strong>the</strong>applications’sendandreceivepatternsand<strong>the</strong>irnetwork-level<br />
impact. Us<strong>in</strong>gpackettraces,wefirstexam<strong>in</strong>e<strong>the</strong>type<strong>of</strong>applicationsrunn<strong>in</strong>g<strong>in</strong>eachdatacenterand<strong>the</strong>irrelativecontributiontonetworktraffic.We<strong>the</strong>nexam<strong>in</strong>e<strong>the</strong>f<strong>in</strong>e-gra<strong>in</strong>edsend<strong>in</strong>gpatternsascapturedbydatatransmissionbehaviorat<strong>the</strong>packetand<br />
flowlevels.Weexam<strong>in</strong>e<strong>the</strong>sepatternsboth<strong>in</strong>aggregateandata<br />
per-applicationlevel.F<strong>in</strong>ally,weuseSNMPtracestoexam<strong>in</strong>e<strong>the</strong><br />
network-levelimpact<strong>in</strong>terms<strong>of</strong>l<strong>in</strong>kutilization,congestion,and<br />
packetdrops,and<strong>the</strong>dependence<strong>of</strong><strong>the</strong>sepropertieson<strong>the</strong>location<strong>of</strong><strong>the</strong>l<strong>in</strong>ks<strong>in</strong><strong>the</strong>networktopologyandon<strong>the</strong>time<strong>of</strong>day.<br />
Ourkeyempiricalf<strong>in</strong>d<strong>in</strong>gsare<strong>the</strong>follow<strong>in</strong>g:<br />
• Weseeawidevariety<strong>of</strong>applicationsacross<strong>the</strong>datacenters,<br />
rang<strong>in</strong>gfromcustomer-fac<strong>in</strong>gapplications,suchasWebservices,filestores,au<strong>the</strong>nticationservices,L<strong>in</strong>e-<strong>of</strong>-Bus<strong>in</strong>essapplications,andcustomenterpriseapplicationstodata<strong>in</strong>tensiveapplications,suchasMapReduceandsearch<strong>in</strong>dex<strong>in</strong>g.Wef<strong>in</strong>dthatapplicationplacementisnon-uniformacross<br />
racks.<br />
• Mostflows<strong>in</strong><strong>the</strong>datacentersaresmall<strong>in</strong>size(≤ 10KB),<br />
asignificantfraction<strong>of</strong>whichlastunderafewhundreds<strong>of</strong><br />
milliseconds,and<strong>the</strong>number<strong>of</strong>activeflowspersecondis<br />
under10,000perrackacrossalldatacenters.<br />
• Despite<strong>the</strong>differences<strong>in</strong><strong>the</strong>sizeandusage<strong>of</strong><strong>the</strong>datacenters,trafficorig<strong>in</strong>at<strong>in</strong>gfromarack<strong>in</strong>adatacenterisON/OFF<br />
<strong>in</strong>naturewithpropertiesthatfi<strong>the</strong>avy-taileddistributions.<br />
• In<strong>the</strong>clouddatacenters,amajority<strong>of</strong>trafficorig<strong>in</strong>atedby<br />
servers(80%)stayswith<strong>in</strong><strong>the</strong>rack. For<strong>the</strong>universityand<br />
privateenterprisedatacenters,most<strong>of</strong><strong>the</strong>traffic(40-90%)<br />
leaves<strong>the</strong>rackandtraverses<strong>the</strong>network’s<strong>in</strong>terconnect.<br />
• Irrespective<strong>of</strong><strong>the</strong>type,<strong>in</strong>mostdatacenters,l<strong>in</strong>kutilizations<br />
arera<strong>the</strong>rlow<strong>in</strong>alllayersbut<strong>the</strong>core.In<strong>the</strong>core,wef<strong>in</strong>d<br />
thatasubset<strong>of</strong><strong>the</strong>corel<strong>in</strong>ks<strong>of</strong>tenexperiencehighutilization.Fur<strong>the</strong>rmore,<strong>the</strong>exactnumber<strong>of</strong>highlyutilizedcore<br />
l<strong>in</strong>ksvariesovertime,butneverexceeds25%<strong>of</strong><strong>the</strong>core<br />
l<strong>in</strong>ks<strong>in</strong>anydatacenter.<br />
• Lossesoccurwith<strong>in</strong><strong>the</strong>datacenters;however,lossesarenot<br />
localizedtol<strong>in</strong>kswithpersistentlyhighutilization.Instead,<br />
lossesoccuratl<strong>in</strong>kswithlowaverageutilizationimplicat<strong>in</strong>gmomentaryspikesas<strong>the</strong>primarycause<strong>of</strong>losses.<br />
We<br />
observethat<strong>the</strong>magnitude<strong>of</strong>lossesisgreaterat<strong>the</strong>aggregationlayerthanat<strong>the</strong>edgeor<strong>the</strong>corelayers.<br />
• Weobservethatl<strong>in</strong>kutilizationsaresubjecttotime-<strong>of</strong>-day<br />
andday-<strong>of</strong>-weekeffectsacrossalldatacenters.However<strong>in</strong><br />
many<strong>of</strong><strong>the</strong>clouddatacenters,<strong>the</strong>variationsarenearlyan<br />
order<strong>of</strong>magnitudemorepronouncedatcorel<strong>in</strong>ksthanat<br />
edgeandaggregationl<strong>in</strong>ks.<br />
Tohighlight<strong>the</strong>implications<strong>of</strong>ourobservations,weconclude<br />
<strong>the</strong>paperwithananalysis<strong>of</strong>twodatacenternetworkdesignissues<br />
thathavereceivedalot<strong>of</strong>recentattention,namely,networkbisectionbandwidthand<strong>the</strong>use<strong>of</strong>centralizedmanagementtechniques.<br />
• BisectionBandwidth:Recentdatacenternetworkproposals<br />
havearguedthatdatacentersneedhighbisectionbandwidth<br />
tosupportdemand<strong>in</strong>gapplications.Ourmeasurementsshow<br />
thatonlyafraction<strong>of</strong><strong>the</strong>exist<strong>in</strong>gbisectioncapacityislikely<br />
tobeutilizedwith<strong>in</strong>agiventime<strong>in</strong>terval<strong>in</strong>all<strong>the</strong>datacenters,even<strong>in</strong><strong>the</strong>“worstcase”whereapplication<strong>in</strong>stancesare<br />
268<br />
<strong>Data</strong>Center Type<strong>of</strong> Type<strong>of</strong> #<strong>of</strong>DCs<br />
Study <strong>Data</strong>Center Apps Measured<br />
Fat-tree[1] Cloud MapReduce 0<br />
Hedera[2] Cloud MapReduce 0<br />
Portland[22] Cloud MapReduce 0<br />
BCube[13] Cloud MapReduce 0<br />
DCell[16] Cloud MapReduce 0<br />
VAL2[11] Cloud MapReduce 1<br />
MicroTE[4] Cloud MapReduce 1<br />
Flyways[18] Cloud MapReduce 1<br />
Opticalswitch<strong>in</strong>g[29] Cloud MapReduce 1<br />
ECMP.study1[19] Cloud MapReduce 1<br />
ECMP.study2[3] Cloud MapReduce 19<br />
WebServices<br />
ElasticTree[14] ANY WebServices 1<br />
SPAIN[21] Any Any 0<br />
Ourwork Cloud MapReduce 10<br />
PrivateNet Webservices<br />
Universities DistributedF’S<br />
Table1: Comparison<strong>of</strong>priordatacenterstudies,<strong>in</strong>clud<strong>in</strong>g<br />
type<strong>of</strong>datacenterandapplication.<br />
spreadacrossracksra<strong>the</strong>rthanconf<strong>in</strong>edwith<strong>in</strong>arack.This<br />
istrueevenforMapReducedatacentersthatseerelatively<br />
higherutilization.Fromthis,weconcludethatloadbalanc<strong>in</strong>gmechanismsforspread<strong>in</strong>gtrafficacross<strong>the</strong>exist<strong>in</strong>gl<strong>in</strong>ks<strong>in</strong><strong>the</strong>network’scorecanhelpmanageoccasionalcongestion,given<strong>the</strong>currentapplicationsused.<br />
• CentralizationManagement: Afewrecentproposals[2,<br />
14]havearguedforcentrallymanag<strong>in</strong>gandschedul<strong>in</strong>gnetworkwidetransmissionstomoreeffectivelyeng<strong>in</strong>eerdatacenter<br />
traffic.Ourmeasurementsshowthatcentralizedapproaches<br />
mustemployparallelismandfastroutecomputationheuristicstoscaleto<strong>the</strong>size<strong>of</strong>datacenterstodaywhilesupport<strong>in</strong>g<br />
<strong>the</strong>applicationtrafficpatternsweobserve<strong>in</strong><strong>the</strong>datacenters.<br />
Therest<strong>of</strong><strong>the</strong>paperisstructuredasfollows:wepresentrelated<br />
work<strong>in</strong>Section2and<strong>in</strong>Section3describe<strong>the</strong>datacentersstudied,<br />
<strong>the</strong>irhigh-leveldesign,andtypicaluses.InSection4,wedescribe<br />
<strong>the</strong>applicationsrunn<strong>in</strong>g<strong>in</strong><strong>the</strong>sedatacenters. InSection5,we<br />
zoom<strong>in</strong>to<strong>the</strong>microscopicproperties<strong>of</strong><strong>the</strong>variousdatacenters.<br />
InSection6,weexam<strong>in</strong>e<strong>the</strong>flow<strong>of</strong>trafficwith<strong>in</strong>datacentersand<br />
<strong>the</strong>utilization<strong>of</strong>l<strong>in</strong>ksacross<strong>the</strong>variouslayers.Wediscuss<strong>the</strong>implications<strong>of</strong>ourempirical<strong>in</strong>sights<strong>in</strong>Section7,andwesummarize<br />
ourf<strong>in</strong>d<strong>in</strong>gs<strong>in</strong>Section8.<br />
2. RELATEDWORK<br />
Thereistremendous<strong>in</strong>terest<strong>in</strong>design<strong>in</strong>gimprovednetworksfor<br />
datacenters [1,2,22,13,16,11,4,18,29,14,21];however,such<br />
workanditsevaluationisdrivenbyonlyafewstudies<strong>of</strong>datacentertraffic,andthosestudiesaresolely<strong>of</strong>huge(>10Kserver)data<br />
centers,primarilyrunn<strong>in</strong>gdatam<strong>in</strong><strong>in</strong>g,MapReducejobs,orWeb<br />
services.Table1summarizes<strong>the</strong>priorstudies.FromTable1,we<br />
observethatmany<strong>of</strong><strong>the</strong>dataarchitecturesareevaluatedwithout<br />
empiricaldatafromdatacenters. For<strong>the</strong>architecturesevaluated<br />
wi<strong>the</strong>mpiricaldata,wef<strong>in</strong>dthat<strong>the</strong>seevaluationsareperformed<br />
withtracesfromclouddatacenters.Theseobservationsimplythat<br />
<strong>the</strong>actualperformance<strong>of</strong><strong>the</strong>setechniquesundervarioustypes<strong>of</strong><br />
realisticdatacentersfound<strong>in</strong><strong>the</strong>wild(suchasenterpriseanduniversitydatacenters)isunknownandthuswearemotivatedbythis<br />
toconductabroadstudyon<strong>the</strong>characteristics<strong>of</strong>datacenters.Such<br />
astudywill<strong>in</strong>form<strong>the</strong>designandevaluation<strong>of</strong>currentandfuture<br />
datacentertechniques.