An Architecture for Dynamic Allocation of Compute Cluster Bandwidth

More documents

Recommendations

Info

1the controller can get immediate access to a node bypreempting or suspending other jobs, then the queue wait timeis no longer relevant. While preemption can be handled withhigh priorities, and backfilling can allow faster access to thequeue for short running jobs of this nature, it may not provideimmediate access. Suspending running jobs can provideimmediate access; however this is a complicated issue andtypically requires virtual machines. Since this is not alwayspossible, we use pre-fetching and caching of backend services.the effects of the n-idle and zero idle policies under lighterclient loads.Figure 5: The Effect of Load on Connect TimeWe perform the same study using the N-Idle policy and setN=4. Figure 6 shows the results. The Connect time for up tofour simultaneous clients is low, no matter what the wait timeof the queue. As the number of clients connecting exceeds thevalue of N, the connection time increases similarly to the zeroidlepolicy yet remains lower on average due to theimmediately available backends servicing up to N requests.Figure 6: Connection Time for Various Waits1) Lighter LoadsMany clients connecting at the exact same time constitutes aheavy load on the system. In practice this situation willtypically not occur: there will be some spacing of clientrequests and thus a lower load. In this experiment we measureFigure 7: Lighter LoadsFigure 7 shows the results of an experiment in which we runvarious client loads against both the zero idle policy and theN=4-idle policy. The graph shows the connection time as afunction of the delay between client requests. The zero-idlepolicy maintains a steady delay that is equal to the backendservice start time. The N-idle connection times start off highbecause the system is overloaded; however they drop offquickly as the interval between requests increases. As requestsare spaced out enough they allow enough time for previousrequests to finish before the next starts. Recall that the time ittakes to finish a request involves the time through the waitqueue as well as the transfer time. With the N-Idle policy N ofthe requests can be serviced immediately and thus finish veryquickly. Once finished the backend service’s node is returnedto the LRM’s compute pool. The backend service that wasstarted to replace the consumed backend will not be ready untilit works its way through the wait queue.The point at which the connection time drops significantlyin this case is seven seconds between transfers. The zero-idleline shows that the node access time is slightly below 30seconds. With seven seconds between transfers and four prefetchednodes the system has 7*4 = 28 seconds to acquire anode before a new client will be stalled waiting for it. Thisresults in the following formula:N = queue wait time / time between clientsNetwork load prediction research can be used here todetermine a value for N dynamically as the system runs basedon the expected time between transfer requests.Instead of looking at the results to find an ideal value for Nwe can alternatively look at the ideal time between transferrequests. If client connection requests are intercepted andspaced far enough apart, the overall connection time couldimprove for short transfer times or for bursts in client transferrequests. This leaves interesting open questions relating totransfer time prediction. We leave this for future work.
1D. ThroughputAs stated, the goal of this proposed architecture is tomaximize bandwidth usage for batch data transfers. The delaytime to the start of the transfer is secondary to maximizingutilization of available bandwidth. However, the wait time toacquire a node can affect the overall system throughput. Figure8 shows this.Figure 8: Single Client ThroughputIn the experiment we used the Zero-Idle GRAM policy andrepeatedly ran a single client transfer request of a 1 GByte file,one hundred times in sequence. We measured the overallthroughput used by the system as a function of an increasingqueue wait times. The maximum possible bandwidth availablefor consumption is one Gbit/s. Our results show that we onlyachieve 20% of the maximum with no wait, and as the waittime increases the bandwidth utilization decreases. Thehistograms in Figure 9 provide insight into why. Peak transferrates are reached only over the small interval after a backendservice is allocated to a transfer and before the transfer isfinished. The peak times are followed by idle intervalsspanning the wait time plus the connection time. The longerthe wait time, the longer the network idle time and thus thelower overall throughput.E. CachingOne approach to eliminating the time gaps between transfersinvolves using the Time Cache policy. We demonstrate thiswith an experiment using four backends and four clientmachines. A client is run on each machine transferring a seriesof one Gigabyte files in serial. Each backend service has awallclock time of one minute.Figure 9: Single Client Transfer HistogramsThe graph in Figure 10 shows the results of the first 60seconds of this experiment. Each colored bar represents thethrough put achieved by a client machine in a two secondstime step. Figure 10 shows the network is only idle at thebeginning when we are requesting new nodes. The remainderof the time the backend services are cached and thereforeready to perform transfer requests immediately. Were theexperiment to continue on for another minute we would seeanother drop in performance as the cached backends wallclocktime expired. This ratio of cached time to fetch time can beadjusted by requesting longer and shorter wallclock times.Longer wallclock times will result in higher overall bandwidthutilization under heavy client loads. Conversely, shorterwallclock times reduce the potential backend idle time underlightclientloads.
Page 1 and 2: 1An Architecture for Dynamic Alloca
Page 3 and 4: 3When DNS-Round-Robin is in place,
Page 6 and 7: 6created from the WSDL which makes
Page 8: 8backend services must be created f
Page 14: [31] 31 I. Foster, C. Kesselman, G.

An Architecture for Dynamic Allocation of Compute Cluster Bandwidth

Create successful ePaper yourself

Delete template?

Save as template?