21.01.2014 Views

A simulation model to implement multiple client class server-client ...

A simulation model to implement multiple client class server-client ...

A simulation model to implement multiple client class server-client ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

A <strong>simulation</strong> <strong>model</strong> <strong>to</strong> <strong>implement</strong> <strong>multiple</strong> <strong>client</strong> <strong>class</strong><br />

<strong>server</strong>-<strong>client</strong> software architecture<br />

1. Introduction<br />

In this chapter we introduce the <strong>simulation</strong> environment, which will be used <strong>to</strong> apply the<br />

proposed nonlinear control methodologies in this thesis. A <strong>simulation</strong> environment is vital <strong>to</strong><br />

evaluate, validate and compare the various existing control methodologies with the proposed<br />

technique in a controlled environment. This is because a multi-<strong>client</strong> <strong>class</strong> system deployed in<br />

physical resources (in other words a case study or test bed) provides variable performance even<br />

under same settings/inputs in the <strong>multiple</strong> runs. A known limitation of <strong>simulation</strong> environments<br />

is it abstracts away some of the behavior from the analysis <strong>to</strong> trade off between the consistency.<br />

Therefore, the validation of this thesis utilizes the strengths of both the <strong>simulation</strong> and case study<br />

based evaluations. In following sections, we provide a description of characteristics and the<br />

process of multi-<strong>client</strong> <strong>class</strong> system followed by the architecture and <strong>implement</strong>ation details of<br />

the <strong>simulation</strong> environment.<br />

2. Characteristics and requirements of a <strong>simulation</strong><br />

The main purpose of the <strong>simulation</strong> design in this thesis is <strong>to</strong> represent a <strong>model</strong> of a multi<strong>client</strong><br />

<strong>class</strong> system, which can be used <strong>to</strong> generate artificial measurements and draw conclusions<br />

from those measurements. The general architecture of a multi-<strong>client</strong> <strong>class</strong> system for performance<br />

control is illustrated in the Figure 1. The workloads from the N <strong>client</strong> <strong>class</strong>es are sent<br />

<strong>to</strong> the shared resource environment, which is then <strong>class</strong>ified according <strong>to</strong> the <strong>client</strong> <strong>class</strong> id and<br />

queued in <strong>client</strong> <strong>class</strong> specific queue by the Classifier component. The scheduler accesses the<br />

queues and allocates the resources depending on the availability of the shared resources. It also<br />

takes in<strong>to</strong> account the resource allocation decisions made by the management system.<br />

Such systems face variable workloads from <strong>multiple</strong> <strong>client</strong> <strong>class</strong>es competing for the available<br />

resources. An incoming request may invokes different functionalities in the system, therefore<br />

the time period a resource is reserved is also a variable. In addition, due <strong>to</strong> various other<br />

characteristics exist in software systems, such as garbage collection processes, thread scheduling,<br />

complier (just-in-time) optimization and memory competitions between components, the<br />

resource reservation time periods may vary for a given request. Further, the multi-<strong>client</strong> <strong>class</strong><br />

systems are hybrid systems, which have a mix of continuous/discrete time and discrete event<br />

based dynamics. For instance, a request arrival and request completion are discrete events in the<br />

system, while the average response times of the requests are continuous/discrete time variables.<br />

The main requirements of the <strong>simulation</strong> <strong>model</strong> in the performance management prospective<br />

are as follows:<br />

1. Simulate <strong>multiple</strong> (1 <strong>to</strong> N) <strong>client</strong> <strong>class</strong>es accurately.<br />

2. Same consistent behavior under same input settings.<br />

Preprint submitted <strong>to</strong> Chapter 3 September 5, 2011


Workloads of N <strong>client</strong> <strong>class</strong><br />

Classifier<br />

Performance<br />

measurements<br />

Queue - 1<br />

Queue – 2<br />

...<br />

Scheduler<br />

Queue – N<br />

Resource allocation<br />

decisions<br />

Shared computing<br />

resources<br />

Shared resource environment<br />

Figure 1: Conceptual structure of multi-<strong>client</strong> <strong>class</strong> system<br />

3. Ability <strong>to</strong> validate the correctness of the <strong>simulation</strong> <strong>model</strong>.<br />

4. Accurate measurements of the system outputs (e.g. response times) of each <strong>client</strong> <strong>class</strong>.<br />

5. Valid <strong>implement</strong>ation of the resource allocation decisions.<br />

6. Accurate average statistics of the required system parameters.<br />

7. Ability <strong>to</strong> simulate variable workload rates over the period of <strong>simulation</strong>.<br />

8. Modifiability, extendibility and scalability.<br />

9. Fast and efficient execution.<br />

3. Simulation environment<br />

One of the main <strong>to</strong>ols available <strong>to</strong> us <strong>to</strong> build a <strong>simulation</strong> environments is discrete event <strong>simulation</strong><br />

[1]. Discrete event <strong>simulation</strong> is widely used <strong>to</strong> test and analyze new systems, policies<br />

before they are been <strong>implement</strong>ed as a production system. Discrete event <strong>simulation</strong> environments<br />

can be <strong>implement</strong>ed by general purpose programming languages (e.g., Java, C#.Net) or<br />

commercial <strong>simulation</strong> <strong>to</strong>ols. As a consequence, in this work we build a Discrete event <strong>simulation</strong><br />

<strong>model</strong> <strong>to</strong> simulate a multi-<strong>client</strong> <strong>class</strong> environment, while achieving the requirements<br />

mentioned in Section 2.<br />

3.1. Brief introduction <strong>to</strong> discrete event <strong>simulation</strong><br />

A discrete event <strong>simulation</strong> (DES) is defined as<br />

”Modeling of systems in which the state variables change only at a discrete set of points in time”<br />

in [1]. A DES <strong>model</strong> consists of entities (e.g., <strong>client</strong>s, queues, and resources), attributes, events<br />

(e.g., <strong>client</strong> arrival and departure), and activities (operation invocations, statistic collection). For<br />

a given time instance, DES <strong>model</strong> has a snapshot of the system, which is updated based on the<br />

events that is scheduled <strong>to</strong> happen in that time instance. Hence, a time advance algorithm is there<br />

<strong>to</strong> keep track of the events that suppose <strong>to</strong> take place in a given time instance chronologically.<br />

These events trigger activities in the system that may in turn produce new events that needs <strong>to</strong><br />

be executed in a future time instance or update the state variables of the system. After, these<br />

events have taken place, the clock is advanced <strong>to</strong> the next time instance and the same process<br />

2


is continued till the <strong>simulation</strong> end condition is reached. During the <strong>simulation</strong> or at the end of<br />

the <strong>simulation</strong> statistics are gathered <strong>to</strong> analyze the results of the <strong>simulation</strong>. Generally, DES<br />

<strong>model</strong> can be designed in an event-oriented and a process-oriented point of view. In the even<strong>to</strong>riented<br />

technique the DES <strong>model</strong> designer takes the events of the system and how they affect the<br />

system state variables of the <strong>model</strong> as major concerns. On the other hand, process oriented point<br />

of view enables <strong>to</strong> <strong>model</strong> the entities, their processes and how the inter-process communicates<br />

take place. The event-oriented design produces <strong>simulation</strong>s that can execute faster compared<br />

<strong>to</strong> process-oriented design, however, modularity extendibility and the understandability of the<br />

system is a trade-off. Both of these design techniques can be used, however process-oriented<br />

design is popular among the commercial <strong>simulation</strong> products [1]. Further, a DES <strong>simulation</strong><br />

can be designed with deterministic and s<strong>to</strong>chastic inputs and variables. For instance, a resource<br />

utilization time is precisely 5 seconds for any request is deterministic, where as the utilization<br />

time is determined by a probabilistic distribution is s<strong>to</strong>chastic.<br />

3.2. DES <strong>model</strong> of a multi-<strong>client</strong> <strong>class</strong> system<br />

In this section, we provide <strong>implement</strong>ation details of the <strong>simulation</strong> environments developed<br />

following the guidelines provided in [1]. Here, we have taken the process oriented design<br />

technique because it provides modularity, extendibility and convenience <strong>to</strong> design using general<br />

purpose object-orient programming languages like Java and C#.Net. Further, we use s<strong>to</strong>chastic<br />

inputs and variables in this <strong>simulation</strong> <strong>to</strong> represent the variability in multi-<strong>client</strong> <strong>class</strong> systems.<br />

The DES <strong>simulation</strong> <strong>model</strong> constructed has following entities (components) in the architecture<br />

corresponding <strong>to</strong> the characteristic architecture of Figure 1.<br />

MasterClock: This component keeps track of the current time instance of the system and<br />

advances the time after all events and activities specific <strong>to</strong> the current time instance have taken<br />

place. It triggers events on the tick (smallest time unit) and major tick (which is 1000 ticks).<br />

Request: This represents a <strong>client</strong> request flowing through the <strong>simulation</strong> <strong>model</strong>. It has the<br />

properties of <strong>client</strong> <strong>class</strong> Id, start time, end time and processing time. The processing time is<br />

determined by a probabilistic distribution specified by the designer.<br />

ClientClassWorkloadGenera<strong>to</strong>r: This component generates workloads for a specific <strong>client</strong><br />

<strong>class</strong>. It needs a <strong>client</strong> <strong>class</strong> id, workload script and the corresponding queue instance at the<br />

initialization. Then the process of this component is at each tick the workload script is analyzed<br />

and generates the required number of requests that have <strong>to</strong> be sent <strong>to</strong> the system. Then, the<br />

requests are initialized with the <strong>class</strong> id and the start time and enqueued <strong>to</strong> the corresponding<br />

queue. Currently it can simulate deterministic time varying and s<strong>to</strong>chastic (e.g., Poisson process)<br />

workloads.<br />

Queue: In a multi-<strong>client</strong> <strong>class</strong> system there is a corresponding queue <strong>to</strong> each <strong>client</strong> <strong>class</strong><br />

(see Figure 1). The Queue component is used for this purpose. It is a container of the requests<br />

generated by the ClientClassWorkloadGenera<strong>to</strong>r and ordered in a first-come-first-out fashion.<br />

The <strong>simulation</strong> <strong>model</strong> needs N Queue instances <strong>to</strong> represent N queues.<br />

ResourceUnit: The ResourceUnit entity is an abstraction of a resource unit in a multi-<strong>client</strong><br />

<strong>class</strong> system. It simulates the time period a resource is reserved/occupied/provisioned <strong>to</strong> serve a<br />

request of a <strong>client</strong> <strong>class</strong>. It has the currently served request, serviced <strong>client</strong> <strong>class</strong>, status (idle or<br />

working) as attributes. The process of this entity is at each tick, it simulates the processing time<br />

specified on the request it is serving. When the request has utilized the resource for the specified<br />

period of time it is assumed <strong>to</strong> be sent back <strong>to</strong> the <strong>client</strong> after stamping the end time. However,<br />

in this <strong>simulation</strong> the copy of the served request is also sent <strong>to</strong> the statistical analysis component<br />

<strong>to</strong> compute measurements such as response times.<br />

3


Scheduler: The scheduler <strong>implement</strong>s the resource allocation decisions required. For instance,<br />

if the decision is <strong>to</strong> maintain 15 and 5 resource units for A and B <strong>client</strong> <strong>class</strong>es respectively,<br />

this component <strong>implement</strong>s these decisions until the next decision is made. It has the<br />

access <strong>to</strong> the Queue instances of each <strong>client</strong> <strong>class</strong>, resource units and other state variables. In<br />

each tick it executes the following algorithm for each <strong>client</strong> <strong>class</strong>. Say S i and i util are integer<br />

variables representing the allocated resources of i th <strong>client</strong> <strong>class</strong> and currently utilized resources<br />

by i th <strong>client</strong> <strong>class</strong>, respectively. Calculate the number of resources that can be allocated in this<br />

time instance, by Di f i = S i −i util . Get the Di f i amount of requests from the Queue corresponding<br />

<strong>to</strong> A <strong>client</strong> <strong>class</strong> and then the ResourceUnit instances are initialized with these requests. Further,<br />

the i util variable is updated at the same time. Here, we have taken the design decision of<br />

centralized scheduler, instead of each ResourceUnit <strong>class</strong> taking the responsibility of scheduling.<br />

This is because, it is easy <strong>to</strong> track and validate the resource utilizations compared <strong>to</strong> a distributed<br />

algorithm.<br />

StatisticCalula<strong>to</strong>r: This is the component that computes the measurement required <strong>to</strong> <strong>implement</strong><br />

the control systems. In particular, it calculates average response time, throughput and<br />

resource utilizations for each <strong>client</strong> <strong>class</strong> on specified time periods. It has a list of completed<br />

requests for each <strong>client</strong> <strong>class</strong>, which is populated by the ResourceUnit <strong>class</strong> after servicing the<br />

requests. The designer specifies the time interval <strong>to</strong> calculate the statistics, which we call as the<br />

sample instance. The statistic report generated will be used by the external entities for analysis<br />

and make runtime decisions. Afterwards, the request lists are cleared <strong>to</strong> accumulate the completed<br />

requests till the next sample instance. Following equations summarize how some of the<br />

statistics are calculated for <strong>client</strong> <strong>class</strong> i.<br />

Given the completed request list for <strong>client</strong> <strong>class</strong> List i , Throughput of the system T P i =<br />

Count(List i ), i.e, the number of items in the list. The response time of the j th request r i, j =<br />

r i, j .endtime − r i, j .starttime.<br />

The <strong>to</strong>tal response time of all requests in the list<br />

Tot i =<br />

Count(List ∑ i )−1<br />

j=0<br />

Average response time is calculated by R i = Tot i<br />

T P i<br />

r i, j (1)<br />

MainProgram: The designer can use this <strong>class</strong> <strong>to</strong> <strong>implement</strong> the required <strong>simulation</strong> depending<br />

on the requirements. Depending on the number of <strong>client</strong> <strong>class</strong>es Queue instance have <strong>to</strong> be<br />

created, then the required workload scripts have <strong>to</strong> be specified in <strong>client</strong> <strong>class</strong> specific workload<br />

genera<strong>to</strong>r objects. In addition, the number of resource units that is available in the system has <strong>to</strong><br />

be specified in the scheduler. Further, probability distributions <strong>to</strong> simulate resource reservation<br />

time and sample period has <strong>to</strong> be given depending on the <strong>simulation</strong> objectives.<br />

Assumptions<br />

1. Typically, the resource allocation decisions made by the management system are <strong>implement</strong>ed<br />

at each sample instance. However, some of the resource units are maybe occupied<br />

by the requests that are being processed at that time instance. This may indicate that some<br />

<strong>class</strong>es have more than the resources they are allocated for that time instance. Hence, <strong>to</strong><br />

<strong>implement</strong>ation of the resource allocation decisions can be done in two different ways,<br />

including preemptive and non- preemptive. In the preemptive setting, the number of over<br />

utilized resources are forcefully taken away in order <strong>to</strong> allocate that resource <strong>to</strong> specified<br />

4


<strong>class</strong>. This is a complex policy, which will cause jittery behavior in system measurements,<br />

inconsistent states in transaction and additional overhead on the shared resource<br />

system during the <strong>implement</strong>ation at runtime [2, 3]. In contrast, in the non-preemptive<br />

setting, the resource is taken away once the request being processed is completed. The<br />

non-preemptive setting is a desirable configuration for shared resource environments [2].<br />

Thus, we have <strong>implement</strong>ed this non-preemptive setting in the scheduler process of the<br />

<strong>simulation</strong> <strong>model</strong>. However, the inaccuracy in decision <strong>implement</strong>ation can be reduced by<br />

selecting the processing times comparatively smaller than the sample period. For instance,<br />

if the service time varies in ticks range, sample period can be selected in major ticks. This<br />

means the decision made will be <strong>implement</strong>ed before the next decision made. In addition,<br />

a large amount of requests will be processed during a sample time so that the error due <strong>to</strong><br />

incomplete requests during a sample period becomes insignificant.<br />

2. End-<strong>to</strong>-end response time is not a consideration.<br />

3. Time taken <strong>to</strong> reschedule the resource is assumed <strong>to</strong> be zero.<br />

4. Overhead from the scheduler and the statistic calcula<strong>to</strong>r is zero.<br />

From the various <strong>simulation</strong>s designed and executed from this <strong>implement</strong>ation indicated that<br />

it can simulate consistent behavior under same settings and N number of <strong>client</strong>s. It also provides<br />

accurate measurements of the system outputs, correct <strong>implement</strong>ations of resource allocation<br />

decisions and fast executions. The process-oriented design approach taken in this <strong>implement</strong>ation<br />

provides modularity by delegating responsibilities among entities and extendibility. Therefore,<br />

this DES <strong>simulation</strong> <strong>model</strong> achieves many of the requirements mentioned in Section 2. What<br />

is left is verify and validate this DES <strong>model</strong> is capable <strong>to</strong> simulate the complex behavior of a<br />

multi-<strong>client</strong> <strong>class</strong> system.<br />

4. Validation of the DES <strong>model</strong><br />

After building a <strong>simulation</strong>, the next major step is <strong>to</strong> verify and validate the <strong>implement</strong>ation.<br />

In this section we provide three forms of validations using queuing theory. It is noteworthy<br />

that <strong>to</strong> apply queuing theory, certain assumptions on the system structure, arrival workloads<br />

and processing time statistical distributions should hold. In the following sections, the required<br />

<strong>simulation</strong> systems are constructed using the DES <strong>model</strong> proposed in Section 3, adhering <strong>to</strong> the<br />

assumptions.<br />

4.1. Conformant <strong>to</strong> Little’s law<br />

One of the fundamental results of queuing theory was developed by John Little in 1960’s,<br />

which is used as a basic building box in the development of theories of large scale queuing<br />

systems. Little’s law is defined as follows:<br />

For a queuing system in steady state, if the mean time waiting in the system is W = E(T),<br />

and the mean number of cus<strong>to</strong>mers entering the system is λ, then the mean number of cus<strong>to</strong>mers<br />

in the system is given by E(L) = W × λ. This result applies <strong>to</strong> any queuing system and even <strong>to</strong><br />

systems within a system. However, system has <strong>to</strong> be in steady state, meaning that the arrival rate<br />

should be less than the service rate of the system. Therefore, the <strong>simulation</strong> <strong>model</strong> presented in<br />

this chapter can be validated using Little’s law.<br />

In order <strong>to</strong> do the validation, we constructed a queuing <strong>simulation</strong> using the constructs introduced<br />

in Section 3. We used two <strong>client</strong> <strong>class</strong> workload genera<strong>to</strong>rs and queues with 5 resource<br />

5


Average<br />

waiting<br />

time Class<br />

1(W 1 )<br />

Table 1: A comparison based on Littles law<br />

Average<br />

waiting<br />

time Class<br />

2(W 2 )<br />

Total<br />

number<br />

of cus<strong>to</strong>ms<br />

<strong>class</strong><br />

1(N 1 )<br />

Total<br />

number<br />

of cus<strong>to</strong>ms<br />

<strong>class</strong><br />

2(N 2 )<br />

Measured<br />

average<br />

number of<br />

cus<strong>to</strong>mers<br />

in the<br />

Calculation<br />

of littles<br />

law<br />

W 1 × N 1 +<br />

W 2 × N 1<br />

system<br />

54.53763 51.49038 93 104 0.20854 0.20854<br />

46.25325 49.35146 999 993 1.90426 1.90426<br />

28.23506 29.22124 1238 1243 1.42554 1.42554<br />

14.12709 14.39264 1676 1658 0.9508 0.9508<br />

17.09113 17.26543 2030 1944 1.36518 1.36518<br />

14.42547 14.40802 2536 2468 1.44284 1.44284<br />

units for each <strong>client</strong> <strong>class</strong> in this validation. We used 18 combinations of s<strong>to</strong>chastic arrival rate<br />

and service rates from exponential distribution <strong>to</strong> simulate workloads and processing times of<br />

both <strong>client</strong> <strong>class</strong>es. All these combinations were selected <strong>to</strong> maintain the system in the steady<br />

state. The workload scripts generated from arrival rate were given <strong>to</strong> the ClientClassWorkload-<br />

Genera<strong>to</strong>r instance of each <strong>client</strong> <strong>class</strong> and the processing times of ResourceUnits were generated<br />

from service rate from exponential distribution in each experiment. A experiment was conducted<br />

for 50,000 ticks. The StatisticCalula<strong>to</strong>r instance was used <strong>to</strong> compute the final statistics of the<br />

experiment including, the average response time, average arrival rates and average number of<br />

cus<strong>to</strong>mers in the system. In these calculations the system was considered as two sub systems,<br />

each providing services <strong>to</strong> a corresponding <strong>client</strong> <strong>class</strong>. The comparison of the statics were done<br />

as the <strong>to</strong>tal number of <strong>client</strong>s in these two sub systems as equal <strong>to</strong> <strong>to</strong>tal measured number of<br />

<strong>client</strong>s in the systems when both sub systems considered <strong>to</strong>gether. These statics were an exact<br />

match for all of these experiments. Some of the selected experimental results are summarized<br />

in Table 1 indicates that measured number in the system is precisely equal <strong>to</strong> the calculations of<br />

the Little’s law. The same results were observed for the experiments conducted in deterministic<br />

arrival and service rates. Hence, the multi-<strong>client</strong> <strong>class</strong> <strong>simulation</strong>s <strong>implement</strong>ed from the DES<br />

<strong>model</strong> described in Section 3, precisely conform <strong>to</strong> the Little’s law. This result also indicates that<br />

all the request input <strong>to</strong> the system leave the system. Further, <strong>implement</strong>ation of the DES <strong>model</strong><br />

including the statistical calculations is correct.<br />

4.2. Conformant <strong>to</strong> single-<strong>server</strong> queuing system (M/M/1)<br />

In this section a single-<strong>server</strong> queuing system is developed and simulated, and then the measurements<br />

are compared <strong>to</strong> theoretic results of (M/M/1) queuing system from literature. The<br />

<strong>simulation</strong> was <strong>implement</strong>ed with a single resource unit and queue. The workload script of a<br />

single <strong>client</strong> is generated according <strong>to</strong> Poisson arrival process. The resource reservation time<br />

(processing time) is generated according <strong>to</strong> the exponential distribution. All the components<br />

available from the DES <strong>model</strong> are used in this <strong>implement</strong>ation as well. The 18 experiments were<br />

conducted with the same arrival and processing time combinations utilized in Section 4.1. Due<br />

<strong>to</strong> the measured results are compared with probabilistic theoretical values, each experiment was<br />

run for 200,000 ticks. As the basis of validation, we compared the measured average number of<br />

cus<strong>to</strong>mers in the system from the <strong>simulation</strong>s with the expected number of cus<strong>to</strong>mers calculated<br />

6


Table 2: A comparison based on single-<strong>server</strong> queuing system (M/M/1)<br />

λ µ L theoretical L measured<br />

0.02 0.1 0.247 0.244<br />

0.025 0.067 0.423 0.418<br />

0.04 0.056 2.622 2.591<br />

0.022 0.03 2.799 2.756<br />

0.02 0.027 2.838 2.801<br />

0.02 0.04 0.979 0.968<br />

0.033 0.067 0.976 0.96<br />

from queuing theoretic results. Let us say λ and µ represents the mean arrival rate and mean<br />

service time respectively. Theoretically the expected cus<strong>to</strong>mers in the system is calculated as<br />

follows:<br />

L theoretical =<br />

λ<br />

µ − λ<br />

So that, given the simulated λ and µ, the L theoretical calculated from equation (2) should approximately<br />

equal <strong>to</strong> L measured from the <strong>simulation</strong>. In order <strong>to</strong> quantify the statistical significance of<br />

the difference, we also conducted a Kolmogorov-Smirnov test using the data of the 18 <strong>simulation</strong>s<br />

conducted under different λ and µ. The compassion of results of some of the experiments are<br />

summarized in Table 2.<br />

The results of Kolmogorov-Smirnov test producedx D statistics of 0.11 and P statistic of 1.<br />

In nutshell if the P value is less than 0.05 there is significant difference between the data sets.<br />

However, since for this case P = 1 concludes that the data set of L measured and L theoretical has no<br />

significant difference. As a consequence, the single-<strong>server</strong> queuing system (M/M/1) constructed<br />

from the DES <strong>model</strong> conform <strong>to</strong> the queuing theoretic results. This confirms the M/M/1 queuing<br />

system <strong>implement</strong>ed using the DES <strong>model</strong> constructs which includes scheduling of a single<br />

queue and resource is correct.<br />

4.3. Conformant <strong>to</strong> multi-<strong>server</strong> queuing system (M/M/c)<br />

In this section we construct a multi-<strong>server</strong> queuing system serving a single queue. For this<br />

system, the same assumptions used in Section 4.2 are maintained. However, (c=) 5 resource units<br />

are used <strong>to</strong> represent 5 <strong>server</strong>s in the system. 18 experiments were conducted under same settings<br />

as in Section 4.2 in order <strong>to</strong> gather measurement data. The same measurement of the average<br />

number of cus<strong>to</strong>mers in the system was used for the comparison. The theoretical calculation is<br />

done as follows:<br />

r c<br />

∑c−1<br />

p 0 = (<br />

c!(1 − ρ) + (c − 1) r n<br />

n! )−1 (3)<br />

L theoretical = r +<br />

n=0<br />

(2)<br />

r c ρ<br />

c!(1 − ρ) 2 p 0, (4)<br />

Where r = λ µ , ρ = r c<br />

, c = number of <strong>server</strong>s (5 for this experiment). The results are summarized<br />

in Table 3.<br />

7


Table 3: A comparison based on multi-<strong>server</strong> queuing system (M/M/c)<br />

λ µ L theoretical L measured<br />

0.02 0.1 0.2 0.198<br />

0.025 0.067 0.375 0.367<br />

0.04 0.056 0.72 0.728<br />

0.022 0.03 0.734 0.722<br />

0.02 0.027 0.74 0.739<br />

0.02 0.04 0.5 0.492<br />

0.033 0.067 0.5 0.495<br />

The Kolmogorov-Smirnov test computed D statistics of 0.16 and P statistic of 0.95 similar<br />

<strong>to</strong> the earlier the case of single-<strong>server</strong> queuing system indicating that the data set of L measured and<br />

L theoretical has no significant difference. Thus, the (M/M/c) queuing system developed for this<br />

case also conforms <strong>to</strong> the theoretical results.<br />

We conclude the theoretical validation of the DES <strong>model</strong> built <strong>to</strong> be used in this thesis with<br />

the above three validations. The results are not exactly equal <strong>to</strong> the theoretic results because of the<br />

slight numerical inaccuracies of the <strong>implement</strong>ations of probabilistic distributions. In addtion,<br />

the multi-<strong>client</strong> <strong>class</strong> systems fall under multi-<strong>server</strong> multi-<strong>class</strong> queuing systems. Well known<br />

exact theoretical results are not available so far for such systems, so that we limited our validation<br />

<strong>to</strong> multi-<strong>server</strong> queuing systems. With this result we can justify that the <strong>implement</strong>ation of the<br />

constructs of DES <strong>model</strong>, including scheduling of <strong>multiple</strong> resource units and queuing is valid.<br />

5. Simulation settings<br />

Using the above generalized DES <strong>model</strong>, we setup a <strong>simulation</strong> system <strong>to</strong> apply and validate<br />

the proposed nonlinear control theoretic approaches in this thesis.<br />

5.1. Workload profiles<br />

The workloads, a multi-<strong>client</strong> <strong>class</strong> system may face cannot be generalized. The workload<br />

a system can manage depends on the capacity of resources, management requirements and performance<br />

objectives. The workload profile for a system with CPU as the shared resource may<br />

differ from a system with concurrent threads as a shared resource. In addition, the workloads are<br />

time-varying, instead of staying constant for entire period of operations. This characteristic is<br />

not only limited <strong>to</strong> software systems, but <strong>to</strong> other physical systems as well. As a consequence,<br />

control engineering provides set of well-established input signals <strong>to</strong> validate the performance<br />

of the control systems. They are as follows: Assume, W n is the nominal workload that system<br />

receive.<br />

Impulse input signal: Formally, W impulse (k) = 1 when k = 0 and k 0. i.e, the impulse<br />

input signal increases the workload <strong>to</strong> some value greater than W n for a single sample period.<br />

In a real workload this can be considered as a workload spike for a very short time period.<br />

However, such spikes for very short periods of time may not affect the performance attributes<br />

(e.g., average response time) drastically, consequently the impulse input signal may not be useful<br />

for the validations of the control systems of software systems.<br />

Step input signal: Step input signal <strong>model</strong>s a sudden jump in the workload from W n <strong>to</strong><br />

some value W step and staying at that value for a more than a single sample period. This is one<br />

8


of the widely used input signals <strong>to</strong> validate the performance of the control systems in control<br />

engineering. In addition, most of the applications of feedback control in software systems, including<br />

multi-<strong>client</strong> <strong>class</strong> systems have used step workload changes <strong>to</strong> validate the performance<br />

and resource management capabilities. This is because, such workload changes of even a single<br />

<strong>client</strong> <strong>class</strong> in a multi-<strong>client</strong> system for a long period of time affects the performance attributes<br />

(e.g., response time) under control. As a consequence, the control system is forced <strong>to</strong> redistribute<br />

the available recourses among <strong>client</strong> <strong>class</strong>es, in order <strong>to</strong> achieve the required performance objectives.<br />

The delay in response <strong>to</strong> such workload variations may cause large transient responses and<br />

temporal instabilities in the system. Therefore, this is a significantly difficult load variation <strong>to</strong><br />

handle [2, 3, 4, 5].<br />

Ramp input signal: Ramp input linearly increases the workload from W n <strong>to</strong> W ramp during<br />

sometime interval. This signal <strong>model</strong>s a gradual increase of workload instead of instantaneous<br />

increment of workload compared <strong>to</strong> step input signal.<br />

The main advantage of these input signals is given a linear <strong>model</strong> of a system, there are wellknown<br />

design and analysis techniques available from control theory <strong>to</strong> compute performance<br />

specifications and behavior. Consequently, after constructing a linear <strong>model</strong> of a system we can<br />

investigate/prove the load variations that the system can maintain without leading <strong>to</strong> instabilities.<br />

However, a linear <strong>model</strong> of a system is an estimation of its behavior (not 100% accurate<br />

representation), so that these theoretical evaluations may not be correct 100%. Further, this is<br />

also true for systems demonstrating nonlinearities such as the system under investigation in this<br />

thesis. As a consequence, we have <strong>to</strong> mention that the combinations of workload input signals<br />

(in particular, step input profiles) in time varying fashion are used as heuristics <strong>to</strong> validate and<br />

compare the performance of the control systems.<br />

5.2. Total resource amount and resource reservation time distribution<br />

The following settings will be used as an abstract representation of the multi-<strong>client</strong> <strong>class</strong><br />

system in the <strong>simulation</strong>s. The settings will remain the same unless otherwise specified. The<br />

<strong>to</strong>tal amount of resources simulated S <strong>to</strong>tal = 30. The processing time of each resource unit is<br />

selected from a uniform distribution as follows :<br />

1<br />

r(x) = for r min ≤ x ≥ r max<br />

r max − r min<br />

(5)<br />

= 0 for x < r min and x > r max (6)<br />

Where, r min = 100 ticks and r max = 700 ticks. The selection of the above settings is done, in order<br />

<strong>to</strong> achieve the tractability of resource allocations among <strong>client</strong> <strong>class</strong>es under different experiment<br />

conditions. The r min and r max , were selected after careful investigation of system outputs under<br />

different workload conditions. That is when the system is running close <strong>to</strong> the full capacity the<br />

system output should remain within some bounds, according <strong>to</strong> theoretical and practical system<br />

behavior. The Figure 2 shows a comparison when 30 resource units are allocated <strong>to</strong> two <strong>client</strong><br />

<strong>class</strong>es with 30 req/sec workloads for each <strong>class</strong>. When the selected bounds r min = 100 and r max =<br />

700 ticks, maintain the system in steady state under the applied resource settings. However, under<br />

the same settings, when the bounds are r min = 100 and r max = 900 ticks, the steady state behavior<br />

is highly variable/unstable. This is because the variability around the average response time leads<br />

<strong>to</strong> large transient response in the system. To avoid such behaviors the resource capacity and the<br />

workload intensity have <strong>to</strong> be selected depending on the bounds. For the workloads rates and the<br />

resources we selected <strong>to</strong> evaluate, r min = 100 ticks and r max = 700 are suitable bounds.<br />

9


2<br />

2<br />

R 1<br />

R 1<br />

Response time<br />

1.5<br />

1<br />

0.5<br />

R 2<br />

Response time<br />

1.5<br />

1<br />

0.5<br />

R 2<br />

0<br />

20 40 60 80 100<br />

Sample Id<br />

(a) 100-700<br />

0<br />

20 40 60 80 100<br />

Sample Id<br />

(b) 100-900<br />

Figure 2: System behavior under 2a) r min = 100 and r max = 700 ticks 2b) r min = 100 and<br />

r max = 900 ticks<br />

Further, selection of the uniform distributed processing time means that any operations invoked<br />

in the system is equally likely, so that we can get fair weight for each invocation. This is<br />

done because there is neither evidence nor a generalization available <strong>to</strong> represent the invocation<br />

patterns of the operations and their system output (e.g., response time) bounds. Such selections<br />

are done in [3, 6].<br />

In addition, 2000 ticks were selected as the sampling time period of the statistic calculation<br />

process. The selection of the sample time period has <strong>to</strong> be carefully done in physical systems. For<br />

instance, a small sample time invokes the statistic calculations frequently leading <strong>to</strong> additional<br />

overhead on the system. In addition, short sampling intervals affects variability of the measured<br />

average statistics. In contrast, large sampling times may cause decision delays, under sudden<br />

changes of the workloads or other conditions, leading <strong>to</strong> instabilities. Therefore, there is a tradeoff<br />

in the selection of the time interval. We selected 2000 ticks after analysis of the workload rates<br />

and changes we will be applying in the system. Further, it reduces the effect of the assumption<br />

(1), listed in Section 3.<br />

6. Simulation vs. Physical system behavior<br />

It is also important how the simulated multi-<strong>client</strong> <strong>server</strong> system behave corresponding <strong>to</strong> the<br />

behavior of real physical systems. There are many existing work related <strong>to</strong> performance management<br />

has analyzed the behavior of their case studies (based on physical systems) under different<br />

workload conditions. For instance, such analysis can be found for the cases of web <strong>server</strong>s [7, 8],<br />

data centers [4, 9], multi-tenant <strong>class</strong> systems [10, 11], multi-<strong>client</strong> <strong>class</strong> systems [12, 13, 14]<br />

and so on. One of the common experiments conducted is by changing the available resource in<br />

some order (increasing or decreasing in small steps) and measuring/plotting the system output<br />

(commonly the response time) under a constant deterministic workload. Common characteristic<br />

of all these example physical systems are shown in Figure 3. That is when resource share is<br />

sufficient <strong>to</strong> handle the incoming workload the response time remains in a steady value (or with<br />

low variations). That is the response time is insensitive for the resource share. However, when<br />

the resource share is insufficient the response time increases in a high rate. That is response time<br />

is highly sensitive <strong>to</strong> the resource share. However, this behavior highly depends on the workload<br />

settings as well (see [7, 4, 8, 9] for detailed analysis).<br />

10


Response time (seconds)<br />

Response time is highly sensitive<br />

due <strong>to</strong> lack of resources<br />

Response time is<br />

insensitive due <strong>to</strong><br />

excessive<br />

resources<br />

Resource share<br />

Figure 3: Abstract response time behavior against the resource share observed in physical systems<br />

Response time<br />

5<br />

4.5<br />

4<br />

3.5<br />

3<br />

2.5<br />

2<br />

30 req/sec<br />

40 req/sec<br />

50 req/sec<br />

60 req/sec<br />

70 req/sec<br />

80 req/sec<br />

1.5<br />

1<br />

0.5<br />

0<br />

14 16 18 20 22 24 26 28 30<br />

Resource amount<br />

Figure 4: A comparison of response time behavior of the <strong>simulation</strong> environment with the resource<br />

allocation in different workload conditionss<br />

In this section, we conduct such an experiment with a single <strong>client</strong> <strong>class</strong> system using the<br />

same settings described in Section 5. Here, the average response time of the system is observed<br />

while decreasing the available resource units from 30 <strong>to</strong> 14 for different workload conditions.<br />

The Figure 4 illustrates the behavior of the system output (response time) with respect <strong>to</strong> the<br />

resource allocation. The common observation under different workload conditions is when the<br />

incoming workload can be handled by the available resources the response time remains at a<br />

steady value. For instance, the response time is not affected by the 30 req/sec workload, because<br />

of 14 resource units are adequate <strong>to</strong> handle that workload. When we increase the workload rate,<br />

at certain resource allocation levels the response time starts <strong>to</strong> increase at a high rate moving the<br />

system <strong>to</strong> highly sensitive region (see Figure 4). Therefore, the behavior of this <strong>simulation</strong> is<br />

same as the behavior of real physical systems investigated in literature (see work [7, 4, 8] for<br />

similar experimental results).<br />

This experiment also indicates that 80 req/sec workload is the maximum capacity of the system<br />

for 30 resource units. However, this is not a linear relationship. For instance, this relationship<br />

indicates that with 15 resources, a 40 req/sec workload can be handled. However, graph for 40<br />

req/sec indicates that 15 resources move the system <strong>to</strong> highly sensitive region. As a consequence,<br />

11


when tenants are placed in on a shared resource environment the <strong>to</strong>tal capacity with mixed workload<br />

of these tenants is less than that of when system is considered as single <strong>class</strong>. For instance,<br />

if two tenants share 15 resource units each <strong>to</strong>tal workload capacity it can handle is approximately<br />

60 req/sec. Such behavior also pointed out by Kwok et al in [10].<br />

7. Summery<br />

This chapter presented the characteristics, requirements and importance of a <strong>simulation</strong> environment<br />

<strong>to</strong> represent a multi-<strong>client</strong> <strong>class</strong> system. Using popular discrete event <strong>simulation</strong> mechanism,<br />

we presented an appropriate discrete event <strong>simulation</strong> <strong>model</strong> <strong>to</strong> <strong>implement</strong> multi-<strong>client</strong><br />

<strong>class</strong> systems with different settings. Then the <strong>model</strong> <strong>implement</strong>ation was validated using queuing<br />

theoretic principles. The <strong>simulation</strong> settings that will be used in the rest of the chapter were<br />

also presented. Finally, the behavior of the <strong>simulation</strong> environments was compared <strong>to</strong> the behavior<br />

of physical systems utilizing the case studies available from the literature.<br />

References<br />

[1] J. Banks, J. Carson, B. L. Nelson, D. Nicol, Discrete-Event System Simulation (4th Edition), 4th Edition, Prentice<br />

Hall, 2004.<br />

[2] C. Lu, Y. Lu, T. F. Abdelzaher, J. A. Stankovic, S. H. Son, Feedback control architecture and design methodology<br />

for service delay guarantees in web <strong>server</strong>s, IEEE Trans. Parallel Distrib. Syst. (2006) 1014–1027.<br />

[3] C. Lu, Feedback control real-time scheduling, Ph.D. thesis, University of Virginia (2001).<br />

[4] P. Padala, Au<strong>to</strong>mated management of virtualized data centers, Ph.D. thesis, University of Michigan (2010).<br />

[5] J. L. Hellerstein, Y. Diao, S. Parekh, D. M. Tilbury, Feedback Control of Computing Systems, John Wiley and<br />

Sons, 2004.<br />

[6] L. Chenyang, J. Stankovic, G. Tao, S. Son, Design and evaluation of a feedback control edf scheduling algorithm,<br />

in: Real-Time Systems Symposium, 1999. Proceedings. The 20th IEEE, 1999, pp. 56 –67.<br />

[7] Z. Wang, X. Zhu, S. Singhal, Z. Wang, X. Zhu, S. Singhal, Utilization vs. slo-based control for dynamic sizing of<br />

resource partitions (2006).<br />

[8] X. Zhu, Z. Wang, S. Singhal, Utility-driven workload management using nested control design, no. HPL-2005-<br />

193R1, Hewlett Packard Labora<strong>to</strong>ries, 2006, p. 8.<br />

[9] P. Pradeep, H. Kai-Yuan, S. K. G., Z. Xiaoyun, U. Mustafa, W. Zhikui, S. Sharad, M. Arif, Au<strong>to</strong>mated control of<br />

<strong>multiple</strong> virtualized resources (2009).<br />

[10] T. Kwok, A. Mohindra, Resource calculations with constraints, and placement of tenants and instances for multitenant<br />

saas applications, in: Proceedings of the 6th International Conference on Service-Oriented Computing,<br />

ICSOC ’08, Springer-Verlag, 2008, pp. 633–648.<br />

[11] Z. H. Wang, C. J. Guo, B. Gao, W. Sun, Z. Zhang, W. H. An, A study and performance evaluation of the multitenant<br />

data tier design patterns for service oriented computing, in: IEEE International Conference on e-Business<br />

Engineering, 2008. ICEBE ’08., 2008, pp. 94 –101.<br />

[12] Y. Lu, T. Abdelzaher, C. Lu, L. Sha, X. Liu, Feedback control with queueing-theoretic prediction for relative delay<br />

guarantees in web <strong>server</strong>s (2003).<br />

[13] M. Karlsson, X. Zhu, C. Karamanolis, An adaptive optimal controller for non-intrusive performance differentiation<br />

in computing services, in: In IEEE Conference on Control and Au<strong>to</strong>mation (ICCA), 2005.<br />

[14] M. Li<strong>to</strong>iu, A performance analysis method for au<strong>to</strong>nomic computing systems, ACM Trans. Au<strong>to</strong>n. Adapt. Syst. 2.<br />

12

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!