A simulation model to implement multiple client class server-client ...
A simulation model to implement multiple client class server-client ...
A simulation model to implement multiple client class server-client ...
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
A <strong>simulation</strong> <strong>model</strong> <strong>to</strong> <strong>implement</strong> <strong>multiple</strong> <strong>client</strong> <strong>class</strong><br />
<strong>server</strong>-<strong>client</strong> software architecture<br />
1. Introduction<br />
In this chapter we introduce the <strong>simulation</strong> environment, which will be used <strong>to</strong> apply the<br />
proposed nonlinear control methodologies in this thesis. A <strong>simulation</strong> environment is vital <strong>to</strong><br />
evaluate, validate and compare the various existing control methodologies with the proposed<br />
technique in a controlled environment. This is because a multi-<strong>client</strong> <strong>class</strong> system deployed in<br />
physical resources (in other words a case study or test bed) provides variable performance even<br />
under same settings/inputs in the <strong>multiple</strong> runs. A known limitation of <strong>simulation</strong> environments<br />
is it abstracts away some of the behavior from the analysis <strong>to</strong> trade off between the consistency.<br />
Therefore, the validation of this thesis utilizes the strengths of both the <strong>simulation</strong> and case study<br />
based evaluations. In following sections, we provide a description of characteristics and the<br />
process of multi-<strong>client</strong> <strong>class</strong> system followed by the architecture and <strong>implement</strong>ation details of<br />
the <strong>simulation</strong> environment.<br />
2. Characteristics and requirements of a <strong>simulation</strong><br />
The main purpose of the <strong>simulation</strong> design in this thesis is <strong>to</strong> represent a <strong>model</strong> of a multi<strong>client</strong><br />
<strong>class</strong> system, which can be used <strong>to</strong> generate artificial measurements and draw conclusions<br />
from those measurements. The general architecture of a multi-<strong>client</strong> <strong>class</strong> system for performance<br />
control is illustrated in the Figure 1. The workloads from the N <strong>client</strong> <strong>class</strong>es are sent<br />
<strong>to</strong> the shared resource environment, which is then <strong>class</strong>ified according <strong>to</strong> the <strong>client</strong> <strong>class</strong> id and<br />
queued in <strong>client</strong> <strong>class</strong> specific queue by the Classifier component. The scheduler accesses the<br />
queues and allocates the resources depending on the availability of the shared resources. It also<br />
takes in<strong>to</strong> account the resource allocation decisions made by the management system.<br />
Such systems face variable workloads from <strong>multiple</strong> <strong>client</strong> <strong>class</strong>es competing for the available<br />
resources. An incoming request may invokes different functionalities in the system, therefore<br />
the time period a resource is reserved is also a variable. In addition, due <strong>to</strong> various other<br />
characteristics exist in software systems, such as garbage collection processes, thread scheduling,<br />
complier (just-in-time) optimization and memory competitions between components, the<br />
resource reservation time periods may vary for a given request. Further, the multi-<strong>client</strong> <strong>class</strong><br />
systems are hybrid systems, which have a mix of continuous/discrete time and discrete event<br />
based dynamics. For instance, a request arrival and request completion are discrete events in the<br />
system, while the average response times of the requests are continuous/discrete time variables.<br />
The main requirements of the <strong>simulation</strong> <strong>model</strong> in the performance management prospective<br />
are as follows:<br />
1. Simulate <strong>multiple</strong> (1 <strong>to</strong> N) <strong>client</strong> <strong>class</strong>es accurately.<br />
2. Same consistent behavior under same input settings.<br />
Preprint submitted <strong>to</strong> Chapter 3 September 5, 2011
Workloads of N <strong>client</strong> <strong>class</strong><br />
Classifier<br />
Performance<br />
measurements<br />
Queue - 1<br />
Queue – 2<br />
...<br />
Scheduler<br />
Queue – N<br />
Resource allocation<br />
decisions<br />
Shared computing<br />
resources<br />
Shared resource environment<br />
Figure 1: Conceptual structure of multi-<strong>client</strong> <strong>class</strong> system<br />
3. Ability <strong>to</strong> validate the correctness of the <strong>simulation</strong> <strong>model</strong>.<br />
4. Accurate measurements of the system outputs (e.g. response times) of each <strong>client</strong> <strong>class</strong>.<br />
5. Valid <strong>implement</strong>ation of the resource allocation decisions.<br />
6. Accurate average statistics of the required system parameters.<br />
7. Ability <strong>to</strong> simulate variable workload rates over the period of <strong>simulation</strong>.<br />
8. Modifiability, extendibility and scalability.<br />
9. Fast and efficient execution.<br />
3. Simulation environment<br />
One of the main <strong>to</strong>ols available <strong>to</strong> us <strong>to</strong> build a <strong>simulation</strong> environments is discrete event <strong>simulation</strong><br />
[1]. Discrete event <strong>simulation</strong> is widely used <strong>to</strong> test and analyze new systems, policies<br />
before they are been <strong>implement</strong>ed as a production system. Discrete event <strong>simulation</strong> environments<br />
can be <strong>implement</strong>ed by general purpose programming languages (e.g., Java, C#.Net) or<br />
commercial <strong>simulation</strong> <strong>to</strong>ols. As a consequence, in this work we build a Discrete event <strong>simulation</strong><br />
<strong>model</strong> <strong>to</strong> simulate a multi-<strong>client</strong> <strong>class</strong> environment, while achieving the requirements<br />
mentioned in Section 2.<br />
3.1. Brief introduction <strong>to</strong> discrete event <strong>simulation</strong><br />
A discrete event <strong>simulation</strong> (DES) is defined as<br />
”Modeling of systems in which the state variables change only at a discrete set of points in time”<br />
in [1]. A DES <strong>model</strong> consists of entities (e.g., <strong>client</strong>s, queues, and resources), attributes, events<br />
(e.g., <strong>client</strong> arrival and departure), and activities (operation invocations, statistic collection). For<br />
a given time instance, DES <strong>model</strong> has a snapshot of the system, which is updated based on the<br />
events that is scheduled <strong>to</strong> happen in that time instance. Hence, a time advance algorithm is there<br />
<strong>to</strong> keep track of the events that suppose <strong>to</strong> take place in a given time instance chronologically.<br />
These events trigger activities in the system that may in turn produce new events that needs <strong>to</strong><br />
be executed in a future time instance or update the state variables of the system. After, these<br />
events have taken place, the clock is advanced <strong>to</strong> the next time instance and the same process<br />
2
is continued till the <strong>simulation</strong> end condition is reached. During the <strong>simulation</strong> or at the end of<br />
the <strong>simulation</strong> statistics are gathered <strong>to</strong> analyze the results of the <strong>simulation</strong>. Generally, DES<br />
<strong>model</strong> can be designed in an event-oriented and a process-oriented point of view. In the even<strong>to</strong>riented<br />
technique the DES <strong>model</strong> designer takes the events of the system and how they affect the<br />
system state variables of the <strong>model</strong> as major concerns. On the other hand, process oriented point<br />
of view enables <strong>to</strong> <strong>model</strong> the entities, their processes and how the inter-process communicates<br />
take place. The event-oriented design produces <strong>simulation</strong>s that can execute faster compared<br />
<strong>to</strong> process-oriented design, however, modularity extendibility and the understandability of the<br />
system is a trade-off. Both of these design techniques can be used, however process-oriented<br />
design is popular among the commercial <strong>simulation</strong> products [1]. Further, a DES <strong>simulation</strong><br />
can be designed with deterministic and s<strong>to</strong>chastic inputs and variables. For instance, a resource<br />
utilization time is precisely 5 seconds for any request is deterministic, where as the utilization<br />
time is determined by a probabilistic distribution is s<strong>to</strong>chastic.<br />
3.2. DES <strong>model</strong> of a multi-<strong>client</strong> <strong>class</strong> system<br />
In this section, we provide <strong>implement</strong>ation details of the <strong>simulation</strong> environments developed<br />
following the guidelines provided in [1]. Here, we have taken the process oriented design<br />
technique because it provides modularity, extendibility and convenience <strong>to</strong> design using general<br />
purpose object-orient programming languages like Java and C#.Net. Further, we use s<strong>to</strong>chastic<br />
inputs and variables in this <strong>simulation</strong> <strong>to</strong> represent the variability in multi-<strong>client</strong> <strong>class</strong> systems.<br />
The DES <strong>simulation</strong> <strong>model</strong> constructed has following entities (components) in the architecture<br />
corresponding <strong>to</strong> the characteristic architecture of Figure 1.<br />
MasterClock: This component keeps track of the current time instance of the system and<br />
advances the time after all events and activities specific <strong>to</strong> the current time instance have taken<br />
place. It triggers events on the tick (smallest time unit) and major tick (which is 1000 ticks).<br />
Request: This represents a <strong>client</strong> request flowing through the <strong>simulation</strong> <strong>model</strong>. It has the<br />
properties of <strong>client</strong> <strong>class</strong> Id, start time, end time and processing time. The processing time is<br />
determined by a probabilistic distribution specified by the designer.<br />
ClientClassWorkloadGenera<strong>to</strong>r: This component generates workloads for a specific <strong>client</strong><br />
<strong>class</strong>. It needs a <strong>client</strong> <strong>class</strong> id, workload script and the corresponding queue instance at the<br />
initialization. Then the process of this component is at each tick the workload script is analyzed<br />
and generates the required number of requests that have <strong>to</strong> be sent <strong>to</strong> the system. Then, the<br />
requests are initialized with the <strong>class</strong> id and the start time and enqueued <strong>to</strong> the corresponding<br />
queue. Currently it can simulate deterministic time varying and s<strong>to</strong>chastic (e.g., Poisson process)<br />
workloads.<br />
Queue: In a multi-<strong>client</strong> <strong>class</strong> system there is a corresponding queue <strong>to</strong> each <strong>client</strong> <strong>class</strong><br />
(see Figure 1). The Queue component is used for this purpose. It is a container of the requests<br />
generated by the ClientClassWorkloadGenera<strong>to</strong>r and ordered in a first-come-first-out fashion.<br />
The <strong>simulation</strong> <strong>model</strong> needs N Queue instances <strong>to</strong> represent N queues.<br />
ResourceUnit: The ResourceUnit entity is an abstraction of a resource unit in a multi-<strong>client</strong><br />
<strong>class</strong> system. It simulates the time period a resource is reserved/occupied/provisioned <strong>to</strong> serve a<br />
request of a <strong>client</strong> <strong>class</strong>. It has the currently served request, serviced <strong>client</strong> <strong>class</strong>, status (idle or<br />
working) as attributes. The process of this entity is at each tick, it simulates the processing time<br />
specified on the request it is serving. When the request has utilized the resource for the specified<br />
period of time it is assumed <strong>to</strong> be sent back <strong>to</strong> the <strong>client</strong> after stamping the end time. However,<br />
in this <strong>simulation</strong> the copy of the served request is also sent <strong>to</strong> the statistical analysis component<br />
<strong>to</strong> compute measurements such as response times.<br />
3
Scheduler: The scheduler <strong>implement</strong>s the resource allocation decisions required. For instance,<br />
if the decision is <strong>to</strong> maintain 15 and 5 resource units for A and B <strong>client</strong> <strong>class</strong>es respectively,<br />
this component <strong>implement</strong>s these decisions until the next decision is made. It has the<br />
access <strong>to</strong> the Queue instances of each <strong>client</strong> <strong>class</strong>, resource units and other state variables. In<br />
each tick it executes the following algorithm for each <strong>client</strong> <strong>class</strong>. Say S i and i util are integer<br />
variables representing the allocated resources of i th <strong>client</strong> <strong>class</strong> and currently utilized resources<br />
by i th <strong>client</strong> <strong>class</strong>, respectively. Calculate the number of resources that can be allocated in this<br />
time instance, by Di f i = S i −i util . Get the Di f i amount of requests from the Queue corresponding<br />
<strong>to</strong> A <strong>client</strong> <strong>class</strong> and then the ResourceUnit instances are initialized with these requests. Further,<br />
the i util variable is updated at the same time. Here, we have taken the design decision of<br />
centralized scheduler, instead of each ResourceUnit <strong>class</strong> taking the responsibility of scheduling.<br />
This is because, it is easy <strong>to</strong> track and validate the resource utilizations compared <strong>to</strong> a distributed<br />
algorithm.<br />
StatisticCalula<strong>to</strong>r: This is the component that computes the measurement required <strong>to</strong> <strong>implement</strong><br />
the control systems. In particular, it calculates average response time, throughput and<br />
resource utilizations for each <strong>client</strong> <strong>class</strong> on specified time periods. It has a list of completed<br />
requests for each <strong>client</strong> <strong>class</strong>, which is populated by the ResourceUnit <strong>class</strong> after servicing the<br />
requests. The designer specifies the time interval <strong>to</strong> calculate the statistics, which we call as the<br />
sample instance. The statistic report generated will be used by the external entities for analysis<br />
and make runtime decisions. Afterwards, the request lists are cleared <strong>to</strong> accumulate the completed<br />
requests till the next sample instance. Following equations summarize how some of the<br />
statistics are calculated for <strong>client</strong> <strong>class</strong> i.<br />
Given the completed request list for <strong>client</strong> <strong>class</strong> List i , Throughput of the system T P i =<br />
Count(List i ), i.e, the number of items in the list. The response time of the j th request r i, j =<br />
r i, j .endtime − r i, j .starttime.<br />
The <strong>to</strong>tal response time of all requests in the list<br />
Tot i =<br />
Count(List ∑ i )−1<br />
j=0<br />
Average response time is calculated by R i = Tot i<br />
T P i<br />
r i, j (1)<br />
MainProgram: The designer can use this <strong>class</strong> <strong>to</strong> <strong>implement</strong> the required <strong>simulation</strong> depending<br />
on the requirements. Depending on the number of <strong>client</strong> <strong>class</strong>es Queue instance have <strong>to</strong> be<br />
created, then the required workload scripts have <strong>to</strong> be specified in <strong>client</strong> <strong>class</strong> specific workload<br />
genera<strong>to</strong>r objects. In addition, the number of resource units that is available in the system has <strong>to</strong><br />
be specified in the scheduler. Further, probability distributions <strong>to</strong> simulate resource reservation<br />
time and sample period has <strong>to</strong> be given depending on the <strong>simulation</strong> objectives.<br />
Assumptions<br />
1. Typically, the resource allocation decisions made by the management system are <strong>implement</strong>ed<br />
at each sample instance. However, some of the resource units are maybe occupied<br />
by the requests that are being processed at that time instance. This may indicate that some<br />
<strong>class</strong>es have more than the resources they are allocated for that time instance. Hence, <strong>to</strong><br />
<strong>implement</strong>ation of the resource allocation decisions can be done in two different ways,<br />
including preemptive and non- preemptive. In the preemptive setting, the number of over<br />
utilized resources are forcefully taken away in order <strong>to</strong> allocate that resource <strong>to</strong> specified<br />
4
<strong>class</strong>. This is a complex policy, which will cause jittery behavior in system measurements,<br />
inconsistent states in transaction and additional overhead on the shared resource<br />
system during the <strong>implement</strong>ation at runtime [2, 3]. In contrast, in the non-preemptive<br />
setting, the resource is taken away once the request being processed is completed. The<br />
non-preemptive setting is a desirable configuration for shared resource environments [2].<br />
Thus, we have <strong>implement</strong>ed this non-preemptive setting in the scheduler process of the<br />
<strong>simulation</strong> <strong>model</strong>. However, the inaccuracy in decision <strong>implement</strong>ation can be reduced by<br />
selecting the processing times comparatively smaller than the sample period. For instance,<br />
if the service time varies in ticks range, sample period can be selected in major ticks. This<br />
means the decision made will be <strong>implement</strong>ed before the next decision made. In addition,<br />
a large amount of requests will be processed during a sample time so that the error due <strong>to</strong><br />
incomplete requests during a sample period becomes insignificant.<br />
2. End-<strong>to</strong>-end response time is not a consideration.<br />
3. Time taken <strong>to</strong> reschedule the resource is assumed <strong>to</strong> be zero.<br />
4. Overhead from the scheduler and the statistic calcula<strong>to</strong>r is zero.<br />
From the various <strong>simulation</strong>s designed and executed from this <strong>implement</strong>ation indicated that<br />
it can simulate consistent behavior under same settings and N number of <strong>client</strong>s. It also provides<br />
accurate measurements of the system outputs, correct <strong>implement</strong>ations of resource allocation<br />
decisions and fast executions. The process-oriented design approach taken in this <strong>implement</strong>ation<br />
provides modularity by delegating responsibilities among entities and extendibility. Therefore,<br />
this DES <strong>simulation</strong> <strong>model</strong> achieves many of the requirements mentioned in Section 2. What<br />
is left is verify and validate this DES <strong>model</strong> is capable <strong>to</strong> simulate the complex behavior of a<br />
multi-<strong>client</strong> <strong>class</strong> system.<br />
4. Validation of the DES <strong>model</strong><br />
After building a <strong>simulation</strong>, the next major step is <strong>to</strong> verify and validate the <strong>implement</strong>ation.<br />
In this section we provide three forms of validations using queuing theory. It is noteworthy<br />
that <strong>to</strong> apply queuing theory, certain assumptions on the system structure, arrival workloads<br />
and processing time statistical distributions should hold. In the following sections, the required<br />
<strong>simulation</strong> systems are constructed using the DES <strong>model</strong> proposed in Section 3, adhering <strong>to</strong> the<br />
assumptions.<br />
4.1. Conformant <strong>to</strong> Little’s law<br />
One of the fundamental results of queuing theory was developed by John Little in 1960’s,<br />
which is used as a basic building box in the development of theories of large scale queuing<br />
systems. Little’s law is defined as follows:<br />
For a queuing system in steady state, if the mean time waiting in the system is W = E(T),<br />
and the mean number of cus<strong>to</strong>mers entering the system is λ, then the mean number of cus<strong>to</strong>mers<br />
in the system is given by E(L) = W × λ. This result applies <strong>to</strong> any queuing system and even <strong>to</strong><br />
systems within a system. However, system has <strong>to</strong> be in steady state, meaning that the arrival rate<br />
should be less than the service rate of the system. Therefore, the <strong>simulation</strong> <strong>model</strong> presented in<br />
this chapter can be validated using Little’s law.<br />
In order <strong>to</strong> do the validation, we constructed a queuing <strong>simulation</strong> using the constructs introduced<br />
in Section 3. We used two <strong>client</strong> <strong>class</strong> workload genera<strong>to</strong>rs and queues with 5 resource<br />
5
Average<br />
waiting<br />
time Class<br />
1(W 1 )<br />
Table 1: A comparison based on Littles law<br />
Average<br />
waiting<br />
time Class<br />
2(W 2 )<br />
Total<br />
number<br />
of cus<strong>to</strong>ms<br />
<strong>class</strong><br />
1(N 1 )<br />
Total<br />
number<br />
of cus<strong>to</strong>ms<br />
<strong>class</strong><br />
2(N 2 )<br />
Measured<br />
average<br />
number of<br />
cus<strong>to</strong>mers<br />
in the<br />
Calculation<br />
of littles<br />
law<br />
W 1 × N 1 +<br />
W 2 × N 1<br />
system<br />
54.53763 51.49038 93 104 0.20854 0.20854<br />
46.25325 49.35146 999 993 1.90426 1.90426<br />
28.23506 29.22124 1238 1243 1.42554 1.42554<br />
14.12709 14.39264 1676 1658 0.9508 0.9508<br />
17.09113 17.26543 2030 1944 1.36518 1.36518<br />
14.42547 14.40802 2536 2468 1.44284 1.44284<br />
units for each <strong>client</strong> <strong>class</strong> in this validation. We used 18 combinations of s<strong>to</strong>chastic arrival rate<br />
and service rates from exponential distribution <strong>to</strong> simulate workloads and processing times of<br />
both <strong>client</strong> <strong>class</strong>es. All these combinations were selected <strong>to</strong> maintain the system in the steady<br />
state. The workload scripts generated from arrival rate were given <strong>to</strong> the ClientClassWorkload-<br />
Genera<strong>to</strong>r instance of each <strong>client</strong> <strong>class</strong> and the processing times of ResourceUnits were generated<br />
from service rate from exponential distribution in each experiment. A experiment was conducted<br />
for 50,000 ticks. The StatisticCalula<strong>to</strong>r instance was used <strong>to</strong> compute the final statistics of the<br />
experiment including, the average response time, average arrival rates and average number of<br />
cus<strong>to</strong>mers in the system. In these calculations the system was considered as two sub systems,<br />
each providing services <strong>to</strong> a corresponding <strong>client</strong> <strong>class</strong>. The comparison of the statics were done<br />
as the <strong>to</strong>tal number of <strong>client</strong>s in these two sub systems as equal <strong>to</strong> <strong>to</strong>tal measured number of<br />
<strong>client</strong>s in the systems when both sub systems considered <strong>to</strong>gether. These statics were an exact<br />
match for all of these experiments. Some of the selected experimental results are summarized<br />
in Table 1 indicates that measured number in the system is precisely equal <strong>to</strong> the calculations of<br />
the Little’s law. The same results were observed for the experiments conducted in deterministic<br />
arrival and service rates. Hence, the multi-<strong>client</strong> <strong>class</strong> <strong>simulation</strong>s <strong>implement</strong>ed from the DES<br />
<strong>model</strong> described in Section 3, precisely conform <strong>to</strong> the Little’s law. This result also indicates that<br />
all the request input <strong>to</strong> the system leave the system. Further, <strong>implement</strong>ation of the DES <strong>model</strong><br />
including the statistical calculations is correct.<br />
4.2. Conformant <strong>to</strong> single-<strong>server</strong> queuing system (M/M/1)<br />
In this section a single-<strong>server</strong> queuing system is developed and simulated, and then the measurements<br />
are compared <strong>to</strong> theoretic results of (M/M/1) queuing system from literature. The<br />
<strong>simulation</strong> was <strong>implement</strong>ed with a single resource unit and queue. The workload script of a<br />
single <strong>client</strong> is generated according <strong>to</strong> Poisson arrival process. The resource reservation time<br />
(processing time) is generated according <strong>to</strong> the exponential distribution. All the components<br />
available from the DES <strong>model</strong> are used in this <strong>implement</strong>ation as well. The 18 experiments were<br />
conducted with the same arrival and processing time combinations utilized in Section 4.1. Due<br />
<strong>to</strong> the measured results are compared with probabilistic theoretical values, each experiment was<br />
run for 200,000 ticks. As the basis of validation, we compared the measured average number of<br />
cus<strong>to</strong>mers in the system from the <strong>simulation</strong>s with the expected number of cus<strong>to</strong>mers calculated<br />
6
Table 2: A comparison based on single-<strong>server</strong> queuing system (M/M/1)<br />
λ µ L theoretical L measured<br />
0.02 0.1 0.247 0.244<br />
0.025 0.067 0.423 0.418<br />
0.04 0.056 2.622 2.591<br />
0.022 0.03 2.799 2.756<br />
0.02 0.027 2.838 2.801<br />
0.02 0.04 0.979 0.968<br />
0.033 0.067 0.976 0.96<br />
from queuing theoretic results. Let us say λ and µ represents the mean arrival rate and mean<br />
service time respectively. Theoretically the expected cus<strong>to</strong>mers in the system is calculated as<br />
follows:<br />
L theoretical =<br />
λ<br />
µ − λ<br />
So that, given the simulated λ and µ, the L theoretical calculated from equation (2) should approximately<br />
equal <strong>to</strong> L measured from the <strong>simulation</strong>. In order <strong>to</strong> quantify the statistical significance of<br />
the difference, we also conducted a Kolmogorov-Smirnov test using the data of the 18 <strong>simulation</strong>s<br />
conducted under different λ and µ. The compassion of results of some of the experiments are<br />
summarized in Table 2.<br />
The results of Kolmogorov-Smirnov test producedx D statistics of 0.11 and P statistic of 1.<br />
In nutshell if the P value is less than 0.05 there is significant difference between the data sets.<br />
However, since for this case P = 1 concludes that the data set of L measured and L theoretical has no<br />
significant difference. As a consequence, the single-<strong>server</strong> queuing system (M/M/1) constructed<br />
from the DES <strong>model</strong> conform <strong>to</strong> the queuing theoretic results. This confirms the M/M/1 queuing<br />
system <strong>implement</strong>ed using the DES <strong>model</strong> constructs which includes scheduling of a single<br />
queue and resource is correct.<br />
4.3. Conformant <strong>to</strong> multi-<strong>server</strong> queuing system (M/M/c)<br />
In this section we construct a multi-<strong>server</strong> queuing system serving a single queue. For this<br />
system, the same assumptions used in Section 4.2 are maintained. However, (c=) 5 resource units<br />
are used <strong>to</strong> represent 5 <strong>server</strong>s in the system. 18 experiments were conducted under same settings<br />
as in Section 4.2 in order <strong>to</strong> gather measurement data. The same measurement of the average<br />
number of cus<strong>to</strong>mers in the system was used for the comparison. The theoretical calculation is<br />
done as follows:<br />
r c<br />
∑c−1<br />
p 0 = (<br />
c!(1 − ρ) + (c − 1) r n<br />
n! )−1 (3)<br />
L theoretical = r +<br />
n=0<br />
(2)<br />
r c ρ<br />
c!(1 − ρ) 2 p 0, (4)<br />
Where r = λ µ , ρ = r c<br />
, c = number of <strong>server</strong>s (5 for this experiment). The results are summarized<br />
in Table 3.<br />
7
Table 3: A comparison based on multi-<strong>server</strong> queuing system (M/M/c)<br />
λ µ L theoretical L measured<br />
0.02 0.1 0.2 0.198<br />
0.025 0.067 0.375 0.367<br />
0.04 0.056 0.72 0.728<br />
0.022 0.03 0.734 0.722<br />
0.02 0.027 0.74 0.739<br />
0.02 0.04 0.5 0.492<br />
0.033 0.067 0.5 0.495<br />
The Kolmogorov-Smirnov test computed D statistics of 0.16 and P statistic of 0.95 similar<br />
<strong>to</strong> the earlier the case of single-<strong>server</strong> queuing system indicating that the data set of L measured and<br />
L theoretical has no significant difference. Thus, the (M/M/c) queuing system developed for this<br />
case also conforms <strong>to</strong> the theoretical results.<br />
We conclude the theoretical validation of the DES <strong>model</strong> built <strong>to</strong> be used in this thesis with<br />
the above three validations. The results are not exactly equal <strong>to</strong> the theoretic results because of the<br />
slight numerical inaccuracies of the <strong>implement</strong>ations of probabilistic distributions. In addtion,<br />
the multi-<strong>client</strong> <strong>class</strong> systems fall under multi-<strong>server</strong> multi-<strong>class</strong> queuing systems. Well known<br />
exact theoretical results are not available so far for such systems, so that we limited our validation<br />
<strong>to</strong> multi-<strong>server</strong> queuing systems. With this result we can justify that the <strong>implement</strong>ation of the<br />
constructs of DES <strong>model</strong>, including scheduling of <strong>multiple</strong> resource units and queuing is valid.<br />
5. Simulation settings<br />
Using the above generalized DES <strong>model</strong>, we setup a <strong>simulation</strong> system <strong>to</strong> apply and validate<br />
the proposed nonlinear control theoretic approaches in this thesis.<br />
5.1. Workload profiles<br />
The workloads, a multi-<strong>client</strong> <strong>class</strong> system may face cannot be generalized. The workload<br />
a system can manage depends on the capacity of resources, management requirements and performance<br />
objectives. The workload profile for a system with CPU as the shared resource may<br />
differ from a system with concurrent threads as a shared resource. In addition, the workloads are<br />
time-varying, instead of staying constant for entire period of operations. This characteristic is<br />
not only limited <strong>to</strong> software systems, but <strong>to</strong> other physical systems as well. As a consequence,<br />
control engineering provides set of well-established input signals <strong>to</strong> validate the performance<br />
of the control systems. They are as follows: Assume, W n is the nominal workload that system<br />
receive.<br />
Impulse input signal: Formally, W impulse (k) = 1 when k = 0 and k 0. i.e, the impulse<br />
input signal increases the workload <strong>to</strong> some value greater than W n for a single sample period.<br />
In a real workload this can be considered as a workload spike for a very short time period.<br />
However, such spikes for very short periods of time may not affect the performance attributes<br />
(e.g., average response time) drastically, consequently the impulse input signal may not be useful<br />
for the validations of the control systems of software systems.<br />
Step input signal: Step input signal <strong>model</strong>s a sudden jump in the workload from W n <strong>to</strong><br />
some value W step and staying at that value for a more than a single sample period. This is one<br />
8
of the widely used input signals <strong>to</strong> validate the performance of the control systems in control<br />
engineering. In addition, most of the applications of feedback control in software systems, including<br />
multi-<strong>client</strong> <strong>class</strong> systems have used step workload changes <strong>to</strong> validate the performance<br />
and resource management capabilities. This is because, such workload changes of even a single<br />
<strong>client</strong> <strong>class</strong> in a multi-<strong>client</strong> system for a long period of time affects the performance attributes<br />
(e.g., response time) under control. As a consequence, the control system is forced <strong>to</strong> redistribute<br />
the available recourses among <strong>client</strong> <strong>class</strong>es, in order <strong>to</strong> achieve the required performance objectives.<br />
The delay in response <strong>to</strong> such workload variations may cause large transient responses and<br />
temporal instabilities in the system. Therefore, this is a significantly difficult load variation <strong>to</strong><br />
handle [2, 3, 4, 5].<br />
Ramp input signal: Ramp input linearly increases the workload from W n <strong>to</strong> W ramp during<br />
sometime interval. This signal <strong>model</strong>s a gradual increase of workload instead of instantaneous<br />
increment of workload compared <strong>to</strong> step input signal.<br />
The main advantage of these input signals is given a linear <strong>model</strong> of a system, there are wellknown<br />
design and analysis techniques available from control theory <strong>to</strong> compute performance<br />
specifications and behavior. Consequently, after constructing a linear <strong>model</strong> of a system we can<br />
investigate/prove the load variations that the system can maintain without leading <strong>to</strong> instabilities.<br />
However, a linear <strong>model</strong> of a system is an estimation of its behavior (not 100% accurate<br />
representation), so that these theoretical evaluations may not be correct 100%. Further, this is<br />
also true for systems demonstrating nonlinearities such as the system under investigation in this<br />
thesis. As a consequence, we have <strong>to</strong> mention that the combinations of workload input signals<br />
(in particular, step input profiles) in time varying fashion are used as heuristics <strong>to</strong> validate and<br />
compare the performance of the control systems.<br />
5.2. Total resource amount and resource reservation time distribution<br />
The following settings will be used as an abstract representation of the multi-<strong>client</strong> <strong>class</strong><br />
system in the <strong>simulation</strong>s. The settings will remain the same unless otherwise specified. The<br />
<strong>to</strong>tal amount of resources simulated S <strong>to</strong>tal = 30. The processing time of each resource unit is<br />
selected from a uniform distribution as follows :<br />
1<br />
r(x) = for r min ≤ x ≥ r max<br />
r max − r min<br />
(5)<br />
= 0 for x < r min and x > r max (6)<br />
Where, r min = 100 ticks and r max = 700 ticks. The selection of the above settings is done, in order<br />
<strong>to</strong> achieve the tractability of resource allocations among <strong>client</strong> <strong>class</strong>es under different experiment<br />
conditions. The r min and r max , were selected after careful investigation of system outputs under<br />
different workload conditions. That is when the system is running close <strong>to</strong> the full capacity the<br />
system output should remain within some bounds, according <strong>to</strong> theoretical and practical system<br />
behavior. The Figure 2 shows a comparison when 30 resource units are allocated <strong>to</strong> two <strong>client</strong><br />
<strong>class</strong>es with 30 req/sec workloads for each <strong>class</strong>. When the selected bounds r min = 100 and r max =<br />
700 ticks, maintain the system in steady state under the applied resource settings. However, under<br />
the same settings, when the bounds are r min = 100 and r max = 900 ticks, the steady state behavior<br />
is highly variable/unstable. This is because the variability around the average response time leads<br />
<strong>to</strong> large transient response in the system. To avoid such behaviors the resource capacity and the<br />
workload intensity have <strong>to</strong> be selected depending on the bounds. For the workloads rates and the<br />
resources we selected <strong>to</strong> evaluate, r min = 100 ticks and r max = 700 are suitable bounds.<br />
9
2<br />
2<br />
R 1<br />
R 1<br />
Response time<br />
1.5<br />
1<br />
0.5<br />
R 2<br />
Response time<br />
1.5<br />
1<br />
0.5<br />
R 2<br />
0<br />
20 40 60 80 100<br />
Sample Id<br />
(a) 100-700<br />
0<br />
20 40 60 80 100<br />
Sample Id<br />
(b) 100-900<br />
Figure 2: System behavior under 2a) r min = 100 and r max = 700 ticks 2b) r min = 100 and<br />
r max = 900 ticks<br />
Further, selection of the uniform distributed processing time means that any operations invoked<br />
in the system is equally likely, so that we can get fair weight for each invocation. This is<br />
done because there is neither evidence nor a generalization available <strong>to</strong> represent the invocation<br />
patterns of the operations and their system output (e.g., response time) bounds. Such selections<br />
are done in [3, 6].<br />
In addition, 2000 ticks were selected as the sampling time period of the statistic calculation<br />
process. The selection of the sample time period has <strong>to</strong> be carefully done in physical systems. For<br />
instance, a small sample time invokes the statistic calculations frequently leading <strong>to</strong> additional<br />
overhead on the system. In addition, short sampling intervals affects variability of the measured<br />
average statistics. In contrast, large sampling times may cause decision delays, under sudden<br />
changes of the workloads or other conditions, leading <strong>to</strong> instabilities. Therefore, there is a tradeoff<br />
in the selection of the time interval. We selected 2000 ticks after analysis of the workload rates<br />
and changes we will be applying in the system. Further, it reduces the effect of the assumption<br />
(1), listed in Section 3.<br />
6. Simulation vs. Physical system behavior<br />
It is also important how the simulated multi-<strong>client</strong> <strong>server</strong> system behave corresponding <strong>to</strong> the<br />
behavior of real physical systems. There are many existing work related <strong>to</strong> performance management<br />
has analyzed the behavior of their case studies (based on physical systems) under different<br />
workload conditions. For instance, such analysis can be found for the cases of web <strong>server</strong>s [7, 8],<br />
data centers [4, 9], multi-tenant <strong>class</strong> systems [10, 11], multi-<strong>client</strong> <strong>class</strong> systems [12, 13, 14]<br />
and so on. One of the common experiments conducted is by changing the available resource in<br />
some order (increasing or decreasing in small steps) and measuring/plotting the system output<br />
(commonly the response time) under a constant deterministic workload. Common characteristic<br />
of all these example physical systems are shown in Figure 3. That is when resource share is<br />
sufficient <strong>to</strong> handle the incoming workload the response time remains in a steady value (or with<br />
low variations). That is the response time is insensitive for the resource share. However, when<br />
the resource share is insufficient the response time increases in a high rate. That is response time<br />
is highly sensitive <strong>to</strong> the resource share. However, this behavior highly depends on the workload<br />
settings as well (see [7, 4, 8, 9] for detailed analysis).<br />
10
Response time (seconds)<br />
Response time is highly sensitive<br />
due <strong>to</strong> lack of resources<br />
Response time is<br />
insensitive due <strong>to</strong><br />
excessive<br />
resources<br />
Resource share<br />
Figure 3: Abstract response time behavior against the resource share observed in physical systems<br />
Response time<br />
5<br />
4.5<br />
4<br />
3.5<br />
3<br />
2.5<br />
2<br />
30 req/sec<br />
40 req/sec<br />
50 req/sec<br />
60 req/sec<br />
70 req/sec<br />
80 req/sec<br />
1.5<br />
1<br />
0.5<br />
0<br />
14 16 18 20 22 24 26 28 30<br />
Resource amount<br />
Figure 4: A comparison of response time behavior of the <strong>simulation</strong> environment with the resource<br />
allocation in different workload conditionss<br />
In this section, we conduct such an experiment with a single <strong>client</strong> <strong>class</strong> system using the<br />
same settings described in Section 5. Here, the average response time of the system is observed<br />
while decreasing the available resource units from 30 <strong>to</strong> 14 for different workload conditions.<br />
The Figure 4 illustrates the behavior of the system output (response time) with respect <strong>to</strong> the<br />
resource allocation. The common observation under different workload conditions is when the<br />
incoming workload can be handled by the available resources the response time remains at a<br />
steady value. For instance, the response time is not affected by the 30 req/sec workload, because<br />
of 14 resource units are adequate <strong>to</strong> handle that workload. When we increase the workload rate,<br />
at certain resource allocation levels the response time starts <strong>to</strong> increase at a high rate moving the<br />
system <strong>to</strong> highly sensitive region (see Figure 4). Therefore, the behavior of this <strong>simulation</strong> is<br />
same as the behavior of real physical systems investigated in literature (see work [7, 4, 8] for<br />
similar experimental results).<br />
This experiment also indicates that 80 req/sec workload is the maximum capacity of the system<br />
for 30 resource units. However, this is not a linear relationship. For instance, this relationship<br />
indicates that with 15 resources, a 40 req/sec workload can be handled. However, graph for 40<br />
req/sec indicates that 15 resources move the system <strong>to</strong> highly sensitive region. As a consequence,<br />
11
when tenants are placed in on a shared resource environment the <strong>to</strong>tal capacity with mixed workload<br />
of these tenants is less than that of when system is considered as single <strong>class</strong>. For instance,<br />
if two tenants share 15 resource units each <strong>to</strong>tal workload capacity it can handle is approximately<br />
60 req/sec. Such behavior also pointed out by Kwok et al in [10].<br />
7. Summery<br />
This chapter presented the characteristics, requirements and importance of a <strong>simulation</strong> environment<br />
<strong>to</strong> represent a multi-<strong>client</strong> <strong>class</strong> system. Using popular discrete event <strong>simulation</strong> mechanism,<br />
we presented an appropriate discrete event <strong>simulation</strong> <strong>model</strong> <strong>to</strong> <strong>implement</strong> multi-<strong>client</strong><br />
<strong>class</strong> systems with different settings. Then the <strong>model</strong> <strong>implement</strong>ation was validated using queuing<br />
theoretic principles. The <strong>simulation</strong> settings that will be used in the rest of the chapter were<br />
also presented. Finally, the behavior of the <strong>simulation</strong> environments was compared <strong>to</strong> the behavior<br />
of physical systems utilizing the case studies available from the literature.<br />
References<br />
[1] J. Banks, J. Carson, B. L. Nelson, D. Nicol, Discrete-Event System Simulation (4th Edition), 4th Edition, Prentice<br />
Hall, 2004.<br />
[2] C. Lu, Y. Lu, T. F. Abdelzaher, J. A. Stankovic, S. H. Son, Feedback control architecture and design methodology<br />
for service delay guarantees in web <strong>server</strong>s, IEEE Trans. Parallel Distrib. Syst. (2006) 1014–1027.<br />
[3] C. Lu, Feedback control real-time scheduling, Ph.D. thesis, University of Virginia (2001).<br />
[4] P. Padala, Au<strong>to</strong>mated management of virtualized data centers, Ph.D. thesis, University of Michigan (2010).<br />
[5] J. L. Hellerstein, Y. Diao, S. Parekh, D. M. Tilbury, Feedback Control of Computing Systems, John Wiley and<br />
Sons, 2004.<br />
[6] L. Chenyang, J. Stankovic, G. Tao, S. Son, Design and evaluation of a feedback control edf scheduling algorithm,<br />
in: Real-Time Systems Symposium, 1999. Proceedings. The 20th IEEE, 1999, pp. 56 –67.<br />
[7] Z. Wang, X. Zhu, S. Singhal, Z. Wang, X. Zhu, S. Singhal, Utilization vs. slo-based control for dynamic sizing of<br />
resource partitions (2006).<br />
[8] X. Zhu, Z. Wang, S. Singhal, Utility-driven workload management using nested control design, no. HPL-2005-<br />
193R1, Hewlett Packard Labora<strong>to</strong>ries, 2006, p. 8.<br />
[9] P. Pradeep, H. Kai-Yuan, S. K. G., Z. Xiaoyun, U. Mustafa, W. Zhikui, S. Sharad, M. Arif, Au<strong>to</strong>mated control of<br />
<strong>multiple</strong> virtualized resources (2009).<br />
[10] T. Kwok, A. Mohindra, Resource calculations with constraints, and placement of tenants and instances for multitenant<br />
saas applications, in: Proceedings of the 6th International Conference on Service-Oriented Computing,<br />
ICSOC ’08, Springer-Verlag, 2008, pp. 633–648.<br />
[11] Z. H. Wang, C. J. Guo, B. Gao, W. Sun, Z. Zhang, W. H. An, A study and performance evaluation of the multitenant<br />
data tier design patterns for service oriented computing, in: IEEE International Conference on e-Business<br />
Engineering, 2008. ICEBE ’08., 2008, pp. 94 –101.<br />
[12] Y. Lu, T. Abdelzaher, C. Lu, L. Sha, X. Liu, Feedback control with queueing-theoretic prediction for relative delay<br />
guarantees in web <strong>server</strong>s (2003).<br />
[13] M. Karlsson, X. Zhu, C. Karamanolis, An adaptive optimal controller for non-intrusive performance differentiation<br />
in computing services, in: In IEEE Conference on Control and Au<strong>to</strong>mation (ICCA), 2005.<br />
[14] M. Li<strong>to</strong>iu, A performance analysis method for au<strong>to</strong>nomic computing systems, ACM Trans. Au<strong>to</strong>n. Adapt. Syst. 2.<br />
12