Agile Performance Testing - Testing Experience
Agile Performance Testing - Testing Experience
Agile Performance Testing - Testing Experience
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
minimal environment, shown in Figure 1. This environment was<br />
okay for basic functionality and compatibility testing, but it was<br />
much, much smaller than the typical customer data center. After<br />
selling the application to one customer with a large, complex<br />
network and large, complex security dataset, the customer found<br />
that one transaction that the customer wanted to run overnight,<br />
every night, took 25 hours to complete!<br />
While on this topic, let me point out a key warning: For systems<br />
that deal with large datasets, don’t forget that the data is a key<br />
component of the test environment. You could test with representative<br />
hardware, operating system configurations, and cohabitating<br />
software, but, if you forget the data, your results might<br />
be meaningless.<br />
Now, here’s a case study of building a proper test environment.<br />
For a banking application, my associates and I built a test environment<br />
that mimicked the production environment, as shown in<br />
Figure 2. Tests were run in steps where each step added 20 users<br />
to the load, looking for the beginning of non-linear degradation<br />
in performance (i.e., the so-called “knees” or “hockey-sticks” in the<br />
performance curve). We put realistic loads onto the system from<br />
the call center side of the interface.<br />
The only differences between the test and production environments<br />
were in the bandwidth and throughput limits on this<br />
wide-area network that tied the call center to the data center.<br />
We used testing and modeling to ensure that these differences<br />
would not affect the performance, load, and reliability results in<br />
the production environment. In other words, we made sure that,<br />
under real-world conditions, the traffic between call center and<br />
data center would never exceed the traffic we were simulating,<br />
thus ensuring that there was no hidden chokepoint or bottleneck<br />
there.<br />
This brings us to a key tip: To the extent that you can’t completely<br />
replicate the actual production environment or environments, be<br />
sure that you identify and understand all the differences. Once<br />
the differences are known, analyze carefully how they might affect<br />
your results. If there are more than one or two identified differences<br />
that can materially affect your test results, it will become<br />
very hard to extrapolate to the real world. That will call the validity<br />
of your test results into question.<br />
Lesson 2: Real-World Loads<br />
Once you have a realistic test environment in place, you are now<br />
ready to move to the next level of realistic performance, load, and<br />
reliability testing, which is the use of realistic transactions, usage<br />
profiles, and loads. These scenarios should include not just realistic<br />
usage under typical conditions. They should also include:<br />
regular events like backups; time-based peaks and lulls in activity;<br />
seasonal events such as holiday shopping and year-end closing;<br />
different classes of users including experienced users, novice<br />
users, and special-application users; and, allowance for growth<br />
for the future. Finally, don’t forget about external factors such<br />
as constant and variable LAN, WAN, and Internet load, the load<br />
imposed by cohabiting applications, and so forth.<br />
This brings us to another key tip: Don’t just test to the realistically<br />
foreseen limits of load, but rather try what my associates and I<br />
like to call “tip-over tests.” These involve increasing load until the<br />
system fails. At that point, what you’re doing is both checking for<br />
graceful degradation and trying to figure out where the bottlenecks<br />
are.<br />
Here’s a case study of testing with unrealistic load. One project<br />
we worked on involved the development of an interactive voice<br />
response server, as you would use when you do phone banking.<br />
The server was to support over 1,000 simultaneous users. A vendor<br />
was developing the software for the telephony subsystem.<br />
However, during subsystem testing, the vendor’s developers tested<br />
the server’s performance by generating load using half of the<br />
telephony cards on the system under test, as shown in Figure 3.<br />
Figure 3: Improper load generation<br />
Based on a simple inspection of the load generating software, the<br />
system load imposed by the load generators was well below that<br />
of the telephony software we were testing. My associates and<br />
I warned the vendor and the client that these test results were<br />
thus meaninglessness, but unfortunately our warnings were ignored.<br />
So, it was no surprise that the tests, which were not valid tests,<br />
gave passing results. As with all false negatives, this gave project<br />
participants and stakeholders false confidence in the system.<br />
My associates and I built a load generator that ran on an identical<br />
but separate host as shown in Figure 4. We loaded all the telephony<br />
ports on the system under test with representative inputs.<br />
The tests failed, revealing project-threatening design problems,<br />
which I’ll discuss in a subsequent section of this article.<br />
Figure 4: Proper load generation<br />
So, this brings us to a key tip: Prefer non-intrusive load generators<br />
for most performance testing. It also brings up a key warning:<br />
<strong>Performance</strong>, load, and reliability testing of subsystems in<br />
unrealistic system settings yields misleading results.<br />
Now, let me follow up with another key tip, to clarify the one I just<br />
gave: Intrusive or self-generated load can work for some load and<br />
reliability tests. Let me give you an example.<br />
I once managed a system test team that was testing a distributed<br />
Unix operating system. This operating system would support a<br />
cluster of up to 31 systems, which could be a mix of mainframes<br />
and PCs running Unix, as shown in Figure 5. For load generators,<br />
we built simple Unix/C programs that would consume resources<br />
like CPU, memory, and disk space, as well as using files, interprocess<br />
communication, process migration, and other cross-network<br />
activities.<br />
Basically, these load generators, simple though they were, allowed<br />
us to create worst-case scenarios in terms of the amount<br />
of resource utilization as well the number of simultaneously running<br />
programs. No application mix on the cluster could exceed<br />
the load created by those simple programs.<br />
Even better, randomness built into the programs allowed us<br />
68 The Magazine for Professional Testers www.testingexperience.com