Agile Performance Testing - Testing Experience

minimal environment, shown in Figure 1. This environment was 

okay for basic functionality and compatibility testing, but it was 

much, much smaller than the typical customer data center. After 

selling the application to one customer with a large, complex 

network and large, complex security dataset, the customer found 

that one transaction that the customer wanted to run overnight, 

every night, took 25 hours to complete! 

While on this topic, let me point out a key warning: For systems 

that deal with large datasets, don’t forget that the data is a key 

component of the test environment. You could test with representative 

hardware, operating system configurations, and cohabitating 

software, but, if you forget the data, your results might 

be meaningless. 

Now, here’s a case study of building a proper test environment. 

For a banking application, my associates and I built a test environment 

that mimicked the production environment, as shown in 

Figure 2. Tests were run in steps where each step added 20 users 

to the load, looking for the beginning of non-linear degradation 

in performance (i.e., the so-called “knees” or “hockey-sticks” in the 

performance curve). We put realistic loads onto the system from 

the call center side of the interface. 

The only differences between the test and production environments 

were in the bandwidth and throughput limits on this 

wide-area network that tied the call center to the data center. 

We used testing and modeling to ensure that these differences 

would not affect the performance, load, and reliability results in 

the production environment. In other words, we made sure that, 

under real-world conditions, the traffic between call center and 

data center would never exceed the traffic we were simulating, 

thus ensuring that there was no hidden chokepoint or bottleneck 

there. 

This brings us to a key tip: To the extent that you can’t completely 

replicate the actual production environment or environments, be 

sure that you identify and understand all the differences. Once 

the differences are known, analyze carefully how they might affect 

your results. If there are more than one or two identified differences 

that can materially affect your test results, it will become 

very hard to extrapolate to the real world. That will call the validity 

of your test results into question. 

Lesson 2: Real-World Loads 

Once you have a realistic test environment in place, you are now 

ready to move to the next level of realistic performance, load, and 

reliability testing, which is the use of realistic transactions, usage 

profiles, and loads. These scenarios should include not just realistic 

usage under typical conditions. They should also include: 

regular events like backups; time-based peaks and lulls in activity; 

seasonal events such as holiday shopping and year-end closing; 

different classes of users including experienced users, novice 

users, and special-application users; and, allowance for growth 

for the future. Finally, don’t forget about external factors such 

as constant and variable LAN, WAN, and Internet load, the load 

imposed by cohabiting applications, and so forth. 

This brings us to another key tip: Don’t just test to the realistically 

foreseen limits of load, but rather try what my associates and I 

like to call “tip-over tests.” These involve increasing load until the 

system fails. At that point, what you’re doing is both checking for 

graceful degradation and trying to figure out where the bottlenecks 

are. 

Here’s a case study of testing with unrealistic load. One project 

we worked on involved the development of an interactive voice 

response server, as you would use when you do phone banking. 

The server was to support over 1,000 simultaneous users. A vendor 

was developing the software for the telephony subsystem. 

However, during subsystem testing, the vendor’s developers tested 

the server’s performance by generating load using half of the 

telephony cards on the system under test, as shown in Figure 3. 

Figure 3: Improper load generation 

Based on a simple inspection of the load generating software, the 

system load imposed by the load generators was well below that 

of the telephony software we were testing. My associates and 

I warned the vendor and the client that these test results were 

thus meaninglessness, but unfortunately our warnings were ignored. 

So, it was no surprise that the tests, which were not valid tests, 

gave passing results. As with all false negatives, this gave project 

participants and stakeholders false confidence in the system. 

My associates and I built a load generator that ran on an identical 

but separate host as shown in Figure 4. We loaded all the telephony 

ports on the system under test with representative inputs. 

The tests failed, revealing project-threatening design problems, 

which I’ll discuss in a subsequent section of this article. 

Figure 4: Proper load generation 

So, this brings us to a key tip: Prefer non-intrusive load generators 

for most performance testing. It also brings up a key warning: 

Performance, load, and reliability testing of subsystems in 

unrealistic system settings yields misleading results. 

Now, let me follow up with another key tip, to clarify the one I just 

gave: Intrusive or self-generated load can work for some load and 

reliability tests. Let me give you an example. 

I once managed a system test team that was testing a distributed 

Unix operating system. This operating system would support a 

cluster of up to 31 systems, which could be a mix of mainframes 

and PCs running Unix, as shown in Figure 5. For load generators, 

we built simple Unix/C programs that would consume resources 

like CPU, memory, and disk space, as well as using files, interprocess 

communication, process migration, and other cross-network 

activities. 

Basically, these load generators, simple though they were, allowed 

us to create worst-case scenarios in terms of the amount 

of resource utilization as well the number of simultaneously running 

programs. No application mix on the cluster could exceed 

the load created by those simple programs. 

Even better, randomness built into the programs allowed us 

68 The Magazine for Professional Testers www.testingexperience.com

Previous page

Next page

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

26

27

28

29

30

31

32

33

34

35

36

37

38

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

84

85

86

87

88

Agile Performance Testing - Testing Experience

Create successful ePaper yourself

Delete template?

Save as template?