Acceptance Test-Driven Development with Keyword ... - Niksula
Acceptance Test-Driven Development with Keyword ... - Niksula
Acceptance Test-Driven Development with Keyword ... - Niksula
- No tags were found...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
HELSINKI UNIVERSITY OF TECHNOLOGY<br />
Department of Computer Science and Engineering<br />
Software Business and Engineering Institute<br />
Juha Rantanen<br />
<strong>Acceptance</strong> <strong>Test</strong>-<strong>Driven</strong> <strong>Development</strong> <strong>with</strong> <strong>Keyword</strong>-<strong>Driven</strong><br />
<strong>Test</strong> Automation Framework in an Agile Software Project<br />
Master’s Thesis<br />
Espoo, May 18, 2007<br />
Supervisor:<br />
Instructor:<br />
Professor Tomi Männistö<br />
Harri Töhönen, M.Sc.
HELSINKI UNIVERSITY OF TECHNOLOGY<br />
Department of Computer Science and Engineering<br />
ABSTRACT OF MASTER’S THESIS<br />
Author<br />
Title of thesis<br />
Professorship<br />
Supervisor<br />
Instructor<br />
Date<br />
Juha Rantanen May 18, 2007<br />
Pages<br />
102<br />
<strong>Acceptance</strong> <strong>Test</strong>-<strong>Driven</strong> <strong>Development</strong> <strong>with</strong> <strong>Keyword</strong>-<strong>Driven</strong> <strong>Test</strong> Automation<br />
Framework in an Agile Software Project<br />
Professorship Code<br />
Computer Science T-76<br />
Professor Tomi Männistö<br />
Harri Töhönen, M.Sc.<br />
Agile software development uses iterative development allowing changes and updates periodically<br />
to the software requirements. In agile software development methods, customer-defined tests have<br />
an important role in assuring that the software fulfills the customer’s needs. These tests can be defined<br />
before implementation to establish a clear goal for the development team. This is called acceptance<br />
test-driven development (ATDD).<br />
With ATDD the acceptance tests are usually automated. <strong>Keyword</strong>-driven testing is the latest evolution<br />
in test automation approaches. In keyword-driven testing, instructions, inputs, and expected<br />
outputs are defined in separate test data. A test automation framework tests the software accordingly<br />
and reports the results.<br />
In this thesis, the use of acceptance test-driven development <strong>with</strong> the keyword-driven test automation<br />
framework is studied in a real-world agile software development project. The study was conducted<br />
using action research during a four-month period. The main methods used were observations and<br />
interviews.<br />
It was noticed that the keyword-driven test automation framework can be used in acceptance testdriven<br />
development. However, there were some limitations preventing the implementation of all the<br />
test cases before the software implementation started. It was also noticed that the test automation<br />
framework used to implement the acceptance test cases is not in a crucial role in acceptance test<br />
driven development. The biggest benefits were gained from the detailed planning done before the<br />
software implementation at the beginning of the iterations.<br />
Based on the results, acceptance test-driven development improves communication and cooperation,<br />
and gives a common understanding about the details of the software’s features. These improvements<br />
help the development team to implement the wanted features. Therefore, the risk of building incomplete<br />
software decreases. The improvements also help to implement the features more efficiently as<br />
the features are more likely to be implemented correctly at the first time. Also remarkable changes to<br />
the test engineers’ role were noticed as the test engineers are more involved in the detailed planning.<br />
It seems that the biggest challenge in acceptance test-driven development is creating tests on right<br />
test levels and in a right scope.<br />
<strong>Keyword</strong>s: acceptance test-driven development, keyword-driven testing, agile testing, test automation<br />
ii
TEKNILLINEN KORKEAKOULU<br />
Tietotekniikan osasto<br />
DIPLOMITYÖN TIIVISTELMÄ<br />
Tekijä<br />
Työn nimi<br />
Professuuri<br />
Työn valvoja<br />
Työn ohjaaja<br />
Päiväys<br />
Juha Rantanen May 18, 2007<br />
Sivumäärä<br />
102<br />
Hyväksymistestauslähtöinen kehitys avainsanaohjatulla testiautomaatiokehyksellä<br />
ketterässä ohjelmistoprojektissa<br />
Koodi<br />
Ohjelmistoliiketoiminta ja tuotanto T-76<br />
Professori Tomi Männistö<br />
DI Harri Töhönen<br />
Ketterä ohjelmistokehitys pohjautuu iteratiiviseen lähestymistapaan. Iteratiivisuus mahdollistaa ohjelmiston<br />
vaatimusten muuttamisen ja päivittämisen jaksottaisesti. Ketterissä ohjelmistokehitysprosesseissa<br />
asiakkaan määrittämät testit ovat tärkeässä roolissa varmistettaessa, että kehitettävä ohjelmisto<br />
täyttää asiakkaan tarpeet. Nämä testit voidaan määritellä ennen toteutuksen aloittamista selkeän<br />
tavoitteen luomiseksi kehitystiimille. Tätä kutsutaan hyväksymistestauslähtöiseksi kehitykseksi.<br />
Hyväksymistestauslähtöisessä kehityksessä hyväksymistestit usein automatisoidaan. Yksi uusimmista<br />
testiautomaatiomenetelmistä on avainsanaohjattu testaus. Avainsanaohjatussa testauksessa ohjeet,<br />
syötteet ja oletetut lopputulokset määritellään erillisissä testitiedoissa. <strong>Test</strong>iautomaatiokehys testaa<br />
ohjelmistoa kyseisten tietojen mukaisesti ja raportoi tulokset.<br />
Tässä diplomityössä tarkastellaan avainsanaohjatun testiautomaatiokehyksen käyttöä hyväksymistestauslähtöisessä<br />
kehityksessä. Tutkimuksen kohteena oli eräs käynnissä oleva ketterä ohjelmistotuotantoprojekti.<br />
Lähestymistapana käytettiin toimintatutkimusta (action research) ja pääasiallisina menetelminä<br />
havainnointia ja haastatteluita. Tutkimusjakson pituus oli neljä kuukautta.<br />
Tutkimuksessa havaittiin, että avainsanaohjattua testiautomaatiokehystä voidaan käyttää hyväksymislähtöisessä<br />
kehityksessä. Jotkin rajoitteet kuitenkin estivät testien tekemisen ennen ohjelmiston<br />
toteutuksen aloittamista. Lisäksi havaittiin, että hyväksymistestauslähtöisessä kehityksessä testitapausten<br />
luomisessa käytettävällä testiautomaatiokehyksellä ei ole ratkaisevaa roolia. Suurimmat<br />
hyödyt saavutettiin yksityiskohtaisella suunnittelulla ennen ohjelmiston toteuttamista jokaisen iteraation<br />
alussa.<br />
Tulosten perusteella hyväksymistestauslähtöinen kehitys edistää eri osapuolten välistä kommunikaatiota,<br />
yhteistyötä ja käsitystä ohjelmiston ominaisuuksien yksityiskohdista. Tämä edistää haluttujen<br />
ominaisuuksien toteuttamista. Näin ollen riski toimimattoman tai väärin toimivan ohjelmiston valmistamisesta<br />
pienenee. Tämä edesauttaa tehokkaampaa ohjelmistokehitystä, sillä oikeat ominaisuudet<br />
tuotetaan todennäköisemmin jo ensimmäisellä toteutuskerralla. <strong>Test</strong>aajien roolissa huomattiin<br />
myös merkittäviä muutoksia johtuen testaajien lisääntyneestä osallistumisesta yksityiskohtaiseen<br />
suunnitteluun. Näyttää siltä, että hyväksymistestauslähtöisen kehityksen suurimmat haasteet liittyvät<br />
oikeilla testitasoilla ja oikeassa laajuudessaan olevien testien luomiseen.<br />
Avainsanat: hyväksymislähtöinen kehitys, avainsanaohjattu testaus, ketterä testaus, testiautomaatio<br />
iii
ACKNOWLEDGEMENTS<br />
This master’s thesis has been written for a Finnish software testing consultancy company Qentinel during<br />
the years 2006 and 2007. I would like to thank all the Qentinelians who have made this possible.<br />
Big thanks belong to my instructor Harri Töhönen for his interest, valuable feedback, and time used for<br />
listening and commenting my ideas.<br />
I would express my gratitude to my supervisor Tomi Männistö who gave advice and comments when<br />
those were needed.<br />
I would like to thank Petri Haapio and Pekka Laukkanen whom I have been working <strong>with</strong> and who<br />
have been giving valuable ideas, comments and feedback. The discussions <strong>with</strong> these two professionals<br />
have improved my know-how about agile software development and test automation. That knowhow<br />
has been priceless during this work.<br />
I would wish to thank all the members of the project where the research was carried out. It has been<br />
very rewarding to work <strong>with</strong> them.<br />
Also my good friend Pauli Aho deserves to be thanked. I am deeply indebted to him for using his time<br />
to check the language of this thesis.<br />
Finally, special thanks go to my lovely wife Aino for the help and support I received during this project.<br />
I am grateful to her for being so patient.<br />
iv
TABLE OF CONTENTS<br />
TERMS....................................................................................................................................... VII<br />
1 INTRODUCTION ............................................................................................................. 1<br />
1.1 Motivation .............................................................................................................. 1<br />
1.2 Aim of the Thesis ................................................................................................... 3<br />
1.3 Structure of the Thesis .......................................................................................... 3<br />
2 TRADITIONAL TESTING................................................................................................ 4<br />
2.1 Purpose of <strong>Test</strong>ing................................................................................................. 4<br />
2.2 Dynamic and Static <strong>Test</strong>ing................................................................................... 4<br />
2.3 Functional and Non-Functional <strong>Test</strong>ing................................................................. 4<br />
2.4 White-Box and Black-Box <strong>Test</strong>ing ......................................................................... 5<br />
2.5 <strong>Test</strong> Levels ............................................................................................................ 5<br />
3 AGILE AND ITERATIVE SOFTWARE DEVELOPMENT ............................................... 9<br />
3.1 Iterative <strong>Development</strong> Model................................................................................. 9<br />
3.2 Agile <strong>Development</strong>................................................................................................. 10<br />
3.3 Scrum..................................................................................................................... 11<br />
3.4 Extreme Programming........................................................................................... 15<br />
3.5 Scrum and Extreme Programming Together......................................................... 17<br />
3.6 Measuring Progress in Agile Projects ................................................................... 17<br />
4 TESTING IN AGILE SOFTWARE DEVELOPMENT ...................................................... 19<br />
4.1 Purpose of <strong>Test</strong>ing................................................................................................. 19<br />
4.2 <strong>Test</strong> Levels ............................................................................................................ 19<br />
4.3 <strong>Acceptance</strong> <strong>Test</strong>-<strong>Driven</strong> <strong>Development</strong>.................................................................. 22<br />
5 TEST AUTOMATION APPROACHES............................................................................ 28<br />
5.1 <strong>Test</strong> Automation..................................................................................................... 28<br />
5.2 Evolution of <strong>Test</strong> Automation Frameworks............................................................ 29<br />
5.3 <strong>Keyword</strong>-<strong>Driven</strong> <strong>Test</strong>ing ........................................................................................ 29<br />
6 KEYWORD-DRIVEN TEST AUTOMATION FRAMEWORK .......................................... 32<br />
6.1 <strong>Keyword</strong>-<strong>Driven</strong> <strong>Test</strong> Automation Framework....................................................... 32<br />
6.2 <strong>Test</strong> Data ............................................................................................................... 33<br />
6.3 <strong>Test</strong> Execution ....................................................................................................... 35<br />
6.4 <strong>Test</strong> Reporting ....................................................................................................... 35<br />
7 EXAMPLE OF ACCEPTANCE TEST-DRIVEN DEVELOPMENT WITH<br />
KEYWORD-DRIVEN TEST AUTOMATION FRAMEWORK .......................................... 36<br />
7.1 <strong>Test</strong> Data between User Stories and System under <strong>Test</strong> ..................................... 36<br />
7.2 User Stories ........................................................................................................... 37<br />
7.3 Defining <strong>Acceptance</strong> <strong>Test</strong>s.................................................................................... 37<br />
7.4 Implementing <strong>Acceptance</strong> <strong>Test</strong>s and Application.................................................. 39<br />
8 ELABORATED GOALS OF THE THESIS...................................................................... 45<br />
8.1 Scope..................................................................................................................... 45<br />
8.2 Research Questions .............................................................................................. 45<br />
9 RESEARCH SUBJECT AND METHOD ......................................................................... 47<br />
9.1 Case Project .......................................................................................................... 47<br />
9.2 Research Method .................................................................................................. 47<br />
9.3 Data Collection ...................................................................................................... 49<br />
v
10 ACCEPTANCE TEST-DRIVEN DEVELOPMENT WITH KEYWORD-DRIVEN<br />
TEST AUTOMATION FRAMEWORK IN THE PROJECT UNDER STUDY................... 51<br />
10.1 <strong>Development</strong> Model and <strong>Development</strong> Practices Used in the Project.................. 51<br />
10.2 January Sprint........................................................................................................ 52<br />
10.3 February Sprint ...................................................................................................... 55<br />
10.4 March Sprint .......................................................................................................... 61<br />
10.5 April Sprint ............................................................................................................. 63<br />
10.6 Interviews............................................................................................................... 65<br />
11 ANALYSES OF OBSERVATIONS ................................................................................. 72<br />
11.1 Suitability of the <strong>Keyword</strong>-<strong>Driven</strong> <strong>Test</strong> Automation Framework <strong>with</strong><br />
<strong>Acceptance</strong> <strong>Test</strong>-<strong>Driven</strong> <strong>Development</strong>.................................................................. 72<br />
11.2 Use of the <strong>Keyword</strong>-<strong>Driven</strong> <strong>Test</strong> Automation Framework <strong>with</strong> <strong>Acceptance</strong><br />
<strong>Test</strong>-<strong>Driven</strong> <strong>Development</strong>...................................................................................... 76<br />
11.3 Benefits, Challenges and Drawbacks of <strong>Acceptance</strong> <strong>Test</strong>-<strong>Driven</strong><br />
<strong>Development</strong> <strong>with</strong> <strong>Keyword</strong>-<strong>Driven</strong> <strong>Test</strong> Automation Framework......................... 78<br />
11.4 Good Practices ...................................................................................................... 87<br />
12 DISCUSSION AND CONCLUSIONS.............................................................................. 89<br />
12.1 Researcher’s Experience ...................................................................................... 89<br />
12.2 Main Conclusions .................................................................................................. 89<br />
12.3 Validity ................................................................................................................... 90<br />
12.4 Evaluation of the Thesis ........................................................................................ 92<br />
12.5 Further Research Areas ........................................................................................ 92<br />
BIBLIOGRAPHY........................................................................................................................ 94<br />
APPENDIX A PRINCIPLES BEHIND THE AGILE MANIFESTO ..................................... 101<br />
APPENDIX B INTERVIEW QUESTIONS .......................................................................... 102<br />
vi
TERMS<br />
<strong>Acceptance</strong> Criteria<br />
<strong>Acceptance</strong> <strong>Test</strong>ing<br />
<strong>Acceptance</strong> <strong>Test</strong>-<strong>Driven</strong> <strong>Development</strong><br />
(ATDD)<br />
Actual Result<br />
Agile <strong>Test</strong>ing<br />
Base <strong>Keyword</strong><br />
Behavior<br />
Bespoke Software<br />
Beta <strong>Test</strong>ing<br />
Black-box <strong>Test</strong>ing<br />
Bug<br />
Capture/Playback Tool<br />
Component<br />
The exit criteria that a component or system must satisfy in order to<br />
be accepted by a user, customer, or other authorized entity. (IEEE<br />
Std 610.12-1990)<br />
Formal testing <strong>with</strong> respect to user needs, requirements, and business<br />
processes conducted to determine whether or not a system satisfies<br />
the acceptance criteria and to enable the user, customers or<br />
other authorized entity to determine whether or not to accept the<br />
system. (IEEE Std 610.12-1990) See also component testing, integration<br />
testing and acceptance testing.<br />
A way of developing software where the acceptance test cases are<br />
developed, and often automated, before the software is developed to<br />
run those test cases. See also test-driven development.<br />
The behavior produced/observed when a component or system is<br />
tested. (ISTQB 2006)<br />
<strong>Test</strong>ing practice for a project using agile methodologies, such as<br />
extreme programming (XP), treating development as the customer<br />
of testing and emphasizing the test-first design paradigm. (ISTQB<br />
2006) See also test-driven development and acceptance test-driven<br />
development.<br />
<strong>Keyword</strong> implemented in a test library of a keyword-driven test<br />
automation framework. (Laukkanen 2006) See also sentence format<br />
keyword and user keyword.<br />
The response of a component or system to a set of input values and<br />
preconditions. (ISTQB 2006)<br />
Software developed specifically for a set of users or customers. The<br />
opposite is off-the-shelf software. (ISTQB 2006)<br />
Operational testing by potential and/or existing users/customers at<br />
an external site not otherwise involved <strong>with</strong> the developers, to determine<br />
whether or not a component or system satisfies the<br />
user/customer needs and fits <strong>with</strong>in the business processes. Beta<br />
testing is often employed as a form of external acceptance testing<br />
for off-the-shelf software in order to acquire feedback from the market.<br />
(ISTQB 2006)<br />
<strong>Test</strong>ing, either functional or non-functional, <strong>with</strong>out reference to the<br />
internal structure of the component or system. (ISTQB 2006) See<br />
also white-box testing.<br />
See defect.<br />
A type of test execution tool where inputs are recorded during manual<br />
testing in order to generate automated test scripts that can be<br />
executed later (i.e. replayed). These tools are often used to support<br />
automated regression testing. (ISTQB 2006)<br />
A minimal software item that can be tested in isolation. (ISTQB<br />
2006)<br />
vii
Component <strong>Test</strong>ing The testing of individual software components. (IEEE Std 610.12-<br />
1990)<br />
Context-<strong>Driven</strong> <strong>Test</strong>ing<br />
Daily Build<br />
Data-<strong>Driven</strong> <strong>Test</strong>ing<br />
Defect<br />
Defined Process<br />
Dynamic <strong>Test</strong>ing<br />
Empirical Process<br />
Expected Outcome<br />
Expected Result<br />
Exploratory <strong>Test</strong>ing<br />
Fail<br />
Failure<br />
Fault<br />
A testing methodology that underlines the importance of the context<br />
where different testing practices are used over the practices themselves.<br />
The main message is that there are good practices in a context<br />
but there are no general best practices. (Kaner et al. 2001a)<br />
A development activity where a complete system is compiled and<br />
linked every day (usually overnight), so that a consistent system is<br />
available at any time including all latest changes. (ISTQB 2006)<br />
A scripting technique that stores test input and expected results in a<br />
table or spreadsheet, so that a single control script can execute all of<br />
the tests in the table. Data-driven testing is often used to support the<br />
application of test execution tools such as capture/playback tools.<br />
(Fewster & Graham 1999) See also keyword-driven testing.<br />
A flaw in a component or system that can cause the component or<br />
system to fail to perform its required function, e.g. an incorrect<br />
statement or data definition. A defect, if encountered during execution,<br />
may cause a failure of the component or system. (ISTQB<br />
2006)<br />
In defined process every piece of work is well understood. With<br />
well defined input, the defined process can be started and allowed<br />
to run until completion, ending <strong>with</strong> the same results every time.<br />
(Schwaber & Beedle 2002) See also empirical process.<br />
<strong>Test</strong>ing that involves the execution of the software of a component<br />
or system. (ISTQB 2006) See also static testing.<br />
In empirical process the unexpected is expected. Empirical process<br />
provides and exercises control through frequent inspection and adaptation<br />
in imperfectly defined environments where unpredictable<br />
and unrepeatable outputs are generated. (Schwaber & Beedle 2002)<br />
See also defined process.<br />
See expected result.<br />
The behavior predicted by the specification, or another source, of<br />
the component or system under specified conditions. (ISTQB 2006)<br />
An informal test design technique where the tester actively controls<br />
the design of the tests as those tests are performed and uses information<br />
gained while testing to design new and better tests. (Bach<br />
2003b)<br />
A test is deemed to fail if its actual result does not match its expected<br />
result. (ISTQB 2006)<br />
Deviation of the component or system from its expected delivery,<br />
service or result. (Fenton 1996)<br />
See defect.<br />
viii
Feature<br />
Feature Creep<br />
Functional <strong>Test</strong>ing<br />
Functionality<br />
High Level <strong>Test</strong> Case<br />
Input<br />
Input Value<br />
Information Radiator<br />
Integration <strong>Test</strong>ing<br />
Iterative <strong>Development</strong><br />
Model<br />
<strong>Keyword</strong><br />
<strong>Keyword</strong>-<strong>Driven</strong> <strong>Test</strong><br />
Automation Framework<br />
<strong>Keyword</strong>-<strong>Driven</strong> <strong>Test</strong>ing<br />
An attribute of a component or system specified or implied by requirements<br />
documentation (for example reliability, usability or design<br />
constraints). (IEEE Std 1008-1987)<br />
On-going requirements increase <strong>with</strong>out corresponding adjustment<br />
of approved cost and schedule allowances. As some projects progress,<br />
especially through the definition and development phases,<br />
requirements tend to change incrementally, causing the project<br />
manager to add to the project's mission or objectives <strong>with</strong>out getting<br />
a corresponding increase in the time and budget allowances. (Wideman<br />
2002)<br />
<strong>Test</strong>ing based on an analysis of the specification of the functionality<br />
of a component or system. (ISTQB 2006) See also black-box testing.<br />
The capability of the software product to provide functions which<br />
meet stated and implied needs when the software is used under<br />
specified conditions. (ISO/IEC Std 9126-1:2001)<br />
A test case <strong>with</strong>out concrete (implementation level) values for input<br />
data and expected results. Logical operators are used; instances of<br />
the actual values are not yet defined and/or available. (ISTQB 2006)<br />
See also low level test case.<br />
A variable (whether stored <strong>with</strong>in a component or outside) that is<br />
read by a component. (ISTQB 2006)<br />
An instance of an input. (ISTQB 2006) See also input.<br />
An information radiator is a large display of critical team information<br />
that is continuously updated and located in a spot where the<br />
team can see it constantly. (Agile Advice 2005)<br />
<strong>Test</strong>ing performed to expose defects in the interfaces and in the interactions<br />
between integrated components or systems. (ISTQB<br />
2006) See also component testing, system testing and acceptance<br />
testing.<br />
A development life cycle where a project is broken into a usually<br />
large number of iterations. Iteration is a complete development loop<br />
resulting in a release (internal or external) of an executable product,<br />
a subset of the final product under development, which grows from<br />
iteration to iteration to become the final product. (ISTQB 2006)<br />
A directive representing a single action in keyword-driven testing.<br />
(Laukkanen 2006)<br />
<strong>Test</strong> automation framework using keyword-driven testing technique.<br />
A scripting technique that uses data files to contain not only test<br />
data and expected results, but also keywords related to the application<br />
being tested. The keywords are interpreted by special supporting<br />
scripts that are called by the control script for the test. (ISTQB<br />
2006) See also data-driven testing.<br />
ix
Low Level <strong>Test</strong> Case<br />
Negative <strong>Test</strong>ing<br />
Non-functional testing<br />
Off-the-shelf Software<br />
Output<br />
Output Value<br />
Pass<br />
Postcondition<br />
Precondition<br />
Problem<br />
Quality<br />
Quality Assurance<br />
Regression <strong>Test</strong>ing<br />
Requirement<br />
Result<br />
A test case <strong>with</strong> concrete (implementation level) values for input<br />
data and expected results. Logical operators from high level test<br />
cases are replaced by actual values that correspond to the objectives<br />
of the logical operators. (ISTQB 2006) See also high level test case.<br />
<strong>Test</strong>s aimed at showing that a component or system does not work.<br />
Negative testing is related to the testers’ attitude rather than a specific<br />
test approach or test design technique, e.g. testing <strong>with</strong> invalid<br />
input values or exceptions. (Beizer 1990)<br />
<strong>Test</strong>ing the attributes of a component or system that do not relate to<br />
functionality, e.g. reliability, efficiency, usability, maintainability<br />
and portability. (ISTQB 2006)<br />
A software product that is developed for the general market, i.e. for<br />
a large number of customers, and that is delivered to many customers<br />
in identical format. (ISTQB 2006)<br />
A variable (whether stored <strong>with</strong>in a component or outside) that is<br />
written by a component. (ISTQB 2006)<br />
An instance of an output. (ISTQB 2006) See also output.<br />
A test is deemed to pass if its actual result matches its expected result.<br />
(ISTQB 2006)<br />
Environmental and state conditions that must be fulfilled after the<br />
execution of a test or test procedure. (ISTQB 2006)<br />
Environmental and state conditions that must be fulfilled before the<br />
component or system can be executed <strong>with</strong> a particular test or test<br />
procedure. (ISTQB 2006)<br />
See defect.<br />
The degree to which a component, system or process meets specified<br />
requirements and/or user/customer needs and expectations.<br />
(IEEE Std 610.12-1990)<br />
Part of quality management focused on providing confidence that<br />
quality requirements will be fulfilled. (ISO Std 9000-2005)<br />
<strong>Test</strong>ing of a previously tested program following modification to<br />
ensure that defects have not been introduced or uncovered in unchanged<br />
areas of the software, as a result of the changes made. It is<br />
performed when the software or its environment is changed.<br />
(ISTQB 2006)<br />
A condition or capability needed by a user to solve a problem or<br />
achieve an objective that must be met or possessed by a system or<br />
system component to satisfy a contract, standard, specification, or<br />
other formally imposed document. (IEEE Std 610.12-1990)<br />
The consequence/outcome of the execution of a test. It includes<br />
outputs to screens, changes to data, reports, and communication<br />
messages sent out. See also actual result, expected result. (ISTQB<br />
2006)<br />
x
Running <strong>Test</strong>ed Features<br />
(RTF)<br />
Sentence Format <strong>Keyword</strong><br />
Software<br />
Software Quality<br />
Static Code Analysis<br />
Static <strong>Test</strong>ing<br />
System<br />
System <strong>Test</strong>ing<br />
Running <strong>Test</strong>ed Features is a metric to measure the progress of an<br />
agile team. (Jeffries 2004)<br />
Term defined in this thesis for keywords which name is a sentence<br />
and it does not take any arguments. See also base keyword and user<br />
keyword.<br />
Computer programs, procedures, and possibly associated documentation<br />
and data pertaining to the operation of a computer system.<br />
(IEEE Std 610.12-1990)<br />
The totality of functionality and features of a software product that<br />
bear on its ability to satisfy stated or implied needs. (ISO/IEC Std<br />
9126-1:2001)<br />
Analysis of source code carried out <strong>with</strong>out execution of that software.<br />
(ISTQB 2006)<br />
<strong>Test</strong>ing of a component or system at specification or implementation<br />
level <strong>with</strong>out execution of that software, e.g. reviews or static code<br />
analysis. (ISTQB 2006) See also dynamic testing.<br />
A collection of components organized to accomplish a specific<br />
function or set of functions. (IEEE Std 610.12-1990)<br />
The process of testing an integrated system to verify that it meets<br />
specified requirements. (Burnstein 2003) See also component testing,<br />
integration testing and acceptance testing.<br />
System Under <strong>Test</strong> (SUT) The entire system or product to be tested. (Craig and Jaskiel 2002)<br />
<strong>Test</strong> A set of one or more test cases. (IEEE Std 829-1983)<br />
<strong>Test</strong> Automation<br />
<strong>Test</strong> Automation Framework<br />
<strong>Test</strong> Case<br />
<strong>Test</strong> Data<br />
<strong>Test</strong>-<strong>Driven</strong> development<br />
(TDD)<br />
<strong>Test</strong> Execution<br />
The use of software to perform or support test activities, e.g. test<br />
management, test design, test execution and results checking.<br />
(ISTQB 2006)<br />
A framework used for test automation. Provides some core functionality<br />
(e.g. logging and reporting) and allows its testing capabilities<br />
to be extended by adding new test libraries. (Laukkanen 2006)<br />
A set of input values, execution preconditions, expected results and<br />
execution postconditions, developed for a particular objective or<br />
test condition, such as to exercise a particular program path or to<br />
verify compliance <strong>with</strong> a specific requirement. (IEEE Std 610.12-<br />
1990)<br />
Data that exists (for example, in a database) before a test is executed,<br />
and that affects or is affected by the component or system<br />
under test. (ISTQB 2006)<br />
A way of developing software where the test cases are developed,<br />
and often automated, before the software is developed to run those<br />
test cases. (ISTQB 2006)<br />
The process of running a test on the component or system under<br />
test, producing actual result(s). (ISTQB 2006)<br />
xi
<strong>Test</strong> Execution Automation<br />
<strong>Test</strong> Engineer<br />
<strong>Test</strong> Input<br />
<strong>Test</strong> Level<br />
<strong>Test</strong> Log<br />
<strong>Test</strong> Logging<br />
<strong>Test</strong> Report<br />
<strong>Test</strong> Run<br />
<strong>Test</strong> Runner<br />
<strong>Test</strong> Result<br />
<strong>Test</strong> Script<br />
<strong>Test</strong> Set<br />
<strong>Test</strong> Suite<br />
<strong>Test</strong>ability<br />
<strong>Test</strong>er<br />
<strong>Test</strong>ing<br />
User <strong>Keyword</strong><br />
The use of software, e.g. capture/playback tools, to control the execution<br />
of tests, the comparison of actual results to expected results,<br />
the setting up of test preconditions, and other test control and reporting<br />
functions. (ISTQB 2006)<br />
See tester.<br />
The data received from an external source by the test object during<br />
test execution. The external source can be hardware, software or<br />
human. (ISTQB 2006)<br />
A group of test activities that are organized and managed together.<br />
A test level is linked to the responsibilities in a project. Examples of<br />
test levels are component test, integration test, system test and acceptance<br />
test. (Pol 2002)<br />
A chronological record of relevant details about the execution of<br />
tests. (IEEE Std 829-1983)<br />
The process of recording information about tests executed into a<br />
test log. (ISTQB 2006)<br />
A document summarizing testing activities and results. (IEEE Std<br />
829-1983)<br />
Execution of a test on a specific version of the test object. (ISTQB<br />
2006)<br />
A generic driver script capable to execute different kinds of test<br />
cases and not only variations <strong>with</strong> slightly different test data.<br />
(Laukkanen 2006)<br />
See result.<br />
Commonly used to refer to a test procedure specification, especially<br />
an automated one. (ISTQB 2006)<br />
See test suite.<br />
A set of several test cases for a component or system under test,<br />
where the postcondition of one test is often used as the precondition<br />
for the next one. (ISTQB 2006)<br />
The capability of the software product to enable modified software<br />
to be tested. (ISO/IEC Std 9126-1:2001)<br />
A skilled professional who is involved in the testing of a component<br />
or system. (ISTQB 2006)<br />
The process consisting of all life cycle activities, both static and<br />
dynamic, concerned <strong>with</strong> planning, preparation and evaluation of<br />
software products and related work products to determine that they<br />
satisfy specified requirements, to demonstrate that they are fit for<br />
purpose and to detect defects. (ISTQB 2006)<br />
<strong>Keyword</strong> constructed from base keywords and other user keywords<br />
in a test design system. User keywords can be created easily even<br />
<strong>with</strong>out programming skills. (Laukkanen 2006) See also base keyword<br />
and sentence format keyword.<br />
xii
Unit <strong>Test</strong>ing<br />
Variable<br />
White-Box <strong>Test</strong>ing<br />
See component testing.<br />
An element of storage in a computer that is accessible by a software<br />
program by referring to it by a name. (ISTQB 2006)<br />
<strong>Test</strong>ing based on an analysis of the internal structure of the component<br />
or system. (ISTQB 2006) See also black-box testing.<br />
xiii
1 INTRODUCTION<br />
1.1 Motivation<br />
Quality is one of the most important aspects of software products. If software does not work, it is not<br />
worth a lot. The drawbacks caused by faulty software can be much higher than the advantages gained<br />
from using it. Malfunctioning or difficult to use software can complicate daily life. In life critical systems<br />
faults may even cause loss of human lives. In highly competing markets quality may determine<br />
which software product is going to be a success and which ones are going to fail. Low quality software<br />
products have a negative impact on firms’ reputation and unquestionably also on the sales. Unhappy<br />
customers are also more willing to change to other software suppliers. For these reasons organizations<br />
have to invest on the quality of software products.<br />
Even high quality software can fail at the markets if it does not meet the customers’ needs. At the beginning<br />
of a software project it is common that customers’ exact needs are unknown. This may lead to<br />
guessing the wanted features and development of useless features and in the worst case useless software.<br />
This should obviously be avoided.<br />
New feature ideas usually arise when the customer understands the problem domain more thoroughly.<br />
This might be quite problematic if strict contractual agreements on the developed features exist. Even<br />
when it is contractually possible to add new features to the software, there might be a lot of rework<br />
before the features are ready for use.<br />
Iterative and especially agile software processes are introduced as a solution for changing requirements.<br />
The basic idea in the iterative processes is to create the software in small steps. When software<br />
is developed in this way, the customers can try out the developed software and based on the customer’s<br />
feedback, the development team can create features that are valuable for the customer. The most valuable<br />
features are developed first allowing the customer to start using the software earlier than the software<br />
developed <strong>with</strong> a non-iterative development process.<br />
Iterative software development adds new challenges for software testing. In traditional software projects<br />
main part of the testing is conducted in the end of the development project. With the iterative and<br />
agile processes the software should, however, be tested in every iteration. If the customer uses the result<br />
of the iteration, at least all the major problems should be solved before the product can be delivered.<br />
In an ideal situation each iteration outcome would be high quality software.<br />
1
In the agile methods the need for testing is understood and there are development practices that are<br />
used to assure the quality of the software. Many of these practices are targeted for developers and used<br />
to test that the code works as the developers have thought it should. To also test that the features fulfill<br />
the customer’s requirements there is need for a higher level testing. This higher level testing is often<br />
called as acceptance testing or customer testing. Customer input is needed to define these higher level<br />
test cases to make sure that her requirements are met.<br />
Because the software is developed in the iterative manner and there is a continuous change, it would be<br />
beneficial to test all the features at least once during the iteration. <strong>Test</strong>ing over again is needed, because<br />
the changes may have caused defects. <strong>Test</strong>ing manually all functionalities after every change is<br />
not possible. It may be possible at the beginning, but when the count of features rises, manual regression<br />
testing becomes harder and eventually impossible. This leads to a situation in which the changes<br />
done late in the iteration may have caused faults that cannot be noticed in testing. And even if the<br />
faults could be noticed, developers may not be able to fix them during the iteration.<br />
<strong>Test</strong> automation can be used for helping the testing effort. <strong>Test</strong> automation means testing software <strong>with</strong><br />
other software. When software and computers are used for testing, the test execution can be conducted<br />
much faster than manually. If the automated tests can be executed daily or even more often, the status<br />
of the developed software is continuously known. Therefore the problems can be found faster and the<br />
changes causing the problems can be pinpointed. That is why the test automation is an integral part of<br />
agile software development.<br />
By automating the customer defined acceptance tests, the test cases defining how the system should<br />
work from the customer point of view can be executed often. This makes it possible to know the status<br />
of the software in any point of the development. In acceptance test-driven development this approach<br />
is taken even further and the acceptance tests are not only used for verifying that the system works but<br />
also driving the system development. The customer defined test cases are created before the implementation<br />
starts. The goal of the implementation is then to develop software that passes all the acceptance<br />
test cases.<br />
2
1.2 Aim of the Thesis<br />
The aim of this thesis is to investigate whether the acceptance test-driven development can be used<br />
<strong>with</strong> in-house built keyword-driven test automation framework. The research is conducted in a real-life<br />
agile software development project and the suitability is evaluated in this case project. Also the pros<br />
and cons of this approach are evaluated. More detailed research question will follow in Chapter 8 after<br />
the acceptance test-driven development and keyword-driven test automation concepts are clarified.<br />
One purpose is to present the framework usage in level that can help others to try the approach <strong>with</strong><br />
similar kind of tools.<br />
1.3 Structure of the Thesis<br />
Structure of this thesis is the following; in Chapter 2 the traditional software testing is described to introduce<br />
the basic concepts needed in the following chapters. In Chapter 3 the basis of the agile and iterative<br />
software development is described. The testing in the agile software development is introduced<br />
in Chapter 4. Chapter 4 contains also acceptance test-driven development which is the main topic in<br />
this thesis. Chapter 5 includes the test automation approaches in general and the keyword-driven test<br />
automation approach in particular. After the keyword-driven approach is introduced, the keyworddriven<br />
test automation framework used in this thesis is explained in Chapter 6 in level that is needed to<br />
understand the coming Chapters. Chapter 7 contains simple and fictitious example of the usage of the<br />
presented keyword-driven test automation framework <strong>with</strong> acceptance test-driven development.<br />
The research questions are defined in Chapter 8. The case project and product developed in the case<br />
project are described in Chapter 9. The research method used to conduct this research is also explained<br />
in Chapter 9. Chapter 10 contains all the results from the project. First the development model used in<br />
the case project is described. Then the use of acceptance test-driven development <strong>with</strong> the keyworddriven<br />
test automation framework is represented. Chapter 10 also contains results from the interviews<br />
which were conducted at the end of the research. In Chapter 11 the observations gained from the case<br />
project are analyzed. Chapter 12 contains the conclusions and the discussion about the results and the<br />
meaning of the analysis in a wider perspective. Further research areas are presented at the end of Chapter<br />
12.<br />
3
2 TRADITIONAL TESTING<br />
In this chapter the traditional testing terminology and divisions of different testing aspects are described.<br />
The purpose is to give an overall view of the testing field and make it possible in the following<br />
chapters to compare agile testing to traditional testing and specify the research area in a wider context.<br />
2.1 Purpose of <strong>Test</strong>ing<br />
<strong>Test</strong>ing is an integral part of the software development. The goal of software testing is to find faults<br />
from developed software and to make sure they get fixed (Kaner et al. 1999, Patton 2000). It is important<br />
to find the faults as early as possible because fixing them is more expensive in the later phases of<br />
the development (Kaner et al. 1999, Patton 2000). The purpose of testing is also to provide information<br />
about the current state of the developed software from the quality perspective (Burnstein 2003). One<br />
might argue that software testing should make sure that software works correctly. This is however impossible<br />
because even a simple piece of software has millions of paths that should all be tested to make<br />
sure that it works correctly (Kaner et al. 1999).<br />
2.2 Dynamic and Static <strong>Test</strong>ing<br />
On a high level, software testing can be divided into dynamic and static testing. The division to these<br />
two categories can be done based on whether the software is executed or not. Static testing means testing<br />
<strong>with</strong>out executing the code. This can be done <strong>with</strong> different kinds of reviews. Reviewed items can<br />
be documents or code. Other static testing methods are static code analysis methods for example syntax<br />
correctness and code complexity analysis. With static testing faults can be found in an early phase<br />
of software development because the testing can be started before any code is written. (IEEE Std<br />
610.12-1990; Burnstein 2003)<br />
Dynamic testing is the opposite of static testing. The system under test is tested by executing it or parts<br />
of it. Dynamic testing can be divided to functional testing and non-functional testing which are presented<br />
below. (Burnstein 2003)<br />
2.3 Functional and Non-Functional <strong>Test</strong>ing<br />
The purpose of functional testing is to verify that software corresponds to the requirements defined for<br />
the system. The focus on functional testing is to enter inputs to the system under test and verify the<br />
proper output and state. The concept of functional testing is quite similar to all systems, even though<br />
the inputs and outputs differ from system to system.<br />
4
The non-functional testing means testing quality aspects of software. Examples of non-functional testing<br />
are performance, security, usability, portability, reliability, and memory management testing. Each<br />
non-functional testing needs different approaches and different kind of know-how and resources. The<br />
needed non-functional testing is always decided based on the quality attributes of the system and therefore<br />
selected by case basis. (Burnstein 2003)<br />
2.4 White-Box and Black-Box <strong>Test</strong>ing<br />
There are two basic testing strategies, white-box testing and black-box testing. When the white-box<br />
strategy is used, the internal structure of the system under test is known. The purpose is to verify the<br />
correct behavior of internal structural elements. This can be done for example by exercising all the<br />
statements or all conditional branches. Because the white-box testing is quite time consuming, it is<br />
usually done for small parts of the system at a time. White-box testing methods are useful in finding<br />
design, code-based control, logic and sequence defects, initialization defects, and data flow defects.<br />
(Burnstein 2003)<br />
In black-box testing the system under test is seen as an opaque box. There is no knowledge of the inner<br />
structure of the software. The only knowledge is how the software works. The intention of the blackbox<br />
testing is to provide inputs to the system under test and verify that the system works as defined in<br />
the specifications. Because black box approach considers only behavior and functionality of the system<br />
under test, it is also called functional testing. With black box strategy requirement and specification<br />
defects are revealed. Black-box testing strategy can be used at all test levels defined in the following<br />
chapter. (Burnstein 2003)<br />
2.5 <strong>Test</strong> Levels<br />
<strong>Test</strong>ing can be performed in multiple levels. Usually software testing is divided into unit testing, integration<br />
testing, system testing, and acceptance testing (Dustin et al. 1999; Craig & Jaskiel 2002; Burnstein<br />
2003). The purpose of these different test levels is to investigate and test the software from different<br />
perspectives and find different type of defects (Burnstein 2003). If the division of levels is done<br />
from test automation perspective, the levels can be unit testing, component testing and system testing<br />
(Meszaros 2003; Laukkanen, 2006). In this thesis, whenever traditional test levels are used, the division<br />
into unit, integration, system, and acceptance testing is meant. Figure 1 shows these test levels and<br />
their relative order.<br />
5
Figure 1: <strong>Test</strong> levels (Burnstein 2003)<br />
UNIT TESTING<br />
The smallest part of software is a unit. A unit is traditionally viewed as a function or a procedure in a<br />
(imperative) programming language. In object-oriented systems methods and classes/objects can be<br />
seen as units. Unit can also be a small-sized component or a programming library. The principal goal<br />
of unit testing is to detect functional and structural defects in the unit. Sometimes the name component<br />
is used instead of a unit. In that case the name of this phase is component testing. (Burnstein 2003)<br />
There are different opinions about who should create unit tests. Unit testing is in most cases best handled<br />
by developers who know the code under test and techniques needed (Dustin et al. 1999; Craig &<br />
Jaskiel 2002; Mosley & Posey 2002). On the other hand, Burnstein (2003) thinks that an independent<br />
tester should plan and execute the unit tests. The latter is the more traditional point of view, pointing<br />
that nobody should evaluate their own job.<br />
Unit testing can be started in an early phase of the software development after the unit is created. The<br />
failures revealed by the unit tests are usually easy to locate and repair since only one unit is under consideration<br />
(Burnstein 2003). For these reasons, finding and fixing the defects is cheapest on the unit<br />
test level.<br />
6
INTEGRATION TESTING<br />
When units are combined the resulting group of units is called a subsystem or some times in objectoriented<br />
software system a cluster. The goal of integration testing is to verify that the component/class<br />
interfaces are working correctly and the control and data flows are working correctly between the<br />
components. (Burnstein 2003)<br />
SYSTEM TESTING<br />
When ready and tested subsystems are combined to the final system, system test execution can be<br />
started. System tests evaluate both the functional behavior and non-functional qualities of the system.<br />
The goal is to ensure that the system performs according to its requirements when tested as a whole<br />
system. After system testing and corrections based on the found faults are done, the system is ready for<br />
the customer’s acceptance testing, alpha testing or beta testing (see next paragraph). If the customer<br />
has defined the acceptance tests, those can be used in the system testing phase to assure the quality of<br />
the system from the customer’s point of view. (Burnstein 2003)<br />
ACCEPTANCE TESTING<br />
When a software product is custom-made, the customer wants to verify that the developed software<br />
meets her requirements. This verification is done in the acceptance testing phase. The acceptance tests<br />
are developed in co-operation between the customer and test planners and executed after the system<br />
testing phase. The purpose is to evaluate the software in terms of customer’s expectations and goals.<br />
When the acceptance testing phase is passed, the product is ready for production. If the product is targeted<br />
for mass market, it is often not possible to arrange customer-specific acceptance testing. In these<br />
cases the acceptance testing is conducted in two phases called alpha and beta testing. In alpha testing<br />
the possible customers and members from the development organization test the product in the development<br />
organization premises. After defects found in alpha testing are fixed, beta testing can be<br />
started. The product is send to a cross-section of users who use it in the real-world environment and<br />
report the found defects. (Burnstein 2003)<br />
7
REGRESSION TESTING<br />
The purpose of regression testing is to ensure that old characteristics are working after changes made<br />
to the software and verify that the changes have not introduced new defects. Regression testing is not a<br />
test level as such and it can be performed in all test levels. The importance of the regression testing<br />
increases when the system is released multiple times. The functionality provided in the previous version<br />
should still work <strong>with</strong> all the new functionality and verifying this is very time consuming. Therefore<br />
it is recommended to use automated testing tools to support this task (Burnstein 2003). Also Kaner<br />
et al. (1999) have noticed that it is a common way to automate acceptance and regression tests to<br />
quickly verify the status of the latest build.<br />
8
3 AGILE AND ITERATIVE SOFTWARE DEVELOPMENT<br />
The purpose of this chapter is to explain iterative development model and agile methods in general, and<br />
illustrate development models Scrum and Extreme Programming (XP) on a more detailed level because<br />
their relevance to this thesis.<br />
3.1 Iterative <strong>Development</strong> Model<br />
In iterative development model software is built using multiple sequential iterations during the whole<br />
lifecycle of the software. An iteration can be seen as a mini-project containing requirement analysis,<br />
design, development, and testing. The goal of the iteration is to build an iteration release. An iteration<br />
release is a partially completed system which is stable, integrated, and tested. Usually most of the iteration<br />
releases are internal and not released for external customers. The final iteration release is the complete<br />
product, and it is released to customer or to markets. (Larman 2004)<br />
Usually a partial system grows incrementally <strong>with</strong> new features, iteration by iteration. This is called<br />
incremental development. The concept of a growing system via iterations has been called iterative and<br />
incremental development, although iterative development is the common term. The features to be implemented<br />
in iteration are decided at the beginning of the iteration. The customer selects the most valuable<br />
features at that time, so there is no a strict predefined plan. This is called adaptive planning. (Larman<br />
2004)<br />
In modern iterative methods, the recommended length of iteration is between one and six weeks. In<br />
most of the iterative and incremental development methods the length of the iteration is timeboxed.<br />
Timeboxing is a practice which sets a fixed end date for the iteration. Fixed end date means that if the<br />
iteration scope can not be met, the features <strong>with</strong> lowest priority are reduced from the scope of the iteration.<br />
This way the growing software is always in a stable and tested state at the end of the iteration.<br />
(Larman 2004)<br />
Evolutionary iterative development implies that requirements, plans, and solutions evolve and they are<br />
being refined during iterations, instead of using predefined specifications. There is also the term adaptive<br />
development. Difference between these two terms is that the adaptive development implies that the<br />
received feedback is guiding the development. (Larman 2004)<br />
9
Iterative and incremental development makes it possible that an enhanced product is repeatedly delivered<br />
to the markets. This is also called incremental delivery. Usually the incremental deliveries are<br />
done between three and twelve months. Evolutionary delivery is a refinement of incremental delivery.<br />
In evolutionary delivery the goal is to collect feedback and based on that plan content of the next delivery.<br />
In incremental delivery the feedback is not running the delivery plan. However, there is always<br />
some predefined and feedback based planning and therefore these two terms are used interchangeably.<br />
(Larman 2004)<br />
3.2 Agile <strong>Development</strong><br />
Iterative and incremental development is the core of all agile methods, including Scrum and XP. Agile<br />
methods cannot be defined <strong>with</strong> a single definition, but all of them apply timeboxed iterative and evolutionary<br />
delivery as well as adaptive planning. There are also values and practices in agile methods<br />
that support agility, meaning rapid and flexible response to change. Agile methods also promote practices<br />
and principles like simplicity, lightness, communication, self-directed teams, and programming<br />
over documentation. The values and principles that guide the agile methods were written down by a<br />
group interested in iterative and agile methods in 2001. (Larman 2004) Those values are stated in the<br />
Agile Manifesto (Figure 2). Agile software development principles are listed in Appendix A.<br />
Figure 2: Agile Manifesto (Beck et al. 2001a)<br />
10
3.3 Scrum<br />
Scrum is an agile, lightweight process that can be used to manage and control software and product<br />
development, and it uses iterative and incremental development methods. Scrum emphasizes empirical<br />
process rather than defined process. Scrum consists of four phases: planning, staging, development,<br />
and release. In the planning phase items like vision, funding and initial requirements are created. In the<br />
staging phase requirements are defined and prioritized in a way that there is enough content for the<br />
first iteration. In the development phase the development is done in iterations. The release phase contains<br />
product tasks like documentation, training, and deployment. (Schwaber & Beedle 2002; Larman<br />
2004; Schwaber 2004)<br />
When using Scrum people involved in software development are divided into three different roles:<br />
product owner, scrum master, and the team. The product owner’s task is to get the funding, collect the<br />
project’s initial requirements and manage the requirements (see product backlog at next page). The<br />
team is responsible for developing the functionality. The teams are self-managing, self-organizing, and<br />
cross-functional, and their task is to figure out how to convert items in the product backlog to functionality<br />
in iterations. Team members are collectively responsible for success of iterations and of the<br />
project as a whole, and this is one of the core principles of the Scrum. The maximum size of the team<br />
is seven members. The scrum master is responsible for the Scrum process and teaching Scrum to everyone<br />
in the project. The scrum master also makes sure that everyone follows the rules and practices of<br />
Scrum. (Schwaber 2004)<br />
Scrum consists of several practices, which are Product Backlog, Daily Scrum Meetings, Sprint, Sprint<br />
Planning, Sprint Backlog, Sprint Review, and the Sprint Retrospective. Figure 3 shows the overview of<br />
Scrum.<br />
11
Figure 3: Overview of Scrum (Control Chaos 2006a)<br />
PRODUCT BACKLOG<br />
Product Backlog is a list of all features, functions, technologies, enhancements, and bug-fixes that constitute<br />
the changes to be made to the product for future releases. The items in the product backlog are<br />
in a prioritized list which is evolving all the time. The idea is to add new items to it whenever there are<br />
new features or improvement ideas. (Schwaber & Beedle 2002)<br />
SPRINT<br />
Sprint is the name of the timeboxed iteration in the Scrum. The length of sprint is usually 30 calendardays.<br />
Sprint planning takes place at the beginning of the sprint. There are two meetings in the beginning<br />
of the sprint. In the first meeting the product owner and the team select the content for the following<br />
sprint from the product backlog. Usually the items <strong>with</strong> the highest priority and risks are selected.<br />
In the second meeting, the team and the product owner meet to consider how to develop the selected<br />
features and create sprint backlog which contains all the tasks that are needed to meet the goals of the<br />
sprint. The duration of the tasks are estimated in the meeting and updated during the sprint. (Schwaber<br />
& Beedle 2002; Larman 2004; Schwaber 2004)<br />
12
DAILY SCRUM<br />
The development progress is monitored <strong>with</strong> daily scrum meetings. Daily scrum of the specified form<br />
is kept every work day at the same time and place. Meeting should not last more than 15 minutes. The<br />
team is standing in a circle, and the scrum master asks the following questions from all the team members:<br />
1. What have you done since the last daily scrum<br />
2. What are you going to do between now and the next daily scrum<br />
3. What is preventing you from doing your work<br />
If any problems are raised during the daily scrum meeting, it is the responsibility of the team to solve<br />
the problems. If the team cannot deal <strong>with</strong> the problems, it becomes a responsibility of the scrum master.<br />
If there is a need for a decision, the scrum master has to decide the matter <strong>with</strong>in an hour. If there<br />
are some other problems, the scrum master should solve them <strong>with</strong>in one day before the next daily<br />
scrum. (Schwaber & Beedle 2002; Schwaber 2004)<br />
SPRINT REVIEW<br />
At the end of the sprint results are shown in the sprint review hosted by the scrum master. The purpose<br />
of the sprint review is to demonstrate the done functionality to the product owner and the stakeholders.<br />
After every presentation, all the participants are allowed to voice any comments, observations, improvement<br />
ideas, changes, or missing features regarding the presented functionality. All these found<br />
items are noted down. At the end of the meeting all the items are checked and placed to the product<br />
backlog for prioritization. (Schwaber & Beedle 2002; Schwaber 2004)<br />
DEFINITION OF DONE<br />
Because only the done functionality can be shown in sprint review, there is a need to define what that<br />
means. Otherwise one might think that functionality is done when a feature is implemented and another<br />
thinks that it is done when it is properly tested, documented and ready to be deployed to production.<br />
Schwaber (2004) recommends having a definition of done that is written down and agreed by all<br />
members of the team. This way all stakeholders know the condition of the demonstrated functionalities.<br />
13
SPRINT RETROSPECTIVE<br />
The sprint retrospective meeting is used to improve the performance of the scrum team. The sprint retrospective<br />
takes place at the end of the sprint and the participants are the scrum master and the team.<br />
Two questions “What went well during the last sprint” and “What could be improved in the next<br />
sprint” are asked from all of the team members. Improvement ideas are prioritized, and the ideas that<br />
should be taken into the next sprint are added as high priority nonfunctional items to the product backlog.<br />
(Schwaber 2004)<br />
RULES IN SCRUM<br />
In addition to earlier mentioned aspects there are a few more rules in Scrum. It is forbidden to add any<br />
new tasks to the sprint backlog during the sprint, and the scrum masters must ensure this. If the proposed<br />
new tasks are however more important than the ones in the sprint backlog, the sprint can be abnormally<br />
terminated by the scrum master. After the termination, a new sprint can be started <strong>with</strong> the<br />
sprint backlog containing the new tasks. (Schwaber & Beedle 2002; Schwaber 2004)<br />
DAILY BUILD<br />
As mentioned earlier, Scrum is used to manage and control product development, and therefore there<br />
are no strict rules for development practices that should be used. However, there is a need to know the<br />
status of the project on a daily basis, and therefore a daily build practice is needed. The daily build<br />
practice means that every day the developed source code is checked into the version control system,<br />
built and tested. This means that integration problems can be noticed on a daily basis rather than at the<br />
end of the sprint. The daily build practice can be implemented by continuous integration. Because the<br />
daily build is the only development practice that has to be used in Scrum, the team is responsible for<br />
selecting other development practices to be used. This means that many practices from other agile<br />
methods can be used by the team. (Schwaber & Beedle 2002)<br />
14
SCALING SCRUM<br />
It was mentioned that the size of scrum team is seven people. When Scrum is used in a larger project,<br />
the project members can be divided into multiple teams (Schwaber 2004; Larman 2006). When multiple<br />
teams are used, the cooperation <strong>with</strong> the team can be handled <strong>with</strong> the scrum of scrums. The scrum<br />
of scrums is a daily scrum where at least one member from every scrum team is participating. This<br />
mechanism is used to remove obstacles that concern more than one team (Schwaber 2004). In a larger<br />
project it is also possible to divide the product owner’s responsibilities. Cohn (2007) suggests using<br />
group of product owners <strong>with</strong> one chief product owner. The product owners work in the teams while<br />
the chief product owner manages the wholeness. Larman (2006) calls product owners working <strong>with</strong><br />
scrum teams as feature champions.<br />
3.4 Extreme Programming<br />
Extreme Programming (XP) is a disciplined and still very agile software development method for<br />
small teams from two to twelve members. The purpose of XP is to minimize the risk and the cost of<br />
change in the software development. XP is based on the gained experiences, and successfully used<br />
practices of the father of the method, Kent Beck. Communication, simplicity, feedback, and courage<br />
are the values that XP is based on. Simplicity means as simple code as possible. No extra functionality<br />
is done beforehand even there might be need for a more complex solution in the future. Communication<br />
means continuous communication between the customer and the developers and also between the<br />
developers. Some of the XP practices also force communication. This enhances the spread of important<br />
information inside the project. Continuous testing and communication provide feedback from the state<br />
of the system and the development velocity. Courage is needed to make hard decisions like changing<br />
the system heavily when seeking simplicity and better design. Another form of courage is deleting<br />
code when it is not working at the end of day. To concretize these values there are twelve development<br />
practices which XP heavily counts on. The practices are listed below:<br />
• The Planning Game: Quickly determine the scope of the next release by combining business<br />
priorities and technical estimates. As reality takes over the plan, update the plan.<br />
• Small Releases: Put a simple system into production quickly, and then release new versions<br />
on a very short cycle.<br />
• Metaphor: Guide all development <strong>with</strong> simple shared story of how the whole system<br />
works.<br />
15
• Simple Design: The system should be designed as simply as possible at any given moment.<br />
Extra complexity is removed as soon as it is discovered.<br />
• <strong>Test</strong>ing: Programmers continually write unit tests, which must be run flawlessly for development<br />
to continue. Customers write tests demonstrating that features are finished.<br />
• Refactoring: Programmers restructure the system <strong>with</strong>out changing its behavior to remove<br />
duplication, improve communication, simplify, or add flexibility.<br />
• Pair Programming: All production code is written <strong>with</strong> two programmers at one machine.<br />
• Collective ownership: Anyone can change any code anywhere in the system at any time.<br />
• Continuous integration: Integrate and build the system many times a day, every time a task<br />
is completed.<br />
• 40-hour week: Work no more than 40 hours a week as a rule. Never work overtime second<br />
week in a row.<br />
• On-site customer: Include a real, live user on the team, available full-time to answer questions.<br />
• Coding standards: Programmers write all code in accordance <strong>with</strong> rules the emphasizing<br />
communication through the code.<br />
None of the practices are unique or original. However the idea in the XP is to use all the practices together.<br />
When the practices are used together they complement each other (Figure 4). (Beck 2000)<br />
Figure 4: The practices support each other (Beck 2000)<br />
16
3.5 Scrum and Extreme Programming Together<br />
It is possible to combine agile management mechanism from Scrum and engineering practices from XP<br />
(Control Chaos 2006b). Figure 5 illustrates this approach. Mar and Schwaber (2002) have experience<br />
that these two practices are complementary; when used together, they can have a significant impact on<br />
both the productivity of a team and the quality of its outputs.<br />
Figure 5:<br />
XP@Scrum (Control Chaos2006b)<br />
3.6 Measuring Progress in Agile Projects<br />
Ron Jeffries (2004) recommends using the Running <strong>Test</strong>ed Features metric (RTF) for measuring the<br />
team’s agility and productivity. He defines the RTF in the following way:<br />
1. The desired software is broken down into named features (requirements, stories) which are<br />
part of the system to be delivered.<br />
2. For each named feature, there are one or more automated acceptance tests which, when they<br />
work, will show that the feature in question is implemented.<br />
3. The RTF metric shows, at every moment in the project, how many features are passing all<br />
their acceptance tests.<br />
17
The RTF is a simple metric and it measures well the most important aspect of software, which is the<br />
amount of working features. The amount of RTF should start to increase in the beginning of the project<br />
and keep increasing until the end of the project. If the curve is not rising, there must be some problems<br />
in the project. Figure 6 shows how the RTF curve could look like if the project is doing well. (Jeffries<br />
2004)<br />
Figure 6: RTF curve for an agile project (Jeffries 2004)<br />
18
4 TESTING IN AGILE SOFTWARE DEVELOPMENT<br />
Agile testing is guided by the Agile Manifesto presented in Figure 2. Marick (2001) sees the working<br />
code and conversing people as the most important guides for agile testing. Communication between<br />
the project and test engineers should not be based on communicating <strong>with</strong> the written requirements and<br />
the design specifications handed over the wall to the testing department and then communicating back<br />
<strong>with</strong> the specifications and defect reports. Instead Marick (2001) emphasizes face-to-face conversations<br />
and informal discussions as the main channel for getting testing ideas and creating the test plan.<br />
<strong>Test</strong> engineers should work <strong>with</strong> developers and help testing even unfinished features. Marick is one of<br />
the people agreeing <strong>with</strong> the principles of the context-driven testing school (Kaner et al. 2001a), and<br />
therefore the principles in agile testing and context-driven testing overlap.<br />
4.1 Purpose of <strong>Test</strong>ing<br />
The purpose of agile testing is to build confidence in the developed software. In extreme programming<br />
the confidence is built on two test levels. The unit tests created <strong>with</strong> test-driven development increase<br />
the developers’ confidence, and the customer’s confidence is founded on the acceptance tests (Beck<br />
2000). Unit tests verify that the code works correctly and acceptance tests make sure correct code has<br />
been implemented. In Scrum the integration and acceptance tests are not described (Abrahamsson et al.<br />
2002), and therefore it is up to the team to define the testing related issues. Itkonen et al. (2005) state<br />
that in agile testing the focus is on the constructive quality assurance practices. This is opposite to the<br />
destructive quality assurance practices like negative testing used in the traditional testing. Itkonen et al.<br />
(2005) have doubts about the sufficiency of the constructive quality assurance practices, but admit that<br />
more research in that area is needed.<br />
4.2 <strong>Test</strong> Levels<br />
In agile development the different testing activities overlap. This is mainly because the purpose is to<br />
deliver working software repeatedly. The levels of agile testing cannot be similarly distinguished from<br />
development phases as the traditional test levels can. The contents of the different levels also differ in<br />
agile and traditional testing. As was mentioned in the previous chapter, in XP the confidence is built<br />
<strong>with</strong> the unit and acceptance tests. As was earlier mentioned, Scrum does not contain guidelines on<br />
how testing should be conducted. There are also other opinions in the agile community how the testing<br />
could be divided. Therefore there is no coherent definition for the test levels in the agile testing. However,<br />
test levels in XP and some other categorizations are presented below.<br />
19
UNIT TESTING<br />
The unit testing, sometimes called also as developer testing, can be seen very similar to traditional unit<br />
testing. However, unit tests are usually done using test-driven development (TDD). As the name testdriven<br />
indicates, unit tests are written before the code (Beck 2003; Astels 2003). When TDD is used it<br />
is obvious that a developer writes the unit tests. Even though TDD is used to create the unit tests, its<br />
only purpose is not just testing. The TDD is an approach to write and design maintainable code, and as<br />
a nice side effect, a suite of unit tests is produced (Astels 2003).<br />
ACCEPTANCE TESTING IN XP<br />
The acceptance testing in XP has a wider meaning than the traditional acceptance testing. <strong>Acceptance</strong><br />
tests can contain functional, system, end-to-end, performance, load, stress, security, and usability testing,<br />
among others (Crispin 2005). <strong>Acceptance</strong> tests are also called customer and functional tests in XP<br />
literature, but in this thesis the term acceptance test is used.<br />
The acceptance tests are written by the customer or by a tester <strong>with</strong> the customer’s help (Beck 2000).<br />
In some projects defining the acceptance tests have been a joint effort of the team (Crispin et al. 2002).<br />
The aim of acceptance testing is to show that the product is working as the customer wants and increase<br />
her confidence (Beck 2000; Jeffries 1999). The acceptance tests should contain only tests for<br />
features that customer wants. Jeffries (1999) advices to invest wisely and pick tests that have a meaning<br />
when passing and failing. Crispin et al. (2002) mention also that the purpose of the acceptance tests<br />
is not to go through all the paths in the system because the unit tests take care of that. However,<br />
Crispin (2005) had noticed that teams doing TDD test only the “happy paths”, especially when trying<br />
the TDD for the first time. Misunderstood requirements and hard to find defects may go undetected.<br />
Therefore the acceptance tests keep the teams on track.<br />
The acceptance tests should be always automated, and the automated tests should be simple and created<br />
incrementally (Jeffries et al. 2001; Crispin & House 2005). However, in practice, automating all<br />
the tests are extremely hard and some trade-offs has to be done (Crispin et al. 2002). Kaner (2003)<br />
thinks that automating all acceptance tests is a serious error and the amount of automated tests should<br />
be decided based on the context. Jeffries (2006) admits that automating all the tests is impossible but<br />
still phrases “if we want to be excellent at automated testing, we should set out to automate all tests”.<br />
When automating the tests, the entire development team should be responsible for the automation tasks<br />
(Crispin et al. 2002). The test first approach can be used also <strong>with</strong> the acceptance tests. The acceptance<br />
test-driven development concept is introduced in Chapter 4.3.<br />
20
OTHER TESTING PRACTICES IN XP<br />
While the unit and the acceptance testing are the heart of XP, Beck (2000) admits that there are also<br />
other testing practices that make sense from time to time. He lists parallel test, stress test, and monkey<br />
test as examples of these kinds of helpful testing approaches.<br />
OTHER TEST LEVELS IN AGILE TESTING<br />
There are also other test level divisions in the agile testing community addition to the division in XP.<br />
Marick (2004) divides testing into four categories: technology-facing programmer support, businessfacing<br />
team support, business-facing product critiques, and technology-facing product critiques. In<br />
Marick’s division, unit testing can be seen as technology-facing programmer support and acceptance<br />
testing as business-facing team support. The business-facing product critiques means testing forgotten,<br />
wrongly defined, or otherwise false requirements. Marick (2004) believes that different kinds of exploratory<br />
testing practices can be used in this phase. Technology-facing product critiques corresponds<br />
to non-functional testing.<br />
Hendrickson (2006) divides the agile testing practices into automated acceptance or story tests, automated<br />
unit tests, and manual exploratory testing (Figure 7). She thinks the exploratory testing provides<br />
additional feedback and covers gaps in automation. She also states that the exploratory testing is necessary<br />
to augment the automated tests. This division is quite similar <strong>with</strong> the Marick’s (2004) division<br />
from the functional testing’s point of view.<br />
21
Figure 7: Agile testing practices (Hendrickson 2006)<br />
4.3 <strong>Acceptance</strong> <strong>Test</strong>-<strong>Driven</strong> <strong>Development</strong><br />
The idea of acceptance test-driven development (ATDD) was firstly introduced by Beck (2003) <strong>with</strong><br />
the name application test-driven development. However, he had some doubts on how well the acceptance<br />
tests can be written before the development. Before this, the acceptance test-driven development<br />
had been used, although named as acceptance testing (Miller & Collins 2001). After that, there have<br />
been projects using acceptance test-driven development (Andersson et al. 2003; Reppert 2004; Crispin<br />
2005; Sauvé et al. 2006). The ATDD concept has also been called as story test-driven development<br />
(Mudridge & Cunningham 2005, Reppert 2004) and customer test-driven development (Crispin 2005).<br />
22
PROCESS<br />
On a high level the acceptance test-driven development process contains three steps. The first step is to<br />
define the requirements for the coming iteration. In agile projects the requirements are usually written<br />
in a format of user stories. User stories are short descriptions representing the customer requirements<br />
used for planning and as a reminder (Cohn 2004). When the user stories are defined, the acceptance<br />
tests for those requirements can be done. As the name acceptance test indicates the purpose of these<br />
tests is to define the acceptable functionality of the system. Therefore, the customer has to take part in<br />
defining the acceptance tests. The acceptance tests have to be written in a format the customer understands<br />
(Miller & Collins 2001; Mudridge & Cunningham 2005). When the tests have been defined, the<br />
development can be started. As the concept on a high level is quite simple, there are multiple possible<br />
approaches by whom, when and in which extent the acceptance tests are written and automated.<br />
WHO WRITES THE TESTS<br />
As it was mentioned above, the customer or some other person <strong>with</strong> the proper knowledge of the domain<br />
is needed when writing the tests (Reppert 2004; Crispin 2005). Usually the customer needs some<br />
help in writing the tests (Crispin 2005). Crispin (2005) describes a process where the test engineer<br />
writes the acceptance tests <strong>with</strong> customer. On the other hand, it is also possible for the developers and<br />
the customer to define the tests (Andersson et al. 2003). It is also possible that the customer, the developers<br />
and the test engineers write the tests in collaboration (Reppert 2004). As can be seen, there are<br />
several alternative ways of writing the acceptance tests, and it evidently depends on the available people<br />
and their skills.<br />
WHEN TESTS ARE WRITTEN AND AUTOMATED<br />
<strong>Test</strong>s are written before the development when ATDD is used. This can mean writing the test cases<br />
before the iteration planning or after it. Mudridge and Cunningham (2005) describe an example on<br />
how to use the acceptance tests to define the user stories on a more detailed level and this way ease the<br />
task estimation in the iteration planning session. Watt and Leigh-Fellows (2004) have also used acceptance<br />
tests to clarify the user stories before the planning sessions. On the other hand, Crispin (2005)<br />
and Sauvé et al. (2006) describe a process where the acceptance tests are developed after the stories<br />
have been selected for the iteration.<br />
23
While working in one software development project, Crispin (2005) noticed that writing too many detailed<br />
test cases at the beginning can make it difficult for the developers to understand the big picture.<br />
Therefore, in that project the high level test cases were written at the beginning of the iteration and the<br />
more detailed low level test cases were developed parallel <strong>with</strong> the developers writing the code. This<br />
way the risk of having to rework a lot of test cases is lowered. A similar kind of an approach has also<br />
been used by Andersson et al. (2003) and Miller and Collins (2001). However, Crispin (2005) states<br />
that this is not “pure” ATDD because all the tests are not written before the code.<br />
HOW ACCEPTANCE TESTS ARE AUTOMATED<br />
As was mentioned in Chapter 4.2 the goal in agile testing is to automate as many tests as possible. Depending<br />
on the tool used to automate the test cases, the actual work varies. In general, there are two<br />
tasks. The test cases have to be written in a format that can be processed <strong>with</strong> the test automation<br />
framework. In addition to these test cases, some code is needed to move the instructions from the test<br />
cases into the system under test. Often this code bypasses the graphical user interface and calls<br />
straightly the business logic (Reppert 2004; Crispin 2005).<br />
There are several open source tools used to automate the test cases. The most known of these tools is<br />
FIT (Framework for Integrated <strong>Test</strong>) (Sauvé et al. 2006). When FIT is used the test cases consist of<br />
steps which are presented in tabular format. Developers have to implement test code for every different<br />
kind of step. Sauvé et al. (2006) see this as the weakness of FIT. Other tools and approaches used to<br />
automate the acceptance test cases are not presented here.<br />
24
PROMISES AND CHALLENGES<br />
Table 1 and Table 2 show the promises and challenges of acceptance test-driven development collected<br />
from the different references mentioned in the previous chapters.<br />
PROMISES<br />
The risk of building incorrect<br />
software is decreased.<br />
The development status is<br />
known at any point.<br />
A clear quality agreement is<br />
created.<br />
Requirements can be defined<br />
more cost-effectively.<br />
The requirements and tests<br />
are in synchronization.<br />
The quality of tests can be<br />
improved.<br />
The communication gap is reduced because the tests are an effective<br />
medium of communication between the customer and the development<br />
(Sauvé et al. 2006). When the collaboration takes place just<br />
before the development, there is a clear context for having a conversation<br />
and removing misunderstandings (Reppert 2004). Crispin<br />
(2005) even thinks that the most important function of the tests is to<br />
force the customer, the developers and the test engineers to communicate<br />
and create a common understanding before the development.<br />
When acceptance tests created in collaboration are passing, the feature<br />
is done. The readiness of the product can be evaluated based on<br />
the results of the suite of automated tests executed daily (Miller and<br />
Collins 2001). Knowing what features are ready makes also the project<br />
tracking easier and better (Reppert 2004).<br />
The tests made in collaboration <strong>with</strong> the customer and the development<br />
team serve as a quality agreement between the customer and<br />
the development (Sauvé et al. 2006).<br />
The requirements are described as executable artifacts that can be<br />
used to automatically test the software. Misunderstandings are less<br />
likely than <strong>with</strong> requirements defined in textual descriptions or diagrams.<br />
(Sauvé et al. 2006)<br />
Requirement changes become test updates, and therefore they are<br />
always in synchronization (Sauvé et al. 2006).<br />
The errors in the tests are corrected and approved by the customer,<br />
and therefore the quality of the tests is improved (Sauvé et al. 2006).<br />
25
Confidence in the developed<br />
software is increased.<br />
A clear goal for the developers.<br />
The test engineers are not<br />
seen as “bad guys”.<br />
Problems can be found earlier.<br />
Improve the design of the<br />
developed system.<br />
The correctness of refactoring<br />
can be verified.<br />
Without tests the customers cannot have confidence in the software<br />
(Miller and Collins 2001). The customers get confidence because<br />
they do not need to just hope that the developers have understood<br />
the requirements (Reppert 2004).<br />
The developers have a clear goal in making the customer defined<br />
acceptance tests to pass and that can prevent feature creep (Reppert<br />
2004, Sauvé et al. 2006).<br />
Because the developers and test engineers have the same well defined<br />
goal, the developers do not see the test engineers as “bad<br />
guys” (Reppert 2004).<br />
The customer’s domain knowledge helps to create meaningful tests.<br />
This helps to find problems already in an early phase of project<br />
(Reppert 2004).<br />
Joshua Kerievsky has been amazed how much simpler the code is<br />
when ATDD is used (Reppert 2004).<br />
The acceptance tests are not relying in the internal design of software,<br />
and therefore they can be used to reliably verify the refactoring<br />
has not broken anything (Andersson et al. 2003).<br />
Table 1:<br />
Promises of ATDD<br />
26
CHALLENGES<br />
Automating tests<br />
Writing the tests before development.<br />
The right level of test cases<br />
Crispin (2005) has noticed that defining and automating tests can be a<br />
huge challenge even <strong>with</strong> light tools like FIT.<br />
It might be hard to find time for writing the tests in advance (Crispin<br />
2005).<br />
Crispin (2005) has noticed that when many test cases are written beforehand,<br />
the test cases can cause more confusion than help to understand<br />
the requirements. This causes a lot of rework because some of<br />
the test cases have to be refactored. Therefore the team Crispin (2005)<br />
worked <strong>with</strong>, started <strong>with</strong> a few high level test cases and added more<br />
test cases during the iteration.<br />
Table 2:<br />
Challenges of ATDD<br />
Promises and challenges are revisited in the end of the thesis when the observations are analyzed.<br />
27
5 TEST AUTOMATION APPROACHES<br />
The purpose of this chapter is to describe briefly the field of test automation and the evolution of test<br />
automation frameworks. In addition the keyword-driven testing approach is explained on a more detailed<br />
level.<br />
5.1 <strong>Test</strong> Automation<br />
The term test automation usually means test execution automation. However, test automation is a much<br />
wider term and it can also mean activities like test generation, reporting the test execution results, and<br />
test management (Bach 2003a). All these test automation activities can take place on all the different<br />
test levels described in Chapter 2.5. The extent of test automation can also vary. Small scale test automation<br />
can mean tool aided testing like using a small collection of testing tools to ease different kind<br />
of testing tasks (Bach 2003a). On the other hand, large scale test automation frameworks are used for<br />
setting up the environment, executing test cases, and reporting the results (Zallar 2001).<br />
Automating the testing is not an easy task. There are several issues that have to be taken into account.<br />
Fewster and Graham (1999) have listed the common test automation problems as unrealistic expectations,<br />
poor testing practice, an expectation that automated tests will find a lot of new defects, a false<br />
sense of security, maintenance, technical problems, and organizational issues. As can be noticed, the<br />
list is quite long, and therefore all these issues have to be taken into account when planning the test<br />
automation usage. Laukkanen (2006) also lists some other test automation issues like when to automate,<br />
what to automate, what can be automated, and how much to automate.<br />
28
5.2 Evolution of <strong>Test</strong> Automation Frameworks<br />
The test automation frameworks have evolved over the time (Laukkanen 2006). Kit (1999) divides the<br />
evolution into three generations. The first generation test automation frameworks are unstructured, test<br />
cases are separate scripts containing also test data and therefore almost non-maintainable. In the second<br />
generation frameworks the test scripts are well-designed, modular and documented. This makes<br />
the second generation frameworks maintainable. The third generation frameworks are based on the<br />
second generation <strong>with</strong> the difference that the test data is taken out of the scripts. This makes the test<br />
data variation easy and similar test cases can be created quickly and <strong>with</strong>out coding skills. This concept<br />
is called data-driven testing. The limitation of the data-driven testing is that one script is needed<br />
for every logically different test case (Fewster & Graham 1999; Laukkanen 2006). This can easily increase<br />
the amount of needed scripts dramatically. The keyword-driven testing is a logical extension of<br />
the data-driven testing (Fewster & Graham 1999), and it is described in the following chapter.<br />
5.3 <strong>Keyword</strong>-<strong>Driven</strong> <strong>Test</strong>ing<br />
In the keyword-driven testing also the keywords controlling the test execution are taken out of the<br />
scripts into the test data (Fewster & Graham 1999; Laukkanen 2006). This makes it possible to create<br />
new test cases in the test data <strong>with</strong>out creating a script for every different test case allowing also the<br />
test engineers <strong>with</strong>out coding skills to add new test cases (Fewster & Graham 1999; Kaner et al.<br />
2001b). This removes the biggest limitation of the data-driven testing approach. Figure 8 is an example<br />
of keyword-driven test data containing two simple test cases for testing a calculator application. The<br />
test cases consist of keywords Input, Push and Check, and the arguments which are inputs and expected<br />
outputs for the test cases. As it can be seen, it is easy to add logically different test cases <strong>with</strong>out<br />
implementing new keywords.<br />
29
Figure 8: <strong>Keyword</strong>-driven test data file (Laukkanen 2006)<br />
To be able to execute the tabular format test cases shown in Figure 8, there have to be mapping from<br />
the keywords to the code interacting <strong>with</strong> system under test (SUT). The scripts or code implementing<br />
the keywords are called handlers by Laukkanen (2006). In Figure 9 can be seen the handlers for the<br />
keywords used in test data (Figure 8). In addition to the handlers, test execution needs a driver script<br />
which parses the test data and calls the keyword handlers according to the parsed data.<br />
Figure 9: Handlers for keywords in Figure 8 (Laukkanen 2006)<br />
30
If there is a need for creating high level and low level test cases, different level keywords are needed.<br />
Simple keywords like Input are not enough for high level test cases. There are simple and more flexible<br />
solutions according to Laukkanen (2006). Higher level keywords can be created inside the framework<br />
by combining the lower level keywords. The limitation of this approach is the need for coding<br />
skills whenever there is a need for new higher level keywords. A more flexible solution proposed by<br />
Buwalda et al. (2002), Laukkanen (2006) and Nagle (2007) is to include a possibility to combine existing<br />
keywords in the keyword-driven test automation framework. This makes it possible to create<br />
higher level keywords by combining existing keywords inside the test data. Laukkanen (2006) calls<br />
these combined keywords as user keywords and this term will be used also in this thesis.<br />
31
6 KEYWORD-DRIVEN TEST AUTOMATION FRAMEWORK<br />
The keyword-driven test automation framework used in this research was developed inside the company<br />
the study took place and was called as Robot. The ideas and the basic concept of Robot were<br />
based on the master’s thesis of Laukkanen (2006). In the following chapters some functionalities of<br />
Robot that are interesting from this thesis’s point of view are briefly explained.<br />
6.1 <strong>Keyword</strong>-<strong>Driven</strong> <strong>Test</strong> Automation Framework<br />
In the keyword-driven test automation framework there are three logical parts; the test data, the test<br />
automation framework and the test libraries. The test data contains directives telling what to do <strong>with</strong><br />
associated inputs and expected outputs. The test automation framework contains the functionality to<br />
read the test data, run the handlers in the libraries based on the directives in the test data, and handle<br />
errors during the test execution. The test automation framework contains also test logging and test reporting<br />
functionality. The test libraries are the interface between the framework and system under test.<br />
The libraries can use existing test tools to access the interfaces of the system under test or connect directly<br />
to the interfaces. In Figure 10 the logical structure of Robot is presented.<br />
Figure 10:<br />
Logical structure of Robot<br />
32
6.2 <strong>Test</strong> Data<br />
In Robot, the test data is in tabular format and it can be stored to html or tsv-files. The test data is divided<br />
to four different categories; test cases, keywords, variables and settings. All these different test<br />
data types are defined in their own table in the test data file. Robot recognizes the different tables<br />
through the name of the data type in the table’s first header cell.<br />
KEYWORDS AND TEST CASES<br />
In Robot, keywords can be divided into base and user keywords. Base keywords are keywords implemented<br />
in the libraries. User keywords are keywords that are defined in the test data by combining<br />
base keywords or other user keywords. The ability to create new user keywords in the test data decreases<br />
the amount of needed base keywords and therefore amount of programming. User keywords<br />
make it possible to increase the abstraction of test cases. In following Figure 11 the test cases shown in<br />
Figure 8 are modified to use user keywords Add, Equals and Multiply. The test cases are composed of<br />
keywords defined in the second column of test case table and arguments defined in the following columns.<br />
User keywords are defined in a similar way. In test case and keyword tables the second column<br />
is named as action. This column name can be defined by a user as it is not used by Robot. Same applies<br />
<strong>with</strong> the rest of the headers.<br />
Figure 11: <strong>Test</strong> cases and user keywords (Laukkanen 2006)<br />
33
VARIABLES AND SETTINGS<br />
It is possible to define variables in the Robot framework. Variables increase the maintainability of the<br />
test data because some changes need only updates to the variable values. In some cases variables can<br />
contain test environment specific data, like hostnames or alike. In these cases variables make it easier<br />
to use the same test cases in different environments <strong>with</strong> minimal extra effort. There are two types of<br />
variables in Robot. Scalar variables contain one value, and it can be anything from a simple string to an<br />
object. A list variable contains multiple items. Figure 12 contains a scalar variable ${GREETING} and<br />
a list variable @{ITEMS}.<br />
Figure 12:<br />
Variable table containing scalar and list variables<br />
Settings table is similar to the variable table. Name of the setting is defined in the first column and<br />
value or values in the following columns. Settings are predefined in Robot. Examples of settings are<br />
Library and Resource. Library setting is used to import library which contains the needed base keywords.<br />
The resource setting is used to import resource files. Resource files are used to define the user<br />
keywords and variables in one place.<br />
GROUPING TEST CASES<br />
There are two ways of grouping test cases in Robot. First of all, test cases are grouped hierarchically.<br />
A file containing the test cases (i.e. Figure 11) is called a test case file and it forms a test suite. A directory<br />
containing one or more test case files or directories <strong>with</strong> test case files also creates a test suite. In<br />
other words, hierarchical grouping is the same as the test data structure in the file system.<br />
The other way to group the test cases is based on project specific agreements. In Robot, there is a possibility<br />
to give words for the test cases that are used for grouping the test cases. These words are called<br />
tags. Tags can be used to define for example part of the system the test case tests, who has created the<br />
test case, does the test case belong to regression tests, and does it take a long time to execute the test<br />
case.<br />
34
6.3 <strong>Test</strong> Execution<br />
In Robot, the test execution is started from the command line. The scope of the test execution is defined<br />
by giving test suite directories or test case files as inputs. Without parameters, all the test cases in<br />
the given test suites are executed. A single test suite or test case can be executed <strong>with</strong> command line<br />
options. It is also possible to include or exclude test cases from the test run based on the tags (see the<br />
previous chapter). Command line execution makes it possible to start the test execution at some predefined<br />
time. It also enables starting test execution from continuous integration systems like Cruise Control<br />
(Cruise Control 2006).<br />
The test execution result can be pass or fail. By default, if even a single test case fails, the test execution<br />
result is a failure. To allow a success in test execution even <strong>with</strong> failing test cases, Robot contains<br />
a feature called critical tests. The test execution result is failure if any of the critical test cases fails.<br />
This means that test execution is considered successful even though non-critical test cases fail. The<br />
critical test cases are defined when starting the execution from command line. For example regression<br />
can be defined as a critical tag and all the test cases that contain a tag regression are handled as critical<br />
tests. This functionality allows adding test cases to the test execution in case the test case is failing, but<br />
the result is not wanted to be failure. This is needed if the test case or the feature is not ready. These<br />
test cases are not marked as critical.<br />
6.4 <strong>Test</strong> Reporting<br />
Robot produces a report, a log and an output from the test execution. The report contains statistics and<br />
information based on executed test suites and tags. It can be used as an information radiator since its<br />
background color shows whether the test execution status was pass or fail. The test log contains more<br />
detailed information about the executed keywords and information that can be used to solve problems.<br />
The output contains test execution results presented in an xml-format. The report and the log are generated<br />
from the output.<br />
35
7 EXAMPLE OF ACCEPTANCE TEST-DRIVEN<br />
DEVELOPMENT WITH KEYWORD-DRIVEN TEST<br />
AUTOMATION FRAMEWORK<br />
In this chapter a simple fictitious example of acceptance test-driven development <strong>with</strong> the Robot<br />
framework is shown. The purpose of this chapter is to help to understand the concept before showing<br />
the concept in practice. This is also a simple theoretical example of how the concept could work. However,<br />
at first the relation between user stories, test cases and keywords are briefly explained.<br />
7.1 <strong>Test</strong> Data between User Stories and System under <strong>Test</strong><br />
As was described in Chapter 4.3, user stories are short descriptions representing the customer requirements<br />
used for planning. Different levels of test data is needed to map the user stories to the actual<br />
code interacting <strong>with</strong> the system under test. These levels and their interdependence are shown in Figure<br />
13. First of all the user story is mapped to one or multiple test cases. Every test case contains one or<br />
more sentence format keywords. The sentence format keyword means user keywords which are written<br />
in plain text, possibly containing some input or expected output values but no arguments. When the<br />
test cases contain only the sentence format keywords, those can be understood <strong>with</strong>out technical skills.<br />
Every sentence format keyword consists of one or more base or user keywords. A user keyword includes<br />
one or more base or user keywords. Finally the base keywords contain the code which controls<br />
the system under test. The examples in the following chapters clarify the use of the different type of<br />
keywords presented above.<br />
Figure 13:<br />
Mapping from user story to the system under test<br />
36
7.2 User Stories<br />
The customer in this example is a person who handles registrations to different kind of events. People<br />
usually enroll to the events by email or by phone, and therefore the customer needs an application<br />
where to save the registrations. The customer has requested a desktop application that has a graphical<br />
user interface. The customer has defined following user stories:<br />
1. As a registration handler I want to add registrations and see all the registrations so that I can<br />
keep count of the registrations and later contact the registered people.<br />
2. As a registration handler I want to delete one or multiple registrations so that I can remove<br />
the canceled registration(s).<br />
3. As a registration handler I want to have the count of the registrations so that I can notice<br />
when there is no longer room for new registrations.<br />
4. As a registration handler I want to save registrations persistently so that I do not lose the<br />
registrations even if my computer crashes.<br />
7.3 Defining <strong>Acceptance</strong> <strong>Test</strong>s<br />
Before the stories can be implemented, there is a need to discuss and clarify a hidden assumption behind<br />
the stories. Details arising from the collaboration can be captured as acceptance tests. As was<br />
mentioned in Chapter 4.3, it can vary when and who are participating to this collaboration. Because<br />
those issues are a more matter of the process and the people available than the tool used, those issues<br />
are not taken into account in this example.<br />
The discussion about the user stories between the customer and the development team can lead to acceptance<br />
tests shown in Figure 14. The test cases are in the format that can be used as input for Robot.<br />
<strong>Test</strong> cases can be written straightly to this format using empty templates. However, it might be easier<br />
to discuss about the user stories and write drafts of the test cases to a flap board during the conversation.<br />
After sketches of the test cases have been made, those can be easily converted to digital format.<br />
37
Figure 14:<br />
Some acceptance test cases for the registration application<br />
While discussing the details of the user stories and the test cases, the outline of the user interface can<br />
be drawn. The outline in Figure 15 could be the result of the session where the test cases were created.<br />
It can be used as a starting point for the implementation. In the picture, names for the user interface<br />
elements are also defined. These are implementation details that have to be agreed if different persons<br />
are doing the test cases and the application.<br />
38
Figure 15:<br />
Sketch of the registration application<br />
7.4 Implementing <strong>Acceptance</strong> <strong>Test</strong>s and Application<br />
After the acceptance tests are defined it should be clear to all the stakeholders what are going to be implemented.<br />
If pure acceptance test-driven development is used, the test cases are implemented on a detailed<br />
level before the implementation of the application can be started. In this example the implementation<br />
of the test case User Can Add Registrations is described on a detailed level.<br />
CREATING THE TEST CASE ”USER CAN ADD REGISTRATIONS”<br />
User Can Add Registrations test case contains three sentence format keywords as can be seen in Figure<br />
16. The creation of the test case starts <strong>with</strong> defining those sentence format keywords. To keep the actual<br />
test case file as simple as possible, the sentence format keywords are defined in a separate resource<br />
file. The keywords defined in the resource file have to be taken into use by importing the resource file<br />
in the setting table. Because the test case starts <strong>with</strong> a sentence format keyword which launches the<br />
application, the application has to be closed at the end of the test case. This can be done in the test case<br />
or <strong>with</strong> a <strong>Test</strong> post condition setting. These two settings are shown in Figure 17.<br />
39
Figure 16:<br />
<strong>Test</strong> case “User Can Add Registrations”<br />
Figure 17:<br />
Settings for all test cases<br />
Figure 18 shows variables and user keywords defined in the atdd_keyword.html resource file. List<br />
variables @{person1}, @{person2} and @{person3} are described in the variable table. The comments<br />
Name and Email are used to clarify the meaning of the different columns. These variables are used in<br />
the sentence format keywords created in the keyword table. Application is started and there are no<br />
registrations in the database keyword contains two user keywords. The first keyword Clear database<br />
makes sure there are not users in the database when the application is started. The second keyword<br />
User launches registration application launches the registration application. The next two user keywords<br />
User adds three people and all three people should be shown in the application and should exist<br />
in the database repeat the same user keyword <strong>with</strong> the different person variables described in the variable<br />
table. These user keywords are not using base keywords from the libraries, and therefore the test<br />
case is not accessing the system under test at this level. The user keywords used to create the sentence<br />
format keywords can be defined in the same resource file or in other resource files. The missing user<br />
keywords are defined in resource file resource.html.<br />
40
Figure 18:<br />
Variables and user keywords for test case “User Can Add Registrations”<br />
In Figure 19 the user keywords used in the atdd_resource.html resource file are described. The base<br />
keywords needed by these user keywords are imported from the SwingLibrary and the OperatingSystem<br />
test libraries in the settings table. The SwingLibrary contains base keywords for handling the<br />
graphical user interface of applications made <strong>with</strong> Java Swing technology. The OperatingSystem library<br />
is a part of Robot, and it contains base keywords for example handling files (like Get file) and<br />
environment variables, and running system commands. If there are no existing libraries for the technologies<br />
the system under test is implemented <strong>with</strong> or some needed base keywords are missing from<br />
the existing library, the missing keywords must naturally be implemented.<br />
41
Figure 19:<br />
User keywords using the base keywords<br />
User launches registration application means the Launch base keyword <strong>with</strong> two arguments, the main<br />
method of the application, and the title of the application to be opened. Both of these arguments have<br />
been defined in the variables table as scalar variables. User Closes Registration Application uses the<br />
Close base keyword which simply closes the launched application. Clear Database consists of the base<br />
keyword Remove file which removes the database file from the file system. The ${DATABASE} variable<br />
contains the path to the database.txt file which is used as a database by the registration application.<br />
The ${CURDIR} and ${/} variables are Robot’s built-in variables. ${CURDIR} is the directory<br />
where the resource file is and ${/} is a path separator character which is resolved based on the operating<br />
system.<br />
42
User adds registration keyword takes two arguments ${name} and ${email}, and it consists of the<br />
Clear text field, Insert into text field and Push button base keywords. All these keywords take as the<br />
first argument the identifier of the element. These identifiers were agreed in the discussion and can be<br />
seen in Figure 15. The ${name} and ${email} arguments are entered to the corresponding text fields<br />
<strong>with</strong> the Insert into text field keyword. In the Registration should be shown in the application and<br />
should exist in the database user keyword the List value should exist base keyword is used to check<br />
that the name and email are in the list shown in the application. The Get file base keyword is used to<br />
read the data from the database to the ${data} variable and the Contains base keyword is used to check<br />
that the database contains the name and email pair.<br />
EXECUTING THE TESTS<br />
The team has made an agreement that all test cases that should pass will be tagged <strong>with</strong> a regression<br />
tag. When the first version of the application is available, the created test cases can be executed. At this<br />
stage none of the test cases are tagged <strong>with</strong> the regression tag. The result of this first test execution can<br />
be seen in Figure 20. Four of the eleven acceptance test cases passed. Passing test cases can be tagged<br />
now as regression test. In Figure 21 one of the passing test cases tagged <strong>with</strong> the tag regression is<br />
shown. When the test cases are executed next time, there will be four critical test cases. If any of those<br />
test cases fail, the test execution result will be failure and the report will turn to red.<br />
43
Figure 20:<br />
First test execution<br />
Figure 21:<br />
<strong>Acceptance</strong> test case tagged <strong>with</strong> tag regression<br />
When the application is next time updated, the test cases are executed. Again, all passing test cases can<br />
be tagged <strong>with</strong> the regression tag. At some point, all the test cases will pass and the features are ready<br />
and the following items can be taken under development. New acceptance test cases are defined, and<br />
the development can start. In case the old functionality is changed, the test cases have to be updated<br />
and the regression tags have to be removed.<br />
44
8 ELABORATED GOALS OF THE THESIS<br />
In this chapter the aim of this thesis is described on a more detailed level. First the scope is defined.<br />
Then the actual research questions are presented.<br />
8.1 Scope<br />
As it was seen in the previous chapters, the field of software testing is very wide. In this thesis the focus<br />
is in the acceptance test-driven development. It is important to distinguish the traditional acceptance<br />
test level and the agile acceptance test level and in the context of this thesis the term acceptance<br />
testing refers to the latter. Other testing areas are excluded from the scope of this master’s thesis. The<br />
testing areas which are excluded in this thesis are non-functional testing, static testing, unit testing, and<br />
integration testing. Also manual acceptance testing in such is out of the scope, but in some cases it may<br />
be mentioned.<br />
The different aspects and generations of test automation were explained in Chapter 6. This thesis concentrates<br />
the on large scale keyword-driven test automation framework called Robot. The following<br />
aspects of the test automation are included to scope of this thesis: creating the automated acceptance<br />
test cases, executing the automated acceptance test cases and reporting the test execution results.<br />
8.2 Research Questions<br />
The main aim of this thesis is to study how the keyword-driven test automation technique can be used<br />
in the acceptance test-driven development. The study is done in a real life software development project,<br />
and therefore another aim is to give an example on how a keyword-driven test automation framework<br />
was used in this specific case and also describe all the noticed benefits and drawbacks. The research<br />
question can be stated as:<br />
1. Can the keyword-driven test automation framework be used in the acceptance testdriven<br />
development<br />
2. How is the keyword-driven test automation framework used in the acceptance testdriven<br />
development in the project under study<br />
3. Does the acceptance test-driven development <strong>with</strong> the keyword-driven test automation<br />
framework provide any benefits What are the challenges and drawbacks<br />
45
The first question can be divided into the following more detailed questions:<br />
1. Is it possible to write the acceptance tests before the implementation <strong>with</strong> the keyword-driven<br />
test automation framework<br />
2. Is it possible to write the acceptance tests in a format that can be understood <strong>with</strong>out<br />
technical competence <strong>with</strong> the keyword-driven test automation framework<br />
The second question can be divided into the following parts:<br />
1. How, when and by whom the acceptance test cases are planned<br />
2. How, when and by whom the acceptance test cases are implemented<br />
3. How, when and by whom the acceptance test cases are executed<br />
4. How and by whom the acceptance test results are reported<br />
The third research question can be evaluated against the promises and challenges of acceptance testdriven<br />
development shown in Table 1 and Table 2 in Chapter 4.3.<br />
46
9 RESEARCH SUBJECT AND METHOD<br />
The purpose of this chapter is to explain where and how this research was done. At first, the case project,<br />
and the product developed in the project are described on a level that is needed to understand the<br />
context where the research took place. Then the research method and the used data collection methods<br />
are described.<br />
9.1 Case Project<br />
This research was conducted in a software project at Nokia Siemens Networks referred as the Project<br />
from now on. The Project was located in Espoo. The Project consisted of two scrum teams each consisting<br />
of approximately ten persons. In addition to the teams, the Project had a product owner, a project<br />
manager, a software architect and half a dozen specialists working as feature owners. Feature<br />
owner meant same as feature champion (see Chapter 3.3). There were also several supporting functions<br />
like a test laboratory team. Several nationalities were represented in the Project.<br />
The software product developed in the Project was a network optimization tool referred as the Product<br />
from now on. The Product and its predecessors had been developed almost for 5 years. The Product is<br />
bespoke software aimed for mobile network operators. The Project was started June 2006, and the<br />
planned end was December 2007. The Product was a desktop application which was used through a<br />
graphical user interface developed <strong>with</strong> Java Swing technology.<br />
9.2 Research Method<br />
The Project under study was decided before the actual research method was chosen. When the role of<br />
the researcher became clear, there were two qualitative approaches to select from; case study and action<br />
research. It was clear from the beginning that the researcher would be highly involved <strong>with</strong> the<br />
Project under research. This high involvement <strong>with</strong> the Project prevented choosing case study for the<br />
research method. Action research was more suitable for this research. Unlike other research methods<br />
where the researcher seeks to study organizational phenomena but not to change them, the action researcher<br />
is concerned <strong>with</strong> creating organizational changes and simultaneously studying the process<br />
(Babüroglu & Ravn 1992). This describes pretty well the situation <strong>with</strong> this research. Researcher was<br />
participating, giving trainings and helping to define the actions that would change the existing process.<br />
47
While the research method was chosen, it was also kept in mind that one purpose of the research was<br />
to try out the acceptance test-driven development in practice. There was a demand for a method that<br />
would enable a practical approach to the problem. Avison et al. (1999) define that action research<br />
combines theory and practice (and researchers and practitioners) through change and reflection in an<br />
immediate problematic situation <strong>with</strong>in a mutually acceptable ethical framework. This was another<br />
reason why action research was chosen to be the method for this research.<br />
According to Avison et al. (1999), action research is an iterative process involving researchers and<br />
practitioners acting together on a particular cycle of activities, including problem diagnosis, action intervention,<br />
and reflective learning. The iterative process of action research suited well <strong>with</strong> the iterative<br />
process of Scrum. The research iteration length was chosen to be the same as the length of the Scrum<br />
iterations. Figure 22 shows how these two processes were synchronized. With this arrangement the<br />
research cycle was quite short, but it also helped to concentrate on small steps in changing the process.<br />
It also helped to prioritize the most important steps.<br />
Figure 22:<br />
Action research activities and the Scrum process<br />
48
A management decision to increase the amount of automated testing was made before the research project<br />
started. This decision was also a trigger for starting this research. Stringer (1996) mentions that<br />
programs and projects begun on the basis of the decisions and the definitions of authority figures have<br />
a high probability of failure. This was taken into account in the beginning of the research and led to a<br />
different starting phase than what was defined in Stringer (1996) about defining the problems and defining<br />
the scope and actions based on that problem definition. Because the goal was already defined,<br />
the research started from collecting data about the environment and implementing the new acceptance<br />
test-driven development process. Otherwise, the action research method defined in Stringer (1996) was<br />
used.<br />
9.3 Data Collection<br />
There were two purposes for the data collection. The first purpose was to collect data about problems<br />
and benefits that individual project members confronted and noticed during the Project. The other purpose<br />
was to record the agreed implementation of the acceptance test-driven development and also to<br />
observe how this agreement was actually implemented. The latter was even more important as Avison<br />
et al. (1999) mentioned that, in action research, the emphasis is more on what practitioners do than on<br />
what they say they do.<br />
The data was collected <strong>with</strong> observations, informal conversations, semi-formal interviews and collecting<br />
meaningful emails and documents. The data was collected during four months period from the beginning<br />
of January 2007 to the end of April 2007. The researcher worked in the Project as a test automation<br />
engineer. The observations and the informal conversations were conducted when working in<br />
the Project. One continuous method to collect relevant issues was recording the issues raised in the<br />
daily scrum meetings.<br />
Initial information collection based mainly on informal discussions but also a few informal interviews<br />
were used. The main purpose of initial information collection was to build an overall understanding of<br />
the Project and a deep understanding about the testing in the project. This was done by asking questions<br />
about the used software processes, software development and testing practices, and problems encountered<br />
in these issues. Some interviews contained also questions about the Project’s history.<br />
49
The final interviews were semi-formal interviews meaning that the main questions were pre-defined<br />
but questions derived from the discussion were also asked. Nine persons were interviewed. Interviewees<br />
consist of two developers, two test engineers, two feature owner/usability specialist, one feature<br />
owner, one scrum master and one specification engineer. All these persons had participated more or<br />
less in developing the features <strong>with</strong> ATDD. The final interviews in the end of the research focused<br />
more on the influences of acceptance test-driven development on different software development aspects.<br />
Appendix B contains the questions asked in the final interviews. The interview questions were<br />
asked in order presented in the appendix, and the objective was to lead respondents’ answers as little as<br />
possible. Specifying questions were asked to get reasoning to the answers. The interviews were both<br />
noted done and tape-recorded.<br />
50
10 ACCEPTANCE TEST-DRIVEN DEVELOPMENT WITH<br />
KEYWORD-DRIVEN TEST AUTOMATION FRAMEWORK IN<br />
THE PROJECT UNDER STUDY<br />
This chapter describes what was done in the Project where the acceptance test-driven development was<br />
tried out. The emphasis is on issues that are relevant from the acceptance test-driven development<br />
point of view. At first the development model and practices used in the project are described. The case<br />
project was described in Chapter 9.1. Then it is illustrated how the keyword-driven test automation<br />
framework was used in the Project. The emphasis is on the four areas mentioned in the second research<br />
question in Chapter 8.2. At the end of this chapter the results of the final interviews are presented.<br />
10.1 <strong>Development</strong> Model and <strong>Development</strong> Practices Used in the<br />
Project<br />
The development process used in the Project was Scrum. Scrum was introduced and taken into use at<br />
the beginning of the project. That meant that the adjustment to the process was ongoing at the time of<br />
the research. There were also some differences to Scrum presented in Chapter 3.3. The biggest difference<br />
was the format of the product backlog. The main requirement types in the Project were the requirements<br />
defined in the requirement specifications and workflows. A workflow contained all the<br />
steps that user could do <strong>with</strong> the functionality. The workflow was a high level use-case. It contained<br />
multiple steps that were related to each other. These steps were divided into mandatory and optional<br />
steps. Every step in the workflow could be seen as a substitute for an item in the product backlog.<br />
As was mentioned in Chapter 3.3, Scrum does not define development practices other than the daily<br />
build. In the Project continuous integration was used. There were no rules defining which development<br />
practices should be used during the Project. Extreme programming practices like refactoring were used<br />
from time to time by the development team. The developers created unit tests, and there were targets<br />
for the unit testing coverage. However, the unit tests were not done using test-driven development.<br />
Main details of the features were written down to feature descriptions which were short verbal description<br />
of the feature. During the research project, the testing division to the automated acceptance test,<br />
automated unit test and manual exploratory testing was taken into use (see Chapter 4.2).<br />
51
The test automation <strong>with</strong> Robot was started in September 2006. At the beginning of the Project the<br />
automated test cases were created for the already existing functionality. This automation task was done<br />
by a separate test automation team. At the time the research was started, automated test cases covered<br />
most of the basic functionalities. This meant that the library to access the graphical user interfaces of<br />
the Product was already developed for some time, and it included the base keywords for most of the<br />
Java Swing components. At this stage there was a desire to create the automated test cases for features<br />
during the same sprint. To make this possible the acceptance test-driven development was taken into<br />
use.<br />
10.2 January Sprint<br />
In the first research sprint the goal was to start acceptance test-driven development <strong>with</strong> a few new features.<br />
At first it was problematic to find features to be developed <strong>with</strong> acceptance test-driven development.<br />
As part of the implementation was follow-on to the implementation of the previous sprints.<br />
These features were seen problematic to start <strong>with</strong>. Some of the new features needed internal models,<br />
and while being developed, they could not be tested through the user interface. Finally, one new feature<br />
was chosen to be the starting point. The feature was the map layer handling. The map layer handling<br />
is used to load backgrounds to a map view of the Product. Network elements and information<br />
about the network is shown on the map view.<br />
As mentioned, there was a separate team for the test automation when the research started. To be able<br />
to work better in the scrum teams, the test automation team members started working as team members<br />
in the scrum teams. This was done at the beginning of the sprint.<br />
PLANNING<br />
The test planning meeting for the map layer handling feature was arranged by a test engineer. It took<br />
place at the middle of the sprint, before the developer started implementing the feature. The participants<br />
of the meeting were a usability expert/feature owner, developer, test engineer and test automation<br />
engineer.<br />
52
The meeting started <strong>with</strong> a general discussion about the feature to be implemented, and the developer<br />
draw a sketch about the user interface he had in mind. After the initial sketch, the group started to think<br />
about the test cases; how the user could use the feature, and which kind of error situations should be<br />
handled. The sketch was updated based on the noticed needs. The test engineers wrote down test cases<br />
when ever those were agreed. During the discussions some important decisions were made about supported<br />
file formats and supported graphic types. At the end of the meeting, the agreed test cases were<br />
gone through to make sure that all of those were written down. At this phase the test cases were not<br />
written in any formal format.<br />
IMPLEMENTATION<br />
The test case implementation started by writing the test cases agreed in the planning meeting to the<br />
tabular format. At the same time, the developer started the development. Figure 23 contains some of<br />
the initial test cases. The highest level of abstraction was not used in the test cases, and therefore the<br />
test cases consist of lower level user keywords <strong>with</strong> short names and variables. These test cases remind<br />
more the test cases the test automation team had implemented earlier than the test cases shown in the<br />
example in Chapter 7.2 and Figure 13.<br />
53
Figure 23:<br />
Some of the initial acceptance test cases for map layer handling<br />
The implementation of the user keywords started after the test cases were written. There was a need to<br />
implement multiple base keywords even though the library had been developed for some months. Fortunately,<br />
the test automation engineer had time to create these needed keywords. At this stage the identifiers<br />
needed to select the correct widgets from the user interface were replaced <strong>with</strong> variables. The<br />
variable values were set when the developer had written identifiers to the code and emailed those to the<br />
test engineer.<br />
From the beginning it was clear that verifying that the map layers are drawn and shown correctly on<br />
the map would be hard to automate. It was not seen sensible to create a library for verifying the correctness<br />
of the map, and therefore a substitutive solution was created. The solution was to take screenshots,<br />
and combine those <strong>with</strong> instructions defining what should be checked from the picture. This lead<br />
to manual verification, but doing that from time to time was not seen as a problem.<br />
54
One of the tested features was changing the colors of the map layers. A base keyword for changing this<br />
color was created, but when it was tried out, it was not working. After the problem was investigated by<br />
the developer and test automation engineer, the base keyword implementation was noticed to be incorrect.<br />
However, the changes made to the base keyword were not correcting the problems, and one more<br />
problem was noticed in the application. These problems were technical test automation problems. The<br />
investigations took some time and the color changing functionality could not been tested in this sprint<br />
by automation. Also some parts of the feature were not fully implemented, and those were moved to<br />
the next sprint.<br />
TEST EXECUTION<br />
The test cases were executed on the test engineer’s and the test automation engineer’s workstations<br />
during the test case implementation phase. There were problems to get a working build during the<br />
sprint, and that slowed down the ability to test that the test cases and especially the implemented base<br />
keywords were working. During this phase the problems in test cases were corrected and defects were<br />
reported to the developer.<br />
REPORTING<br />
The project had one dedicated workstation for executing the automated acceptance test after the continuous<br />
integration system had successfully built the application. The web page showing a report of the<br />
latest acceptance tests was visible at a monitor situated in the project area. The test cases created during<br />
the sprint were added to an automatic regression test set at the end of the sprint. <strong>Test</strong>s that were<br />
passing at the end of the sprint were marked as regression tests <strong>with</strong> the Robot’s tagging functionality<br />
(see Chapter 6.2). The ability to define the critical test cases based on tags made it possible to execute<br />
all the tests even some test cases and features were not working.<br />
10.3 February Sprint<br />
In the second sprint the goal was to finalize the test cases for the map layer functionality and start<br />
ATDD <strong>with</strong> a few more functionality. The functionality selected was the visualization of the Abis configuration.<br />
The purpose of the feature was to collect data from multiple network elements and show the<br />
Abis configuration based on the collected data.<br />
55
PLANNING<br />
Immediately after the sprint planning people involved <strong>with</strong> the visualization of the Abis configuration<br />
feature development kept a meeting about the details of the feature. There were a feature owner, a<br />
specification person, two developers, a test engineer, a test automation engineer, a usability expert and<br />
a scrum master. The usability specialist had developed prototype showing how the functionality should<br />
look like. By using this prototype as starting point, the team discussed different aspects of the feature<br />
and asked specifying questions. The test automation engineer made notes during the meeting.<br />
IMPLEMENTATION<br />
Based on the issues agreed in the meeting, the initial test cases were created, and those were sent by<br />
email to all the participated people. The test cases were created on a high level to make them more understandable<br />
and these test cases can be seen in Figure 24.<br />
Figure 24:<br />
Initial acceptance test cases for the Abis configuration<br />
56
After the test cases were described, the needed keywords were implemented. Figure 25 contains the<br />
implementation of the sentence format keywords. The variables used in the keywords were defined in<br />
the same file as these keywords. As can be seen User opens and closes Abis dialog from navigator user<br />
keyword was used by multiple keywords, and its implementation can be seen in Figure 26. The user<br />
keywords used to implement the User opens and closes Abis dialog from navigator user keyword consists<br />
of user keywords and base keywords.<br />
Figure 25:<br />
The highest level user keywords used to map the sentences to user keywords and variables<br />
57
Figure 26:<br />
Lower level user keywords “User opens and closes Abis dialog from navigator” implementation<br />
Again more base keywords were needed. However, the base keywords were not implemented into the<br />
SwingLibrary. There was a need to implement a helper library to handle the data that was checked<br />
from the configuration table. The configuration table contained 128 cells and the content of every cell<br />
was wanted to be verified. The tabular test data format allowed to describe the expected output almost<br />
in the same format as it was seen in the application. However, the expected outcome could not be defined<br />
beforehand. The input for the feature was configuration data from a mobile network. In this context<br />
it was hard to create all the needed network data in a way that the expected outcome would be<br />
known and the data would be correct. In the test cases existing test data was used, and the configuration<br />
view to be tested automatically was selected from the available alternatives in the existing test<br />
data.<br />
58
Soon after the middle point of the sprint there was a meeting where the status of the visualization of<br />
the Abis configuration feature was checked. The feature was used by a few specialists while the scrum<br />
team was responding to the raised questions and writing down observations. Based on this meeting and<br />
some other informal discussions some more details were agreed to be done in the sprint. Figure 27 contains<br />
some of the test cases which were added and updated after the meeting. The changes were marked<br />
<strong>with</strong> bolded text to highlight them.<br />
Figure 27:<br />
Some of the added and updated test cases<br />
As was mentioned earlier, there was no easy way to automatically test the map component. However,<br />
one of the acceptance test cases was supposed to test that the Abis view can be opened from the map. It<br />
was not seen possible to automate the test case <strong>with</strong> a reasonable effort. The test case was still written<br />
down and tagged <strong>with</strong> the tag manual. The manual tag made it possible to see all the acceptance test<br />
cases that had to be executed manually. Another challenge was to keep the test cases in synchronization<br />
<strong>with</strong> the implementation because the details were changed a few times.<br />
The test case User Can See The Relation Between TRX And DAP visible in Figure 27 was one of the<br />
test cases added in the middle of the sprint. The implementation of the test case could not be finished<br />
during the sprint. The exact implementation of the feature was changed a few times, and the test case<br />
was not implemented before the implementation details were final. The feature was ready just before<br />
the sprint ended, and there was no time to finalize the test case. This was due to the final details were<br />
decided so late, and different people were implementing the feature and the test case.<br />
59
The problems in the map layer handling feature and base keywords were discussed during the sprint.<br />
Some changes to the map layer handling functionality were agreed. These changes were mainly functional<br />
changes to solve the problems <strong>with</strong> the feature itself. The acceptance test cases needed updates<br />
due to these changes. While the test cases were updated, they were changed to include only sentence<br />
format keywords. Some of the new test cases can be seen in Figure 28. The change was quite easy because<br />
most of the keywords were already ready and the mapping from sentence format keywords to<br />
user keywords and variables was very straightforward.<br />
Figure 28:<br />
Some of the updated acceptance test cases for the map layers handling functionality<br />
60
The problem <strong>with</strong> implementing the base keyword in the previous sprint was solved soon after the test<br />
cases were updated. There was also a need for implementing one base keyword, and again there were<br />
some small technical problems. The problem was again <strong>with</strong> a custom component. However, this time<br />
the problem was solved quite quickly. Some challenges in the implementation and implementing functionality<br />
<strong>with</strong> a higher priority took so much time that the map layers handling functionality was not<br />
ready at the end of the sprint and a few nasty defects remained open.<br />
TEST EXECUTION<br />
The test cases were executed by the test automation engineer while developing the test cases similarly<br />
as in the previous sprint. During the sprint there were still problems <strong>with</strong> the builds. This made it<br />
harder to check whether the test cases were working or not and when some of the features were ready.<br />
During this sprint Abis configuration test cases found a defect from a feature which had already<br />
worked.<br />
REPORTING<br />
The reporting was done in a similar way as in the previous sprint.<br />
10.4 March Sprint<br />
During the previous two sprints it was seen that the test automation team was too much responsible for<br />
the test automation. The knowledge was decided to be divided more into the whole team. This meant<br />
arranging training during the sprint. The purpose was to continue the ATDD research <strong>with</strong> other new<br />
functionality. However, some of the team had to participate in a maintenance project during the sprint,<br />
and the sprint content was heavily decreased.<br />
PLANNING<br />
The team had agreed that the details of the new functionality should be agreed on a more detailed level<br />
in the sprint planning. Therefore the team and the feature owner were discussing on a detailed level<br />
what should be implemented in the sprint. All the details could not be described in the first planning<br />
meeting, and thus a second meeting was arranged. In the second meeting the feature owner, two developers,<br />
usability expert/feature owner, test engineer and test automation engineer participated. The<br />
functionalities were gone through, and there were discussion about the details. Agreements about the<br />
implementation details were made, and those were noted down.<br />
61
IMPLEMENTATION<br />
The test automation engineer was responsible for arranging the training, and therefore the test cases<br />
were not implemented at the beginning of the sprint. After the training, a developer and the test automation<br />
engineer implemented the test cases which were not finished during the previous sprint. At this<br />
point contents of the current sprint were decreased. All the functionality that was planned in the second<br />
planning meeting was moved to the following sprint. The initial test cases were still created before the<br />
sprint ended, and some of those can be seen in Figure 29.<br />
Figure 29:<br />
Initial test cases for Abis rule<br />
62
TEST EXECUTION<br />
The test cases that were implemented by the developer and test automation engineer were added to the<br />
automated test execution system immediately after they were ready. All test cases created during the<br />
previous sprints were already there.<br />
10.5 April Sprint<br />
The goal in the April sprint was to continue <strong>with</strong> ATDD <strong>with</strong> the Abis analysis functionality. There<br />
were some big changes in the beginning of the sprint. The Abis analysis workflow was wanted to be<br />
ready at the end of the sprint. This led to combining the two teams to one big sprint team. The team<br />
that had not worked <strong>with</strong> the Abis analysis earlier needed introduction to the functionality. The big<br />
team made it impossible to go into such details that the acceptance tests could be updated during the<br />
sprint planning.<br />
PLANNING<br />
As it was mentioned in the previous chapter, initial acceptance test cases were created during the earlier<br />
sprint. After the sprint planning, the feature owner, the specification engineer and the test automation<br />
engineer went through the initial test cases and updated them. Some of the details still remained<br />
open as the feature owner found those out later in the sprint. After the test cases were updated, they<br />
were sent to the whole team.<br />
IMPLEMENTATION<br />
The implementation started immediately after the acceptance test cases were updated. The test automation<br />
engineer was writing the test cases. After some of the sentence format keywords were implemented,<br />
one step needed clarification. The test automation engineer invited two usability specialist/feature<br />
owners and a specification/test engineer to a meeting where the different options to solve the<br />
usability problem were discussed. After all the options were evaluated the test automation engineer<br />
discussed <strong>with</strong> the developer and the software architect about possible solutions. The changes were<br />
agreed to be implemented, and three developers, the usability expert/feature owner and the test automation<br />
engineer planned and agreed about the details for the feature. Based on the agreed details the<br />
test automation engineer created the acceptance tests for the new feature. Technically the test cases<br />
were created in a similar manner as in the previous sprints.<br />
63
The acceptance test cases were dependent on each other because every test case was a step in the Abis<br />
analysis workflow. This caused some problems as the first step, getting the needed data into the application,<br />
was ready only at the last day of the sprint. A part of the test cases could not be finalized before<br />
this data was available. It was seen too laborious to count all the needed inputs beforehand. Also one<br />
part of the feature could not be finished during the sprint. Therefore a few test cases were not ready<br />
when the sprint ended.<br />
At the end of the sprint, the test engineer and the test automation engineer created some more detailed<br />
test cases to test the Abis rule. These test cases tested different variations and checked that the rule result<br />
was correct. However, the rule was not working as it was meant to be. The developer, the feature<br />
owner and the test automation engineer had understood the details differently. This led to a more detailed<br />
discussion between these parties. It was even noticed that some of the special cases were not<br />
handled correctly. Based on the discussion the developer and test automation engineer wrote down all<br />
the different situations and mailed those to the feature owner. It was agreed that this kind of details<br />
need acceptance test cases in the coming sprints.<br />
TEST EXECUTION<br />
Some of the test cases were verified in the developers’ development environments. One test case was<br />
failing, and it was noticed that the feature implementation has to be improved to fulfill the requirements.<br />
The developers continued the implementation, and after they thought it was ready, the acceptance<br />
test cases were executed again, and those passed. It was seen that the feature was ready. Some<br />
other test cases were executed in the test automation engineer’s workstation. Some problems and misunderstandings<br />
were found, and they were reported to developers.<br />
REPORTING<br />
The test cases were added to the acceptance test execution environment after they were updated in the<br />
beginning of the sprint. The idea was to make the development status visible to all via the acceptance<br />
test report. However, all the test cases were failing most of the sprint, and only a few days before the<br />
sprint ended, some of them passed. Even at the end of the sprint all of them were not passing.<br />
64
It was also planned to create a running tested features diagram from the acceptance test results. However,<br />
this idea was discarded because it was seen that it would not give the correct picture from the<br />
projects status. Some of the test cases were not acceptance test cases in a sense that those were defined<br />
by the test engineers, not by the feature owners. This limitation could be avoided by using acceptance<br />
tag and include only test cases <strong>with</strong> this tag to the RTF diagram. An even more important reason for<br />
dropping the idea was the fact that the whole projects development was not done in the ATDD manner.<br />
10.6 Interviews<br />
This chapter collects the experiences from the project members involved in the team which developed<br />
features <strong>with</strong> the acceptance test-driven development. The interview methods are described on a more<br />
detailed level in Chapter 9.3. Altogether nine persons were interviewed and in this chapter the results<br />
are briefly described. Results of the interviews are analyzed on a more detailed level in Chapter 11.<br />
CHANGES IN THE SOFTWARE DEVELOPMENT<br />
Interviewees thought that the biggest change due to the use of ATDD had been the increased understanding<br />
of the details and workflow in the whole team. One developer thought that ATDD had forced<br />
the team to communicate and co-operate. Another developer mentioned that due to ATDD, feedback<br />
about the features is obtained faster. The test engineers saw that they were able to affect on the developed<br />
software more than before.<br />
BENEFITS<br />
The biggest benefit mentioned in the interviews was a better common understanding of the details due<br />
to the increased communication, cooperation, and detailed planning. Four interviewees saw that requirements<br />
and feature descriptions are more accurate than before. One feature owner had noticed<br />
missing details in the requirements while defining the acceptance tests. The developers thought that<br />
they knew better what was expected from them. Three other interviewees agreed. Four interviewees<br />
saw that the increased understanding of the details had lead to doing the right things already at the first<br />
time. Two interviewees thought the acceptance test cases had increased the overall understanding<br />
about the workflow. One respondent had noticed improvements in team work.<br />
65
The test engineers thought their early involvement was beneficial because they were able to influence<br />
on the developed software, ask hard questions and create better test cases due to the increased understanding.<br />
One test engineer thought that being in the same cycle <strong>with</strong> the development is very efficient<br />
because then people remember what they have done, and therefore problems can be solved <strong>with</strong> a s-<br />
maller effort. One feature owner was of the opinion that the test engineers and developers understand<br />
better what to test and how to test. She also mentioned that the testing is now covering a full use case.<br />
Three interviewees mentioned that feedback was obtained much faster than earlier. The early involvement<br />
of the test engineers and test automation to helped to shorten the feedback loop. One developer<br />
saw that the automated user interface testing is improved. One interviewee thought the automated acceptance<br />
tests keep the quality at a certain level but does not increase it. Another interviewee was of<br />
opinion that test automation helps to reduce the manual regression testing and test engineers can concentrate<br />
more on the complex scenarios and use more their domain knowledge.<br />
DRAWBACKS<br />
There were not many drawbacks according to the interviewees. Two interviewees thought that the initial<br />
investment to test automation is the biggest disadvantage and they were wondering if the costs will<br />
be covered in the long run. Two interviewees were of the opinion that the extra work needed to rewrite<br />
the test cases after possible changes is a problem. One feature owner thought that the time needed to<br />
write the initial test cases is also a kind of a drawback. Two interviewees were speculating that some<br />
developers may not like that others come to their territory. Four interviewees could not find any weaknesses<br />
that are in the same dimension <strong>with</strong> the benefits.<br />
CHALLENGES<br />
<strong>Test</strong> data was seen as the biggest challenge and five respondents mentioned it. Flexible creation of test<br />
data and its use in acceptance test cases were considered challenging. Also reliable automated algorithm<br />
testing was seen as problematic. One developer mentioned that testing the map component and<br />
other visual issues <strong>with</strong> automated test cases would be troublesome. Three interviewees thought that<br />
there may be challenges <strong>with</strong> change resistance. The test engineers found that it was difficult to find<br />
the right working methods. The increased cooperation increases the need for asking right questions,<br />
and that can also be challenging.<br />
66
INFLUENCE ON THE RISK OF BUILDING INCORRECT SOFTWARE<br />
There were varying views on how ATDD influences the risk of building incorrect software. Some interviewees<br />
saw two risks. The first risk was building software that does not fulfill the end customer’s<br />
expectation. The second risk was building software that does not fulfill the requirements or the feature<br />
owner’s expectations. Two persons saw that ATDD does not affect on the risk of building incorrect<br />
software from the end user’s point of view. On the other hand, one test engineer thought that the early<br />
involvement of testing may even decrease the risk. Seven interviewees saw that the second risk about<br />
not creating software that has been specified and wanted by the internal customer had decreased compared<br />
to earlier. Increased communication, discussion about the details and an increased common understanding<br />
before the implementation were seen as the main reasons. One interviewee thought that if<br />
the test cases are incorrect and those are followed too narrowly, it may increase the risk. Another response<br />
was that if the application is developed too much from the test automation’s point of view, the<br />
actual application development could suffer.<br />
VISIBILITY OF THE DEVELOPMENT STATUS<br />
The visibility of the development status was not seen to have changed much <strong>with</strong> the use of ATDD.<br />
One individual view was that the automated tests will increase it in the future. Another comment was<br />
that breaking the tests into smaller parts and arranging a sprint-specific information radiator could<br />
help. The developers thought that merging the acceptance test reports as a part of the build reports<br />
would improve the situation.<br />
QUALITY AGREEMENT BETWEEN THE DEVELOPMENT AND FEATURE OWNERS<br />
Seven interviewees saw the acceptance test cases as an agreement between the development team and<br />
feature owners because the test cases were done in cooperation. However, four of them saw that the<br />
agreement is a functional agreement and not a quality agreement. The quality was seen as a bigger entity<br />
than correct functionality. Two interviewees saw that the agreement had not yet formed.<br />
67
CONFIDENCE IN THE APPLICATION<br />
In general, the confidence in the application had increased. One developer saw that ATDD had enhanced<br />
his confidence in the software because he knew that he was developing the right features. Also<br />
three other persons saw that confidence had grown because there was a common understanding on<br />
what should be done. Three other interviewees were of the opinion that test automation had built the<br />
confidence mainly because passing automated test cases indicated that the application was working on<br />
a certain level. One interviewee saw that the automated test cases increase confidence because she<br />
could trust that something was working after it had been shown to be working in the demo. One test<br />
engineer saw that the possibility to affect the implementation details had enhanced his confidence in<br />
the software.<br />
WHEN PROBLEMS ARE FOUND<br />
Five interviewees thought that problems can be found earlier than <strong>with</strong>out using ATDD and three of<br />
them had already experienced that. However, four of them were of the opinion that manual testing and<br />
test engineers’ early involvement were the key issues. Two of them also mentioned that co-operation in<br />
the early phase can prevent problems from occurring. Four interviewees had not experienced changes,<br />
even though one of them hoped that problems could be found faster in the future.<br />
REQUIREMENTS UP-TO-DATENESS<br />
According to the interviewees, the requirements were more up-to-date than before. Seven of the interviewees<br />
had seen improvement in the way the requirement specification and feature descriptions were<br />
updated. One feature owner and specification engineer mentioned that some missing requirements<br />
were noticed while creating the test cases. Increased communication between the different roles was<br />
also seen to have helped updating the specifications. One developer and test engineer thought that if<br />
some of the agreed functionality has to be changed during the development, it may not get updated.<br />
Two interviewees had not seen any change compared to earlier.<br />
68
CORRESPONDENCE BETWEEN TEST CASES AND REQUIREMENTS<br />
Seven of the interviewees saw that the test cases and requirements are more in sync than before. Reasons<br />
mentioned were cooperation in the test case creation, increased communication, better understanding<br />
of the feature, and agreement about the details. Two persons thought that the test cases correspond<br />
better to the requirements at the beginning when the details are agreed. On the other hand, they<br />
thought that changes during the implementation phase may lead to differences between the test cases<br />
and requirements. One feature owner/usability expert saw that ATDD does not assure that the test<br />
cases and requirements are in sync. He also thought that the test cases cannot replace other specifications.<br />
In his opinion, there is not even a need for that.<br />
DEVELOPERS’ GOAL<br />
Both the developers thought that ATDD had made it easier to focus on the essential issues. One of<br />
them thought the acceptance test cases had also increased the understanding about where his code fits<br />
into the bigger context. Five persons other than developers thought that the developers’ focus is more<br />
on the right features. One interviewee hoped the developers’ goal had changed to a direction where the<br />
feature is implemented, tested and documented, not only implemented.<br />
DESIGN OF THE SYSTEM<br />
One developer thought that ATDD had helped in finding the design faster than before. The other developer<br />
did not have noticed any changes in the design.<br />
REFACTORING CORRECTNESS<br />
The developers found that ATDD had not affected on the evaluation of the refactoring correctness yet.<br />
However, they thought that automated acceptance tests could be used for that later on.<br />
QUALITY OF THE TEST CASES<br />
Most of the interviewees were of the opinion that the quality of the test cases had increased. The following<br />
justifications were presented; test cases are created in co-operation, test cases respond better to<br />
the requirements, test cases cover the whole workflow, and test cases are more detailed and executed<br />
more often. Some interviewees could not tell if there had been any changes. One developer thought<br />
that the acceptance tests done through the graphical user interface had been a huge improvement to the<br />
user interface testing. He explained that it had been very troublesome to unit test the user interfaces<br />
extensively.<br />
69
TEST ENGINEERS’ ROLE<br />
In general, it was seen that test engineers’ role had broadened due to the use of ATDD. Most of the<br />
interviewees mentioned that being a part of the detailed planning had been the biggest change. Other<br />
mentioned changes were increased need to communicate and an increased role in information sharing.<br />
The test engineers thought the change had been huge. The ability to influence on the details makes the<br />
work more rewarding. The improved knowledge about expected details makes it possible to test what<br />
should be done instead of testing what has been done. One feature owner thought that ATDD had<br />
eased the test engineers’ tasks due to the fact that test cases were defined together.<br />
Four interviewees had noticed the old confrontation between the developers and test engineers starting<br />
to decrease due to the increased cooperation. One developer had understood better the difficulties in<br />
testing which in turn had changed his view about the test engineers. One developer said he was happy<br />
that the communication is not only happening through defect reports.<br />
FORMAT OF THE TEST CASES<br />
All the interviewees thought the test cases are at the moment in a format which is very easy to understand.<br />
The sentence format was seen very descriptive. However, one developer had noticed some inconsistency<br />
between the terminology in the test cases and requirements specification. A few persons<br />
thought that still some domain knowledge is needed to understand the test cases. One test engineer<br />
thought the format is much more understandable than the test cases created <strong>with</strong> traditional test automation<br />
tools.<br />
LEVEL OF THE ACCEPTANCE TESTS<br />
The interviewees saw it difficult to define on which level the acceptance test cases should be. One test<br />
engineer thought that discussion at the beginning of the sprint may help to write proper acceptance test<br />
cases and to avoid duplicating the same tests in unit testing and acceptance test levels. Two persons<br />
thought that more detailed test cases would need better test data. One of them also mentioned that it<br />
will not be possible to test all the combinations. He also doubted the profitability of detailed automated<br />
test cases due to the increasing maintenance costs. One specification engineer thought that the acceptance<br />
test cases have probably been detailed enough, but more experiences are needed to become convinced.<br />
Other interviewees did not have any views on this issue.<br />
70
EASINESS OF TEST AUTOMATION<br />
Most of the interviewees did not know if ATDD had affected the easiness of test automation. One test<br />
engineer thought that ATDD helps to plan which test cases to automate and which not.<br />
IMPROVEMENT IDEAS<br />
The interviewees did not have any common opinion on improvement areas. One interviewee thought<br />
that increasing the routine is the most important thing to concentrate on because the method had been<br />
used only for a short time. One feature owner saw that in some areas there is a need for more detailed<br />
level acceptance tests. She also mentioned that there could be a check during the sprint where the acceptance<br />
test cases are reviewed.<br />
Both the developers thought that reporting could be improved to shorten even more the feedback loop.<br />
Adding the acceptance test reports to the build reports was seen as a solution. One of the developers<br />
thought that the written acceptance test cases could be communicated so that everyone really knows<br />
those test cases exist. One feature owner/usability specialist was of the opinion that splitting the acceptance<br />
test cases into smaller parts would help to follow the progress inside the sprint. He felt that<br />
smaller acceptance tests <strong>with</strong> sprint-specific reporting could be used to improve the visibility to all project<br />
members.<br />
One test engineer saw that there is room for improvement in defining and communicating what is<br />
tested <strong>with</strong> manual exploratory tests, automated acceptance tests, and automated unit tests. Two respondents<br />
thought that more specific process description should be created to ease the process adaptation<br />
if ATDD would be taken to wider use. It was also seen that the whole organization is needed to<br />
support the change.<br />
71
11 ANALYSES OF OBSERVATIONS<br />
In this chapter the observations made during the study, including the interviews, are analyzed against<br />
the research questions presented in Chapter 8.<br />
11.1 Suitability of the <strong>Keyword</strong>-<strong>Driven</strong> <strong>Test</strong> Automation Framework<br />
<strong>with</strong> <strong>Acceptance</strong> <strong>Test</strong>-<strong>Driven</strong> <strong>Development</strong><br />
The first research question was: Can the keyword-driven test automation framework be used in the acceptance<br />
test-driven development This question was divided into two more specific questions and<br />
those are analyzed first. After the specific questions have been covered, the analysis of the actual research<br />
question is presented.<br />
IS IT POSSIBLE TO WRITE THE ACCEPTANCE TESTS BEFORE THE<br />
IMPLEMENTATION WITH THE KEYWORD-DRIVEN TEST AUTOMATION FRAMEWORK<br />
In the Project the test cases were written in two phases. The initial test cases were written based on the<br />
information gathered from the planning meetings. Writing the initial test cases took place after the<br />
planning and those were usually ready before the developers started implementing the features. Therefore,<br />
it can be said that the initial test cases were written before the implementation started. However, it<br />
has to be taken into account that the initial test cases were on a high level and the amount of test cases<br />
was only between 10 and 25 test cases per sprint. In case there had been more test cases or the test<br />
cases had been on a more detailed level, the result might have been different.<br />
The second phase, implementing the keywords that were needed to map the initial test cases to the system<br />
under test, was conducted in parallel <strong>with</strong> the application development. With some test cases, it<br />
was not possible to implement all the keywords before the actual implementation details were decided.<br />
There were also difficulties <strong>with</strong> implementing the test cases <strong>with</strong> the inputs and outputs dependent on<br />
the features under development. There were also problems <strong>with</strong> implementing the base keywords. This<br />
prevented finalizing some of the test cases during the sprint. Therefore, only some of the acceptance<br />
test cases were fully ready before the corresponding feature. It was noticed that the test cases could be<br />
implemented neither before the development nor before the features were ready. However, the test<br />
cases were mainly ready soon after the features.<br />
72
The reasons behind the test case implementation problems had to be analyzed. The first problem was<br />
that the interface between test cases and application was changing. It was not possible to implement the<br />
test cases before the interface was defined, which is obvious. However, the test cases were not implemented<br />
even immediately after the interface was clear. This was due to the fact that different persons<br />
were implementing the test cases and the features. In case the same person had implemented both, the<br />
test cases could have been created on time. This problem has also something to do <strong>with</strong> the tool and<br />
approach used to automate the test cases. If the interface had been a programmatic interface, the developers<br />
would have been forced to create the needed code to map the test cases and application. In this<br />
case, the changes in the interface would have been just one person’s responsibility. Therefore, it can be<br />
said that the selected interface made this problem possible. To avoid this problem, it is possible to<br />
move the test case implementation to the developer or improve the communication between the person<br />
implementing the test cases and the person developing the features.<br />
The second problem was defining the inputs and outputs beforehand. The interviewed project members<br />
mentioned that the test data is the biggest challenge in the domain. In the Project, some expected results<br />
were calculated for verification purposes. However, in some test cases more data was needed. It<br />
was not seen sensible to count all this data only for the sake of a few test cases. These problems can<br />
obviously make it hard or even impossible to implement the test cases before developing the features.<br />
On the other hand, these problems were not tool specific. It is even possible that in some other context<br />
this kind of problems do not exist or those are at least easier to solve. However, if this kind of problems<br />
exists it has to be decided case by case whether it is worth of the extra effort to implement the test<br />
cases in a test-first manner.<br />
The problems <strong>with</strong> creating the base keywords were technical. These kinds of problems occur every<br />
now and then. It was also noticed that it might be hard to implement the system specific base keywords<br />
<strong>with</strong>out trying those out. There was no specific reason for the problems, and as the knowledge about<br />
the library increased, the amount of problems was decreasing. And more importantly, all of the problems<br />
were eventually solved.<br />
73
IS IT POSSIBLE TO WRITE THE ACCEPTANCE TESTS IN A FORMAT THAT CAN BE<br />
UNDERSTOOD WITHOUT TECHNICAL COMPETENCE WITH THE KEYWORD-DRIVEN<br />
TEST AUTOMATION FRAMEWORK<br />
The acceptance tests were easy to understand by all the project members. The main reason for this was<br />
that the acceptance test cases were written using plaintext sentences, in other words sentence format<br />
keywords. However, using the sentence format keywords caused extra cost. One additional abstraction<br />
layer was needed for the test cases. Whenever some inputs were defined in the test cases those were<br />
given as arguments for the keyword implementing the sentence format keyword. In some cases this led<br />
to creating duplicate data. The sentence format keyword was first converted to a user keyword and argument<br />
or arguments, and then the user keyword was mapped to other keywords. Implementing the<br />
sentence format keywords took usually only seconds, so the cost was not relevant. This was because<br />
the keyword-driven test automation framework supported a flexible way of defining user keywords in<br />
the test data. Without this functionality in the keyword-driven test automation framework, it may be<br />
harder to use the sentence format keywords and the cost may be higher. Overall the clarity gained <strong>with</strong><br />
the sentence format keywords in the Project was worth of the extra effort.<br />
However, there are some doubts about the sentence format keywords’ suitability to lower level test<br />
cases. Especially if the test cases are created in a data-driven manner, and only inputs and expected<br />
outputs vary. In these cases the overhead caused by the extra abstraction layer may become a burden.<br />
In these cases it would probably be better to use descriptive keyword names and add some comments<br />
and column names to increase the readability of the test cases. This is something that needs further research<br />
because the acceptance test cases created in the Project were mainly on a high level.<br />
CAN THE KEYWORD-DRIVEN TEST AUTOMATION FRAMEWORK BE USED IN THE<br />
ACCEPTANCE TEST-DRIVEN DEVELOPMENT<br />
The answer to this question is ambiguous. It depends on how strictly the acceptance test-driven development<br />
is specified. It is clear that the acceptance test cases were not implemented before the development.<br />
In the Project it would have been very unprofitable to implement all the test cases in a testfirst<br />
manner and probably also impossible. The strict test-first approach <strong>with</strong> acceptance test cases may<br />
be hard in any environment, and Crispin (2005) has also noticed more problems than benefits <strong>with</strong> the<br />
strict test-first approach. On the other hand, the initial test cases were mainly ready before the development<br />
as was mentioned earlier. Therefore, the acceptance test cases were driving the development<br />
by giving a direction and goal for the sprints. One developer’s comment “The acceptance test cases<br />
really drove the development!” promotes this statement.<br />
74
However, the test cases created <strong>with</strong> the keyword-driven test automation framework can be on a very<br />
high level due to the ability to create abstraction layers to the test cases. This may lead to a situation<br />
where a high level use case is converted to high level test cases, and therefore the details are not<br />
agreed, and the benefits of ATDD evade. In the Project some of the test cases were created on such a<br />
high level that the problems were noticed only when the test cases were implemented. At least one usability<br />
problem was noticed while implementing the test cases. This could have been noticed already in<br />
the planning phase <strong>with</strong> more detailed test cases. On the other hand, the usability problem was solved<br />
during the sprint and <strong>with</strong>out ATDD this problem would have been noticed and corrected much later.<br />
Also some misunderstandings noticed at the end of the April sprint could have been avoided <strong>with</strong> more<br />
detailed test cases.<br />
It was also observed that some of the agreed acceptance test cases were not driving the developers<br />
work as well as they could have. With some features the test automation engineer found problems that<br />
could have been avoided if the developers had been following the test cases more strictly. These problems<br />
were not considerable, but some extra implementation was needed to fix them. These situations<br />
were possible because the test automation engineer implemented the test cases instead of the developers.<br />
There were two reasons why the test automation engineer was implementing the test cases. First<br />
the keyword-driven test automation framework made it possible to implement the test cases <strong>with</strong> the<br />
keywords <strong>with</strong>out programming. The other reason was the interface used to access the Product from<br />
the acceptance test cases. Because there was a test library to access the graphical user interface of the<br />
Product, it was possible to write the test cases <strong>with</strong>out the developers’ continuous involvement. With<br />
tools like FIT (Framework for Integrated <strong>Test</strong>) there is usually a need for implementing some feature<br />
specific code between the test cases and application. Therefore, developers are enforced to work<br />
closely <strong>with</strong> the test cases. However, <strong>with</strong> the keyword-driven test automation framework this involvement<br />
is not forced by the tool.<br />
Overall, it seems that the keyword-driven test automation framework can be used in the acceptance<br />
test-driven development if the strict test-first approach is not required. However, there are a few things<br />
that are good to keep in mind if the keyword-driven test automation framework is used <strong>with</strong> ATDD.<br />
Creating only high level test cases should be avoided because those will not drive the discussion to the<br />
details which were mentioned to be the biggest benefit of ATDD. If different persons are creating the<br />
test cases and implementing the application, the communication between these two parties has to be<br />
assured.<br />
75
11.2 Use of the <strong>Keyword</strong>-<strong>Driven</strong> <strong>Test</strong> Automation Framework <strong>with</strong><br />
<strong>Acceptance</strong> <strong>Test</strong>-<strong>Driven</strong> <strong>Development</strong><br />
The second research question was: How is the keyword-driven test automation framework used in the<br />
acceptance test-driven development in the project under study This question was divided into acceptance<br />
test case planning, implementation, execution, and reporting. Chapter 10 already answers to<br />
these questions, but into this chapter sprints are summarized and analyzed.<br />
HOW, WHEN AND BY WHOM THE ACCEPTANCE TEST CASES WERE PLANNED<br />
There was no formal procedure for defining the acceptance test cases. The test cases were rather defined<br />
on the situation basis. However, in all the cases the implementation details were discussed in a<br />
group containing at least a developer, a feature owner, a usability specialist, and a test engineer, and<br />
the discussion was noted down to different sketches and notepads. These discussions took place usually<br />
soon after the sprint planning and always before the implementation. After the meetings it was<br />
mainly the test automation engineer’s task to convert the acceptance test cases to the tabular format<br />
used <strong>with</strong> Robot. In the April sprint the acceptance test cases were updated by a group including a feature<br />
owner, a specification engineer and a test automation engineer.<br />
Writing the test cases and details quickly down in the planning meetings was noticed to be a good<br />
choice. The discussion was not hindered by someone writing the test cases, but all the participants<br />
were really taking part in the conversation. However, there was one drawback <strong>with</strong> this approach. In a<br />
few meetings, some of the details needed to implement the test cases were not discussed. This was because<br />
the issues were not handled systematically. Because these details were straightened out from individual<br />
persons, those were not fully understood by the whole team. It was noticed that emailing and<br />
having the test cases in version control system was not enough. Therefore, it would have been beneficial<br />
to have some kind of a meeting after the test cases were written to check and clarify all the details<br />
to all the team members. This was mentioned also by two team members in the final interviews. A<br />
similar problem was noticed in the April sprint when the details were updated <strong>with</strong>out the developers.<br />
HOW, WHEN AND BY WHOM THE ACCEPTANCE TEST CASES WERE IMPLEMENTED<br />
The acceptance test cases were implemented using the sentence format keywords from the February<br />
sprint onwards in a similar manner as was explained on example in Chapter 7. The test case implementation<br />
took place parallel <strong>with</strong> the feature implementation. The test cases were implemented mainly by<br />
the test automation engineer, but also a test engineer and a developer implemented some of the test<br />
cases.<br />
76
In addition to the challenges presented earlier in this chapter, there were challenges in keeping the test<br />
cases up to date in the February sprint. This problem could have been avoided if the details had been<br />
agreed on a more detailed level in the planning meeting. On the other hand, some of the changes were<br />
made based on the feedback gained from the meeting arranged <strong>with</strong> the specialist. These changes<br />
would have been very hard to foresee. However, updating the test cases was quite easy because the test<br />
cases were created <strong>with</strong> keywords.<br />
The biggest challenge compared to the simple example presented in Chapter 7 was the increase in test<br />
execution time. Starting the application and importing the network data took a considerably long time.<br />
Those actions were not wanted to be executed in every test case as the total test execution time would<br />
have been multiplied by the amount of test cases. It was important to keep the execution time short as<br />
it had an effect on the duration of the test case implementation, and the feedback time in the acceptance<br />
test execution system.<br />
HOW, WHEN AND BY WHOM THE ACCEPTANCE TEST CASES WERE EXECUTED<br />
The acceptance test cases were executed in three ways. During the test case implementation the test<br />
automation engineer executed the test cases on his workstation. The purpose was to verify that the test<br />
cases were implemented correctly. With some test cases this meant that the features were already implemented<br />
at this stage. Some of the test cases were executed on the developers’ workstations during<br />
the development by a test automation engineer and the developers. All the test cases were added to the<br />
acceptance test execution environment. At the beginning the test cases were added to the environment<br />
at the end of each sprint. However, in the last sprint the test cases were added to the acceptance test<br />
execution environment immediately after the initial versions were created. In the acceptance test environment<br />
the test cases were automatically executed whenever there were new builds available.<br />
As already was mentioned, the problems in the acceptance test implementation prevented the developers<br />
from evaluating whether they were ready or not by running the acceptance test cases. There were<br />
also two other reasons which made it hard for the developers to evaluate their work readiness <strong>with</strong><br />
automated acceptance test cases. First of all, some of the test cases tested the workflow, and therefore<br />
those test cases were dependent on each other. That is why the test cases in a late phase of the workflow<br />
could not be tested before the features preceding them were working. Another reason was that the<br />
single test cases tested multiple developers’ work, and therefore the test cases were not passing until all<br />
the parts the test case was testing were ready.<br />
77
Many of the mentioned problems derive from the level of the test cases. When the acceptance test<br />
cases are on a high level, it is inevitable that those test cases test multiple features. This in turn will<br />
lead into the problems mentioned earlier. Avoiding the dependency between steps is hard in the workflow<br />
test cases. Even though these problems exist, it is obvious that the end-to-end acceptance test<br />
cases are needed. One possible solution to this problem is to divide the acceptance test cases more<br />
strictly into two categories. Higher level test cases could be traditional system level test cases containing<br />
end-to-end test cases. The feature specific test cases could be then integration and system level test<br />
cases concentrating only on one feature. The feature specific test cases could be executed by developers<br />
to evaluate the features readiness. Of course this will not remove the problems that some of the features<br />
can not be tested before pre conditional features are ready. This would also make it easier for the<br />
developers to implement the acceptance test cases. The higher level test cases could then still be the<br />
testers’ responsibility as was the case in the Project.<br />
HOW AND BY WHOM THE ACCEPTANCE TEST RESULTS WERE REPORTED<br />
The problems found during the test case implementation were told to the developers. The results of test<br />
case execution in the acceptance test execution environment were visible to all the project members<br />
through an information radiator. The problems found in the automated test execution were passed on to<br />
the developers by the test automation team members after having investigated the problems. However,<br />
this investigation was lengthening the feedback loop as the testers were not always available. In case<br />
the test cases would have been implemented by the developers, the feedback loop could have been<br />
shortened. The developers thought that the feedback loop should be shortened even though they had<br />
experienced that the feedback loop had already been cut radically.<br />
11.3 Benefits, Challenges and Drawbacks of <strong>Acceptance</strong> <strong>Test</strong>-<br />
<strong>Driven</strong> <strong>Development</strong> <strong>with</strong> <strong>Keyword</strong>-<strong>Driven</strong> <strong>Test</strong> Automation<br />
Framework<br />
The third research question was: Does the acceptance test-driven development <strong>with</strong> keyword-driven<br />
test automation framework provide any benefits What are the challenges and drawbacks Based on<br />
the experiences presented in Chapter 10.6 and the expected benefits and challenges presented in Chapter<br />
4.3, the answers to these questions are analyzed.<br />
78
BENEFITS<br />
The project members noticed many benefits in the use of ATDD. This was notable because the research<br />
period lasted only four months. The people who worked closely <strong>with</strong> the acceptance test cases<br />
had noticed much more benefits than those who were less involved in the use of ATDD. The role a<br />
person represented had much less influence on the experienced benefits than the degree of involvement.<br />
Of course, there were different viewpoints based on the role to some of the issues, but the main<br />
benefits were perceived similarly by different roles. The same benefits were also notices by the researcher<br />
while working in the Project.<br />
While the research was conducted, there were some changes in the Project as was mentioned in Chapter<br />
10.1. Not all of them were related to taking ATDD into use. The changes can be categorized into<br />
three main changes; taking test automation into use, a change towards agile testing and of course taking<br />
ATDD into use. The relations and effects of these changes on the experienced benefits had to be<br />
analyzed. The analysis is presented next.<br />
The main relations between the different benefits and reasoning are represented in Figure 30. As can<br />
be seen in the figure, quite many relations between the benefits can be found. The figure is only a simplified<br />
view of the benefits and their relations, but it is used as a basis of this analysis.<br />
79
Figure 30:<br />
The relations between the changes and benefits<br />
80
One of the sensed changes was the increased communication. As was mentioned in Chapter 4.1, agile<br />
testing emphasizes face-to-face communication. When ATDD is in use, the work needed to create the<br />
test cases forces to communication. The perceived increase in the communication can also be dependent<br />
on the tester as some people communicate more actively than others. Therefore, it is impossible to<br />
say how much of the increased communication was due to the use of ATDD and how much due to the<br />
other changes. The test engineers’ early involvement can be seen as a consequence of taking the agile<br />
testing into use. On the other hand, the use of ATDD forced the testers to take a part in an earlier phase<br />
of the development as the testers participated in the detailed planning. Therefore, most of the benefits<br />
gained due to the testers’ earlier participation were obtained because of the use of ATDD (see Figure<br />
30). Co-operation in acceptance test case creation is also something that is a part of the agile testing.<br />
However, in the Project it was due to the use of ATDD that the acceptance test cases were created <strong>with</strong><br />
the feature owners. Therefore, it is hard to say whether the benefits could be gained <strong>with</strong>out the use of<br />
ATDD. Anyway, the use of ATDD assures that acceptance test cases are created in co-operation and<br />
therefore the benefits relative to it are gained.<br />
The only practice that was taken into use purely due to the use of ATDD was detailed planning done<br />
by the feature owners, developers and testers (bolded in Figure 30). This was one of the biggest reasons<br />
leading to an improved common understanding about the details, which was seen in the Project as<br />
the biggest benefit of the use of ATDD. Crispin (2005) also stated that the cooperation between the<br />
groups before development was the biggest benefit of ATDD. The need to create the test cases forces<br />
to discussion. Of course, the detailed planning could be done <strong>with</strong>out ATDD and some of the mentioned<br />
benefits could still be gained. However, as can be seen in Figure 30, the benefits are sums of<br />
multiple factors, and it is hard to say which benefits would be gained if only the detailed planning<br />
would be used. As mentioned earlier, an increased common understanding and benefits following from<br />
that can be missed if the test cases are on a too high level and the planning is not detailed enough.<br />
The test automation affected only a few observed benefits as can be seen in Figure 30. This suggests<br />
that the tool used in the ATDD is not relevant as most of the benefits were gained from well-timed<br />
planning done by people working in the different roles. However, the role of the test automation providing<br />
the feedback and helping the regression testing should not be undervalued. The benefits of automated<br />
regression testing were probably not broadly highlighted in the research because of the short<br />
research period. With a longer follow-up period, this benefit could have been greater.<br />
81
The increased common understanding, the biggest benefit of the use of ATDD, does not provide additional<br />
value as such. However, the increased understanding provides the “real” benefits. The most<br />
valuable benefits of the use of ATDD are therefore the decreased risk of building incorrect software<br />
and the increased development efficiency as the problems can be solved <strong>with</strong> a smaller effort, and the<br />
features are done right at the first time. The change in the tester’s role is also quite remarkable.<br />
The use of ATDD affects also to the software quality. As the risk of building incorrect software is decreased,<br />
it is more likely that the created features will satisfy the end user’s needs. A better understanding,<br />
improved test cases, and fact that the problems are found earlier should also improve the possibility<br />
to find the defects <strong>with</strong> a significant impact. However, this remains to be seen. The test automation<br />
as a part of ATDD provides a certain level of quality. As the regression testing is done automatically,<br />
the testers hopefully have more time to explore the system and find defects. In the Project the nonfunctional<br />
testing was not taken into account when the acceptance test cases were created. However, it<br />
was discussed to be one area to expand the use of ATDD to. Therefore, the non-functional qualities<br />
were not improved by the use of ATDD.<br />
The benefits mentioned are at least partially gained because the use of ATDD. If the agile testing, test<br />
automation, and increased communication are removed from the relations, none of the real benefits<br />
evade. Of course, that may influence the magnitude of the benefit.<br />
BENEFITS NOT PERCEIVED<br />
There were also areas where benefits were not noticed even though those areas were mentioned as possible<br />
benefit areas in the literature (see Chapter 4.3). Possible reasons for why the benefits were not<br />
gained are analyzed here.<br />
82
<strong>Development</strong> Status Was Not More Visible<br />
There were no changes in the development status visibility even though the acceptance test report was<br />
available to everyone through the information radiator and the web page. At the beginning of the research<br />
the test cases were added to the acceptance test execution environment at the end of each sprint.<br />
Therefore, it was clear that the development status could not be followed inside the sprints. At the last<br />
sprint of the research period the acceptance test cases were added to the acceptance test execution environment<br />
at the beginning of the sprint. However, this did not help as the test cases were failing most of<br />
the sprint. There were three reasons for that. The test cases were high level test cases testing multiple<br />
parts of the Product in one test case. Therefore, even the development team was able to finish some<br />
single features the test cases were still failing. Another reason was that the features were ready at a<br />
very late phase of the sprint if even then. Therefore, the test cases were actually describing the development<br />
status even though the people did not see failing tests as progress indicators. The third reason<br />
was that not all of the acceptance test cases were ready at the same time the features were. The reasons<br />
behind this problem were analyzed in Chapter 11.1.<br />
The development status visibility could be improved by dividing the development status follow-up into<br />
a project level and a sprint level progress. The division to the higher level and feature level test cases<br />
presented in Chapter 11.2 could be exploited. The higher level test cases could be used to indicate<br />
which workflows are working, and therefore those could provide the project level status. The feature<br />
level test cases could be used to follow-up the progress inside the sprints.<br />
Requirements Were Not Defined More Cost-effectively<br />
The test cases were not substituting the requirement specifications in the Project. Therefore the requirements<br />
and test cases were not created more cost-effectively. One clear reason was that the Project<br />
had been started before ATDD was tried out and a requirement specification was already created. Even<br />
if ATDD had been started at the beginning of the Project, the requirement specification would probably<br />
still have been created. One interviewed person also mentioned that there is no need for replacing<br />
the requirements <strong>with</strong> the test cases. On the other hand, keeping duplicate data up-to-date can be seen<br />
as a burden.<br />
83
No Remarkable Changes to System Design<br />
ATDD did not cause remarkable changes to the system design even though one developer thought that<br />
he had found the design faster in some cases. A relatively short research period may be one reason why<br />
the changes were not noticed. However, there might be other reasons as well. Reppert (2004) reported<br />
that remarkable improvements in system design were seen as ATDD was used in some project. It may<br />
be that this improvement could not be noticed because the interface used to access the system from the<br />
test cases was different. As was mentioned in Chapter 4.3 the acceptance test cases usually bypass the<br />
graphical user interface and use straightly the internal structures. This was not the case as the test cases<br />
used the graphical user interface to access the system under test. Therefore, in the Project, there was no<br />
need to create test code which would be interacting straightly the internal structures. This maybe was<br />
the reason why the developers did not notice a significant change. So it seems that the interface used to<br />
access the system under test affects whether the system design is improved or not.<br />
<strong>Acceptance</strong> <strong>Test</strong>s Were Not Used To Verify Refactoring Correctness<br />
Developers in the Project thought that the acceptance test cases created <strong>with</strong> ATDD could be used to<br />
evaluate the refactoring correctness even though they had not done that yet. A longer research period is<br />
needed to assess properly the acceptance test cases usefulness when evaluating the refactoring correctness.<br />
However, it is hard to see any reasons why the acceptance test cases created <strong>with</strong> the keyworddriven<br />
test automation framework could not be used to verify refactoring correctness. Probably the<br />
coverage and level of the acceptance test cases have a bigger influence than the tool used to create the<br />
acceptance test cases.<br />
84
CHALLENGES<br />
As was mentioned in Chapter 10.6, the main challenge in the Project’s environment was proper test<br />
data. This however, was a domain specific testing problem. However, it was seen to affect the creation<br />
of automated tests more than manual testing. There were also other challenges in automating the test<br />
cases. The base keyword creation problems were described in Chapter 11.1. There were also components<br />
in the application which could not be accessed from the automated test cases as was mentioned in<br />
Chapters 10.2 and 10.3. As was already mentioned in Chapter 5.1, it is not an easy task to automate<br />
testing. <strong>Test</strong> automation was also seen as one of the biggest challenges in the use of ATDD by Crispin<br />
(2005) (Chapter 4.3). The presented test automation challenges were mainly general test automation<br />
challenges. Some of these challenges are relative to the selected interface for accessing the application.<br />
However, none of them were keyword-driven test automation specific. The use of ATDD and agile<br />
testing helped to solve some of the problems easier than that could be done in a more traditional environment.<br />
It was easier to add the needed testability hooks to the Product because the implementations<br />
were done in parallel.<br />
As was mentioned, test automation is a part of ATDD, but the biggest benefits can be achieved even<br />
though not all of the test cases could be automated. However, this leads to a need to handle the manual<br />
regression testing. Therefore, it is not advisable to be immediately satisfied <strong>with</strong> manual tests. The importance<br />
of the automated regression tests in iterative software development should not be forgotten.<br />
Of course, the scale of test automation has to be decided based on the context.<br />
The second challenge mentioned in Chapter 4.3 was writing the tests before development. That was<br />
also noticed in the Project as was presented in Chapter 11.1. Crispin (2005) mentioned the problem<br />
was that there was no time to write the test cases before development. However, in the Project the<br />
problems were more test data and context specific. Time could have been a problem in case the amount<br />
of detailed level test cases would have been higher.<br />
The third challenge was the right level of test cases. Crispin (2005) noticed that when many test cases<br />
are written beforehand, the test cases can cause more confusion than help to understand the requirements.<br />
It was noticed in the Project that there would have been a need for test cases on multiple levels<br />
as was mentioned in Chapter 11.2. Including also non-functional testing to a part of the acceptance test<br />
cases in the future was seen beneficial by two interviewees. This would even widen the goal of the acceptance<br />
test cases. This challenge <strong>with</strong> the right level of test cases derives probably from the wide<br />
definition of the acceptance testing and the possibility to create test cases on multiple test levels simultaneously.<br />
85
One more challenge was noticed in the use of the keyword-driven test automation framework. As there<br />
was not an intelligent development environment for editing the test case files and resource files, the<br />
test data management took some time. Also some developers found it difficult to find all the keywords<br />
that were used in the test cases and user keywords because those were defined in multiple files. These<br />
problems <strong>with</strong> the test data management can even be bigger if there are more people implementing the<br />
test cases.<br />
DRAWBACKS<br />
Interviewees mentioned only a few drawbacks. One interviewee mentioned that writing the test case<br />
took time and it was a drawback. As there were more people defining the test cases, it took more resources.<br />
On the other hand, the first versions of the test cases were written by a test automation engineer,<br />
and therefore only the definitions were done <strong>with</strong> a bigger group. Two interviewees thought that<br />
updating the test cases can be seen as rework and therefore as a drawback. This drawback was also<br />
noticed in the February sprint. The reason was mainly that the details were not agreed well enough.<br />
However, the time used to do the changes was not remarkable. In all, it seems that the benefits gained<br />
from the use of ATDD exceed clearly the drawbacks.<br />
86
11.4 Good Practices<br />
Good practices are summarized based on literature, the observations, and the analysis of the observations,<br />
and those are shown in Table 3. These practices can be applied when acceptance test-driven development<br />
is used.<br />
PRACTICE<br />
<strong>Acceptance</strong> test cases are created also on a detailed<br />
level.<br />
Use case/workflow test cases are discussed <strong>with</strong><br />
the whole team at the beginning of the sprint.<br />
Detailed level test cases are discussed in small<br />
groups.<br />
<strong>Test</strong> cases are written to the formal format after<br />
the planning meetings.<br />
<strong>Test</strong> cases are checked by the team.<br />
EXPLANATION<br />
If the acceptance test cases are created on a too<br />
high level, there is no need to clarify the details,<br />
and those remain unclear. However, creating too<br />
many detailed test cases at the beginning of the<br />
sprint may be confusing.<br />
It is important that all team members understand<br />
the big picture, and high level test cases can be<br />
used to clarify that.<br />
It is obviously not productive to plan all the details<br />
<strong>with</strong> the whole team. Therefore, the detailed<br />
test cases are created in small groups, where different<br />
roles are represented.<br />
During the planning meetings, the test cases can<br />
be quickly noted down. The purpose of the meetings<br />
is to find the needed details and create a<br />
common understanding about those details. The<br />
test cases can be written to a proper format after<br />
the meeting.<br />
Because the test cases are created based on the<br />
notes, it is good to check the test cases <strong>with</strong> the<br />
people who planned those. This helps to find ambiguities<br />
and to verify that all the people have<br />
understood the details similarly.<br />
87
The test-first approach is not mandatory.<br />
Initial test cases are added to the test execution<br />
environment.<br />
Different kinds of acceptance test cases are created.<br />
There can be situations where it is not profitable<br />
to implement the test cases in the test-first manner.<br />
However, plan and implement the test cases<br />
on some level before implementing the feature.<br />
Even the test case planning can help to understand<br />
the wanted features.<br />
When the test cases are executed often and there<br />
are detailed level test cases, the development progress<br />
can be followed during the sprints. With the<br />
high level test cases the development progress can<br />
be followed on the project level.<br />
The acceptance test cases should cover the functional<br />
and non-functional requirements. Therefore,<br />
there is a need to create different types of<br />
test cases. Functional test cases can even be on<br />
different testing levels.<br />
Table 3:<br />
Good practices<br />
88
12 DISCUSSION AND CONCLUSIONS<br />
This research was conducted by a comprehensive literature review, action research based observations<br />
of the use of acceptance test-driven development <strong>with</strong> the keyword-driven test automation framework<br />
in one software development project, and interviewing members of the project in question. Results of<br />
the research were analyzed by reflecting them to the relevant literature and to earlier studies. Conclusions<br />
based on the analysis are covered in this chapter.<br />
12.1 Researcher’s Experience<br />
The researcher’s background and experience at the field of software testing are described briefly, so<br />
that the reader can make some assumptions about the researcher’s competence. The researcher had four<br />
years of experience on software testing and test automation when the research was started. The researcher<br />
was a part of a team that had developed the keyword-driven test automation framework used<br />
in the Project and called as Robot. The Robot development had lasted over a year when the research<br />
started. The researcher had gained a lot of experience on Robot by using it for testing Robot itself.<br />
12.2 Main Conclusions<br />
It can be said that ATDD can provide many benefits, and it is a radical change to the traditional acceptance<br />
testing. ATDD together <strong>with</strong> agile testing brings testing to the core of the development opposed<br />
to the traditional way where the main part of the testing insufficiently takes place at the end of the software<br />
development. This is a positive feature which also improves meaningfulness of the work as all<br />
team members can take part in planning quality software.<br />
According to the results gained from the study, ATDD also helps to develop more efficiently software<br />
that corresponds better to the requirements. This is mainly due to the improved common understanding<br />
<strong>with</strong>in the team about the details of the software’s features. So it seems that the use of ATDD is really<br />
profitable.<br />
It can be seen that the tool used to automate the test cases in ATDD does not play a crucial role as the<br />
biggest benefits noticed based on the interviews were gained from the process. However, the level on<br />
which the acceptance test cases are created have an influence on the gained benefits, and if the test<br />
cases are done on a too high level, the noticed benefits evade.<br />
89
Of course, ATDD is not a silver bullet, and challenges exist. As the acceptance testing should cover<br />
both non-functional testing and functional testing excluding only unit testing, there is wide a area to<br />
test. Finding the right level of tests is unquestionably hard. However, the cooperation between the team<br />
and the customer can ease that journey.<br />
It was acknowledged that the benefits were gained even though the acceptance test cases were not created<br />
before the development as pure ATDD requires. This leads to the question if ATDD should be<br />
defined so that there is no strict requirement that the test cases should be created in a test-first manner.<br />
The discussion about the test cases is anyway driving the development as the goal of the team is to get<br />
the acceptance test cases passing.<br />
Based on the work, the way of thinking is that ATDD can provide a clear process to arrange the testing<br />
in the iterative development inside the iteration, and in consequence, establish a prerequisite of a successful<br />
testing. This can be seen very beneficial because clear guidance of the process of the agile testing,<br />
especially in Scrum, is missing. The importance of this process is emphasized in environments<br />
where the transition from traditional software development to agile software development is taking<br />
place.<br />
12.3 Validity<br />
There is no one clear definition what validity means in qualitative research (Golafshan 2003, Trochim<br />
2006, Flick 2006). However, Flick (2006) summarizes that validity answers to the question whether<br />
the researchers see what they think they see. Flick (2006) also suggests using triangulation as a method<br />
for evaluating the qualitative research. Based on that suggestion, the validity of this research is evaluated<br />
using data and investigator triangulation. Theory and methodological triangulation are not used<br />
because of the practical nature and predefined scope of the research. Also other matters affecting the<br />
results of this thesis are considered.<br />
The validity of the data was ensured by collecting data <strong>with</strong> different data collection methods listed in<br />
Chapter 9.3. Data was also collected during the whole research, increasing the validity of the data. To<br />
prevent unbalanced view, the researcher interviewed and observed people in different roles. Investigator<br />
triangulation means using more than one researcher to detect and prevent biases resulting from the<br />
researcher as a person. It was not possible to use any other interviewer or observer in this research.<br />
From this point of view the validity of the research is questionable.<br />
90
The high involvement in the Project and especially the help the Project gained from the researcher during<br />
the study affects the validity of this research. Kock (2003) mentions that in action research the researcher’s<br />
actions may strongly bias the results. The researcher became aware of this possibility in the<br />
beginning of the research, and it was kept in mind during the Project and especially during the analyzing<br />
phase.<br />
In the addition the background, the know-how and opinions of the researcher are possible sources of<br />
error. This is mainly due to the fact that this research was a qualitative research, and for example, the<br />
interviews were used as a research method. Therefore, the content and form of the interview questions<br />
can reflect researcher’s own background, knowledge, and views. As a part of the project team, there is<br />
no possibility for the researcher to be completely objective. However, it can be discussed whether this<br />
subjectivity has a negative impact on the research or not.<br />
Interpreting results is not completely an objective action. Therefore, it might be that another researcher<br />
<strong>with</strong> a different background may have interpreted the gained results in a slightly different way. Therefore,<br />
it must be kept in mind, that for example the conclusions are always a somewhat subjective view<br />
of the reality. However, it can be argued that the results gained from the research would have been<br />
similar even though the research had been carried out by some other researcher.<br />
Also the fact that there were other changes in the Project such as a change towards agile testing may<br />
have caused problems in understanding what actually caused the perceived benefits. However, as was<br />
noticed in the analysis of the research results, some of the changes and benefits originate directly from<br />
the use of ATDD. To be sure about the benefits, the subject should be studied for a longer period of<br />
time than what was done in this research. However, the main conclusions could be drawn also based<br />
on the period of time used in the research. The results of earlier studies and relevant literature confirm<br />
the noticed research results as they were mainly in line <strong>with</strong> each other.<br />
It must be kept in mind that the results presented in this thesis are only based on one software development<br />
project and more specifically on one team’s work. Every project has its own context specific<br />
features. These facts and of course the structure of the team have an influence on how ATDD is used<br />
and how it is adapted as a part of the development process. Therefore, the results can in some extent<br />
vary according to the project in question, but the main benefits noticed should be possible to gain also<br />
in other projects.<br />
91
The test automation framework Robot used in this research is not open sourced, which makes it harder<br />
to introduce the test automation concept used in this research to other projects. However, there is a<br />
possibility that the keyword-driven test automation framework used in the study will be open sourced.<br />
12.4 Evaluation of the Thesis<br />
The first goal of this thesis was to investigate whether the keyword-driven test automation framework<br />
could be used <strong>with</strong> acceptance test-driven development. It can be said that the goal was achieved. Suitability<br />
of keyword-driven test automation was analyzed extensively and based on the analysis the outcome<br />
was that it is possible to use the keyword-driven test automation framework <strong>with</strong> ATDD. It was<br />
also noticed that some limitations exist which may in turn prevent the finalization of the test cases<br />
prior to feature implementation.<br />
One aim was to describe the use of keyword-driven test automation framework <strong>with</strong> ATDD in a way<br />
that enables other projects to experiment the approach <strong>with</strong> similar tools. How this goal is met remains<br />
to be seen when the results of this thesis will possible be used in other real-world software development<br />
projects. However the aim was to describe both the fictive example (Chapter 7) and case study<br />
(Chapter 10) in such a way that they would be widely understood.<br />
The last goal was to study what are the pros and cons of the acceptance test driven development when<br />
it is used <strong>with</strong> the keyword driven test automation framework. Even though the research lasted only<br />
four months, plenty of results were collected. Based on these results it was possible to see clear benefits,<br />
some challenges, and a few drawbacks. In this sense, the study was successful.<br />
12.5 Further Research Areas<br />
Because this thesis is one of the first studies focusing on acceptance test-driven development <strong>with</strong> the<br />
keyword-driven test automation framework, there is need for a more extensive study of this kind of<br />
approach in other projects and projects that use different kind of iterative processes. Also a longer research<br />
period would be beneficial as the changes due to the use of ATDD are wide-ranging, and the<br />
adaptation and adjusting the process takes time. Full-scale use of ATDD would make it possible to<br />
study better the effects of the test automation framework, and the running tested features metric’s suitability<br />
<strong>with</strong> ATDD.<br />
92
As was noticed, the level of acceptance tests affects the benefits of ATDD. It was also noticed that<br />
there is a need for acceptance test cases on different levels and that it is difficult to create test cases on<br />
a right level. At least the following areas need more study to understand which kind of acceptance tests<br />
would be beneficial to create:<br />
• How do the different levels of test cases affect the different aspects of ATDD<br />
• How do the different levels of acceptance tests affect measuring the project, and how does<br />
it affect the use of running tested features metric<br />
• How could the lower level acceptance tests created <strong>with</strong> the keyword-driven test automation<br />
framework defined in a format that can be easily understood<br />
• What is the relationship between the unit testing and the lower level acceptance testing<br />
Further research is also needed to clarify which of the benefits mentioned in this research are actually<br />
direct results of ATDD. Therefore, the relationships between the benefits and the source of each benefit<br />
should be studied.<br />
One issue that was not studied in this research was the ability to substitute the requirement specifications<br />
<strong>with</strong> the acceptance test cases. As one interviewee mentioned, there is no need to replace the requirements<br />
<strong>with</strong> the acceptance test cases. However, some of the details in the requirement specifications<br />
could be defined <strong>with</strong> test cases to avoid maintaining duplicate data. This could lead to linking<br />
the high level requirements to the acceptance test cases. This would be an interesting area for further<br />
study.<br />
Altogether, it can be said that this thesis is a good opening for discussion in this field of software testing.<br />
93
BIBLIOGRAPHY<br />
Abrahamsson, Pekka, Outi Salo, Jussi Ronkainen & Juhani Warsta (2002). Agile Software<br />
<strong>Development</strong> Methods: Review and Analysis. VTT Publications 478, VTT, Finland.<br />
<br />
Agile Advice (2005). Information Radiators, May 10, 2005.<br />
May 14th, 2007<br />
Andersson, Johan, Geoff Bache & Peter Sutton (2003). XP <strong>with</strong> <strong>Acceptance</strong> <strong>Test</strong>-<strong>Driven</strong><br />
<strong>Development</strong>: A Rewrite Project for a Resource Optimization System. Lecture Notes in Computer<br />
Science, Volume 2675/2003, Extreme Programming and Agile Processes in Software Engineering,<br />
180-188, Springer Berlin/Heidelberg.<br />
<br />
Astels, David (2003). <strong>Test</strong>-<strong>Driven</strong> <strong>Development</strong>: A Practical Guide. 562, Prentice Hall PTR, United<br />
States of America.<br />
Avison, David, Francis Lau, Michael Myers & Peter Axel Nielsen (1999). Action Research: To make<br />
academic research relevant, researchers should try out their theories <strong>with</strong> practitioners in real situations<br />
and real organizations. COMMUNICATIONS OF THE ACM, January 1999/Vol. 42, No. 1, 94-97.<br />
Babüroglu, Oguz N. & Ib Ravn. Normative Action Research (1992). Organization Studies Vol. 13, No.<br />
1, 1992, 19-34.<br />
Bach, James (2003a). Agile test automation. <br />
March 31st, 2007<br />
Bach, James (2003b). Exploratory <strong>Test</strong>ing Explained v.1.3 4/16/03.<br />
March 31st, 2007<br />
Beck, Kent (2000). Extreme Programming Explained: Embrace Change. Third Print, 190, Addison-<br />
Wesley, Reading (MA).<br />
94
Beck, Kent, Mike Beedle, Arie van Bennekum, Alistair Cockburn, Ward Cunningham, Martin Fowler,<br />
James Grenning, Jim Highsmith, Andrew Hunt, Ron Jeffries, Jon Kern, Brian Marick, Robert C.<br />
Martin, Steve Mellor, Ken Schwaber, Jeff Sutherland & Dave Thomas (2001a). Manifesto for Agile<br />
Software <strong>Development</strong>. December 5th, 2006<br />
Beck, Kent, Mike Beedle, Arie van Bennekum, Alistair Cockburn, Ward Cunningham, Martin Fowler,<br />
James Grenning, Jim Highsmith, Andrew Hunt, Ron Jeffries, Jon Kern, Brian Marick, Robert C.<br />
Martin, Steve Mellor, Ken Schwaber, Jeff Sutherland & Dave Thomas (2001b). Principles behind the<br />
Agile Manifesto. March 31st, 2007<br />
Beck, Kent (2003). <strong>Test</strong>-<strong>Driven</strong> <strong>Development</strong> By Example. 240, Addison-Wesley.<br />
Beizer, Boris (1990). Software testing techniques. Second Edition, 550, Van Nostrand Reinhold, New<br />
York.<br />
Burnstein, Ilene (2003). Practical Software <strong>Test</strong>ing: a process-oriented approach. 709, Springer, New<br />
York.<br />
Buwalda, Hans, Dennis Janssen & Iris Pinkster (2002). Integrated <strong>Test</strong> Design and Automation: Using<br />
the <strong>Test</strong>Frame Method. 242, Addison Wesley, Bibbles Ltd, Guildford and King’s Lynn, Great Britain.<br />
Cohn, Mike (2004). User Stories Applied: For Agile Software <strong>Development</strong>. 268, Addison-Wesley.<br />
Cohn, Mike (2007). User Stories, Agile Planning and Estimating. Internal Seminar, March 24th, 2007.<br />
Control Chaos (2006a). What is Scrum September 26th, 2006<br />
Control Chaos (2006b). XP@Scrum. September 26th,<br />
2006<br />
Craig, Rick D. & Stefan P. Jaskiel (2002). Systematic Software <strong>Test</strong>ing. 536, Artech House Publishers,<br />
Boston.<br />
Crispin, Lisa, Tip House & Wade Carol (2002). The Need for Speed: Automating <strong>Acceptance</strong> <strong>Test</strong>ing<br />
in an eXtreme Programming Environment. Upgrade, The European Online Magazine for the IT<br />
Professional Vol III, No. 2, April 2002, 11-17. <br />
95
Crispin, Lisa & Tip House (2005). <strong>Test</strong>ing Extreme Programming. Second Print, 306, Addison-<br />
Wesley.<br />
Crispin, Lisa (2005). Using Customer <strong>Test</strong>s to Drive <strong>Development</strong>. METHODS & TOOLS, Global<br />
knowledge source for software development professionals, Summer 2005, Volume 13, number 2, 12-<br />
17. <br />
Cruise Control (2006). Cruise Control, Continuous Integration Toolkit.<br />
Sebtember 23rd, 2006<br />
Dustin, Elfriede, Jeff Rashka & John Paul (1999). Automated Software <strong>Test</strong>ing: introduction,<br />
management, and performance. 575, Addison-Wesley.<br />
Fenton, Norman E (1996). Software metrics : a rigorous and practical approach. Second Edition, 638,<br />
International Thomson Computer Press, London.<br />
Fewster, Mark & Dorothy Graham (1999). Software <strong>Test</strong> Automation, Effective use of test execution<br />
tools. 574, Addison-Wesley.<br />
Flick, Uwe (2006). An Introduction to Qualitative Research. Third Edition, 443, SAGE, London.<br />
Golafshani, Nahid (2003). Understanding Realibility and Validity in Qualitative Research. The<br />
Qualitative Report Vol. 8, Number 4, December 2003, 597-607. <br />
Hendrickson, Elisabeth (2006). Agile QA/<strong>Test</strong>ing. <br />
April 10th, 2007<br />
IEEE Std 829-1983. IEEE Standard for Software <strong>Test</strong> Documentation. Institute of Electrical and<br />
Electronics Engineers, Inc., 1983.<br />
IEEE Std 1008-1987. IEEE standard for Software Unit <strong>Test</strong>ing. Institute of Electrical and Electronics<br />
Engineers, Inc., 1987.<br />
IEEE Std 610.12-1990. IEEE standard glossary of software engineering terminology. Institute of<br />
Electrical and Electronics Engineers, Inc., 1990.<br />
96
ISO Std 9000-2005. Quality management systems - Fundamentals and vocabulary. ISO Properties,<br />
Inc., 2005<br />
ISO/IEC Std 9126-1:2001. Software engineering -- Product quality -- Part 1: Quality model. ISO<br />
Properties, Inc., 2001<br />
ISTQB (2006). Standard glossary of terms used in Software <strong>Test</strong>ing Version 1.2 (dd. June, 4th 2006).<br />
April 9th, 2007<br />
Itkonen, Juha, Kristian Rautiainen and Casper Lassenius (2005). Toward an Understanding of Quality<br />
Assurance in Agile Software <strong>Development</strong>. International Journal of Agile Manufacturing, Vol. 8, No.<br />
2, 39-49.<br />
Jeffries, Ronald E. (1999). Extreme <strong>Test</strong>ing, Why aggressive software development calls for radical<br />
testing efforts. Software <strong>Test</strong>ing & Quality Engineering, March/April 1999, 23-26.<br />
<br />
Jeffries, Ron, Ann Andersson & Chet Hendrickson (2001). Extreme Programming Installed. 265,<br />
Addison-Wesley, Boston.<br />
Jeffries, Ron (2004). A Metric Leading to Agility 06/14/2004.<br />
November 18th, 2006<br />
Jeffries, Ron (2006). Automating “All” <strong>Test</strong>s 05/25/2006.<br />
April 14th, 2007<br />
Kaner, Cem, Jack Falk & Quoc Nguyen (1999). <strong>Test</strong>ing Computer Software. Second Edition, 480,<br />
Wiley, New York.<br />
Kaner, Cem, James Bach, Bret Pettichord, Brian Marick, Alan Myrvold, Ross Collard, Johanna<br />
Rothman, Christopher Denardis, Marge Farrell, Noel Nyman, Karen Johnson, Jane Stepak, Erick<br />
Griffin, Patricia A. McQuaid, Stale Amland, Sam Guckenheimer, Paul Szymkowiak, Andy Tinkham,<br />
Pat McGee & Alan A. Jorgensen (2001a). The Seven Basic Principles of the Context-<strong>Driven</strong> School.<br />
December 19th, 2006<br />
Kaner, Cem, James Bach & Bret Pettichord (2001b). Lessons Learned in Software <strong>Test</strong>ing: A Context-<br />
<strong>Driven</strong> Approach. 286, John Wiley & Sons, Inc., New York.<br />
97
Kaner, Cem (2003). The Role of <strong>Test</strong>ers in XP.<br />
November 18th, 2006<br />
Kit, Edward (1999). Integrated, effective test design and automation. Software <strong>Development</strong>, February<br />
1999, 27–41.<br />
Kock, Ned (2003). Action Research: Lessons Learned From a Multi-Iteration Study of Computer-<br />
Mediated Communication in Groups. IEEE Transactions on Professional Communication, Vol. 46, No.<br />
2, June 2003, 105-128.<br />
Larman, Craig (2004). Agile & Iterative <strong>Development</strong>: A Manager’s Guide. 342, Addison-Wesley.<br />
Larman, Craig (2006). Introduction to Agile & Iterative <strong>Development</strong>. Internal Seminar, December<br />
14th, 2006.<br />
Laukkanen, Pekka (2006). Data-<strong>Driven</strong> and <strong>Keyword</strong>-<strong>Driven</strong> <strong>Test</strong> Automation Frameworks. 98,<br />
Master’s Thesis, Software Business and Engineering Institute, Department of Computer Science and<br />
Engineering, Helsinki University of Technology.<br />
Mar, Kane & Ken Schwaber (2002). Scrum <strong>with</strong> XP.<br />
October 4th, 2006<br />
Marick, Brian (2001). Agile Methods and Agile <strong>Test</strong>ing. <br />
November 15th, 2006<br />
Marick, Brian (2004). Agile <strong>Test</strong>ing Directions. <br />
November 15th, 2006<br />
Meszaros, Gerard (2003). Agile regression testing using record & playback. Conference on Object<br />
Oriented Programming Systems Languages and Applications, Companion of the 18th annual ACM<br />
SIGPLAN conference on Object-oriented programming, systems, languages, and applications, 353–<br />
360, ACM Press, New York. <br />
Miller, Roy W. & Christopher T. Collins (2001). <strong>Acceptance</strong> testing. XP Universe, 2001.<br />
April 10th, 2007<br />
98
Mosley, Daniel J. & Bruce A. Posey (2002). Just Enough Software <strong>Test</strong> Automation. 260, Prentice Hall<br />
PTR, Upper Saddle River, New Jersey, USA.<br />
Mugridge, Rick & Ward Cunningham (2005). Fit for Developing Software: Framework for Integrated<br />
<strong>Test</strong>s. 355, Prentice Hall PTR, Westford, Massachusetts.<br />
Nagle, Carl J. (2007). <strong>Test</strong> Automation Frameworks.<br />
April 14th, 2007<br />
Patton, Ron (2000). Software <strong>Test</strong>ing. 389, SAMS, United States of America.<br />
Pol, Martin (2002). Software testing: a guide to the TMap approach. 564, Addison-Wesley, Harlow.<br />
Reppert, Tracy (2004). Don’t Just Break Software Make Software: How storytest-driven development<br />
is changing the way QA, Customers, and developers work. Better Software, July/August, 2004, 18-23.<br />
<br />
Sauvé, Jacques Philippe, Osório Lopes Abath Neto & Walfredo Cirne (2006). EasyAccept: a tool to<br />
easily create, run and drive development <strong>with</strong> automated acceptance tests. International Conference on<br />
Software Engineering, Proceedings of the 2006 international workshop on Automation of software<br />
test, 111-117, ACM Press, New York. <br />
Schwaber, Ken & Mike Beeble (2002). Agile software development <strong>with</strong> Scrum. 158, Prentice-Hall,<br />
Upper Saddle River (NJ).<br />
Schwaber, Ken (2004). Agile Project Management <strong>with</strong> Scrum. 163, Microsoft Press, Redmond,<br />
Washington.<br />
Stringer, Ernest T. (1996). Action Research: A Handbook for Practitioners. 169, SAGE, United States<br />
of America.<br />
Trochim, William M.K (2006). Qualitative Validity.<br />
October 4th, 2006<br />
99
Zallar, Kerry (2001). Are you ready for the test automation game Software <strong>Test</strong>ing & Quality<br />
Engineering, November/December 2001, 22–26.<br />
<br />
Watt, Richard J. & David Leigh-Fellows (2004). <strong>Acceptance</strong> <strong>Test</strong>-<strong>Driven</strong> Planning. Lecture Notes in<br />
Computer Science, Volume 3134/2004, Extreme Programming and Agile Methods - XP/Agile Universe<br />
2004, 43-49, Springer, Berlin/Heidelberg.<br />
Wideman Max R. (2002). Wideman Comparative Glossary of Project Management Terms, March<br />
2002. May 14th, 2007<br />
100
APPENDIX A<br />
PRINCIPLES BEHIND THE AGILE<br />
MANIFESTO<br />
We follow these principles:<br />
Our highest priority is to satisfy the customer<br />
through early and continuous delivery<br />
of valuable software.<br />
Welcome changing requirements, even late in<br />
development. Agile processes harness change for<br />
the customer's competitive advantage.<br />
Deliver working software frequently, from a<br />
couple of weeks to a couple of months, <strong>with</strong> a<br />
preference to the shorter timescale.<br />
Business people and developers must work<br />
together daily throughout the project.<br />
Build projects around motivated individuals.<br />
Give them the environment and support they need,<br />
and trust them to get the job done.<br />
The most efficient and effective method of<br />
conveying information to and <strong>with</strong>in a development<br />
team is face-to-face conversation.<br />
Working software is the primary measure of progress.<br />
Agile processes promote sustainable development.<br />
The sponsors, developers, and users should be able<br />
to maintain a constant pace indefinitely.<br />
Continuous attention to technical excellence<br />
and good design enhances agility.<br />
Simplicity--the art of maximizing the amount<br />
of work not done--is essential.<br />
The best architectures, requirements, and designs<br />
emerge from self-organizing teams.<br />
At regular intervals, the team reflects on how<br />
to become more effective, then tunes and adjusts<br />
its behavior accordingly. (Beck et al. 2001b)<br />
101
APPENDIX B<br />
INTERVIEW QUESTIONS<br />
Interview questions asked in the final interviews.<br />
1. How has ATDD affected the software development Why<br />
2. What have been the benefits in ATDD Why<br />
3. What have been the drawbacks in ATDD Why<br />
4. What have been the challenges in ATDD Why<br />
5. Has ATDD affected on the risk of building incorrect software How Why<br />
6. Has ATDD affected on the visibility of the development status How Why<br />
7. Has ATDD established a quality agreement between the development and feature owners<br />
How Why<br />
8. Has ATDD changed your confidence in the software How Why<br />
9. Has ATDD affected on when problems are found How Why<br />
10. Has ATDD affected on the way requirements are up to date How Why<br />
11. Has ATDD affected on the way requirements and tests are in sync How Why<br />
12. Are the acceptance tests in a format that is easy to understand Why or why not<br />
13. Is it easy to write the acceptance tests on a right level Why or why not<br />
14. Has ATDD affected on the developers’ goal How Why<br />
15. Has ATDD affected on the design of the developed system How Why<br />
16. Has ATDD affected on verifying the refactoring correctness How Why<br />
17. Has ATDD affected on the quality of the test cases How Why<br />
18. Has ATDD had influence on the way people see test engineers How Why<br />
19. Has ATDD had influence on the test engineer's role How Why<br />
20. Has ATDD affected on how hard or easy the tests are to automate How Why<br />
21. What could be improved in the current way of doing ATDD Which changes could give the<br />
biggest benefits<br />
22. Sum up the biggest benefit and the biggest drawback based on the issues asked in this interview<br />
and state the reasons.<br />
102