24.01.2015 Views

Acceptance Test-Driven Development with Keyword ... - Niksula

Acceptance Test-Driven Development with Keyword ... - Niksula

Acceptance Test-Driven Development with Keyword ... - Niksula

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

HELSINKI UNIVERSITY OF TECHNOLOGY<br />

Department of Computer Science and Engineering<br />

Software Business and Engineering Institute<br />

Juha Rantanen<br />

<strong>Acceptance</strong> <strong>Test</strong>-<strong>Driven</strong> <strong>Development</strong> <strong>with</strong> <strong>Keyword</strong>-<strong>Driven</strong><br />

<strong>Test</strong> Automation Framework in an Agile Software Project<br />

Master’s Thesis<br />

Espoo, May 18, 2007<br />

Supervisor:<br />

Instructor:<br />

Professor Tomi Männistö<br />

Harri Töhönen, M.Sc.


HELSINKI UNIVERSITY OF TECHNOLOGY<br />

Department of Computer Science and Engineering<br />

ABSTRACT OF MASTER’S THESIS<br />

Author<br />

Title of thesis<br />

Professorship<br />

Supervisor<br />

Instructor<br />

Date<br />

Juha Rantanen May 18, 2007<br />

Pages<br />

102<br />

<strong>Acceptance</strong> <strong>Test</strong>-<strong>Driven</strong> <strong>Development</strong> <strong>with</strong> <strong>Keyword</strong>-<strong>Driven</strong> <strong>Test</strong> Automation<br />

Framework in an Agile Software Project<br />

Professorship Code<br />

Computer Science T-76<br />

Professor Tomi Männistö<br />

Harri Töhönen, M.Sc.<br />

Agile software development uses iterative development allowing changes and updates periodically<br />

to the software requirements. In agile software development methods, customer-defined tests have<br />

an important role in assuring that the software fulfills the customer’s needs. These tests can be defined<br />

before implementation to establish a clear goal for the development team. This is called acceptance<br />

test-driven development (ATDD).<br />

With ATDD the acceptance tests are usually automated. <strong>Keyword</strong>-driven testing is the latest evolution<br />

in test automation approaches. In keyword-driven testing, instructions, inputs, and expected<br />

outputs are defined in separate test data. A test automation framework tests the software accordingly<br />

and reports the results.<br />

In this thesis, the use of acceptance test-driven development <strong>with</strong> the keyword-driven test automation<br />

framework is studied in a real-world agile software development project. The study was conducted<br />

using action research during a four-month period. The main methods used were observations and<br />

interviews.<br />

It was noticed that the keyword-driven test automation framework can be used in acceptance testdriven<br />

development. However, there were some limitations preventing the implementation of all the<br />

test cases before the software implementation started. It was also noticed that the test automation<br />

framework used to implement the acceptance test cases is not in a crucial role in acceptance test<br />

driven development. The biggest benefits were gained from the detailed planning done before the<br />

software implementation at the beginning of the iterations.<br />

Based on the results, acceptance test-driven development improves communication and cooperation,<br />

and gives a common understanding about the details of the software’s features. These improvements<br />

help the development team to implement the wanted features. Therefore, the risk of building incomplete<br />

software decreases. The improvements also help to implement the features more efficiently as<br />

the features are more likely to be implemented correctly at the first time. Also remarkable changes to<br />

the test engineers’ role were noticed as the test engineers are more involved in the detailed planning.<br />

It seems that the biggest challenge in acceptance test-driven development is creating tests on right<br />

test levels and in a right scope.<br />

<strong>Keyword</strong>s: acceptance test-driven development, keyword-driven testing, agile testing, test automation<br />

ii


TEKNILLINEN KORKEAKOULU<br />

Tietotekniikan osasto<br />

DIPLOMITYÖN TIIVISTELMÄ<br />

Tekijä<br />

Työn nimi<br />

Professuuri<br />

Työn valvoja<br />

Työn ohjaaja<br />

Päiväys<br />

Juha Rantanen May 18, 2007<br />

Sivumäärä<br />

102<br />

Hyväksymistestauslähtöinen kehitys avainsanaohjatulla testiautomaatiokehyksellä<br />

ketterässä ohjelmistoprojektissa<br />

Koodi<br />

Ohjelmistoliiketoiminta ja tuotanto T-76<br />

Professori Tomi Männistö<br />

DI Harri Töhönen<br />

Ketterä ohjelmistokehitys pohjautuu iteratiiviseen lähestymistapaan. Iteratiivisuus mahdollistaa ohjelmiston<br />

vaatimusten muuttamisen ja päivittämisen jaksottaisesti. Ketterissä ohjelmistokehitysprosesseissa<br />

asiakkaan määrittämät testit ovat tärkeässä roolissa varmistettaessa, että kehitettävä ohjelmisto<br />

täyttää asiakkaan tarpeet. Nämä testit voidaan määritellä ennen toteutuksen aloittamista selkeän<br />

tavoitteen luomiseksi kehitystiimille. Tätä kutsutaan hyväksymistestauslähtöiseksi kehitykseksi.<br />

Hyväksymistestauslähtöisessä kehityksessä hyväksymistestit usein automatisoidaan. Yksi uusimmista<br />

testiautomaatiomenetelmistä on avainsanaohjattu testaus. Avainsanaohjatussa testauksessa ohjeet,<br />

syötteet ja oletetut lopputulokset määritellään erillisissä testitiedoissa. <strong>Test</strong>iautomaatiokehys testaa<br />

ohjelmistoa kyseisten tietojen mukaisesti ja raportoi tulokset.<br />

Tässä diplomityössä tarkastellaan avainsanaohjatun testiautomaatiokehyksen käyttöä hyväksymistestauslähtöisessä<br />

kehityksessä. Tutkimuksen kohteena oli eräs käynnissä oleva ketterä ohjelmistotuotantoprojekti.<br />

Lähestymistapana käytettiin toimintatutkimusta (action research) ja pääasiallisina menetelminä<br />

havainnointia ja haastatteluita. Tutkimusjakson pituus oli neljä kuukautta.<br />

Tutkimuksessa havaittiin, että avainsanaohjattua testiautomaatiokehystä voidaan käyttää hyväksymislähtöisessä<br />

kehityksessä. Jotkin rajoitteet kuitenkin estivät testien tekemisen ennen ohjelmiston<br />

toteutuksen aloittamista. Lisäksi havaittiin, että hyväksymistestauslähtöisessä kehityksessä testitapausten<br />

luomisessa käytettävällä testiautomaatiokehyksellä ei ole ratkaisevaa roolia. Suurimmat<br />

hyödyt saavutettiin yksityiskohtaisella suunnittelulla ennen ohjelmiston toteuttamista jokaisen iteraation<br />

alussa.<br />

Tulosten perusteella hyväksymistestauslähtöinen kehitys edistää eri osapuolten välistä kommunikaatiota,<br />

yhteistyötä ja käsitystä ohjelmiston ominaisuuksien yksityiskohdista. Tämä edistää haluttujen<br />

ominaisuuksien toteuttamista. Näin ollen riski toimimattoman tai väärin toimivan ohjelmiston valmistamisesta<br />

pienenee. Tämä edesauttaa tehokkaampaa ohjelmistokehitystä, sillä oikeat ominaisuudet<br />

tuotetaan todennäköisemmin jo ensimmäisellä toteutuskerralla. <strong>Test</strong>aajien roolissa huomattiin<br />

myös merkittäviä muutoksia johtuen testaajien lisääntyneestä osallistumisesta yksityiskohtaiseen<br />

suunnitteluun. Näyttää siltä, että hyväksymistestauslähtöisen kehityksen suurimmat haasteet liittyvät<br />

oikeilla testitasoilla ja oikeassa laajuudessaan olevien testien luomiseen.<br />

Avainsanat: hyväksymislähtöinen kehitys, avainsanaohjattu testaus, ketterä testaus, testiautomaatio<br />

iii


ACKNOWLEDGEMENTS<br />

This master’s thesis has been written for a Finnish software testing consultancy company Qentinel during<br />

the years 2006 and 2007. I would like to thank all the Qentinelians who have made this possible.<br />

Big thanks belong to my instructor Harri Töhönen for his interest, valuable feedback, and time used for<br />

listening and commenting my ideas.<br />

I would express my gratitude to my supervisor Tomi Männistö who gave advice and comments when<br />

those were needed.<br />

I would like to thank Petri Haapio and Pekka Laukkanen whom I have been working <strong>with</strong> and who<br />

have been giving valuable ideas, comments and feedback. The discussions <strong>with</strong> these two professionals<br />

have improved my know-how about agile software development and test automation. That knowhow<br />

has been priceless during this work.<br />

I would wish to thank all the members of the project where the research was carried out. It has been<br />

very rewarding to work <strong>with</strong> them.<br />

Also my good friend Pauli Aho deserves to be thanked. I am deeply indebted to him for using his time<br />

to check the language of this thesis.<br />

Finally, special thanks go to my lovely wife Aino for the help and support I received during this project.<br />

I am grateful to her for being so patient.<br />

iv


TABLE OF CONTENTS<br />

TERMS....................................................................................................................................... VII<br />

1 INTRODUCTION ............................................................................................................. 1<br />

1.1 Motivation .............................................................................................................. 1<br />

1.2 Aim of the Thesis ................................................................................................... 3<br />

1.3 Structure of the Thesis .......................................................................................... 3<br />

2 TRADITIONAL TESTING................................................................................................ 4<br />

2.1 Purpose of <strong>Test</strong>ing................................................................................................. 4<br />

2.2 Dynamic and Static <strong>Test</strong>ing................................................................................... 4<br />

2.3 Functional and Non-Functional <strong>Test</strong>ing................................................................. 4<br />

2.4 White-Box and Black-Box <strong>Test</strong>ing ......................................................................... 5<br />

2.5 <strong>Test</strong> Levels ............................................................................................................ 5<br />

3 AGILE AND ITERATIVE SOFTWARE DEVELOPMENT ............................................... 9<br />

3.1 Iterative <strong>Development</strong> Model................................................................................. 9<br />

3.2 Agile <strong>Development</strong>................................................................................................. 10<br />

3.3 Scrum..................................................................................................................... 11<br />

3.4 Extreme Programming........................................................................................... 15<br />

3.5 Scrum and Extreme Programming Together......................................................... 17<br />

3.6 Measuring Progress in Agile Projects ................................................................... 17<br />

4 TESTING IN AGILE SOFTWARE DEVELOPMENT ...................................................... 19<br />

4.1 Purpose of <strong>Test</strong>ing................................................................................................. 19<br />

4.2 <strong>Test</strong> Levels ............................................................................................................ 19<br />

4.3 <strong>Acceptance</strong> <strong>Test</strong>-<strong>Driven</strong> <strong>Development</strong>.................................................................. 22<br />

5 TEST AUTOMATION APPROACHES............................................................................ 28<br />

5.1 <strong>Test</strong> Automation..................................................................................................... 28<br />

5.2 Evolution of <strong>Test</strong> Automation Frameworks............................................................ 29<br />

5.3 <strong>Keyword</strong>-<strong>Driven</strong> <strong>Test</strong>ing ........................................................................................ 29<br />

6 KEYWORD-DRIVEN TEST AUTOMATION FRAMEWORK .......................................... 32<br />

6.1 <strong>Keyword</strong>-<strong>Driven</strong> <strong>Test</strong> Automation Framework....................................................... 32<br />

6.2 <strong>Test</strong> Data ............................................................................................................... 33<br />

6.3 <strong>Test</strong> Execution ....................................................................................................... 35<br />

6.4 <strong>Test</strong> Reporting ....................................................................................................... 35<br />

7 EXAMPLE OF ACCEPTANCE TEST-DRIVEN DEVELOPMENT WITH<br />

KEYWORD-DRIVEN TEST AUTOMATION FRAMEWORK .......................................... 36<br />

7.1 <strong>Test</strong> Data between User Stories and System under <strong>Test</strong> ..................................... 36<br />

7.2 User Stories ........................................................................................................... 37<br />

7.3 Defining <strong>Acceptance</strong> <strong>Test</strong>s.................................................................................... 37<br />

7.4 Implementing <strong>Acceptance</strong> <strong>Test</strong>s and Application.................................................. 39<br />

8 ELABORATED GOALS OF THE THESIS...................................................................... 45<br />

8.1 Scope..................................................................................................................... 45<br />

8.2 Research Questions .............................................................................................. 45<br />

9 RESEARCH SUBJECT AND METHOD ......................................................................... 47<br />

9.1 Case Project .......................................................................................................... 47<br />

9.2 Research Method .................................................................................................. 47<br />

9.3 Data Collection ...................................................................................................... 49<br />

v


10 ACCEPTANCE TEST-DRIVEN DEVELOPMENT WITH KEYWORD-DRIVEN<br />

TEST AUTOMATION FRAMEWORK IN THE PROJECT UNDER STUDY................... 51<br />

10.1 <strong>Development</strong> Model and <strong>Development</strong> Practices Used in the Project.................. 51<br />

10.2 January Sprint........................................................................................................ 52<br />

10.3 February Sprint ...................................................................................................... 55<br />

10.4 March Sprint .......................................................................................................... 61<br />

10.5 April Sprint ............................................................................................................. 63<br />

10.6 Interviews............................................................................................................... 65<br />

11 ANALYSES OF OBSERVATIONS ................................................................................. 72<br />

11.1 Suitability of the <strong>Keyword</strong>-<strong>Driven</strong> <strong>Test</strong> Automation Framework <strong>with</strong><br />

<strong>Acceptance</strong> <strong>Test</strong>-<strong>Driven</strong> <strong>Development</strong>.................................................................. 72<br />

11.2 Use of the <strong>Keyword</strong>-<strong>Driven</strong> <strong>Test</strong> Automation Framework <strong>with</strong> <strong>Acceptance</strong><br />

<strong>Test</strong>-<strong>Driven</strong> <strong>Development</strong>...................................................................................... 76<br />

11.3 Benefits, Challenges and Drawbacks of <strong>Acceptance</strong> <strong>Test</strong>-<strong>Driven</strong><br />

<strong>Development</strong> <strong>with</strong> <strong>Keyword</strong>-<strong>Driven</strong> <strong>Test</strong> Automation Framework......................... 78<br />

11.4 Good Practices ...................................................................................................... 87<br />

12 DISCUSSION AND CONCLUSIONS.............................................................................. 89<br />

12.1 Researcher’s Experience ...................................................................................... 89<br />

12.2 Main Conclusions .................................................................................................. 89<br />

12.3 Validity ................................................................................................................... 90<br />

12.4 Evaluation of the Thesis ........................................................................................ 92<br />

12.5 Further Research Areas ........................................................................................ 92<br />

BIBLIOGRAPHY........................................................................................................................ 94<br />

APPENDIX A PRINCIPLES BEHIND THE AGILE MANIFESTO ..................................... 101<br />

APPENDIX B INTERVIEW QUESTIONS .......................................................................... 102<br />

vi


TERMS<br />

<strong>Acceptance</strong> Criteria<br />

<strong>Acceptance</strong> <strong>Test</strong>ing<br />

<strong>Acceptance</strong> <strong>Test</strong>-<strong>Driven</strong> <strong>Development</strong><br />

(ATDD)<br />

Actual Result<br />

Agile <strong>Test</strong>ing<br />

Base <strong>Keyword</strong><br />

Behavior<br />

Bespoke Software<br />

Beta <strong>Test</strong>ing<br />

Black-box <strong>Test</strong>ing<br />

Bug<br />

Capture/Playback Tool<br />

Component<br />

The exit criteria that a component or system must satisfy in order to<br />

be accepted by a user, customer, or other authorized entity. (IEEE<br />

Std 610.12-1990)<br />

Formal testing <strong>with</strong> respect to user needs, requirements, and business<br />

processes conducted to determine whether or not a system satisfies<br />

the acceptance criteria and to enable the user, customers or<br />

other authorized entity to determine whether or not to accept the<br />

system. (IEEE Std 610.12-1990) See also component testing, integration<br />

testing and acceptance testing.<br />

A way of developing software where the acceptance test cases are<br />

developed, and often automated, before the software is developed to<br />

run those test cases. See also test-driven development.<br />

The behavior produced/observed when a component or system is<br />

tested. (ISTQB 2006)<br />

<strong>Test</strong>ing practice for a project using agile methodologies, such as<br />

extreme programming (XP), treating development as the customer<br />

of testing and emphasizing the test-first design paradigm. (ISTQB<br />

2006) See also test-driven development and acceptance test-driven<br />

development.<br />

<strong>Keyword</strong> implemented in a test library of a keyword-driven test<br />

automation framework. (Laukkanen 2006) See also sentence format<br />

keyword and user keyword.<br />

The response of a component or system to a set of input values and<br />

preconditions. (ISTQB 2006)<br />

Software developed specifically for a set of users or customers. The<br />

opposite is off-the-shelf software. (ISTQB 2006)<br />

Operational testing by potential and/or existing users/customers at<br />

an external site not otherwise involved <strong>with</strong> the developers, to determine<br />

whether or not a component or system satisfies the<br />

user/customer needs and fits <strong>with</strong>in the business processes. Beta<br />

testing is often employed as a form of external acceptance testing<br />

for off-the-shelf software in order to acquire feedback from the market.<br />

(ISTQB 2006)<br />

<strong>Test</strong>ing, either functional or non-functional, <strong>with</strong>out reference to the<br />

internal structure of the component or system. (ISTQB 2006) See<br />

also white-box testing.<br />

See defect.<br />

A type of test execution tool where inputs are recorded during manual<br />

testing in order to generate automated test scripts that can be<br />

executed later (i.e. replayed). These tools are often used to support<br />

automated regression testing. (ISTQB 2006)<br />

A minimal software item that can be tested in isolation. (ISTQB<br />

2006)<br />

vii


Component <strong>Test</strong>ing The testing of individual software components. (IEEE Std 610.12-<br />

1990)<br />

Context-<strong>Driven</strong> <strong>Test</strong>ing<br />

Daily Build<br />

Data-<strong>Driven</strong> <strong>Test</strong>ing<br />

Defect<br />

Defined Process<br />

Dynamic <strong>Test</strong>ing<br />

Empirical Process<br />

Expected Outcome<br />

Expected Result<br />

Exploratory <strong>Test</strong>ing<br />

Fail<br />

Failure<br />

Fault<br />

A testing methodology that underlines the importance of the context<br />

where different testing practices are used over the practices themselves.<br />

The main message is that there are good practices in a context<br />

but there are no general best practices. (Kaner et al. 2001a)<br />

A development activity where a complete system is compiled and<br />

linked every day (usually overnight), so that a consistent system is<br />

available at any time including all latest changes. (ISTQB 2006)<br />

A scripting technique that stores test input and expected results in a<br />

table or spreadsheet, so that a single control script can execute all of<br />

the tests in the table. Data-driven testing is often used to support the<br />

application of test execution tools such as capture/playback tools.<br />

(Fewster & Graham 1999) See also keyword-driven testing.<br />

A flaw in a component or system that can cause the component or<br />

system to fail to perform its required function, e.g. an incorrect<br />

statement or data definition. A defect, if encountered during execution,<br />

may cause a failure of the component or system. (ISTQB<br />

2006)<br />

In defined process every piece of work is well understood. With<br />

well defined input, the defined process can be started and allowed<br />

to run until completion, ending <strong>with</strong> the same results every time.<br />

(Schwaber & Beedle 2002) See also empirical process.<br />

<strong>Test</strong>ing that involves the execution of the software of a component<br />

or system. (ISTQB 2006) See also static testing.<br />

In empirical process the unexpected is expected. Empirical process<br />

provides and exercises control through frequent inspection and adaptation<br />

in imperfectly defined environments where unpredictable<br />

and unrepeatable outputs are generated. (Schwaber & Beedle 2002)<br />

See also defined process.<br />

See expected result.<br />

The behavior predicted by the specification, or another source, of<br />

the component or system under specified conditions. (ISTQB 2006)<br />

An informal test design technique where the tester actively controls<br />

the design of the tests as those tests are performed and uses information<br />

gained while testing to design new and better tests. (Bach<br />

2003b)<br />

A test is deemed to fail if its actual result does not match its expected<br />

result. (ISTQB 2006)<br />

Deviation of the component or system from its expected delivery,<br />

service or result. (Fenton 1996)<br />

See defect.<br />

viii


Feature<br />

Feature Creep<br />

Functional <strong>Test</strong>ing<br />

Functionality<br />

High Level <strong>Test</strong> Case<br />

Input<br />

Input Value<br />

Information Radiator<br />

Integration <strong>Test</strong>ing<br />

Iterative <strong>Development</strong><br />

Model<br />

<strong>Keyword</strong><br />

<strong>Keyword</strong>-<strong>Driven</strong> <strong>Test</strong><br />

Automation Framework<br />

<strong>Keyword</strong>-<strong>Driven</strong> <strong>Test</strong>ing<br />

An attribute of a component or system specified or implied by requirements<br />

documentation (for example reliability, usability or design<br />

constraints). (IEEE Std 1008-1987)<br />

On-going requirements increase <strong>with</strong>out corresponding adjustment<br />

of approved cost and schedule allowances. As some projects progress,<br />

especially through the definition and development phases,<br />

requirements tend to change incrementally, causing the project<br />

manager to add to the project's mission or objectives <strong>with</strong>out getting<br />

a corresponding increase in the time and budget allowances. (Wideman<br />

2002)<br />

<strong>Test</strong>ing based on an analysis of the specification of the functionality<br />

of a component or system. (ISTQB 2006) See also black-box testing.<br />

The capability of the software product to provide functions which<br />

meet stated and implied needs when the software is used under<br />

specified conditions. (ISO/IEC Std 9126-1:2001)<br />

A test case <strong>with</strong>out concrete (implementation level) values for input<br />

data and expected results. Logical operators are used; instances of<br />

the actual values are not yet defined and/or available. (ISTQB 2006)<br />

See also low level test case.<br />

A variable (whether stored <strong>with</strong>in a component or outside) that is<br />

read by a component. (ISTQB 2006)<br />

An instance of an input. (ISTQB 2006) See also input.<br />

An information radiator is a large display of critical team information<br />

that is continuously updated and located in a spot where the<br />

team can see it constantly. (Agile Advice 2005)<br />

<strong>Test</strong>ing performed to expose defects in the interfaces and in the interactions<br />

between integrated components or systems. (ISTQB<br />

2006) See also component testing, system testing and acceptance<br />

testing.<br />

A development life cycle where a project is broken into a usually<br />

large number of iterations. Iteration is a complete development loop<br />

resulting in a release (internal or external) of an executable product,<br />

a subset of the final product under development, which grows from<br />

iteration to iteration to become the final product. (ISTQB 2006)<br />

A directive representing a single action in keyword-driven testing.<br />

(Laukkanen 2006)<br />

<strong>Test</strong> automation framework using keyword-driven testing technique.<br />

A scripting technique that uses data files to contain not only test<br />

data and expected results, but also keywords related to the application<br />

being tested. The keywords are interpreted by special supporting<br />

scripts that are called by the control script for the test. (ISTQB<br />

2006) See also data-driven testing.<br />

ix


Low Level <strong>Test</strong> Case<br />

Negative <strong>Test</strong>ing<br />

Non-functional testing<br />

Off-the-shelf Software<br />

Output<br />

Output Value<br />

Pass<br />

Postcondition<br />

Precondition<br />

Problem<br />

Quality<br />

Quality Assurance<br />

Regression <strong>Test</strong>ing<br />

Requirement<br />

Result<br />

A test case <strong>with</strong> concrete (implementation level) values for input<br />

data and expected results. Logical operators from high level test<br />

cases are replaced by actual values that correspond to the objectives<br />

of the logical operators. (ISTQB 2006) See also high level test case.<br />

<strong>Test</strong>s aimed at showing that a component or system does not work.<br />

Negative testing is related to the testers’ attitude rather than a specific<br />

test approach or test design technique, e.g. testing <strong>with</strong> invalid<br />

input values or exceptions. (Beizer 1990)<br />

<strong>Test</strong>ing the attributes of a component or system that do not relate to<br />

functionality, e.g. reliability, efficiency, usability, maintainability<br />

and portability. (ISTQB 2006)<br />

A software product that is developed for the general market, i.e. for<br />

a large number of customers, and that is delivered to many customers<br />

in identical format. (ISTQB 2006)<br />

A variable (whether stored <strong>with</strong>in a component or outside) that is<br />

written by a component. (ISTQB 2006)<br />

An instance of an output. (ISTQB 2006) See also output.<br />

A test is deemed to pass if its actual result matches its expected result.<br />

(ISTQB 2006)<br />

Environmental and state conditions that must be fulfilled after the<br />

execution of a test or test procedure. (ISTQB 2006)<br />

Environmental and state conditions that must be fulfilled before the<br />

component or system can be executed <strong>with</strong> a particular test or test<br />

procedure. (ISTQB 2006)<br />

See defect.<br />

The degree to which a component, system or process meets specified<br />

requirements and/or user/customer needs and expectations.<br />

(IEEE Std 610.12-1990)<br />

Part of quality management focused on providing confidence that<br />

quality requirements will be fulfilled. (ISO Std 9000-2005)<br />

<strong>Test</strong>ing of a previously tested program following modification to<br />

ensure that defects have not been introduced or uncovered in unchanged<br />

areas of the software, as a result of the changes made. It is<br />

performed when the software or its environment is changed.<br />

(ISTQB 2006)<br />

A condition or capability needed by a user to solve a problem or<br />

achieve an objective that must be met or possessed by a system or<br />

system component to satisfy a contract, standard, specification, or<br />

other formally imposed document. (IEEE Std 610.12-1990)<br />

The consequence/outcome of the execution of a test. It includes<br />

outputs to screens, changes to data, reports, and communication<br />

messages sent out. See also actual result, expected result. (ISTQB<br />

2006)<br />

x


Running <strong>Test</strong>ed Features<br />

(RTF)<br />

Sentence Format <strong>Keyword</strong><br />

Software<br />

Software Quality<br />

Static Code Analysis<br />

Static <strong>Test</strong>ing<br />

System<br />

System <strong>Test</strong>ing<br />

Running <strong>Test</strong>ed Features is a metric to measure the progress of an<br />

agile team. (Jeffries 2004)<br />

Term defined in this thesis for keywords which name is a sentence<br />

and it does not take any arguments. See also base keyword and user<br />

keyword.<br />

Computer programs, procedures, and possibly associated documentation<br />

and data pertaining to the operation of a computer system.<br />

(IEEE Std 610.12-1990)<br />

The totality of functionality and features of a software product that<br />

bear on its ability to satisfy stated or implied needs. (ISO/IEC Std<br />

9126-1:2001)<br />

Analysis of source code carried out <strong>with</strong>out execution of that software.<br />

(ISTQB 2006)<br />

<strong>Test</strong>ing of a component or system at specification or implementation<br />

level <strong>with</strong>out execution of that software, e.g. reviews or static code<br />

analysis. (ISTQB 2006) See also dynamic testing.<br />

A collection of components organized to accomplish a specific<br />

function or set of functions. (IEEE Std 610.12-1990)<br />

The process of testing an integrated system to verify that it meets<br />

specified requirements. (Burnstein 2003) See also component testing,<br />

integration testing and acceptance testing.<br />

System Under <strong>Test</strong> (SUT) The entire system or product to be tested. (Craig and Jaskiel 2002)<br />

<strong>Test</strong> A set of one or more test cases. (IEEE Std 829-1983)<br />

<strong>Test</strong> Automation<br />

<strong>Test</strong> Automation Framework<br />

<strong>Test</strong> Case<br />

<strong>Test</strong> Data<br />

<strong>Test</strong>-<strong>Driven</strong> development<br />

(TDD)<br />

<strong>Test</strong> Execution<br />

The use of software to perform or support test activities, e.g. test<br />

management, test design, test execution and results checking.<br />

(ISTQB 2006)<br />

A framework used for test automation. Provides some core functionality<br />

(e.g. logging and reporting) and allows its testing capabilities<br />

to be extended by adding new test libraries. (Laukkanen 2006)<br />

A set of input values, execution preconditions, expected results and<br />

execution postconditions, developed for a particular objective or<br />

test condition, such as to exercise a particular program path or to<br />

verify compliance <strong>with</strong> a specific requirement. (IEEE Std 610.12-<br />

1990)<br />

Data that exists (for example, in a database) before a test is executed,<br />

and that affects or is affected by the component or system<br />

under test. (ISTQB 2006)<br />

A way of developing software where the test cases are developed,<br />

and often automated, before the software is developed to run those<br />

test cases. (ISTQB 2006)<br />

The process of running a test on the component or system under<br />

test, producing actual result(s). (ISTQB 2006)<br />

xi


<strong>Test</strong> Execution Automation<br />

<strong>Test</strong> Engineer<br />

<strong>Test</strong> Input<br />

<strong>Test</strong> Level<br />

<strong>Test</strong> Log<br />

<strong>Test</strong> Logging<br />

<strong>Test</strong> Report<br />

<strong>Test</strong> Run<br />

<strong>Test</strong> Runner<br />

<strong>Test</strong> Result<br />

<strong>Test</strong> Script<br />

<strong>Test</strong> Set<br />

<strong>Test</strong> Suite<br />

<strong>Test</strong>ability<br />

<strong>Test</strong>er<br />

<strong>Test</strong>ing<br />

User <strong>Keyword</strong><br />

The use of software, e.g. capture/playback tools, to control the execution<br />

of tests, the comparison of actual results to expected results,<br />

the setting up of test preconditions, and other test control and reporting<br />

functions. (ISTQB 2006)<br />

See tester.<br />

The data received from an external source by the test object during<br />

test execution. The external source can be hardware, software or<br />

human. (ISTQB 2006)<br />

A group of test activities that are organized and managed together.<br />

A test level is linked to the responsibilities in a project. Examples of<br />

test levels are component test, integration test, system test and acceptance<br />

test. (Pol 2002)<br />

A chronological record of relevant details about the execution of<br />

tests. (IEEE Std 829-1983)<br />

The process of recording information about tests executed into a<br />

test log. (ISTQB 2006)<br />

A document summarizing testing activities and results. (IEEE Std<br />

829-1983)<br />

Execution of a test on a specific version of the test object. (ISTQB<br />

2006)<br />

A generic driver script capable to execute different kinds of test<br />

cases and not only variations <strong>with</strong> slightly different test data.<br />

(Laukkanen 2006)<br />

See result.<br />

Commonly used to refer to a test procedure specification, especially<br />

an automated one. (ISTQB 2006)<br />

See test suite.<br />

A set of several test cases for a component or system under test,<br />

where the postcondition of one test is often used as the precondition<br />

for the next one. (ISTQB 2006)<br />

The capability of the software product to enable modified software<br />

to be tested. (ISO/IEC Std 9126-1:2001)<br />

A skilled professional who is involved in the testing of a component<br />

or system. (ISTQB 2006)<br />

The process consisting of all life cycle activities, both static and<br />

dynamic, concerned <strong>with</strong> planning, preparation and evaluation of<br />

software products and related work products to determine that they<br />

satisfy specified requirements, to demonstrate that they are fit for<br />

purpose and to detect defects. (ISTQB 2006)<br />

<strong>Keyword</strong> constructed from base keywords and other user keywords<br />

in a test design system. User keywords can be created easily even<br />

<strong>with</strong>out programming skills. (Laukkanen 2006) See also base keyword<br />

and sentence format keyword.<br />

xii


Unit <strong>Test</strong>ing<br />

Variable<br />

White-Box <strong>Test</strong>ing<br />

See component testing.<br />

An element of storage in a computer that is accessible by a software<br />

program by referring to it by a name. (ISTQB 2006)<br />

<strong>Test</strong>ing based on an analysis of the internal structure of the component<br />

or system. (ISTQB 2006) See also black-box testing.<br />

xiii


1 INTRODUCTION<br />

1.1 Motivation<br />

Quality is one of the most important aspects of software products. If software does not work, it is not<br />

worth a lot. The drawbacks caused by faulty software can be much higher than the advantages gained<br />

from using it. Malfunctioning or difficult to use software can complicate daily life. In life critical systems<br />

faults may even cause loss of human lives. In highly competing markets quality may determine<br />

which software product is going to be a success and which ones are going to fail. Low quality software<br />

products have a negative impact on firms’ reputation and unquestionably also on the sales. Unhappy<br />

customers are also more willing to change to other software suppliers. For these reasons organizations<br />

have to invest on the quality of software products.<br />

Even high quality software can fail at the markets if it does not meet the customers’ needs. At the beginning<br />

of a software project it is common that customers’ exact needs are unknown. This may lead to<br />

guessing the wanted features and development of useless features and in the worst case useless software.<br />

This should obviously be avoided.<br />

New feature ideas usually arise when the customer understands the problem domain more thoroughly.<br />

This might be quite problematic if strict contractual agreements on the developed features exist. Even<br />

when it is contractually possible to add new features to the software, there might be a lot of rework<br />

before the features are ready for use.<br />

Iterative and especially agile software processes are introduced as a solution for changing requirements.<br />

The basic idea in the iterative processes is to create the software in small steps. When software<br />

is developed in this way, the customers can try out the developed software and based on the customer’s<br />

feedback, the development team can create features that are valuable for the customer. The most valuable<br />

features are developed first allowing the customer to start using the software earlier than the software<br />

developed <strong>with</strong> a non-iterative development process.<br />

Iterative software development adds new challenges for software testing. In traditional software projects<br />

main part of the testing is conducted in the end of the development project. With the iterative and<br />

agile processes the software should, however, be tested in every iteration. If the customer uses the result<br />

of the iteration, at least all the major problems should be solved before the product can be delivered.<br />

In an ideal situation each iteration outcome would be high quality software.<br />

1


In the agile methods the need for testing is understood and there are development practices that are<br />

used to assure the quality of the software. Many of these practices are targeted for developers and used<br />

to test that the code works as the developers have thought it should. To also test that the features fulfill<br />

the customer’s requirements there is need for a higher level testing. This higher level testing is often<br />

called as acceptance testing or customer testing. Customer input is needed to define these higher level<br />

test cases to make sure that her requirements are met.<br />

Because the software is developed in the iterative manner and there is a continuous change, it would be<br />

beneficial to test all the features at least once during the iteration. <strong>Test</strong>ing over again is needed, because<br />

the changes may have caused defects. <strong>Test</strong>ing manually all functionalities after every change is<br />

not possible. It may be possible at the beginning, but when the count of features rises, manual regression<br />

testing becomes harder and eventually impossible. This leads to a situation in which the changes<br />

done late in the iteration may have caused faults that cannot be noticed in testing. And even if the<br />

faults could be noticed, developers may not be able to fix them during the iteration.<br />

<strong>Test</strong> automation can be used for helping the testing effort. <strong>Test</strong> automation means testing software <strong>with</strong><br />

other software. When software and computers are used for testing, the test execution can be conducted<br />

much faster than manually. If the automated tests can be executed daily or even more often, the status<br />

of the developed software is continuously known. Therefore the problems can be found faster and the<br />

changes causing the problems can be pinpointed. That is why the test automation is an integral part of<br />

agile software development.<br />

By automating the customer defined acceptance tests, the test cases defining how the system should<br />

work from the customer point of view can be executed often. This makes it possible to know the status<br />

of the software in any point of the development. In acceptance test-driven development this approach<br />

is taken even further and the acceptance tests are not only used for verifying that the system works but<br />

also driving the system development. The customer defined test cases are created before the implementation<br />

starts. The goal of the implementation is then to develop software that passes all the acceptance<br />

test cases.<br />

2


1.2 Aim of the Thesis<br />

The aim of this thesis is to investigate whether the acceptance test-driven development can be used<br />

<strong>with</strong> in-house built keyword-driven test automation framework. The research is conducted in a real-life<br />

agile software development project and the suitability is evaluated in this case project. Also the pros<br />

and cons of this approach are evaluated. More detailed research question will follow in Chapter 8 after<br />

the acceptance test-driven development and keyword-driven test automation concepts are clarified.<br />

One purpose is to present the framework usage in level that can help others to try the approach <strong>with</strong><br />

similar kind of tools.<br />

1.3 Structure of the Thesis<br />

Structure of this thesis is the following; in Chapter 2 the traditional software testing is described to introduce<br />

the basic concepts needed in the following chapters. In Chapter 3 the basis of the agile and iterative<br />

software development is described. The testing in the agile software development is introduced<br />

in Chapter 4. Chapter 4 contains also acceptance test-driven development which is the main topic in<br />

this thesis. Chapter 5 includes the test automation approaches in general and the keyword-driven test<br />

automation approach in particular. After the keyword-driven approach is introduced, the keyworddriven<br />

test automation framework used in this thesis is explained in Chapter 6 in level that is needed to<br />

understand the coming Chapters. Chapter 7 contains simple and fictitious example of the usage of the<br />

presented keyword-driven test automation framework <strong>with</strong> acceptance test-driven development.<br />

The research questions are defined in Chapter 8. The case project and product developed in the case<br />

project are described in Chapter 9. The research method used to conduct this research is also explained<br />

in Chapter 9. Chapter 10 contains all the results from the project. First the development model used in<br />

the case project is described. Then the use of acceptance test-driven development <strong>with</strong> the keyworddriven<br />

test automation framework is represented. Chapter 10 also contains results from the interviews<br />

which were conducted at the end of the research. In Chapter 11 the observations gained from the case<br />

project are analyzed. Chapter 12 contains the conclusions and the discussion about the results and the<br />

meaning of the analysis in a wider perspective. Further research areas are presented at the end of Chapter<br />

12.<br />

3


2 TRADITIONAL TESTING<br />

In this chapter the traditional testing terminology and divisions of different testing aspects are described.<br />

The purpose is to give an overall view of the testing field and make it possible in the following<br />

chapters to compare agile testing to traditional testing and specify the research area in a wider context.<br />

2.1 Purpose of <strong>Test</strong>ing<br />

<strong>Test</strong>ing is an integral part of the software development. The goal of software testing is to find faults<br />

from developed software and to make sure they get fixed (Kaner et al. 1999, Patton 2000). It is important<br />

to find the faults as early as possible because fixing them is more expensive in the later phases of<br />

the development (Kaner et al. 1999, Patton 2000). The purpose of testing is also to provide information<br />

about the current state of the developed software from the quality perspective (Burnstein 2003). One<br />

might argue that software testing should make sure that software works correctly. This is however impossible<br />

because even a simple piece of software has millions of paths that should all be tested to make<br />

sure that it works correctly (Kaner et al. 1999).<br />

2.2 Dynamic and Static <strong>Test</strong>ing<br />

On a high level, software testing can be divided into dynamic and static testing. The division to these<br />

two categories can be done based on whether the software is executed or not. Static testing means testing<br />

<strong>with</strong>out executing the code. This can be done <strong>with</strong> different kinds of reviews. Reviewed items can<br />

be documents or code. Other static testing methods are static code analysis methods for example syntax<br />

correctness and code complexity analysis. With static testing faults can be found in an early phase<br />

of software development because the testing can be started before any code is written. (IEEE Std<br />

610.12-1990; Burnstein 2003)<br />

Dynamic testing is the opposite of static testing. The system under test is tested by executing it or parts<br />

of it. Dynamic testing can be divided to functional testing and non-functional testing which are presented<br />

below. (Burnstein 2003)<br />

2.3 Functional and Non-Functional <strong>Test</strong>ing<br />

The purpose of functional testing is to verify that software corresponds to the requirements defined for<br />

the system. The focus on functional testing is to enter inputs to the system under test and verify the<br />

proper output and state. The concept of functional testing is quite similar to all systems, even though<br />

the inputs and outputs differ from system to system.<br />

4


The non-functional testing means testing quality aspects of software. Examples of non-functional testing<br />

are performance, security, usability, portability, reliability, and memory management testing. Each<br />

non-functional testing needs different approaches and different kind of know-how and resources. The<br />

needed non-functional testing is always decided based on the quality attributes of the system and therefore<br />

selected by case basis. (Burnstein 2003)<br />

2.4 White-Box and Black-Box <strong>Test</strong>ing<br />

There are two basic testing strategies, white-box testing and black-box testing. When the white-box<br />

strategy is used, the internal structure of the system under test is known. The purpose is to verify the<br />

correct behavior of internal structural elements. This can be done for example by exercising all the<br />

statements or all conditional branches. Because the white-box testing is quite time consuming, it is<br />

usually done for small parts of the system at a time. White-box testing methods are useful in finding<br />

design, code-based control, logic and sequence defects, initialization defects, and data flow defects.<br />

(Burnstein 2003)<br />

In black-box testing the system under test is seen as an opaque box. There is no knowledge of the inner<br />

structure of the software. The only knowledge is how the software works. The intention of the blackbox<br />

testing is to provide inputs to the system under test and verify that the system works as defined in<br />

the specifications. Because black box approach considers only behavior and functionality of the system<br />

under test, it is also called functional testing. With black box strategy requirement and specification<br />

defects are revealed. Black-box testing strategy can be used at all test levels defined in the following<br />

chapter. (Burnstein 2003)<br />

2.5 <strong>Test</strong> Levels<br />

<strong>Test</strong>ing can be performed in multiple levels. Usually software testing is divided into unit testing, integration<br />

testing, system testing, and acceptance testing (Dustin et al. 1999; Craig & Jaskiel 2002; Burnstein<br />

2003). The purpose of these different test levels is to investigate and test the software from different<br />

perspectives and find different type of defects (Burnstein 2003). If the division of levels is done<br />

from test automation perspective, the levels can be unit testing, component testing and system testing<br />

(Meszaros 2003; Laukkanen, 2006). In this thesis, whenever traditional test levels are used, the division<br />

into unit, integration, system, and acceptance testing is meant. Figure 1 shows these test levels and<br />

their relative order.<br />

5


Figure 1: <strong>Test</strong> levels (Burnstein 2003)<br />

UNIT TESTING<br />

The smallest part of software is a unit. A unit is traditionally viewed as a function or a procedure in a<br />

(imperative) programming language. In object-oriented systems methods and classes/objects can be<br />

seen as units. Unit can also be a small-sized component or a programming library. The principal goal<br />

of unit testing is to detect functional and structural defects in the unit. Sometimes the name component<br />

is used instead of a unit. In that case the name of this phase is component testing. (Burnstein 2003)<br />

There are different opinions about who should create unit tests. Unit testing is in most cases best handled<br />

by developers who know the code under test and techniques needed (Dustin et al. 1999; Craig &<br />

Jaskiel 2002; Mosley & Posey 2002). On the other hand, Burnstein (2003) thinks that an independent<br />

tester should plan and execute the unit tests. The latter is the more traditional point of view, pointing<br />

that nobody should evaluate their own job.<br />

Unit testing can be started in an early phase of the software development after the unit is created. The<br />

failures revealed by the unit tests are usually easy to locate and repair since only one unit is under consideration<br />

(Burnstein 2003). For these reasons, finding and fixing the defects is cheapest on the unit<br />

test level.<br />

6


INTEGRATION TESTING<br />

When units are combined the resulting group of units is called a subsystem or some times in objectoriented<br />

software system a cluster. The goal of integration testing is to verify that the component/class<br />

interfaces are working correctly and the control and data flows are working correctly between the<br />

components. (Burnstein 2003)<br />

SYSTEM TESTING<br />

When ready and tested subsystems are combined to the final system, system test execution can be<br />

started. System tests evaluate both the functional behavior and non-functional qualities of the system.<br />

The goal is to ensure that the system performs according to its requirements when tested as a whole<br />

system. After system testing and corrections based on the found faults are done, the system is ready for<br />

the customer’s acceptance testing, alpha testing or beta testing (see next paragraph). If the customer<br />

has defined the acceptance tests, those can be used in the system testing phase to assure the quality of<br />

the system from the customer’s point of view. (Burnstein 2003)<br />

ACCEPTANCE TESTING<br />

When a software product is custom-made, the customer wants to verify that the developed software<br />

meets her requirements. This verification is done in the acceptance testing phase. The acceptance tests<br />

are developed in co-operation between the customer and test planners and executed after the system<br />

testing phase. The purpose is to evaluate the software in terms of customer’s expectations and goals.<br />

When the acceptance testing phase is passed, the product is ready for production. If the product is targeted<br />

for mass market, it is often not possible to arrange customer-specific acceptance testing. In these<br />

cases the acceptance testing is conducted in two phases called alpha and beta testing. In alpha testing<br />

the possible customers and members from the development organization test the product in the development<br />

organization premises. After defects found in alpha testing are fixed, beta testing can be<br />

started. The product is send to a cross-section of users who use it in the real-world environment and<br />

report the found defects. (Burnstein 2003)<br />

7


REGRESSION TESTING<br />

The purpose of regression testing is to ensure that old characteristics are working after changes made<br />

to the software and verify that the changes have not introduced new defects. Regression testing is not a<br />

test level as such and it can be performed in all test levels. The importance of the regression testing<br />

increases when the system is released multiple times. The functionality provided in the previous version<br />

should still work <strong>with</strong> all the new functionality and verifying this is very time consuming. Therefore<br />

it is recommended to use automated testing tools to support this task (Burnstein 2003). Also Kaner<br />

et al. (1999) have noticed that it is a common way to automate acceptance and regression tests to<br />

quickly verify the status of the latest build.<br />

8


3 AGILE AND ITERATIVE SOFTWARE DEVELOPMENT<br />

The purpose of this chapter is to explain iterative development model and agile methods in general, and<br />

illustrate development models Scrum and Extreme Programming (XP) on a more detailed level because<br />

their relevance to this thesis.<br />

3.1 Iterative <strong>Development</strong> Model<br />

In iterative development model software is built using multiple sequential iterations during the whole<br />

lifecycle of the software. An iteration can be seen as a mini-project containing requirement analysis,<br />

design, development, and testing. The goal of the iteration is to build an iteration release. An iteration<br />

release is a partially completed system which is stable, integrated, and tested. Usually most of the iteration<br />

releases are internal and not released for external customers. The final iteration release is the complete<br />

product, and it is released to customer or to markets. (Larman 2004)<br />

Usually a partial system grows incrementally <strong>with</strong> new features, iteration by iteration. This is called<br />

incremental development. The concept of a growing system via iterations has been called iterative and<br />

incremental development, although iterative development is the common term. The features to be implemented<br />

in iteration are decided at the beginning of the iteration. The customer selects the most valuable<br />

features at that time, so there is no a strict predefined plan. This is called adaptive planning. (Larman<br />

2004)<br />

In modern iterative methods, the recommended length of iteration is between one and six weeks. In<br />

most of the iterative and incremental development methods the length of the iteration is timeboxed.<br />

Timeboxing is a practice which sets a fixed end date for the iteration. Fixed end date means that if the<br />

iteration scope can not be met, the features <strong>with</strong> lowest priority are reduced from the scope of the iteration.<br />

This way the growing software is always in a stable and tested state at the end of the iteration.<br />

(Larman 2004)<br />

Evolutionary iterative development implies that requirements, plans, and solutions evolve and they are<br />

being refined during iterations, instead of using predefined specifications. There is also the term adaptive<br />

development. Difference between these two terms is that the adaptive development implies that the<br />

received feedback is guiding the development. (Larman 2004)<br />

9


Iterative and incremental development makes it possible that an enhanced product is repeatedly delivered<br />

to the markets. This is also called incremental delivery. Usually the incremental deliveries are<br />

done between three and twelve months. Evolutionary delivery is a refinement of incremental delivery.<br />

In evolutionary delivery the goal is to collect feedback and based on that plan content of the next delivery.<br />

In incremental delivery the feedback is not running the delivery plan. However, there is always<br />

some predefined and feedback based planning and therefore these two terms are used interchangeably.<br />

(Larman 2004)<br />

3.2 Agile <strong>Development</strong><br />

Iterative and incremental development is the core of all agile methods, including Scrum and XP. Agile<br />

methods cannot be defined <strong>with</strong> a single definition, but all of them apply timeboxed iterative and evolutionary<br />

delivery as well as adaptive planning. There are also values and practices in agile methods<br />

that support agility, meaning rapid and flexible response to change. Agile methods also promote practices<br />

and principles like simplicity, lightness, communication, self-directed teams, and programming<br />

over documentation. The values and principles that guide the agile methods were written down by a<br />

group interested in iterative and agile methods in 2001. (Larman 2004) Those values are stated in the<br />

Agile Manifesto (Figure 2). Agile software development principles are listed in Appendix A.<br />

Figure 2: Agile Manifesto (Beck et al. 2001a)<br />

10


3.3 Scrum<br />

Scrum is an agile, lightweight process that can be used to manage and control software and product<br />

development, and it uses iterative and incremental development methods. Scrum emphasizes empirical<br />

process rather than defined process. Scrum consists of four phases: planning, staging, development,<br />

and release. In the planning phase items like vision, funding and initial requirements are created. In the<br />

staging phase requirements are defined and prioritized in a way that there is enough content for the<br />

first iteration. In the development phase the development is done in iterations. The release phase contains<br />

product tasks like documentation, training, and deployment. (Schwaber & Beedle 2002; Larman<br />

2004; Schwaber 2004)<br />

When using Scrum people involved in software development are divided into three different roles:<br />

product owner, scrum master, and the team. The product owner’s task is to get the funding, collect the<br />

project’s initial requirements and manage the requirements (see product backlog at next page). The<br />

team is responsible for developing the functionality. The teams are self-managing, self-organizing, and<br />

cross-functional, and their task is to figure out how to convert items in the product backlog to functionality<br />

in iterations. Team members are collectively responsible for success of iterations and of the<br />

project as a whole, and this is one of the core principles of the Scrum. The maximum size of the team<br />

is seven members. The scrum master is responsible for the Scrum process and teaching Scrum to everyone<br />

in the project. The scrum master also makes sure that everyone follows the rules and practices of<br />

Scrum. (Schwaber 2004)<br />

Scrum consists of several practices, which are Product Backlog, Daily Scrum Meetings, Sprint, Sprint<br />

Planning, Sprint Backlog, Sprint Review, and the Sprint Retrospective. Figure 3 shows the overview of<br />

Scrum.<br />

11


Figure 3: Overview of Scrum (Control Chaos 2006a)<br />

PRODUCT BACKLOG<br />

Product Backlog is a list of all features, functions, technologies, enhancements, and bug-fixes that constitute<br />

the changes to be made to the product for future releases. The items in the product backlog are<br />

in a prioritized list which is evolving all the time. The idea is to add new items to it whenever there are<br />

new features or improvement ideas. (Schwaber & Beedle 2002)<br />

SPRINT<br />

Sprint is the name of the timeboxed iteration in the Scrum. The length of sprint is usually 30 calendardays.<br />

Sprint planning takes place at the beginning of the sprint. There are two meetings in the beginning<br />

of the sprint. In the first meeting the product owner and the team select the content for the following<br />

sprint from the product backlog. Usually the items <strong>with</strong> the highest priority and risks are selected.<br />

In the second meeting, the team and the product owner meet to consider how to develop the selected<br />

features and create sprint backlog which contains all the tasks that are needed to meet the goals of the<br />

sprint. The duration of the tasks are estimated in the meeting and updated during the sprint. (Schwaber<br />

& Beedle 2002; Larman 2004; Schwaber 2004)<br />

12


DAILY SCRUM<br />

The development progress is monitored <strong>with</strong> daily scrum meetings. Daily scrum of the specified form<br />

is kept every work day at the same time and place. Meeting should not last more than 15 minutes. The<br />

team is standing in a circle, and the scrum master asks the following questions from all the team members:<br />

1. What have you done since the last daily scrum<br />

2. What are you going to do between now and the next daily scrum<br />

3. What is preventing you from doing your work<br />

If any problems are raised during the daily scrum meeting, it is the responsibility of the team to solve<br />

the problems. If the team cannot deal <strong>with</strong> the problems, it becomes a responsibility of the scrum master.<br />

If there is a need for a decision, the scrum master has to decide the matter <strong>with</strong>in an hour. If there<br />

are some other problems, the scrum master should solve them <strong>with</strong>in one day before the next daily<br />

scrum. (Schwaber & Beedle 2002; Schwaber 2004)<br />

SPRINT REVIEW<br />

At the end of the sprint results are shown in the sprint review hosted by the scrum master. The purpose<br />

of the sprint review is to demonstrate the done functionality to the product owner and the stakeholders.<br />

After every presentation, all the participants are allowed to voice any comments, observations, improvement<br />

ideas, changes, or missing features regarding the presented functionality. All these found<br />

items are noted down. At the end of the meeting all the items are checked and placed to the product<br />

backlog for prioritization. (Schwaber & Beedle 2002; Schwaber 2004)<br />

DEFINITION OF DONE<br />

Because only the done functionality can be shown in sprint review, there is a need to define what that<br />

means. Otherwise one might think that functionality is done when a feature is implemented and another<br />

thinks that it is done when it is properly tested, documented and ready to be deployed to production.<br />

Schwaber (2004) recommends having a definition of done that is written down and agreed by all<br />

members of the team. This way all stakeholders know the condition of the demonstrated functionalities.<br />

13


SPRINT RETROSPECTIVE<br />

The sprint retrospective meeting is used to improve the performance of the scrum team. The sprint retrospective<br />

takes place at the end of the sprint and the participants are the scrum master and the team.<br />

Two questions “What went well during the last sprint” and “What could be improved in the next<br />

sprint” are asked from all of the team members. Improvement ideas are prioritized, and the ideas that<br />

should be taken into the next sprint are added as high priority nonfunctional items to the product backlog.<br />

(Schwaber 2004)<br />

RULES IN SCRUM<br />

In addition to earlier mentioned aspects there are a few more rules in Scrum. It is forbidden to add any<br />

new tasks to the sprint backlog during the sprint, and the scrum masters must ensure this. If the proposed<br />

new tasks are however more important than the ones in the sprint backlog, the sprint can be abnormally<br />

terminated by the scrum master. After the termination, a new sprint can be started <strong>with</strong> the<br />

sprint backlog containing the new tasks. (Schwaber & Beedle 2002; Schwaber 2004)<br />

DAILY BUILD<br />

As mentioned earlier, Scrum is used to manage and control product development, and therefore there<br />

are no strict rules for development practices that should be used. However, there is a need to know the<br />

status of the project on a daily basis, and therefore a daily build practice is needed. The daily build<br />

practice means that every day the developed source code is checked into the version control system,<br />

built and tested. This means that integration problems can be noticed on a daily basis rather than at the<br />

end of the sprint. The daily build practice can be implemented by continuous integration. Because the<br />

daily build is the only development practice that has to be used in Scrum, the team is responsible for<br />

selecting other development practices to be used. This means that many practices from other agile<br />

methods can be used by the team. (Schwaber & Beedle 2002)<br />

14


SCALING SCRUM<br />

It was mentioned that the size of scrum team is seven people. When Scrum is used in a larger project,<br />

the project members can be divided into multiple teams (Schwaber 2004; Larman 2006). When multiple<br />

teams are used, the cooperation <strong>with</strong> the team can be handled <strong>with</strong> the scrum of scrums. The scrum<br />

of scrums is a daily scrum where at least one member from every scrum team is participating. This<br />

mechanism is used to remove obstacles that concern more than one team (Schwaber 2004). In a larger<br />

project it is also possible to divide the product owner’s responsibilities. Cohn (2007) suggests using<br />

group of product owners <strong>with</strong> one chief product owner. The product owners work in the teams while<br />

the chief product owner manages the wholeness. Larman (2006) calls product owners working <strong>with</strong><br />

scrum teams as feature champions.<br />

3.4 Extreme Programming<br />

Extreme Programming (XP) is a disciplined and still very agile software development method for<br />

small teams from two to twelve members. The purpose of XP is to minimize the risk and the cost of<br />

change in the software development. XP is based on the gained experiences, and successfully used<br />

practices of the father of the method, Kent Beck. Communication, simplicity, feedback, and courage<br />

are the values that XP is based on. Simplicity means as simple code as possible. No extra functionality<br />

is done beforehand even there might be need for a more complex solution in the future. Communication<br />

means continuous communication between the customer and the developers and also between the<br />

developers. Some of the XP practices also force communication. This enhances the spread of important<br />

information inside the project. Continuous testing and communication provide feedback from the state<br />

of the system and the development velocity. Courage is needed to make hard decisions like changing<br />

the system heavily when seeking simplicity and better design. Another form of courage is deleting<br />

code when it is not working at the end of day. To concretize these values there are twelve development<br />

practices which XP heavily counts on. The practices are listed below:<br />

• The Planning Game: Quickly determine the scope of the next release by combining business<br />

priorities and technical estimates. As reality takes over the plan, update the plan.<br />

• Small Releases: Put a simple system into production quickly, and then release new versions<br />

on a very short cycle.<br />

• Metaphor: Guide all development <strong>with</strong> simple shared story of how the whole system<br />

works.<br />

15


• Simple Design: The system should be designed as simply as possible at any given moment.<br />

Extra complexity is removed as soon as it is discovered.<br />

• <strong>Test</strong>ing: Programmers continually write unit tests, which must be run flawlessly for development<br />

to continue. Customers write tests demonstrating that features are finished.<br />

• Refactoring: Programmers restructure the system <strong>with</strong>out changing its behavior to remove<br />

duplication, improve communication, simplify, or add flexibility.<br />

• Pair Programming: All production code is written <strong>with</strong> two programmers at one machine.<br />

• Collective ownership: Anyone can change any code anywhere in the system at any time.<br />

• Continuous integration: Integrate and build the system many times a day, every time a task<br />

is completed.<br />

• 40-hour week: Work no more than 40 hours a week as a rule. Never work overtime second<br />

week in a row.<br />

• On-site customer: Include a real, live user on the team, available full-time to answer questions.<br />

• Coding standards: Programmers write all code in accordance <strong>with</strong> rules the emphasizing<br />

communication through the code.<br />

None of the practices are unique or original. However the idea in the XP is to use all the practices together.<br />

When the practices are used together they complement each other (Figure 4). (Beck 2000)<br />

Figure 4: The practices support each other (Beck 2000)<br />

16


3.5 Scrum and Extreme Programming Together<br />

It is possible to combine agile management mechanism from Scrum and engineering practices from XP<br />

(Control Chaos 2006b). Figure 5 illustrates this approach. Mar and Schwaber (2002) have experience<br />

that these two practices are complementary; when used together, they can have a significant impact on<br />

both the productivity of a team and the quality of its outputs.<br />

Figure 5:<br />

XP@Scrum (Control Chaos2006b)<br />

3.6 Measuring Progress in Agile Projects<br />

Ron Jeffries (2004) recommends using the Running <strong>Test</strong>ed Features metric (RTF) for measuring the<br />

team’s agility and productivity. He defines the RTF in the following way:<br />

1. The desired software is broken down into named features (requirements, stories) which are<br />

part of the system to be delivered.<br />

2. For each named feature, there are one or more automated acceptance tests which, when they<br />

work, will show that the feature in question is implemented.<br />

3. The RTF metric shows, at every moment in the project, how many features are passing all<br />

their acceptance tests.<br />

17


The RTF is a simple metric and it measures well the most important aspect of software, which is the<br />

amount of working features. The amount of RTF should start to increase in the beginning of the project<br />

and keep increasing until the end of the project. If the curve is not rising, there must be some problems<br />

in the project. Figure 6 shows how the RTF curve could look like if the project is doing well. (Jeffries<br />

2004)<br />

Figure 6: RTF curve for an agile project (Jeffries 2004)<br />

18


4 TESTING IN AGILE SOFTWARE DEVELOPMENT<br />

Agile testing is guided by the Agile Manifesto presented in Figure 2. Marick (2001) sees the working<br />

code and conversing people as the most important guides for agile testing. Communication between<br />

the project and test engineers should not be based on communicating <strong>with</strong> the written requirements and<br />

the design specifications handed over the wall to the testing department and then communicating back<br />

<strong>with</strong> the specifications and defect reports. Instead Marick (2001) emphasizes face-to-face conversations<br />

and informal discussions as the main channel for getting testing ideas and creating the test plan.<br />

<strong>Test</strong> engineers should work <strong>with</strong> developers and help testing even unfinished features. Marick is one of<br />

the people agreeing <strong>with</strong> the principles of the context-driven testing school (Kaner et al. 2001a), and<br />

therefore the principles in agile testing and context-driven testing overlap.<br />

4.1 Purpose of <strong>Test</strong>ing<br />

The purpose of agile testing is to build confidence in the developed software. In extreme programming<br />

the confidence is built on two test levels. The unit tests created <strong>with</strong> test-driven development increase<br />

the developers’ confidence, and the customer’s confidence is founded on the acceptance tests (Beck<br />

2000). Unit tests verify that the code works correctly and acceptance tests make sure correct code has<br />

been implemented. In Scrum the integration and acceptance tests are not described (Abrahamsson et al.<br />

2002), and therefore it is up to the team to define the testing related issues. Itkonen et al. (2005) state<br />

that in agile testing the focus is on the constructive quality assurance practices. This is opposite to the<br />

destructive quality assurance practices like negative testing used in the traditional testing. Itkonen et al.<br />

(2005) have doubts about the sufficiency of the constructive quality assurance practices, but admit that<br />

more research in that area is needed.<br />

4.2 <strong>Test</strong> Levels<br />

In agile development the different testing activities overlap. This is mainly because the purpose is to<br />

deliver working software repeatedly. The levels of agile testing cannot be similarly distinguished from<br />

development phases as the traditional test levels can. The contents of the different levels also differ in<br />

agile and traditional testing. As was mentioned in the previous chapter, in XP the confidence is built<br />

<strong>with</strong> the unit and acceptance tests. As was earlier mentioned, Scrum does not contain guidelines on<br />

how testing should be conducted. There are also other opinions in the agile community how the testing<br />

could be divided. Therefore there is no coherent definition for the test levels in the agile testing. However,<br />

test levels in XP and some other categorizations are presented below.<br />

19


UNIT TESTING<br />

The unit testing, sometimes called also as developer testing, can be seen very similar to traditional unit<br />

testing. However, unit tests are usually done using test-driven development (TDD). As the name testdriven<br />

indicates, unit tests are written before the code (Beck 2003; Astels 2003). When TDD is used it<br />

is obvious that a developer writes the unit tests. Even though TDD is used to create the unit tests, its<br />

only purpose is not just testing. The TDD is an approach to write and design maintainable code, and as<br />

a nice side effect, a suite of unit tests is produced (Astels 2003).<br />

ACCEPTANCE TESTING IN XP<br />

The acceptance testing in XP has a wider meaning than the traditional acceptance testing. <strong>Acceptance</strong><br />

tests can contain functional, system, end-to-end, performance, load, stress, security, and usability testing,<br />

among others (Crispin 2005). <strong>Acceptance</strong> tests are also called customer and functional tests in XP<br />

literature, but in this thesis the term acceptance test is used.<br />

The acceptance tests are written by the customer or by a tester <strong>with</strong> the customer’s help (Beck 2000).<br />

In some projects defining the acceptance tests have been a joint effort of the team (Crispin et al. 2002).<br />

The aim of acceptance testing is to show that the product is working as the customer wants and increase<br />

her confidence (Beck 2000; Jeffries 1999). The acceptance tests should contain only tests for<br />

features that customer wants. Jeffries (1999) advices to invest wisely and pick tests that have a meaning<br />

when passing and failing. Crispin et al. (2002) mention also that the purpose of the acceptance tests<br />

is not to go through all the paths in the system because the unit tests take care of that. However,<br />

Crispin (2005) had noticed that teams doing TDD test only the “happy paths”, especially when trying<br />

the TDD for the first time. Misunderstood requirements and hard to find defects may go undetected.<br />

Therefore the acceptance tests keep the teams on track.<br />

The acceptance tests should be always automated, and the automated tests should be simple and created<br />

incrementally (Jeffries et al. 2001; Crispin & House 2005). However, in practice, automating all<br />

the tests are extremely hard and some trade-offs has to be done (Crispin et al. 2002). Kaner (2003)<br />

thinks that automating all acceptance tests is a serious error and the amount of automated tests should<br />

be decided based on the context. Jeffries (2006) admits that automating all the tests is impossible but<br />

still phrases “if we want to be excellent at automated testing, we should set out to automate all tests”.<br />

When automating the tests, the entire development team should be responsible for the automation tasks<br />

(Crispin et al. 2002). The test first approach can be used also <strong>with</strong> the acceptance tests. The acceptance<br />

test-driven development concept is introduced in Chapter 4.3.<br />

20


OTHER TESTING PRACTICES IN XP<br />

While the unit and the acceptance testing are the heart of XP, Beck (2000) admits that there are also<br />

other testing practices that make sense from time to time. He lists parallel test, stress test, and monkey<br />

test as examples of these kinds of helpful testing approaches.<br />

OTHER TEST LEVELS IN AGILE TESTING<br />

There are also other test level divisions in the agile testing community addition to the division in XP.<br />

Marick (2004) divides testing into four categories: technology-facing programmer support, businessfacing<br />

team support, business-facing product critiques, and technology-facing product critiques. In<br />

Marick’s division, unit testing can be seen as technology-facing programmer support and acceptance<br />

testing as business-facing team support. The business-facing product critiques means testing forgotten,<br />

wrongly defined, or otherwise false requirements. Marick (2004) believes that different kinds of exploratory<br />

testing practices can be used in this phase. Technology-facing product critiques corresponds<br />

to non-functional testing.<br />

Hendrickson (2006) divides the agile testing practices into automated acceptance or story tests, automated<br />

unit tests, and manual exploratory testing (Figure 7). She thinks the exploratory testing provides<br />

additional feedback and covers gaps in automation. She also states that the exploratory testing is necessary<br />

to augment the automated tests. This division is quite similar <strong>with</strong> the Marick’s (2004) division<br />

from the functional testing’s point of view.<br />

21


Figure 7: Agile testing practices (Hendrickson 2006)<br />

4.3 <strong>Acceptance</strong> <strong>Test</strong>-<strong>Driven</strong> <strong>Development</strong><br />

The idea of acceptance test-driven development (ATDD) was firstly introduced by Beck (2003) <strong>with</strong><br />

the name application test-driven development. However, he had some doubts on how well the acceptance<br />

tests can be written before the development. Before this, the acceptance test-driven development<br />

had been used, although named as acceptance testing (Miller & Collins 2001). After that, there have<br />

been projects using acceptance test-driven development (Andersson et al. 2003; Reppert 2004; Crispin<br />

2005; Sauvé et al. 2006). The ATDD concept has also been called as story test-driven development<br />

(Mudridge & Cunningham 2005, Reppert 2004) and customer test-driven development (Crispin 2005).<br />

22


PROCESS<br />

On a high level the acceptance test-driven development process contains three steps. The first step is to<br />

define the requirements for the coming iteration. In agile projects the requirements are usually written<br />

in a format of user stories. User stories are short descriptions representing the customer requirements<br />

used for planning and as a reminder (Cohn 2004). When the user stories are defined, the acceptance<br />

tests for those requirements can be done. As the name acceptance test indicates the purpose of these<br />

tests is to define the acceptable functionality of the system. Therefore, the customer has to take part in<br />

defining the acceptance tests. The acceptance tests have to be written in a format the customer understands<br />

(Miller & Collins 2001; Mudridge & Cunningham 2005). When the tests have been defined, the<br />

development can be started. As the concept on a high level is quite simple, there are multiple possible<br />

approaches by whom, when and in which extent the acceptance tests are written and automated.<br />

WHO WRITES THE TESTS<br />

As it was mentioned above, the customer or some other person <strong>with</strong> the proper knowledge of the domain<br />

is needed when writing the tests (Reppert 2004; Crispin 2005). Usually the customer needs some<br />

help in writing the tests (Crispin 2005). Crispin (2005) describes a process where the test engineer<br />

writes the acceptance tests <strong>with</strong> customer. On the other hand, it is also possible for the developers and<br />

the customer to define the tests (Andersson et al. 2003). It is also possible that the customer, the developers<br />

and the test engineers write the tests in collaboration (Reppert 2004). As can be seen, there are<br />

several alternative ways of writing the acceptance tests, and it evidently depends on the available people<br />

and their skills.<br />

WHEN TESTS ARE WRITTEN AND AUTOMATED<br />

<strong>Test</strong>s are written before the development when ATDD is used. This can mean writing the test cases<br />

before the iteration planning or after it. Mudridge and Cunningham (2005) describe an example on<br />

how to use the acceptance tests to define the user stories on a more detailed level and this way ease the<br />

task estimation in the iteration planning session. Watt and Leigh-Fellows (2004) have also used acceptance<br />

tests to clarify the user stories before the planning sessions. On the other hand, Crispin (2005)<br />

and Sauvé et al. (2006) describe a process where the acceptance tests are developed after the stories<br />

have been selected for the iteration.<br />

23


While working in one software development project, Crispin (2005) noticed that writing too many detailed<br />

test cases at the beginning can make it difficult for the developers to understand the big picture.<br />

Therefore, in that project the high level test cases were written at the beginning of the iteration and the<br />

more detailed low level test cases were developed parallel <strong>with</strong> the developers writing the code. This<br />

way the risk of having to rework a lot of test cases is lowered. A similar kind of an approach has also<br />

been used by Andersson et al. (2003) and Miller and Collins (2001). However, Crispin (2005) states<br />

that this is not “pure” ATDD because all the tests are not written before the code.<br />

HOW ACCEPTANCE TESTS ARE AUTOMATED<br />

As was mentioned in Chapter 4.2 the goal in agile testing is to automate as many tests as possible. Depending<br />

on the tool used to automate the test cases, the actual work varies. In general, there are two<br />

tasks. The test cases have to be written in a format that can be processed <strong>with</strong> the test automation<br />

framework. In addition to these test cases, some code is needed to move the instructions from the test<br />

cases into the system under test. Often this code bypasses the graphical user interface and calls<br />

straightly the business logic (Reppert 2004; Crispin 2005).<br />

There are several open source tools used to automate the test cases. The most known of these tools is<br />

FIT (Framework for Integrated <strong>Test</strong>) (Sauvé et al. 2006). When FIT is used the test cases consist of<br />

steps which are presented in tabular format. Developers have to implement test code for every different<br />

kind of step. Sauvé et al. (2006) see this as the weakness of FIT. Other tools and approaches used to<br />

automate the acceptance test cases are not presented here.<br />

24


PROMISES AND CHALLENGES<br />

Table 1 and Table 2 show the promises and challenges of acceptance test-driven development collected<br />

from the different references mentioned in the previous chapters.<br />

PROMISES<br />

The risk of building incorrect<br />

software is decreased.<br />

The development status is<br />

known at any point.<br />

A clear quality agreement is<br />

created.<br />

Requirements can be defined<br />

more cost-effectively.<br />

The requirements and tests<br />

are in synchronization.<br />

The quality of tests can be<br />

improved.<br />

The communication gap is reduced because the tests are an effective<br />

medium of communication between the customer and the development<br />

(Sauvé et al. 2006). When the collaboration takes place just<br />

before the development, there is a clear context for having a conversation<br />

and removing misunderstandings (Reppert 2004). Crispin<br />

(2005) even thinks that the most important function of the tests is to<br />

force the customer, the developers and the test engineers to communicate<br />

and create a common understanding before the development.<br />

When acceptance tests created in collaboration are passing, the feature<br />

is done. The readiness of the product can be evaluated based on<br />

the results of the suite of automated tests executed daily (Miller and<br />

Collins 2001). Knowing what features are ready makes also the project<br />

tracking easier and better (Reppert 2004).<br />

The tests made in collaboration <strong>with</strong> the customer and the development<br />

team serve as a quality agreement between the customer and<br />

the development (Sauvé et al. 2006).<br />

The requirements are described as executable artifacts that can be<br />

used to automatically test the software. Misunderstandings are less<br />

likely than <strong>with</strong> requirements defined in textual descriptions or diagrams.<br />

(Sauvé et al. 2006)<br />

Requirement changes become test updates, and therefore they are<br />

always in synchronization (Sauvé et al. 2006).<br />

The errors in the tests are corrected and approved by the customer,<br />

and therefore the quality of the tests is improved (Sauvé et al. 2006).<br />

25


Confidence in the developed<br />

software is increased.<br />

A clear goal for the developers.<br />

The test engineers are not<br />

seen as “bad guys”.<br />

Problems can be found earlier.<br />

Improve the design of the<br />

developed system.<br />

The correctness of refactoring<br />

can be verified.<br />

Without tests the customers cannot have confidence in the software<br />

(Miller and Collins 2001). The customers get confidence because<br />

they do not need to just hope that the developers have understood<br />

the requirements (Reppert 2004).<br />

The developers have a clear goal in making the customer defined<br />

acceptance tests to pass and that can prevent feature creep (Reppert<br />

2004, Sauvé et al. 2006).<br />

Because the developers and test engineers have the same well defined<br />

goal, the developers do not see the test engineers as “bad<br />

guys” (Reppert 2004).<br />

The customer’s domain knowledge helps to create meaningful tests.<br />

This helps to find problems already in an early phase of project<br />

(Reppert 2004).<br />

Joshua Kerievsky has been amazed how much simpler the code is<br />

when ATDD is used (Reppert 2004).<br />

The acceptance tests are not relying in the internal design of software,<br />

and therefore they can be used to reliably verify the refactoring<br />

has not broken anything (Andersson et al. 2003).<br />

Table 1:<br />

Promises of ATDD<br />

26


CHALLENGES<br />

Automating tests<br />

Writing the tests before development.<br />

The right level of test cases<br />

Crispin (2005) has noticed that defining and automating tests can be a<br />

huge challenge even <strong>with</strong> light tools like FIT.<br />

It might be hard to find time for writing the tests in advance (Crispin<br />

2005).<br />

Crispin (2005) has noticed that when many test cases are written beforehand,<br />

the test cases can cause more confusion than help to understand<br />

the requirements. This causes a lot of rework because some of<br />

the test cases have to be refactored. Therefore the team Crispin (2005)<br />

worked <strong>with</strong>, started <strong>with</strong> a few high level test cases and added more<br />

test cases during the iteration.<br />

Table 2:<br />

Challenges of ATDD<br />

Promises and challenges are revisited in the end of the thesis when the observations are analyzed.<br />

27


5 TEST AUTOMATION APPROACHES<br />

The purpose of this chapter is to describe briefly the field of test automation and the evolution of test<br />

automation frameworks. In addition the keyword-driven testing approach is explained on a more detailed<br />

level.<br />

5.1 <strong>Test</strong> Automation<br />

The term test automation usually means test execution automation. However, test automation is a much<br />

wider term and it can also mean activities like test generation, reporting the test execution results, and<br />

test management (Bach 2003a). All these test automation activities can take place on all the different<br />

test levels described in Chapter 2.5. The extent of test automation can also vary. Small scale test automation<br />

can mean tool aided testing like using a small collection of testing tools to ease different kind<br />

of testing tasks (Bach 2003a). On the other hand, large scale test automation frameworks are used for<br />

setting up the environment, executing test cases, and reporting the results (Zallar 2001).<br />

Automating the testing is not an easy task. There are several issues that have to be taken into account.<br />

Fewster and Graham (1999) have listed the common test automation problems as unrealistic expectations,<br />

poor testing practice, an expectation that automated tests will find a lot of new defects, a false<br />

sense of security, maintenance, technical problems, and organizational issues. As can be noticed, the<br />

list is quite long, and therefore all these issues have to be taken into account when planning the test<br />

automation usage. Laukkanen (2006) also lists some other test automation issues like when to automate,<br />

what to automate, what can be automated, and how much to automate.<br />

28


5.2 Evolution of <strong>Test</strong> Automation Frameworks<br />

The test automation frameworks have evolved over the time (Laukkanen 2006). Kit (1999) divides the<br />

evolution into three generations. The first generation test automation frameworks are unstructured, test<br />

cases are separate scripts containing also test data and therefore almost non-maintainable. In the second<br />

generation frameworks the test scripts are well-designed, modular and documented. This makes<br />

the second generation frameworks maintainable. The third generation frameworks are based on the<br />

second generation <strong>with</strong> the difference that the test data is taken out of the scripts. This makes the test<br />

data variation easy and similar test cases can be created quickly and <strong>with</strong>out coding skills. This concept<br />

is called data-driven testing. The limitation of the data-driven testing is that one script is needed<br />

for every logically different test case (Fewster & Graham 1999; Laukkanen 2006). This can easily increase<br />

the amount of needed scripts dramatically. The keyword-driven testing is a logical extension of<br />

the data-driven testing (Fewster & Graham 1999), and it is described in the following chapter.<br />

5.3 <strong>Keyword</strong>-<strong>Driven</strong> <strong>Test</strong>ing<br />

In the keyword-driven testing also the keywords controlling the test execution are taken out of the<br />

scripts into the test data (Fewster & Graham 1999; Laukkanen 2006). This makes it possible to create<br />

new test cases in the test data <strong>with</strong>out creating a script for every different test case allowing also the<br />

test engineers <strong>with</strong>out coding skills to add new test cases (Fewster & Graham 1999; Kaner et al.<br />

2001b). This removes the biggest limitation of the data-driven testing approach. Figure 8 is an example<br />

of keyword-driven test data containing two simple test cases for testing a calculator application. The<br />

test cases consist of keywords Input, Push and Check, and the arguments which are inputs and expected<br />

outputs for the test cases. As it can be seen, it is easy to add logically different test cases <strong>with</strong>out<br />

implementing new keywords.<br />

29


Figure 8: <strong>Keyword</strong>-driven test data file (Laukkanen 2006)<br />

To be able to execute the tabular format test cases shown in Figure 8, there have to be mapping from<br />

the keywords to the code interacting <strong>with</strong> system under test (SUT). The scripts or code implementing<br />

the keywords are called handlers by Laukkanen (2006). In Figure 9 can be seen the handlers for the<br />

keywords used in test data (Figure 8). In addition to the handlers, test execution needs a driver script<br />

which parses the test data and calls the keyword handlers according to the parsed data.<br />

Figure 9: Handlers for keywords in Figure 8 (Laukkanen 2006)<br />

30


If there is a need for creating high level and low level test cases, different level keywords are needed.<br />

Simple keywords like Input are not enough for high level test cases. There are simple and more flexible<br />

solutions according to Laukkanen (2006). Higher level keywords can be created inside the framework<br />

by combining the lower level keywords. The limitation of this approach is the need for coding<br />

skills whenever there is a need for new higher level keywords. A more flexible solution proposed by<br />

Buwalda et al. (2002), Laukkanen (2006) and Nagle (2007) is to include a possibility to combine existing<br />

keywords in the keyword-driven test automation framework. This makes it possible to create<br />

higher level keywords by combining existing keywords inside the test data. Laukkanen (2006) calls<br />

these combined keywords as user keywords and this term will be used also in this thesis.<br />

31


6 KEYWORD-DRIVEN TEST AUTOMATION FRAMEWORK<br />

The keyword-driven test automation framework used in this research was developed inside the company<br />

the study took place and was called as Robot. The ideas and the basic concept of Robot were<br />

based on the master’s thesis of Laukkanen (2006). In the following chapters some functionalities of<br />

Robot that are interesting from this thesis’s point of view are briefly explained.<br />

6.1 <strong>Keyword</strong>-<strong>Driven</strong> <strong>Test</strong> Automation Framework<br />

In the keyword-driven test automation framework there are three logical parts; the test data, the test<br />

automation framework and the test libraries. The test data contains directives telling what to do <strong>with</strong><br />

associated inputs and expected outputs. The test automation framework contains the functionality to<br />

read the test data, run the handlers in the libraries based on the directives in the test data, and handle<br />

errors during the test execution. The test automation framework contains also test logging and test reporting<br />

functionality. The test libraries are the interface between the framework and system under test.<br />

The libraries can use existing test tools to access the interfaces of the system under test or connect directly<br />

to the interfaces. In Figure 10 the logical structure of Robot is presented.<br />

Figure 10:<br />

Logical structure of Robot<br />

32


6.2 <strong>Test</strong> Data<br />

In Robot, the test data is in tabular format and it can be stored to html or tsv-files. The test data is divided<br />

to four different categories; test cases, keywords, variables and settings. All these different test<br />

data types are defined in their own table in the test data file. Robot recognizes the different tables<br />

through the name of the data type in the table’s first header cell.<br />

KEYWORDS AND TEST CASES<br />

In Robot, keywords can be divided into base and user keywords. Base keywords are keywords implemented<br />

in the libraries. User keywords are keywords that are defined in the test data by combining<br />

base keywords or other user keywords. The ability to create new user keywords in the test data decreases<br />

the amount of needed base keywords and therefore amount of programming. User keywords<br />

make it possible to increase the abstraction of test cases. In following Figure 11 the test cases shown in<br />

Figure 8 are modified to use user keywords Add, Equals and Multiply. The test cases are composed of<br />

keywords defined in the second column of test case table and arguments defined in the following columns.<br />

User keywords are defined in a similar way. In test case and keyword tables the second column<br />

is named as action. This column name can be defined by a user as it is not used by Robot. Same applies<br />

<strong>with</strong> the rest of the headers.<br />

Figure 11: <strong>Test</strong> cases and user keywords (Laukkanen 2006)<br />

33


VARIABLES AND SETTINGS<br />

It is possible to define variables in the Robot framework. Variables increase the maintainability of the<br />

test data because some changes need only updates to the variable values. In some cases variables can<br />

contain test environment specific data, like hostnames or alike. In these cases variables make it easier<br />

to use the same test cases in different environments <strong>with</strong> minimal extra effort. There are two types of<br />

variables in Robot. Scalar variables contain one value, and it can be anything from a simple string to an<br />

object. A list variable contains multiple items. Figure 12 contains a scalar variable ${GREETING} and<br />

a list variable @{ITEMS}.<br />

Figure 12:<br />

Variable table containing scalar and list variables<br />

Settings table is similar to the variable table. Name of the setting is defined in the first column and<br />

value or values in the following columns. Settings are predefined in Robot. Examples of settings are<br />

Library and Resource. Library setting is used to import library which contains the needed base keywords.<br />

The resource setting is used to import resource files. Resource files are used to define the user<br />

keywords and variables in one place.<br />

GROUPING TEST CASES<br />

There are two ways of grouping test cases in Robot. First of all, test cases are grouped hierarchically.<br />

A file containing the test cases (i.e. Figure 11) is called a test case file and it forms a test suite. A directory<br />

containing one or more test case files or directories <strong>with</strong> test case files also creates a test suite. In<br />

other words, hierarchical grouping is the same as the test data structure in the file system.<br />

The other way to group the test cases is based on project specific agreements. In Robot, there is a possibility<br />

to give words for the test cases that are used for grouping the test cases. These words are called<br />

tags. Tags can be used to define for example part of the system the test case tests, who has created the<br />

test case, does the test case belong to regression tests, and does it take a long time to execute the test<br />

case.<br />

34


6.3 <strong>Test</strong> Execution<br />

In Robot, the test execution is started from the command line. The scope of the test execution is defined<br />

by giving test suite directories or test case files as inputs. Without parameters, all the test cases in<br />

the given test suites are executed. A single test suite or test case can be executed <strong>with</strong> command line<br />

options. It is also possible to include or exclude test cases from the test run based on the tags (see the<br />

previous chapter). Command line execution makes it possible to start the test execution at some predefined<br />

time. It also enables starting test execution from continuous integration systems like Cruise Control<br />

(Cruise Control 2006).<br />

The test execution result can be pass or fail. By default, if even a single test case fails, the test execution<br />

result is a failure. To allow a success in test execution even <strong>with</strong> failing test cases, Robot contains<br />

a feature called critical tests. The test execution result is failure if any of the critical test cases fails.<br />

This means that test execution is considered successful even though non-critical test cases fail. The<br />

critical test cases are defined when starting the execution from command line. For example regression<br />

can be defined as a critical tag and all the test cases that contain a tag regression are handled as critical<br />

tests. This functionality allows adding test cases to the test execution in case the test case is failing, but<br />

the result is not wanted to be failure. This is needed if the test case or the feature is not ready. These<br />

test cases are not marked as critical.<br />

6.4 <strong>Test</strong> Reporting<br />

Robot produces a report, a log and an output from the test execution. The report contains statistics and<br />

information based on executed test suites and tags. It can be used as an information radiator since its<br />

background color shows whether the test execution status was pass or fail. The test log contains more<br />

detailed information about the executed keywords and information that can be used to solve problems.<br />

The output contains test execution results presented in an xml-format. The report and the log are generated<br />

from the output.<br />

35


7 EXAMPLE OF ACCEPTANCE TEST-DRIVEN<br />

DEVELOPMENT WITH KEYWORD-DRIVEN TEST<br />

AUTOMATION FRAMEWORK<br />

In this chapter a simple fictitious example of acceptance test-driven development <strong>with</strong> the Robot<br />

framework is shown. The purpose of this chapter is to help to understand the concept before showing<br />

the concept in practice. This is also a simple theoretical example of how the concept could work. However,<br />

at first the relation between user stories, test cases and keywords are briefly explained.<br />

7.1 <strong>Test</strong> Data between User Stories and System under <strong>Test</strong><br />

As was described in Chapter 4.3, user stories are short descriptions representing the customer requirements<br />

used for planning. Different levels of test data is needed to map the user stories to the actual<br />

code interacting <strong>with</strong> the system under test. These levels and their interdependence are shown in Figure<br />

13. First of all the user story is mapped to one or multiple test cases. Every test case contains one or<br />

more sentence format keywords. The sentence format keyword means user keywords which are written<br />

in plain text, possibly containing some input or expected output values but no arguments. When the<br />

test cases contain only the sentence format keywords, those can be understood <strong>with</strong>out technical skills.<br />

Every sentence format keyword consists of one or more base or user keywords. A user keyword includes<br />

one or more base or user keywords. Finally the base keywords contain the code which controls<br />

the system under test. The examples in the following chapters clarify the use of the different type of<br />

keywords presented above.<br />

Figure 13:<br />

Mapping from user story to the system under test<br />

36


7.2 User Stories<br />

The customer in this example is a person who handles registrations to different kind of events. People<br />

usually enroll to the events by email or by phone, and therefore the customer needs an application<br />

where to save the registrations. The customer has requested a desktop application that has a graphical<br />

user interface. The customer has defined following user stories:<br />

1. As a registration handler I want to add registrations and see all the registrations so that I can<br />

keep count of the registrations and later contact the registered people.<br />

2. As a registration handler I want to delete one or multiple registrations so that I can remove<br />

the canceled registration(s).<br />

3. As a registration handler I want to have the count of the registrations so that I can notice<br />

when there is no longer room for new registrations.<br />

4. As a registration handler I want to save registrations persistently so that I do not lose the<br />

registrations even if my computer crashes.<br />

7.3 Defining <strong>Acceptance</strong> <strong>Test</strong>s<br />

Before the stories can be implemented, there is a need to discuss and clarify a hidden assumption behind<br />

the stories. Details arising from the collaboration can be captured as acceptance tests. As was<br />

mentioned in Chapter 4.3, it can vary when and who are participating to this collaboration. Because<br />

those issues are a more matter of the process and the people available than the tool used, those issues<br />

are not taken into account in this example.<br />

The discussion about the user stories between the customer and the development team can lead to acceptance<br />

tests shown in Figure 14. The test cases are in the format that can be used as input for Robot.<br />

<strong>Test</strong> cases can be written straightly to this format using empty templates. However, it might be easier<br />

to discuss about the user stories and write drafts of the test cases to a flap board during the conversation.<br />

After sketches of the test cases have been made, those can be easily converted to digital format.<br />

37


Figure 14:<br />

Some acceptance test cases for the registration application<br />

While discussing the details of the user stories and the test cases, the outline of the user interface can<br />

be drawn. The outline in Figure 15 could be the result of the session where the test cases were created.<br />

It can be used as a starting point for the implementation. In the picture, names for the user interface<br />

elements are also defined. These are implementation details that have to be agreed if different persons<br />

are doing the test cases and the application.<br />

38


Figure 15:<br />

Sketch of the registration application<br />

7.4 Implementing <strong>Acceptance</strong> <strong>Test</strong>s and Application<br />

After the acceptance tests are defined it should be clear to all the stakeholders what are going to be implemented.<br />

If pure acceptance test-driven development is used, the test cases are implemented on a detailed<br />

level before the implementation of the application can be started. In this example the implementation<br />

of the test case User Can Add Registrations is described on a detailed level.<br />

CREATING THE TEST CASE ”USER CAN ADD REGISTRATIONS”<br />

User Can Add Registrations test case contains three sentence format keywords as can be seen in Figure<br />

16. The creation of the test case starts <strong>with</strong> defining those sentence format keywords. To keep the actual<br />

test case file as simple as possible, the sentence format keywords are defined in a separate resource<br />

file. The keywords defined in the resource file have to be taken into use by importing the resource file<br />

in the setting table. Because the test case starts <strong>with</strong> a sentence format keyword which launches the<br />

application, the application has to be closed at the end of the test case. This can be done in the test case<br />

or <strong>with</strong> a <strong>Test</strong> post condition setting. These two settings are shown in Figure 17.<br />

39


Figure 16:<br />

<strong>Test</strong> case “User Can Add Registrations”<br />

Figure 17:<br />

Settings for all test cases<br />

Figure 18 shows variables and user keywords defined in the atdd_keyword.html resource file. List<br />

variables @{person1}, @{person2} and @{person3} are described in the variable table. The comments<br />

Name and Email are used to clarify the meaning of the different columns. These variables are used in<br />

the sentence format keywords created in the keyword table. Application is started and there are no<br />

registrations in the database keyword contains two user keywords. The first keyword Clear database<br />

makes sure there are not users in the database when the application is started. The second keyword<br />

User launches registration application launches the registration application. The next two user keywords<br />

User adds three people and all three people should be shown in the application and should exist<br />

in the database repeat the same user keyword <strong>with</strong> the different person variables described in the variable<br />

table. These user keywords are not using base keywords from the libraries, and therefore the test<br />

case is not accessing the system under test at this level. The user keywords used to create the sentence<br />

format keywords can be defined in the same resource file or in other resource files. The missing user<br />

keywords are defined in resource file resource.html.<br />

40


Figure 18:<br />

Variables and user keywords for test case “User Can Add Registrations”<br />

In Figure 19 the user keywords used in the atdd_resource.html resource file are described. The base<br />

keywords needed by these user keywords are imported from the SwingLibrary and the OperatingSystem<br />

test libraries in the settings table. The SwingLibrary contains base keywords for handling the<br />

graphical user interface of applications made <strong>with</strong> Java Swing technology. The OperatingSystem library<br />

is a part of Robot, and it contains base keywords for example handling files (like Get file) and<br />

environment variables, and running system commands. If there are no existing libraries for the technologies<br />

the system under test is implemented <strong>with</strong> or some needed base keywords are missing from<br />

the existing library, the missing keywords must naturally be implemented.<br />

41


Figure 19:<br />

User keywords using the base keywords<br />

User launches registration application means the Launch base keyword <strong>with</strong> two arguments, the main<br />

method of the application, and the title of the application to be opened. Both of these arguments have<br />

been defined in the variables table as scalar variables. User Closes Registration Application uses the<br />

Close base keyword which simply closes the launched application. Clear Database consists of the base<br />

keyword Remove file which removes the database file from the file system. The ${DATABASE} variable<br />

contains the path to the database.txt file which is used as a database by the registration application.<br />

The ${CURDIR} and ${/} variables are Robot’s built-in variables. ${CURDIR} is the directory<br />

where the resource file is and ${/} is a path separator character which is resolved based on the operating<br />

system.<br />

42


User adds registration keyword takes two arguments ${name} and ${email}, and it consists of the<br />

Clear text field, Insert into text field and Push button base keywords. All these keywords take as the<br />

first argument the identifier of the element. These identifiers were agreed in the discussion and can be<br />

seen in Figure 15. The ${name} and ${email} arguments are entered to the corresponding text fields<br />

<strong>with</strong> the Insert into text field keyword. In the Registration should be shown in the application and<br />

should exist in the database user keyword the List value should exist base keyword is used to check<br />

that the name and email are in the list shown in the application. The Get file base keyword is used to<br />

read the data from the database to the ${data} variable and the Contains base keyword is used to check<br />

that the database contains the name and email pair.<br />

EXECUTING THE TESTS<br />

The team has made an agreement that all test cases that should pass will be tagged <strong>with</strong> a regression<br />

tag. When the first version of the application is available, the created test cases can be executed. At this<br />

stage none of the test cases are tagged <strong>with</strong> the regression tag. The result of this first test execution can<br />

be seen in Figure 20. Four of the eleven acceptance test cases passed. Passing test cases can be tagged<br />

now as regression test. In Figure 21 one of the passing test cases tagged <strong>with</strong> the tag regression is<br />

shown. When the test cases are executed next time, there will be four critical test cases. If any of those<br />

test cases fail, the test execution result will be failure and the report will turn to red.<br />

43


Figure 20:<br />

First test execution<br />

Figure 21:<br />

<strong>Acceptance</strong> test case tagged <strong>with</strong> tag regression<br />

When the application is next time updated, the test cases are executed. Again, all passing test cases can<br />

be tagged <strong>with</strong> the regression tag. At some point, all the test cases will pass and the features are ready<br />

and the following items can be taken under development. New acceptance test cases are defined, and<br />

the development can start. In case the old functionality is changed, the test cases have to be updated<br />

and the regression tags have to be removed.<br />

44


8 ELABORATED GOALS OF THE THESIS<br />

In this chapter the aim of this thesis is described on a more detailed level. First the scope is defined.<br />

Then the actual research questions are presented.<br />

8.1 Scope<br />

As it was seen in the previous chapters, the field of software testing is very wide. In this thesis the focus<br />

is in the acceptance test-driven development. It is important to distinguish the traditional acceptance<br />

test level and the agile acceptance test level and in the context of this thesis the term acceptance<br />

testing refers to the latter. Other testing areas are excluded from the scope of this master’s thesis. The<br />

testing areas which are excluded in this thesis are non-functional testing, static testing, unit testing, and<br />

integration testing. Also manual acceptance testing in such is out of the scope, but in some cases it may<br />

be mentioned.<br />

The different aspects and generations of test automation were explained in Chapter 6. This thesis concentrates<br />

the on large scale keyword-driven test automation framework called Robot. The following<br />

aspects of the test automation are included to scope of this thesis: creating the automated acceptance<br />

test cases, executing the automated acceptance test cases and reporting the test execution results.<br />

8.2 Research Questions<br />

The main aim of this thesis is to study how the keyword-driven test automation technique can be used<br />

in the acceptance test-driven development. The study is done in a real life software development project,<br />

and therefore another aim is to give an example on how a keyword-driven test automation framework<br />

was used in this specific case and also describe all the noticed benefits and drawbacks. The research<br />

question can be stated as:<br />

1. Can the keyword-driven test automation framework be used in the acceptance testdriven<br />

development<br />

2. How is the keyword-driven test automation framework used in the acceptance testdriven<br />

development in the project under study<br />

3. Does the acceptance test-driven development <strong>with</strong> the keyword-driven test automation<br />

framework provide any benefits What are the challenges and drawbacks<br />

45


The first question can be divided into the following more detailed questions:<br />

1. Is it possible to write the acceptance tests before the implementation <strong>with</strong> the keyword-driven<br />

test automation framework<br />

2. Is it possible to write the acceptance tests in a format that can be understood <strong>with</strong>out<br />

technical competence <strong>with</strong> the keyword-driven test automation framework<br />

The second question can be divided into the following parts:<br />

1. How, when and by whom the acceptance test cases are planned<br />

2. How, when and by whom the acceptance test cases are implemented<br />

3. How, when and by whom the acceptance test cases are executed<br />

4. How and by whom the acceptance test results are reported<br />

The third research question can be evaluated against the promises and challenges of acceptance testdriven<br />

development shown in Table 1 and Table 2 in Chapter 4.3.<br />

46


9 RESEARCH SUBJECT AND METHOD<br />

The purpose of this chapter is to explain where and how this research was done. At first, the case project,<br />

and the product developed in the project are described on a level that is needed to understand the<br />

context where the research took place. Then the research method and the used data collection methods<br />

are described.<br />

9.1 Case Project<br />

This research was conducted in a software project at Nokia Siemens Networks referred as the Project<br />

from now on. The Project was located in Espoo. The Project consisted of two scrum teams each consisting<br />

of approximately ten persons. In addition to the teams, the Project had a product owner, a project<br />

manager, a software architect and half a dozen specialists working as feature owners. Feature<br />

owner meant same as feature champion (see Chapter 3.3). There were also several supporting functions<br />

like a test laboratory team. Several nationalities were represented in the Project.<br />

The software product developed in the Project was a network optimization tool referred as the Product<br />

from now on. The Product and its predecessors had been developed almost for 5 years. The Product is<br />

bespoke software aimed for mobile network operators. The Project was started June 2006, and the<br />

planned end was December 2007. The Product was a desktop application which was used through a<br />

graphical user interface developed <strong>with</strong> Java Swing technology.<br />

9.2 Research Method<br />

The Project under study was decided before the actual research method was chosen. When the role of<br />

the researcher became clear, there were two qualitative approaches to select from; case study and action<br />

research. It was clear from the beginning that the researcher would be highly involved <strong>with</strong> the<br />

Project under research. This high involvement <strong>with</strong> the Project prevented choosing case study for the<br />

research method. Action research was more suitable for this research. Unlike other research methods<br />

where the researcher seeks to study organizational phenomena but not to change them, the action researcher<br />

is concerned <strong>with</strong> creating organizational changes and simultaneously studying the process<br />

(Babüroglu & Ravn 1992). This describes pretty well the situation <strong>with</strong> this research. Researcher was<br />

participating, giving trainings and helping to define the actions that would change the existing process.<br />

47


While the research method was chosen, it was also kept in mind that one purpose of the research was<br />

to try out the acceptance test-driven development in practice. There was a demand for a method that<br />

would enable a practical approach to the problem. Avison et al. (1999) define that action research<br />

combines theory and practice (and researchers and practitioners) through change and reflection in an<br />

immediate problematic situation <strong>with</strong>in a mutually acceptable ethical framework. This was another<br />

reason why action research was chosen to be the method for this research.<br />

According to Avison et al. (1999), action research is an iterative process involving researchers and<br />

practitioners acting together on a particular cycle of activities, including problem diagnosis, action intervention,<br />

and reflective learning. The iterative process of action research suited well <strong>with</strong> the iterative<br />

process of Scrum. The research iteration length was chosen to be the same as the length of the Scrum<br />

iterations. Figure 22 shows how these two processes were synchronized. With this arrangement the<br />

research cycle was quite short, but it also helped to concentrate on small steps in changing the process.<br />

It also helped to prioritize the most important steps.<br />

Figure 22:<br />

Action research activities and the Scrum process<br />

48


A management decision to increase the amount of automated testing was made before the research project<br />

started. This decision was also a trigger for starting this research. Stringer (1996) mentions that<br />

programs and projects begun on the basis of the decisions and the definitions of authority figures have<br />

a high probability of failure. This was taken into account in the beginning of the research and led to a<br />

different starting phase than what was defined in Stringer (1996) about defining the problems and defining<br />

the scope and actions based on that problem definition. Because the goal was already defined,<br />

the research started from collecting data about the environment and implementing the new acceptance<br />

test-driven development process. Otherwise, the action research method defined in Stringer (1996) was<br />

used.<br />

9.3 Data Collection<br />

There were two purposes for the data collection. The first purpose was to collect data about problems<br />

and benefits that individual project members confronted and noticed during the Project. The other purpose<br />

was to record the agreed implementation of the acceptance test-driven development and also to<br />

observe how this agreement was actually implemented. The latter was even more important as Avison<br />

et al. (1999) mentioned that, in action research, the emphasis is more on what practitioners do than on<br />

what they say they do.<br />

The data was collected <strong>with</strong> observations, informal conversations, semi-formal interviews and collecting<br />

meaningful emails and documents. The data was collected during four months period from the beginning<br />

of January 2007 to the end of April 2007. The researcher worked in the Project as a test automation<br />

engineer. The observations and the informal conversations were conducted when working in<br />

the Project. One continuous method to collect relevant issues was recording the issues raised in the<br />

daily scrum meetings.<br />

Initial information collection based mainly on informal discussions but also a few informal interviews<br />

were used. The main purpose of initial information collection was to build an overall understanding of<br />

the Project and a deep understanding about the testing in the project. This was done by asking questions<br />

about the used software processes, software development and testing practices, and problems encountered<br />

in these issues. Some interviews contained also questions about the Project’s history.<br />

49


The final interviews were semi-formal interviews meaning that the main questions were pre-defined<br />

but questions derived from the discussion were also asked. Nine persons were interviewed. Interviewees<br />

consist of two developers, two test engineers, two feature owner/usability specialist, one feature<br />

owner, one scrum master and one specification engineer. All these persons had participated more or<br />

less in developing the features <strong>with</strong> ATDD. The final interviews in the end of the research focused<br />

more on the influences of acceptance test-driven development on different software development aspects.<br />

Appendix B contains the questions asked in the final interviews. The interview questions were<br />

asked in order presented in the appendix, and the objective was to lead respondents’ answers as little as<br />

possible. Specifying questions were asked to get reasoning to the answers. The interviews were both<br />

noted done and tape-recorded.<br />

50


10 ACCEPTANCE TEST-DRIVEN DEVELOPMENT WITH<br />

KEYWORD-DRIVEN TEST AUTOMATION FRAMEWORK IN<br />

THE PROJECT UNDER STUDY<br />

This chapter describes what was done in the Project where the acceptance test-driven development was<br />

tried out. The emphasis is on issues that are relevant from the acceptance test-driven development<br />

point of view. At first the development model and practices used in the project are described. The case<br />

project was described in Chapter 9.1. Then it is illustrated how the keyword-driven test automation<br />

framework was used in the Project. The emphasis is on the four areas mentioned in the second research<br />

question in Chapter 8.2. At the end of this chapter the results of the final interviews are presented.<br />

10.1 <strong>Development</strong> Model and <strong>Development</strong> Practices Used in the<br />

Project<br />

The development process used in the Project was Scrum. Scrum was introduced and taken into use at<br />

the beginning of the project. That meant that the adjustment to the process was ongoing at the time of<br />

the research. There were also some differences to Scrum presented in Chapter 3.3. The biggest difference<br />

was the format of the product backlog. The main requirement types in the Project were the requirements<br />

defined in the requirement specifications and workflows. A workflow contained all the<br />

steps that user could do <strong>with</strong> the functionality. The workflow was a high level use-case. It contained<br />

multiple steps that were related to each other. These steps were divided into mandatory and optional<br />

steps. Every step in the workflow could be seen as a substitute for an item in the product backlog.<br />

As was mentioned in Chapter 3.3, Scrum does not define development practices other than the daily<br />

build. In the Project continuous integration was used. There were no rules defining which development<br />

practices should be used during the Project. Extreme programming practices like refactoring were used<br />

from time to time by the development team. The developers created unit tests, and there were targets<br />

for the unit testing coverage. However, the unit tests were not done using test-driven development.<br />

Main details of the features were written down to feature descriptions which were short verbal description<br />

of the feature. During the research project, the testing division to the automated acceptance test,<br />

automated unit test and manual exploratory testing was taken into use (see Chapter 4.2).<br />

51


The test automation <strong>with</strong> Robot was started in September 2006. At the beginning of the Project the<br />

automated test cases were created for the already existing functionality. This automation task was done<br />

by a separate test automation team. At the time the research was started, automated test cases covered<br />

most of the basic functionalities. This meant that the library to access the graphical user interfaces of<br />

the Product was already developed for some time, and it included the base keywords for most of the<br />

Java Swing components. At this stage there was a desire to create the automated test cases for features<br />

during the same sprint. To make this possible the acceptance test-driven development was taken into<br />

use.<br />

10.2 January Sprint<br />

In the first research sprint the goal was to start acceptance test-driven development <strong>with</strong> a few new features.<br />

At first it was problematic to find features to be developed <strong>with</strong> acceptance test-driven development.<br />

As part of the implementation was follow-on to the implementation of the previous sprints.<br />

These features were seen problematic to start <strong>with</strong>. Some of the new features needed internal models,<br />

and while being developed, they could not be tested through the user interface. Finally, one new feature<br />

was chosen to be the starting point. The feature was the map layer handling. The map layer handling<br />

is used to load backgrounds to a map view of the Product. Network elements and information<br />

about the network is shown on the map view.<br />

As mentioned, there was a separate team for the test automation when the research started. To be able<br />

to work better in the scrum teams, the test automation team members started working as team members<br />

in the scrum teams. This was done at the beginning of the sprint.<br />

PLANNING<br />

The test planning meeting for the map layer handling feature was arranged by a test engineer. It took<br />

place at the middle of the sprint, before the developer started implementing the feature. The participants<br />

of the meeting were a usability expert/feature owner, developer, test engineer and test automation<br />

engineer.<br />

52


The meeting started <strong>with</strong> a general discussion about the feature to be implemented, and the developer<br />

draw a sketch about the user interface he had in mind. After the initial sketch, the group started to think<br />

about the test cases; how the user could use the feature, and which kind of error situations should be<br />

handled. The sketch was updated based on the noticed needs. The test engineers wrote down test cases<br />

when ever those were agreed. During the discussions some important decisions were made about supported<br />

file formats and supported graphic types. At the end of the meeting, the agreed test cases were<br />

gone through to make sure that all of those were written down. At this phase the test cases were not<br />

written in any formal format.<br />

IMPLEMENTATION<br />

The test case implementation started by writing the test cases agreed in the planning meeting to the<br />

tabular format. At the same time, the developer started the development. Figure 23 contains some of<br />

the initial test cases. The highest level of abstraction was not used in the test cases, and therefore the<br />

test cases consist of lower level user keywords <strong>with</strong> short names and variables. These test cases remind<br />

more the test cases the test automation team had implemented earlier than the test cases shown in the<br />

example in Chapter 7.2 and Figure 13.<br />

53


Figure 23:<br />

Some of the initial acceptance test cases for map layer handling<br />

The implementation of the user keywords started after the test cases were written. There was a need to<br />

implement multiple base keywords even though the library had been developed for some months. Fortunately,<br />

the test automation engineer had time to create these needed keywords. At this stage the identifiers<br />

needed to select the correct widgets from the user interface were replaced <strong>with</strong> variables. The<br />

variable values were set when the developer had written identifiers to the code and emailed those to the<br />

test engineer.<br />

From the beginning it was clear that verifying that the map layers are drawn and shown correctly on<br />

the map would be hard to automate. It was not seen sensible to create a library for verifying the correctness<br />

of the map, and therefore a substitutive solution was created. The solution was to take screenshots,<br />

and combine those <strong>with</strong> instructions defining what should be checked from the picture. This lead<br />

to manual verification, but doing that from time to time was not seen as a problem.<br />

54


One of the tested features was changing the colors of the map layers. A base keyword for changing this<br />

color was created, but when it was tried out, it was not working. After the problem was investigated by<br />

the developer and test automation engineer, the base keyword implementation was noticed to be incorrect.<br />

However, the changes made to the base keyword were not correcting the problems, and one more<br />

problem was noticed in the application. These problems were technical test automation problems. The<br />

investigations took some time and the color changing functionality could not been tested in this sprint<br />

by automation. Also some parts of the feature were not fully implemented, and those were moved to<br />

the next sprint.<br />

TEST EXECUTION<br />

The test cases were executed on the test engineer’s and the test automation engineer’s workstations<br />

during the test case implementation phase. There were problems to get a working build during the<br />

sprint, and that slowed down the ability to test that the test cases and especially the implemented base<br />

keywords were working. During this phase the problems in test cases were corrected and defects were<br />

reported to the developer.<br />

REPORTING<br />

The project had one dedicated workstation for executing the automated acceptance test after the continuous<br />

integration system had successfully built the application. The web page showing a report of the<br />

latest acceptance tests was visible at a monitor situated in the project area. The test cases created during<br />

the sprint were added to an automatic regression test set at the end of the sprint. <strong>Test</strong>s that were<br />

passing at the end of the sprint were marked as regression tests <strong>with</strong> the Robot’s tagging functionality<br />

(see Chapter 6.2). The ability to define the critical test cases based on tags made it possible to execute<br />

all the tests even some test cases and features were not working.<br />

10.3 February Sprint<br />

In the second sprint the goal was to finalize the test cases for the map layer functionality and start<br />

ATDD <strong>with</strong> a few more functionality. The functionality selected was the visualization of the Abis configuration.<br />

The purpose of the feature was to collect data from multiple network elements and show the<br />

Abis configuration based on the collected data.<br />

55


PLANNING<br />

Immediately after the sprint planning people involved <strong>with</strong> the visualization of the Abis configuration<br />

feature development kept a meeting about the details of the feature. There were a feature owner, a<br />

specification person, two developers, a test engineer, a test automation engineer, a usability expert and<br />

a scrum master. The usability specialist had developed prototype showing how the functionality should<br />

look like. By using this prototype as starting point, the team discussed different aspects of the feature<br />

and asked specifying questions. The test automation engineer made notes during the meeting.<br />

IMPLEMENTATION<br />

Based on the issues agreed in the meeting, the initial test cases were created, and those were sent by<br />

email to all the participated people. The test cases were created on a high level to make them more understandable<br />

and these test cases can be seen in Figure 24.<br />

Figure 24:<br />

Initial acceptance test cases for the Abis configuration<br />

56


After the test cases were described, the needed keywords were implemented. Figure 25 contains the<br />

implementation of the sentence format keywords. The variables used in the keywords were defined in<br />

the same file as these keywords. As can be seen User opens and closes Abis dialog from navigator user<br />

keyword was used by multiple keywords, and its implementation can be seen in Figure 26. The user<br />

keywords used to implement the User opens and closes Abis dialog from navigator user keyword consists<br />

of user keywords and base keywords.<br />

Figure 25:<br />

The highest level user keywords used to map the sentences to user keywords and variables<br />

57


Figure 26:<br />

Lower level user keywords “User opens and closes Abis dialog from navigator” implementation<br />

Again more base keywords were needed. However, the base keywords were not implemented into the<br />

SwingLibrary. There was a need to implement a helper library to handle the data that was checked<br />

from the configuration table. The configuration table contained 128 cells and the content of every cell<br />

was wanted to be verified. The tabular test data format allowed to describe the expected output almost<br />

in the same format as it was seen in the application. However, the expected outcome could not be defined<br />

beforehand. The input for the feature was configuration data from a mobile network. In this context<br />

it was hard to create all the needed network data in a way that the expected outcome would be<br />

known and the data would be correct. In the test cases existing test data was used, and the configuration<br />

view to be tested automatically was selected from the available alternatives in the existing test<br />

data.<br />

58


Soon after the middle point of the sprint there was a meeting where the status of the visualization of<br />

the Abis configuration feature was checked. The feature was used by a few specialists while the scrum<br />

team was responding to the raised questions and writing down observations. Based on this meeting and<br />

some other informal discussions some more details were agreed to be done in the sprint. Figure 27 contains<br />

some of the test cases which were added and updated after the meeting. The changes were marked<br />

<strong>with</strong> bolded text to highlight them.<br />

Figure 27:<br />

Some of the added and updated test cases<br />

As was mentioned earlier, there was no easy way to automatically test the map component. However,<br />

one of the acceptance test cases was supposed to test that the Abis view can be opened from the map. It<br />

was not seen possible to automate the test case <strong>with</strong> a reasonable effort. The test case was still written<br />

down and tagged <strong>with</strong> the tag manual. The manual tag made it possible to see all the acceptance test<br />

cases that had to be executed manually. Another challenge was to keep the test cases in synchronization<br />

<strong>with</strong> the implementation because the details were changed a few times.<br />

The test case User Can See The Relation Between TRX And DAP visible in Figure 27 was one of the<br />

test cases added in the middle of the sprint. The implementation of the test case could not be finished<br />

during the sprint. The exact implementation of the feature was changed a few times, and the test case<br />

was not implemented before the implementation details were final. The feature was ready just before<br />

the sprint ended, and there was no time to finalize the test case. This was due to the final details were<br />

decided so late, and different people were implementing the feature and the test case.<br />

59


The problems in the map layer handling feature and base keywords were discussed during the sprint.<br />

Some changes to the map layer handling functionality were agreed. These changes were mainly functional<br />

changes to solve the problems <strong>with</strong> the feature itself. The acceptance test cases needed updates<br />

due to these changes. While the test cases were updated, they were changed to include only sentence<br />

format keywords. Some of the new test cases can be seen in Figure 28. The change was quite easy because<br />

most of the keywords were already ready and the mapping from sentence format keywords to<br />

user keywords and variables was very straightforward.<br />

Figure 28:<br />

Some of the updated acceptance test cases for the map layers handling functionality<br />

60


The problem <strong>with</strong> implementing the base keyword in the previous sprint was solved soon after the test<br />

cases were updated. There was also a need for implementing one base keyword, and again there were<br />

some small technical problems. The problem was again <strong>with</strong> a custom component. However, this time<br />

the problem was solved quite quickly. Some challenges in the implementation and implementing functionality<br />

<strong>with</strong> a higher priority took so much time that the map layers handling functionality was not<br />

ready at the end of the sprint and a few nasty defects remained open.<br />

TEST EXECUTION<br />

The test cases were executed by the test automation engineer while developing the test cases similarly<br />

as in the previous sprint. During the sprint there were still problems <strong>with</strong> the builds. This made it<br />

harder to check whether the test cases were working or not and when some of the features were ready.<br />

During this sprint Abis configuration test cases found a defect from a feature which had already<br />

worked.<br />

REPORTING<br />

The reporting was done in a similar way as in the previous sprint.<br />

10.4 March Sprint<br />

During the previous two sprints it was seen that the test automation team was too much responsible for<br />

the test automation. The knowledge was decided to be divided more into the whole team. This meant<br />

arranging training during the sprint. The purpose was to continue the ATDD research <strong>with</strong> other new<br />

functionality. However, some of the team had to participate in a maintenance project during the sprint,<br />

and the sprint content was heavily decreased.<br />

PLANNING<br />

The team had agreed that the details of the new functionality should be agreed on a more detailed level<br />

in the sprint planning. Therefore the team and the feature owner were discussing on a detailed level<br />

what should be implemented in the sprint. All the details could not be described in the first planning<br />

meeting, and thus a second meeting was arranged. In the second meeting the feature owner, two developers,<br />

usability expert/feature owner, test engineer and test automation engineer participated. The<br />

functionalities were gone through, and there were discussion about the details. Agreements about the<br />

implementation details were made, and those were noted down.<br />

61


IMPLEMENTATION<br />

The test automation engineer was responsible for arranging the training, and therefore the test cases<br />

were not implemented at the beginning of the sprint. After the training, a developer and the test automation<br />

engineer implemented the test cases which were not finished during the previous sprint. At this<br />

point contents of the current sprint were decreased. All the functionality that was planned in the second<br />

planning meeting was moved to the following sprint. The initial test cases were still created before the<br />

sprint ended, and some of those can be seen in Figure 29.<br />

Figure 29:<br />

Initial test cases for Abis rule<br />

62


TEST EXECUTION<br />

The test cases that were implemented by the developer and test automation engineer were added to the<br />

automated test execution system immediately after they were ready. All test cases created during the<br />

previous sprints were already there.<br />

10.5 April Sprint<br />

The goal in the April sprint was to continue <strong>with</strong> ATDD <strong>with</strong> the Abis analysis functionality. There<br />

were some big changes in the beginning of the sprint. The Abis analysis workflow was wanted to be<br />

ready at the end of the sprint. This led to combining the two teams to one big sprint team. The team<br />

that had not worked <strong>with</strong> the Abis analysis earlier needed introduction to the functionality. The big<br />

team made it impossible to go into such details that the acceptance tests could be updated during the<br />

sprint planning.<br />

PLANNING<br />

As it was mentioned in the previous chapter, initial acceptance test cases were created during the earlier<br />

sprint. After the sprint planning, the feature owner, the specification engineer and the test automation<br />

engineer went through the initial test cases and updated them. Some of the details still remained<br />

open as the feature owner found those out later in the sprint. After the test cases were updated, they<br />

were sent to the whole team.<br />

IMPLEMENTATION<br />

The implementation started immediately after the acceptance test cases were updated. The test automation<br />

engineer was writing the test cases. After some of the sentence format keywords were implemented,<br />

one step needed clarification. The test automation engineer invited two usability specialist/feature<br />

owners and a specification/test engineer to a meeting where the different options to solve the<br />

usability problem were discussed. After all the options were evaluated the test automation engineer<br />

discussed <strong>with</strong> the developer and the software architect about possible solutions. The changes were<br />

agreed to be implemented, and three developers, the usability expert/feature owner and the test automation<br />

engineer planned and agreed about the details for the feature. Based on the agreed details the<br />

test automation engineer created the acceptance tests for the new feature. Technically the test cases<br />

were created in a similar manner as in the previous sprints.<br />

63


The acceptance test cases were dependent on each other because every test case was a step in the Abis<br />

analysis workflow. This caused some problems as the first step, getting the needed data into the application,<br />

was ready only at the last day of the sprint. A part of the test cases could not be finalized before<br />

this data was available. It was seen too laborious to count all the needed inputs beforehand. Also one<br />

part of the feature could not be finished during the sprint. Therefore a few test cases were not ready<br />

when the sprint ended.<br />

At the end of the sprint, the test engineer and the test automation engineer created some more detailed<br />

test cases to test the Abis rule. These test cases tested different variations and checked that the rule result<br />

was correct. However, the rule was not working as it was meant to be. The developer, the feature<br />

owner and the test automation engineer had understood the details differently. This led to a more detailed<br />

discussion between these parties. It was even noticed that some of the special cases were not<br />

handled correctly. Based on the discussion the developer and test automation engineer wrote down all<br />

the different situations and mailed those to the feature owner. It was agreed that this kind of details<br />

need acceptance test cases in the coming sprints.<br />

TEST EXECUTION<br />

Some of the test cases were verified in the developers’ development environments. One test case was<br />

failing, and it was noticed that the feature implementation has to be improved to fulfill the requirements.<br />

The developers continued the implementation, and after they thought it was ready, the acceptance<br />

test cases were executed again, and those passed. It was seen that the feature was ready. Some<br />

other test cases were executed in the test automation engineer’s workstation. Some problems and misunderstandings<br />

were found, and they were reported to developers.<br />

REPORTING<br />

The test cases were added to the acceptance test execution environment after they were updated in the<br />

beginning of the sprint. The idea was to make the development status visible to all via the acceptance<br />

test report. However, all the test cases were failing most of the sprint, and only a few days before the<br />

sprint ended, some of them passed. Even at the end of the sprint all of them were not passing.<br />

64


It was also planned to create a running tested features diagram from the acceptance test results. However,<br />

this idea was discarded because it was seen that it would not give the correct picture from the<br />

projects status. Some of the test cases were not acceptance test cases in a sense that those were defined<br />

by the test engineers, not by the feature owners. This limitation could be avoided by using acceptance<br />

tag and include only test cases <strong>with</strong> this tag to the RTF diagram. An even more important reason for<br />

dropping the idea was the fact that the whole projects development was not done in the ATDD manner.<br />

10.6 Interviews<br />

This chapter collects the experiences from the project members involved in the team which developed<br />

features <strong>with</strong> the acceptance test-driven development. The interview methods are described on a more<br />

detailed level in Chapter 9.3. Altogether nine persons were interviewed and in this chapter the results<br />

are briefly described. Results of the interviews are analyzed on a more detailed level in Chapter 11.<br />

CHANGES IN THE SOFTWARE DEVELOPMENT<br />

Interviewees thought that the biggest change due to the use of ATDD had been the increased understanding<br />

of the details and workflow in the whole team. One developer thought that ATDD had forced<br />

the team to communicate and co-operate. Another developer mentioned that due to ATDD, feedback<br />

about the features is obtained faster. The test engineers saw that they were able to affect on the developed<br />

software more than before.<br />

BENEFITS<br />

The biggest benefit mentioned in the interviews was a better common understanding of the details due<br />

to the increased communication, cooperation, and detailed planning. Four interviewees saw that requirements<br />

and feature descriptions are more accurate than before. One feature owner had noticed<br />

missing details in the requirements while defining the acceptance tests. The developers thought that<br />

they knew better what was expected from them. Three other interviewees agreed. Four interviewees<br />

saw that the increased understanding of the details had lead to doing the right things already at the first<br />

time. Two interviewees thought the acceptance test cases had increased the overall understanding<br />

about the workflow. One respondent had noticed improvements in team work.<br />

65


The test engineers thought their early involvement was beneficial because they were able to influence<br />

on the developed software, ask hard questions and create better test cases due to the increased understanding.<br />

One test engineer thought that being in the same cycle <strong>with</strong> the development is very efficient<br />

because then people remember what they have done, and therefore problems can be solved <strong>with</strong> a s-<br />

maller effort. One feature owner was of the opinion that the test engineers and developers understand<br />

better what to test and how to test. She also mentioned that the testing is now covering a full use case.<br />

Three interviewees mentioned that feedback was obtained much faster than earlier. The early involvement<br />

of the test engineers and test automation to helped to shorten the feedback loop. One developer<br />

saw that the automated user interface testing is improved. One interviewee thought the automated acceptance<br />

tests keep the quality at a certain level but does not increase it. Another interviewee was of<br />

opinion that test automation helps to reduce the manual regression testing and test engineers can concentrate<br />

more on the complex scenarios and use more their domain knowledge.<br />

DRAWBACKS<br />

There were not many drawbacks according to the interviewees. Two interviewees thought that the initial<br />

investment to test automation is the biggest disadvantage and they were wondering if the costs will<br />

be covered in the long run. Two interviewees were of the opinion that the extra work needed to rewrite<br />

the test cases after possible changes is a problem. One feature owner thought that the time needed to<br />

write the initial test cases is also a kind of a drawback. Two interviewees were speculating that some<br />

developers may not like that others come to their territory. Four interviewees could not find any weaknesses<br />

that are in the same dimension <strong>with</strong> the benefits.<br />

CHALLENGES<br />

<strong>Test</strong> data was seen as the biggest challenge and five respondents mentioned it. Flexible creation of test<br />

data and its use in acceptance test cases were considered challenging. Also reliable automated algorithm<br />

testing was seen as problematic. One developer mentioned that testing the map component and<br />

other visual issues <strong>with</strong> automated test cases would be troublesome. Three interviewees thought that<br />

there may be challenges <strong>with</strong> change resistance. The test engineers found that it was difficult to find<br />

the right working methods. The increased cooperation increases the need for asking right questions,<br />

and that can also be challenging.<br />

66


INFLUENCE ON THE RISK OF BUILDING INCORRECT SOFTWARE<br />

There were varying views on how ATDD influences the risk of building incorrect software. Some interviewees<br />

saw two risks. The first risk was building software that does not fulfill the end customer’s<br />

expectation. The second risk was building software that does not fulfill the requirements or the feature<br />

owner’s expectations. Two persons saw that ATDD does not affect on the risk of building incorrect<br />

software from the end user’s point of view. On the other hand, one test engineer thought that the early<br />

involvement of testing may even decrease the risk. Seven interviewees saw that the second risk about<br />

not creating software that has been specified and wanted by the internal customer had decreased compared<br />

to earlier. Increased communication, discussion about the details and an increased common understanding<br />

before the implementation were seen as the main reasons. One interviewee thought that if<br />

the test cases are incorrect and those are followed too narrowly, it may increase the risk. Another response<br />

was that if the application is developed too much from the test automation’s point of view, the<br />

actual application development could suffer.<br />

VISIBILITY OF THE DEVELOPMENT STATUS<br />

The visibility of the development status was not seen to have changed much <strong>with</strong> the use of ATDD.<br />

One individual view was that the automated tests will increase it in the future. Another comment was<br />

that breaking the tests into smaller parts and arranging a sprint-specific information radiator could<br />

help. The developers thought that merging the acceptance test reports as a part of the build reports<br />

would improve the situation.<br />

QUALITY AGREEMENT BETWEEN THE DEVELOPMENT AND FEATURE OWNERS<br />

Seven interviewees saw the acceptance test cases as an agreement between the development team and<br />

feature owners because the test cases were done in cooperation. However, four of them saw that the<br />

agreement is a functional agreement and not a quality agreement. The quality was seen as a bigger entity<br />

than correct functionality. Two interviewees saw that the agreement had not yet formed.<br />

67


CONFIDENCE IN THE APPLICATION<br />

In general, the confidence in the application had increased. One developer saw that ATDD had enhanced<br />

his confidence in the software because he knew that he was developing the right features. Also<br />

three other persons saw that confidence had grown because there was a common understanding on<br />

what should be done. Three other interviewees were of the opinion that test automation had built the<br />

confidence mainly because passing automated test cases indicated that the application was working on<br />

a certain level. One interviewee saw that the automated test cases increase confidence because she<br />

could trust that something was working after it had been shown to be working in the demo. One test<br />

engineer saw that the possibility to affect the implementation details had enhanced his confidence in<br />

the software.<br />

WHEN PROBLEMS ARE FOUND<br />

Five interviewees thought that problems can be found earlier than <strong>with</strong>out using ATDD and three of<br />

them had already experienced that. However, four of them were of the opinion that manual testing and<br />

test engineers’ early involvement were the key issues. Two of them also mentioned that co-operation in<br />

the early phase can prevent problems from occurring. Four interviewees had not experienced changes,<br />

even though one of them hoped that problems could be found faster in the future.<br />

REQUIREMENTS UP-TO-DATENESS<br />

According to the interviewees, the requirements were more up-to-date than before. Seven of the interviewees<br />

had seen improvement in the way the requirement specification and feature descriptions were<br />

updated. One feature owner and specification engineer mentioned that some missing requirements<br />

were noticed while creating the test cases. Increased communication between the different roles was<br />

also seen to have helped updating the specifications. One developer and test engineer thought that if<br />

some of the agreed functionality has to be changed during the development, it may not get updated.<br />

Two interviewees had not seen any change compared to earlier.<br />

68


CORRESPONDENCE BETWEEN TEST CASES AND REQUIREMENTS<br />

Seven of the interviewees saw that the test cases and requirements are more in sync than before. Reasons<br />

mentioned were cooperation in the test case creation, increased communication, better understanding<br />

of the feature, and agreement about the details. Two persons thought that the test cases correspond<br />

better to the requirements at the beginning when the details are agreed. On the other hand, they<br />

thought that changes during the implementation phase may lead to differences between the test cases<br />

and requirements. One feature owner/usability expert saw that ATDD does not assure that the test<br />

cases and requirements are in sync. He also thought that the test cases cannot replace other specifications.<br />

In his opinion, there is not even a need for that.<br />

DEVELOPERS’ GOAL<br />

Both the developers thought that ATDD had made it easier to focus on the essential issues. One of<br />

them thought the acceptance test cases had also increased the understanding about where his code fits<br />

into the bigger context. Five persons other than developers thought that the developers’ focus is more<br />

on the right features. One interviewee hoped the developers’ goal had changed to a direction where the<br />

feature is implemented, tested and documented, not only implemented.<br />

DESIGN OF THE SYSTEM<br />

One developer thought that ATDD had helped in finding the design faster than before. The other developer<br />

did not have noticed any changes in the design.<br />

REFACTORING CORRECTNESS<br />

The developers found that ATDD had not affected on the evaluation of the refactoring correctness yet.<br />

However, they thought that automated acceptance tests could be used for that later on.<br />

QUALITY OF THE TEST CASES<br />

Most of the interviewees were of the opinion that the quality of the test cases had increased. The following<br />

justifications were presented; test cases are created in co-operation, test cases respond better to<br />

the requirements, test cases cover the whole workflow, and test cases are more detailed and executed<br />

more often. Some interviewees could not tell if there had been any changes. One developer thought<br />

that the acceptance tests done through the graphical user interface had been a huge improvement to the<br />

user interface testing. He explained that it had been very troublesome to unit test the user interfaces<br />

extensively.<br />

69


TEST ENGINEERS’ ROLE<br />

In general, it was seen that test engineers’ role had broadened due to the use of ATDD. Most of the<br />

interviewees mentioned that being a part of the detailed planning had been the biggest change. Other<br />

mentioned changes were increased need to communicate and an increased role in information sharing.<br />

The test engineers thought the change had been huge. The ability to influence on the details makes the<br />

work more rewarding. The improved knowledge about expected details makes it possible to test what<br />

should be done instead of testing what has been done. One feature owner thought that ATDD had<br />

eased the test engineers’ tasks due to the fact that test cases were defined together.<br />

Four interviewees had noticed the old confrontation between the developers and test engineers starting<br />

to decrease due to the increased cooperation. One developer had understood better the difficulties in<br />

testing which in turn had changed his view about the test engineers. One developer said he was happy<br />

that the communication is not only happening through defect reports.<br />

FORMAT OF THE TEST CASES<br />

All the interviewees thought the test cases are at the moment in a format which is very easy to understand.<br />

The sentence format was seen very descriptive. However, one developer had noticed some inconsistency<br />

between the terminology in the test cases and requirements specification. A few persons<br />

thought that still some domain knowledge is needed to understand the test cases. One test engineer<br />

thought the format is much more understandable than the test cases created <strong>with</strong> traditional test automation<br />

tools.<br />

LEVEL OF THE ACCEPTANCE TESTS<br />

The interviewees saw it difficult to define on which level the acceptance test cases should be. One test<br />

engineer thought that discussion at the beginning of the sprint may help to write proper acceptance test<br />

cases and to avoid duplicating the same tests in unit testing and acceptance test levels. Two persons<br />

thought that more detailed test cases would need better test data. One of them also mentioned that it<br />

will not be possible to test all the combinations. He also doubted the profitability of detailed automated<br />

test cases due to the increasing maintenance costs. One specification engineer thought that the acceptance<br />

test cases have probably been detailed enough, but more experiences are needed to become convinced.<br />

Other interviewees did not have any views on this issue.<br />

70


EASINESS OF TEST AUTOMATION<br />

Most of the interviewees did not know if ATDD had affected the easiness of test automation. One test<br />

engineer thought that ATDD helps to plan which test cases to automate and which not.<br />

IMPROVEMENT IDEAS<br />

The interviewees did not have any common opinion on improvement areas. One interviewee thought<br />

that increasing the routine is the most important thing to concentrate on because the method had been<br />

used only for a short time. One feature owner saw that in some areas there is a need for more detailed<br />

level acceptance tests. She also mentioned that there could be a check during the sprint where the acceptance<br />

test cases are reviewed.<br />

Both the developers thought that reporting could be improved to shorten even more the feedback loop.<br />

Adding the acceptance test reports to the build reports was seen as a solution. One of the developers<br />

thought that the written acceptance test cases could be communicated so that everyone really knows<br />

those test cases exist. One feature owner/usability specialist was of the opinion that splitting the acceptance<br />

test cases into smaller parts would help to follow the progress inside the sprint. He felt that<br />

smaller acceptance tests <strong>with</strong> sprint-specific reporting could be used to improve the visibility to all project<br />

members.<br />

One test engineer saw that there is room for improvement in defining and communicating what is<br />

tested <strong>with</strong> manual exploratory tests, automated acceptance tests, and automated unit tests. Two respondents<br />

thought that more specific process description should be created to ease the process adaptation<br />

if ATDD would be taken to wider use. It was also seen that the whole organization is needed to<br />

support the change.<br />

71


11 ANALYSES OF OBSERVATIONS<br />

In this chapter the observations made during the study, including the interviews, are analyzed against<br />

the research questions presented in Chapter 8.<br />

11.1 Suitability of the <strong>Keyword</strong>-<strong>Driven</strong> <strong>Test</strong> Automation Framework<br />

<strong>with</strong> <strong>Acceptance</strong> <strong>Test</strong>-<strong>Driven</strong> <strong>Development</strong><br />

The first research question was: Can the keyword-driven test automation framework be used in the acceptance<br />

test-driven development This question was divided into two more specific questions and<br />

those are analyzed first. After the specific questions have been covered, the analysis of the actual research<br />

question is presented.<br />

IS IT POSSIBLE TO WRITE THE ACCEPTANCE TESTS BEFORE THE<br />

IMPLEMENTATION WITH THE KEYWORD-DRIVEN TEST AUTOMATION FRAMEWORK<br />

In the Project the test cases were written in two phases. The initial test cases were written based on the<br />

information gathered from the planning meetings. Writing the initial test cases took place after the<br />

planning and those were usually ready before the developers started implementing the features. Therefore,<br />

it can be said that the initial test cases were written before the implementation started. However, it<br />

has to be taken into account that the initial test cases were on a high level and the amount of test cases<br />

was only between 10 and 25 test cases per sprint. In case there had been more test cases or the test<br />

cases had been on a more detailed level, the result might have been different.<br />

The second phase, implementing the keywords that were needed to map the initial test cases to the system<br />

under test, was conducted in parallel <strong>with</strong> the application development. With some test cases, it<br />

was not possible to implement all the keywords before the actual implementation details were decided.<br />

There were also difficulties <strong>with</strong> implementing the test cases <strong>with</strong> the inputs and outputs dependent on<br />

the features under development. There were also problems <strong>with</strong> implementing the base keywords. This<br />

prevented finalizing some of the test cases during the sprint. Therefore, only some of the acceptance<br />

test cases were fully ready before the corresponding feature. It was noticed that the test cases could be<br />

implemented neither before the development nor before the features were ready. However, the test<br />

cases were mainly ready soon after the features.<br />

72


The reasons behind the test case implementation problems had to be analyzed. The first problem was<br />

that the interface between test cases and application was changing. It was not possible to implement the<br />

test cases before the interface was defined, which is obvious. However, the test cases were not implemented<br />

even immediately after the interface was clear. This was due to the fact that different persons<br />

were implementing the test cases and the features. In case the same person had implemented both, the<br />

test cases could have been created on time. This problem has also something to do <strong>with</strong> the tool and<br />

approach used to automate the test cases. If the interface had been a programmatic interface, the developers<br />

would have been forced to create the needed code to map the test cases and application. In this<br />

case, the changes in the interface would have been just one person’s responsibility. Therefore, it can be<br />

said that the selected interface made this problem possible. To avoid this problem, it is possible to<br />

move the test case implementation to the developer or improve the communication between the person<br />

implementing the test cases and the person developing the features.<br />

The second problem was defining the inputs and outputs beforehand. The interviewed project members<br />

mentioned that the test data is the biggest challenge in the domain. In the Project, some expected results<br />

were calculated for verification purposes. However, in some test cases more data was needed. It<br />

was not seen sensible to count all this data only for the sake of a few test cases. These problems can<br />

obviously make it hard or even impossible to implement the test cases before developing the features.<br />

On the other hand, these problems were not tool specific. It is even possible that in some other context<br />

this kind of problems do not exist or those are at least easier to solve. However, if this kind of problems<br />

exists it has to be decided case by case whether it is worth of the extra effort to implement the test<br />

cases in a test-first manner.<br />

The problems <strong>with</strong> creating the base keywords were technical. These kinds of problems occur every<br />

now and then. It was also noticed that it might be hard to implement the system specific base keywords<br />

<strong>with</strong>out trying those out. There was no specific reason for the problems, and as the knowledge about<br />

the library increased, the amount of problems was decreasing. And more importantly, all of the problems<br />

were eventually solved.<br />

73


IS IT POSSIBLE TO WRITE THE ACCEPTANCE TESTS IN A FORMAT THAT CAN BE<br />

UNDERSTOOD WITHOUT TECHNICAL COMPETENCE WITH THE KEYWORD-DRIVEN<br />

TEST AUTOMATION FRAMEWORK<br />

The acceptance tests were easy to understand by all the project members. The main reason for this was<br />

that the acceptance test cases were written using plaintext sentences, in other words sentence format<br />

keywords. However, using the sentence format keywords caused extra cost. One additional abstraction<br />

layer was needed for the test cases. Whenever some inputs were defined in the test cases those were<br />

given as arguments for the keyword implementing the sentence format keyword. In some cases this led<br />

to creating duplicate data. The sentence format keyword was first converted to a user keyword and argument<br />

or arguments, and then the user keyword was mapped to other keywords. Implementing the<br />

sentence format keywords took usually only seconds, so the cost was not relevant. This was because<br />

the keyword-driven test automation framework supported a flexible way of defining user keywords in<br />

the test data. Without this functionality in the keyword-driven test automation framework, it may be<br />

harder to use the sentence format keywords and the cost may be higher. Overall the clarity gained <strong>with</strong><br />

the sentence format keywords in the Project was worth of the extra effort.<br />

However, there are some doubts about the sentence format keywords’ suitability to lower level test<br />

cases. Especially if the test cases are created in a data-driven manner, and only inputs and expected<br />

outputs vary. In these cases the overhead caused by the extra abstraction layer may become a burden.<br />

In these cases it would probably be better to use descriptive keyword names and add some comments<br />

and column names to increase the readability of the test cases. This is something that needs further research<br />

because the acceptance test cases created in the Project were mainly on a high level.<br />

CAN THE KEYWORD-DRIVEN TEST AUTOMATION FRAMEWORK BE USED IN THE<br />

ACCEPTANCE TEST-DRIVEN DEVELOPMENT<br />

The answer to this question is ambiguous. It depends on how strictly the acceptance test-driven development<br />

is specified. It is clear that the acceptance test cases were not implemented before the development.<br />

In the Project it would have been very unprofitable to implement all the test cases in a testfirst<br />

manner and probably also impossible. The strict test-first approach <strong>with</strong> acceptance test cases may<br />

be hard in any environment, and Crispin (2005) has also noticed more problems than benefits <strong>with</strong> the<br />

strict test-first approach. On the other hand, the initial test cases were mainly ready before the development<br />

as was mentioned earlier. Therefore, the acceptance test cases were driving the development<br />

by giving a direction and goal for the sprints. One developer’s comment “The acceptance test cases<br />

really drove the development!” promotes this statement.<br />

74


However, the test cases created <strong>with</strong> the keyword-driven test automation framework can be on a very<br />

high level due to the ability to create abstraction layers to the test cases. This may lead to a situation<br />

where a high level use case is converted to high level test cases, and therefore the details are not<br />

agreed, and the benefits of ATDD evade. In the Project some of the test cases were created on such a<br />

high level that the problems were noticed only when the test cases were implemented. At least one usability<br />

problem was noticed while implementing the test cases. This could have been noticed already in<br />

the planning phase <strong>with</strong> more detailed test cases. On the other hand, the usability problem was solved<br />

during the sprint and <strong>with</strong>out ATDD this problem would have been noticed and corrected much later.<br />

Also some misunderstandings noticed at the end of the April sprint could have been avoided <strong>with</strong> more<br />

detailed test cases.<br />

It was also observed that some of the agreed acceptance test cases were not driving the developers<br />

work as well as they could have. With some features the test automation engineer found problems that<br />

could have been avoided if the developers had been following the test cases more strictly. These problems<br />

were not considerable, but some extra implementation was needed to fix them. These situations<br />

were possible because the test automation engineer implemented the test cases instead of the developers.<br />

There were two reasons why the test automation engineer was implementing the test cases. First<br />

the keyword-driven test automation framework made it possible to implement the test cases <strong>with</strong> the<br />

keywords <strong>with</strong>out programming. The other reason was the interface used to access the Product from<br />

the acceptance test cases. Because there was a test library to access the graphical user interface of the<br />

Product, it was possible to write the test cases <strong>with</strong>out the developers’ continuous involvement. With<br />

tools like FIT (Framework for Integrated <strong>Test</strong>) there is usually a need for implementing some feature<br />

specific code between the test cases and application. Therefore, developers are enforced to work<br />

closely <strong>with</strong> the test cases. However, <strong>with</strong> the keyword-driven test automation framework this involvement<br />

is not forced by the tool.<br />

Overall, it seems that the keyword-driven test automation framework can be used in the acceptance<br />

test-driven development if the strict test-first approach is not required. However, there are a few things<br />

that are good to keep in mind if the keyword-driven test automation framework is used <strong>with</strong> ATDD.<br />

Creating only high level test cases should be avoided because those will not drive the discussion to the<br />

details which were mentioned to be the biggest benefit of ATDD. If different persons are creating the<br />

test cases and implementing the application, the communication between these two parties has to be<br />

assured.<br />

75


11.2 Use of the <strong>Keyword</strong>-<strong>Driven</strong> <strong>Test</strong> Automation Framework <strong>with</strong><br />

<strong>Acceptance</strong> <strong>Test</strong>-<strong>Driven</strong> <strong>Development</strong><br />

The second research question was: How is the keyword-driven test automation framework used in the<br />

acceptance test-driven development in the project under study This question was divided into acceptance<br />

test case planning, implementation, execution, and reporting. Chapter 10 already answers to<br />

these questions, but into this chapter sprints are summarized and analyzed.<br />

HOW, WHEN AND BY WHOM THE ACCEPTANCE TEST CASES WERE PLANNED<br />

There was no formal procedure for defining the acceptance test cases. The test cases were rather defined<br />

on the situation basis. However, in all the cases the implementation details were discussed in a<br />

group containing at least a developer, a feature owner, a usability specialist, and a test engineer, and<br />

the discussion was noted down to different sketches and notepads. These discussions took place usually<br />

soon after the sprint planning and always before the implementation. After the meetings it was<br />

mainly the test automation engineer’s task to convert the acceptance test cases to the tabular format<br />

used <strong>with</strong> Robot. In the April sprint the acceptance test cases were updated by a group including a feature<br />

owner, a specification engineer and a test automation engineer.<br />

Writing the test cases and details quickly down in the planning meetings was noticed to be a good<br />

choice. The discussion was not hindered by someone writing the test cases, but all the participants<br />

were really taking part in the conversation. However, there was one drawback <strong>with</strong> this approach. In a<br />

few meetings, some of the details needed to implement the test cases were not discussed. This was because<br />

the issues were not handled systematically. Because these details were straightened out from individual<br />

persons, those were not fully understood by the whole team. It was noticed that emailing and<br />

having the test cases in version control system was not enough. Therefore, it would have been beneficial<br />

to have some kind of a meeting after the test cases were written to check and clarify all the details<br />

to all the team members. This was mentioned also by two team members in the final interviews. A<br />

similar problem was noticed in the April sprint when the details were updated <strong>with</strong>out the developers.<br />

HOW, WHEN AND BY WHOM THE ACCEPTANCE TEST CASES WERE IMPLEMENTED<br />

The acceptance test cases were implemented using the sentence format keywords from the February<br />

sprint onwards in a similar manner as was explained on example in Chapter 7. The test case implementation<br />

took place parallel <strong>with</strong> the feature implementation. The test cases were implemented mainly by<br />

the test automation engineer, but also a test engineer and a developer implemented some of the test<br />

cases.<br />

76


In addition to the challenges presented earlier in this chapter, there were challenges in keeping the test<br />

cases up to date in the February sprint. This problem could have been avoided if the details had been<br />

agreed on a more detailed level in the planning meeting. On the other hand, some of the changes were<br />

made based on the feedback gained from the meeting arranged <strong>with</strong> the specialist. These changes<br />

would have been very hard to foresee. However, updating the test cases was quite easy because the test<br />

cases were created <strong>with</strong> keywords.<br />

The biggest challenge compared to the simple example presented in Chapter 7 was the increase in test<br />

execution time. Starting the application and importing the network data took a considerably long time.<br />

Those actions were not wanted to be executed in every test case as the total test execution time would<br />

have been multiplied by the amount of test cases. It was important to keep the execution time short as<br />

it had an effect on the duration of the test case implementation, and the feedback time in the acceptance<br />

test execution system.<br />

HOW, WHEN AND BY WHOM THE ACCEPTANCE TEST CASES WERE EXECUTED<br />

The acceptance test cases were executed in three ways. During the test case implementation the test<br />

automation engineer executed the test cases on his workstation. The purpose was to verify that the test<br />

cases were implemented correctly. With some test cases this meant that the features were already implemented<br />

at this stage. Some of the test cases were executed on the developers’ workstations during<br />

the development by a test automation engineer and the developers. All the test cases were added to the<br />

acceptance test execution environment. At the beginning the test cases were added to the environment<br />

at the end of each sprint. However, in the last sprint the test cases were added to the acceptance test<br />

execution environment immediately after the initial versions were created. In the acceptance test environment<br />

the test cases were automatically executed whenever there were new builds available.<br />

As already was mentioned, the problems in the acceptance test implementation prevented the developers<br />

from evaluating whether they were ready or not by running the acceptance test cases. There were<br />

also two other reasons which made it hard for the developers to evaluate their work readiness <strong>with</strong><br />

automated acceptance test cases. First of all, some of the test cases tested the workflow, and therefore<br />

those test cases were dependent on each other. That is why the test cases in a late phase of the workflow<br />

could not be tested before the features preceding them were working. Another reason was that the<br />

single test cases tested multiple developers’ work, and therefore the test cases were not passing until all<br />

the parts the test case was testing were ready.<br />

77


Many of the mentioned problems derive from the level of the test cases. When the acceptance test<br />

cases are on a high level, it is inevitable that those test cases test multiple features. This in turn will<br />

lead into the problems mentioned earlier. Avoiding the dependency between steps is hard in the workflow<br />

test cases. Even though these problems exist, it is obvious that the end-to-end acceptance test<br />

cases are needed. One possible solution to this problem is to divide the acceptance test cases more<br />

strictly into two categories. Higher level test cases could be traditional system level test cases containing<br />

end-to-end test cases. The feature specific test cases could be then integration and system level test<br />

cases concentrating only on one feature. The feature specific test cases could be executed by developers<br />

to evaluate the features readiness. Of course this will not remove the problems that some of the features<br />

can not be tested before pre conditional features are ready. This would also make it easier for the<br />

developers to implement the acceptance test cases. The higher level test cases could then still be the<br />

testers’ responsibility as was the case in the Project.<br />

HOW AND BY WHOM THE ACCEPTANCE TEST RESULTS WERE REPORTED<br />

The problems found during the test case implementation were told to the developers. The results of test<br />

case execution in the acceptance test execution environment were visible to all the project members<br />

through an information radiator. The problems found in the automated test execution were passed on to<br />

the developers by the test automation team members after having investigated the problems. However,<br />

this investigation was lengthening the feedback loop as the testers were not always available. In case<br />

the test cases would have been implemented by the developers, the feedback loop could have been<br />

shortened. The developers thought that the feedback loop should be shortened even though they had<br />

experienced that the feedback loop had already been cut radically.<br />

11.3 Benefits, Challenges and Drawbacks of <strong>Acceptance</strong> <strong>Test</strong>-<br />

<strong>Driven</strong> <strong>Development</strong> <strong>with</strong> <strong>Keyword</strong>-<strong>Driven</strong> <strong>Test</strong> Automation<br />

Framework<br />

The third research question was: Does the acceptance test-driven development <strong>with</strong> keyword-driven<br />

test automation framework provide any benefits What are the challenges and drawbacks Based on<br />

the experiences presented in Chapter 10.6 and the expected benefits and challenges presented in Chapter<br />

4.3, the answers to these questions are analyzed.<br />

78


BENEFITS<br />

The project members noticed many benefits in the use of ATDD. This was notable because the research<br />

period lasted only four months. The people who worked closely <strong>with</strong> the acceptance test cases<br />

had noticed much more benefits than those who were less involved in the use of ATDD. The role a<br />

person represented had much less influence on the experienced benefits than the degree of involvement.<br />

Of course, there were different viewpoints based on the role to some of the issues, but the main<br />

benefits were perceived similarly by different roles. The same benefits were also notices by the researcher<br />

while working in the Project.<br />

While the research was conducted, there were some changes in the Project as was mentioned in Chapter<br />

10.1. Not all of them were related to taking ATDD into use. The changes can be categorized into<br />

three main changes; taking test automation into use, a change towards agile testing and of course taking<br />

ATDD into use. The relations and effects of these changes on the experienced benefits had to be<br />

analyzed. The analysis is presented next.<br />

The main relations between the different benefits and reasoning are represented in Figure 30. As can<br />

be seen in the figure, quite many relations between the benefits can be found. The figure is only a simplified<br />

view of the benefits and their relations, but it is used as a basis of this analysis.<br />

79


Figure 30:<br />

The relations between the changes and benefits<br />

80


One of the sensed changes was the increased communication. As was mentioned in Chapter 4.1, agile<br />

testing emphasizes face-to-face communication. When ATDD is in use, the work needed to create the<br />

test cases forces to communication. The perceived increase in the communication can also be dependent<br />

on the tester as some people communicate more actively than others. Therefore, it is impossible to<br />

say how much of the increased communication was due to the use of ATDD and how much due to the<br />

other changes. The test engineers’ early involvement can be seen as a consequence of taking the agile<br />

testing into use. On the other hand, the use of ATDD forced the testers to take a part in an earlier phase<br />

of the development as the testers participated in the detailed planning. Therefore, most of the benefits<br />

gained due to the testers’ earlier participation were obtained because of the use of ATDD (see Figure<br />

30). Co-operation in acceptance test case creation is also something that is a part of the agile testing.<br />

However, in the Project it was due to the use of ATDD that the acceptance test cases were created <strong>with</strong><br />

the feature owners. Therefore, it is hard to say whether the benefits could be gained <strong>with</strong>out the use of<br />

ATDD. Anyway, the use of ATDD assures that acceptance test cases are created in co-operation and<br />

therefore the benefits relative to it are gained.<br />

The only practice that was taken into use purely due to the use of ATDD was detailed planning done<br />

by the feature owners, developers and testers (bolded in Figure 30). This was one of the biggest reasons<br />

leading to an improved common understanding about the details, which was seen in the Project as<br />

the biggest benefit of the use of ATDD. Crispin (2005) also stated that the cooperation between the<br />

groups before development was the biggest benefit of ATDD. The need to create the test cases forces<br />

to discussion. Of course, the detailed planning could be done <strong>with</strong>out ATDD and some of the mentioned<br />

benefits could still be gained. However, as can be seen in Figure 30, the benefits are sums of<br />

multiple factors, and it is hard to say which benefits would be gained if only the detailed planning<br />

would be used. As mentioned earlier, an increased common understanding and benefits following from<br />

that can be missed if the test cases are on a too high level and the planning is not detailed enough.<br />

The test automation affected only a few observed benefits as can be seen in Figure 30. This suggests<br />

that the tool used in the ATDD is not relevant as most of the benefits were gained from well-timed<br />

planning done by people working in the different roles. However, the role of the test automation providing<br />

the feedback and helping the regression testing should not be undervalued. The benefits of automated<br />

regression testing were probably not broadly highlighted in the research because of the short<br />

research period. With a longer follow-up period, this benefit could have been greater.<br />

81


The increased common understanding, the biggest benefit of the use of ATDD, does not provide additional<br />

value as such. However, the increased understanding provides the “real” benefits. The most<br />

valuable benefits of the use of ATDD are therefore the decreased risk of building incorrect software<br />

and the increased development efficiency as the problems can be solved <strong>with</strong> a smaller effort, and the<br />

features are done right at the first time. The change in the tester’s role is also quite remarkable.<br />

The use of ATDD affects also to the software quality. As the risk of building incorrect software is decreased,<br />

it is more likely that the created features will satisfy the end user’s needs. A better understanding,<br />

improved test cases, and fact that the problems are found earlier should also improve the possibility<br />

to find the defects <strong>with</strong> a significant impact. However, this remains to be seen. The test automation<br />

as a part of ATDD provides a certain level of quality. As the regression testing is done automatically,<br />

the testers hopefully have more time to explore the system and find defects. In the Project the nonfunctional<br />

testing was not taken into account when the acceptance test cases were created. However, it<br />

was discussed to be one area to expand the use of ATDD to. Therefore, the non-functional qualities<br />

were not improved by the use of ATDD.<br />

The benefits mentioned are at least partially gained because the use of ATDD. If the agile testing, test<br />

automation, and increased communication are removed from the relations, none of the real benefits<br />

evade. Of course, that may influence the magnitude of the benefit.<br />

BENEFITS NOT PERCEIVED<br />

There were also areas where benefits were not noticed even though those areas were mentioned as possible<br />

benefit areas in the literature (see Chapter 4.3). Possible reasons for why the benefits were not<br />

gained are analyzed here.<br />

82


<strong>Development</strong> Status Was Not More Visible<br />

There were no changes in the development status visibility even though the acceptance test report was<br />

available to everyone through the information radiator and the web page. At the beginning of the research<br />

the test cases were added to the acceptance test execution environment at the end of each sprint.<br />

Therefore, it was clear that the development status could not be followed inside the sprints. At the last<br />

sprint of the research period the acceptance test cases were added to the acceptance test execution environment<br />

at the beginning of the sprint. However, this did not help as the test cases were failing most of<br />

the sprint. There were three reasons for that. The test cases were high level test cases testing multiple<br />

parts of the Product in one test case. Therefore, even the development team was able to finish some<br />

single features the test cases were still failing. Another reason was that the features were ready at a<br />

very late phase of the sprint if even then. Therefore, the test cases were actually describing the development<br />

status even though the people did not see failing tests as progress indicators. The third reason<br />

was that not all of the acceptance test cases were ready at the same time the features were. The reasons<br />

behind this problem were analyzed in Chapter 11.1.<br />

The development status visibility could be improved by dividing the development status follow-up into<br />

a project level and a sprint level progress. The division to the higher level and feature level test cases<br />

presented in Chapter 11.2 could be exploited. The higher level test cases could be used to indicate<br />

which workflows are working, and therefore those could provide the project level status. The feature<br />

level test cases could be used to follow-up the progress inside the sprints.<br />

Requirements Were Not Defined More Cost-effectively<br />

The test cases were not substituting the requirement specifications in the Project. Therefore the requirements<br />

and test cases were not created more cost-effectively. One clear reason was that the Project<br />

had been started before ATDD was tried out and a requirement specification was already created. Even<br />

if ATDD had been started at the beginning of the Project, the requirement specification would probably<br />

still have been created. One interviewed person also mentioned that there is no need for replacing<br />

the requirements <strong>with</strong> the test cases. On the other hand, keeping duplicate data up-to-date can be seen<br />

as a burden.<br />

83


No Remarkable Changes to System Design<br />

ATDD did not cause remarkable changes to the system design even though one developer thought that<br />

he had found the design faster in some cases. A relatively short research period may be one reason why<br />

the changes were not noticed. However, there might be other reasons as well. Reppert (2004) reported<br />

that remarkable improvements in system design were seen as ATDD was used in some project. It may<br />

be that this improvement could not be noticed because the interface used to access the system from the<br />

test cases was different. As was mentioned in Chapter 4.3 the acceptance test cases usually bypass the<br />

graphical user interface and use straightly the internal structures. This was not the case as the test cases<br />

used the graphical user interface to access the system under test. Therefore, in the Project, there was no<br />

need to create test code which would be interacting straightly the internal structures. This maybe was<br />

the reason why the developers did not notice a significant change. So it seems that the interface used to<br />

access the system under test affects whether the system design is improved or not.<br />

<strong>Acceptance</strong> <strong>Test</strong>s Were Not Used To Verify Refactoring Correctness<br />

Developers in the Project thought that the acceptance test cases created <strong>with</strong> ATDD could be used to<br />

evaluate the refactoring correctness even though they had not done that yet. A longer research period is<br />

needed to assess properly the acceptance test cases usefulness when evaluating the refactoring correctness.<br />

However, it is hard to see any reasons why the acceptance test cases created <strong>with</strong> the keyworddriven<br />

test automation framework could not be used to verify refactoring correctness. Probably the<br />

coverage and level of the acceptance test cases have a bigger influence than the tool used to create the<br />

acceptance test cases.<br />

84


CHALLENGES<br />

As was mentioned in Chapter 10.6, the main challenge in the Project’s environment was proper test<br />

data. This however, was a domain specific testing problem. However, it was seen to affect the creation<br />

of automated tests more than manual testing. There were also other challenges in automating the test<br />

cases. The base keyword creation problems were described in Chapter 11.1. There were also components<br />

in the application which could not be accessed from the automated test cases as was mentioned in<br />

Chapters 10.2 and 10.3. As was already mentioned in Chapter 5.1, it is not an easy task to automate<br />

testing. <strong>Test</strong> automation was also seen as one of the biggest challenges in the use of ATDD by Crispin<br />

(2005) (Chapter 4.3). The presented test automation challenges were mainly general test automation<br />

challenges. Some of these challenges are relative to the selected interface for accessing the application.<br />

However, none of them were keyword-driven test automation specific. The use of ATDD and agile<br />

testing helped to solve some of the problems easier than that could be done in a more traditional environment.<br />

It was easier to add the needed testability hooks to the Product because the implementations<br />

were done in parallel.<br />

As was mentioned, test automation is a part of ATDD, but the biggest benefits can be achieved even<br />

though not all of the test cases could be automated. However, this leads to a need to handle the manual<br />

regression testing. Therefore, it is not advisable to be immediately satisfied <strong>with</strong> manual tests. The importance<br />

of the automated regression tests in iterative software development should not be forgotten.<br />

Of course, the scale of test automation has to be decided based on the context.<br />

The second challenge mentioned in Chapter 4.3 was writing the tests before development. That was<br />

also noticed in the Project as was presented in Chapter 11.1. Crispin (2005) mentioned the problem<br />

was that there was no time to write the test cases before development. However, in the Project the<br />

problems were more test data and context specific. Time could have been a problem in case the amount<br />

of detailed level test cases would have been higher.<br />

The third challenge was the right level of test cases. Crispin (2005) noticed that when many test cases<br />

are written beforehand, the test cases can cause more confusion than help to understand the requirements.<br />

It was noticed in the Project that there would have been a need for test cases on multiple levels<br />

as was mentioned in Chapter 11.2. Including also non-functional testing to a part of the acceptance test<br />

cases in the future was seen beneficial by two interviewees. This would even widen the goal of the acceptance<br />

test cases. This challenge <strong>with</strong> the right level of test cases derives probably from the wide<br />

definition of the acceptance testing and the possibility to create test cases on multiple test levels simultaneously.<br />

85


One more challenge was noticed in the use of the keyword-driven test automation framework. As there<br />

was not an intelligent development environment for editing the test case files and resource files, the<br />

test data management took some time. Also some developers found it difficult to find all the keywords<br />

that were used in the test cases and user keywords because those were defined in multiple files. These<br />

problems <strong>with</strong> the test data management can even be bigger if there are more people implementing the<br />

test cases.<br />

DRAWBACKS<br />

Interviewees mentioned only a few drawbacks. One interviewee mentioned that writing the test case<br />

took time and it was a drawback. As there were more people defining the test cases, it took more resources.<br />

On the other hand, the first versions of the test cases were written by a test automation engineer,<br />

and therefore only the definitions were done <strong>with</strong> a bigger group. Two interviewees thought that<br />

updating the test cases can be seen as rework and therefore as a drawback. This drawback was also<br />

noticed in the February sprint. The reason was mainly that the details were not agreed well enough.<br />

However, the time used to do the changes was not remarkable. In all, it seems that the benefits gained<br />

from the use of ATDD exceed clearly the drawbacks.<br />

86


11.4 Good Practices<br />

Good practices are summarized based on literature, the observations, and the analysis of the observations,<br />

and those are shown in Table 3. These practices can be applied when acceptance test-driven development<br />

is used.<br />

PRACTICE<br />

<strong>Acceptance</strong> test cases are created also on a detailed<br />

level.<br />

Use case/workflow test cases are discussed <strong>with</strong><br />

the whole team at the beginning of the sprint.<br />

Detailed level test cases are discussed in small<br />

groups.<br />

<strong>Test</strong> cases are written to the formal format after<br />

the planning meetings.<br />

<strong>Test</strong> cases are checked by the team.<br />

EXPLANATION<br />

If the acceptance test cases are created on a too<br />

high level, there is no need to clarify the details,<br />

and those remain unclear. However, creating too<br />

many detailed test cases at the beginning of the<br />

sprint may be confusing.<br />

It is important that all team members understand<br />

the big picture, and high level test cases can be<br />

used to clarify that.<br />

It is obviously not productive to plan all the details<br />

<strong>with</strong> the whole team. Therefore, the detailed<br />

test cases are created in small groups, where different<br />

roles are represented.<br />

During the planning meetings, the test cases can<br />

be quickly noted down. The purpose of the meetings<br />

is to find the needed details and create a<br />

common understanding about those details. The<br />

test cases can be written to a proper format after<br />

the meeting.<br />

Because the test cases are created based on the<br />

notes, it is good to check the test cases <strong>with</strong> the<br />

people who planned those. This helps to find ambiguities<br />

and to verify that all the people have<br />

understood the details similarly.<br />

87


The test-first approach is not mandatory.<br />

Initial test cases are added to the test execution<br />

environment.<br />

Different kinds of acceptance test cases are created.<br />

There can be situations where it is not profitable<br />

to implement the test cases in the test-first manner.<br />

However, plan and implement the test cases<br />

on some level before implementing the feature.<br />

Even the test case planning can help to understand<br />

the wanted features.<br />

When the test cases are executed often and there<br />

are detailed level test cases, the development progress<br />

can be followed during the sprints. With the<br />

high level test cases the development progress can<br />

be followed on the project level.<br />

The acceptance test cases should cover the functional<br />

and non-functional requirements. Therefore,<br />

there is a need to create different types of<br />

test cases. Functional test cases can even be on<br />

different testing levels.<br />

Table 3:<br />

Good practices<br />

88


12 DISCUSSION AND CONCLUSIONS<br />

This research was conducted by a comprehensive literature review, action research based observations<br />

of the use of acceptance test-driven development <strong>with</strong> the keyword-driven test automation framework<br />

in one software development project, and interviewing members of the project in question. Results of<br />

the research were analyzed by reflecting them to the relevant literature and to earlier studies. Conclusions<br />

based on the analysis are covered in this chapter.<br />

12.1 Researcher’s Experience<br />

The researcher’s background and experience at the field of software testing are described briefly, so<br />

that the reader can make some assumptions about the researcher’s competence. The researcher had four<br />

years of experience on software testing and test automation when the research was started. The researcher<br />

was a part of a team that had developed the keyword-driven test automation framework used<br />

in the Project and called as Robot. The Robot development had lasted over a year when the research<br />

started. The researcher had gained a lot of experience on Robot by using it for testing Robot itself.<br />

12.2 Main Conclusions<br />

It can be said that ATDD can provide many benefits, and it is a radical change to the traditional acceptance<br />

testing. ATDD together <strong>with</strong> agile testing brings testing to the core of the development opposed<br />

to the traditional way where the main part of the testing insufficiently takes place at the end of the software<br />

development. This is a positive feature which also improves meaningfulness of the work as all<br />

team members can take part in planning quality software.<br />

According to the results gained from the study, ATDD also helps to develop more efficiently software<br />

that corresponds better to the requirements. This is mainly due to the improved common understanding<br />

<strong>with</strong>in the team about the details of the software’s features. So it seems that the use of ATDD is really<br />

profitable.<br />

It can be seen that the tool used to automate the test cases in ATDD does not play a crucial role as the<br />

biggest benefits noticed based on the interviews were gained from the process. However, the level on<br />

which the acceptance test cases are created have an influence on the gained benefits, and if the test<br />

cases are done on a too high level, the noticed benefits evade.<br />

89


Of course, ATDD is not a silver bullet, and challenges exist. As the acceptance testing should cover<br />

both non-functional testing and functional testing excluding only unit testing, there is wide a area to<br />

test. Finding the right level of tests is unquestionably hard. However, the cooperation between the team<br />

and the customer can ease that journey.<br />

It was acknowledged that the benefits were gained even though the acceptance test cases were not created<br />

before the development as pure ATDD requires. This leads to the question if ATDD should be<br />

defined so that there is no strict requirement that the test cases should be created in a test-first manner.<br />

The discussion about the test cases is anyway driving the development as the goal of the team is to get<br />

the acceptance test cases passing.<br />

Based on the work, the way of thinking is that ATDD can provide a clear process to arrange the testing<br />

in the iterative development inside the iteration, and in consequence, establish a prerequisite of a successful<br />

testing. This can be seen very beneficial because clear guidance of the process of the agile testing,<br />

especially in Scrum, is missing. The importance of this process is emphasized in environments<br />

where the transition from traditional software development to agile software development is taking<br />

place.<br />

12.3 Validity<br />

There is no one clear definition what validity means in qualitative research (Golafshan 2003, Trochim<br />

2006, Flick 2006). However, Flick (2006) summarizes that validity answers to the question whether<br />

the researchers see what they think they see. Flick (2006) also suggests using triangulation as a method<br />

for evaluating the qualitative research. Based on that suggestion, the validity of this research is evaluated<br />

using data and investigator triangulation. Theory and methodological triangulation are not used<br />

because of the practical nature and predefined scope of the research. Also other matters affecting the<br />

results of this thesis are considered.<br />

The validity of the data was ensured by collecting data <strong>with</strong> different data collection methods listed in<br />

Chapter 9.3. Data was also collected during the whole research, increasing the validity of the data. To<br />

prevent unbalanced view, the researcher interviewed and observed people in different roles. Investigator<br />

triangulation means using more than one researcher to detect and prevent biases resulting from the<br />

researcher as a person. It was not possible to use any other interviewer or observer in this research.<br />

From this point of view the validity of the research is questionable.<br />

90


The high involvement in the Project and especially the help the Project gained from the researcher during<br />

the study affects the validity of this research. Kock (2003) mentions that in action research the researcher’s<br />

actions may strongly bias the results. The researcher became aware of this possibility in the<br />

beginning of the research, and it was kept in mind during the Project and especially during the analyzing<br />

phase.<br />

In the addition the background, the know-how and opinions of the researcher are possible sources of<br />

error. This is mainly due to the fact that this research was a qualitative research, and for example, the<br />

interviews were used as a research method. Therefore, the content and form of the interview questions<br />

can reflect researcher’s own background, knowledge, and views. As a part of the project team, there is<br />

no possibility for the researcher to be completely objective. However, it can be discussed whether this<br />

subjectivity has a negative impact on the research or not.<br />

Interpreting results is not completely an objective action. Therefore, it might be that another researcher<br />

<strong>with</strong> a different background may have interpreted the gained results in a slightly different way. Therefore,<br />

it must be kept in mind, that for example the conclusions are always a somewhat subjective view<br />

of the reality. However, it can be argued that the results gained from the research would have been<br />

similar even though the research had been carried out by some other researcher.<br />

Also the fact that there were other changes in the Project such as a change towards agile testing may<br />

have caused problems in understanding what actually caused the perceived benefits. However, as was<br />

noticed in the analysis of the research results, some of the changes and benefits originate directly from<br />

the use of ATDD. To be sure about the benefits, the subject should be studied for a longer period of<br />

time than what was done in this research. However, the main conclusions could be drawn also based<br />

on the period of time used in the research. The results of earlier studies and relevant literature confirm<br />

the noticed research results as they were mainly in line <strong>with</strong> each other.<br />

It must be kept in mind that the results presented in this thesis are only based on one software development<br />

project and more specifically on one team’s work. Every project has its own context specific<br />

features. These facts and of course the structure of the team have an influence on how ATDD is used<br />

and how it is adapted as a part of the development process. Therefore, the results can in some extent<br />

vary according to the project in question, but the main benefits noticed should be possible to gain also<br />

in other projects.<br />

91


The test automation framework Robot used in this research is not open sourced, which makes it harder<br />

to introduce the test automation concept used in this research to other projects. However, there is a<br />

possibility that the keyword-driven test automation framework used in the study will be open sourced.<br />

12.4 Evaluation of the Thesis<br />

The first goal of this thesis was to investigate whether the keyword-driven test automation framework<br />

could be used <strong>with</strong> acceptance test-driven development. It can be said that the goal was achieved. Suitability<br />

of keyword-driven test automation was analyzed extensively and based on the analysis the outcome<br />

was that it is possible to use the keyword-driven test automation framework <strong>with</strong> ATDD. It was<br />

also noticed that some limitations exist which may in turn prevent the finalization of the test cases<br />

prior to feature implementation.<br />

One aim was to describe the use of keyword-driven test automation framework <strong>with</strong> ATDD in a way<br />

that enables other projects to experiment the approach <strong>with</strong> similar tools. How this goal is met remains<br />

to be seen when the results of this thesis will possible be used in other real-world software development<br />

projects. However the aim was to describe both the fictive example (Chapter 7) and case study<br />

(Chapter 10) in such a way that they would be widely understood.<br />

The last goal was to study what are the pros and cons of the acceptance test driven development when<br />

it is used <strong>with</strong> the keyword driven test automation framework. Even though the research lasted only<br />

four months, plenty of results were collected. Based on these results it was possible to see clear benefits,<br />

some challenges, and a few drawbacks. In this sense, the study was successful.<br />

12.5 Further Research Areas<br />

Because this thesis is one of the first studies focusing on acceptance test-driven development <strong>with</strong> the<br />

keyword-driven test automation framework, there is need for a more extensive study of this kind of<br />

approach in other projects and projects that use different kind of iterative processes. Also a longer research<br />

period would be beneficial as the changes due to the use of ATDD are wide-ranging, and the<br />

adaptation and adjusting the process takes time. Full-scale use of ATDD would make it possible to<br />

study better the effects of the test automation framework, and the running tested features metric’s suitability<br />

<strong>with</strong> ATDD.<br />

92


As was noticed, the level of acceptance tests affects the benefits of ATDD. It was also noticed that<br />

there is a need for acceptance test cases on different levels and that it is difficult to create test cases on<br />

a right level. At least the following areas need more study to understand which kind of acceptance tests<br />

would be beneficial to create:<br />

• How do the different levels of test cases affect the different aspects of ATDD<br />

• How do the different levels of acceptance tests affect measuring the project, and how does<br />

it affect the use of running tested features metric<br />

• How could the lower level acceptance tests created <strong>with</strong> the keyword-driven test automation<br />

framework defined in a format that can be easily understood<br />

• What is the relationship between the unit testing and the lower level acceptance testing<br />

Further research is also needed to clarify which of the benefits mentioned in this research are actually<br />

direct results of ATDD. Therefore, the relationships between the benefits and the source of each benefit<br />

should be studied.<br />

One issue that was not studied in this research was the ability to substitute the requirement specifications<br />

<strong>with</strong> the acceptance test cases. As one interviewee mentioned, there is no need to replace the requirements<br />

<strong>with</strong> the acceptance test cases. However, some of the details in the requirement specifications<br />

could be defined <strong>with</strong> test cases to avoid maintaining duplicate data. This could lead to linking<br />

the high level requirements to the acceptance test cases. This would be an interesting area for further<br />

study.<br />

Altogether, it can be said that this thesis is a good opening for discussion in this field of software testing.<br />

93


BIBLIOGRAPHY<br />

Abrahamsson, Pekka, Outi Salo, Jussi Ronkainen & Juhani Warsta (2002). Agile Software<br />

<strong>Development</strong> Methods: Review and Analysis. VTT Publications 478, VTT, Finland.<br />

<br />

Agile Advice (2005). Information Radiators, May 10, 2005.<br />

May 14th, 2007<br />

Andersson, Johan, Geoff Bache & Peter Sutton (2003). XP <strong>with</strong> <strong>Acceptance</strong> <strong>Test</strong>-<strong>Driven</strong><br />

<strong>Development</strong>: A Rewrite Project for a Resource Optimization System. Lecture Notes in Computer<br />

Science, Volume 2675/2003, Extreme Programming and Agile Processes in Software Engineering,<br />

180-188, Springer Berlin/Heidelberg.<br />

<br />

Astels, David (2003). <strong>Test</strong>-<strong>Driven</strong> <strong>Development</strong>: A Practical Guide. 562, Prentice Hall PTR, United<br />

States of America.<br />

Avison, David, Francis Lau, Michael Myers & Peter Axel Nielsen (1999). Action Research: To make<br />

academic research relevant, researchers should try out their theories <strong>with</strong> practitioners in real situations<br />

and real organizations. COMMUNICATIONS OF THE ACM, January 1999/Vol. 42, No. 1, 94-97.<br />

Babüroglu, Oguz N. & Ib Ravn. Normative Action Research (1992). Organization Studies Vol. 13, No.<br />

1, 1992, 19-34.<br />

Bach, James (2003a). Agile test automation. <br />

March 31st, 2007<br />

Bach, James (2003b). Exploratory <strong>Test</strong>ing Explained v.1.3 4/16/03.<br />

March 31st, 2007<br />

Beck, Kent (2000). Extreme Programming Explained: Embrace Change. Third Print, 190, Addison-<br />

Wesley, Reading (MA).<br />

94


Beck, Kent, Mike Beedle, Arie van Bennekum, Alistair Cockburn, Ward Cunningham, Martin Fowler,<br />

James Grenning, Jim Highsmith, Andrew Hunt, Ron Jeffries, Jon Kern, Brian Marick, Robert C.<br />

Martin, Steve Mellor, Ken Schwaber, Jeff Sutherland & Dave Thomas (2001a). Manifesto for Agile<br />

Software <strong>Development</strong>. December 5th, 2006<br />

Beck, Kent, Mike Beedle, Arie van Bennekum, Alistair Cockburn, Ward Cunningham, Martin Fowler,<br />

James Grenning, Jim Highsmith, Andrew Hunt, Ron Jeffries, Jon Kern, Brian Marick, Robert C.<br />

Martin, Steve Mellor, Ken Schwaber, Jeff Sutherland & Dave Thomas (2001b). Principles behind the<br />

Agile Manifesto. March 31st, 2007<br />

Beck, Kent (2003). <strong>Test</strong>-<strong>Driven</strong> <strong>Development</strong> By Example. 240, Addison-Wesley.<br />

Beizer, Boris (1990). Software testing techniques. Second Edition, 550, Van Nostrand Reinhold, New<br />

York.<br />

Burnstein, Ilene (2003). Practical Software <strong>Test</strong>ing: a process-oriented approach. 709, Springer, New<br />

York.<br />

Buwalda, Hans, Dennis Janssen & Iris Pinkster (2002). Integrated <strong>Test</strong> Design and Automation: Using<br />

the <strong>Test</strong>Frame Method. 242, Addison Wesley, Bibbles Ltd, Guildford and King’s Lynn, Great Britain.<br />

Cohn, Mike (2004). User Stories Applied: For Agile Software <strong>Development</strong>. 268, Addison-Wesley.<br />

Cohn, Mike (2007). User Stories, Agile Planning and Estimating. Internal Seminar, March 24th, 2007.<br />

Control Chaos (2006a). What is Scrum September 26th, 2006<br />

Control Chaos (2006b). XP@Scrum. September 26th,<br />

2006<br />

Craig, Rick D. & Stefan P. Jaskiel (2002). Systematic Software <strong>Test</strong>ing. 536, Artech House Publishers,<br />

Boston.<br />

Crispin, Lisa, Tip House & Wade Carol (2002). The Need for Speed: Automating <strong>Acceptance</strong> <strong>Test</strong>ing<br />

in an eXtreme Programming Environment. Upgrade, The European Online Magazine for the IT<br />

Professional Vol III, No. 2, April 2002, 11-17. <br />

95


Crispin, Lisa & Tip House (2005). <strong>Test</strong>ing Extreme Programming. Second Print, 306, Addison-<br />

Wesley.<br />

Crispin, Lisa (2005). Using Customer <strong>Test</strong>s to Drive <strong>Development</strong>. METHODS & TOOLS, Global<br />

knowledge source for software development professionals, Summer 2005, Volume 13, number 2, 12-<br />

17. <br />

Cruise Control (2006). Cruise Control, Continuous Integration Toolkit.<br />

Sebtember 23rd, 2006<br />

Dustin, Elfriede, Jeff Rashka & John Paul (1999). Automated Software <strong>Test</strong>ing: introduction,<br />

management, and performance. 575, Addison-Wesley.<br />

Fenton, Norman E (1996). Software metrics : a rigorous and practical approach. Second Edition, 638,<br />

International Thomson Computer Press, London.<br />

Fewster, Mark & Dorothy Graham (1999). Software <strong>Test</strong> Automation, Effective use of test execution<br />

tools. 574, Addison-Wesley.<br />

Flick, Uwe (2006). An Introduction to Qualitative Research. Third Edition, 443, SAGE, London.<br />

Golafshani, Nahid (2003). Understanding Realibility and Validity in Qualitative Research. The<br />

Qualitative Report Vol. 8, Number 4, December 2003, 597-607. <br />

Hendrickson, Elisabeth (2006). Agile QA/<strong>Test</strong>ing. <br />

April 10th, 2007<br />

IEEE Std 829-1983. IEEE Standard for Software <strong>Test</strong> Documentation. Institute of Electrical and<br />

Electronics Engineers, Inc., 1983.<br />

IEEE Std 1008-1987. IEEE standard for Software Unit <strong>Test</strong>ing. Institute of Electrical and Electronics<br />

Engineers, Inc., 1987.<br />

IEEE Std 610.12-1990. IEEE standard glossary of software engineering terminology. Institute of<br />

Electrical and Electronics Engineers, Inc., 1990.<br />

96


ISO Std 9000-2005. Quality management systems - Fundamentals and vocabulary. ISO Properties,<br />

Inc., 2005<br />

ISO/IEC Std 9126-1:2001. Software engineering -- Product quality -- Part 1: Quality model. ISO<br />

Properties, Inc., 2001<br />

ISTQB (2006). Standard glossary of terms used in Software <strong>Test</strong>ing Version 1.2 (dd. June, 4th 2006).<br />

April 9th, 2007<br />

Itkonen, Juha, Kristian Rautiainen and Casper Lassenius (2005). Toward an Understanding of Quality<br />

Assurance in Agile Software <strong>Development</strong>. International Journal of Agile Manufacturing, Vol. 8, No.<br />

2, 39-49.<br />

Jeffries, Ronald E. (1999). Extreme <strong>Test</strong>ing, Why aggressive software development calls for radical<br />

testing efforts. Software <strong>Test</strong>ing & Quality Engineering, March/April 1999, 23-26.<br />

<br />

Jeffries, Ron, Ann Andersson & Chet Hendrickson (2001). Extreme Programming Installed. 265,<br />

Addison-Wesley, Boston.<br />

Jeffries, Ron (2004). A Metric Leading to Agility 06/14/2004.<br />

November 18th, 2006<br />

Jeffries, Ron (2006). Automating “All” <strong>Test</strong>s 05/25/2006.<br />

April 14th, 2007<br />

Kaner, Cem, Jack Falk & Quoc Nguyen (1999). <strong>Test</strong>ing Computer Software. Second Edition, 480,<br />

Wiley, New York.<br />

Kaner, Cem, James Bach, Bret Pettichord, Brian Marick, Alan Myrvold, Ross Collard, Johanna<br />

Rothman, Christopher Denardis, Marge Farrell, Noel Nyman, Karen Johnson, Jane Stepak, Erick<br />

Griffin, Patricia A. McQuaid, Stale Amland, Sam Guckenheimer, Paul Szymkowiak, Andy Tinkham,<br />

Pat McGee & Alan A. Jorgensen (2001a). The Seven Basic Principles of the Context-<strong>Driven</strong> School.<br />

December 19th, 2006<br />

Kaner, Cem, James Bach & Bret Pettichord (2001b). Lessons Learned in Software <strong>Test</strong>ing: A Context-<br />

<strong>Driven</strong> Approach. 286, John Wiley & Sons, Inc., New York.<br />

97


Kaner, Cem (2003). The Role of <strong>Test</strong>ers in XP.<br />

November 18th, 2006<br />

Kit, Edward (1999). Integrated, effective test design and automation. Software <strong>Development</strong>, February<br />

1999, 27–41.<br />

Kock, Ned (2003). Action Research: Lessons Learned From a Multi-Iteration Study of Computer-<br />

Mediated Communication in Groups. IEEE Transactions on Professional Communication, Vol. 46, No.<br />

2, June 2003, 105-128.<br />

Larman, Craig (2004). Agile & Iterative <strong>Development</strong>: A Manager’s Guide. 342, Addison-Wesley.<br />

Larman, Craig (2006). Introduction to Agile & Iterative <strong>Development</strong>. Internal Seminar, December<br />

14th, 2006.<br />

Laukkanen, Pekka (2006). Data-<strong>Driven</strong> and <strong>Keyword</strong>-<strong>Driven</strong> <strong>Test</strong> Automation Frameworks. 98,<br />

Master’s Thesis, Software Business and Engineering Institute, Department of Computer Science and<br />

Engineering, Helsinki University of Technology.<br />

Mar, Kane & Ken Schwaber (2002). Scrum <strong>with</strong> XP.<br />

October 4th, 2006<br />

Marick, Brian (2001). Agile Methods and Agile <strong>Test</strong>ing. <br />

November 15th, 2006<br />

Marick, Brian (2004). Agile <strong>Test</strong>ing Directions. <br />

November 15th, 2006<br />

Meszaros, Gerard (2003). Agile regression testing using record & playback. Conference on Object<br />

Oriented Programming Systems Languages and Applications, Companion of the 18th annual ACM<br />

SIGPLAN conference on Object-oriented programming, systems, languages, and applications, 353–<br />

360, ACM Press, New York. <br />

Miller, Roy W. & Christopher T. Collins (2001). <strong>Acceptance</strong> testing. XP Universe, 2001.<br />

April 10th, 2007<br />

98


Mosley, Daniel J. & Bruce A. Posey (2002). Just Enough Software <strong>Test</strong> Automation. 260, Prentice Hall<br />

PTR, Upper Saddle River, New Jersey, USA.<br />

Mugridge, Rick & Ward Cunningham (2005). Fit for Developing Software: Framework for Integrated<br />

<strong>Test</strong>s. 355, Prentice Hall PTR, Westford, Massachusetts.<br />

Nagle, Carl J. (2007). <strong>Test</strong> Automation Frameworks.<br />

April 14th, 2007<br />

Patton, Ron (2000). Software <strong>Test</strong>ing. 389, SAMS, United States of America.<br />

Pol, Martin (2002). Software testing: a guide to the TMap approach. 564, Addison-Wesley, Harlow.<br />

Reppert, Tracy (2004). Don’t Just Break Software Make Software: How storytest-driven development<br />

is changing the way QA, Customers, and developers work. Better Software, July/August, 2004, 18-23.<br />

<br />

Sauvé, Jacques Philippe, Osório Lopes Abath Neto & Walfredo Cirne (2006). EasyAccept: a tool to<br />

easily create, run and drive development <strong>with</strong> automated acceptance tests. International Conference on<br />

Software Engineering, Proceedings of the 2006 international workshop on Automation of software<br />

test, 111-117, ACM Press, New York. <br />

Schwaber, Ken & Mike Beeble (2002). Agile software development <strong>with</strong> Scrum. 158, Prentice-Hall,<br />

Upper Saddle River (NJ).<br />

Schwaber, Ken (2004). Agile Project Management <strong>with</strong> Scrum. 163, Microsoft Press, Redmond,<br />

Washington.<br />

Stringer, Ernest T. (1996). Action Research: A Handbook for Practitioners. 169, SAGE, United States<br />

of America.<br />

Trochim, William M.K (2006). Qualitative Validity.<br />

October 4th, 2006<br />

99


Zallar, Kerry (2001). Are you ready for the test automation game Software <strong>Test</strong>ing & Quality<br />

Engineering, November/December 2001, 22–26.<br />

<br />

Watt, Richard J. & David Leigh-Fellows (2004). <strong>Acceptance</strong> <strong>Test</strong>-<strong>Driven</strong> Planning. Lecture Notes in<br />

Computer Science, Volume 3134/2004, Extreme Programming and Agile Methods - XP/Agile Universe<br />

2004, 43-49, Springer, Berlin/Heidelberg.<br />

Wideman Max R. (2002). Wideman Comparative Glossary of Project Management Terms, March<br />

2002. May 14th, 2007<br />

100


APPENDIX A<br />

PRINCIPLES BEHIND THE AGILE<br />

MANIFESTO<br />

We follow these principles:<br />

Our highest priority is to satisfy the customer<br />

through early and continuous delivery<br />

of valuable software.<br />

Welcome changing requirements, even late in<br />

development. Agile processes harness change for<br />

the customer's competitive advantage.<br />

Deliver working software frequently, from a<br />

couple of weeks to a couple of months, <strong>with</strong> a<br />

preference to the shorter timescale.<br />

Business people and developers must work<br />

together daily throughout the project.<br />

Build projects around motivated individuals.<br />

Give them the environment and support they need,<br />

and trust them to get the job done.<br />

The most efficient and effective method of<br />

conveying information to and <strong>with</strong>in a development<br />

team is face-to-face conversation.<br />

Working software is the primary measure of progress.<br />

Agile processes promote sustainable development.<br />

The sponsors, developers, and users should be able<br />

to maintain a constant pace indefinitely.<br />

Continuous attention to technical excellence<br />

and good design enhances agility.<br />

Simplicity--the art of maximizing the amount<br />

of work not done--is essential.<br />

The best architectures, requirements, and designs<br />

emerge from self-organizing teams.<br />

At regular intervals, the team reflects on how<br />

to become more effective, then tunes and adjusts<br />

its behavior accordingly. (Beck et al. 2001b)<br />

101


APPENDIX B<br />

INTERVIEW QUESTIONS<br />

Interview questions asked in the final interviews.<br />

1. How has ATDD affected the software development Why<br />

2. What have been the benefits in ATDD Why<br />

3. What have been the drawbacks in ATDD Why<br />

4. What have been the challenges in ATDD Why<br />

5. Has ATDD affected on the risk of building incorrect software How Why<br />

6. Has ATDD affected on the visibility of the development status How Why<br />

7. Has ATDD established a quality agreement between the development and feature owners<br />

How Why<br />

8. Has ATDD changed your confidence in the software How Why<br />

9. Has ATDD affected on when problems are found How Why<br />

10. Has ATDD affected on the way requirements are up to date How Why<br />

11. Has ATDD affected on the way requirements and tests are in sync How Why<br />

12. Are the acceptance tests in a format that is easy to understand Why or why not<br />

13. Is it easy to write the acceptance tests on a right level Why or why not<br />

14. Has ATDD affected on the developers’ goal How Why<br />

15. Has ATDD affected on the design of the developed system How Why<br />

16. Has ATDD affected on verifying the refactoring correctness How Why<br />

17. Has ATDD affected on the quality of the test cases How Why<br />

18. Has ATDD had influence on the way people see test engineers How Why<br />

19. Has ATDD had influence on the test engineer's role How Why<br />

20. Has ATDD affected on how hard or easy the tests are to automate How Why<br />

21. What could be improved in the current way of doing ATDD Which changes could give the<br />

biggest benefits<br />

22. Sum up the biggest benefit and the biggest drawback based on the issues asked in this interview<br />

and state the reasons.<br />

102

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!