A thesis submitted in partial fulfilment of the requirements of the University of Glamorgan / Prifvsgol Morgannwg for the degree of a

(1)

The Automatic Generation of Software Test Data Using

Genetic Algorithms

by

Harmen - Hinrich Sthamer

A thesis submitted in partial fulfilment of the requirements of the University of Glamorgan / Prifvsgol Morgannwg for the degree of a

Doctor of Philosophy.

November 1995

University of Glamorgan

(2)

Declaration

I declare that this thesis has not been, nor is currently being, submitted for the award of any other degree or similar qualification.

Signed

Harmen - Hinrich Sthamer

(3)

Acknowledgements

I wish to thank my director of studies, Dr. Bryan F. Jones for his thorough guidance and support throughout the research and his whole-hearted efforts in providing the necessary help and many useful discussions and suggestions for the research project. With gratitude I would also acknowledge my second supervisor, David Eyres especially for his help during the research time and for the encouragement provided.

Many thanks go to Professor Dr.-Ing. Walter Lechner from the Fachhochschule Hannover and Professor Darrel C. Ince from the Open University for valuable discussions and suggestions at meetings.

I am especially indebted to my colleagues of the software testing research group Mrs.

Xile Yang and Mr. Steve Holmes for providing invaluable discussions.

Special thanks go to Dr. David Knibb for providing commercial software programs which were used during the course of the investigation as software to be tested.

In particular thanks go also to Professor Dr.-Ing. Walter Heinecke from the Fachhochschule Braunschweig/Wolfenbüttel for his efforts in establishing the collaboration between the German academic institute and its British counterpart, the University of Glamorgan, which paved the way for German students to study at British Universities.

Last but not least, I would like to thank my parents, my brothers, sisters and sister in law who supported me through out my stay in Britain in any aspect, inspiration and understanding and especially my grandparents, Eija and Ofa, without them I would not be here.

(4)

Abstract

Genetic Algorithms (GAs) have been used successfully to automate the generation of test data for software developed in ADA83. The test data were derived from the program's structure with the aim to traverse every branch in the software. The investigation uses fitness functions based on the Hamming distance between the expressions in the branch predicate and on the reciprocal of the difference between numerical expressions in the predicate. The input variables are represented in Gray code and as an image of the machine memory. The power of using GAs lies in their ability to handle input data which may be of complex structure, and predicates which may be complicated and unknown functions of the input variables. Thus, the problem of test data generation is treated entirely as an optimisation problem.

Random testing is used as a comparison of the effectiveness of test data generation using GAs which requires up to two orders of magnitude fewer tests than random testing and achieves 100% branch coverage. The advantage of GAs is that through the search and optimisation process, test sets are improved such that they are at or close to the input subdomain boundaries. The GAs give most improvements over random testing when these subdomains are small. Mutation analysis is used to establish the quality of test data generation and the strengths and weaknesses of the test data generation strategy.

Various software procedures with different input data structures (integer, characters, arrays and records) and program structures with 'if' conditions and loops are tested i.e. a quadratic equation solver, a triangle classifier program comprising a system of three procedures, linear and binary search procedures, remainder procedure and a commercially available generic sorting procedure.

Experiments show that GAs required less CPU time in general to reach a global solution than random testing. The greatest advantage is when the density of global optima (solutions) is small compared to entire input search domain.

(5)

CHAPTER 1 ...1-1 - 1-5 1. Introduction... 1-1 1.1 Objectives and aims of the research project... 1-2 1.2 Hypotheses ... 1-3 1.3 Testing criteria... 1-4 1.4 Structure of Thesis... 1-4 CHAPTER 2 ...2-1 - 2-13 2. Background of various automatic testing methods... 2-1 2.1 Testing techniques ... 2-1 2.1.1 Black box testing ... 2-1 2.1.2 White box testing... 2-1 2.2 Automatic test data generator ... 2-2 2.2.1 Pathwise generators ... 2-5 2.2.2 Data specification generators... 2-10 2.2.3 Random testing ... 2-10 CHAPTER 3 ...3-1 - 3-27 3. Genetic Algorithm... 3-1 3.1 Introduction to Genetic Algorithms ... 3-1 3.2 Application of Genetic Algorithms... 3-2 3.3 Overview and basics of Genetic Algorithms ... 3-4 3.4 Features of GAs ... 3-8 3.5 Population and generation ... 3-8 3.5.1 Convergence and sub optima solutions... 3-9 3.6 Seeding ... 3-9 3.7 Representation of chromosomes ... 3-10 3.8 The way from one generation P(t) to the next generation P(t+1)... 3-10 3.8.1 Fitness... 3-11 3.8.2 Selection ... 3-12 3.8.2.1 Selection according to fitness ... 3-13 3.8.2.2 Random selection ... 3-13 3.8.3 Recombination operators ... 3-14 3.8.3.1 Crossover operator... 3-14 3.8.3.1.1 Single crossover ... 3-14 3.8.3.1.2 Double crossover ... 3-15 3.8.3.1.3 Uniform crossover ... 3-16 3.8.3.2 Mutation operator ... 3-16 3.8.3.2.1 Normal mutation... 3-17 3.8.3.2.2 Weighted mutation ... 3-18 3.8.4 Survive... 3-18 3.8.4.1 SURVIVE_1... 3-19

(6)

3.8.4.2 SURVIVE_2... 3-20 3.8.4.3 SURVIVE_3... 3-21 3.8.4.4 SURVIVE_4... 3-21 3.8.4.5 SURVIVE_5... 3-21 3.9 Why Do GAs work? ... 3-21 3.10 Structure of Genetic Algorithm... 3-22 3.11 Performance ... 3-23 3.12 Exploiting the power of genetic algorithms ... 3-23 3.12.1 The role of feedback ... 3-24 3.12.2 The use of domain knowledge ... 3-25 3.13 Interim conclusion... 3-25 CHAPTER 4 ...4-1 - 4-24 4. A strategy for applying GAs and random numbers to software testing ... 4-1 4.1 Testing using Random Testing ... 4-1 4.2 Subdomains ... 4-1 4.2.1 Fitness functions adapted from predicates ... 4-2 4.3 Control flow tree and graph... 4-4 4.4 Effectiveness measure of test sets... 4-5 4.5 Instrumentation... 4-5 4.6 Overall structure of testing tool ... 4-6 4.7 Applying the testing strategy on an example ... 4-7 4.8 Method of testing... 4-13 4.8.1 Testing 'if ... then ... else' conditions ... 4-13 4.8.2 Boundary test data ... 4-15 4.8.3 Testing loops ... 4-17 4.8.3.1 'While ... loop' testing... 4-17 4.8.3.2 'Loop ... exit' testing ... 4-20 4.8.4 Additional conditions in a loop... 4-21 4.8.5 Testing of complex data types ... 4-22 4.9 Implementation (ADA, VAX, ALPHA, PC) ... 4-22 4.10 Motivation of experiments ... 4-22 4.11 Interim conclusion... 4-23 CHAPTER 5 ...5-1 - 5-36 5. Investigation of GA to test procedures without loop conditions... 5-1 5.1 Quadratic Equation Solver... 5-1 5.1.1 Experiments and investigation of GA for branch coverage and... 5-4 5.1.1.1 The generic Unchecked_Conversion procedure ... 5-4 5.1.2 Different select procedure... 5-5 5.1.2.1 Results of selection procedures... 5-5 5.1.3 Different survive procedure ... 5-6 5.1.3.1 Results ... 5-6 5.1.4 Different population sizePSZ... 5-8 5.1.5 Different fitness functions... 5-10 5.1.5.1 Reciprocal fitness function ... 5-10

(7)

5.1.5.2 Gaussian function ... 5-11 5.1.5.3 Hamming distance function ... 5-11 5.1.6 Gray code ... 5-13 5.1.6.1 Results of using Gray code ... 5-16 5.1.7 Zero solution... 5-17 5.1.8 Comparison of different crossovers ... 5-19 5.1.8.1 Results of crossover... 5-19 5.1.9 Different mutations ... 5-21 5.1.10 The EQUALITY procedure ... 5-21 5.1.11 Using mutation only ... 5-22 5.1.12 Results of the quadratic equation solver ... 5-23 5.1.13 Random testing ... 5-24 5.1.14 Comparison between GA and Random testing ... 5-25 5.1.15 Interim Conclusion quadratic... 5-27 5.2 Triangle classification procedure... 5-30 5.2.1 Description of the triangle procedure ... 5-30 5.2.2 Result for the triangle classifier ... 5-32 5.2.2.1 Different Fitness functions and survive probabilities... 5-32 5.2.2.2 Different crossover operator ... 5-32 5.2.3 Comparison with random testing and probabilities... 5-33 5.2.3.1 Results Random Test ... 5-34 5.3 Interim conclusion triangle ... 5-35 5.4 Interim Conclusion ... 5-37 CHAPTER 6 ...6-1 - 6-30 6. The application of GAs to more complex data structure ... 6-1 6.1 Search procedures... 6-1 6.1.1 Linear Search_1 procedure ... 6-1 6.1.1.1 Description ... 6-2 6.1.1.2 Different experiments with single loop condition... 6-3 6.1.1.3 Experiments with full 'while ... loop' testing ... 6-10 6.1.1.4 Random testing of LINEAR_1 ... 6-12 6.1.1.5 Interim conclusion of LINEAR_1... 6-13 6.1.2 Linear Search 2... 6-15 6.1.2.1 Random testing of LINEAR_2 ... 6-16 6.1.2.2 Interim conclusion LINEAR_2... 6-16 6.1.3 Binary Search ... 6-17 6.1.3.1 Results of binary search... 6-18 6.1.3.2 Full loop testing ... 6-19 6.1.3.3 Random testing of binary search... 6-20 6.1.4 Using character in LINEAR_1 and BINARY_SEARCH ... 6-21 6.1.4.1 Conclusion of search procedures ... 6-22 6.2 Remainder procedure... 6-24 6.2.1 Description ... 6-24 6.2.2 Results of Remainder... 6-25 6.2.3 Random testing using REMAINDER procedure ... 6-27

(8)

6.2.4 Interim Conclusion ... 6-28 6.3 Conclusion of procedures with loop conditions... 6-28 CHAPTER 7 ...7-1 - 7-15 7. Testing generic procedure ... 7-1 7.1 Generic Sorting Procedure: Direct Sort ... 7-1 7.2 Generic feature ... 7-3 7.3 Test strategy and data structure ... 7-4 7.4 Different Tests ... 7-8 7.5 Results of DIRECT_SORT... 7-9 7.6 Test results... 7-10 7.7 Interim Conclusion ... 7-14 CHAPTER 8 ...8-1 - 8-20 8. Adequate test Data ... 8-1 8.1 Adequacy of Test Data ... 8-1 8.1.1 Subdomain boundary test data ... 8-2 8.2 Mutation Testing and analysis ... 8-3 8.2.1 Overview of mutation analysis ... 8-3 8.2.2 Detection of mutants... 8-3 8.2.3 Construction of mutants... 8-5 8.3 Application of mutation testing to measure test data efficiency ... 8-6 8.3.1.1 Different mutants ... 8-7 8.3.1.2 First mutation results ... 8-10 8.3.1.3 Improving the testing tool with regard to mutation testing ... 8-12 8.3.1.4 Test results... 8-13 8.4 Interim conclusion ... 8-18 CHAPTER 9 ...9-1 - 9-8 9. Review and Conclusion... 9-1 9.1 Review of the project... 9-1 9.2 Summary ... 9-1 9.2.1 Operators ... 9-1 9.2.2 Data types, fitness function and program structures ... 9-3 9.2.3 Representation; Gray vs. binary... 9-3 9.2.4 Adequacy criteria and mutation testing ... 9-4 9.2.5 Random testing vs. GA testing ... 9-4 9.3 Conclusion... 9-5 9.4 Contribution to published Literature... 9-6 9.5 Further studies ... 9-7

(9)

REFERENCES...R-1 - R-8 APPENDIX A:Listing of TRIANGLE classifier function...A-1 - A-2 APPENDIX B:Results for LINEAR SEARCH procedure... B-1 - B-1 APPENDIX C:Listing of REMAINDER procedure...C-1 - C-1 APPENDIX D:Listing of DIRECT_SORT procedure...D-1 - D-3

(10)

Table of Figures

Figure 2.1: Sibling nodes. ... 2-6 Figure 2.2: Control flow tree for the. ... 2-7 Figure 2.3: Example of an input space partitioning structure in the range of -15 to 15... 2-8 Figure 3.1: Single crossover withk = 5. ... 3-15 Figure 3.2: Double crossover withk = 2andn = 5. ... 3-16 Figure 3.3: Uniform crossover with crossing points at 1, 4, 5 and 7. ... 3-16 Figure 3.4: Before and after mutation. ... 3-17 Figure 3.5: Survival method... 3-19 Figure 3.6: Block diagram of GA. ... 3-22 Figure 3.7: Pseudo code of GA. ... 3-22 Figure 4.1: Overall structure of testing tool. ... 4-6 Figure 4.2: Software and control flow tree for the example... 4-7 Figure 4.3: Example with generated test data for generation G1 to G5. ... 4-12 Figure 4.4: Control flow tree of an 'if...then....else' condition and the corresponding original software

code. ... 4-13 Figure 4.5: Control flow tree of a simple 'if...then....else' condition and the corresponding original

(bold) software code and instrumentation... 4-14 Figure 4.6: Example of original software (displayed in bold) and the instrumentation... 4-15 Figure 4.7: Test data for the nodes 2 and 3. ... 4-16 Figure 4.8: Control flow trees of a 'while' loop-statement with the corresponding. ... 4-18 Figure 4.9: Control flow graph of a 'loop ... exit' condition. ... 4-20 Figure 4.10: Loop testing with 'if' statement inside... 4-21 Figure 4.11: Example of calculating Hamming distance... 4-22 Figure 5.1: Control flow tree for the quadratic procedure... 5-2 Figure 5.2: Illustrating global and local optima with regard to D = 0 using integers. ... 5-4 Figure 5.3: Required tests for different survival procedures and probabilities using reciprocal fitness

function. ... 5-7 Figure 5.4: Required tests for different survival procedures and probabilities using Hamming fitness

function. ... 5-8 Figure 5.5: Binary to Gray conversion... 5-15 Figure 5.6: Gray to Binary conversion... 5-15 Figure 5.7: Distribution of global solutions using binary coded representation where solution for

A=0 have been filtered out. ... 5-18 Figure 5.8: Distribution of global solutions using Gray coded representation where solution for

A=0 have been filtered out. ... 5-18 Figure 5.9: Distribution of global solutions using random testing where solution for A=0 have

been filtered out... 5-19 Figure 5.10: Pseudo code of EQUALITY procedure. ... 5-22 Figure 5.11: CPU time over various input ranges for using random numbers and GA. ... 5-25 Figure 5.12: Individual fitness over a test run using GAs and Random Testing. ... 5-26 Figure 5.13: Histogram of successful test runs... 5-27 Figure 5.14: The complete triangle control flow tree... 5-31 Figure 5.15: Executed branch distribution in log for random testing. ... 5-34 Figure 5.16: CPU time over various input ranges for using random numbers and GA. ... 5-35

(11)

Figure 5.17:Fitness distribution using Genetic Algorithms... 5-36 Figure 6.1: Control flow graph of LINEAR_1 search procedure. ... 6-2 Figure 6.2: Converging process towards global optimum using weighted Hamming. ... 6-5 Figure 6.3: Converging process towards global optimum using unweighted Hamming. ... 6-5 Figure 6.4: Converging process towards global optimum using reciprocal fitness. ... 6-6 Figure 6.5: Typical fitness distribution of linear search with full loop testing. ... 6-11 Figure 6.6: Comparison of random testing and Genetic Algorithm testing. ... 6-13 Figure 6.7: Control flow graph of LINEAR_2 procedure. ... 6-15 Figure 6.8: Control flow graph of the binary search function. ... 6-18 Figure 6.9: Required tests using reciprocal and Hamming fitness function. ... 6-19 Figure 6.10: Control flow tree of the REMAINDER. ... 6-24 Figure 6.11: Distribution of off-line performance for the remainder procedure using. ... 6-27 Figure 7.1: Control flow graph of DIRECT_SORT. ... 7-2 Figure 7.2: Control flow graph of procedure PARTITION... 7-2 Figure 7.3: Control flow graph of procedure INSERT... 7-2 Figure 7.4: Control flow graph of procedure SWAP... 7-2 Figure 7.5: Data structure of the new chromosome... 7-6 Figure 7.6: Usage of different array sizes as defined byindex_type. ... 7-6 Figure 7.7: Example of a chromosome using different data types... 7-8 Figure 8.1: Original and mutated statement causing shifting of subdomain boundary... 8-2 Figure 8.2: Definition of sub-expression for the quadratic equation solver problem. ... 8-8

Table of Listings

Listing 4.1: Listing of CHECK_BRANCH... 4-14 Listing 4.2: Listing of LOOKING_BRANCH. ... 4-15 Listing 4.3: Listing of procedure calls for boundary testing approach. ... 4-16 Listing 4.4: Fitness function for different nodes. ... 4-17 Listing 4.5: Software listings for the instrumented procedures for a loop condition... 4-19 Listing 4.6:'while'loop condition with instrumented code... 4-20 Listing 4.7: Instrumented procedures in to a 'loop ... exit' condition... 4-21 Listing 5.1: Software code for the quadratic procedure. ... 5-2 Listing 6.1: Software code of linear search procedure. ... 6-2 Listing 6.2: Software code LINEAR_2 search procedure. ... 6-15 Listing 6.3: Software code of the binary search function... 6-18 Listing 6.4: Listing of a part of the REMAINDER. ... 6-24 Listing 7.1: Type declaration of the chromosome... 7-6 Listing 7.2: Generic type declaration. ... 7-7

(12)

Table of Tables

Table 5.1: Difference between binary code and two's complement representation. ... 5-5 Table 5.2: Results using different population sizes... 5-9 Table 5.3: Different Hamming fitness functions. ... 5-12 Table 5.4: Results from using different Hamming fitness functions. ... 5-12 Table 5.5: Comparison of binary and Gray code. ... 5-14 Table 5.6: Results of using different crossover operators. ... 5-20 Table 5.7: Results of using different mutation probabilityPm... 5-21 Table 5.8: Results of using different values forPE. ... 5-22 Table 5.9: Possible set-ups... 4-23 Table 5.10: Results using random and GA testing. ... 5-25 Table 5.11: Summary table of most significant results. ... 5-28 Table 5.12: Different crossover for the triangle procedure. ... 5-32 Table 5.13: Results using random and GA testing. ... 5-35 Table 5.14: Results of triangle using GA and random testing... 5-37 Table 6.1: Results of linear procedure forA(1)= 1... 6-3 Table 6.2: Results of generatingA(1)randomly for each test run in the range±20000. ... 6-8 Table 6.3: Results of linear procedure for zero and more than zero iterations where the elements

of the array are also changed by the GAs. ... 6-9 Table 6.4: Results of LINEAR_1 procedure for full loop testing. ... 6-10 Table 6.5: Results using random and GA testing for full loop testing. ... 6-13 Table 6.6: Results of LINEAR_2 procedure for full loop testing using binary coding. ... 6-16 Table 6.7: Results of binary search procedure for full loop testing... 6-20 Table 6.8: Results of random testing with predefined number of iterations NITS = 1. ... 6-20 Table 6.9: Results for full loop testing using REMAINDER. ... 6-25 Table 6.10: Results of remainder procedure using different crossover operator. ... 6-26 Table 6.11: Results of random and GA testing. ... 6-28 Table 7.1: Results of testing DIRECT_SORT using integer data type... 7-11 Table 7.2: Results using integer data type without CONTROL_PARENTS. ... 7-12 Table 7.3: Results of testing DIRECT_SORT using record type of integer and character... 7-13 Table 7.4: Results of DIRECT_SORT using record type of character and integer. ... 7-14 Table 8.1: First level of mutation analysis: Statement Analysis... 8-7 Table 8.2: Second level of mutation analysis: Predicate Analysis. ... 8-8 Table 8.3: Third level of mutation testing: Domain Analysis. ... 8-9 Table 8.4: Fourth level of mutation testing: Coincidental Correctness Analysis. ... 8-9 Table 8.5: Results for statement analysis. ... 8-14 Table 8.6: Results of predicate analysis. ... 8-14 Table 8.7: Results of domain analysis... 8-15 Table 8.8: Results of coincidental correctness analysis. ... 8-17

(13)

List of Abbreviations

PC Crossoverprobability

PS Surviveprobability

Pm Mutationprobability

PSZ Populationsize

PE Equalityprobability

Nits Number ofiterations

NoRT NumberofRandomTests

GA GeneticAlgorithm

Pm_w Weightedmutationprobability

S Chromosome length

Hx Hamming fitness functionx

C Chromosome

LCSAJ LinearCodeSequenceAndJump

CPU CentralProcessorUnit

MSB MostSignificantBit

LSB LeastSignificantBit

F Fitness value

M_G MAX_GENERATION

ES Evolutionstrategie

(14)

CHAPTER 1 Introduction

Between 40% and 50% of the software production development cost is expended in software testing, Tai [1980], Ince [1987] and Graham [1992]. It consumes resources and adds nothing to the product in terms of functionality. Therefore, much effort has been spent in the development of automatic software testing tools in order to significantly reduce the cost of developing software. A test data generator is a tool which supports and helps the program tester to produce test data for software.

Ideally, testing software guarantees the absence of errors in the software, but in reality it only reveals the presence of software errors but never guarantees their absence, Dijkstra [1972]. Even, systematic testing cannot prove absolutely the absence of errors which are detected by discovering their effects, Clarke and Richardson [1983]. One objective of software testing is to find errors and program structure faults. However, a problem might be to decide when to stop testing the software, e.g. if no errors are found or, how long does one keep looking, if several errors are found, Morell [1990].

Software testing is one of the main feasible methods to increase the confidence of the programmers in the correctness and reliability of software, Deason [1991]. Sometimes, programs which poorly tested, perform correctly for months and even years before some input sets reveal the presence of serious errors, Miller [1978]. Incorrect software which is released to market without being fully tested, could result in customer dissatisfaction and moreover it is vitally important for software in critical applications that it is free of software faults which might lead to heavy financial loss or even endanger lives, Hamlet [1987]. In the past decades, systematic approaches to software testing procedures and tools have been developed to avoid many difficulties which existed in ad hoc techniques. Nevertheless, software testing is the most usual technique for error detection in todays software industry. The main goal of software testing is to increase one's confidence in the correctness of the program being tested.

In order to test software, test data have to be generated and some test data are better at finding errors than others. Therefore, a systematic testing system has to differentiate

(15)

good (suitable) test data from bad test (unsuitable) data, and so it should be able to detect good test data if they are generated. Nowadays testing tools can automatically generate test data which will satisfy certain criteria, such as branch testing, path testing, etc. However, these tools have problems, when complicated software is tested.

A testing tool should be general, robust and generate the right test data corresponding to the testing criteria for use in the real world of software testing, Korel [1992]. Therefore, a search algorithm must decide where the best values (test data) lie and concentrate its search there. It can be difficult to find correct test data because conditions or predicates in the software restrict the input domain which is a set of valid data.

Test data which are good for one program are not necessarily appropriate for another program even if they have the same functionality. Therefore, an adaptive testing tool for the software under test is necessary. Adaptive means that it monitors the effectiveness of the test data to the environment in order to produce new solutions with the attempt to maximise the test effectiveness.

1.1 Objectives and aims of the research project

The overall aim of this research project is to investigate the effectiveness of Genetic Algorithms (GAs) with regard to random testing and to automatically generate test data to traverse all branches of software. The objectives of the research activity can be defined as follows:

• The furtherance of basic knowledge required to develop new techniques for automatic testing;

• To assess the feasibility of using GAs to automatically generate test data for a variety of data type variables and complex data structures for software testing;

• To analyse the performance of GAs under various circumstances e.g. large systems.

• Comparison of the effectiveness of GAs with pure random testing for software developed in ADA;

• The automatic testing of complex software procedures;

• Analysis of the test data adequacy using mutation testing;

The performance of GAs in automatically generating test data for small procedures is assessed and analysed. A library of GAs is developed and then applied to larger systems.

(16)

The efficiency of GAs in generating test data is compared to random testing with regard to the number of test data sets generated and the CPU time required.

This research project presents a system for the generation of test data for software written in ADA83. The problem of test data generation is formed and solved completely as a numerical optimisation problem using Genetic Algorithms and structural testing techniques.

Software testing is about searching and generating certain test data in a domain to satisfy the test criteria. Since GAs are an established search and optimisation process the basic aim of this project is to generate test sets which will traverse all branches in a given procedure under test.

1.2Hypotheses

In order to make my objectives clear several hypotheses are suggested and justified in the following chapters by evolving experiments which are described in section 4.6.

1. Hypothesis:

Genetic Algorithms are more efficient than random testing in generating test data, see section 2.2.3. The efficiency will be measured as the number of tests required to obtain full branch coverage.

2. Hypothesis:

A standard set of parameters for Genetic Algorithms can be established which will apply to a variety of procedures with different input data types. In particular, the following will be investigated:

2_1 which of the following bit patterns is most appropriate for representing the input test-set: twos complement, binary with sign bit or Gray code; see section 3.7;

2_2 which of the following reproduction strategies is most efficient: selection at random or according to fitness), see section 3.8.2;

2_3 which crossover strategy is most efficient: single, double or uniform crossover, see section 3.8.3.1;

2_4 what size of population gives the best result, see section 3.5.1;

(17)

These investigations are described in detail in chapter 3. A standard set is determined in chapter 5 and confirmed in chapters 6 and 7;

3. Hypothesis:

Test cases can be generated for loops with zero, one, two and more than two iterations, see section 4.7.2. The confirmation is in chapters 6 and 7.

4. Hypothesis:

Genetic Algorithms generate adequate test data in terms of mutation testing and generating test data for the original (unmutated) software is better. A detailed description is in section 2.2.1. This is confirmed in chapter 8;

These hypotheses will be under close investigation through out the chapters and will be discussed in more detail in the chapter 3 and 5 where these different operators and parameters are introduced.

1.3 Testing criteria

The criterion of testing in this thesis is branch testing, see section 2.1.2. Our aim is to develop a test system to exercise every branch of the software under test. In order to generate the required test data for branch testing Genetic Algorithms and random testing are used. These two testing techniques will be compared by means of the percentage of coverage which each of them can achieve and by the number of test data which have to be generated before full branch coverage has been attained.

1.4 Structure of Thesis

Following this introductory chapter, Chapter 2 reviews various testing methods and applications for software testing. The advantages and disadvantages of these techniques are examined and explained.

Chapter 3 describes the overall idea of GAs. An introduction to GAs is given and how and why they work is explained using an example. Various operators and procedures are explained which are used within a GA. Important and necessary issues of GAs are described.

Chapter 4 describes the technique which has been applied to test software. A detailed

(18)

description of the application of GAs to software testing is explained and is shown with brief examples. Instrumentation of the software under test is explained.

The validity of any technique can only be ascertained by means of experimental verification. A significant part of the work reported in this thesis is the conduct of experiments which yield results which can be compared to the method of random testing.

Chapter 5 describes these experiments for a quadratic equation solver procedure and a triangle classifier procedure using different GAs by means of various settings. These experiments are conducted in order to investigate the effectiveness of using GA for software testing. Both procedures under test handle integer variables which are involved in complex predicates which makes the search for test data difficult. In addition the triangle procedure comprises various nested procedure declarations.

In Chapter 6, various linear search procedures, a binary search and a remainder procedure have been tested. Moreover, these procedures have 'loop' conditions as well as 'if' conditions. In contrast to the previous chapter they consist of more complex data types such as characters, strings and arrays.

Chapter 7 uses a commercially available generic sort procedure, DIRECT_SORT, which has nested procedure declarations and complex data structures such as records of integer and character variable arrays where the arrays have to be of variable lenght.

Chapter 8 describes an error - based testing method, also called mutation testing. The goal is to construct test data that reveal the presence or absence of specific errors and to measure the adequacy of the test data sets and so of the testing tool.

Chapter 9 gives an overall conclusion of the project. One main conclusion is that the proposed technique represents a significant improvement over random testing with regard to the required number of tests. The technique required up to two orders of magnitude fewer tests and less CPU time.

(19)

CHAPTER 2 Background of various automatic testing methods

Software testing is widely used in many different applications using various testing strategies. This chapter explains and gives an overview of the fundamental differences between several approaches to software testing.

2.1 Testing techniques

There are two different testing techniques;black boxandwhite box testing.

2.1.1 Black box testing

In black box testing, the internal structure and behaviour of the program under test is not considered. The objective is to find out solely when the input-output behaviour of the program does not agree with its specification. In this approach, test data for software are constructed from its specification, Beizer [1990], Ince [1987] and Frankl and Weiss [1993]. The strength of black box testing is that tests can be derived early in the development cycle. This can detect missing logic faults mentioned by Hamlet [1987].

The software is treated as a black box and its functionality is tested by providing it with various combinations of input test data. Black box testing is also called functional or specification based testing. In contrast to this is white box testing.

2.1.2 White box testing

In white box testing, the internal structure and behaviour of the program under test is considered. The structure of the software is examined by execution of the code. Test data are derived from the program's logic. This is also called program-based or structural testing, Roper [1994]. This method gives feedback e.g. on coverage of the software.

(20)

There are several white box (structural) testing criteria:

• Statement Testing: Every statement in the software under test has to be executed at least once during testing. A more extensive and stronger strategy is branch testing.

• Branch testing: Branch coverage is a stronger criterion than statement coverage. It requires every possible outcome of all decisions to be exercised at least once Huang [1975], i.e. each possible transfer of control in the program be exercised. This means that all control transfers are executed, Jin [1995]. It includes statement coverage since every statement is executed if every branch in a program is exercised once. However, some errors can only be detected if the statements and branches are executed in a certain order, which leads to path testing.

• Path testing: In path testing every possible path in the software under test is executed; this increases the probability of error detection and is a stronger method than both statement and branch testing. A path through software can be described as the conjunction of predicates in relation to the software's input variables. However, path testing is generally considered impractical because a program with loop statements can have an infinite number of paths. A path is said to be 'feasible', when there exists an input for which the path is traversed during program execution, otherwise the path is unfeasible.

2.2 Automatic test data generator

Extensive testing can only be carried out by an automation of the test process, claimed by Staknis [1990]. The benefits are a reduction in time, effort, labour and cost for software testing. Automated testing tools consist in general of an instrumentator, test harnessand atest data generator.

Static analysing tools analyse the software under test without executing the code, either manually or automatically. It is a limited analysis technique for programs containing array references, pointer variables and other dynamic constructs. Experiments have shown that this kind of evaluation of code inspections (visual inspections) has found static analysis is very effective in finding 30% to 70% of the logic design and coding errors in a typical software, DeMillo [1987]. Symbolic execution and evaluation is a typical static tool for generating test data.

(21)

Many automated test data generators are based on symbolic execution, Howden [1977], Ramamoorthy [1976]. Symbolic execution provides a functional representation of the path in a program and assigns symbolic names for the input values and evaluates a path by interpreting the statements and predicates on the path in terms of these symbolic names, King [1976]. Symbolic execution requires the systematic derivation of these expressions which can take much computational effort, Fosdick and Osterweil [1976].

The values of all variables are maintained as algebraic expressions in terms of symbolic names. The value of each program variable is determined at every node of a flow graph as a symbolic formula (expression) for which the only unknown is the program input value. The symbolic expression for a variable carries enough information such that, if numerical values are assigned to the inputs, a numerical value can be obtained for the variable, this is called symbolic evaluation. The characteristics of symbolic execution are:

• Symbolic expressions are generated and show the necessary requirements to execute a certain path or branch, Clarke [1976]. The result of symbolic execution is a set of equality and inequality constraints on the input variables; these constraints may be linear or non-linear and define a subset of the input space that will lead to the execution of the path chosen;

• If the symbolic expression can be solved the test path is feasible and the solution corresponds to a set of input data which will execute the test path. If no solution can be found then the test path is unfeasible;

• Manipulating algebraic expressions is computationally expensive, especially when performed on a large number of paths;

• Common problems are variable dependent loop conditions, input variable dependent array (sometimes the value is only known during run time) reference subscripts, module calls and pointers, Korel [1990];

• These problems slow down the successful application of symbolic execution, especially if many constraints have to be combined, Coward [1988] and Gallagher [1993].

Some program errors are easily identified by examining the symbolic output of a program if the program is supposed to compute a mathematical formula. In this kind of

(22)

event, the output has just to be checked against the formula to see if they match.

In contrast tostatic analysis,dynamic testingtools involve the execution of the software under test and rely upon the feedback of the software (achieved by instrumentations) in order to generate test data. Precautions are taken to ensure that these additional instructions have no effect whatever upon the logic of the original software. A representative of this method is described by Gallagher et al. [1993] who used instrumentation to return information to the test data generation system about the state of various variables, path predicates and test coverage. A penalty function evaluates by means of a constraint value of the branch predicate how good the current test data is with regard to the branch predicate. There are three types of test data generators;

pathwise,data specificationandrandom test data generator.

A test set is run on the software under test, and the output is saved as the actual output of that test case. The program tester has to examine the output and decides whether it is correct, by comparing the actual output with the expected-output. If the output is incorrect, an error has been discovered, the program must be changed and testing must start again. This leads toregression testingexecuting all previous test data to verify that the correction introduced no new errors. BCS SIGIST [1995] defined regression tesing as:

Retesting of a previously tested program following modification to ensure that faults have not been introduced or uncovered as a result of the changes made.

To finish testing, the tester will manually examine the output of the test cases to determine whether they are correct.

Deason [1991] investigated the use of rule-based testing methods using integers and real variables. His system uses prior tests as input for the generation of additional tests. The test data generator assigns values directly to input variables in conditions with constants and then increments and decrements them by small amounts to come closer to the boundary. The input values are doubled and halved resulting in much faster movements through the search space. Finally one input variable at a time is set to a random number.

The result of this project is that the rule-based method performed almost always better than random. Deason called his method multiple-condition boundary coverage where

(23)

condition (true and false) outcomes in every decision. Boundary here means that the test data has to be as close as possible to switching the conditions from true to false. He mentioned that it is not possible to generate test data which causes the execution of all branches in any arbitrary software, i.e. no algorithm exists for solving general non-linear predicates. In addition, his approach is restricted to numeric types such as integer data types. A rule based approach has always the drawback that a rule should exist for unforeseen problems which can be difficult to realise. If such a rule does not exist, the generation of test data can end in a local optimum solution and not in a global solution which will traverse a branch.

2.2.1 Pathwise generators

Pathwise test data generators are systems that test software using a testing criterion which can be path coverage, statement coverage, branch coverage, etc. The system automatically generates test data to the chosen requirements. A pathwise test generator consists of a program control flow graph construction, path selection and test data generation tool.

Deterministic heuristics have been used by Korel [1990, 1992]. He used a dynamic test data generation approach which is based on a pathwise test data generator to locate automatically the values of input variables for which a selected path is traversed. His steps are program control flow graph construction and test data generation. The path selection stage is not used because if unfeasible paths are selected, the result is a waste of computational effort examining these paths. Most of these methods use symbolic evaluation to generate the test data. Korel's method is based on data flow analysis and a function minimisation approach to generate test data which is based on the execution of the software under test. Since the software under test is executed, values of array indexes and pointers are known at each point in the software execution and this overcomes the problems and limitations of symbolic evaluations.

In Korel's approach a branch function is formed out of the branch predicates where the control should flow. These functions are dependent on the input variables which can be difficult to represent algebraically, but the values can be determined by program execution. The idea is now to minimise these branch functions which have a positive value when the desired branch is not executed. As soon as the value becomes negative

(24)

the branch will be traversed. The minimisation of the branch function demands that the path, up to the node where the sibling node should be executed, will be retained for the next test data.

A boolean condition has two branches with a true and a false node, see Figure 2.1. A reference to the sibling node means, the other node corresponding to the current executed node. For example the sibling node of 'True branch' is 'False branch'.

Boolean

False True

condition

branch branch

Figure 2.1:Sibling nodes.

Korel's method of generating test data is based on thealternating variable methodwhich consists of minimising the current branch function. Each input variable in turn is increased or decreased by a small amount in an exploratory search; and by larger amounts in apattern search. The effect of this is one of the following:

• a decrease in the branch function value so that the direction in which to proceed for changing a variable is known, and keep changing in this direction and the new test data and value replaces the old one;

• an increase of the branch function value which results in changing the direction of manipulating the variable;

• no effect so that the next variable will be manipulated.

If one cycle is completed, the method continuously cycles around the input variables until the desired branch is executed or no progress (decreasing the branch function) can be made for any input variable. To reduce the search time significantly for the alternating variable approach, Korel applies adynamic data flow analysis to determine those input variables which have an influence for the current branch function value on a given program input in order to reduce the number of unnecessary tries. Therefore, only these variables need to be manipulated during the minimisation process. The disadvantage of this approach is that it is not a general approach to software testing because his approach can fail depending on the initial test set, because the subdomain of

(25)

a branch may comprise small and disconnected regions, see e.g. section 5.1 and Figure 5.2. In this case, this local search technique has to be replaced by a global optimisation technique, otherwise only local minima for the branch function value may be found which might not be good enough traverse the desired branch. This is where Genetic Algorithms gain their advantages and strength as a global optimisation process because they do not depend on a continuous and connected domain.

The basic idea of domain testing, White and Cohen [1978, 1980], is that each path of the software belongs to a certain subdomain, which consists of those inputs which are necessary to traverse that path.

An example of these subdomains are shown in Figure 2.3 for the software example in Listing 2.1 and the control flow tree in Figure 2.2.

if A≤B then put("node 2");

if A = B then put("node 3");

else --A < B put("node 4");

end if;

else --A > B put("node 5");

end if;

1 2 5

Node number

A = B

A > B

3 4

A < B A <= B

Listing 2.1:Example of software. Figure 2.2: Control flow tree for the example.

(26)

-15 -10 -5 0 5 10 15 -15

-10 -5 0 5 10 15

A B

subdomain of node 4 TF

subdomain of node 5 F

subdomain of node 3 TT

subdomain of node 2 TT or TF

(= subdomains of nodes 3 and 4)

Figure 2.3:Example of an input space partitioning structure in the range of -15 to 15.

The domain is the set of all valid test sets. It is divided into subdomains such that all members of a subdomain cause a particular branch to be exercised. Alternatively, a different division of subdomains is formed for path testing etc. The domain notation may be based upon which branch ( true or false) has been taken. A character code specifies the branch (here also path), e.g. TT, TF, F. In addition the respective node is also mentioned. The subdomain of node 5 is the dark grey area, the subdomain of node 3 is the diagonal line, the subdomain of node 4 is the light grey area whereas the subdomain of node 2 includes the light grey area (node 2) plus the diagonal line (node 3).

Domain testing tries to check whether the border segments of the subdomains are correctly located by the execution of the software with test data to find errors in the flow of the control through the software. These test data belong to an input space which is partitioned into a set of subdomains which belong to a certain path in the software. The boundary of these domains is obtained by the predicates in the path condition where a border segment is a section of the boundary created by a single path condition. Two types of boundary test points are necessary;onandofftest points. Theontest points are on the border within the domain under test, and the off points are outside the border within an adjacent domain which means if the software generates correct results for all these points then it can be considered that the border locations are correct. White and

(27)

Cohen proposed to have two onand one off point where the off point is in between the two on points. Clarke [1982] extended this technique and suggested having two off points at the ends of the border under test.

Therefore, domain testing strategies concentrate on domain errors which are based on the shifting of a border segment by using a wrong relational operator. This means test data have to be generated for each border segment (predicate) to see whether the relational operator and the position of the border segment are correct, White and Cohen [1980] and see chapter 4 for further detail. Hence points close to the border (called boundary test data) are the most sensitive test data for analysing the path domains and revealing domain errors, Clarke and Richardson [1983].

Domain testing, therefore, is an example of partition testing. It divides the input space into equivalent domains and it is assumed that all test data from one domain are expected to be correct if a selected test data from that domain is shown to be correct.

25% of all errors (bugs) arise out of structural and control flow errors according to Beizer [1990], pp. 463. Two different types of errors were identified by Howden [1976].

Adomain erroroccurs when a specific input takes the wrong path because of an error in the control flow of the program which will end in a different subdomain. A 'computational error' is based on an input which follows the correct path, but an error in some assignment statement causes the wrong output. In complex systems with several input variables it is usually hard to find data points which belong to a small subdomain because the subdomain does not have many possible data points. Zeil et al. [1992] and White and Cohen [1978] used domain testing in their strategies but restrict it to linear predicates handling floating point variables. They use a symbolic analysis for the paths and a geometrical approach for test data generation. An example is in chapter 4 of generating test data using Genetic Algorithms to check these subdomains and boundaries of a small program.

Mutation testing is an implementation of an error-based testing method, DeMillo [1978].

It is based on the introduction of a single syntactically-correct fault e.g. by manipulation of conditions and statements. This new program is called mutant. Mutation testing is used to show the absence of prespecified faults, Morell [1990]. After checking the output of the original program to be correct (by some oracle), the same test data is input

(28)

to the mutant and if the output of the mutant differs from the expected output the mutant is killed by the test data because the error is detected. If the mutant has not been killed and it is said to be still alive and more test data have to be generated to kill it. If a mutant cannot be killed then either it is an equivalent mutant (no functional change) or the test data sets are not of sufficient quality. This revealed a weakness in the test data (low quality test data). By using test data which do not kill the mutants, it can be said that either the generating tool for test data is not good enough and the original program has to be re-examined or additional test data have to be produced until some threshold is met where it appears that it is impossible to reveal a difference. The checking of the correctness is a major factor in the high cost of software development. Checking correctness of test data can be automated by using an oracle or post conditions, Holmes [1993] or from input - output specification. The main task is to generate test data that reveals a difference in the mutant's behaviour corresponding to the original software. A detailed description and application is given in chapter 8. Mutation testing shows only the absence of certain faults. However, it is very time consuming to generate and test a large number of mutant programs. Hypothesis 4 is formulated which states that GAs are robust to generate test data and that it is better to generate test data for the original or the mutant program.

2.2.2 Data specification generators

Deriving test data from specification belongs to the 'black-box' testing method. Such a strategy generate test cases and test data e.g. from formal Z specification, Yang [1995].

The test data can then be applied to software and the effectiveness can be measured, e.g.

using ADATEST (ADATEST is an automatic testing system for Ada software which measures for example the percentage of statements executed or branches covered).

A disadvantage is the need for a formal specification for the software which does not often exist, Gutjahr [1993].

2.2.3 Random testing

Random testing selects arbitrarily test data from the input domain and then these test data are applied to the program under test. The automatic production of random test data, drawn from an uniform distribution, should be the default method by which other

(29)

systems should be judged, Ince [1987]. Statistical testing is a test case design technique in which the tests are derived according to the expected usage distribution profile.

Taylor [1989], Ould [1991] and Duran [1981] suggested that the distribution of selected input data should have the same probability distribution of inputs which will occur in actual use (operational profile or distribution which occurs during the real use of the software) in order to estimate the operational reliability.

Hamlet [1987] mentioned that the operational distribution for a problem may not be known and a uniform distribution may choose points from an unlikely part of the domain which can lead to inaccurate predictions, however, he still favours this technique. Duran [1981] had the opinion that an operational profile is not as effective for error detection as a uniform distribution. Taylor [1989] mentioned that concentrating on a certain feature using partition testing tended to be easier and simpler to generate test data than actual user inputs (operational profile) because they are focused on a particular feature. Partitioning testing is less effective than random testing in detecting faults which cause a printer controller to crash, Taylor [1989]. Random testing is the only standard in reliability estimation, Hamlet and Taylor [1990], in the user application because it can use data which resemble the user's operational profile. Partition testing in general can not supply this information because it focuses on test data in partitions that are more likely to fail, so that the failure rate for partition testing would be higher than that in expected actual use. If it is not known where the faults are likely to be, partition testing is not significantly better than random testing.

Hamlet and Taylor [1990] mentioned that there is not much difference between partition and random testing in terms of finding faults. Hamlet showed that random testing is superior to partition testing with regard to human effort especially with more partitions and if confidence is required. For a small number of sub-domains partition testing will perform better than random testing.

On the contrary Deason [1991] commented that random number generators are ineffective in that they rarely provide the necessary coverage of the program. Myers [1979] strengthened this comment and is of the opinion that random testing is probably the poorest methodology in testing. However, Duran and Ntafos [1984] and Duran [1981] stated that many errors are easy to find, but the problem is to determine whether a

(30)

test run failed. Therefore, automatic output checking is essential if large numbers of tests are to be performed. They also said that partition testing is more expensive than performing an equivalent number of random tests which is more cost effective because it only requires a random number generator and a small amount of software support.

DeMillo [1978] proved that the adequacy of random data is very dependent on the interval (range) from which the data is generated. Data from poorly chosen intervals are much worse than those from well chosen intervals. Duran and Ntafos [1984] agreed that the change of range for random testing has a great effect. Further they mentioned a disadvantage of random testing which is to satisfy equality values which are difficult to generate randomly.

Moranda [1978], Bertolino [1991] commented, the advantage of random testing is normally that it is more stressing to the program under test than hand selected test data, but on the other hand random inputs may never exercise both branches of a predicate which tests for equality. Even in the case that random testing is cheaper than partition testing, the slight advantage of random testing could be compensated for by using more random tests and there is no assurance that full coverage can be obtained, e.g. if equality between variables are required. And secondly it may mean examining the output from thousands of tests.

Random testing was especially recommended for the final testing stage of software by Tsoukalas [1993] and Girard and Rault [1973]. Duran and Ntafos [1984] recommended a mixed final testing, starting with random testing, followed by a special value testing method (to handle exceptional cases). Ince [1986] reported that random testing is a relatively cheap method of generating initial test data.

It is decided to use random testing as a benchmark for our Genetic Algorithm testing system as suggested by Ince [1987]. It offers a good comparison between the systems so that also other testing tool system can be easily compared indirectly to our system.

Whether the comparism of GA's with random testing is appropriate, especially when generating test data for disconnected subdomains, will be examined and discussed in chapters 5, 6, 7 and 8. Therefore, the hypothesis 1 is formulated, see section 1.2. Are using GAs in order to generate test data more effective than using random testing?

(31)

The next chapter explains the method of Genetic Algorithms and their features and characteristics. An example shows the method of working of the GAs.

(32)

CHAPTER 3 Genetic Algorithm

Optimisation problems arise in almost every field, especially in the engineering world.

As a consequence many different optimisation techniques have been developed.

However, these techniques quite often have problems with functions which are not continuous or differentiable everywhere, multi-modal (multiple peaks) and noisy.

Therefore, more robust optimisation techniques are under development which may be capable of handling such problems. In the past biological and physical approaches have become of increasing interest to solve optimisation problems, including for the former neural networks, genetic algorithms and evolution strategies (ESs) and for the second simulated annealing, Hills and Barlow [1994], Rayward-Smith and Debuse [1994], Bayliss [1994], Golden and Skiscim [1986], Hoffmannet al.[1991], Osborne and Gillett [1991].

Other optimisation techniques are:

• Tabu search, Glover [1989], Reeves et al. [1994], Rayward-Smith and Debuse [1994];

• Simplex method, Box [1965];

• Hooke Jeeves, Hooke and Jeeves [1961];

• Gradient method, Donneet al.[1994].

This chapter explains the features and methods of Genetic Algorithms. An introduction to genetic algorithms is followed by an optimisation example using genetic algorithms.

3.1 Introduction to Genetic Algorithms

Genetic algorithms (GAs) represent a class of adaptive search techniques and procedures based on the processes of natural genetics and Darwin's principal of the survival of the fittest. There is a randomised exchange of structured information among a population of artificial chromosomes. GAs are a computer model of biological evolution. When GAs are used to solve optimisation problems, good results are obtained surprisingly quickly.

In the context of software testing, the basic idea is to search the domain for input

(33)

variables which satisfy the goal of testing.

3.2Application of Genetic Algorithms

Creatures are the perfect problem solver. In the amount of tasks which they have to cope with (e.g. adaptation, changing environment), they do better by far than the best computer programs to the frustration of programmers because these organisms acquire their ability by the apparently aimless mechanism of evolution.

GAs have been used in many different applications as an adaptive search method, e.g. by Grefenstetteet al. [1985 a], Allen and Lesser [1989], and Joget al.[1989].

The fundamental concept of GAs is to evolve successive generations of increasingly better combinations of those parameters which significantly effect the overall performance of a design. The GAs come from the evolution strategy. Perhaps the slowness of biological evolution has lead to the fallacy that evolution strategies are in principle time consuming and less efficient.

However, the Berlin Professor for the science of engineering, Rechenberg, successfully developed independently and applied a selection - mutation - strategy, with the name Evolutionstrategie (ES), on the basis of trial and error principle in the early 1960s.

Rechenberg [1965] and Schwefel [1989] developed an optimal solid shape for the flow technique and achieved astonishing shapes for a nozzle which are distinguished by their high efficiency. Mutation was the key operator in his investigation. He also applied ES to determine the optimum configuration of five plates of steel in which each of them could have fifty one different positions corresponding to a search space consisting of some 3.45x10⁸plate configurations. An optimal solution was achieved by using the ES.

By selecting an experiment with a known optimal solution Rechenberg successfully demonstrated the capabilities of an evolutionary algorithm.

This result is very impressive because the experimental method was quite simple and the mathematical approach failed due to the complexity. The disadvantage of a random method which is that no step builds upon another is avoided by using these evolution strategies. Its steps are based on the experience which has been gained from previous trials, Abblay [1987]. Many other projects have used evolutionary strategies, among them are the development of a right-angled pipe with minimum flow resistance and

(34)

other structural elements Parmee [1992].

A parallel development of evolutionary algorithms has also taken place in the USA.

Seminal work concerning the formulation and behaviour of GAs was pioneered by Holland [1975] in his research on adaptation in natural and artificial systems at the University of Michigan (USA). Three basic operators are responsible for GAs:selection, crossover and mutation. The main genetic operator is crossover which performs recombination (mixing) of different solutions to ensure that the genetic information of a child life form is made up of the elements (genes) from each parent. He simulated the methods used when biological systems adapt to their environment in computer software models to solve optimisation problems. Also in the former USSR the method of evolution has been an accepted technique for many years, Zhigljavsky [1991].

Evolution avoids one of the most difficult obstacles which the software designer is confronted: the need to know in advance what to do for every situation which may confront a program. The advantage of GAs is the fact that they are adaptive. These GAs have already achieved epoch-making solutions for complex tasks like the construction of an aeroplane turbine. Evolution is under the influence of two fundamental processes;

natural selection and recombination. The former determines which individual member of a population is selected, survives and reproduces, the latter ensures that the genes (or entire chromosome) will be mixed to form a new one. Human beings have used a combination of crossbreeding and selection, to breed more productive corn, faster racehorses or more beautiful roses for thousands of years, Holland [1992].

Goldberg [1989] mentioned that various engineering projects had applied GAs to solve problems, e.g. to optimise a gas pipeline control. This problem was controlled by non- linear state transition equations that impose the pressure drop through the pipelines and pressure rise across compressors.

Other applications describe the use of GAs to the optimisation of a ten member plane truss, Goldberg and Samtani [1986]. The objective of this problem is to minimise the weight of the structure under the condition of maximum and minimum stress constraints on each member. As reported the GA always showed optimal solutions.

Grefenstette and Fitzpatrick [1985 b] considered a medical imaging system problem

(35)

where the problem lies in the registration of digital images. The functions which are to be optimised in image registration are the measures of the difference between (in this case) two x-ray images, one taken prior to the injection of dye into the artery and one taken following the injection. The images differ because of motion which has taken place between the two acquisition times. The two images are digitised and subtracted pixel by pixel with the desired end result being a difference image that clearly outlines the interior of the subject artery. By performing a geometrical transformation which warps one image relative to the other it is possible to improve the registration of the images so that the difference which is due to the motion is reduced.

GAs are also applied to the classical problem of the Prisoner's Dilemma problem, studied by Mühlenbein [1991 a], Fujiki and Dickinson [1987] and Wilson [1987]. In its simplest form, each of the two players have a choice of co-operating with the other or defecting. Grefenstette et al. [1985 b] and Oliver et al. [1987] applied GAs to the well known combinatorial optimisation Travelling Salesman Problem (TSP). The optimisation problem consists in finding the shortest distance (normally Euclidean distance) between n cities. Shultz et al. [1992, 1993] used GA in order to test autonomous vehicle software controllers. The task for this application is to search for combinations of faults that produce abnormal vehicle controller performance by comparing a controller to a chosen set of fault scenarios within a vehicle simulator. They concluded that this approach using GA is an effective method compared to manual testing of sophisticated software controllers.

In addition, GAs have been applied to the optimisation for example of the design of a concrete shell of an arch dam (large scale hydropower scheme) Parmee [1994]; the design of digital filters, Suckley [1991], Roberts and Wade [1994]; the design of cooling hole geometry of gas turbine blades, Parmeeet al. [1993], Parmee and Purchase [1991];

design of microwave absorbers (low-profile radar absorbing materials) which results in a reduction of the radar signature of military hardware, Tennant and Chambers [1994] and the generation of test patterns for VLSI circuits, O'Dareet al.[1994].

3.3 Overview and basics of Genetic Algorithms

GAs offer a robust non-linear search technique that is particularly suited to problems involving large numbers of variables. The GA achieves the optimum solution by the

A thesis submitted in partial fulfilment of the requirements of the University of Glamorgan / Prifvsgol Morgannwg for the degree of a

The Automatic Generation of Software Test Data Using

Genetic Algorithms

by

Harmen - Hinrich Sthamer

A thesis submitted in partial fulfilment of the requirements of the University of Glamorgan / Prifvsgol Morgannwg for the degree of a

Doctor of Philosophy.

November 1995

University of Glamorgan

Declaration

I declare that this thesis has not been, nor is currently being, submitted for the award of any other degree or similar qualification.

Signed

Acknowledgements

Abstract

Table of contents

Table of Figures

Table of Listings

Table of Tables

List of Abbreviations

CHAPTER 1

Introduction

1.1 Objectives and aims of the research project

1.2Hypotheses

1.3 Testing criteria

1.4 Structure of Thesis

CHAPTER 2

Background of various automatic testing methods

2.1 Testing techniques

2.1.1 Black box testing

2.1.2 White box testing

2.2 Automatic test data generator

2.2.1 Pathwise generators

1

2 5

Node number

3 4

2.2.2 Data specification generators

2.2.3 Random testing

CHAPTER 3

Genetic Algorithm

3.1 Introduction to Genetic Algorithms

3.2Application of Genetic Algorithms

3.3 Overview and basics of Genetic Algorithms