Work carried out while this author worked at ETH Zurich.
On the number and nature of faults found by random testing
Version of Record online: 6 JUL 2009
Copyright © 2009 John Wiley & Sons, Ltd.
Software Testing, Verification and Reliability
Special Issue: ICST 2008, the First IEEE International Conference on Software Testing, Verification and Validation
Volume 21, Issue 1, pages 3–28, March 2011
How to Cite
Ciupa, I., Pretschner, A., Oriol, M., Leitner, A. and Meyer, B. (2011), On the number and nature of faults found by random testing. Softw. Test. Verif. Reliab., 21: 3–28. doi: 10.1002/stvr.415
- Issue online: 6 JUL 2009
- Version of Record online: 6 JUL 2009
- Manuscript Accepted: 19 MAY 2009
- Manuscript Revised: 4 MAY 2009
- Manuscript Received: 15 AUG 2008
- FhG Internal Programs. Grant Number: Attract 692166
- random tests;
- object-oriented software;
- test selection strategies;
- empirical studies
Intuition suggests that random testing should exhibit a considerable difference in the number of faults detected by two different runs of equal duration. As a consequence, random testing would be rather unpredictable. This article first evaluates the variance over time of the number of faults detected by randomly testing object-oriented software that is equipped with contracts. It presents the results of an empirical study based on 1215 h of randomly testing 27 Eiffel classes, each with 30 seeds of the random number generator. The analysis of over 6 million failures triggered during the experiments shows that the relative number of faults detected by random testing over time is predictable, but that different runs of the random test case generator detect different faults. The experiment also suggests that the random testing quickly finds faults: the first failure is likely to be triggered within 30 s. The second part of this article evaluates the nature of the faults found by random testing. To this end, it first explains a fault classification scheme, which is also used to compare the faults found through random testing with those found through manual testing and with those found in field use of the software and recorded in user incident reports. The results of the comparisons show that each technique is good at uncovering different kinds of faults. None of the techniques subsumes any of the others; each brings distinct contributions. This supports a more general conclusion on comparisons between testing strategies: the number of detected faults is too coarse a criterion for such comparisons—the nature of faults must also be considered. Copyright © 2009 John Wiley & Sons, Ltd.