A measure of the difference between test sets for generating controlled random tests
https://doi.org/10.37661/1816-0301-2022-19-4-7-26
Abstract
Objectives . The problem of constructing the characteristics of the difference between test sequences is solved. Its relevance for generating controlled random tests and the complexity of finding measures of difference for symbolic tests are substantiated. The limitations of using the Hamming and Damerau – Levenshtein distances to obtain a measure of the difference between test sets are shown.
Methods . Based on the characteristic of the interval used in the theory of the chain of successive events, a new measure of the difference between two symbolic test sets is determined. As a difference measure, the distance AD(Ti, Tk) between the test sets Ti and Tk is calculated using the interval characteristic, which is based on determining independent pairs of same (identical) symbols belonging to two sets and calculating the intervals between them.
Results. The combinatorial nature of the calculation of the proposed difference measure for symbolic test sets of an arbitrary alphabet and dimension is shown. An example of calculating this measure for various types of test sets, including such as address test sets, is given. Possible modifications are shown and some properties and limitations are determined. The application of the measure of difference is considered for the case of repeated testing of storage devices based on address sequences pA with even p repetition of addresses. For the case p = 2, mathematical relations are given for calculating the intervals and distances AD(Ti, Tk) for address sequences 2A used for controlled random testing of storage devices. The main attention is paid to binary test sets, when the task of calculating given difference metric is reduced to the classical assignment problem using the Hungarian algorithm. The computational complexity of the Hungarian algorithm is estimated by the relation O(n4). As an alternative to the Hungarian algorithm, an algorithm for calculating the considered difference measure is proposed, the complexity of which is much less and has an estimate equal to O(n2). The experimental studies confirm the effectiveness of the proposed algorithm.
Conclusion. The proposed difference measure extends the possibilities of generating test sequences when generating controlled random tests. It is shown that test sets, which are indistinguishable when Hamming distance is used as a measure of difference, have different values of AD(Ti, Tk) that allows to make more accurate classification of randomly generated sets as candidates for test sets.
Keywords
About the Authors
V. N. YarmolikBelarus
Vyacheslav N. Yarmolik, D. Sc. (Eng.), Professor
st. P. Brovki, 6, Minsk, 220013
V. V. Petrovskaya
Belarus
Vita V. Petrovskaya, M. Sc. (Eng.)
st. P. Brovki, 6, Minsk, 220013
I. Mrozek
Poland
Ireneusz Mrozek, D. Sc., Lecture
Wiejska, 45A, 15-351, Białystok
References
1. Anand S., Burke E. K., Chen T. Y., Clark J., Cohen M. B., …, Zhu H. An orchestrated survey on automated software test case generation. Journal of Systems and Software, 2014, vol. C-39, no 4, pp. 582–586.
2. Malaiya Y. K., Yang S. The coverage problem for random testing. Proceedings of the International Test Conference, Philadelphia, PA, USA, October 1984. Philadelphia, 1984, pp. 237–242.
3. Huang R., Sun W., Xu Y., Chen H., Towey D., Xia X. A survey on adaptive random testing. IEEE Transactions on Software Engineering, 2021, vol. 47, no 10, pp. 2052–2083.
4. Chan K. P., Towey D., Chen T. Y., Kuo F.-C., Merkel R. G. Using the information: Incorporating positive feedback information into the testing process. Proceedings of the 7th Annual International Workshop on Software Technology and Engineering Practice (STEP’03), Amsterdam, Netherlands, 19–21 September 2003. Amsterdam, 2003, pp. 71–76.
5. Roslina M. S., Ghani A. A. A., Baharom S., Zulzazil H. A preliminary study of adaptive random testing techniques. International Journal of Information Technology & Computer Science, 2015, vol. 19, no. 1, pp. 116–127.
6. Wu H., Nie C., Petke J., Jia Y., Harman M. An empirical comparison of combinatorial testing, random testing and adaptive random testing. IEEE Transactions on software engineering, 2020, vol. 46, no. 3, pp. 302–320.
7. Yarmolik V. N. Control’ i diagnostika vuchislitel’nuch system. Computer Systems Testing and Diagnoses. Minsk, Bestprint, 2019, 387 p. (In Russ.).
8. Yarmolik V. N., Levantsevich В. А., Mrozek I. Multiple controlled random tests. Informatika [Informatics], 2015, no. 2, pp. 63–76 (In Russ.).
9. Mrozek I., Yarmolik V. N. Multiple controlled random testing. Fundamenta Informaticae, 2016, vol. 144, no. 1, pp. 23–43.
10. Sadovskii M. G. About symbolical sequences comparigion. Vuchislitel’nue technologii [Computational Technologise], 2005, no. 3(10), pp. 106–116 (In Russ.).
11. Yarmolik V. N., Shauchenka M. A., Petrovskaya V. V. Distance measure for controlled random tests. Doklady BGUIR [BSUIR Proceedings], 2022, no. 6(20), pp. 52–60 (In Russ.).
12. Chen J., Kudjo P. K., Zhang Z., Su C., Guo Y., …, Song H. A modified similarity metric for unit testing of object-oriented software based on adaptive random testing. International Journal of Software Engineering and Knowledge Engineering, 2019, vol. 29, no. 4, pp. 577–606.
13. Chen T. Y., Kuo F.-C., Liu H. Adaptive random testing based on distribution metrics. Journal of Systems and Software, 2009, vol. 82, no. 9, pp. 1419–1433.
14. Bard G. V. Spelling-error tolerant, order-independent pass-phrases via the Damerau – Levenshtein stringedit distance metric. Proceedings of the Fifth Australasian Symposium on ACSW Frontiers, Ballarat, Australia, 30 January – 2 February 2007. Ballarat, 2007, pp. 117–124.
15. Tannga M. J., Rahman S., Hasniati. Comparative analysis of Levenshnein distance algorithm and Jaro Winkler for text plagiarism detection application. Journal of Technology Research in Information System and Engineering, 2017, vol. 4, no. 2, pp. 44–54.
16. Gumenyuk A. S., Skiba A. A., Pozdnichenko N. N., Shpynov S. N. About similarity measures of components of naturaly ordered data arrays. Trudy SPIIRAN [SPIIRAS Proceedings], 2019, vol. 18, no. 2, pp. 471–503 (In Russ.).
17. Yarmolik V. N., Yarmolik S. V. Address sequences for multiple march tests. Avtomatika i vuchislitel’naya technika [Automation and Computer Science], 2006, no 5, pp. 59–68 (In Russ.).
18. Savage C. A survey of combinatorial Gray code. SIAM Review, 1997, vol. 3, no. 4, pp. 605–629.
19. Mrozek I., Yarmolik V. N. Transparent memory tests based on the double address sequences. Entropy, 2021, no. 23, pp. 894.
20. Yarmolik V. N., Mrozek I., Levantsevich V. A., Demenkovets D. V. Transparent memory tests with even repeating addresses for storage devices. Informatika [Informatics], 2021, vol. 18, no. 3, pp. 18–35 (In Russ.).
21. Kuhn H. W. Variants of the Hungarian method for assignment problems. Naval Research Logistics Quarterly, 1956, vol. 3, no. 4, pp. 253–258.
22. Munkres J. Algorithms for the assignment and transportation problems. Journal of the Society for Industrial and Applied Mathematics, 1957, vol. 5, no. 1, pp. 32–38.
23. Blum C., Roli A. Metaheuristics in combinatorial optimization: Overview and conceptual comparison. ACM Computing Surveys, 2003, vol. 35, no. 3, pp. 268–308.
Review
For citations:
Yarmolik V.N., Petrovskaya V.V., Mrozek I. A measure of the difference between test sets for generating controlled random tests. Informatics. 2022;19(4):7-26. (In Russ.) https://doi.org/10.37661/1816-0301-2022-19-4-7-26