Abstract:
Fuzzing is an automated testing technique of finding security vulnerabilities in software. It guarantees more thorough security testing than manual code review for classic white-box testing or attempting to generate an error with black-box testing. The currently used methods of generating test data are based on the initial set of test data, so-called corpus. The elements of this set are randomly modified before being inputted into the tested program, in order to check if the applications handle them correctly. The number of elements and the quality of the corpus have a significant impact on the effectiveness of the tests performed. In order to optimize the corpus of test data, a process of their reduction called distillation, is used. At present, popular distillation criteria are test execution time, file size, and the number of edges in the control flow graph, which are triggered by inputting test data (code coverage). The paper proposes an additional criterion - simplified entropy. It was used to determine the probability of drawing two different ASCII characters (bytes) from the file. The main conclusion of the paper is that the proposed index may complement the currently used criteria in multi-criteria distillation.