Fuzzing | Annibale Panichella

Generating Class-Level Integration Tests Using Call Site Information

Abstract: Search-based approaches have been used in the literature to automate the process of creating unit test cases. However, related work has shown that generated tests with high code coverage could be ineffective, i.

Pouria Derakhshanfar, Xavier Devroey, Annibale Panichella, Andy Zaidman, Arie van Deursen

Guiding Automated Test Case Generation for Transaction-Reverting Statements in Smart Contracts

Transaction-reverting statements are key constructs within Solidity that are extensively used for authority and validity checks. Current state-of-the-art search-based testing and fuzzing approaches do not explicitly handle these statements and therefore can not effectively detect security vulnerabilities. In this paper, we argue that it is critical to directly handle and test these statements to assess that they correctly protect the contracts against invalid requests. To this aim, we propose a new approach that improves the search guidance for these transaction-reverting statements based on interprocedural control dependency analysis, in addition to the traditional coverage criteria. We assess the benefits of our approach by performing an empirical study on 100 smart contracts w.r.t. transaction-reverting statement coverage and vulnerability detection capability. Our results show that the proposed approach can improve the performance of DynaMOSA, the state-of-the-art algorithm for test case generation. On average, we improve transaction-reverting statement coverage by 14 % (up to 35 %), line coverage by 8 % (up to 32 %), and vulnerability-detection capability by 17 % (up to 50 %).

Mitchell Olsthoorn, Arie van Deursen, Annibale Panichella

SynTest-Solidity: Automated Test Case Generation and Fuzzing for Smart Contracts

Ethereum is the largest and most prominent smart contract platform. One key property of Ethereum is that once a contract is deployed, it can not be updated anymore. This increases the importance of thoroughly testing the behavior and constraints of the smart contract before deployment. Existing approaches in related work either do not scale or are only focused on finding crashing inputs. In this tool demo, we introduce SynTest-Solidity, an automated test case generation and fuzzing framework for Solidity. SynTest-Solidity implements various metaheuristic search algorithms, including random search (traditional fuzzing) and genetic algorithms (i.e., NSGA-II, MOSA, and DynaMOSA). Finally, we performed a preliminary empirical study to assess the effectiveness of SynTest-Solidity in testing Solidity smart contracts.

Mitchell Olsthoorn, Dimitri Stallenberg, Arie van Deursen, Annibale Panichella

What Are We Really Testing in Mutation Testing for Machine Learning? A Critical Reflection

Mutation testing is a well-established technique for assessing a test suite’s effectiveness by injecting artificial faults into production code. In recent years, mutation testing has been extended to machine learning (ML) systems and deep learning (DL) in particular. Researchers have proposed approaches, tools, and statistically sound heuristics to determine whether mutants in DL systems are killed or not. However, as we will argue in this work, questions can be raised to what extent currently used mutation testing techniques in DL are actually in line with the classical interpretation of mutation testing. As we will discuss, in current approaches, the distinction between production and test code is blurry, the realism of mutation operators can be challenged, and generally, the degree to which the hypotheses underlying classical mutation testing (competent programmer hypothesis and coupling effect hypothesis) are followed lacks focus and explicit mappings. In this paper, we observe that ML model development follows a test-driven development (TDD) process, where data points (test data) with labels (implicit assertions) correspond to test cases in traditional software. Based on this perspective, we critically revisit existing mutation operators for ML, the mutation testing paradigm for ML, and its fundamental hypotheses. Based on our observations, we propose several action points for better alignment of mutation testing techniques for ML with paradigms and vocabularies of classical mutation testing.

Annibale Panichella, Cynthia C. S. Liem

Generating Highly-structured Input Data by Combining Search-based Testing and Grammar-based Fuzzing

Software testing is an important and time-consuming task that is often done manually. In the last decades, researchers have come up with techniques to generate input data (e.g., fuzzing) and automate the process of generating test cases (e.g., search-based testing). However, these techniques are known to have their own limitations: search-based testing does not generate highly-structured data; grammar-based fuzzing does not generate test case structures. To address these limitations, we combine these two techniques. By applying grammar-based mutations to the input data gathered by the search-based testing algorithm, it allows us to co-evolve both aspects of test case generation. We evaluate our approach by performing an empirical study on 20 Java classes from the three most popular JSON parsers across multiple search budgets. Our results show that the proposed approach on average improves branch coverage for JSON related classes by 15% (with a maximum increase of 50%) without negatively impacting other classes.

Mitchell Olsthoorn, Arie van Deursen, Annibale Panichella