A practical algorithm for learning disjunctive abstraction heuristics in static program analysis

Donghoon Jeon, Minseok Jeon, Hakjoo Oh

Research output: Contribution to journalArticlepeer-review

Abstract

Context: The precision and cost of static analysis are determined by abstraction heuristics (e.g., strategies for abstracting calling contexts, heap locations, etc.), but manually designing effective abstraction heuristics requires a huge amount of engineering effort and domain knowledge. Recently, data-driven static analysis has emerged to address this challenge by learning such heuristics automatically from a set of training programs. Objective: We present a practical algorithm for learning disjunctive abstraction heuristics in data-driven static analysis. We build on a recently proposed approach that can learn nontrivial program properties by disjunctive boolean functions. However, the existing approach is practically limited as it assumes that the most precise abstraction is cheap for the training programs; the algorithm is inapplicable if the most precise abstraction is not scalable. The objective of this paper is to mitigate this limitation. Method: Our algorithm overcomes the limitation with two new ideas. It systematically decomposes the learning problem into feasible subproblems, and it can search through the abstraction space from the coarse- to fine-grained abstractions. With this approach, our algorithm is able to learn heuristics when static analysis with the most precise abstraction is not scalable over the training programs. Results: We show our approach is effective and generally applicable. We applied our approach to a context-sensitive points-to analysis for Java and a flow-sensitive interval analysis for C. Experimental results show that our algorithm is efficient. For example, our algorithm can learn heuristics for 3-object-sensitive analysis for which the existing learning algorithm is too expensive to learn any useful heuristics. Conclusion: Our algorithm makes a state-of-the-art technique for data-driven static analysis more practical.

Original languageEnglish
Article number106564
JournalInformation and Software Technology
Volume135
DOIs
Publication statusPublished - 2021 Jul

Bibliographical note

Funding Information:
This work was supported by Samsung Research Funding & Incubation Center of Samsung Electronics under Project Number SRFC-IT1701-51. This work was partly supported by Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT) (No. 2020-0-01337, (SW STAR LAB) Research on Highly-Practical Automated Software Repair) and was the MSIT (Ministry of Science and ICT), Korea, under the ICT Creative Consilience program (IITP-2021-0-01819) supervised by the IITP (Institute for Information & communications Technology Planning & Evaluation).

Publisher Copyright:
© 2021 Elsevier B.V.

Keywords

  • Data-driven static analysis
  • Learning algorithm

ASJC Scopus subject areas

  • Software
  • Information Systems
  • Computer Science Applications

Fingerprint

Dive into the research topics of 'A practical algorithm for learning disjunctive abstraction heuristics in static program analysis'. Together they form a unique fingerprint.

Cite this