Learning a variable-clustering strategy for octagon from labeled data generated by a static analysis

Kihong Heo, Hakjoo Oh, Hongseok Yang

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    30 Citations (Scopus)

    Abstract

    We present a method for automatically learning an effective strategy for clustering variables for the Octagon analysis from a given codebase. This learned strategy works as a preprocessor of Octagon. Given a program to be analyzed, the strategy is first applied to the program and clusters variables in it. We then run a partial variant of the Octagon analysis that tracks relationships among variables within the same cluster, but not across different clusters. The notable aspect of our learning method is that although the method is based on supervised learning, it does not require manually-labeled data. The method does not ask human to indicate which pairs of program variables in the given codebase should be tracked. Instead it uses the impact pre-analysis for Octagon from our previous work and automatically labels variable pairs in the codebase as positive or negative. We implemented our method on top of a static buffer-overflow detector for C programs and tested it against open source benchmarks. Our experiments show that the partial Octagon analysis with the learned strategy scales up to 100KLOC and is 33x faster than the one with the impact pre-analysis (which itself is significantly faster than the original Octagon analysis), while increasing false alarms by only 2 %.

    Original languageEnglish
    Title of host publicationStatic Analysis - 23rd International Symposium, SAS 2016, Proceedings
    EditorsXavier Rival
    PublisherSpringer Verlag
    Pages237-256
    Number of pages20
    ISBN (Print)9783662534120
    DOIs
    Publication statusPublished - 2016
    Event23rd International Symposium on Static Analysis, SAS 2016 - Edinburgh, United Kingdom
    Duration: 2016 Sept 82016 Sept 10

    Publication series

    NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
    Volume9837 LNCS
    ISSN (Print)0302-9743
    ISSN (Electronic)1611-3349

    Conference

    Conference23rd International Symposium on Static Analysis, SAS 2016
    Country/TerritoryUnited Kingdom
    CityEdinburgh
    Period16/9/816/9/10

    Bibliographical note

    Funding Information:
    We thank the anonymous reviewers for their helpful comments. We also thank Kwangkeun Yi, Chung-Kil Hur, and all members of SoFA group members in Seoul National University for their helpful comments and suggestions. This work was supported by Samsung Research Funding Center of Samsung Electronics under Project Number SRFC-IT1502-07 and Institute for Information & communications Technology Promotion (IITP) grant funded by the Korea government (MSIP) (No. R0190-16-2011, Development of Vulnerability Discovery Technologies for IoT Software Security). This research was also supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT & Future Planning (NRF-2016R1C1B2014062).

    Publisher Copyright:
    © Springer-Verlag GmbH Germany 2016.

    ASJC Scopus subject areas

    • Theoretical Computer Science
    • General Computer Science

    Fingerprint

    Dive into the research topics of 'Learning a variable-clustering strategy for octagon from labeled data generated by a static analysis'. Together they form a unique fingerprint.

    Cite this