Abstract
Security patches play an important role in detecting and fixing one-day vulnerabilities. However, collecting abundant security patches from diverse data sources is not a simple task. This is because (1) each data source provides vulnerability information in a different way and (2) many security patches cannot be directly collected from Common Vulnerabilities and Exposures (CVE) information (e. g., National Vulnerability Database (NVD) references). In this paper, we propose a high-coverage approach that collects known security patches by tracking multiple data sources. Specifically, we considered the following three data sources: repositories (e. g., GitHub), issue trackers (e. g., Bugzilla), and Q&A sites (e. g., Stack Overflow). From the data sources, we gather even security patches that cannot be collected by considering only CVE information (i. e., previously untracked security patches). In our experiments, we collected 12,432 CVE patches from repositories and issue trackers, and 12,458 insecure posts from Q&A sites. We could collect at least four times more CVE patches than those collected in existing approaches, which demonstrates the efficacy of our approach. The collected security patches serves as a database on a public website (i. e., IoTcube) to proceed with the detection of vulnerable code clones.
Original language | English |
---|---|
Pages (from-to) | 85050-85063 |
Number of pages | 14 |
Journal | IEEE Access |
Volume | 10 |
DOIs | |
Publication status | Published - 2022 |
Bibliographical note
Publisher Copyright:© 2013 IEEE.
Keywords
- Open source software
- software security
- vulnerability database
ASJC Scopus subject areas
- General Engineering
- General Materials Science
- General Computer Science