TY - GEN
T1 - Dynamic classification of packing algorithms for inspecting executables using entropy analysis
AU - Bat-Erdene, Munkhbayar
AU - Kim, Taebeom
AU - Li, Hongzhe
AU - Lee, Heejo
PY - 2013
Y1 - 2013
N2 - Packing is widely used for bypassing anti-malware systems, and the proportion of packed malware has been growing rapidly, making up over 80% of malware. Few studies on detecting packing algorithms have been conducted during last two decades. In this paper, we propose a method to classify packing algorithms of given packed executables. First, we convert entropy values of the packed executables loaded in memory into symbolic representations. Our proposed method uses SAX (Symbolic Aggregate Approximation) which is known to be good at large data conversion. Due to its advantage of simplifying complicated patterns, symbolic representation is commonly used in bio-informatics and data mining fields. Second, we classify the distribution of symbols using supervised learning classifications, i.e., Naive Bayes and Support Vector Machines. Results of our experiments with a collection of 466 programs and 15 packing algorithms demonstrated that our method can identify packing algorithms of given executables with a high accuracy of 94.2%, recall of 94.7% and precision of 92.7%. It has been confirmed that packing algorithms can be identified using entropy analysis, which is a measure of uncertainty of running executables, without a prior knowledge of the executable.
AB - Packing is widely used for bypassing anti-malware systems, and the proportion of packed malware has been growing rapidly, making up over 80% of malware. Few studies on detecting packing algorithms have been conducted during last two decades. In this paper, we propose a method to classify packing algorithms of given packed executables. First, we convert entropy values of the packed executables loaded in memory into symbolic representations. Our proposed method uses SAX (Symbolic Aggregate Approximation) which is known to be good at large data conversion. Due to its advantage of simplifying complicated patterns, symbolic representation is commonly used in bio-informatics and data mining fields. Second, we classify the distribution of symbols using supervised learning classifications, i.e., Naive Bayes and Support Vector Machines. Results of our experiments with a collection of 466 programs and 15 packing algorithms demonstrated that our method can identify packing algorithms of given executables with a high accuracy of 94.2%, recall of 94.7% and precision of 92.7%. It has been confirmed that packing algorithms can be identified using entropy analysis, which is a measure of uncertainty of running executables, without a prior knowledge of the executable.
KW - Entropy Analysis
KW - Original Entry Point (OEP)
KW - Packing Algorithms
KW - Piecewise Aggregate Approximation (PAA)
KW - Symbolic Aggregate Approximation (SAX)
UR - http://www.scopus.com/inward/record.url?scp=84893810422&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84893810422&partnerID=8YFLogxK
U2 - 10.1109/MALWARE.2013.6703681
DO - 10.1109/MALWARE.2013.6703681
M3 - Conference contribution
AN - SCOPUS:84893810422
SN - 9781479925339
T3 - Proceedings of the 2013 8th International Conference on Malicious and Unwanted Software: "The Americas", MALWARE 2013
SP - 19
EP - 26
BT - Proceedings of the 2013 8th International Conference on Malicious and Unwanted Software
PB - IEEE Computer Society
T2 - 2013 8th International Conference on Malicious and Unwanted Software: "The Americas", MALWARE 2013
Y2 - 22 October 2013 through 24 October 2013
ER -