TY - GEN
T1 - Data mining approaches for packaging yield prediction in the post-fabrication process
AU - Park, Seung Hwan
AU - Park, Cheong Sool
AU - Kim, Jun Seok
AU - Kim, Sung Shick
AU - Baek, Jun Geol
AU - An, Daewoong
PY - 2013
Y1 - 2013
N2 - In the post-fabrication process for semiconductors, it is critical to predict the yield. This process consists of a series of electrical and physical tests following semiconductor fabrication, tests that generate a significant volume of parametric data. While past research has investigated yield prediction using parametric test data, most studies have difficulty correctly predicting the low and high yield because of the wide range of variables and the large data set. Also, in the case of the packaging yield, prediction is inaccurate as this yield does not directly correlate with the parametric test data. Therefore, this study proposes a framework in which the packaging yield is classified using the parametric test data of the previous step of the packaging test. This study involves three stages. In the first, data preprocessing is conducted due to the large data set. To learn a data mining model using much more data, parametric test data generated in the die level need to be changed into the wafer level. In the second stage, a random forest algorithm is used to select significant variables affecting the packaging yield. Finally, the third stage uses a nonlinear support vector machine (SVM) to classify the low and high yield. Through the three stages, this study demonstrates that this proposed algorithm has a superior performance.
AB - In the post-fabrication process for semiconductors, it is critical to predict the yield. This process consists of a series of electrical and physical tests following semiconductor fabrication, tests that generate a significant volume of parametric data. While past research has investigated yield prediction using parametric test data, most studies have difficulty correctly predicting the low and high yield because of the wide range of variables and the large data set. Also, in the case of the packaging yield, prediction is inaccurate as this yield does not directly correlate with the parametric test data. Therefore, this study proposes a framework in which the packaging yield is classified using the parametric test data of the previous step of the packaging test. This study involves three stages. In the first, data preprocessing is conducted due to the large data set. To learn a data mining model using much more data, parametric test data generated in the die level need to be changed into the wafer level. In the second stage, a random forest algorithm is used to select significant variables affecting the packaging yield. Finally, the third stage uses a nonlinear support vector machine (SVM) to classify the low and high yield. Through the three stages, this study demonstrates that this proposed algorithm has a superior performance.
KW - Ensemble Support Vector Machine
KW - Packaging Yield Classification
KW - Random Forests
KW - Semiconductor Manufacturing Process
UR - http://www.scopus.com/inward/record.url?scp=84885996723&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84885996723&partnerID=8YFLogxK
U2 - 10.1109/BigData.Congress.2013.55
DO - 10.1109/BigData.Congress.2013.55
M3 - Conference contribution
AN - SCOPUS:84885996723
SN - 9780768550060
T3 - Proceedings - 2013 IEEE International Congress on Big Data, BigData 2013
SP - 363
EP - 368
BT - Proceedings - 2013 IEEE International Congress on Big Data, BigData 2013
T2 - 2013 IEEE International Congress on Big Data, BigData 2013
Y2 - 27 June 2013 through 2 July 2013
ER -