Abstract
Classification is a process of identifying the class to which input data belong. One of the most popular methods to do this is to construct a classification model by training a machine learning algorithm using a given set of data. For better classification performance, the dataset should have a balanced data distribution by class. If the dataset is imbalanced, that is, one class (minority class) has very fewer data than the other class (majority class); a model has little chance to learn about the minority class, and training is biased to the majority class. As a result, the model tends to classify any input to the majority class and does not handle data of the minority class properly. To overcome this data imbalance problem, we propose a novel over-sampling scheme based on Borderline-Conditional Generative Adversarial Networks (BCGAN). Our BCGAN generates data for the minority class, particularly along the borderline between majority class and minority class. Through various experiments on actual imbalanced datasets, we show the performance of our scheme.
Original language | English |
---|---|
Title of host publication | Proceedings - 2020 IEEE International Conference on Big Data and Smart Computing, BigComp 2020 |
Editors | Wookey Lee, Luonan Chen, Yang-Sae Moon, Julien Bourgeois, Mehdi Bennis, Yu-Feng Li, Young-Guk Ha, Hyuk-Yoon Kwon, Alfredo Cuzzocrea |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 155-160 |
Number of pages | 6 |
ISBN (Electronic) | 9781728160344 |
DOIs | |
Publication status | Published - 2020 Feb 1 |
Event | 2020 IEEE International Conference on Big Data and Smart Computing, BigComp 2020 - Busan, Korea, Republic of Duration: 2020 Feb 19 → 2020 Feb 22 |
Publication series
Name | Proceedings - 2020 IEEE International Conference on Big Data and Smart Computing, BigComp 2020 |
---|
Conference
Conference | 2020 IEEE International Conference on Big Data and Smart Computing, BigComp 2020 |
---|---|
Country/Territory | Korea, Republic of |
City | Busan |
Period | 20/2/19 → 20/2/22 |
Bibliographical note
Funding Information:ACKNOWLEDGMENT This research was supported in part by Energy Cloud R&D Program (grant number: 2019M3F2A1073184) through the National Research Foundation of Korea (NRF) funded by the Ministry of Science and ICT and in part by the Korea Electric Power Corporation (grant number: R18XA05).
Keywords
- BCGAN
- CGAN
- Imbalanced data
- Over-sampling
ASJC Scopus subject areas
- Artificial Intelligence
- Information Systems and Management
- Control and Optimization