Classification is an important task in various areas. In many real-world applications, class imbalance and overlapping problems have been reported as major issues in the application of traditional classification algorithms. An imbalance problem occurs when training data contain considerably more representatives of one class than of other classes. Class overlap occurs when a region in the data space contains a similar number of data for each class. When a class overlap occurs in imbalanced data sets, classification becomes even more complicated. Although various approaches have been proposed to deal separately with class imbalance and overlapping problems, only a few studies have attempted to address both problems simultaneously. In this paper, we propose an overlap-sensitive margin (OSM) classifier based on a modified fuzzy support vector machine and k-nearest neighbor algorithm to address imbalanced and overlapping data sets. The main idea of the proposed OSM classifier is to separate the data space into soft- and hard-overlap regions using the modified fuzzy support vector machine algorithm. The separated spaces are then classified using the decision boundaries of the support vector machine and 1-nearest neighbor algorithms. Furthermore, by separating a data set into soft- and hard-overlap regions, one can determine which part of the data is to be examined more closely for classification in real-world situations. Experiments using synthetic and real-world data sets demonstrated that the proposed OSM classifier outperformed existing methods for imbalanced and overlapping situations.
Bibliographical noteFunding Information:
The authors would like to thank the editor and reviewers for their useful comments and suggestions, which were of great help in improving the quality of the paper. This work was supported by Brain Korea PLUS; by the Basic Science Research Program through the National Research Foundation of Korea funded by the Ministry of Science, ICT and Future Planning ( NRF-2016R1A2B1008994 ); and by the Ministry of Trade, Industry & Energy under the Industrial Technology Innovation Program ( R1623371 ).
© 2018 Elsevier Ltd
- Imbalanced class
- Overlapping class
- Support vector machine
ASJC Scopus subject areas
- Computer Science Applications
- Artificial Intelligence