TY - JOUR
T1 - Maximizing subset accuracy with recurrent neural networks in multi-label classification
AU - Nam, Jinseok
AU - Mencía, Eneldo Loza
AU - Kim, Hyunwoo J.
AU - Fürnkranz, Johannes
N1 - Funding Information:
The authors would like to thank anonymous reviewers for their thorough feedback. Computations for this research were conducted on the Lichtenberg high performance computer of the Technische Universität Darmstadt. The Titan X used for this research was donated by the NVIDIA Corporation. This work has been supported by the German Institute for Educational Research (DIPF) under the Knowledge Discovery in Scientific Literature (KDSL) program, and the German Research Foundation as part of the Research Training Group Adaptive Preparation of Information from Heterogeneous Sources (AIPHES) under grant No. GRK 1994/1.
Publisher Copyright:
© 2017 Neural information processing systems foundation. All rights reserved.
PY - 2017
Y1 - 2017
N2 - Multi-label classification is the task of predicting a set of labels for a given input instance. Classifier chains are a state-of-the-art method for tackling such problems, which essentially converts this problem into a sequential prediction problem, where the labels are first ordered in an arbitrary fashion, and the task is to predict a sequence of binary values for these labels. In this paper, we replace classifier chains with recurrent neural networks, a sequence-to-sequence prediction algorithm which has recently been successfully applied to sequential prediction tasks in many domains. The key advantage of this approach is that it allows to focus on the prediction of the positive labels only, a much smaller set than the full set of possible labels. Moreover, parameter sharing across all classifiers allows to better exploit information of previous decisions. As both, classifier chains and recurrent neural networks depend on a fixed ordering of the labels, which is typically not part of a multi-label problem specification, we also compare different ways of ordering the label set, and give some recommendations on suitable ordering strategies.
AB - Multi-label classification is the task of predicting a set of labels for a given input instance. Classifier chains are a state-of-the-art method for tackling such problems, which essentially converts this problem into a sequential prediction problem, where the labels are first ordered in an arbitrary fashion, and the task is to predict a sequence of binary values for these labels. In this paper, we replace classifier chains with recurrent neural networks, a sequence-to-sequence prediction algorithm which has recently been successfully applied to sequential prediction tasks in many domains. The key advantage of this approach is that it allows to focus on the prediction of the positive labels only, a much smaller set than the full set of possible labels. Moreover, parameter sharing across all classifiers allows to better exploit information of previous decisions. As both, classifier chains and recurrent neural networks depend on a fixed ordering of the labels, which is typically not part of a multi-label problem specification, we also compare different ways of ordering the label set, and give some recommendations on suitable ordering strategies.
UR - http://www.scopus.com/inward/record.url?scp=85047012313&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:85047012313
SN - 1049-5258
VL - 2017-December
SP - 5414
EP - 5424
JO - Advances in Neural Information Processing Systems
JF - Advances in Neural Information Processing Systems
T2 - 31st Annual Conference on Neural Information Processing Systems, NIPS 2017
Y2 - 4 December 2017 through 9 December 2017
ER -