Abstract
Automatic text categorization is a problem of assigning predefined categories to free text documents based on the likelihood suggested by a training set of labelled texts. kNN learning based text classifier is a well known statistical approach and its algorithm is quite simple. While the method has been applied to many systems and shown relatively good performance, a through evaluation of the method has rarely been done. There are some parameters which play important roles in the performance of the method: decision function, k value of kNN, and size of feature set. This paper focuses on an improving method for a kNN learning based Korean text classifier by using heuristic information found experimentally. Our results show that kNN method with carefully chosen parameters is very significant in improving the performance and decreasing the size of feature set.
Original language | English |
---|---|
Title of host publication | ICONIP 2002 - Proceedings of the 9th International Conference on Neural Information Processing: Computational Intelligence for the E-Age |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 731-735 |
Number of pages | 5 |
Volume | 2 |
ISBN (Print) | 9810475241, 9789810475246 |
DOIs | |
Publication status | Published - 2002 |
Externally published | Yes |
Event | 9th International Conference on Neural Information Processing, ICONIP 2002 - Singapore, Singapore Duration: 2002 Nov 18 → 2002 Nov 22 |
Other
Other | 9th International Conference on Neural Information Processing, ICONIP 2002 |
---|---|
Country/Territory | Singapore |
City | Singapore |
Period | 02/11/18 → 02/11/22 |
ASJC Scopus subject areas
- Computer Networks and Communications
- Information Systems
- Signal Processing