The multinomial naive Bayes model has been widely used for probabilistic text classification. However, the parameter estimation for this model sometimes generates inappropriate probabilities. In this paper, we propose a topic document model for the multinomial naive Bayes text classification, where the parameters are estimated from normalized term frequencies of each training document. Experiments are conducted on Reuters 21578 and 20 Newsgroup collections, and our proposed approach obtained a significant improvement in performance compared to the traditional multinomial naive Bayes.
- Naive Bayes
- Text classification
ASJC Scopus subject areas
- Hardware and Architecture
- Computer Vision and Pattern Recognition
- Electrical and Electronic Engineering
- Artificial Intelligence