Abstract:
This study implements a series of tournament structure ranking technique to improve the classification accuracy of conventional Bayesian classification, especially in handling classification tasks with highly similar categories. Bayesian classification approach has been widely implemented in many real-world text categorization applications due to its simplicity, low cost training and classifying algorithms and ability in handling raw text data directly without needing extensive pre-processes. However, Bayesian classification has been reported as one of the poor-performing classification approaches. The poor performance of the Bayesian classification is critical especially in handling text classification tasks with multiple highly similar categories. In this study, we introduce a series of tournament structure based ranking classification techniques to overcome the low accuracy of conventional Bayesian classification which implements the flat ranking technique. Experiments that have been conducted in this research to show that the proposed Bayesian classifier embedded with tournament structure ranking techniques is able to ensure promising performance while dealing with knowledge domains with highly similar categories. This is due to the enhanced Bayesian classifier performs its classification tasks based on the implementation of multiple, iterative and isolated binary classifications and thus guarantee a low-error-rate Bayesian classification. As the result, an enhanced Bayesian classifier which is applicable to different types of domains of varying characteristics is introduced to handle the real world text classification problems effectively and efficiently.
L.H. Lee, D. Isa, W.O. Choo and W.Y. Chue, 2010. Tournament Structure Ranking Techniques for Bayesian Text Classification with Highly Similar Categories. Journal of Applied Sciences, 10: 1243-1254.