Subscribe Now Subscribe Today
Research Article

Blocking Distribution Based Hierarchical Reconstruction for Text Categorization

Wen Li, Weili Wang and Ling Chai
Facebook Twitter Digg Reddit Linkedin StumbleUpon E-mail

As one of the important techniques in large-scale data organizing, text categorization has been widely investigated. But the existing hierarchical classification methods often suffer from inter-level error transmission, namely blocking. In this paper, blocking distribution based topology reconstruction method was proposed for hierarchical text categorization problem. Firstly, blocking distribution recognition technique is put forward to mining out the serious high-level misclassification class. Subsequently, original hierarchical structure are reconstructed using blocking direction information obtained ahead, which increasing the path for the blocking instance to the correct subclass. Experimental studies on Chinese text classification benchmark Tan Corp, demonstrate that the proposed algorithm performs better than the traditional hierarchical and state-of-the-art flat classification strategies.

Related Articles in ASCI
Search in Google Scholar
View Citation
Report Citation

  How to cite this article:

Wen Li, Weili Wang and Ling Chai, 2013. Blocking Distribution Based Hierarchical Reconstruction for Text Categorization. Journal of Applied Sciences, 13: 2123-2126.

DOI: 10.3923/jas.2013.2123.2126



1:  He, L., Y. Jia, W. Han, S. Tan and Z. Chen, 2012. Research and development of large scale hierarchical classification problem. J. Comput., 35: 2101-2115.

2:  Huang, C.C., S.L. Chuang and L.F. Chien, 2004. Liveclassifier: Creating hierarchical text classifiers through web corpora. Proceedings of the 13th International Conference on World Wide Web, May 17, 2004, New York, USA., pp: 184-192

3:  Ceci, M. and D. Malerba, 2007. Classifying web documents in a hierarchy of categories: A comprehensive study. J. Intell. Inform. Syst., 28: 37-78.
CrossRef  |  Direct Link  |  

4:  Sun, A., E.P. Lim and W.K. Ng, 2003. Performance measurement framework for hierarchical text classification. J. Am. Soc. Inform. Sci. Technol., 54: 1014-1028.
CrossRef  |  Direct Link  |  

5:  Ruiz, M.E., 2001. Combining machine learning and hierarchical structures for text categorization. Ph.D. Thesis, Graduate College of University of Iowa, Ames, USA.

6:  Li, W., D.Q. Miao, W. Wang and N. Zhang, 2010. Hierarchical rough decision theoretic framework for text classification. Proceedings of the 9th IEEE International Conference on Cognitive Informatics, July 7-9, 2010, Beijing, China, pp: 484-489
CrossRef  |  

7:  Yuan, S., R. Li, S. Zhou and Y. Hu, 2004. Hierarchical Chinese document categorization. J. China Inst. Commun., 25: 55-63.

8:  Sun, A., E.P. Lim, W.K. Ng and J. Srivastava, 2004. Blocking reduction strategies in hierarchical text classification. IEEE Trans. Knowl. Data Eng., 16: 1305-1308.
CrossRef  |  Direct Link  |  

9:  Tan, S., 2006. An effective refinement strategy for KNN Text classifier. Expert Syst. Applic., 30: 290-298.
CrossRef  |  

10:  Joachims, T., 1998. Text categorization with support vector machines: Learning with many relevant features. Proceedings of the 10th European Conference on Machine Learning, Chemnitz, Germany, April 21-23, 1998, Springer, Berlin, Heidelberg, pp: 137-142
CrossRef  |  Direct Link  |  

©  2022 Science Alert. All Rights Reserved