Journal of Applied Sciences

Year: 2014 | Volume: 14 | Issue: 2 | Page No.: 177-182
DOI: 10.3923/jas.2014.177.182
Active Learning Based Semi-automatic Annotation of Event Corpus
Jianfeng Fu, Nianzu Liu and Shuangcheng Wang

Abstract: In the area of Natural Language Processing, building corpus by hand was a hard and time-consuming task. Active learning promised to reduce the cost of annotating dataset for it was allowed to choose the data from which it learned. This study presented a semi-automatic annotation method based on active learning for labeling events in Chinese text. Particularly, it focused on uncertainty-based sampling and query-by-committee based sampling algorithm to evaluate which instance was informative and could be labeled by hand in the unlabeled dataset. The selected informative instances were labeled manually for obtaining a more effective classifier. Experimental results not only demonstrated that active learning improved the accuracy of Chinese event annotation, but also showed that it reduced the number of labeling actions dramatically.

Fulltext PDF Fulltext HTML

How to cite this article
Jianfeng Fu, Nianzu Liu and Shuangcheng Wang , 2014. Active Learning Based Semi-automatic Annotation of Event Corpus. Journal of Applied Sciences, 14: 177-182.

© Science Alert. All Rights Reserved