Asian Science Citation Index is committed to provide an authoritative, trusted and significant information by the coverage of the most important and influential journals to meet the needs of the global scientific community.  
ASCI Database
308-Lasani Town,
Sargodha Road,
Faisalabad, Pakistan
Fax: +92-41-8815544
Contact Via Web
Suggest a Journal
Information Technology Journal
Year: 2007  |  Volume: 6  |  Issue: 8  |  Page No.: 1190 - 1198

Hidden Markov Model Based Part of Speech Tagger for Urdu

Waqas Anwar, Xuan Wang, LuLi and Xiaolong Wang    

Abstract: In this study, we present the preliminary achievement of Hidden Markov Model (HMM) to solve the part of speech tagging problem of Urdu language. The presented HMM is derived from the combination of lexical and transition probabilities. An important feature of our tagger is to combine many distinguished smoothing techniques with HMM model to resolve the data sparseness problem. We note that the proposed HMM based Urdu Part of speech tagger with different smoothing method has achieved significant performance. We evaluate our tagger’s results regarding different smoothing methods and different word level accuracy through Analysis of Variance (ANOVA) and show how present results are significant. Also, we compose a confusion matrix about most frequent error occurring tag pairs. The development of our tagger is an important milestone toward Urdu language processing. This will open some novel research directions to mature Urdu language processing.

Cited References   |    Fulltext    |   Related Articles   |   Back
   
 
 
 
  Related Articles

 
 
 
 
 
 
 
 
 
Copyright   |   Desclaimer   |    Privacy Policy   |   Browsers   |   Accessibility