Cao Hui
Chinese National Institute of Information Technology, Northwest University for Nationalities, Lanzhou, Gansu, China, 730030
Dong Xiaofang
Chinese National Institute of Information Technology, Northwest University for Nationalities, Lanzhou, Gansu, China, 730030
ABSTRACT
The extraction of the bilingual phrase is one of the key steps in the phrase-based translation model of Statistical machine translation. Extracting bilingual phrase accurately and sufficiently is the focus of the study. By improving the phrase extraction algorithm get the final phrase translation probability table. There is the situation that a Tibetan word aligned to many Chinese in the word alignment matrix. Using the Och algorithm extracts phrase pairs. When it does not meet Och condition, adding Tibetan dictionaries information. Comparing results by two methods which is the same size between different linguistic corpus and different sentence pairs of Tibetan-Chinese parallel corpora, the improved will be better in the experiment.
PDF References Citation
Received: August 07, 2013;
Accepted: November 08, 2013;
Published: November 13, 2013
How to cite this article
Cao Hui and Dong Xiaofang, 2013. Improved of Phrase Extraction Algorithm in Tibetan and Chinese Statistical Machine Translation. Journal of Applied Sciences, 13: 5230-5234.
DOI: 10.3923/jas.2013.5230.5234
URL: https://scialert.net/abstract/?doi=jas.2013.5230.5234
DOI: 10.3923/jas.2013.5230.5234
URL: https://scialert.net/abstract/?doi=jas.2013.5230.5234
REFERENCES
- Zhao, B. and S. Vogel, 2005. A generalized alignment-free phrase extraction. Proceedings of the ACL Workshop on Building and Using Parallel Texts, June 29-30, 2005, USA., pp: 141-144.
Direct Link