Improved of Phrase Extraction Algorithm in Tibetan and Chinese Statistical Machine Translation
Abstract:
The extraction of the bilingual phrase is one of the
key steps in the phrase-based translation model of Statistical machine translation.
Extracting bilingual phrase accurately and sufficiently is the focus of the
study. By improving the phrase extraction algorithm get the final phrase translation
probability table. There is the situation that a Tibetan word aligned to many
Chinese in the word alignment matrix. Using the Och algorithm extracts phrase
pairs. When it does not meet Och
condition, adding Tibetan dictionaries information. Comparing results by two
methods which is the same size between different linguistic corpus and different
sentence pairs of Tibetan-Chinese parallel corpora, the improved will be better
in the experiment.
How to cite this article
Cao Hui and Dong Xiaofang, 2013. Improved of Phrase Extraction Algorithm in Tibetan and Chinese Statistical Machine Translation. Journal of Applied Sciences, 13: 5230-5234.
REFERENCES
David, C., 2005. A hierarchical phrase-based model for statistical machine translation. Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, June 25-30, 2005, USA., pp: 263-270.
Deng, Y.G., J. Xu and Y.Q. Gao, 2008. Phrase table training for precision and recall: What makes a good phrase and a good phrase pair? Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics, June 15-20, 2008, USA., pp: 81-88.
Chen, Y.Z., B.L. Li and S.W. Yu, 2003. The design and implementation of a Tibetan word segmentation system. J. Chin. Inform., 17: 15-20.
He, Y., Y. Zhou, C. Zong and X. Wang, 2007. Method of phrase translation extraction based on loose scale. J. Chin. Inform., 21: 91-95.
Li, Y., X.Z. He, J.Y. Ai and H. Yu, 2009. Tibetan encoding and its transformation. Comput. Appl., 29: 2017-2018.
Qi, K.Y., 2006. Information processing in Tibetan word segmentation research. J. Northwest Univ. Nationalities, 4: 92-97.
Vogel, S., 2005. PESA: Phrase pair extraction as sentence splitting. Proceedings of the Machine Translation Summit X, September 14, 2005, Phuket, Thailand, pp: 251-258.
Zhao, B. and S. Vogel, 2005. A generalized alignment-free phrase extraction. Proceedings of the ACL Workshop on Building and Using Parallel Texts, June 29-30, 2005, USA., pp: 141-144.
© Science Alert. All Rights Reserved