HOME JOURNALS CONTACT

Information Technology Journal

Year: 2013 | Volume: 12 | Issue: 19 | Page No.: 4895-4900
DOI: 10.3923/itj.2013.4895.4900
Phrase-table Filtering for Phrase-based Machine Translation
Li Bin, Ma Ning and Liang Wuqi

Abstract: Phrase-based machine translation models have shown better translations than Word-based models, but many phrase pairs do not encode any relevant context. In order to decrease the number of phrase pairs occurring in phrases table for phrase-based machine translation, we described and compared several approaches for filtering phrase pairs. While the phrase pairs extracted and tested in our Machine Translation (MT) system, they all performed satisfactorily. Comparing to each other, the method of Model-best got a very good result. Based on the Model-best, two methods CPS and MB and LLR and MB we proposed by combining Model-best with the method of “Composition” and Log Likelihood Ratio. Both of their translation performances are better than the method of Model-best.

Fulltext PDF

How to cite this article
Li Bin, Ma Ning and Liang Wuqi, 2013. Phrase-table Filtering for Phrase-based Machine Translation. Information Technology Journal, 12: 4895-4900.

Keywords: Machine translation, phrase pairs filtering, combination of methods, phrase-table and model-best

REFERENCES

  • Brown, P.F., V.J.D. Pietra, S.A.D. Pietra and R.L. Mercer, 1993. The mathematics of statistical machine translation: Parameter estimation. Comput. Ling., 19: 263-311.
    Direct Link    


  • Dunning, T., 1993. Accurate methods for the statistics of surprise and coincidence. Comput. Linguist., 19: 61-74.
    Direct Link    


  • Eck, M., S. Vogel and A. Waibel, 2007. Translation model pruning via usage statistics for statistical machine translation. Proceeding of NAACL HLT, April 22-27, 2007, Rochester, New York, pp: 21-24.


  • Eck, M., S. Vogel and A. Waibel, 2007. Estimating phrase pair relevance for translation model pruning. Proceedings of the 11th Machine Translation Summit, September 10-14, 2007, Copenhagen, Denmark, pp: 159-165.


  • Johnson, J.H., J. Martin, G. Foster and R. Kuhn, 2007. Improving translation quality by discarding most of the phrasetable. Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, June 28-30, 2007, Prague, Canada, pp: 965-967.


  • Koehn, P., 2004. Pharaoh: A beam search decoder for phrase-based statistical machine translation models. Proceedings of the 6th Conference of the Association for Machine Translation in the Americas, September 28-October 2, 2004, Washington, DC., USA., pp: 115-124.


  • Koehn, P., H. Hoang, A. Birch, C. Callison-Burch and M. Federico et al., 2007. Moses: Open source toolkit for statistical machine translation. Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions, June 2007, Prague, pp: 177-180.


  • Ling, W., J. Graca, I. Trancoso and A. Black, 2012. Entropy-based pruning for Phrase-based machine translation. Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, July 12-14, 2012, Jeju Island, Korea, pp: 962-971.


  • Tomeh, N., N. Cancedda and M. Dymetman, 2009. Complexity-based phrase-table filtering for statistical machine translation. Proceedings of the 12th Machine Translation Summit, August 26-30, 2009, Ottawa, Ontario, Canada, pp: 144-151.


  • Wu, H. and H. Wang, 2007. Comparative study of word alignment heuristics and phrase-based SMT. Proceedings of the 11th Machine Translation Summit, September 10-14, 2007, Copenhagen, pp: 507-514.


  • Zettlemoyer, L. and R. Moore, 2007. Selective phrase pair extraction for improved statistical machine translation. Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics, Rochester, New York, April 22-27, Association for Computational Linguistics, Morristown, NJ, USA., pp: 209-212.

  • © Science Alert. All Rights Reserved