Skip to main navigation Skip to search Skip to main content

Phrase extraction based on constraints of word similarities

  • Harbin Institute of Technology

Research output: Contribution to journalArticlepeer-review

Abstract

Aimed at the problem that the traditional phrase extraction method is strictly dependent on word alignments, and is not pruned to alignment errors, a loose phrase extraction method, which does not strictly depend on word alignments. In this method, constraints are posed on alignment points to avoid ill-formed phrase pairs. Three constraint strategies are proposed based on word similarities: Dice coefficient, Phi-square coefficient and log-likelihood ratio. Experiments were carried out on the corpus of IWSLT 2004. Results show that the BLEU scores of the best results of loose phrase extraction can be improved by 15.14%, compared with the baseline system Pharaoh.

Original languageEnglish
Pages (from-to)775-778
Number of pages4
JournalHarbin Gongye Daxue Xuebao/Journal of Harbin Institute of Technology
Volume42
Issue number5
StatePublished - May 2010

Keywords

  • Machine translation
  • Phrase extraction
  • Statistical machine translation
  • Word similarity

Fingerprint

Dive into the research topics of 'Phrase extraction based on constraints of word similarities'. Together they form a unique fingerprint.

Cite this