Skip to main navigation Skip to search Skip to main content

Pivot approach for extracting paraphrase patterns from bilingual corpora

  • Shiqi Zhao*
  • , Haifeng Wang
  • , Ting Liu
  • , Sheng Li
  • *Corresponding author for this work
  • Harbin Institute of Technology
  • Toshiba Corporation

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Paraphrase patterns are useful in paraphrase recognition and generation. In this paper, we present a pivot approach for extracting paraphrase patterns from bilingual parallel corpora, whereby the English paraphrase patterns are extracted using the sentences in a foreign language as pivots. We propose a loglinear model to compute the paraphrase likelihood of two patterns and exploit feature functions based on maximum likelihood estimation (MLE) and lexical weighting (LW). Using the presented method, we extract over 1,000,000 pairs of paraphrase patterns from 2M bilingual sentence pairs, the precision of which exceeds 67%. The evaluation results show that: (1) The pivot approach is effective in extracting paraphrase patterns, which significantly outperforms the conventional method DIRT. Especially, the log-linear model with the proposed feature functions achieves high performance. (2) The coverage of the extracted paraphrase patterns is high, which is above 84%. (3) The extracted paraphrase patterns can be classified into 5 types, which are useful in various applications.

Original languageEnglish
Title of host publicationACL-08
Subtitle of host publicationHLT - 46th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference
Pages780-788
Number of pages9
StatePublished - 2008
Event46th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, ACL-08: HLT - Columbus, OH, United States
Duration: 15 Jun 200820 Jun 2008

Publication series

NameACL-08: HLT - 46th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference

Conference

Conference46th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, ACL-08: HLT
Country/TerritoryUnited States
CityColumbus, OH
Period15/06/0820/06/08

Fingerprint

Dive into the research topics of 'Pivot approach for extracting paraphrase patterns from bilingual corpora'. Together they form a unique fingerprint.

Cite this