Skip to main navigation Skip to search Skip to main content

Enhancing word distinction for bilingual lexicon induction with generalized antonym knowledge

  • Harbin Institute of Technology

Research output: Contribution to journalArticlepeer-review

Abstract

Most Bilingual Lexicon Induction (BLI) methods map monolingual word embeddings (WEs) into a shared semantic space and treat nearest cross-lingual neighbors as translation pairs. A common challenge with these techniques is the propensity for dissimilar semantic words to cluster together in the WE space, posing difficulties in accurately identifying translations. To address this problem, we propose a novel method that leverages antonym knowledge to enhance the separation between words with different semantics in the WE space. The knowledge of generalized antonyms is mined from commonly used data in BLI. Specifically, we jointly use seed lexicons and monolingual word embeddings (WEs) to identify semantically different words, which we refer to as “generalized antonyms.” These generalized antonyms share high cosine similarity within the monolingual WE space and raise semantic confusion. The identified ”generalized antonyms” then serve as “fixed anchor points” to guide the training of the BLI model. The method requires no additional data and can be applied to any language pair. Comprehensive experiments demonstrate that our proposed method outperforms existing state-of-the-art (SOTA) BLI methods across nearly all diverse language pairs. The analysis study also proves that our method effectively enhances the distinction between words.

Original languageEnglish
Article number113247
JournalKnowledge-Based Systems
Volume315
DOIs
StatePublished - 22 Apr 2025

Keywords

  • Bilingual lexicon induction
  • Knowledge acquisition
  • Machine translation
  • Word translation

Fingerprint

Dive into the research topics of 'Enhancing word distinction for bilingual lexicon induction with generalized antonym knowledge'. Together they form a unique fingerprint.

Cite this