Skip to main navigation Skip to search Skip to main content

Point set registration for unsupervised bilingual lexicon induction

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Inspired by the observation that word embeddings exhibit isomorphic structure across languages, we propose a novel method to induce a bilingual lexicon from only two sets of word embeddings, which are trained on monolingual source and target data respectively. This is achieved by formulating the task as point set registration which is a more general problem. We show that a transformation from the source to the target embedding space can be learned automatically without any form of cross-lingual supervision. By properly adapting a traditional point set registration model to make it be suitable for processing word embeddings, we achieved state-ofthe-art performance on the unsupervised bilingual lexicon induction task. The point set registration problem has been well-studied and can be solved by many elegant models, we thus opened up a new opportunity to capture the universal lexical semantic structure across languages.

Original languageEnglish
Title of host publicationProceedings of the 27th International Joint Conference on Artificial Intelligence, IJCAI 2018
EditorsJerome Lang
PublisherInternational Joint Conferences on Artificial Intelligence
Pages3991-3997
Number of pages7
ISBN (Electronic)9780999241127
DOIs
StatePublished - 2018
Event27th International Joint Conference on Artificial Intelligence, IJCAI 2018 - Stockholm, Sweden
Duration: 13 Jul 201819 Jul 2018

Publication series

NameIJCAI International Joint Conference on Artificial Intelligence
Volume2018-July
ISSN (Print)1045-0823

Conference

Conference27th International Joint Conference on Artificial Intelligence, IJCAI 2018
Country/TerritorySweden
CityStockholm
Period13/07/1819/07/18

Fingerprint

Dive into the research topics of 'Point set registration for unsupervised bilingual lexicon induction'. Together they form a unique fingerprint.

Cite this