Skip to main navigation Skip to search Skip to main content

Towards better UD parsing: Deep contextualized word embeddings, ensemble, and treebank concatenation

  • Harbin Institute of Technology

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

This paper describes our system (HIT-SCIR) submitted to the CoNLL 2018 shared task on Multilingual Parsing from Raw Text to Universal Dependencies. We base our submission on Stanford's winning system for the CoNLL 2017 shared task and make two effective extensions: 1) incorporating deep contextualized word embeddings into both the part of speech tagger and dependency parser; 2) ensembling parsers trained with different initialization. We also explore different ways of concatenating treebanks for further improvements. Experimental results on the development data show the effectiveness of our methods. In the final evaluation, our system was ranked first according to LAS (75.84%) and outperformed the other systems by a large margin.

Original languageEnglish
Title of host publicationCoNLL 2018 - SIGNLL Conference on Computational Natural Language Learning, Proceedings of the CoNLL 2018 Shared Task
Subtitle of host publicationMultilingual Parsing from Raw Text to Universal Dependencies
PublisherAssociation for Computational Linguistics (ACL)
Pages55-64
Number of pages10
ISBN (Electronic)9781948087827
DOIs
StatePublished - 2018
Event2018 SIGNLL Conference on Computational Natural Language Learning, CoNLL Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, CoNLL 2018 - Brussels, Belgium
Duration: 31 Oct 20181 Nov 2018

Publication series

NameCoNLL 2018 - SIGNLL Conference on Computational Natural Language Learning, Proceedings of the CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies

Conference

Conference2018 SIGNLL Conference on Computational Natural Language Learning, CoNLL Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, CoNLL 2018
Country/TerritoryBelgium
CityBrussels
Period31/10/181/11/18

Fingerprint

Dive into the research topics of 'Towards better UD parsing: Deep contextualized word embeddings, ensemble, and treebank concatenation'. Together they form a unique fingerprint.

Cite this