Skip to main navigation Skip to search Skip to main content

Improving dependency parsing on clinical text with syntactic clusters from web text

  • School of Computer Science and Technology, Harbin Institute of Technology

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Treebanks for clinical text are not enough for supervised dependency parsing no matter in their scale or diversity, leading to still unsatisfactory performance. Many unlabeled text from web can make up for the scarceness of treebanks in some extent. In this paper, we propose to gain syntactic knowledge from web text as syntactic cluster features to improve dependency parsing on clinical text. We parse the web text and compute the distributed representation of each words base on their contexts in dependency trees. Then we cluster words according to their distributed representation, and use these syntactic cluster features to solve the data sparseness problem. Experiments on Genia show that syntactic cluster features improve the LAS (Labled Attachment Score) of dependency parser on clinical text by 1.62%. And when we use syntactic clusters combining with brown clusters, the performance gains by 1.93% on LAS.

Original languageEnglish
Title of host publicationNeural Information Processing - 23rd International Conference, ICONIP 2016, Proceedings
EditorsKenji Doya, Kazushi Ikeda, Minho Lee, Akira Hirose, Seiichi Ozawa, Derong Liu
PublisherSpringer Verlag
Pages470-478
Number of pages9
ISBN (Print)9783319466866
DOIs
StatePublished - 2016
Externally publishedYes
Event23rd International Conference on Neural Information Processing, ICONIP 2016 - Kyoto, Japan
Duration: 16 Oct 201621 Oct 2016

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume9947 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference23rd International Conference on Neural Information Processing, ICONIP 2016
Country/TerritoryJapan
CityKyoto
Period16/10/1621/10/16

Fingerprint

Dive into the research topics of 'Improving dependency parsing on clinical text with syntactic clusters from web text'. Together they form a unique fingerprint.

Cite this