LSTM-CRF for Drug-Named Entity Recognition

Research output: Contribution to journalArticlepeer-review

Abstract

Drug-Named Entity Recognition (DNER) for biomedical literature is a fundamental facilitator of Information Extraction. For this reason, the DDIExtraction2011 (DDI2011) and DDIExtraction2013 (DDI2013) challenge introduced one task aiming at recognition of drug names. State-of-the-art DNER approaches heavily rely on hand-engineered features and domain-specific knowledge which are difficult to collect and define. Therefore, we offer an automatic exploring words and characters level features approach: a recurrent neural network using bidirectional long short-term memory (LSTM) with Conditional Random Fields decoding (LSTM-CRF). Two kinds of word representations are used in this work: word embedding, which is trained from a large amount of text, and character-based representation, which can capture orthographic feature of words. Experimental results on the DDI2011 and DDI2013 dataset show the effect of the proposed LSTM-CRF method. Our method outperforms the best system in the DDI2013 challenge.

Original languageEnglish
Article number283
JournalEntropy
Volume19
Issue number6
DOIs
StatePublished - 1 Jun 2017
Externally publishedYes

Keywords

  • Conditional random field
  • Drug name entity recognition
  • Information extraction
  • Long short-term memory

Fingerprint

Dive into the research topics of 'LSTM-CRF for Drug-Named Entity Recognition'. Together they form a unique fingerprint.

Cite this