Skip to main navigation Skip to search Skip to main content

A bootstrapping approach to symptom entity extraction on Chinese electronic medical records

  • Tianyi Qin*
  • , Yi Guan
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Symptom entities are widely distributed in Chinese electronic medical records. Previous approaches on symptom entity extraction usually extract continuous strings as symptom entities and require massive human efforts on corpus annotation. We describe the symptom entity as two-tuples of <subject, lesion> and design a soft pattern matching method to locate them in sentences in the EMR. Our bootstrapping approach which only requires a few annotated symptom tuples and it allows iterative extraction from mass electronic medical record databases without human supervision. Furthermore, the described method annotates symptom entities in EMR by the extracted tuples. Starting with 60 annotated entities, our approach reached an F value of 81.40 % in the extraction task of 3,150 entities from 992 sets of electronic medical records.

Original languageEnglish
Title of host publicationChinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data - 15th China National Conference, CCL 2016 and 4th International Symposium, NLP-NABD 2016, Proceedings
EditorsMaosong Sun, Zhiyuan Liu, Yang Liu, Hongfei Lin, Xuanjing Huang
PublisherSpringer Verlag
Pages413-423
Number of pages11
ISBN (Print)9783319476735
DOIs
StatePublished - 2016
Externally publishedYes
Event15th China National Conference on Chinese Computational Linguistics, CCL 2016 and 4th International Symposium on Natural Language Processing Based on Naturally Annotated Big Data, NLP-NABD 2016 - Yantai, China
Duration: 15 Oct 201616 Oct 2016

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume10035 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference15th China National Conference on Chinese Computational Linguistics, CCL 2016 and 4th International Symposium on Natural Language Processing Based on Naturally Annotated Big Data, NLP-NABD 2016
Country/TerritoryChina
CityYantai
Period15/10/1616/10/16

Keywords

  • Bootstrapping
  • Electronic medical record
  • Named entity extraction
  • Soft matching

Fingerprint

Dive into the research topics of 'A bootstrapping approach to symptom entity extraction on Chinese electronic medical records'. Together they form a unique fingerprint.

Cite this