Skip to main navigation Skip to search Skip to main content

Multi-document summarization based on lexical chains

  • Yan Min Chen*
  • , Xiao Long Wang
  • , Bing Quan Liu
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

This paper for the first time investigates using lexical chains as a model of multiple documents written in Chinese to generate an indicative, moderately fluent summary. The algorithm which computes lexical chains based on the HowNet knowledge database is modified to improve the performance and suit Chinese summarization. Based on an analysis of semanteme, the algorithm can remove redundant similarities and remain differences in information content among multiple documents. The method pre-processes the text first, then constructs lexical chains and identifies strong chains. Then significant sentences are extracted from each document and are ordered, and redundant information are recognized and removed. Finally, the summary is generated in chronological order, and the anaphora resolution technology is applied to improve the fluency of the summary. Evaluation results show that the performance of the presented system is obviously better than that of the baseline system, and lexical chains are effective for multi-document summarization.

Original languageEnglish
Title of host publication2005 International Conference on Machine Learning and Cybernetics, ICMLC 2005
Pages1937-1942
Number of pages6
StatePublished - 2005
EventInternational Conference on Machine Learning and Cybernetics, ICMLC 2005 - Guangzhou, China
Duration: 18 Aug 200521 Aug 2005

Publication series

Name2005 International Conference on Machine Learning and Cybernetics, ICMLC 2005

Conference

ConferenceInternational Conference on Machine Learning and Cybernetics, ICMLC 2005
Country/TerritoryChina
CityGuangzhou
Period18/08/0521/08/05

Keywords

  • Cohesion
  • HowNet
  • Lexical chains
  • Multi-document summarization
  • Semanteme

Fingerprint

Dive into the research topics of 'Multi-document summarization based on lexical chains'. Together they form a unique fingerprint.

Cite this