Skip to main navigation Skip to search Skip to main content

Multi-stage chinese collocation extraction

  • Rui Feng Xu*
  • , Qin Lu
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Collocation is a recurrent and conventional natural language expression. In this research, Chinese collocations are categorized into four types. Based on the statistical analysis of different types of typical collocations, a multi-stage window-based collocation extraction system is designed, in which lexical statistic, synonyms information, syntactic information, and dependency knowledge, are used to extract n-gram collocations and different types of bi-gram collocations separately. Experimental results show that this system achieves a better precision and recall performance, compared with existed statistical collocation extraction techniques.

Original languageEnglish
Title of host publication2005 International Conference on Machine Learning and Cybernetics, ICMLC 2005
Pages3254-3259
Number of pages6
StatePublished - 2005
Externally publishedYes
EventInternational Conference on Machine Learning and Cybernetics, ICMLC 2005 - Guangzhou, China
Duration: 18 Aug 200521 Aug 2005

Publication series

Name2005 International Conference on Machine Learning and Cybernetics, ICMLC 2005

Conference

ConferenceInternational Conference on Machine Learning and Cybernetics, ICMLC 2005
Country/TerritoryChina
CityGuangzhou
Period18/08/0521/08/05

Keywords

  • Collocation extraction
  • Multi-stage extraction
  • Natural language processing

Fingerprint

Dive into the research topics of 'Multi-stage chinese collocation extraction'. Together they form a unique fingerprint.

Cite this