Skip to main navigation Skip to search Skip to main content

中文专利关键信息语料库的构建研究

Translated title of the contribution: Research on the construction of Chinese patent key information corpus
  • Wenting Zhang
  • , Meihan Zhao
  • , Yixuan Ma
  • , Wenrui Wang
  • , Yuzhe Liu
  • , Muyun Yang*
  • , Yu Deng
  • *Corresponding author for this work
  • Harbin Institute of Technology
  • Harbin Shineip Intellectual Property Corporation

Research output: Contribution to conferencePaperpeer-review

Abstract

As a kind of important technology document, the patent is of substantial significance to the national intellectual property strategy in China. Existing patent corpus are mostly for the purpose of information retrieval and machine translation task, leaving the fine-grained annotated patent less touched. To facilitate the forth-coming intelligent patent technology development, this paper constructs a Patent Key Information Corpus, consisting of 313 patents annotated with the issues, methods and effects in the texts. Then the SOTA named entity recognition models are applied to the corpus, and the sharping decrease in the performance indicate the automatic identification of the key information in a patent is a challenging IE task.

Translated title of the contributionResearch on the construction of Chinese patent key information corpus
Original languageChinese (Traditional)
Pages455-463
Number of pages9
StatePublished - 2022
Event21st Chinese National Conference on Computational Linguistic, CCL 2022 - Nanchang, China
Duration: 14 Oct 202216 Oct 2022

Conference

Conference21st Chinese National Conference on Computational Linguistic, CCL 2022
Country/TerritoryChina
CityNanchang
Period14/10/2216/10/22

Fingerprint

Dive into the research topics of 'Research on the construction of Chinese patent key information corpus'. Together they form a unique fingerprint.

Cite this