Skip to main navigation Skip to search Skip to main content

CMCOQA: A Chinese Medical Complex Open-Question Answering Benchmark

  • Faculty of Computing, Harbin Institute of Technology
  • Harbin Institute of Technology

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

With the development of Large Language Models (LLMs), many Chinese medical benchmarks have emerged. These benchmarks have primarily used multiple-choice questions and open-ended questions as test items. However, our experimental results indicate that using multiple-choice questions to test the capabilities of LLMs is not very reasonable. Additionally, relatively simple open-ended questions do not effectively assess LLMs' actual grasp of medical knowledge. Therefore, we propose the Chinese Medical Complex Open-Question Answering Benchmark (CMCOQA), designed to more accurately and efficiently evaluate the true medical proficiency of LLMs by constructing complex open-ended questions within medical scenarios. Our proposed benchmark involves three evaluation dimensions: Completeness, Depth, and Professionalism. Starting with 100 manually generated complex questions as seeds, we expand the set to 1,200 using the Self-Instruct method with GPT-4o. We then have GPT-4o self-check the questions, followed by a manual screening process to ensure a broad coverage and a certain level of depth. We have both humans and GPT-4o score from these three dimensions, while also employing automated metrics. We also calculate correlations between these metrics and human scores to validate the results. Through this work, CMCOQA can further promote the development of Chinese medical LLMs in terms of medical professionalism.

Original languageEnglish
Title of host publicationProceedings - 2024 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2024
EditorsMario Cannataro, Huiru Zheng, Lin Gao, Jianlin Cheng, Joao Luis de Miranda, Ester Zumpano, Xiaohua Hu, Young-Rae Cho, Taesung Park
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages3402-3407
Number of pages6
ISBN (Electronic)9798350386226
DOIs
StatePublished - 2024
Externally publishedYes
Event2024 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2024 - Lisbon, Portugal
Duration: 3 Dec 20246 Dec 2024

Publication series

NameProceedings - 2024 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2024

Conference

Conference2024 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2024
Country/TerritoryPortugal
CityLisbon
Period3/12/246/12/24

Keywords

  • Chinese Medical
  • Large Language Model
  • Medical Professionalism
  • Open-Ended Complex Question Answering

Fingerprint

Dive into the research topics of 'CMCOQA: A Chinese Medical Complex Open-Question Answering Benchmark'. Together they form a unique fingerprint.

Cite this