Skip to main navigation Skip to search Skip to main content

CSMT: Combining Snoring and Metadata-based Text for Sleep Apnea Severity Classification

  • Heng Li
  • , Yukun Qian
  • , Yun Lu*
  • , Mingjiang Wang
  • *Corresponding author for this work
  • Harbin Institute of Technology Shenzhen
  • Huizhou University

Research output: Contribution to journalConference articlepeer-review

Abstract

Sleep apnea is a common sleep disorder that, if untreated, can lead to serious health issues. Snoring is a typical symptom of sleep apnea and can be utilized to develop a non-contact automatic detection method for sleep apnea severity classification (SASC). However, due to patient heterogeneity, the acoustic characteristics of snoring vary significantly among individuals. To address this issue, we introduced a text-audio multimodal model that leverages patient’s metadata to provide valuable supplementary information for SASC task. Specifically, we utilized text descriptions derived from metadata and snoring sounds to fine-tune a pretrained text-audio multimodal model. The metadata includes patient’s physical indicators such as gender, age, BMI, neck circumference, and blood pressure. We constructed a snoring dataset that included four sleep apnea severity levels. On this dataset, our method achieved a classification F-score of 74.34%. We conducted a series of ablation experiments to validate the effectiveness of improving SASC performance by leveraging both metadata-based text and snoring sounds. Additionally, we discussed the model’s performance in scenarios where parts of the metadata are unavailable, a situation that may occur in real-world applications.

Keywords

  • Metadata
  • Pretrained model
  • Sleep apnea severity classification
  • Snoring
  • Text-audio multimodal model

Fingerprint

Dive into the research topics of 'CSMT: Combining Snoring and Metadata-based Text for Sleep Apnea Severity Classification'. Together they form a unique fingerprint.

Cite this