Skip to main navigation Skip to search Skip to main content

Novel Multikernel Trick for Predicting Pan-CancerDistant Metastatic Sites Using a Feature Extraction Strategy

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Distant metastasis is the leading cause of cancer death. Identifying the tendency of a given cancer to metastasize could be conducive to cancer diagnosis and therapeutic schedules. In cancer studies, mRNA gene expression data have been widely used to predict cancer metastasis due to the ease with which they can be obtained. Moreover, mRNA gene expression data represent cancer progression directly and in detail. In these studies, feature extraction followed by a prediction model has been a commonly used solution to predict pan-cancer prognosis and tumor stage. Limitations of these studies include a lack of comprehensive feature extraction, relatively low prediction accuracy of cancer outcomes and a lack of precise pan-cancer metastasis site prediction.To address the questions mentioned above, we designed an innovative pipeline to determine the heterogeneity of pan-cancer distant metastatic sites using mRNA gene expression data. We used a directed relational graph convolutional network (DRGCN) for feature extraction and a multikernel support vector machine (SVM) for pan-cancer distant metastasis site prediction. DR-GCN successfully excavated hidden features from relational networks and effectively extracted features from gene-cancer relations, cancer-disease relations and gene-gene relational networks. DR-GCN was demonstrably able to deal with complex prior knowledge-based feature extraction tasks. A dynamic weight multikernel SVM was then applied to predict pan-cancer distant metastasis sites. By this method, the AUROC (0.7542) of the multikernel SVM outperformed that of the single kernel SVM (polykernel: 0.7346, RBF kernel: 0.72, linear kernel: 0. 725S). We last applied our pipeline to an extremely unbalanced small sample dataset and obtained a higher AUPRC (0.2606) than other semisupervised learning methods (Laplacian SVM: 0. 1S, TSVM: 0.21, SSL-EM: 0. 1S, RRLSL: 0.22) while predicting TCGA glioblastoma (GBM) patient prognosis.

Original languageEnglish
Title of host publicationProceedings - 2021 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2021
EditorsYufei Huang, Lukasz Kurgan, Feng Luo, Xiaohua Tony Hu, Yidong Chen, Edward Dougherty, Andrzej Kloczkowski, Yaohang Li
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1899-1905
Number of pages7
ISBN (Electronic)9781665401265
DOIs
StatePublished - 2021
Externally publishedYes
Event2021 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2021 - Virtual, Online, United States
Duration: 9 Dec 202112 Dec 2021

Publication series

NameProceedings - 2021 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2021

Conference

Conference2021 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2021
Country/TerritoryUnited States
CityVirtual, Online
Period9/12/2112/12/21

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 3 - Good Health and Well-being
    SDG 3 Good Health and Well-being

Keywords

  • DR-GCN
  • SVM
  • distant metastasis
  • pan-cancer
  • pan-cancer prognosis

Fingerprint

Dive into the research topics of 'Novel Multikernel Trick for Predicting Pan-CancerDistant Metastatic Sites Using a Feature Extraction Strategy'. Together they form a unique fingerprint.

Cite this