Skip to main navigation Skip to search Skip to main content

MeSiC: A Model-Based Method for Estimating 5 mC Levels at Single-CpG Resolution from MeDIP-seq

  • Yun Xiao
  • , Fulong Yu
  • , Lin Pang
  • , Hongying Zhao
  • , Ling Liu
  • , Guanxiong Zhang
  • , Tingting Liu
  • , Hongyi Zhang
  • , Huihui Fan
  • , Yan Zhang
  • , Bo Pang
  • , Xia Li*
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

As the fifth base in mammalian genome, 5-methylcytosine (5 mC) is essential for many biological processes including normal development and disease. Methylated DNA immunoprecipitation sequencing (MeDIP-seq), which uses anti-5 mC antibodies to enrich for methylated fraction of the genome, is widely used to investigate methylome at a resolution of 100-500 bp. Considering the CpG density-dependent bias and limited resolution of MeDIP-seq, we developed a Random Forest Regression (RFR) model method, MeSiC, to estimate DNA methylation levels at single-base resolution. MeSiC integrated MeDIP-seq signals of CpG sites and their surrounding neighbors as well as genomic features to construct genomic element-dependent RFR models. In the H1 cell line, a high correlation was observed between MeSiC predictions and actual 5 mC levels. Meanwhile, MeSiC enabled to calibrate CpG density-dependent bias of MeDIP-seq signals. Importantly, we found that MeSiC models constructed in the H1 cell line could be used to accurately predict DNA methylation levels for other cell types. Comparisons with methylCRF and MEDIPS showed that MeSiC achieved comparable and even better performance. These demonstrate that MeSiC can provide accurate estimations of 5 mC levels at single-CpG resolution using MeDIP-seq data alone.

Original languageEnglish
Article number14699
JournalScientific Reports
Volume5
DOIs
StatePublished - 1 Oct 2015
Externally publishedYes

Fingerprint

Dive into the research topics of 'MeSiC: A Model-Based Method for Estimating 5 mC Levels at Single-CpG Resolution from MeDIP-seq'. Together they form a unique fingerprint.

Cite this