遗传 ›› 2013, Vol. 35 ›› Issue (6): 685-684.doi: 10.3724/SP.J.1005.2013.00685

• 综述 •    下一篇

基因组规模DNA甲基化测序数据预处理及表观遗传分析

王庭璋,单杲,徐建红,薛庆中   

  1. 浙江大学农业与生物技术学院, 杭州 310058
  • 收稿日期:2012-10-24 修回日期:2012-12-27 出版日期:2013-06-20 发布日期:2013-06-25
  • 通讯作者: 薛庆中 E-mail:xueqingzhong@hotmail.com
  • 基金资助:

    国家重点基础研究发展计划(973计划)项目(编号:2010CB126205)和国家自然科学基金项目(编号:31171165)资助

Genome-scale sequence data processing and epigenetic analysis of DNA methylation

WANG Ting-Zhang, SHAN Gao, XU Jian-Hong, XUE Qing-Zhong   

  1. College of Agriculture and Biotechnology, Zhejiang University, Hangzhou 310058, China
  • Received:2012-10-24 Revised:2012-12-27 Online:2013-06-20 Published:2013-06-25

摘要: 鉴定DNA甲基化胞嘧啶(mC)并能制作基因组规模甲基化图谱的新方法——BS-Seq, 最近已被开发, 它是基于新一代高通量测序结合DNA亚硫酸氢盐转换技术, 不仅可以从基因组规模洞察不同生物之间在DNA甲基化水平和模式上的差异, 也能从不同基因组区域, 包括基因、外显子、重复序列等方面, 阐明DNA甲基化环境和核苷酸偏好上的保守性, 加深理解DNA胞嘧啶(C)甲基化在调控基因表达和沉默转座子等重复序列中所起的表观遗传学影响。文章举例介绍了DNA甲基化位点数据预处理的具体步骤, 通过处理分别将参考序列中的胞嘧啶(C)替换成胸腺嘧啶(T), 鸟嘌呤(G)替换成腺嘌呤(A), 而将读序列中的胞嘧啶(C)替换为胸腺嘧啶(T)。文章综述了全基因组DNA甲基化分析的主要内容, 包括:(1)不同序列环境下的胞嘧啶甲基化; (2)全基因组上的甲基化的分布情况; (3)DNA甲基化环境和核苷酸的偏好; (4)DNA-蛋白质互作位点上的DNA甲基化; (5)不同基因结构元件的胞嘧啶甲基化程度。DNA甲基化分析技术为研究不同物种的表观基因组, 环境和表观互作提供了强大的工具, 并为进一步发展人体疾病诊断和治疗方法提供理论基础。

关键词: 新一代测序, DNA甲基化, BS-Seq, 数据处理, 表观遗传学

Abstract: A new approach recently developed for detecting cytosine DNA methylation (mC) and analyzing the genome-scale DNA methylation profiling, is called BS-Seq which is based on bisulfite conversion of genomic DNA combined with next-generation sequencing. The method can not only provide an insight into the difference of genome-scale DNA methylation among different organisms, but also reveal the conservation of DNA methylation in all contexts and nucleotide preference for different genomic regions, including genes, exons, and repetitive DNA sequences. It will be helpful to under-stand the epigenetic impacts of cytosine DNA methylation on the regulation of gene expression and maintaining silence of repetitive sequences, such as transposable elements. In this paper, we introduce the preprocessing steps of DNA methylation data, by which cytosine (C) and guanine (G) in the reference sequence are transferred to thymine (T) and adenine (A), and cytosine in reads is transferred to thymine, respectively. We also comprehensively review the main content of the DNA me-thylation analysis on the genomic scale: (1) the cytosine methylation under the context of different sequences; (2) the dis-tribution of genomic methylcytosine; (3) DNA methylation context and the preference for the nucleotides; (4) DNA- protein interaction sites of DNA methylation; (5) degree of methylation of cytosine in the different structural elements of genes. DNA methylation analysis technique provides a powerful tool for the epigenome study in human and other species, and genes and environment interaction, and founds the theoretical basis for further development of disease diagnostics and therapeutics in human.

Key words: next-generation sequencing (NGS), DNA methylation, BS-Seq, data processing, epigenetics