遗传 ›› 2019, Vol. 41 ›› Issue (3): 234-242.doi: 10.16288/j.yczz.18-279

• 综述 • 上一篇    下一篇

常用肿瘤基因分析方法及基于TCGA数据库的分析应用

李鑫,李梦玮,张依楠,徐寒梅()   

  1. 中国药科大学多肽药物创制工程中心,南京 211198
  • 收稿日期:2018-11-20 修回日期:2019-01-27 出版日期:2019-02-25 发布日期:2019-02-22
  • 通讯作者: 徐寒梅 E-mail:13913925346@126.com
  • 作者简介:李鑫,硕士研究生,专业方向:海洋药学。E-mail: cpu_lixin@163.com
  • 基金资助:
    国家“重大新药创制”科技重大专项(2018ZX09301053-001);国家“重大新药创制”科技重大专项(2018ZX09301039-002);国家“重大新药创制”科技重大专项(2018ZX09201001-004-001);江苏高校优势学科建设工程项目资助

Common cancer genetic analysis methods and application study based on TCGA database

Xin Li,Mengwei Li,Yinan Zhang,Hanmei Xu()   

  1. Engineering Research Center of Peptide Drug Discovery and Development, China Pharmaceutical University, Nanjing 211198, China
  • Received:2018-11-20 Revised:2019-01-27 Online:2019-02-25 Published:2019-02-22
  • Contact: Xu Hanmei E-mail:13913925346@126.com
  • Supported by:
    Supported by the National Science and Technology Major Projects of New Drugs(2018ZX09301053-001);Supported by the National Science and Technology Major Projects of New Drugs(2018ZX09301039-002);Supported by the National Science and Technology Major Projects of New Drugs(2018ZX09201001-004-001);the Priority Academic Program Development of Jiangsu Higher Education Institutions (PAPD)

摘要:

随着二代测序技术的快速发展,数据量不断累积,肿瘤学家的目光逐渐由多物种测序转移至高通量测序数据的分析和比对。基因数据分析方法层出不穷,高通量的组学分析手段不断优化和创新,基因数据的挖掘和分析工作正处于飞速发展的时期。以肿瘤病人样本为核心的数据库 The Cancer Genome Atlas (TCGA)由此应运而生,该数据库全方位记录了从临床肿瘤病人样本得到的基因数据如DNA序列、转录本信息、表观遗传学修饰等。本文主要从数据分析方法、TCGA数据库及其应用实例等3个方面详细介绍了肿瘤相关基因数据的深入挖掘和生物信息学分析方法的最新研究进展,以期为研究人员利用大数据发现肿瘤防治相关的新靶点提供借鉴和参考。

关键词: 基因数据, TCGA数据库, 肿瘤

Abstract:

The development of second-generation sequencing (NGS) technology is providing numerous data which shifts the focus of cancer research from the sequencing of multi-species to the analysis and comparison of select data via high-throughput sequencing. The NGS also facilitates the diversity of available genetic data analysis methods, the constant optimization and innovation of analytical approaches for high-throughput genomics as well as the rapid development of genetic data mining and analysis models. The Cancer Genome Atlas (TCGA) database is a direct result of this work. The TCGA database provides a comprehensive record of genetic data collected from a tumor patient’s sample, including its DNA sequence, transcriptional information, epigenetic modification and related. This review elaborates the latest progress in both the mining algorithm and analysis methods for tumor genomics. Specially, we introduce and review the TCGA database and data analysis approaches while demonstrating its applicability using representative cases. This review may shed light on new tumor-related targets discovery for researchers by means of bid data.

Key words: gene data analysis, TCGA database, cancer