遗传 ›› 2017, Vol. 39 ›› Issue (8): 707-716.doi: 10.16288/j.yczz.16-419

• 综述 • 上一篇    下一篇

全基因组关联研究通路分析方法现状

王钰嫣(),王子兴,胡耀达,王蕾,李宁,张彪,韩伟,姜晶梅()   

  1. 中国医学科学院基础医学研究所,北京 100005
  • 收稿日期:2017-03-25 修回日期:2017-04-25 出版日期:2017-08-20 发布日期:2017-12-25
  • 作者简介:王钰嫣,博士研究生,专业方向:生物统计学。E-mail: wangyyamy@163.com|姜晶梅,博士,教授,研究方向:统计学方法在医学研究中的应用。E-mail: jingmeijiang@ibms.pumc.edu.cn
  • 基金资助:
    北京协和医学院研究生创新基金(10023-1001-1005)

Current status of pathway analysis in genome-wide association study

Yuyan Wang(),Zixing Wang,Yaoda Hu,Lei Wang,Ning Li,Biao Zhang,Wei Han,Jingmei Jiang()   

  1. Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences, Beijing 100005, China
  • Received:2017-03-25 Revised:2017-04-25 Online:2017-08-20 Published:2017-12-25
  • Supported by:
    the Graduate Innovation Foundation of Peking Union Medical College, China(10023-1001-1005)

摘要:

全基因组关联研究(genome-wide association study, GWAS)自2005年首次发表以来已不断增进人们对疾病遗传机制的认识,结合系统生物学并改进统计分析方法是对GWAS数据进行深度挖掘的重要途径。通路分析(pathway analysis)将GWAS所检测的遗传变异根据一定的生物学含义组合为集合进行分析,有利于发现对疾病单独效应小却在通路中相互关联的遗传变异,更有利于进行生物学解释。当前通路分析在GWAS数据上已有较为广泛的应用并取得初步成果。与此同时,通路分析的统计方法仍在不断发展。本文旨在介绍现有直接以SNP为对象的GWAS通路分析算法,根据方法中是否采用核函数分为非核算法和核算法两大类,其中非核算法主要包括基因功能富集分析(gene set enrichment analysis, GSEA)和分层贝叶斯优取(hierarchical Bayes prioritization, HBP),核算法包括线性核(linear kernel, LIN)、状态认证核(identity-by-status kernel, IBS)和尺度不变核(powered exponential kernel)。通过介绍这些方法的计算原理和优缺点,以期为新算法的构建提供更好的思路,为GWAS领域研究方法的选择提供参考。

关键词: 全基因组关联研究, 通路分析, 核算法

Abstract:

Since the first publication in 2005, the genome-wide association study (GWAS) strategy has contributed significantly to the understanding of the mechanisms of human genetic diseases. Integrations of statistical methods and systematic biology are important means to explore the GWAS data. Pathway analysis establishes the importance of genetic variants from GWAS and provides insights into their biological significance. It is conducive in correlating the genetic variants, which have only small but interactive changes, to their importance in the biological pathways. At present, pathway analysis has been widely applied to studies of GWAS data, with relatively good results. In the meantime, various analytical methods are being developed and adapted for research on more types of complex data. In this review, we summarize the statistical methods of pathway analysis on GWAS data, and divide them into non-kernel methods and kernel methods. The non-kernel methods include gene set enrichment analysis (GSEA) and hierarchical Bayes prioritization (HBP) analysis, while kernel methods include linear kernel (LIN), identity-by-status kernel (IBS) and powered exponential kernel. We have summarized the calculation principles and features of these statistical methods to provide insights for further developments of new algorithms in GWAS research.

Key words: genome-wide association study, pathway analysis, kernel methods