遗传 ›› 2008, Vol. 30 ›› Issue (5): 543-549.doi: 10.3724/SP.J.1005.2008.00543

• 综述 • 上一篇    下一篇

复杂疾病全基因组关联研究进展——遗传统计分析

严卫丽   

  1. 新疆医科大学公共卫生学院, 乌鲁木齐 830054

  • 收稿日期:2007-09-20 修回日期:2008-01-28 出版日期:2008-05-10 发布日期:2008-05-10
  • 通讯作者: 严卫丽

Genome-wide association study on complex diseases: genetic statis-tical issues

YAN Wei-Li

  

  1. School of Public Health, Xinjiang Medical University, Urumqi 830054, China
  • Received:2007-09-20 Revised:2008-01-28 Online:2008-05-10 Published:2008-05-10
  • Contact: YAN Wei-Li

摘要:

2005年, Science杂志首次报道了有关人类年龄相关性黄斑变性的全基因组关联研究, 此后有关肥胖、2型糖尿病、冠心病、阿尔茨海默病等一系列复杂疾病的全基因组关联研究被陆续报道, 这一阶段被称为人类全基因组关联研究的第一次浪潮。文章分别介绍了全基因组关联研究统计分析的方法、软件和应用实例; 比较了关联分析中多重检验的P值调整方法, 包括Bonferroni、递减的Bonferroni校正法、模拟运算法和控制错误发现率的方法; 还讨论了人群混杂对关联分析结果可能产生的影响及原理, 以及全基因组关联研究中控制人群混杂的方法的研究进展和应用实例。在全基因组关联研究的第一次浪潮中, 应用经典的遗传统计方法发现了许多基因-表型之间的关联并且能够对这些关联做出解释, 其中包括许多基因组中的未知基因和染色体区域。然而, 全基因组关联研究的继续发展需要进一步阐述基因组内基因之间相互作用、基因-基因之间的复杂作用网络与环境因素的相互作用在复杂疾病发生中的作用, 现有的统计分析方法肯定不能满足需要, 开发更为高级的统计分析方法势在必行。最后, 文章还给出了全基因组关联研究统计分析软件的相关网站信息。

关键词: 多重检验校正, 人群混杂, 重复, 检验效能, 全基因组关联研究

Abstract:

Since the first genome-wide association study on human age-related macular degeneration was reported by Science journal in 2005, a series of genome-wide association studies have been published on human complex diseases or traits, such as obesity, type 2 diabetes, coronary artery disease, Alzheimer’s disease and so on. The study of human genetics has recently undergone a dramatic transition which is called “the first wave of genome-wide association study”. Some issues in statistical analysis of genome-wide association studies were reviewed by this paper. First, statistical analysis guidelines, methods and examples for genome-wide association studies of different designs, including unrelated case-control studies, population-based studies, and family-based association studies; second, multiple testing correction of P values, including Bonferroni correction, step-down Bonferroni correction, permutation correction, and the correction based on false discovery rate; third, population stratification and its effect on inference of genotype-phenotype associations. The False Positive Re-port Probability has been successfully applied in a recent genome-wide association study on coronary artery disease to con-trol the population stratification. Although genetic statistical methodology has been greatly developed in control of false positive associations caused by multiple testing or population stratification, it is still not sufficient to achieve the goal. Rep-licating genotype-phenotype associations is the only way to identify true association between genetic markers and common disease traits. The first wave of genome-wide association studies is producing an impressive list of unexpected associations between genes or chromosomal regions and a broad range of diseases. Traditional statistical techniques are adequate for the analysis and interpretation of these results. However, much more sophisticated methods of statistical analysis are likely to be required as we delve further into the genome in the search for networks of interacting gene variants, or interactions between gene-gene networks and environmental factors. Finally, some useful links about statistical software for genome-wide association studies were provided.