遗传 ›› 2018, Vol. 40 ›› Issue (2): 162-169.doi: 10.16288/j.yczz.17-174

• 技术与方法 • 上一篇    

稀有变异遗传关联性研究中常用负担检验方法比较

林欣琪1,2(),梁融1,张俊国1,皮路程1,陈思东1,刘丽1(),郜艳晖1   

  1. 1. 广东药科大学公共卫生学院流行病与卫生统计学系,广州 510310
    2. 广东省职业病防治院,广州 510300
  • 收稿日期:2017-11-12 修回日期:2017-12-28 出版日期:2018-01-18 发布日期:2018-01-18
  • 作者简介:作者简介: 林欣琪,硕士研究生,研究方向:分子流行病学。E-mail: linxinki@163.com|通讯作者: 刘丽,博士,副教授,研究方向:流行病与卫生统计学。E-mail: pupuliu919@163.com
  • 基金资助:
    广东省自然基金(2016A030313809);广东省科技厅公益能力(2014A020212307);国家自然科学青年基金资助(2016A030313809)

Comparison of common burden tests for genetic association studies of rare variants

Xinqi Lin1,2(),Rong Liang1,Junguo Zhang1,Lucheng Pi1,Sidong Chen1,Li Liu1(),Yanhui Gao1   

  1. 1. Department of Epidemiology and Biostatistics, School of Public Health, Guangdong Pharmaceutical University, Guangzhou 510310, China
    2. Guangdong Province Hospital for Occupational Disease Prevention and Treatment, Guangzhou 510300, China
  • Received:2017-11-12 Revised:2017-12-28 Online:2018-01-18 Published:2018-01-18
  • Supported by:
    the National Natural Science Foundation of Guangdong, China(2016A030313809);Science and Technology Planning Project of Guangdong Province, China(2014A020212307);National Natural Science Foundation of China(2016A030313809)

摘要:

为比较稀有变异遗传关联研究中常用负担检验方法(CMC、WST、SUM及其扩展)在不同遗传情境下的统计性能,本文通过计算机模拟产生不同样本量、连锁不平衡(linkage disequilibrium, LD)参数、混杂非关联变异的个数和不同效应的关联变异等条件的稀有变异病例对照数据集,运用各种负担检验方法进行分析,分别计算各方法的一类错误和效能。结果表明,各方法一类错误均在0.05附近;当稀有变异效应方向一致时,除aSUM法外,LD参数越大、混杂非关联变异越少、各法效能越高;当效应方向不一致时,各法效能则显著降低。除强LD外,有方向考虑的方法效能均比无方向考虑的方法高,且样本量越大效能越高。负担检验的统计性能受效应大小和方向、噪音变异和连锁不平衡等多种因素影响。在实际应用中,在各类方法选择、确定集合单位,权重等时最好结合遗传变异的生物信息先验以提高研究效能。

关键词: 稀有变异, 遗传关联研究, 负担检验

Abstract:

Common burden tests have different statistical performance in genetic association studies of rare variants. Here, we compare the statistical performance of burden tests, such as CMC, WST, SUM and extension methods, using the computer-simulated datasets of rare variants with different parameters of sample sizes, linkage disequilibrium (LD), and different numbers of mixed non-associated variants. The simulation results showed that the type I error for all methods is near 0.05. When the rare variants had the same direction of effect, the higher LD and the less non-associated variants, the higher the power of these method, except the data adaptive SUM test. When the direction was different, the power was significantly reduced for all methods. The methods that consider the direction yielded larger statistical power than those methods without considering the effect direction, except the strong LD condition. And the larger the sample size, the larger the power. The statistical performance of burden tests is affected by a variety of factors, including the sample size, effect direction of variants, non-associated variants, and LD. Therefore, when choosing the method and setting the collection unit and weight, the prior biological information of genetic variation should be integrated to improve study efficiency.

Key words: rare variant, genetic association studies, burden test