随机SNP在全基因组关联研究人群分层分析中的应用

doi:10.3724/SP.J.1005.2010.00921

遗传 ›› 2010, Vol. 32 ›› Issue (9): 921-928.doi: 10.3724/SP.J.1005.2010.00921

随机SNP在全基因组关联研究人群分层分析中的应用

曹宗富^{1, 2}, 马传香^{1, 2}, 王雷^{1, 2}, 蔡斌^{1, 2}

1. 生物芯片北京国家工程研究中心, 北京 102206; 2. 博奥生物有限公司, 北京 102206

收稿日期:2009-11-12 修回日期:2010-03-10 出版日期:2010-09-20 发布日期:2010-09-20
基金资助:
国家高技术研究发展计划项目(863 计划)(编号：2009AA022708)资助

Analysis of Population Stratification Using Random SNPs in Genome-wide Association Studies

CAO Zong-Fu^{1, 2}, MA Chuan-Xiang^{1, 2}, WANG Lei^{1, 2}, CAI Bin^{1, 2}

1. National Engineering Research Center for Beijing Biochip Technology, Beijing 102206, China; 2. CapitalBio Corporation, Beijing 102206, China

Received:2009-11-12 Revised:2010-03-10 Published:2010-09-20 Online:2010-09-20

摘要/Abstract

摘要： 在复杂疾病的全基因组关联研究中，人群分层现象会增加结果的假阳性率，因此考虑人群遗传结构、控制人群分层是很有必要的。而在人群分层研究中，使用随机选择的SNP的效果还有待进一步探讨。文章利用HapMap Phase2人群中无关个体的Affymetrix SNP 6.0芯片分型数据，在全基因组上随机均匀选择不同数量的SNP，同时利用f值和Fisher精确检验方法筛选祖先信息标记（Ancestry Informative Markers，AIMs）。然后利用HapMap Phase3中的无关个体的数据，以F-statistics和STRUCTURE分析两种方法评估所选出的不同SNP组合对人群的区分效果。研究发现，随机均匀分布于全基因组的SNP可用于识别人群内部存在的遗传结构。文章进一步提示，在全基因组关联研究中，当没有针对特定人群的AIMs时，可在全基因组上随机选择3000以上均匀分布的SNP来控制人群分层。

关键词: 全基因组关联研究, 人群分层, 祖先信息标记, 随机SNP, Affymetrix SNP 6.0芯片

Abstract: Since population genetic STRUCTURE can increase false-positive rate in genome-wide association studies (GWAS) for complex diseases, the effect of population stratification should be taken into account in GWAS. However, the effect of randomly selected SNPs in population stratification analysis is underdetermined. In this study, based on the genotype data generated on Genome-Wide Human SNP Array 6.0 from unrelated individuals of HapMap Phase2, we randomly selected SNPs that were evenly distributed across the whole-genome, and acquired Ancestry Informative Markers (AIMs) by the method of f value and allelic Fisher exact test. F-statistics and STRUCTURE analysis based on the select different sets of SNPs were used to evaluate the effect of distinguishing the populations from HapMap Phase3. We found that randomly selected SNPs that were evenly distributed across the whole-genome were able to be used to identify the population structure. This study further indicated that more than 3 000 randomly selected SNPs that were evenly distributed across the whole-genome were substituted for AIMs in population stratification analysis, when there were no available AIMs for spe-cific populations.

Key words: genome-wide association study, population stratification, ancestry informative markers, random SNP, Af-fymetrix SNP 6.0 array

曹宗富，马传香，王雷，蔡斌. 随机SNP在全基因组关联研究人群分层分析中的应用[J]. 遗传, 2010, 32(9): 921-928.

CAO Zong-Fu, MA Chuan-Xiang, WANG Lei, CA Bin. Analysis of Population Stratification Using Random SNPs in Genome-wide Association Studies[J]. HEREDITAS, 2010, 32(9): 921-928.

[1]	唐恒磊, 郑树涛, 李友, 钟望涛. 心源性卒中的遗传学研究进展[J]. 遗传, 2024, 46(5): 373-386.
[2]	徐晓鹏, 范小英. 单细胞精度的表达数量性状位点研究进展[J]. 遗传, 2024, 46(10): 795-806.
[3]	李以格, 张丹丹. 后GWAS时代结直肠癌致病SNP功能机制的研究进展[J]. 遗传, 2021, 43(3): 203-214.
[4]	钱国清. 慢性阻塞性肺疾病全基因组关联研究进展[J]. 遗传, 2020, 42(9): 832-846.
[5]	梁文权,侯豫,赵存友. 精神分裂症相关单核苷酸多态性调控microRNA功能研究进展[J]. 遗传, 2019, 41(8): 677-685.
[6]	杨超, 杨瑞馥, 崔玉军. 细菌全基因组关联研究的方法与应用[J]. 遗传, 2018, 40(1): 57-65.
[7]	王钰嫣,王子兴,胡耀达,王蕾,李宁,张彪,韩伟,姜晶梅. 全基因组关联研究通路分析方法现状[J]. 遗传, 2017, 39(8): 707-716.
[8]	陈开旭, 王为兰, 张富春, 郑秀芬. 人类身高的遗传学研究进展[J]. 遗传, 2015, 37(8): 741-755.
[9]	王立, 徐颜美, 程竹君, 熊招平, 邓立彬. 胆固醇代谢紊乱的遗传学研究进展[J]. 遗传, 2014, 36(9): 857-863.
[10]	周家蓬裴智勇陈禹保陈润生. 基于高通量测序的全基因组关联研究策略[J]. 遗传, 2014, 36(11): 1099-1111.
[11]	宋庆峰, 张红星, 马亦龙, 周钢桥. 复杂疾病的遗传易感基因区域的精细定位[J]. 遗传, 2014, 36(1): 2-10.
[12]	罗旭红刘志芳董长征. 基因水平的关联分析方法[J]. 遗传, 2013, 35(9): 1065-1071.
[13]	郑伟季林丹邢文华涂巍巍徐进. 肺结核全基因组关联研究进展[J]. 遗传, 2013, 35(7): 823-829.
[14]	许睿玮，严卫丽. 原发性高血压全基因组关联研究进展[J]. 遗传, 2012, 34(7): 793-809.
[15]	李俊燕，谭英姿，冯国鄞，贺林，周里钢，陆灏. 糖尿病肾病遗传学研究进展[J]. 遗传, 2012, 34(12): 1537-1544.

随机SNP在全基因组关联研究人群分层分析中的应用

Analysis of Population Stratification Using Random SNPs in Genome-wide Association Studies

PDF (PC)

可视化

被引次数

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics