[an error occurred while processing this directive]

Hereditas(Beijing) ›› 2019, Vol. 41 ›› Issue (7): 644-652.doi: 10.16288/j.yczz.18-319

• Research Article • Previous Articles     Next Articles

Impacts of SNP genotyping call rate and SNP genotyping error rate on imputation accuracy inHolsteincattle

Zhi Li1,2,3,Jun He1,3,Jun Jiang1,4,Richard G. Tait Jr.3,Stewart Bauck3,Wei Guo2(),Xiao-Lin Wu1,3,4()   

  1. 1. CollegeofAnimalScienceand Technology, HunanAgricultural University, Changsha 410128, China
    2. Department of Animal Science, University of Wyoming, Laramie WY 82071, USA
    3. Biostatisticsand Bioinformatics, NeogenGeneSeek, LincolnNE68504, USA
    4. Department of Animal Sciences, University of Wisconsin, Madison WI 53706, USA
  • Received:2018-11-30 Revised:2019-04-16 Online:2019-07-20 Published:2019-05-28
  • Contact: Guo Wei,Wu Xiao-Lin E-mail:wguo3@uwyo.edu;nwu@neogen.com
  • Supported by:
    Supported by Hundred-Talent Project of Hunan Province, Key Researchand Development Program of Hunan Province(2018NK2081);Hunan Innovation Center of Animal Safety Production and Key Researchand Development Program of Changsha City(kq1801014)

Abstract:

Single nucleotide polymorphism (SNP) chips have been widely used in genetic studies and breeding applications in animal and plant species. The quality of SNP genotypes is of paramount importance. More often than not, there are situations in which a number of genotypes may fail, requiring them to be imputed. There are also situations in which ungenotyped loci need to be imputed between different chips, or high-density genotypes need to be imputed based on low-density genotypes. Under these circumstances, the validity and reliability of subsequent data analyses is subject to the accuracy of these imputed genotypes. For justifying a better understanding of factors affecting imputation accuracy, in the present study, the impacts of SNP genotyping call rate and SNP genotyping error rate on the accuracy of genotype imputation were investigated under two scenarios in 20 116 U.S. Holstein cattle, each genotyped with a GGP 50K SNP chip. When the two factors were not correlated in scenario 1, simulated genotyping call rate varied from 50% to 100% and simulated genotyping error rate changed from 0% to 50%, with both factors being independent of each other. In scenario 2, genotyping error rates were correlated with genotyping call rate, and the relationship was set up by fitting a linear regression model between the two variables on a real dataset. That is, the simulated SNP call rate varied from 100% to 50% whereas the SNP genotyping rate changed from 0% to 13.55%. Finally, a 5-fold cross-validation was used to assess the subsequent imputation accuracy. The results showed that when original SNP genotyping call rate were independent of SNP genotyping error rate, the imputation accuracy did not change significantly with the original genotyping call rate (P>0.05), but it decreased significantly as the genotyping error rate increased (P<0.01). However, when original genotyping call rate was negatively correlated with genotyping error rate, the imputation error increased with elevated original genotyping error rate. In both scenarios, genotyping call rate needs to be no less than 0.90 in order to obtain 98% or higher genotype imputation accuracy. The present results can provide guidance for establishing quality assurance criteria for SNP genotyping in practice.

Key words: SNP chip, genotyping, imputation accuracy, call rate, error rate