遗传

• 综述 •    

法庭科学核心STR基因座的序列特征

缪磊1,2,康克莱1,张驰1,刘爽1,焦瑞莲1,袁丽2,王乐1   

  1. 1. 公安部鉴定中心法医遗传学公安部重点实验室,北京 100038

    2. 中国政法大学证据科学教育部重点实验室司法文明协同创新中心,北京 100088


  • 收稿日期:2025-03-15 修回日期:2025-05-22 出版日期:2025-05-23 发布日期:2025-05-23
  • 基金资助:
    国家重点研发计划课题;公安部科技强警基础工作专项;中央级公益性科研院所基本科研业务费项目;中央级公益性科研院所基本科研业务费项目;中央级公益性科研院所基本科研业务费项目

Sequence features of forensic core short tandem repeat loci

Lei Miao1,2, Kelai Kang1, Chi Zhang1, Shuang Liu1, Ruilian Jiao1, Li Yuan2, Le Wang1   

  • Received:2025-03-15 Revised:2025-05-22 Published:2025-05-23 Online:2025-05-23

摘要: 短串联重复序列(short tandem repeat,STR)遗传标记在法庭科学DNA鉴定中占据绝对主导地位,包括中国在内的世界各国DNA数据库均基于STR遗传标记建立。STR遗传标记具有长度多态性和序列多态性。序列多态包括重复区和侧翼区序列的多态性。传统的基于毛细管电泳技术进行STR分型仅区分长度多态性,而深刻理解核心STR基因座的序列多态对于引物设计和DNA鉴定等方面至关重要。首先,STR扩增引物结合区的SNP、InDel可能干扰引物与DNA模板结合的亲和力,导致无法检测到某些等位基因或均衡性差,影响DNA鉴定准确性;其次,二代测序技术推动STR鉴定由长度多态分型向序列多态分型发展,显著提升了可检测的核心STR基因座多态信息含量,提高了其个体识别和亲缘关系分析效能;再者,不同人群具有不同的STR序列特征。近10年来,基于二代测序的STR序列多态性的研究逐渐增多,多个人群的序列多态性数据已经被报道,但以往的研究群体及数据较为零散,重复序列的数据格式不统一,导致核心STR基因座的序列多态性缺乏来自大数据的系统性总结和梳理。充分掌握核心STR基因座的序列特征对微量检材的个体识别、混合样本拆分、亲子鉴定中突变来源的确定等具有十分重要的意义。本文以19个常染色体核心STR为分析对象,整合了目前文献报道的群体数据和公开数据库中的中国人群变异频率数据,系统综述了这些STR的序列多态性,包括归纳STR基因座重复区的变异类型和分析变异规律,总结了中国人群中STR侧翼区的高频变异,并探讨了在STR序列检验中可能遇到的难点,以期为STR序列的应用解析、案件检验中稀有等位基因的判别以及STR试剂盒的研制等方面提供参考。

关键词: 法医遗传学, 常染色体, STR, 序列多态性

Abstract: Short tandem repeat (STR) is a significant genetic marker for the identification of forensic DNA. DNA databases worldwide, including those in China, are established based on STR markers. Length- and sequence-based polymorphism are two features of STR markers. Sequence-based polymorphism includes polymorphisms in both repeat and flanking regions. Traditional capillary electrophoresis-based STR genotyping method can only profile length-based genotypes. However, a deep understanding of the sequence polymorphism of core STR loci is crucial for primer design and DNA identification. Firstly, single nucleotide poly-morphisms and insertions/deletions in STR primer binding regions may reduce the affinity between primers and DNA templates, leading to allele dropout or poor interlocus balance, thereby impacting the accuracy of DNA identification. Secondly, se-quence-based polymorphism can be unveiled by next-generation sequencing technology, which could significantly enhance the detectable polymorphic information of core STR loci and improve the efficiency of individual identification and kinship analysis. Thirdly, different populations exhibit distinct STR sequence characteristics. Over the past decade, studies on sequence-based polymorphisms of STR loci have increased alongside the application of next-generation sequencing technology, and se-quence-based polymorphisms from multiple populations were reported. However, previously studied populations and data were scattered, and different formats of repeat region sequences were used in various studies. These limitations result in the absence of a systematic summary and analysis of sequence polymorphism for core STR loci, hindering its further application in forensic practices. A comprehensive understanding of core STR loci sequence characteristics is crucial for individual identification from trace DNA, deconvolution of mixed samples, and determination of mutation origins in paternity testing. In this review, we focus on 19 autosomal core STRs and systematically review the sequence polymorphisms of these loci based on population data reported in the literature. We summarize variations in repeat regions, analyze variation patterns, present high-frequency variations in flanking regions within the Chinese population, and discuss potential challenges encountered in STR sequence analyses, with the aim to provide a reference for the analyses and application of STR sequence, the identification of rare alleles in criminal case testing, and the development of STR genotyping panel.

Key words: forensic genetics, autosomal chromosome, short tandem repeat, sequence-based polymorphism