遗传 ›› 2012, Vol. 34 ›› Issue (11): 1491-1500.doi: 10.3724/SP.J.1005.2012.01491

• 技术与方法 • 上一篇    下一篇

植物LTR类反转录转座子序列分析识别方法

侯小改1, 张曦1, 郭大龙2   

  1. 1. 河南科技大学农学院, 洛阳 471003 2. 河南科技大学林学院, 洛阳 471003
  • 收稿日期:2012-07-24 修回日期:2012-09-02 出版日期:2012-11-20 发布日期:2012-11-25
  • 通讯作者: 侯小改 E-mail:hkdhxg@126.com
  • 基金资助:

    国家自然科学基金项目(编号:31070620), 河南省高校科技创新人才支持计划项目(编号:2010HASTTT002)和河南省高等学校青年骨干教师计划(编号:2010GGJS-072)资助

Identification and analysis methods of plant LTR retrotransposon sequences

HOU Xiao-Gai1, ZHANG Xi1, GUO Da-Long2   

  1. 1. College of Agriculture, Henan University of Science Technology, Luoyang 471003, China 2. College of Forestry, Henan University of Science Technology, Luoyang 471003, China
  • Received:2012-07-24 Revised:2012-09-02 Online:2012-11-20 Published:2012-11-25
  • Contact: Xiao-GaiHou E-mail:hkdhxg@126.com

摘要: LTR类反转录转座子(Long terminal repeat retrotransponson)是真核生物中的一类重要转座元件, 具有分布广泛、异质性高等特点, 在真核生物基因组进化中起着重要作用, 现广泛应用于植物的基因功能分析和遗传多样性研究等方面。LTR类反转录转座子的序列识别是其应用的前提条件, 因此对LTR类反转录转座子的序列鉴定和分析方法的研究具有重要的理论意义和实际应用价值。LTR类反转录转座子序列的生物信息学分析软件按原理可大致分为序列比对分析和相关序列保守区域识别鉴定两类。比对软件如BLAST、DNAstar等, 是一种序列相似性搜索程序, 通过与已知的反转录转座子序列比对后的序列相似性来判断未知序列是否是反转录转座子序列, 但这类软件不能直接获得具体的LTR等特征序列的相关信息, 不能对反转录转座子序列的全长进行识别。识别鉴定软件按原理可分为从头算起法、比较基因组法、同源搜索法和结构基础法4种, 如LTR-Finder等基于从头算起法的识别鉴定软件, 可对LTR类反转录转座子全序列进行较准确地预测和注释, RepeatMasker等基于同源搜索法的软件, 通过与数据库中的序列的相似性比对后发现可能存在的LTR类反转录转座子。文章对不同的LTR类反转录转座子预测方法进行了比较和分析, 在此基础上归纳总结出一套分析LTR类反转录转座子序列的操作流程, 旨在为LTR类反转录转座子序列的分析提供参考。

关键词: 植物LTR类反转录转座子, 序列比对, 软件分析, 遗传多样性

Abstract: LTR retrotransposons are an important class of eukaryotic transposable elements, which are ubiquitous and highly heterogeneous in plant and play a major role in genome evolution of eukaryote. They are now extensively employed in gene function and genetic diversity analyses. Identification of LTR retrotransposons is the precondition for its application. Therefore, it has important theoretical significance and practical application value in studying identification and analysis methods LTR retrotransposon sequences. Bioinformatic software of the sequence analysis, according to the work principle, can be classified roughly into two types: sequence alignment and sequence identification of conserved domains. Alignment software, such as BLAST and DNAstar, produce the corresponding sequence information through comparison of sequence similarity; however, this kind of software cannot be applied for full length sequences. According to the principle, LTR retro-transposon sequence identification software can be roughly sorted into four types: de novo repeat discovery method, comparative genomic method, homology-based method, and structure-based method. For example, LTR_Finder based on de novo repeat discovery method can accurately predict and annotate LTR retrotransposons for full length sequences; RepeatMasker, which is based on homology-based method, can discover LTR retrotransposons by comparing the similarity with known sequences in the database. In this article, different methods of identification and analysis of retrotransposon sequences were compared and analyzed, and a set of flow of LTR retrotransposons sequence analysis was summarized in order to provide the reference for LTR retrotransposons sequence analysis.

Key words: plant LTR retrotransposons, sequence alignment, software analysis, genetic diversity