遗传 ›› 2012, Vol. 34 ›› Issue (8): 1009-1019.doi: 10.3724/SP.J.1005.2012.01009

• 综述 • 上一篇    下一篇

真核生物转座子鉴定和分类计算方法

许红恩1, 张化浩1, 韩民锦1, 沈以红1, 黄先智1, 向仲怀1, 张泽1,2   

  1. 1. 西南大学蚕学与系统生物学研究所, 重庆 400715 2. 重庆大学农学及生命科学研究院, 重庆 400044
  • 收稿日期:2012-02-14 修回日期:2012-04-01 出版日期:2012-08-20 发布日期:2012-08-25
  • 通讯作者: 张泽 E-mail:zezhang@swu.edu.cn
  • 基金资助:

    西南大学研究生科技创新基金项目(优博项目)(编号:kb2010106)资助

Computational approaches for identification and classification of transposable elements in eukaryotic genomes

XU Hong-En1, ZHANG Hua-Hao1, HAN Min-Jin1, SHEN Yi-Hong1, HUANG Xian-Zhi1, XIANG Zhong-Huai1, ZHANG Ze1, 2   

  1. 1. The Institute of Sericulture and Systems Biology, Southwest University, Chongqing 400715, China 2. The Institute of Agricultural and Life Sciences, Chongqing University, Chongqing 400044, China
  • Received:2012-02-14 Revised:2012-04-01 Online:2012-08-20 Published:2012-08-25

摘要: 重复序列是真核生物基因组的重要组成成分, 根据其序列特征及在基因组中的存在形式, 可以进一步分为串联重复、片段重复和散在重复。其中, 散在重复大多起源于转座子。根据转座介质的不同, 转座子又可分为DNA和逆转录转座子。转座子的转座和扩增对基因的进化和基因组的稳定具有显著的影响; 同时与其他类型的重复序列相比, 转座子的结构和分类更为复杂多样, 使得对转座子的鉴定和分类更为复杂和困难。鉴于此, 文章简要概括了转座子的功能及分类, 总结了真核生物转座子鉴定、分类和注释的3个步骤:(1)重复序列库的构建; (2)重复序列的校正和分类; (3)基因组注释。着重介绍了每一步骤所采用的不同计算方法, 比较了不同方法的优缺点。只有把多种方法结合起来使用才能实现全基因组转座子的精确鉴定、分类和注释, 这将为转座子的全基因组鉴定和分类提供借鉴意义。

关键词: 真核生物, 重复序列, 转座子, 鉴定, 分类

Abstract: Repetitive sequences (repeats) represent a significant fraction of the eukaryotic genomes and can be divided into tandem repeats, segmental duplications, and interspersed repeats on the basis of their sequence characteristics and how they are formed. Most interspersed repeats are derived from transposable elements (TEs). Eukaryotic TEs have been subdivided into two major classes according to the intermediate they use to move. The transposition and amplification of TEs have a great impact on the evolution of genes and the stability of genomes. However, identification and classification of TEs are complex and difficult due to the fact that their structure and classification are complex and diverse compared with those of other types of repeats. Here, we briefly introduced the function and classification of TEs, and summarized three different steps for identification, classification and annotation of TEs in eukaryotic genomes: (1) assembly of a repeat library, (2) repeat correction and classification, and (3) genome annotation. The existing computational approaches for each step were summarized and the advantages and disadvantages of the approaches were also highlighted in this review. To accurately identify, classify, and annotate the TEs in eukaryotic genomes requires combined methods. This review provides useful information for biologists who are not familiar with these approaches to find their way through the forest of programs.

Key words: eukaryotic genome, repeats, transposable elements, identification, classification