[an error occurred while processing this directive]

HEREDITAS ›› 2011, Vol. 33 ›› Issue (6): 654-660.doi: 10.3724/SP.J.1005.2011.00654

• en • Previous Articles     Next Articles

Cyanobacterial genome transposable element mining and analysis based on 454 deep-sequencing data set

XIAO Peng1, 2, LI Ren-Hui1   

  1. 1. Key Laboratory of Aquatic Biodiversity and Conservation Biology, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan 430072, China 2. Graduate University of Chinese Academy of Sciences, Beijing 100049, China
  • Received:2010-09-03 Revised:2010-12-20 Online:2011-06-20 Published:2011-06-25
  • Contact: LI Ren-Hui E-mail:reli@ihb.ac.cn

Abstract: Researches on the next generation sequencing (NGS) and the comparative genome analysis have recently been concerned. The analyses on transposable element composition and abundance are important parts for genome studies. Generally, the analyses of transposable element system were based on the complete spliced genomes; however, the post-processing and sequence splicing of the huge amount of short sequences from the 454 sequencer always encounter problems. Moreover, the occasion that large amount of repeat elements made up by transposable elements were incorrectly splicing or lost, leading to uncertain results. This study aimed at the construction of a framework to automatically analyze the insert sequence (IS) abundance and their composition based on a stimulated Roche 454 deep-sequencing data set, which was a 33-fold coverage of Microcystis aeruginosa NIES 843 genome. The result from the examination under the setting of three classes of division on the IS element candidates and a separated transposase examination thresholds is the most reliable. It showed that the abundance of IS element in this stimulated dataset was 10.38%, including 14 IS families and 66 IS subfamilies, which demonstrated no significant difference with the two sets of previous analysis results based on the spliced M. aeruginosa NIES 843 genome and a high percentage of IS element sequence overlap, indicating the reliability of this framework.

Key words: Cyanobacterial genome, insert sequence, IS family, transposable element, Roche 454 sequencing original data