遗传 ›› 2006, Vol. 28 ›› Issue (10): 1299-1306.

• 研究论文 • 上一篇    下一篇

全基因组预测目标基因的新方法及其应用

张菁晶1; 冯 晶2; 朱英国1; 李阳生3   

  1. 1. 武汉大学生命科学院植物发育生物学教育部重点实验室, 武汉 430072; 2. 武汉大学高科技研究与发展中心, 武汉 430072
  • 出版日期:2006-10-01 发布日期:2006-10-01

A Novel Method of the Genome-Wide Prediction for the Target Genes and Its Application

ZHANG Jing-Jing1; FENG Jing2; ZHU Ying-Guo1; LI Yang-Sheng1   

  1. 1. Key Laboratory of Ministry of Education for Plant Developmental Biology, College of Life Sciences, Wuhan University, Wuhan 430072, China; 2. Advanced Research Center for Science and Technology, Wuhan University, Wuhan 430072, China
  • Online:2006-10-01 Published:2006-10-01

摘要: 运用隐马尔可夫模型, 利用Perl编程, 以几种模式生物的蛋白质数据库为基础, 构建了目标基因的全基因组预测的新方法。该方法具有高通量, 准确度高且操作简易等优点, 特别在多结构域蛋白家族预测上更显优势。应用该方法对几种模式生物的全基因组PPR和TPR蛋白家族进行了预测, 其中粳稻日本晴中含有536个PPR蛋白、199个TPR蛋白; 籼稻9311中含有519个PPR蛋白、177个TPR蛋白; 拟南芥中含有735个PPR蛋白、292个TPR蛋白; 红藻中6个PPR蛋白、32个TPR蛋白; 蓝细菌以及古细菌中没有PPR蛋白, 但蓝细菌含有10个TPR蛋白, 古细菌有4个TPR蛋白, 并对所得结果进行了进一步生物信息学分析。

关键词: HMM, 全基因组, PPR, Perl, 基因预测

Abstract: Abstract: Based on the protein databases of several model species, this study developed a new method of the Genome-wide prediction for the target genes, using Hidden Markov model by Perl programming. The advantages of this method are high throughput, high quality and easy prediction, especially in the case of multi-domains proteins families. By this method, we predicted the PPR and TPR proteins families in whole genome of several model species. There were 536 PPR proteins and 199 TPR proteins in Oryza sativa ssp. japonica, 519 PPR proteins and 177 TPR proteins in Oryza sativa L. ssp. indica, 735 PPR proteins and 292 TPR proteins in Arabidopsis thaliana, 6 PPR proteins and 32 TPR proteins in Cyanidioschyzon merolae. Synechococcus and Thermophilic archaebacterium did not have PPR proteins. By contrast, 10 TPR proteins were found in Synechococcus and 4 TPR proteins were found in Thermophilic archaebacterium. Moreover, of these results, some further bioinformatics analyses were conducted

中图分类号: