遗传 ›› 2012, Vol. 34 ›› Issue (6): 765-772.doi: 10.3724/SP.J.1005.2012.00765

• 研究报告 • 上一篇    下一篇

30株大肠杆菌的泛基因组学特征分析

付静1,2, 秦启伟1   

  1. 1. 中国科学院南海海洋研究所海洋生物资源可持续利用重点实验室, 广州 510301 2. 中国科学院研究生院, 北京 100049
  • 收稿日期:2011-10-11 修回日期:2012-01-13 出版日期:2012-06-20 发布日期:2012-06-25
  • 通讯作者: 秦启伟 E-mail:qinqw@scsio.ac.cn
  • 基金资助:

    国家杰出青年基金项目(编号:30725027)资助

Pan-genomics analysis of 30 Escherichia coli genomes

FU Jing1, 2, QIN Qi-Wei1   

  1. 1. Key Laboratory of Marine Bio-resources Sustainable Utilization, South China Sea Institute of Oceanology, Chinese Academy of Sciences, Guangzhou 510301, China 2. Graduate University of the Chinese Academy of Sciences, Beijing 100049, China
  • Received:2011-10-11 Revised:2012-01-13 Online:2012-06-20 Published:2012-06-25

摘要: 泛基因组(Pan-genome)是某一物种全部基因的总称, 其中包括核心基因组(该物种所有个体中都存在的基因)和非必须基因组(只在部分个体中存在的基因, 以及某个体特有的基因)。文章从泛基因组学角度比较分析了30株已经完成测序的大肠杆菌的基因、基因组成及其进化特征, 结果表明核心基因只占据每株大肠杆菌全部基因数目的50%左右, 而平均每个菌株有146个特有基因, 结果表明随着更多大肠杆菌菌株的基因组被测序, 将会不断有新基因被发现。通过比较分析大肠杆菌不同菌株之间基因的保守性与基因的GC含量以及选择压力之间的关系, 发现越保守的基因其GC含量变化范围越窄, 同时在进化中受到的选择压力也越大。这些结果将有助于深入了解大肠杆菌基因组的进化特征及其基因组成的动态变化, 并为预防和控制由致病性大肠杆菌引发的流行疾病提供理论依据, 同时也为大规模病原菌基因组数据的分析方法提供借鉴。

关键词: 泛基因组, 大肠杆菌, GC含量, 选择压力

Abstract: A pan-genome describes the full complement of genes in species. It is a superset of all the genes in all the individuals of a species, which is composed of a ‘core genome’ containing genes present in all individuals, and a ‘dispensable genome’ containing genes present only in some individuals and individual-specific genes. From pan-genome sight, 30 finished genomes from Escherichia coli were employed to analyze their gene and genome compositions and evaluation in this study. The results indicated that the core genes accounted for about 50% of the total number of genes, while about 146 strain-specific genes existed in the each strain tested. The data suggests that the E. coli pan-genome is vast, and unique genes will continue to be identified when more E. coli genomes are sequenced. After analyzing relationships of the gene conservation, GC content and selection pressure in different strains tested, we found that more conserved genes had a narrow range of GC content, and they also bear more selection pressure. These results will be helpful for better understanding of the evolution profile of E. coli genome, and the dynamic changes of its gene compositions. The E. coli pan-genome provides useful information for prevention and control of the diseases caused by pathogenic E. coli, and also provides a paradigm for the large-scale analysis of pathogenic bacteria genomes.

Key words: Pan-genome, E. coli, GC-content, selection pressure