遗传 ›› 2020, Vol. 42 ›› Issue (8): 799-809.doi: 10.16288/j.yczz.20-080

• 资源与平台 • 上一篇    下一篇

CNGBdb:国家基因库生命大数据平台

陈凤珍1, 游丽金1, 杨帆1, 王丽娜1, 郭学芹1, 高飞1, 华聪1, 谈聪1, 方林2, 单日强3, 曾文君1, 王博1, 王韧1(), 徐讯1,2,4(), 魏晓锋1()   

  1. 1. 深圳国家基因库,深圳 518120
    2. 深圳华大生命科学研究院,深圳 518083
    3. 深圳华大智造科技有限公司,深圳 518083
    4. 广东省高通量基因组测序与合成编辑应用重点实验室,深圳 518120
  • 收稿日期:2020-03-23 修回日期:2020-05-23 出版日期:2020-08-20 发布日期:2020-06-01
  • 通讯作者: 王韧,徐讯,魏晓锋 E-mail:wangren@cngb.org;xuxun@genomics.cn;weixiaofeng@cngb.org
  • 作者简介:陈凤珍,本科,研究方向:生物大数据。E-mail: chenfengzhen@cngb.org
  • 基金资助:
    广东省高通量基因组测序与合成编辑应用重点实验室资助编号(2017B030301011)

CNGBdb: China National GeneBank DataBase

Fengzhen Chen1, Lijin You1, Fan Yang1, Lina Wang1, Xueqin Guo1, Fei Gao1, Cong Hua1, Cong Tan1, Lin Fang2, Riqiang Shan3, Wenjun Zeng1, Bo Wang1, Ren Wang1(), Xun Xu1,2,4(), Xiaofeng Wei1()   

  1. 1. China National GeneBank, Shenzhen 518120, China
    2. BGI-Shenzhen, Shenzhen 518083, China
    3. MGI-Shenzhen, Shenzhen 518083, China
    4. Guangdong Provincial Key Laboratory of Genome Read and Write, Shenzhen 518120, China
  • Received:2020-03-23 Revised:2020-05-23 Online:2020-08-20 Published:2020-06-01
  • Contact: Wang Ren,Xu Xun,Wei Xiaofeng E-mail:wangren@cngb.org;xuxun@genomics.cn;weixiaofeng@cngb.org
  • Supported by:
    Supported by Guangdong Provincial Key Laboratory of Genome Read and Write No(2017B030301011)

摘要:

国家基因库生命大数据平台(China National GeneBank DataBase, CNGBdb)是一个致力于生命科学多组学数据归档和开放共享的数据库平台,是深圳国家基因库的核心功能“三库两平台”中生物信息数据库的对外服务平台,拥有深圳国家基因库丰富的样本资源、数据资源、合作项目资源和强大的数据计算和分析能力等优势。生命科学研究已经进入到了一个以高通量多组学数据为基础的大数据时代,迫切需要加强国际合作和信息共享。随着中国经济的发展和在生命科学研究领域的研究项目投入力度的加大,需要建立相关的生命大数据归档和共享的平台, 来促进我国生命科学研究项目中生成的基因组学数据的系统管理、开放共享与合理利用。目前,CNGBdb主要提供生命科学研究相关的数据归档、知识搜索、数据管理、数据计算和数据服务等服务。其归档和共享的数据类型,主要包括项目、样本、实验、测序、组装、变异、序列等。截止2020年5月22号, CNGBdb已接受了全球生命科学科研工作者提交的研究项目达2176个,归档的基因组学数据量超过2221 TB。未来,CNGBdb将继续推动生命科学研究多组学数据的开放共享和产业应用,完善基因组学数据的归档和共享功能,提升其服务生命科学数据开放共享的能力。CNGBdb的网址是:https://db.cngb.org/。

关键词: 国家基因库生命大数据平台, 数据归档, 数据共享, 多组学数据

Abstract:

China National GeneBank DataBase (CNGBdb) is a data platform aiming to systematically archiving and sharing of multi-omics data in life science. As the service portal of Bio-informatics Data Center of the core structure, namely, "Three Banks and Two Platforms" of China National GeneBank (CNGB), CNGBdb has the advantages of rich sample resources, data resources, cooperation projects, powerful data computation and analysis capabilities. With the advent of high throughput sequencing technologies, research in life science has entered the big data era, which is in the need of closer international cooperation and data sharing. With the development of China's economy and the increase of investment in life science research, we need to establish a national public platform for data archiving and sharing in life science to promote the systematic management, application and industrial utilization. Currently, CNGBdb can provide genomic data archiving, information search engines, data management and data analysis services. The data schema of CNGBdb has covered projects, samples, experiments, runs, assemblies, variations and sequences. Until May 22, 2020, CNGBdb has archived 2176 research projects and more than 2221 TB sequencing data submitted by researchers globally. In the future, CNGBdb will continue to be dedicated to promoting data sharing in life science research and improving the service capability. CNGBdb website is: https://db.cngb.org/.

Key words: China National GenBank Database, data sharing, data archiving, omics data