遗传, 2020, 42(2): 212-221 doi: 10.16288/j.yczz.20-030

资源与平台

2019新型冠状病毒信息库

赵文明,1,2,3, 宋述慧,1,2, 陈梅丽,1,2, 邹东,1,2, 马利娜,1,2, 马英克1,2, 李茹姣1,2, 郝丽丽1,2, 李翠萍1,2, 田东梅1,2, 唐碧霞1,2, 王彦青1,2, 朱军伟1,2, 陈焕新1,2, 章张1,2,3, 薛勇彪,1,3, 鲍一明,1,2,3

1. 国家生物信息中心&中国科学院北京基因组研究所国家基因组科学数据中心, 北京 100101;

2. 中国科学院北京基因组研究所基因组科学与信息重点实验室,北京 100101 3. 中国科学院大学,北京 100049

3. 中国科学院大学,北京 100049

The 2019 novel coronavirus resource

Wenming Zhao,1,2,3, Shuhui Song,1,2, Meili Chen,1,2, Dong Zou,1,2, Lina Ma,1,2, Yingke Ma1,2, Rujiao Li1,2, Lili Hao1,2, Cuiping Li1,2, Dongmei Tian1,2, Bixia Tang1,2, Yanqing Wang1,2, Junwei Zhu1,2, Huanxin Chen1,2, Zhang Zhang1,2,3, Yongbiao Xue,1,3, Yiming Bao,1,2,3

1. China National Center for Bioinformation & National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China;

2. CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China

3. University of Chinese Academy of Sciences, Beijing 100049, China

通讯作者: 薛勇彪,博士,研究员,研究方向:分子遗传学。E-mail:ybxue@big.ac.cn鲍一明,博士,研究员,研究方向:生物信息学。E-mail:baoym@big.ac.cn

第一联系人:

赵文明,硕士,正高级工程师,研究方向:生物信息学。Email: zhaowm@big.ac.cn宋述慧,博士,副研究员,研究方向:生物信息学。E-mail: songshh@big.ac.cn陈梅丽,博士,助理研究员,研究方向:生物信息学。E-mail: chenml@big.ac.cn邹东,本科,工程师,研究方向:生物信息学。E-mail: zoud@big.ac.cn马利娜,博士,副研究员,研究方向:生物信息学。E-mail: malina@big.ac.cn赵文明、宋述慧、陈梅丽、邹东和马利娜为并列第一作者。

编委: 谢建平

收稿日期: 2020-01-31   修回日期: 2020-02-7   网络出版日期: 2020-02-12

基金资助: 国家重点研发计划项目.  2016YFE0206600
国家重点研发计划项目.  2017YFC1201202
中国科学院“十三五”信息化建设专项.  XXH13505-05
中国科学院地球大数据先导A类专项.  XDA19050302
中国科学院基因组科学数据中心能力建设项目.  0202
中国科学院青年创新促进会和中国科学院关键技术人才项目资助.  

Received: 2020-01-31   Revised: 2020-02-7   Online: 2020-02-12

Fund supported: the National Key Research & Development Program of China.  2016YFE0206600
the National Key Research & Development Program of China.  2017YFC1201202
13th Five-year Informatization Plan of CAS.  XXH13505-05
Strategic Priority Research Program of the Chines Academy of Sciences (CAS).  XDA19050302
Capacity building project of genome science data center of Chinese Academy of Sciences.  0202
Key Technology Talent Program of the CAS, The Youth Innovation Promotion Association of Chinese Academy of Sciences.  

摘要

2019年12月在中国武汉开始爆发的新型肺炎已造成全球25个国家/地区的31516人感染、638人死亡(截止2020年2月7日16时),引起该肺炎的病毒被世界卫生组织命名为2019新型冠状病毒(2019-nCoV)。为促进2019-nCoV数据共享应用并及时向全球公众提供病毒的相关信息,国家生物信息中心(CNCB)/国家基因组科学数据中心(NGDC)建立了2019新型冠状病毒信息库(2019nCoVR,https://bigd.big.ac.cn/ncov)。该信息库整合了来自德国全球流感病毒数据库、美国国家生物技术信息中心、深圳(国家)基因库、国家微生物科学数据中心及CNCB/NGDC等机构公开发布的2019-nCoV核苷酸和蛋白质序列数据、元信息、学术文献、新闻动态、科普文章等信息,开展了不同冠状病毒株的基因组序列变异分析并提供可视化展示。同时,2019nCoVR无缝对接CNCB/NGDC的相关数据库,提供新测序病毒株系的基因组原始测序数据、组装后序列的在线汇交、管理与共享、国际数据库同步发布等数据服务。本文对2019nCoVR数据汇交、管理、发布及使用等进行全面阐述,以方便用户了解该信息库各项功能及数据状况,为加速开展病毒的分类溯源、变异演化、快速检测、药物研发以及新型肺炎的精准预防与治疗等研究提供重要基础。

关键词: 冠状病毒数据库 ; 2019新型冠状病毒 ; 国家生物信息中心 ; 国家基因组科学数据中心 ; 基因组数据共享

Abstract

An ongoing outbreak of a novel coronavirus infection in Wuhan, China since December 2019 has led to 31,516 infected persons and 638 deaths across 25 countries (till 16:00 on February 7, 2020). The virus causing this pneumonia was then named as the 2019 novel coronavirus (2019-nCoV) by the World Health Organization. To promote the data sharing and make all relevant information of 2019-nCoV publicly available, we construct the 2019 Novel Coronavirus Resource (2019nCoVR, https://bigd.big.ac.cn/ncov). 2019nCoVR features comprehensive integration of genomic and proteomic sequences as well as their metadata information from the Global Initiative on Sharing All Influenza Data, National Center for Biotechnology Information, China National GeneBank, National Microbiology Data Center and China National Center for Bioinformation (CNCB)/National Genomics Data Center (NGDC). It also incorporates a wide range of relevant information including scientific literatures, news, and popular articles for science dissemination, and provides visualization functionalities for genome variation analysis results based on all collected 2019-nCoV strains. Moreover, by linking seamlessly with related databases in CNCB/NGDC, 2019nCoVR offers virus data submission and sharing services for raw sequence reads and assembled sequences. In this report, we provide comprehensive descriptions on data deposition, management, release and utility in 2019nCoVR, laying important foundations in aid of studies on virus classification and origin, genome variation and evolution, fast detection, drug development and pneumonia precision prevention and therapy.

Keywords: 2019nCoVR ; 2019 novel coronavirus ; China National Center for Bioinformation (CNCB) ; National Genomics Data Center (NGDC) ; genomic data sharing

PDF (2093KB) 元数据 多维度评价 相关文章 导出 EndNote| Ris| Bibtex  收藏本文

本文引用格式

赵文明, 宋述慧, 陈梅丽, 邹东, 马利娜, 马英克, 李茹姣, 郝丽丽, 李翠萍, 田东梅, 唐碧霞, 王彦青, 朱军伟, 陈焕新, 章张, 薛勇彪, 鲍一明. 2019新型冠状病毒信息库. 遗传[J], 2020, 42(2): 212-221 doi:10.16288/j.yczz.20-030

Wenming Zhao. The 2019 novel coronavirus resource. Hereditas(Beijing)[J], 2020, 42(2): 212-221 doi:10.16288/j.yczz.20-030

2019年12月以来,中国湖北省武汉市部分医院陆续发现了多例不明原因肺炎病例,后被证实是由一种先前尚未发现的冠状病毒(coronavirus)感染引起的急性呼吸道传染病,这种病毒被世界卫生组织(World Health Organization, WHO)命名为2019新型冠状病毒(2019 novel coronavirus, 2019-nCoV)*[1]( *注:2020年2月11日,2019新型冠状病毒(2019-nCoV)被国际病毒分类委员会(the International Committee on Taxonomy of Viruses)冠状病毒研究小组(Coronavirus Study Group, CSG)命名为“SARS-CoV-2” (severe acute respiratory syndrome coronavirus 2),同时,由该病毒感染引起的疾病被WHO命名为“COVID-19” (corona virus disease 2019)。),该病毒与中东呼吸综合征相关冠状病毒(middle east respiratory syndrome-related coronavirus, MERSr-CoV)和严重急性呼吸综合征相关冠状病毒(severe acute respiratory syndrome-related coronavirus, SARSr-CoV)同属于β冠状病毒属[2]

利用快速发展的基因组学方法与技术,全球的科研人员已经获得了多个2019-nCoV基因组序列,并且开展了多项相关研究[2,3,4,5,6,7]。因此,收集整合已有的2019-nCoV数据,构建统一完整的信息库系统,实现对数据的动态发布与共享对于防控病毒疫情、制定病毒性肺炎治疗方案具有重要意义[8,9]。自2020年1月5日,复旦大学张永振教授向美国国家生物技术信息中心(National Center for Biotechnology Information, NCBI)[10]的GenBank数据库提交第一条新型冠状病毒基因组序列(Acc. No. MN908947)至2020年2月5日,共有86条2019-nCoV序列数据在全球多个数据库发布,主要分布于德国全球流感病毒数据库(Global Initiative on Sharing All Influenza Data, GISAID)[11]、美国NCBI、深圳(国家)基因库(China National GeneBank, CNGB)[12]、国家微生物科学数据中心(National Microbiology Data Center, NMDC)[13]及国家生物信息中心(China National Center for Bioinformation, CNCB)/国家基因组科学数据中心(National Genomics Data Center, NGDC)[14]等相关数据库。然而,2019-nCoV序列数据分散在这些数据库中,未形成完整、统一访问的数据集,这给科研人员检索、预览和获取数据带来诸多不便。

为了缓解当前数据多源的局面和问题,帮助科研人员便捷地获取数据,同时提供高效的基因组序列递交与发布共享系统,CNCB/NGDC通过整合全球2019-nCoV相关数据,构建了2019新型冠状病毒信息库(2019nCoVR, https://bigd.big.ac.cn/ncov),并于2020年1月22日正式公开上线。2019nCoVR动态发布基因组序列、元数据信息以及相关新闻、学术文献、科普文章,提供冠状病毒基因组的变异分析结果与可视化展示,提供已知冠状病毒科的核 苷酸和蛋白质序列信息的搜索和下载。同时,2019nCoVR无缝对接NGDC的原始组学数据归档库(Genome Sequence Archive, GSA)[15,16]和基因组数据库(Genome Warehouse, GWH)[14,17,18],提供新病毒的基因组原始测序数据和组装后序列的在线汇交、管理、共享以及与国际数据库同步发布等服务。本文主要介绍2019nCoVR的数据资源、数据汇交与审核机制以及数据发布、管理与使用规范等内容,为加速开展病毒分类溯源、变异演化、快速检测、药物研发以及新型肺炎的精准预防与治疗等研究提供重要数据基础。

1 2019nCoVR数据资源

1.1 基因组序列发布动态

基于严格的质控审编流程,2019nCoVR收集整合多个数据平台(CNCB/NGDC、CNGB、GISAID、NCBI、NMDC)的2019-nCoV序列信息和元数据信息(包括病毒株名、序列号、数据来源、宿主、采样日期、采样地点、样本提供单位、数据递交单位等),持续更新序列发布动态,为开展相关科学研究提供完备准确的第一手数据。自2019年12月2019-nCoV疫情爆发至2020年2月5日,已收录来自16个国家/地区的81株病毒的86条基因组序列(附表1),其中67株具有全基因组序列(人体中分离66株,蝙蝠中分离1株)。

在数据获取与访问权限方面,遵守不同数据共享平台的数据管理规则,提供最大限度的数据集成与访问。可公开访问的数据已整合录入2019nCoVR,包括NCBI、CNCB/NGDC、NMDC、CNGB中相关基因组序列,任何人可不受限访问并下载;受限访问的数据,主要为GISAID数据库中序列,用户需到GISAID系统注册、登录后才可访问并下载(图1)。

图1

图1   2019新型冠状病毒基因组元信息相关统计结果

A:数据共享平台;B:采样国家/地区;C:样本提供单位;D:数据递交单位。

Fig. 1   Statistics of 2019-nCoV genome meta information


在病毒来源方面,所收录的病毒株主要来自湖北省武汉市,部分来自广东省和浙江省等地区,还有一小部分来自美国、泰国和日本等国家。病毒样本采集单位主要包括香港大学深圳医院、广东省疾病预防控制中心(Center for Disease Control and Prevention, CDC)、广东省公共卫生研究院、武汉金银潭医院、中国医学科学院病原生物学研究所等国内外28家医疗卫生或科研单位。基因组测序和数据递交主要由香港大学深圳医院、中国CDC、广东省CDC、湖北省CDC、华大基因(Beijing Genomics Institute, BGI)等30家单位完成(图1)。

1.2 基因组序列资源整合与信息检索

基于CNCB/NGDC的GWH数据平台,2019nCoVR收录并整合国内外公共数据平台中可开放获取的冠状病毒序列数据,形成冠状病毒序列数据集。截止到2020年2月5日,已审编收录冠状病毒科的核苷酸序列7566条和蛋白质序列29039条,以及相应的元数据信息(图2)。基于标准化的信息整合与发布,2019nCoVR提供多方位信息检索、条件查询、批量下载等功能,用户亦可在FTP网站公开访问和下载数据(ftp://download.big.ac.cn/Genome/Viruses/Coronaviridae/)。

图2

图2   冠状病毒科基因组序列信息汇总

Fig. 2   Coronaviridae genome sequence information


1.3 基因组序列变异分析与可视化

2019nCoVR分别选取可感染人的两种冠状病毒,即SARS (NC_004718)和最先公布的2019-nCoV基因组序列(MN908947),以及一种从蝙蝠中分离采集到的SARS样冠状病毒(bat-SL-CoVZC45, MG772933)作为参考基因组,整合“发布动态”中汇总可获取的全基因组序列,用Muscle软件[19,20]逐一进行全基因组序列比较和多序列比对,比较发现2019-nCoV与NC_004718、bat-SL-CoVZC45和蝙蝠中检测到的冠状病毒(bat/Yunnan/RaTG13/2013)的基因组序列相似度分别为80%、88%和96%,而2019-nCoV内部不同株系间的序列相似性约为99.9%。基于从人体中分离的65株病毒全基因组序列,在去除序列变异数量异常和变异位点集中(有5个突变发生在20 bp的区域内)的3株序列后,对剩余的62株序列采用基于距离的UPGMA法构建系统发育树,显示其遗传关系非常近且有所分化(图3)。

图3

图3   新型冠状病毒序列的系统进化树

Fig. 3   Phylogenetics tree of 2019-nCoV


通过提取基因组序列比对中发现的变异位置、类型及信息,并配置GBrowse浏览器[21],可视化展示了每个病毒分离株与不同参考序列的变异(图4A)。此外,统计包括插入、删除、Indel和单核苷酸多态位点(SNP)的各类变异总数,提供了每个病毒株变异统计信息检索及下载。汇总各株变异信息发现主要的变异类型是SNP。经统计,与2019-nCoV参考序列相比,有14株病毒的序列无变异,49株平均有1~9个SNP变异(图4B),1株有27个SNP变异,因此推测该株(Acc. No. EPI_ISL_406592)的基因组序列质量存在问题。此外,检测到的少数序列删除变异(deletion)主要发生在基因组的5ʹUTR和3ʹUTR区域,有可能与测序准确率、基因组拼接等有关。初步提示已发布的65株病毒可能来源于近期出现的同一个病毒源。

图4

图4   基因组序列变异在线展示示意图及变异信息统计与注释

A:全基因组序列变异在线展示示意图;B:病毒株SNP变异数统计;C:SNP变异在各注释基因及UTR区的数量统计;D:SNP变异效应统计。

Fig. 4   Snapshot of genome sequence variants on GBrowse as well as SNP statistics and annotations


通过计算每个变异位点的群体发生频率并采用VEP软件[22]对上述变异进行注释,网站提供了所有变异位点注释信息(包括碱基变异、密码子及氨基酸变化、变异注释类型)的查询、浏览与下载。经统计发现2019-nCoV群体内的序列变异主要发生在5个基因,即产生病毒表面糖蛋白的S基因、编码病毒核衣壳磷蛋白的N基因、orf8基因、orf3a和最大的基因orf1ab。其中,orf1ab基因的变异位点数高达39 (图4C)。经分析,约42%的变异是非同义突变(图4D),且发现多株病毒的非同义突变主要发生在S蛋白的第32位(F→I,c.94Ttc>Atc)和第49位(H→Y,c.145Cat>Tat)及ORF8蛋白的第84位(L→S,c.251tTa>tCa),而发生在ORF1ab蛋白上的非同义突变位点数量最多(附表2)。

1.4 关联信息整合

2019nCoVR整合了来源于公共数据库及公共媒体的相关信息,主要包括:(1) NCBI冠状病毒科的所有序列、冠状病毒全基因组序列、感染人的冠状病毒全基因组序列、2019-nCoV序列等;(2)PubMed中冠状病毒相关的学术文献及Europe PMC针对2019-nCoV的最新学术报道;(3)中国CDC及WHO等权威机构对2019-nCoV的新闻报道、病毒解读及其相关的科普知识。这些内容为全球科研人员和普通民众开展学术研究、了解科研进展、掌握新闻动态与科学知识提供一站式数据资源与信息窗口。

2 数据汇交与审核机制

依托CNCB/NGDC的GSA系统,2019nCoVR提供新型冠状病毒原始测序数据的汇交服务,汇交内容主要包括元数据信息和序列文件。数据递交完成后,GSA系统会对用户递交的元数据信息和序列文件进行质量控制与审核,校验文件大小和内容、统计序列信息、评估数据质量,以此确保递交数据的完整性和可靠性。审核通过后,系统会为该数据分配唯一的数据编号(accession number),并通过邮件通知递交者。数据编号可作为数据检索和访问的标识,也可在文章中使用。

与之类似,2019nCoVR依托CNCB/NGDC的GWH数据库,汇交新型冠状病毒基因组序列和蛋白质序列,主要包括元数据、序列信息和注释文件。

为严格把控病毒基因组数据入库质量,针对用户递交的数据,GWH建立了严格的质量控制标准,审核检查数据的合法性和一致性,主要包括序列合法性、基因结构与信息完整性、基因结构内部的一致性、序列内容与注释信息的一致性以及载体、接头、index、污染序列等。数据审核通过后,GWH系统会为该数据分配正式的数据编号,方便数据检索、访问和下载。截止到2020年2月5日,已经收录了中国医学科学院病原生物学研究所和中国科学院武汉病毒研究所提交的11株冠状病毒全基因组序列。为了进一步扩大2019-nCoV基因组序列的国际影响力和应用范围,CNCB/NGDC与国际生物信息数据库建立了数据同步共享机制,第一批5个2019-nCoV全基因组序列已经在NCBI发布(Acc. No. MT019529~ MT019533)。

3 数据发布、管理与使用规范

2019nCoVR遵循CNCB/NGDC的相关数据管理制度及领域内数据管理惯例,即数据的所有权属于数据递交者,数据的公开与发布由数据提交者(submitter)自行管理。递交至2019nCoVR的新型冠状病毒数据(包括元信息和关联的序列数据),将分为公开(public)、受控(controlled)和私有(confidential)三种类型,数据提交者根据其数据的密级、保密期限、开放条件、开放对象和审核程序等,在提交数据时选择一种数据访问类型,数据提交完成并审核通过后,系统将按照数据提交者选择的数据访问类型进行管理(表1)。三类数据访问类型的管理规则具体如下:

(1)公开类型:元信息和关联序列数据都公开共享,任何用户可查询、访问与下载;

(2)受控类型:元信息公开共享,但关联序列数据受控访问。数据申请者须向数据提交者提出序列数据使用请求,由数据提交者向数据申请者发放访问权限。数据提交者可根据情况动态调整数据访问类型。

(3)私有类型:元信息和关联数据不会在数据平台上展示,用户无法查询、访问或下载。当私有数据符合开放共享条件时,如相关科研论文已发表或者达到约定公开时限,系统会通知数据提交者公开其数据。私有类型数据亦可由数据提交者动态管理。

表1   三类数据访问类型的基本规则

Table 1  Fundamental rules for three types of data access

数据类型汇交内容公开程度开放对象开放条件
公开*元信息
关联数据
公开所有用户审核通过即公开
受控元信息
关联数据
公开
受控
所有用户
申请用户
相关科研论文已发表或达到约定公开时限
私有元信息
关联数据
受控相关科研论文已发表或达到约定公开时限

*:凡是类似2019-nCoV等涉及国家或全球公共卫生安全,呼吁基因序列数据在测序完成后第一时间采用“公开”数据类型开放共享。

新窗口打开| 下载CSV


4 结语与展望

2019nCoVR整合来自CNCB/NGDC、CNGB、GISAID、NCBI及NMDC 的新型冠状病毒数据资源,无缝对接CNCB/NGDC的相关数据库,为新型冠状病毒基因组数据的快速发布与开放共享提供公共平台,也为加速开展病毒分类溯源、基因组演化、快速检测、药物研发、新型肺炎的精准预防与治疗等研究提供重要基础。随着2019-nCoV科研工作的深入开展,2019nCoVR将持续更新并发布相关基因组序列及其元数据信息,为攻坚2019-nCoV提供数据保障与信息支撑。同时,特此呼吁科研人员和医务工作者加快推进2019-nCoV基因组数据的汇交、共享与发布,建立实现全球数据共同体,协同战胜病毒疫情。

致谢

该信息库由国家生物信息中心(CNCB)/国家基因组科学数据中心(NGDC)建设并维护。在建设过程中,得到了北京大学罗静初教授的支持和帮助,在此表示感谢!信息库所有数据来源于用户直接递交或国内外公共数据平台,包括GISAID、NCBI/GenBank、NMDC、CNGB/CNGBdb等(附表1),在此,对所有样本收集和数据递交的单位和个人表示感谢!

附录

附表1和附表2见网站电子版www.chinagene.cn。

附表1   病毒基因组元信息表。信息统计截至2020年2月5日。

Table S1  

Virus Isolate NameAccession IDData SourceRelated IDHostSample Collection DateLocationNuc.CompletenessOriginating LabSubmitting Lab
BetaCoV/USA/WA1-F6/2020EPI_ISL_407215GISAIDHuman2020-1-25USA / WashingtonCompleteWA State Department of HealthPathogen Discovery, Respiratory Viruses Branch, Division of Viral Diseases, Centers for Dieases Control and Prevention
BetaCoV/Korea/KCDC03/2020EPI_ISL_407193GISAIDHuman2020-1-25Korea / Gyeonggi-doCompleteKorea Centers for Disease Control & Prevention (KCDC) Center for Laboratory Control of Infectious Diseases Division of Viral DiseasesKorea Centers for Disease Control & Prevention (KCDC) Center for Laboratory Control of Infectious Diseases Division of Viral Diseases
BetaCoV/Japan/AI/I-004/2020EPI_ISL_407084GISAIDHuman2020-1-25Japan / AichiCompleteDepartment of Virology III, National Institute of Infectious DiseasesPathogen Genomics Center, National Institute of Infectious Diseases
BetaCoV/USA/WA1-A12/2020EPI_ISL_407214GISAIDHuman2020-1-25USA / WashingtonCompleteWA State Department of HealthPathogen Discovery, Respiratory Viruses Branch, Division of Viral Diseases, Centers for Dieases Control and Prevention
BetaCoV/Singapore/1/2020EPI_ISL_406973GISAIDHuman2020-1-23SingaporeCompleteSingapore General HospitalNational Public Health Laboratory
BetaCoV/England/02/2020EPI_ISL_407073GISAIDHuman2020-1-29EnglandCompleteRespiratory Virus Unit, Microbiology Services Colindale, Public Health EnglandRespiratory Virus Unit, Microbiology Services Colindale, Public Health England
BetaCoV/England/01/2020EPI_ISL_407071GISAIDHuman2020-1-29EnglandCompleteRespiratory Virus Unit, Microbiology Services Colindale, Public Health EnglandRespiratory Virus Unit, Microbiology Services Colindale, Public Health England
BetaCoV/Finland/1/2020EPI_ISL_407079GISAIDHuman2020-1-29Finland / LaplandPartial/scaffold levelLapland Central HospitalDepartment of Virology, University of Helsinki and Helsinki University Hospital, Helsinki, Finland
BetaCoV/Hangzhou/HZ-1/2020EPI_ISL_406970GISAIDHuman2020-1-20China / Zhejiang / HangzhouCompleteHangzhou Center for Disease Control and Prevention, Microbiology LabHangzhou Center for Disease Control and Prevention, Microbiology Lab
2019 nCoV/Italy-INMI1MT008022GenBankEPI_ISL_406959Human2020-01Italy / RomePartial/gene levelVirology Laboratory, INMI L. Spallanzani
2019 nCoV/Italy-INMI2MT008023GenBankEPI_ISL_406960Human2020-01Italy / RomePartial/gene levelVirology Laboratory, INMI L. Spallanzani
BetaCoV/Wuhan/WH19002/2019NMDC60013002-05NMDCHuman2019-12-30China / Hubei Province / Wuhan CityCompleteChina CDC; Shandong First Medical University & Shandong Academy of Medical Sciences; Hubei Provincial Center for Disease Control and Prevention; BGI PathoGenesis Pharmaceutical Technology Co., Ltd
BetaCoV/Wuhan/WH19008/2019NMDC60013002-06NMDCHuman2019-12-30China / Hubei Province / Wuhan CityCompleteChina CDC; Shandong First Medical University & Shandong Academy of Medical Sciences; Hubei Provincial Center for Disease Control and Prevention; BGI PathoGenesis Pharmaceutical Technology Co., Ltd
BetaCoV/Wuhan/YS8011/2020NMDC60013002-07NMDCHuman2020-1-7China / Hubei Province / Wuhan CityCompleteChina CDC; Shandong First Medical University & Shandong Academy of Medical Sciences; Hubei Provincial Center for Disease Control and Prevention; BGI PathoGenesis Pharmaceutical Technology Co., Ltd
BetaCoV/Wuhan/WH19001/2019NMDC60013002-08NMDCHuman2019-12-30China / Hubei Province / Wuhan CityCompleteChina CDC; Shandong First Medical University & Shandong Academy of Medical Sciences; Hubei Provincial Center for Disease Control and Prevention; BGI PathoGenesis Pharmaceutical Technology Co., Ltd
BetaCoV/Wuhan/WH19004/2020NMDC60013002-09NMDCHuman2020-1-1China / Hubei Province / Wuhan CityCompleteChina CDC; Shandong First Medical University & Shandong Academy of Medical Sciences; Hubei Provincial Center for Disease Control and Prevention; BGI PathoGenesis Pharmaceutical Technology Co., Ltd
BetaCoV/Wuhan/WH19005/2019NMDC60013002-10NMDCHuman2019-12-30China / Hubei Province / Wuhan CityCompleteChina CDC; Shandong First Medical University & Shandong Academy of Medical Sciences; Hubei Provincial Center for Disease Control and Prevention; BGI PathoGenesis Pharmaceutical Technology Co., Ltd
BetaCoV/Germany/BavPat1/2020EPI_ISL_406862GISAIDHuman2020-1-28Germany / Bavaria / MunichCompleteCharité Universitätsmedizin Berlin, Institute of Virology; Institut für Mikrobiologie der Bundeswehr, MunichCharité Universitätsmedizin Berlin, Institute of Virology
BetaCoV/Wuhan/WH-01/2019NMDC60013002-01NMDCLR757998, EPI_ISL_406798, CNA0007332Human2019-12-26China / Hubei Province / Wuhan CityCompleteGeneral Hospital of Central Theater Command of People's Liberation Army of ChinaBGI PathoGenesis Pharmaceutical Technology Co., Ltd; China CDC; Shandong First Medical University & Shandong Academy of Medical Sciences; Hubei Provincial Center for Disease Control and Prevention
BetaCoV/Wuhan/WH-02/2019NMDC60013002-02NMDCLR757997, CNA0007333Human2019-12-31China / Hubei Province / Wuhan CityPartial/scaffold levelBGI PathoGenesis Pharmaceutical Technology Co., Ltd; China CDC; Shandong First Medical University & Shandong Academy of Medical Sciences; Hubei Provincial Center for Disease Control and Prevention
BetaCoV/Wuhan/WH-03/2019NMDC60013002-03NMDCLR757996, EPI_ISL_406800, CNA0007334Human2020-1-1China / Hubei Province / Wuhan CityCompleteGeneral Hospital of Central Theater Command of People's Liberation Army of ChinaBGI PathoGenesis Pharmaceutical Technology Co., Ltd; China CDC; Shandong First Medical University & Shandong Academy of Medical Sciences; Hubei Provincial Center for Disease Control and Prevention
BetaCoV/Wuhan/WH-04/2019NMDC60013002-04NMDCLR757995, EPI_ISL_406801, CNA0007335Human2020-1-5China / Hubei Province / Wuhan CityCompleteGeneral Hospital of Central Theater Command of People's Liberation Army of ChinaBGI PathoGenesis Pharmaceutical Technology Co., Ltd; China CDC; Shandong First Medical University & Shandong Academy of Medical Sciences; Hubei Provincial Center for Disease Control and Prevention
BetaCoV/Australia/VIC01/2020MT007544GenBankEPI_ISL_406844Human2020-1-25Australia / Victoria / ClaytonCompleteMonash Medical CentreDepartment of Microbiology and Immunology, The University of Melbourne at The Peter Doherty Institute for Infection and Immunity
WIV02GWHABKK00000000Genome WarehouseEPI_ISL_402127, MN996527Human2019-12-30China/ Hubei / WuhanCompleteWuhan Jinyintan HospitalCAS Key Laboratory of Special Pathogens and Biosafety and Center for Emerging Infectious Diseases, Wuhan Institute of Virology, Chinese Academy of Sciences
WIV04GWHABKL00000000Genome WarehouseEPI_ISL_402124, MN996528Human2019-12-30China / Hubei / WuhanCompleteWuhan Jinyintan HospitalCAS Key Laboratory of Special Pathogens and Biosafety and Center for Emerging Infectious Diseases, Wuhan Institute of Virology, Chinese Academy of Sciences
WIV05GWHABKM00000000Genome WarehouseEPI_ISL_402128, MN996529Human2019-12-30China / Hubei / WuhanCompleteWuhan Jinyintan HospitalCAS Key Laboratory of Special Pathogens and Biosafety and Center for Emerging Infectious Diseases, Wuhan Institute of Virology, Chinese Academy of Sciences
WIV06GWHABKN00000000Genome WarehouseEPI_ISL_402129, MN996530Human2019-12-30China / Hubei / WuhanCompleteWuhan Jinyintan HospitalCAS Key Laboratory of Special Pathogens and Biosafety and Center for Emerging Infectious Diseases, Wuhan Institute of Virology, Chinese Academy of Sciences
WIV07GWHABKO00000000Genome WarehouseEPI_ISL_402130, MN996531Human2019-12-30China / Hubei / WuhanCompleteWuhan Jinyintan HospitalCAS Key Laboratory of Special Pathogens and Biosafety and Center for Emerging Infectious Diseases, Wuhan Institute of Virology, Chinese Academy of Sciences
TG13GWHABKP00000000Genome WarehouseEPI_ISL_402131, MN996532Rhinolophus affinis2013-7-24China / Yunnan / Pu'erCompleteWuhan Institute of VirologyCAS Key Laboratory of Special Pathogens and Biosafety and Center for Emerging Infectious Diseases, Wuhan Institute of Virology, Chinese Academy of Sciences
BetaCoV/Guangdong/20SF174/2020EPI_ISL_406531GISAIDHuman2020-1-22China / Guangdong ProvinceCompleteGuangdong Provincial Center for Diseases Control and Prevention; Guangdong Provincial Institute of Public HealthGuangdong Provincial Center for Diseases Control and Prevention
BetaCoV/Guangzhou/20SF206/2020EPI_ISL_406533GISAIDHuman2020-1-22China / Guangdong Province / Guangzhou CityCompleteGuangdong Provincial Center for Diseases Control and Prevention; Guangdong Provincial Institute of Public HealthGuangdong Provincial Center for Diseases Control and Prevention
BetaCoV/Foshan/20SF207/2020EPI_ISL_406534GISAIDHuman2020-1-22China / Guangdong ProvinceCompleteGuangdong Provincial Center for Diseases Control and Prevention; Guangdong Provincial Institute of Public HealthGuangdong Provincial Center for Diseases Control and Prevention
BetaCoV/Foshan/20SF210/2020EPI_ISL_406535GISAIDHuman2020-1-22China / Guangdong ProvinceCompleteGuangdong Provincial Center for Diseases Control and Prevention; Guangdong Provincial Institute of Public HealthGuangdong Provincial Center for Diseases Control and Prevention
BetaCoV/Foshan/20SF211/2020EPI_ISL_406536GISAIDHuman2020-1-22China / Guangdong ProvinceCompleteGuangdong Provincial Center for Diseases Control and Prevention; Guangdong Provincial Institute of Public HealthGuangdong Provincial Center for Diseases Control and Prevention
BetaCoV/Guangdong/20SF201/2020EPI_ISL_406538GISAIDHuman2020-1-23China / Guangdong ProvinceCompleteGuangdong Provincial Center for Diseases Control and Prevention;Guangdong Provincial Institute of Public HealthGuangdong Provincial Center for Diseases Control and Prevention
BetaCoV/Shenzhen/SZTH-001/2020EPI_ISL_406592GISAIDHuman2020-1-13China/ Guangdong Province / Shenzhen CityCompleteShenzhen Third People's HospitalShenzhen Key Laboratory of Pathogen and Immunity, National Clinical Research Center for Infectious Disease, Shenzhen Third People's Hospital
BetaCoV/Shenzhen/SZTH-003/2020EPI_ISL_406594GISAIDHuman2020-1-16China/ Guangdong Province / Shenzhen CityCompleteShenzhen Key Laboratory of Pathogen and Immunity, National Clinical Research Center for Infectious Disease, Shenzhen Third People's HospitalShenzhen Key Laboratory of Pathogen and Immunity, National Clinical Research Center for Infectious Disease, Shenzhen Third People's Hospital
BetaCoV/Shenzhen/SZTH-004/2020EPI_ISL_406595GISAIDHuman2020-1-16China/ Guangdong Province / Shenzhen CityCompleteShenzhen Key Laboratory of Pathogen and Immunity, National Clinical Research Center for Infectious Disease, Shenzhen Third People's HospitalShenzhen Key Laboratory of Pathogen and Immunity, National Clinical Research Center for Infectious Disease, Shenzhen Third People's Hospital
BetaCoV/Shenzhen/SZTH-002/2020EPI_ISL_406593GISAIDHuman2020-1-13China/ Guangdong Province / Shenzhen CityCompleteShenzhen Key Laboratory of Pathogen and Immunity, National Clinical Research Center for Infectious Disease, Shenzhen Third People's HospitalShenzhen Key Laboratory of Pathogen and Immunity, National Clinical Research Center for Infectious Disease, Shenzhen Third People's Hospital
BetaCoV/France/IDF0373/2020EPI_ISL_406597GISAIDHuman2020-1-23France / Ile-de-France / ParisCompleteDepartment of Infectious and Tropical Diseases, Bichat Claude Bernard Hospital, ParisNational Reference Center for Viruses of Respiratory Infections, Institut Pasteur, Paris
BetaCoV/France/IDF0372/2020EPI_ISL_406596GISAIDHuman2020-1-23France / Ile-de-France / ParisCompleteDepartment of Infectious and Tropical Diseases, Bichat Claude Bernard Hospital, ParisNational Reference Center for Viruses of Respiratory Infections, Institut Pasteur, Paris
2019-nCoV_HKU-SZ-001_2020MN938387GenBankHuman2020-01China / Guangdong / ShenzhenPartial/gene levelUniversity of Hong Kong-Shenzhen HospitalUniversity of Hong Kong-Shenzhen Hospital
2019-nCoV_HKU-SZ-001_2020MN938385GenBankHuman2020-01China / Guangdong / ShenzhenPartial/gene levelUniversity of Hong Kong-Shenzhen HospitalUniversity of Hong Kong-Shenzhen Hospital
2019-nCoV_HKU-SZ-002b_2020MN938388GenBankHuman2020-01China / Guangdong / ShenzhenPartial/gene levelUniversity of Hong Kong-Shenzhen HospitalUniversity of Hong Kong-Shenzhen Hospital
2019-nCoV_HKU-SZ-004_2020MN938389GenBankHuman2020-01China / Guangdong / ShenzhenPartial/gene levelUniversity of Hong Kong-Shenzhen HospitalUniversity of Hong Kong-Shenzhen Hospital
2019-nCoV_HKU-SZ-004_2020MN938386GenBankHuman2020-01China / Guangdong / ShenzhenPartial/gene levelUniversity of Hong Kong-Shenzhen HospitalUniversity of Hong Kong-Shenzhen Hospital
2019-nCoV_HKU-SZ-005_2020MN938390GenBankHuman2020-01China / Guangdong / ShenzhenPartial/gene levelUniversity of Hong Kong-Shenzhen HospitalUniversity of Hong Kong-Shenzhen Hospital
2019-nCoV_HKU-SZ-007a_2020MN975266GenBankHuman2020-01China / Guangdong / ShenzhenPartial/gene levelUniversity of Hong Kong-Shenzhen HospitalUniversity of Hong Kong-Shenzhen Hospital
2019-nCoV_HKU-SZ-007a_2020MN975263GenBankHuman2020-01China / Guangdong / ShenzhenPartial/gene levelUniversity of Hong Kong-Shenzhen HospitalUniversity of Hong Kong-Shenzhen Hospital
2019-nCoV_HKU-SZ-007b_2020MN975264GenBankHuman2020-01China / Guangdong / ShenzhenPartial/gene levelUniversity of Hong Kong-Shenzhen HospitalUniversity of Hong Kong-Shenzhen Hospital
2019-nCoV_HKU-SZ-007b_2020MN975267GenBankHuman2020-01China / Guangdong / ShenzhenPartial/gene levelUniversity of Hong Kong-Shenzhen HospitalUniversity of Hong Kong-Shenzhen Hospital
2019-nCoV_HKU-SZ-007c_2020MN975268GenBankHuman2020-01China / Guangdong / ShenzhenPartial/gene levelUniversity of Hong Kong-Shenzhen HospitalUniversity of Hong Kong-Shenzhen Hospital
2019-nCoV_HKU-SZ-007c_2020MN975265GenBankHuman2020-01China / Guangdong / ShenzhenPartial/gene levelUniversity of Hong Kong-Shenzhen HospitalUniversity of Hong Kong-Shenzhen Hospital
SI200040-SPMN970003GenBankHuman2020-1-8ThailandPartial/gene levelFaculty of Medicine, Chulalongkorn University
SI200121-SPMN970004GenBankHuman2020-1-13ThailandPartial/gene levelFaculty of Medicine, Chulalongkorn University
BetaCoV/Taiwan/2/2020EPI_ISL_406031GISAIDHuman2020-1-23Taiwan/Kaohsiung CityCompleteCenters for Disease Control (Taiwan)Centers for Disease Control (Taiwan)
2019-nCoV/USA-CA1/2020MN994467GenBankEPI_ISL_406034Human2020-1-23USA / California / Los AngelesCompleteCalifornia Department of Public HealthDivision of Viral Diseases, Centers for Disease Control and Prevention
2019-nCoV/USA-CA2/2020MN994468GenBankEPI_ISL_406036Human2020-1-22USA / California / Orange CountyCompleteCalifornia Department of Public HealthDivision of Viral Diseases, Centers for Disease Control and Prevention
2019-nCoV/USA-AZ1/2020MN997409GenBankEPI_ISL_406223Human2020-1-22USA / Arizona / PhoenixCompleteArizona Department of Health ServicesDivision of Viral Diseases, Centers for Disease Control and Prevention
2019-nCoV WHU01MN988668GenBankEPI_ISL_406716Human2020-1-2China / Hubei / WuhanCompleteState Key Laboratory of Virology, Wuhan UniversityState Key Laboratory of Virology, Wuhan University
2019-nCoV WHU02MN988669GenBankEPI_ISL_406717Human2020-1-2China / Hubei / WuhanCompleteState Key Laboratory of Virology, Wuhan UniversityState Key Laboratory of Virology, Wuhan University
2019-nCoV/USA-WA1/2020MN985325GenBankEPI_ISL_404895Human2020-1-19USA / Washington / Snohomish CountyCompleteProvidence Regional Medical CenterDivision of Viral Diseases, Centers for Disease Control and Prevention
2019-nCoV/USA-IL1/2020MN988713GenBankEPI_ISL_404253Human2020-1-21USA / Illinois /ChicagoCompletePathogen Discovery, Respiratory Viruses Branch, Division of Viral Diseases, Centers for Dieases Control and PreventionIL Department of Public Health Chicago Laboratory
BetaCoV/Wuhan/IPBCAMS-WH-01/2019GWHABKF00000000Genome WarehouseEPI_ISL_402123Human2019-12-23China / Hubei Province / Wuhan CityCompleteInstitute of Pathogen Biology, Chinese Academy of Medical Sciences & Peking Union Medical CollegeInstitute of Pathogen Biology, Chinese Academy of Medical Sciences & Peking Union Medical College; Vision Medicals Co., Ltd
BetaCoV/Wuhan/IPBCAMS-WH-02/2019GWHABKG00000000Genome WarehouseEPI_ISL_403931Human2019-12-30China / Hubei Province / Wuhan CityCompleteInstitute of Pathogen Biology, Chinese Academy of Medical Sciences & Peking Union Medical CollegeInstitute of Pathogen Biology, Chinese Academy of Medical Sciences & Peking Union Medical College
BetaCoV/Wuhan/IPBCAMS-WH-03/2019GWHABKH00000000Genome WarehouseEPI_ISL_403930Human2019-12-30China / Hubei Province / Wuhan CityCompleteInstitute of Pathogen Biology, Chinese Academy of Medical Sciences & Peking Union Medical CollegeInstitute of Pathogen Biology, Chinese Academy of Medical Sciences & Peking Union Medical College
BetaCoV/Wuhan/IPBCAMS-WH-04/2019GWHABKI00000000Genome WarehouseEPI_ISL_403929Human2019-12-30China / Hubei Province / Wuhan CityCompleteInstitute of Pathogen Biology, Chinese Academy of Medical Sciences & Peking Union Medical CollegeInstitute of Pathogen Biology, Chinese Academy of Medical Sciences & Peking Union Medical College
BetaCoV/Wuhan/IPBCAMS-WH-05/2020GWHABKJ00000000Genome WarehouseEPI_ISL_403928Human2020-1-1China / Hubei Province / Wuhan CityCompleteInstitute of Pathogen Biology, Chinese Academy of Medical Sciences & Peking Union Medical CollegeInstitute of Pathogen Biology, Chinese Academy of Medical Sciences & Peking Union Medical College; China National Center for Bioinformation
2019-nCoV_HKU-SZ-002a_2020MN938384GenBankEPI_ISL_406030Human2020-1China / Guangdong / ShenzhenCompleteUniversity of Hong Kong-Shenzhen HospitalUniversity of Hong Kong-Shenzhen Hospital
2019-nCoV_HKU-SZ-005b_2020MN975262GenBankEPI_ISL_405839Human2020-1China / Guangdong / ShenzhenCompleteUniversity of Hong Kong-Shenzhen HospitalUniversity of Hong Kong-Shenzhen Hospital
BetaCoV/Guangdong/20SF040/2020EPI_ISL_403937GISAIDHuman2020-1-18China / Guangdong Province / Zhuhai CityCompleteGuangdong Provincial Center for Diseases Control and Prevention; Guangdong Provincial Institute of Public HealthDepartment of Microbiology, Guangdong Provincial Center for Diseases control and Prevention
BetaCoV/Guangdong/20SF028/2020EPI_ISL_403936GISAIDHuman2020-1-17China / Guangdong Province / Zhuhai CityCompleteGuangdong Provincial Center for Diseases control and Prevention; Guangdong Provincial Institute of Public HealthDepartment of Microbiology, Guangdong Provincial Center for Diseases control and Prevention
BetaCoV/Guangdong/20SF025/2020EPI_ISL_403935GISAIDHuman2020-1-15China / Guangdong Province / Shenzhen CityCompleteGuangdong Provincial Center for Diseases control and Prevention; Guangdong Provincial Institute of Public HealthDepartment of Microbiology, Guangdong Provincial Center for Diseases control and Prevention
BetaCoV/Guangdong/20SF014/2020EPI_ISL_403934GISAIDHuman2020-1-15China / Guangdong Province / Shenzhen CityCompleteGuangdong Provincial Center for Diseases control and Prevention; Guangdong Provincial Institute of Public HealthDepartment of Microbiology, Guangdong Provincial Center for Diseases control and Prevention
BetaCoV/Guangdong/20SF013/2020EPI_ISL_403933GISAIDHuman2020-1-15China / Guangdong Province / Shenzhen CityCompleteGuangdong Provincial Center for Diseases control and Prevention; Guangdong Provincial Institute of Public HealthDepartment of Microbiology, Guangdong Provincial Center for Diseases control and Prevention
BetaCoV/Guangdong/20SF012/2020EPI_ISL_403932GISAIDHuman2020-1-14China / Guangdong Province / Shenzhen CityCompleteGuangdong Provincial Center for Diseases control and Prevention; Guangdong Provincial Institute of Public HealthDepartment of Microbiology, Guangdong Provincial Center for Diseases control and Prevention
BetaCoV/Zhejiang/WZ-01/2020EPI_ISL_404227GISAIDHuman2020-1-16China / Zhejiang ProvinceCompleteZhejiang Provincial Center for Disease Control and PreventionDepartment of Microbiology, Zhejiang Provincial Center for Disease Control and Prevention
BetaCoV/Zhejiang/WZ-02/2020EPI_ISL_404228GISAIDHuman2020-1-17China / Zhejiang ProvinceCompleteZhejiang Provincial Center for Disease Control and PreventionDepartment of Microbiology, Zhejiang Provincial Center for Disease Control and Prevention
BetaCoV/Wuhan/HBCDC-HB-01/2019EPI_ISL_402132GISAIDHuman2019-12-30China/Hubei ProvinceCompleteWuhan Jinyintan HospitalHubei Provincial Center for Disease Control and Prevention
BetaCoV/Nonthaburi/74/2020EPI_ISL_403963GISAIDHuman2020-1-13Thailand/ Nonthaburi ProvinceCompleteBamrasnaradura HospitalDepartment of Medical Sciences, Ministry of Public Health, Thailand; Thai Red Cross Emerging Infectious Diseases - Health Science Centre; Department of Disease Control, Ministry of Public Health, Thailand
BetaCoV/Nonthaburi/61/2020EPI_ISL_403962GISAIDHuman2020-1-8Thailand/ Nonthaburi ProvinceCompleteBamrasnaradura HospitalDepartment of Medical Sciences, Ministry of Public Health, Thailand; Thai Red Cross Emerging Infectious Diseases - Health Science Centre; Department of Disease Control, Ministry of Public Health, Thailand
BetaCoV/Wuhan/IVDC-HB-04/2020EPI_ISL_402120GISAIDHuman2020-1-1China / Hubei Province / Wuhan CityCompleteNational Institute for Viral Disease Control and Prevention, China CDCNational Institute for Viral Disease Control and Prevention, China CDC
BetaCoV/Wuhan/IVDC-HB-01/2019EPI_ISL_402119GISAIDHuman2019-12-30China / Hubei Province / Wuhan CityCompleteNational Institute for Viral Disease Control and Prevention, China CDCNational Institute for Viral Disease Control and Prevention, China CDC
BetaCoV/Wuhan/IVDC-HB-05/2019EPI_ISL_402121GISAIDHuman2019-12-30China / Hubei Province / Wuhan CityCompleteNational Institute for Viral Disease Control and Prevention, China CDCNational Institute for Viral Disease Control and Prevention, China CDC
BetaCoV/Kanagawa/1/2020EPI_ISL_402126GISAIDHuman2020-1-14Kanagawa Prefecture / JapanPartialDept. of Virology III, National Institute of Infectious DiseasesDept. of Virology III, National Institute of Infectious Diseases
Wuhan-Hu-1MN908947GenBankNC_045512Human2019-12China / Hubei Province / Wuhan CityCompleteShanghai Public Health Clinical Center & School of Public Health, Fudan University, Shanghai, China

新窗口打开| 下载CSV


附表2   2019-nCoV基因组序列变异注释信息

Table S2  Genome variants and annotations

基因组位置基因/区域名称变异病毒株数碱基变化及病毒株数变异注释类型蛋白名称.位置.氨基酸变化基因名.CDS位置.序列变化效应类型
2019-nCoV_165'UTR1C->T:1intergenic_variant--MODIFIER
2019-nCoV_315'UTR1A->G:1intergenic_variant--MODIFIER
2019-nCoV_1045'UTR1T->A:1intergenic_variant--MODIFIER
2019-nCoV_1115'UTR1T->C:1intergenic_variant--MODIFIER
2019-nCoV_1125'UTR1T->G:1intergenic_variant--MODIFIER
2019-nCoV_1195'UTR1C->G:1intergenic_variant--MODIFIER
2019-nCoV_1205'UTR1T->C:1intergenic_variant--MODIFIER
2019-nCoV_1245'UTR1G->A:1intergenic_variant--MODIFIER
2019-nCoV_2415'UTR1C->T:1upstream_gene_variantQHD43415.1gene-orf1abMODIFIER;DISTANCE=25
2019-nCoV_358gene-orf1ab1TGGAGACTCCGTGGAGGAGGTCTTA->T:1inframe_deletionQHD43415.1:p.32-39GDSVEEVL>-gene-orf1ab:c.94-117GGAGACTCCGTGGAGGAGGTCTTA>-MODERATE
2019-nCoV_490gene-orf1ab1T->W:1coding_sequence_variantQHD43415.1:p.75-gene-orf1ab:c.225gaT>gaWMODIFIER
2019-nCoV_583gene-orf1ab1C->T:1synonymous_variantQHD43415.1:p.106Vgene-orf1ab:c.318gtC>gtTLOW
2019-nCoV_709gene-orf1ab1G->A:1synonymous_variantQHD43415.1:p.148Egene-orf1ab:c.444gaG>gaALOW
2019-nCoV_1548gene-orf1ab1G->A:1missense_variantQHD43415.1:p.428S>Ngene-orf1ab:c.1283aGc>aAcMODERATE
2019-nCoV_1912gene-orf1ab1C->T:1synonymous_variantQHD43415.1:p.549Sgene-orf1ab:c.1647tcC>tcTLOW
2019-nCoV_3037gene-orf1ab1C->T:1synonymous_variantQHD43415.1:p.924Fgene-orf1ab:c.2772ttC>ttTLOW
2019-nCoV_3177gene-orf1ab1C->Y:1coding_sequence_variantQHD43415.1:p.971-gene-orf1ab:c.2912cCt>cYtMODIFIER
2019-nCoV_3778gene-orf1ab1A->G:1synonymous_variantQHD43415.1:p.1171Tgene-orf1ab:c.3513acA>acGLOW
2019-nCoV_4402gene-orf1ab1T->C:1synonymous_variantQHD43415.1:p.1379Lgene-orf1ab:c.4137ctT>ctCLOW
2019-nCoV_5062gene-orf1ab1G->T:1missense_variantQHD43415.1:p.1599L>Fgene-orf1ab:c.4797ttG>ttTMODERATE
2019-nCoV_6846gene-orf1ab1T->C:1missense_variantQHD43415.1:p.2194M>Tgene-orf1ab:c.6581aTg>aCgMODERATE
2019-nCoV_6968gene-orf1ab1C->A:1missense_variantQHD43415.1:p.2235L>Igene-orf1ab:c.6703Cta>AtaMODERATE
2019-nCoV_6996gene-orf1ab1T->C:1missense_variantQHD43415.1:p.2244I>Tgene-orf1ab:c.6731aTc>aCcMODERATE
2019-nCoV_7016gene-orf1ab1G->A:1missense_variantQHD43415.1:p.2251G>Sgene-orf1ab:c.6751Ggt>AgtMODERATE
2019-nCoV_7866gene-orf1ab1G->T:1missense_variantQHD43415.1:p.2534G>Vgene-orf1ab:c.7601gGt>gTtMODERATE
2019-nCoV_8001gene-orf1ab1A->C:1missense_variantQHD43415.1:p.2579D>Agene-orf1ab:c.7736gAt>gCtMODERATE
2019-nCoV_8388gene-orf1ab1A->G:1missense_variantQHD43415.1:p.2708N>Sgene-orf1ab:c.8123aAc>aGcMODERATE
2019-nCoV_8782gene-orf1ab16C->T:15;C->Y:1synonymous_variant;coding_sequence_variantQHD43415.1:p.2839S;QHD43415.1:p.2839-gene-orf1ab:c.8517agC>agT;gene-orf1ab:c.8517agC>agYLOW;MODIFIER
2019-nCoV_8987gene-orf1ab1T->A:1missense_variantQHD43415.1:p.2908F>Igene-orf1ab:c.8722Ttt>AttMODERATE
2019-nCoV_9534gene-orf1ab1C->T:1missense_variantQHD43415.1:p.3090T>Igene-orf1ab:c.9269aCt>aTtMODERATE
2019-nCoV_9561gene-orf1ab1C->T:1missense_variantQHD43415.1:p.3099S>Lgene-orf1ab:c.9296tCa>tTaMODERATE
2019-nCoV_11083gene-orf1ab1G->T:1missense_variantQHD43415.1:p.3606L>Fgene-orf1ab:c.10818ttG>ttTMODERATE
2019-nCoV_11707gene-orf1ab1A->G:1synonymous_variantQHD43415.1:p.3814Lgene-orf1ab:c.11442ttA>ttGLOW
2019-nCoV_11764gene-orf1ab1T->A:1missense_variantQHD43415.1:p.3833N>Kgene-orf1ab:c.11499aaT>aaAMODERATE
2019-nCoV_15324gene-orf1ab1C->T:1synonymous_variantQHD43415.1:p.5020Ngene-orf1ab:c.15060aaC>aaTLOW
2019-nCoV_15607gene-orf1ab1T->C:1synonymous_variantQHD43415.1:p.5115Lgene-orf1ab:c.15343Tta>CtaLOW
2019-nCoV_16188gene-orf1ab1G->T:1missense_variantQHD43415.1:p.5308W>Cgene-orf1ab:c.15924tgG>tgTMODERATE
2019-nCoV_17000gene-orf1ab1C->T:1missense_variantQHD43415.1:p.5579T>Igene-orf1ab:c.16736aCa>aTaMODERATE
2019-nCoV_17373gene-orf1ab2C->T:2synonymous_variantQHD43415.1:p.5703Agene-orf1ab:c.17109gcC>gcTLOW
2019-nCoV_18060gene-orf1ab3C->T:3synonymous_variantQHD43415.1:p.5932Lgene-orf1ab:c.17796ctC>ctTLOW
2019-nCoV_18488gene-orf1ab2T->C:2missense_variantQHD43415.1:p.6075I>Tgene-orf1ab:c.18224aTa>aCaMODERATE
2019-nCoV_18512gene-orf1ab1C->T:1missense_variantQHD43415.1:p.6083P>Lgene-orf1ab:c.18248cCt>cTtMODERATE
2019-nCoV_19065gene-orf1ab1T->C:1synonymous_variantQHD43415.1:p.6267Pgene-orf1ab:c.18801ccT>ccCLOW
2019-nCoV_19959gene-orf1ab1A->C:1missense_variantQHD43415.1:p.6565E>Dgene-orf1ab:c.19695gaA>gaCMODERATE
2019-nCoV_20670gene-orf1ab2G->A:2synonymous_variantQHD43415.1:p.6802Agene-orf1ab:c.20406gcG>gcALOW
2019-nCoV_20679gene-orf1ab2G->A:2synonymous_variantQHD43415.1:p.6805Pgene-orf1ab:c.20415ccG>ccALOW
2019-nCoV_21137gene-orf1ab1A->G:1missense_variantQHD43415.1:p.6958K>Rgene-orf1ab:c.20873aAg>aGgMODERATE
2019-nCoV_21316gene-orf1ab1G->A:1missense_variantQHD43415.1:p.7018D>Ngene-orf1ab:c.21052Gat>AatMODERATE
2019-nCoV_21656gene-S1T->A:1missense_variantQHD43416.1:p.32F>Igene-S:c.94Ttc>AtcMODERATE
2019-nCoV_21707gene-S3C->T:3missense_variantQHD43416.1:p.49H>Ygene-S:c.145Cat>TatMODERATE
2019-nCoV_22303gene-S1T->G:1missense_variantQHD43416.1:p.247S>Rgene-S:c.741agT>agGMODERATE
2019-nCoV_22586gene-S1T->Y:1coding_sequence_variantQHD43416.1:p.342-gene-S:c.1024Ttt>YttMODIFIER
2019-nCoV_22622gene-S1A->G:1missense_variantQHD43416.1:p.354N>Dgene-S:c.1060Aac>GacMODERATE
2019-nCoV_22652gene-S1G->T:1missense_variantQHD43416.1:p.364D>Ygene-S:c.1090Gat>TatMODERATE
2019-nCoV_22661gene-S2G->T:2missense_variantQHD43416.1:p.367V>Fgene-S:c.1099Gtc>TtcMODERATE
2019-nCoV_23403gene-S1A->G:1missense_variantQHD43416.1:p.614D>Ggene-S:c.1841gAt>gGtMODERATE
2019-nCoV_23569gene-S2T->C:2synonymous_variantQHD43416.1:p.669Ggene-S:c.2007ggT>ggCLOW
2019-nCoV_23605gene-S2T->G:2synonymous_variantQHD43416.1:p.681Pgene-S:c.2043ccT>ccGLOW
2019-nCoV_24034gene-S2C->T:1;C->Y:1synonymous_variant;coding_sequence_variantQHD43416.1:p.824N;QHD43416.1:p.824-gene-S:c.2472aaC>aaT;gene-S:c.2472aaC>aaYLOW;MODIFIER
2019-nCoV_24325gene-S2A->G:2synonymous_variantQHD43416.1:p.921Kgene-S:c.2763aaA>aaGLOW
2019-nCoV_25060gene-S1A->G:1synonymous_variantQHD43416.1:p.1166Lgene-S:c.3498ttA>ttGLOW
2019-nCoV_25645gene-ORF3a1T->C:1synonymous_variantQHD43417.1:p.85Lgene-ORF3a:c.253Ttg>CtgLOW
2019-nCoV_25964gene-ORF3a1A->G:1missense_variantQHD43417.1:p.191E>Ggene-ORF3a:c.572gAa>gGaMODERATE
2019-nCoV_26144gene-ORF3a5G->T:5missense_variantQHD43417.1:p.251G>Vgene-ORF3a:c.752gGt>gTtMODERATE
2019-nCoV_26729gene-M2T->C:1;T->Y:1synonymous_variant;coding_sequence_variantQHD43419.1:p.69A;QHD43419.1:p.69-gene-M:c.207gcT>gcC;gene-M:c.207gcT>gcYLOW;MODIFIER
2019-nCoV_27493gene-ORF7a2C->T:2missense_variantQHD43421.1:p.34P>Sgene-ORF7a:c.100Cct>TctMODERATE
2019-nCoV_27577gene-ORF7a1C->T:1stop_gainedQHD43421.1:p.62Q>*gene-ORF7a:c.184Caa>TaaHIGH
2019-nCoV_28077gene-ORF82G->S:1;G->C:1coding_sequence_variant;missense_variantQHD43422.1:p.62-;QHD43422.1:p.62V>Lgene-ORF8:c.184Gtg>Stg;gene-ORF8:c.184Gtg>CtgMODIFIER;MODERATE
2019-nCoV_28144gene-ORF816T->C:15;T->Y:1missense_variant;coding_sequence_variantQHD43422.1:p.84L>S;QHD43422.1:p.84-gene-ORF8:c.251tTa>tCa;gene-ORF8:c.251tTa>tYaMODERATE;MODIFIER
2019-nCoV_28253gene-ORF82C->T:2synonymous_variantQHD43422.1:p.120Fgene-ORF8:c.360ttC>ttTLOW
2019-nCoV_28291gene-N1C->T:1synonymous_variantQHD43423.2:p.6Pgene-N:c.18ccC>ccTLOW
2019-nCoV_28716gene-N1C->T:1missense_variantQHD43423.2:p.148T>Igene-N:c.443aCc>aTcMODERATE
2019-nCoV_28792gene-N1A->T:1synonymous_variantQHD43423.2:p.173Agene-N:c.519gcA>gcTLOW
2019-nCoV_28854gene-N3C->T:2;C->Y:1missense_variant;coding_sequence_variantQHD43423.2:p.194S>L;QHD43423.2:p.194-gene-N:c.581tCa>tTa;gene-N:c.581tCa>tYaMODERATE;MODIFIER
2019-nCoV_29095gene-N7C->T:7synonymous_variantQHD43423.2:p.274Fgene-N:c.822ttC>ttTLOW
2019-nCoV_29303gene-N1C->T:1missense_variantQHD43423.2:p.344P>Sgene-N:c.1030Cca>TcaMODERATE
2019-nCoV_29596gene-ORF101A->G:1missense_variantQHI42199.1:p.13I>Mgene-ORF10:c.39atA>atGMODERATE
2019-nCoV_297493'UTR1ACGATCGAGTG->A:1downstream_gene_variantQHI42199.1gene-ORF10MODIFIER;DISTANCE=76
2019-nCoV_298543'UTR1C->T:1intergenic_variant--MODIFIER
2019-nCoV_298563'UTR1T->A:1intergenic_variant--MODIFIER
2019-nCoV_298683'UTR1GA->G:1intergenic_variant--MODIFIER
2019-nCoV_298773'UTR1A->T:1intergenic_variant--MODIFIER

新窗口打开| 下载CSV


参考文献

WHO.

Novel Coronavirus (2019-nCoV)

https://www.who.int/emergencies/diseases/novel-coronavirus-2019, 2020.

Xu XT, Chen P, Wang JF, Feng JN, Zhou H, Li X, Zhong W, Hao P .

Evolution of the novel coronavirus from the ongoing Wuhan outbreak and modeling of its spike protein for risk of human transmission

Sci China Life Sci, 2020, doi: 10.1007/s11427-020-1637-5.

URL     PMID:32009228      [本文引用: 2]

Zhou P, Yang XL, Wang XG, Hu B, Zhang L, Zhang W, Si HR, Zhu Y, Li B, Huang CL, Chen HD, Chen J, Luo Y, Guo H, Jiang RD, Liu MQ, Chen Y, Shen XR, Wang X, Zheng XS, Zhao K, Chen QJ, Deng F, Liu LL, Yan B, Zhan FX, Wang YY, Xiao GF, Shi ZL .

Discovery of a novel coronavirus associated with the recent pneumonia outbreak in humans and its potential bat origin

bioRxiv, 2020, doi: 10.1101/2020.01.22.914952.

[本文引用: 1]

Ji W, Wang W, Zhao XF, Zai JJ, Li XG .

Homologous recombination within the spike glycoprotein of the newly identified coronavirus may boost cross-species transmission from snake to human

J Med Virol, 2020, doi: 10.1002/jmv.25682.

[本文引用: 1]

Dong N, Yang XM, Ye LW, Chen KC, Chan EWC, Yang MS, Chen S .

Genomic and protein structure modelling analysis depicts the origin and infectivity of 2019-nCoV, a new coronavirus which caused a pneumonia outbreak in Wuhan, China

bioRxiv, 2020, doi: 10.1101/2020.01.20. 913368.

[本文引用: 1]

Benvenuto D, Giovanetti M, Ciccozzi A, Spoto S, Angeletti S, Ciccozzi M .

The 2019-new Coronavirus epidemic: evidence for virus evolution

J Med Virol, 2020, doi: 10.1101/2020.01.24.915157.

URL     PMID:31994738      [本文引用: 1]

There is a worldwide concern about the new coronavirus 2019-nCoV as a global public health threat. In this article, we provide a preliminary evolutionary and molecular epidemiological analysis of this new virus. A phylogenetic tree has been built using the 15 available whole genome sequences of 2019-nCoV, 12 whole genome sequences of 2019-nCoV, and 12 highly similar whole genome sequences available in gene bank (five from the severe acute respiratory syndrome, two from Middle East respiratory syndrome, and five from bat SARS-like coronavirus). Fast unconstrained Bayesian approximation analysis shows that the nucleocapsid and the spike glycoprotein have some sites under positive pressure, whereas homology modeling revealed some molecular and structural differences between the viruses. The phylogenetic tree showed that 2019-nCoV significantly clustered with bat SARS-like coronavirus sequence isolated in 2015, whereas structural analysis revealed mutation in Spike Glycoprotein and nucleocapsid protein. From these results, the new 2019-nCoV is distinct from SARS virus, probably trasmitted from bats after mutation conferring ability to infect humans.

Chen JY, Shi JS, Qiu DA, Liu C, Li X, Zhao Q, Ruan JS, Gao S .

Bioinformatics analysis of the Wuhan 2019 human coronavirus genome

Chin J Bio, 2020, doi: 10.12113/ 202001007.

[本文引用: 1]

陈嘉源, 施劲松, 丘栋安, 刘畅, 李鑫, 赵强, 阮吉寿, 高山 .

武汉2019冠状病毒基因组的生物信息学分析

生物信息学, 2020, doi: 10.12113/202001007.

[本文引用: 1]

Heymann DL .

Data sharing and outbreaks: best practice exemplified

Lancet, 2020, doi: 10.1016/S0140-6736(20)30184-7.

URL     PMID:31986258      [本文引用: 1]

Munster VJ ,

Koopmans M, van Doremalen N, van Riel D, de Wit E. A novel coronavirus emerging in China - key questions for impact assessment

N Engl J Med, 2020, doi: 10.1056/NEJMp2000929.

[本文引用: 1]

Sayers EW, Beck J, Brister JR, Bolton EE, Canese K, Comeau DC, Funk K, Ketter A, Kim S, Kimchi A, Kitts PA, Kuznetsov A, Lathrop S, Lu Z , McGarvey K, Madden TL, Murphy TD, O'Leary N, Phan L, Schneider VA, Thibaud-Nissen F, Trawick BW, Pruitt KD, Ostell J.

Database resources of the National Center for Biotechnology Information

Nucleic Acids Res, 2020,48:D9-D16.

DOI:10.1093/nar/gkz899      URL     PMID:31602479      [本文引用: 1]

The National Center for Biotechnology Information (NCBI) provides a large suite of online resources for biological information and data, including the GenBank® nucleic acid sequence database and the PubMed database of citations and abstracts published in life science journals. The Entrez system provides search and retrieval operations for most of these data from 35 distinct databases. The E-utilities serve as the programming interface for the Entrez system. Custom implementations of the BLAST program provide sequence-based searching of many specialized datasets. New resources released in the past year include a new PubMed interface, a sequence database search and a gene orthologs page. Additional resources that were updated in the past year include PMC, Bookshelf, My Bibliography, Assembly, RefSeq, viral genomes, the prokaryotic genome annotation pipeline, Genome Workbench, dbSNP, BLAST, Primer-BLAST, IgBLAST and PubChem. All of these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov.

Shu YL , McCauley J.

GISAID: Global initiative on sharing all influenza data-from vision to reality

Euro Surveill, 2017,22(13):30494.

DOI:10.2807/1560-7917.ES.2017.22.13.30494      URL     PMID:28382917      [本文引用: 1]

Wang B, Liu F, Zhang EC, Wo CL, Chen J, Qian PY, Lu HR, Zeng WJ, Chen T, Wei JP, Wan Q, Wang R, Xu X .

The China National GeneBank—owned by all, completed by all and shared by all

Hereditas(Beijing), 2019,41(8):761-772.

DOI:10.1016/0165-0327(92)90041-4      URL     PMID:1447427      [本文引用: 1]

The Edinburgh Post Natal Depression Scale (EPDS), a 10-item self-rating depression scale, was translated into Dutch and compared in 293 postpartum women with other self-rating scales commonly in use in The Netherlands. In addition the structure of EPDS was analyzed by various factor analyses to reveal some of its dimensional aspects. The Dutch version of EPDS was found to be a self-rating scale with good psychometric characteristics which measures what it claims to measure: the strength of depressive symptoms. With LISREL a 2-factor model could be distinguished which contained subscales reflecting depressive symptoms and cognitive anxiety.

王博, 刘芳, 张二春, 沃晨亮, 陈振家, 钱璞毅, 卢浩荣, 曾文君, 陈泰, 危金普, 万仟, 王韧, 徐讯 .

国家基因库: 共有、共为、共享

遗传, 2019,41(8):761-72.

DOI:10.1016/0165-0327(92)90041-4      URL     PMID:1447427      [本文引用: 1]

The Edinburgh Post Natal Depression Scale (EPDS), a 10-item self-rating depression scale, was translated into Dutch and compared in 293 postpartum women with other self-rating scales commonly in use in The Netherlands. In addition the structure of EPDS was analyzed by various factor analyses to reveal some of its dimensional aspects. The Dutch version of EPDS was found to be a self-rating scale with good psychometric characteristics which measures what it claims to measure: the strength of depressive symptoms. With LISREL a 2-factor model could be distinguished which contained subscales reflecting depressive symptoms and cognitive anxiety.

Wu LH, Sun QL, Desmeth P, Sugawara H, Xu ZH , McCluskey K, Smith D, Alexander V, Lima N, Ohkuma M, Robert V, Zhou YG, Li JH, Fan GM, Ingsriswang S, Ozerskaya S, Ma JC.

World data centre for microorganisms: an information infrastructure to explore and utilize preserved microbial strains worldwide

Nucleic Acids Res, 2017,45(D1):D611-D618.

DOI:10.1093/nar/gkw903      URL     PMID:28053166      [本文引用: 1]

The World Data Centre for Microorganisms (WDCM) was established 50 years ago as the data center of the World Federation for Culture Collections (WFCC)-Microbial Resource Center (MIRCEN). WDCM aims to provide integrated information services using big data technology for microbial resource centers and microbiologists all over the world. Here, we provide an overview of WDCM including all of its integrated services. Culture Collections Information Worldwide (CCINFO) provides metadata information on 708 culture collections from 72 countries and regions. Global Catalogue of Microorganism (GCM) gathers strain catalogue information and provides a data retrieval, analysis, and visualization system of microbial resources. Currently, GCM includes >368 000 strains from 103 culture collections in 43 countries and regions. Analyzer of Bioresource Citation (ABC) is a data mining tool extracting strain related publications, patents, nucleotide sequences and genome information from public data sources to form a knowledge base. Reference Strain Catalogue (RSC) maintains a database of strains listed in International Standards Organization (ISO) and other international or regional standards. RSC allocates a unique identifier to strains recommended for use in diagnosis and quality control, and hence serves as a valuable cross-platform reference. WDCM provides free access to all these services at www.wdcm.org.

Zhang Z, Bao Y, Zhao W, Xiao J, Chen R, Zhang G, Li Y, Zhao G, Pervaiz N, Li R, Gao F, Zhi X, Lu Y, Liu L, He S, Li Q, Yuan C, Ma L, Xiao Y, Wang J, Hao Y, Wang Q, Shang Y, Zhang Y, Yuan N, Song S, Tian F, Sun L, Teng Y, Sun X, Chen H, Xue Y, Zhang Q, Teng X, Huang Z, Wang H, Zhu T, Zhang C, Ma Y, Zhang X, Lin S, Gao Y, Zhou J, Guo J, Liu X, Kang H, Tian D, Gao G, Ling Y, Xu S, Wang P, Zhou H, Niu Y, Ruan C, Lv D, Dong L, Zhu Q, Abbasi AA, Tang Q, Li H, Yao L, Chen M, Gao Q, Cao R, Guo Y, Zhai S, Shi S, Guo AY, Shireen H, Miao YR, Jin JP, Qian Q, Wang Y, Cao J, Duan G, Ning Z, Yu L, Li Z, Du Q, Wu W, Zhou Q, Hu H, Wang G, Wu S, Li CY, Zhao F, Xiong Z, Wang C, Gong Z, Zeng J, Yuan L, Xia X, Sun M, Batool F, Xue H, Sang J, Du Z, Wang X, Lan L, Fang S, Cui Q, Wang Z, Hao L, Liu W, Jiang Z, Zhang H, Raza RZ, Wu Y, Luo H, Zhang YE, Zhu J, Jiang M, Li M, Ying C, Li X, Li C, Zhao Y, Kang Q, Klenk HP, Zheng Y, Yang F, Tang B, Zhang P, Chen X, Zhang L, Zhao L, Tu Y, Chen T, Zou D, Zhang S, Ning W, Niu G, Guo H, Yan J, Shi Y, Sun Y, Pan M, Lu M, Ji P, Peng D, Yuan H .

Database resources of the National Genomics Data Center in 2020

Nucleic Acids Res, 2020,48(D1):D24-D33.

DOI:10.1093/nar/gkz913      URL     PMID:31702008      [本文引用: 2]

The National Genomics Data Center (NGDC) provides a suite of database resources to support worldwide research activities in both academia and industry. With the rapid advancements in higher-throughput and lower-cost sequencing technologies and accordingly the huge volume of multi-omics data generated at exponential scales and rates, NGDC is continually expanding, updating and enriching its core database resources through big data integration and value-added curation. In the past year, efforts for update have been mainly devoted to BioProject, BioSample, GSA, GWH, GVM, NONCODE, LncBook, EWAS Atlas and IC4R. Newly released resources include three human genome databases (PGG.SNV, PGG.Han and CGVD), eLMSG, EWAS Data Hub, GWAS Atlas, iSheep and PADS Arsenal. In addition, four web services, namely, eGPS Cloud, BIG Search, BIG Submission and BIG SSO, have been significantly improved and enhanced. All of these resources along with their services are publicly accessible at https://bigd.big.ac.cn.

Wang YQ, Song FH, Zhu JW, Zhang SS, Yang YD, Chen TT, Tang BX, Dong LL, Ding N, Zhang Q, Bai ZX, Dong XN, Chen HX, Sun MY, Zhai S, Sun YB, Yu L, Lan L, Xiao JF, Fang XD, Lei HX, Zhang Z, Zhao WM .

GSA: genome sequence archive

Genomics Proteomics Bioinformatics, 2017,15(1):14-18.

DOI:10.1016/j.gpb.2017.01.001      URL     PMID:28387199      [本文引用: 1]

With the rapid development of sequencing technologies towards higher throughput and lower cost, sequence data are generated at an unprecedentedly explosive rate. To provide an efficient and easy-to-use platform for managing huge sequence data, here we present Genome Sequence Archive (GSA; http://bigd.big.ac.cn/gsa or http://gsa.big.ac.cn), a data repository for archiving raw sequence data. In compliance with data standards and structures of the International Nucleotide Sequence Database Collaboration (INSDC), GSA adopts four data objects (BioProject, BioSample, Experiment, and Run) for data organization, accepts raw sequence reads produced by a variety of sequencing platforms, stores both sequence reads and metadata submitted from all over the world, and makes all these data publicly available to worldwide scientific communities. In the era of big data, GSA is not only an important complement to existing INSDC members by alleviating the increasing burdens of handling sequence data deluge, but also takes the significant responsibility for global big data archive and provides free unrestricted access to all publicly available data in support of research activities throughout the world.

Zhang SS, Chen TT, Zhu JW, Zhou Q, Chen X, Wang YQ, Zhao WM .

GSA: genome sequence archive

Hereditas (Beijing), 2018,40(11):1044-1047.

DOI:10.1016/0005-2795(79)90090-4      URL     PMID:465538      [本文引用: 1]

A procedure is described for the preparation of three cyanogen bromide fragments of the MM, NN, or MN glycoprotein (glycophorin) of the human erythrocyte membranes, from erythrocytes of single donors. The fragments are obtained in pure form and excellent yields by employing procedures which include proteolytic inhibitors during membrane processing, thorough delipidation of the glycoprotein, and CNBr cleavage conditions which lead to quantitative fragmentation without loss of carbohydrates. A phenol-urea extraction resolves the two glycopeptide fragments from the carbohydrate-free fragment. The two glycopeptides are further purified by Bio-Gel P-6 and P-100 chromatography. The three fragments include the amino terminal 8 residue glycopeptide, a large glycopeptide form the middle of the molecule which bears the Asn-linked oligosaccharide and 8--9 O-glycosidically linked units, and a carboxyl terminal, carbohydrate-free, approx. 50 residue fragment. Their amino acid and carbohydrate composition, and size, are in close agreement with the sequence data of Tomita, M., Furthmayr, H. and Marchesi, V.T. (Biochemistry (1978), 17, 4756--4770). The fragments represent three well delineated portions of the glycoprotein molecule.

张思思, 陈婷婷, 朱军伟, 周晴, 陈旭, 王彦青, 赵文明 .

GSA: 组学原始数据归档库

遗传, 2018,40(11):1044-1047.

DOI:10.1016/0005-2795(79)90090-4      URL     PMID:465538      [本文引用: 1]

A procedure is described for the preparation of three cyanogen bromide fragments of the MM, NN, or MN glycoprotein (glycophorin) of the human erythrocyte membranes, from erythrocytes of single donors. The fragments are obtained in pure form and excellent yields by employing procedures which include proteolytic inhibitors during membrane processing, thorough delipidation of the glycoprotein, and CNBr cleavage conditions which lead to quantitative fragmentation without loss of carbohydrates. A phenol-urea extraction resolves the two glycopeptide fragments from the carbohydrate-free fragment. The two glycopeptides are further purified by Bio-Gel P-6 and P-100 chromatography. The three fragments include the amino terminal 8 residue glycopeptide, a large glycopeptide form the middle of the molecule which bears the Asn-linked oligosaccharide and 8--9 O-glycosidically linked units, and a carboxyl terminal, carbohydrate-free, approx. 50 residue fragment. Their amino acid and carbohydrate composition, and size, are in close agreement with the sequence data of Tomita, M., Furthmayr, H. and Marchesi, V.T. (Biochemistry (1978), 17, 4756--4770). The fragments represent three well delineated portions of the glycoprotein molecule.

Zhang YS, Xia L, Sang J, Li M, Liu L, Li MW, Niu GY, Cao JB, Teng XF, Zhou Q, Zhang Z .

The BIG Data Center’s database resources

Hereditas(Beijing), 2018,40(11):1039-1043.

DOI:10.1016/0005-2795(79)90089-8      URL     PMID:465537      [本文引用: 1]

A major glycoprotein of the plasma membranes of AH-66 hepatoma ascites cells was isolated in essentially pure form and in milligram amounts. The plasma membranes were solubilized with a solution containing both 0.3 M lithium diiodosalycylate and 0.2% cetylpyridinium chloride, and further extracted with 50% phenol, followed by gel filtration on Sepharose 6B in the presence of 0.1% Ammonyx-LO at pH 8.0. The apparent molecular weight of the purified glycoprotein was estimated to be 165 000 in 5.6% polyacrylamide gels, of which 54% was carbohydrate and 46% was protein. The chemical composition of the glycoprotein resembles glycophorin A from human erythrocyte membranes in that it has a high content of N-acetylgalactosamine, N-acetylglucosamine, galactose and sialic acid and a particularly large proportion of serine, threonine, aspartic acid and glutamic acid.

张源笙, 夏琳, 桑健, 李漫, 刘琳, 李萌伟, 牛广艺, 曹佳宝, 滕徐菲, 周晴, 章张 .

生命与健康大数据中心资源

遗传, 2018,40(11):1039-1043.

DOI:10.1016/0005-2795(79)90089-8      URL     PMID:465537      [本文引用: 1]

A major glycoprotein of the plasma membranes of AH-66 hepatoma ascites cells was isolated in essentially pure form and in milligram amounts. The plasma membranes were solubilized with a solution containing both 0.3 M lithium diiodosalycylate and 0.2% cetylpyridinium chloride, and further extracted with 50% phenol, followed by gel filtration on Sepharose 6B in the presence of 0.1% Ammonyx-LO at pH 8.0. The apparent molecular weight of the purified glycoprotein was estimated to be 165 000 in 5.6% polyacrylamide gels, of which 54% was carbohydrate and 46% was protein. The chemical composition of the glycoprotein resembles glycophorin A from human erythrocyte membranes in that it has a high content of N-acetylgalactosamine, N-acetylglucosamine, galactose and sialic acid and a particularly large proportion of serine, threonine, aspartic acid and glutamic acid.

Ma YK, Bao YM .

Prospects for national biological big data centers

Hereditas(Beijing), 2018,40(11):938-943.

DOI:10.1016/0005-2795(79)90096-5      URL     PMID:465527      [本文引用: 1]

Pure ferritin from male mouse liver produces a single band of monomers (RF = 0.199) with electrophoresis in polyacrylamide gels at pH 9.0. The five sub-bands within this monomeric band appear to represent charge isomers having the same molecular size. Ferritin from BH3 transplantable mouse hepatoma shows two overlapping bands of monomers (RFA = 0.208 and RFB = 0.240); further electrophoretic studies show that these bands represent two subpopulations of molecules differing both in charge and size. Sub-bands are not found in this hepatoma ferritin. The larger tumor ferritin reaches the same end migration position as all liver isoferritins on gradient gels, signifying a very similar or identical molecular size; however, the absence of sub-bands indicates that this hepatoma ferritin differs in charge from the homologous liver proteins. Liver and hepatoma ferritins both produce a single prominent subunit band corresponding to nominal molecular weights of 22 250 and 21 700, with polyacrylamide gel electrophoresis in the presence of sodium dodecyl sulfate and dithiothreitol. With electrophoresis on polyacrylamide gradient slabs containing sodium dodecyl sulfate and dithiothreitol, both liver and hepatoma ferritins now reveal two subunits bands situated at identical positions. The polypeptides of these two closely spaced bands have a nominal molecular weight difference of less than 1000. Neither the hepatoma nor the liver seems to produce the ferritins found in the other tissue. Nevertheless, all these ferritins are composed of the same two types of subunits, albeit in different relative amounts. Observed distinctions in the ferritins from these normal or neoplastic cells must reflect differences in assembly and processing, as well as in the regulated expression of the same ferritin genes.

马英克, 鲍一明 .

国家级生物大数据中心展望

遗传, 2018,40(11):938-943.

DOI:10.1016/0005-2795(79)90096-5      URL     PMID:465527      [本文引用: 1]

Pure ferritin from male mouse liver produces a single band of monomers (RF = 0.199) with electrophoresis in polyacrylamide gels at pH 9.0. The five sub-bands within this monomeric band appear to represent charge isomers having the same molecular size. Ferritin from BH3 transplantable mouse hepatoma shows two overlapping bands of monomers (RFA = 0.208 and RFB = 0.240); further electrophoretic studies show that these bands represent two subpopulations of molecules differing both in charge and size. Sub-bands are not found in this hepatoma ferritin. The larger tumor ferritin reaches the same end migration position as all liver isoferritins on gradient gels, signifying a very similar or identical molecular size; however, the absence of sub-bands indicates that this hepatoma ferritin differs in charge from the homologous liver proteins. Liver and hepatoma ferritins both produce a single prominent subunit band corresponding to nominal molecular weights of 22 250 and 21 700, with polyacrylamide gel electrophoresis in the presence of sodium dodecyl sulfate and dithiothreitol. With electrophoresis on polyacrylamide gradient slabs containing sodium dodecyl sulfate and dithiothreitol, both liver and hepatoma ferritins now reveal two subunits bands situated at identical positions. The polypeptides of these two closely spaced bands have a nominal molecular weight difference of less than 1000. Neither the hepatoma nor the liver seems to produce the ferritins found in the other tissue. Nevertheless, all these ferritins are composed of the same two types of subunits, albeit in different relative amounts. Observed distinctions in the ferritins from these normal or neoplastic cells must reflect differences in assembly and processing, as well as in the regulated expression of the same ferritin genes.

Edgar RC .

MUSCLE: multiple sequence alignment with high accuracy and high throughput

Nucleic Acids Res, 2004,32(5):1792-1797.

DOI:10.1093/nar/gkh340      URL     PMID:15034147      [本文引用: 1]

We describe MUSCLE, a new computer program for creating multiple alignments of protein sequences. Elements of the algorithm include fast distance estimation using kmer counting, progressive alignment using a new profile function we call the log-expectation score, and refinement using tree-dependent restricted partitioning. The speed and accuracy of MUSCLE are compared with T-Coffee, MAFFT and CLUSTALW on four test sets of reference alignments: BAliBASE, SABmark, SMART and a new benchmark, PREFAB. MUSCLE achieves the highest, or joint highest, rank in accuracy on each of these sets. Without refinement, MUSCLE achieves average accuracy statistically indistinguishable from T-Coffee and MAFFT, and is the fastest of the tested methods for large numbers of sequences, aligning 5000 sequences of average length 350 in 7 min on a current desktop computer. The MUSCLE program, source code and PREFAB test data are freely available at http://www.drive5. com/muscle.

Bodenhofer U, Bonatesta E, Horejš-Kainrath C ,

Hochreiter S. msa: an R package for multiple sequence alignment

Bioinformatics, 2015,31(24):3997-3999.

DOI:10.1093/bioinformatics/btv494      URL     PMID:26315911      [本文引用: 1]

Although the R platform and the add-on packages of the Bioconductor project are widely used in bioinformatics, the standard task of multiple sequence alignment has been neglected so far. The msa package, for the first time, provides a unified R interface to the popular multiple sequence alignment algorithms ClustalW, ClustalOmega and MUSCLE. The package requires no additional software and runs on all major platforms. Moreover, the msa package provides an R interface to the powerful package shade which allows for flexible and customizable plotting of multiple sequence alignments.

Donlin MJ .

Using the generic genome browser(GBrowse)

. Curr Protoc Bioinformatics, 2009, 28(1): 9.9.1-9.9.25.

[本文引用: 1]

McLaren W, Gil L, Hunt SE, Riat HS, Ritchie GRS, Thormann A, Flicek P, Cunningham F .

The ensembl variant effect predictor

Genome Biol, 2016,17:122.

DOI:10.1186/s13059-016-0974-4      URL     PMID:27268795      [本文引用: 1]

The Ensembl Variant Effect Predictor is a powerful toolset for the analysis, annotation, and prioritization of genomic variants in coding and non-coding regions. It provides access to an extensive collection of genomic annotation, with a variety of interfaces to suit different requirements, and simple options for configuring and extending analysis. It is open source, free to use, and supports full reproducibility of results. The Ensembl Variant Effect Predictor can simplify and accelerate variant interpretation in a wide range of study designs.

/