遗传 ›› 2010, Vol. 32 ›› Issue (12): 1296-1303.doi: 10.3724/SP.J.1005.2010.01296

• 技术与方法 • 上一篇    

用于高通量测序的基因组靶序列捕获方法的建立

陈丹1, 2, 张雯2, 朱智东2, 黄银3, 王平4, 周贝贝2, 杨晓楠2, 肖华胜2, 张庆华1,2   

  1. 1. 上海交通大学医学院附属瑞金医院, 医学基因组学国家重点实验室, 上海 200025; 
    2. 生物芯片上海国家工程研究中心, 上海 201203; 
    3. 国家人类基因组南方研究中心, 上海 201203; 
    4. 上海交通大学医学院基础医学院病理教研室, 上海 200025
  • 收稿日期:2010-04-23 修回日期:2010-09-30 出版日期:2010-12-20 发布日期:2010-12-20
  • 通讯作者: 张庆华 E-mail:qinghua_zhang@shbiochip.com
  • 基金资助:

    国家重大科学研究项目(2006CB910402)资助

Establishment of target genomic DNA capturing system for next generation sequencing

CHEN Dan1, 2, ZHANG Wen2, ZHU Zhi-Dong2, HUANG Yin3, WANG Ping4, ZHOU Bei-Bei2, YANG Xiao-Nan2, XIAO Hua-Sheng2, ZHANG Qing-Hua1, 2   

  1. 1. State Key Laboratory of Medical Genomics, Ruijin Hospital, Shanghai Jiaotong University School of Medicine, Shanghai 200025, China; 
    2. National Engineering Research Center for Biochip at Shanghai, Shanghai 201203, China; 
    3. Chinese National Human Genome Center at Shanghai, Shanghai 201203, China; 
    4. Department of Pathology, College of Basic Medical Science, Shanghai Jiaotong University School of Medicine, Shanghai 200025, China
  • Received:2010-04-23 Revised:2010-09-30 Online:2010-12-20 Published:2010-12-20
  • Contact: ZHANG Qing-Hua1 E-mail:qinghua_zhang@shbiochip.com

摘要: 文章旨在建立一种基因组目标靶序列捕捉文库的方法, 并结合第二代测序技术, 以实现候选基因区段的深度测序。利用Agilent公司的eArray在线平台, 对1 250个基因的11 824个外显子共2 414 977 bp的基因组序列进行120个碱基长度的捕捉探针(钓饵)设计, 并制备成SureSelect液相靶序列捕获试剂。选用2例人基因组DNA, 超声打断后末端补平并磷酸化, 连接SOLiD接头, 回收150bp~200bp的DNA片段, 与靶序列探针杂交捕获目标序列, 油包水微乳滴PCR扩增后, 磁珠分离富集, 上SOLiD测序系统通过工作流程分析(WFA)进行文库质量的评价, 或正式测序反应。结果显示对所包含的11 147个基因外显子片段设计出并合成了46 509个捕捉探针, 制备成SureSelect试剂盒。探针可有效地捕捉并富集基因组DNA的目标靶片段, 定量PCR显示富集效率可达29倍。WFA分析表明文库可以在SOLiD仪器进行正式测序。测序结果显示靶序列区域的测序数占有效总测序数的比例达到70%, 覆盖率均在200×以上。结果表明本研究所建立的SureSelect基因组靶序列捕捉、富集建立测序文库的技术路线可行, 可直接用于SOLiD测序仪的测序。

关键词: 靶序列捕获, 第二代测序, 液相杂交, 油包水PCR

Abstract: The motivation of this research is to establish a system of target genomic DNA capture and enrichment, which could be used in deep sequencing of target regions with next-generation sequencing. To design the 120 bp capture probes (baits) and prepare the SureSelect reagents, 2 414 977 bp human genomic sequence of 11 824 exons in 1 250 genes were submitted to the Agilent eArray platform and manufactured by Agilent. Two human genomic DNA samples were used and conducted the successive experiments for sequencing library construction: shearing fragmentation by sonication, blunt-ending and phosphorylation, adaptor ligation, 150?200 bp fragments size selection, followed by hybridization with the baits, hybrid selection with magnetic beads, and PCR amplification. Prior to SOLiD sequencing reaction, the libraries were amplified with emulsion PCR and enriched with the P2 enrichment beads. The library samples were loaded to sequencing Chip for Work Flow Analysis (WFA) or sequencing running with default parameters. The results displayed that 46 509 baits were designed and synthesized for 11 147 gene regions, and SureSelect capture probe regent was prepared. Real-time PCR showed the target enrichment efficiency up to 29 times with the SureSelect system. WFA revealed that the libraries were suitable for SOLiD Sequencing. The sequencing data revealed that 70% of the unique mapped sequence tags matched the target regions, and the average coverage of the target regions were above 200-fold. All these demonstrated the feasibility of the established system of target genome sequence capture for next generation DNA sequencing.

Key words: target capturing, next generation sequencing, solution phase hybridization, emulsion PCR