机器学习方法在CRISPR/Cas9系统中的应用
张桂珊,杨勇,张灵敏,戴宪华

Application of machine learning in the CRISPR/Cas9 system
Zhang Guishan,Yang Yong,Zhang Lingmin,Dai Xianhua
表1 优化CRISPR/Cas9系统的机器学习方法常用数据集
Table 1 Major datasets of machine learning methods for optimizing the CRISPR/Cas9 system
工具名 年份 Cas类型 数据集|数据集来源 数据集URL 参考文献
Elevation 2018 Cas9 FC, “Sch?nig”, “Concordet”, “Eschstruth”, “Shkumatava”| U2OS, HEK293 https://www.ncbi.nlm.nih.gov/sra?term=SRP117146%5BAccession%5D
https://github.com/maximilianh/crisporWebsite
[13]
DeepCpf1 2018 Cpf1 HT 1-2, HT 2, HT 3, HEK-lenti, HEK-plasmid, HCT-plasmid| HEK293T cells http://www.rgenome.net/cpf1-database [14]
CRISTA 2017 Cas9 GUIDE-seq data, BLESS data, HTGTS data| HEK293, U2OS http://crista.tau.ac.il/ [52]
CRISPRpred 2017 Cas9 FC, RES http://research.microsoft.com/en-us/projects/azimuth/ [60]
sgRNA Designer
(Rule Set 2)
2016 Cas9 FC, RES| A375, HL60, KBM7, mouse ESC JM8 https://www.nature.com/articles/nbt.3437#supplementary-information [22]
predictSGRNA 2017 Cas9 ribosomal genes, non-ribosomal genes, essential genes| HL-60, KBM-7, mouse ESC JM8 http://genome.cshlp.org/content/25/8/1147/suppl/DC1
http://www.sciencemag.org/content/343/6166/80/suppl/DC1
http://www.nature.com/nbt/journal/v32/n3/full/nbt.2800.html#supplementary information
[23]
Big Papi 2017 Cas9 A375, 293T, MOLM13 https://github.com/mhegde [16]
2017 Cas9 FC, RES, UniRef100 http://research.microsoft.com/en-us/projets/azimuth [24]
sgRNA Scorer 2.0 2017 Cas9
Cpf1
293T https://pubs.acs.org/doi/abs/%2010.1021%2Facssynbio.6b00343 [62]
工具名 年份 Cas类型 数据集|数据集来源 数据集URL 参考文献
CRISPR-DO 2016 Cas9 full human (GRCh37/hg19, GRCh38/hg38)
mouse (NCBI37/mm9 GRCm38/mm10)
zebrafish (danRer7), fly (dm6), worm (ce10)| HL60, 293T, KBM7
https://www.ncbi.nlm.nih.gov/pubmed/?term=CRISPR-DO+for+genome-wide+CRISPR+design+and+optimization [63]
CRISPR multitargeter 2015 Cas9 BioMart, zebrafish ohnologs https://github.com/SergeyPry/CRISPR_MultiTargeter [64]
CRISPRscan 2015 Cas9 One-cell-stage zebrafish embryos, germ https://www.nature.com/articles/nmeth.3543 [46]
WU-CRISPR 2015 Cas9 FC| 293T, K562, A549, HepG2, SKNAS, U2OS, PGP1-iPS, HEK293 https://www.ncbi.nlm.nih.gov/sra/ [6]
CRISPR (SSC) 2015 Cas9 HL60, KBM7, ABL1, BCR, 293T, LNCaP- abl, mouse ESC JM8| A promoter-level mammalian expression atlas; Determinants of nucleosome organization in primary human cells; An integrated encyclopedia of DNA elements in the human genome http://fantom.gsc.riken.jp/5/datafiles/phase1.3/extra/TSS_classifier/
https://www.encodeproject.org/files/ENCFF000VNN
https://www.encodepro ject.org/files/EN-CFF000TLU
[65]
CRISPRko 2014 Cas9 FC| A375, EL4, AML, MOLM13, NB4, TF1 https://www.nature.com/articles/nbt.3026#supplementary-information [45]
2014 Cas9 HL60, 293T, KBM7, DH5α https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3972032/#SD1 [47]
SgRNA Scorer 1.0 2015 Cas9 DNase-seq (GSM1008573), H3K4-trimethylation (GSM945288) |293T, K562, A549, HepG2, SKNAS, U2OS, PGP1-iPS, HEK293 http://arep.med.harvard.edu/CasFinder/ [66]
CRoatan 2017 Cas9 FC| A375, K562 http://dx.doi.org/10.1016/j.molcel.2017.06.030 [61]
TKOv3 2017 Cas9 essential genes, nonessential genes| CEG2, KBM7, HL60, RPE1, DLD1, GBM, HAP1, HCT116, RPE1dTP53 http://tko.ccbr.utoronto.ca/ [18]
BAGEL 2016 Cas9 essential genes, nonessential genes| GBM, HCT116, HeLa, RPE1 http://tko.ccbr.utoronto.ca/ [20]
CRISPRiaDesign 2016 Cas9 FANTOM Consortium、ENCODE; Consortium (accession no. ENCFF000VNN); ENCODE Consortium (accession no. ENCFF000TLU)|; K562, HEK293T; A promoter-level mammalian expression atlas; Determinants of nucleosome, organization in primary human cells http://fantom.gsc.riken.jp/5/datafiles/phase1.3/extra/TSS_classifier/
https://www.encodeproject.org/files/ENCFF000VNN
https://www.encodeproject.org/files/ENCFF000TLU
[8]
CRISPRstrand 2014 Cas REPEATSLange,; REPEATSKunin;
REPEATSShah
http://www.ncbi.nlm.nih.gov/ [19]
H1/H2 library 2018 Cas9 K562, Raji, Jiyoye, KBM7 https://doi.org/10.1093/bioinformatics/bty450 [67]