The database contains a comprehensive set of characteristics for gRNA spacer sequences, including calculated metrics for on-target (DeepSpCas9 and DeepHF) and off-target (CFD and MIT) activity for the knockout of genes from the C. bursa-pastoris accession PGL0001 (alias 'msu-wt') genome (GCA_001974645.2 GenBank). The sequences have been prefiltered according to the specified set of thresholds:
The spacer's GC composition should be between 20-80%
The spacer should be located within the 5-65% of the CDS of the gene
The spacer should not contain polyT sequences (four or more T sites)
The spacer should not have sequence complementarity with the sgRNA hairpin backbone of SpCas9 or itself
The spacer should not be located at overlapping sites of coding regions of different genes
The spacer cut site should be located in CDS region of all coding isoforms of a target gene;
DeepSpCas9 and DeepHF on-target scores should be at least more than 0.2 and aggregated CFD off-target score (for all targets with maximum 3 mismatch) should be at least more than 0.2.
Homoeolog pairs were identified with orthofinder software.
Funding
Ministry of science and higher education, project # 075-15-2021-1064