assembly_id	genome_id	genome_def	crispr_array_locus_merge	crispr_array_location_merge	crispr_locus_id	crispr_pred_method	array_in_prot	prot_within_array_20000	prot_in_genome	crispr_type_by_cas_prot	consensus_repeat	repeat_length	self-targeting_spacer_number	self-targeting_target_number	spacer_location	protospacer_location	repeat_type	spacer_locus_num	spacer_num	correct_crispr_type	genome_cas_prots	unknown_protein_around_crispr	L10	L10_domain	L9	L9_domain	L8	L8_domain	L7	L7_domain	L6	L6_domain	L5	L5_domain	L4	L4_domain	L3	L3_domain	L2	L2_domain	L1	L1_domain	R1	R1_domain	R2	R2_domain	R3	R3_domain	R4	R4_domain	R5	R5_domain	R6	R6_domain	R7	R7_domain	R8	R8_domain	R9	R9_domain	R10	R10_domain
GCF_000014125.1_ASM1412v1	NC_008593	Clostridium novyi NT, complete sequence	1	152250-152357	1	CRISPRCasFinder	no	cas3	DinG,cas3,csa3,DEDDh,cas2,cas1,cas4,cas5,cas7b,cas8b1,cas6	Unclear	ACCCGAAGGTCGTAGGTTCAAGTCCT	26	0	0	NA	NA	NA	1	1	Unclear	DinG,cas3,csa3,DEDDh,cas2,cas1,cas4,cas5,cas7b,cas8b1,cas6	NA|91aa|up_1|NC_008593.1_151225_151498_+,NA	NA|349aa|up_9|NC_008593.1_144479_145526_+	PRK00059, prsA, peptidylprolyl isomerase; Provisional	NA|184aa|up_8|NC_008593.1_145936_146488_+	TIGR02851, stage_V_sporulation_protein_T, stage V sporulation protein T	NA|510aa|up_7|NC_008593.1_146585_148115_+	cd13124, MATE_SpoVB_like, Stage V sporulation protein B, also known as Stage III sporulation protein F, and related proteins	NA|482aa|up_6|NC_008593.1_148146_149592_+	COG3956, COG3956, Protein containing tetrapyrrole methyltransferase domain and MazG-like (predicted pyrophosphatase) domain [General function prediction only]	NA|93aa|up_5|NC_008593.1_149699_149978_+	cd13831, HU, histone-like DNA-binding protein HU	NA|90aa|up_4|NC_008593.1_150067_150337_+	COG1188, COG1188, Ribosome-associated heat shock protein implicated in the recycling of the 50S subunit (S4 paralog) [Translation, ribosomal structure and biogenesis]	NA|98aa|up_3|NC_008593.1_150452_150746_+	TIGR02892, conserved_hypothetical_protein, sporulation protein YabP	NA|146aa|up_2|NC_008593.1_150751_151189_+	TIGR02893, Spore_protein_YabQ, spore cortex biosynthesis protein YabQ	NA|91aa|up_1|NC_008593.1_151225_151498_+	NA	NA|136aa|up_0|NC_008593.1_151552_151960_+	PRK05807, PRK05807, RNA-binding protein S1	NA|799aa|down_0|NC_008593.1_153013_155410_+	TIGR02865, Stage_II_sporulation_protein_E, stage II sporulation protein E	NA|470aa|down_1|NC_008593.1_155530_156940_+	cd01992, PP-ATPase, N-terminal domain of predicted ATPase of the PP-loop faimly implicated in cell cycle control [Cell division and chromosome partitioning]	NA|180aa|down_2|NC_008593.1_156941_157481_+	COG0634, Hpt, Hypoxanthine-guanine phosphoribosyltransferase [Nucleotide transport and metabolism]	NA|677aa|down_3|NC_008593.1_157551_159582_+	TIGR01241, ATP-dependent_zinc_metalloprotease_FtsH, ATP-dependent metalloprotease FtsH	NA|129aa|down_4|NC_008593.1_159724_160111_+	COG5496, COG5496, Predicted thioesterase [General function prediction only]	NA|557aa|down_5|NC_008593.1_160212_161883_+	pfam01268, FTHFS, Formate--tetrahydrofolate ligase	NA|260aa|down_6|NC_008593.1_162006_162786_+	PRK13318, PRK13318, type III pantothenate kinase	NA|322aa|down_7|NC_008593.1_162794_163760_+	TIGR00737, Probable_tRNA-dihydrouridine_synthase, putative TIM-barrel protein, nifR3 family	NA|160aa|down_8|NC_008593.1_163988_164468_+	PRK00226, greA, transcription elongation factor GreA; Reviewed	NA|496aa|down_9|NC_008593.1_164562_166050_+	PRK00484, lysS, lysyl-tRNA synthetase; Reviewed
GCF_000014125.1_ASM1412v1	NC_008593	Clostridium novyi NT, complete sequence	2	1853852-1855193	2,1,1	CRISPRCasFinder,CRT,PILER-CR	no	cas2,cas1,cas4,cas3,cas5,cas7b,cas8b1,cas6	DinG,cas3,csa3,DEDDh,cas2,cas1,cas4,cas5,cas7b,cas8b1,cas6	Type I-B	ATTTAAATACATCTCATGTTAATGTTCAAC,ATTTAAATACATCTCATGTTAATGTTCAAC,ATTTAAATACATCTCATGTTAATGTTCAAC	30,30,30	2	2	1853882-1853917|1853884-1853919	NC_008593.1_1361923-1361958|NC_008593.1_1361923-1361958	III-B:III-B:III-B	20,20,20	20	TypeI-B	DinG,cas3,csa3,DEDDh,cas2,cas1,cas4,cas5,cas7b,cas8b1,cas6	NA|100aa|up_5|NC_008593.1_1847332_1847632_-,NA|53aa|down_8|NC_008593.1_1864256_1864415_-	NA|376aa|up_9|NC_008593.1_1841881_1843009_-	TIGR02887, Spore_germination_protein_B3, germination protein, Ger(x)C family	NA|371aa|up_8|NC_008593.1_1843014_1844127_-	pfam03845, Spore_permease, Spore germination protein	NA|497aa|up_7|NC_008593.1_1844110_1845601_-	pfam03323, GerA, Bacillus/Clostridium GerA spore germination protein	NA|424aa|up_6|NC_008593.1_1845782_1847054_-	COG2233, UraA, Xanthine/uracil permeases [Nucleotide transport and metabolism]	NA|100aa|up_5|NC_008593.1_1847332_1847632_-	NA	NA|199aa|up_4|NC_008593.1_1847799_1848396_+	COG0605, SodA, Superoxide dismutase [Inorganic ion transport and metabolism]	NA|534aa|up_3|NC_008593.1_1848456_1850058_-	cd01031, EriC, ClC chloride channel EriC	NA|242aa|up_2|NC_008593.1_1850434_1851160_+	cd08563, GDPD_TtGDE_like, Glycerophosphodiester phosphodiesterase domain of Thermoanaerobacter tengcongensis and similar proteins	NA|457aa|up_1|NC_008593.1_1851189_1852560_+	COG1115, AlsT, Na+/alanine symporter [Amino acid transport and metabolism]	NA|318aa|up_0|NC_008593.1_1852647_1853601_-	COG2333, ComEC, Predicted hydrolase (metallo-beta-lactamase superfamily) [General function prediction only]	cas2|96aa|down_0|NC_008593.1_1855377_1855665_-	cd09725, Cas2_I_II_III, CRISPR/Cas system-associated protein Cas2	cas1|333aa|down_1|NC_008593.1_1855664_1856663_-	cd09722, Cas1_I-B, CRISPR/Cas system-associated protein Cas1	cas4|165aa|down_2|NC_008593.1_1856675_1857170_-	pfam01930, Cas_Cas4, Domain of unknown function DUF83	cas3|874aa|down_3|NC_008593.1_1857178_1859800_-	cd09639, Cas3_I, CRISPR/Cas system-associated protein Cas3	cas5|256aa|down_4|NC_008593.1_1859860_1860628_-	TIGR02592, hypothetical_protein_CTC_01466, CRISPR-associated protein Cas5, subtype I-B/HMARI	cas7b|339aa|down_5|NC_008593.1_1860630_1861647_-	pfam05107, Cas_Cas7, CRISPR-associated protein Cas7	cas8b1|595aa|down_6|NC_008593.1_1861639_1863424_-	TIGR02591, cas_Csh1, CRISPR-associated protein Cas8b/Csh1, subtype I-B/HMARI	cas6|231aa|down_7|NC_008593.1_1863436_1864129_-	TIGR01877, CRISPR-associated_endoribonuclease_Cas6_1, CRISPR-associated endoribonuclease Cas6	NA|53aa|down_8|NC_008593.1_1864256_1864415_-	NA	NA|105aa|down_9|NC_008593.1_1866485_1866800_-	pfam10387, DUF2442, Protein of unknown function (DUF2442)
GCF_000014125.1_ASM1412v1	NC_008593	Clostridium novyi NT, complete sequence	3	1864520-1866264	3,2,2	CRISPRCasFinder,CRT,PILER-CR	no	cas2,cas1,cas4,cas3,cas5,cas7b,cas8b1,cas6	DinG,cas3,csa3,DEDDh,cas2,cas1,cas4,cas5,cas7b,cas8b1,cas6	Type I-B	ATTTAAATACATCTCATGTTAATGTTCAAC,ATTTAAATACATCCCATGTTATTGTTCAAC,ATTTAAATACATCCCATGTTATTGTTCAAC	30,30,30	0	0	NA	NA	III-B:III-B:III-B	26,25,24	26	TypeI-B	DinG,cas3,csa3,DEDDh,cas2,cas1,cas4,cas5,cas7b,cas8b1,cas6	NA|53aa|up_0|NC_008593.1_1864256_1864415_-,NA	NA|318aa|up_9|NC_008593.1_1852647_1853601_-	COG2333, ComEC, Predicted hydrolase (metallo-beta-lactamase superfamily) [General function prediction only]	cas2|96aa|up_8|NC_008593.1_1855377_1855665_-	cd09725, Cas2_I_II_III, CRISPR/Cas system-associated protein Cas2	cas1|333aa|up_7|NC_008593.1_1855664_1856663_-	cd09722, Cas1_I-B, CRISPR/Cas system-associated protein Cas1	cas4|165aa|up_6|NC_008593.1_1856675_1857170_-	pfam01930, Cas_Cas4, Domain of unknown function DUF83	cas3|874aa|up_5|NC_008593.1_1857178_1859800_-	cd09639, Cas3_I, CRISPR/Cas system-associated protein Cas3	cas5|256aa|up_4|NC_008593.1_1859860_1860628_-	TIGR02592, hypothetical_protein_CTC_01466, CRISPR-associated protein Cas5, subtype I-B/HMARI	cas7b|339aa|up_3|NC_008593.1_1860630_1861647_-	pfam05107, Cas_Cas7, CRISPR-associated protein Cas7	cas8b1|595aa|up_2|NC_008593.1_1861639_1863424_-	TIGR02591, cas_Csh1, CRISPR-associated protein Cas8b/Csh1, subtype I-B/HMARI	cas6|231aa|up_1|NC_008593.1_1863436_1864129_-	TIGR01877, CRISPR-associated_endoribonuclease_Cas6_1, CRISPR-associated endoribonuclease Cas6	NA|53aa|up_0|NC_008593.1_1864256_1864415_-	NA	NA|105aa|down_0|NC_008593.1_1866485_1866800_-	pfam10387, DUF2442, Protein of unknown function (DUF2442)	NA|87aa|down_1|NC_008593.1_1866811_1867072_-	pfam13711, DUF4160, Domain of unknown function (DUF4160)	NA|266aa|down_2|NC_008593.1_1870714_1871512_-	pfam01841, Transglut_core, Transglutaminase-like superfamily	NA|198aa|down_3|NC_008593.1_1871704_1872298_-	cd06166, Sortase_D_2, Sortase domain found in subfamily 2 of the class D family of sortases	NA|905aa|down_4|NC_008593.1_1872367_1875082_-	pfam17961, Big_8, Bacterial Ig domain	NA|393aa|down_5|NC_008593.1_1875397_1876576_-	pfam07907, YibE_F, YibE/F-like protein	NA|686aa|down_6|NC_008593.1_1876649_1878707_-	PRK09419, PRK09419, multifunctional 2',3'-cyclic-nucleotide 2'-phosphodiesterase/3'-nucleotidase/5'-nucleotidase	NA|277aa|down_7|NC_008593.1_1878856_1879687_-	pfam09370, PEP_hydrolase, Phosphoenolpyruvate hydrolase-like	NA|404aa|down_8|NC_008593.1_1879709_1880921_-	pfam06792, UPF0261, Uncharacterized protein family (UPF0261)	NA|392aa|down_9|NC_008593.1_1881112_1882288_-	pfam09370, PEP_hydrolase, Phosphoenolpyruvate hydrolase-like
GCF_000014125.1_ASM1412v1	NC_008593	Clostridium novyi NT, complete sequence	4	1867121-1870431	3,4,3	PILER-CR,CRISPRCasFinder,CRT	no	cas2,cas1,cas4,cas3,cas5,cas7b,cas8b1,cas6	DinG,cas3,csa3,DEDDh,cas2,cas1,cas4,cas5,cas7b,cas8b1,cas6	Type I-B	ATTTAAATACATCTCATGTTAATGTTCAAC,ATTTAAATACATCTCATGTTAATGTTCAAC,ATTTAAATACATCTCATGTTAATGTTCAAC	30,30,30	0	0	NA	NA	III-B:III-B:III-B	50,50,50	50	TypeI-B	DinG,cas3,csa3,DEDDh,cas2,cas1,cas4,cas5,cas7b,cas8b1,cas6	NA|53aa|up_2|NC_008593.1_1864256_1864415_-,NA	cas1|333aa|up_9|NC_008593.1_1855664_1856663_-	cd09722, Cas1_I-B, CRISPR/Cas system-associated protein Cas1	cas4|165aa|up_8|NC_008593.1_1856675_1857170_-	pfam01930, Cas_Cas4, Domain of unknown function DUF83	cas3|874aa|up_7|NC_008593.1_1857178_1859800_-	cd09639, Cas3_I, CRISPR/Cas system-associated protein Cas3	cas5|256aa|up_6|NC_008593.1_1859860_1860628_-	TIGR02592, hypothetical_protein_CTC_01466, CRISPR-associated protein Cas5, subtype I-B/HMARI	cas7b|339aa|up_5|NC_008593.1_1860630_1861647_-	pfam05107, Cas_Cas7, CRISPR-associated protein Cas7	cas8b1|595aa|up_4|NC_008593.1_1861639_1863424_-	TIGR02591, cas_Csh1, CRISPR-associated protein Cas8b/Csh1, subtype I-B/HMARI	cas6|231aa|up_3|NC_008593.1_1863436_1864129_-	TIGR01877, CRISPR-associated_endoribonuclease_Cas6_1, CRISPR-associated endoribonuclease Cas6	NA|53aa|up_2|NC_008593.1_1864256_1864415_-	NA	NA|105aa|up_1|NC_008593.1_1866485_1866800_-	pfam10387, DUF2442, Protein of unknown function (DUF2442)	NA|87aa|up_0|NC_008593.1_1866811_1867072_-	pfam13711, DUF4160, Domain of unknown function (DUF4160)	NA|266aa|down_0|NC_008593.1_1870714_1871512_-	pfam01841, Transglut_core, Transglutaminase-like superfamily	NA|198aa|down_1|NC_008593.1_1871704_1872298_-	cd06166, Sortase_D_2, Sortase domain found in subfamily 2 of the class D family of sortases	NA|905aa|down_2|NC_008593.1_1872367_1875082_-	pfam17961, Big_8, Bacterial Ig domain	NA|393aa|down_3|NC_008593.1_1875397_1876576_-	pfam07907, YibE_F, YibE/F-like protein	NA|686aa|down_4|NC_008593.1_1876649_1878707_-	PRK09419, PRK09419, multifunctional 2',3'-cyclic-nucleotide 2'-phosphodiesterase/3'-nucleotidase/5'-nucleotidase	NA|277aa|down_5|NC_008593.1_1878856_1879687_-	pfam09370, PEP_hydrolase, Phosphoenolpyruvate hydrolase-like	NA|404aa|down_6|NC_008593.1_1879709_1880921_-	pfam06792, UPF0261, Uncharacterized protein family (UPF0261)	NA|392aa|down_7|NC_008593.1_1881112_1882288_-	pfam09370, PEP_hydrolase, Phosphoenolpyruvate hydrolase-like	NA|333aa|down_8|NC_008593.1_1882488_1883487_-	pfam14057, GGGtGRT, GGGtGRT protein	NA|231aa|down_9|NC_008593.1_1883504_1884197_-	COG0822, IscU, NifU homolog involved in Fe-S cluster formation [Energy production and conversion]
GCF_000014125.1_ASM1412v1	NC_008593	Clostridium novyi NT, complete sequence	5	2514547-2514651	5	CRISPRCasFinder	no		DinG,cas3,csa3,DEDDh,cas2,cas1,cas4,cas5,cas7b,cas8b1,cas6	Orphan	TTAGTACACATTTTTTATCTGTACT	25	0	0	NA	NA	NA	1	1	Orphan	DinG,cas3,csa3,DEDDh,cas2,cas1,cas4,cas5,cas7b,cas8b1,cas6	NA|239aa|up_5|NC_008593.1_2508130_2508847_-,NA|123aa|down_3|NC_008593.1_2517321_2517690_-,NA|103aa|down_5|NC_008593.1_2518260_2518569_-	NA|85aa|up_9|NC_008593.1_2504455_2504710_-	pfam07441, BofA, SigmaK-factor processing regulatory protein BofA	NA|199aa|up_8|NC_008593.1_2504834_2505431_-	PRK00076, recR, recombination protein RecR; Reviewed	NA|115aa|up_7|NC_008593.1_2505446_2505791_-	PRK00153, PRK00153, YbaB/EbfC family nucleoid-associated protein	NA|540aa|up_6|NC_008593.1_2505872_2507492_-	PRK05563, PRK05563, DNA polymerase III subunits gamma and tau; Validated	NA|239aa|up_5|NC_008593.1_2508130_2508847_-	NA	NA|148aa|up_4|NC_008593.1_2508908_2509352_-	COG0590, CumB, Cytosine/adenosine deaminases [Nucleotide transport and metabolism / Translation, ribosomal structure and biogenesis]	NA|245aa|up_3|NC_008593.1_2509639_2510374_+	pfam09529, Intg_mem_TP0381, Integral membrane protein (intg_mem_TP0381)	NA|296aa|up_2|NC_008593.1_2510440_2511328_+	PRK00723, PRK00723, phosphatidylserine decarboxylase; Provisional	NA|244aa|up_1|NC_008593.1_2511322_2512054_-	cd07716, RNaseZ_short-form-like_MBL-fold, uncharacterized bacterial subgroup of Ribonuclease Z, short form; MBL-fold metallo-hydrolase domain	NA|607aa|up_0|NC_008593.1_2512288_2514109_-	cd08579, GDPD_memb_like, Glycerophosphodiester phosphodiesterase domain of uncharacterized bacterial glycerophosphodiester phosphodiesterases	NA|423aa|down_0|NC_008593.1_2514694_2515963_-	COG1783, XtmB, Phage terminase large subunit [General function prediction only]	NA|123aa|down_1|NC_008593.1_2515991_2516360_-	pfam03592, Terminase_2, Terminase small subunit	NA|296aa|down_2|NC_008593.1_2516434_2517322_-	pfam05065, Phage_capsid, Phage capsid family	NA|123aa|down_3|NC_008593.1_2517321_2517690_-	NA	NA|180aa|down_4|NC_008593.1_2517708_2518248_-	pfam02732, ERCC4, ERCC4 domain	NA|103aa|down_5|NC_008593.1_2518260_2518569_-	NA	NA|725aa|down_6|NC_008593.1_2518665_2520840_-	TIGR01613, putative_primase, phage/plasmid primase, P4 family, C-terminal domain	NA|63aa|down_7|NC_008593.1_2520848_2521037_-	pfam12728, HTH_17, Helix-turn-helix domain	NA|148aa|down_8|NC_008593.1_2521272_2521716_+	cd00093, HTH_XRE, Helix-turn-helix XRE-family like proteins	NA|388aa|down_9|NC_008593.1_2521850_2523014_+	cd01189, INT_ICEBs1_C_like, C-terminal catalytic domain of integrases from bacterial phages and conjugate transposons
