assembly_id	genome_id	genome_def	crispr_array_locus_merge	crispr_array_location_merge	crispr_locus_id	crispr_pred_method	array_in_prot	prot_within_array_20000	prot_in_genome	crispr_type_by_cas_prot	consensus_repeat	repeat_length	self-targeting_spacer_number	self-targeting_target_number	spacer_location	protospacer_location	repeat_type	spacer_locus_num	spacer_num	correct_crispr_type	genome_cas_prots	unknown_protein_around_crispr	L10	L10_domain	L9	L9_domain	L8	L8_domain	L7	L7_domain	L6	L6_domain	L5	L5_domain	L4	L4_domain	L3	L3_domain	L2	L2_domain	L1	L1_domain	R1	R1_domain	R2	R2_domain	R3	R3_domain	R4	R4_domain	R5	R5_domain	R6	R6_domain	R7	R7_domain	R8	R8_domain	R9	R9_domain	R10	R10_domain
GCF_000021685.1_ASM2168v1	NC_011959	Thermomicrobium roseum DSM 5159, complete sequence	1	31267-36269	1,1,1	PILER-CR,CRISPRCasFinder,CRT	no	cas6,cas8b2,cas7,cas5,cas3,cas4,cas1,cas2	cas6,cas8b2,cas7,cas5,cas3,cas4,cas1,cas2,csa3,DEDDh,DinG,WYL,Cas9_archaeal	Unclear	GTTTCGACAGTACCTATGAGGGCTTGAAAC,GTTTCGACAGTACCTATGAGGGCTTGAAAC,GTTTCGACAGTACCTATGAGGGCTTGAAAC	30,30,30	0	0	NA	NA	NA:NA:NA	74,75,75	75	Unclear	cas6,cas8b2,cas7,cas5,cas3,cas4,cas1,cas2,csa3,DEDDh,DinG,WYL,Cas9_archaeal,csx1,cmr6gr7,cmr5gr11,cmr4gr7,cmr3gr5,cas10,csm3gr7	NA,NA|84aa|down_5|NC_011959.1_42581_42833_+,NA|123aa|down_6|NC_011959.1_43070_43439_+	cas7|356aa|up_9|NC_011959.1_19364_20432_+	TIGR02585, conserved_protein, CRISPR-associated protein Cas7/Cst2/DevR, subtype I-B/TNEAP	cas5|236aa|up_8|NC_011959.1_20454_21162_+	TIGR01895, conserved_hypothetical_protein, CRISPR-associated protein Cas5, subtype I-B/TNEAP	cas3|774aa|up_7|NC_011959.1_21151_23473_+	TIGR01587, CRISPR-associated_endonuclease/helicase_Cas3, CRISPR-associated helicase Cas3	cas4|196aa|up_6|NC_011959.1_23487_24075_+	pfam01930, Cas_Cas4, Domain of unknown function DUF83	cas1|330aa|up_5|NC_011959.1_24071_25061_+	TIGR03641, cas1_HMARI, CRISPR-associated endonuclease Cas1, subtype I-B/HMARI/TNEAP	cas2|88aa|up_4|NC_011959.1_25125_25389_+	cd09725, Cas2_I_II_III, CRISPR/Cas system-associated protein Cas2	NA|357aa|up_3|NC_011959.1_25385_26456_-	cd06061, PurM-like1, AIR synthase (PurM) related protein, subgroup 1 of unknown function	NA|203aa|up_2|NC_011959.1_26684_27293_-	cd01749, GATase1_PB, Glutamine Amidotransferase (GATase_I) involved in pyridoxine biosynthesis	NA|301aa|up_1|NC_011959.1_27303_28206_-	PRK04180, PRK04180, pyridoxal 5'-phosphate synthase lyase subunit PdxS	NA|501aa|up_0|NC_011959.1_28232_29735_-	COG1167, ARO8, Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs [Transcription / Amino acid transport and metabolism]	NA|478aa|down_0|NC_011959.1_36949_38383_+	COG1207, GlmU, N-acetylglucosamine-1-phosphate uridyltransferase (contains nucleotidyltransferase and I-patch acetyltransferase domains) [Cell envelope biogenesis, outer membrane]	NA|313aa|down_1|NC_011959.1_38387_39326_+	PRK01259, PRK01259, ribose-phosphate diphosphokinase	NA|421aa|down_2|NC_011959.1_39367_40630_+	COG1253, TlyC, Hemolysins and related proteins containing CBS domains [General function prediction only]	NA|184aa|down_3|NC_011959.1_40636_41188_+	cd16442, BPL, biotin protein ligase	NA|380aa|down_4|NC_011959.1_41174_42314_-	pfam00296, Bac_luciferase, Luciferase-like monooxygenase	NA|84aa|down_5|NC_011959.1_42581_42833_+	NA	NA|123aa|down_6|NC_011959.1_43070_43439_+	NA	NA|404aa|down_7|NC_011959.1_43717_44929_+	cd01158, SCAD_SBCAD, Short chain acyl-CoA dehydrogenases and eukaryotic short/branched chain acyl-CoA dehydrogenases	NA|179aa|down_8|NC_011959.1_44930_45467_+	cd04645, LbH_gamma_CA_like, Gamma carbonic anhydrase-like: This family is composed of gamma carbonic anhydrase (CA), Ferripyochelin Binding Protein (FBP), E	NA|258aa|down_9|NC_011959.1_45463_46237_+	PRK05809, PRK05809, short-chain-enoyl-CoA hydratase
GCF_000021685.1_ASM2168v1	NC_011961	Thermomicrobium roseum DSM 5159 plasmid unnamed, complete sequence	1	122024-122151	1	CRISPRCasFinder	no	csx1,cmr6gr7,cmr5gr11,cmr4gr7,cmr3gr5,cas10,csm3gr7	csx1,cmr6gr7,cmr5gr11,cmr4gr7,cmr3gr5,cas10,csm3gr7,cas3,csa3	Type III-B,Type III-D,Type III-C, Type III-C?,Type III-A	ACCGAGCGCATGGAGCGGGTGGA	23	0	0	NA	NA	NA	2	2	TypeIII-B,TypeIII-D,TypeIII-C,TypeIII-C?,TypeIII-A	cas6,cas8b2,cas7,cas5,cas3,cas4,cas1,cas2,csa3,DEDDh,DinG,WYL,Cas9_archaeal,csx1,cmr6gr7,cmr5gr11,cmr4gr7,cmr3gr5,cas10,csm3gr7	NA|447aa|up_2|NC_011961.1_115579_116920_-,NA|141aa|down_0|NC_011961.1_122266_122689_+,NA|128aa|down_1|NC_011961.1_123382_123766_+,NA|137aa|down_2|NC_011961.1_123799_124210_+,NA|200aa|down_9|NC_011961.1_130522_131122_+	cmr5gr11|141aa|up_9|NC_011961.1_108455_108878_-	pfam09701, Cas_Cmr5, CRISPR-associated protein (Cas_Cmr5)	cmr4gr7|308aa|up_8|NC_011961.1_108892_109816_-	TIGR02580, putative_CRISPR-associated_protein, CRISPR type III-B/RAMP module RAMP protein Cmr4	cmr3gr5|357aa|up_7|NC_011961.1_109817_110888_-	cd09748, Cmr3_III-B, CRISPR/Cas system-associated RAMP superfamily protein Cmr3	cas10|605aa|up_6|NC_011961.1_110884_112699_-	cd09679, Cas10_III, CRISPR/Cas system-associated protein Cas10	csm3gr7|385aa|up_5|NC_011961.1_112691_113846_-	TIGR01894, hypothetical_protein, CRISPR type III-B/RAMP module RAMP protein Cmr1	NA|269aa|up_4|NC_011961.1_113861_114668_-	pfam13365, Trypsin_2, Trypsin-like peptidase domain	NA|301aa|up_3|NC_011961.1_114677_115580_-	COG1196, Smc, Chromosome segregation ATPases [Cell division and chromosome partitioning]	NA|447aa|up_2|NC_011961.1_115579_116920_-	NA	NA|465aa|up_1|NC_011961.1_117047_118442_+	cd05680, M20_dipept_like, uncharacterized M20 dipeptidase	NA|571aa|up_0|NC_011961.1_119276_120989_+	pfam08378, NERD, Nuclease-related domain	NA|141aa|down_0|NC_011961.1_122266_122689_+	NA	NA|128aa|down_1|NC_011961.1_123382_123766_+	NA	NA|137aa|down_2|NC_011961.1_123799_124210_+	NA	NA|200aa|down_3|NC_011961.1_125703_126303_-	pfam13668, Ferritin_2, Ferritin-like domain	NA|197aa|down_4|NC_011961.1_126560_127151_+	PRK11924, PRK11924, RNA polymerase sigma factor; Provisional	NA|248aa|down_5|NC_011961.1_127119_127863_+	pfam10099, RskA, Anti-sigma-K factor rskA	NA|247aa|down_6|NC_011961.1_128379_129120_+	cd11374, CE4_u10, Putative catalytic domain of uncharacterized bacterial proteins from the carbohydrate esterase 4 superfamily	NA|238aa|down_7|NC_011961.1_129094_129808_+	cd02522, GT_2_like_a, GT_2_like_a represents a glycosyltransferase family-2 subfamily with unknown function	NA|238aa|down_8|NC_011961.1_129812_130526_+	cd02522, GT_2_like_a, GT_2_like_a represents a glycosyltransferase family-2 subfamily with unknown function	NA|200aa|down_9|NC_011961.1_130522_131122_+	NA
GCF_000021685.1_ASM2168v1	NC_011961	Thermomicrobium roseum DSM 5159 plasmid unnamed, complete sequence	2	405759-405849	2	CRISPRCasFinder	no		csx1,cmr6gr7,cmr5gr11,cmr4gr7,cmr3gr5,cas10,csm3gr7,cas3,csa3	Orphan	ACCCCGACCGTCGTCAAGAAGCCGACGATCG	31	0	0	NA	NA	NA	1	1	Orphan	cas6,cas8b2,cas7,cas5,cas3,cas4,cas1,cas2,csa3,DEDDh,DinG,WYL,Cas9_archaeal,csx1,cmr6gr7,cmr5gr11,cmr4gr7,cmr3gr5,cas10,csm3gr7	NA|170aa|up_6|NC_011961.1_397173_397683_+,NA|57aa|up_4|NC_011961.1_400206_400377_-,NA|62aa|up_0|NC_011961.1_404574_404760_-,NA|92aa|down_4|NC_011961.1_410707_410983_-	NA|414aa|up_9|NC_011961.1_394932_396174_+	PRK09237, PRK09237, amidohydrolase/deacetylase family metallohydrolase	NA|175aa|up_8|NC_011961.1_396174_396699_+	PRK02253, PRK02253, deoxyuridine 5'-triphosphate nucleotidohydrolase; Provisional	NA|136aa|up_7|NC_011961.1_396752_397160_+	cd00338, Ser_Recombinase, Serine Recombinase family, catalytic domain; a DNA binding domain may be present either N- or C-terminal to the catalytic domain	NA|170aa|up_6|NC_011961.1_397173_397683_+	NA	NA|828aa|up_5|NC_011961.1_397726_400210_+	pfam13654, AAA_32, AAA domain	NA|57aa|up_4|NC_011961.1_400206_400377_-	NA	NA|477aa|up_3|NC_011961.1_400730_402161_-	PRK09436, thrA, bifunctional aspartokinase I/homoserine dehydrogenase I; Provisional	NA|437aa|up_2|NC_011961.1_402487_403798_+	COG1595, RpoE, DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog [Transcription]	NA|111aa|up_1|NC_011961.1_404097_404430_-	pfam05642, Sporozoite_P67, Sporozoite P67 surface antigen	NA|62aa|up_0|NC_011961.1_404574_404760_-	NA	NA|397aa|down_0|NC_011961.1_406542_407733_+	PHA03247, PHA03247, large tegument protein UL36; Provisional	NA|61aa|down_1|NC_011961.1_408226_408409_+	pfam10262, Rdx, Rdx family	NA|408aa|down_2|NC_011961.1_408460_409684_+	PRK01565, PRK01565, thiamine biosynthesis protein ThiI; Provisional	NA|281aa|down_3|NC_011961.1_409623_410466_-	pfam13367, PrsW-protease, Protease prsW family	NA|92aa|down_4|NC_011961.1_410707_410983_-	NA	NA|536aa|down_5|NC_011961.1_411195_412803_-	cd03399, SPFH_flotillin, Flotillin or reggie family; SPFH (stomatin, prohibitin, flotillin, and HflK/C) superfamily	NA|185aa|down_6|NC_011961.1_412799_413354_-	pfam01957, NfeD, NfeD-like C-terminal, partner-binding	NA|513aa|down_7|NC_011961.1_413651_415190_-	TIGR00908, putative_ethanolamine_permease, ethanolamine permease	NA|261aa|down_8|NC_011961.1_415220_416003_-	PRK06057, PRK06057, short chain dehydrogenase; Provisional	NA|462aa|down_9|NC_011961.1_416017_417403_-	COG0174, GlnA, Glutamine synthetase [Amino acid transport and metabolism]
GCF_000021685.1_ASM2168v1	NC_011961	Thermomicrobium roseum DSM 5159 plasmid unnamed, complete sequence	3	742512-742609	3	CRISPRCasFinder	no		csx1,cmr6gr7,cmr5gr11,cmr4gr7,cmr3gr5,cas10,csm3gr7,cas3,csa3	Orphan	GCAACTGGACGAATTAAGGAGGCGTGCAGCG	31	0	0	NA	NA	NA	1	1	Orphan	cas6,cas8b2,cas7,cas5,cas3,cas4,cas1,cas2,csa3,DEDDh,DinG,WYL,Cas9_archaeal,csx1,cmr6gr7,cmr5gr11,cmr4gr7,cmr3gr5,cas10,csm3gr7	NA|125aa|up_6|NC_011961.1_736185_736560_+,NA|55aa|up_0|NC_011961.1_742222_742387_+,NA|114aa|down_0|NC_011961.1_742705_743047_+,NA|147aa|down_1|NC_011961.1_743059_743500_+,NA|116aa|down_2|NC_011961.1_743692_744040_+,NA|68aa|down_3|NC_011961.1_744097_744301_+,NA|59aa|down_4|NC_011961.1_744300_744477_+	NA|422aa|up_9|NC_011961.1_732199_733465_+	cd01118, ArsB_permease, Anion permease ArsB	NA|420aa|up_8|NC_011961.1_733469_734729_-	pfam01663, Phosphodiest, Type I phosphodiesterase / nucleotide pyrophosphatase	NA|310aa|up_7|NC_011961.1_735239_736169_+	cd11589, Agmatinase_like_1, Agmatinase and related proteins	NA|125aa|up_6|NC_011961.1_736185_736560_+	NA	NA|195aa|up_5|NC_011961.1_736732_737317_+	COG1522, Lrp, Transcriptional regulators [Transcription]	NA|441aa|up_4|NC_011961.1_737300_738623_+	PRK00062, PRK00062, glutamate-1-semialdehyde 2,1-aminomutase	NA|256aa|up_3|NC_011961.1_738619_739387_-	cd05829, Sortase_F, Sortase domain found in the class F family of sortases	NA|272aa|up_2|NC_011961.1_739478_740294_-	pfam14344, DUF4397, Domain of unknown function (DUF4397)	NA|335aa|up_1|NC_011961.1_740813_741818_+	cd08254, hydroxyacyl_CoA_DH, 6-hydroxycyclohex-1-ene-1-carboxyl-CoA dehydrogenase, N-benzyl-3-pyrrolidinol dehydrogenase, and other MDR family members	NA|55aa|up_0|NC_011961.1_742222_742387_+	NA	NA|114aa|down_0|NC_011961.1_742705_743047_+	NA	NA|147aa|down_1|NC_011961.1_743059_743500_+	NA	NA|116aa|down_2|NC_011961.1_743692_744040_+	NA	NA|68aa|down_3|NC_011961.1_744097_744301_+	NA	NA|59aa|down_4|NC_011961.1_744300_744477_+	NA	NA|261aa|down_5|NC_011961.1_745428_746211_+	COG1192, Soj, ATPases involved in chromosome partitioning [Cell division and chromosome partitioning]	NA|329aa|down_6|NC_011961.1_746203_747190_+	TIGR04285, parB-like_partition_protein, nucleoid occlusion protein	NA|297aa|down_7|NC_011961.1_747414_748305_+	COG2810, COG2810, Predicted type IV restriction endonuclease [Defense mechanisms]	NA|202aa|down_8|NC_011961.1_748485_749091_-	COG1268, BioY, Uncharacterized conserved protein [General function prediction only]	NA|258aa|down_9|NC_011961.1_749140_749914_-	pfam03746, LamB_YcsF, LamB/YcsF family
