assembly_id	genome_id	genome_def	crispr_array_locus_merge	crispr_array_location_merge	crispr_locus_id	crispr_pred_method	array_in_prot	prot_within_array_20000	prot_in_genome	crispr_type_by_cas_prot	consensus_repeat	repeat_length	self-targeting_spacer_number	self-targeting_target_number	spacer_location	protospacer_location	repeat_type	spacer_locus_num	spacer_num	correct_crispr_type	genome_cas_prots	unknown_protein_around_crispr	L10	L10_domain	L9	L9_domain	L8	L8_domain	L7	L7_domain	L6	L6_domain	L5	L5_domain	L4	L4_domain	L3	L3_domain	L2	L2_domain	L1	L1_domain	R1	R1_domain	R2	R2_domain	R3	R3_domain	R4	R4_domain	R5	R5_domain	R6	R6_domain	R7	R7_domain	R8	R8_domain	R9	R9_domain	R10	R10_domain
GCF_000255615.2_ASM25561v3	NZ_CP013828	Hungateiclostridium thermocellum AD2 chromosome, complete genome	1	881812-882001	1,1	PILER-CR,CRISPRCasFinder	no	cas8b1,cas6,csx1,cas10,csm3gr7,csx10gr5,csx19,cas2,cas4	cas3,WYL,cas8b1,cas6,csx1,cas10,csm3gr7,csx10gr5,csx19,cas2,cas4,csa3,DEDDh,DinG,csm2gr11,cas1,cas5,cas7	Type III-A,Type III-B,Type III-D,Type I-B,Type III-C	GTTGAAGAGGTACTTCCAGTAAAACAAGGATTGAAACATA,GTTGAAGAGGTACTTCCAGTAAAACAAGGATTGAAAC	40,37	0	0	NA	NA	?:?	2,2	2	TypeIII-A,TypeIII-B,TypeIII-D,TypeI-B,TypeIII-C	cas3,WYL,cas8b1,cas6,csx1,cas10,csm3gr7,csx10gr5,csx19,cas2,cas4,csa3,DEDDh,DinG,csm2gr11,cas1,cas5,cas7	csx1|329aa|up_9|NZ_CP013828.1_870150_871137_+,csx1|330aa|up_8|NZ_CP013828.1_871126_872116_+,csx19|156aa|up_3|NZ_CP013828.1_877279_877747_+,NA|60aa|up_1|NZ_CP013828.1_879829_880009_+,NA|83aa|down_3|NZ_CP013828.1_885469_885718_-,NA|47aa|down_4|NZ_CP013828.1_885828_885969_-,NA|216aa|down_5|NZ_CP013828.1_886794_887442_+,NA|266aa|down_6|NZ_CP013828.1_887578_888376_+,NA|166aa|down_7|NZ_CP013828.1_888417_888915_+	csx1|329aa|up_9|NZ_CP013828.1_870150_871137_+	NA	csx1|330aa|up_8|NZ_CP013828.1_871126_872116_+	NA	cas10|502aa|up_7|NZ_CP013828.1_872135_873641_+	cd09679, Cas10_III, CRISPR/Cas system-associated protein Cas10	csm3gr7|222aa|up_6|NZ_CP013828.1_873641_874307_+	pfam03787, RAMPs, RAMP superfamily	csx10gr5|534aa|up_5|NZ_CP013828.1_874299_875901_+	cd09700, Csx10, CRISPR/Cas system-associated RAMP superfamily protein Csx10	csm3gr7|461aa|up_4|NZ_CP013828.1_875900_877283_+	cd09726, RAMP_I_III, CRISPR/Cas system-associated RAMP superfamily protein	csx19|156aa|up_3|NZ_CP013828.1_877279_877747_+	NA	csm3gr7|693aa|up_2|NZ_CP013828.1_877751_879830_+	TIGR03986, CRISPR-associated_protein, CRISPR-associated protein	NA|60aa|up_1|NZ_CP013828.1_879829_880009_+	NA	csx1|489aa|up_0|NZ_CP013828.1_880052_881519_+	pfam09670, Cas_Cas02710, CRISPR-associated protein (Cas_Cas02710)	cas2|97aa|down_0|NZ_CP013828.1_883078_883369_+	cd09725, Cas2_I_II_III, CRISPR/Cas system-associated protein Cas2	cas4|210aa|down_1|NZ_CP013828.1_883343_883973_+	cd09637, Cas4_I-A_I-B_I-C_I-D_II-B, CRISPR/Cas system-associated protein Cas4	NA|408aa|down_2|NZ_CP013828.1_884055_885279_+	pfam00872, Transposase_mut, Transposase, Mutator family	NA|83aa|down_3|NZ_CP013828.1_885469_885718_-	NA	NA|47aa|down_4|NZ_CP013828.1_885828_885969_-	NA	NA|216aa|down_5|NZ_CP013828.1_886794_887442_+	NA	NA|266aa|down_6|NZ_CP013828.1_887578_888376_+	NA	NA|166aa|down_7|NZ_CP013828.1_888417_888915_+	NA	NA|287aa|down_8|NZ_CP013828.1_889424_890285_+	cd00200, WD40, WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment	NA|384aa|down_9|NZ_CP013828.1_890528_891681_-	PHA02517, PHA02517, putative transposase OrfB; Reviewed
GCF_000255615.2_ASM25561v3	NZ_CP013828	Hungateiclostridium thermocellum AD2 chromosome, complete genome	2	979303-984692	2,1,2	CRISPRCasFinder,CRT,PILER-CR	no		cas3,WYL,cas8b1,cas6,csx1,cas10,csm3gr7,csx10gr5,csx19,cas2,cas4,csa3,DEDDh,DinG,csm2gr11,cas1,cas5,cas7	Orphan	GTTTCAATTCCTCATAGGTACGATAAAAAC,GTTTCAATTCCTCATAGGTACGATAAAAAC,GTTTCAATTCCTCATAGGTACGATAAAAAC	30,30,30	0	0	NA	NA	NA:NA:NA	80,80,79	80	Orphan	cas3,WYL,cas8b1,cas6,csx1,cas10,csm3gr7,csx10gr5,csx19,cas2,cas4,csa3,DEDDh,DinG,csm2gr11,cas1,cas5,cas7	NA|111aa|up_0|NZ_CP013828.1_978724_979057_-,NA	NA|317aa|up_9|NZ_CP013828.1_968656_969607_+	COG0053, MMT1, Predicted Co/Zn/Cd cation transporters [Inorganic ion transport and metabolism]	NA|490aa|up_8|NZ_CP013828.1_969648_971118_-	COG0642, BaeS, Signal transduction histidine kinase [Signal transduction mechanisms]	NA|233aa|up_7|NZ_CP013828.1_971122_971821_-	COG0745, OmpR, Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain [Signal transduction mechanisms / Transcription]	NA|229aa|up_6|NZ_CP013828.1_971996_972683_+	cd07750, PolyPPase_VTC_like, Polyphosphate(polyP) polymerase domain of yeast vacuolar transport chaperone (VTC) proteins VTC-2, -3 and- 4, and similar proteins	NA|229aa|up_5|NZ_CP013828.1_972712_973399_+	pfam16316, DUF4956, Domain of unknown function (DUF4956)	NA|704aa|up_4|NZ_CP013828.1_973437_975549_+	pfam08757, CotH, CotH kinase protein	NA|717aa|up_3|NZ_CP013828.1_975580_977731_-	COG0370, FeoB, Fe2+ transport system protein B [Inorganic ion transport and metabolism]	NA|80aa|up_2|NZ_CP013828.1_977813_978053_-	pfam04023, FeoA, FeoA domain	NA|160aa|up_1|NZ_CP013828.1_978212_978692_+	PRK03902, PRK03902, transcriptional regulator MntR	NA|111aa|up_0|NZ_CP013828.1_978724_979057_-	NA	NA|416aa|down_0|NZ_CP013828.1_985081_986329_+	pfam07745, Glyco_hydro_53, Glycosyl hydrolase family 53	NA|149aa|down_1|NZ_CP013828.1_986440_986887_+	pfam09719, C_GCAxxG_C_C, Putative redox-active protein (C_GCAxxG_C_C)	NA|843aa|down_2|NZ_CP013828.1_987235_989764_+	cd14256, Dockerin_I, Type I dockerin repeat domain	NA|273aa|down_3|NZ_CP013828.1_989998_990817_+	cd04194, GT8_A4GalT_like, A4GalT_like proteins catalyze the addition of galactose or glucose residues to the lipooligosaccharide (LOS) or lipopolysaccharide (LPS) of the bacterial cell surface	NA|515aa|down_4|NZ_CP013828.1_990856_992401_+	cd09160, PLDc_SMU_988_like_2, Putative catalytic domain, repeat 2, of Streptococcus mutans uncharacterized protein SMU_988 and similar proteins	NA|64aa|down_5|NZ_CP013828.1_992485_992677_+	COG1117, PstB, ABC-type phosphate transport system, ATPase component [Inorganic ion transport and metabolism]	NA|237aa|down_6|NZ_CP013828.1_992740_993451_+	COG0745, OmpR, Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain [Signal transduction mechanisms / Transcription]	NA|590aa|down_7|NZ_CP013828.1_993453_995223_+	NF033092, HK_WalK, cell wall metabolism sensor histidine kinase WalK	NA|512aa|down_8|NZ_CP013828.1_996039_997575_+	PRK00915, PRK00915, 2-isopropylmalate synthase; Validated	NA|311aa|down_9|NZ_CP013828.1_997712_998645_-	pfam12146, Hydrolase_4, Serine aminopeptidase, S33
GCF_000255615.2_ASM25561v3	NZ_CP013828	Hungateiclostridium thermocellum AD2 chromosome, complete genome	3	1918540-1920316	3,2,3,4	CRISPRCasFinder,CRT,PILER-CR,PILER-CR	no	cas3	cas3,WYL,cas8b1,cas6,csx1,cas10,csm3gr7,csx10gr5,csx19,cas2,cas4,csa3,DEDDh,DinG,csm2gr11,cas1,cas5,cas7	Unclear	ATTTCAATTCCTCATAGGTACGATACAAAC,ATTTCAATTCCTCATAGGTACGATACAAAC,ATTTCAATTCCTCATAGGTACGATACAAAC,TTTCAATTCCTCATAGGTACGATACAAAC	30,30,30,29	0	0	NA	NA	NA:NA:NA:NA	26,26,23,23	26	Unclear	cas3,WYL,cas8b1,cas6,csx1,cas10,csm3gr7,csx10gr5,csx19,cas2,cas4,csa3,DEDDh,DinG,csm2gr11,cas1,cas5,cas7	NA|356aa|up_4|NZ_CP013828.1_1909878_1910946_-,NA|111aa|down_2|NZ_CP013828.1_1922119_1922452_-	NA|215aa|up_9|NZ_CP013828.1_1905988_1906633_-	PRK08644, PRK08644, sulfur carrier protein ThiS adenylyltransferase ThiF	NA|370aa|up_8|NZ_CP013828.1_1906709_1907819_-	PRK09240, thiH, 2-iminoacetate synthase ThiH	NA|256aa|up_7|NZ_CP013828.1_1907947_1908715_-	PRK00208, thiG, thiazole synthase; Reviewed	NA|67aa|up_6|NZ_CP013828.1_1908717_1908918_-	cd00565, Ubl_ThiS, ubiquitin-like (Ubl) domain found in sulfur carrier protein ThiS	NA|208aa|up_5|NZ_CP013828.1_1909182_1909806_-	PRK00454, engB, GTP-binding protein YsxC; Reviewed	NA|356aa|up_4|NZ_CP013828.1_1909878_1910946_-	NA	NA|67aa|up_3|NZ_CP013828.1_1911520_1911721_+	PRK10767, PRK10767, chaperone protein DnaJ; Provisional	NA|348aa|up_2|NZ_CP013828.1_1911842_1912886_-	pfam07833, Cu_amine_oxidN1, Copper amine oxidase N-terminal domain	NA|221aa|up_1|NZ_CP013828.1_1913163_1913826_-	pfam13649, Methyltransf_25, Methyltransferase domain	NA|384aa|up_0|NZ_CP013828.1_1915738_1916890_+	PHA02517, PHA02517, putative transposase OrfB; Reviewed	NA|152aa|down_0|NZ_CP013828.1_1920355_1920811_-	PRK00409, PRK00409, recombination and DNA strand exchange inhibitor protein; Reviewed	NA|384aa|down_1|NZ_CP013828.1_1920892_1922044_-	cd00338, Ser_Recombinase, Serine Recombinase family, catalytic domain; a DNA binding domain may be present either N- or C-terminal to the catalytic domain	NA|111aa|down_2|NZ_CP013828.1_1922119_1922452_-	NA	cas3|713aa|down_3|NZ_CP013828.1_1922699_1924838_-	COG1201, Lhr, Lhr-like helicases [General function prediction only]	NA|615aa|down_4|NZ_CP013828.1_1926176_1928021_-	pfam13208, TerB_N, TerB N-terminal domain	NA|223aa|down_5|NZ_CP013828.1_1928181_1928850_-	PRK13413, mpi, master DNA invertase Mpi family serine-type recombinase	NA|259aa|down_6|NZ_CP013828.1_1929133_1929910_-	COG1196, Smc, Chromosome segregation ATPases [Cell division and chromosome partitioning]	NA|54aa|down_7|NZ_CP013828.1_1929934_1930096_-	pfam12728, HTH_17, Helix-turn-helix domain	NA|384aa|down_8|NZ_CP013828.1_1931246_1932399_+	PHA02517, PHA02517, putative transposase OrfB; Reviewed	NA|274aa|down_9|NZ_CP013828.1_1932587_1933409_-	pfam13730, HTH_36, Helix-turn-helix domain
GCF_000255615.2_ASM25561v3	NZ_CP013828	Hungateiclostridium thermocellum AD2 chromosome, complete genome	4	1933649-1936892	4,3,5	CRISPRCasFinder,CRT,PILER-CR	no	cas3	cas3,WYL,cas8b1,cas6,csx1,cas10,csm3gr7,csx10gr5,csx19,cas2,cas4,csa3,DEDDh,DinG,csm2gr11,cas1,cas5,cas7	Unclear	GTTTCAATTCCTCATAGGTACGATACAAAC,GTTTCAATTCCTCATAGGTACGATACAAAC,GTTTCAATTCCTCATAGGTACGATACAAAC	30,30,30	0	0	NA	NA	NA:NA:NA	48,48,48	48	Unclear	cas3,WYL,cas8b1,cas6,csx1,cas10,csm3gr7,csx10gr5,csx19,cas2,cas4,csa3,DEDDh,DinG,csm2gr11,cas1,cas5,cas7	NA|111aa|up_7|NZ_CP013828.1_1922119_1922452_-,NA	NA|152aa|up_9|NZ_CP013828.1_1920355_1920811_-	PRK00409, PRK00409, recombination and DNA strand exchange inhibitor protein; Reviewed	NA|384aa|up_8|NZ_CP013828.1_1920892_1922044_-	cd00338, Ser_Recombinase, Serine Recombinase family, catalytic domain; a DNA binding domain may be present either N- or C-terminal to the catalytic domain	NA|111aa|up_7|NZ_CP013828.1_1922119_1922452_-	NA	cas3|713aa|up_6|NZ_CP013828.1_1922699_1924838_-	COG1201, Lhr, Lhr-like helicases [General function prediction only]	NA|615aa|up_5|NZ_CP013828.1_1926176_1928021_-	pfam13208, TerB_N, TerB N-terminal domain	NA|223aa|up_4|NZ_CP013828.1_1928181_1928850_-	PRK13413, mpi, master DNA invertase Mpi family serine-type recombinase	NA|259aa|up_3|NZ_CP013828.1_1929133_1929910_-	COG1196, Smc, Chromosome segregation ATPases [Cell division and chromosome partitioning]	NA|54aa|up_2|NZ_CP013828.1_1929934_1930096_-	pfam12728, HTH_17, Helix-turn-helix domain	NA|384aa|up_1|NZ_CP013828.1_1931246_1932399_+	PHA02517, PHA02517, putative transposase OrfB; Reviewed	NA|274aa|up_0|NZ_CP013828.1_1932587_1933409_-	pfam13730, HTH_36, Helix-turn-helix domain	NA|97aa|down_0|NZ_CP013828.1_1937632_1937923_-	pfam04456, DUF503, Protein of unknown function (DUF503)	NA|395aa|down_1|NZ_CP013828.1_1938052_1939237_+	pfam07228, SpoIIE, Stage II sporulation protein E (SpoIIE)	NA|588aa|down_2|NZ_CP013828.1_1939264_1941028_+	COG4191, COG4191, Signal transduction histidine kinase regulating C4-dicarboxylate transport system [Signal transduction mechanisms]	NA|589aa|down_3|NZ_CP013828.1_1941284_1943051_+	pfam05833, FbpA, Fibronectin-binding protein A N-terminus (FbpA)	NA|398aa|down_4|NZ_CP013828.1_1943064_1944258_+	PRK06836, PRK06836, pyridoxal phosphate-dependent aminotransferase	NA|174aa|down_5|NZ_CP013828.1_1944324_1944846_-	cd02151, nitroreductase, nitroreductase family protein	NA|737aa|down_6|NZ_CP013828.1_1945399_1947610_+	pfam00759, Glyco_hydro_9, Glycosyl hydrolase family 9	NA|213aa|down_7|NZ_CP013828.1_1947748_1948387_-	cd07995, TPK, Thiamine pyrophosphokinase	NA|221aa|down_8|NZ_CP013828.1_1948399_1949062_-	cd00429, RPE, Ribulose-5-phosphate 3-epimerase (RPE)	NA|295aa|down_9|NZ_CP013828.1_1949220_1950105_-	PRK00098, PRK00098, GTPase RsgA; Reviewed
GCF_000255615.2_ASM25561v3	NZ_CP013828	Hungateiclostridium thermocellum AD2 chromosome, complete genome	5	3473346-3475182	5,4,6	CRISPRCasFinder,CRT,PILER-CR	no	cas2,cas1,cas4,cas3,cas5,cas7,cas8b1,cas6	cas3,WYL,cas8b1,cas6,csx1,cas10,csm3gr7,csx10gr5,csx19,cas2,cas4,csa3,DEDDh,DinG,csm2gr11,cas1,cas5,cas7	Type I-B	GTTTCAATTCCTCATAGGTACGATAAAAAC,GTTTCAATTCCTCATAGGTACGATAAAAAC,GTTTCAATTCCTCATAGGTACGATAAAAAC	30,30,30	0	0	NA	NA	NA:NA:NA	27,27,26	27	TypeI-B	cas3,WYL,cas8b1,cas6,csx1,cas10,csm3gr7,csx10gr5,csx19,cas2,cas4,csa3,DEDDh,DinG,csm2gr11,cas1,cas5,cas7	NA|240aa|up_4|NZ_CP013828.1_3470086_3470806_-,NA|88aa|up_2|NZ_CP013828.1_3471474_3471738_-,NA|153aa|up_1|NZ_CP013828.1_3471782_3472241_-,NA|180aa|up_0|NZ_CP013828.1_3472257_3472797_-,cas8b1|559aa|down_6|NZ_CP013828.1_3481064_3482741_-	NA|703aa|up_9|NZ_CP013828.1_3464121_3466230_-	COG0643, CheA, Chemotaxis protein histidine kinase and related kinases [Cell motility and secretion / Signal transduction mechanisms]	NA|142aa|up_8|NZ_CP013828.1_3466244_3466670_-	COG0835, CheW, Chemotaxis signal transduction protein [Cell motility and secretion / Signal transduction mechanisms]	NA|187aa|up_7|NZ_CP013828.1_3467616_3468177_+	cd01192, INT_C_like_3, Uncharacterized site-specific tyrosine recombinase, C-terminal catalytic domain	NA|311aa|up_6|NZ_CP013828.1_3468477_3469410_+	smart00342, HTH_ARAC, helix_turn_helix, arabinose operon control protein	NA|185aa|up_5|NZ_CP013828.1_3469382_3469937_-	pfam01161, PBP, Phosphatidylethanolamine-binding protein	NA|240aa|up_4|NZ_CP013828.1_3470086_3470806_-	NA	NA|227aa|up_3|NZ_CP013828.1_3470784_3471465_-	COG1131, CcmA, ABC-type multidrug transport system, ATPase component [Defense mechanisms]	NA|88aa|up_2|NZ_CP013828.1_3471474_3471738_-	NA	NA|153aa|up_1|NZ_CP013828.1_3471782_3472241_-	NA	NA|180aa|up_0|NZ_CP013828.1_3472257_3472797_-	NA	cas2|88aa|down_0|NZ_CP013828.1_3475348_3475612_-	cd09725, Cas2_I_II_III, CRISPR/Cas system-associated protein Cas2	cas1|331aa|down_1|NZ_CP013828.1_3475625_3476618_-	TIGR03641, cas1_HMARI, CRISPR-associated endonuclease Cas1, subtype I-B/HMARI/TNEAP	cas4|169aa|down_2|NZ_CP013828.1_3476631_3477138_-	pfam01930, Cas_Cas4, Domain of unknown function DUF83	cas3|751aa|down_3|NZ_CP013828.1_3477156_3479409_-	cd09639, Cas3_I, CRISPR/Cas system-associated protein Cas3	cas5|242aa|down_4|NZ_CP013828.1_3479430_3480156_-	TIGR01895, conserved_hypothetical_protein, CRISPR-associated protein Cas5, subtype I-B/TNEAP	cas7|295aa|down_5|NZ_CP013828.1_3480174_3481059_-	TIGR02585, conserved_protein, CRISPR-associated protein Cas7/Cst2/DevR, subtype I-B/TNEAP	cas8b1|559aa|down_6|NZ_CP013828.1_3481064_3482741_-	NA	cas6|241aa|down_7|NZ_CP013828.1_3482752_3483475_-	COG1583, COG1583, CRISPR system related protein, RAMP superfamily [Defense    mechanisms]	NA|307aa|down_8|NZ_CP013828.1_3483731_3484652_+	cd00537, MTHFR, Methylenetetrahydrofolate reductase (MTHFR)	NA|273aa|down_9|NZ_CP013828.1_3484681_3485500_-	PRK00281, PRK00281, undecaprenyl-diphosphate phosphatase
