assembly_id	genome_id	genome_def	crispr_array_locus_merge	crispr_array_location_merge	crispr_locus_id	crispr_pred_method	array_in_prot	prot_within_array_20000	prot_in_genome	crispr_type_by_cas_prot	consensus_repeat	repeat_length	self-targeting_spacer_number	self-targeting_target_number	spacer_location	protospacer_location	repeat_type	spacer_locus_num	spacer_num	correct_crispr_type	genome_cas_prots	unknown_protein_around_crispr	L10	L10_domain	L9	L9_domain	L8	L8_domain	L7	L7_domain	L6	L6_domain	L5	L5_domain	L4	L4_domain	L3	L3_domain	L2	L2_domain	L1	L1_domain	R1	R1_domain	R2	R2_domain	R3	R3_domain	R4	R4_domain	R5	R5_domain	R6	R6_domain	R7	R7_domain	R8	R8_domain	R9	R9_domain	R10	R10_domain
GCF_000015865.1_ASM1586v1	NC_009012	Hungateiclostridium thermocellum ATCC 27405, complete sequence	1	716798-720026	1,1,1	PILER-CR,CRISPRCasFinder,CRT	no		csa3,DinG,DEDDh,cas3,WYL,csm2gr11,csm3gr7,csx10gr5,cas10,cas6,csx1,cas2,cas1,cas4,cas5,cas7,cas8b1,csx19	Orphan	GTTTGTATCGTACCTATGAGGAATTGAAAC,GTTTGTATCGTACCTATGAGGAATTGAAAC,GTTTGTATCGTACCTATGAGGAATTGAAAC	30,30,30	0	0	NA	NA	NA:NA:NA	47,48,48	48	Orphan	csa3,DinG,DEDDh,cas3,WYL,csm2gr11,csm3gr7,csx10gr5,cas10,cas6,csx1,cas2,cas1,cas4,cas5,cas7,cas8b1,csx19	NA,NA|356aa|down_7|NC_009012.1_731946_733014_+	NA|295aa|up_9|NC_009012.1_703584_704469_+	PRK00098, PRK00098, GTPase RsgA; Reviewed	NA|221aa|up_8|NC_009012.1_704627_705290_+	cd00429, RPE, Ribulose-5-phosphate 3-epimerase (RPE)	NA|213aa|up_7|NC_009012.1_705302_705941_+	cd07995, TPK, Thiamine pyrophosphokinase	NA|737aa|up_6|NC_009012.1_706079_708290_-	pfam00759, Glyco_hydro_9, Glycosyl hydrolase family 9	NA|174aa|up_5|NC_009012.1_708843_709365_+	cd02151, nitroreductase, nitroreductase family protein	NA|398aa|up_4|NC_009012.1_709431_710625_-	PRK06836, PRK06836, pyridoxal phosphate-dependent aminotransferase	NA|589aa|up_3|NC_009012.1_710638_712405_-	pfam05833, FbpA, Fibronectin-binding protein A N-terminus (FbpA)	NA|588aa|up_2|NC_009012.1_712661_714425_-	COG4191, COG4191, Signal transduction histidine kinase regulating C4-dicarboxylate transport system [Signal transduction mechanisms]	NA|395aa|up_1|NC_009012.1_714452_715637_-	pfam07228, SpoIIE, Stage II sporulation protein E (SpoIIE)	NA|97aa|up_0|NC_009012.1_715766_716057_+	pfam04456, DUF503, Protein of unknown function (DUF503)	NA|384aa|down_0|NC_009012.1_720605_721758_-	PHA02517, PHA02517, putative transposase OrfB; Reviewed	NA|407aa|down_1|NC_009012.1_721918_723139_-	pfam00872, Transposase_mut, Transposase, Mutator family	NA|357aa|down_2|NC_009012.1_723749_724820_+	COG2826, Tra8, Transposase and inactivated derivatives, IS30 family [DNA replication, recombination, and repair]	NA|384aa|down_3|NC_009012.1_726132_727285_+	PHA02517, PHA02517, putative transposase OrfB; Reviewed	NA|222aa|down_4|NC_009012.1_727659_728325_+	pfam13649, Methyltransf_25, Methyltransferase domain	NA|67aa|down_5|NC_009012.1_729767_729968_-	PRK10767, PRK10767, chaperone protein DnaJ; Provisional	NA|407aa|down_6|NC_009012.1_730555_731776_+	pfam00872, Transposase_mut, Transposase, Mutator family	NA|356aa|down_7|NC_009012.1_731946_733014_+	NA	NA|208aa|down_8|NC_009012.1_733086_733710_+	PRK00454, engB, GTP-binding protein YsxC; Reviewed	NA|67aa|down_9|NC_009012.1_733974_734175_+	cd00565, Ubl_ThiS, ubiquitin-like (Ubl) domain found in sulfur carrier protein ThiS
GCF_000015865.1_ASM1586v1	NC_009012	Hungateiclostridium thermocellum ATCC 27405, complete sequence	2	1712609-1719089	2,2,2	PILER-CR,CRISPRCasFinder,CRT	no		csa3,DinG,DEDDh,cas3,WYL,csm2gr11,csm3gr7,csx10gr5,cas10,cas6,csx1,cas2,cas1,cas4,cas5,cas7,cas8b1,csx19	Orphan	GTTTTTATCGTACCTATGAGGAATTGAAAC,GTTTTTATCGTACCTATGAGGAATTGAAAC,GTTTTTATCGTACCTATGAGGAATTGAAAC	30,30,30	0	0	NA	NA	NA:NA:NA	95,96,96	96	Orphan	csa3,DinG,DEDDh,cas3,WYL,csm2gr11,csm3gr7,csx10gr5,cas10,cas6,csx1,cas2,cas1,cas4,cas5,cas7,cas8b1,csx19	NA,NA|111aa|down_0|NC_009012.1_1719334_1719667_+	NA|512aa|up_9|NC_009012.1_1698327_1699863_-	PRK00915, PRK00915, 2-isopropylmalate synthase; Validated	NA|357aa|up_8|NC_009012.1_1700857_1701928_-	COG2826, Tra8, Transposase and inactivated derivatives, IS30 family [DNA replication, recombination, and repair]	NA|590aa|up_7|NC_009012.1_1702078_1703848_-	NF033092, HK_WalK, cell wall metabolism sensor histidine kinase WalK	NA|237aa|up_6|NC_009012.1_1703850_1704561_-	COG0745, OmpR, Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain [Signal transduction mechanisms / Transcription]	NA|64aa|up_5|NC_009012.1_1704624_1704816_-	COG1117, PstB, ABC-type phosphate transport system, ATPase component [Inorganic ion transport and metabolism]	NA|515aa|up_4|NC_009012.1_1704900_1706445_-	cd09160, PLDc_SMU_988_like_2, Putative catalytic domain, repeat 2, of Streptococcus mutans uncharacterized protein SMU_988 and similar proteins	NA|273aa|up_3|NC_009012.1_1706484_1707303_-	cd04194, GT8_A4GalT_like, A4GalT_like proteins catalyze the addition of galactose or glucose residues to the lipooligosaccharide (LOS) or lipopolysaccharide (LPS) of the bacterial cell surface	NA|843aa|up_2|NC_009012.1_1707537_1710066_-	cd14256, Dockerin_I, Type I dockerin repeat domain	NA|149aa|up_1|NC_009012.1_1710414_1710861_-	pfam09719, C_GCAxxG_C_C, Putative redox-active protein (C_GCAxxG_C_C)	NA|416aa|up_0|NC_009012.1_1710972_1712220_-	pfam07745, Glyco_hydro_53, Glycosyl hydrolase family 53	NA|111aa|down_0|NC_009012.1_1719334_1719667_+	NA	NA|160aa|down_1|NC_009012.1_1719699_1720179_-	PRK03902, PRK03902, transcriptional regulator MntR	NA|80aa|down_2|NC_009012.1_1720338_1720578_+	pfam04023, FeoA, FeoA domain	NA|717aa|down_3|NC_009012.1_1720660_1722811_+	COG0370, FeoB, Fe2+ transport system protein B [Inorganic ion transport and metabolism]	NA|710aa|down_4|NC_009012.1_1722842_1724972_-	pfam08757, CotH, CotH kinase protein	NA|229aa|down_5|NC_009012.1_1725010_1725697_-	pfam16316, DUF4956, Domain of unknown function (DUF4956)	NA|229aa|down_6|NC_009012.1_1725726_1726413_-	cd07750, PolyPPase_VTC_like, Polyphosphate(polyP) polymerase domain of yeast vacuolar transport chaperone (VTC) proteins VTC-2, -3 and- 4, and similar proteins	NA|233aa|down_7|NC_009012.1_1726588_1727287_+	COG0745, OmpR, Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain [Signal transduction mechanisms / Transcription]	NA|490aa|down_8|NC_009012.1_1727291_1728761_+	COG0642, BaeS, Signal transduction histidine kinase [Signal transduction mechanisms]	NA|317aa|down_9|NC_009012.1_1728802_1729753_-	COG0053, MMT1, Predicted Co/Zn/Cd cation transporters [Inorganic ion transport and metabolism]
GCF_000015865.1_ASM1586v1	NC_009012	Hungateiclostridium thermocellum ATCC 27405, complete sequence	3	2729744-2741081	3,3,3	CRISPRCasFinder,CRT,PILER-CR	no	cas2,cas1,cas4,cas3,cas5,cas7,cas8b1,cas6	csa3,DinG,DEDDh,cas3,WYL,csm2gr11,csm3gr7,csx10gr5,cas10,cas6,csx1,cas2,cas1,cas4,cas5,cas7,cas8b1,csx19	Type I-B	GTTTCAATTCCTCATAGGTACGATAAAAAC,TTTCAATTCCTCATAGGTACGATAAAAAC,GTTTTTATCGTACCTATGAGGAATTGAAAT	30,29,30	0	0	NA	NA	NA:NA:NA	168,168,125	168	TypeI-B	csa3,DinG,DEDDh,cas3,WYL,csm2gr11,csm3gr7,csx10gr5,cas10,cas6,csx1,cas2,cas1,cas4,cas5,cas7,cas8b1,csx19	NA|240aa|up_5|NC_009012.1_2724056_2724776_-,NA|88aa|up_3|NC_009012.1_2725444_2725708_-,NA|96aa|up_2|NC_009012.1_2725601_2725889_+,cas8b1|559aa|down_6|NC_009012.1_2746963_2748640_-	NA|143aa|up_9|NC_009012.1_2720213_2720642_-	COG0835, CheW, Chemotaxis signal transduction protein [Cell motility and secretion / Signal transduction mechanisms]	NA|187aa|up_8|NC_009012.1_2721585_2722146_+	cd01192, INT_C_like_3, Uncharacterized site-specific tyrosine recombinase, C-terminal catalytic domain	NA|311aa|up_7|NC_009012.1_2722447_2723380_+	smart00342, HTH_ARAC, helix_turn_helix, arabinose operon control protein	NA|185aa|up_6|NC_009012.1_2723352_2723907_-	pfam01161, PBP, Phosphatidylethanolamine-binding protein	NA|240aa|up_5|NC_009012.1_2724056_2724776_-	NA	NA|227aa|up_4|NC_009012.1_2724754_2725435_-	COG1131, CcmA, ABC-type multidrug transport system, ATPase component [Defense mechanisms]	NA|88aa|up_3|NC_009012.1_2725444_2725708_-	NA	NA|96aa|up_2|NC_009012.1_2725601_2725889_+	NA	NA|495aa|up_1|NC_009012.1_2726778_2728263_+	COG4584, COG4584, Transposase and inactivated derivatives [DNA replication, recombination, and repair]	NA|253aa|up_0|NC_009012.1_2728262_2729021_+	PRK09183, PRK09183, transposase/IS protein; Provisional	cas2|88aa|down_0|NC_009012.1_2741247_2741511_-	cd09725, Cas2_I_II_III, CRISPR/Cas system-associated protein Cas2	cas1|331aa|down_1|NC_009012.1_2741524_2742517_-	TIGR03641, cas1_HMARI, CRISPR-associated endonuclease Cas1, subtype I-B/HMARI/TNEAP	cas4|169aa|down_2|NC_009012.1_2742530_2743037_-	pfam01930, Cas_Cas4, Domain of unknown function DUF83	cas3|751aa|down_3|NC_009012.1_2743055_2745308_-	cd09639, Cas3_I, CRISPR/Cas system-associated protein Cas3	cas5|242aa|down_4|NC_009012.1_2745329_2746055_-	TIGR01895, conserved_hypothetical_protein, CRISPR-associated protein Cas5, subtype I-B/TNEAP	cas7|295aa|down_5|NC_009012.1_2746073_2746958_-	TIGR02585, conserved_protein, CRISPR-associated protein Cas7/Cst2/DevR, subtype I-B/TNEAP	cas8b1|559aa|down_6|NC_009012.1_2746963_2748640_-	NA	cas6|241aa|down_7|NC_009012.1_2748651_2749374_-	COG1583, COG1583, CRISPR system related protein, RAMP superfamily [Defense    mechanisms]	NA|307aa|down_8|NC_009012.1_2749630_2750551_+	cd00537, MTHFR, Methylenetetrahydrofolate reductase (MTHFR)	NA|273aa|down_9|NC_009012.1_2750580_2751399_-	PRK00281, PRK00281, undecaprenyl-diphosphate phosphatase
GCF_000015865.1_ASM1586v1	NC_009012	Hungateiclostridium thermocellum ATCC 27405, complete sequence	4	3785203-3791136	4,4,4	CRISPRCasFinder,CRT,PILER-CR	no	cas8b1,cas7,cas5,cas3,cas6,csm3gr7,csx10gr5,csx19,csx1	csa3,DinG,DEDDh,cas3,WYL,csm2gr11,csm3gr7,csx10gr5,cas10,cas6,csx1,cas2,cas1,cas4,cas5,cas7,cas8b1,csx19	Type I-B	GTTGAAGTGGTACTTCCAGTAAAACAAGGATTGAAAC,GTTGAAGTGGTACTTCCAGTAAAACAAGGATTGAAACNN,GTTGAAGTGGTACTTCCAGTAAAACAAGGATTGAAAC	37,39,37	0	0	NA	NA	?:?:?	78,80,40	80	TypeI-B	csa3,DinG,DEDDh,cas3,WYL,csm2gr11,csm3gr7,csx10gr5,cas10,cas6,csx1,cas2,cas1,cas4,cas5,cas7,cas8b1,csx19	NA|210aa|up_8|NC_009012.1_3775000_3775630_-,NA|142aa|up_5|NC_009012.1_3777694_3778120_+,NA	NA|407aa|up_9|NC_009012.1_3773636_3774857_+	cd17869, TadZ-like, pilus assembly protein TadZ	NA|210aa|up_8|NC_009012.1_3775000_3775630_-	NA	NA|237aa|up_7|NC_009012.1_3775853_3776564_+	cd11009, Zn_dep_PLPC, Zinc dependent phospholipase C (alpha toxin)	NA|217aa|up_6|NC_009012.1_3776836_3777487_-	pfam09580, Spore_YhcN_YlaJ, Sporulation lipoprotein YhcN/YlaJ (Spore_YhcN_YlaJ)	NA|142aa|up_5|NC_009012.1_3777694_3778120_+	NA	NA|454aa|up_4|NC_009012.1_3778202_3779564_-	pfam06245, DUF1015, Protein of unknown function (DUF1015)	NA|185aa|up_3|NC_009012.1_3779816_3780371_+	COG3881, COG3881, PRC-barrel domain containing protein [General function prediction only]	NA|62aa|up_2|NC_009012.1_3780375_3780561_+	pfam12732, YtxH, YtxH-like protein	NA|350aa|up_1|NC_009012.1_3780748_3781798_+	COG0628, yhhT, Predicted permease, member of the PurR regulon [General function prediction only]	NA|881aa|up_0|NC_009012.1_3782068_3784711_+	PRK00252, alaS, alanyl-tRNA synthetase; Reviewed	cas8b1|614aa|down_0|NC_009012.1_3791467_3793309_+	pfam09484, Cas_TM1802, CRISPR-associated protein TM1802 (cas_TM1802)	cas7|306aa|down_1|NC_009012.1_3793308_3794226_+	TIGR02590, hypothetical_protein_MM_0563, CRISPR-associated protein Cas7/Csh2, subtype I-B/HMARI	cas5|238aa|down_2|NC_009012.1_3794240_3794954_+	TIGR02592, hypothetical_protein_CTC_01466, CRISPR-associated protein Cas5, subtype I-B/HMARI	cas3|801aa|down_3|NC_009012.1_3794967_3797370_+	cd17930, DEXHc_cas3, DEXH/Q-box helicase domain of Cas3	cas6|223aa|down_4|NC_009012.1_3797383_3798052_+	pfam17262, DUF5328, Family of unknown function (DUF5328)	NA|408aa|down_5|NC_009012.1_3798674_3799898_+	pfam00872, Transposase_mut, Transposase, Mutator family	NA|357aa|down_6|NC_009012.1_3800182_3801253_-	COG2826, Tra8, Transposase and inactivated derivatives, IS30 family [DNA replication, recombination, and repair]	csm3gr7|225aa|down_7|NC_009012.1_3802517_3803192_+	pfam03787, RAMPs, RAMP superfamily	csx10gr5|534aa|down_8|NC_009012.1_3803184_3804786_+	cd09700, Csx10, CRISPR/Cas system-associated RAMP superfamily protein Csx10	csm3gr7|444aa|down_9|NC_009012.1_3804782_3806114_+	cd09726, RAMP_I_III, CRISPR/Cas system-associated RAMP superfamily protein
GCF_000015865.1_ASM1586v1	NC_009012	Hungateiclostridium thermocellum ATCC 27405, complete sequence	5	3813209-3816348	5,5,5	PILER-CR,CRISPRCasFinder,CRT	no	cas7,cas5,cas3,cas6,csm3gr7,csx10gr5,csx19,csx1,cas1,cas2,cas4	csa3,DinG,DEDDh,cas3,WYL,csm2gr11,csm3gr7,csx10gr5,cas10,cas6,csx1,cas2,cas1,cas4,cas5,cas7,cas8b1,csx19	Unclear	GTTGAAGAGGTACTTCCAGTAAAACAAGGATTGAAAC,GTTGAAGAGGTACTTCCAGTAAAACAAGGATTGAAAC,GTTGAAGAGGTACTTCCAGTAAAACAAGGATTGAAAC	37,37,37	0	0	NA	NA	?:?:?	41,42,42	42	Unclear	csa3,DinG,DEDDh,cas3,WYL,csm2gr11,csm3gr7,csx10gr5,cas10,cas6,csx1,cas2,cas1,cas4,cas5,cas7,cas8b1,csx19	NA,NA|73aa|down_3|NC_009012.1_3818366_3818585_-,NA|47aa|down_4|NC_009012.1_3818695_3818836_-,NA|216aa|down_5|NC_009012.1_3819664_3820312_+,NA|267aa|down_6|NC_009012.1_3820448_3821249_+,NA|166aa|down_7|NC_009012.1_3821296_3821794_+	cas6|223aa|up_9|NC_009012.1_3797383_3798052_+	pfam17262, DUF5328, Family of unknown function (DUF5328)	NA|408aa|up_8|NC_009012.1_3798674_3799898_+	pfam00872, Transposase_mut, Transposase, Mutator family	NA|357aa|up_7|NC_009012.1_3800182_3801253_-	COG2826, Tra8, Transposase and inactivated derivatives, IS30 family [DNA replication, recombination, and repair]	csm3gr7|225aa|up_6|NC_009012.1_3802517_3803192_+	pfam03787, RAMPs, RAMP superfamily	csx10gr5|534aa|up_5|NC_009012.1_3803184_3804786_+	cd09700, Csx10, CRISPR/Cas system-associated RAMP superfamily protein Csx10	csm3gr7|444aa|up_4|NC_009012.1_3804782_3806114_+	cd09726, RAMP_I_III, CRISPR/Cas system-associated RAMP superfamily protein	csx19|122aa|up_3|NC_009012.1_3806120_3806486_+	TIGR03984, hypothetical_protein_FrEUN1fDRAFT_5778, CRISPR-associated protein, TIGR03984 family	csm3gr7|662aa|up_2|NC_009012.1_3806500_3808486_+	TIGR03986, CRISPR-associated_protein, CRISPR-associated protein	csx1|417aa|up_1|NC_009012.1_3808629_3809880_+	cd09732, Csx1_III-U, CRISPR/Cas system-associated protein Csx1	NA|408aa|up_0|NC_009012.1_3810125_3811349_+	pfam00872, Transposase_mut, Transposase, Mutator family	cas1|332aa|down_0|NC_009012.1_3816439_3817435_+	pfam01867, Cas_Cas1, CRISPR associated protein Cas1	cas2|97aa|down_1|NC_009012.1_3817428_3817719_+	cd09725, Cas2_I_II_III, CRISPR/Cas system-associated protein Cas2	cas4|205aa|down_2|NC_009012.1_3817693_3818308_+	cd09637, Cas4_I-A_I-B_I-C_I-D_II-B, CRISPR/Cas system-associated protein Cas4	NA|73aa|down_3|NC_009012.1_3818366_3818585_-	NA	NA|47aa|down_4|NC_009012.1_3818695_3818836_-	NA	NA|216aa|down_5|NC_009012.1_3819664_3820312_+	NA	NA|267aa|down_6|NC_009012.1_3820448_3821249_+	NA	NA|166aa|down_7|NC_009012.1_3821296_3821794_+	NA	NA|287aa|down_8|NC_009012.1_3822303_3823164_+	cd00200, WD40, WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment	NA|171aa|down_9|NC_009012.1_3823878_3824391_+	pfam17117, DUF5104, Domain of unknown function (DUF5104)
