assembly_id	genome_id	genome_def	crispr_array_locus_merge	crispr_array_location_merge	crispr_locus_id	crispr_pred_method	array_in_prot	prot_within_array_20000	prot_in_genome	crispr_type_by_cas_prot	consensus_repeat	repeat_length	self-targeting_spacer_number	self-targeting_target_number	spacer_location	protospacer_location	repeat_type	spacer_locus_num	spacer_num	correct_crispr_type	genome_cas_prots	unknown_protein_around_crispr	L10	L10_domain	L9	L9_domain	L8	L8_domain	L7	L7_domain	L6	L6_domain	L5	L5_domain	L4	L4_domain	L3	L3_domain	L2	L2_domain	L1	L1_domain	R1	R1_domain	R2	R2_domain	R3	R3_domain	R4	R4_domain	R5	R5_domain	R6	R6_domain	R7	R7_domain	R8	R8_domain	R9	R9_domain	R10	R10_domain
GCF_901538265.1_ASM90153826v1	NZ_LR593886	Gemmata massiliana isolate Soil9 chromosome 1	1	1139991-1140079	1	CRISPRCasFinder	no		csa3,RT,cas1,cas2,cas3,cas8u2,cas7,cas5u,DinG,cas8u1,DEDDh,PD-DExK,csb2gr5	Orphan	GACGCGTTCGGGAAGACCACGACGA	25	0	0	NA	NA	NA	1	1	Orphan	csa3,RT,cas1,cas2,cas3,cas8u2,cas7,cas5u,DinG,cas8u1,DEDDh,PD-DExK,csb2gr5	NA|259aa|up_9|NZ_LR593886.1_1111566_1112343_+,NA|135aa|up_5|NZ_LR593886.1_1120904_1121309_-,NA|361aa|down_0|NZ_LR593886.1_1146187_1147270_+,NA|142aa|down_3|NZ_LR593886.1_1150257_1150683_+,NA|81aa|down_6|NZ_LR593886.1_1152895_1153138_+	NA|259aa|up_9|NZ_LR593886.1_1111566_1112343_+	NA	NA|709aa|up_8|NZ_LR593886.1_1112392_1114519_+	cd14014, STKc_PknB_like, Catalytic domain of bacterial Serine/Threonine kinases, PknB and similar proteins	NA|346aa|up_7|NZ_LR593886.1_1114718_1115756_+	TIGR02996, rpt_mate_G_obs, repeat-companion domain TIGR02996	NA|1529aa|up_6|NZ_LR593886.1_1115799_1120386_-	cd01406, SIR2-like, Sir2-like: Prokaryotic group of uncharacterized Sir2-like proteins which lack certain key catalytic residues and conserved zinc binding cysteines; and are members of the SIR2 superfamily of proteins, silent information regulator 2 (Sir2) enzymes which catalyze NAD+-dependent protein/histone deacetylation	NA|135aa|up_5|NZ_LR593886.1_1120904_1121309_-	NA	NA|876aa|up_4|NZ_LR593886.1_1121292_1123920_-	TIGR02760, conjugative_transfer_relaxase_protein_TraI, conjugative transfer relaxase protein TraI	NA|585aa|up_3|NZ_LR593886.1_1124252_1126007_+	pfam01734, Patatin, Patatin-like phospholipase	NA|238aa|up_2|NZ_LR593886.1_1127361_1128075_-	COG0412, COG0412, Dienelactone hydrolase and related enzymes [Secondary metabolites biosynthesis, transport, and catabolism]	NA|476aa|up_1|NZ_LR593886.1_1128295_1129723_-	cd02511, Beta4Glucosyltransferase, UDP-glucose LOS-beta-1,4 glucosyltransferase is required for biosynthesis of lipooligosaccharide	NA|277aa|up_0|NZ_LR593886.1_1133413_1134244_-	TIGR01444, 2-O-methyltransferase_NoeI, methyltransferase, FkbM family	NA|361aa|down_0|NZ_LR593886.1_1146187_1147270_+	NA	NA|618aa|down_1|NZ_LR593886.1_1147300_1149154_+	sd00006, TPR, Tetratricopeptide repeat	NA|274aa|down_2|NZ_LR593886.1_1149207_1150029_+	COG1216, COG1216, Predicted glycosyltransferases [General function prediction only]	NA|142aa|down_3|NZ_LR593886.1_1150257_1150683_+	NA	NA|120aa|down_4|NZ_LR593886.1_1150676_1151036_+	pfam05717, TnpB_IS66, IS66 Orf2 like protein	NA|507aa|down_5|NZ_LR593886.1_1151095_1152616_+	pfam03050, DDE_Tnp_IS66, Transposase IS66 family	NA|81aa|down_6|NZ_LR593886.1_1152895_1153138_+	NA	NA|145aa|down_7|NZ_LR593886.1_1153134_1153569_+	cd09873, PIN_Pae0151-like, VapC-like PIN domain of the Pyrobaculum aerophilum Pae0151 and Pae2754 proteins and homologs	NA|324aa|down_8|NZ_LR593886.1_1153583_1154555_+	cd01196, INT_C_like_6, Uncharacterized site-specific tyrosine recombinase, C-terminal catalytic domain	NA|164aa|down_9|NZ_LR593886.1_1154584_1155076_-	pfam13604, AAA_30, AAA domain
GCF_901538265.1_ASM90153826v1	NZ_LR593886	Gemmata massiliana isolate Soil9 chromosome 1	2	1140255-1140342	2	CRISPRCasFinder	no		csa3,RT,cas1,cas2,cas3,cas8u2,cas7,cas5u,DinG,cas8u1,DEDDh,PD-DExK,csb2gr5	Orphan	GACGCGTTCGGGAAGACCACGACGA	25	0	0	NA	NA	NA	1	1	Orphan	csa3,RT,cas1,cas2,cas3,cas8u2,cas7,cas5u,DinG,cas8u1,DEDDh,PD-DExK,csb2gr5	NA|259aa|up_9|NZ_LR593886.1_1111566_1112343_+,NA|135aa|up_5|NZ_LR593886.1_1120904_1121309_-,NA|361aa|down_0|NZ_LR593886.1_1146187_1147270_+,NA|142aa|down_3|NZ_LR593886.1_1150257_1150683_+,NA|81aa|down_6|NZ_LR593886.1_1152895_1153138_+	NA|259aa|up_9|NZ_LR593886.1_1111566_1112343_+	NA	NA|709aa|up_8|NZ_LR593886.1_1112392_1114519_+	cd14014, STKc_PknB_like, Catalytic domain of bacterial Serine/Threonine kinases, PknB and similar proteins	NA|346aa|up_7|NZ_LR593886.1_1114718_1115756_+	TIGR02996, rpt_mate_G_obs, repeat-companion domain TIGR02996	NA|1529aa|up_6|NZ_LR593886.1_1115799_1120386_-	cd01406, SIR2-like, Sir2-like: Prokaryotic group of uncharacterized Sir2-like proteins which lack certain key catalytic residues and conserved zinc binding cysteines; and are members of the SIR2 superfamily of proteins, silent information regulator 2 (Sir2) enzymes which catalyze NAD+-dependent protein/histone deacetylation	NA|135aa|up_5|NZ_LR593886.1_1120904_1121309_-	NA	NA|876aa|up_4|NZ_LR593886.1_1121292_1123920_-	TIGR02760, conjugative_transfer_relaxase_protein_TraI, conjugative transfer relaxase protein TraI	NA|585aa|up_3|NZ_LR593886.1_1124252_1126007_+	pfam01734, Patatin, Patatin-like phospholipase	NA|238aa|up_2|NZ_LR593886.1_1127361_1128075_-	COG0412, COG0412, Dienelactone hydrolase and related enzymes [Secondary metabolites biosynthesis, transport, and catabolism]	NA|476aa|up_1|NZ_LR593886.1_1128295_1129723_-	cd02511, Beta4Glucosyltransferase, UDP-glucose LOS-beta-1,4 glucosyltransferase is required for biosynthesis of lipooligosaccharide	NA|277aa|up_0|NZ_LR593886.1_1133413_1134244_-	TIGR01444, 2-O-methyltransferase_NoeI, methyltransferase, FkbM family	NA|361aa|down_0|NZ_LR593886.1_1146187_1147270_+	NA	NA|618aa|down_1|NZ_LR593886.1_1147300_1149154_+	sd00006, TPR, Tetratricopeptide repeat	NA|274aa|down_2|NZ_LR593886.1_1149207_1150029_+	COG1216, COG1216, Predicted glycosyltransferases [General function prediction only]	NA|142aa|down_3|NZ_LR593886.1_1150257_1150683_+	NA	NA|120aa|down_4|NZ_LR593886.1_1150676_1151036_+	pfam05717, TnpB_IS66, IS66 Orf2 like protein	NA|507aa|down_5|NZ_LR593886.1_1151095_1152616_+	pfam03050, DDE_Tnp_IS66, Transposase IS66 family	NA|81aa|down_6|NZ_LR593886.1_1152895_1153138_+	NA	NA|145aa|down_7|NZ_LR593886.1_1153134_1153569_+	cd09873, PIN_Pae0151-like, VapC-like PIN domain of the Pyrobaculum aerophilum Pae0151 and Pae2754 proteins and homologs	NA|324aa|down_8|NZ_LR593886.1_1153583_1154555_+	cd01196, INT_C_like_6, Uncharacterized site-specific tyrosine recombinase, C-terminal catalytic domain	NA|164aa|down_9|NZ_LR593886.1_1154584_1155076_-	pfam13604, AAA_30, AAA domain
GCF_901538265.1_ASM90153826v1	NZ_LR593886	Gemmata massiliana isolate Soil9 chromosome 1	3	1373844-1375187	1,3,1	PILER-CR,CRISPRCasFinder,CRT	no	csa3	csa3,RT,cas1,cas2,cas3,cas8u2,cas7,cas5u,DinG,cas8u1,DEDDh,PD-DExK,csb2gr5	Type I-A	GTTCCACTCCCGCCCCGTCTGCGCCCGCGTTGCTAA,GTTCCACTCCCGCCCCGTCTGCGCCCGCGTTGCTAA,GTTCCACTCCCGCCCCGTCTGCGCCCGCGTTGCTAA	36,36,36	0	0	NA	NA	NA:NA:NA	18,18,18	18	Orphan	csa3,RT,cas1,cas2,cas3,cas8u2,cas7,cas5u,DinG,cas8u1,DEDDh,PD-DExK,csb2gr5	NA|112aa|up_7|NZ_LR593886.1_1367975_1368311_+,NA|292aa|up_6|NZ_LR593886.1_1368377_1369253_+,NA|128aa|up_5|NZ_LR593886.1_1369320_1369704_+,NA|446aa|up_4|NZ_LR593886.1_1369763_1371101_+,NA|114aa|up_1|NZ_LR593886.1_1372503_1372845_+,NA|103aa|up_0|NZ_LR593886.1_1372957_1373266_+,NA|143aa|down_0|NZ_LR593886.1_1375330_1375759_-,NA|66aa|down_3|NZ_LR593886.1_1376591_1376789_+,NA|83aa|down_4|NZ_LR593886.1_1376901_1377150_+,NA|93aa|down_5|NZ_LR593886.1_1377194_1377473_+,NA|78aa|down_6|NZ_LR593886.1_1377509_1377743_+,NA|98aa|down_7|NZ_LR593886.1_1378319_1378613_+,NA|130aa|down_9|NZ_LR593886.1_1379851_1380241_+	NA|170aa|up_9|NZ_LR593886.1_1367082_1367592_+	pfam04586, Peptidase_S78, Caudovirus prohead serine protease	NA|79aa|up_8|NZ_LR593886.1_1367728_1367965_+	pfam12728, HTH_17, Helix-turn-helix domain	NA|112aa|up_7|NZ_LR593886.1_1367975_1368311_+	NA	NA|292aa|up_6|NZ_LR593886.1_1368377_1369253_+	NA	NA|128aa|up_5|NZ_LR593886.1_1369320_1369704_+	NA	NA|446aa|up_4|NZ_LR593886.1_1369763_1371101_+	NA	NA|124aa|up_3|NZ_LR593886.1_1371100_1371472_+	pfam16459, Phage_TAC_13, Phage tail assembly chaperone, TAC	NA|174aa|up_2|NZ_LR593886.1_1371602_1372124_+	pfam05119, Terminase_4, Phage terminase, small subunit	NA|114aa|up_1|NZ_LR593886.1_1372503_1372845_+	NA	NA|103aa|up_0|NZ_LR593886.1_1372957_1373266_+	NA	NA|143aa|down_0|NZ_LR593886.1_1375330_1375759_-	NA	csa3|110aa|down_1|NZ_LR593886.1_1375786_1376116_-	cd00090, HTH_ARSR, Arsenical Resistance Operon Repressor and similar prokaryotic, metal regulated homodimeric repressors	NA|82aa|down_2|NZ_LR593886.1_1376146_1376392_-	cd19148, AKR_AKR11B1, Bacillus subtilis aldo-keto reductase YhdN and similar proteins	NA|66aa|down_3|NZ_LR593886.1_1376591_1376789_+	NA	NA|83aa|down_4|NZ_LR593886.1_1376901_1377150_+	NA	NA|93aa|down_5|NZ_LR593886.1_1377194_1377473_+	NA	NA|78aa|down_6|NZ_LR593886.1_1377509_1377743_+	NA	NA|98aa|down_7|NZ_LR593886.1_1378319_1378613_+	NA	NA|94aa|down_8|NZ_LR593886.1_1379024_1379306_+	TIGR01764, Probable_excisionase, DNA binding domain, excisionase family	NA|130aa|down_9|NZ_LR593886.1_1379851_1380241_+	NA
GCF_901538265.1_ASM90153826v1	NZ_LR593886	Gemmata massiliana isolate Soil9 chromosome 1	4	1537875-1537952	4	CRISPRCasFinder	no		csa3,RT,cas1,cas2,cas3,cas8u2,cas7,cas5u,DinG,cas8u1,DEDDh,PD-DExK,csb2gr5	Orphan	GAATGAGCCGCGACCGCGAGGGAGCG	26	0	0	NA	NA	NA	1	1	Orphan	csa3,RT,cas1,cas2,cas3,cas8u2,cas7,cas5u,DinG,cas8u1,DEDDh,PD-DExK,csb2gr5	NA|197aa|up_4|NZ_LR593886.1_1529523_1530114_-,NA|545aa|up_3|NZ_LR593886.1_1530162_1531797_-,NA|93aa|up_1|NZ_LR593886.1_1535801_1536080_-,NA|781aa|down_0|NZ_LR593886.1_1538229_1540572_+,NA|174aa|down_1|NZ_LR593886.1_1540986_1541508_+,NA|212aa|down_2|NZ_LR593886.1_1541808_1542444_+,NA|179aa|down_5|NZ_LR593886.1_1546063_1546600_+	NA|109aa|up_9|NZ_LR593886.1_1521589_1521916_+	smart00886, Dabb, Stress responsive A/B Barrel Domain	NA|232aa|up_8|NZ_LR593886.1_1522095_1522791_-	cd02910, cupin_Yhhw_N, Escherichia coli YhhW and YhaK and related proteins, pirin-like bicupin, N-terminal cupin domain	NA|447aa|up_7|NZ_LR593886.1_1522951_1524292_-	pfam07394, DUF1501, Protein of unknown function (DUF1501)	NA|799aa|up_6|NZ_LR593886.1_1524420_1526817_-	pfam07583, PSCyt2, Protein of unknown function (DUF1549)	NA|811aa|up_5|NZ_LR593886.1_1527027_1529460_-	pfam01804, Penicil_amidase, Penicillin amidase	NA|197aa|up_4|NZ_LR593886.1_1529523_1530114_-	NA	NA|545aa|up_3|NZ_LR593886.1_1530162_1531797_-	NA	NA|768aa|up_2|NZ_LR593886.1_1532784_1535088_-	PRK15048, PRK15048, methyl-accepting chemotaxis protein II; Provisional	NA|93aa|up_1|NZ_LR593886.1_1535801_1536080_-	NA	NA|244aa|up_0|NZ_LR593886.1_1537137_1537869_+	COG0670, COG0670, Integral membrane protein, interacts with FtsH [General function prediction only]	NA|781aa|down_0|NZ_LR593886.1_1538229_1540572_+	NA	NA|174aa|down_1|NZ_LR593886.1_1540986_1541508_+	NA	NA|212aa|down_2|NZ_LR593886.1_1541808_1542444_+	NA	NA|374aa|down_3|NZ_LR593886.1_1542526_1543648_-	TIGR02996, rpt_mate_G_obs, repeat-companion domain TIGR02996	NA|321aa|down_4|NZ_LR593886.1_1544698_1545661_+	pfam07596, SBP_bac_10, Protein of unknown function (DUF1559)	NA|179aa|down_5|NZ_LR593886.1_1546063_1546600_+	NA	NA|272aa|down_6|NZ_LR593886.1_1546740_1547556_+	TIGR01730, COG0845:_Membrane-fusion_protein, RND family efflux transporter, MFP subunit	NA|296aa|down_7|NZ_LR593886.1_1547628_1548516_-	TIGR02800, Protein_TolB, tol-pal system beta propeller repeat protein TolB	NA|452aa|down_8|NZ_LR593886.1_1548571_1549927_-	COG1222, RPT1, ATP-dependent 26S proteasome regulatory subunit [Posttranslational modification, protein turnover, chaperones]	NA|464aa|down_9|NZ_LR593886.1_1550000_1551392_-	pfam07394, DUF1501, Protein of unknown function (DUF1501)
GCF_901538265.1_ASM90153826v1	NZ_LR593886	Gemmata massiliana isolate Soil9 chromosome 1	5	2070914-2071015	5	CRISPRCasFinder	no		csa3,RT,cas1,cas2,cas3,cas8u2,cas7,cas5u,DinG,cas8u1,DEDDh,PD-DExK,csb2gr5	Orphan	GGGTTCGGCTCGCACCCCGAATTT	24	0	0	NA	NA	NA	1	1	Orphan	csa3,RT,cas1,cas2,cas3,cas8u2,cas7,cas5u,DinG,cas8u1,DEDDh,PD-DExK,csb2gr5	NA|100aa|up_9|NZ_LR593886.1_2057512_2057812_-,NA|57aa|up_7|NZ_LR593886.1_2060920_2061091_-,NA|81aa|up_0|NZ_LR593886.1_2069716_2069959_+,NA|219aa|down_1|NZ_LR593886.1_2076243_2076900_-,NA|132aa|down_5|NZ_LR593886.1_2081927_2082323_+,NA|86aa|down_6|NZ_LR593886.1_2082476_2082734_-,NA|70aa|down_7|NZ_LR593886.1_2083460_2083670_+	NA|100aa|up_9|NZ_LR593886.1_2057512_2057812_-	NA	NA|503aa|up_8|NZ_LR593886.1_2058701_2060209_-	pfam10551, MULE, MULE transposase domain	NA|57aa|up_7|NZ_LR593886.1_2060920_2061091_-	NA	NA|319aa|up_6|NZ_LR593886.1_2061404_2062361_+	pfam00561, Abhydrolase_1, alpha/beta hydrolase fold	NA|375aa|up_5|NZ_LR593886.1_2062357_2063482_+	cd02932, OYE_YqiM_FMN, Old yellow enzyme (OYE) YqjM-like FMN binding domain	NA|282aa|up_4|NZ_LR593886.1_2063573_2064419_+	cd16361, VOC_ShValD_like, vicinal oxygen chelate (VOC) family protein similar to Streptomyces hygroscopicus ValD protein	NA|206aa|up_3|NZ_LR593886.1_2064573_2065191_+	pfam06966, DUF1295, Protein of unknown function (DUF1295)	NA|200aa|up_2|NZ_LR593886.1_2065311_2065911_+	TIGR02937, RNA_polymerase_sigma_factor, RNA polymerase sigma factor, sigma-70 family	NA|1113aa|up_1|NZ_LR593886.1_2065968_2069307_+	cd14014, STKc_PknB_like, Catalytic domain of bacterial Serine/Threonine kinases, PknB and similar proteins	NA|81aa|up_0|NZ_LR593886.1_2069716_2069959_+	NA	NA|1484aa|down_0|NZ_LR593886.1_2071075_2075527_-	NF033176, auto_AIDA-I, autotransporter adhesin AIDA-I	NA|219aa|down_1|NZ_LR593886.1_2076243_2076900_-	NA	NA|746aa|down_2|NZ_LR593886.1_2077071_2079309_-	cd14014, STKc_PknB_like, Catalytic domain of bacterial Serine/Threonine kinases, PknB and similar proteins	NA|188aa|down_3|NZ_LR593886.1_2079382_2079946_-	pfam07638, Sigma70_ECF, ECF sigma factor	NA|302aa|down_4|NZ_LR593886.1_2080912_2081818_+	pfam07596, SBP_bac_10, Protein of unknown function (DUF1559)	NA|132aa|down_5|NZ_LR593886.1_2081927_2082323_+	NA	NA|86aa|down_6|NZ_LR593886.1_2082476_2082734_-	NA	NA|70aa|down_7|NZ_LR593886.1_2083460_2083670_+	NA	NA|160aa|down_8|NZ_LR593886.1_2083715_2084195_+	pfam11149, DUF2924, Protein of unknown function (DUF2924)	NA|83aa|down_9|NZ_LR593886.1_2084595_2084844_+	cd04184, GT2_RfbC_Mx_like, Myxococcus xanthus RfbC like proteins are required for O-antigen biosynthesis
GCF_901538265.1_ASM90153826v1	NZ_LR593886	Gemmata massiliana isolate Soil9 chromosome 1	6	2979467-2979563	6	CRISPRCasFinder	no		csa3,RT,cas1,cas2,cas3,cas8u2,cas7,cas5u,DinG,cas8u1,DEDDh,PD-DExK,csb2gr5	Orphan	CGCGAGTTCTTTTGCCCCGGCGTC	24	0	0	NA	NA	NA	1	1	Orphan	csa3,RT,cas1,cas2,cas3,cas8u2,cas7,cas5u,DinG,cas8u1,DEDDh,PD-DExK,csb2gr5	NA|141aa|up_7|NZ_LR593886.1_2973319_2973742_+,NA|98aa|up_6|NZ_LR593886.1_2973843_2974137_+,NA|180aa|up_5|NZ_LR593886.1_2974515_2975055_+,NA|264aa|up_4|NZ_LR593886.1_2975055_2975847_-,NA|101aa|up_3|NZ_LR593886.1_2975966_2976269_-,NA|170aa|down_0|NZ_LR593886.1_2980358_2980868_-,NA|81aa|down_2|NZ_LR593886.1_2982436_2982679_-,NA|270aa|down_3|NZ_LR593886.1_2983374_2984184_+,NA|96aa|down_5|NZ_LR593886.1_2985217_2985505_-,NA|189aa|down_6|NZ_LR593886.1_2986364_2986931_-,NA|82aa|down_7|NZ_LR593886.1_2986982_2987228_-,NA|131aa|down_8|NZ_LR593886.1_2987800_2988193_+	NA|138aa|up_9|NZ_LR593886.1_2970924_2971338_-	COG4775, COG4775, Outer membrane protein/protective antigen OMA87 [Cell envelope biogenesis, outer membrane]	NA|321aa|up_8|NZ_LR593886.1_2972215_2973178_+	pfam07596, SBP_bac_10, Protein of unknown function (DUF1559)	NA|141aa|up_7|NZ_LR593886.1_2973319_2973742_+	NA	NA|98aa|up_6|NZ_LR593886.1_2973843_2974137_+	NA	NA|180aa|up_5|NZ_LR593886.1_2974515_2975055_+	NA	NA|264aa|up_4|NZ_LR593886.1_2975055_2975847_-	NA	NA|101aa|up_3|NZ_LR593886.1_2975966_2976269_-	NA	NA|417aa|up_2|NZ_LR593886.1_2976569_2977820_-	sd00034, LRR_AMN1, leucine-rich repeats, antagonist of mitotic exit network protein 1-like subfamily	NA|221aa|up_1|NZ_LR593886.1_2977968_2978631_-	pfam07589, VPEP, PEP-CTERM motif	NA|126aa|up_0|NZ_LR593886.1_2978899_2979277_-	TIGR03066, Gem_osc_para_1, Gemmata obscuriglobus paralogous family TIGR03066	NA|170aa|down_0|NZ_LR593886.1_2980358_2980868_-	NA	NA|358aa|down_1|NZ_LR593886.1_2980937_2982011_-	cd02966, TlpA_like_family, TlpA-like family; composed of  TlpA, ResA, DsbE and similar proteins	NA|81aa|down_2|NZ_LR593886.1_2982436_2982679_-	NA	NA|270aa|down_3|NZ_LR593886.1_2983374_2984184_+	NA	NA|184aa|down_4|NZ_LR593886.1_2984597_2985149_-	pfam09346, SMI1_KNR4, SMI1 / KNR4 family (SUKH-1)	NA|96aa|down_5|NZ_LR593886.1_2985217_2985505_-	NA	NA|189aa|down_6|NZ_LR593886.1_2986364_2986931_-	NA	NA|82aa|down_7|NZ_LR593886.1_2986982_2987228_-	NA	NA|131aa|down_8|NZ_LR593886.1_2987800_2988193_+	NA	NA|158aa|down_9|NZ_LR593886.1_2988466_2988940_+	smart00871, AraC_E_bind, Bacterial transcription activator, effector binding domain
GCF_901538265.1_ASM90153826v1	NZ_LR593886	Gemmata massiliana isolate Soil9 chromosome 1	7	2979755-2979853	7	CRISPRCasFinder	no		csa3,RT,cas1,cas2,cas3,cas8u2,cas7,cas5u,DinG,cas8u1,DEDDh,PD-DExK,csb2gr5	Orphan	CGCGAGTTCTTTTGCCCCGGCGTC	24	0	0	NA	NA	NA	1	1	Orphan	csa3,RT,cas1,cas2,cas3,cas8u2,cas7,cas5u,DinG,cas8u1,DEDDh,PD-DExK,csb2gr5	NA|141aa|up_7|NZ_LR593886.1_2973319_2973742_+,NA|98aa|up_6|NZ_LR593886.1_2973843_2974137_+,NA|180aa|up_5|NZ_LR593886.1_2974515_2975055_+,NA|264aa|up_4|NZ_LR593886.1_2975055_2975847_-,NA|101aa|up_3|NZ_LR593886.1_2975966_2976269_-,NA|170aa|down_0|NZ_LR593886.1_2980358_2980868_-,NA|81aa|down_2|NZ_LR593886.1_2982436_2982679_-,NA|270aa|down_3|NZ_LR593886.1_2983374_2984184_+,NA|96aa|down_5|NZ_LR593886.1_2985217_2985505_-,NA|189aa|down_6|NZ_LR593886.1_2986364_2986931_-,NA|82aa|down_7|NZ_LR593886.1_2986982_2987228_-,NA|131aa|down_8|NZ_LR593886.1_2987800_2988193_+	NA|138aa|up_9|NZ_LR593886.1_2970924_2971338_-	COG4775, COG4775, Outer membrane protein/protective antigen OMA87 [Cell envelope biogenesis, outer membrane]	NA|321aa|up_8|NZ_LR593886.1_2972215_2973178_+	pfam07596, SBP_bac_10, Protein of unknown function (DUF1559)	NA|141aa|up_7|NZ_LR593886.1_2973319_2973742_+	NA	NA|98aa|up_6|NZ_LR593886.1_2973843_2974137_+	NA	NA|180aa|up_5|NZ_LR593886.1_2974515_2975055_+	NA	NA|264aa|up_4|NZ_LR593886.1_2975055_2975847_-	NA	NA|101aa|up_3|NZ_LR593886.1_2975966_2976269_-	NA	NA|417aa|up_2|NZ_LR593886.1_2976569_2977820_-	sd00034, LRR_AMN1, leucine-rich repeats, antagonist of mitotic exit network protein 1-like subfamily	NA|221aa|up_1|NZ_LR593886.1_2977968_2978631_-	pfam07589, VPEP, PEP-CTERM motif	NA|126aa|up_0|NZ_LR593886.1_2978899_2979277_-	TIGR03066, Gem_osc_para_1, Gemmata obscuriglobus paralogous family TIGR03066	NA|170aa|down_0|NZ_LR593886.1_2980358_2980868_-	NA	NA|358aa|down_1|NZ_LR593886.1_2980937_2982011_-	cd02966, TlpA_like_family, TlpA-like family; composed of  TlpA, ResA, DsbE and similar proteins	NA|81aa|down_2|NZ_LR593886.1_2982436_2982679_-	NA	NA|270aa|down_3|NZ_LR593886.1_2983374_2984184_+	NA	NA|184aa|down_4|NZ_LR593886.1_2984597_2985149_-	pfam09346, SMI1_KNR4, SMI1 / KNR4 family (SUKH-1)	NA|96aa|down_5|NZ_LR593886.1_2985217_2985505_-	NA	NA|189aa|down_6|NZ_LR593886.1_2986364_2986931_-	NA	NA|82aa|down_7|NZ_LR593886.1_2986982_2987228_-	NA	NA|131aa|down_8|NZ_LR593886.1_2987800_2988193_+	NA	NA|158aa|down_9|NZ_LR593886.1_2988466_2988940_+	smart00871, AraC_E_bind, Bacterial transcription activator, effector binding domain
GCF_901538265.1_ASM90153826v1	NZ_LR593886	Gemmata massiliana isolate Soil9 chromosome 1	8	3029932-3030671	2,8,9,10	CRT,CRISPRCasFinder,CRISPRCasFinder,CRISPRCasFinder	no		csa3,RT,cas1,cas2,cas3,cas8u2,cas7,cas5u,DinG,cas8u1,DEDDh,PD-DExK,csb2gr5	Orphan	GGNACGCNGGTGACGGACGCGGG,CACGAAGGTAACAGACGCGGGGCT,CACGAAGGTAACAGACGCGGGGCT,CACGAAGGTAACAGACGCGGGGCT	23,24,24,24	0	0	NA	NA	NA:NA:NA:NA	10,5,5,5	10	Orphan	csa3,RT,cas1,cas2,cas3,cas8u2,cas7,cas5u,DinG,cas8u1,DEDDh,PD-DExK,csb2gr5	NA|79aa|up_8|NZ_LR593886.1_3023885_3024122_-,NA|53aa|up_7|NZ_LR593886.1_3024437_3024596_-,NA|84aa|up_6|NZ_LR593886.1_3024649_3024901_-,NA|203aa|up_4|NZ_LR593886.1_3026106_3026715_+,NA|125aa|up_3|NZ_LR593886.1_3026791_3027166_-,NA|180aa|up_2|NZ_LR593886.1_3027193_3027733_-,NA|88aa|up_1|NZ_LR593886.1_3028259_3028523_+,NA|105aa|up_0|NZ_LR593886.1_3028888_3029203_+,NA|82aa|down_1|NZ_LR593886.1_3032451_3032697_+,NA|82aa|down_2|NZ_LR593886.1_3033194_3033440_+,NA|81aa|down_3|NZ_LR593886.1_3034085_3034328_+,NA|81aa|down_4|NZ_LR593886.1_3034742_3034985_+,NA|81aa|down_5|NZ_LR593886.1_3035540_3035783_+,NA|153aa|down_6|NZ_LR593886.1_3035849_3036308_+,NA|229aa|down_7|NZ_LR593886.1_3036468_3037155_+,NA|79aa|down_8|NZ_LR593886.1_3037735_3037972_+	NA|898aa|up_9|NZ_LR593886.1_3021141_3023835_-	COG3378, COG3378, Phage associated DNA primase [General function prediction only]	NA|79aa|up_8|NZ_LR593886.1_3023885_3024122_-	NA	NA|53aa|up_7|NZ_LR593886.1_3024437_3024596_-	NA	NA|84aa|up_6|NZ_LR593886.1_3024649_3024901_-	NA	NA|204aa|up_5|NZ_LR593886.1_3025092_3025704_+	PRK00215, PRK00215, transcriptional repressor LexA	NA|203aa|up_4|NZ_LR593886.1_3026106_3026715_+	NA	NA|125aa|up_3|NZ_LR593886.1_3026791_3027166_-	NA	NA|180aa|up_2|NZ_LR593886.1_3027193_3027733_-	NA	NA|88aa|up_1|NZ_LR593886.1_3028259_3028523_+	NA	NA|105aa|up_0|NZ_LR593886.1_3028888_3029203_+	NA	NA|347aa|down_0|NZ_LR593886.1_3030828_3031869_+	sd00034, LRR_AMN1, leucine-rich repeats, antagonist of mitotic exit network protein 1-like subfamily	NA|82aa|down_1|NZ_LR593886.1_3032451_3032697_+	NA	NA|82aa|down_2|NZ_LR593886.1_3033194_3033440_+	NA	NA|81aa|down_3|NZ_LR593886.1_3034085_3034328_+	NA	NA|81aa|down_4|NZ_LR593886.1_3034742_3034985_+	NA	NA|81aa|down_5|NZ_LR593886.1_3035540_3035783_+	NA	NA|153aa|down_6|NZ_LR593886.1_3035849_3036308_+	NA	NA|229aa|down_7|NZ_LR593886.1_3036468_3037155_+	NA	NA|79aa|down_8|NZ_LR593886.1_3037735_3037972_+	NA	NA|179aa|down_9|NZ_LR593886.1_3038001_3038538_+	pfam13424, TPR_12, Tetratricopeptide repeat
GCF_901538265.1_ASM90153826v1	NZ_LR593886	Gemmata massiliana isolate Soil9 chromosome 1	9	3031002-3031170	11	CRISPRCasFinder	no		csa3,RT,cas1,cas2,cas3,cas8u2,cas7,cas5u,DinG,cas8u1,DEDDh,PD-DExK,csb2gr5	Orphan	CACGAAGGTAACAGACGCGGGGCT	24	0	0	NA	NA	NA	2	2	Orphan	csa3,RT,cas1,cas2,cas3,cas8u2,cas7,cas5u,DinG,cas8u1,DEDDh,PD-DExK,csb2gr5	NA|79aa|up_8|NZ_LR593886.1_3023885_3024122_-,NA|53aa|up_7|NZ_LR593886.1_3024437_3024596_-,NA|84aa|up_6|NZ_LR593886.1_3024649_3024901_-,NA|203aa|up_4|NZ_LR593886.1_3026106_3026715_+,NA|125aa|up_3|NZ_LR593886.1_3026791_3027166_-,NA|180aa|up_2|NZ_LR593886.1_3027193_3027733_-,NA|88aa|up_1|NZ_LR593886.1_3028259_3028523_+,NA|105aa|up_0|NZ_LR593886.1_3028888_3029203_+,NA|82aa|down_0|NZ_LR593886.1_3032451_3032697_+,NA|82aa|down_1|NZ_LR593886.1_3033194_3033440_+,NA|81aa|down_2|NZ_LR593886.1_3034085_3034328_+,NA|81aa|down_3|NZ_LR593886.1_3034742_3034985_+,NA|81aa|down_4|NZ_LR593886.1_3035540_3035783_+,NA|153aa|down_5|NZ_LR593886.1_3035849_3036308_+,NA|229aa|down_6|NZ_LR593886.1_3036468_3037155_+,NA|79aa|down_7|NZ_LR593886.1_3037735_3037972_+,NA|79aa|down_9|NZ_LR593886.1_3039418_3039655_+	NA|898aa|up_9|NZ_LR593886.1_3021141_3023835_-	COG3378, COG3378, Phage associated DNA primase [General function prediction only]	NA|79aa|up_8|NZ_LR593886.1_3023885_3024122_-	NA	NA|53aa|up_7|NZ_LR593886.1_3024437_3024596_-	NA	NA|84aa|up_6|NZ_LR593886.1_3024649_3024901_-	NA	NA|204aa|up_5|NZ_LR593886.1_3025092_3025704_+	PRK00215, PRK00215, transcriptional repressor LexA	NA|203aa|up_4|NZ_LR593886.1_3026106_3026715_+	NA	NA|125aa|up_3|NZ_LR593886.1_3026791_3027166_-	NA	NA|180aa|up_2|NZ_LR593886.1_3027193_3027733_-	NA	NA|88aa|up_1|NZ_LR593886.1_3028259_3028523_+	NA	NA|105aa|up_0|NZ_LR593886.1_3028888_3029203_+	NA	NA|82aa|down_0|NZ_LR593886.1_3032451_3032697_+	NA	NA|82aa|down_1|NZ_LR593886.1_3033194_3033440_+	NA	NA|81aa|down_2|NZ_LR593886.1_3034085_3034328_+	NA	NA|81aa|down_3|NZ_LR593886.1_3034742_3034985_+	NA	NA|81aa|down_4|NZ_LR593886.1_3035540_3035783_+	NA	NA|153aa|down_5|NZ_LR593886.1_3035849_3036308_+	NA	NA|229aa|down_6|NZ_LR593886.1_3036468_3037155_+	NA	NA|79aa|down_7|NZ_LR593886.1_3037735_3037972_+	NA	NA|179aa|down_8|NZ_LR593886.1_3038001_3038538_+	pfam13424, TPR_12, Tetratricopeptide repeat	NA|79aa|down_9|NZ_LR593886.1_3039418_3039655_+	NA
GCF_901538265.1_ASM90153826v1	NZ_LR593886	Gemmata massiliana isolate Soil9 chromosome 1	10	3031434-3031602	12	CRISPRCasFinder	no		csa3,RT,cas1,cas2,cas3,cas8u2,cas7,cas5u,DinG,cas8u1,DEDDh,PD-DExK,csb2gr5	Orphan	CACGAAGGTAACAGACGCGGGGCT	24	0	0	NA	NA	NA	2	2	Orphan	csa3,RT,cas1,cas2,cas3,cas8u2,cas7,cas5u,DinG,cas8u1,DEDDh,PD-DExK,csb2gr5	NA|79aa|up_8|NZ_LR593886.1_3023885_3024122_-,NA|53aa|up_7|NZ_LR593886.1_3024437_3024596_-,NA|84aa|up_6|NZ_LR593886.1_3024649_3024901_-,NA|203aa|up_4|NZ_LR593886.1_3026106_3026715_+,NA|125aa|up_3|NZ_LR593886.1_3026791_3027166_-,NA|180aa|up_2|NZ_LR593886.1_3027193_3027733_-,NA|88aa|up_1|NZ_LR593886.1_3028259_3028523_+,NA|105aa|up_0|NZ_LR593886.1_3028888_3029203_+,NA|82aa|down_0|NZ_LR593886.1_3032451_3032697_+,NA|82aa|down_1|NZ_LR593886.1_3033194_3033440_+,NA|81aa|down_2|NZ_LR593886.1_3034085_3034328_+,NA|81aa|down_3|NZ_LR593886.1_3034742_3034985_+,NA|81aa|down_4|NZ_LR593886.1_3035540_3035783_+,NA|153aa|down_5|NZ_LR593886.1_3035849_3036308_+,NA|229aa|down_6|NZ_LR593886.1_3036468_3037155_+,NA|79aa|down_7|NZ_LR593886.1_3037735_3037972_+,NA|79aa|down_9|NZ_LR593886.1_3039418_3039655_+	NA|898aa|up_9|NZ_LR593886.1_3021141_3023835_-	COG3378, COG3378, Phage associated DNA primase [General function prediction only]	NA|79aa|up_8|NZ_LR593886.1_3023885_3024122_-	NA	NA|53aa|up_7|NZ_LR593886.1_3024437_3024596_-	NA	NA|84aa|up_6|NZ_LR593886.1_3024649_3024901_-	NA	NA|204aa|up_5|NZ_LR593886.1_3025092_3025704_+	PRK00215, PRK00215, transcriptional repressor LexA	NA|203aa|up_4|NZ_LR593886.1_3026106_3026715_+	NA	NA|125aa|up_3|NZ_LR593886.1_3026791_3027166_-	NA	NA|180aa|up_2|NZ_LR593886.1_3027193_3027733_-	NA	NA|88aa|up_1|NZ_LR593886.1_3028259_3028523_+	NA	NA|105aa|up_0|NZ_LR593886.1_3028888_3029203_+	NA	NA|82aa|down_0|NZ_LR593886.1_3032451_3032697_+	NA	NA|82aa|down_1|NZ_LR593886.1_3033194_3033440_+	NA	NA|81aa|down_2|NZ_LR593886.1_3034085_3034328_+	NA	NA|81aa|down_3|NZ_LR593886.1_3034742_3034985_+	NA	NA|81aa|down_4|NZ_LR593886.1_3035540_3035783_+	NA	NA|153aa|down_5|NZ_LR593886.1_3035849_3036308_+	NA	NA|229aa|down_6|NZ_LR593886.1_3036468_3037155_+	NA	NA|79aa|down_7|NZ_LR593886.1_3037735_3037972_+	NA	NA|179aa|down_8|NZ_LR593886.1_3038001_3038538_+	pfam13424, TPR_12, Tetratricopeptide repeat	NA|79aa|down_9|NZ_LR593886.1_3039418_3039655_+	NA
GCF_901538265.1_ASM90153826v1	NZ_LR593886	Gemmata massiliana isolate Soil9 chromosome 1	11	3031722-3031817	13	CRISPRCasFinder	no		csa3,RT,cas1,cas2,cas3,cas8u2,cas7,cas5u,DinG,cas8u1,DEDDh,PD-DExK,csb2gr5	Orphan	CACGAAGGTAACAGACGCGGGGCT	24	0	0	NA	NA	NA	1	1	Orphan	csa3,RT,cas1,cas2,cas3,cas8u2,cas7,cas5u,DinG,cas8u1,DEDDh,PD-DExK,csb2gr5	NA|79aa|up_8|NZ_LR593886.1_3023885_3024122_-,NA|53aa|up_7|NZ_LR593886.1_3024437_3024596_-,NA|84aa|up_6|NZ_LR593886.1_3024649_3024901_-,NA|203aa|up_4|NZ_LR593886.1_3026106_3026715_+,NA|125aa|up_3|NZ_LR593886.1_3026791_3027166_-,NA|180aa|up_2|NZ_LR593886.1_3027193_3027733_-,NA|88aa|up_1|NZ_LR593886.1_3028259_3028523_+,NA|105aa|up_0|NZ_LR593886.1_3028888_3029203_+,NA|82aa|down_0|NZ_LR593886.1_3032451_3032697_+,NA|82aa|down_1|NZ_LR593886.1_3033194_3033440_+,NA|81aa|down_2|NZ_LR593886.1_3034085_3034328_+,NA|81aa|down_3|NZ_LR593886.1_3034742_3034985_+,NA|81aa|down_4|NZ_LR593886.1_3035540_3035783_+,NA|153aa|down_5|NZ_LR593886.1_3035849_3036308_+,NA|229aa|down_6|NZ_LR593886.1_3036468_3037155_+,NA|79aa|down_7|NZ_LR593886.1_3037735_3037972_+,NA|79aa|down_9|NZ_LR593886.1_3039418_3039655_+	NA|898aa|up_9|NZ_LR593886.1_3021141_3023835_-	COG3378, COG3378, Phage associated DNA primase [General function prediction only]	NA|79aa|up_8|NZ_LR593886.1_3023885_3024122_-	NA	NA|53aa|up_7|NZ_LR593886.1_3024437_3024596_-	NA	NA|84aa|up_6|NZ_LR593886.1_3024649_3024901_-	NA	NA|204aa|up_5|NZ_LR593886.1_3025092_3025704_+	PRK00215, PRK00215, transcriptional repressor LexA	NA|203aa|up_4|NZ_LR593886.1_3026106_3026715_+	NA	NA|125aa|up_3|NZ_LR593886.1_3026791_3027166_-	NA	NA|180aa|up_2|NZ_LR593886.1_3027193_3027733_-	NA	NA|88aa|up_1|NZ_LR593886.1_3028259_3028523_+	NA	NA|105aa|up_0|NZ_LR593886.1_3028888_3029203_+	NA	NA|82aa|down_0|NZ_LR593886.1_3032451_3032697_+	NA	NA|82aa|down_1|NZ_LR593886.1_3033194_3033440_+	NA	NA|81aa|down_2|NZ_LR593886.1_3034085_3034328_+	NA	NA|81aa|down_3|NZ_LR593886.1_3034742_3034985_+	NA	NA|81aa|down_4|NZ_LR593886.1_3035540_3035783_+	NA	NA|153aa|down_5|NZ_LR593886.1_3035849_3036308_+	NA	NA|229aa|down_6|NZ_LR593886.1_3036468_3037155_+	NA	NA|79aa|down_7|NZ_LR593886.1_3037735_3037972_+	NA	NA|179aa|down_8|NZ_LR593886.1_3038001_3038538_+	pfam13424, TPR_12, Tetratricopeptide repeat	NA|79aa|down_9|NZ_LR593886.1_3039418_3039655_+	NA
GCF_901538265.1_ASM90153826v1	NZ_LR593886	Gemmata massiliana isolate Soil9 chromosome 1	12	3209200-3209300	14	CRISPRCasFinder	no		csa3,RT,cas1,cas2,cas3,cas8u2,cas7,cas5u,DinG,cas8u1,DEDDh,PD-DExK,csb2gr5	Orphan	GCACCAGGGGTTGAACCCCCTGGCTATT	28	0	0	NA	NA	NA	1	1	Orphan	csa3,RT,cas1,cas2,cas3,cas8u2,cas7,cas5u,DinG,cas8u1,DEDDh,PD-DExK,csb2gr5	NA|57aa|up_8|NZ_LR593886.1_3195951_3196122_+,NA|102aa|up_7|NZ_LR593886.1_3196187_3196493_-,NA|341aa|up_6|NZ_LR593886.1_3196840_3197863_+,NA|184aa|down_0|NZ_LR593886.1_3209420_3209972_-,NA|422aa|down_1|NZ_LR593886.1_3210337_3211603_+,NA|592aa|down_8|NZ_LR593886.1_3221222_3222998_+,NA|160aa|down_9|NZ_LR593886.1_3223200_3223680_-	NA|567aa|up_9|NZ_LR593886.1_3193771_3195472_+	TIGR02393, RNA_polymerase_sigma_factor_RpoD, RNA polymerase sigma factor RpoD, C-terminal domain	NA|57aa|up_8|NZ_LR593886.1_3195951_3196122_+	NA	NA|102aa|up_7|NZ_LR593886.1_3196187_3196493_-	NA	NA|341aa|up_6|NZ_LR593886.1_3196840_3197863_+	NA	NA|354aa|up_5|NZ_LR593886.1_3198006_3199068_-	cd02194, ThiL, ThiL (Thiamine-monophosphate kinase) plays a dual role in de novo biosynthesis and in salvage of exogenous thiamine	NA|163aa|up_4|NZ_LR593886.1_3199168_3199657_+	pfam12850, Metallophos_2, Calcineurin-like phosphoesterase superfamily domain	NA|467aa|up_3|NZ_LR593886.1_3199717_3201118_+	TIGR02037, Probable_periplasmic_serine_protease_do/HhoA-like, periplasmic serine protease, Do/DeqQ family	NA|1122aa|up_2|NZ_LR593886.1_3201343_3204709_+	cd07562, Peptidase_S41_TRI, Tricorn protease; serine protease family S41	NA|313aa|up_1|NZ_LR593886.1_3204851_3205790_-	COG3386, COG3386, Gluconolactonase [Carbohydrate transport and metabolism]	NA|1029aa|up_0|NZ_LR593886.1_3206011_3209098_-	pfam18582, HZS_alpha, Hydrazine synthase alpha subunit middle domain	NA|184aa|down_0|NZ_LR593886.1_3209420_3209972_-	NA	NA|422aa|down_1|NZ_LR593886.1_3210337_3211603_+	NA	NA|438aa|down_2|NZ_LR593886.1_3211759_3213073_+	TIGR02644, Thymidine_phosphorylase, pyrimidine-nucleoside phosphorylase	NA|134aa|down_3|NZ_LR593886.1_3213069_3213471_+	PRK05578, PRK05578, cytidine deaminase; Validated	NA|210aa|down_4|NZ_LR593886.1_3213480_3214110_+	PRK00129, upp, uracil phosphoribosyltransferase; Reviewed	NA|626aa|down_5|NZ_LR593886.1_3214128_3216006_-	cd11325, AmyAc_GTHase, Alpha amylase catalytic domain found in Glycosyltrehalose trehalohydrolase (also called Maltooligosyl trehalose Trehalohydrolase)	NA|973aa|down_6|NZ_LR593886.1_3216018_3218937_-	PRK14511, PRK14511, malto-oligosyltrehalose synthase	NA|557aa|down_7|NZ_LR593886.1_3219325_3220996_-	PRK14951, PRK14951, DNA polymerase III subunits gamma and tau; Provisional	NA|592aa|down_8|NZ_LR593886.1_3221222_3222998_+	NA	NA|160aa|down_9|NZ_LR593886.1_3223200_3223680_-	NA
GCF_901538265.1_ASM90153826v1	NZ_LR593886	Gemmata massiliana isolate Soil9 chromosome 1	13	3332236-3332346	15	CRISPRCasFinder	no		csa3,RT,cas1,cas2,cas3,cas8u2,cas7,cas5u,DinG,cas8u1,DEDDh,PD-DExK,csb2gr5	Orphan	GCCCATCCCTCCATTCGCCCGTCGAGCAGGCTTTGCATC	39	0	0	NA	NA	NA	1	1	Orphan	csa3,RT,cas1,cas2,cas3,cas8u2,cas7,cas5u,DinG,cas8u1,DEDDh,PD-DExK,csb2gr5	NA|82aa|up_2|NZ_LR593886.1_3329773_3330019_+,NA|208aa|up_0|NZ_LR593886.1_3331335_3331959_+,NA|241aa|down_3|NZ_LR593886.1_3335941_3336664_+,NA|88aa|down_5|NZ_LR593886.1_3337776_3338040_+,NA|143aa|down_8|NZ_LR593886.1_3340106_3340535_-	NA|125aa|up_9|NZ_LR593886.1_3309493_3309868_+	pfam03965, Penicillinase_R, Penicillinase repressor	NA|938aa|up_8|NZ_LR593886.1_3309869_3312683_+	cd07341, M56_BlaR1_MecR1_like, Peptidase M56-like including those in BlaR1 and MecR1, integral membrane metallopeptidase	NA|779aa|up_7|NZ_LR593886.1_3312763_3315100_+	pfam07583, PSCyt2, Protein of unknown function (DUF1549)	NA|430aa|up_6|NZ_LR593886.1_3315117_3316407_+	pfam07394, DUF1501, Protein of unknown function (DUF1501)	NA|481aa|up_5|NZ_LR593886.1_3316780_3318223_-	pfam14871, GHL6, Hypothetical glycosyl hydrolase 6	NA|750aa|up_4|NZ_LR593886.1_3318325_3320575_-	TIGR02937, RNA_polymerase_sigma_factor, RNA polymerase sigma factor, sigma-70 family	NA|2513aa|up_3|NZ_LR593886.1_3321918_3329457_+	TIGR03696, tRNA_nuclease_WapA, RHS repeat-associated core domain	NA|82aa|up_2|NZ_LR593886.1_3329773_3330019_+	NA	NA|380aa|up_1|NZ_LR593886.1_3330156_3331295_+	pfam13358, DDE_3, DDE superfamily endonuclease	NA|208aa|up_0|NZ_LR593886.1_3331335_3331959_+	NA	NA|220aa|down_0|NZ_LR593886.1_3332385_3333045_-	pfam14706, Tnp_DNA_bind, Transposase DNA-binding	NA|70aa|down_1|NZ_LR593886.1_3333226_3333436_-	pfam13817, DDE_Tnp_IS66_C, IS66 C-terminal element	NA|355aa|down_2|NZ_LR593886.1_3334320_3335385_-	pfam00589, Phage_integrase, Phage integrase family	NA|241aa|down_3|NZ_LR593886.1_3335941_3336664_+	NA	NA|253aa|down_4|NZ_LR593886.1_3336781_3337540_-	COG2197, CitB, Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain [Signal transduction mechanisms / Transcription]	NA|88aa|down_5|NZ_LR593886.1_3337776_3338040_+	NA	NA|187aa|down_6|NZ_LR593886.1_3338112_3338673_-	TIGR02937, RNA_polymerase_sigma_factor, RNA polymerase sigma factor, sigma-70 family	NA|435aa|down_7|NZ_LR593886.1_3338711_3340016_-	pfam03999, MAP65_ASE1, Microtubule associated protein (MAP65/ASE1 family)	NA|143aa|down_8|NZ_LR593886.1_3340106_3340535_-	NA	NA|284aa|down_9|NZ_LR593886.1_3340590_3341442_-	pfam04586, Peptidase_S78, Caudovirus prohead serine protease
GCF_901538265.1_ASM90153826v1	NZ_LR593886	Gemmata massiliana isolate Soil9 chromosome 1	14	3351857-3351965	16	CRISPRCasFinder	no	cas1,cas2,cas3,cas8u2,cas7,cas5u	csa3,RT,cas1,cas2,cas3,cas8u2,cas7,cas5u,DinG,cas8u1,DEDDh,PD-DExK,csb2gr5	Unclear	GGTGAAACTCCGAATAACTCTGCTACACTTCTCAAC	36	0	0	NA	NA	NA	1	1	Unclear	csa3,RT,cas1,cas2,cas3,cas8u2,cas7,cas5u,DinG,cas8u1,DEDDh,PD-DExK,csb2gr5	NA|70aa|up_8|NZ_LR593886.1_3345782_3345992_-,NA|149aa|up_6|NZ_LR593886.1_3347716_3348163_-,NA|131aa|up_5|NZ_LR593886.1_3348155_3348548_-,NA|84aa|up_2|NZ_LR593886.1_3349568_3349820_+,NA|119aa|down_1|NZ_LR593886.1_3352645_3353002_+,NA|229aa|down_2|NZ_LR593886.1_3352998_3353685_+,NA|416aa|down_3|NZ_LR593886.1_3353782_3355030_+,NA|73aa|down_8|NZ_LR593886.1_3358621_3358840_+	NA|124aa|up_9|NZ_LR593886.1_3345279_3345651_-	pfam12728, HTH_17, Helix-turn-helix domain	NA|70aa|up_8|NZ_LR593886.1_3345782_3345992_-	NA	NA|441aa|up_7|NZ_LR593886.1_3346351_3347674_-	cd00397, DNA_BRE_C, DNA breaking-rejoining enzymes, C-terminal catalytic domain	NA|149aa|up_6|NZ_LR593886.1_3347716_3348163_-	NA	NA|131aa|up_5|NZ_LR593886.1_3348155_3348548_-	NA	NA|118aa|up_4|NZ_LR593886.1_3348738_3349092_+	cd00093, HTH_XRE, Helix-turn-helix XRE-family like proteins	NA|111aa|up_3|NZ_LR593886.1_3349101_3349434_+	pfam12728, HTH_17, Helix-turn-helix domain	NA|84aa|up_2|NZ_LR593886.1_3349568_3349820_+	NA	NA|132aa|up_1|NZ_LR593886.1_3349833_3350229_-	cd17580, REC_2_DhkD-like, second phosphoacceptor receiver (REC) domain of Dictyostelium discoideum hybrid signal transduction histidine kinase D and similar domains	NA|412aa|up_0|NZ_LR593886.1_3350290_3351526_-	pfam00589, Phage_integrase, Phage integrase family	NA|202aa|down_0|NZ_LR593886.1_3352043_3352649_+	TIGR02937, RNA_polymerase_sigma_factor, RNA polymerase sigma factor, sigma-70 family	NA|119aa|down_1|NZ_LR593886.1_3352645_3353002_+	NA	NA|229aa|down_2|NZ_LR593886.1_3352998_3353685_+	NA	NA|416aa|down_3|NZ_LR593886.1_3353782_3355030_+	NA	NA|653aa|down_4|NZ_LR593886.1_3355092_3357051_+	COG0433, COG0433,  HerA helicase [Replication, recombination, and repair]	NA|120aa|down_5|NZ_LR593886.1_3357055_3357415_-	pfam18480, DUF5615, Domain of unknown function (DUF5615)	NA|79aa|down_6|NZ_LR593886.1_3357411_3357648_-	pfam04255, DUF433, Protein of unknown function (DUF433)	NA|95aa|down_7|NZ_LR593886.1_3357951_3358236_-	cd10434, GIY-YIG_UvrC_Cho, Catalytic GIY-YIG domain of nucleotide excision repair endonucleases UvrC, Cho, and similar proteins	NA|73aa|down_8|NZ_LR593886.1_3358621_3358840_+	NA	cas1|570aa|down_9|NZ_LR593886.1_3358872_3360582_+	pfam01867, Cas_Cas1, CRISPR associated protein Cas1
GCF_901538265.1_ASM90153826v1	NZ_LR593886	Gemmata massiliana isolate Soil9 chromosome 1	15	3368019-3369513	2,17,3,3	PILER-CR,CRISPRCasFinder,CRT,PILER-CR	no	cas1,cas2,cas3,cas8u2,cas7,cas5u	csa3,RT,cas1,cas2,cas3,cas8u2,cas7,cas5u,DinG,cas8u1,DEDDh,PD-DExK,csb2gr5	Unclear	GTCTCTCCCCAGATACATCTGGGGCCGAATTGAAGC,GTCTCTCCCCAGATACATCTGGGGCCGAATTGAAGC,GTCTCTCCCCAGATACATCTGGGGCCGAATTGAAGC,GTCTCTCCCCAGATACATCTGGGGCCGAATTGAAGC	36,36,36,36	0	0	NA	NA	NA:NA:NA:NA	17,19,20,17	20	Unclear	csa3,RT,cas1,cas2,cas3,cas8u2,cas7,cas5u,DinG,cas8u1,DEDDh,PD-DExK,csb2gr5	NA|73aa|up_6|NZ_LR593886.1_3358621_3358840_+,NA|73aa|down_5|NZ_LR593886.1_3380145_3380364_-,NA|89aa|down_8|NZ_LR593886.1_3382231_3382498_-	NA|120aa|up_9|NZ_LR593886.1_3357055_3357415_-	pfam18480, DUF5615, Domain of unknown function (DUF5615)	NA|79aa|up_8|NZ_LR593886.1_3357411_3357648_-	pfam04255, DUF433, Protein of unknown function (DUF433)	NA|95aa|up_7|NZ_LR593886.1_3357951_3358236_-	cd10434, GIY-YIG_UvrC_Cho, Catalytic GIY-YIG domain of nucleotide excision repair endonucleases UvrC, Cho, and similar proteins	NA|73aa|up_6|NZ_LR593886.1_3358621_3358840_+	NA	cas1|570aa|up_5|NZ_LR593886.1_3358872_3360582_+	pfam01867, Cas_Cas1, CRISPR associated protein Cas1	cas2|95aa|up_4|NZ_LR593886.1_3360954_3361239_+	cd09725, Cas2_I_II_III, CRISPR/Cas system-associated protein Cas2	cas3|920aa|up_3|NZ_LR593886.1_3361251_3364011_+	cd09696, Cas3_I, CRISPR/Cas system-associated protein Cas3; Distinct Cas3 family with HD domain fused to C-termus of Helicase domain	cas8u2|216aa|up_2|NZ_LR593886.1_3364190_3364838_+	TIGR04106, hypothetical_protein_GobsU_11505, CRISPR-associated protein GSU0052/csb3, Dpsyc system	cas7|387aa|up_1|NZ_LR593886.1_3364830_3365991_+	cd09678, Csb1_I-U, CRISPR/Cas system-associated protein Csb1	cas5u|550aa|up_0|NZ_LR593886.1_3365990_3367640_+	cd09667, Csb2_I-U, CRISPR/Cas system-associated protein Csb2	NA|372aa|down_0|NZ_LR593886.1_3369568_3370684_+	pfam13808, DDE_Tnp_1_assoc, DDE_Tnp_1-associated	NA|149aa|down_1|NZ_LR593886.1_3377932_3378379_-	cd01057, AAMH_A, Aromatic and Alkene Monooxygenase Hydroxylase, subunit A, ferritin-like diiron-binding domain	NA|203aa|down_2|NZ_LR593886.1_3378424_3379033_-	PRK05327, rpsD, 30S ribosomal protein S4; Validated	NA|89aa|down_3|NZ_LR593886.1_3379068_3379335_-	PRK00391, rpsR, 30S ribosomal protein S18; Reviewed	NA|254aa|down_4|NZ_LR593886.1_3379387_3380149_-	sd00006, TPR, Tetratricopeptide repeat	NA|73aa|down_5|NZ_LR593886.1_3380145_3380364_-	NA	NA|228aa|down_6|NZ_LR593886.1_3380382_3381066_-	pfam10670, DUF4198, Domain of unknown function (DUF4198)	NA|337aa|down_7|NZ_LR593886.1_3381185_3382196_-	sd00038, Kelch, Kelch repeat	NA|89aa|down_8|NZ_LR593886.1_3382231_3382498_-	NA	NA|322aa|down_9|NZ_LR593886.1_3382564_3383530_-	pfam07596, SBP_bac_10, Protein of unknown function (DUF1559)
GCF_901538265.1_ASM90153826v1	NZ_LR593886	Gemmata massiliana isolate Soil9 chromosome 1	16	3370694-3377796	18,4,4,5,19	CRISPRCasFinder,CRT,PILER-CR,PILER-CR,CRISPRCasFinder	no	cas1,cas2,cas3,cas8u2,cas7,cas5u	csa3,RT,cas1,cas2,cas3,cas8u2,cas7,cas5u,DinG,cas8u1,DEDDh,PD-DExK,csb2gr5	Unclear	GTCTCTCCCCAGATACATCTGGGGCCGAATTGAAGC,NGTCTCTCCCCAGATACAATCTGGGGCCGAATTGAAGC,GTCTCTCCCCAGATACATCTGGGGCCGAATTGAAGC,GTCTCTCCCCAGATACATCTGGGGCCGAATTGAAGC,GTCTCTCCCCAGATACATCTGGGGCCGAATTGAAGC	36,38,36,36,36	0	0	NA	NA	NA:NA:NA:NA:NA	95,97,95,95,95	97	Unclear	csa3,RT,cas1,cas2,cas3,cas8u2,cas7,cas5u,DinG,cas8u1,DEDDh,PD-DExK,csb2gr5	NA|73aa|up_7|NZ_LR593886.1_3358621_3358840_+,NA|73aa|down_4|NZ_LR593886.1_3380145_3380364_-,NA|89aa|down_7|NZ_LR593886.1_3382231_3382498_-	NA|79aa|up_9|NZ_LR593886.1_3357411_3357648_-	pfam04255, DUF433, Protein of unknown function (DUF433)	NA|95aa|up_8|NZ_LR593886.1_3357951_3358236_-	cd10434, GIY-YIG_UvrC_Cho, Catalytic GIY-YIG domain of nucleotide excision repair endonucleases UvrC, Cho, and similar proteins	NA|73aa|up_7|NZ_LR593886.1_3358621_3358840_+	NA	cas1|570aa|up_6|NZ_LR593886.1_3358872_3360582_+	pfam01867, Cas_Cas1, CRISPR associated protein Cas1	cas2|95aa|up_5|NZ_LR593886.1_3360954_3361239_+	cd09725, Cas2_I_II_III, CRISPR/Cas system-associated protein Cas2	cas3|920aa|up_4|NZ_LR593886.1_3361251_3364011_+	cd09696, Cas3_I, CRISPR/Cas system-associated protein Cas3; Distinct Cas3 family with HD domain fused to C-termus of Helicase domain	cas8u2|216aa|up_3|NZ_LR593886.1_3364190_3364838_+	TIGR04106, hypothetical_protein_GobsU_11505, CRISPR-associated protein GSU0052/csb3, Dpsyc system	cas7|387aa|up_2|NZ_LR593886.1_3364830_3365991_+	cd09678, Csb1_I-U, CRISPR/Cas system-associated protein Csb1	cas5u|550aa|up_1|NZ_LR593886.1_3365990_3367640_+	cd09667, Csb2_I-U, CRISPR/Cas system-associated protein Csb2	NA|372aa|up_0|NZ_LR593886.1_3369568_3370684_+	pfam13808, DDE_Tnp_1_assoc, DDE_Tnp_1-associated	NA|149aa|down_0|NZ_LR593886.1_3377932_3378379_-	cd01057, AAMH_A, Aromatic and Alkene Monooxygenase Hydroxylase, subunit A, ferritin-like diiron-binding domain	NA|203aa|down_1|NZ_LR593886.1_3378424_3379033_-	PRK05327, rpsD, 30S ribosomal protein S4; Validated	NA|89aa|down_2|NZ_LR593886.1_3379068_3379335_-	PRK00391, rpsR, 30S ribosomal protein S18; Reviewed	NA|254aa|down_3|NZ_LR593886.1_3379387_3380149_-	sd00006, TPR, Tetratricopeptide repeat	NA|73aa|down_4|NZ_LR593886.1_3380145_3380364_-	NA	NA|228aa|down_5|NZ_LR593886.1_3380382_3381066_-	pfam10670, DUF4198, Domain of unknown function (DUF4198)	NA|337aa|down_6|NZ_LR593886.1_3381185_3382196_-	sd00038, Kelch, Kelch repeat	NA|89aa|down_7|NZ_LR593886.1_3382231_3382498_-	NA	NA|322aa|down_8|NZ_LR593886.1_3382564_3383530_-	pfam07596, SBP_bac_10, Protein of unknown function (DUF1559)	NA|141aa|down_9|NZ_LR593886.1_3383883_3384306_+	COG0432, COG0432, Uncharacterized conserved protein [Function unknown]
GCF_901538265.1_ASM90153826v1	NZ_LR593886	Gemmata massiliana isolate Soil9 chromosome 1	17	4923108-4923187	20	CRISPRCasFinder	no		csa3,RT,cas1,cas2,cas3,cas8u2,cas7,cas5u,DinG,cas8u1,DEDDh,PD-DExK,csb2gr5	Orphan	AACTCCAGAGTGTCTTCGTCGAAATC	26	0	0	NA	NA	NA	1	1	Orphan	csa3,RT,cas1,cas2,cas3,cas8u2,cas7,cas5u,DinG,cas8u1,DEDDh,PD-DExK,csb2gr5	NA|211aa|up_9|NZ_LR593886.1_4911231_4911864_+,NA|234aa|up_7|NZ_LR593886.1_4913059_4913761_-,NA|104aa|up_5|NZ_LR593886.1_4915148_4915460_-,NA|157aa|down_6|NZ_LR593886.1_4932544_4933015_-,NA|304aa|down_8|NZ_LR593886.1_4934118_4935030_-	NA|211aa|up_9|NZ_LR593886.1_4911231_4911864_+	NA	NA|329aa|up_8|NZ_LR593886.1_4911910_4912897_+	pfam07596, SBP_bac_10, Protein of unknown function (DUF1559)	NA|234aa|up_7|NZ_LR593886.1_4913059_4913761_-	NA	NA|376aa|up_6|NZ_LR593886.1_4913975_4915103_+	COG0564, RluA, Pseudouridylate synthases, 23S RNA-specific [Translation, ribosomal structure and biogenesis]	NA|104aa|up_5|NZ_LR593886.1_4915148_4915460_-	NA	NA|287aa|up_4|NZ_LR593886.1_4915571_4916432_+	PRK00450, dapF, diaminopimelate epimerase; Provisional	NA|612aa|up_3|NZ_LR593886.1_4916428_4918264_+	pfam09594, GT87, Glycosyltransferase family 87	NA|261aa|up_2|NZ_LR593886.1_4918309_4919092_-	PRK00278, trpC, indole-3-glycerol phosphate synthase TrpC	NA|274aa|up_1|NZ_LR593886.1_4919239_4920061_-	COG1082, IolE, Sugar phosphate isomerases/epimerases [Carbohydrate transport and metabolism]	NA|482aa|up_0|NZ_LR593886.1_4920064_4921510_-	cd14014, STKc_PknB_like, Catalytic domain of bacterial Serine/Threonine kinases, PknB and similar proteins	NA|338aa|down_0|NZ_LR593886.1_4925170_4926184_+	COG2047, COG2047, Uncharacterized protein (ATP-grasp superfamily) [General function prediction only]	NA|363aa|down_1|NZ_LR593886.1_4926226_4927315_+	cd09620, CBM9_like_3, DOMON-like type 9 carbohydrate binding module	NA|185aa|down_2|NZ_LR593886.1_4927370_4927925_-	COG1413, COG1413, FOG: HEAT repeat [Energy production and conversion]	NA|286aa|down_3|NZ_LR593886.1_4927999_4928857_-	cd05483, retropepsin_like_bacteria, Bacterial aspartate proteases, retropepsin-like protease family	NA|292aa|down_4|NZ_LR593886.1_4928859_4929735_-	cd00987, PDZ_serine_protease, PDZ domain of trypsin-like serine proteases, such as DegP/HtrA, which are oligomeric proteins involved in heat-shock response, chaperone function, and apoptosis	NA|643aa|down_5|NZ_LR593886.1_4930447_4932376_-	cd02969, PRX_like1, Peroxiredoxin (PRX)-like 1 family; hypothetical proteins that show sequence similarity to PRXs	NA|157aa|down_6|NZ_LR593886.1_4932544_4933015_-	NA	NA|297aa|down_7|NZ_LR593886.1_4933089_4933980_-	COG0631, PTC1, Serine/threonine protein phosphatase [Signal transduction mechanisms]	NA|304aa|down_8|NZ_LR593886.1_4934118_4935030_-	NA	NA|434aa|down_9|NZ_LR593886.1_4935033_4936335_-	pfam01555, N6_N4_Mtase, DNA methylase
GCF_901538265.1_ASM90153826v1	NZ_LR593886	Gemmata massiliana isolate Soil9 chromosome 1	18	5002323-5002393	21	CRISPRCasFinder	no		csa3,RT,cas1,cas2,cas3,cas8u2,cas7,cas5u,DinG,cas8u1,DEDDh,PD-DExK,csb2gr5	Orphan	GCCCCAATGGACCCGAAGGGCGG	23	0	0	NA	NA	NA	1	1	Orphan	csa3,RT,cas1,cas2,cas3,cas8u2,cas7,cas5u,DinG,cas8u1,DEDDh,PD-DExK,csb2gr5	NA|264aa|up_9|NZ_LR593886.1_4976996_4977788_+,NA|1171aa|up_0|NZ_LR593886.1_4997536_5001049_+,NA|367aa|down_0|NZ_LR593886.1_5003608_5004709_+,NA|491aa|down_8|NZ_LR593886.1_5015622_5017095_-	NA|264aa|up_9|NZ_LR593886.1_4976996_4977788_+	NA	NA|612aa|up_8|NZ_LR593886.1_4977805_4979641_-	pfam05731, TROVE, TROVE domain	NA|530aa|up_7|NZ_LR593886.1_4979974_4981564_+	COG2204, AtoC, Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains [Signal transduction mechanisms]	NA|1661aa|up_6|NZ_LR593886.1_4982289_4987272_+	COG1520, COG1520, FOG: WD40-like repeat [Function unknown]	NA|392aa|up_5|NZ_LR593886.1_4987561_4988737_+	cd00688, ISOPREN_C2_like, This group contains class II terpene cyclases, protein prenyltransferases beta subunit, two broadly specific proteinase inhibitors alpha2-macroglobulin (alpha (2)-M) and pregnancy zone protein (PZP) and, the C3 C4 and C5 components of vertebrate complement	NA|353aa|up_4|NZ_LR593886.1_4988898_4989957_+	COG0714, COG0714, MoxR-like ATPases [General function prediction only]	NA|315aa|up_3|NZ_LR593886.1_4990061_4991006_+	COG1721, COG1721, Uncharacterized conserved protein (some members contain a von Willebrand factor type A (vWA) domain) [General function prediction only]	NA|961aa|up_2|NZ_LR593886.1_4991120_4994003_+	pfam07584, BatA, Aerotolerance regulator N-terminal	NA|1118aa|up_1|NZ_LR593886.1_4994084_4997438_+	cd00198, vWFA, Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF)	NA|1171aa|up_0|NZ_LR593886.1_4997536_5001049_+	NA	NA|367aa|down_0|NZ_LR593886.1_5003608_5004709_+	NA	NA|418aa|down_1|NZ_LR593886.1_5004837_5006091_+	pfam08305, NPCBM, NPCBM/NEW2 domain	NA|608aa|down_2|NZ_LR593886.1_5006194_5008018_+	TIGR02037, Probable_periplasmic_serine_protease_do/HhoA-like, periplasmic serine protease, Do/DeqQ family	NA|395aa|down_3|NZ_LR593886.1_5008092_5009277_+	TIGR02037, Probable_periplasmic_serine_protease_do/HhoA-like, periplasmic serine protease, Do/DeqQ family	NA|719aa|down_4|NZ_LR593886.1_5009356_5011513_+	TIGR02037, Probable_periplasmic_serine_protease_do/HhoA-like, periplasmic serine protease, Do/DeqQ family	NA|591aa|down_5|NZ_LR593886.1_5011643_5013416_-	COG2849, COG2849, Uncharacterized protein conserved in bacteria [Function unknown]	NA|212aa|down_6|NZ_LR593886.1_5013608_5014244_-	pfam14279, HNH_5, HNH endonuclease	NA|385aa|down_7|NZ_LR593886.1_5014471_5015626_-	pfam10117, McrBC, McrBC 5-methylcytosine restriction system component	NA|491aa|down_8|NZ_LR593886.1_5015622_5017095_-	NA	NA|872aa|down_9|NZ_LR593886.1_5017194_5019810_-	PRK11331, PRK11331, 5-methylcytosine-specific restriction enzyme subunit McrB; Provisional
GCF_901538265.1_ASM90153826v1	NZ_LR593886	Gemmata massiliana isolate Soil9 chromosome 1	19	5381674-5381771	22	CRISPRCasFinder	no		csa3,RT,cas1,cas2,cas3,cas8u2,cas7,cas5u,DinG,cas8u1,DEDDh,PD-DExK,csb2gr5	Orphan	CCGGGTGGGTACACGGGAGGCTC	23	0	0	NA	NA	NA	1	1	Orphan	csa3,RT,cas1,cas2,cas3,cas8u2,cas7,cas5u,DinG,cas8u1,DEDDh,PD-DExK,csb2gr5	NA|74aa|up_9|NZ_LR593886.1_5375455_5375677_-,NA|73aa|up_7|NZ_LR593886.1_5376016_5376235_+,NA|84aa|up_6|NZ_LR593886.1_5376237_5376489_-,NA|70aa|up_5|NZ_LR593886.1_5376485_5376695_-,NA|71aa|up_4|NZ_LR593886.1_5376691_5376904_-,NA|124aa|up_3|NZ_LR593886.1_5376900_5377272_-,NA|143aa|up_2|NZ_LR593886.1_5377268_5377697_-,NA|308aa|up_1|NZ_LR593886.1_5377693_5378617_-,NA|480aa|up_0|NZ_LR593886.1_5379955_5381395_+,NA|642aa|down_8|NZ_LR593886.1_5396859_5398785_+	NA|74aa|up_9|NZ_LR593886.1_5375455_5375677_-	NA	NA|69aa|up_8|NZ_LR593886.1_5375676_5375883_-	COG4640, COG4640, Predicted membrane protein [Function unknown]	NA|73aa|up_7|NZ_LR593886.1_5376016_5376235_+	NA	NA|84aa|up_6|NZ_LR593886.1_5376237_5376489_-	NA	NA|70aa|up_5|NZ_LR593886.1_5376485_5376695_-	NA	NA|71aa|up_4|NZ_LR593886.1_5376691_5376904_-	NA	NA|124aa|up_3|NZ_LR593886.1_5376900_5377272_-	NA	NA|143aa|up_2|NZ_LR593886.1_5377268_5377697_-	NA	NA|308aa|up_1|NZ_LR593886.1_5377693_5378617_-	NA	NA|480aa|up_0|NZ_LR593886.1_5379955_5381395_+	NA	NA|350aa|down_0|NZ_LR593886.1_5384146_5385196_-	COG0385, COG0385, Predicted Na+-dependent transporter [General function prediction only]	NA|163aa|down_1|NZ_LR593886.1_5385385_5385874_+	pfam11396, PepSY_like, Putative beta-lactamase-inhibitor-like, PepSY-like	NA|226aa|down_2|NZ_LR593886.1_5385972_5386650_+	COG0745, OmpR, Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain [Signal transduction mechanisms / Transcription]	NA|487aa|down_3|NZ_LR593886.1_5386646_5388107_+	TIGR01386, Probable_sensor_protein_PcoS, heavy metal sensor kinase	NA|193aa|down_4|NZ_LR593886.1_5388103_5388682_+	cd14529, TpbA-like, bacterial protein tyrosine and dual-specificity phosphatases related to Pseudomonas aeruginosa TpbA	NA|322aa|down_5|NZ_LR593886.1_5388920_5389886_+	TIGR02037, Probable_periplasmic_serine_protease_do/HhoA-like, periplasmic serine protease, Do/DeqQ family	NA|1223aa|down_6|NZ_LR593886.1_5390047_5393716_+	TIGR02604, Piru_Ver_Nterm, putative membrane-bound dehydrogenase domain	NA|888aa|down_7|NZ_LR593886.1_5394088_5396752_+	cd00200, WD40, WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment	NA|642aa|down_8|NZ_LR593886.1_5396859_5398785_+	NA	NA|245aa|down_9|NZ_LR593886.1_5398847_5399582_-	COG4221, COG4221, Short-chain alcohol dehydrogenase of unknown specificity [General function prediction only]
GCF_901538265.1_ASM90153826v1	NZ_LR593886	Gemmata massiliana isolate Soil9 chromosome 1	20	7069326-7069492	23	CRISPRCasFinder	no		csa3,RT,cas1,cas2,cas3,cas8u2,cas7,cas5u,DinG,cas8u1,DEDDh,PD-DExK,csb2gr5	Orphan	CTCACAAAACTCACCCACCTCGA	23	0	0	NA	NA	NA	2	2	Orphan	csa3,RT,cas1,cas2,cas3,cas8u2,cas7,cas5u,DinG,cas8u1,DEDDh,PD-DExK,csb2gr5	NA|77aa|up_4|NZ_LR593886.1_7057738_7057969_-,NA|176aa|up_3|NZ_LR593886.1_7058742_7059270_-,NA|252aa|up_1|NZ_LR593886.1_7062428_7063184_+,NA|142aa|down_0|NZ_LR593886.1_7069871_7070297_+,NA|173aa|down_3|NZ_LR593886.1_7073755_7074274_-,NA|170aa|down_5|NZ_LR593886.1_7076403_7076913_-,NA|128aa|down_6|NZ_LR593886.1_7077056_7077440_+	NA|357aa|up_9|NZ_LR593886.1_7050718_7051789_+	pfam07596, SBP_bac_10, Protein of unknown function (DUF1559)	NA|323aa|up_8|NZ_LR593886.1_7051781_7052750_+	pfam07596, SBP_bac_10, Protein of unknown function (DUF1559)	NA|1321aa|up_7|NZ_LR593886.1_7052821_7056784_+	pfam02514, CobN-Mg_chel, CobN/Magnesium Chelatase	NA|204aa|up_6|NZ_LR593886.1_7056818_7057430_+	COG0811, TolQ, Biopolymer transport proteins [Intracellular trafficking and secretion]	NA|107aa|up_5|NZ_LR593886.1_7057401_7057722_+	pfam09919, DUF2149, Uncharacterized conserved protein (DUF2149)	NA|77aa|up_4|NZ_LR593886.1_7057738_7057969_-	NA	NA|176aa|up_3|NZ_LR593886.1_7058742_7059270_-	NA	NA|558aa|up_2|NZ_LR593886.1_7059663_7061337_-	cd17906, CheX, chemotaxis phosphatase CheX	NA|252aa|up_1|NZ_LR593886.1_7062428_7063184_+	NA	NA|1237aa|up_0|NZ_LR593886.1_7063344_7067055_+	COG2319, COG2319, FOG: WD40 repeat [General function prediction only]	NA|142aa|down_0|NZ_LR593886.1_7069871_7070297_+	NA	NA|189aa|down_1|NZ_LR593886.1_7070664_7071231_+	TIGR02984, Sig-70_plancto1, RNA polymerase sigma-70 factor, Planctomycetaceae-specific subfamily 1	NA|692aa|down_2|NZ_LR593886.1_7071534_7073610_+	cd14014, STKc_PknB_like, Catalytic domain of bacterial Serine/Threonine kinases, PknB and similar proteins	NA|173aa|down_3|NZ_LR593886.1_7073755_7074274_-	NA	NA|392aa|down_4|NZ_LR593886.1_7075045_7076221_+	pfam01070, FMN_dh, FMN-dependent dehydrogenase	NA|170aa|down_5|NZ_LR593886.1_7076403_7076913_-	NA	NA|128aa|down_6|NZ_LR593886.1_7077056_7077440_+	NA	NA|351aa|down_7|NZ_LR593886.1_7077554_7078607_-	TIGR02037, Probable_periplasmic_serine_protease_do/HhoA-like, periplasmic serine protease, Do/DeqQ family	NA|363aa|down_8|NZ_LR593886.1_7078750_7079839_-	TIGR02037, Probable_periplasmic_serine_protease_do/HhoA-like, periplasmic serine protease, Do/DeqQ family	NA|213aa|down_9|NZ_LR593886.1_7080418_7081057_+	COG1280, RhtB, Putative threonine efflux protein [Amino acid transport and metabolism]
GCF_901538265.1_ASM90153826v1	NZ_LR593886	Gemmata massiliana isolate Soil9 chromosome 1	21	7283945-7284072	24	CRISPRCasFinder	no		csa3,RT,cas1,cas2,cas3,cas8u2,cas7,cas5u,DinG,cas8u1,DEDDh,PD-DExK,csb2gr5	Orphan	TTTGCTACGCACCGGTGAAGAGCAAACCGGTAGCAAACCGG	41	0	0	NA	NA	NA	1	1	Orphan	csa3,RT,cas1,cas2,cas3,cas8u2,cas7,cas5u,DinG,cas8u1,DEDDh,PD-DExK,csb2gr5	NA|208aa|up_6|NZ_LR593886.1_7279095_7279719_-,NA|331aa|up_5|NZ_LR593886.1_7279977_7280970_-,NA|184aa|up_3|NZ_LR593886.1_7281507_7282059_+,NA|106aa|up_2|NZ_LR593886.1_7282367_7282685_+,NA|71aa|up_1|NZ_LR593886.1_7282742_7282955_+,NA|83aa|down_1|NZ_LR593886.1_7286118_7286367_+,NA|97aa|down_2|NZ_LR593886.1_7286715_7287006_+,NA|81aa|down_3|NZ_LR593886.1_7287498_7287741_+	NA|454aa|up_9|NZ_LR593886.1_7271808_7273170_+	pfam12965, DUF3854, Domain of unknown function (DUF3854)	NA|398aa|up_8|NZ_LR593886.1_7276588_7277782_+	pfam00589, Phage_integrase, Phage integrase family	NA|315aa|up_7|NZ_LR593886.1_7277876_7278821_-	pfam06114, Peptidase_M78, IrrE N-terminal-like domain	NA|208aa|up_6|NZ_LR593886.1_7279095_7279719_-	NA	NA|331aa|up_5|NZ_LR593886.1_7279977_7280970_-	NA	NA|97aa|up_4|NZ_LR593886.1_7280989_7281280_-	cd00093, HTH_XRE, Helix-turn-helix XRE-family like proteins	NA|184aa|up_3|NZ_LR593886.1_7281507_7282059_+	NA	NA|106aa|up_2|NZ_LR593886.1_7282367_7282685_+	NA	NA|71aa|up_1|NZ_LR593886.1_7282742_7282955_+	NA	NA|124aa|up_0|NZ_LR593886.1_7283064_7283436_+	pfam12728, HTH_17, Helix-turn-helix domain	NA|243aa|down_0|NZ_LR593886.1_7284477_7285206_+	COG1484, DnaC, DNA replication protein [DNA replication, recombination, and repair]	NA|83aa|down_1|NZ_LR593886.1_7286118_7286367_+	NA	NA|97aa|down_2|NZ_LR593886.1_7286715_7287006_+	NA	NA|81aa|down_3|NZ_LR593886.1_7287498_7287741_+	NA	NA|252aa|down_4|NZ_LR593886.1_7287832_7288588_-	pfam00254, FKBP_C, FKBP-type peptidyl-prolyl cis-trans isomerase	NA|168aa|down_5|NZ_LR593886.1_7289105_7289609_+	PTZ00144, PTZ00144, dihydrolipoamide succinyltransferase; Provisional	NA|770aa|down_6|NZ_LR593886.1_7289771_7292081_-	PRK15041, PRK15041, methyl-accepting chemotaxis protein	NA|164aa|down_7|NZ_LR593886.1_7292175_7292667_-	COG0835, CheW, Chemotaxis signal transduction protein [Cell motility and secretion / Signal transduction mechanisms]	NA|609aa|down_8|NZ_LR593886.1_7293459_7295286_+	TIGR02937, RNA_polymerase_sigma_factor, RNA polymerase sigma factor, sigma-70 family	NA|662aa|down_9|NZ_LR593886.1_7295791_7297777_-	TIGR02937, RNA_polymerase_sigma_factor, RNA polymerase sigma factor, sigma-70 family
GCF_901538265.1_ASM90153826v1	NZ_LR593886	Gemmata massiliana isolate Soil9 chromosome 1	22	8288223-8288341	25	CRISPRCasFinder	no		csa3,RT,cas1,cas2,cas3,cas8u2,cas7,cas5u,DinG,cas8u1,DEDDh,PD-DExK,csb2gr5	Orphan	GGGGATTCAGTGCTGGTACTGATGCCACCCT	31	0	0	NA	NA	NA	1	1	Orphan	csa3,RT,cas1,cas2,cas3,cas8u2,cas7,cas5u,DinG,cas8u1,DEDDh,PD-DExK,csb2gr5	NA|298aa|up_6|NZ_LR593886.1_8280303_8281197_+,NA|101aa|up_0|NZ_LR593886.1_8287480_8287783_+,NA|94aa|down_4|NZ_LR593886.1_8296589_8296871_-,NA|252aa|down_5|NZ_LR593886.1_8297327_8298083_-,NA|90aa|down_6|NZ_LR593886.1_8298452_8298722_-,NA|79aa|down_7|NZ_LR593886.1_8298848_8299085_+,NA|98aa|down_9|NZ_LR593886.1_8299522_8299816_+	NA|355aa|up_9|NZ_LR593886.1_8276766_8277831_+	pfam07596, SBP_bac_10, Protein of unknown function (DUF1559)	NA|311aa|up_8|NZ_LR593886.1_8278065_8278998_-	cd19163, AKR_galDH, L-galactose dehydrogenase (L-galDH) and similar proteins	NA|327aa|up_7|NZ_LR593886.1_8279170_8280151_-	COG0673, MviM, Predicted dehydrogenases and related proteins [General function prediction only]	NA|298aa|up_6|NZ_LR593886.1_8280303_8281197_+	NA	NA|236aa|up_5|NZ_LR593886.1_8281312_8282020_-	cd06561, AlkD_like, A new structural DNA glycosylase	NA|304aa|up_4|NZ_LR593886.1_8282681_8283593_+	COG1104, NifS, Cysteine sulfinate desulfinase/cysteine desulfurase and related enzymes [Amino acid transport and metabolism]	NA|260aa|up_3|NZ_LR593886.1_8283599_8284379_-	cd04221, MauL, Methylamine utilization protein MauL	NA|541aa|up_2|NZ_LR593886.1_8284412_8286035_-	COG1595, RpoE, DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog [Transcription]	NA|332aa|up_1|NZ_LR593886.1_8286233_8287229_-	cd07010, cupin_PMI_type_I_N_bac, Phosphomannose isomerase in bacteria and archaea, N-terminal cupin domain	NA|101aa|up_0|NZ_LR593886.1_8287480_8287783_+	NA	NA|622aa|down_0|NZ_LR593886.1_8289131_8290997_-	cd14014, STKc_PknB_like, Catalytic domain of bacterial Serine/Threonine kinases, PknB and similar proteins	NA|542aa|down_1|NZ_LR593886.1_8291290_8292916_+	cd00038, CAP_ED, effector domain of the CAP family of transcription factors; members include CAP (or cAMP receptor protein (CRP)), which binds cAMP, FNR (fumarate and nitrate reduction), which uses an iron-sulfur cluster to sense oxygen) and CooA, a heme containing CO sensor	NA|415aa|down_2|NZ_LR593886.1_8293554_8294799_-	cd01312, Met_dep_hydrolase_D, Metallo-dependent hydrolases, subgroup D is part of the superfamily of metallo-dependent hydrolases, a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site	NA|67aa|down_3|NZ_LR593886.1_8295899_8296100_+	pfam00589, Phage_integrase, Phage integrase family	NA|94aa|down_4|NZ_LR593886.1_8296589_8296871_-	NA	NA|252aa|down_5|NZ_LR593886.1_8297327_8298083_-	NA	NA|90aa|down_6|NZ_LR593886.1_8298452_8298722_-	NA	NA|79aa|down_7|NZ_LR593886.1_8298848_8299085_+	NA	NA|108aa|down_8|NZ_LR593886.1_8299087_8299411_+	pfam07618, DUF1580, Protein of unknown function (DUF1580)	NA|98aa|down_9|NZ_LR593886.1_8299522_8299816_+	NA
GCF_901538265.1_ASM90153826v1	NZ_LR593886	Gemmata massiliana isolate Soil9 chromosome 1	23	8419356-8419457	26	CRISPRCasFinder	no		csa3,RT,cas1,cas2,cas3,cas8u2,cas7,cas5u,DinG,cas8u1,DEDDh,PD-DExK,csb2gr5	Orphan	TCCCCCTTCCCTTCAGGGAGGGGG	24	0	0	NA	NA	NA	1	1	Orphan	csa3,RT,cas1,cas2,cas3,cas8u2,cas7,cas5u,DinG,cas8u1,DEDDh,PD-DExK,csb2gr5	NA|203aa|up_5|NZ_LR593886.1_8410043_8410652_+,NA|148aa|up_4|NZ_LR593886.1_8411784_8412228_+,NA|254aa|up_1|NZ_LR593886.1_8416392_8417154_-,NA	NA|394aa|up_9|NZ_LR593886.1_8402690_8403872_-	cd07480, Peptidases_S8_12, Peptidase S8 family domain, uncharacterized subfamily 12	NA|194aa|up_8|NZ_LR593886.1_8404274_8404856_+	TIGR02937, RNA_polymerase_sigma_factor, RNA polymerase sigma factor, sigma-70 family	NA|1139aa|up_7|NZ_LR593886.1_8404914_8408331_+	cd14014, STKc_PknB_like, Catalytic domain of bacterial Serine/Threonine kinases, PknB and similar proteins	NA|341aa|up_6|NZ_LR593886.1_8408752_8409775_-	COG4447, COG4447, Uncharacterized protein related to plant photosystem II stability/assembly factor [General function prediction only]	NA|203aa|up_5|NZ_LR593886.1_8410043_8410652_+	NA	NA|148aa|up_4|NZ_LR593886.1_8411784_8412228_+	NA	NA|481aa|up_3|NZ_LR593886.1_8412247_8413690_-	pfam07394, DUF1501, Protein of unknown function (DUF1501)	NA|725aa|up_2|NZ_LR593886.1_8413795_8415970_-	pfam07583, PSCyt2, Protein of unknown function (DUF1549)	NA|254aa|up_1|NZ_LR593886.1_8416392_8417154_-	NA	NA|330aa|up_0|NZ_LR593886.1_8417468_8418458_-	cd09813, 3b-HSD-NSDHL-like_SDR_e, human NSDHL (NAD(P)H steroid dehydrogenase-like protein)-like, extended (e) SDRs	NA|326aa|down_0|NZ_LR593886.1_8420484_8421462_-	PRK00870, PRK00870, haloalkane dehalogenase; Provisional	NA|386aa|down_1|NZ_LR593886.1_8421535_8422693_-	PRK09258, PRK09258, 3-oxoacyl-(acyl carrier protein) synthase III; Reviewed	NA|2309aa|down_2|NZ_LR593886.1_8422769_8429696_-	cd00833, PKS, polyketide synthases (PKSs) polymerize simple fatty acids into a large variety of different products, called polyketides, by successive decarboxylating Claisen condensations	NA|2272aa|down_3|NZ_LR593886.1_8429792_8436608_-	TIGR02813, omega-3_polyunsaturated_fatty_acid_synthase_PfaA, polyketide-type polyunsaturated fatty acid synthase PfaA	NA|552aa|down_4|NZ_LR593886.1_8436635_8438291_-	cd04742, NPD_FabD, 2-Nitropropane dioxygenase (NPD)-like domain, associated with the (acyl-carrier-protein) S-malonyltransferase  FabD	NA|287aa|down_5|NZ_LR593886.1_8438527_8439388_-	TIGR02996, rpt_mate_G_obs, repeat-companion domain TIGR02996	NA|785aa|down_6|NZ_LR593886.1_8439397_8441752_-	sd00006, TPR, Tetratricopeptide repeat	NA|189aa|down_7|NZ_LR593886.1_8441885_8442452_-	PRK05591, rplQ, 50S ribosomal protein L17; Validated	NA|336aa|down_8|NZ_LR593886.1_8442610_8443618_-	PRK05182, PRK05182, DNA-directed RNA polymerase subunit alpha; Provisional	NA|213aa|down_9|NZ_LR593886.1_8443841_8444480_-	PRK05327, rpsD, 30S ribosomal protein S4; Validated
GCF_901538265.1_ASM90153826v1	NZ_LR593886	Gemmata massiliana isolate Soil9 chromosome 1	24	9109972-9116888	27,5,6	CRISPRCasFinder,CRT,PILER-CR	no	cas3,csb2gr5,cas7	csa3,RT,cas1,cas2,cas3,cas8u2,cas7,cas5u,DinG,cas8u1,DEDDh,PD-DExK,csb2gr5	Unclear	GCTTCAATTCGGCCACGGTTGGTGAACCGTGGAGAC,GCTTCAATTCGGCCACGGTTGGTGAACCGTGGAGAC,GCTTCAATTCGGCCACGGTTGGTGAACCGTGGAGAC	36,36,36	0	0	NA	NA	NA:NA:NA	94,94,94	94	Unclear	csa3,RT,cas1,cas2,cas3,cas8u2,cas7,cas5u,DinG,cas8u1,DEDDh,PD-DExK,csb2gr5	NA|137aa|up_9|NZ_LR593886.1_9092247_9092658_+,NA|178aa|up_7|NZ_LR593886.1_9094386_9094920_-,NA|290aa|up_6|NZ_LR593886.1_9095027_9095897_-,NA|159aa|down_3|NZ_LR593886.1_9124781_9125258_+,NA|471aa|down_8|NZ_LR593886.1_9129920_9131333_+	NA|137aa|up_9|NZ_LR593886.1_9092247_9092658_+	NA	NA|371aa|up_8|NZ_LR593886.1_9092795_9093908_+	pfam13808, DDE_Tnp_1_assoc, DDE_Tnp_1-associated	NA|178aa|up_7|NZ_LR593886.1_9094386_9094920_-	NA	NA|290aa|up_6|NZ_LR593886.1_9095027_9095897_-	NA	NA|231aa|up_5|NZ_LR593886.1_9095889_9096582_-	cd04301, NAT_SF, N-Acyltransferase superfamily: Various enzymes that characteristically catalyze the transfer of an acyl group to a substrate	NA|229aa|up_4|NZ_LR593886.1_9096578_9097265_-	pfam02384, N6_Mtase, N-6 DNA Methylase	NA|87aa|up_3|NZ_LR593886.1_9098725_9098986_-	pfam13384, HTH_23, Homeodomain-like domain	NA|344aa|up_2|NZ_LR593886.1_9099084_9100116_-	sd00033, LRR_RI, leucine-rich repeats, ribonuclease inhibitor (RI)-like subfamily	NA|2547aa|up_1|NZ_LR593886.1_9100416_9108057_-	TIGR03696, tRNA_nuclease_WapA, RHS repeat-associated core domain	NA|435aa|up_0|NZ_LR593886.1_9108500_9109805_-	pfam13578, Methyltransf_24, Methyltransferase domain	cas3|1376aa|down_0|NZ_LR593886.1_9117398_9121526_-	TIGR02621, CRISPR-associated_helicase_Cas3, CRISPR-associated helicase Cas3, subtype Dpsyc	csb2gr5|495aa|down_1|NZ_LR593886.1_9121522_9123007_-	TIGR02165, CRISPR-associated_protein_GSU0054_family, CRISPR-associated protein GSU0054/csb2, Dpsyc system	cas7|446aa|down_2|NZ_LR593886.1_9123010_9124348_-	pfam09617, Cas_GSU0053, CRISPR-associated protein GSU0053 (Cas_GSU0053)	NA|159aa|down_3|NZ_LR593886.1_9124781_9125258_+	NA	NA|588aa|down_4|NZ_LR593886.1_9125643_9127407_+	COG0841, AcrB, Cation/multidrug efflux pump [Defense mechanisms]	NA|234aa|down_5|NZ_LR593886.1_9127895_9128597_+	pfam05988, DUF899, Bacterial protein of unknown function (DUF899)	NA|173aa|down_6|NZ_LR593886.1_9128667_9129186_+	COG0475, KefB, Kef-type K+ transport systems, membrane components [Inorganic ion transport and metabolism]	NA|155aa|down_7|NZ_LR593886.1_9129227_9129692_+	pfam14690, zf-ISL3, zinc-finger of transposase IS204/IS1001/IS1096/IS1165	NA|471aa|down_8|NZ_LR593886.1_9129920_9131333_+	NA	NA|274aa|down_9|NZ_LR593886.1_9131458_9132280_+	sd00006, TPR, Tetratricopeptide repeat
GCF_901538265.1_ASM90153826v1	NZ_LR593886	Gemmata massiliana isolate Soil9 chromosome 1	25	9355287-9355394	28	CRISPRCasFinder	no		csa3,RT,cas1,cas2,cas3,cas8u2,cas7,cas5u,DinG,cas8u1,DEDDh,PD-DExK,csb2gr5	Orphan	GTAGGTCGGGCTGTGCCCGACGCAACATTCACAGCC	36	0	0	NA	NA	NA	1	1	Orphan	csa3,RT,cas1,cas2,cas3,cas8u2,cas7,cas5u,DinG,cas8u1,DEDDh,PD-DExK,csb2gr5	NA|177aa|up_4|NZ_LR593886.1_9345322_9345853_-,NA|633aa|up_3|NZ_LR593886.1_9345985_9347884_+,NA|285aa|up_0|NZ_LR593886.1_9354399_9355254_+,NA|97aa|down_2|NZ_LR593886.1_9357652_9357943_-,NA|114aa|down_3|NZ_LR593886.1_9357946_9358288_-,NA|210aa|down_6|NZ_LR593886.1_9365105_9365735_+,NA|100aa|down_8|NZ_LR593886.1_9366807_9367107_+,NA|185aa|down_9|NZ_LR593886.1_9367190_9367745_+	NA|156aa|up_9|NZ_LR593886.1_9341555_9342023_+	cd07246, VOC_like, uncharacterized subfamily of vicinal oxygen chelate (VOC) family	NA|120aa|up_8|NZ_LR593886.1_9342076_9342436_-	COG5652, COG5652, Predicted integral membrane protein [Function unknown]	NA|356aa|up_7|NZ_LR593886.1_9342672_9343740_-	PRK01059, PRK01059, ATP:guanido phosphotransferase; Provisional	NA|161aa|up_6|NZ_LR593886.1_9343835_9344318_-	COG3880, COG3880, Modulator of heat shock repressor CtsR, McsA [Signal transduction    mechanisms]	NA|157aa|up_5|NZ_LR593886.1_9344366_9344837_-	COG3880, COG3880, Modulator of heat shock repressor CtsR, McsA [Signal transduction    mechanisms]	NA|177aa|up_4|NZ_LR593886.1_9345322_9345853_-	NA	NA|633aa|up_3|NZ_LR593886.1_9345985_9347884_+	NA	NA|666aa|up_2|NZ_LR593886.1_9348246_9350244_+	pfam02119, FlgI, Flagellar P-ring protein	NA|1257aa|up_1|NZ_LR593886.1_9350445_9354216_+	TIGR02168, Chromosome_partition_protein_Smc, chromosome segregation protein SMC, common bacterial type	NA|285aa|up_0|NZ_LR593886.1_9354399_9355254_+	NA	NA|381aa|down_0|NZ_LR593886.1_9355446_9356589_-	TIGR03700, mena_SCO4494, putative menaquinone biosynthesis radical SAM enzyme, SCO4494 family	NA|214aa|down_1|NZ_LR593886.1_9356884_9357526_+	TIGR03000, plancto_dom_1, Planctomycetes uncharacterized domain TIGR03000	NA|97aa|down_2|NZ_LR593886.1_9357652_9357943_-	NA	NA|114aa|down_3|NZ_LR593886.1_9357946_9358288_-	NA	NA|1189aa|down_4|NZ_LR593886.1_9358326_9361893_-	PRK06039, ileS, isoleucyl-tRNA synthetase; Reviewed	NA|872aa|down_5|NZ_LR593886.1_9362224_9364840_+	cd01461, vWA_interalpha_trypsin_inhibitor, vWA_interalpha trypsin inhibitor (ITI): ITI is a glycoprotein composed of three polypeptides- two heavy chains and one light chain (bikunin)	NA|210aa|down_6|NZ_LR593886.1_9365105_9365735_+	NA	NA|254aa|down_7|NZ_LR593886.1_9365845_9366607_-	pfam13649, Methyltransf_25, Methyltransferase domain	NA|100aa|down_8|NZ_LR593886.1_9366807_9367107_+	NA	NA|185aa|down_9|NZ_LR593886.1_9367190_9367745_+	NA
GCF_901538265.1_ASM90153826v1	NZ_LR593886	Gemmata massiliana isolate Soil9 chromosome 1	26	9565915-9566370	29	CRISPRCasFinder	no		csa3,RT,cas1,cas2,cas3,cas8u2,cas7,cas5u,DinG,cas8u1,DEDDh,PD-DExK,csb2gr5	Orphan	ATGGTGAGGCTGGTGAGCCCCTT	23	0	0	NA	NA	NA	6	6	Orphan	csa3,RT,cas1,cas2,cas3,cas8u2,cas7,cas5u,DinG,cas8u1,DEDDh,PD-DExK,csb2gr5	NA|111aa|up_6|NZ_LR593886.1_9558016_9558349_+,NA|80aa|up_4|NZ_LR593886.1_9559683_9559923_-,NA|89aa|up_3|NZ_LR593886.1_9560080_9560347_-,NA|29aa|up_0|NZ_LR593886.1_9564567_9564654_-,NA|77aa|down_0|NZ_LR593886.1_9567711_9567942_+,NA|1308aa|down_7|NZ_LR593886.1_9580729_9584653_+	NA|1512aa|up_9|NZ_LR593886.1_9546825_9551361_-	cd01461, vWA_interalpha_trypsin_inhibitor, vWA_interalpha trypsin inhibitor (ITI): ITI is a glycoprotein composed of three polypeptides- two heavy chains and one light chain (bikunin)	NA|1200aa|up_8|NZ_LR593886.1_9553423_9557023_+	TIGR02082, Methionine_synthase, 5-methyltetrahydrofolate--homocysteine methyltransferase	NA|286aa|up_7|NZ_LR593886.1_9557149_9558007_+	COG1131, CcmA, ABC-type multidrug transport system, ATPase component [Defense mechanisms]	NA|111aa|up_6|NZ_LR593886.1_9558016_9558349_+	NA	NA|409aa|up_5|NZ_LR593886.1_9558380_9559607_+	COG0842, COG0842, ABC-type multidrug transport system, permease component [Defense mechanisms]	NA|80aa|up_4|NZ_LR593886.1_9559683_9559923_-	NA	NA|89aa|up_3|NZ_LR593886.1_9560080_9560347_-	NA	NA|840aa|up_2|NZ_LR593886.1_9560494_9563014_-	cd07473, Peptidases_S8_Subtilisin_like, Peptidase S8 family domain in Subtilisin-like proteins	NA|308aa|up_1|NZ_LR593886.1_9563612_9564536_+	pfam08668, HDOD, HDOD domain	NA|29aa|up_0|NZ_LR593886.1_9564567_9564654_-	NA	NA|77aa|down_0|NZ_LR593886.1_9567711_9567942_+	NA	NA|485aa|down_1|NZ_LR593886.1_9567934_9569389_-	pfam07394, DUF1501, Protein of unknown function (DUF1501)	NA|848aa|down_2|NZ_LR593886.1_9569499_9572043_-	pfam07583, PSCyt2, Protein of unknown function (DUF1549)	NA|1453aa|down_3|NZ_LR593886.1_9572222_9576581_-	TIGR02604, Piru_Ver_Nterm, putative membrane-bound dehydrogenase domain	NA|389aa|down_4|NZ_LR593886.1_9576758_9577925_-	COG1940, NagC, Transcriptional regulator/sugar kinase [Transcription / Carbohydrate transport and metabolism]	NA|194aa|down_5|NZ_LR593886.1_9578288_9578870_+	TIGR03495, Probable_spanin_inner_membrane_subunit, phage lysis regulatory protein, LysB family	NA|454aa|down_6|NZ_LR593886.1_9579088_9580450_-	COG0642, BaeS, Signal transduction histidine kinase [Signal transduction mechanisms]	NA|1308aa|down_7|NZ_LR593886.1_9580729_9584653_+	NA	NA|106aa|down_8|NZ_LR593886.1_9584788_9585106_+	COG3309, VapD, Uncharacterized virulence-associated protein D [Function unknown]	NA|64aa|down_9|NZ_LR593886.1_9585136_9585328_+	COG1598, COG1598, Predicted nuclease of the RNAse H fold, HicB family [General    function prediction only]
GCF_901538265.1_ASM90153826v1	NZ_LR593886	Gemmata massiliana isolate Soil9 chromosome 1	27	9566491-9566657	30	CRISPRCasFinder	no		csa3,RT,cas1,cas2,cas3,cas8u2,cas7,cas5u,DinG,cas8u1,DEDDh,PD-DExK,csb2gr5	Orphan	ATGGTGAGGCTGGTGAGCCCCTT	23	0	0	NA	NA	NA	2	2	Orphan	csa3,RT,cas1,cas2,cas3,cas8u2,cas7,cas5u,DinG,cas8u1,DEDDh,PD-DExK,csb2gr5	NA|111aa|up_6|NZ_LR593886.1_9558016_9558349_+,NA|80aa|up_4|NZ_LR593886.1_9559683_9559923_-,NA|89aa|up_3|NZ_LR593886.1_9560080_9560347_-,NA|29aa|up_0|NZ_LR593886.1_9564567_9564654_-,NA|77aa|down_0|NZ_LR593886.1_9567711_9567942_+,NA|1308aa|down_7|NZ_LR593886.1_9580729_9584653_+	NA|1512aa|up_9|NZ_LR593886.1_9546825_9551361_-	cd01461, vWA_interalpha_trypsin_inhibitor, vWA_interalpha trypsin inhibitor (ITI): ITI is a glycoprotein composed of three polypeptides- two heavy chains and one light chain (bikunin)	NA|1200aa|up_8|NZ_LR593886.1_9553423_9557023_+	TIGR02082, Methionine_synthase, 5-methyltetrahydrofolate--homocysteine methyltransferase	NA|286aa|up_7|NZ_LR593886.1_9557149_9558007_+	COG1131, CcmA, ABC-type multidrug transport system, ATPase component [Defense mechanisms]	NA|111aa|up_6|NZ_LR593886.1_9558016_9558349_+	NA	NA|409aa|up_5|NZ_LR593886.1_9558380_9559607_+	COG0842, COG0842, ABC-type multidrug transport system, permease component [Defense mechanisms]	NA|80aa|up_4|NZ_LR593886.1_9559683_9559923_-	NA	NA|89aa|up_3|NZ_LR593886.1_9560080_9560347_-	NA	NA|840aa|up_2|NZ_LR593886.1_9560494_9563014_-	cd07473, Peptidases_S8_Subtilisin_like, Peptidase S8 family domain in Subtilisin-like proteins	NA|308aa|up_1|NZ_LR593886.1_9563612_9564536_+	pfam08668, HDOD, HDOD domain	NA|29aa|up_0|NZ_LR593886.1_9564567_9564654_-	NA	NA|77aa|down_0|NZ_LR593886.1_9567711_9567942_+	NA	NA|485aa|down_1|NZ_LR593886.1_9567934_9569389_-	pfam07394, DUF1501, Protein of unknown function (DUF1501)	NA|848aa|down_2|NZ_LR593886.1_9569499_9572043_-	pfam07583, PSCyt2, Protein of unknown function (DUF1549)	NA|1453aa|down_3|NZ_LR593886.1_9572222_9576581_-	TIGR02604, Piru_Ver_Nterm, putative membrane-bound dehydrogenase domain	NA|389aa|down_4|NZ_LR593886.1_9576758_9577925_-	COG1940, NagC, Transcriptional regulator/sugar kinase [Transcription / Carbohydrate transport and metabolism]	NA|194aa|down_5|NZ_LR593886.1_9578288_9578870_+	TIGR03495, Probable_spanin_inner_membrane_subunit, phage lysis regulatory protein, LysB family	NA|454aa|down_6|NZ_LR593886.1_9579088_9580450_-	COG0642, BaeS, Signal transduction histidine kinase [Signal transduction mechanisms]	NA|1308aa|down_7|NZ_LR593886.1_9580729_9584653_+	NA	NA|106aa|down_8|NZ_LR593886.1_9584788_9585106_+	COG3309, VapD, Uncharacterized virulence-associated protein D [Function unknown]	NA|64aa|down_9|NZ_LR593886.1_9585136_9585328_+	COG1598, COG1598, Predicted nuclease of the RNAse H fold, HicB family [General    function prediction only]
