assembly_id	genome_id	genome_def	crispr_array_locus_merge	crispr_array_location_merge	crispr_locus_id	crispr_pred_method	array_in_prot	prot_within_array_20000	prot_in_genome	crispr_type_by_cas_prot	consensus_repeat	repeat_length	self-targeting_spacer_number	self-targeting_target_number	spacer_location	protospacer_location	repeat_type	spacer_locus_num	spacer_num	correct_crispr_type	genome_cas_prots	unknown_protein_around_crispr	L10	L10_domain	L9	L9_domain	L8	L8_domain	L7	L7_domain	L6	L6_domain	L5	L5_domain	L4	L4_domain	L3	L3_domain	L2	L2_domain	L1	L1_domain	R1	R1_domain	R2	R2_domain	R3	R3_domain	R4	R4_domain	R5	R5_domain	R6	R6_domain	R7	R7_domain	R8	R8_domain	R9	R9_domain	R10	R10_domain
GCF_000177615.2_ASM17761v2	NC_015656	Candidatus Frankia datiscae, complete sequence	1	303167-303264	1	CRISPRCasFinder	no		RT,csa3,cas3,cas6,cas4,DEDDh,WYL,cas3HD,c2c9_V-U4	Orphan	GTTGGTGTGGCGGCTGGCGCCGCC	24	0	0	NA	NA	NA	1	1	Orphan	RT,csa3,cas3,cas6,cas4,DEDDh,WYL,cas3HD,c2c9_V-U4	NA|98aa|up_9|NC_015656.1_291386_291680_+,NA|56aa|up_7|NC_015656.1_293603_293771_-,NA|200aa|up_1|NC_015656.1_299855_300455_+,NA|235aa|down_1|NC_015656.1_304624_305329_-,NA|187aa|down_4|NC_015656.1_307318_307879_-	NA|98aa|up_9|NC_015656.1_291386_291680_+	NA	NA|479aa|up_8|NC_015656.1_291980_293417_+	pfam03050, DDE_Tnp_IS66, Transposase IS66 family	NA|56aa|up_7|NC_015656.1_293603_293771_-	NA	NA|305aa|up_6|NC_015656.1_294480_295395_+	pfam00583, Acetyltransf_1, Acetyltransferase (GNAT) family	NA|324aa|up_5|NC_015656.1_295603_296575_+	cd00739, DHPS, DHPS subgroup of Pterin binding enzymes	NA|152aa|up_4|NC_015656.1_296571_297027_+	pfam02152, FolB, Dihydroneopterin aldolase	NA|207aa|up_3|NC_015656.1_297023_297644_+	COG0801, FolK, 7,8-dihydro-6-hydroxymethylpterin-pyrophosphokinase [Coenzyme metabolism]	NA|146aa|up_2|NC_015656.1_297640_298078_+	pfam11377, DUF3180, Protein of unknown function (DUF3180)	NA|200aa|up_1|NC_015656.1_299855_300455_+	NA	NA|348aa|up_0|NC_015656.1_300903_301947_-	COG3509, LpqC, Poly(3-hydroxybutyrate) depolymerase [Secondary metabolites biosynthesis, transport, and catabolism]	NA|286aa|down_0|NC_015656.1_303302_304160_-	pfam05552, TM_helix, Conserved TM helix	NA|235aa|down_1|NC_015656.1_304624_305329_-	NA	NA|209aa|down_2|NC_015656.1_305618_306245_+	TIGR03914, SPO1_DNA_polymerase-related_protein, uracil-DNA glycosylase family domain	NA|293aa|down_3|NC_015656.1_306368_307247_-	cd16913, YkuD_like, L,D-transpeptidases/carboxypeptidases similar to Bacillus YkuD	NA|187aa|down_4|NC_015656.1_307318_307879_-	NA	NA|268aa|down_5|NC_015656.1_308298_309102_-	TIGR00558, Pyridoxine/pyridoxamine_5'-phosphate_oxidase, pyridoxamine-phosphate oxidase	NA|365aa|down_6|NC_015656.1_309449_310544_+	PRK12350, PRK12350, citrate synthase 2; Provisional	NA|373aa|down_7|NC_015656.1_310536_311655_+	PRK03080, PRK03080, phosphoserine transaminase	NA|456aa|down_8|NC_015656.1_311846_313214_-	cd06116, CaCS_like, Chloroflexus aurantiacus (Ca) citrate synthase (CS)_like	NA|335aa|down_9|NC_015656.1_313628_314633_+	cd19087, AKR_AKR12A1_B1_C1, AKR12A, AKR12B,  AKR12C families of aldo-keto reductase (AKR)
GCF_000177615.2_ASM17761v2	NC_015656	Candidatus Frankia datiscae, complete sequence	2	610144-610226	2	CRISPRCasFinder	no	csa3	RT,csa3,cas3,cas6,cas4,DEDDh,WYL,cas3HD,c2c9_V-U4	Type I-A	GAGTATCCGCACCCGGCGCGGGG	23	0	0	NA	NA	NA	1	1	Orphan	RT,csa3,cas3,cas6,cas4,DEDDh,WYL,cas3HD,c2c9_V-U4	NA|168aa|up_6|NC_015656.1_601046_601550_-,NA|81aa|up_3|NC_015656.1_605107_605350_+,NA|141aa|up_2|NC_015656.1_605553_605976_-,NA	NA|72aa|up_9|NC_015656.1_599498_599714_-	TIGR04268, FxSxx-COOH, FXSXX-COOH protein	NA|109aa|up_8|NC_015656.1_599821_600148_-	cd02230, cupin_HP0902-like, Helicobacter pylori HP0902 and related proteins, cupin domain	NA|267aa|up_7|NC_015656.1_600261_601062_+	pfam01510, Amidase_2, N-acetylmuramoyl-L-alanine amidase	NA|168aa|up_6|NC_015656.1_601046_601550_-	NA	NA|593aa|up_5|NC_015656.1_601734_603513_-	pfam02026, RyR, RyR domain	NA|314aa|up_4|NC_015656.1_603720_604662_-	cd04741, DHOD_1A_like, Dihydroorotate dehydrogenase (DHOD) class 1A FMN-binding domain	NA|81aa|up_3|NC_015656.1_605107_605350_+	NA	NA|141aa|up_2|NC_015656.1_605553_605976_-	NA	NA|268aa|up_1|NC_015656.1_606833_607637_+	cd04622, CBS_pair_HRP1_like, CBS pair domain found in Hypoxic Response Protein 1 (HRP1) -like proteinds	NA|217aa|up_0|NC_015656.1_607736_608387_+	COG1926, COG1926, Predicted phosphoribosyltransferases [General function prediction only]	NA|732aa|down_0|NC_015656.1_611325_613521_+	cd00200, WD40, WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment	NA|70aa|down_1|NC_015656.1_614779_614989_-	pfam13676, TIR_2, TIR domain	NA|748aa|down_2|NC_015656.1_615245_617489_-	pfam07228, SpoIIE, Stage II sporulation protein E (SpoIIE)	NA|1285aa|down_3|NC_015656.1_617777_621632_+	TIGR02956, sensor_protein_TorS, TMAO reductase sytem sensor TorS	NA|506aa|down_4|NC_015656.1_622040_623558_+	cd13653, PBP2_phosphate_like_1, Substrate binding domain of putative ABC-type phosphate transporter, a member of the type 2 periplasmic binding fold superfamily	csa3|140aa|down_5|NC_015656.1_623654_624074_-	smart00418, HTH_ARSR, helix_turn_helix, Arsenical Resistance Operon Repressor	NA|185aa|down_6|NC_015656.1_624181_624736_+	cd07254, VOC_like, uncharacterized subfamily of vicinal oxygen chelate (VOC) family	NA|277aa|down_7|NC_015656.1_624735_625566_+	COG0580, GlpF, Glycerol uptake facilitator and related permeases (Major Intrinsic Protein Family) [Carbohydrate transport and metabolism]	NA|137aa|down_8|NC_015656.1_625590_626001_+	cd16345, LMWP_ArsC, Arsenate reductase of the LMWP family	NA|383aa|down_9|NC_015656.1_626188_627337_+	COG2197, CitB, Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain [Signal transduction mechanisms / Transcription]
GCF_000177615.2_ASM17761v2	NC_015656	Candidatus Frankia datiscae, complete sequence	3	851605-851688	3	CRISPRCasFinder	no		RT,csa3,cas3,cas6,cas4,DEDDh,WYL,cas3HD,c2c9_V-U4	Orphan	CCCGCCTCCCCGGGGCGCTGGTGG	24	0	0	NA	NA	NA	1	1	Orphan	RT,csa3,cas3,cas6,cas4,DEDDh,WYL,cas3HD,c2c9_V-U4	NA|182aa|up_0|NC_015656.1_850746_851292_+,NA|68aa|down_0|NC_015656.1_851758_851962_-	NA|438aa|up_9|NC_015656.1_830747_832061_-	PRK07811, PRK07811, cystathionine gamma-synthase; Provisional	NA|460aa|up_8|NC_015656.1_832077_833457_-	TIGR01137, Cystathionine_beta-synthase, cystathionine beta-synthase	NA|541aa|up_7|NC_015656.1_833760_835383_-	cd06843, PLPDE_III_PvsE_like, Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzyme PvsE	NA|137aa|up_6|NC_015656.1_835460_835871_-	cd16936, HATPase_RsbW-like, Histidine kinase-like ATPase domain of RsbW, an anti sigma-B factor and serine-protein kinase involved in regulating sigma-B during stress in Bacilli, and related domains	NA|480aa|up_5|NC_015656.1_836081_837521_-	pfam07228, SpoIIE, Stage II sporulation protein E (SpoIIE)	NA|1327aa|up_4|NC_015656.1_838997_842978_+	PRK11107, PRK11107, hybrid sensory histidine kinase BarA; Provisional	NA|231aa|up_3|NC_015656.1_843369_844062_+	cd19920, REC_PA4781-like, phosphoacceptor receiver (REC) domain of cyclic di-GMP phosphodiesterase PA4781 and similar domains	NA|272aa|up_2|NC_015656.1_845681_846497_-	pfam03713, DUF305, Domain of unknown function (DUF305)	NA|596aa|up_1|NC_015656.1_848312_850100_-	COG3387, SGA1, Glucoamylase and related glycosyl hydrolases [Carbohydrate transport and metabolism]	NA|182aa|up_0|NC_015656.1_850746_851292_+	NA	NA|68aa|down_0|NC_015656.1_851758_851962_-	NA	NA|425aa|down_1|NC_015656.1_852775_854050_+	TIGR04182, glyco_TIGR04182, glycosyltransferase, TIGR04182 family	NA|505aa|down_2|NC_015656.1_854411_855926_+	COG1807, ArnT, 4-amino-4-deoxy-L-arabinose transferase and related glycosyltransferases of PMT family [Cell envelope biogenesis, outer membrane]	NA|521aa|down_3|NC_015656.1_855858_857421_-	pfam16192, PMT_4TMC, C-terminal four TMM region of protein-O-mannosyltransferase	NA|303aa|down_4|NC_015656.1_857571_858480_+	COG0313, COG0313, Predicted methyltransferases [General function prediction only]	NA|157aa|down_5|NC_015656.1_858757_859228_-	cd06259, YdcF-like, YdcF-like	NA|368aa|down_6|NC_015656.1_859563_860667_-	cd06338, PBP1_ABC_ligand_binding-like, type 1 periplasmic ligand-binding domain of uncharacterized ABC (ATPase Binding Cassette)-type active transport systems predicted to be involved in transport of amino acids, peptides, or inorganic ions	NA|275aa|down_7|NC_015656.1_860960_861785_+	cd01310, TatD_DNAse, TatD like proteins;  E	NA|320aa|down_8|NC_015656.1_861898_862858_+	PRK00274, ksgA, 16S rRNA (adenine(1518)-N(6)/adenine(1519)-N(6))-dimethyltransferase RsmA	NA|242aa|down_9|NC_015656.1_863193_863919_-	NF033218, anchor_AmaP, alkaline shock response membrane anchor protein AmaP
GCF_000177615.2_ASM17761v2	NC_015656	Candidatus Frankia datiscae, complete sequence	4	1161340-1161423	4	CRISPRCasFinder	no		RT,csa3,cas3,cas6,cas4,DEDDh,WYL,cas3HD,c2c9_V-U4	Orphan	GCCGCGTGGGGAAGGAAAGGGGC	23	0	0	NA	NA	NA	1	1	Orphan	RT,csa3,cas3,cas6,cas4,DEDDh,WYL,cas3HD,c2c9_V-U4	NA|59aa|up_4|NC_015656.1_1150962_1151139_-,NA|310aa|down_0|NC_015656.1_1161428_1162358_-,NA|631aa|down_7|NC_015656.1_1171573_1173466_-	NA|132aa|up_9|NC_015656.1_1146095_1146491_-	pfam09055, Sod_Ni, Nickel-containing superoxide dismutase	NA|92aa|up_8|NC_015656.1_1146690_1146966_+	cd06462, Peptidase_S24_S26, The S24, S26 LexA/signal peptidase superfamily contains LexA-related and type I signal peptidase families	NA|206aa|up_7|NC_015656.1_1146926_1147544_-	pfam00908, dTDP_sugar_isom, dTDP-4-dehydrorhamnose 3,5-epimerase	NA|423aa|up_6|NC_015656.1_1147916_1149185_+	COG0281, SfcA, Malic enzyme [Energy production and conversion]	NA|325aa|up_5|NC_015656.1_1149801_1150776_+	cd08266, Zn_ADH_like1, Alcohol dehydrogenases of the MDR family	NA|59aa|up_4|NC_015656.1_1150962_1151139_-	NA	NA|1252aa|up_3|NC_015656.1_1151192_1154948_-	PRK12270, kgd, multifunctional oxoglutarate decarboxylase/oxoglutarate dehydrogenase thiamine pyrophosphate-binding subunit/dihydrolipoyllysine-residue succinyltransferase subunit	NA|745aa|up_2|NC_015656.1_1155160_1157395_+	COG0145, HyuA, N-methylhydantoinase A/acetone carboxylase, beta subunit [Amino acid transport and metabolism / Secondary metabolites biosynthesis, transport, and catabolism]	NA|499aa|up_1|NC_015656.1_1158076_1159573_+	COG2421, COG2421, Predicted acetamidase/formamidase [Energy production and conversion]	NA|420aa|up_0|NC_015656.1_1159869_1161129_-	pfam13006, Nterm_IS4, Insertion element 4 transposase N-terminal	NA|310aa|down_0|NC_015656.1_1161428_1162358_-	NA	NA|255aa|down_1|NC_015656.1_1162657_1163422_-	PRK05950, sdhB, succinate dehydrogenase iron-sulfur subunit; Reviewed	NA|585aa|down_2|NC_015656.1_1163804_1165559_-	PRK08205, sdhA, succinate dehydrogenase flavoprotein subunit; Reviewed	NA|146aa|down_3|NC_015656.1_1165588_1166026_-	cd03500, SQR_TypeA_SdhD_like, Succinate:quinone oxidoreductase (SQR) Type A subfamily, Succinate dehydrogenase D (SdhD)-like subunit; SQR catalyzes the oxidation of succinate to fumarate coupled to the reduction of quinone to quinol	NA|132aa|down_4|NC_015656.1_1166025_1166421_-	cd03501, SQR_TypeA_SdhC_like, Succinate:quinone oxidoreductase (SQR) Type A subfamily, Succinate dehydrogenase C (SdhC)-like subunit; SQR catalyzes the oxidation of succinate to fumarate coupled to the reduction of quinone to quinol	NA|662aa|down_5|NC_015656.1_1166822_1168808_-	cd14014, STKc_PknB_like, Catalytic domain of bacterial Serine/Threonine kinases, PknB and similar proteins	NA|812aa|down_6|NC_015656.1_1168915_1171351_-	cd14014, STKc_PknB_like, Catalytic domain of bacterial Serine/Threonine kinases, PknB and similar proteins	NA|631aa|down_7|NC_015656.1_1171573_1173466_-	NA	NA|198aa|down_8|NC_015656.1_1173842_1174436_+	PRK05578, PRK05578, cytidine deaminase; Validated	NA|428aa|down_9|NC_015656.1_1174432_1175716_+	PRK05820, deoA, thymidine phosphorylase; Reviewed
GCF_000177615.2_ASM17761v2	NC_015656	Candidatus Frankia datiscae, complete sequence	5	1556731-1556819	5	CRISPRCasFinder	no		RT,csa3,cas3,cas6,cas4,DEDDh,WYL,cas3HD,c2c9_V-U4	Orphan	GCCGGTTCGGACCGCCCGAGCAGGGAGC	28	0	0	NA	NA	NA	1	1	Orphan	RT,csa3,cas3,cas6,cas4,DEDDh,WYL,cas3HD,c2c9_V-U4	NA|438aa|up_0|NC_015656.1_1554499_1555813_-,NA|370aa|down_8|NC_015656.1_1572766_1573876_+	NA|427aa|up_9|NC_015656.1_1537773_1539054_+	cd08953, KR_2_SDR_x, ketoreductase (KR), subgroup 2, complex (x) SDRs	NA|1099aa|up_8|NC_015656.1_1539040_1542337_+	cd00833, PKS, polyketide synthases (PKSs) polymerize simple fatty acids into a large variety of different products, called polyketides, by successive decarboxylating Claisen condensations	NA|725aa|up_7|NC_015656.1_1542792_1544967_+	pfam14765, PS-DH, Polyketide synthase dehydratase	NA|162aa|up_6|NC_015656.1_1544963_1545449_+	PRK05350, PRK05350, acyl carrier protein; Provisional	NA|410aa|up_5|NC_015656.1_1546509_1547739_+	cd03784, GT1_Gtf-like, UDP-glycosyltransferases and similar proteins	NA|576aa|up_4|NC_015656.1_1548646_1550374_+	PRK09284, PRK09284, thiamine biosynthesis protein ThiC; Provisional	NA|190aa|up_3|NC_015656.1_1550470_1551040_-	COG2852, COG2852, Very-short-patch-repair endonuclease [Replication, recombination,    and repair]	NA|340aa|up_2|NC_015656.1_1551641_1552661_-	PRK09599, PRK09599, NADP-dependent phosphogluconate dehydrogenase	NA|309aa|up_1|NC_015656.1_1553315_1554242_-	cd01050, Acyl_ACP_Desat, Acyl ACP desaturase, ferritin-like diiron-binding domain	NA|438aa|up_0|NC_015656.1_1554499_1555813_-	NA	NA|390aa|down_0|NC_015656.1_1558802_1559972_+	cd01152, ACAD_fadE6_17_26, Putative acyl-CoA dehydrogenases similar to fadE6, fadE17, and fadE26	NA|386aa|down_1|NC_015656.1_1559982_1561140_+	TIGR03203, pimD_small, pimeloyl-CoA dehydrogenase, small subunit	NA|571aa|down_2|NC_015656.1_1561340_1563053_-	COG0578, GlpA, Glycerol-3-phosphate dehydrogenase [Energy production and conversion]	NA|577aa|down_3|NC_015656.1_1563070_1564801_+	COG0277, GlcD, FAD/FMN-containing dehydrogenases [Energy production and conversion]	NA|244aa|down_4|NC_015656.1_1564953_1565685_+	cd05373, SDR_c10, classical (c) SDR, subgroup  10	NA|377aa|down_5|NC_015656.1_1565734_1566865_+	pfam11209, DUF2993, Protein of unknown function (DUF2993)	NA|816aa|down_6|NC_015656.1_1567582_1570030_+	COG1131, CcmA, ABC-type multidrug transport system, ATPase component [Defense mechanisms]	NA|688aa|down_7|NC_015656.1_1570336_1572400_-	cd14014, STKc_PknB_like, Catalytic domain of bacterial Serine/Threonine kinases, PknB and similar proteins	NA|370aa|down_8|NC_015656.1_1572766_1573876_+	NA	NA|350aa|down_9|NC_015656.1_1573975_1575025_-	TIGR03558, oxido_grp_1, luciferase family oxidoreductase, group 1
GCF_000177615.2_ASM17761v2	NC_015656	Candidatus Frankia datiscae, complete sequence	6	2008918-2009004	6	CRISPRCasFinder	no		RT,csa3,cas3,cas6,cas4,DEDDh,WYL,cas3HD,c2c9_V-U4	Orphan	TGTCGCGTGCTCGGAGGCCGCTTCCCAGT	29	0	0	NA	NA	NA	1	1	Orphan	RT,csa3,cas3,cas6,cas4,DEDDh,WYL,cas3HD,c2c9_V-U4	NA|74aa|up_8|NC_015656.1_1997399_1997621_+,NA	NA|131aa|up_9|NC_015656.1_1996542_1996935_+	PRK14965, PRK14965, DNA polymerase III subunits gamma and tau; Provisional	NA|74aa|up_8|NC_015656.1_1997399_1997621_+	NA	NA|413aa|up_7|NC_015656.1_1997797_1999036_+	cd03812, GT4_CapH-like, capsular polysaccharide biosynthesis glycosyltransferase CapH and similar proteins	NA|211aa|up_6|NC_015656.1_1998941_1999574_-	cd04683, Nudix_Hydrolase_24, Members of the Nudix hydrolase superfamily catalyze the hydrolysis of NUcleoside DIphosphates linked to other moieties, X	NA|336aa|up_5|NC_015656.1_2001087_2002095_+	COG0057, GapA, Glyceraldehyde-3-phosphate dehydrogenase/erythrose-4-phosphate dehydrogenase [Carbohydrate transport and metabolism]	NA|422aa|up_4|NC_015656.1_2002322_2003588_+	PRK00073, pgk, phosphoglycerate kinase; Provisional	NA|276aa|up_3|NC_015656.1_2003589_2004417_+	PRK00042, tpiA, triosephosphate isomerase; Provisional	NA|591aa|up_2|NC_015656.1_2004614_2006387_+	cd08506, PBP2_clavulanate_OppA2, The substrate-binding domain of an oligopeptide binding protein (OppA2) from the biosynthesis pathway of the beta-lactamase inhibitor clavulanic acid contains the type 2 periplasmic binding fold	NA|77aa|up_1|NC_015656.1_2006757_2006988_+	PRK06870, secG, preprotein translocase subunit SecG; Reviewed	NA|585aa|up_0|NC_015656.1_2007008_2008763_+	cd08506, PBP2_clavulanate_OppA2, The substrate-binding domain of an oligopeptide binding protein (OppA2) from the biosynthesis pathway of the beta-lactamase inhibitor clavulanic acid contains the type 2 periplasmic binding fold	NA|112aa|down_0|NC_015656.1_2009216_2009552_+	pfam13397, RbpA, RNA polymerase-binding protein	NA|283aa|down_1|NC_015656.1_2009784_2010633_-	pfam01182, Glucosamine_iso, Glucosamine-6-phosphate isomerases/6-phosphogluconolactonase	NA|384aa|down_2|NC_015656.1_2010632_2011784_-	pfam10128, OpcA_G6PD_assem, Glucose-6-phosphate dehydrogenase subunit	NA|511aa|down_3|NC_015656.1_2011780_2013313_-	PRK05722, PRK05722, glucose-6-phosphate 1-dehydrogenase; Validated	NA|377aa|down_4|NC_015656.1_2013452_2014583_-	PRK03343, PRK03343, transaldolase; Validated	NA|329aa|down_5|NC_015656.1_2015317_2016304_+	PRK04375, PRK04375, protoheme IX farnesyltransferase; Provisional	NA|351aa|down_6|NC_015656.1_2016617_2017670_-	COG1612, CtaA, Uncharacterized protein required for cytochrome oxidase assembly [Posttranslational modification, protein turnover, chaperones]	NA|283aa|down_7|NC_015656.1_2017737_2018586_-	TIGR00025, Mtu_efflux, ABC transporter efflux protein, DrrB family	NA|314aa|down_8|NC_015656.1_2018582_2019524_-	COG1131, CcmA, ABC-type multidrug transport system, ATPase component [Defense mechanisms]	NA|227aa|down_9|NC_015656.1_2019755_2020436_-	cd16837, BldD_C_like, C-terminal domain of BldD and similar transcription factors
GCF_000177615.2_ASM17761v2	NC_015656	Candidatus Frankia datiscae, complete sequence	7	2068379-2068476	7	CRISPRCasFinder	no		RT,csa3,cas3,cas6,cas4,DEDDh,WYL,cas3HD,c2c9_V-U4	Orphan	CCTCGTCCGGTCGAGCCGCCGCG	23	0	0	NA	NA	NA	1	1	Orphan	RT,csa3,cas3,cas6,cas4,DEDDh,WYL,cas3HD,c2c9_V-U4	NA|501aa|up_5|NC_015656.1_2058142_2059645_-,NA|73aa|up_4|NC_015656.1_2060317_2060536_+,NA|154aa|down_0|NC_015656.1_2070109_2070571_+,NA|232aa|down_1|NC_015656.1_2071562_2072258_+,NA|108aa|down_5|NC_015656.1_2077364_2077688_+	NA|315aa|up_9|NC_015656.1_2052227_2053172_-	pfam04168, Alpha-E, A predicted alpha-helical domain with a conserved ER motif	NA|571aa|up_8|NC_015656.1_2053165_2054878_-	COG2308, COG2308, Uncharacterized conserved protein [Function unknown]	NA|538aa|up_7|NC_015656.1_2055189_2056803_+	COG0488, Uup, ATPase components of ABC transporters with duplicated ATPase domains [General function prediction only]	NA|325aa|up_6|NC_015656.1_2057074_2058049_+	COG0053, MMT1, Predicted Co/Zn/Cd cation transporters [Inorganic ion transport and metabolism]	NA|501aa|up_5|NC_015656.1_2058142_2059645_-	NA	NA|73aa|up_4|NC_015656.1_2060317_2060536_+	NA	NA|281aa|up_3|NC_015656.1_2060685_2061528_+	cd06558, crotonase-like, Crotonase/Enoyl-Coenzyme A (CoA) hydratase superfamily	NA|666aa|up_2|NC_015656.1_2061621_2063619_+	COG1132, MdlB, ABC-type multidrug transport system, ATPase and permease components [Defense mechanisms]	NA|363aa|up_1|NC_015656.1_2064104_2065193_-	pfam13358, DDE_3, DDE superfamily endonuclease	NA|472aa|up_0|NC_015656.1_2065346_2066762_+	PRK06247, PRK06247, pyruvate kinase; Provisional	NA|154aa|down_0|NC_015656.1_2070109_2070571_+	NA	NA|232aa|down_1|NC_015656.1_2071562_2072258_+	NA	NA|428aa|down_2|NC_015656.1_2072298_2073582_-	PRK11728, PRK11728, L-2-hydroxyglutarate oxidase	NA|611aa|down_3|NC_015656.1_2073883_2075716_-	COG3387, SGA1, Glucoamylase and related glycosyl hydrolases [Carbohydrate transport and metabolism]	NA|363aa|down_4|NC_015656.1_2075982_2077071_+	pfam13358, DDE_3, DDE superfamily endonuclease	NA|108aa|down_5|NC_015656.1_2077364_2077688_+	NA	NA|178aa|down_6|NC_015656.1_2078016_2078550_-	pfam09656, PGPGW, Putative transmembrane protein (PGPGW)	NA|161aa|down_7|NC_015656.1_2078950_2079433_+	COG0490, COG0490, Putative regulatory, ligand-binding protein related to C-terminal domains of K+ channels [Inorganic ion transport and metabolism]	NA|400aa|down_8|NC_015656.1_2079500_2080700_+	COG0475, KefB, Kef-type K+ transport systems, membrane components [Inorganic ion transport and metabolism]	NA|410aa|down_9|NC_015656.1_2081027_2082257_+	pfam06965, Na_H_antiport_1, Na+/H+ antiporter 1
GCF_000177615.2_ASM17761v2	NC_015656	Candidatus Frankia datiscae, complete sequence	8	2189089-2189171	8	CRISPRCasFinder	no		RT,csa3,cas3,cas6,cas4,DEDDh,WYL,cas3HD,c2c9_V-U4	Orphan	CCGCGTTCCCCGCGGGCGGGCACG	24	0	0	NA	NA	NA	1	1	Orphan	RT,csa3,cas3,cas6,cas4,DEDDh,WYL,cas3HD,c2c9_V-U4	NA,NA|77aa|down_2|NC_015656.1_2193288_2193519_+,NA|407aa|down_7|NC_015656.1_2197607_2198828_-	NA|421aa|up_9|NC_015656.1_2175207_2176470_-	pfam13006, Nterm_IS4, Insertion element 4 transposase N-terminal	NA|170aa|up_8|NC_015656.1_2176629_2177139_+	pfam10042, DUF2278, Uncharacterized conserved protein (DUF2278)	NA|216aa|up_7|NC_015656.1_2178702_2179350_-	pfam02720, DUF222, Domain of unknown function (DUF222)	NA|260aa|up_6|NC_015656.1_2179979_2180759_-	cd05233, SDR_c, classical (c) SDRs	NA|467aa|up_5|NC_015656.1_2181168_2182569_-	PRK05537, PRK05537, bifunctional sulfate adenylyltransferase/adenylylsulfate kinase	NA|323aa|up_4|NC_015656.1_2183111_2184080_+	cd01638, CysQ, CysQ, a 3'-Phosphoadenosine-5'-phosphosulfate (PAPS) 3'-phosphatase, is a bacterial member of the inositol monophosphatase family	NA|305aa|up_3|NC_015656.1_2184381_2185296_+	PRK05253, PRK05253, sulfate adenylyltransferase subunit CysD	NA|415aa|up_2|NC_015656.1_2185468_2186713_+	PRK05506, PRK05506, bifunctional sulfate adenylyltransferase subunit 1/adenylylsulfate kinase protein; Provisional	NA|268aa|up_1|NC_015656.1_2186744_2187548_-	PRK14059, PRK14059, pyrimidine reductase family protein	NA|378aa|up_0|NC_015656.1_2187831_2188965_+	COG1485, COG1485, Predicted ATPase [General function prediction only]	NA|584aa|down_0|NC_015656.1_2189766_2191518_-	TIGR02402, Malto-oligosyltrehalose_trehalohydrolase, malto-oligosyltrehalose trehalohydrolase	NA|265aa|down_1|NC_015656.1_2191803_2192598_+	pfam00300, His_Phos_1, Histidine phosphatase superfamily (branch 1)	NA|77aa|down_2|NC_015656.1_2193288_2193519_+	NA	NA|257aa|down_3|NC_015656.1_2193497_2194268_+	PRK11365, ssuC, aliphatic sulfonate ABC transporter permease SsuC	NA|353aa|down_4|NC_015656.1_2194516_2195575_+	cd01008, PBP2_NrtA_SsuA_CpmA_like, Substrate binding domain of ABC-type nitrate/sulfonate/bicarbonate transporters, a member of the type 2 periplasmic binding fold superfamily	NA|261aa|down_5|NC_015656.1_2195708_2196491_+	COG1116, TauB, ABC-type nitrate/sulfonate/bicarbonate transport system, ATPase component [Inorganic ion transport and metabolism]	NA|328aa|down_6|NC_015656.1_2196578_2197562_-	cd19086, AKR_AKR11C1, AKR11C family of aldo-keto reductase (AKR)	NA|407aa|down_7|NC_015656.1_2197607_2198828_-	NA	NA|446aa|down_8|NC_015656.1_2198824_2200162_-	cd06828, PLPDE_III_DapDC, Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzyme Diaminopimelate Decarboxylase	NA|414aa|down_9|NC_015656.1_2200250_2201492_-	PRK07206, PRK07206, hypothetical protein; Provisional
GCF_000177615.2_ASM17761v2	NC_015656	Candidatus Frankia datiscae, complete sequence	9	3036316-3036397	9	CRISPRCasFinder	no		RT,csa3,cas3,cas6,cas4,DEDDh,WYL,cas3HD,c2c9_V-U4	Orphan	GGTCATCCGCGCGCGGGACACCC	23	0	0	NA	NA	NA	1	1	Orphan	RT,csa3,cas3,cas6,cas4,DEDDh,WYL,cas3HD,c2c9_V-U4	NA,NA	NA|416aa|up_9|NC_015656.1_3022465_3023713_-	PRK13022, secF, protein translocase subunit SecF	NA|620aa|up_8|NC_015656.1_3023716_3025576_-	PRK05812, secD, preprotein translocase subunit SecD; Reviewed	NA|186aa|up_7|NC_015656.1_3025995_3026553_-	pfam02699, YajC, Preprotein translocase subunit	NA|351aa|up_6|NC_015656.1_3026724_3027777_-	PRK00080, ruvB, Holliday junction branch migration DNA helicase RuvB	NA|265aa|up_5|NC_015656.1_3027787_3028582_-	PRK00116, ruvA, Holliday junction branch migration protein RuvA	NA|178aa|up_4|NC_015656.1_3028578_3029112_-	PRK00039, ruvC, Holliday junction resolvase; Reviewed	NA|252aa|up_3|NC_015656.1_3030701_3031457_-	PRK00110, PRK00110, YebC/PmpR family DNA-binding transcriptional regulator	NA|245aa|up_2|NC_015656.1_3031467_3032202_-	PRK13525, PRK13525, pyridoxal 5'-phosphate synthase glutaminase subunit PdxT	NA|330aa|up_1|NC_015656.1_3032195_3033185_-	PRK04180, PRK04180, pyridoxal 5'-phosphate synthase lyase subunit PdxS	NA|378aa|up_0|NC_015656.1_3034198_3035332_-	cd03801, GT4_PimA-like, phosphatidyl-myo-inositol mannosyltransferase	NA|229aa|down_0|NC_015656.1_3036407_3037094_-	COG0558, PgsA, Phosphatidylglycerophosphate synthase [Lipid metabolism]	NA|739aa|down_1|NC_015656.1_3037543_3039760_+	PRK12740, PRK12740, elongation factor G-like protein EF-G2	NA|154aa|down_2|NC_015656.1_3040017_3040479_-	cd01275, FHIT, FHIT (fragile histidine family): FHIT proteins, related to the HIT family carry a motif HxHxH/Qxx (x, is a hydrophobic amino acid), On the basis of sequence, substrate specificity, structure, evolution and mechanism, HIT proteins are classified into three  branches: the Hint branch, which consists of adenosine 5' -monophosphoramide hydrolases, the Fhit branch, that consists of diadenosine polyphosphate hydrolases, and the GalT branch consisting of specific nucloside monophosphate transferases	NA|547aa|down_3|NC_015656.1_3041366_3043007_+	pfam00067, p450, Cytochrome P450	NA|144aa|down_4|NC_015656.1_3044060_3044492_+	pfam04686, SsgA, Streptomyces sporulation and cell division protein, SsgA	NA|349aa|down_5|NC_015656.1_3044837_3045884_-	TIGR01824, Anthranilate_synthase_component_I-like_protein, aminodeoxychorismate synthase, component I, clade 2	NA|207aa|down_6|NC_015656.1_3047690_3048311_-	COG1309, AcrR, Transcriptional regulator [Transcription]	NA|287aa|down_7|NC_015656.1_3048720_3049581_-	cd02696, MurNAc-LAA, N-acetylmuramoyl-L-alanine amidase or MurNAc-LAA (also known as peptidoglycan aminohydrolase, NAMLA amidase, NAMLAA, Amidase 3, and peptidoglycan amidase; EC 3	NA|300aa|down_8|NC_015656.1_3049631_3050531_-	pfam11296, DUF3097, Protein of unknown function (DUF3097)	NA|201aa|down_9|NC_015656.1_3050816_3051419_+	cd01011, nicotinamidase, Nicotinamidase/pyrazinamidase (PZase)
GCF_000177615.2_ASM17761v2	NC_015656	Candidatus Frankia datiscae, complete sequence	10	3841895-3842167	1,10,1	PILER-CR,CRISPRCasFinder,CRT	no		RT,csa3,cas3,cas6,cas4,DEDDh,WYL,cas3HD,c2c9_V-U4	Orphan	CGGCAGTCCCGCGCGCACGCGCGGCATGCAGCACG,TCCCGCGCGCACGCGCGGCATGC,CCCGCGCGCANGCGCGGCATGC	35,23,22	0	0	NA	NA	NA:NA:NA	2,4,4	4	Orphan	RT,csa3,cas3,cas6,cas4,DEDDh,WYL,cas3HD,c2c9_V-U4	NA|550aa|up_7|NC_015656.1_3829181_3830831_-,NA|61aa|up_6|NC_015656.1_3830923_3831106_-,NA|191aa|up_1|NC_015656.1_3837916_3838489_-,NA|73aa|down_2|NC_015656.1_3844663_3844882_+,NA|46aa|down_4|NC_015656.1_3846021_3846159_-,NA|72aa|down_9|NC_015656.1_3852368_3852584_-	NA|354aa|up_9|NC_015656.1_3824804_3825866_+	PLN02284, PLN02284, glutamine synthetase	NA|1027aa|up_8|NC_015656.1_3825940_3829021_-	PRK14109, PRK14109, bifunctional [glutamine synthetase] adenylyltransferase/[glutamine synthetase]-adenylyl-L-tyrosine phosphorylase	NA|550aa|up_7|NC_015656.1_3829181_3830831_-	NA	NA|61aa|up_6|NC_015656.1_3830923_3831106_-	NA	NA|453aa|up_5|NC_015656.1_3831669_3833028_-	COG0174, GlnA, Glutamine synthetase [Amino acid transport and metabolism]	NA|617aa|up_4|NC_015656.1_3833293_3835144_+	PRK13981, PRK13981, NAD synthetase; Provisional	NA|241aa|up_3|NC_015656.1_3835361_3836084_-	pfam16859, TetR_C_11, Bacterial transcriptional repressor C-terminal	NA|313aa|up_2|NC_015656.1_3836613_3837552_+	PRK00311, panB, 3-methyl-2-oxobutanoate hydroxymethyltransferase; Reviewed	NA|191aa|up_1|NC_015656.1_3837916_3838489_-	NA	NA|393aa|up_0|NC_015656.1_3840427_3841605_+	PHA02517, PHA02517, putative transposase OrfB; Reviewed	NA|267aa|down_0|NC_015656.1_3842465_3843266_-	cd01144, BtuF, Cobalamin binding protein BtuF	NA|228aa|down_1|NC_015656.1_3843393_3844077_-	TIGR01915, F420-dependent_NADP_reductase, NADPH-dependent F420 reductase	NA|73aa|down_2|NC_015656.1_3844663_3844882_+	NA	NA|335aa|down_3|NC_015656.1_3845014_3846019_-	PRK00283, xerD, tyrosine recombinase	NA|46aa|down_4|NC_015656.1_3846021_3846159_-	NA	NA|453aa|down_5|NC_015656.1_3846688_3848047_+	cd00146, PKD, polycystic kidney disease I (PKD) domain; similar to other cell-surface modules, with an IG-like fold; domain probably functions as a ligand binding site in protein-protein or protein-carbohydrate interactions; a single instance of the repeat is presented here	NA|292aa|down_6|NC_015656.1_3848135_3849011_-	cd03424, ADPRase_NUDT5, ADP-ribose pyrophosphatase (ADPRase) catalyzes the hydrolysis of ADP-ribose and a variety of additional ADP-sugar conjugates to AMP and ribose-5-phosphate	NA|585aa|down_7|NC_015656.1_3848992_3850747_-	PRK05380, pyrG, CTP synthetase; Validated	NA|206aa|down_8|NC_015656.1_3851230_3851848_-	PRK05365, PRK05365, malonic semialdehyde reductase; Provisional	NA|72aa|down_9|NC_015656.1_3852368_3852584_-	NA
GCF_000177615.2_ASM17761v2	NC_015656	Candidatus Frankia datiscae, complete sequence	11	4510511-4510601	11	CRISPRCasFinder	no		RT,csa3,cas3,cas6,cas4,DEDDh,WYL,cas3HD,c2c9_V-U4	Orphan	CAGTTTCACTTGTGCCCTGAAGTCACCGGCC	31	0	0	NA	NA	NA	1	1	Orphan	RT,csa3,cas3,cas6,cas4,DEDDh,WYL,cas3HD,c2c9_V-U4	NA|167aa|up_9|NC_015656.1_4500631_4501132_-,NA|206aa|up_6|NC_015656.1_4504337_4504955_-,NA|79aa|up_4|NC_015656.1_4506094_4506331_+,NA|57aa|down_6|NC_015656.1_4516531_4516702_-,NA|147aa|down_7|NC_015656.1_4516797_4517238_-	NA|167aa|up_9|NC_015656.1_4500631_4501132_-	NA	NA|227aa|up_8|NC_015656.1_4501409_4502090_-	TIGR00691, PppGpp_synthetase	NA|679aa|up_7|NC_015656.1_4502242_4504279_-	cd03681, MM_CoA_mutase_MeaA, Coenzyme B12-dependent-methylmalonyl coenzyme A (CoA) mutase (MCM) family, MeaA-like subfamily; contains various methylmalonyl coenzyme A (CoA) mutase (MCM)-like proteins similar to the Streptomyces cinnamonensis MeaA, Methylobacterium extorquens MeaA and Streptomyces collinus B12-dependent mutase	NA|206aa|up_6|NC_015656.1_4504337_4504955_-	NA	NA|115aa|up_5|NC_015656.1_4505103_4505448_-	cd07043, STAS_anti-anti-sigma_factors, Sulphate Transporter and Anti-Sigma factor antagonist) domain of anti-anti-sigma factors, key regulators of anti-sigma factors by phosphorylation	NA|79aa|up_4|NC_015656.1_4506094_4506331_+	NA	NA|489aa|up_3|NC_015656.1_4506435_4507902_-	PRK09369, PRK09369, UDP-N-acetylglucosamine 1-carboxyvinyltransferase; Validated	NA|191aa|up_2|NC_015656.1_4508082_4508655_+	pfam01923, Cob_adeno_trans, Cobalamin adenosyltransferase	NA|90aa|up_1|NC_015656.1_4508748_4509018_-	PRK13442, atpC, F0F1 ATP synthase subunit epsilon; Provisional	NA|481aa|up_0|NC_015656.1_4509059_4510502_-	PRK09280, PRK09280, F0F1 ATP synthase subunit beta; Validated	NA|299aa|down_0|NC_015656.1_4510744_4511641_-	PRK05621, PRK05621, F0F1 ATP synthase subunit gamma; Validated	NA|551aa|down_1|NC_015656.1_4511644_4513297_-	PRK09281, PRK09281, F0F1 ATP synthase subunit alpha; Validated	NA|281aa|down_2|NC_015656.1_4513374_4514217_-	PRK13430, PRK13430, F0F1 ATP synthase subunit delta; Provisional	NA|197aa|down_3|NC_015656.1_4514220_4514811_-	PRK05759, PRK05759, F0F1 ATP synthase subunit B; Validated	NA|82aa|down_4|NC_015656.1_4514826_4515072_-	PRK07874, PRK07874, ATP synthase F0 subunit C	NA|294aa|down_5|NC_015656.1_4515136_4516018_-	PRK05815, PRK05815, F0F1 ATP synthase subunit A; Validated	NA|57aa|down_6|NC_015656.1_4516531_4516702_-	NA	NA|147aa|down_7|NC_015656.1_4516797_4517238_-	NA	NA|370aa|down_8|NC_015656.1_4517234_4518344_-	cd06853, GT_WecA_like, This subfamily contains Escherichia coli WecA, Bacillus subtilis TagO and related proteins	NA|419aa|down_9|NC_015656.1_4518452_4519709_-	PRK00011, glyA, serine hydroxymethyltransferase; Reviewed
