assembly_id	genome_id	genome_def	crispr_array_locus_merge	crispr_array_location_merge	crispr_locus_id	crispr_pred_method	array_in_prot	prot_within_array_20000	prot_in_genome	crispr_type_by_cas_prot	consensus_repeat	repeat_length	self-targeting_spacer_number	self-targeting_target_number	spacer_location	protospacer_location	repeat_type	spacer_locus_num	spacer_num	correct_crispr_type	genome_cas_prots	unknown_protein_around_crispr	L10	L10_domain	L9	L9_domain	L8	L8_domain	L7	L7_domain	L6	L6_domain	L5	L5_domain	L4	L4_domain	L3	L3_domain	L2	L2_domain	L1	L1_domain	R1	R1_domain	R2	R2_domain	R3	R3_domain	R4	R4_domain	R5	R5_domain	R6	R6_domain	R7	R7_domain	R8	R8_domain	R9	R9_domain	R10	R10_domain
GCF_009873495.1_ASM987349v1	NZ_CP046703	Nostoc sp. ATCC 53789 chromosome, complete genome	1	724575-724677	1	CRISPRCasFinder	no		Cas9_archaeal,cas14k,cas14j,csa3,RT,DEDDh,PD-DExK,cas3,DinG,Cas14c_CAS-V-F,WYL,cas10d,csc2gr7,csc1gr5,2OG_CAS,cas6,cas4,cas1,cas2,Cas14u_CAS-V	Orphan	TAAAGCCCCTAAATTTATTTATG	23	0	0	NA	NA	NA	1	1	Orphan	Cas9_archaeal,cas14k,cas14j,csa3,RT,DEDDh,PD-DExK,cas3,DinG,Cas14c_CAS-V-F,WYL,cas10d,csc2gr7,csc1gr5,2OG_CAS,cas6,cas4,cas1,cas2,Cas14u_CAS-V,csx3,csx1,cas10,csm3gr7,csx10gr5,csx19,c2c9_V-U4,cas8a4,cas7,cas5	NA|70aa|up_7|NZ_CP046703.1_714673_714883_-,NA|63aa|up_1|NZ_CP046703.1_721067_721256_-,NA|132aa|down_1|NZ_CP046703.1_728785_729181_-,NA|151aa|down_3|NZ_CP046703.1_730611_731064_-,NA|93aa|down_9|NZ_CP046703.1_741348_741627_+	NA|469aa|up_9|NZ_CP046703.1_711506_712913_-	cd16943, HATPase_AtoS-like, Histidine kinase-like ATPase domain of two-component sensor histidine kinases similar to Escherichia coli K-12 AtoS	NA|555aa|up_8|NZ_CP046703.1_712944_714609_-	cd17595, REC_TrxB, phosphoacceptor receiver (REC) domain a fused response regulator with a thioredoxin reductase output domain	NA|70aa|up_7|NZ_CP046703.1_714673_714883_-	NA	NA|131aa|up_6|NZ_CP046703.1_715551_715944_-	cd17552, REC_RR468-like, phosphoacceptor receiver (REC) domain of Thermotoga maritima response regulator RR468 and similar domains	NA|83aa|up_5|NZ_CP046703.1_716085_716334_-	cd17552, REC_RR468-like, phosphoacceptor receiver (REC) domain of Thermotoga maritima response regulator RR468 and similar domains	NA|173aa|up_4|NZ_CP046703.1_716529_717048_+	cd14768, PC_PEC_beta, Beta subunits of phycoerythrin and phycoerythrocyanin; phycobilisome rod components	NA|162aa|up_3|NZ_CP046703.1_717192_717678_+	cd14770, PC-PEC_alpha, Alpha subunits of phycoerythrin and phycoerythrocyanin; phycobilisome rod components	NA|423aa|up_2|NZ_CP046703.1_717840_719109_-	TIGR02176, pyruvate_flavodoxin/ferrodoxin_oxidoreductase, pyruvate:ferredoxin (flavodoxin) oxidoreductase, homodimeric	NA|63aa|up_1|NZ_CP046703.1_721067_721256_-	NA	NA|126aa|up_0|NZ_CP046703.1_721411_721789_-	cd17552, REC_RR468-like, phosphoacceptor receiver (REC) domain of Thermotoga maritima response regulator RR468 and similar domains	NA|588aa|down_0|NZ_CP046703.1_726609_728373_-	cd00200, WD40, WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment	NA|132aa|down_1|NZ_CP046703.1_728785_729181_-	NA	NA|287aa|down_2|NZ_CP046703.1_729520_730381_-	COG2897, SseA, Rhodanese-related sulfurtransferase [Inorganic ion transport and metabolism]	NA|151aa|down_3|NZ_CP046703.1_730611_731064_-	NA	NA|159aa|down_4|NZ_CP046703.1_731766_732243_+	COG4446, COG4446, Uncharacterized protein conserved in bacteria [Function unknown]	NA|448aa|down_5|NZ_CP046703.1_732402_733746_-	cd19920, REC_PA4781-like, phosphoacceptor receiver (REC) domain of cyclic di-GMP phosphodiesterase PA4781 and similar domains	NA|951aa|down_6|NZ_CP046703.1_733742_736595_-	COG4252, COG4252, Predicted transmembrane sensor domain [Signal transduction mechanisms]	NA|351aa|down_7|NZ_CP046703.1_736818_737871_-	pfam06051, DUF928, Domain of Unknown Function (DUF928)	NA|851aa|down_8|NZ_CP046703.1_738136_740689_-	COG4995, COG4995, Uncharacterized protein conserved in bacteria [Function unknown]	NA|93aa|down_9|NZ_CP046703.1_741348_741627_+	NA
GCF_009873495.1_ASM987349v1	NZ_CP046703	Nostoc sp. ATCC 53789 chromosome, complete genome	2	866872-866971	2	CRISPRCasFinder	no	RT	Cas9_archaeal,cas14k,cas14j,csa3,RT,DEDDh,PD-DExK,cas3,DinG,Cas14c_CAS-V-F,WYL,cas10d,csc2gr7,csc1gr5,2OG_CAS,cas6,cas4,cas1,cas2,Cas14u_CAS-V	Unclear	TCGATTATGAGTCGATTGCTCTACCA	26	0	0	NA	NA	NA	1	1	Orphan	Cas9_archaeal,cas14k,cas14j,csa3,RT,DEDDh,PD-DExK,cas3,DinG,Cas14c_CAS-V-F,WYL,cas10d,csc2gr7,csc1gr5,2OG_CAS,cas6,cas4,cas1,cas2,Cas14u_CAS-V,csx3,csx1,cas10,csm3gr7,csx10gr5,csx19,c2c9_V-U4,cas8a4,cas7,cas5	NA|78aa|up_7|NZ_CP046703.1_855061_855295_-,NA|193aa|down_0|NZ_CP046703.1_867431_868010_-,NA|58aa|down_3|NZ_CP046703.1_870579_870753_+,NA|141aa|down_4|NZ_CP046703.1_871333_871756_+	NA|37aa|up_9|NZ_CP046703.1_854720_854831_-	cd18745, PIN_VapC4-5_FitB-like, uncharacterized subgroup of the PIN_VapC4-5_FitB-like subfamily of the PIN domain superfamily	NA|39aa|up_8|NZ_CP046703.1_854922_855039_-	cd18745, PIN_VapC4-5_FitB-like, uncharacterized subgroup of the PIN_VapC4-5_FitB-like subfamily of the PIN domain superfamily	NA|78aa|up_7|NZ_CP046703.1_855061_855295_-	NA	NA|96aa|up_6|NZ_CP046703.1_855294_855582_-	pfam09907, HigB_toxin, HigB_toxin, RelE-like toxic component of a toxin-antitoxin system	NA|31aa|up_5|NZ_CP046703.1_855812_855905_-	cd18745, PIN_VapC4-5_FitB-like, uncharacterized subgroup of the PIN_VapC4-5_FitB-like subfamily of the PIN domain superfamily	NA|86aa|up_4|NZ_CP046703.1_856217_856475_+	TIGR04149, predicted_protein, natural product precursor, GG-Bacteroidales family	NA|308aa|up_3|NZ_CP046703.1_856784_857708_-	TIGR03707, PPK2_P_aer, polyphosphate kinase 2, PA0141 family	NA|1642aa|up_2|NZ_CP046703.1_857895_862821_-	pfam18741, MTES_1575, REase_MTES_1575	NA|512aa|up_1|NZ_CP046703.1_862892_864428_-	cd13438, SPFH_eoslipins_u2, Uncharacterized prokaryotic subgroup of the stomatin-like proteins (slipins) family; belonging to the SPFH (stomatin, prohibitin, flotillin, and HflK/C) superfamily	RT|365aa|up_0|NZ_CP046703.1_865344_866439_-	cd03487, RT_Bac_retron_II, RT_Bac_retron_II: Reverse transcriptases (RTs) in bacterial retrotransposons or retrons	NA|193aa|down_0|NZ_CP046703.1_867431_868010_-	NA	NA|222aa|down_1|NZ_CP046703.1_868176_868842_+	pfam05857, TraX, TraX protein	NA|319aa|down_2|NZ_CP046703.1_869454_870411_+	pfam09150, Carot_N, Orange carotenoid protein, N-terminal	NA|58aa|down_3|NZ_CP046703.1_870579_870753_+	NA	NA|141aa|down_4|NZ_CP046703.1_871333_871756_+	NA	NA|216aa|down_5|NZ_CP046703.1_872722_873370_+	COG2226, UbiE, Methylase involved in ubiquinone/menaquinone biosynthesis [Coenzyme metabolism]	NA|266aa|down_6|NZ_CP046703.1_873511_874309_-	COG0412, COG0412, Dienelactone hydrolase and related enzymes [Secondary metabolites biosynthesis, transport, and catabolism]	NA|180aa|down_7|NZ_CP046703.1_874604_875144_+	COG5502, COG5502, Uncharacterized conserved protein [Function unknown]	NA|155aa|down_8|NZ_CP046703.1_875378_875843_-	pfam09346, SMI1_KNR4, SMI1 / KNR4 family (SUKH-1)	NA|95aa|down_9|NZ_CP046703.1_876231_876516_-	COG2442, COG2442, Uncharacterized conserved protein [Function unknown]
GCF_009873495.1_ASM987349v1	NZ_CP046703	Nostoc sp. ATCC 53789 chromosome, complete genome	3	1116686-1116783	3	CRISPRCasFinder	no		Cas9_archaeal,cas14k,cas14j,csa3,RT,DEDDh,PD-DExK,cas3,DinG,Cas14c_CAS-V-F,WYL,cas10d,csc2gr7,csc1gr5,2OG_CAS,cas6,cas4,cas1,cas2,Cas14u_CAS-V	Orphan	AGAGCTACGGTGTACACACAAGTC	24	0	0	NA	NA	NA	1	1	Orphan	Cas9_archaeal,cas14k,cas14j,csa3,RT,DEDDh,PD-DExK,cas3,DinG,Cas14c_CAS-V-F,WYL,cas10d,csc2gr7,csc1gr5,2OG_CAS,cas6,cas4,cas1,cas2,Cas14u_CAS-V,csx3,csx1,cas10,csm3gr7,csx10gr5,csx19,c2c9_V-U4,cas8a4,cas7,cas5	NA|456aa|up_7|NZ_CP046703.1_1105381_1106749_-,NA|126aa|up_6|NZ_CP046703.1_1107197_1107575_-,NA|88aa|up_2|NZ_CP046703.1_1112568_1112832_-,NA|146aa|up_1|NZ_CP046703.1_1113008_1113446_+,NA|71aa|down_7|NZ_CP046703.1_1130101_1130314_-	NA|117aa|up_9|NZ_CP046703.1_1096489_1096840_-	PRK12275, PRK12275, hypothetical protein; Reviewed	NA|438aa|up_8|NZ_CP046703.1_1097756_1099070_-	pfam13546, DDE_5, DDE superfamily endonuclease	NA|456aa|up_7|NZ_CP046703.1_1105381_1106749_-	NA	NA|126aa|up_6|NZ_CP046703.1_1107197_1107575_-	NA	NA|374aa|up_5|NZ_CP046703.1_1107578_1108700_-	pfam12770, CHAT, CHAT domain	NA|303aa|up_4|NZ_CP046703.1_1108677_1109586_+	pfam13359, DDE_Tnp_4, DDE superfamily endonuclease	NA|771aa|up_3|NZ_CP046703.1_1109740_1112053_-	sd00006, TPR, Tetratricopeptide repeat	NA|88aa|up_2|NZ_CP046703.1_1112568_1112832_-	NA	NA|146aa|up_1|NZ_CP046703.1_1113008_1113446_+	NA	NA|959aa|up_0|NZ_CP046703.1_1113738_1116615_+	PRK05743, ileS, isoleucyl-tRNA synthetase; Reviewed	NA|443aa|down_0|NZ_CP046703.1_1116921_1118250_+	pfam02281, Dimer_Tnp_Tn5, Transposase Tn5 dimerization domain	NA|385aa|down_1|NZ_CP046703.1_1118617_1119772_+	PRK02769, PRK02769, histidine decarboxylase; Provisional	NA|1128aa|down_2|NZ_CP046703.1_1119846_1123230_-	PRK10060, PRK10060, cyclic di-GMP phosphodiesterase	NA|295aa|down_3|NZ_CP046703.1_1123464_1124349_-	COG1192, Soj, ATPases involved in chromosome partitioning [Cell division and chromosome partitioning]	NA|391aa|down_4|NZ_CP046703.1_1124594_1125767_+	PRK00770, PRK00770, deoxyhypusine synthase	NA|784aa|down_5|NZ_CP046703.1_1125880_1128232_-	pfam00211, Guanylate_cyc, Adenylate and Guanylate cyclase catalytic domain	NA|491aa|down_6|NZ_CP046703.1_1128390_1129863_-	pfam03050, DDE_Tnp_IS66, Transposase IS66 family	NA|71aa|down_7|NZ_CP046703.1_1130101_1130314_-	NA	NA|388aa|down_8|NZ_CP046703.1_1130535_1131699_+	COG3693, XynA, Beta-1,4-xylanase [Carbohydrate transport and metabolism]	NA|275aa|down_9|NZ_CP046703.1_1131886_1132711_-	COG1922, WecG, Teichoic acid biosynthesis proteins [Cell envelope biogenesis, outer membrane]
GCF_009873495.1_ASM987349v1	NZ_CP046703	Nostoc sp. ATCC 53789 chromosome, complete genome	4	1715210-1715307	4	CRISPRCasFinder	no		Cas9_archaeal,cas14k,cas14j,csa3,RT,DEDDh,PD-DExK,cas3,DinG,Cas14c_CAS-V-F,WYL,cas10d,csc2gr7,csc1gr5,2OG_CAS,cas6,cas4,cas1,cas2,Cas14u_CAS-V	Orphan	GACTTGTGTGTACACCGTAGCTTT	24	0	0	NA	NA	NA	1	1	Orphan	Cas9_archaeal,cas14k,cas14j,csa3,RT,DEDDh,PD-DExK,cas3,DinG,Cas14c_CAS-V-F,WYL,cas10d,csc2gr7,csc1gr5,2OG_CAS,cas6,cas4,cas1,cas2,Cas14u_CAS-V,csx3,csx1,cas10,csm3gr7,csx10gr5,csx19,c2c9_V-U4,cas8a4,cas7,cas5	NA|438aa|up_5|NZ_CP046703.1_1705655_1706969_+,NA|54aa|down_0|NZ_CP046703.1_1715409_1715571_-,NA|131aa|down_3|NZ_CP046703.1_1717645_1718038_-,NA|48aa|down_6|NZ_CP046703.1_1721156_1721300_+	NA|402aa|up_9|NZ_CP046703.1_1698896_1700102_-	PRK10535, PRK10535, macrolide ABC transporter ATP-binding protein/permease MacB	NA|439aa|up_8|NZ_CP046703.1_1700200_1701517_-	TIGR01730, COG0845:_Membrane-fusion_protein, RND family efflux transporter, MFP subunit	NA|291aa|up_7|NZ_CP046703.1_1702251_1703124_-	pfam02557, VanY, D-alanyl-D-alanine carboxypeptidase	NA|641aa|up_6|NZ_CP046703.1_1703502_1705425_+	PRK05192, PRK05192, tRNA uridine-5-carboxymethylaminomethyl(34) synthesis enzyme MnmG	NA|438aa|up_5|NZ_CP046703.1_1705655_1706969_+	NA	NA|522aa|up_4|NZ_CP046703.1_1706996_1708562_-	pfam01832, Glucosaminidase, Mannosyl-glycoprotein endo-beta-N-acetylglucosaminidase	NA|254aa|up_3|NZ_CP046703.1_1708651_1709413_-	cd02696, MurNAc-LAA, N-acetylmuramoyl-L-alanine amidase or MurNAc-LAA (also known as peptidoglycan aminohydrolase, NAMLA amidase, NAMLAA, Amidase 3, and peptidoglycan amidase; EC 3	NA|316aa|up_2|NZ_CP046703.1_1709945_1710893_-	PRK07399, PRK07399, DNA polymerase III subunit delta'; Validated	NA|522aa|up_1|NZ_CP046703.1_1711885_1713451_+	COG1233, COG1233, Phytoene dehydrogenase and related proteins [Secondary metabolites biosynthesis, transport, and catabolism]	NA|443aa|up_0|NZ_CP046703.1_1713742_1715071_-	pfam02281, Dimer_Tnp_Tn5, Transposase Tn5 dimerization domain	NA|54aa|down_0|NZ_CP046703.1_1715409_1715571_-	NA	NA|202aa|down_1|NZ_CP046703.1_1715724_1716330_+	COG3019, COG3019, Predicted metal-binding protein [General function prediction only]	NA|316aa|down_2|NZ_CP046703.1_1716448_1717396_-	sd00006, TPR, Tetratricopeptide repeat	NA|131aa|down_3|NZ_CP046703.1_1717645_1718038_-	NA	NA|304aa|down_4|NZ_CP046703.1_1718663_1719575_+	cd01846, fatty_acyltransferase_like, Fatty acyltransferase-like subfamily of the SGNH hydrolases, a diverse family of lipases and esterases	NA|435aa|down_5|NZ_CP046703.1_1719759_1721064_+	TIGR01326, Includes:_O-acetylhomoserine_sulfhydrylase, OAH/OAS sulfhydrylase	NA|48aa|down_6|NZ_CP046703.1_1721156_1721300_+	NA	NA|357aa|down_7|NZ_CP046703.1_1721371_1722442_+	TIGR01392, Homoserine_O-acetyltransferase, homoserine O-acetyltransferase	NA|444aa|down_8|NZ_CP046703.1_1722597_1723929_-	COG2907, COG2907, Predicted NAD/FAD-binding protein [General function prediction only]	NA|521aa|down_9|NZ_CP046703.1_1724094_1725657_-	COG1233, COG1233, Phytoene dehydrogenase and related proteins [Secondary metabolites biosynthesis, transport, and catabolism]
GCF_009873495.1_ASM987349v1	NZ_CP046703	Nostoc sp. ATCC 53789 chromosome, complete genome	5	2316082-2316190	5	CRISPRCasFinder	no	csa3	Cas9_archaeal,cas14k,cas14j,csa3,RT,DEDDh,PD-DExK,cas3,DinG,Cas14c_CAS-V-F,WYL,cas10d,csc2gr7,csc1gr5,2OG_CAS,cas6,cas4,cas1,cas2,Cas14u_CAS-V	Type I-A	GGTTAGTACCGGAAGCAGCAGGTAGAGCCATTTCGCT	37	0	0	NA	NA	NA	1	1	Orphan	Cas9_archaeal,cas14k,cas14j,csa3,RT,DEDDh,PD-DExK,cas3,DinG,Cas14c_CAS-V-F,WYL,cas10d,csc2gr7,csc1gr5,2OG_CAS,cas6,cas4,cas1,cas2,Cas14u_CAS-V,csx3,csx1,cas10,csm3gr7,csx10gr5,csx19,c2c9_V-U4,cas8a4,cas7,cas5	NA|51aa|up_4|NZ_CP046703.1_2311895_2312048_-,NA|78aa|down_3|NZ_CP046703.1_2318708_2318942_-,NA|215aa|down_6|NZ_CP046703.1_2322073_2322718_+,NA|237aa|down_8|NZ_CP046703.1_2323563_2324274_+,NA|147aa|down_9|NZ_CP046703.1_2324341_2324782_+	NA|108aa|up_9|NZ_CP046703.1_2306638_2306962_+	TIGR01068, Thioredoxin-like_protein_slr0233, thioredoxin	NA|265aa|up_8|NZ_CP046703.1_2307126_2307921_+	cd05344, BKR_like_SDR_like, putative beta-ketoacyl acyl carrier protein [ACP] reductase (BKR)-like, SDR	NA|371aa|up_7|NZ_CP046703.1_2308182_2309295_+	cd02933, OYE_like_FMN, Old yellow enzyme (OYE)-like FMN binding domain	NA|222aa|up_6|NZ_CP046703.1_2309641_2310307_+	COG0778, NfnB, Nitroreductase [Energy production and conversion]	NA|433aa|up_5|NZ_CP046703.1_2310589_2311888_+	cd17355, MFS_YcxA_like, MFS-type transporter YcxA and similar proteins of the Major Facilitator Superfamily of transporters	NA|51aa|up_4|NZ_CP046703.1_2311895_2312048_-	NA	NA|230aa|up_3|NZ_CP046703.1_2312214_2312904_+	cd05373, SDR_c10, classical (c) SDR, subgroup  10	NA|256aa|up_2|NZ_CP046703.1_2312938_2313706_+	pfam13649, Methyltransf_25, Methyltransferase domain	NA|322aa|up_1|NZ_CP046703.1_2313796_2314762_-	pfam13578, Methyltransf_24, Methyltransferase domain	NA|233aa|up_0|NZ_CP046703.1_2315096_2315795_+	PRK13972, PRK13972, GSH-dependent disulfide bond oxidoreductase; Provisional	NA|286aa|down_0|NZ_CP046703.1_2316415_2317273_+	cd07987, LPLAT_MGAT-like, Lysophospholipid Acyltransferases (LPLATs) of Glycerophospholipid Biosynthesis: MGAT-like	NA|87aa|down_1|NZ_CP046703.1_2317483_2317744_-	cd00754, Ubl_MoaD, ubiquitin-like (Ubl) domain found in molybdenum cofactor biosynthesis protein D (MoaD) and similar proteins	NA|253aa|down_2|NZ_CP046703.1_2317863_2318622_-	pfam05685, Uma2, Putative restriction endonuclease	NA|78aa|down_3|NZ_CP046703.1_2318708_2318942_-	NA	NA|227aa|down_4|NZ_CP046703.1_2319072_2319753_-	sd00006, TPR, Tetratricopeptide repeat	NA|657aa|down_5|NZ_CP046703.1_2319870_2321841_-	PRK00174, PRK00174, acetyl-CoA synthetase; Provisional	NA|215aa|down_6|NZ_CP046703.1_2322073_2322718_+	NA	NA|250aa|down_7|NZ_CP046703.1_2322740_2323490_+	cd06260, DUF820, Domain of unknown function (DUF820)	NA|237aa|down_8|NZ_CP046703.1_2323563_2324274_+	NA	NA|147aa|down_9|NZ_CP046703.1_2324341_2324782_+	NA
GCF_009873495.1_ASM987349v1	NZ_CP046703	Nostoc sp. ATCC 53789 chromosome, complete genome	6	2342811-2343645	1,6,1	PILER-CR,CRISPRCasFinder,CRT	no		Cas9_archaeal,cas14k,cas14j,csa3,RT,DEDDh,PD-DExK,cas3,DinG,Cas14c_CAS-V-F,WYL,cas10d,csc2gr7,csc1gr5,2OG_CAS,cas6,cas4,cas1,cas2,Cas14u_CAS-V	Orphan	GTTTCAATCCCTAATAGGGATTTTGATGAATTGCAAT,GTTTCAATCCCTAATAGGGATTTTGATGAATTGCAAT,GTTTCAATCCCTAATAGGGATTTTGATGAATTGCAAT	37,37,37	0	0	NA	NA	I-D,II-B:I-D,II-B:I-D,II-B	11,11,11	11	Orphan	Cas9_archaeal,cas14k,cas14j,csa3,RT,DEDDh,PD-DExK,cas3,DinG,Cas14c_CAS-V-F,WYL,cas10d,csc2gr7,csc1gr5,2OG_CAS,cas6,cas4,cas1,cas2,Cas14u_CAS-V,csx3,csx1,cas10,csm3gr7,csx10gr5,csx19,c2c9_V-U4,cas8a4,cas7,cas5	NA|48aa|up_9|NZ_CP046703.1_2335690_2335834_+,NA	NA|48aa|up_9|NZ_CP046703.1_2335690_2335834_+	NA	NA|206aa|up_8|NZ_CP046703.1_2335882_2336500_-	COG1182, AcpD, Acyl carrier protein phosphodiesterase [Lipid metabolism]	NA|295aa|up_7|NZ_CP046703.1_2337141_2338026_-	pfam11353, DUF3153, Protein of unknown function (DUF3153)	NA|181aa|up_6|NZ_CP046703.1_2338052_2338595_-	sd00006, TPR, Tetratricopeptide repeat	NA|299aa|up_5|NZ_CP046703.1_2338872_2339769_-	cd04250, AAK_NAGK-C, AAK_NAGK-C: N-Acetyl-L-glutamate kinase - cyclic (NAGK-C) catalyzes the phosphorylation of the gamma-COOH group of N-acetyl-L-glutamate (NAG) by ATP in the second step of arginine biosynthesis found in some bacteria and photosynthetic organisms using the non-acetylated, cyclic route of ornithine biosynthesis	NA|160aa|up_4|NZ_CP046703.1_2339896_2340376_+	COG1357, COG1357, Pentapeptide repeats containing protein [Function unknown]	NA|208aa|up_3|NZ_CP046703.1_2340559_2341183_+	pfam05685, Uma2, Putative restriction endonuclease	NA|112aa|up_2|NZ_CP046703.1_2341282_2341618_-	pfam08869, XisI, XisI protein	NA|139aa|up_1|NZ_CP046703.1_2341605_2342022_-	pfam08814, XisH, XisH protein	NA|183aa|up_0|NZ_CP046703.1_2342039_2342588_-	PRK00131, aroK, shikimate kinase; Reviewed	NA|341aa|down_0|NZ_CP046703.1_2343955_2344978_-	COG3491, PcbC, Isopenicillin N synthase and related dioxygenases [General function prediction only]	NA|497aa|down_1|NZ_CP046703.1_2345143_2346634_-	COG4775, COG4775, Outer membrane protein/protective antigen OMA87 [Cell envelope biogenesis, outer membrane]	NA|156aa|down_2|NZ_CP046703.1_2347088_2347556_+	COG3296, COG3296, Uncharacterized protein conserved in bacteria [Function unknown]	NA|305aa|down_3|NZ_CP046703.1_2347623_2348538_+	pfam02485, Branch, Core-2/I-Branching enzyme	NA|366aa|down_4|NZ_CP046703.1_2348690_2349788_+	COG0673, MviM, Predicted dehydrogenases and related proteins [General function prediction only]	NA|82aa|down_5|NZ_CP046703.1_2349795_2350041_+	pfam14279, HNH_5, HNH endonuclease	NA|906aa|down_6|NZ_CP046703.1_2350572_2353290_-	cd10797, GH57N_APU_like_1, N-terminal putative catalytic domain of mainly uncharacterized prokaryotic proteins similar to archaeal thermoactive amylopullulanases; glycoside hydrolase family 57 (GH57)	NA|364aa|down_7|NZ_CP046703.1_2353817_2354909_+	TIGR00378, cax, calcium/proton exchanger (cax)	NA|380aa|down_8|NZ_CP046703.1_2355089_2356229_+	cd14014, STKc_PknB_like, Catalytic domain of bacterial Serine/Threonine kinases, PknB and similar proteins	NA|477aa|down_9|NZ_CP046703.1_2356396_2357827_-	PRK09287, PRK09287, NADP-dependent phosphogluconate dehydrogenase
GCF_009873495.1_ASM987349v1	NZ_CP046703	Nostoc sp. ATCC 53789 chromosome, complete genome	7	2456944-2457067	7	CRISPRCasFinder	no		Cas9_archaeal,cas14k,cas14j,csa3,RT,DEDDh,PD-DExK,cas3,DinG,Cas14c_CAS-V-F,WYL,cas10d,csc2gr7,csc1gr5,2OG_CAS,cas6,cas4,cas1,cas2,Cas14u_CAS-V	Orphan	TTCTGTTCATCAACAAACCCATGCGTTTCATCAACAAAAT	40	0	0	NA	NA	NA	1	1	Orphan	Cas9_archaeal,cas14k,cas14j,csa3,RT,DEDDh,PD-DExK,cas3,DinG,Cas14c_CAS-V-F,WYL,cas10d,csc2gr7,csc1gr5,2OG_CAS,cas6,cas4,cas1,cas2,Cas14u_CAS-V,csx3,csx1,cas10,csm3gr7,csx10gr5,csx19,c2c9_V-U4,cas8a4,cas7,cas5	NA|117aa|up_9|NZ_CP046703.1_2444334_2444685_-,NA|52aa|up_8|NZ_CP046703.1_2444805_2444961_-,NA|100aa|up_5|NZ_CP046703.1_2445867_2446167_-,NA|145aa|up_3|NZ_CP046703.1_2449409_2449844_-,NA|53aa|up_0|NZ_CP046703.1_2456417_2456576_+,NA|143aa|down_0|NZ_CP046703.1_2457256_2457685_+,NA|141aa|down_5|NZ_CP046703.1_2466567_2466990_-,NA|142aa|down_8|NZ_CP046703.1_2470866_2471292_+	NA|117aa|up_9|NZ_CP046703.1_2444334_2444685_-	NA	NA|52aa|up_8|NZ_CP046703.1_2444805_2444961_-	NA	NA|83aa|up_7|NZ_CP046703.1_2445126_2445375_+	COG4118, Phd, Antitoxin of toxin-antitoxin stability system [Cell division and chromosome partitioning]	NA|129aa|up_6|NZ_CP046703.1_2445377_2445764_+	cd09872, PIN_Sll0205-like, VapC-like PIN domain of Sll0205 protein and homologs	NA|100aa|up_5|NZ_CP046703.1_2445867_2446167_-	NA	NA|1002aa|up_4|NZ_CP046703.1_2446304_2449310_-	pfam13424, TPR_12, Tetratricopeptide repeat	NA|145aa|up_3|NZ_CP046703.1_2449409_2449844_-	NA	NA|1404aa|up_2|NZ_CP046703.1_2450025_2454237_-	COG0553, HepA, Superfamily II DNA/RNA helicases, SNF2 family [Transcription / DNA replication, recombination, and repair]	NA|589aa|up_1|NZ_CP046703.1_2454364_2456131_-	COG4715, COG4715, Uncharacterized conserved protein [Function unknown]	NA|53aa|up_0|NZ_CP046703.1_2456417_2456576_+	NA	NA|143aa|down_0|NZ_CP046703.1_2457256_2457685_+	NA	NA|887aa|down_1|NZ_CP046703.1_2457743_2460404_-	COG0642, BaeS, Signal transduction histidine kinase [Signal transduction mechanisms]	NA|803aa|down_2|NZ_CP046703.1_2460670_2463079_-	pfam16095, COR, C-terminal of Roc, COR, domain	NA|531aa|down_3|NZ_CP046703.1_2463517_2465110_+	COG0654, UbiH, 2-polyprenyl-6-methoxyphenol hydroxylase and related FAD-dependent oxidoreductases [Coenzyme metabolism / Energy production and conversion]	NA|348aa|down_4|NZ_CP046703.1_2465338_2466382_+	cd11593, Agmatinase-like_2, Agmatinase and related proteins	NA|141aa|down_5|NZ_CP046703.1_2466567_2466990_-	NA	NA|546aa|down_6|NZ_CP046703.1_2467114_2468752_-	PRK05380, pyrG, CTP synthetase; Validated	NA|592aa|down_7|NZ_CP046703.1_2468836_2470612_+	cd02696, MurNAc-LAA, N-acetylmuramoyl-L-alanine amidase or MurNAc-LAA (also known as peptidoglycan aminohydrolase, NAMLA amidase, NAMLAA, Amidase 3, and peptidoglycan amidase; EC 3	NA|142aa|down_8|NZ_CP046703.1_2470866_2471292_+	NA	NA|192aa|down_9|NZ_CP046703.1_2471407_2471983_-	COG1357, COG1357, Pentapeptide repeats containing protein [Function unknown]
GCF_009873495.1_ASM987349v1	NZ_CP046703	Nostoc sp. ATCC 53789 chromosome, complete genome	8	4209520-4209636	8	CRISPRCasFinder	no		Cas9_archaeal,cas14k,cas14j,csa3,RT,DEDDh,PD-DExK,cas3,DinG,Cas14c_CAS-V-F,WYL,cas10d,csc2gr7,csc1gr5,2OG_CAS,cas6,cas4,cas1,cas2,Cas14u_CAS-V	Orphan	ACTTGTGTGTACACCGTAGCCTTGTAAAGGGG	32	0	0	NA	NA	NA	1	1	Orphan	Cas9_archaeal,cas14k,cas14j,csa3,RT,DEDDh,PD-DExK,cas3,DinG,Cas14c_CAS-V-F,WYL,cas10d,csc2gr7,csc1gr5,2OG_CAS,cas6,cas4,cas1,cas2,Cas14u_CAS-V,csx3,csx1,cas10,csm3gr7,csx10gr5,csx19,c2c9_V-U4,cas8a4,cas7,cas5	NA|78aa|up_9|NZ_CP046703.1_4193006_4193240_+,NA|127aa|up_5|NZ_CP046703.1_4198585_4198966_-,NA|327aa|up_1|NZ_CP046703.1_4202876_4203857_-,NA|42aa|down_4|NZ_CP046703.1_4227905_4228031_-,NA|204aa|down_7|NZ_CP046703.1_4231018_4231630_-	NA|78aa|up_9|NZ_CP046703.1_4193006_4193240_+	NA	NA|233aa|up_8|NZ_CP046703.1_4193690_4194389_+	pfam13489, Methyltransf_23, Methyltransferase domain	NA|358aa|up_7|NZ_CP046703.1_4195159_4196233_+	cd03801, GT4_PimA-like, phosphatidyl-myo-inositol mannosyltransferase	NA|262aa|up_6|NZ_CP046703.1_4196245_4197031_+	COG1922, WecG, Teichoic acid biosynthesis proteins [Cell envelope biogenesis, outer membrane]	NA|127aa|up_5|NZ_CP046703.1_4198585_4198966_-	NA	NA|363aa|up_4|NZ_CP046703.1_4199691_4200780_-	PRK13396, PRK13396, 3-deoxy-7-phosphoheptulonate synthase; Provisional	NA|318aa|up_3|NZ_CP046703.1_4201028_4201982_-	cd05229, SDR_a3, atypical (a) SDRs, subgroup 3	NA|235aa|up_2|NZ_CP046703.1_4202038_4202743_-	pfam14329, DUF4386, Domain of unknown function (DUF4386)	NA|327aa|up_1|NZ_CP046703.1_4202876_4203857_-	NA	NA|1648aa|up_0|NZ_CP046703.1_4204033_4208977_-	cd17646, A_NRPS_AB3403-like, Peptide Synthetase	NA|2539aa|down_0|NZ_CP046703.1_4209730_4217347_-	COG3321, COG3321, Polyketide synthase modules and related proteins [Secondary metabolites biosynthesis, transport, and catabolism]	NA|1874aa|down_1|NZ_CP046703.1_4217648_4223270_-	COG3321, COG3321, Polyketide synthase modules and related proteins [Secondary metabolites biosynthesis, transport, and catabolism]	NA|664aa|down_2|NZ_CP046703.1_4223266_4225258_-	cd05936, FC-FACS_FadD_like, Prokaryotic long-chain fatty acid CoA synthetases similar to Escherichia coli FadD	NA|564aa|down_3|NZ_CP046703.1_4225399_4227091_-	pfam00221, Lyase_aromatic, Aromatic amino acid lyase	NA|42aa|down_4|NZ_CP046703.1_4227905_4228031_-	NA	NA|390aa|down_5|NZ_CP046703.1_4228148_4229318_-	cd17477, MFS_YcaD_like, YcaD and similar transporters of the Major Facilitator Superfamily	NA|259aa|down_6|NZ_CP046703.1_4229504_4230281_-	COG3208, GrsT, Predicted thioesterase involved in non-ribosomal peptide biosynthesis [Secondary metabolites biosynthesis, transport, and catabolism]	NA|204aa|down_7|NZ_CP046703.1_4231018_4231630_-	NA	NA|663aa|down_8|NZ_CP046703.1_4232107_4234096_+	cd07498, Peptidases_S8_15, Peptidase S8 family domain, uncharacterized subfamily 15	NA|458aa|down_9|NZ_CP046703.1_4234269_4235643_-	PRK13352, PRK13352, phosphomethylpyrimidine synthase ThiC
GCF_009873495.1_ASM987349v1	NZ_CP046703	Nostoc sp. ATCC 53789 chromosome, complete genome	9	4930753-4931585	2,9,2	PILER-CR,CRISPRCasFinder,CRT	no		Cas9_archaeal,cas14k,cas14j,csa3,RT,DEDDh,PD-DExK,cas3,DinG,Cas14c_CAS-V-F,WYL,cas10d,csc2gr7,csc1gr5,2OG_CAS,cas6,cas4,cas1,cas2,Cas14u_CAS-V	Orphan	ATTGCAATTCATCAAAATCCCTATTAGGG----------ATTGAAAC,ATTGCAATTCATCAAAATCCCTATTAGGGATTGAAAC,ATTGCAATTCATCAAAATCCCTATTAGGGATTGAAAC	47,37,37	0	0	NA	NA	I-D,II-B:I-D,II-B:I-D,II-B	11,11,11	11	Orphan	Cas9_archaeal,cas14k,cas14j,csa3,RT,DEDDh,PD-DExK,cas3,DinG,Cas14c_CAS-V-F,WYL,cas10d,csc2gr7,csc1gr5,2OG_CAS,cas6,cas4,cas1,cas2,Cas14u_CAS-V,csx3,csx1,cas10,csm3gr7,csx10gr5,csx19,c2c9_V-U4,cas8a4,cas7,cas5	NA|500aa|up_6|NZ_CP046703.1_4922766_4924266_-,NA|98aa|up_2|NZ_CP046703.1_4928176_4928470_-,NA	NA|261aa|up_9|NZ_CP046703.1_4917932_4918715_+	pfam01925, TauE, Sulfite exporter TauE/SafE	NA|177aa|up_8|NZ_CP046703.1_4919060_4919591_-	COG2197, CitB, Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain [Signal transduction mechanisms / Transcription]	NA|592aa|up_7|NZ_CP046703.1_4920654_4922430_+	COG0642, BaeS, Signal transduction histidine kinase [Signal transduction mechanisms]	NA|500aa|up_6|NZ_CP046703.1_4922766_4924266_-	NA	NA|258aa|up_5|NZ_CP046703.1_4924325_4925099_-	TIGR03943, TIGR03943, TIGR03943 family protein	NA|352aa|up_4|NZ_CP046703.1_4925152_4926208_-	COG0701, COG0701, Predicted permeases [General function prediction only]	NA|336aa|up_3|NZ_CP046703.1_4926867_4927875_-	cd00200, WD40, WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment	NA|98aa|up_2|NZ_CP046703.1_4928176_4928470_-	NA	NA|57aa|up_1|NZ_CP046703.1_4928796_4928967_-	PLN00014, PLN00014, light-harvesting-like protein 3; Provisional	NA|364aa|up_0|NZ_CP046703.1_4929384_4930476_+	cd05305, L-AlaDH, Alanine dehydrogenase NAD-binding and catalytic domains	NA|352aa|down_0|NZ_CP046703.1_4931656_4932712_-	TIGR00737, Probable_tRNA-dihydrouridine_synthase, putative TIM-barrel protein, nifR3 family	NA|278aa|down_1|NZ_CP046703.1_4932788_4933622_-	pfam11103, DUF2887, Protein of unknown function (DUF2887)	NA|286aa|down_2|NZ_CP046703.1_4933743_4934601_-	COG1398, OLE1, Fatty-acid desaturase [Lipid metabolism]	NA|410aa|down_3|NZ_CP046703.1_4934774_4936004_+	COG1309, AcrR, Transcriptional regulator [Transcription]	NA|634aa|down_4|NZ_CP046703.1_4936085_4937987_-	pfam07602, DUF1565, Protein of unknown function (DUF1565)	NA|372aa|down_5|NZ_CP046703.1_4938483_4939599_+	PRK02615, PRK02615, thiamine phosphate synthase	NA|71aa|down_6|NZ_CP046703.1_4939727_4939940_+	PRK07440, PRK07440, thiamine biosynthesis protein ThiS	NA|294aa|down_7|NZ_CP046703.1_4939936_4940818_-	COG2602, COG2602, Beta-lactamase class D [Defense mechanisms]	NA|411aa|down_8|NZ_CP046703.1_4940938_4942171_-	cd03801, GT4_PimA-like, phosphatidyl-myo-inositol mannosyltransferase	NA|328aa|down_9|NZ_CP046703.1_4943054_4944038_+	COG4371, COG4371, Predicted membrane protein [Function unknown]
GCF_009873495.1_ASM987349v1	NZ_CP046703	Nostoc sp. ATCC 53789 chromosome, complete genome	10	5414739-5414917	10	CRISPRCasFinder	no		Cas9_archaeal,cas14k,cas14j,csa3,RT,DEDDh,PD-DExK,cas3,DinG,Cas14c_CAS-V-F,WYL,cas10d,csc2gr7,csc1gr5,2OG_CAS,cas6,cas4,cas1,cas2,Cas14u_CAS-V	Orphan	GTTTCAATCCCTAATAGGGATTTTATTTGATTGCAAT	37	0	0	NA	NA	I-D,II-B	2	2	Orphan	Cas9_archaeal,cas14k,cas14j,csa3,RT,DEDDh,PD-DExK,cas3,DinG,Cas14c_CAS-V-F,WYL,cas10d,csc2gr7,csc1gr5,2OG_CAS,cas6,cas4,cas1,cas2,Cas14u_CAS-V,csx3,csx1,cas10,csm3gr7,csx10gr5,csx19,c2c9_V-U4,cas8a4,cas7,cas5	NA|87aa|up_5|NZ_CP046703.1_5405094_5405355_-,NA|125aa|down_2|NZ_CP046703.1_5418534_5418909_+,NA|57aa|down_9|NZ_CP046703.1_5424973_5425144_+	NA|317aa|up_9|NZ_CP046703.1_5400504_5401455_+	cd05230, UGD_SDR_e, UDP-glucuronate decarboxylase (UGD) and related proteins, extended (e) SDRs	NA|464aa|up_8|NZ_CP046703.1_5401603_5402995_+	COG1004, Ugd, Predicted UDP-glucose 6-dehydrogenase [Cell envelope biogenesis, outer membrane]	NA|291aa|up_7|NZ_CP046703.1_5403331_5404204_-	pfam11209, DUF2993, Protein of unknown function (DUF2993)	NA|139aa|up_6|NZ_CP046703.1_5404691_5405108_-	cd18696, PIN_MtVapC26-like, VapC-like PIN domain of Mycobacterium tuberculosis VapC26 and related proteins	NA|87aa|up_5|NZ_CP046703.1_5405094_5405355_-	NA	NA|417aa|up_4|NZ_CP046703.1_5405403_5406654_-	PRK00549, PRK00549, competence damage-inducible protein A; Provisional	NA|428aa|up_3|NZ_CP046703.1_5406774_5408058_-	PRK00011, glyA, serine hydroxymethyltransferase; Reviewed	NA|249aa|up_2|NZ_CP046703.1_5408467_5409214_+	pfam18171, LSDAT_prok, SLOG in TRPM, prokaryote	NA|208aa|up_1|NZ_CP046703.1_5409223_5409847_+	pfam14015, DUF4231, Protein of unknown function (DUF4231)	NA|927aa|up_0|NZ_CP046703.1_5410547_5413328_-	COG0612, PqqL, Predicted Zn-dependent peptidases [General function prediction only]	NA|256aa|down_0|NZ_CP046703.1_5415534_5416302_-	cd02978, KaiB_like, KaiB-like family; composed of the circadian clock proteins, KaiB and the N-terminal KaiB-like sensory domain of SasA	NA|428aa|down_1|NZ_CP046703.1_5416521_5417805_-	COG2805, PilT, Tfp pilus assembly protein, pilus retraction ATPase PilT [Cell motility and secretion / Intracellular trafficking and secretion]	NA|125aa|down_2|NZ_CP046703.1_5418534_5418909_+	NA	NA|377aa|down_3|NZ_CP046703.1_5419161_5420292_+	TIGR00236, UDP-N-acetylglucosamine_2-epimerase, UDP-N-acetylglucosamine 2-epimerase	NA|224aa|down_4|NZ_CP046703.1_5420547_5421219_-	TIGR02869, Spore_cortex-lytic_enzyme, spore cortex-lytic enzyme	NA|137aa|down_5|NZ_CP046703.1_5421532_5421943_-	cd17580, REC_2_DhkD-like, second phosphoacceptor receiver (REC) domain of Dictyostelium discoideum hybrid signal transduction histidine kinase D and similar domains	NA|244aa|down_6|NZ_CP046703.1_5422099_5422831_-	COG0664, Crp, cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases [Signal transduction mechanisms]	NA|249aa|down_7|NZ_CP046703.1_5422911_5423658_-	COG0664, Crp, cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases [Signal transduction mechanisms]	NA|242aa|down_8|NZ_CP046703.1_5423859_5424585_-	COG0664, Crp, cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases [Signal transduction mechanisms]	NA|57aa|down_9|NZ_CP046703.1_5424973_5425144_+	NA
GCF_009873495.1_ASM987349v1	NZ_CP046703	Nostoc sp. ATCC 53789 chromosome, complete genome	11	5414918-5415168	3,3	CRT,PILER-CR	no		Cas9_archaeal,cas14k,cas14j,csa3,RT,DEDDh,PD-DExK,cas3,DinG,Cas14c_CAS-V-F,WYL,cas10d,csc2gr7,csc1gr5,2OG_CAS,cas6,cas4,cas1,cas2,Cas14u_CAS-V	Orphan	TGTAGCTACTTACTTATGGATTTTATTTGATTGCAAT,GTAGCTACTTACTTATGGATTTTATTTGATTGCAAT	37,36	0	0	NA	NA	NA:NA	3,3	3	Orphan	Cas9_archaeal,cas14k,cas14j,csa3,RT,DEDDh,PD-DExK,cas3,DinG,Cas14c_CAS-V-F,WYL,cas10d,csc2gr7,csc1gr5,2OG_CAS,cas6,cas4,cas1,cas2,Cas14u_CAS-V,csx3,csx1,cas10,csm3gr7,csx10gr5,csx19,c2c9_V-U4,cas8a4,cas7,cas5	NA|87aa|up_5|NZ_CP046703.1_5405094_5405355_-,NA|125aa|down_2|NZ_CP046703.1_5418534_5418909_+,NA|57aa|down_9|NZ_CP046703.1_5424973_5425144_+	NA|317aa|up_9|NZ_CP046703.1_5400504_5401455_+	cd05230, UGD_SDR_e, UDP-glucuronate decarboxylase (UGD) and related proteins, extended (e) SDRs	NA|464aa|up_8|NZ_CP046703.1_5401603_5402995_+	COG1004, Ugd, Predicted UDP-glucose 6-dehydrogenase [Cell envelope biogenesis, outer membrane]	NA|291aa|up_7|NZ_CP046703.1_5403331_5404204_-	pfam11209, DUF2993, Protein of unknown function (DUF2993)	NA|139aa|up_6|NZ_CP046703.1_5404691_5405108_-	cd18696, PIN_MtVapC26-like, VapC-like PIN domain of Mycobacterium tuberculosis VapC26 and related proteins	NA|87aa|up_5|NZ_CP046703.1_5405094_5405355_-	NA	NA|417aa|up_4|NZ_CP046703.1_5405403_5406654_-	PRK00549, PRK00549, competence damage-inducible protein A; Provisional	NA|428aa|up_3|NZ_CP046703.1_5406774_5408058_-	PRK00011, glyA, serine hydroxymethyltransferase; Reviewed	NA|249aa|up_2|NZ_CP046703.1_5408467_5409214_+	pfam18171, LSDAT_prok, SLOG in TRPM, prokaryote	NA|208aa|up_1|NZ_CP046703.1_5409223_5409847_+	pfam14015, DUF4231, Protein of unknown function (DUF4231)	NA|927aa|up_0|NZ_CP046703.1_5410547_5413328_-	COG0612, PqqL, Predicted Zn-dependent peptidases [General function prediction only]	NA|256aa|down_0|NZ_CP046703.1_5415534_5416302_-	cd02978, KaiB_like, KaiB-like family; composed of the circadian clock proteins, KaiB and the N-terminal KaiB-like sensory domain of SasA	NA|428aa|down_1|NZ_CP046703.1_5416521_5417805_-	COG2805, PilT, Tfp pilus assembly protein, pilus retraction ATPase PilT [Cell motility and secretion / Intracellular trafficking and secretion]	NA|125aa|down_2|NZ_CP046703.1_5418534_5418909_+	NA	NA|377aa|down_3|NZ_CP046703.1_5419161_5420292_+	TIGR00236, UDP-N-acetylglucosamine_2-epimerase, UDP-N-acetylglucosamine 2-epimerase	NA|224aa|down_4|NZ_CP046703.1_5420547_5421219_-	TIGR02869, Spore_cortex-lytic_enzyme, spore cortex-lytic enzyme	NA|137aa|down_5|NZ_CP046703.1_5421532_5421943_-	cd17580, REC_2_DhkD-like, second phosphoacceptor receiver (REC) domain of Dictyostelium discoideum hybrid signal transduction histidine kinase D and similar domains	NA|244aa|down_6|NZ_CP046703.1_5422099_5422831_-	COG0664, Crp, cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases [Signal transduction mechanisms]	NA|249aa|down_7|NZ_CP046703.1_5422911_5423658_-	COG0664, Crp, cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases [Signal transduction mechanisms]	NA|242aa|down_8|NZ_CP046703.1_5423859_5424585_-	COG0664, Crp, cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases [Signal transduction mechanisms]	NA|57aa|down_9|NZ_CP046703.1_5424973_5425144_+	NA
GCF_009873495.1_ASM987349v1	NZ_CP046703	Nostoc sp. ATCC 53789 chromosome, complete genome	12	5418944-5419042	11	CRISPRCasFinder	no		Cas9_archaeal,cas14k,cas14j,csa3,RT,DEDDh,PD-DExK,cas3,DinG,Cas14c_CAS-V-F,WYL,cas10d,csc2gr7,csc1gr5,2OG_CAS,cas6,cas4,cas1,cas2,Cas14u_CAS-V	Orphan	CCCGGTTAAGCCGGGTTTTTTCATGCT	27	0	0	NA	NA	NA	1	1	Orphan	Cas9_archaeal,cas14k,cas14j,csa3,RT,DEDDh,PD-DExK,cas3,DinG,Cas14c_CAS-V-F,WYL,cas10d,csc2gr7,csc1gr5,2OG_CAS,cas6,cas4,cas1,cas2,Cas14u_CAS-V,csx3,csx1,cas10,csm3gr7,csx10gr5,csx19,c2c9_V-U4,cas8a4,cas7,cas5	NA|87aa|up_8|NZ_CP046703.1_5405094_5405355_-,NA|125aa|up_0|NZ_CP046703.1_5418534_5418909_+,NA|57aa|down_6|NZ_CP046703.1_5424973_5425144_+	NA|139aa|up_9|NZ_CP046703.1_5404691_5405108_-	cd18696, PIN_MtVapC26-like, VapC-like PIN domain of Mycobacterium tuberculosis VapC26 and related proteins	NA|87aa|up_8|NZ_CP046703.1_5405094_5405355_-	NA	NA|417aa|up_7|NZ_CP046703.1_5405403_5406654_-	PRK00549, PRK00549, competence damage-inducible protein A; Provisional	NA|428aa|up_6|NZ_CP046703.1_5406774_5408058_-	PRK00011, glyA, serine hydroxymethyltransferase; Reviewed	NA|249aa|up_5|NZ_CP046703.1_5408467_5409214_+	pfam18171, LSDAT_prok, SLOG in TRPM, prokaryote	NA|208aa|up_4|NZ_CP046703.1_5409223_5409847_+	pfam14015, DUF4231, Protein of unknown function (DUF4231)	NA|927aa|up_3|NZ_CP046703.1_5410547_5413328_-	COG0612, PqqL, Predicted Zn-dependent peptidases [General function prediction only]	NA|256aa|up_2|NZ_CP046703.1_5415534_5416302_-	cd02978, KaiB_like, KaiB-like family; composed of the circadian clock proteins, KaiB and the N-terminal KaiB-like sensory domain of SasA	NA|428aa|up_1|NZ_CP046703.1_5416521_5417805_-	COG2805, PilT, Tfp pilus assembly protein, pilus retraction ATPase PilT [Cell motility and secretion / Intracellular trafficking and secretion]	NA|125aa|up_0|NZ_CP046703.1_5418534_5418909_+	NA	NA|377aa|down_0|NZ_CP046703.1_5419161_5420292_+	TIGR00236, UDP-N-acetylglucosamine_2-epimerase, UDP-N-acetylglucosamine 2-epimerase	NA|224aa|down_1|NZ_CP046703.1_5420547_5421219_-	TIGR02869, Spore_cortex-lytic_enzyme, spore cortex-lytic enzyme	NA|137aa|down_2|NZ_CP046703.1_5421532_5421943_-	cd17580, REC_2_DhkD-like, second phosphoacceptor receiver (REC) domain of Dictyostelium discoideum hybrid signal transduction histidine kinase D and similar domains	NA|244aa|down_3|NZ_CP046703.1_5422099_5422831_-	COG0664, Crp, cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases [Signal transduction mechanisms]	NA|249aa|down_4|NZ_CP046703.1_5422911_5423658_-	COG0664, Crp, cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases [Signal transduction mechanisms]	NA|242aa|down_5|NZ_CP046703.1_5423859_5424585_-	COG0664, Crp, cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases [Signal transduction mechanisms]	NA|57aa|down_6|NZ_CP046703.1_5424973_5425144_+	NA	NA|346aa|down_7|NZ_CP046703.1_5425628_5426666_+	cd19094, AKR_Tas-like, Escherichia coli Tas protein and similar proteins	NA|376aa|down_8|NZ_CP046703.1_5426769_5427897_+	pfam05935, Arylsulfotrans, Arylsulfotransferase (ASST)	NA|224aa|down_9|NZ_CP046703.1_5428137_5428809_+	COG0625, Gst, Glutathione S-transferase [Posttranslational modification, protein turnover, chaperones]
GCF_009873495.1_ASM987349v1	NZ_CP046703	Nostoc sp. ATCC 53789 chromosome, complete genome	13	5440397-5440492	12	CRISPRCasFinder	no		Cas9_archaeal,cas14k,cas14j,csa3,RT,DEDDh,PD-DExK,cas3,DinG,Cas14c_CAS-V-F,WYL,cas10d,csc2gr7,csc1gr5,2OG_CAS,cas6,cas4,cas1,cas2,Cas14u_CAS-V	Orphan	TCTTAACTGTCCTTGCTTTTAAAT	24	0	0	NA	NA	NA	1	1	Orphan	Cas9_archaeal,cas14k,cas14j,csa3,RT,DEDDh,PD-DExK,cas3,DinG,Cas14c_CAS-V-F,WYL,cas10d,csc2gr7,csc1gr5,2OG_CAS,cas6,cas4,cas1,cas2,Cas14u_CAS-V,csx3,csx1,cas10,csm3gr7,csx10gr5,csx19,c2c9_V-U4,cas8a4,cas7,cas5	NA|48aa|up_1|NZ_CP046703.1_5437911_5438055_+,NA|74aa|down_2|NZ_CP046703.1_5442417_5442639_+,NA|294aa|down_6|NZ_CP046703.1_5449894_5450776_+	NA|224aa|up_9|NZ_CP046703.1_5428137_5428809_+	COG0625, Gst, Glutathione S-transferase [Posttranslational modification, protein turnover, chaperones]	NA|338aa|up_8|NZ_CP046703.1_5428880_5429894_-	cd08276, MDR7, Medium chain dehydrogenases/reductase (MDR)/zinc-dependent alcohol dehydrogenase-like family	NA|170aa|up_7|NZ_CP046703.1_5433004_5433514_+	COG3153, COG3153, Predicted acetyltransferase [General function prediction only]	NA|462aa|up_6|NZ_CP046703.1_5433603_5434989_-	PRK09201, PRK09201, AtzE family amidohydrolase	NA|209aa|up_5|NZ_CP046703.1_5434998_5435625_-	pfam14219, DUF4328, Domain of unknown function (DUF4328)	NA|63aa|up_4|NZ_CP046703.1_5435714_5435903_-	pfam13318, DUF4089, Protein of unknown function (DUF4089)	NA|94aa|up_3|NZ_CP046703.1_5435976_5436258_+	pfam05768, DUF836, Glutaredoxin-like domain (DUF836)	NA|501aa|up_2|NZ_CP046703.1_5436311_5437814_+	PRK00139, murE, UDP-N-acetylmuramoylalanyl-D-glutamate--2,6-diaminopimelate ligase; Provisional	NA|48aa|up_1|NZ_CP046703.1_5437911_5438055_+	NA	NA|494aa|up_0|NZ_CP046703.1_5438244_5439726_+	COG0642, BaeS, Signal transduction histidine kinase [Signal transduction mechanisms]	NA|198aa|down_0|NZ_CP046703.1_5440584_5441178_+	cd00051, EFh, EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands	NA|139aa|down_1|NZ_CP046703.1_5441560_5441977_+	cd02210, cupin_BLR2406-like, Bradyrhizobium japonicum BLR2406 and related proteins, cupin domain	NA|74aa|down_2|NZ_CP046703.1_5442417_5442639_+	NA	NA|443aa|down_3|NZ_CP046703.1_5442964_5444293_+	pfam02281, Dimer_Tnp_Tn5, Transposase Tn5 dimerization domain	NA|1058aa|down_4|NZ_CP046703.1_5444394_5447568_-	COG0841, AcrB, Cation/multidrug efflux pump [Defense mechanisms]	NA|486aa|down_5|NZ_CP046703.1_5447702_5449160_-	TIGR01730, COG0845:_Membrane-fusion_protein, RND family efflux transporter, MFP subunit	NA|294aa|down_6|NZ_CP046703.1_5449894_5450776_+	NA	NA|372aa|down_7|NZ_CP046703.1_5450882_5451998_+	COG1173, DppC, ABC-type dipeptide/oligopeptide/nickel transport systems, permease components [Amino acid transport and metabolism / Inorganic ion transport and metabolism]	NA|902aa|down_8|NZ_CP046703.1_5452334_5455040_-	pfam09534, Trp_oprn_chp, Tryptophan-associated transmembrane protein (Trp_oprn_chp)	NA|367aa|down_9|NZ_CP046703.1_5455332_5456433_-	pfam01637, ATPase_2, ATPase domain predominantly from Archaea
GCF_009873495.1_ASM987349v1	NZ_CP046703	Nostoc sp. ATCC 53789 chromosome, complete genome	14	5454366-5454731	4	CRT	no		Cas9_archaeal,cas14k,cas14j,csa3,RT,DEDDh,PD-DExK,cas3,DinG,Cas14c_CAS-V-F,WYL,cas10d,csc2gr7,csc1gr5,2OG_CAS,cas6,cas4,cas1,cas2,Cas14u_CAS-V	Orphan	CACGCCTCNCGCCACGCCTCCNGCCACGCC	30	1	1	5454672-5454701	NZ_CP046703.1_5262577-5262606	NA	6	6	Orphan	Cas9_archaeal,cas14k,cas14j,csa3,RT,DEDDh,PD-DExK,cas3,DinG,Cas14c_CAS-V-F,WYL,cas10d,csc2gr7,csc1gr5,2OG_CAS,cas6,cas4,cas1,cas2,Cas14u_CAS-V,csx3,csx1,cas10,csm3gr7,csx10gr5,csx19,c2c9_V-U4,cas8a4,cas7,cas5	NA|48aa|up_9|NZ_CP046703.1_5437911_5438055_+,NA|74aa|up_5|NZ_CP046703.1_5442417_5442639_+,NA|294aa|up_1|NZ_CP046703.1_5449894_5450776_+,NA|114aa|down_1|NZ_CP046703.1_5456631_5456973_+,NA|109aa|down_5|NZ_CP046703.1_5461249_5461576_+	NA|48aa|up_9|NZ_CP046703.1_5437911_5438055_+	NA	NA|494aa|up_8|NZ_CP046703.1_5438244_5439726_+	COG0642, BaeS, Signal transduction histidine kinase [Signal transduction mechanisms]	NA|198aa|up_7|NZ_CP046703.1_5440584_5441178_+	cd00051, EFh, EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands	NA|139aa|up_6|NZ_CP046703.1_5441560_5441977_+	cd02210, cupin_BLR2406-like, Bradyrhizobium japonicum BLR2406 and related proteins, cupin domain	NA|74aa|up_5|NZ_CP046703.1_5442417_5442639_+	NA	NA|443aa|up_4|NZ_CP046703.1_5442964_5444293_+	pfam02281, Dimer_Tnp_Tn5, Transposase Tn5 dimerization domain	NA|1058aa|up_3|NZ_CP046703.1_5444394_5447568_-	COG0841, AcrB, Cation/multidrug efflux pump [Defense mechanisms]	NA|486aa|up_2|NZ_CP046703.1_5447702_5449160_-	TIGR01730, COG0845:_Membrane-fusion_protein, RND family efflux transporter, MFP subunit	NA|294aa|up_1|NZ_CP046703.1_5449894_5450776_+	NA	NA|372aa|up_0|NZ_CP046703.1_5450882_5451998_+	COG1173, DppC, ABC-type dipeptide/oligopeptide/nickel transport systems, permease components [Amino acid transport and metabolism / Inorganic ion transport and metabolism]	NA|367aa|down_0|NZ_CP046703.1_5455332_5456433_-	pfam01637, ATPase_2, ATPase domain predominantly from Archaea	NA|114aa|down_1|NZ_CP046703.1_5456631_5456973_+	NA	NA|342aa|down_2|NZ_CP046703.1_5457122_5458148_-	COG0601, DppB, ABC-type dipeptide/oligopeptide/nickel transport systems, permease components [Amino acid transport and metabolism / Inorganic ion transport and metabolism]	NA|551aa|down_3|NZ_CP046703.1_5458217_5459870_-	cd08519, PBP2_NikA_DppA_OppA_like_20, The substrate-binding component of an uncharacterized ABC-type nickel/dipeptide/oligopeptide-like import system contains the type 2 periplasmic binding fold	NA|289aa|down_4|NZ_CP046703.1_5460083_5460950_-	COG0338, Dam, Site-specific DNA methylase [DNA replication, recombination, and repair]	NA|109aa|down_5|NZ_CP046703.1_5461249_5461576_+	NA	NA|510aa|down_6|NZ_CP046703.1_5462054_5463584_+	CHL00062, psbB, photosystem II 47 kDa protein	NA|36aa|down_7|NZ_CP046703.1_5463695_5463803_+	PRK11875, psbT, photosystem II reaction center protein T; Reviewed	NA|376aa|down_8|NZ_CP046703.1_5464100_5465228_+	PRK07400, PRK07400, 30S ribosomal protein S1; Reviewed	NA|245aa|down_9|NZ_CP046703.1_5465381_5466116_+	COG0546, Gph, Predicted phosphatases [General function prediction only]
GCF_009873495.1_ASM987349v1	NZ_CP046703	Nostoc sp. ATCC 53789 chromosome, complete genome	15	5602292-5602406	13	CRISPRCasFinder	no	cas3	Cas9_archaeal,cas14k,cas14j,csa3,RT,DEDDh,PD-DExK,cas3,DinG,Cas14c_CAS-V-F,WYL,cas10d,csc2gr7,csc1gr5,2OG_CAS,cas6,cas4,cas1,cas2,Cas14u_CAS-V	Unclear	GAAACACCCACGCTTACATACAAAACATCGTGGA	34	0	0	NA	NA	NA	1	1	Unclear	Cas9_archaeal,cas14k,cas14j,csa3,RT,DEDDh,PD-DExK,cas3,DinG,Cas14c_CAS-V-F,WYL,cas10d,csc2gr7,csc1gr5,2OG_CAS,cas6,cas4,cas1,cas2,Cas14u_CAS-V,csx3,csx1,cas10,csm3gr7,csx10gr5,csx19,c2c9_V-U4,cas8a4,cas7,cas5	NA|49aa|up_8|NZ_CP046703.1_5590180_5590327_+,NA|130aa|up_4|NZ_CP046703.1_5596190_5596580_+,NA|90aa|up_2|NZ_CP046703.1_5598760_5599030_+,NA|61aa|down_3|NZ_CP046703.1_5607786_5607969_+,NA|142aa|down_4|NZ_CP046703.1_5608116_5608542_-	NA|596aa|up_9|NZ_CP046703.1_5588269_5590057_-	PRK00476, aspS, aspartyl-tRNA synthetase; Validated	NA|49aa|up_8|NZ_CP046703.1_5590180_5590327_+	NA	NA|703aa|up_7|NZ_CP046703.1_5590761_5592870_+	COG1357, COG1357, Pentapeptide repeats containing protein [Function unknown]	NA|505aa|up_6|NZ_CP046703.1_5593215_5594730_+	TIGR02733, similar_to_to_phytoene_dehydrogenase, C-3',4' desaturase CrtD	NA|369aa|up_5|NZ_CP046703.1_5594848_5595955_+	COG1748, LYS9, Saccharopine dehydrogenase and related proteins [Amino acid transport and metabolism]	NA|130aa|up_4|NZ_CP046703.1_5596190_5596580_+	NA	NA|433aa|up_3|NZ_CP046703.1_5596721_5598020_-	COG0025, NhaP, NhaP-type Na+/H+ and K+/H+ antiporters [Inorganic ion transport and metabolism]	NA|90aa|up_2|NZ_CP046703.1_5598760_5599030_+	NA	NA|380aa|up_1|NZ_CP046703.1_5599230_5600370_-	cd03801, GT4_PimA-like, phosphatidyl-myo-inositol mannosyltransferase	NA|390aa|up_0|NZ_CP046703.1_5600445_5601615_-	cd03801, GT4_PimA-like, phosphatidyl-myo-inositol mannosyltransferase	NA|299aa|down_0|NZ_CP046703.1_5604457_5605354_-	PRK02259, PRK02259, aspartoacylase; Provisional	NA|390aa|down_1|NZ_CP046703.1_5605524_5606694_+	COG2942, COG2942, N-acyl-D-glucosamine 2-epimerase [Carbohydrate transport and metabolism]	NA|118aa|down_2|NZ_CP046703.1_5606951_5607305_-	TIGR01617, Uncharacterized_protein_UU176, transcriptional regulator, Spx/MgsR family	NA|61aa|down_3|NZ_CP046703.1_5607786_5607969_+	NA	NA|142aa|down_4|NZ_CP046703.1_5608116_5608542_-	NA	NA|138aa|down_5|NZ_CP046703.1_5608693_5609107_+	cd11386, MCP_signal, Methyl-accepting chemotaxis protein (MCP), signaling domain	NA|428aa|down_6|NZ_CP046703.1_5609246_5610530_+	PRK00885, PRK00885, phosphoribosylamine--glycine ligase; Provisional	NA|622aa|down_7|NZ_CP046703.1_5610824_5612690_-	cd04084, CBM6_xylanase-like, Carbohydrate Binding Module 6 (CBM6); many are appended to glycoside hydrolase (GH) family 11 and GH43 xylanase domains	NA|189aa|down_8|NZ_CP046703.1_5614013_5614580_-	pfam10989, DUF2808, Protein of unknown function (DUF2808)	NA|146aa|down_9|NZ_CP046703.1_5614602_5615040_-	COG4270, COG4270, Predicted membrane protein [Function unknown]
GCF_009873495.1_ASM987349v1	NZ_CP046703	Nostoc sp. ATCC 53789 chromosome, complete genome	16	5753671-5754282	4,14,5	PILER-CR,CRISPRCasFinder,CRT	no	WYL,cas3,PD-DExK,cas10d,csc2gr7,csc1gr5,2OG_CAS,cas6,cas4,cas1,cas2	Cas9_archaeal,cas14k,cas14j,csa3,RT,DEDDh,PD-DExK,cas3,DinG,Cas14c_CAS-V-F,WYL,cas10d,csc2gr7,csc1gr5,2OG_CAS,cas6,cas4,cas1,cas2,Cas14u_CAS-V	Type I-D	ATTGCAATTCATTAAAATCCCTATCAGGG----------ATTGAAAC,ATTGCAATTCATTAAAATCCCTATCAGGGATTGAAAC,ATTGCAATTCATTAAAATCCCTATCAGGGATTGAAAC	47,37,37	0	0	NA	NA	I-D,II-B:I-D,II-B:I-D,II-B	7,8,8	8	TypeI-D	Cas9_archaeal,cas14k,cas14j,csa3,RT,DEDDh,PD-DExK,cas3,DinG,Cas14c_CAS-V-F,WYL,cas10d,csc2gr7,csc1gr5,2OG_CAS,cas6,cas4,cas1,cas2,Cas14u_CAS-V,csx3,csx1,cas10,csm3gr7,csx10gr5,csx19,c2c9_V-U4,cas8a4,cas7,cas5	NA|137aa|up_5|NZ_CP046703.1_5749215_5749626_-,NA|75aa|down_1|NZ_CP046703.1_5761513_5761738_-	PD-DExK|340aa|up_9|NZ_CP046703.1_5743115_5744135_+	pfam06250, DUF1016, Protein of unknown function (DUF1016)	cas10d|1097aa|up_8|NZ_CP046703.1_5744152_5747443_+	TIGR03174, cas_Csc3, CRISPR type I-D/CYANO-associated protein Csc3/Cas10d	csc2gr7|339aa|up_7|NZ_CP046703.1_5747491_5748508_+	pfam18320, Csc2, Csc2 Crispr	csc1gr5|236aa|up_6|NZ_CP046703.1_5748507_5749215_+	cd09711, Csc1_I-D, CRISPR/Cas system-associated protein Csc1	NA|137aa|up_5|NZ_CP046703.1_5749215_5749626_-	NA	2OG_CAS|207aa|up_4|NZ_CP046703.1_5749735_5750356_+	pfam13640, 2OG-FeII_Oxy_3, 2OG-Fe(II) oxygenase superfamily	cas6|290aa|up_3|NZ_CP046703.1_5750345_5751215_+	COG5551, COG5551, CRISPR system related protein, RAMP superfamily [Defense    mechanisms]	cas4|198aa|up_2|NZ_CP046703.1_5751246_5751840_+	TIGR00372, conserved_hypothetical_protein, CRISPR-associated protein Cas4	cas1|335aa|up_1|NZ_CP046703.1_5752051_5753056_+	TIGR04093, hypothetical_protein_L8106_25395, CRISPR-associated endonuclease Cas1, subtype CYANO	cas2|96aa|up_0|NZ_CP046703.1_5753114_5753402_+	cd09725, Cas2_I_II_III, CRISPR/Cas system-associated protein Cas2	NA|2207aa|down_0|NZ_CP046703.1_5754530_5761151_-	PRK11107, PRK11107, hybrid sensory histidine kinase BarA; Provisional	NA|75aa|down_1|NZ_CP046703.1_5761513_5761738_-	NA	NA|162aa|down_2|NZ_CP046703.1_5761721_5762207_-	cd09874, PIN_MT3492-like, VapC-like PIN domain of the hypothetical protein MT3492 of Mycobacterium tuberculosis CDC1551 and other uncharacterized, annotated PilT protein domain proteins	NA|74aa|down_3|NZ_CP046703.1_5762272_5762494_-	COG1598, COG1598, Predicted nuclease of the RNAse H fold, HicB family [General    function prediction only]	NA|1259aa|down_4|NZ_CP046703.1_5762546_5766323_-	PLN03241, PLN03241, magnesium chelatase subunit H; Provisional	NA|314aa|down_5|NZ_CP046703.1_5767161_5768103_+	cd02696, MurNAc-LAA, N-acetylmuramoyl-L-alanine amidase or MurNAc-LAA (also known as peptidoglycan aminohydrolase, NAMLA amidase, NAMLAA, Amidase 3, and peptidoglycan amidase; EC 3	NA|326aa|down_6|NZ_CP046703.1_5768366_5769344_+	cd09763, DHRS1-like_SDR_c, human dehydrogenase/reductase (SDR family) member 1 (DHRS1) -like, classical (c) SDRs	NA|780aa|down_7|NZ_CP046703.1_5769520_5771860_+	COG0475, KefB, Kef-type K+ transport systems, membrane components [Inorganic ion transport and metabolism]	NA|233aa|down_8|NZ_CP046703.1_5772247_5772946_+	TIGR02595, conserved_hypothetical_protein, PEP-CTERM protein-sorting domain	NA|1851aa|down_9|NZ_CP046703.1_5773124_5778677_-	COG3899, COG3899, Predicted ATPase [General function prediction only]
GCF_009873495.1_ASM987349v1	NZ_CP046703	Nostoc sp. ATCC 53789 chromosome, complete genome	17	6707883-6709219	15,6,5	CRISPRCasFinder,CRT,PILER-CR	no	PD-DExK	Cas9_archaeal,cas14k,cas14j,csa3,RT,DEDDh,PD-DExK,cas3,DinG,Cas14c_CAS-V-F,WYL,cas10d,csc2gr7,csc1gr5,2OG_CAS,cas6,cas4,cas1,cas2,Cas14u_CAS-V	Unclear	GTTTCAATCCCTAATAGGGATTTTGATAAATTGCAAT,GTTTCAATCCCTAATAGGGATTTTGATAAATTGCAAT,GTTTCAATCCCTAATAGGGATTTTGATAAATTGCAAT	37,37,37	0	0	NA	NA	I-D,II-B:I-D,II-B:I-D,II-B	18,18,18	18	Orphan	Cas9_archaeal,cas14k,cas14j,csa3,RT,DEDDh,PD-DExK,cas3,DinG,Cas14c_CAS-V-F,WYL,cas10d,csc2gr7,csc1gr5,2OG_CAS,cas6,cas4,cas1,cas2,Cas14u_CAS-V,csx3,csx1,cas10,csm3gr7,csx10gr5,csx19,c2c9_V-U4,cas8a4,cas7,cas5	NA|46aa|up_9|NZ_CP046703.1_6696937_6697075_+,NA|165aa|up_7|NZ_CP046703.1_6699010_6699505_+,NA|147aa|up_3|NZ_CP046703.1_6703098_6703539_+,NA|85aa|up_1|NZ_CP046703.1_6706572_6706827_-,NA|71aa|down_2|NZ_CP046703.1_6714513_6714726_+,NA|68aa|down_3|NZ_CP046703.1_6714768_6714972_+	NA|46aa|up_9|NZ_CP046703.1_6696937_6697075_+	NA	NA|291aa|up_8|NZ_CP046703.1_6697225_6698098_-	PRK13398, PRK13398, 3-deoxy-7-phosphoheptulonate synthase; Provisional	NA|165aa|up_7|NZ_CP046703.1_6699010_6699505_+	NA	NA|222aa|up_6|NZ_CP046703.1_6699587_6700253_+	COG1357, COG1357, Pentapeptide repeats containing protein [Function unknown]	NA|202aa|up_5|NZ_CP046703.1_6701098_6701704_+	pfam14273, DUF4360, Domain of unknown function (DUF4360)	NA|286aa|up_4|NZ_CP046703.1_6701955_6702813_+	COG1801, COG1801, Uncharacterized conserved protein [Function unknown]	NA|147aa|up_3|NZ_CP046703.1_6703098_6703539_+	NA	NA|437aa|up_2|NZ_CP046703.1_6703725_6705036_-	COG5659, COG5659, FOG: Transposase [DNA replication, recombination, and repair]	NA|85aa|up_1|NZ_CP046703.1_6706572_6706827_-	NA	NA|298aa|up_0|NZ_CP046703.1_6706935_6707829_+	PRK13236, PRK13236, nitrogenase reductase; Reviewed	NA|278aa|down_0|NZ_CP046703.1_6709707_6710541_+	COG0412, COG0412, Dienelactone hydrolase and related enzymes [Secondary metabolites biosynthesis, transport, and catabolism]	NA|1040aa|down_1|NZ_CP046703.1_6710766_6713886_-	COG3641, PfoR, Predicted membrane protein, putative toxin regulator [General function prediction only]	NA|71aa|down_2|NZ_CP046703.1_6714513_6714726_+	NA	NA|68aa|down_3|NZ_CP046703.1_6714768_6714972_+	NA	NA|446aa|down_4|NZ_CP046703.1_6715160_6716498_-	cd01116, P_permease, Permease P (pink-eyed dilution)	NA|206aa|down_5|NZ_CP046703.1_6716611_6717229_-	pfam00582, Usp, Universal stress protein family	NA|287aa|down_6|NZ_CP046703.1_6717624_6718485_-	COG1116, TauB, ABC-type nitrate/sulfonate/bicarbonate transport system, ATPase component [Inorganic ion transport and metabolism]	NA|481aa|down_7|NZ_CP046703.1_6718592_6720035_-	cd13553, PBP2_NrtA_CpmA_like, Substrate binding domain of ABC-type nitrate/bicarbonate transporters, a member of the type 2 periplasmic binding fold superfamily	NA|283aa|down_8|NZ_CP046703.1_6720121_6720970_-	TIGR01183, Nitrate_transport_permease_protein_NrtB, nitrate ABC transporter, permease protein	NA|397aa|down_9|NZ_CP046703.1_6721699_6722890_+	COG0642, BaeS, Signal transduction histidine kinase [Signal transduction mechanisms]
GCF_009873495.1_ASM987349v1	NZ_CP046703	Nostoc sp. ATCC 53789 chromosome, complete genome	18	7120544-7120796	6,16,7	PILER-CR,CRISPRCasFinder,CRT	no		Cas9_archaeal,cas14k,cas14j,csa3,RT,DEDDh,PD-DExK,cas3,DinG,Cas14c_CAS-V-F,WYL,cas10d,csc2gr7,csc1gr5,2OG_CAS,cas6,cas4,cas1,cas2,Cas14u_CAS-V	Orphan	ATTGCAATTATCTTAAATCCCTATTAGGG----------ATTGAAAC,ATTGCAATTATCTTAAATCCCTATTAGGGATTG,ATTGCAATTATCTTAAATCCCTATTAGGGATTGAAACAA	47,33,39	0	0	NA	NA	I-D,II-B:I-D,II-B:I-D,II-B	2,3,3	3	Orphan	Cas9_archaeal,cas14k,cas14j,csa3,RT,DEDDh,PD-DExK,cas3,DinG,Cas14c_CAS-V-F,WYL,cas10d,csc2gr7,csc1gr5,2OG_CAS,cas6,cas4,cas1,cas2,Cas14u_CAS-V,csx3,csx1,cas10,csm3gr7,csx10gr5,csx19,c2c9_V-U4,cas8a4,cas7,cas5	NA|108aa|up_0|NZ_CP046703.1_7119853_7120177_+,NA|184aa|down_1|NZ_CP046703.1_7122692_7123244_-,NA|102aa|down_5|NZ_CP046703.1_7128752_7129058_+,NA|119aa|down_6|NZ_CP046703.1_7129700_7130057_+,NA|59aa|down_9|NZ_CP046703.1_7133569_7133746_-	NA|312aa|up_9|NZ_CP046703.1_7105102_7106038_-	COG1622, CyoA, Heme/copper-type cytochrome/quinol oxidases, subunit 2 [Energy production and conversion]	NA|194aa|up_8|NZ_CP046703.1_7106206_7106788_-	COG4244, COG4244, Predicted membrane protein [Function unknown]	NA|165aa|up_7|NZ_CP046703.1_7106784_7107279_-	COG4244, COG4244, Predicted membrane protein [Function unknown]	NA|271aa|up_6|NZ_CP046703.1_7107904_7108717_+	cd05358, GlcDH_SDR_c, glucose 1 dehydrogenase (GlcDH), classical (c) SDRs	NA|313aa|up_5|NZ_CP046703.1_7108896_7109835_+	COG0053, MMT1, Predicted Co/Zn/Cd cation transporters [Inorganic ion transport and metabolism]	NA|547aa|up_4|NZ_CP046703.1_7112252_7113893_+	COG4188, COG4188, Predicted dienelactone hydrolase [General function prediction only]	NA|1100aa|up_3|NZ_CP046703.1_7113929_7117229_-	TIGR02956, sensor_protein_TorS, TMAO reductase sytem sensor TorS	NA|372aa|up_2|NZ_CP046703.1_7117931_7119047_+	cd19920, REC_PA4781-like, phosphoacceptor receiver (REC) domain of cyclic di-GMP phosphodiesterase PA4781 and similar domains	NA|60aa|up_1|NZ_CP046703.1_7119381_7119561_+	PLN00014, PLN00014, light-harvesting-like protein 3; Provisional	NA|108aa|up_0|NZ_CP046703.1_7119853_7120177_+	NA	NA|530aa|down_0|NZ_CP046703.1_7120876_7122466_-	COG3540, PhoD, Phosphodiesterase/alkaline phosphatase D [Inorganic ion transport and metabolism]	NA|184aa|down_1|NZ_CP046703.1_7122692_7123244_-	NA	NA|427aa|down_2|NZ_CP046703.1_7123786_7125067_-	PRK02427, PRK02427, 3-phosphoshikimate 1-carboxyvinyltransferase; Provisional	NA|277aa|down_3|NZ_CP046703.1_7125416_7126247_+	pfam06051, DUF928, Domain of Unknown Function (DUF928)	NA|743aa|down_4|NZ_CP046703.1_7126252_7128481_+	COG4252, COG4252, Predicted transmembrane sensor domain [Signal transduction mechanisms]	NA|102aa|down_5|NZ_CP046703.1_7128752_7129058_+	NA	NA|119aa|down_6|NZ_CP046703.1_7129700_7130057_+	NA	NA|116aa|down_7|NZ_CP046703.1_7130155_7130503_-	pfam07883, Cupin_2, Cupin domain	NA|965aa|down_8|NZ_CP046703.1_7130499_7133394_-	PRK06241, PRK06241, phosphoenolpyruvate synthase; Validated	NA|59aa|down_9|NZ_CP046703.1_7133569_7133746_-	NA
GCF_009873495.1_ASM987349v1	NZ_CP046704	Nostoc sp. ATCC 53789 plasmid pNsp_a, complete sequence	1	40761-42637	1,1,1	PILER-CR,CRISPRCasFinder,CRT	no	RT,WYL,csx3,csx1,cas10,csm3gr7,csx10gr5,csx19,c2c9_V-U4	RT,WYL,csx3,csx1,cas10,csm3gr7,csx10gr5,csx19,c2c9_V-U4,cas14j,cas3,cas8a4,cas7,cas5,cas6,cas4,cas1,cas2	Type III-D, Type III-D?,Type III-C,Type III-A,Type III-B	GTTTCCAAAGCCTATTACCCCGCAAGGGGACTGAAAC,GTTTCCAAAGCCTATTACCCCGCAAGGGGACTGAAAC,GTTTCCAAAGCCTATTACCCCGCAAGGGGACTGAAAC	37,37,37	0	0	NA	NA	NA:NA:NA	25,25,25	25	TypeIII-D,TypeIII-D?,TypeIII-C,TypeIII-A,TypeIII-B	Cas9_archaeal,cas14k,cas14j,csa3,RT,DEDDh,PD-DExK,cas3,DinG,Cas14c_CAS-V-F,WYL,cas10d,csc2gr7,csc1gr5,2OG_CAS,cas6,cas4,cas1,cas2,Cas14u_CAS-V,csx3,csx1,cas10,csm3gr7,csx10gr5,csx19,c2c9_V-U4,cas8a4,cas7,cas5	NA|281aa|up_7|NZ_CP046704.1_34000_34843_-,NA|68aa|up_6|NZ_CP046704.1_35357_35561_-,NA|109aa|up_5|NZ_CP046704.1_35553_35880_-,NA|219aa|up_4|NZ_CP046704.1_36003_36660_-,NA|74aa|up_3|NZ_CP046704.1_36576_36798_+,NA|323aa|up_0|NZ_CP046704.1_39554_40523_+,NA|145aa|down_0|NZ_CP046704.1_42688_43123_+,csx3|312aa|down_5|NZ_CP046704.1_47800_48736_+	NA|153aa|up_9|NZ_CP046704.1_32837_33296_-	pfam07154, DUF1392, Protein of unknown function (DUF1392)	NA|133aa|up_8|NZ_CP046704.1_33292_33691_-	cd04762, HTH_MerR-trunc, Helix-Turn-Helix DNA binding domain of truncated MerR-like proteins	NA|281aa|up_7|NZ_CP046704.1_34000_34843_-	NA	NA|68aa|up_6|NZ_CP046704.1_35357_35561_-	NA	NA|109aa|up_5|NZ_CP046704.1_35553_35880_-	NA	NA|219aa|up_4|NZ_CP046704.1_36003_36660_-	NA	NA|74aa|up_3|NZ_CP046704.1_36576_36798_+	NA	RT|514aa|up_2|NZ_CP046704.1_36844_38386_-	TIGR04416, hypothetical_protein, group II intron reverse transcriptase/maturase	NA|79aa|up_1|NZ_CP046704.1_39166_39403_+	cd00093, HTH_XRE, Helix-turn-helix XRE-family like proteins	NA|323aa|up_0|NZ_CP046704.1_39554_40523_+	NA	NA|145aa|down_0|NZ_CP046704.1_42688_43123_+	NA	NA|80aa|down_1|NZ_CP046704.1_43132_43372_+	COG2020, STE14, Putative protein-S-isoprenylcysteine methyltransferase [Posttranslational modification, protein turnover, chaperones]	NA|193aa|down_2|NZ_CP046704.1_45367_45946_-	pfam13328, HD_4, HD domain	WYL|465aa|down_3|NZ_CP046704.1_45953_47348_-	TIGR03985, hypothetical_protein_sll7078, CRISPR-associated protein, TIGR03985 family	csx3|102aa|down_4|NZ_CP046704.1_47450_47756_+	cd09740, Csx3_III-U, CRISPR/Cas system-associated protein Csx3	csx3|312aa|down_5|NZ_CP046704.1_47800_48736_+	NA	csx1|421aa|down_6|NZ_CP046704.1_48809_50072_+	pfam09002, DUF1887, Domain of unknown function (DUF1887)	cas10|538aa|down_7|NZ_CP046704.1_50068_51682_+	cd09680, Cas10_III, CRISPR/Cas system-associated protein Cas10	csm3gr7|223aa|down_8|NZ_CP046704.1_51682_52351_+	pfam03787, RAMPs, RAMP superfamily	csx10gr5|538aa|down_9|NZ_CP046704.1_52347_53961_+	TIGR02674, cas_cyan_RAMP_2, CRISPR-associated RAMP protein, Csx10 family
GCF_009873495.1_ASM987349v1	NZ_CP046704	Nostoc sp. ATCC 53789 plasmid pNsp_a, complete sequence	2	43770-45359	2,2,2	PILER-CR,CRISPRCasFinder,CRT	no	RT,WYL,csx3,csx1,cas10,csm3gr7,csx10gr5,csx19,c2c9_V-U4,cas14j,cas3	RT,WYL,csx3,csx1,cas10,csm3gr7,csx10gr5,csx19,c2c9_V-U4,cas14j,cas3,cas8a4,cas7,cas5,cas6,cas4,cas1,cas2	Type III-D,,Type III-C,Type III-A,Type III-B	GTTTCCATTAACCAAATCCCCTCACGGGGACTGAAAC,GTTTCCAAAGCCTATTACCCCGCAAGGGGACTGAAAC,GTTTCCATTAACCAAATCCCCTCACGGGGACTGAAAC	37,37,37	0	0	NA	NA	NA:NA:NA	20,21,21	21	TypeIII-D,,TypeIII-C,TypeIII-A,TypeIII-B,TypeV	Cas9_archaeal,cas14k,cas14j,csa3,RT,DEDDh,PD-DExK,cas3,DinG,Cas14c_CAS-V-F,WYL,cas10d,csc2gr7,csc1gr5,2OG_CAS,cas6,cas4,cas1,cas2,Cas14u_CAS-V,csx3,csx1,cas10,csm3gr7,csx10gr5,csx19,c2c9_V-U4,cas8a4,cas7,cas5	NA|281aa|up_9|NZ_CP046704.1_34000_34843_-,NA|68aa|up_8|NZ_CP046704.1_35357_35561_-,NA|109aa|up_7|NZ_CP046704.1_35553_35880_-,NA|219aa|up_6|NZ_CP046704.1_36003_36660_-,NA|74aa|up_5|NZ_CP046704.1_36576_36798_+,NA|323aa|up_2|NZ_CP046704.1_39554_40523_+,NA|145aa|up_1|NZ_CP046704.1_42688_43123_+,csx3|312aa|down_3|NZ_CP046704.1_47800_48736_+	NA|281aa|up_9|NZ_CP046704.1_34000_34843_-	NA	NA|68aa|up_8|NZ_CP046704.1_35357_35561_-	NA	NA|109aa|up_7|NZ_CP046704.1_35553_35880_-	NA	NA|219aa|up_6|NZ_CP046704.1_36003_36660_-	NA	NA|74aa|up_5|NZ_CP046704.1_36576_36798_+	NA	RT|514aa|up_4|NZ_CP046704.1_36844_38386_-	TIGR04416, hypothetical_protein, group II intron reverse transcriptase/maturase	NA|79aa|up_3|NZ_CP046704.1_39166_39403_+	cd00093, HTH_XRE, Helix-turn-helix XRE-family like proteins	NA|323aa|up_2|NZ_CP046704.1_39554_40523_+	NA	NA|145aa|up_1|NZ_CP046704.1_42688_43123_+	NA	NA|80aa|up_0|NZ_CP046704.1_43132_43372_+	COG2020, STE14, Putative protein-S-isoprenylcysteine methyltransferase [Posttranslational modification, protein turnover, chaperones]	NA|193aa|down_0|NZ_CP046704.1_45367_45946_-	pfam13328, HD_4, HD domain	WYL|465aa|down_1|NZ_CP046704.1_45953_47348_-	TIGR03985, hypothetical_protein_sll7078, CRISPR-associated protein, TIGR03985 family	csx3|102aa|down_2|NZ_CP046704.1_47450_47756_+	cd09740, Csx3_III-U, CRISPR/Cas system-associated protein Csx3	csx3|312aa|down_3|NZ_CP046704.1_47800_48736_+	NA	csx1|421aa|down_4|NZ_CP046704.1_48809_50072_+	pfam09002, DUF1887, Domain of unknown function (DUF1887)	cas10|538aa|down_5|NZ_CP046704.1_50068_51682_+	cd09680, Cas10_III, CRISPR/Cas system-associated protein Cas10	csm3gr7|223aa|down_6|NZ_CP046704.1_51682_52351_+	pfam03787, RAMPs, RAMP superfamily	csx10gr5|538aa|down_7|NZ_CP046704.1_52347_53961_+	TIGR02674, cas_cyan_RAMP_2, CRISPR-associated RAMP protein, Csx10 family	csm3gr7|474aa|down_8|NZ_CP046704.1_53950_55372_+	cd09683, Csm3_III-A, CRISPR/Cas system-associated RAMP superfamily protein Csm3	csx19|188aa|down_9|NZ_CP046704.1_55368_55932_+	TIGR03984, hypothetical_protein_FrEUN1fDRAFT_5778, CRISPR-associated protein, TIGR03984 family
GCF_009873495.1_ASM987349v1	NZ_CP046704	Nostoc sp. ATCC 53789 plasmid pNsp_a, complete sequence	3	73180-75066	3,3,3	PILER-CR,CRISPRCasFinder,CRT	no	csm3gr7,csx19,c2c9_V-U4,cas14j,cas3,cas8a4,cas7,cas5,cas6,cas4,cas1,cas2	RT,WYL,csx3,csx1,cas10,csm3gr7,csx10gr5,csx19,c2c9_V-U4,cas14j,cas3,cas8a4,cas7,cas5,cas6,cas4,cas1,cas2	Unclear	GTTTCCAAAGACTATTACCCCGCAAGGGGACTGAAAC,GTTTCCAAAGACTATTACCCCGCAAGGGGACTGAAAC,GTTTCCAAAGACTATTACCCCGCAAGGGGACTGAAAC	37,37,37	0	0	NA	NA	NA:NA:NA	25,25,25	25	TypeV	Cas9_archaeal,cas14k,cas14j,csa3,RT,DEDDh,PD-DExK,cas3,DinG,Cas14c_CAS-V-F,WYL,cas10d,csc2gr7,csc1gr5,2OG_CAS,cas6,cas4,cas1,cas2,Cas14u_CAS-V,csx3,csx1,cas10,csm3gr7,csx10gr5,csx19,c2c9_V-U4,cas8a4,cas7,cas5	cas8a4|612aa|up_6|NZ_CP046704.1_66408_68244_+,cas5|257aa|up_4|NZ_CP046704.1_69179_69950_+,NA|47aa|down_2|NZ_CP046704.1_80205_80346_-,NA|72aa|down_4|NZ_CP046704.1_82702_82918_-,NA|135aa|down_5|NZ_CP046704.1_83160_83565_+,NA|80aa|down_9|NZ_CP046704.1_88365_88605_+	NA|72aa|up_9|NZ_CP046704.1_63065_63281_-	COG4636, Uma2, Endonuclease, Uma2 family (restriction endonuclease fold) [General function prediction only]	cas3|209aa|up_8|NZ_CP046704.1_63406_64033_+	smart00487, DEXDc, DEAD-like helicases superfamily	cas3|682aa|up_7|NZ_CP046704.1_64013_66059_+	COG1203, COG1203, CRISPR-associated helicase Cas3 [Defense mechanisms]	cas8a4|612aa|up_6|NZ_CP046704.1_66408_68244_+	NA	cas7|321aa|up_5|NZ_CP046704.1_68236_69199_+	pfam01905, DevR, CRISPR-associated negative auto-regulator DevR/Csa2	cas5|257aa|up_4|NZ_CP046704.1_69179_69950_+	NA	cas6|342aa|up_3|NZ_CP046704.1_69927_70953_+	pfam10040, CRISPR_Cas6, CRISPR-associated endoribonuclease Cas6	cas4|200aa|up_2|NZ_CP046704.1_70980_71580_+	TIGR00372, conserved_hypothetical_protein, CRISPR-associated protein Cas4	cas1|326aa|up_1|NZ_CP046704.1_71659_72637_+	TIGR04093, hypothetical_protein_L8106_25395, CRISPR-associated endonuclease Cas1, subtype CYANO	cas2|98aa|up_0|NZ_CP046704.1_72666_72960_+	cd09725, Cas2_I_II_III, CRISPR/Cas system-associated protein Cas2	cas6|384aa|down_0|NZ_CP046704.1_75214_76366_+	COG5551, COG5551, CRISPR system related protein, RAMP superfamily [Defense    mechanisms]	NA|330aa|down_1|NZ_CP046704.1_78936_79926_+	COG1426, COG1426, Predicted transcriptional regulator contains Xre-like HTH domain [Function unknown]	NA|47aa|down_2|NZ_CP046704.1_80205_80346_-	NA	NA|234aa|down_3|NZ_CP046704.1_81884_82586_+	cd14098, STKc_Rad53_Cds1, Catalytic domain of the yeast Serine/Threonine Kinases, Rad53 and Cds1	NA|72aa|down_4|NZ_CP046704.1_82702_82918_-	NA	NA|135aa|down_5|NZ_CP046704.1_83160_83565_+	NA	NA|154aa|down_6|NZ_CP046704.1_83926_84388_-	COG3837, COG3837, Uncharacterized conserved protein, contains double-stranded beta-helix domain [Function unknown]	NA|362aa|down_7|NZ_CP046704.1_84987_86073_+	cd00200, WD40, WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment	NA|632aa|down_8|NZ_CP046704.1_86191_88087_+	pfam11850, DUF3370, Protein of unknown function (DUF3370)	NA|80aa|down_9|NZ_CP046704.1_88365_88605_+	NA
GCF_009873495.1_ASM987349v1	NZ_CP046704	Nostoc sp. ATCC 53789 plasmid pNsp_a, complete sequence	4	76706-78355	4,4	CRISPRCasFinder,CRT	no	c2c9_V-U4,cas14j,cas3,cas8a4,cas7,cas5,cas6,cas4,cas1,cas2	RT,WYL,csx3,csx1,cas10,csm3gr7,csx10gr5,csx19,c2c9_V-U4,cas14j,cas3,cas8a4,cas7,cas5,cas6,cas4,cas1,cas2	Unclear	GTTTCCAAAGACTATTACCCCGCAAGGGGACTGAAAC,GTTTCCACCAACCAAATCCCCTCACGGGGACTGAAAC	37,37	1	1	77475-77509	NZ_CP046703.1_6025422-6025388	NA:NA	22,22	22	TypeV	Cas9_archaeal,cas14k,cas14j,csa3,RT,DEDDh,PD-DExK,cas3,DinG,Cas14c_CAS-V-F,WYL,cas10d,csc2gr7,csc1gr5,2OG_CAS,cas6,cas4,cas1,cas2,Cas14u_CAS-V,csx3,csx1,cas10,csm3gr7,csx10gr5,csx19,c2c9_V-U4,cas8a4,cas7,cas5	cas8a4|612aa|up_7|NZ_CP046704.1_66408_68244_+,cas5|257aa|up_5|NZ_CP046704.1_69179_69950_+,NA|47aa|down_1|NZ_CP046704.1_80205_80346_-,NA|72aa|down_3|NZ_CP046704.1_82702_82918_-,NA|135aa|down_4|NZ_CP046704.1_83160_83565_+,NA|80aa|down_8|NZ_CP046704.1_88365_88605_+	cas3|209aa|up_9|NZ_CP046704.1_63406_64033_+	smart00487, DEXDc, DEAD-like helicases superfamily	cas3|682aa|up_8|NZ_CP046704.1_64013_66059_+	COG1203, COG1203, CRISPR-associated helicase Cas3 [Defense mechanisms]	cas8a4|612aa|up_7|NZ_CP046704.1_66408_68244_+	NA	cas7|321aa|up_6|NZ_CP046704.1_68236_69199_+	pfam01905, DevR, CRISPR-associated negative auto-regulator DevR/Csa2	cas5|257aa|up_5|NZ_CP046704.1_69179_69950_+	NA	cas6|342aa|up_4|NZ_CP046704.1_69927_70953_+	pfam10040, CRISPR_Cas6, CRISPR-associated endoribonuclease Cas6	cas4|200aa|up_3|NZ_CP046704.1_70980_71580_+	TIGR00372, conserved_hypothetical_protein, CRISPR-associated protein Cas4	cas1|326aa|up_2|NZ_CP046704.1_71659_72637_+	TIGR04093, hypothetical_protein_L8106_25395, CRISPR-associated endonuclease Cas1, subtype CYANO	cas2|98aa|up_1|NZ_CP046704.1_72666_72960_+	cd09725, Cas2_I_II_III, CRISPR/Cas system-associated protein Cas2	cas6|384aa|up_0|NZ_CP046704.1_75214_76366_+	COG5551, COG5551, CRISPR system related protein, RAMP superfamily [Defense    mechanisms]	NA|330aa|down_0|NZ_CP046704.1_78936_79926_+	COG1426, COG1426, Predicted transcriptional regulator contains Xre-like HTH domain [Function unknown]	NA|47aa|down_1|NZ_CP046704.1_80205_80346_-	NA	NA|234aa|down_2|NZ_CP046704.1_81884_82586_+	cd14098, STKc_Rad53_Cds1, Catalytic domain of the yeast Serine/Threonine Kinases, Rad53 and Cds1	NA|72aa|down_3|NZ_CP046704.1_82702_82918_-	NA	NA|135aa|down_4|NZ_CP046704.1_83160_83565_+	NA	NA|154aa|down_5|NZ_CP046704.1_83926_84388_-	COG3837, COG3837, Uncharacterized conserved protein, contains double-stranded beta-helix domain [Function unknown]	NA|362aa|down_6|NZ_CP046704.1_84987_86073_+	cd00200, WD40, WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment	NA|632aa|down_7|NZ_CP046704.1_86191_88087_+	pfam11850, DUF3370, Protein of unknown function (DUF3370)	NA|80aa|down_8|NZ_CP046704.1_88365_88605_+	NA	NA|153aa|down_9|NZ_CP046704.1_88601_89060_+	pfam01844, HNH, HNH endonuclease
GCF_009873495.1_ASM987349v1	NZ_CP046705	Nostoc sp. ATCC 53789 plasmid pNsp_b, complete sequence	1	67816-67946	1	CRISPRCasFinder	no		RT,cas14j,cas14k	Orphan	GTTTGCAGACTAAGTGAAATTTT	23	1	1	67894-67923	NZ_CP046704.1_326878-326849	NA	2	2	Orphan	Cas9_archaeal,cas14k,cas14j,csa3,RT,DEDDh,PD-DExK,cas3,DinG,Cas14c_CAS-V-F,WYL,cas10d,csc2gr7,csc1gr5,2OG_CAS,cas6,cas4,cas1,cas2,Cas14u_CAS-V,csx3,csx1,cas10,csm3gr7,csx10gr5,csx19,c2c9_V-U4,cas8a4,cas7,cas5	NA|137aa|up_6|NZ_CP046705.1_54379_54790_+,NA|98aa|up_2|NZ_CP046705.1_64331_64625_+,NA|213aa|up_0|NZ_CP046705.1_66467_67106_+,NA|161aa|down_6|NZ_CP046705.1_75535_76018_-	NA|199aa|up_9|NZ_CP046705.1_52361_52958_-	COG3576, COG3576, Predicted flavin-nucleotide-binding protein structurally related to pyridoxine 5'-phosphate oxidase [General function prediction only]	NA|206aa|up_8|NZ_CP046705.1_52967_53585_-	cd03206, GST_C_7, C-terminal, alpha helical domain of an unknown subfamily 7 of Glutathione S-transferases	NA|190aa|up_7|NZ_CP046705.1_53738_54308_+	COG1309, AcrR, Transcriptional regulator [Transcription]	NA|137aa|up_6|NZ_CP046705.1_54379_54790_+	NA	NA|889aa|up_5|NZ_CP046705.1_56925_59592_-	cd00200, WD40, WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment	NA|623aa|up_4|NZ_CP046705.1_59648_61517_-	pfam00656, Peptidase_C14, Caspase domain	NA|671aa|up_3|NZ_CP046705.1_61653_63666_-	pfam00656, Peptidase_C14, Caspase domain	NA|98aa|up_2|NZ_CP046705.1_64331_64625_+	NA	NA|549aa|up_1|NZ_CP046705.1_64645_66292_-	COG3264, COG3264, Small-conductance mechanosensitive channel [Cell envelope biogenesis, outer membrane]	NA|213aa|up_0|NZ_CP046705.1_66467_67106_+	NA	NA|306aa|down_0|NZ_CP046705.1_68342_69260_-	pfam06527, TniQ, TniQ	NA|372aa|down_1|NZ_CP046705.1_69249_70365_-	pfam13401, AAA_22, AAA domain	NA|594aa|down_2|NZ_CP046705.1_70364_72146_-	pfam00665, rve, Integrase core domain	NA|313aa|down_3|NZ_CP046705.1_72232_73170_-	pfam13384, HTH_23, Homeodomain-like domain	NA|353aa|down_4|NZ_CP046705.1_73198_74257_-	pfam08722, Tn7_Tnp_TnsA_N, TnsA endonuclease N terminal	NA|301aa|down_5|NZ_CP046705.1_74643_75546_-	pfam06527, TniQ, TniQ	NA|161aa|down_6|NZ_CP046705.1_75535_76018_-	NA	NA|280aa|down_7|NZ_CP046705.1_76235_77075_-	TIGR02224, Tyrosine_recombinase_XerC, tyrosine recombinase XerC	NA|549aa|down_8|NZ_CP046705.1_77361_79008_+	pfam00665, rve, Integrase core domain	NA|324aa|down_9|NZ_CP046705.1_78997_79969_+	COG3267, ExeA, Type II secretory pathway, component ExeA (predicted ATPase) [Intracellular trafficking and secretion]
GCF_009873495.1_ASM987349v1	NZ_CP046705	Nostoc sp. ATCC 53789 plasmid pNsp_b, complete sequence	2	152918-153020	2	CRISPRCasFinder	no		RT,cas14j,cas14k	Orphan	ATAGCCCCCACATAGCCACCATG	23	1	1	152941-152997	NZ_CP046704.1_223113-223057	NA	1	1	Orphan	Cas9_archaeal,cas14k,cas14j,csa3,RT,DEDDh,PD-DExK,cas3,DinG,Cas14c_CAS-V-F,WYL,cas10d,csc2gr7,csc1gr5,2OG_CAS,cas6,cas4,cas1,cas2,Cas14u_CAS-V,csx3,csx1,cas10,csm3gr7,csx10gr5,csx19,c2c9_V-U4,cas8a4,cas7,cas5	NA|145aa|up_9|NZ_CP046705.1_141343_141778_+,NA|269aa|up_8|NZ_CP046705.1_141904_142711_+,NA|170aa|up_7|NZ_CP046705.1_142623_143133_+,NA|168aa|up_6|NZ_CP046705.1_143550_144054_-,NA|1596aa|up_5|NZ_CP046705.1_144128_148916_+,NA|87aa|up_4|NZ_CP046705.1_149212_149473_+,NA|97aa|up_1|NZ_CP046705.1_151503_151794_-,NA|89aa|down_1|NZ_CP046705.1_155276_155543_+,NA|57aa|down_2|NZ_CP046705.1_155634_155805_-	NA|145aa|up_9|NZ_CP046705.1_141343_141778_+	NA	NA|269aa|up_8|NZ_CP046705.1_141904_142711_+	NA	NA|170aa|up_7|NZ_CP046705.1_142623_143133_+	NA	NA|168aa|up_6|NZ_CP046705.1_143550_144054_-	NA	NA|1596aa|up_5|NZ_CP046705.1_144128_148916_+	NA	NA|87aa|up_4|NZ_CP046705.1_149212_149473_+	NA	NA|160aa|up_3|NZ_CP046705.1_149739_150219_+	pfam13274, DUF4065, Protein of unknown function (DUF4065)	NA|201aa|up_2|NZ_CP046705.1_150220_150823_+	pfam02452, PemK_toxin, PemK-like, MazF-like toxin of type II toxin-antitoxin system	NA|97aa|up_1|NZ_CP046705.1_151503_151794_-	NA	NA|320aa|up_0|NZ_CP046705.1_151799_152759_-	cd10227, ParM_like, Plasmid segregation protein ParM and similar proteins	NA|367aa|down_0|NZ_CP046705.1_153870_154971_-	pfam14239, RRXRR, RRXRR protein	NA|89aa|down_1|NZ_CP046705.1_155276_155543_+	NA	NA|57aa|down_2|NZ_CP046705.1_155634_155805_-	NA	NA|264aa|down_3|NZ_CP046705.1_156092_156884_+	COG1192, Soj, ATPases involved in chromosome partitioning [Cell division and chromosome partitioning]	NA|373aa|down_4|NZ_CP046705.1_156886_158005_+	TIGR04285, parB-like_partition_protein, nucleoid occlusion protein	NA|653aa|down_5|NZ_CP046705.1_158186_160145_+	TIGR01448, recD_rel, helicase, putative, RecD/TraA family	NA|317aa|down_6|NZ_CP046705.1_160234_161185_+	TIGR02997, RNA_polymerase_sigma_subunit_sigma70/sigma32, RNA polymerase sigma factor, cyanobacterial RpoD-like family	NA|190aa|down_7|NZ_CP046705.1_161263_161833_-	pfam13358, DDE_3, DDE superfamily endonuclease	NA|167aa|down_8|NZ_CP046705.1_161823_162324_-	pfam13565, HTH_32, Homeodomain-like domain	NA|169aa|down_9|NZ_CP046705.1_162414_162921_-	pfam09351, DUF1993, Domain of unknown function (DUF1993)
GCF_009873495.1_ASM987349v1	NZ_CP046705	Nostoc sp. ATCC 53789 plasmid pNsp_b, complete sequence	3	218622-218707	3	CRISPRCasFinder	no		RT,cas14j,cas14k	Orphan	TTTGCAATTAATGTGAGCCAAGTC	24	1	8	218646-218683|218646-218683|218646-218683|218646-218683|218646-218683|218646-218683|218646-218683|218646-218683	NZ_CP046708.1_45930-45893|NZ_CP046708.1_49872-49909|NZ_CP046705.1_76140-76103|NZ_CP046705.1_214704-214667|NZ_CP046705.1_80082-80119|NZ_CP046715.1_208-171|NZ_CP046710.1_35948-35911|NZ_CP046709.1_37651-37614	NA	1	1	Orphan	Cas9_archaeal,cas14k,cas14j,csa3,RT,DEDDh,PD-DExK,cas3,DinG,Cas14c_CAS-V-F,WYL,cas10d,csc2gr7,csc1gr5,2OG_CAS,cas6,cas4,cas1,cas2,Cas14u_CAS-V,csx3,csx1,cas10,csm3gr7,csx10gr5,csx19,c2c9_V-U4,cas8a4,cas7,cas5	NA|175aa|up_7|NZ_CP046705.1_210599_211124_+,NA|87aa|up_5|NZ_CP046705.1_212006_212267_-,NA|139aa|up_3|NZ_CP046705.1_214028_214445_-,NA|51aa|down_0|NZ_CP046705.1_218771_218924_+,NA|64aa|down_4|NZ_CP046705.1_220646_220838_-,NA|47aa|down_5|NZ_CP046705.1_220936_221077_+,NA|83aa|down_8|NZ_CP046705.1_223196_223445_-	NA|201aa|up_9|NZ_CP046705.1_207828_208431_-	pfam13358, DDE_3, DDE superfamily endonuclease	NA|441aa|up_8|NZ_CP046705.1_208974_210297_-	cd13136, MATE_DinF_like, DinF and similar proteins, a subfamily of the multidrug and toxic compound extrusion (MATE)-like proteins	NA|175aa|up_7|NZ_CP046705.1_210599_211124_+	NA	NA|240aa|up_6|NZ_CP046705.1_211251_211971_-	pfam12120, Arr-ms, Rifampin ADP-ribosyl transferase	NA|87aa|up_5|NZ_CP046705.1_212006_212267_-	NA	NA|317aa|up_4|NZ_CP046705.1_212684_213635_-	TIGR02997, RNA_polymerase_sigma_subunit_sigma70/sigma32, RNA polymerase sigma factor, cyanobacterial RpoD-like family	NA|139aa|up_3|NZ_CP046705.1_214028_214445_-	NA	NA|324aa|up_2|NZ_CP046705.1_214816_215788_-	COG3267, ExeA, Type II secretory pathway, component ExeA (predicted ATPase) [Intracellular trafficking and secretion]	NA|549aa|up_1|NZ_CP046705.1_215777_217424_-	pfam00665, rve, Integrase core domain	NA|280aa|up_0|NZ_CP046705.1_217710_218550_+	TIGR02224, Tyrosine_recombinase_XerC, tyrosine recombinase XerC	NA|51aa|down_0|NZ_CP046705.1_218771_218924_+	NA	NA|151aa|down_1|NZ_CP046705.1_218966_219419_-	pfam07154, DUF1392, Protein of unknown function (DUF1392)	NA|163aa|down_2|NZ_CP046705.1_219582_220071_-	pfam18306, LDcluster4, SLOG cluster4 family	NA|92aa|down_3|NZ_CP046705.1_220195_220471_-	pfam02148, zf-UBP, Zn-finger in ubiquitin-hydrolases and other protein	NA|64aa|down_4|NZ_CP046705.1_220646_220838_-	NA	NA|47aa|down_5|NZ_CP046705.1_220936_221077_+	NA	NA|162aa|down_6|NZ_CP046705.1_221088_221574_+	cd07242, VOC_BsYqjT, vicinal oxygen chelate (VOC) family protein similar to Bacillus subtilis YqjT	NA|411aa|down_7|NZ_CP046705.1_221880_223113_-	smart00325, RhoGEF, Guanine nucleotide exchange factor for Rho/Rac/Cdc42-like GTPases	NA|83aa|down_8|NZ_CP046705.1_223196_223445_-	NA	NA|169aa|down_9|NZ_CP046705.1_224034_224541_-	PRK00028, infC, translation initiation factor IF-3; Reviewed
GCF_009873495.1_ASM987349v1	NZ_CP046706	Nostoc sp. ATCC 53789 plasmid pNsp_c, complete sequence	1	125124-125215	1	CRISPRCasFinder	no		c2c9_V-U4	Orphan	CATGGGGGCTATATGGGGGCTGT	23	0	0	NA	NA	NA	1	1	Orphan	Cas9_archaeal,cas14k,cas14j,csa3,RT,DEDDh,PD-DExK,cas3,DinG,Cas14c_CAS-V-F,WYL,cas10d,csc2gr7,csc1gr5,2OG_CAS,cas6,cas4,cas1,cas2,Cas14u_CAS-V,csx3,csx1,cas10,csm3gr7,csx10gr5,csx19,c2c9_V-U4,cas8a4,cas7,cas5	NA|84aa|up_6|NZ_CP046706.1_117548_117800_-,NA|97aa|up_1|NZ_CP046706.1_124118_124409_+,NA|172aa|up_0|NZ_CP046706.1_124458_124974_+,NA|97aa|down_1|NZ_CP046706.1_126351_126642_+,NA|114aa|down_3|NZ_CP046706.1_127684_128026_-,NA|1597aa|down_4|NZ_CP046706.1_128315_133106_-,NA|168aa|down_5|NZ_CP046706.1_133180_133684_+,NA|134aa|down_6|NZ_CP046706.1_133979_134381_-,NA|185aa|down_7|NZ_CP046706.1_134488_135043_-,NA|271aa|down_8|NZ_CP046706.1_134925_135738_-,NA|127aa|down_9|NZ_CP046706.1_135748_136129_-	NA|502aa|up_9|NZ_CP046706.1_113353_114859_+	COG0665, DadA, Glycine/D-amino acid oxidases (deaminating) [Amino acid transport and metabolism]	NA|412aa|up_8|NZ_CP046706.1_114887_116123_-	pfam10592, AIPR, AIPR protein	NA|196aa|up_7|NZ_CP046706.1_116570_117158_+	pfam05685, Uma2, Putative restriction endonuclease	NA|84aa|up_6|NZ_CP046706.1_117548_117800_-	NA	NA|317aa|up_5|NZ_CP046706.1_117892_118843_-	TIGR02997, RNA_polymerase_sigma_subunit_sigma70/sigma32, RNA polymerase sigma factor, cyanobacterial RpoD-like family	NA|749aa|up_4|NZ_CP046706.1_118934_121181_-	TIGR01448, recD_rel, helicase, putative, RecD/TraA family	NA|391aa|up_3|NZ_CP046706.1_121437_122610_-	cd16393, SPO0J_N, Thermus thermophilus stage 0 sporulation protein J-like N-terminal domain, ParB family member	NA|276aa|up_2|NZ_CP046706.1_122609_123437_-	COG1192, Soj, ATPases involved in chromosome partitioning [Cell division and chromosome partitioning]	NA|97aa|up_1|NZ_CP046706.1_124118_124409_+	NA	NA|172aa|up_0|NZ_CP046706.1_124458_124974_+	NA	NA|320aa|down_0|NZ_CP046706.1_125386_126346_+	cd10227, ParM_like, Plasmid segregation protein ParM and similar proteins	NA|97aa|down_1|NZ_CP046706.1_126351_126642_+	NA	NA|228aa|down_2|NZ_CP046706.1_126772_127456_-	PTZ00266, PTZ00266, NIMA-related protein kinase; Provisional	NA|114aa|down_3|NZ_CP046706.1_127684_128026_-	NA	NA|1597aa|down_4|NZ_CP046706.1_128315_133106_-	NA	NA|168aa|down_5|NZ_CP046706.1_133180_133684_+	NA	NA|134aa|down_6|NZ_CP046706.1_133979_134381_-	NA	NA|185aa|down_7|NZ_CP046706.1_134488_135043_-	NA	NA|271aa|down_8|NZ_CP046706.1_134925_135738_-	NA	NA|127aa|down_9|NZ_CP046706.1_135748_136129_-	NA
GCF_009873495.1_ASM987349v1	NZ_CP046707	Nostoc sp. ATCC 53789 plasmid pNsp_d, complete sequence	1	4883-4994	1	CRISPRCasFinder	no			Orphan	GGCTTAGTCTAGGCTTAGTCCAAGCTTAGTTTAGG	35	0	0	NA	NA	NA	1	1	Orphan	Cas9_archaeal,cas14k,cas14j,csa3,RT,DEDDh,PD-DExK,cas3,DinG,Cas14c_CAS-V-F,WYL,cas10d,csc2gr7,csc1gr5,2OG_CAS,cas6,cas4,cas1,cas2,Cas14u_CAS-V,csx3,csx1,cas10,csm3gr7,csx10gr5,csx19,c2c9_V-U4,cas8a4,cas7,cas5	NA|78aa|up_6|NZ_CP046707.1_254_488_+,NA|48aa|up_5|NZ_CP046707.1_720_864_-,NA|54aa|up_4|NZ_CP046707.1_996_1158_-,NA|144aa|up_3|NZ_CP046707.1_1177_1609_-,NA|59aa|up_2|NZ_CP046707.1_2103_2280_+,NA|53aa|up_1|NZ_CP046707.1_2548_2707_-,NA|54aa|up_0|NZ_CP046707.1_4391_4553_+,NA|181aa|down_0|NZ_CP046707.1_5113_5656_-,NA|162aa|down_1|NZ_CP046707.1_5908_6394_+,NA|168aa|down_4|NZ_CP046707.1_7583_8087_-,NA|99aa|down_5|NZ_CP046707.1_8124_8421_-,NA|157aa|down_6|NZ_CP046707.1_8425_8896_-,NA|67aa|down_7|NZ_CP046707.1_8918_9119_-,NA|149aa|down_8|NZ_CP046707.1_9224_9671_-,NA|130aa|down_9|NZ_CP046707.1_9935_10325_-	NA|NA	NA	NA|NA	NA	NA|NA	NA	NA|78aa|up_6|NZ_CP046707.1_254_488_+	NA	NA|48aa|up_5|NZ_CP046707.1_720_864_-	NA	NA|54aa|up_4|NZ_CP046707.1_996_1158_-	NA	NA|144aa|up_3|NZ_CP046707.1_1177_1609_-	NA	NA|59aa|up_2|NZ_CP046707.1_2103_2280_+	NA	NA|53aa|up_1|NZ_CP046707.1_2548_2707_-	NA	NA|54aa|up_0|NZ_CP046707.1_4391_4553_+	NA	NA|181aa|down_0|NZ_CP046707.1_5113_5656_-	NA	NA|162aa|down_1|NZ_CP046707.1_5908_6394_+	NA	NA|68aa|down_2|NZ_CP046707.1_6414_6618_+	pfam01155, HypA, Hydrogenase/urease nickel incorporation, metallochaperone, hypA	NA|151aa|down_3|NZ_CP046707.1_6959_7412_+	COG0071, IbpA, Molecular chaperone (small heat shock protein) [Posttranslational modification, protein turnover, chaperones]	NA|168aa|down_4|NZ_CP046707.1_7583_8087_-	NA	NA|99aa|down_5|NZ_CP046707.1_8124_8421_-	NA	NA|157aa|down_6|NZ_CP046707.1_8425_8896_-	NA	NA|67aa|down_7|NZ_CP046707.1_8918_9119_-	NA	NA|149aa|down_8|NZ_CP046707.1_9224_9671_-	NA	NA|130aa|down_9|NZ_CP046707.1_9935_10325_-	NA
GCF_009873495.1_ASM987349v1	NZ_CP046708	Nostoc sp. ATCC 53789 plasmid pNsp_e, complete sequence	1	26685-26801	1	CRISPRCasFinder	no			Orphan	AGTTGCATTTAATCCTGGCTCAA	23	0	0	NA	NA	NA	2	2	Orphan	Cas9_archaeal,cas14k,cas14j,csa3,RT,DEDDh,PD-DExK,cas3,DinG,Cas14c_CAS-V-F,WYL,cas10d,csc2gr7,csc1gr5,2OG_CAS,cas6,cas4,cas1,cas2,Cas14u_CAS-V,csx3,csx1,cas10,csm3gr7,csx10gr5,csx19,c2c9_V-U4,cas8a4,cas7,cas5	NA|101aa|up_9|NZ_CP046708.1_14894_15197_+,NA|107aa|up_6|NZ_CP046708.1_21866_22187_-,NA|152aa|up_3|NZ_CP046708.1_23884_24340_+,NA|134aa|down_2|NZ_CP046708.1_31650_32052_+,NA|550aa|down_3|NZ_CP046708.1_32235_33885_+,NA|96aa|down_7|NZ_CP046708.1_37353_37641_+,NA|171aa|down_9|NZ_CP046708.1_44003_44516_+	NA|101aa|up_9|NZ_CP046708.1_14894_15197_+	NA	NA|1443aa|up_8|NZ_CP046708.1_15294_19623_-	cd00200, WD40, WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment	NA|739aa|up_7|NZ_CP046708.1_19648_21865_-	pfam14516, AAA_35, AAA-like domain	NA|107aa|up_6|NZ_CP046708.1_21866_22187_-	NA	NA|148aa|up_5|NZ_CP046708.1_22276_22720_-	PRK00236, xerC, site-specific tyrosine recombinase XerC; Reviewed	NA|255aa|up_4|NZ_CP046708.1_23127_23892_+	COG1192, Soj, ATPases involved in chromosome partitioning [Cell division and chromosome partitioning]	NA|152aa|up_3|NZ_CP046708.1_23884_24340_+	NA	NA|105aa|up_2|NZ_CP046708.1_24719_25034_-	COG3668, ParE, Plasmid stabilization system protein [General function prediction only]	NA|95aa|up_1|NZ_CP046708.1_25034_25319_-	COG3609, COG3609, Predicted transcriptional regulators containing the CopG/Arc/MetJ DNA-binding domain [Transcription]	NA|203aa|up_0|NZ_CP046708.1_25532_26141_+	pfam00589, Phage_integrase, Phage integrase family	NA|209aa|down_0|NZ_CP046708.1_27572_28199_-	smart00857, Resolvase, Resolvase, N terminal domain	NA|1002aa|down_1|NZ_CP046708.1_28301_31307_+	pfam01526, DDE_Tnp_Tn3, Tn3 transposase DDE domain	NA|134aa|down_2|NZ_CP046708.1_31650_32052_+	NA	NA|550aa|down_3|NZ_CP046708.1_32235_33885_+	NA	NA|513aa|down_4|NZ_CP046708.1_34167_35706_+	cd00737, lyz_endolysin_autolysin, endolysin and autolysin	NA|179aa|down_5|NZ_CP046708.1_35748_36285_-	COG5433, COG5433, Transposase [DNA replication, recombination, and repair]	NA|186aa|down_6|NZ_CP046708.1_36281_36839_-	pfam13808, DDE_Tnp_1_assoc, DDE_Tnp_1-associated	NA|96aa|down_7|NZ_CP046708.1_37353_37641_+	NA	NA|108aa|down_8|NZ_CP046708.1_43135_43459_-	pfam05713, MobC, Bacterial mobilisation protein (MobC)	NA|171aa|down_9|NZ_CP046708.1_44003_44516_+	NA
GCF_009873495.1_ASM987349v1	NZ_CP046709	Nostoc sp. ATCC 53789 plasmid pNsp_f, complete sequence	1	41569-41654	1	CRISPRCasFinder	no			Orphan	TTTGCAATTAATGTGAGCCAAGTC	24	1	8	41593-41630|41593-41630|41593-41630|41593-41630|41593-41630|41593-41630|41593-41630|41593-41630	NZ_CP046708.1_45930-45893|NZ_CP046708.1_49872-49909|NZ_CP046705.1_76140-76103|NZ_CP046705.1_214704-214667|NZ_CP046705.1_80082-80119|NZ_CP046715.1_208-171|NZ_CP046710.1_35948-35911|NZ_CP046709.1_37651-37614	NA	1	1	Orphan	Cas9_archaeal,cas14k,cas14j,csa3,RT,DEDDh,PD-DExK,cas3,DinG,Cas14c_CAS-V-F,WYL,cas10d,csc2gr7,csc1gr5,2OG_CAS,cas6,cas4,cas1,cas2,Cas14u_CAS-V,csx3,csx1,cas10,csm3gr7,csx10gr5,csx19,c2c9_V-U4,cas8a4,cas7,cas5	NA|92aa|up_9|NZ_CP046709.1_34723_34999_-,NA|74aa|up_8|NZ_CP046709.1_34995_35217_-,NA|96aa|up_7|NZ_CP046709.1_35795_36083_-,NA|85aa|up_6|NZ_CP046709.1_36079_36334_-,NA|123aa|up_4|NZ_CP046709.1_36775_37144_+,NA|106aa|up_3|NZ_CP046709.1_37147_37465_+,NA|112aa|down_0|NZ_CP046709.1_41718_42054_+,NA|66aa|down_1|NZ_CP046709.1_42160_42358_+,NA|63aa|down_2|NZ_CP046709.1_42514_42703_+,NA|222aa|down_3|NZ_CP046709.1_42695_43361_+,NA|533aa|down_5|NZ_CP046709.1_43930_45529_+,NA|54aa|down_7|NZ_CP046709.1_46138_46300_-,NA|322aa|down_9|NZ_CP046709.1_47655_48621_+	NA|92aa|up_9|NZ_CP046709.1_34723_34999_-	NA	NA|74aa|up_8|NZ_CP046709.1_34995_35217_-	NA	NA|96aa|up_7|NZ_CP046709.1_35795_36083_-	NA	NA|85aa|up_6|NZ_CP046709.1_36079_36334_-	NA	NA|66aa|up_5|NZ_CP046709.1_36330_36528_-	pfam07878, RHH_5, CopG-like RHH_1 or ribbon-helix-helix domain, RHH_5	NA|123aa|up_4|NZ_CP046709.1_36775_37144_+	NA	NA|106aa|up_3|NZ_CP046709.1_37147_37465_+	NA	NA|324aa|up_2|NZ_CP046709.1_37763_38735_-	COG3267, ExeA, Type II secretory pathway, component ExeA (predicted ATPase) [Intracellular trafficking and secretion]	NA|549aa|up_1|NZ_CP046709.1_38724_40371_-	pfam00665, rve, Integrase core domain	NA|280aa|up_0|NZ_CP046709.1_40657_41497_+	TIGR02224, Tyrosine_recombinase_XerC, tyrosine recombinase XerC	NA|112aa|down_0|NZ_CP046709.1_41718_42054_+	NA	NA|66aa|down_1|NZ_CP046709.1_42160_42358_+	NA	NA|63aa|down_2|NZ_CP046709.1_42514_42703_+	NA	NA|222aa|down_3|NZ_CP046709.1_42695_43361_+	NA	NA|151aa|down_4|NZ_CP046709.1_43357_43810_+	cd14807, RAP_D2, Domain 2 of receptor-associated protein (RAP)	NA|533aa|down_5|NZ_CP046709.1_43930_45529_+	NA	NA|142aa|down_6|NZ_CP046709.1_45605_46031_+	pfam13455, MUG113, Meiotically up-regulated gene 113	NA|54aa|down_7|NZ_CP046709.1_46138_46300_-	NA	NA|347aa|down_8|NZ_CP046709.1_46532_47573_+	pfam16684, Telomere_res, Telomere resolvase	NA|322aa|down_9|NZ_CP046709.1_47655_48621_+	NA
GCF_009873495.1_ASM987349v1	NZ_CP046710	Nostoc sp. ATCC 53789 plasmid pNsp_g, complete sequence	1	39866-39951	1	CRISPRCasFinder	no			Orphan	TTTGCAATTAATGTGAGCCAAGTC	24	1	8	39890-39927|39890-39927|39890-39927|39890-39927|39890-39927|39890-39927|39890-39927|39890-39927	NZ_CP046708.1_45930-45893|NZ_CP046708.1_49872-49909|NZ_CP046705.1_76140-76103|NZ_CP046705.1_214704-214667|NZ_CP046705.1_80082-80119|NZ_CP046715.1_208-171|NZ_CP046710.1_35948-35911|NZ_CP046709.1_37651-37614	NA	1	1	Orphan	Cas9_archaeal,cas14k,cas14j,csa3,RT,DEDDh,PD-DExK,cas3,DinG,Cas14c_CAS-V-F,WYL,cas10d,csc2gr7,csc1gr5,2OG_CAS,cas6,cas4,cas1,cas2,Cas14u_CAS-V,csx3,csx1,cas10,csm3gr7,csx10gr5,csx19,c2c9_V-U4,cas8a4,cas7,cas5	NA|248aa|up_5|NZ_CP046710.1_32770_33514_+,NA|69aa|up_4|NZ_CP046710.1_33577_33784_-,NA|637aa|up_3|NZ_CP046710.1_33878_35789_-,NA|239aa|down_0|NZ_CP046710.1_40653_41370_-,NA|137aa|down_1|NZ_CP046710.1_41369_41780_-,NA|86aa|down_2|NZ_CP046710.1_41781_42039_-,NA|79aa|down_3|NZ_CP046710.1_42242_42479_-,NA|94aa|down_4|NZ_CP046710.1_42523_42805_-,NA|305aa|down_8|NZ_CP046710.1_48178_49093_+	NA|341aa|up_9|NZ_CP046710.1_27976_28999_-	pfam03432, Relaxase, Relaxase/Mobilisation nuclease domain	NA|151aa|up_8|NZ_CP046710.1_28976_29429_-	pfam05713, MobC, Bacterial mobilisation protein (MobC)	NA|165aa|up_7|NZ_CP046710.1_31474_31969_+	pfam09726, Macoilin, Macoilin family	NA|274aa|up_6|NZ_CP046710.1_31965_32787_+	cd05386, TraL, transfer origin protein TraL	NA|248aa|up_5|NZ_CP046710.1_32770_33514_+	NA	NA|69aa|up_4|NZ_CP046710.1_33577_33784_-	NA	NA|637aa|up_3|NZ_CP046710.1_33878_35789_-	NA	NA|324aa|up_2|NZ_CP046710.1_36060_37032_-	COG3267, ExeA, Type II secretory pathway, component ExeA (predicted ATPase) [Intracellular trafficking and secretion]	NA|549aa|up_1|NZ_CP046710.1_37021_38668_-	pfam00665, rve, Integrase core domain	NA|280aa|up_0|NZ_CP046710.1_38954_39794_+	TIGR02224, Tyrosine_recombinase_XerC, tyrosine recombinase XerC	NA|239aa|down_0|NZ_CP046710.1_40653_41370_-	NA	NA|137aa|down_1|NZ_CP046710.1_41369_41780_-	NA	NA|86aa|down_2|NZ_CP046710.1_41781_42039_-	NA	NA|79aa|down_3|NZ_CP046710.1_42242_42479_-	NA	NA|94aa|down_4|NZ_CP046710.1_42523_42805_-	NA	NA|346aa|down_5|NZ_CP046710.1_42804_43842_-	cd10227, ParM_like, Plasmid segregation protein ParM and similar proteins	NA|50aa|down_6|NZ_CP046710.1_44099_44249_+	pfam09274, ParG, ParG	NA|1037aa|down_7|NZ_CP046710.1_44824_47935_+	pfam08706, D5_N, D5 N terminal like	NA|305aa|down_8|NZ_CP046710.1_48178_49093_+	NA	NA|643aa|down_9|NZ_CP046710.1_51856_53785_+	COG0367, AsnB, Asparagine synthase (glutamine-hydrolyzing) [Amino acid transport and metabolism]
GCF_009873495.1_ASM987349v1	NZ_CP046711	Nostoc sp. ATCC 53789 plasmid pNsp_h, complete sequence	1	17185-17252	1	CRISPRCasFinder	no			Orphan	TTCGCCCATAACGTTATGGGTAG	23	0	0	NA	NA	NA	1	1	Orphan	Cas9_archaeal,cas14k,cas14j,csa3,RT,DEDDh,PD-DExK,cas3,DinG,Cas14c_CAS-V-F,WYL,cas10d,csc2gr7,csc1gr5,2OG_CAS,cas6,cas4,cas1,cas2,Cas14u_CAS-V,csx3,csx1,cas10,csm3gr7,csx10gr5,csx19,c2c9_V-U4,cas8a4,cas7,cas5	NA|58aa|up_2|NZ_CP046711.1_11691_11865_+,NA|154aa|up_1|NZ_CP046711.1_13181_13643_+,NA|287aa|down_1|NZ_CP046711.1_18878_19739_-,NA|59aa|down_4|NZ_CP046711.1_22719_22896_+,NA|216aa|down_5|NZ_CP046711.1_22962_23610_-,NA|80aa|down_6|NZ_CP046711.1_23747_23987_-,NA|81aa|down_7|NZ_CP046711.1_24454_24697_+,NA|245aa|down_9|NZ_CP046711.1_28458_29193_-	NA|606aa|up_9|NZ_CP046711.1_636_2454_+	COG0631, PTC1, Serine/threonine protein phosphatase [Signal transduction mechanisms]	NA|715aa|up_8|NZ_CP046711.1_2496_4641_+	cd14014, STKc_PknB_like, Catalytic domain of bacterial Serine/Threonine kinases, PknB and similar proteins	NA|783aa|up_7|NZ_CP046711.1_4959_7308_-	COG0744, MrcB, Membrane carboxypeptidase (penicillin-binding protein) [Cell envelope biogenesis, outer membrane]	NA|283aa|up_6|NZ_CP046711.1_7703_8552_+	PRK00236, xerC, site-specific tyrosine recombinase XerC; Reviewed	NA|247aa|up_5|NZ_CP046711.1_8823_9564_-	pfam13365, Trypsin_2, Trypsin-like peptidase domain	NA|399aa|up_4|NZ_CP046711.1_9800_10997_-	pfam13365, Trypsin_2, Trypsin-like peptidase domain	NA|184aa|up_3|NZ_CP046711.1_11013_11565_-	pfam14218, COP23, Circadian oscillating protein COP23	NA|58aa|up_2|NZ_CP046711.1_11691_11865_+	NA	NA|154aa|up_1|NZ_CP046711.1_13181_13643_+	NA	NA|957aa|up_0|NZ_CP046711.1_13757_16628_-	cd18808, SF1_C_Upf1, C-terminal helicase domain of Upf1-like family helicases	NA|252aa|down_0|NZ_CP046711.1_17828_18584_-	COG1192, Soj, ATPases involved in chromosome partitioning [Cell division and chromosome partitioning]	NA|287aa|down_1|NZ_CP046711.1_18878_19739_-	NA	NA|511aa|down_2|NZ_CP046711.1_20018_21551_-	pfam03432, Relaxase, Relaxase/Mobilisation nuclease domain	NA|137aa|down_3|NZ_CP046711.1_21525_21936_-	pfam05713, MobC, Bacterial mobilisation protein (MobC)	NA|59aa|down_4|NZ_CP046711.1_22719_22896_+	NA	NA|216aa|down_5|NZ_CP046711.1_22962_23610_-	NA	NA|80aa|down_6|NZ_CP046711.1_23747_23987_-	NA	NA|81aa|down_7|NZ_CP046711.1_24454_24697_+	NA	NA|1182aa|down_8|NZ_CP046711.1_24636_28182_-	pfam08707, PriCT_2, Primase C terminal 2 (PriCT-2)	NA|245aa|down_9|NZ_CP046711.1_28458_29193_-	NA
GCF_009873495.1_ASM987349v1	NZ_CP046715	Nostoc sp. ATCC 53789 plasmid pNsp_l, complete sequence	1	4126-4211	1	CRISPRCasFinder	no			Orphan	TTTGCAATTAATGTGAGCCAAGTC	24	1	8	4150-4187|4150-4187|4150-4187|4150-4187|4150-4187|4150-4187|4150-4187|4150-4187	NZ_CP046708.1_45930-45893|NZ_CP046708.1_49872-49909|NZ_CP046705.1_76140-76103|NZ_CP046705.1_214704-214667|NZ_CP046705.1_80082-80119|NZ_CP046715.1_208-171|NZ_CP046710.1_35948-35911|NZ_CP046709.1_37651-37614	NA	1	1	Orphan	Cas9_archaeal,cas14k,cas14j,csa3,RT,DEDDh,PD-DExK,cas3,DinG,Cas14c_CAS-V-F,WYL,cas10d,csc2gr7,csc1gr5,2OG_CAS,cas6,cas4,cas1,cas2,Cas14u_CAS-V,csx3,csx1,cas10,csm3gr7,csx10gr5,csx19,c2c9_V-U4,cas8a4,cas7,cas5	NA,NA|130aa|down_0|NZ_CP046715.1_4275_4665_+,NA|85aa|down_1|NZ_CP046715.1_4699_4954_+,NA|230aa|down_2|NZ_CP046715.1_5129_5819_-,NA|50aa|down_7|NZ_CP046715.1_9306_9456_-,NA|305aa|down_9|NZ_CP046715.1_11051_11966_-	NA|NA	NA	NA|NA	NA	NA|NA	NA	NA|NA	NA	NA|NA	NA	NA|NA	NA	NA|NA	NA	NA|324aa|up_2|NZ_CP046715.1_320_1292_-	COG3267, ExeA, Type II secretory pathway, component ExeA (predicted ATPase) [Intracellular trafficking and secretion]	NA|549aa|up_1|NZ_CP046715.1_1281_2928_-	pfam00665, rve, Integrase core domain	NA|280aa|up_0|NZ_CP046715.1_3214_4054_+	TIGR02224, Tyrosine_recombinase_XerC, tyrosine recombinase XerC	NA|130aa|down_0|NZ_CP046715.1_4275_4665_+	NA	NA|85aa|down_1|NZ_CP046715.1_4699_4954_+	NA	NA|230aa|down_2|NZ_CP046715.1_5129_5819_-	NA	NA|376aa|down_3|NZ_CP046715.1_5820_6948_-	pfam13304, AAA_21, AAA domain, putative AbiEii toxin, Type IV TA system	NA|201aa|down_4|NZ_CP046715.1_7236_7839_-	cd01197, INT_FimBE_like, FimB and FimE and related proteins, integrase/recombinases	NA|247aa|down_5|NZ_CP046715.1_8140_8881_-	pfam13649, Methyltransf_25, Methyltransferase domain	NA|34aa|down_6|NZ_CP046715.1_9027_9129_-	pfam08846, DUF1816, Domain of unknown function (DUF1816)	NA|50aa|down_7|NZ_CP046715.1_9306_9456_-	NA	NA|361aa|down_8|NZ_CP046715.1_9905_10988_+	TIGR01151, Photosystem_QB_protein, photosystem II, DI subunit (also called Q(B))	NA|305aa|down_9|NZ_CP046715.1_11051_11966_-	NA
