assembly_id	genome_id	genome_def	crispr_array_locus_merge	crispr_array_location_merge	crispr_locus_id	crispr_pred_method	array_in_prot	prot_within_array_20000	prot_in_genome	crispr_type_by_cas_prot	consensus_repeat	repeat_length	self-targeting_spacer_number	self-targeting_target_number	spacer_location	protospacer_location	repeat_type	spacer_locus_num	spacer_num	correct_crispr_type	genome_cas_prots	unknown_protein_around_crispr	L10	L10_domain	L9	L9_domain	L8	L8_domain	L7	L7_domain	L6	L6_domain	L5	L5_domain	L4	L4_domain	L3	L3_domain	L2	L2_domain	L1	L1_domain	R1	R1_domain	R2	R2_domain	R3	R3_domain	R4	R4_domain	R5	R5_domain	R6	R6_domain	R7	R7_domain	R8	R8_domain	R9	R9_domain	R10	R10_domain
GCF_001548115.1_Gm3709_assembly_1.0	NZ_AP014821	Geminocystis sp. NIES-3709	1	225402-225617	1	PILER-CR	no		RT,DinG,csa3,c2c5_V-U5,cas3,WYL,PD-DExK,cas6,cas4,cas1,DEDDh	Orphan	GGGAATAAAGTTGTATAAT	19	0	0	NA	NA	NA	3	3	Orphan	RT,DinG,csa3,c2c5_V-U5,cas3,WYL,PD-DExK,cas6,cas4,cas1,DEDDh	NA,NA	NA|119aa|up_9|NZ_AP014821.1_201877_202234_-	pfam08844, DUF1815, Domain of unknown function (DUF1815)	NA|362aa|up_8|NZ_AP014821.1_202653_203739_-	PRK00772, PRK00772, 3-isopropylmalate dehydrogenase; Provisional	NA|462aa|up_7|NZ_AP014821.1_203846_205232_-	TIGR01981, UPF0051_protein_Rv1462/MT1509, FeS assembly protein SufD	NA|257aa|up_6|NZ_AP014821.1_205314_206085_-	CHL00131, ycf16, sulfate ABC transporter protein; Validated	NA|673aa|up_5|NZ_AP014821.1_206312_208331_-	PRK00208, thiG, thiazole synthase; Reviewed	NA|2714aa|up_4|NZ_AP014821.1_208777_216919_+	NF033203, entero_EhxA, enterohemolysin EhxA	NA|870aa|up_3|NZ_AP014821.1_217315_219925_+	PRK00390, leuS, leucyl-tRNA synthetase; Validated	NA|479aa|up_2|NZ_AP014821.1_219990_221427_-	pfam00990, GGDEF, Diguanylate cyclase, GGDEF domain	NA|928aa|up_1|NZ_AP014821.1_221511_224295_-	TIGR00836, Ammonium_transporter, ammonium transporter	NA|244aa|up_0|NZ_AP014821.1_224360_225092_+	pfam02683, DsbD, Cytochrome C biogenesis protein transmembrane region	NA|280aa|down_0|NZ_AP014821.1_225822_226662_-	pfam13359, DDE_Tnp_4, DDE superfamily endonuclease	NA|626aa|down_1|NZ_AP014821.1_226996_228874_-	PRK00331, PRK00331, isomerizing glutamine--fructose-6-phosphate transaminase	NA|320aa|down_2|NZ_AP014821.1_229127_230087_-	COG1357, COG1357, Pentapeptide repeats containing protein [Function unknown]	NA|104aa|down_3|NZ_AP014821.1_230142_230454_-	TIGR02181, GRX_bact, Glutaredoxin, GrxC family	NA|336aa|down_4|NZ_AP014821.1_230500_231508_-	COG0697, RhaT, Permeases of the drug/metabolite transporter (DMT) superfamily [Carbohydrate transport and metabolism / Amino acid transport and metabolism / General function prediction only]	NA|155aa|down_5|NZ_AP014821.1_231639_232104_+	pfam01668, SmpB, SmpB protein	NA|165aa|down_6|NZ_AP014821.1_232197_232692_+	pfam01641, SelR, SelR domain	NA|196aa|down_7|NZ_AP014821.1_232693_233281_+	pfam01625, PMSR, Peptide methionine sulfoxide reductase	NA|315aa|down_8|NZ_AP014821.1_233660_234605_-	cd04187, DPM1_like_bac, Bacterial DPM1_like enzymes are related to eukaryotic DPM1	NA|162aa|down_9|NZ_AP014821.1_234727_235213_-	COG1399, COG1399, Predicted metal-binding, possibly nucleic acid-binding protein [General function prediction only]
GCF_001548115.1_Gm3709_assembly_1.0	NZ_AP014821	Geminocystis sp. NIES-3709	2	689986-690321	2,1,1	PILER-CR,CRISPRCasFinder,CRT	no	c2c5_V-U5	RT,DinG,csa3,c2c5_V-U5,cas3,WYL,PD-DExK,cas6,cas4,cas1,DEDDh	Type V-U5	CTTTCAATCCCTAGAGAAGGTATTTTCTTATTTCAAC,CTTTCAATCCCTAGAGAAGGTATTTTCTTATTTCAAC,CTTTCAATCCCTAGAGAAGGTATTTTCTTATTTCAAC	37,37,37	0	0	NA	NA	NA:NA:NA	4,4,4	4	TypeV-U5	RT,DinG,csa3,c2c5_V-U5,cas3,WYL,PD-DExK,cas6,cas4,cas1,DEDDh	NA|93aa|up_2|NZ_AP014821.1_688186_688465_-,c2c5_V-U5|617aa|down_0|NZ_AP014821.1_690779_692630_-,NA|58aa|down_2|NZ_AP014821.1_693169_693343_-	NA|104aa|up_9|NZ_AP014821.1_683230_683542_-	pfam12159, DUF3593, Protein of unknown function (DUF3593)	NA|185aa|up_8|NZ_AP014821.1_683765_684320_-	cd03017, PRX_BCP, Peroxiredoxin (PRX) family, Bacterioferritin comigratory protein (BCP) subfamily; composed of  thioredoxin-dependent thiol peroxidases, widely expressed in pathogenic bacteria, that protect cells against toxicity from reactive oxygen species by reducing and detoxifying hydroperoxides	NA|150aa|up_7|NZ_AP014821.1_684391_684841_-	PRK00668, ndk, mulitfunctional nucleoside diphosphate kinase/apyrimidinic endonuclease/3'-; Validated	NA|142aa|up_6|NZ_AP014821.1_684969_685395_-	COG0432, COG0432, Uncharacterized conserved protein [Function unknown]	NA|149aa|up_5|NZ_AP014821.1_685527_685974_-	COG1357, COG1357, Pentapeptide repeats containing protein [Function unknown]	NA|314aa|up_4|NZ_AP014821.1_686194_687136_-	PLN02823, PLN02823, spermine synthase	NA|230aa|up_3|NZ_AP014821.1_687434_688124_+	pfam14014, DUF4230, Protein of unknown function (DUF4230)	NA|93aa|up_2|NZ_AP014821.1_688186_688465_-	NA	NA|68aa|up_1|NZ_AP014821.1_688699_688903_+	pfam05421, DUF751, Protein of unknown function (DUF751)	NA|126aa|up_0|NZ_AP014821.1_688958_689336_+	PRK00521, rbfA, 30S ribosome-binding factor RbfA	c2c5_V-U5|617aa|down_0|NZ_AP014821.1_690779_692630_-	NA	NA|140aa|down_1|NZ_AP014821.1_692693_693113_+	cd01105, HTH_GlnR-like, Helix-Turn-Helix DNA binding domain of GlnR-like transcription regulators	NA|58aa|down_2|NZ_AP014821.1_693169_693343_-	NA	NA|161aa|down_3|NZ_AP014821.1_695885_696368_-	pfam06527, TniQ, TniQ	NA|276aa|down_4|NZ_AP014821.1_696367_697195_-	COG2842, COG2842, Uncharacterized ATPase, putative transposase [General function prediction only]	NA|556aa|down_5|NZ_AP014821.1_697204_698872_-	pfam09299, Mu-transpos_C, Mu transposase, C-terminal	NA|472aa|down_6|NZ_AP014821.1_699562_700978_+	pfam01590, GAF, GAF domain	NA|621aa|down_7|NZ_AP014821.1_701500_703363_+	PRK07418, PRK07418, acetolactate synthase large subunit	NA|390aa|down_8|NZ_AP014821.1_703503_704673_+	COG0276, HemH, Protoheme ferro-lyase (ferrochelatase) [Coenzyme metabolism]	NA|275aa|down_9|NZ_AP014821.1_704783_705608_-	pfam13649, Methyltransf_25, Methyltransferase domain
GCF_001548115.1_Gm3709_assembly_1.0	NZ_AP014821	Geminocystis sp. NIES-3709	3	1556062-1556393	3,2,2	PILER-CR,CRISPRCasFinder,CRT	no	c2c5_V-U5	RT,DinG,csa3,c2c5_V-U5,cas3,WYL,PD-DExK,cas6,cas4,cas1,DEDDh	Type V-U5	GTTGCAGATGAATTTACTTCTCTGTGCGATCGAAAG,GTTGCAGATGAATTTACTTCTCTGTGCGATCGAAAG,GTTGCAGATGAATTTACTTCTCTGTGCGATCGAAAG	36,36,36	0	0	NA	NA	NA:NA:NA	4,4,4	4	TypeV-U5	RT,DinG,csa3,c2c5_V-U5,cas3,WYL,PD-DExK,cas6,cas4,cas1,DEDDh	NA|50aa|up_5|NZ_AP014821.1_1549798_1549948_-,NA|73aa|up_4|NZ_AP014821.1_1550060_1550279_-,c2c5_V-U5|608aa|up_0|NZ_AP014821.1_1553901_1555725_+,NA|57aa|down_4|NZ_AP014821.1_1560691_1560862_+	NA|280aa|up_9|NZ_AP014821.1_1543390_1544230_+	pfam13359, DDE_Tnp_4, DDE superfamily endonuclease	NA|872aa|up_8|NZ_AP014821.1_1544456_1547072_+	pfam07669, Eco57I, Eco57I restriction-modification methylase	NA|441aa|up_7|NZ_AP014821.1_1547131_1548454_-	cd08639, DNA_pol_A_Aquificae_like, Phylum Aquificae Pol A is different from Escherichia coli  Pol A by three signature sequences	NA|59aa|up_6|NZ_AP014821.1_1548426_1548603_-	pfam07592, DDE_Tnp_ISAZ013, Rhodopirellula transposase DDE domain	NA|50aa|up_5|NZ_AP014821.1_1549798_1549948_-	NA	NA|73aa|up_4|NZ_AP014821.1_1550060_1550279_-	NA	NA|113aa|up_3|NZ_AP014821.1_1550632_1550971_-	cd17260, RMtype1_S_EcoEI-TRD1-CR1_like, Type I restriction-modification system specificity (S) subunit Target Recognition Domain-ConseRved domain (TRD-CR), similar to S	NA|791aa|up_2|NZ_AP014821.1_1550967_1553340_-	COG4096, HsdR, Type I site-specific restriction-modification system, R (restriction) subunit and related helicases [Defense mechanisms]	NA|158aa|up_1|NZ_AP014821.1_1553336_1553810_-	cd01105, HTH_GlnR-like, Helix-Turn-Helix DNA binding domain of GlnR-like transcription regulators	c2c5_V-U5|608aa|up_0|NZ_AP014821.1_1553901_1555725_+	NA	NA|211aa|down_0|NZ_AP014821.1_1556829_1557462_-	PRK14419, PRK14419, membrane protein; Provisional	NA|182aa|down_1|NZ_AP014821.1_1557544_1558090_+	cd07503, HAD_HisB-N, histidinol phosphate phosphatase and related phosphatases	NA|310aa|down_2|NZ_AP014821.1_1558274_1559204_+	pfam13354, Beta-lactamase2, Beta-lactamase enzyme family	NA|355aa|down_3|NZ_AP014821.1_1559422_1560487_+	cd13542, PBP2_FutA1_ilke, Substrate binding domain of ferric iron-binding protein, a member of the type 2 periplasmic binding fold superfamily	NA|57aa|down_4|NZ_AP014821.1_1560691_1560862_+	NA	NA|356aa|down_5|NZ_AP014821.1_1561198_1562266_-	TIGR01151, Photosystem_QB_protein, photosystem II, DI subunit (also called Q(B))	NA|231aa|down_6|NZ_AP014821.1_1562509_1563202_-	COG0664, Crp, cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases [Signal transduction mechanisms]	NA|498aa|down_7|NZ_AP014821.1_1563310_1564804_-	cd01949, GGDEF, Diguanylate-cyclase (DGC) or GGDEF domain	NA|216aa|down_8|NZ_AP014821.1_1565010_1565658_-	COG0664, Crp, cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases [Signal transduction mechanisms]	NA|280aa|down_9|NZ_AP014821.1_1565792_1566632_+	pfam13359, DDE_Tnp_4, DDE superfamily endonuclease
GCF_001548115.1_Gm3709_assembly_1.0	NZ_AP014821	Geminocystis sp. NIES-3709	4	1671225-1671329	3	CRISPRCasFinder	no		RT,DinG,csa3,c2c5_V-U5,cas3,WYL,PD-DExK,cas6,cas4,cas1,DEDDh	Orphan	CATCGATCGCAAAGTCGCATTGCT	24	0	0	NA	NA	NA	1	1	Orphan	RT,DinG,csa3,c2c5_V-U5,cas3,WYL,PD-DExK,cas6,cas4,cas1,DEDDh	NA|53aa|up_5|NZ_AP014821.1_1666533_1666692_+,NA|73aa|down_0|NZ_AP014821.1_1671519_1671738_-,NA|77aa|down_2|NZ_AP014821.1_1672990_1673221_-,NA|98aa|down_3|NZ_AP014821.1_1673265_1673559_-,NA|104aa|down_8|NZ_AP014821.1_1679247_1679559_-,NA|80aa|down_9|NZ_AP014821.1_1679683_1679923_+	NA|499aa|up_9|NZ_AP014821.1_1662113_1663610_-	cd06160, S2P-M50_like_2, Uncharacterized homologs of Site-2 protease (S2P), zinc metalloproteases (MEROPS family M50) which cleave transmembrane domains of substrate proteins, regulating intramembrane proteolysis (RIP) of diverse signal transduction mechanisms	NA|231aa|up_8|NZ_AP014821.1_1663743_1664436_+	cd07727, YmaE-like_MBL-fold, uncharacterized subgroup which includes Bacillus subtilis YmaE and related proteins; MBL-fold metallo hydrolase domain	NA|37aa|up_7|NZ_AP014821.1_1664612_1664723_+	pfam08041, PetM, PetM family of cytochrome b6f complex subunit 7	NA|379aa|up_6|NZ_AP014821.1_1665278_1666415_+	COG1104, NifS, Cysteine sulfinate desulfinase/cysteine desulfurase and related enzymes [Amino acid transport and metabolism]	NA|53aa|up_5|NZ_AP014821.1_1666533_1666692_+	NA	NA|182aa|up_4|NZ_AP014821.1_1667086_1667632_+	PRK00889, PRK00889, adenylylsulfate kinase; Provisional	NA|256aa|up_3|NZ_AP014821.1_1667933_1668701_+	pfam01940, DUF92, Integral membrane protein DUF92	NA|228aa|up_2|NZ_AP014821.1_1668713_1669397_+	cd03141, GATase1_Hsp31_like, Type 1 glutamine amidotransferase (GATase1)-like domain found in proteins similar to Escherichia coli Hsp31 protein	NA|228aa|up_1|NZ_AP014821.1_1669849_1670533_+	pfam05419, GUN4, GUN4-like	NA|159aa|up_0|NZ_AP014821.1_1670585_1671062_+	pfam08847, Crr6, Chlororespiratory reduction 6	NA|73aa|down_0|NZ_AP014821.1_1671519_1671738_-	NA	NA|342aa|down_1|NZ_AP014821.1_1671945_1672971_-	pfam11981, DUF3482, Domain of unknown function (DUF3482)	NA|77aa|down_2|NZ_AP014821.1_1672990_1673221_-	NA	NA|98aa|down_3|NZ_AP014821.1_1673265_1673559_-	NA	NA|994aa|down_4|NZ_AP014821.1_1673625_1676607_-	PLN02843, PLN02843, isoleucyl-tRNA synthetase	NA|60aa|down_5|NZ_AP014821.1_1676699_1676879_-	PRK11815, PRK11815, tRNA dihydrouridine(20/20a) synthase DusA	NA|448aa|down_6|NZ_AP014821.1_1677048_1678392_-	COG0642, BaeS, Signal transduction histidine kinase [Signal transduction mechanisms]	NA|232aa|down_7|NZ_AP014821.1_1678517_1679213_-	COG0745, OmpR, Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain [Signal transduction mechanisms / Transcription]	NA|104aa|down_8|NZ_AP014821.1_1679247_1679559_-	NA	NA|80aa|down_9|NZ_AP014821.1_1679683_1679923_+	NA
GCF_001548115.1_Gm3709_assembly_1.0	NZ_AP014821	Geminocystis sp. NIES-3709	5	1923873-1923984	4	CRISPRCasFinder	no		RT,DinG,csa3,c2c5_V-U5,cas3,WYL,PD-DExK,cas6,cas4,cas1,DEDDh	Orphan	AGTCTCAAATGTTTGTATGGACAGGCTATCATAGTAAAC	39	0	0	NA	NA	NA	1	1	Orphan	RT,DinG,csa3,c2c5_V-U5,cas3,WYL,PD-DExK,cas6,cas4,cas1,DEDDh	NA|695aa|up_2|NZ_AP014821.1_1919675_1921760_+,NA|226aa|down_5|NZ_AP014821.1_1932759_1933437_-	NA|583aa|up_9|NZ_AP014821.1_1910038_1911787_-	COG1132, MdlB, ABC-type multidrug transport system, ATPase and permease components [Defense mechanisms]	NA|241aa|up_8|NZ_AP014821.1_1911952_1912675_-	TIGR03022, WbaP_sugtrans, Undecaprenyl-phosphate galactose phosphotransferase, WbaP	NA|343aa|up_7|NZ_AP014821.1_1913361_1914390_+	TIGR00975, precursor_PBP-3_PstS-3_Antigen_Ag88	NA|372aa|up_6|NZ_AP014821.1_1914488_1915604_-	cd12828, TmCorA-like_1, Thermotoga maritima CorA_like subfamily	NA|412aa|up_5|NZ_AP014821.1_1915790_1917026_+	COG2027, DacB, D-alanyl-D-alanine carboxypeptidase (penicillin-binding protein 4) [Cell envelope biogenesis, outer membrane]	NA|399aa|up_4|NZ_AP014821.1_1917111_1918308_-	TIGR03492, TIGR03492, conserved hypothetical protein	NA|365aa|up_3|NZ_AP014821.1_1918499_1919594_+	TIGR00378, cax, calcium/proton exchanger (cax)	NA|695aa|up_2|NZ_AP014821.1_1919675_1921760_+	NA	NA|426aa|up_1|NZ_AP014821.1_1921906_1923184_+	PRK05476, PRK05476, S-adenosyl-L-homocysteine hydrolase; Provisional	NA|167aa|up_0|NZ_AP014821.1_1923312_1923813_+	COG1357, COG1357, Pentapeptide repeats containing protein [Function unknown]	NA|718aa|down_0|NZ_AP014821.1_1925019_1927173_-	PRK11824, PRK11824, polynucleotide phosphorylase/polyadenylase; Provisional	NA|454aa|down_1|NZ_AP014821.1_1927524_1928886_-	PRK00093, PRK00093, GTP-binding protein Der; Reviewed	NA|365aa|down_2|NZ_AP014821.1_1929112_1930207_+	cd02933, OYE_like_FMN, Old yellow enzyme (OYE)-like FMN binding domain	NA|324aa|down_3|NZ_AP014821.1_1930330_1931302_+	COG0435, ECM4, Predicted glutathione S-transferase [Posttranslational modification, protein turnover, chaperones]	NA|468aa|down_4|NZ_AP014821.1_1931349_1932753_+	COG3264, COG3264, Small-conductance mechanosensitive channel [Cell envelope biogenesis, outer membrane]	NA|226aa|down_5|NZ_AP014821.1_1932759_1933437_-	NA	NA|480aa|down_6|NZ_AP014821.1_1933579_1935019_-	PRK11814, PRK11814, cysteine desulfurase activator complex subunit SufB; Provisional	NA|203aa|down_7|NZ_AP014821.1_1935644_1936253_+	CHL00113, rps4, ribosomal protein S4; Reviewed	NA|751aa|down_8|NZ_AP014821.1_1936959_1939212_+	CHL00056, psaA, photosystem I P700 chlorophyll a apoprotein A1	NA|743aa|down_9|NZ_AP014821.1_1939485_1941714_+	PRK13199, psaB, photosystem I P700 chlorophyll a apoprotein A2; Provisional
GCF_001548115.1_Gm3709_assembly_1.0	NZ_AP014821	Geminocystis sp. NIES-3709	6	2815804-2815909	5	CRISPRCasFinder	no		RT,DinG,csa3,c2c5_V-U5,cas3,WYL,PD-DExK,cas6,cas4,cas1,DEDDh	Orphan	AAATTATACCCAAATAAACCCTAAACT	27	0	0	NA	NA	NA	1	1	Orphan	RT,DinG,csa3,c2c5_V-U5,cas3,WYL,PD-DExK,cas6,cas4,cas1,DEDDh	NA|94aa|up_9|NZ_AP014821.1_2801346_2801628_+,NA|83aa|up_8|NZ_AP014821.1_2801734_2801983_-,NA|107aa|down_0|NZ_AP014821.1_2816400_2816721_+,NA|112aa|down_2|NZ_AP014821.1_2818164_2818500_+,NA|96aa|down_6|NZ_AP014821.1_2822819_2823107_+	NA|94aa|up_9|NZ_AP014821.1_2801346_2801628_+	NA	NA|83aa|up_8|NZ_AP014821.1_2801734_2801983_-	NA	NA|393aa|up_7|NZ_AP014821.1_2802239_2803418_+	COG0003, ArsA, Predicted ATPase involved in chromosome partitioning [Cell division and chromosome partitioning]	NA|180aa|up_6|NZ_AP014821.1_2803898_2804438_-	pfam13358, DDE_3, DDE superfamily endonuclease	NA|422aa|up_5|NZ_AP014821.1_2806559_2807825_+	pfam13546, DDE_5, DDE superfamily endonuclease	NA|265aa|up_4|NZ_AP014821.1_2808817_2809612_-	TIGR00726, Laccase_domain_protein, YfiH family protein	NA|235aa|up_3|NZ_AP014821.1_2809635_2810340_-	COG4636, Uma2, Endonuclease, Uma2 family (restriction endonuclease fold) [General function prediction only]	NA|333aa|up_2|NZ_AP014821.1_2811489_2812488_-	pfam13358, DDE_3, DDE superfamily endonuclease	NA|488aa|up_1|NZ_AP014821.1_2813169_2814633_+	COG1649, COG1649, Uncharacterized protein conserved in bacteria [Function unknown]	NA|296aa|up_0|NZ_AP014821.1_2814842_2815730_+	COG1131, CcmA, ABC-type multidrug transport system, ATPase component [Defense mechanisms]	NA|107aa|down_0|NZ_AP014821.1_2816400_2816721_+	NA	NA|289aa|down_1|NZ_AP014821.1_2817055_2817922_-	cd07325, M48_Ste24p_like, M48 Ste24 endopeptidase-like, integral membrane metallopeptidase	NA|112aa|down_2|NZ_AP014821.1_2818164_2818500_+	NA	NA|311aa|down_3|NZ_AP014821.1_2819084_2820017_-	PLN02578, PLN02578, hydrolase	NA|598aa|down_4|NZ_AP014821.1_2820146_2821940_-	COG1217, TypA, Predicted membrane GTPase involved in stress response [Signal transduction mechanisms]	NA|149aa|down_5|NZ_AP014821.1_2822313_2822760_+	COG5637, COG5637, Predicted integral membrane protein [Function unknown]	NA|96aa|down_6|NZ_AP014821.1_2822819_2823107_+	NA	NA|422aa|down_7|NZ_AP014821.1_2823222_2824488_-	pfam13546, DDE_5, DDE superfamily endonuclease	NA|339aa|down_8|NZ_AP014821.1_2825149_2826166_-	COG0057, GapA, Glyceraldehyde-3-phosphate dehydrogenase/erythrose-4-phosphate dehydrogenase [Carbohydrate transport and metabolism]	NA|610aa|down_9|NZ_AP014821.1_2826481_2828311_-	TIGR04096, conserved_hypothetical_protein, DNA phosphorothioation-associated putative methyltransferase
GCF_001548115.1_Gm3709_assembly_1.0	NZ_AP014821	Geminocystis sp. NIES-3709	7	2935667-2935980	4,6,3	PILER-CR,CRISPRCasFinder,CRT	no	WYL,PD-DExK,cas6,cas4,cas1	RT,DinG,csa3,c2c5_V-U5,cas3,WYL,PD-DExK,cas6,cas4,cas1,DEDDh	Unclear	GTCTAAACTATAATAAATACCTACTAG,GTCTAAACTATAATAAATACCTACTAG,GTCTAAACTATAATAAATACCTACTAG	27,27,27	0	0	NA	NA	NA:NA:NA	4,4,4	4	Unclear	RT,DinG,csa3,c2c5_V-U5,cas3,WYL,PD-DExK,cas6,cas4,cas1,DEDDh	NA|185aa|up_6|NZ_AP014821.1_2929999_2930554_+,PD-DExK|202aa|up_5|NZ_AP014821.1_2930578_2931184_+,NA|142aa|down_1|NZ_AP014821.1_2937693_2938119_-	NA|102aa|up_9|NZ_AP014821.1_2927480_2927786_+	PRK13019, clpS, ATP-dependent Clp protease adapter ClpS	NA|96aa|up_8|NZ_AP014821.1_2927782_2928070_+	pfam09876, DUF2103, Predicted metal-binding protein (DUF2103)	WYL|184aa|up_7|NZ_AP014821.1_2928083_2928635_-	pfam13280, WYL, WYL domain	NA|185aa|up_6|NZ_AP014821.1_2929999_2930554_+	NA	PD-DExK|202aa|up_5|NZ_AP014821.1_2930578_2931184_+	NA	NA|190aa|up_4|NZ_AP014821.1_2931211_2931781_+	pfam05685, Uma2, Putative restriction endonuclease	cas6|272aa|up_3|NZ_AP014821.1_2931801_2932617_+	COG5551, COG5551, CRISPR system related protein, RAMP superfamily [Defense    mechanisms]	NA|422aa|up_2|NZ_AP014821.1_2932781_2934047_-	pfam13546, DDE_5, DDE superfamily endonuclease	cas4|195aa|up_1|NZ_AP014821.1_2934495_2935080_+	TIGR00372, conserved_hypothetical_protein, CRISPR-associated protein Cas4	cas1|161aa|up_0|NZ_AP014821.1_2935154_2935637_+	TIGR04093, hypothetical_protein_L8106_25395, CRISPR-associated endonuclease Cas1, subtype CYANO	NA|420aa|down_0|NZ_AP014821.1_2936212_2937472_+	PRK00885, PRK00885, phosphoribosylamine--glycine ligase; Provisional	NA|142aa|down_1|NZ_AP014821.1_2937693_2938119_-	NA	NA|151aa|down_2|NZ_AP014821.1_2938201_2938654_+	pfam02657, SufE, Fe-S metabolism associated domain	NA|602aa|down_3|NZ_AP014821.1_2938908_2940714_+	PRK07431, PRK07431, aspartate kinase; Provisional	NA|356aa|down_4|NZ_AP014821.1_2941201_2942269_-	TIGR01151, Photosystem_QB_protein, photosystem II, DI subunit (also called Q(B))	NA|134aa|down_5|NZ_AP014821.1_2942618_2943020_-	cd02213, cupin_PMI_typeII_C, Phosphomannose isomerase type II, C-terminal cupin domain	NA|133aa|down_6|NZ_AP014821.1_2943050_2943449_-	pfam14271, DUF4359, Domain of unknown function (DUF4359)	NA|418aa|down_7|NZ_AP014821.1_2943591_2944845_-	PHA03100, PHA03100, ankyrin repeat protein; Provisional	NA|83aa|down_8|NZ_AP014821.1_2945354_2945603_+	CHL00005, rps16, ribosomal protein S16	NA|134aa|down_9|NZ_AP014821.1_2945595_2945997_+	COG1837, COG1837, Predicted RNA-binding protein (contains KH domain) [General function prediction only]
GCF_001548115.1_Gm3709_assembly_1.0	NZ_AP014821	Geminocystis sp. NIES-3709	8	2953250-2953335	7	CRISPRCasFinder	no	cas4,cas1,csa3	RT,DinG,csa3,c2c5_V-U5,cas3,WYL,PD-DExK,cas6,cas4,cas1,DEDDh	Type I-A	TGCCATTCTAAAGCATATAAACG	23	0	0	NA	NA	NA	1	1	Unclear	RT,DinG,csa3,c2c5_V-U5,cas3,WYL,PD-DExK,cas6,cas4,cas1,DEDDh	NA|241aa|up_2|NZ_AP014821.1_2950110_2950833_-,NA	NA|356aa|up_9|NZ_AP014821.1_2941201_2942269_-	TIGR01151, Photosystem_QB_protein, photosystem II, DI subunit (also called Q(B))	NA|134aa|up_8|NZ_AP014821.1_2942618_2943020_-	cd02213, cupin_PMI_typeII_C, Phosphomannose isomerase type II, C-terminal cupin domain	NA|133aa|up_7|NZ_AP014821.1_2943050_2943449_-	pfam14271, DUF4359, Domain of unknown function (DUF4359)	NA|418aa|up_6|NZ_AP014821.1_2943591_2944845_-	PHA03100, PHA03100, ankyrin repeat protein; Provisional	NA|83aa|up_5|NZ_AP014821.1_2945354_2945603_+	CHL00005, rps16, ribosomal protein S16	NA|134aa|up_4|NZ_AP014821.1_2945595_2945997_+	COG1837, COG1837, Predicted RNA-binding protein (contains KH domain) [General function prediction only]	NA|86aa|up_3|NZ_AP014821.1_2948643_2948901_-	pfam00534, Glycos_transf_1, Glycosyl transferases group 1	NA|241aa|up_2|NZ_AP014821.1_2950110_2950833_-	NA	NA|220aa|up_1|NZ_AP014821.1_2951016_2951676_-	COG1309, AcrR, Transcriptional regulator [Transcription]	NA|338aa|up_0|NZ_AP014821.1_2951939_2952953_+	PRK07403, PRK07403, type I glyceraldehyde-3-phosphate dehydrogenase	NA|552aa|down_0|NZ_AP014821.1_2954624_2956280_-	cd03801, GT4_PimA-like, phosphatidyl-myo-inositol mannosyltransferase	NA|223aa|down_1|NZ_AP014821.1_2956436_2957105_+	pfam11264, ThylakoidFormat, Thylakoid formation protein	NA|430aa|down_2|NZ_AP014821.1_2957196_2958486_+	PRK05431, PRK05431, seryl-tRNA synthetase; Provisional	NA|434aa|down_3|NZ_AP014821.1_2958804_2960106_+	PLN02482, PLN02482, glutamate-1-semialdehyde 2,1-aminomutase	NA|110aa|down_4|NZ_AP014821.1_2960274_2960604_-	cd07043, STAS_anti-anti-sigma_factors, Sulphate Transporter and Anti-Sigma factor antagonist) domain of anti-anti-sigma factors, key regulators of anti-sigma factors by phosphorylation	NA|195aa|down_5|NZ_AP014821.1_2960967_2961552_+	PRK00277, clpP, ATP-dependent Clp protease proteolytic subunit; Reviewed	NA|174aa|down_6|NZ_AP014821.1_2961624_2962146_+	PRK02603, PRK02603, photosystem I assembly protein Ycf3; Provisional	NA|96aa|down_7|NZ_AP014821.1_2962201_2962489_+	PRK00034, gatC, Asp-tRNA(Asn)/Glu-tRNA(Gln) amidotransferase subunit GatC	NA|528aa|down_8|NZ_AP014821.1_2962799_2964383_+	PRK02546, PRK02546, NAD(P)H-quinone oxidoreductase subunit 4; Provisional	NA|347aa|down_9|NZ_AP014821.1_2964727_2965768_+	PRK12299, obgE, GTPase CgtA; Reviewed
GCF_001548115.1_Gm3709_assembly_1.0	NZ_AP014824	Geminocystis sp. NIES-3709 plasmid pGM3709_03, complete sequence	1	5292-5397	1	CRISPRCasFinder	no	RT,WYL	RT,WYL	Unclear	TCTCTACTCCCTGCTTCCCACTCTCTAA	28	0	0	NA	NA	NA	1	1	Orphan	RT,DinG,csa3,c2c5_V-U5,cas3,WYL,PD-DExK,cas6,cas4,cas1,DEDDh	NA,NA|263aa|down_1|NZ_AP014824.1_8673_9462_+,NA|75aa|down_7|NZ_AP014824.1_12922_13147_-,NA|278aa|down_8|NZ_AP014824.1_13470_14304_+	NA|NA	NA	NA|NA	NA	NA|NA	NA	NA|NA	NA	NA|NA	NA	NA|NA	NA	NA|NA	NA	NA|NA	NA	NA|NA	NA	NA|1335aa|up_0|NZ_AP014824.1_74_4079_+	NF033203, entero_EhxA, enterohemolysin EhxA	RT|378aa|down_0|NZ_AP014824.1_7499_8633_+	cd03487, RT_Bac_retron_II, RT_Bac_retron_II: Reverse transcriptases (RTs) in bacterial retrotransposons or retrons	NA|263aa|down_1|NZ_AP014824.1_8673_9462_+	NA	WYL|684aa|down_2|NZ_AP014824.1_9488_11540_+	COG4639, COG4639, Predicted kinase [General function prediction only]	NA|122aa|down_3|NZ_AP014824.1_11543_11909_-	COG2026, RelE, Cytotoxic translational repressor of toxin-antitoxin stability system [Translation, ribosomal structure and biogenesis / Cell division and chromosome partitioning]	NA|77aa|down_4|NZ_AP014824.1_11898_12129_-	pfam10047, DUF2281, Protein of unknown function (DUF2281)	NA|129aa|down_5|NZ_AP014824.1_12137_12524_-	cd09872, PIN_Sll0205-like, VapC-like PIN domain of Sll0205 protein and homologs	NA|77aa|down_6|NZ_AP014824.1_12520_12751_-	pfam10047, DUF2281, Protein of unknown function (DUF2281)	NA|75aa|down_7|NZ_AP014824.1_12922_13147_-	NA	NA|278aa|down_8|NZ_AP014824.1_13470_14304_+	NA	NA|114aa|down_9|NZ_AP014824.1_14358_14700_+	cd10719, DnaJ_zf, Zinc finger domain of DnaJ and HSP40
GCF_001548115.1_Gm3709_assembly_1.0	NZ_AP014824	Geminocystis sp. NIES-3709 plasmid pGM3709_03, complete sequence	2	42049-42146	2	CRISPRCasFinder	no		RT,WYL	Orphan	ACTATACATTTAATCTCTAGTTTA	24	0	0	NA	NA	NA	1	1	Orphan	RT,DinG,csa3,c2c5_V-U5,cas3,WYL,PD-DExK,cas6,cas4,cas1,DEDDh	NA|99aa|up_8|NZ_AP014824.1_22231_22528_+,NA|100aa|up_6|NZ_AP014824.1_23739_24039_+,NA|62aa|down_0|NZ_AP014824.1_42161_42347_+	NA|93aa|up_9|NZ_AP014824.1_20089_20368_+	COG2442, COG2442, Uncharacterized conserved protein [Function unknown]	NA|99aa|up_8|NZ_AP014824.1_22231_22528_+	NA	NA|280aa|up_7|NZ_AP014824.1_22750_23590_-	pfam13359, DDE_Tnp_4, DDE superfamily endonuclease	NA|100aa|up_6|NZ_AP014824.1_23739_24039_+	NA	NA|318aa|up_5|NZ_AP014824.1_24165_25119_+	TIGR02224, Tyrosine_recombinase_XerC, tyrosine recombinase XerC	NA|251aa|up_4|NZ_AP014824.1_25513_26266_+	pfam13614, AAA_31, AAA domain	NA|121aa|up_3|NZ_AP014824.1_26439_26802_+	cd16377, 23S_rRNA_IVP_like, 23S rRNA-intervening sequence protein and similar proteins	NA|306aa|up_2|NZ_AP014824.1_27322_28240_+	TIGR04285, parB-like_partition_protein, nucleoid occlusion protein	NA|3567aa|up_1|NZ_AP014824.1_28703_39404_+	NF033203, entero_EhxA, enterohemolysin EhxA	NA|699aa|up_0|NZ_AP014824.1_39482_41579_-	PRK00236, xerC, site-specific tyrosine recombinase XerC; Reviewed	NA|62aa|down_0|NZ_AP014824.1_42161_42347_+	NA	NA|736aa|down_1|NZ_AP014824.1_43270_45478_+	pfam00589, Phage_integrase, Phage integrase family	NA|292aa|down_2|NZ_AP014824.1_46028_46904_+	pfam13469, Sulfotransfer_3, Sulfotransferase family	NA|438aa|down_3|NZ_AP014824.1_48297_49611_+	pfam13546, DDE_5, DDE superfamily endonuclease	NA|NA	NA	NA|NA	NA	NA|NA	NA	NA|NA	NA	NA|NA	NA	NA|NA	NA
GCF_001548115.1_Gm3709_assembly_1.0	NZ_AP014825	Geminocystis sp. NIES-3709 plasmid pGM3709_04, complete sequence	1	27874-27971	1	CRISPRCasFinder	no			Orphan	GTTATCGTTAAAACCTTGTATTTTTGCTAA	30	0	0	NA	NA	NA	1	1	Orphan	RT,DinG,csa3,c2c5_V-U5,cas3,WYL,PD-DExK,cas6,cas4,cas1,DEDDh	NA|229aa|up_3|NZ_AP014825.1_25314_26001_-,NA|871aa|down_1|NZ_AP014825.1_32038_34651_-,NA|63aa|down_6|NZ_AP014825.1_38596_38785_+,NA|84aa|down_7|NZ_AP014825.1_38777_39029_+	NA|71aa|up_9|NZ_AP014825.1_16980_17193_+	pfam09957, VapB_antitoxin, Bacterial antitoxin of type II TA system, VapB	NA|132aa|up_8|NZ_AP014825.1_17179_17575_+	cd18761, PIN_MtVapC3-like, uncharacterized subgroup of the VapC3-like nuclease subfamily of the PIN domain superfamily	NA|311aa|up_7|NZ_AP014825.1_18423_19356_-	cd09176, PLDc_unchar6, Putative catalytic domain of uncharacterized hypothetical proteins with one or two copies of the HKD motif	NA|608aa|up_6|NZ_AP014825.1_19515_21339_-	pfam13589, HATPase_c_3, Histidine kinase-, DNA gyrase B-, and HSP90-like ATPase	NA|521aa|up_5|NZ_AP014825.1_21331_22894_-	COG1061, SSL2, DNA or RNA helicases of superfamily II [Transcription / DNA replication, recombination, and repair]	NA|779aa|up_4|NZ_AP014825.1_22975_25312_-	cd16414, dndB_like, DNA-sulfur modification-associated domain	NA|229aa|up_3|NZ_AP014825.1_25314_26001_-	NA	NA|89aa|up_2|NZ_AP014825.1_26139_26406_-	TIGR02116, Hypothetical_protein_Rv3358/MT3466/Mb3393	NA|87aa|up_1|NZ_AP014825.1_26398_26659_-	COG2161, StbD, Antitoxin of toxin-antitoxin stability system [Cell division and chromosome partitioning]	NA|267aa|up_0|NZ_AP014825.1_26690_27491_-	cd01713, PAPS_reductase, This domain is found in phosphoadenosine phosphosulphate (PAPS) reductase enzymes or PAPS sulphotransferase	NA|226aa|down_0|NZ_AP014825.1_29108_29786_-	cd06260, DUF820, Domain of unknown function (DUF820)	NA|871aa|down_1|NZ_AP014825.1_32038_34651_-	NA	NA|99aa|down_2|NZ_AP014825.1_34873_35170_-	pfam08014, DUF1704, Domain of unknown function (DUF1704)	NA|306aa|down_3|NZ_AP014825.1_35332_36250_-	pfam13182, DUF4007, Protein of unknown function (DUF4007)	NA|509aa|down_4|NZ_AP014825.1_36424_37951_-	CHL00195, ycf46, Ycf46; Provisional	NA|191aa|down_5|NZ_AP014825.1_38031_38604_+	COG4185, COG4185, Uncharacterized protein conserved in bacteria [Function unknown]	NA|63aa|down_6|NZ_AP014825.1_38596_38785_+	NA	NA|84aa|down_7|NZ_AP014825.1_38777_39029_+	NA	NA|68aa|down_8|NZ_AP014825.1_39097_39301_-	pfam11211, DUF2997, Protein of unknown function (DUF2997)	NA|103aa|down_9|NZ_AP014825.1_39475_39784_-	CHL00193, ycf35, Ycf35; Provisional
GCF_001548115.1_Gm3709_assembly_1.0	NZ_AP014829	Geminocystis sp. NIES-3709 plasmid pGM3709_08, complete sequence	1	5226-5412	1	CRISPRCasFinder	no			Orphan	AACAGAACCAGTTTTGACAGAATTGTCACAA	31	0	0	NA	NA	NA	2	2	Orphan	RT,DinG,csa3,c2c5_V-U5,cas3,WYL,PD-DExK,cas6,cas4,cas1,DEDDh	NA|146aa|up_1|NZ_AP014829.1_2006_2444_-,NA|307aa|up_0|NZ_AP014829.1_3383_4304_-,NA|202aa|down_0|NZ_AP014829.1_6306_6912_+,NA|136aa|down_1|NZ_AP014829.1_7011_7419_+	NA|NA	NA	NA|NA	NA	NA|NA	NA	NA|NA	NA	NA|NA	NA	NA|NA	NA	NA|NA	NA	NA|345aa|up_2|NZ_AP014829.1_147_1182_-	pfam01446, Rep_1, Replication protein	NA|146aa|up_1|NZ_AP014829.1_2006_2444_-	NA	NA|307aa|up_0|NZ_AP014829.1_3383_4304_-	NA	NA|202aa|down_0|NZ_AP014829.1_6306_6912_+	NA	NA|136aa|down_1|NZ_AP014829.1_7011_7419_+	NA	NA|449aa|down_2|NZ_AP014829.1_7415_8762_+	TIGR00675, Modification_methylase, DNA-methyltransferase (dcm)	NA|230aa|down_3|NZ_AP014829.1_8736_9426_-	pfam09564, RE_NgoBV, NgoBV restriction endonuclease	NA|422aa|down_4|NZ_AP014829.1_9648_10914_+	pfam13546, DDE_5, DDE superfamily endonuclease	NA|NA	NA	NA|NA	NA	NA|NA	NA	NA|NA	NA	NA|NA	NA
