assembly_id	genome_id	genome_def	crispr_array_locus_merge	crispr_array_location_merge	crispr_locus_id	crispr_pred_method	array_in_prot	prot_within_array_20000	prot_in_genome	crispr_type_by_cas_prot	consensus_repeat	repeat_length	self-targeting_spacer_number	self-targeting_target_number	spacer_location	protospacer_location	repeat_type	spacer_locus_num	spacer_num	correct_crispr_type	genome_cas_prots	unknown_protein_around_crispr	L10	L10_domain	L9	L9_domain	L8	L8_domain	L7	L7_domain	L6	L6_domain	L5	L5_domain	L4	L4_domain	L3	L3_domain	L2	L2_domain	L1	L1_domain	R1	R1_domain	R2	R2_domain	R3	R3_domain	R4	R4_domain	R5	R5_domain	R6	R6_domain	R7	R7_domain	R8	R8_domain	R9	R9_domain	R10	R10_domain
GCF_000067045.1_ASM6704v1	NC_010125	Gluconacetobacter diazotrophicus PA1 5, complete genome	1	62935-64899	1,1,1	CRISPRCasFinder,CRT,PILER-CR	no	cas2,cas1,cas4,cas7,cas8c,cas5,cas3	cas2,cas1,cas4,cas7,cas8c,cas5,cas3,DinG,DEDDh,cas9,cas6e,cse2gr11,cas8e,csa3	Type I-C,Type I-U, Type I-U?	GTTTCAATCCACGCTCCCGCACAGGGAGCGAC,GTTTCAATCCACGCTCCCGCACAGGGAGCGAC,GTTTCAATCCACGCTCCCGCACAGGGAGCGAC	32,32,32	0	0	NA	NA	I-C:I-C:I-C	29,29,28	29	TypeI-C,TypeI-U,TypeI-U?	cas2,cas1,cas4,cas7,cas8c,cas5,cas3,DinG,DEDDh,cas9,cas6e,cse2gr11,cas8e,csa3	NA,NA	NA|782aa|up_9|NC_010125.1_47137_49483_-	TIGR02063, Ribonuclease_R, ribonuclease R	NA|925aa|up_8|NC_010125.1_49485_52260_-	PRK07561, PRK07561, DNA topoisomerase I subunit omega; Validated	NA|396aa|up_7|NC_010125.1_52281_53469_-	COG0758, Smf, Predicted Rossmann fold nucleotide-binding protein involved in DNA uptake [DNA replication, recombination, and repair / Intracellular trafficking and secretion]	NA|213aa|up_6|NC_010125.1_53465_54104_-	PRK00220, PRK00220, glycerol-3-phosphate 1-O-acyltransferase PlsY	NA|431aa|up_5|NC_010125.1_54100_55393_-	PRK09357, pyrC, dihydroorotase; Validated	NA|322aa|up_4|NC_010125.1_55389_56355_-	PRK00856, pyrB, aspartate carbamoyltransferase catalytic subunit	NA|467aa|up_3|NC_010125.1_56351_57752_-	PRK01406, gltX, glutamyl-tRNA synthetase; Reviewed	NA|708aa|up_2|NC_010125.1_57934_60058_+	pfam03772, Competence, Competence protein	NA|229aa|up_1|NC_010125.1_60061_60748_-	COG0177, Nth, Predicted EndoIII-related endonuclease [DNA replication, recombination, and repair]	NA|622aa|up_0|NC_010125.1_60971_62837_+	COG1132, MdlB, ABC-type multidrug transport system, ATPase and permease components [Defense mechanisms]	cas2|97aa|down_0|NC_010125.1_65081_65372_-	cd09725, Cas2_I_II_III, CRISPR/Cas system-associated protein Cas2	cas1|347aa|down_1|NC_010125.1_65375_66416_-	TIGR03640, cas1_DVULG, CRISPR-associated endonuclease Cas1, subtype I-C/DVULG	cas4|219aa|down_2|NC_010125.1_66415_67072_-	cd09637, Cas4_I-A_I-B_I-C_I-D_II-B, CRISPR/Cas system-associated protein Cas4	cas7|313aa|down_3|NC_010125.1_67080_68019_-	pfam05107, Cas_Cas7, CRISPR-associated protein Cas7	cas8c|608aa|down_4|NC_010125.1_68011_69835_-	pfam09709, Cas_Csd1, CRISPR-associated protein (Cas_Csd1)	cas5|224aa|down_5|NC_010125.1_69831_70503_-	cd09752, Cas5_I-C, CRISPR/Cas system-associated RAMP superfamily protein Cas5	NA|75aa|down_6|NC_010125.1_70756_70981_+	COG4456, VagC, Virulence-associated protein and related proteins [Function unknown]	NA|137aa|down_7|NC_010125.1_70980_71391_+	cd18745, PIN_VapC4-5_FitB-like, uncharacterized subgroup of the PIN_VapC4-5_FitB-like subfamily of the PIN domain superfamily	cas3|783aa|down_8|NC_010125.1_71551_73900_-	cd17930, DEXHc_cas3, DEXH/Q-box helicase domain of Cas3	NA|127aa|down_9|NC_010125.1_73930_74311_-	cd01276, PKCI_related, Protein Kinase C Interacting protein related (PKCI): PKCI and related proteins belong to the ubiquitous HIT family of hydrolases that act on alpha-phosphates of ribonucleotides
GCF_000067045.1_ASM6704v1	NC_010125	Gluconacetobacter diazotrophicus PA1 5, complete genome	2	2181482-2182177	2,2,2	CRISPRCasFinder,CRT,PILER-CR	no	cas9,cas1,cas2	cas2,cas1,cas4,cas7,cas8c,cas5,cas3,DinG,DEDDh,cas9,cas6e,cse2gr11,cas8e,csa3	Type II-B,Type II-C,Type II-A, or Type II-C?, Type II-B	AGCCTACCATCGGCAAATCGGTAGGGAAACCACGGC,AGCCTACCATCGGCAAATCGGTAGGGAAACCACGGC,AGCCTACCATCGGCAAATCGGTAGGGAAACCACGGC	36,36,36	0	0	NA	NA	NA:NA:NA	10,10,8	10	TypeII-B,TypeII-C,TypeII-A,orTypeII-C?,TypeII-B	cas2,cas1,cas4,cas7,cas8c,cas5,cas3,DinG,DEDDh,cas9,cas6e,cse2gr11,cas8e,csa3	NA,NA	NA|270aa|up_9|NC_010125.1_2171063_2171873_+	COG5375, COG5375, Uncharacterized protein conserved in bacteria [Function unknown]	NA|364aa|up_8|NC_010125.1_2171869_2172961_+	pfam03968, OstA, OstA-like protein	NA|253aa|up_7|NC_010125.1_2172990_2173749_+	COG1137, YhbG, ABC-type (unclassified) transport system, ATPase component [General function prediction only]	NA|459aa|up_6|NC_010125.1_2173759_2175136_+	PRK05932, PRK05932, RNA polymerase factor sigma-54; Reviewed	NA|196aa|up_5|NC_010125.1_2175230_2175818_+	cd00552, RaiA, RaiA ("ribosome-associated inhibitor A", also known as Protein Y (PY), YfiA, and SpotY,  is a stress-response protein that binds the ribosomal subunit interface and arrests translation by interfering with aminoacyl-tRNA binding to the ribosomal A site	NA|88aa|up_4|NC_010125.1_2175888_2176152_-	pfam06620, DUF1150, Protein of unknown function (DUF1150)	NA|163aa|up_3|NC_010125.1_2176265_2176754_-	cd06470, ACD_IbpA-B_like, Alpha-crystallin domain (ACD) found in Escherichia coli inclusion body-associated proteins IbpA and IbpB, and similar proteins	cas9|1047aa|up_2|NC_010125.1_2177094_2180235_+	COG3513, COG3513, Predicted CRISPR-associated nuclease, contains McrA/HNH-nuclease and RuvC-like nuclease domain [Defense mechanisms]	cas1|298aa|up_1|NC_010125.1_2180192_2181086_+	TIGR03639, cas1_NMENI, CRISPR-associated endonuclease Cas1, subtype II/NMENI	cas2|110aa|up_0|NC_010125.1_2181095_2181425_+	COG3512, COG3512, CRISPR-associated protein, Cas2 homolog [Defense mechanisms]	NA|180aa|down_0|NC_010125.1_2182309_2182849_+	PRK00150, def, peptide deformylase; Reviewed	NA|306aa|down_1|NC_010125.1_2182894_2183812_+	PRK00005, fmt, methionyl-tRNA formyltransferase; Reviewed	NA|259aa|down_2|NC_010125.1_2183808_2184585_+	PRK00021, truA, tRNA pseudouridine(38-40) synthase TruA	NA|213aa|down_3|NC_010125.1_2184591_2185230_-	pfam09843, DUF2070, Predicted membrane protein (DUF2070)	NA|386aa|down_4|NC_010125.1_2185226_2186384_-	PRK13009, PRK13009, succinyl-diaminopimelate desuccinylase; Reviewed	NA|282aa|down_5|NC_010125.1_2186380_2187226_-	PRK11830, dapD, 2,3,4,5-tetrahydropyridine-2,6-carboxylate N-succinyltransferase; Provisional	NA|294aa|down_6|NC_010125.1_2187352_2188234_-	PRK00942, PRK00942, acetylglutamate kinase; Provisional	NA|224aa|down_7|NC_010125.1_2188334_2189006_-	PRK00454, engB, GTP-binding protein YsxC; Reviewed	NA|84aa|down_8|NC_010125.1_2190811_2191063_-	pfam01809, Haemolytic, Haemolytic domain	NA|117aa|down_9|NC_010125.1_2191071_2191422_-	pfam00825, Ribonuclease_P, Ribonuclease P
GCF_000067045.1_ASM6704v1	NC_010125	Gluconacetobacter diazotrophicus PA1 5, complete genome	3	2253747-2255116	3,3,3	CRISPRCasFinder,CRT,PILER-CR	no	cas2,cas1,cas6e,cas5,cas7,cse2gr11,cas8e,cas3	cas2,cas1,cas4,cas7,cas8c,cas5,cas3,DinG,DEDDh,cas9,cas6e,cse2gr11,cas8e,csa3	Type I-E	CGGTTCATCCCCGCACGTGCGGGGAACAC,CGGTTCATCCCCGCACGTGCGGGGAACAC,CGGTTCATCCCCGCACGTGCGGGGAACAC	29,29,29	0	0	NA	NA	I-E:I-E:I-E	22,22,20	22	TypeI-E	cas2,cas1,cas4,cas7,cas8c,cas5,cas3,DinG,DEDDh,cas9,cas6e,cse2gr11,cas8e,csa3	NA|46aa|up_7|NC_010125.1_2244706_2244844_-,NA|134aa|down_9|NC_010125.1_2265755_2266157_-	NA|321aa|up_9|NC_010125.1_2243005_2243968_+	cd19087, AKR_AKR12A1_B1_C1, AKR12A, AKR12B,  AKR12C families of aldo-keto reductase (AKR)	NA|238aa|up_8|NC_010125.1_2243978_2244692_+	cd02883, Nudix_Hydrolase, Nudix hydrolase is a superfamily of enzymes found in all three kingdoms of life, and it catalyzes the hydrolysis of NUcleoside DIphosphates linked to other moieties, X	NA|46aa|up_7|NC_010125.1_2244706_2244844_-	NA	NA|199aa|up_6|NC_010125.1_2245052_2245649_+	cd02219, cupin_YjlB-like, Bacillus subtilis YjlB and related proteins, cupin domain	NA|249aa|up_5|NC_010125.1_2245656_2246403_+	COG2085, COG2085, Predicted dinucleotide-binding enzymes [General function prediction only]	NA|184aa|up_4|NC_010125.1_2246469_2247021_-	COG3247, HdeD, Uncharacterized conserved protein [Function unknown]	NA|206aa|up_3|NC_010125.1_2247354_2247972_+	PRK05327, rpsD, 30S ribosomal protein S4; Validated	NA|528aa|up_2|NC_010125.1_2248090_2249674_+	COG4108, PrfC, Peptide chain release factor RF-3 [Translation, ribosomal structure and biogenesis]	NA|467aa|up_1|NC_010125.1_2249645_2251046_-	cd13131, MATE_NorM_like, Subfamily of the multidrug and toxic compound extrusion (MATE)-like proteins similar to Vibrio cholerae NorM	NA|259aa|up_0|NC_010125.1_2252529_2253306_+	COG0565, LasT, rRNA methylase [Translation, ribosomal structure and biogenesis]	cas2|116aa|down_0|NC_010125.1_2255184_2255532_-	PRK11558, PRK11558, putative ssRNA endonuclease; Provisional	cas1|320aa|down_1|NC_010125.1_2255512_2256472_-	TIGR03638, cas1_ECOLI, CRISPR-associated endonuclease Cas1, subtype I-E/ECOLI	cas6e|229aa|down_2|NC_010125.1_2256483_2257170_-	smart01101, CRISPR_assoc, This domain forms an anti-parallel beta strand structure with flanking alpha helical regions	cas5|261aa|down_3|NC_010125.1_2257166_2257949_-	TIGR01868, hypothetical_protein, CRISPR-associated protein Cas5/CasD, subtype I-E/ECOLI	cas7|353aa|down_4|NC_010125.1_2257957_2259016_-	pfam09344, Cas_CT1975, CT1975-like protein	cse2gr11|197aa|down_5|NC_010125.1_2259012_2259603_-	cd09731, Cse2_I-E, CRISPR/Cas system-associated protein Cse2	cas8e|552aa|down_6|NC_010125.1_2259599_2261255_-	cd09729, Cse1_I-E, CRISPR/Cas system-associated protein Cse1	cas3|898aa|down_7|NC_010125.1_2261621_2264315_-	PRK09694, PRK09694, CRISPR-associated helicase/endonuclease Cas3	NA|444aa|down_8|NC_010125.1_2264359_2265691_-	PRK12558, PRK12558, glutamyl-tRNA synthetase; Provisional	NA|134aa|down_9|NC_010125.1_2265755_2266157_-	NA
GCF_000067045.1_ASM6704v1	NC_010125	Gluconacetobacter diazotrophicus PA1 5, complete genome	4	2441486-2441591	4	CRISPRCasFinder	no		cas2,cas1,cas4,cas7,cas8c,cas5,cas3,DinG,DEDDh,cas9,cas6e,cse2gr11,cas8e,csa3	Orphan	AACTTCGCCGCATCCAGCGGTTCCTCCACCAC	32	0	0	NA	NA	NA	1	1	Orphan	cas2,cas1,cas4,cas7,cas8c,cas5,cas3,DinG,DEDDh,cas9,cas6e,cse2gr11,cas8e,csa3	NA,NA	NA|315aa|up_9|NC_010125.1_2425631_2426576_-	pfam13650, Asp_protease_2, Aspartyl protease	NA|321aa|up_8|NC_010125.1_2426601_2427564_-	TIGR01249, Putative_proline_iminopeptidase, proline iminopeptidase, Neisseria-type subfamily	NA|325aa|up_7|NC_010125.1_2427577_2428552_-	PRK00236, xerC, site-specific tyrosine recombinase XerC; Reviewed	NA|751aa|up_6|NC_010125.1_2428787_2431040_+	PRK05580, PRK05580, primosome assembly protein PriA; Validated	NA|719aa|up_5|NC_010125.1_2431046_2433203_-	PRK11249, katE, hydroperoxidase II; Provisional	NA|559aa|up_4|NC_010125.1_2433302_2434979_-	COG2509, COG2509, Uncharacterized FAD-dependent dehydrogenases [General function prediction only]	NA|310aa|up_3|NC_010125.1_2436188_2437118_+	COG0053, MMT1, Predicted Co/Zn/Cd cation transporters [Inorganic ion transport and metabolism]	NA|457aa|up_2|NC_010125.1_2437276_2438647_+	cd01034, EriC_like, ClC chloride channel family	NA|469aa|up_1|NC_010125.1_2438668_2440075_-	pfam00067, p450, Cytochrome P450	NA|333aa|up_0|NC_010125.1_2440178_2441177_-	cd00200, WD40, WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment	NA|370aa|down_0|NC_010125.1_2442399_2443509_-	cd03801, GT4_PimA-like, phosphatidyl-myo-inositol mannosyltransferase	NA|223aa|down_1|NC_010125.1_2443516_2444185_-	cd03194, GST_C_3, C-terminal, alpha helical domain of an unknown subfamily 3 of Glutathione S-transferases	NA|362aa|down_2|NC_010125.1_2444223_2445309_-	PRK00143, mnmA, tRNA-specific 2-thiouridylase MnmA; Reviewed	NA|105aa|down_3|NC_010125.1_2445330_2445645_-	PLN02593, PLN02593, adrenodoxin-like ferredoxin protein	NA|409aa|down_4|NC_010125.1_2445632_2446859_-	PRK14012, PRK14012, IscS subfamily cysteine desulfurase	NA|389aa|down_5|NC_010125.1_2446855_2448022_-	COG1104, NifS, Cysteine sulfinate desulfinase/cysteine desulfurase and related enzymes [Amino acid transport and metabolism]	NA|222aa|down_6|NC_010125.1_2448665_2449331_+	COG2945, COG2945, Predicted hydrolase of the alpha/beta superfamily [General function prediction only]	NA|414aa|down_7|NC_010125.1_2449445_2450687_+	PRK05912, PRK05912, tyrosyl-tRNA synthetase; Validated	NA|137aa|down_8|NC_010125.1_2450700_2451111_+	pfam13811, DUF4186, Domain of unknown function (DUF4186)	NA|119aa|down_9|NC_010125.1_2451064_2451421_-	COG4638, HcaE, Phenylpropionate dioxygenase and related ring-hydroxylating dioxygenases, large terminal subunit [Inorganic ion transport and metabolism / General function prediction only]
