assembly_id	genome_id	genome_def	crispr_array_locus_merge	crispr_array_location_merge	crispr_locus_id	crispr_pred_method	array_in_prot	prot_within_array_20000	prot_in_genome	crispr_type_by_cas_prot	consensus_repeat	repeat_length	self-targeting_spacer_number	self-targeting_target_number	spacer_location	protospacer_location	repeat_type	spacer_locus_num	spacer_num	correct_crispr_type	genome_cas_prots	unknown_protein_around_crispr	L10	L10_domain	L9	L9_domain	L8	L8_domain	L7	L7_domain	L6	L6_domain	L5	L5_domain	L4	L4_domain	L3	L3_domain	L2	L2_domain	L1	L1_domain	R1	R1_domain	R2	R2_domain	R3	R3_domain	R4	R4_domain	R5	R5_domain	R6	R6_domain	R7	R7_domain	R8	R8_domain	R9	R9_domain	R10	R10_domain
GCF_000024165.1_ASM2416v1	NC_013194	Candidatus Accumulibacter phosphatis clade IIA str. UW-1, complete genome	1	1691953-1692132	1	PILER-CR	no		cas3f,DEDDh,RT,csx1,csa3,cas3,WYL,cas2,cas1,cas6,cmr6gr7,cmr5gr11,cmr4gr7,cmr3gr5,cas10,cmr1gr7,cas8e,cse2gr11,cas7,cas5,cas6e,PD-DExK,DinG	Orphan	GAGCGTGTCGTTGCCGGCG	19	3	3	1691972-1692006|1692026-1692060|1692080-1692114	NC_013194.1_544305-544271|NC_013194.1_544251-544217|NC_013194.1_544197-544163	NA	3	3	Orphan	cas3f,DEDDh,RT,csx1,csa3,cas3,WYL,cas2,cas1,cas6,cmr6gr7,cmr5gr11,cmr4gr7,cmr3gr5,cas10,cmr1gr7,cas8e,cse2gr11,cas7,cas5,cas6e,PD-DExK,DinG	NA|248aa|up_7|NC_013194.1_1684027_1684771_+,NA|70aa|down_0|NC_013194.1_1699698_1699908_+,NA|62aa|down_1|NC_013194.1_1701246_1701432_+,NA|290aa|down_2|NC_013194.1_1703479_1704349_-,NA|160aa|down_3|NC_013194.1_1704609_1705089_-,NA|140aa|down_8|NC_013194.1_1708789_1709209_-	NA|168aa|up_9|NC_013194.1_1681667_1682171_-	PRK00901, PRK00901, methylated-DNA--protein-cysteine methyltransferase; Provisional	NA|488aa|up_8|NC_013194.1_1682167_1683631_-	PRK10308, PRK10308, 3-methyl-adenine DNA glycosylase II; Provisional	NA|248aa|up_7|NC_013194.1_1684027_1684771_+	NA	NA|68aa|up_6|NC_013194.1_1684878_1685082_+	cd00371, HMA, Heavy-metal-associated domain (HMA) is a conserved domain of approximately 30 amino acid residues found in a number of proteins that transport or detoxify heavy metals, for example, the CPx-type heavy metal ATPases and copper chaperones	NA|307aa|up_5|NC_013194.1_1685290_1686211_+	cd16896, LT_Slt70-like, uncharacterized lytic transglycosylase subfamily with similarity to Slt70	NA|258aa|up_4|NC_013194.1_1686491_1687265_+	cd13530, PBP2_peptides_like, Peptide-binding protein and related homologs; type 2 periplasmic binding protein fold	NA|253aa|up_3|NC_013194.1_1687280_1688039_+	COG0765, HisM, ABC-type amino acid transport system, permease component [Amino acid transport and metabolism]	NA|251aa|up_2|NC_013194.1_1688052_1688805_+	COG1126, GlnQ, ABC-type polar amino acid transport system, ATPase component [Amino acid transport and metabolism]	NA|309aa|up_1|NC_013194.1_1688836_1689763_-	pfam00892, EamA, EamA-like transporter family	NA|459aa|up_0|NC_013194.1_1689908_1691285_-	TIGR01843, Hemolysin_secretion_protein_D_plasmid, type I secretion membrane fusion protein, HlyD family	NA|70aa|down_0|NC_013194.1_1699698_1699908_+	NA	NA|62aa|down_1|NC_013194.1_1701246_1701432_+	NA	NA|290aa|down_2|NC_013194.1_1703479_1704349_-	NA	NA|160aa|down_3|NC_013194.1_1704609_1705089_-	NA	NA|420aa|down_4|NC_013194.1_1705195_1706455_+	PRK00011, glyA, serine hydroxymethyltransferase; Reviewed	NA|157aa|down_5|NC_013194.1_1706451_1706922_+	PRK00464, nrdR, transcriptional repressor NrdR	NA|357aa|down_6|NC_013194.1_1706957_1708028_+	PRK10786, ribD, bifunctional diaminohydroxyphosphoribosylaminopyrimidine deaminase/5-amino-6-(5-phosphoribosylamino)uracil reductase RibD	NA|253aa|down_7|NC_013194.1_1708020_1708779_-	PRK05690, PRK05690, molybdopterin biosynthesis protein MoeB; Provisional	NA|140aa|down_8|NC_013194.1_1708789_1709209_-	NA	NA|471aa|down_9|NC_013194.1_1709231_1710644_-	COG0793, Prc, Periplasmic protease [Cell envelope biogenesis, outer membrane]
GCF_000024165.1_ASM2416v1	NC_013194	Candidatus Accumulibacter phosphatis clade IIA str. UW-1, complete genome	2	2475437-2479953	1,2,1,3	CRISPRCasFinder,PILER-CR,CRT,PILER-CR	no	cas2,cas1,cas6,csx1,cmr6gr7,cmr5gr11,cmr4gr7,cmr3gr5,cas10,cmr1gr7	cas3f,DEDDh,RT,csx1,csa3,cas3,WYL,cas2,cas1,cas6,cmr6gr7,cmr5gr11,cmr4gr7,cmr3gr5,cas10,cmr1gr7,cas8e,cse2gr11,cas7,cas5,cas6e,PD-DExK,DinG	Type III-A,Type III-C,Type III-D,Type III-B	GTCTCAATCCCTTTGATTTCAGGGCTGGTTACTGAC,GTCTCAATCCCTTTGATTTCAGGGCTGGTTACTGAC,GTCTCAATCCCTTTGATTTCAGGGCTGGTTACTGAC,GTCTCAATCCCTTTGATTTCAGGGCTGGTTACTGAC	36,36,36,36	0	0	NA	NA	NA:NA:NA:NA	64,61,63,61	64	TypeIII-A,TypeIII-C,TypeIII-D,TypeIII-B	cas3f,DEDDh,RT,csx1,csa3,cas3,WYL,cas2,cas1,cas6,cmr6gr7,cmr5gr11,cmr4gr7,cmr3gr5,cas10,cmr1gr7,cas8e,cse2gr11,cas7,cas5,cas6e,PD-DExK,DinG	NA|252aa|up_7|NC_013194.1_2470562_2471318_+,NA|130aa|up_5|NC_013194.1_2471745_2472135_+,NA|156aa|up_0|NC_013194.1_2474671_2475139_-,NA	NA|294aa|up_9|NC_013194.1_2469226_2470108_+	PRK05457, PRK05457, protease HtpX	NA|130aa|up_8|NC_013194.1_2470152_2470542_+	COG3590, PepO, Predicted metalloendopeptidase [Posttranslational modification, protein turnover, chaperones]	NA|252aa|up_7|NC_013194.1_2470562_2471318_+	NA	NA|97aa|up_6|NC_013194.1_2471435_2471726_+	pfam02604, PhdYeFM_antitox, Antitoxin Phd_YefM, type II toxin-antitoxin system	NA|130aa|up_5|NC_013194.1_2471745_2472135_+	NA	cas2|102aa|up_4|NC_013194.1_2472215_2472521_-	pfam09827, CRISPR_Cas2, CRISPR associated protein Cas2	cas1|340aa|up_3|NC_013194.1_2472510_2473530_-	pfam01867, Cas_Cas1, CRISPR associated protein Cas1	cas2|96aa|up_2|NC_013194.1_2473546_2473834_-	cd09725, Cas2_I_II_III, CRISPR/Cas system-associated protein Cas2	NA|178aa|up_1|NC_013194.1_2474002_2474536_-	COG4190, COG4190, Predicted transcriptional regulator [Transcription]	NA|156aa|up_0|NC_013194.1_2474671_2475139_-	NA	cas6|369aa|down_0|NC_013194.1_2479958_2481065_-	pfam10040, CRISPR_Cas6, CRISPR-associated endoribonuclease Cas6	csx1|389aa|down_1|NC_013194.1_2481083_2482250_-	TIGR02221, CRISPR-associated_protein_Csx1_2, CRISPR-associated protein, TM1812 family	cmr6gr7|315aa|down_2|NC_013194.1_2482335_2483280_-	cd09661, Cmr6_III-B, CRISPR/Cas system-associated RAMP superfamily protein Cmr6	cmr5gr11|121aa|down_3|NC_013194.1_2483495_2483858_-	pfam09701, Cas_Cmr5, CRISPR-associated protein (Cas_Cmr5)	cmr4gr7|306aa|down_4|NC_013194.1_2483854_2484772_-	TIGR02580, putative_CRISPR-associated_protein, CRISPR type III-B/RAMP module RAMP protein Cmr4	cmr3gr5|394aa|down_5|NC_013194.1_2484844_2486026_-	cd09748, Cmr3_III-B, CRISPR/Cas system-associated RAMP superfamily protein Cmr3	cas10|966aa|down_6|NC_013194.1_2486025_2488923_-	cd09679, Cas10_III, CRISPR/Cas system-associated protein Cas10	cmr1gr7|370aa|down_7|NC_013194.1_2488919_2490029_-	COG1367, COG1367, CRISPR system related protein, RAMP superfamily [Defense mechanisms]	csx1|401aa|down_8|NC_013194.1_2490144_2491347_-	cd09741, Csx1_III-U, CRISPR/Cas system-associated protein Csx1	NA|751aa|down_9|NC_013194.1_2491617_2493870_-	cd00200, WD40, WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment
GCF_000024165.1_ASM2416v1	NC_013194	Candidatus Accumulibacter phosphatis clade IIA str. UW-1, complete genome	3	2565676-2575157	4,2,2,5	PILER-CR,CRISPRCasFinder,CRT,PILER-CR	no	WYL,cas3,cas8e,cse2gr11,cas7,cas5,cas6e,cas1,cas2	cas3f,DEDDh,RT,csx1,csa3,cas3,WYL,cas2,cas1,cas6,cmr6gr7,cmr5gr11,cmr4gr7,cmr3gr5,cas10,cmr1gr7,cas8e,cse2gr11,cas7,cas5,cas6e,PD-DExK,DinG	Type I-E	GTTTCCCCCGCGTCAGCGGGGATAGGCC,GTTTCCCCCGCGTCAGCGGGGATAGGCCC,GTTTCCCCCGCGTCAGCGGGGATAGGCCC,GTTTCCCCCGCGTCAGCGGGGATAGGCC	28,29,29,28	0	0	NA	NA	NA:NA:NA:NA	90,155,155,90	155	TypeI-E	cas3f,DEDDh,RT,csx1,csa3,cas3,WYL,cas2,cas1,cas6,cmr6gr7,cmr5gr11,cmr4gr7,cmr3gr5,cas10,cmr1gr7,cas8e,cse2gr11,cas7,cas5,cas6e,PD-DExK,DinG	NA,NA|104aa|down_1|NC_013194.1_2575553_2575865_+	NA|559aa|up_9|NC_013194.1_2553845_2555522_-	cd05974, MACS_like_1, Uncharacterized subfamily of medium-chain acyl-CoA synthetase (MACS)	WYL|324aa|up_8|NC_013194.1_2556018_2556990_+	COG2378, COG2378, Predicted transcriptional regulator [Transcription]	cas3|872aa|up_7|NC_013194.1_2556982_2559598_+	PRK09694, PRK09694, CRISPR-associated helicase/endonuclease Cas3	cas8e|526aa|up_6|NC_013194.1_2559618_2561196_+	cd09669, Cse1_I-E, CRISPR/Cas system-associated protein Cse1	cse2gr11|196aa|up_5|NC_013194.1_2561192_2561780_+	pfam09485, CRISPR_Cse2, CRISPR-associated protein Cse2 (CRISPR_cse2)	cas7|401aa|up_4|NC_013194.1_2561801_2563004_+	pfam09344, Cas_CT1975, CT1975-like protein	cas5|238aa|up_3|NC_013194.1_2563005_2563719_+	TIGR01868, hypothetical_protein, CRISPR-associated protein Cas5/CasD, subtype I-E/ECOLI	cas6e|246aa|up_2|NC_013194.1_2563715_2564453_+	smart01101, CRISPR_assoc, This domain forms an anti-parallel beta strand structure with flanking alpha helical regions	cas1|298aa|up_1|NC_013194.1_2564446_2565340_+	cd09719, Cas1_I-E, CRISPR/Cas system-associated protein Cas1	cas2|99aa|up_0|NC_013194.1_2565311_2565608_+	pfam09707, Cas_Cas2CT1978, CRISPR-associated protein (Cas_Cas2CT1978)	NA|81aa|down_0|NC_013194.1_2575272_2575515_-	pfam09720, Unstab_antitox, Putative addiction module component	NA|104aa|down_1|NC_013194.1_2575553_2575865_+	NA	NA|129aa|down_2|NC_013194.1_2576230_2576617_+	PRK00453, rpsF, 30S ribosomal protein S6; Reviewed	NA|115aa|down_3|NC_013194.1_2576616_2576961_+	TIGR04418, PriB_gamma, primosomal replication protein PriB	NA|92aa|down_4|NC_013194.1_2576932_2577208_+	PRK00391, rpsR, 30S ribosomal protein S18; Reviewed	NA|153aa|down_5|NC_013194.1_2577222_2577681_+	PRK00137, rplI, 50S ribosomal protein L9; Reviewed	NA|482aa|down_6|NC_013194.1_2577824_2579270_+	TIGR00665, DnaB, replicative DNA helicase	NA|428aa|down_7|NC_013194.1_2579298_2580582_-	pfam01841, Transglut_core, Transglutaminase-like superfamily	NA|476aa|down_8|NC_013194.1_2580722_2582150_-	COG1875, COG1875, NYN ribonuclease and ATPase of PhoH family domains [General    function prediction only]	NA|159aa|down_9|NC_013194.1_2582293_2582770_-	cd03017, PRX_BCP, Peroxiredoxin (PRX) family, Bacterioferritin comigratory protein (BCP) subfamily; composed of  thioredoxin-dependent thiol peroxidases, widely expressed in pathogenic bacteria, that protect cells against toxicity from reactive oxygen species by reducing and detoxifying hydroperoxides
GCF_000024165.1_ASM2416v1	NC_013194	Candidatus Accumulibacter phosphatis clade IIA str. UW-1, complete genome	4	4049957-4050053	3	CRISPRCasFinder	no		cas3f,DEDDh,RT,csx1,csa3,cas3,WYL,cas2,cas1,cas6,cmr6gr7,cmr5gr11,cmr4gr7,cmr3gr5,cas10,cmr1gr7,cas8e,cse2gr11,cas7,cas5,cas6e,PD-DExK,DinG	Orphan	ATATATACTGTCCCTAGAACTCC	23	0	0	NA	NA	NA	1	1	Orphan	cas3f,DEDDh,RT,csx1,csa3,cas3,WYL,cas2,cas1,cas6,cmr6gr7,cmr5gr11,cmr4gr7,cmr3gr5,cas10,cmr1gr7,cas8e,cse2gr11,cas7,cas5,cas6e,PD-DExK,DinG	NA|150aa|up_9|NC_013194.1_4040785_4041235_-,NA|331aa|up_8|NC_013194.1_4041308_4042301_-,NA|753aa|up_7|NC_013194.1_4042496_4044755_+,NA|150aa|up_6|NC_013194.1_4045244_4045694_-,NA|147aa|up_4|NC_013194.1_4046817_4047258_-,NA|63aa|up_0|NC_013194.1_4049475_4049664_+,NA|92aa|down_0|NC_013194.1_4050059_4050335_-,NA|139aa|down_2|NC_013194.1_4050994_4051411_-,NA|84aa|down_3|NC_013194.1_4051411_4051663_+,NA|133aa|down_4|NC_013194.1_4052498_4052897_-,NA|144aa|down_7|NC_013194.1_4054249_4054681_-,NA|224aa|down_8|NC_013194.1_4054768_4055440_-	NA|150aa|up_9|NC_013194.1_4040785_4041235_-	NA	NA|331aa|up_8|NC_013194.1_4041308_4042301_-	NA	NA|753aa|up_7|NC_013194.1_4042496_4044755_+	NA	NA|150aa|up_6|NC_013194.1_4045244_4045694_-	NA	NA|85aa|up_5|NC_013194.1_4045882_4046137_-	TIGR04102, SEC-C_motif_domain_protein, SWIM/SEC-C metal-binding motif protein, PBPRA1643 family	NA|147aa|up_4|NC_013194.1_4046817_4047258_-	NA	NA|321aa|up_3|NC_013194.1_4047351_4048314_-	pfam13649, Methyltransf_25, Methyltransferase domain	NA|172aa|up_2|NC_013194.1_4048316_4048832_-	cd07812, SRPBCC, START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC (SRPBCC) ligand-binding domain superfamily	NA|114aa|up_1|NC_013194.1_4048920_4049262_-	COG4551, COG4551, Predicted protein tyrosine phosphatase [General function prediction only]	NA|63aa|up_0|NC_013194.1_4049475_4049664_+	NA	NA|92aa|down_0|NC_013194.1_4050059_4050335_-	NA	NA|152aa|down_1|NC_013194.1_4050439_4050895_-	pfam14081, DUF4262, Domain of unknown function (DUF4262)	NA|139aa|down_2|NC_013194.1_4050994_4051411_-	NA	NA|84aa|down_3|NC_013194.1_4051411_4051663_+	NA	NA|133aa|down_4|NC_013194.1_4052498_4052897_-	NA	NA|115aa|down_5|NC_013194.1_4053063_4053408_-	cd00090, HTH_ARSR, Arsenical Resistance Operon Repressor and similar prokaryotic, metal regulated homodimeric repressors	NA|137aa|down_6|NC_013194.1_4053734_4054145_-	COG3791, COG3791, Uncharacterized conserved protein [Function unknown]	NA|144aa|down_7|NC_013194.1_4054249_4054681_-	NA	NA|224aa|down_8|NC_013194.1_4054768_4055440_-	NA	NA|165aa|down_9|NC_013194.1_4056724_4057219_+	COG0783, Dps, DNA-binding ferritin-like protein (oxidative damage protectant) [Inorganic ion transport and metabolism]
