assembly_id	genome_id	genome_def	crispr_array_locus_merge	crispr_array_location_merge	crispr_locus_id	crispr_pred_method	array_in_prot	prot_within_array_20000	prot_in_genome	crispr_type_by_cas_prot	consensus_repeat	repeat_length	self-targeting_spacer_number	self-targeting_target_number	spacer_location	protospacer_location	repeat_type	spacer_locus_num	spacer_num	correct_crispr_type	genome_cas_prots	unknown_protein_around_crispr	L10	L10_domain	L9	L9_domain	L8	L8_domain	L7	L7_domain	L6	L6_domain	L5	L5_domain	L4	L4_domain	L3	L3_domain	L2	L2_domain	L1	L1_domain	R1	R1_domain	R2	R2_domain	R3	R3_domain	R4	R4_domain	R5	R5_domain	R6	R6_domain	R7	R7_domain	R8	R8_domain	R9	R9_domain	R10	R10_domain
GCF_001190945.1_ASM119094v1	NZ_CP011112	Luteipulveratus mongoliensis strain MN07-A0370 chromosome	1	719001-720790	1,1,1,2	CRT,PILER-CR,CRISPRCasFinder,PILER-CR	no	cas3,cas8e,cse2gr11,cas7,cas5,cas6e	csa3,WYL,cas3,cas8e,cse2gr11,cas7,cas5,cas6e,cas4,DEDDh,DinG	Type I-E	GCGCTCGCGGAGATGAGCC,TCCGCTCCGCGCTCGCGGAGATGAGCCCACG,GTCCGCTCCGCGCTCGCGGAGATGAGCC,CGCTCCGCGCTCGCGGAGATGAGCC	19,31,28,25	0	0	NA	NA	NA:NA:NA:NA	29,19,27,19	29	TypeI-E	csa3,WYL,cas3,cas8e,cse2gr11,cas7,cas5,cas6e,cas4,DEDDh,DinG	NA,NA|477aa|down_0|NZ_CP011112.1_721182_722613_-,NA|177aa|down_1|NZ_CP011112.1_722623_723154_-,NA|188aa|down_4|NZ_CP011112.1_728300_728864_+,NA|148aa|down_5|NZ_CP011112.1_728967_729411_+,NA|168aa|down_7|NZ_CP011112.1_730020_730524_+,NA|61aa|down_9|NZ_CP011112.1_730860_731043_+	NA|259aa|up_9|NZ_CP011112.1_706864_707641_-	pfam02525, Flavodoxin_2, Flavodoxin-like fold	NA|116aa|up_8|NZ_CP011112.1_707717_708065_+	COG1733, COG1733, Predicted transcriptional regulators [Transcription]	NA|451aa|up_7|NZ_CP011112.1_708170_709523_+	COG0277, GlcD, FAD/FMN-containing dehydrogenases [Energy production and conversion]	NA|385aa|up_6|NZ_CP011112.1_709881_711036_+	cd06173, MFS_MefA_like, Macrolide efflux protein A and similar proteins of the Major Facilitator Superfamily of transporters	cas3|921aa|up_5|NZ_CP011112.1_711166_713929_+	PRK09694, PRK09694, CRISPR-associated helicase/endonuclease Cas3	cas8e|549aa|up_4|NZ_CP011112.1_713939_715586_+	pfam09481, CRISPR_Cse1, CRISPR-associated protein Cse1 (CRISPR_cse1)	cse2gr11|219aa|up_3|NZ_CP011112.1_715578_716235_+	pfam09485, CRISPR_Cse2, CRISPR-associated protein Cse2 (CRISPR_cse2)	cas7|377aa|up_2|NZ_CP011112.1_716231_717362_+	pfam09344, Cas_CT1975, CT1975-like protein	cas5|238aa|up_1|NZ_CP011112.1_717358_718072_+	cd09756, Cas5_I-E, CRISPR/Cas system-associated RAMP superfamily protein Cas5	cas6e|240aa|up_0|NZ_CP011112.1_718068_718788_+	pfam08798, CRISPR_assoc, CRISPR associated protein	NA|477aa|down_0|NZ_CP011112.1_721182_722613_-	NA	NA|177aa|down_1|NZ_CP011112.1_722623_723154_-	NA	NA|1167aa|down_2|NZ_CP011112.1_723425_726926_+	cd00200, WD40, WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment	NA|390aa|down_3|NZ_CP011112.1_726922_728092_+	pfam01937, DUF89, Protein of unknown function DUF89	NA|188aa|down_4|NZ_CP011112.1_728300_728864_+	NA	NA|148aa|down_5|NZ_CP011112.1_728967_729411_+	NA	NA|134aa|down_6|NZ_CP011112.1_729503_729905_+	COG0745, OmpR, Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain [Signal transduction mechanisms / Transcription]	NA|168aa|down_7|NZ_CP011112.1_730020_730524_+	NA	NA|91aa|down_8|NZ_CP011112.1_730591_730864_+	pfam02467, Whib, Transcription factor WhiB	NA|61aa|down_9|NZ_CP011112.1_730860_731043_+	NA
GCF_001190945.1_ASM119094v1	NZ_CP011112	Luteipulveratus mongoliensis strain MN07-A0370 chromosome	2	2700067-2700161	2	CRISPRCasFinder	no	csa3	csa3,WYL,cas3,cas8e,cse2gr11,cas7,cas5,cas6e,cas4,DEDDh,DinG	Type I-A	GGTCGAGTAGCCGGAGCGCTAGC	23	0	0	NA	NA	NA	1	1	Orphan	csa3,WYL,cas3,cas8e,cse2gr11,cas7,cas5,cas6e,cas4,DEDDh,DinG	NA,NA|195aa|down_2|NZ_CP011112.1_2702730_2703315_-,NA|258aa|down_3|NZ_CP011112.1_2703406_2704180_+	csa3|273aa|up_9|NZ_CP011112.1_2688247_2689066_-	cd00090, HTH_ARSR, Arsenical Resistance Operon Repressor and similar prokaryotic, metal regulated homodimeric repressors	NA|456aa|up_8|NZ_CP011112.1_2689131_2690499_+	TIGR00711, Uncharacterized_MFS-type_transporter_YhcA, drug resistance transporter, EmrB/QacA subfamily	NA|367aa|up_7|NZ_CP011112.1_2690542_2691643_+	pfam02562, PhoH, PhoH-like protein	NA|165aa|up_6|NZ_CP011112.1_2691639_2692134_+	PRK00016, PRK00016, metal-binding heat shock protein; Provisional	NA|448aa|up_5|NZ_CP011112.1_2692147_2693491_+	COG1253, TlyC, Hemolysins and related proteins containing CBS domains [General function prediction only]	NA|305aa|up_4|NZ_CP011112.1_2693487_2694402_+	PRK00089, era, GTPase Era; Reviewed	NA|245aa|up_3|NZ_CP011112.1_2694402_2695137_+	pfam13649, Methyltransf_25, Methyltransferase domain	NA|206aa|up_2|NZ_CP011112.1_2695142_2695760_-	pfam07077, DUF1345, Protein of unknown function (DUF1345)	NA|267aa|up_1|NZ_CP011112.1_2695771_2696572_-	pfam04454, Linocin_M18, Encapsulating protein for peroxidase	NA|341aa|up_0|NZ_CP011112.1_2696568_2697591_-	pfam04261, Dyp_perox, Dyp-type peroxidase family	NA|387aa|down_0|NZ_CP011112.1_2700454_2701615_-	cd09597, M4_TLP, Peptidase M4 family including thermolysin, protealysin, aureolysin, and neutral protease	NA|337aa|down_1|NZ_CP011112.1_2701713_2702724_+	TIGR04247, nitrous_oxide_maturation_protein_NosD, nitrous oxide reductase family maturation protein NosD	NA|195aa|down_2|NZ_CP011112.1_2702730_2703315_-	NA	NA|258aa|down_3|NZ_CP011112.1_2703406_2704180_+	NA	NA|584aa|down_4|NZ_CP011112.1_2704424_2706176_+	PRK03739, PRK03739, 2-isopropylmalate synthase; Validated	NA|446aa|down_5|NZ_CP011112.1_2706374_2707712_+	cd00322, FNR_like, Ferredoxin reductase (FNR), an FAD and NAD(P) binding protein, was intially identified as a chloroplast reductase activity, catalyzing the electron transfer from reduced iron-sulfur protein ferredoxin to NADP+ as the final step in the electron transport mechanism of photosystem I	NA|840aa|down_6|NZ_CP011112.1_2707716_2710236_-	COG1554, ATH1, Trehalose and maltose hydrolases (possible phosphorylases) [Carbohydrate transport and metabolism]	NA|245aa|down_7|NZ_CP011112.1_2710232_2710967_-	TIGR02009, Hypothetical_protein_Rv3400/MT3508/Mb3433	NA|599aa|down_8|NZ_CP011112.1_2711239_2713036_+	cd03677, MM_CoA_mutase_beta, Coenzyme B12-dependent-methylmalonyl coenzyme A (CoA) mutase (MCM) family, Beta subunit-like subfamily; contains bacterial proteins similar to the beta subunit of MCMs from Propionbacterium shermanni and Streptomyces cinnamonensis, which are alpha/beta heterodimers	NA|744aa|down_9|NZ_CP011112.1_2713032_2715264_+	PRK09426, PRK09426, methylmalonyl-CoA mutase; Reviewed
GCF_001190945.1_ASM119094v1	NZ_CP011112	Luteipulveratus mongoliensis strain MN07-A0370 chromosome	3	4928768-4928844	3	CRISPRCasFinder	no		csa3,WYL,cas3,cas8e,cse2gr11,cas7,cas5,cas6e,cas4,DEDDh,DinG	Orphan	GACCCGGGCCGGTTTCCCTGATACC	25	0	0	NA	NA	NA	1	1	Orphan	csa3,WYL,cas3,cas8e,cse2gr11,cas7,cas5,cas6e,cas4,DEDDh,DinG	NA,NA|287aa|down_5|NZ_CP011112.1_4933871_4934732_-	NA|149aa|up_9|NZ_CP011112.1_4920695_4921142_-	TIGR00026, Hypothetical_protein_Rv1261c/MT1299/Mb1292c	NA|235aa|up_8|NZ_CP011112.1_4921274_4921979_+	pfam00440, TetR_N, Bacterial regulatory proteins, tetR family	NA|438aa|up_7|NZ_CP011112.1_4922119_4923433_+	PRK14853, nhaA, pH-dependent sodium/proton antiporter; Provisional	NA|134aa|up_6|NZ_CP011112.1_4923534_4923936_+	pfam07332, Phage_holin_3_6, Putative Actinobacterial Holin-X, holin superfamily III	NA|308aa|up_5|NZ_CP011112.1_4923967_4924891_+	COG0596, MhpC, Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) [General function prediction only]	NA|218aa|up_4|NZ_CP011112.1_4924880_4925534_-	cd03426, CoAse, Coenzyme A pyrophosphatase (CoAse), a member of the Nudix hydrolase superfamily, functions to catalyze the elimination of oxidized inactive CoA, which can inhibit CoA-utilizing enzymes	NA|411aa|up_3|NZ_CP011112.1_4925562_4926795_-	COG4941, COG4941, Predicted RNA polymerase sigma factor containing a TPR repeat domain [Transcription]	NA|128aa|up_2|NZ_CP011112.1_4926791_4927175_-	COG3795, COG3795, Uncharacterized protein conserved in bacteria [Function unknown]	NA|177aa|up_1|NZ_CP011112.1_4927390_4927921_+	COG0262, FolA, Dihydrofolate reductase [Coenzyme metabolism]	NA|231aa|up_0|NZ_CP011112.1_4927925_4928618_-	COG0177, Nth, Predicted EndoIII-related endonuclease [DNA replication, recombination, and repair]	NA|200aa|down_0|NZ_CP011112.1_4929089_4929689_+	COG0664, Crp, cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases [Signal transduction mechanisms]	NA|372aa|down_1|NZ_CP011112.1_4929775_4930891_-	pfam00665, rve, Integrase core domain	NA|298aa|down_2|NZ_CP011112.1_4931186_4932080_+	pfam13338, AbiEi_4, Transcriptional regulator, AbiEi antitoxin	NA|238aa|down_3|NZ_CP011112.1_4932134_4932848_-	cd16278, metallo-hydrolase-like_MBL-fold, uncharacterized subgroup of the MBL-fold_metallo-hydrolase superfamily; MBL-fold metallo hydrolase domain	NA|273aa|down_4|NZ_CP011112.1_4933046_4933865_+	cd06412, GH25_CH-type, CH-type (Chalaropsis-type) lysozymes represent one of four functionally-defined classes of peptidoglycan hydrolases (also referred to as endo-N-acetylmuramidases) that cleave bacterial cell wall peptidoglycans	NA|287aa|down_5|NZ_CP011112.1_4933871_4934732_-	NA	NA|155aa|down_6|NZ_CP011112.1_4934739_4935204_-	cd02199, YjgF_YER057c_UK114_like_1, This group of proteins belong to a large family of YjgF/YER057c/UK114-like proteins present in bacteria, archaea, and eukaryotes with no definitive function	NA|54aa|down_7|NZ_CP011112.1_4935207_4935369_-	pfam13783, DUF4177, Domain of unknown function (DUF4177)	NA|429aa|down_8|NZ_CP011112.1_4935401_4936688_-	PRK00711, PRK00711, D-amino acid dehydrogenase	NA|319aa|down_9|NZ_CP011112.1_4936870_4937827_+	cd02035, ArsA, Arsenical pump-driving ATPase ArsA
GCF_001190945.1_ASM119094v1	NZ_CP011112	Luteipulveratus mongoliensis strain MN07-A0370 chromosome	4	5250344-5250499	4	CRISPRCasFinder	no		csa3,WYL,cas3,cas8e,cse2gr11,cas7,cas5,cas6e,cas4,DEDDh,DinG	Orphan	GCGCTCCGGCTACTCGACCAGCGAGGTATCGCTGGTCGAGTAGGCGAGG	49	0	0	NA	NA	NA	1	1	Orphan	csa3,WYL,cas3,cas8e,cse2gr11,cas7,cas5,cas6e,cas4,DEDDh,DinG	NA|174aa|up_2|NZ_CP011112.1_5246398_5246920_-,NA	NA|376aa|up_9|NZ_CP011112.1_5242063_5243191_-	COG3268, COG3268, Uncharacterized conserved protein [Function unknown]	NA|265aa|up_8|NZ_CP011112.1_5243200_5243995_-	cd05341, 3beta-17beta-HSD_like_SDR_c, 3beta17beta hydroxysteroid dehydrogenase-like, classical (c) SDRs	NA|209aa|up_7|NZ_CP011112.1_5244149_5244776_+	COG1309, AcrR, Transcriptional regulator [Transcription]	NA|146aa|up_6|NZ_CP011112.1_5244895_5245333_-	PRK05395, PRK05395, type II 3-dehydroquinate dehydratase	NA|84aa|up_5|NZ_CP011112.1_5245419_5245671_+	PRK11409, PRK11409, YoeB-YefM toxin-antitoxin system antitoxin YefM	NA|87aa|up_4|NZ_CP011112.1_5245667_5245928_+	pfam06769, YoeB_toxin, YoeB-like toxin of bacterial type II toxin-antitoxin system	NA|157aa|up_3|NZ_CP011112.1_5245931_5246402_-	cd14775, TrHb2_O-like, Truncated hemoglobins, group 2 (O); uncharacterized subgroup	NA|174aa|up_2|NZ_CP011112.1_5246398_5246920_-	NA	NA|815aa|up_1|NZ_CP011112.1_5247019_5249464_+	PRK05261, PRK05261, phosphoketolase	NA|273aa|up_0|NZ_CP011112.1_5249460_5250279_+	COG1357, COG1357, Pentapeptide repeats containing protein [Function unknown]	NA|403aa|down_0|NZ_CP011112.1_5250571_5251780_-	PRK00180, PRK00180, acetate kinase A/propionate kinase 2; Reviewed	NA|693aa|down_1|NZ_CP011112.1_5251776_5253855_-	PRK05632, PRK05632, phosphate acetyltransferase; Reviewed	NA|120aa|down_2|NZ_CP011112.1_5253947_5254307_+	pfam04248, NTP_transf_9, Domain of unknown function (DUF427)	NA|420aa|down_3|NZ_CP011112.1_5254332_5255592_-	COG0330, HflC, Membrane protease subunits, stomatin/prohibitin homologs [Posttranslational modification, protein turnover, chaperones]	NA|146aa|down_4|NZ_CP011112.1_5255584_5256022_-	COG1585, COG1585, Membrane protein implicated in regulation of membrane protease activity [Posttranslational modification, protein turnover, chaperones / Intracellular trafficking and secretion]	NA|617aa|down_5|NZ_CP011112.1_5256107_5257958_+	cd14951, NHL-2_like, NHL repeat domain of NHL repeat-containing protein 2 and similar proteins	NA|271aa|down_6|NZ_CP011112.1_5258179_5258992_-	cd07583, nitrilase_5, Uncharacterized subgroup of the nitrilase superfamily (putative class 13 nitrilases)	NA|643aa|down_7|NZ_CP011112.1_5258994_5260923_-	pfam09678, Caa3_CtaG, Cytochrome c oxidase caa3 assembly factor (Caa3_CtaG)	NA|287aa|down_8|NZ_CP011112.1_5261092_5261953_+	TIGR03083, TIGR03083, uncharacterized Actinobacterial protein TIGR03083	NA|190aa|down_9|NZ_CP011112.1_5261960_5262530_-	COG0262, FolA, Dihydrofolate reductase [Coenzyme metabolism]
