assembly_id	genome_id	genome_def	crispr_array_locus_merge	crispr_array_location_merge	crispr_locus_id	crispr_pred_method	array_in_prot	prot_within_array_20000	prot_in_genome	crispr_type_by_cas_prot	consensus_repeat	repeat_length	self-targeting_spacer_number	self-targeting_target_number	spacer_location	protospacer_location	repeat_type	spacer_locus_num	spacer_num	correct_crispr_type	genome_cas_prots	unknown_protein_around_crispr	L10	L10_domain	L9	L9_domain	L8	L8_domain	L7	L7_domain	L6	L6_domain	L5	L5_domain	L4	L4_domain	L3	L3_domain	L2	L2_domain	L1	L1_domain	R1	R1_domain	R2	R2_domain	R3	R3_domain	R4	R4_domain	R5	R5_domain	R6	R6_domain	R7	R7_domain	R8	R8_domain	R9	R9_domain	R10	R10_domain
GCF_000292705.1_ASM29270v1	NC_018508	Bacillus thuringiensis HD-789, complete sequence	1	192488-192563	1	CRISPRCasFinder	no	csa3	cas14k,csa3,WYL,c2c9_V-U4,cas14j,DinG,cas3,DEDDh	Type I-A	ATCATCATCATGGAGGACACAATCA	25	0	0	NA	NA	NA	1	1	Orphan	cas14k,csa3,WYL,c2c9_V-U4,cas14j,DinG,cas3,DEDDh,RT,cas4	NA,NA|48aa|down_8|NC_018508.1_205570_205714_+	NA|335aa|up_9|NC_018508.1_181209_182214_+	pfam01032, FecCD, FecCD transport family	NA|353aa|up_8|NC_018508.1_182210_183269_+	pfam01032, FecCD, FecCD transport family	NA|274aa|up_7|NC_018508.1_183281_184103_+	COG1120, FepC, ABC-type cobalamin/Fe3+-siderophores transport systems, ATPase components [Inorganic ion transport and metabolism / Coenzyme metabolism]	NA|244aa|up_6|NC_018508.1_184134_184866_-	pfam13649, Methyltransf_25, Methyltransferase domain	NA|397aa|up_5|NC_018508.1_185079_186270_+	PRK06939, PRK06939, 2-amino-3-ketobutyrate coenzyme A ligase; Provisional	NA|322aa|up_4|NC_018508.1_186314_187280_+	cd05272, TDH_SDR_e, L-threonine dehydrogenase, extended (e) SDRs	NA|141aa|up_3|NC_018508.1_187339_187762_+	cd02883, Nudix_Hydrolase, Nudix hydrolase is a superfamily of enzymes found in all three kingdoms of life, and it catalyzes the hydrolysis of NUcleoside DIphosphates linked to other moieties, X	NA|628aa|up_2|NC_018508.1_187799_189683_-	COG4548, NorD, Nitric oxide reductase activation protein [Inorganic ion transport and metabolism]	NA|298aa|up_1|NC_018508.1_189686_190580_-	COG0714, COG0714, MoxR-like ATPases [General function prediction only]	NA|510aa|up_0|NC_018508.1_190707_192237_-	PRK12452, PRK12452, cardiolipin synthase	NA|568aa|down_0|NC_018508.1_193286_194990_+	PRK10060, PRK10060, cyclic di-GMP phosphodiesterase	NA|466aa|down_1|NC_018508.1_195021_196419_-	TIGR00905, Arginine/ornithine_antiporter, transporter, basic amino acid/polyamine antiporter (APA) family	NA|237aa|down_2|NC_018508.1_196871_197582_+	TIGR02404, Trehalose_operon_transcriptional_repressor, trehalose operon repressor, B	NA|476aa|down_3|NC_018508.1_197724_199152_+	TIGR01992, phosphotransferase_system_trehalose_permease, PTS system, trehalose-specific IIBC component	NA|554aa|down_4|NC_018508.1_199165_200827_+	TIGR02403, Trehalose-6-phosphate_hydrolase, alpha,alpha-phosphotrehalase	NA|369aa|down_5|NC_018508.1_201964_203071_-	pfam03845, Spore_permease, Spore germination protein	NA|501aa|down_6|NC_018508.1_203051_204554_-	pfam03323, GerA, Bacillus/Clostridium GerA spore germination protein	NA|274aa|down_7|NC_018508.1_204742_205564_+	COG2334, COG2334, Putative homoserine kinase type II (protein kinase fold) [General function prediction only]	NA|48aa|down_8|NC_018508.1_205570_205714_+	NA	NA|487aa|down_9|NC_018508.1_205873_207334_+	pfam01235, Na_Ala_symp, Sodium:alanine symporter family
GCF_000292705.1_ASM29270v1	NC_018508	Bacillus thuringiensis HD-789, complete sequence	2	4521837-4521953	2	CRISPRCasFinder	no		cas14k,csa3,WYL,c2c9_V-U4,cas14j,DinG,cas3,DEDDh	Orphan	CTTAAACAAGCGTTTGATTAATTCTCCATTTTTCTT	36	0	0	NA	NA	NA	1	1	Orphan	cas14k,csa3,WYL,c2c9_V-U4,cas14j,DinG,cas3,DEDDh,RT,cas4	NA|115aa|up_4|NC_018508.1_4519086_4519431_-,NA|176aa|down_0|NC_018508.1_4522052_4522580_-,NA|62aa|down_5|NC_018508.1_4525925_4526111_-	NA|262aa|up_9|NC_018508.1_4514251_4515037_-	COG0396, sufC, Cysteine desulfurase activator ATPase [Posttranslational modification, protein turnover, chaperones]	NA|269aa|up_8|NC_018508.1_4515275_4516082_-	COG1464, NlpA, ABC-type metal ion transport system, periplasmic component/surface antigen [Inorganic ion transport and metabolism]	NA|271aa|up_7|NC_018508.1_4516153_4516966_-	COG1464, NlpA, ABC-type metal ion transport system, periplasmic component/surface antigen [Inorganic ion transport and metabolism]	NA|222aa|up_6|NC_018508.1_4516989_4517655_-	COG2011, AbcD, ABC-type metal ion transport system, permease component [Inorganic ion transport and metabolism]	NA|342aa|up_5|NC_018508.1_4517647_4518673_-	COG1135, AbcC, ABC-type metal ion transport system, ATPase component [Inorganic ion transport and metabolism]	NA|115aa|up_4|NC_018508.1_4519086_4519431_-	NA	NA|100aa|up_3|NC_018508.1_4519583_4519883_-	cd02947, TRX_family, TRX family; composed of two groups: Group I, which includes proteins that exclusively encode a TRX domain; and Group II, which are composed of fusion proteins of TRX and additional domains	NA|115aa|up_2|NC_018508.1_4519895_4520240_-	COG1658, COG1658, Small primase-like proteins (Toprim domain) [DNA replication, recombination, and repair]	NA|128aa|up_1|NC_018508.1_4520850_4521234_-	PRK01202, PRK01202, glycine cleavage system protein GcvH	NA|122aa|up_0|NC_018508.1_4521275_4521641_-	cd03036, ArsC_like, Arsenate Reductase (ArsC) family, unknown subfamily; uncharacterized proteins containing a CXXC motif with similarity to thioredoxin (TRX)-fold arsenic reductases, ArsC	NA|176aa|down_0|NC_018508.1_4522052_4522580_-	NA	NA|216aa|down_1|NC_018508.1_4522724_4523372_+	cd03386, PAP2_Aur1_like, PAP2_like proteins, Aur1_like subfamily	NA|338aa|down_2|NC_018508.1_4523431_4524445_-	pfam13303, PTS_EIIC_2, Phosphotransferase system, EIIC	NA|390aa|down_3|NC_018508.1_4524467_4525637_-	cd05291, HicDH_like, L-2-hydroxyisocapronate dehydrogenases and some bacterial L-lactate dehydrogenases	NA|83aa|down_4|NC_018508.1_4525663_4525912_-	pfam07875, Coat_F, Coat F domain	NA|62aa|down_5|NC_018508.1_4525925_4526111_-	NA	NA|240aa|down_6|NC_018508.1_4526224_4526944_-	cd07721, yflN-like_MBL-fold, uncharacterized subgroup which includes Bacillus subtilis yflN; MBL-fold metallo hydrolase domain	NA|601aa|down_7|NC_018508.1_4527059_4528862_-	cd01161, VLCAD, Very long chain acyl-CoA dehydrogenase	NA|391aa|down_8|NC_018508.1_4529232_4530405_-	PRK07661, PRK07661, acetyl-CoA C-acetyltransferase	NA|794aa|down_9|NC_018508.1_4530426_4532808_-	COG1250, FadB, 3-hydroxyacyl-CoA dehydrogenase [Lipid metabolism]
GCF_000292705.1_ASM29270v1	NC_018508	Bacillus thuringiensis HD-789, complete sequence	3	4777128-4777261	3	CRISPRCasFinder	no		cas14k,csa3,WYL,c2c9_V-U4,cas14j,DinG,cas3,DEDDh	Orphan	GTTGATTTCTCTTCTTTTTGAGA	23	0	0	NA	NA	NA	2	2	Orphan	cas14k,csa3,WYL,c2c9_V-U4,cas14j,DinG,cas3,DEDDh,RT,cas4	NA|45aa|up_0|NC_018508.1_4776805_4776940_-,NA	NA|220aa|up_9|NC_018508.1_4768807_4769467_-	TIGR03025, EPS_sugtrans, exopolysaccharide biosynthesis polyprenyl glycosylphosphotransferase	NA|293aa|up_8|NC_018508.1_4769485_4770364_-	COG1210, GalU, UDP-glucose pyrophosphorylase [Cell envelope biogenesis, outer membrane]	NA|256aa|up_7|NC_018508.1_4770603_4771371_-	COG4464, CapC, Capsular polysaccharide biosynthesis protein [Carbohydrate transport and metabolism / Cell envelope biogenesis, outer membrane]	NA|234aa|up_6|NC_018508.1_4771482_4772184_-	cd05387, BY-kinase, bacterial tyrosine-kinase	NA|248aa|up_5|NC_018508.1_4772173_4772917_-	COG3944, COG3944, Capsular polysaccharide biosynthesis protein [Cell envelope biogenesis, outer membrane]	NA|226aa|up_4|NC_018508.1_4773180_4773858_-	cd05387, BY-kinase, bacterial tyrosine-kinase	NA|145aa|up_3|NC_018508.1_4774199_4774634_-	PRK00006, fabZ, 3-hydroxyacyl-ACP dehydratase FabZ	NA|334aa|up_2|NC_018508.1_4775062_4776064_-	PRK13928, PRK13928, rod shape-determining protein Mbl; Provisional	NA|91aa|up_1|NC_018508.1_4776224_4776497_-	pfam12116, SpoIIID, Stage III sporulation protein D	NA|45aa|up_0|NC_018508.1_4776805_4776940_-	NA	NA|236aa|down_0|NC_018508.1_4778149_4778857_-	pfam12698, ABC2_membrane_3, ABC-2 family transporter protein	NA|281aa|down_1|NC_018508.1_4778856_4779699_-	COG1131, CcmA, ABC-type multidrug transport system, ATPase component [Defense mechanisms]	NA|336aa|down_2|NC_018508.1_4779880_4780888_-	COG1131, CcmA, ABC-type multidrug transport system, ATPase component [Defense mechanisms]	NA|340aa|down_3|NC_018508.1_4780986_4782006_-	TIGR02870, Stage_II_sporulation_protein_D, stage II sporulation protein D	NA|435aa|down_4|NC_018508.1_4782214_4783519_-	PRK09369, PRK09369, UDP-N-acetylglucosamine 1-carboxyvinyltransferase; Validated	NA|237aa|down_5|NC_018508.1_4783558_4784269_-	pfam08680, DUF1779, TATA-box binding	NA|79aa|down_6|NC_018508.1_4784314_4784551_-	COG4836, COG4836, Predicted membrane protein [Function unknown]	NA|507aa|down_7|NC_018508.1_4784753_4786274_-	PRK05777, PRK05777, NADH-quinone oxidoreductase subunit NuoN	NA|501aa|down_8|NC_018508.1_4786275_4787778_-	PRK05846, PRK05846, NADH:ubiquinone oxidoreductase subunit M; Reviewed	NA|621aa|down_9|NC_018508.1_4787774_4789637_-	PRK06590, PRK06590, NADH:ubiquinone oxidoreductase subunit L; Reviewed
GCF_000292705.1_ASM29270v1	NC_018516	Bacillus thuringiensis HD-789 plasmid pBTHD789-1, complete sequence	1	267554-267691	1	CRISPRCasFinder	no			Orphan	CTTCTTGCTTTTGAACAGGTTTAG	24	0	0	NA	NA	NA	2	2	Orphan	cas14k,csa3,WYL,c2c9_V-U4,cas14j,DinG,cas3,DEDDh,RT,cas4	NA|108aa|up_9|NC_018516.1_261240_261564_-,NA|86aa|up_8|NC_018516.1_261584_261842_-,NA|237aa|up_7|NC_018516.1_262166_262877_-,NA|120aa|up_6|NC_018516.1_262888_263248_-,NA|127aa|up_5|NC_018516.1_263266_263647_-,NA|139aa|up_4|NC_018516.1_263652_264069_-,NA|247aa|up_3|NC_018516.1_264088_264829_-,NA|204aa|up_2|NC_018516.1_264936_265548_-,NA|133aa|up_1|NC_018516.1_265625_266024_-,NA|139aa|down_2|NC_018516.1_281417_281834_-,NA|194aa|down_3|NC_018516.1_281913_282495_-,NA|137aa|down_9|NC_018516.1_296444_296855_-	NA|108aa|up_9|NC_018516.1_261240_261564_-	NA	NA|86aa|up_8|NC_018516.1_261584_261842_-	NA	NA|237aa|up_7|NC_018516.1_262166_262877_-	NA	NA|120aa|up_6|NC_018516.1_262888_263248_-	NA	NA|127aa|up_5|NC_018516.1_263266_263647_-	NA	NA|139aa|up_4|NC_018516.1_263652_264069_-	NA	NA|247aa|up_3|NC_018516.1_264088_264829_-	NA	NA|204aa|up_2|NC_018516.1_264936_265548_-	NA	NA|133aa|up_1|NC_018516.1_265625_266024_-	NA	NA|283aa|up_0|NC_018516.1_266190_267039_-	cd10446, GIY-YIG_unchar_1, GIY-YIG domain of uncharacterized hypothetical protein found in bacteria	NA|3527aa|down_0|NC_018516.1_268946_279527_-	TIGR04226, Fimbrial_subunit_type_2, fimbrial isopeptide formation D2 domain	NA|353aa|down_1|NC_018516.1_280105_281164_-	pfam16403, DUF5011, Domain of unknown function (DUF5011)	NA|139aa|down_2|NC_018516.1_281417_281834_-	NA	NA|194aa|down_3|NC_018516.1_281913_282495_-	NA	NA|237aa|down_4|NC_018516.1_282514_283225_-	cd06165, Sortase_A, Sortase domain found in class A sortases	NA|471aa|down_5|NC_018516.1_283385_284798_-	COG4990, COG4990, Uncharacterized protein conserved in bacteria [Function unknown]	NA|2561aa|down_6|NC_018516.1_285220_292903_-	COG4932, COG4932, Predicted outer membrane protein [Cell envelope biogenesis, outer membrane]	NA|423aa|down_7|NC_018516.1_293542_294811_-	COG3786, COG3786, Uncharacterized protein conserved in bacteria [Function unknown]	NA|396aa|down_8|NC_018516.1_295020_296208_-	smart00475, 53EXOc, 5'-3' exonuclease	NA|137aa|down_9|NC_018516.1_296444_296855_-	NA
