Align acetohydroxyacid synthase subunit B (EC 2.2.1.6) (characterized)
to candidate WP_007473040.1 CMTB2_RS00750 acetolactate synthase, large subunit, biosynthetic type
Query= metacyc::MONOMER-18810 (585 letters) >NCBI__GCF_000170735.1:WP_007473040.1 Length = 565 Score = 568 bits (1464), Expect = e-166 Identities = 294/568 (51%), Positives = 388/568 (68%), Gaps = 6/568 (1%) Query: 18 EMIGAEILVHALAEEGVEYVWGYPGGAVLYIYDELHKQTKFEHILVRHEQAAVHAADGYA 77 +M GA I++ +L +E VE V+GYPGGA++ +YDE++KQ KF+HIL +HEQ AVH ADGYA Sbjct: 2 KMTGARIVIESLIKENVEVVFGYPGGAIMNVYDEIYKQDKFKHILTKHEQGAVHMADGYA 61 Query: 78 RATGKVGVALVTSGPGVTNAVTGIATAYLDSIPMVVITGNVPTHAIGQDAFQECDTVGIT 137 RATGKVGVA VTSGPG+TNA+TGIATAY DSIPMVVI+G VPT AIG DAFQE D VGIT Sbjct: 62 RATGKVGVAFVTSGPGITNAITGIATAYTDSIPMVVISGQVPTTAIGTDAFQEVDAVGIT 121 Query: 138 RPIVKHNFLVKDVRDLAATIKKAFFIAATGRPGPVVVDIPKDVSRNACKYEYPKSIDMRS 197 RPI KHNFLVKDV+DLA IK+AF++A +GRPGPV +DIPK+V+ + ++ YP+ ID+++ Sbjct: 122 RPITKHNFLVKDVKDLAYIIKEAFYLAKSGRPGPVHIDIPKNVTADMTEFIYPEKIDLKT 181 Query: 198 YNPVNKGHSGQIRKAVALLQGAERPYIYTGGGVVLANASDELRQLAALTGHPVTNTLMGL 257 Y P KG+ I++ V ++ A++P Y GGG VL+ AS+ +R++ T P TLM Sbjct: 182 YKPNYKGNKRAIKRTVEAIKNAKKPVFYIGGGSVLSGASNIIREIVKTTQIPAVETLMAR 241 Query: 258 GAFPGTSKQFVGMLGMHGTYEANMAMQNCDVLIAIGARFDDRVIGNPAHFTSQARKIIHI 317 G +GM+GMHG+Y ANMAM + D++I++GARFDDRV G F A IIHI Sbjct: 242 GVLRYDCPYLLGMVGMHGSYAANMAMNDADLIISLGARFDDRVTGKIDEFAKNA-DIIHI 300 Query: 318 DIDPSSISKRVKVDIPIVGNVKDVLQELIAQIKASDIKPKREALAKWWEQIEQWRSVDCL 377 DIDPS I K V+ PIVG+VK VL+E++ + I P R +W E +++++ + L Sbjct: 301 DIDPSQIGKVVETKYPIVGDVKLVLEEMMPML-LDGIDPNR--YEEWREILKRYKELYPL 357 Query: 378 KYDRSSEIIKPQYVVEKIWELTKGDAFICSDVGQHQMWAAQFYKFDEPRRWINSGGLGTM 437 Y S E+IKPQ+V++K EL DA I +DVGQHQMWAAQFY F P++++ SGGLGTM Sbjct: 358 TYSDSDEVIKPQWVIQKTGELAPEDAIISTDVGQHQMWAAQFYPFTYPKQFLTSGGLGTM 417 Query: 438 GVGLPYAMGIKKAFPEKEVVTITGEGSIQMCIQELSTCLQYDTPVKICSLNNGYLGMVRQ 497 G G P A+G K+ +K V+ TG+GSI M IQE+ T +Y PV LNN YLGMVRQ Sbjct: 418 GFGFPAALGAKEGKKDKVVINFTGDGSIVMNIQEVLTGYKYKLPVINIILNNNYLGMVRQ 477 Query: 498 WQEIEYDNRYSHSYM-DALPDFVKLAEAYGHVGMRVEKTSDVEPALREAFRLKDRTVFLD 556 WQ + Y++R S + + D PDF+KLAE+ G G + + E EA + F+D Sbjct: 478 WQTMFYEDRLSETDLSDVQPDFIKLAESMGGRGFTAKTKEEFEKVFNEALN-SNAVCFID 536 Query: 557 FQTDPTENVWPMVQAGKGISEMLLGAED 584 Q D E+V PMV + ML+ ED Sbjct: 537 VQVDRREDVLPMVPPNSPLKNMLVFKED 564 Lambda K H 0.319 0.135 0.407 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 878 Number of extensions: 33 Number of successful extensions: 5 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 585 Length of database: 565 Length adjustment: 36 Effective length of query: 549 Effective length of database: 529 Effective search space: 290421 Effective search space used: 290421 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.4 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.8 bits) S2: 53 (25.0 bits)
Align candidate WP_007473040.1 CMTB2_RS00750 (acetolactate synthase, large subunit, biosynthetic type)
to HMM TIGR00118 (ilvB: acetolactate synthase, large subunit, biosynthetic type (EC 2.2.1.6))
# hmmsearch :: search profile(s) against a sequence database # HMMER 3.3.1 (Jul 2020); http://hmmer.org/ # Copyright (C) 2020 Howard Hughes Medical Institute. # Freely distributed under the BSD open source license. # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # query HMM file: ../tmp/path.aa/TIGR00118.hmm # target sequence database: /tmp/gapView.11879.genome.faa # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: TIGR00118 [M=557] Accession: TIGR00118 Description: acolac_lg: acetolactate synthase, large subunit, biosynthetic type Scores for complete sequences (score includes all domains): --- full sequence --- --- best 1 domain --- -#dom- E-value score bias E-value score bias exp N Sequence Description ------- ------ ----- ------- ------ ----- ---- -- -------- ----------- 2.9e-245 800.9 2.2 3.4e-245 800.6 2.2 1.0 1 lcl|NCBI__GCF_000170735.1:WP_007473040.1 CMTB2_RS00750 acetolactate synth Domain annotation for each sequence (and alignments): >> lcl|NCBI__GCF_000170735.1:WP_007473040.1 CMTB2_RS00750 acetolactate synthase, large subunit, biosynthetic type # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc --- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ---- 1 ! 800.6 2.2 3.4e-245 3.4e-245 1 555 [. 3 560 .. 3 562 .. 0.99 Alignments for each domain: == domain 1 score: 800.6 bits; conditional E-value: 3.4e-245 TIGR00118 1 lkgaeilveslkkegvetvfGyPGGavlpiydaly.dselehilvrheqaaahaadGyarasGkvGvvl 68 ++ga+i++esl ke+ve+vfGyPGGa++++yd++y +++++hil++heq+a+h+adGyara+GkvGv++ lcl|NCBI__GCF_000170735.1:WP_007473040.1 3 MTGARIVIESLIKENVEVVFGYPGGAIMNVYDEIYkQDKFKHILTKHEQGAVHMADGYARATGKVGVAF 71 79*********************************999******************************* PP TIGR00118 69 atsGPGatnlvtgiatayldsvPlvvltGqvatsliGsdafqeidilGitlpvtkhsflvkkaedlpei 137 +tsGPG tn++tgiatay+ds+P+vv++Gqv+t++iG+dafqe+d +Git+p+tkh+flvk+++dl+ i lcl|NCBI__GCF_000170735.1:WP_007473040.1 72 VTSGPGITNAITGIATAYTDSIPMVVISGQVPTTAIGTDAFQEVDAVGITRPITKHNFLVKDVKDLAYI 140 ********************************************************************* PP TIGR00118 138 lkeafeiastGrPGPvlvdlPkdvteaeieleveekvelpgykptvkghklqikkaleliekakkPvll 206 +keaf++a++GrPGPv++d+Pk+vt+ +e+ ++ek++l++ykp+ kg+k+ ik+++e+i++akkPv + lcl|NCBI__GCF_000170735.1:WP_007473040.1 141 IKEAFYLAKSGRPGPVHIDIPKNVTADMTEFIYPEKIDLKTYKPNYKGNKRAIKRTVEAIKNAKKPVFY 209 ********************************************************************* PP TIGR00118 207 vGgGviiaeaseelkelaerlkipvtttllGlGafpedhplalgmlGmhGtkeanlavseadlliavGa 275 +GgG + ++as+ ++e++++++ip + tl+ G + d p lgm+GmhG+++an+a+++adl+i++Ga lcl|NCBI__GCF_000170735.1:WP_007473040.1 210 IGGGSVLSGASNIIREIVKTTQIPAVETLMARGVLRYDCPYLLGMVGMHGSYAANMAMNDADLIISLGA 278 ********************************************************************* PP TIGR00118 276 rfddrvtgnlakfapeakiihididPaeigknvkvdipivGdakkvleellkklkee..ekkekeWlek 342 rfddrvtg++++fa++a iihididP++igk+v+++ pivGd+k vlee++ l + ++ +eW e lcl|NCBI__GCF_000170735.1:WP_007473040.1 279 RFDDRVTGKIDEFAKNADIIHIDIDPSQIGKVVETKYPIVGDVKLVLEEMMPMLLDGidPNRYEEWREI 347 ***************************************************99988765445556**** PP TIGR00118 343 ieewkkeyilkldeeeesikPqkvikelskllkdeaivttdvGqhqmwaaqfyktkkprkfitsgGlGt 411 ++++k+ y+l+++ ++e ikPq vi++ +l++++ai++tdvGqhqmwaaqfy++++p++f+tsgGlGt lcl|NCBI__GCF_000170735.1:WP_007473040.1 348 LKRYKELYPLTYSDSDEVIKPQWVIQKTGELAPEDAIISTDVGQHQMWAAQFYPFTYPKQFLTSGGLGT 416 ********************************************************************* PP TIGR00118 412 mGfGlPaalGakvakpeetvvavtGdgsfqmnlqelstiveydipvkivilnnellGmvkqWqelfyee 480 mGfG+PaalGak +k+++ v++ tGdgs+ mn+qe+ t +y++pv+ +ilnn++lGmv+qWq +fye+ lcl|NCBI__GCF_000170735.1:WP_007473040.1 417 MGFGFPAALGAKEGKKDKVVINFTGDGSIVMNIQEVLTGYKYKLPVINIILNNNYLGMVRQWQTMFYED 485 ********************************************************************* PP TIGR00118 481 rysetklaselpdfvklaeayGvkgiriekpeeleeklkealeskepvlldvevdkeeevlPmvapGag 549 r set l+ +pdf+klae++G +g + +++ee e+ ++eal+s++ ++dv+vd++e+vlPmv+p + lcl|NCBI__GCF_000170735.1:WP_007473040.1 486 RLSETDLSDVQPDFIKLAESMGGRGFTAKTKEEFEKVFNEALNSNAVCFIDVQVDRREDVLPMVPPNSP 554 ********************************************************************* PP TIGR00118 550 ldelve 555 l++++ lcl|NCBI__GCF_000170735.1:WP_007473040.1 555 LKNMLV 560 **9986 PP Internal pipeline statistics summary: ------------------------------------- Query model(s): 1 (557 nodes) Target sequences: 1 (565 residues searched) Passed MSV filter: 1 (1); expected 0.0 (0.02) Passed bias filter: 1 (1); expected 0.0 (0.02) Passed Vit filter: 1 (1); expected 0.0 (0.001) Passed Fwd filter: 1 (1); expected 0.0 (1e-05) Initial search space (Z): 1 [actual number of targets] Domain search space (domZ): 1 [number of targets reported over threshold] # CPU time: 0.02u 0.01s 00:00:00.03 Elapsed: 00:00:00.02 # Mc/sec: 10.50 // [ok]
This GapMind analysis is from Apr 10 2024. The underlying query database was built on Apr 09 2024.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory