Align acetohydroxyacid synthase subunit B (EC 2.2.1.6) (characterized)
to candidate WP_012673975.1 SULAZ_RS00405 biosynthetic-type acetolactate synthase large subunit
Query= metacyc::MONOMER-18810 (585 letters) >NCBI__GCF_000021545.1:WP_012673975.1 Length = 581 Score = 621 bits (1602), Expect = 0.0 Identities = 310/564 (54%), Positives = 407/564 (72%), Gaps = 7/564 (1%) Query: 21 GAEILVHALAEEGVEYVWGYPGGAVLYIYDELHKQTKFEHILVRHEQAAVHAADGYARAT 80 GA+I+V L EGV+ ++G PGGA++ +YD L F+++L RHEQAA H ADGYARAT Sbjct: 6 GADIVVDVLLHEGVDTIFGLPGGAIMEVYDALF-DAPFKNVLTRHEQAACHMADGYARAT 64 Query: 81 GKVGVALVTSGPGVTNAVTGIATAYLDSIPMVVITGNVPTHAIGQDAFQECDTVGITRPI 140 GKVGV + TSGPG TN VTG+ATAY+DSIP+V ITG VP H IG DAFQE D +GITRPI Sbjct: 65 GKVGVVIATSGPGATNLVTGLATAYMDSIPLVAITGQVPRHYIGTDAFQEADVIGITRPI 124 Query: 141 VKHNFLVKDVRDLAATIKKAFFIAATGRPGPVVVDIPKDVSRNACKYEYPKSIDMRS--- 197 KHNFLV D++DL +++AF+IA TGRPGPV+VDIPKD+++ +Y P +++ Sbjct: 125 TKHNFLVTDIKDLPLILRQAFYIARTGRPGPVLVDIPKDITQQVSEYYIPSDEEVKESLP 184 Query: 198 -YNPVNKGHSGQIRKAVALLQGAERPYIYTGGGVVLANASDELRQLAALTGHPVTNTLMG 256 YNP +G+ QI+KA L++ A RP +Y GGG +LA+A++E+ +LA LT PVT T MG Sbjct: 185 GYNPHVEGNPVQIKKAAELIRKATRPVLYVGGGAILADAAEEVTKLARLTKIPVTTTNMG 244 Query: 257 LGAFPGTSKQFVGMLGMHGTYEANMAMQNCDVLIAIGARFDDRVIGNPAHFTSQARKIIH 316 GAFP T + MLGMHGTY ANMA+ + D+LIA+GARFDDRV G + F +A KIIH Sbjct: 245 KGAFPETDPLSLHMLGMHGTYYANMAVYHSDLLIAVGARFDDRVTGKISEFAPEA-KIIH 303 Query: 317 IDIDPSSISKRVKVDIPIVGNVKDVLQELIAQIKASDIKPKREALAKWWEQIEQWRSVDC 376 IDIDP+SISK + VD+PIVG+VK+VLQ+LI +++ ++ A W +QI++W+ Sbjct: 304 IDIDPASISKTITVDVPIVGDVKNVLQKLIKELEEKPVEWVA-ARENWLKQIQEWKEKHP 362 Query: 377 LKYDRSSEIIKPQYVVEKIWELTKGDAFICSDVGQHQMWAAQFYKFDEPRRWINSGGLGT 436 L Y +S +IIKPQYV+E+I+ +T GDA I + VGQHQMWAA FYK+ PR+++NSGGLGT Sbjct: 363 LSYRKSDKIIKPQYVIEEIYNITNGDAIISAGVGQHQMWAAMFYKYSYPRQFLNSGGLGT 422 Query: 437 MGVGLPYAMGIKKAFPEKEVVTITGEGSIQMCIQELSTCLQYDTPVKICSLNNGYLGMVR 496 MG G P A+G K P+K V I G+GS M +Q+L+T +QY PVKI +NNG+LGMVR Sbjct: 423 MGFGFPAAVGAKIGRPDKTVFAIEGDGSFIMNVQDLATAVQYRVPVKIAIINNGFLGMVR 482 Query: 497 QWQEIEYDNRYSHSYMDALPDFVKLAEAYGHVGMRVEKTSDVEPALREAFRLKDRTVFLD 556 QWQ+ YD+RY+ + PDFVKLAE++G VG+R K S+V+ L +A + DR V +D Sbjct: 483 QWQQFFYDSRYASVCLSVQPDFVKLAESFGAVGLRATKPSEVKEVLAKAMEINDRPVLID 542 Query: 557 FQTDPTENVWPMVQAGKGISEMLL 580 F D ENV PMV AGK EM+L Sbjct: 543 FVVDREENVLPMVPAGKSYREMIL 566 Lambda K H 0.319 0.135 0.407 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 926 Number of extensions: 38 Number of successful extensions: 5 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 585 Length of database: 581 Length adjustment: 36 Effective length of query: 549 Effective length of database: 545 Effective search space: 299205 Effective search space used: 299205 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.4 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.8 bits) S2: 53 (25.0 bits)
Align candidate WP_012673975.1 SULAZ_RS00405 (biosynthetic-type acetolactate synthase large subunit)
to HMM TIGR00118 (ilvB: acetolactate synthase, large subunit, biosynthetic type (EC 2.2.1.6))
# hmmsearch :: search profile(s) against a sequence database # HMMER 3.3.1 (Jul 2020); http://hmmer.org/ # Copyright (C) 2020 Howard Hughes Medical Institute. # Freely distributed under the BSD open source license. # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # query HMM file: ../tmp/path.aa/TIGR00118.hmm # target sequence database: /tmp/gapView.1346670.genome.faa # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: TIGR00118 [M=557] Accession: TIGR00118 Description: acolac_lg: acetolactate synthase, large subunit, biosynthetic type Scores for complete sequences (score includes all domains): --- full sequence --- --- best 1 domain --- -#dom- E-value score bias E-value score bias exp N Sequence Description ------- ------ ----- ------- ------ ----- ---- -- -------- ----------- 3.7e-264 863.3 0.7 4.4e-264 863.0 0.7 1.0 1 NCBI__GCF_000021545.1:WP_012673975.1 Domain annotation for each sequence (and alignments): >> NCBI__GCF_000021545.1:WP_012673975.1 # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc --- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ---- 1 ! 863.0 0.7 4.4e-264 4.4e-264 2 555 .. 5 566 .. 4 568 .. 0.98 Alignments for each domain: == domain 1 score: 863.0 bits; conditional E-value: 4.4e-264 TIGR00118 2 kgaeilveslkkegvetvfGyPGGavlpiydalydselehilvrheqaaahaadGyarasGkvGvvlatsGPG 74 +ga+i+v+ l +egv+t+fG PGGa++++ydal+d ++ +l+rheqaa h+adGyara+GkvGvv+atsGPG NCBI__GCF_000021545.1:WP_012673975.1 5 RGADIVVDVLLHEGVDTIFGLPGGAIMEVYDALFDAPFKNVLTRHEQAACHMADGYARATGKVGVVIATSGPG 77 79*********************************************************************** PP TIGR00118 75 atnlvtgiatayldsvPlvvltGqvatsliGsdafqeidilGitlpvtkhsflvkkaedlpeilkeafeiast 147 atnlvtg+atay+ds+Plv++tGqv+++ iG+dafqe+d++Git+p+tkh+flv++ +dlp il++af+ia t NCBI__GCF_000021545.1:WP_012673975.1 78 ATNLVTGLATAYMDSIPLVAITGQVPRHYIGTDAFQEADVIGITRPITKHNFLVTDIKDLPLILRQAFYIART 150 ************************************************************************* PP TIGR00118 148 GrPGPvlvdlPkdvteaeieleve....ekvelpgykptvkghklqikkaleliekakkPvllvGgGviiaea 216 GrPGPvlvd+Pkd+t++ e+ ++ k +lpgy+p+v+g++ qikka+eli+ka +Pvl+vGgG+i a+a NCBI__GCF_000021545.1:WP_012673975.1 151 GRPGPVLVDIPKDITQQVSEYYIPsdeeVKESLPGYNPHVEGNPVQIKKAAELIRKATRPVLYVGGGAILADA 223 ****************9998877666444568***************************************** PP TIGR00118 217 seelkelaerlkipvtttllGlGafpedhplalgmlGmhGtkeanlavseadlliavGarfddrvtgnlakfa 289 ee+++la +kipvttt +G+Gafpe++pl+l mlGmhGt++an+av ++dlliavGarfddrvtg++++fa NCBI__GCF_000021545.1:WP_012673975.1 224 AEEVTKLARLTKIPVTTTNMGKGAFPETDPLSLHMLGMHGTYYANMAVYHSDLLIAVGARFDDRVTGKISEFA 296 ************************************************************************* PP TIGR00118 290 peakiihididPaeigknvkvdipivGdakkvleellkklkee....ekkekeWlekieewkkeyilkldeee 358 peakiihididPa+i+k+++vd+pivGd+k+vl++l+k+l+e+ ++Wl++i+ewk++++l++ +++ NCBI__GCF_000021545.1:WP_012673975.1 297 PEAKIIHIDIDPASISKTITVDVPIVGDVKNVLQKLIKELEEKpvewVAARENWLKQIQEWKEKHPLSYRKSD 369 *****************************************998876344446******************** PP TIGR00118 359 esikPqkvikelskllkdeaivttdvGqhqmwaaqfyktkkprkfitsgGlGtmGfGlPaalGakvakpeetv 431 + ikPq+vi+e++++++++ai++++vGqhqmwaa fyk+++pr+f++sgGlGtmGfG+Paa+Gak++ p++tv NCBI__GCF_000021545.1:WP_012673975.1 370 KIIKPQYVIEEIYNITNGDAIISAGVGQHQMWAAMFYKYSYPRQFLNSGGLGTMGFGFPAAVGAKIGRPDKTV 442 ************************************************************************* PP TIGR00118 432 vavtGdgsfqmnlqelstiveydipvkivilnnellGmvkqWqelfyeerysetklaselpdfvklaeayGvk 504 a+ Gdgsf mn+q+l+t+v+y +pvki i+nn +lGmv+qWq++fy+ ry++++l+ +pdfvklae++G++ NCBI__GCF_000021545.1:WP_012673975.1 443 FAIEGDGSFIMNVQDLATAVQYRVPVKIAIINNGFLGMVRQWQQFFYDSRYASVCLSV-QPDFVKLAESFGAV 514 *********************************************************6.************** PP TIGR00118 505 giriekpeeleeklkealesk.epvlldvevdkeeevlPmvapGagldelve 555 g+r +kp+e++e l++a+e + +pvl+d++vd+ee+vlPmv+ G++ e++ NCBI__GCF_000021545.1:WP_012673975.1 515 GLRATKPSEVKEVLAKAMEINdRPVLIDFVVDREENVLPMVPAGKSYREMIL 566 *******************9879***************************96 PP Internal pipeline statistics summary: ------------------------------------- Query model(s): 1 (557 nodes) Target sequences: 1 (581 residues searched) Passed MSV filter: 1 (1); expected 0.0 (0.02) Passed bias filter: 1 (1); expected 0.0 (0.02) Passed Vit filter: 1 (1); expected 0.0 (0.001) Passed Fwd filter: 1 (1); expected 0.0 (1e-05) Initial search space (Z): 1 [actual number of targets] Domain search space (domZ): 1 [number of targets reported over threshold] # CPU time: 0.01u 0.01s 00:00:00.02 Elapsed: 00:00:00.01 # Mc/sec: 31.24 // [ok]
This GapMind analysis is from Jul 25 2024. The underlying query database was built on Jul 25 2024.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory