Align acetohydroxyacid synthase subunit B (EC 2.2.1.6) (characterized)
to candidate WP_012385382.1 BIND_RS12235 acetolactate synthase 3 large subunit
Query= metacyc::MONOMER-18810 (585 letters) >NCBI__GCF_000019845.1:WP_012385382.1 Length = 586 Score = 621 bits (1602), Expect = 0.0 Identities = 314/566 (55%), Positives = 410/566 (72%), Gaps = 5/566 (0%) Query: 18 EMIGAEILVHALAEEGVEYVWGYPGGAVLYIYDELHKQTKFEHILVRHEQAAVHAADGYA 77 +M GA+++V AL ++GV+ ++GYPGGAVL IYD L Q +H+LVRHEQ AVHAA+GYA Sbjct: 4 KMTGAQMVVTALKDQGVDTIFGYPGGAVLPIYDALFHQNSVKHVLVRHEQGAVHAAEGYA 63 Query: 78 RATGKVGVALVTSGPGVTNAVTGIATAYLDSIPMVVITGNVPTHAIGQDAFQECDTVGIT 137 R++GK+GV LVTSGPG TNAVTG+ A LDSIP+V ITG VPTH IG DAFQECDTVGIT Sbjct: 64 RSSGKIGVVLVTSGPGATNAVTGLTDALLDSIPLVCITGQVPTHLIGSDAFQECDTVGIT 123 Query: 138 RPIVKHNFLVKDVRDLAATIKKAFFIAATGRPGPVVVDIPKDVSRNACKYEYPKSIDMRS 197 R KHN+LV+ V DL + +AF++A GRPGPVV+DIPKDV Y P++I+ ++ Sbjct: 124 RHCTKHNYLVRQVEDLPRILHEAFYVAQHGRPGPVVIDIPKDVQFALGLYAGPENIEHKT 183 Query: 198 YNPVNKGHSGQIRKAVALLQGAERPYIYTGGGVVLA--NASDELRQLAALTGHPVTNTLM 255 Y P KG G+I +A++ + A+RP YTGGG++ + AS LR+L LTG P+T+TLM Sbjct: 184 YKPAFKGDLGRIDEALSWIAAAKRPIFYTGGGIINSGPEASALLRELVDLTGAPITSTLM 243 Query: 256 GLGAFPGTSKQFVGMLGMHGTYEANMAMQNCDVLIAIGARFDDRVIGNPAHFTSQARKII 315 GLGA+P +S +++GMLGMHGTYEAN AM +CD++IAIGARFDDR+ G F+ +RK I Sbjct: 244 GLGAYPASSPRWLGMLGMHGTYEANNAMHDCDLMIAIGARFDDRITGRIDAFSPGSRK-I 302 Query: 316 HIDIDPSSISKRVKVDIPIVGNVKDVLQELIAQIKASDIKPKREALAKWWEQIEQWRSVD 375 HIDIDPSSI+K VKVD+ IVG+ VL++++A+ K + L KWW++I+ WR+ Sbjct: 303 HIDIDPSSINKTVKVDLGIVGDCTAVLRDMVARWKERHLACDAAVLQKWWQEIDHWRARK 362 Query: 376 CLKYDRSSEIIKPQYVVEKIWELTKG-DAFICSDVGQHQMWAAQFYKFDEPRRWINSGGL 434 L Y S+++IKPQY +++++ELT+G D +I ++VGQHQMWAAQ Y FDEP RW+ SGGL Sbjct: 363 SLAYRASNKVIKPQYAIQRLFELTRGRDTYISTEVGQHQMWAAQHYHFDEPNRWMTSGGL 422 Query: 435 GTMGVGLPYAMGIKKAFPEKEVVTITGEGSIQMCIQELSTCLQYDTPVKICSLNNGYLGM 494 GTMG GLP A+G + A P VV I GE SI M IQELST +Q+ PVKI LNN Y+GM Sbjct: 423 GTMGYGLPAAIGAQMAHPGSLVVDIAGEASILMNIQELSTAVQFRLPVKIFILNNQYMGM 482 Query: 495 VRQWQEIEYDNRYSHSYMDALPDFVKLAEAYGHVGMRVEKTSDVEPALREAFRLKDRTVF 554 VRQWQE+ + RYS SY +ALPDFVKLAEAYG G+R + ++ A+ E + V Sbjct: 483 VRQWQELLHGGRYSESYSEALPDFVKLAEAYGCHGIRCADPNTLDAAILEMIE-TPKPVL 541 Query: 555 LDFQTDPTENVWPMVQAGKGISEMLL 580 D D EN PM+ +GK +EM+L Sbjct: 542 FDCIVDREENCLPMIPSGKAHNEMIL 567 Lambda K H 0.319 0.135 0.407 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 959 Number of extensions: 27 Number of successful extensions: 5 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 585 Length of database: 586 Length adjustment: 37 Effective length of query: 548 Effective length of database: 549 Effective search space: 300852 Effective search space used: 300852 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.4 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.8 bits) S2: 53 (25.0 bits)
Align candidate WP_012385382.1 BIND_RS12235 (acetolactate synthase 3 large subunit)
to HMM TIGR00118 (ilvB: acetolactate synthase, large subunit, biosynthetic type (EC 2.2.1.6))
# hmmsearch :: search profile(s) against a sequence database # HMMER 3.3.1 (Jul 2020); http://hmmer.org/ # Copyright (C) 2020 Howard Hughes Medical Institute. # Freely distributed under the BSD open source license. # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # query HMM file: ../tmp/path.aa/TIGR00118.hmm # target sequence database: /tmp/gapView.2140208.genome.faa # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: TIGR00118 [M=557] Accession: TIGR00118 Description: acolac_lg: acetolactate synthase, large subunit, biosynthetic type Scores for complete sequences (score includes all domains): --- full sequence --- --- best 1 domain --- -#dom- E-value score bias E-value score bias exp N Sequence Description ------- ------ ----- ------- ------ ----- ---- -- -------- ----------- 2.8e-244 797.6 0.0 3.2e-244 797.4 0.0 1.0 1 NCBI__GCF_000019845.1:WP_012385382.1 Domain annotation for each sequence (and alignments): >> NCBI__GCF_000019845.1:WP_012385382.1 # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc --- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ---- 1 ! 797.4 0.0 3.2e-244 3.2e-244 1 555 [. 5 567 .. 5 569 .. 0.98 Alignments for each domain: == domain 1 score: 797.4 bits; conditional E-value: 3.2e-244 TIGR00118 1 lkgaeilveslkkegvetvfGyPGGavlpiydaly.dselehilvrheqaaahaadGyarasGkvGvvlatsG 72 ++ga+++v +lk++gv+t+fGyPGGavlpiydal+ ++ ++h+lvrheq+a+haa+Gyar+sGk+Gvvl+tsG NCBI__GCF_000019845.1:WP_012385382.1 5 MTGAQMVVTALKDQGVDTIFGYPGGAVLPIYDALFhQNSVKHVLVRHEQGAVHAAEGYARSSGKIGVVLVTSG 77 79********************************98999********************************** PP TIGR00118 73 PGatnlvtgiatayldsvPlvvltGqvatsliGsdafqeidilGitlpvtkhsflvkkaedlpeilkeafeia 145 PGatn+vtg+++a lds+Plv +tGqv+t+liGsdafqe+d +Git+ +tkh++lv+++edlp+il+eaf++a NCBI__GCF_000019845.1:WP_012385382.1 78 PGATNAVTGLTDALLDSIPLVCITGQVPTHLIGSDAFQECDTVGITRHCTKHNYLVRQVEDLPRILHEAFYVA 150 ************************************************************************* PP TIGR00118 146 stGrPGPvlvdlPkdvteaeieleveekvelpgykptvkghklqikkaleliekakkPvllvGgGviia..ea 216 + GrPGPv++d+Pkdv+ a + +e++e ++ykp kg+ +i +al i++ak+P+ + GgG+i + ea NCBI__GCF_000019845.1:WP_012385382.1 151 QHGRPGPVVIDIPKDVQFALGLYAGPENIEHKTYKPAFKGDLGRIDEALSWIAAAKRPIFYTGGGIINSgpEA 223 *******************************************************************874469 PP TIGR00118 217 seelkelaerlkipvtttllGlGafpedhplalgmlGmhGtkeanlavseadlliavGarfddrvtgnlakfa 289 s+ l+el++ + +p+t+tl+GlGa+p+ p+ lgmlGmhGt+ean a++++dl+ia+Garfddr+tg ++ f+ NCBI__GCF_000019845.1:WP_012385382.1 224 SALLRELVDLTGAPITSTLMGLGAYPASSPRWLGMLGMHGTYEANNAMHDCDLMIAIGARFDDRITGRIDAFS 296 99*********************************************************************** PP TIGR00118 290 peakiihididPaeigknvkvdipivGdakkvleellkklkee....ekkeke.Wlekieewkkeyilkldee 357 p ++ ihididP++i+k+vkvd+ ivGd++ vl++++++ ke+ + + W+++i++w++++ l++ + NCBI__GCF_000019845.1:WP_012385382.1 297 PGSRKIHIDIDPSSINKTVKVDLGIVGDCTAVLRDMVARWKERhlacDAAVLQkWWQEIDHWRARKSLAYRAS 369 ***************************************999977764444455******************* PP TIGR00118 358 eesikPqkvikelskllkd.eaivttdvGqhqmwaaqfyktkkprkfitsgGlGtmGfGlPaalGakvakpee 429 ++ ikPq+ i++l +l+++ +++++t+vGqhqmwaaq+y++++p++++tsgGlGtmG+GlPaa+Ga++a+p + NCBI__GCF_000019845.1:WP_012385382.1 370 NKVIKPQYAIQRLFELTRGrDTYISTEVGQHQMWAAQHYHFDEPNRWMTSGGLGTMGYGLPAAIGAQMAHPGS 442 *****************9989**************************************************** PP TIGR00118 430 tvvavtGdgsfqmnlqelstiveydipvkivilnnellGmvkqWqelfyeerysetklaselpdfvklaeayG 502 vv+++G++s+ mn+qelst+v++ +pvki ilnn+++Gmv+qWqel++ +ryse++ ++ lpdfvklaeayG NCBI__GCF_000019845.1:WP_012385382.1 443 LVVDIAGEASILMNIQELSTAVQFRLPVKIFILNNQYMGMVRQWQELLHGGRYSESYSEA-LPDFVKLAEAYG 514 **********************************************************95.************ PP TIGR00118 503 vkgiriekpeeleeklkealeskepvlldvevdkeeevlPmvapGagldelve 555 +gir ++p+ l++++ e++e+ +pvl+d vd+ee++lPm+++G++ +e++ NCBI__GCF_000019845.1:WP_012385382.1 515 CHGIRCADPNTLDAAILEMIETPKPVLFDCIVDREENCLPMIPSGKAHNEMIL 567 ***************************************************96 PP Internal pipeline statistics summary: ------------------------------------- Query model(s): 1 (557 nodes) Target sequences: 1 (586 residues searched) Passed MSV filter: 1 (1); expected 0.0 (0.02) Passed bias filter: 1 (1); expected 0.0 (0.02) Passed Vit filter: 1 (1); expected 0.0 (0.001) Passed Fwd filter: 1 (1); expected 0.0 (1e-05) Initial search space (Z): 1 [actual number of targets] Domain search space (domZ): 1 [number of targets reported over threshold] # CPU time: 0.01u 0.00s 00:00:00.01 Elapsed: 00:00:00.00 # Mc/sec: 33.13 // [ok]
This GapMind analysis is from Jul 25 2024. The underlying query database was built on Jul 25 2024.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory