Align acetohydroxyacid synthase subunit B (EC 2.2.1.6) (characterized)
to candidate WP_027459069.1 K420_RS0116805 acetolactate synthase, large subunit, biosynthetic type
Query= metacyc::MONOMER-18810 (585 letters) >NCBI__GCF_000519045.1:WP_027459069.1 Length = 566 Score = 797 bits (2058), Expect = 0.0 Identities = 394/566 (69%), Positives = 469/566 (82%), Gaps = 5/566 (0%) Query: 21 GAEILVHALAEEGVEYVWGYPGGAVLYIYDELHKQTKFEHILVRHEQAAVHAADGYARAT 80 GAEI++ L EE V+YV+GYPGG+VL+IYD L KQ + +HILVRHEQAAVHAAD Y+R++ Sbjct: 5 GAEIVIRCLQEEKVDYVFGYPGGSVLHIYDALFKQEEVKHILVRHEQAAVHAADAYSRSS 64 Query: 81 GKVGVALVTSGPGVTNAVTGIATAYLDSIPMVVITGNVPTHAIGQDAFQECDTVGITRPI 140 KVGVALVTSGPGVTN VTGIATAY+DSIPMVV+ G VPT IGQDAFQECDTVGITRP Sbjct: 65 QKVGVALVTSGPGVTNTVTGIATAYMDSIPMVVLCGQVPTAYIGQDAFQECDTVGITRPC 124 Query: 141 VKHNFLVKDVRDLAATIKKAFFIAATGRPGPVVVDIPKDVSRNACKYEYPKSIDMRSYNP 200 VKHNFLVKDV+DLA+TIKKAF IA+TGRPGPVVVDIPKD++ C+++YPKS+ MRSYNP Sbjct: 125 VKHNFLVKDVKDLASTIKKAFHIASTGRPGPVVVDIPKDITAQVCEFDYPKSVQMRSYNP 184 Query: 201 VNKGHSGQIRKAVALLQGAERPYIYTGGGVVLANASDELRQLAALTGHPVTNTLMGLGAF 260 V KGH GQI+KAV +LQ A+RP IYTGGGV+L++A+++L QLA PVTNTLMGLG + Sbjct: 185 VVKGHLGQIKKAVQILQEAKRPIIYTGGGVILSDAAEKLTQLARKLNFPVTNTLMGLGGY 244 Query: 261 PGTSKQFVGMLGMHGTYEANMAMQNCDVLIAIGARFDDRVIGNPAHFTSQARKIIHIDID 320 P T KQFVGMLGMHGT+EAN AM DV++A+GARFDDRVIGNP HF + R++IHIDID Sbjct: 245 PATDKQFVGMLGMHGTFEANNAMHYADVILAVGARFDDRVIGNPEHFGEEKRRVIHIDID 304 Query: 321 PSSISKRVKVDIPIVGNVKDVLQELIAQIKASDIKPKREALAKWWEQIEQWRSVDCLKYD 380 PSSISKRVKVD+PIVGNV DV+ E++ ++ K E +A WW+QI++WR + L Y Sbjct: 305 PSSISKRVKVDVPIVGNVTDVIDEIL-KLLDGGFKADPE-VADWWKQIDEWRGRNSLAY- 361 Query: 381 RSSEIIKPQYVVEKIWELTKGDAFICSDVGQHQMWAAQFYKFDEPRRWINSGGLGTMGVG 440 R S I PQYVVEK++E+T GDAFI SDVGQHQM+AAQ+YKFD+PRRWINSGGLGTMGVG Sbjct: 362 RQSNHIMPQYVVEKLYEVTGGDAFITSDVGQHQMFAAQYYKFDKPRRWINSGGLGTMGVG 421 Query: 441 LPYAMGIKKAFPEKEVVTITGEGSIQMCIQELSTCLQYDTPVKICSLNNGYLGMVRQWQE 500 LPY MG+ A P +V +TGEGSIQMCIQELSTC QY+ P+KI +LNNG LGMVRQWQE Sbjct: 422 LPYGMGVLLANPGAQVACVTGEGSIQMCIQELSTCKQYELPIKIINLNNGMLGMVRQWQE 481 Query: 501 IEYDNRYSHSYMDALPDFVKLAEAYGHVGMRVEKTSDVEPALREAF-RLKDRTVFLDFQT 559 + Y RYS SY+ +LPDFVKLAE+YGHVGMR+EK DVEPALR+AF K+ VF+DF Sbjct: 482 MFYSKRYSQSYVTSLPDFVKLAESYGHVGMRIEKPEDVEPALRKAFTEHKNDLVFMDFII 541 Query: 560 DPTENVWPMVQAGKGISEMLLGAEDL 585 DP NV+PMV AGKG++EM+L AED+ Sbjct: 542 DPAANVFPMVAAGKGLTEMIL-AEDI 566 Lambda K H 0.319 0.135 0.407 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 1016 Number of extensions: 38 Number of successful extensions: 4 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 585 Length of database: 566 Length adjustment: 36 Effective length of query: 549 Effective length of database: 530 Effective search space: 290970 Effective search space used: 290970 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.4 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.8 bits) S2: 53 (25.0 bits)
Align candidate WP_027459069.1 K420_RS0116805 (acetolactate synthase, large subunit, biosynthetic type)
to HMM TIGR00118 (ilvB: acetolactate synthase, large subunit, biosynthetic type (EC 2.2.1.6))
# hmmsearch :: search profile(s) against a sequence database # HMMER 3.3.1 (Jul 2020); http://hmmer.org/ # Copyright (C) 2020 Howard Hughes Medical Institute. # Freely distributed under the BSD open source license. # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # query HMM file: ../tmp/path.aa/TIGR00118.hmm # target sequence database: /tmp/gapView.27568.genome.faa # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: TIGR00118 [M=557] Accession: TIGR00118 Description: acolac_lg: acetolactate synthase, large subunit, biosynthetic type Scores for complete sequences (score includes all domains): --- full sequence --- --- best 1 domain --- -#dom- E-value score bias E-value score bias exp N Sequence Description ------- ------ ----- ------- ------ ----- ---- -- -------- ----------- 6.6e-253 826.1 2.1 7.8e-253 825.9 2.1 1.0 1 lcl|NCBI__GCF_000519045.1:WP_027459069.1 K420_RS0116805 acetolactate synt Domain annotation for each sequence (and alignments): >> lcl|NCBI__GCF_000519045.1:WP_027459069.1 K420_RS0116805 acetolactate synthase, large subunit, biosynthetic type # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc --- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ---- 1 ! 825.9 2.1 7.8e-253 7.8e-253 2 555 .. 4 562 .. 3 564 .. 0.97 Alignments for each domain: == domain 1 score: 825.9 bits; conditional E-value: 7.8e-253 TIGR00118 2 kgaeilveslkkegvetvfGyPGGavlpiydaly.dselehilvrheqaaahaadGyarasGkvGvvla 69 +gaei+++ l++e+v++vfGyPGG+vl+iydal+ ++e++hilvrheqaa+haad y+r+s kvGv+l+ lcl|NCBI__GCF_000519045.1:WP_027459069.1 4 SGAEIVIRCLQEEKVDYVFGYPGGSVLHIYDALFkQEEVKHILVRHEQAAVHAADAYSRSSQKVGVALV 72 89********************************999******************************** PP TIGR00118 70 tsGPGatnlvtgiatayldsvPlvvltGqvatsliGsdafqeidilGitlpvtkhsflvkkaedlpeil 138 tsGPG tn+vtgiatay+ds+P+vvl Gqv+t+ iG+dafqe+d +Git+p++kh+flvk+++dl++++ lcl|NCBI__GCF_000519045.1:WP_027459069.1 73 TSGPGVTNTVTGIATAYMDSIPMVVLCGQVPTAYIGQDAFQECDTVGITRPCVKHNFLVKDVKDLASTI 141 ********************************************************************* PP TIGR00118 139 keafeiastGrPGPvlvdlPkdvteaeieleveekvelpgykptvkghklqikkaleliekakkPvllv 207 k+af+iastGrPGPv+vd+Pkd+t++ +e++++++v++++y+p vkgh qikka++++++ak+P+++ lcl|NCBI__GCF_000519045.1:WP_027459069.1 142 KKAFHIASTGRPGPVVVDIPKDITAQVCEFDYPKSVQMRSYNPVVKGHLGQIKKAVQILQEAKRPIIYT 210 ********************************************************************* PP TIGR00118 208 GgGviiaeaseelkelaerlkipvtttllGlGafpedhplalgmlGmhGtkeanlavseadlliavGar 276 GgGvi ++a e+l++la +l+ pvt tl+GlG++p+++++++gmlGmhGt ean a++ ad+++avGar lcl|NCBI__GCF_000519045.1:WP_027459069.1 211 GGGVILSDAAEKLTQLARKLNFPVTNTLMGLGGYPATDKQFVGMLGMHGTFEANNAMHYADVILAVGAR 279 ********************************************************************* PP TIGR00118 277 fddrvtgnlakfapeak.iihididPaeigknvkvdipivGdakkvleellkklkee...ekkekeWle 341 fddrv gn ++f +e + +ihididP++i+k vkvd+pivG+++ v++e+lk l + + +W++ lcl|NCBI__GCF_000519045.1:WP_027459069.1 280 FDDRVIGNPEHFGEEKRrVIHIDIDPSSISKRVKVDVPIVGNVTDVIDEILKLLDGGfkaDPEVADWWK 348 ************987644*********************************999876442444445*** PP TIGR00118 342 kieewkkeyilkldeeeesikPqkvikelskllkdeaivttdvGqhqmwaaqfyktkkprkfitsgGlG 410 +i+ew+ ++ l++ ++++i Pq+v+++l++++ ++a++t+dvGqhqm+aaq+yk++kpr++i+sgGlG lcl|NCBI__GCF_000519045.1:WP_027459069.1 349 QIDEWRGRNSLAYR-QSNHIMPQYVVEKLYEVTGGDAFITSDVGQHQMFAAQYYKFDKPRRWINSGGLG 416 *********98875.56789************************************************* PP TIGR00118 411 tmGfGlPaalGakvakpeetvvavtGdgsfqmnlqelstiveydipvkivilnnellGmvkqWqelfye 479 tmG GlP +G+ +a+p ++v +vtG+gs+qm +qelst+++y++p+ki++lnn +lGmv+qWqe+fy+ lcl|NCBI__GCF_000519045.1:WP_027459069.1 417 TMGVGLPYGMGVLLANPGAQVACVTGEGSIQMCIQELSTCKQYELPIKIINLNNGMLGMVRQWQEMFYS 485 ********************************************************************* PP TIGR00118 480 erysetklaselpdfvklaeayGvkgiriekpeeleeklkealesk..epvlldvevdkeeevlPmvap 546 +rys+++++s lpdfvklae+yG++g+riekpe++e +l++a++++ ++v++d+ +d ++v+Pmva lcl|NCBI__GCF_000519045.1:WP_027459069.1 486 KRYSQSYVTS-LPDFVKLAESYGHVGMRIEKPEDVEPALRKAFTEHknDLVFMDFIIDPAANVFPMVAA 553 *********5.******************************998654589******************* PP TIGR00118 547 Gagldelve 555 G+gl+e++ lcl|NCBI__GCF_000519045.1:WP_027459069.1 554 GKGLTEMIL 562 *******96 PP Internal pipeline statistics summary: ------------------------------------- Query model(s): 1 (557 nodes) Target sequences: 1 (566 residues searched) Passed MSV filter: 1 (1); expected 0.0 (0.02) Passed bias filter: 1 (1); expected 0.0 (0.02) Passed Vit filter: 1 (1); expected 0.0 (0.001) Passed Fwd filter: 1 (1); expected 0.0 (1e-05) Initial search space (Z): 1 [actual number of targets] Domain search space (domZ): 1 [number of targets reported over threshold] # CPU time: 0.02u 0.02s 00:00:00.04 Elapsed: 00:00:00.03 # Mc/sec: 9.17 // [ok]
This GapMind analysis is from Apr 10 2024. The underlying query database was built on Apr 09 2024.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory