Align Acetolactate synthase large subunit; AHAS; EC 2.2.1.6; Acetohydroxy-acid synthase large subunit; ALS; Vegetative protein 105; VEG105 (uncharacterized)
to candidate WP_014027226.1 PYRFU_RS08320 acetolactate synthase, large subunit, biosynthetic type
Query= curated2:P37251 (574 letters) >NCBI__GCF_000223395.1:WP_014027226.1 Length = 571 Score = 408 bits (1048), Expect = e-118 Identities = 223/551 (40%), Positives = 328/551 (59%), Gaps = 14/551 (2%) Query: 26 ESLKKEKVEMIFGYPGGAVLPIYDKLYNSGLVHI-LPRHEQGAIHAAEGYARVSGKPGVV 84 ++L++ +FG GG+++ YD + G +I + RHEQGA HAA+ Y RV +P +V Sbjct: 7 KTLREMGATDVFGVTGGSIMAFYDAMEVVGGFNIYMFRHEQGAAHAADAYGRVKKRPAIV 66 Query: 85 IATSGPGATNLVTGLADAMIDSLPLVVFTGQVATSVIGSDAFQEADILGITMPVTKHSYQ 144 TSGPGATN+VTG+A+A +DS P + TGQV T+V G DAFQE D++G+ P+TK YQ Sbjct: 67 AVTSGPGATNIVTGVANAYMDSSPALFITGQVPTTVFGRDAFQETDMVGVVAPITKFVYQ 126 Query: 145 VRQPEDLPRIIKEAFHIATTGRPGPVLIDIPKDVATIEGEFSYDHEMNLPGYQP--TTEP 202 +R+PE+ IK A+ +A GRPGP L+D P+DV + + D + L Y+ +P Sbjct: 127 IRRPEEAVPAIKTAYKLAIMGRPGPTLVDFPRDVQLRRCDCTSDGLLPL-NYEKFKAPDP 185 Query: 203 NYLQIRKLVEAVSSAKKPVILAGAGVLHGKASEELKNYAEQQQIPVAHTLLGLGGFPADH 262 + I + + SA++PVIL G GV A E+ AE+ P+ TL G PADH Sbjct: 186 DPRLIEEAARLLLSARRPVILVGGGVYWSGAWPEVIEIAERLWAPIVTTLTGKNSVPADH 245 Query: 263 PLFLGMAGMHGTYTANMALHECDLLISIGARFDDRVTGNLKHFARNAKIAHIDIDPAEIG 322 PL +G AGMHG A+ AL D+++++G RF DR G + + KI HIDIDP+EIG Sbjct: 246 PLVMGPAGMHGRAEADAALANADVILAVGTRFSDRTVGRFEPELKEKKIIHIDIDPSEIG 305 Query: 323 KIMKTQIPVVGDSKIVLQELIKQDGKQSDSSEWKKQLAEWKEEYPLWYVDNEEE------ 376 K +K + +V D+K L+ LI+ K + + + +W Y D E+ Sbjct: 306 KNVKPAVGIVADAKKALRMLIE---KLPEVARRDTKFVDWLLYIRRRYEDAMEKLAERMK 362 Query: 377 GFKPQKLIEYIHQFTKGEAIVATDVGQHQMWSAQFYPFQKADKWVTSGGLGTMGFGLPAA 436 F P K+++ + + + I AT VG HQMWS + W+TS GLGTMGF +PAA Sbjct: 363 PFAPWKVLKLLRRIVPRDTITATGVGSHQMWSEIHWDVYIPGTWLTSAGLGTMGFCVPAA 422 Query: 437 IGAQLAEKDATVVAVVGDGGFQMTLQELDVIRELNLPVKVVILNNACLGMVRQWQEIFYE 496 IGA++A + TV+ + GDG FQMT+ L ++R+ NLP VI +N L +V+QWQ YE Sbjct: 423 IGAKIAAPERTVLCIDGDGSFQMTMNNLALVRDYNLPAIFVIFDNRALMLVKQWQIFLYE 482 Query: 497 ERYSESKFASQPDFVKLSEAYGIKGIRISSEAEAKEKLEEALTSREPVVIDVRVASE-EK 555 R + F +PDFVK++EAY I+G+R + E ++ + A+ + EP+V+D+ + E + Sbjct: 483 RRIVATHFTERPDFVKVAEAYDIEGVRPADYQELEKWVRWAVRNNEPLVVDIMIDREMDI 542 Query: 556 VFPMVAPGKGL 566 V+P V PG+ L Sbjct: 543 VYPWVKPGEWL 553 Lambda K H 0.317 0.135 0.391 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 833 Number of extensions: 37 Number of successful extensions: 4 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 574 Length of database: 571 Length adjustment: 36 Effective length of query: 538 Effective length of database: 535 Effective search space: 287830 Effective search space used: 287830 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.3 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.6 bits) S2: 53 (25.0 bits)
Align candidate WP_014027226.1 PYRFU_RS08320 (acetolactate synthase, large subunit, biosynthetic type)
to HMM TIGR00118 (ilvB: acetolactate synthase, large subunit, biosynthetic type (EC 2.2.1.6))
# hmmsearch :: search profile(s) against a sequence database # HMMER 3.3.1 (Jul 2020); http://hmmer.org/ # Copyright (C) 2020 Howard Hughes Medical Institute. # Freely distributed under the BSD open source license. # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # query HMM file: ../tmp/path.aa/TIGR00118.hmm # target sequence database: /tmp/gapView.21104.genome.faa # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: TIGR00118 [M=557] Accession: TIGR00118 Description: acolac_lg: acetolactate synthase, large subunit, biosynthetic type Scores for complete sequences (score includes all domains): --- full sequence --- --- best 1 domain --- -#dom- E-value score bias E-value score bias exp N Sequence Description ------- ------ ----- ------- ------ ----- ---- -- -------- ----------- 7.7e-182 591.5 0.0 9.9e-182 591.1 0.0 1.0 1 lcl|NCBI__GCF_000223395.1:WP_014027226.1 PYRFU_RS08320 acetolactate synth Domain annotation for each sequence (and alignments): >> lcl|NCBI__GCF_000223395.1:WP_014027226.1 PYRFU_RS08320 acetolactate synthase, large subunit, biosynthetic type # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc --- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ---- 1 ! 591.1 0.0 9.9e-182 9.9e-182 7 552 .. 5 555 .. 2 559 .. 0.96 Alignments for each domain: == domain 1 score: 591.1 bits; conditional E-value: 9.9e-182 TIGR00118 7 lveslkkegvetvfGyPGGavlpiydaly.dselehilvrheqaaahaadGyarasGkvGvvlatsGPG 74 + ++l++ g vfG GG+++ +yda+ ++++ + rheq+aahaad y r+ ++ +v +tsGPG lcl|NCBI__GCF_000223395.1:WP_014027226.1 5 VAKTLREMGATDVFGVTGGSIMAFYDAMEvVGGFNIYMFRHEQGAAHAADAYGRVKKRPAIVAVTSGPG 73 77999***********************999************************************** PP TIGR00118 75 atnlvtgiatayldsvPlvvltGqvatsliGsdafqeidilGitlpvtkhsflvkkaedlpeilkeafe 143 atn+vtg+a+ay+ds+P + +tGqv+t++ G dafqe d++G+ p+tk+ ++++++e+ +k a++ lcl|NCBI__GCF_000223395.1:WP_014027226.1 74 ATNIVTGVANAYMDSSPALFITGQVPTTVFGRDAFQETDMVGVVAPITKFVYQIRRPEEAVPAIKTAYK 142 ********************************************************************* PP TIGR00118 144 iastGrPGPvlvdlPkdvteaeieleveekvelpgykptv.kghklqikkaleliekakkPvllvGgGv 211 +a GrPGP lvd+P+dv+ + ++ + + + l k + ++++ i++a+ l+ +a++Pv+lvGgGv lcl|NCBI__GCF_000223395.1:WP_014027226.1 143 LAIMGRPGPTLVDFPRDVQLRRCDCTSDGLLPLNYEKFKApDPDPRLIEEAARLLLSARRPVILVGGGV 211 ****************************9999987777653678999********************** PP TIGR00118 212 iiaeaseelkelaerlkipvtttllGlGafpedhplalgmlGmhGtkeanlavseadlliavGarfddr 280 ++a e+ e+aerl +p++ttl G+ ++p+dhpl +g +GmhG++ea+ a+ +ad+++avG+rf+dr lcl|NCBI__GCF_000223395.1:WP_014027226.1 212 YWSGAWPEVIEIAERLWAPIVTTLTGKNSVPADHPLVMGPAGMHGRAEADAALANADVILAVGTRFSDR 280 ********************************************************************* PP TIGR00118 281 vtgnlakfapeakiihididPaeigknvkvdipivGdakkvleellkklkeeekkeke...Wlekieew 346 + g + +e kiihididP+eigknvk ++ iv dakk l+ l++kl e+ +++++ Wl i++ lcl|NCBI__GCF_000223395.1:WP_014027226.1 281 TVGRFEPELKEKKIIHIDIDPSEIGKNVKPAVGIVADAKKALRMLIEKLPEVARRDTKfvdWLLYIRRR 349 ****************************************************88887744577777777 PP TIGR00118 347 kkeyilkldeeeesikPqkvikelskllkdeaivttdvGqhqmwaaqfyktkkprkfitsgGlGtmGfG 415 ++ + kl e+ ++++P kv+k l ++++ ++i +t+vG hqmw+ ++++ p +++ts+GlGtmGf lcl|NCBI__GCF_000223395.1:WP_014027226.1 350 YEDAMEKLAERMKPFAPWKVLKLLRRIVPRDTITATGVGSHQMWSEIHWDVYIPGTWLTSAGLGTMGFC 418 777777788888889999*************************************************** PP TIGR00118 416 lPaalGakvakpeetvvavtGdgsfqmnlqelstiveydipvkivilnnellGmvkqWqelfyeeryse 484 +Paa+Gak+a pe tv++++Gdgsfqm++++l+ +++y++p + vi++n+ l +vkqWq ++ye+r + lcl|NCBI__GCF_000223395.1:WP_014027226.1 419 VPAAIGAKIAAPERTVLCIDGDGSFQMTMNNLALVRDYNLPAIFVIFDNRALMLVKQWQIFLYERRIVA 487 ********************************************************************* PP TIGR00118 485 tklaselpdfvklaeayGvkgiriekpeeleeklkealeskepvlldvevdkeee.vlPmvapGaglde 552 t+++ e pdfvk+aeay ++g+r ++ +ele+ ++ a++++ep+++d+ +d+e + v+P v pG l++ lcl|NCBI__GCF_000223395.1:WP_014027226.1 488 THFT-ERPDFVKVAEAYDIEGVRPADYQELEKWVRWAVRNNEPLVVDIMIDREMDiVYPWVKPGEWLTN 555 ****.7*********************************************986538*******98876 PP Internal pipeline statistics summary: ------------------------------------- Query model(s): 1 (557 nodes) Target sequences: 1 (571 residues searched) Passed MSV filter: 1 (1); expected 0.0 (0.02) Passed bias filter: 1 (1); expected 0.0 (0.02) Passed Vit filter: 1 (1); expected 0.0 (0.001) Passed Fwd filter: 1 (1); expected 0.0 (1e-05) Initial search space (Z): 1 [actual number of targets] Domain search space (domZ): 1 [number of targets reported over threshold] # CPU time: 0.03u 0.01s 00:00:00.04 Elapsed: 00:00:00.03 # Mc/sec: 9.75 // [ok]
This GapMind analysis is from Apr 10 2024. The underlying query database was built on Apr 09 2024.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory