Align acetohydroxy-acid synthase large subunit (EC 2.2.1.6) (characterized)
to candidate WP_017751807.1 PN53_RS01895 biosynthetic-type acetolactate synthase large subunit
Query= metacyc::MONOMER-11900 (599 letters) >NCBI__GCF_000816635.1:WP_017751807.1 Length = 559 Score = 486 bits (1251), Expect = e-141 Identities = 254/573 (44%), Positives = 366/573 (63%), Gaps = 18/573 (3%) Query: 1 MNGAEAMIKALEAEKVEILFGYPGGALLPFYDALHH--SDLIHLLTRHEQAAAHAADGYA 58 M GA+ +++ L V+ +FGYPGGA+LP YDAL+ + H++T HEQ A+HAADGYA Sbjct: 1 MKGAKMLLECLVQHGVDTIFGYPGGAVLPIYDALYDMKDKINHIITAHEQGASHAADGYA 60 Query: 59 RASGKVGVCIGTSGPGATNLVTGVATAHSDSSPMVALTGQVPTKLIGNDAFQEIDALGLF 118 R++GKVGV + TSGPGATN VTG+ATA+ DS P++ GQVP L+G D+FQE++ + Sbjct: 61 RSTGKVGVALATSGPGATNTVTGIATAYMDSVPIIVFAGQVPVSLVGKDSFQEVNIRSIT 120 Query: 119 MPIVKHNFQIQKTCQIPEIFRSAFEIAQTGRPGPVHIDLPKDVQELELDIDKHPIPSKVK 178 I K F I K I + AF+IA +GR GPV I++PK+VQ E I + + S Sbjct: 121 GTITKKTFTIDKVENIKSVIDEAFKIATSGRKGPVVIEVPKNVQVSE--ISEMSLNSIEN 178 Query: 179 LIGYNPTTIGHPRQIKKAIKLIASAKRPIILAGGGVLLSGANEELLKLVELLNIPVCTTL 238 + Y+ + H + KAIK+I +++RP++ AGGGV+ SGA +EL K V+ L+ P+ +L Sbjct: 179 ELIYDLFSNKHENALNKAIKIIENSERPMVYAGGGVVSSGAEDELSKFVDKLDTPISCSL 238 Query: 239 MGKGCISENHPLALGMVGMHGTKPANYCLSESDVLISIGCRFSDRITGDIKSFATNAKII 298 MG G ++ GM+GMHGT +N +++ D+LI+IG RFSDR+ + +FA NA+II Sbjct: 239 MGTGAFPQDRDNYTGMLGMHGTHTSNQAINDCDLLIAIGARFSDRVISKVSTFAKNARII 298 Query: 299 HIDIDPAEIGKNVNVDVPIVGDAKLILKEVIKQLDYIINKDSKENNDKENISQWIENVNS 358 HIDID E GKN++VD+ I GD K IL++ +N K N S+W+ + S Sbjct: 299 HIDIDEKEFGKNIDVDLTIKGDIKNILEK--------LNSSIKNQNH----SKWMRKIIS 346 Query: 359 LKKSSIPVMDYDDIPIKPQKIVKELMAVIDDLNINKNTIITTDVGQNQMWMAHYFKTQTP 418 KK+ ++ +D K + ++ + +L + I+TT+VGQNQ+W A YFK P Sbjct: 347 SKKNEELKIENNDSEKKDYVSPRHIIETLYELT-GGDCIVTTEVGQNQIWTAQYFKFLKP 405 Query: 419 RSFLSSGGLGTMGFGFPSAIGAKVAKPDSKVICITGDGGFMMNCQELGTIAEYNIPVVIC 478 R+FL+SGGLGTMGFG +AIGA V PD +VI + GDG F MNC EL TI++Y +PV+ Sbjct: 406 RTFLTSGGLGTMGFGLGAAIGACVGNPDKRVINVAGDGSFKMNCNELATISKYKLPVIQL 465 Query: 479 IFDNRTLGMVYQWQNLFYGKRQCSVNFGGAPDFIKLAESYGIKARRIESPNEINEALKEA 538 + +N +LGMV+QWQ +FY +R C D +KL E+YGIK +IE+ NEI + L+ A Sbjct: 466 VLNNNSLGMVHQWQEMFYNRRYCFTELTDDVDILKLGEAYGIKTLKIENNNEIEDCLRFA 525 Query: 539 INCDEPYLLDFAID-PSSALSMVPPGAKLTNII 570 + EP +++ +ID +VPPGA +T+ I Sbjct: 526 LEKREPIIIECSIDINEKVFPIVPPGASITDSI 558 Lambda K H 0.319 0.137 0.405 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 855 Number of extensions: 40 Number of successful extensions: 4 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 599 Length of database: 559 Length adjustment: 36 Effective length of query: 563 Effective length of database: 523 Effective search space: 294449 Effective search space used: 294449 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.4 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.7 bits) S2: 53 (25.0 bits)
Align candidate WP_017751807.1 PN53_RS01895 (biosynthetic-type acetolactate synthase large subunit)
to HMM TIGR00118 (ilvB: acetolactate synthase, large subunit, biosynthetic type (EC 2.2.1.6))
# hmmsearch :: search profile(s) against a sequence database # HMMER 3.3.1 (Jul 2020); http://hmmer.org/ # Copyright (C) 2020 Howard Hughes Medical Institute. # Freely distributed under the BSD open source license. # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # query HMM file: ../tmp/path.aa/TIGR00118.hmm # target sequence database: /tmp/gapView.2691264.genome.faa # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: TIGR00118 [M=557] Accession: TIGR00118 Description: acolac_lg: acetolactate synthase, large subunit, biosynthetic type Scores for complete sequences (score includes all domains): --- full sequence --- --- best 1 domain --- -#dom- E-value score bias E-value score bias exp N Sequence Description ------- ------ ----- ------- ------ ----- ---- -- -------- ----------- 5.4e-221 720.8 1.4 6.1e-221 720.6 1.4 1.0 1 NCBI__GCF_000816635.1:WP_017751807.1 Domain annotation for each sequence (and alignments): >> NCBI__GCF_000816635.1:WP_017751807.1 # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc --- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ---- 1 ! 720.6 1.4 6.1e-221 6.1e-221 1 553 [. 1 557 [. 1 559 [] 0.97 Alignments for each domain: == domain 1 score: 720.6 bits; conditional E-value: 6.1e-221 TIGR00118 1 lkgaeilveslkkegvetvfGyPGGavlpiydaly..dselehilvrheqaaahaadGyarasGkvGvvlats 71 +kga++l+e l ++gv+t+fGyPGGavlpiydaly +++++hi++ heq+a+haadGyar++GkvGv+lats NCBI__GCF_000816635.1:WP_017751807.1 1 MKGAKMLLECLVQHGVDTIFGYPGGAVLPIYDALYdmKDKINHIITAHEQGASHAADGYARSTGKVGVALATS 73 89*********************************9999********************************** PP TIGR00118 72 GPGatnlvtgiatayldsvPlvvltGqvatsliGsdafqeidilGitlpvtkhsflvkkaedlpeilkeafei 144 GPGatn+vtgiatay+dsvP++v++Gqv+ sl+G+d+fqe++i it ++tk +f + k+e++ +++ eaf+i NCBI__GCF_000816635.1:WP_017751807.1 74 GPGATNTVTGIATAYMDSVPIIVFAGQVPVSLVGKDSFQEVNIRSITGTITKKTFTIDKVENIKSVIDEAFKI 146 ************************************************************************* PP TIGR00118 145 astGrPGPvlvdlPkdvteaeieleveekvelp.gykptvkghklqikkaleliekakkPvllvGgGviiaea 216 a++Gr GPv++++Pk+v+ +ei+ +++e y + + h++ ++ka+++ie++++P++++GgGv+ ++a NCBI__GCF_000816635.1:WP_017751807.1 147 ATSGRKGPVVIEVPKNVQVSEISEMSLNSIENElIYDLFSNKHENALNKAIKIIENSERPMVYAGGGVVSSGA 219 *********************999888777654279************************************* PP TIGR00118 217 seelkelaerlkipvtttllGlGafpedhplalgmlGmhGtkeanlavseadlliavGarfddrvtgnlakfa 289 ++el +++++l+ p+ ++l+G Gafp+d + gmlGmhGt+++n a++++dllia+Garf+drv +++++fa NCBI__GCF_000816635.1:WP_017751807.1 220 EDELSKFVDKLDTPISCSLMGTGAFPQDRDNYTGMLGMHGTHTSNQAINDCDLLIAIGARFSDRVISKVSTFA 292 ************************************************************************* PP TIGR00118 290 peakiihididPaeigknvkvdipivGdakkvleellkklkeeekkekeWlekieewkkeyilkldeeees.. 360 ++a+iihidid +e gkn+ vd++i Gd k++le+l + +k++++++ W++ki + kk+ lk+++++++ NCBI__GCF_000816635.1:WP_017751807.1 293 KNARIIHIDIDEKEFGKNIDVDLTIKGDIKNILEKLNSSIKNQNHSK--WMRKIISSKKNEELKIENNDSEkk 363 **************************************999887776..*********999999775432212 PP TIGR00118 361 ..ikPqkvikelskllkdeaivttdvGqhqmwaaqfyktkkprkfitsgGlGtmGfGlPaalGakvakpeetv 431 + P+++i++l++l+ ++ ivtt+vGq+q+w+aq++k+ kpr+f+tsgGlGtmGfGl aa+Ga v++p++ v NCBI__GCF_000816635.1:WP_017751807.1 364 dyVSPRHIIETLYELTGGDCIVTTEVGQNQIWTAQYFKFLKPRTFLTSGGLGTMGFGLGAAIGACVGNPDKRV 436 4599********************************************************************* PP TIGR00118 432 vavtGdgsfqmnlqelstiveydipvkivilnnellGmvkqWqelfyeerysetklaselpdfvklaeayGvk 504 ++v+Gdgsf+mn++el+ti++y++pv+ ++lnn+ lGmv+qWqe+fy++ry t+l+ + d+ kl eayG+k NCBI__GCF_000816635.1:WP_017751807.1 437 INVAGDGSFKMNCNELATISKYKLPVIQLVLNNNSLGMVHQWQEMFYNRRYCFTELT-DDVDILKLGEAYGIK 508 *********************************************************.699************ PP TIGR00118 505 giriekpeeleeklkealeskepvlldvevdkeeevlPmvapGagldel 553 +++ie+++e+e+ l+ ale++ep++++ ++d +e+v+P+v+pGa++++ NCBI__GCF_000816635.1:WP_017751807.1 509 TLKIENNNEIEDCLRFALEKREPIIIECSIDINEKVFPIVPPGASITDS 557 ********************************************99875 PP Internal pipeline statistics summary: ------------------------------------- Query model(s): 1 (557 nodes) Target sequences: 1 (559 residues searched) Passed MSV filter: 1 (1); expected 0.0 (0.02) Passed bias filter: 1 (1); expected 0.0 (0.02) Passed Vit filter: 1 (1); expected 0.0 (0.001) Passed Fwd filter: 1 (1); expected 0.0 (1e-05) Initial search space (Z): 1 [actual number of targets] Domain search space (domZ): 1 [number of targets reported over threshold] # CPU time: 0.00u 0.01s 00:00:00.01 Elapsed: 00:00:00.01 # Mc/sec: 24.51 // [ok]
This GapMind analysis is from Jul 25 2024. The underlying query database was built on Jul 25 2024.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory