Align acetohydroxyacid synthase subunit B (EC 2.2.1.6) (characterized)
to candidate WP_083537097.1 ACG33_RS00740 acetolactate synthase, large subunit, biosynthetic type
Query= metacyc::MONOMER-18810 (585 letters) >NCBI__GCF_001579945.1:WP_083537097.1 Length = 596 Score = 429 bits (1104), Expect = e-124 Identities = 254/580 (43%), Positives = 337/580 (58%), Gaps = 21/580 (3%) Query: 19 MIGAEILVHALAEEGVEYVWGYPGGAVLYIYDELH---------KQTKFEHILVRHEQAA 69 M GAEI+V LA+EGV+ V+GY GGA+L YD + + + I+ +EQ A Sbjct: 1 MTGAEIIVQVLADEGVDTVFGYSGGAILPTYDAIFVNNQDCERCDRRQMSLIVPANEQGA 60 Query: 70 VHAADGYARATGKVGVALVTSGPGVTNAVTGIATAYLDSIPMVVITGNVPTHAIGQDAFQ 129 A GYARATGKVGV +VTSGPG TN VT + DSIP+VVI G VPT AIG DAFQ Sbjct: 61 GFMAAGYARATGKVGVCIVTSGPGATNTVTPVRDCMADSIPIVVICGQVPTGAIGSDAFQ 120 Query: 130 ECDTVGITRPIVKHNFLVKDVRDLAATIKKAFFIAATGRPGPVVVDIPKDVSRNACKYEY 189 E I + KH FLV D L ATI+ AF IA +GRPGPVV+D+PKDV K++ Sbjct: 121 EAPVASIMGAVAKHVFLVTDPSKLEATIRTAFEIARSGRPGPVVIDVPKDVQNWQGKFQG 180 Query: 190 PKSIDMRSYNP--VNKGHS----GQIRKAVALLQGAERPYIYTGGGVVLANASDELRQLA 243 + + Y HS + + +L A RP IY GGGV+ + S L++ A Sbjct: 181 AGRLPVAGYRQRMTRLTHSVLSDARCAEFFTMLGAARRPLIYAGGGVIHSGGSQALQEFA 240 Query: 244 ALTGHPVTNTLMGLGAFPGTSKQFVGMLGMHGTYEANMAMQNCDVLIAIGARFDDRVIGN 303 G PV TLMGLGA T + MLGMHG AN A+ +CD L A+GARFDDRV GN Sbjct: 241 IEYGIPVVTTLMGLGALDTTHPLAMRMLGMHGAAFANYAVDDCDFLFALGARFDDRVAGN 300 Query: 304 PAHFTSQARKIIHIDIDPSSISKRVKVDIPIVGNVKDVLQELIAQIKASDIKPKREALAK 363 PA F A++I IDID S I+K +V +G + + L+ LI + S + Sbjct: 301 PAKFAPNAKQIAQIDIDISEINKVKQVHWHHIGLLPEALRGLIDYGRRSAFN---RDWST 357 Query: 364 WWEQIEQWRSVDCLKYDRSSEIIKPQYVVEKIWELTKGDAFICSDVGQHQMWAAQFYKFD 423 W +Q R + Y+R SE I+P +V+E+I +LT+G+A I + VGQHQMWAAQ++ F Sbjct: 358 WRTHCDQLRRTYAMNYERDSERIQPYHVIEEINKLTRGEAIITTGVGQHQMWAAQYFDFR 417 Query: 424 EPRRWINSGGLGTMGVGLPYAMGIKKAFPEKEVVTITGEGSIQMCIQELSTCLQYDTPVK 483 PR W+ SG +GTMG GLP A+G + A P++ V+ I G+ SI+M + EL T Y P+K Sbjct: 418 SPRLWLTSGSMGTMGFGLPAAIGAQFAQPDRLVIDIDGDSSIRMNLGELETVTTYGLPIK 477 Query: 484 ICSLNNGYLGMVRQWQEIEYDNRYSHSYMDA-LPDFVKLAEAYGH-VGMRVEKTSDVEPA 541 + LNN GMV+QWQ++ + R + S DF+K AEA G MR+E+ DV Sbjct: 478 VVVLNNCGDGMVKQWQKLFFKGRLAASDRSLHKKDFLKAAEADGFPYVMRLERPQDVARV 537 Query: 542 LREAFRLKDRTVFLDFQTDPTENVWPMVQAGKGISEMLLG 581 ++E + FL+ DP V+PMV G+ SEM+ G Sbjct: 538 VKEFVEFQG-PAFLEVMIDPDAGVYPMVGPGQPYSEMITG 576 Lambda K H 0.319 0.135 0.407 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 896 Number of extensions: 51 Number of successful extensions: 5 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 585 Length of database: 596 Length adjustment: 37 Effective length of query: 548 Effective length of database: 559 Effective search space: 306332 Effective search space used: 306332 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.4 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.8 bits) S2: 53 (25.0 bits)
Align candidate WP_083537097.1 ACG33_RS00740 (acetolactate synthase, large subunit, biosynthetic type)
to HMM TIGR00118 (ilvB: acetolactate synthase, large subunit, biosynthetic type (EC 2.2.1.6))
# hmmsearch :: search profile(s) against a sequence database # HMMER 3.3.1 (Jul 2020); http://hmmer.org/ # Copyright (C) 2020 Howard Hughes Medical Institute. # Freely distributed under the BSD open source license. # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # query HMM file: ../tmp/path.aa/TIGR00118.hmm # target sequence database: /tmp/gapView.30156.genome.faa # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: TIGR00118 [M=557] Accession: TIGR00118 Description: acolac_lg: acetolactate synthase, large subunit, biosynthetic type Scores for complete sequences (score includes all domains): --- full sequence --- --- best 1 domain --- -#dom- E-value score bias E-value score bias exp N Sequence Description ------- ------ ----- ------- ------ ----- ---- -- -------- ----------- 1.1e-199 650.4 0.0 1.3e-199 650.2 0.0 1.0 1 lcl|NCBI__GCF_001579945.1:WP_083537097.1 ACG33_RS00740 acetolactate synth Domain annotation for each sequence (and alignments): >> lcl|NCBI__GCF_001579945.1:WP_083537097.1 ACG33_RS00740 acetolactate synthase, large subunit, biosynthetic type # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc --- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ---- 1 ! 650.2 0.0 1.3e-199 1.3e-199 1 555 [. 1 575 [. 1 577 [. 0.95 Alignments for each domain: == domain 1 score: 650.2 bits; conditional E-value: 1.3e-199 TIGR00118 1 lkgaeilveslkkegvetvfGyPGGavlpiydaly..........dselehilvrheqaaahaadGyar 59 ++gaei+v+ l +egv+tvfGy GGa+lp yda++ +++ i++ eq+a +a Gyar lcl|NCBI__GCF_001579945.1:WP_083537097.1 1 MTGAEIIVQVLADEGVDTVFGYSGGAILPTYDAIFvnnqdcercdRRQMSLIVPANEQGAGFMAAGYAR 69 79*********************************8765444433346899****************** PP TIGR00118 60 asGkvGvvlatsGPGatnlvtgiatayldsvPlvvltGqvatsliGsdafqeidilGitlpvtkhsflv 128 a+GkvGv+++tsGPGatn+vt++ + ++ds+P+vv+ Gqv+t +iGsdafqe+ + i+ +v kh flv lcl|NCBI__GCF_001579945.1:WP_083537097.1 70 ATGKVGVCIVTSGPGATNTVTPVRDCMADSIPIVVICGQVPTGAIGSDAFQEAPVASIMGAVAKHVFLV 138 ********************************************************************* PP TIGR00118 129 kkaedlpeilkeafeiastGrPGPvlvdlPkdvteaeieleveekvelpgykptvkghklq......ik 191 ++++ l ++++ afeia +GrPGPv++d+Pkdv++ + +++ ++ + gy+ +++ ++ + lcl|NCBI__GCF_001579945.1:WP_083537097.1 139 TDPSKLEATIRTAFEIARSGRPGPVVIDVPKDVQNWQGKFQGAGRLPVAGYRQRMTRLTHSvlsdarCA 207 ***************************************************987664443322333366 PP TIGR00118 192 kaleliekakkPvllvGgGviiaeaseelkelaerlkipvtttllGlGafpedhplalgmlGmhGtkea 260 + ++ +a++P++++GgGvi++++s+ l+e+a + ipv+ttl+GlGa++ +hpla+ mlGmhG++ a lcl|NCBI__GCF_001579945.1:WP_083537097.1 208 EFFTMLGAARRPLIYAGGGVIHSGGSQALQEFAIEYGIPVVTTLMGLGALDTTHPLAMRMLGMHGAAFA 276 6678899************************************************************** PP TIGR00118 261 nlavseadlliavGarfddrvtgnlakfapeak.iihididPaeigknvkvdipivGdakkvleellkk 328 n+av+++d l+a+Garfddrv+gn akfap+ak i idid +ei+k+ +v+ +G + l+ l++ lcl|NCBI__GCF_001579945.1:WP_083537097.1 277 NYAVDDCDFLFALGARFDDRVAGNPAKFAPNAKqIAQIDIDISEINKVKQVHWHHIGLLPEALRGLIDY 345 ********************************97889******************************98 PP TIGR00118 329 lkee..ekkekeWlekieewkkeyilkldeeeesikPqkvikelskllkdeaivttdvGqhqmwaaqfy 395 ++ +++ + W + ++++++y +++++++e i+P +vi+e+ kl+++eai+tt+vGqhqmwaaq++ lcl|NCBI__GCF_001579945.1:WP_083537097.1 346 GRRSafNRDWSTWRTHCDQLRRTYAMNYERDSERIQPYHVIEEINKLTRGEAIITTGVGQHQMWAAQYF 414 887764555556********************************************************* PP TIGR00118 396 ktkkprkfitsgGlGtmGfGlPaalGakvakpeetvvavtGdgsfqmnlqelstiveydipvkivilnn 464 ++++pr ++tsg +GtmGfGlPaa+Ga+ a+p+ v++++Gd+s+ mnl el+t++ y++p+k+v+lnn lcl|NCBI__GCF_001579945.1:WP_083537097.1 415 DFRSPRLWLTSGSMGTMGFGLPAAIGAQFAQPDRLVIDIDGDSSIRMNLGELETVTTYGLPIKVVVLNN 483 ********************************************************************* PP TIGR00118 465 ellGmvkqWqelfyeerysetklaselpdfvklaeayGv.kgiriekpeeleeklkealeskepvlldv 532 GmvkqWq+lf+++r +++ + ++ df k aea G +r+e+p++++ +ke++e ++p++l+v lcl|NCBI__GCF_001579945.1:WP_083537097.1 484 CGDGMVKQWQKLFFKGRLAASDRSLHKKDFLKAAEADGFpYVMRLERPQDVARVVKEFVEFQGPAFLEV 552 ************************99************84579************************** PP TIGR00118 533 evdkeeevlPmvapGagldelve 555 +d ++ v+Pmv pG+ +e+++ lcl|NCBI__GCF_001579945.1:WP_083537097.1 553 MIDPDAGVYPMVGPGQPYSEMIT 575 ********************996 PP Internal pipeline statistics summary: ------------------------------------- Query model(s): 1 (557 nodes) Target sequences: 1 (596 residues searched) Passed MSV filter: 1 (1); expected 0.0 (0.02) Passed bias filter: 1 (1); expected 0.0 (0.02) Passed Vit filter: 1 (1); expected 0.0 (0.001) Passed Fwd filter: 1 (1); expected 0.0 (1e-05) Initial search space (Z): 1 [actual number of targets] Domain search space (domZ): 1 [number of targets reported over threshold] # CPU time: 0.03u 0.01s 00:00:00.04 Elapsed: 00:00:00.03 # Mc/sec: 9.37 // [ok]
This GapMind analysis is from Apr 10 2024. The underlying query database was built on Apr 09 2024.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory