Align 5-methyltetrahydropteroyltriglutamate-homocysteine S-methyltransferase (EC 2.1.1.14) (characterized)
to candidate 17877 b3829 5-methyltetrahydropteroyltriglutamate-- homocysteine methyltransferase (NCBI)
Query= BRENDA::P25665 (753 letters) >lcl|FitnessBrowser__Keio:17877 b3829 5-methyltetrahydropteroyltriglutamate-- homocysteine methyltransferase (NCBI) Length = 753 Score = 1527 bits (3953), Expect = 0.0 Identities = 753/753 (100%), Positives = 753/753 (100%) Query: 1 MTILNHTLGFPRVGLRRELKKAQESYWAGNSTREELLAVGRELRARHWDQQKQAGIDLLP 60 MTILNHTLGFPRVGLRRELKKAQESYWAGNSTREELLAVGRELRARHWDQQKQAGIDLLP Sbjct: 1 MTILNHTLGFPRVGLRRELKKAQESYWAGNSTREELLAVGRELRARHWDQQKQAGIDLLP 60 Query: 61 VGDFAWYDHVLTTSLLLGNVPARHQNKDGSVDIDTLFRIGRGRAPTGEPAAAAEMTKWFN 120 VGDFAWYDHVLTTSLLLGNVPARHQNKDGSVDIDTLFRIGRGRAPTGEPAAAAEMTKWFN Sbjct: 61 VGDFAWYDHVLTTSLLLGNVPARHQNKDGSVDIDTLFRIGRGRAPTGEPAAAAEMTKWFN 120 Query: 121 TNYHYMVPEFVKGQQFKLTWTQLLDEVDEALALGHKVKPVLLGPVTWLWLGKVKGEQFDR 180 TNYHYMVPEFVKGQQFKLTWTQLLDEVDEALALGHKVKPVLLGPVTWLWLGKVKGEQFDR Sbjct: 121 TNYHYMVPEFVKGQQFKLTWTQLLDEVDEALALGHKVKPVLLGPVTWLWLGKVKGEQFDR 180 Query: 181 LSLLNDILPVYQQVLAELAKRGIEWVQIDEPALVLELPQAWLDAYKPAYDALQGQVKLLL 240 LSLLNDILPVYQQVLAELAKRGIEWVQIDEPALVLELPQAWLDAYKPAYDALQGQVKLLL Sbjct: 181 LSLLNDILPVYQQVLAELAKRGIEWVQIDEPALVLELPQAWLDAYKPAYDALQGQVKLLL 240 Query: 241 TTYFEGVTPNLDTITALPVQGLHVDLVHGKDDVAELHKRLPSDWLLSAGLINGRNVWRAD 300 TTYFEGVTPNLDTITALPVQGLHVDLVHGKDDVAELHKRLPSDWLLSAGLINGRNVWRAD Sbjct: 241 TTYFEGVTPNLDTITALPVQGLHVDLVHGKDDVAELHKRLPSDWLLSAGLINGRNVWRAD 300 Query: 301 LTEKYAQIKDIVGKRDLWVASSCSLLHSPIDLSVETRLDAEVKSWFAFALQKCHELALLR 360 LTEKYAQIKDIVGKRDLWVASSCSLLHSPIDLSVETRLDAEVKSWFAFALQKCHELALLR Sbjct: 301 LTEKYAQIKDIVGKRDLWVASSCSLLHSPIDLSVETRLDAEVKSWFAFALQKCHELALLR 360 Query: 361 DALNSGDTAALAEWSAPIQARRHSTRVHNPAVEKRLAAITAQDSQRANVYEVRAEAQRAR 420 DALNSGDTAALAEWSAPIQARRHSTRVHNPAVEKRLAAITAQDSQRANVYEVRAEAQRAR Sbjct: 361 DALNSGDTAALAEWSAPIQARRHSTRVHNPAVEKRLAAITAQDSQRANVYEVRAEAQRAR 420 Query: 421 FKLPAWPTTTIGSFPQTTEIRTLRLDFKKGNLDANNYRTGIAEHIKQAIVEQERLGLDVL 480 FKLPAWPTTTIGSFPQTTEIRTLRLDFKKGNLDANNYRTGIAEHIKQAIVEQERLGLDVL Sbjct: 421 FKLPAWPTTTIGSFPQTTEIRTLRLDFKKGNLDANNYRTGIAEHIKQAIVEQERLGLDVL 480 Query: 481 VHGEAERNDMVEYFGEHLDGFVFTQNGWVQSYGSRCVKPPIVIGDISRPAPITVEWAKYA 540 VHGEAERNDMVEYFGEHLDGFVFTQNGWVQSYGSRCVKPPIVIGDISRPAPITVEWAKYA Sbjct: 481 VHGEAERNDMVEYFGEHLDGFVFTQNGWVQSYGSRCVKPPIVIGDISRPAPITVEWAKYA 540 Query: 541 QSLTDKPVKGMLTGPVTILCWSFPREDVSRETIAKQIALALRDEVADLEAAGIGIIQIDE 600 QSLTDKPVKGMLTGPVTILCWSFPREDVSRETIAKQIALALRDEVADLEAAGIGIIQIDE Sbjct: 541 QSLTDKPVKGMLTGPVTILCWSFPREDVSRETIAKQIALALRDEVADLEAAGIGIIQIDE 600 Query: 601 PALREGLPLRRSDWDAYLQWGVEAFRINAAVAKDDTQIHTHMCYCEFNDIMDSIAALDAD 660 PALREGLPLRRSDWDAYLQWGVEAFRINAAVAKDDTQIHTHMCYCEFNDIMDSIAALDAD Sbjct: 601 PALREGLPLRRSDWDAYLQWGVEAFRINAAVAKDDTQIHTHMCYCEFNDIMDSIAALDAD 660 Query: 661 VITIETSRSDMELLESFEEFDYPNEIGPGVYDIHSPNVPSVEWIEALLKKAAKRIPAERL 720 VITIETSRSDMELLESFEEFDYPNEIGPGVYDIHSPNVPSVEWIEALLKKAAKRIPAERL Sbjct: 661 VITIETSRSDMELLESFEEFDYPNEIGPGVYDIHSPNVPSVEWIEALLKKAAKRIPAERL 720 Query: 721 WVNPDCGLKTRGWPETRAALANMVQAAQNLRRG 753 WVNPDCGLKTRGWPETRAALANMVQAAQNLRRG Sbjct: 721 WVNPDCGLKTRGWPETRAALANMVQAAQNLRRG 753 Lambda K H 0.319 0.135 0.416 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 2107 Number of extensions: 89 Number of successful extensions: 3 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 753 Length of database: 753 Length adjustment: 40 Effective length of query: 713 Effective length of database: 713 Effective search space: 508369 Effective search space used: 508369 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.4 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.8 bits) S2: 55 (25.8 bits)
Align candidate 17877 b3829 (5-methyltetrahydropteroyltriglutamate-- homocysteine methyltransferase (NCBI))
to HMM TIGR01371 (metE: 5-methyltetrahydropteroyltriglutamate--homocysteine S-methyltransferase (EC 2.1.1.14))
# hmmsearch :: search profile(s) against a sequence database # HMMER 3.3.1 (Jul 2020); http://hmmer.org/ # Copyright (C) 2020 Howard Hughes Medical Institute. # Freely distributed under the BSD open source license. # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # query HMM file: ../tmp/path.aa/TIGR01371.hmm # target sequence database: /tmp/gapView.6115.genome.faa # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: TIGR01371 [M=754] Accession: TIGR01371 Description: met_syn_B12ind: 5-methyltetrahydropteroyltriglutamate--homocysteine S-methyltransferase Scores for complete sequences (score includes all domains): --- full sequence --- --- best 1 domain --- -#dom- E-value score bias E-value score bias exp N Sequence Description ------- ------ ----- ------- ------ ----- ---- -- -------- ----------- 0 1124.8 0.0 0 1124.6 0.0 1.0 1 lcl|FitnessBrowser__Keio:17877 b3829 5-methyltetrahydropteroylt Domain annotation for each sequence (and alignments): >> lcl|FitnessBrowser__Keio:17877 b3829 5-methyltetrahydropteroyltriglutamate-- homocysteine methyltransferase (NCBI) # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc --- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ---- 1 ! 1124.6 0.0 0 0 1 753 [. 8 752 .. 8 753 .] 0.98 Alignments for each domain: == domain 1 score: 1124.6 bits; conditional E-value: 0 TIGR01371 1 lgfPrigekRelkkalekywkgkiskeellkvakdlrkkalkkqkeagvdvipvndfslYDhvLdtavllgaiperfke 79 lgfPr+g +Relkka+e+yw+g++++eell+v ++lr++++++qk+ag+d++pv+df++YDhvL+t++llg++p+r+++ lcl|FitnessBrowser__Keio:17877 8 LGFPRVGLRRELKKAQESYWAGNSTREELLAVGRELRARHWDQQKQAGIDLLPVGDFAWYDHVLTTSLLLGNVPARHQN 86 79***************************************************************************** PP TIGR01371 80 laddesdldtyFaiaRGtek..kdvaalemtkwfntnYhYlvPelskeeefklsknklleeykeakelgvetkPvllGp 156 + d+++d+dt+F+i+RG + + +aa emtkwfntnYhY+vPe+ k ++fkl++++ll+e++ea +lg+++kPvllGp lcl|FitnessBrowser__Keio:17877 87 K-DGSVDIDTLFRIGRGRAPtgEPAAAAEMTKWFNTNYHYMVPEFVKGQQFKLTWTQLLDEVDEALALGHKVKPVLLGP 164 8.8889***********98755668899*************************************************** PP TIGR01371 157 itflkLakakeeeekellellekllpvYkevlkklaeagvewvqidePvlvldlskeelaavkeayeeleeaskelkll 235 +t+l+L+k k +e++++l+ll+++lpvY++vl +la++g+ewvqideP+lvl+l++++l+a+k ay++l+ ++kll lcl|FitnessBrowser__Keio:17877 165 VTWLWLGKVK-GEQFDRLSLLNDILPVYQQVLAELAKRGIEWVQIDEPALVLELPQAWLDAYKPAYDALQG---QVKLL 239 *******999.699********************************************************8...79*** PP TIGR01371 236 lqtYfdsveealeklvslpvealglDlveakeelelakakfeedkvLvaGvidGrniwkadlekslkllkkleakagdk 314 l+tYf+ v+ +l++++ lpv++l++Dlv++k++ + ++++++d++L+aG+i+Grn+w+adl+++ +++k++ k+ lcl|FitnessBrowser__Keio:17877 240 LTTYFEGVTPNLDTITALPVQGLHVDLVHGKDDVAELHKRLPSDWLLSAGLINGRNVWRADLTEKYAQIKDIVGKR--D 316 *********************************9999************************************888..6 PP TIGR01371 315 lvvstscsllhvpvdleleekldkelkellafakekleelkvlkealegeaavaealeaeaaaiaarkkskrvadekvk 393 l+v++scsllh+p+dl+ e++ld+e+k+++afa +k++el++l++al++ +++al++++a+i+ar++s+rv++ +v+ lcl|FitnessBrowser__Keio:17877 317 LWVASSCSLLHSPIDLSVETRLDAEVKSWFAFALQKCHELALLRDALNS--GDTAALAEWSAPIQARRHSTRVHNPAVE 393 ************************************************7..66677888999***************** PP TIGR01371 394 erlealkekkarressfeeRaeaqekklnlPllPtttiGsfPqtkevRkaRakfrkgeiseeeYekfikeeikkviklq 472 +rl+a+++++++r++ +e Raeaq+++++lP+ PtttiGsfPqt+e+R+ R +f+kg++++++Y + i e+ik++i q lcl|FitnessBrowser__Keio:17877 394 KRLAAITAQDSQRANVYEVRAEAQRARFKLPAWPTTTIGSFPQTTEIRTLRLDFKKGNLDANNYRTGIAEHIKQAIVEQ 472 ******************************************************************************* PP TIGR01371 473 eelglDvLvhGefeRnDmveyFgeklaGfaftqngWvqsYGsRcvkPpiiygdvsrpkpmtvkeskyaqsltskpvkGm 551 e+lglDvLvhGe+eRnDmveyFge+l+Gf+ftqngWvqsYGsRcvkPpi++gd+srp+p+tv++ kyaqslt+kpvkGm lcl|FitnessBrowser__Keio:17877 473 ERLGLDVLVHGEAERNDMVEYFGEHLDGFVFTQNGWVQSYGSRCVKPPIVIGDISRPAPITVEWAKYAQSLTDKPVKGM 551 ******************************************************************************* PP TIGR01371 552 LtGPvtilnWsfvReDlprkeiaeqialalrdevkdLeeagikiiqiDepalReglPlrksdkeeYldwaveaFrlaas 630 LtGPvtil+Wsf+ReD++r++ia+qialalrdev+dLe+agi iiqiDepalReglPlr+sd+++Yl+w veaFr+ a+ lcl|FitnessBrowser__Keio:17877 552 LTGPVTILCWSFPREDVSRETIAKQIALALRDEVADLEAAGIGIIQIDEPALREGLPLRRSDWDAYLQWGVEAFRINAA 630 ******************************************************************************* PP TIGR01371 631 gvkdetqihthmCYsefneiieaiaaldaDvisieasrsdmelldalkeikkyekeiGlGvyDihsprvPskeelaell 709 +kd+tqihthmCY+efn+i+++iaaldaDvi+ie+srsdmell++++e ++y++eiG+GvyDihsp+vPs+e++++ll lcl|FitnessBrowser__Keio:17877 631 VAKDDTQIHTHMCYCEFNDIMDSIAALDADVITIETSRSDMELLESFEE-FDYPNEIGPGVYDIHSPNVPSVEWIEALL 708 *************************************************.77*************************** PP TIGR01371 710 ekalkklpkerlWvnPDCGLktRkweevkaalknlveaakelRe 753 +ka k++p+erlWvnPDCGLktR w+e++aal n+v+aa++lR lcl|FitnessBrowser__Keio:17877 709 KKAAKRIPAERLWVNPDCGLKTRGWPETRAALANMVQAAQNLRR 752 *****************************************995 PP Internal pipeline statistics summary: ------------------------------------- Query model(s): 1 (754 nodes) Target sequences: 1 (753 residues searched) Passed MSV filter: 1 (1); expected 0.0 (0.02) Passed bias filter: 1 (1); expected 0.0 (0.02) Passed Vit filter: 1 (1); expected 0.0 (0.001) Passed Fwd filter: 1 (1); expected 0.0 (1e-05) Initial search space (Z): 1 [actual number of targets] Domain search space (domZ): 1 [number of targets reported over threshold] # CPU time: 0.04u 0.01s 00:00:00.05 Elapsed: 00:00:00.06 # Mc/sec: 8.51 // [ok]
This GapMind analysis is from Aug 03 2021. The underlying query database was built on Aug 03 2021.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see the paper from 2019 on GapMind for amino acid biosynthesis, the paper from 2022 on GapMind for carbon sources, or view the source code, or see changes to Amino acid biosynthesis since the publication.
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory