GapMind for Amino acid biosynthesis

 

Alignments for a candidate for metE in Clostridium tyrobutyricum FAM22553

Align 5-methyltetrahydropteroyltriglutamate--homocysteine methyltransferase 3, chloroplastic; Cobalamin-independent methionine synthase 3; AtMS3; EC 2.1.1.14 (characterized)
to candidate WP_039652020.1 PN53_RS03355 5-methyltetrahydropteroyltriglutamate--homocysteine S-methyltransferase

Query= SwissProt::Q0WNZ5
         (812 letters)



>NCBI__GCF_000816635.1:WP_039652020.1
          Length = 776

 Score =  644 bits (1660), Expect = 0.0
 Identities = 346/781 (44%), Positives = 476/781 (60%), Gaps = 29/781 (3%)

Query: 51  SHIVGYPRIGPKRELKFALESFWDGKTNVDDLQNVAANLRKSIWKHMAHAGIKYIPSNTF 110
           S IVGYPRIG  RELKFA+ES++ G  + ++L +    LR   W     +G+  IPSN F
Sbjct: 4   STIVGYPRIGVNRELKFAVESYFKGNIDSNELFSTGKKLRDEYWSKQKESGLDIIPSNDF 63

Query: 111 SYYDQMLDTTAMLGAVPSRYGWESGEIGFDVYFSMARG----NASAHAMEMTKWFDTNYH 166
           SYYD MLD   +L  +P +Y  + G    D YF+MARG    +    A+ M KWF+TNYH
Sbjct: 64  SYYDNMLDMAFLLNIIPQKYK-DLGLSPLDTYFAMARGYQNNSKDIRALPMKKWFNTNYH 122

Query: 167 YIVPELGPDVNFSYASHKAVVEFKEAKALGIDTVPVLIGPMTYLLLSKPAKGVEKSFCLL 226
           YIVPE+  +  FS    K    ++E+K L I+T PV+IG  T+L LS          CL 
Sbjct: 123 YIVPEIDQNTIFSINDTKPFDLYRESKELNIETKPVIIGIFTFLKLSNLKGNTTFEHCL- 181

Query: 227 SLIDKILPVYKEVLADLKSAGARWIQFDEPILVMDLDTSQLQAFSDAYSHMESSLAGLNV 286
              +K+  +Y ++L   +  G +++Q DEPILV DL   +++ F + Y  + +       
Sbjct: 182 ---NKLANIYIDILDKFQQEGIKYLQIDEPILVTDLTEYEIELFKNVYDKILNKKYSFKT 238

Query: 287 LIATYFADVPAEAYKTLMSLKCVTGFGFDLVRGLETLDLI-KMNFPRGKLLFAGVVDGRN 345
           L+ TYF D+  + Y+ L +LK   G G D V G + L L+ K  FP  K+L AG+V+G+N
Sbjct: 239 LLQTYFGDI-RDIYENLQNLK-FNGIGLDFVEGKKNLTLLQKYGFPDNKILIAGIVNGKN 296

Query: 346 IWANDLSASLKTLQTLEDIVGKEKVVVSTSCSLLHTAVDLVNEMKLD------------- 392
           IW ND   S++ +  +   +   K+ +STSCSLLH    +  E  +D             
Sbjct: 297 IWRNDYKHSIQLMNNIGKYIDTNKIYISTSCSLLHVPYTVKPEGHIDDKNIGKVDNLINM 356

Query: 393 KELKSWLAFAAQKVVEVNALAKSFSGA--KDEALFSSNSMRQASRRSSPRVTNAAVQQDV 450
            E    L+FA +K+ E++ + +       + ++ +  N      +R +P   N  ++ ++
Sbjct: 357 SEYIESLSFAEEKLNELSEIKELLKCKYYEKQSKYIKNQDILKRKRQNPLCYNKEIRTNI 416

Query: 451 DAVKKSDHHRSTEVSVRLQAQQKKLNLPALPTTTIGSFPQTTDLRRIRREFKAKKISEVD 510
           + +K  D  R   +  R + Q     LP LPTTTIGSFPQT +++++R++ +   I++ +
Sbjct: 417 NNLKPEDFTRKDSLEFRKKVQNNTFKLPLLPTTTIGSFPQTHEIKKLRKDLRGNTITKEN 476

Query: 511 YVQTIKEEYEKVIKLQEELGIDVLVHGEAERNDMVEFFGEQLSGFAFTSNGWVQSYGSRC 570
           Y   I E+ ++V+KLQE++G+DVLVHGE ER DMVE+FG  L GF FT NGWVQSYG+R 
Sbjct: 477 YENQIMEKIKEVLKLQEDIGLDVLVHGEYERADMVEYFGRLLDGFLFTKNGWVQSYGTRA 536

Query: 571 VKPPIIYGDITRPKAMTVFWSSMAQKMTQRPMKGMLTGPVTILNWSFVRNDQPRHETCFQ 630
           VKPPIIYGD+ R   MT+ W   AQ  T +P+KGMLTGP+TILNWSF R D    +  +Q
Sbjct: 537 VKPPIIYGDVKRTAPMTLDWIKFAQDQTDKPVKGMLTGPITILNWSFPREDLDLKQIAYQ 596

Query: 631 IALAIKDEVEDLEKAGVTVIQIDEAALREGLPLR-KSEQKFYLDWAVHAFRITNSGVQDS 689
           I LAI +EV DLE  G+ +IQIDEAALRE LPLR K   K YLDWA+ AFR+TNS V+  
Sbjct: 597 IGLAIGEEVLDLESEGIKIIQIDEAALREKLPLRTKDWHKKYLDWAIPAFRLTNSKVKSE 656

Query: 690 TQIHTHMCYSNFNDIIHSIIDMDADVITIENSRSDEKLLSVFHEGVKYGAGIGPGVYDIH 749
           TQIHTHMCYS F+ I+  I DMDADV +IE +RSD  +L  F +   + + IGPG+YDIH
Sbjct: 657 TQIHTHMCYSEFSSIVQEIKDMDADVYSIEAARSDFSILD-FLKNNNFKSQIGPGIYDIH 715

Query: 750 SPRIPSTEEIAERINKMLAVLDSKVLWVNPDCGLKTRNYSEVKSALSNMVAAAKLIRSQL 809
           SPRIPS EE+ + I  ML  +D   LW+NPDCGLKTR+  EVK +L NMV A K IR +L
Sbjct: 716 SPRIPSIEELEKSIKIMLDKIDCDKLWINPDCGLKTRDIDEVKKSLINMVLATKNIRKKL 775

Query: 810 N 810
           N
Sbjct: 776 N 776


Lambda     K      H
   0.318    0.133    0.387 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 1415
Number of extensions: 73
Number of successful extensions: 8
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 812
Length of database: 776
Length adjustment: 41
Effective length of query: 771
Effective length of database: 735
Effective search space:   566685
Effective search space used:   566685
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 55 (25.8 bits)

Align candidate WP_039652020.1 PN53_RS03355 (5-methyltetrahydropteroyltriglutamate--homocysteine S-methyltransferase)
to HMM TIGR01371 (metE: 5-methyltetrahydropteroyltriglutamate--homocysteine S-methyltransferase (EC 2.1.1.14))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.aa/TIGR01371.hmm
# target sequence database:        /tmp/gapView.60410.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR01371  [M=754]
Accession:   TIGR01371
Description: met_syn_B12ind: 5-methyltetrahydropteroyltriglutamate--homocysteine S-methyltransferase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                             Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                             -----------
          0 1036.8   3.0          0 1036.5   3.0    1.1  1  NCBI__GCF_000816635.1:WP_039652020.1  


Domain annotation for each sequence (and alignments):
>> NCBI__GCF_000816635.1:WP_039652020.1  
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 ! 1036.5   3.0         0         0       1     754 []       7     774 ..       7     774 .. 0.96

  Alignments for each domain:
  == domain 1  score: 1036.5 bits;  conditional E-value: 0
                             TIGR01371   1 lgfPrigekRelkkalekywkgkiskeellkvakdlrkkalkkqkeagvdvipvndfslYDhvLdtavllgai 73 
                                           +g+Prig +Relk+a+e+y+kg+i+++el ++ k+lr + ++kqke+g+d+ip+ndfs+YD++Ld+a ll++i
  NCBI__GCF_000816635.1:WP_039652020.1   7 VGYPRIGVNRELKFAVESYFKGNIDSNELFSTGKKLRDEYWSKQKESGLDIIPSNDFSYYDNMLDMAFLLNII 79 
                                           69*********************************************************************** PP

                             TIGR01371  74 perfkeladdesdldtyFaiaRGtek..kdvaalemtkwfntnYhYlvPelskeeefklsknklleeykeake 144
                                           p+++k+l  + s ldtyFa+aRG+++  kd+ al m+kwfntnYhY+vPe+++++ f+++ +k+++ y+e ke
  NCBI__GCF_000816635.1:WP_039652020.1  80 PQKYKDL--GLSPLDTYFAMARGYQNnsKDIRALPMKKWFNTNYHYIVPEIDQNTIFSINDTKPFDLYRESKE 150
                                           ******9..4557**********998789******************************************** PP

                             TIGR01371 145 lgvetkPvllGpitflkLakakeeeekellellekllpvYkevlkklaeagvewvqidePvlvldlskeelaa 217
                                           l++etkPv++G++tflkL+  k   +++ ++ l+kl ++Y ++l+k++++g++++qideP+lv+dl++ e+++
  NCBI__GCF_000816635.1:WP_039652020.1 151 LNIETKPVIIGIFTFLKLSNLKG--NTTFEHCLNKLANIYIDILDKFQQEGIKYLQIDEPILVTDLTEYEIEL 221
                                           *******************9886..57899******************************************* PP

                             TIGR01371 218 vkeayeeleeaskelklllqtYfdsveealeklvslpvealglDlveakee.lelakakfeedkvLvaGvidG 289
                                           +k++y+++ +++ + k llqtYf+++++ +e+l++l+++++glD+ve+k++ + l+k++f+++k+L+aG+++G
  NCBI__GCF_000816635.1:WP_039652020.1 222 FKNVYDKILNKKYSFKTLLQTYFGDIRDIYENLQNLKFNGIGLDFVEGKKNlTLLQKYGFPDNKILIAGIVNG 294
                                           *************************************************99777899**************** PP

                             TIGR01371 290 rniwkadlekslkllkkleakag.dklvvstscsllhvpvdleleekldk.............elkellafak 348
                                           +niw++d+++s++l++++ ++ + +k+ +stscsllhvp++++ e ++d+             e+ e l+fa+
  NCBI__GCF_000816635.1:WP_039652020.1 295 KNIWRNDYKHSIQLMNNIGKYIDtNKIYISTSCSLLHVPYTVKPEGHIDDknigkvdnlinmsEYIESLSFAE 367
                                           ********************9999**********************9986333322222222245678***** PP

                             TIGR01371 349 ekleelkvlkealeg.eaavaealeaeaaaiaarkkskrvadekvkerlealkekkarressfeeRaeaqekk 420
                                           ekl+el+ +ke+l+    +++++  ++++ ++++++++   ++++++++++lk ++++r+ s+e R++ q+++
  NCBI__GCF_000816635.1:WP_039652020.1 368 EKLNELSEIKELLKCkYYEKQSKYIKNQDILKRKRQNPLCYNKEIRTNINNLKPEDFTRKDSLEFRKKVQNNT 440
                                           **************9666666677777788888889999********************************** PP

                             TIGR01371 421 lnlPllPtttiGsfPqtkevRkaRakfrkgeiseeeYekfikeeikkviklqeelglDvLvhGefeRnDmvey 493
                                           ++lPllPtttiGsfPqt+e++k R+++r ++i++e+Ye+ i e+ik+v+klqe++glDvLvhGe eR Dmvey
  NCBI__GCF_000816635.1:WP_039652020.1 441 FKLPLLPTTTIGSFPQTHEIKKLRKDLRGNTITKENYENQIMEKIKEVLKLQEDIGLDVLVHGEYERADMVEY 513
                                           ************************************************************************* PP

                             TIGR01371 494 FgeklaGfaftqngWvqsYGsRcvkPpiiygdvsrpkpmtvkeskyaqsltskpvkGmLtGPvtilnWsfvRe 566
                                           Fg  l+Gf+ft+ngWvqsYG+R+vkPpiiygdv+r++pmt++++k+aq  t+kpvkGmLtGP+tilnWsf+Re
  NCBI__GCF_000816635.1:WP_039652020.1 514 FGRLLDGFLFTKNGWVQSYGTRAVKPPIIYGDVKRTAPMTLDWIKFAQDQTDKPVKGMLTGPITILNWSFPRE 586
                                           ************************************************************************* PP

                             TIGR01371 567 DlprkeiaeqialalrdevkdLeeagikiiqiDepalReglPlrksdk.eeYldwaveaFrlaasgvkdetqi 638
                                           Dl++k+ia+qi+la+ +ev dLe++gikiiqiDe+alRe+lPlr++d+ ++Yldwa+ aFrl++s+vk etqi
  NCBI__GCF_000816635.1:WP_039652020.1 587 DLDLKQIAYQIGLAIGEEVLDLESEGIKIIQIDEAALREKLPLRTKDWhKKYLDWAIPAFRLTNSKVKSETQI 659
                                           ***********************************************9679********************** PP

                             TIGR01371 639 hthmCYsefneiieaiaaldaDvisieasrsdmelldalkeikkyekeiGlGvyDihsprvPskeelaellek 711
                                           hthmCYsef+ i+++i+++daDv siea+rsd ++ld lk+ ++++++iG+G+yDihspr+Ps eel+++++ 
  NCBI__GCF_000816635.1:WP_039652020.1 660 HTHMCYSEFSSIVQEIKDMDADVYSIEAARSDFSILDFLKN-NNFKSQIGPGIYDIHSPRIPSIEELEKSIKI 731
                                           *****************************************.77***************************** PP

                             TIGR01371 712 alkklpkerlWvnPDCGLktRkweevkaalknlveaakelRek 754
                                           +l+k++ ++lW+nPDCGLktR+ +evk++l n+v a+k++R+k
  NCBI__GCF_000816635.1:WP_039652020.1 732 MLDKIDCDKLWINPDCGLKTRDIDEVKKSLINMVLATKNIRKK 774
                                           *****************************************85 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (754 nodes)
Target sequences:                          1  (776 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.01u 0.01s 00:00:00.02 Elapsed: 00:00:00.01
# Mc/sec: 31.92
//
[ok]

This GapMind analysis is from Jul 25 2024. The underlying query database was built on Jul 25 2024.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory