GapMind for Amino acid biosynthesis

 

Aligments for a candidate for metH in Caulobacter crescentus NA1000

Align methionine synthase (EC 2.1.1.13) (characterized)
to candidate CCNA_02221 CCNA_02221 methionine synthase I metH

Query= BRENDA::P13009
         (1227 letters)



>FitnessBrowser__Caulo:CCNA_02221
          Length = 899

 Score = 1084 bits (2803), Expect = 0.0
 Identities = 566/885 (63%), Positives = 667/885 (75%), Gaps = 17/885 (1%)

Query: 355  LFVNVGERTNVTGSAKFKRLIKEEKYSEALDVARQQVENGAQIIDINMDEGMLDAEAAMV 414
            +FVN+GERTNVTGSAKFK+LI E  Y EAL VARQQVE GAQ+ID+NMDEG+LD++ AMV
Sbjct: 8    VFVNIGERTNVTGSAKFKKLIVEGNYPEALSVARQQVEAGAQVIDVNMDEGLLDSQQAMV 67

Query: 415  RFLNLIAGEPDIARVPIMIDSSKWDVIEKGLKCIQGKGIVNSISMKEGVDAFIHHAKLLR 474
             FLNL+A EPDIARVP+MIDSSKW+VIE GLKC+QGK IVNSIS+KEG + F+  A L  
Sbjct: 68   TFLNLMAAEPDIARVPVMIDSSKWEVIEAGLKCVQGKAIVNSISLKEGEEKFLEQATLCL 127

Query: 475  RYGAAVVVMAFDEQGQADTRARKIEICRRAYKILTEEVGFPPEDIIFDPNIFAVATGIEE 534
            RYGAAVVVMAFDE GQADT  RK+EIC RAY  L ++VGFPPEDIIFDPNIFAVATGIEE
Sbjct: 128  RYGAAVVVMAFDEVGQADTEKRKVEICTRAYNTLVDKVGFPPEDIIFDPNIFAVATGIEE 187

Query: 535  HNNYAQDFIGACEDIKRELPHALISGGVSNVSFSFRGNDPVREAIHAVFLYYAIRNGMDM 594
            H+NYA DFI A   IK+ LP+A +SGGVSNVSFSFRGN+PVR AIH+VFLY+AI  GMDM
Sbjct: 188  HDNYAVDFIEATRRIKQMLPYARVSGGVSNVSFSFRGNEPVRRAIHSVFLYHAINAGMDM 247

Query: 595  GIVNAGQLAIYDDLPAELRDAVEDVILNR--RD---DGTERLLELAEKYRGSKTDDTANA 649
            GIVNAG L +YDD+   LR+AVEDVILNR  RD     TERL+E+A +Y+G K       
Sbjct: 248  GIVNAGDLPVYDDIDPALREAVEDVILNRPQRDPVMTNTERLVEMAPRYKGEKGQQ--QV 305

Query: 650  QQAEWRSWEVNKRLEYSLVKGITEFIEQDTEEARQQATRPIEVIEGPLMDGMNVVGDLFG 709
               EWR   VN+RL ++LV GITEFIEQDTEEAR  A RP+ VIEGPLMDGMNVVGDLFG
Sbjct: 306  ANLEWRKGTVNERLTHALVHGITEFIEQDTEEARLAAERPLHVIEGPLMDGMNVVGDLFG 365

Query: 710  EGKMFLPQVVKSARVMKQAVAYLEPFIEASKE--QGKTNGKMVIATVKGDVHDIGKNIVG 767
             GKMFLPQVVKSARVMKQAVA+L PF+EA KE  + K  GK+++ATVKGDVHDIGKNIVG
Sbjct: 366  AGKMFLPQVVKSARVMKQAVAWLMPFMEAEKEGQERKAAGKVLMATVKGDVHDIGKNIVG 425

Query: 768  VVLQCNNYEIVDLGVMVPAEKILRTAKEVNADLIGLSGLITPSLDEMVNVAKEMERQGFT 827
            VVLQCNNYE+VDLGVMVPA++IL  AK+   D+IGLSGLITPSLDEMV VA EMERQGF 
Sbjct: 426  VVLQCNNYEVVDLGVMVPADRILDEAKKHKVDMIGLSGLITPSLDEMVFVAAEMERQGFD 485

Query: 828  IPLLIGGATTSKAHTAVKIEQNY-SGPTVYVQNASRTVGVVAALLSDTQRDDFVARTRKE 886
            IPLLIGGATTS+ HTAVKIE  Y  GPT YV +ASR VGVV+ LLS+ +RD  +A TR E
Sbjct: 486  IPLLIGGATTSRTHTAVKIEPAYRRGPTTYVVDASRAVGVVSGLLSEGERDRIIAETRAE 545

Query: 887  YETVRIQHGRKKPRTPPVTLEAARDNDFAFDWQAYTPPVAHRLGVQEVEASIETLRNYID 946
            Y  VR Q+ R +      +++ AR   FA DW+ Y PP    +G +  E S+  L  +ID
Sbjct: 546  YVKVREQYARGQTTKARASIQEARKRAFAIDWKGYAPPKPAFIGTRVFEPSLAELVPFID 605

Query: 947  WTPFFMTWSLAGKYPRILEDEVVGVEAQRLFKDANDMLDKLSAEKTLNPRGVVGLFPANR 1006
            W+PFF +W L G++P+ILED+VVG  A  L++DA  MLDK+  EK    +GV+G +PA  
Sbjct: 606  WSPFFASWELIGRFPQILEDDVVGQAATDLYRDARAMLDKVVEEKWFGAKGVIGFWPAQA 665

Query: 1007 VGDDIEIYRDETRTHVINVSHHLRQQTEK------TGFANYCLADFVAPKLSGKADYIGA 1060
             GDDI +Y DETR    +  H LRQQ +K         AN  L+DFVAP   G ADY+G 
Sbjct: 666  QGDDIVLYTDETRVAEFSRLHTLRQQMDKGADKSGEAKANVALSDFVAPIGQG-ADYVGG 724

Query: 1061 FAVTGGLEEDALADAFEAQHDDYNKIMVKALADRLAEAFAEYLHERVRKVYWGYAPNENL 1120
            FAVT G  ED +   F+A  DDYN IM  ALADRLAEAFAE+LH + R   WGYA +E+ 
Sbjct: 725  FAVTAGHGEDEIVAKFKAAGDDYNAIMASALADRLAEAFAEWLHYKARVELWGYAADEDA 784

Query: 1121 SNEELIRENYQGIRPAPGYPACPEHTEKATIWELLEVEKHTGMKLTESFAMWPGASVSGW 1180
              E LI E YQGIRPAPGYPA P+HTEK T+++LL+ E  TG++LTES+AM PGA+VSG 
Sbjct: 785  DVERLIAEKYQGIRPAPGYPAQPDHTEKGTLFKLLDAEAATGLQLTESYAMTPGAAVSGL 844

Query: 1181 YFSHPDSKYYAVAQIQRDQVEDYARRKGMSVTEVERWLAPNLGYD 1225
            +FSH  + Y+ V +I  DQVEDYARRKG  +   ERWL+P L YD
Sbjct: 845  FFSHRQAHYFGVGKIDADQVEDYARRKGWDMETAERWLSPILNYD 889


Lambda     K      H
   0.318    0.134    0.391 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 2687
Number of extensions: 101
Number of successful extensions: 8
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 2
Number of HSP's successfully gapped: 1
Length of query: 1227
Length of database: 899
Length adjustment: 45
Effective length of query: 1182
Effective length of database: 854
Effective search space:  1009428
Effective search space used:  1009428
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 58 (26.9 bits)

Align methionine synthase (EC 2.1.1.13) (characterized)
to candidate CCNA_02222 CCNA_02222 5-methyltetrahydrofolate--homocysteine methyltransferase homocysteine-binding subunit

Query= BRENDA::P13009
         (1227 letters)



>FitnessBrowser__Caulo:CCNA_02222
          Length = 358

 Score =  375 bits (963), Expect = e-108
 Identities = 182/350 (52%), Positives = 243/350 (69%), Gaps = 1/350 (0%)

Query: 2   SSKVEQLRAQLNERILVLDGGMGTMIQSYRLNEADFRGERFADWPCDLKGNNDLLVLSKP 61
           +++V  L+A   ERIL+LDG  G M Q   L EAD+R ERFA +   +KGNND+L L++P
Sbjct: 8   ANRVAALKAAAKERILILDGSWGVMFQKKGLTEADYRAERFAAYNGQMKGNNDILCLTRP 67

Query: 62  EVIAAIHNAYFEAGADIIETNTFNSTTIAMADYQMESLSA-EINFAAAKLARACADEWTA 120
           +++A +H+AYF AGADI ETNTF+ TTIA ADY +      +IN   AK+ R+ AD W A
Sbjct: 68  DLVAELHDAYFSAGADISETNTFSGTTIAQADYHLGEQDVWDINLEGAKIGRSVADRWNA 127

Query: 121 RTPEKPRYVAGVLGPTNRTASISPDVNDPAFRNITFDGLVAAYRESTKALVEGGADLILI 180
           + P++P+++AG +GP N   S+S DVNDP  R +TFD +  AYR+   AL +GG DL LI
Sbjct: 128 QNPDRPKFIAGSMGPLNVMLSMSSDVNDPGARKVTFDQVYEAYRQQVDALYQGGVDLFLI 187

Query: 181 ETVFDTLNAKAAVFAVKTEFEALGVELPIMISGTITDASGRTLSGQTTEAFYNSLRHAEA 240
           ET+ DTLN KAA+ A+    +    ELPI ISGTITD SGRTLSGQT EAF+NS++HA+ 
Sbjct: 188 ETITDTLNCKAAIKAILDWRDEGHEELPIWISGTITDRSGRTLSGQTAEAFWNSVKHAKP 247

Query: 241 LTFGLNCALGPDELRQYVQELSRIAECYVTAHPNAGLPNAFGEYDLDADTMAKQIREWAQ 300
              G NCALG D +R ++ E++RIA+  V A+PNAGLPNA G+YD +       + EWA+
Sbjct: 248 FAVGFNCALGADLMRPHIAEMARIADTLVAAYPNAGLPNAMGQYDEEPHETGHALHEWAK 307

Query: 301 AGFLNIVGGCCGTTPQHIAAMSRAVEGLAPRKLPEIPVACRLSGLEPLNI 350
            G +NI+GGCCGTTP HI  ++  V G+ PR++PE P A RL+GLEP  +
Sbjct: 308 DGLVNILGGCCGTTPDHIRHVADEVRGVTPRQIPERPKAMRLAGLEPFEL 357


Lambda     K      H
   0.318    0.134    0.391 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 1056
Number of extensions: 36
Number of successful extensions: 3
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 1227
Length of database: 358
Length adjustment: 38
Effective length of query: 1189
Effective length of database: 320
Effective search space:   380480
Effective search space used:   380480
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 54 (25.4 bits)

Align candidate CCNA_02221 CCNA_02221 (methionine synthase I metH)
to HMM TIGR02082 (metH: methionine synthase (EC 2.1.1.13))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.aa/TIGR02082.hmm
# target sequence database:        /tmp/gapView.22250.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR02082  [M=1182]
Accession:   TIGR02082
Description: metH: methionine synthase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                             Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                             -----------
          0 1216.9   0.1          0 1216.6   0.1    1.0  1  lcl|FitnessBrowser__Caulo:CCNA_02221  CCNA_02221 methionine synthase I


Domain annotation for each sequence (and alignments):
>> lcl|FitnessBrowser__Caulo:CCNA_02221  CCNA_02221 methionine synthase I metH
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 ! 1216.6   0.1         0         0     342    1182 .]       8     856 ..       3     856 .. 0.98

  Alignments for each domain:
  == domain 1  score: 1216.6 bits;  conditional E-value: 0
                             TIGR02082  342 sfvniGeRtnvaGskkfrklikaedyeealkiakqqveeGaqilDinvDevllDgeadmkkllsllasepd 412 
                                             fvniGeRtnv+Gs+kf+kli +++y eal++a+qqve Gaq++D+n+De+llD++++m+++l+l+a+epd
  lcl|FitnessBrowser__Caulo:CCNA_02221    8 VFVNIGERTNVTGSAKFKKLIVEGNYPEALSVARQQVEAGAQVIDVNMDEGLLDSQQAMVTFLNLMAAEPD 78  
                                            69********************************************************************* PP

                             TIGR02082  413 iakvPlmlDssefevleaGLkviqGkaivnsislkdGeerFlekaklikeyGaavvvmafDeeGqartadk 483 
                                            ia+vP+m+Dss++ev+eaGLk++qGkaivnsislk+Gee+Fle+a l  +yGaavvvmafDe Gqa+t ++
  lcl|FitnessBrowser__Caulo:CCNA_02221   79 IARVPVMIDSSKWEVIEAGLKCVQGKAIVNSISLKEGEEKFLEQATLCLRYGAAVVVMAFDEVGQADTEKR 149 
                                            *********************************************************************** PP

                             TIGR02082  484 kieiakRayklltekvgfppediifDpniltiatGieehdryaidfieaireikeelPdakisgGvsnvsF 554 
                                            k+ei++Ray++l++kvgfppediifDpni+++atGieehd+ya+dfiea+r+ik+ lP+a++sgGvsnvsF
  lcl|FitnessBrowser__Caulo:CCNA_02221  150 KVEICTRAYNTLVDKVGFPPEDIIFDPNIFAVATGIEEHDNYAVDFIEATRRIKQMLPYARVSGGVSNVSF 220 
                                            *********************************************************************** PP

                             TIGR02082  555 slrgndavRealhsvFLyeaikaGlDmgivnagklavyddidkelrevvedlildrr.....reatekLle 620 
                                            s+rgn++vR+a+hsvFLy+ai+aG+Dmgivnag l vyddid+ lre+ved+il+r        +te+L+e
  lcl|FitnessBrowser__Caulo:CCNA_02221  221 SFRGNEPVRRAIHSVFLYHAINAGMDMGIVNAGDLPVYDDIDPALREAVEDVILNRPqrdpvMTNTERLVE 291 
                                            *******************************************************9866666678****** PP

                             TIGR02082  621 laelykgtkeksskeaqeaewrnlpveeRLeralvkGeregieedleearkklkapleiiegpLldGmkvv 691 
                                            +a +ykg k +  ++ ++ ewr+  v+eRL++alv+G++e+ie+d+eear  +++pl++iegpL+dGm+vv
  lcl|FitnessBrowser__Caulo:CCNA_02221  292 MAPRYKGEKGQ--QQVANLEWRKGTVNERLTHALVHGITEFIEQDTEEARLAAERPLHVIEGPLMDGMNVV 360 
                                            *********99..667999**************************************************** PP

                             TIGR02082  692 GdLFGsGkmfLPqvvksarvmkkavayLePylekekeed..kskGkivlatvkGDvhDiGknivdvvLscn 760 
                                            GdLFG+GkmfLPqvvksarvmk+ava+L+P++e+eke +  k++Gk+++atvkGDvhDiGkniv+vvL+cn
  lcl|FitnessBrowser__Caulo:CCNA_02221  361 GDLFGAGKMFLPQVVKSARVMKQAVAWLMPFMEAEKEGQerKAAGKVLMATVKGDVHDIGKNIVGVVLQCN 431 
                                            *********************************99986555999*************************** PP

                             TIGR02082  761 gyevvdlGvkvPvekileaakkkkaDviglsGLivksldemvevaeemerrgvkiPlllGGaalskahvav 831 
                                            +yevvdlGv+vP+++il++akk+k D+iglsGLi++sldemv+va emer+g++iPll+GGa++s++h+av
  lcl|FitnessBrowser__Caulo:CCNA_02221  432 NYEVVDLGVMVPADRILDEAKKHKVDMIGLSGLITPSLDEMVFVAAEMERQGFDIPLLIGGATTSRTHTAV 502 
                                            *********************************************************************** PP

                             TIGR02082  832 kiaekYk.gevvyvkdaseavkvvdkllsekkkaeelekikeeyeeirekfgekkeklialsekaarkevf 901 
                                            ki+++Y+ g++ yv das+av vv+ llse +++  ++++++ey ++re++ + ++ +  +s+++ark  f
  lcl|FitnessBrowser__Caulo:CCNA_02221  503 KIEPAYRrGPTTYVVDASRAVGVVSGLLSEGERDRIIAETRAEYVKVREQYARGQTTKARASIQEARKRAF 573 
                                            ******6489************************************************************* PP

                             TIGR02082  902 aldrsedlevpapkflGtkvleasieellkyiDwkalFvqWelrgkypkilkdeleglearklfkdakell 972 
                                            a+d++  + +p+p f+Gt+v+e s++el+++iDw ++F +Wel g++p+il+d+++g+ a+ l++da+++l
  lcl|FitnessBrowser__Caulo:CCNA_02221  574 AIDWK-GYAPPKPAFIGTRVFEPSLAELVPFIDWSPFFASWELIGRFPQILEDDVVGQAATDLYRDARAML 643 
                                            *****.9**************************************************************** PP

                             TIGR02082  973 dklsaekllrargvvGlfPaqsvgddieiytdetvsqetkpiatvrekleqlrqqsdr...ylclaDfias 1040
                                            dk+++ek   a+gv+G++Paq +gddi++ytdet+  e + + t+r+++++   +s++   +++l+Df+a+
  lcl|FitnessBrowser__Caulo:CCNA_02221  644 DKVVEEKWFGAKGVIGFWPAQAQGDDIVLYTDETRVAEFSRLHTLRQQMDKGADKSGEakaNVALSDFVAP 714 
                                            ***********************************99********99999888888878899********* PP

                             TIGR02082 1041 kesGikDylgallvtaglgaeelakkleakeddydsilvkaladrlaealaellhervRkelwgyaeeenl 1111
                                               G +Dy+g ++vtag g++e+  k++a  ddy++i+  aladrlaea+ae+lh + R elwgya++e+ 
  lcl|FitnessBrowser__Caulo:CCNA_02221  715 IGQG-ADYVGGFAVTAGHGEDEIVAKFKAAGDDYNAIMASALADRLAEAFAEWLHYKARVELWGYAADEDA 784 
                                            6666.7***************************************************************** PP

                             TIGR02082 1112 dkedllkerYrGirpafGYpacPdhtekatlleLleaer.iGlklteslalaPeasvsglyfahpeakYfa 1181
                                            d+e l+ e+Y+Girpa+GYpa+Pdhtek tl++Ll+ae  +Gl+ltes+a++P a+vsgl+f+h +a+Yf 
  lcl|FitnessBrowser__Caulo:CCNA_02221  785 DVERLIAEKYQGIRPAPGYPAQPDHTEKGTLFKLLDAEAaTGLQLTESYAMTPGAAVSGLFFSHRQAHYFG 855 
                                            ***************************************9******************************8 PP

                             TIGR02082 1182 v 1182
                                            v
  lcl|FitnessBrowser__Caulo:CCNA_02221  856 V 856 
                                            6 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (1182 nodes)
Target sequences:                          1  (899 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.09u 0.04s 00:00:00.13 Elapsed: 00:00:00.11
# Mc/sec: 8.86
//
[ok]

This GapMind analysis is from Aug 03 2021. The underlying query database was built on Aug 03 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see the paper from 2019 on GapMind for amino acid biosynthesis, the paper from 2022 on GapMind for carbon sources, or view the source code, or see changes to Amino acid biosynthesis since the publication.

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory