GapMind for Amino acid biosynthesis

 

Alignments for a candidate for metH in Thermithiobacillus tepidarius DSM 3134

Align cobalamin-dependent methionine synthase (EC 2.1.1.13) (characterized)
to candidate WP_081662708.1 G579_RS16870 methionine synthase

Query= metacyc::G18NG-11090-MONOMER
         (1221 letters)



>NCBI__GCF_000423825.1:WP_081662708.1
          Length = 849

 Score =  863 bits (2230), Expect = 0.0
 Identities = 462/873 (52%), Positives = 604/873 (69%), Gaps = 35/873 (4%)

Query: 16  SEFLDALANHVLIGDGAMGTQLQGFDLDVEKDFLDLEGCNEILNDTRPDVLRQIHRAYFE 75
           S F++ L   VLI DG MGT L  FDLD++KD+  LE C+E+L  +RPDV+  +H+++F+
Sbjct: 2   SAFMERLRERVLIADGGMGTSLHTFDLDLDKDYWGLENCSEVLVLSRPDVVAAVHKSFFD 61

Query: 76  AGADLVETNTFGCNLPNLADYDIADRCRELAYKGTAVAREVADEMGPGRNGMRRFVVGSL 135
           AG+D VET+TFG N   LA++ +++R  E+  K   +AR VAD +        RFV+GS+
Sbjct: 62  AGSDCVETDTFGANKVVLAEFGLSERTFEINEKAAQIARGVADALSTPE--WPRFVIGSI 119

Query: 136 GPGTKLPSLGHAPYADLRGHYKEAALGIIDGGGDAFLIETAQDLLQVKAAVHGVQDAMAE 195
           GPGTKLPSLGH  Y  L   Y E A G+I GG D  LIETAQD+LQVKAAV+G + A AE
Sbjct: 120 GPGTKLPSLGHTSYDVLEDSYAEQARGLIAGGVDLLLIETAQDILQVKAAVNGCKIARAE 179

Query: 196 LDTFLPIICHVTVETTGTMLMGSEIGAALTALQPLGIDMIGLNCATGPDEMSEHLRYLSK 255
               +PI   VT+ETTGTML+G++I AA TA+  LG+D +G+NCATGP EMSEH+R+L +
Sbjct: 180 AGMDVPIFAQVTIETTGTMLVGTDIAAAATAIHALGVDGMGMNCATGPAEMSEHVRWLGE 239

Query: 256 HADIPVSVMPNAGLPVLGKNGAEYPLEAEDLAQALAGFVSEYGLSMVGGCCGTTPEHIRA 315
           +    +SVMPNAGLP+L +    YPL   +LA  +  +V E G+++VGGCCGTTPEHIRA
Sbjct: 240 NWPHLISVMPNAGLPMLVEGQTVYPLGPRELADWMLRYVEEDGVNLVGGCCGTTPEHIRA 299

Query: 316 VRDAVVGVPEQETSTLTKIPAGPVEQASREVEKEDSVASLYTSVPLSQETGISMIGERTN 375
           +R+A +G   +                +R  E   +V+SLY+ VPL QE  +  +GER N
Sbjct: 300 LREA-IGFDRR--------------PRARSPEWVPAVSSLYSQVPLRQENAVLAVGERAN 344

Query: 376 SNGSKAFREAMLSGDWEKCVDIAKQQTRDGAHMLDLCVDYVGRDGTADMATLAALLATSS 435
           +NGSK FRE + + DW+  V +A+ Q ++G+H+LD+C  YVGR   ADM  +        
Sbjct: 345 ANGSKKFRELLAAEDWDAMVGVARDQVKEGSHVLDVCTAYVGRPEVADMQEVVNRYRGQV 404

Query: 436 TLPIMIDSTEPEVIRTGLEHLGGRSIVNSVNFEDGDGPESRYQRIMKLVKQHGAAVVALT 495
           T+P+MIDSTE  V+   L+ LGG+SI+NS+NFEDG   E + +R++   +++GAAVVALT
Sbjct: 405 TVPLMIDSTEVPVLEAALKLLGGKSIINSINFEDG---EEKAERVLGFARKYGAAVVALT 461

Query: 496 IDEEGQARTAEHKVRIAKRLIDDITGSYGLDIKDIVVDCLTFPISTGQEETRRDGIETIE 555
           IDEEG A+  E K+ IA RL D     YGL   D++ D LTF I TG EE RR GI T+E
Sbjct: 462 IDEEGMAKEVEQKLAIAHRLYDFAVKRYGLPASDLIYDPLTFTICTGVEEDRRHGINTLE 521

Query: 556 AIRELKKLYPEIHTTLGLSNISFGLNPAARQVLNSVFLNECIEAGLDSAIAHSSKILPMN 615
           AI  +++  PE    LGLSNISFGL PAAR VLNSVFL+   + GL +AI H + I P++
Sbjct: 522 AIERIRRELPECQIMLGLSNISFGLKPAARHVLNSVFLHHAQKRGLTAAIIHVAAIKPLH 581

Query: 616 RIDDRQREVALDMVYDRRTEDYDPLQEFMQLFEGVSAADAKDARAEQLAAMPLFERLAQR 675
           +I     E A D+++DRR + +DPL  F++LF+ V+ A AK A     A   + ERL QR
Sbjct: 582 QIPPEHVEAAEDLIFDRRDKGFDPLLRFVELFKDVTVASAKKA-----APATVEERLTQR 636

Query: 676 IIDGDKNGLEDDLEAGMKEKSPIAIINEDLLNGMKTVGELFGSGQMQLPFVLQSAETMKT 735
           I+DGDK GLEDDL+A +++  P+ IIN  LL+GMK VGELFGSGQMQLPFVLQSAETMK 
Sbjct: 637 IVDGDKQGLEDDLKAALEQYPPLEIINTFLLDGMKVVGELFGSGQMQLPFVLQSAETMKA 696

Query: 736 AVAYLEPFMEEEAEATGSAQAEGKGKIVVATVKGDVHDIGKNLVDIILSNNGYDVVNLGI 795
           AVA+LEPFME+        + E KG +V+ATVKGDVHDIGKNLVDIIL+NNGY VVNLGI
Sbjct: 697 AVAFLEPFMEK-------VEGEEKGILVLATVKGDVHDIGKNLVDIILTNNGYKVVNLGI 749

Query: 796 KQPLSAMLEAAEEHKADVIGMSGLLVKSTVVMKENLEEMNNAGASNYPVILGGAALTRTY 855
           KQP+ ++++AA +  A  IGMSGLLVKSTV+MKENLEEM   G  + PV+LGGAALTR +
Sbjct: 750 KQPIDSIVQAACDCGAHAIGMSGLLVKSTVIMKENLEEMRRRGL-DIPVLLGGAALTRRF 808

Query: 856 VENDLNEVY--TGEVYYARDAFEGLRLMDEVMA 886
           VEND    Y     V+YA+DAFEGL+LM+++MA
Sbjct: 809 VENDCRAAYGHPERVHYAKDAFEGLKLMEQIMA 841


Lambda     K      H
   0.316    0.135    0.386 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 2436
Number of extensions: 112
Number of successful extensions: 9
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 1221
Length of database: 849
Length adjustment: 45
Effective length of query: 1176
Effective length of database: 804
Effective search space:   945504
Effective search space used:   945504
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.6 bits)
S2: 57 (26.6 bits)

Align cobalamin-dependent methionine synthase (EC 2.1.1.13) (characterized)
to candidate WP_081662709.1 G579_RS18665 hypothetical protein

Query= metacyc::G18NG-11090-MONOMER
         (1221 letters)



>NCBI__GCF_000423825.1:WP_081662709.1
          Length = 289

 Score =  217 bits (552), Expect = 1e-60
 Identities = 122/288 (42%), Positives = 169/288 (58%), Gaps = 14/288 (4%)

Query: 935  ERSD-VSTDTPTAAPPFWGTRIVKGLPLAEFLGNLDERALFMGQWGLKSTRGNEGPSYED 993
            E SD V  D P   PPFWG R+++ + L   +  ++   L+  QWG K+   +    Y+ 
Sbjct: 15   ESSDPVRRDNPIPIPPFWGARVIEHVSLRAIVPYINRNTLYKFQWGFKAPEMSP-QEYKA 73

Query: 994  LVETEGRPRLRYWLDRLKSEGILDHVALVYGYFPAVAEGDDVVILESPDPHAAERMRFSF 1053
               TE  P     +   +++ IL   A VYGYFPA +EGDD+++   P+    ER RF+F
Sbjct: 74   WARTEVDPIFNRLVAASEAQSILQPKA-VYGYFPAQSEGDDLIVYTDPESRQ-ERARFTF 131

Query: 1054 PRQQRGRFLCIADFIRPREQAVKDGQVDVMPFQLVTMGNPIADFANELFAANEYREYLEV 1113
            PRQ+  R  CIADF RP    V  G++DV+ FQLVT+G   AD A ELF  ++Y+EYL  
Sbjct: 132  PRQKTARRRCIADFFRP----VDSGEMDVVAFQLVTVGQHAADHARELFHGDQYQEYLYW 187

Query: 1114 HGIGVQLTEALAEYWHSRVRSELKLNDGGSVADFDPEDKTKFFDLDYRGARFSFGYGSCP 1173
            HG+  +  E LAEY H ++R+EL        A  D  D       +YRG+R+SFGY +CP
Sbjct: 188  HGLNAEGAEGLAEYIHKQIRAEL------GFAREDARDIQAMIKQEYRGSRYSFGYPACP 241

Query: 1174 DLEDRAKLVELLEPGRIGVELSEELQLHPEQSTDAFVLYHPEAKYFNV 1221
            +L D+ K+++LL   RIGV + +E QL PE ST A V +HP+AKYF V
Sbjct: 242  NLHDQHKILDLLGAERIGVVMGDEDQLWPEDSTSAIVAHHPQAKYFGV 289


Lambda     K      H
   0.316    0.135    0.386 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 831
Number of extensions: 40
Number of successful extensions: 5
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 2
Number of HSP's successfully gapped: 1
Length of query: 1221
Length of database: 289
Length adjustment: 36
Effective length of query: 1185
Effective length of database: 253
Effective search space:   299805
Effective search space used:   299805
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.6 bits)
S2: 53 (25.0 bits)

Align candidate WP_081662708.1 G579_RS16870 (methionine synthase)
to HMM TIGR02082 (metH: methionine synthase (EC 2.1.1.13))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.aa/TIGR02082.hmm
# target sequence database:        /tmp/gapView.3374.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR02082  [M=1182]
Accession:   TIGR02082
Description: metH: methionine synthase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                                 Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                                 -----------
          0 1072.6   0.0          0 1072.4   0.0    1.0  1  lcl|NCBI__GCF_000423825.1:WP_081662708.1  G579_RS16870 methionine synthase


Domain annotation for each sequence (and alignments):
>> lcl|NCBI__GCF_000423825.1:WP_081662708.1  G579_RS16870 methionine synthase
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 ! 1072.4   0.0         0         0       2     859 ..       9     841 ..       8     848 .. 0.98

  Alignments for each domain:
  == domain 1  score: 1072.4 bits;  conditional E-value: 0
                                 TIGR02082   2 nkrilvlDGamGtqlqsanLteadFrgeeadlarelkGnndlLnltkPeviaaihrayfeaGaDivetn 70 
                                               ++r+l+ DG+mGt l +++L+ ++  +        l+ + ++L+l++P+v+aa+h+++f+aG+D vet+
  lcl|NCBI__GCF_000423825.1:WP_081662708.1   9 RERVLIADGGMGTSLHTFDLDLDKDYW-------GLENCSEVLVLSRPDVVAAVHKSFFDAGSDCVETD 70 
                                               79*******************986666.......39********************************* PP

                                 TIGR02082  71 tFnsteialadYdledkayelnkkaaklarevadeftltpekkRfvaGslGPtnklatlspdverpefr 139
                                               tF++++++la+++l +++ e+n+kaa++ar vad ++ tpe +Rfv+Gs+GP++kl++l+         
  lcl|NCBI__GCF_000423825.1:WP_081662708.1  71 TFGANKVVLAEFGLSERTFEINEKAAQIARGVADALS-TPEWPRFVIGSIGPGTKLPSLG--------- 129
                                               *************************************.**********************......... PP

                                 TIGR02082 140 nvtydelvdaYkeqvkglldGGvDllLietvfDtlnakaalfaveevfeekgrelPilisgvivdksGr 208
                                               ++ yd l d+Y eq++gl+ GGvDllLiet +D+l++kaa+++ + + +e+g ++Pi+++ v+++++G+
  lcl|NCBI__GCF_000423825.1:WP_081662708.1 130 HTSYDVLEDSYAEQARGLIAGGVDLLLIETAQDILQVKAAVNGCKIARAEAGMDVPIFAQ-VTIETTGT 197
                                               ************************************************************.******** PP

                                 TIGR02082 209 tLsGqtleaflaslehaeililGLnCalGadelrefvkelsetaealvsviPnaGLPnalg...eYdlt 274
                                               +L+G++++a++++++  +++ +G+nCa+G++e++e+v+ l+e+ + l+sv+PnaGLP  +     Y+l 
  lcl|NCBI__GCF_000423825.1:WP_081662708.1 198 MLVGTDIAAAATAIHALGVDGMGMNCATGPAEMSEHVRWLGENWPHLISVMPNAGLPMLVEgqtVYPLG 266
                                               *********************************************************99977889**** PP

                                 TIGR02082 275 peelakalkefaeegllnivGGCCGttPehiraiaeavk.dikprkrqeleeksvlsglealkiaqess 342
                                               p ela  +  ++ee ++n+vGGCCGttPehira+ ea+  d +pr r     + v+s+++++++ qe+ 
  lcl|NCBI__GCF_000423825.1:WP_081662708.1 267 PRELADWMLRYVEEDGVNLVGGCCGTTPEHIRALREAIGfDRRPRARSPEWVPAVSSLYSQVPLRQENA 335
                                               ***********************************9985489999999999****************** PP

                                 TIGR02082 343 fvniGeRtnvaGskkfrklikaedyeealkiakqqveeGaqilDinvDevllDgeadmkkllsllasep 411
                                               ++ +GeR n++Gskkfr+l+ aed+++++ +a++qv+eG+++lD++  +v++  +adm++++++  ++ 
  lcl|NCBI__GCF_000423825.1:WP_081662708.1 336 VLAVGERANANGSKKFRELLAAEDWDAMVGVARDQVKEGSHVLDVCTAYVGRPEVADMQEVVNRYRGQ- 403
                                               *******************************************************************9. PP

                                 TIGR02082 412 diakvPlmlDssefevleaGLkviqGkaivnsislkdGeerFlekaklikeyGaavvvmafDeeGqart 480
                                                + +vPlm+Ds+e+ vlea Lk ++Gk+i+nsi+++dGee+  + + ++++yGaavv++++DeeG+a++
  lcl|NCBI__GCF_000423825.1:WP_081662708.1 404 -V-TVPLMIDSTEVPVLEAALKLLGGKSIINSINFEDGEEKAERVLGFARKYGAAVVALTIDEEGMAKE 470
                                               .6.9***************************************************************** PP

                                 TIGR02082 481 adkkieiakRayklltekvgfppediifDpniltiatGieehdryaidfieaireikeelPdakisgGv 549
                                                ++k+ ia+R+y+ +++++g+p++d+i+Dp+++ti tG+ee++r++i+++eai++i++elP+++i +G+
  lcl|NCBI__GCF_000423825.1:WP_081662708.1 471 VEQKLAIAHRLYDFAVKRYGLPASDLIYDPLTFTICTGVEEDRRHGINTLEAIERIRRELPECQIMLGL 539
                                               ********************************************************************* PP

                                 TIGR02082 550 snvsFslrgndavRealhsvFLyeaikaGlDmgivnagklavyddidkelrevvedlildrrreatekL 618
                                               sn+sF+l+  +a+R++l+svFL++a+k Gl ++i++ + ++++++i++e  e++edli+drr++  ++L
  lcl|NCBI__GCF_000423825.1:WP_081662708.1 540 SNISFGLK--PAARHVLNSVFLHHAQKRGLTAAIIHVAAIKPLHQIPPEHVEAAEDLIFDRRDKGFDPL 606
                                               ********..*********************************************************** PP

                                 TIGR02082 619 lelaelykgtkeksskeaqeaewrnlpveeRLeralvkGeregieedleearkklkapleiiegpLldG 687
                                               l+++el+k+++ +s+k     + +   veeRL++++v+G ++g+e+dl++a+ ++++pleii++ LldG
  lcl|NCBI__GCF_000423825.1:WP_081662708.1 607 LRFVELFKDVTVASAK-----KAAPATVEERLTQRIVDGDKQGLEDDLKAAL-EQYPPLEIINTFLLDG 669
                                               ***********99555.....567789*************************.999************* PP

                                 TIGR02082 688 mkvvGdLFGsGkmfLPqvvksarvmkkavayLePylekekeedkskGkivlatvkGDvhDiGknivdvv 756
                                               mkvvG+LFGsG+m+LP+v++sa++mk+ava+LeP++ek +   + kG +vlatvkGDvhDiGkn+vd++
  lcl|NCBI__GCF_000423825.1:WP_081662708.1 670 MKVVGELFGSGQMQLPFVLQSAETMKAAVAFLEPFMEKVE--GEEKGILVLATVKGDVHDIGKNLVDII 736
                                               ************************************9876..8899*********************** PP

                                 TIGR02082 757 LscngyevvdlGvkvPvekileaakkkkaDviglsGLivksldemvevaeemerrgvkiPlllGGaals 825
                                               L++ngy+vv+lG+k+P++ i++aa +  a  ig+sGL+vks++ m+e++eem+rrg++iP+llGGaal+
  lcl|NCBI__GCF_000423825.1:WP_081662708.1 737 LTNNGYKVVNLGIKQPIDSIVQAACDCGAHAIGMSGLLVKSTVIMKENLEEMRRRGLDIPVLLGGAALT 805
                                               ********************************************************************* PP

                                 TIGR02082 826 kahvavkiaekYkg..evvyvkdaseavkvvdklls 859
                                               + +v++++  +Y    +v+y+kda+e++k++++++ 
  lcl|NCBI__GCF_000423825.1:WP_081662708.1 806 RRFVENDCRAAYGHpeRVHYAKDAFEGLKLMEQIMA 841
                                               ************75226*****************97 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (1182 nodes)
Target sequences:                          1  (849 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.07u 0.02s 00:00:00.09 Elapsed: 00:00:00.09
# Mc/sec: 11.11
//
[ok]

This GapMind analysis is from Apr 10 2024. The underlying query database was built on Apr 09 2024.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory