GapMind for Amino acid biosynthesis

 

Alignments for a candidate for metH in Nocardiopsis lucentensis DSM 44048

Align cobalamin-dependent methionine synthase (EC 2.1.1.13) (characterized)
to candidate WP_083924073.1 D471_RS0104310 methionine synthase

Query= metacyc::G18NG-11090-MONOMER
         (1221 letters)



>NCBI__GCF_000341125.1:WP_083924073.1
          Length = 1157

 Score = 1442 bits (3734), Expect = 0.0
 Identities = 745/1205 (61%), Positives = 912/1205 (75%), Gaps = 55/1205 (4%)

Query: 18   FLDALANHVLIGDGAMGTQLQGFDLDVEKDFLDLEGCNEILNDTRPDVLRQIHRAYFEAG 77
            F +AL+  V++ DGAMGT LQ  DLD+++ F   EGCN+ILN TRPD++   H A+   G
Sbjct: 7    FREALSQRVIVADGAMGTMLQAHDLDLDQ-FEGHEGCNDILNITRPDIVHDTHAAFLAVG 65

Query: 78   ADLVETNTFGCNLPNLADYDIADRCRELAYKGTAVAREVADEMGPGRNGMRRFVVGSLGP 137
            +D VETNTF  N   LA+Y I DR  E+A  G  VARE AD      +   R+V+GS+GP
Sbjct: 66   SDCVETNTFSANYGGLAEYGIEDRAYEIAEAGARVAREAADAYSTPDHP--RYVLGSVGP 123

Query: 138  GTKLPSLGHAPYADLRGHYKEAALGIIDGGGDAFLIETAQDLLQVKAAVHGVQDAMAELD 197
            GTKLPSLGHAPYA LR HY++   G+IDGG DA LIET QDLLQVKAAV G Q A   L 
Sbjct: 124  GTKLPSLGHAPYALLRDHYEQCHRGLIDGGADAILIETCQDLLQVKAAVVGAQRARKALG 183

Query: 198  TFLPIICHVTVETTGTMLMGSEIGAALTALQPLGIDMIGLNCATGPDEMSEHLRYLSKHA 257
              +PII  VT+ETTGTMLMGSEIGAALT+L PLGID+IGLNCATGP EMSEHLRYLS H+
Sbjct: 184  RDVPIIAQVTIETTGTMLMGSEIGAALTSLAPLGIDVIGLNCATGPAEMSEHLRYLSHHS 243

Query: 258  DIPVSVMPNAGLPVLGKNGAEYPLEAEDLAQALAGFVSEYGLSMVGGCCGTTPEHIRAVR 317
             IP+S MPNAGLP LG +GA YPL   +LA A   F SE+GLS+VGGCCGTTPEH+R V 
Sbjct: 244  PIPISCMPNAGLPQLGADGAVYPLTPAELADAHDTFTSEFGLSVVGGCCGTTPEHLRQVV 303

Query: 318  DAVVGVPEQETSTLTKIPAGPVEQASREVEKEDSVASLYTSVPLSQETGISMIGERTNSN 377
            + V G   ++   L                 E + +SLY SVP  Q+     +GERTN+N
Sbjct: 304  ERVQGRGIKDRKPLV----------------EAAASSLYQSVPFRQDASYLAVGERTNAN 347

Query: 378  GSKAFREAMLSGDWEKCVDIAKQQTRDGAHMLDLCVDYVGRDGTADMATLAALLATSSTL 437
            GSK FR AML   W+ CV++A+ Q RDGAH+LDL +DYVGRDG  DM  LA+  ATSSTL
Sbjct: 348  GSKKFRTAMLEERWDDCVEMARDQIRDGAHLLDLNIDYVGRDGVRDMRELASRFATSSTL 407

Query: 438  PIMIDSTEPEVIRTGLEHLGGRSIVNSVNFEDGDGPESRYQRIMKLVKQHGAAVVALTID 497
            PIM+DSTEP V+  GLE LGGR+++NSVN+EDGDGP+SR+ RIM+LV +HGAAVV L ID
Sbjct: 408  PIMLDSTEPAVLEAGLEALGGRAVINSVNYEDGDGPDSRFTRIMELVSEHGAAVVGLCID 467

Query: 498  EEGQARTAEHKVRIAKRLIDDITGSYGLDIKDIVVDCLTFPISTGQEETRRDGIETIEAI 557
            EEGQARTAE K+R+A RLI+ ITG +GL   DIV+DCLTFPI+TGQEETRRDGIET++AI
Sbjct: 468  EEGQARTAEWKLRVATRLIEQITGEWGLRTGDIVIDCLTFPITTGQEETRRDGIETLDAI 527

Query: 558  RELKKLYPEIHTTLGLSNISFGLNPAARQVLNSVFLNECIEAGLDSAIAHSSKILPMNRI 617
            RELK+ YP++ TTLGLSN+SFG+NPAAR VLNSVFL+E +EAGLDSAI H+SKI+P+N+I
Sbjct: 528  RELKRRYPDVQTTLGLSNLSFGVNPAARIVLNSVFLHEAVEAGLDSAIVHASKIVPINQI 587

Query: 618  DDRQREVALDMVYDRRTEDYDPLQEFMQLFEGVSAADAKDARAEQLAAMPLFERLAQRII 677
             D QREVALD+VYDRR +DYDPL  F++LFEGV A   + +RAE+LAA+PL+ERL +RI+
Sbjct: 588  PDEQREVALDLVYDRRADDYDPLSRFIELFEGVDAKSMRASRAEELAALPLWERLERRIV 647

Query: 678  DGDKNGLEDDLEAGMKEKSPIAIINEDLLNGMKTVGELFGSGQMQLPFVLQSAETMKTAV 737
            DG+  G+E DLEA + E+  +AI+N+ LL GMKTVGELFGSGQMQLPFVL+SAE MK AV
Sbjct: 648  DGEMTGIEADLEAALAERPALAIVNDTLLEGMKTVGELFGSGQMQLPFVLKSAEVMKGAV 707

Query: 738  AYLEPFMEEEAEATGSAQAEGKGKIVVATVKGDVHDIGKNLVDIILSNNGYDVVNLGIKQ 797
            AYLEP ME+  +       +GKG+IV+ATVKGDVHDIGKNLVDIILSNNGYDVVN+GIKQ
Sbjct: 708  AYLEPHMEKTDD-------DGKGRIVLATVKGDVHDIGKNLVDIILSNNGYDVVNIGIKQ 760

Query: 798  PLSAMLEAAEEHKADVIGMSGLLVKSTVVMKENLEEMNNAGAS-NYPVILGGAALTRTYV 856
            P+SA+LEAAE+ +ADVIGMSGLLVKSTV+MKENLEEMN+ G S  +PV+LGGAALTR+YV
Sbjct: 761  PVSAILEAAEKERADVIGMSGLLVKSTVIMKENLEEMNSRGLSERFPVLLGGAALTRSYV 820

Query: 857  ENDLNEVYTGEVYYARDAFEGLRLMDEVMAEKRGEGLDPNSPEAIEQAKKKAERKARNER 916
            E DL E++ G+V YA+DAFEGLRLMD  MA KRGE          E A+  A R+ R  R
Sbjct: 821  EQDLAEMFEGQVRYAKDAFEGLRLMDSFMAVKRGE----------EGAELPALRQRRVRR 870

Query: 917  SRKIAAERKANAAPVIVPERSDVSTDTPTAAPPFWGTRIVKGLPLAEFLGNLDERALFMG 976
               +         P  +P RSDV+TD     PPFWG RI KG+PLA++   LDERA FMG
Sbjct: 871  GATLKV-----TEPEDMPARSDVATDNRVPVPPFWGDRISKGIPLADYSAFLDERATFMG 925

Query: 977  QWGLKSTRGNEGPSYEDLVETEGRPRLRYWLDRLKSEGILDHVALVYGYFPAVAEGDDVV 1036
            QWGLK++RG  GPSYE+LVETEGRPR+R WLDR++++G+L+  A+V+G+FP  +EGDD+V
Sbjct: 926  QWGLKASRGGNGPSYEELVETEGRPRMRMWLDRIQTDGLLE-AAVVHGHFPCYSEGDDLV 984

Query: 1037 ILESPDPHAAERMRFSFPRQQRGRFLCIADFIRPREQAVKDGQVDVMPFQLVTMGNPIAD 1096
            +L+  +    ER RF+FPRQ+R R LC+AD+ RP+E     G++DV+ FQ+VT+G+ I+ 
Sbjct: 985  VLD--EDGVTERTRFTFPRQRRDRHLCLADYFRPKE----SGELDVVSFQVVTVGSAISR 1038

Query: 1097 FANELFAANEYREYLEVHGIGVQLTEALAEYWHSRVRSELKLNDGGSVADFDPEDKTKFF 1156
               ELF  + YR+YLE+HG+ VQLTEALAEYWH+RVR+EL        A  DP +   FF
Sbjct: 1039 ATAELFERDAYRDYLELHGLSVQLTEALAEYWHTRVRAEL------GFAGEDPAELDAFF 1092

Query: 1157 DLDYRGARFSFGYGSCPDLEDRAKLVELLEPGRIGVELSEELQLHPEQSTDAFVLYHPEA 1216
             L YRGARFS GYG+CP+LEDRAK++ LLEP R+GV LSEE QL PEQ+TDA V++HPEA
Sbjct: 1093 KLGYRGARFSLGYGACPNLEDRAKIMRLLEPERVGVTLSEEFQLVPEQATDAIVIHHPEA 1152

Query: 1217 KYFNV 1221
             YFNV
Sbjct: 1153 TYFNV 1157


Lambda     K      H
   0.316    0.135    0.386 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 3534
Number of extensions: 157
Number of successful extensions: 13
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 1221
Length of database: 1157
Length adjustment: 47
Effective length of query: 1174
Effective length of database: 1110
Effective search space:  1303140
Effective search space used:  1303140
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.6 bits)
S2: 59 (27.3 bits)

Align candidate WP_083924073.1 D471_RS0104310 (methionine synthase)
to HMM TIGR02082 (metH: methionine synthase (EC 2.1.1.13))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.aa/TIGR02082.hmm
# target sequence database:        /tmp/gapView.1261048.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR02082  [M=1182]
Accession:   TIGR02082
Description: metH: methionine synthase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                             Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                             -----------
          0 1424.4   0.0          0 1424.2   0.0    1.0  1  NCBI__GCF_000341125.1:WP_083924073.1  


Domain annotation for each sequence (and alignments):
>> NCBI__GCF_000341125.1:WP_083924073.1  
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 ! 1424.2   0.0         0         0       1    1182 []      11    1157 .]      11    1157 .] 0.97

  Alignments for each domain:
  == domain 1  score: 1424.2 bits;  conditional E-value: 0
                             TIGR02082    1 lnkrilvlDGamGtqlqsanLteadFrgeeadlarelkGnndlLnltkPeviaaihrayfeaGaDivetnt 71  
                                            l++r++v DGamGt+lq+++L+ ++F+g         +G+nd+Ln+t+P+++++ h a++  G+D vetnt
  NCBI__GCF_000341125.1:WP_083924073.1   11 LSQRVIVADGAMGTMLQAHDLDLDQFEG--------HEGCNDILNITRPDIVHDTHAAFLAVGSDCVETNT 73  
                                            589*************************........5********************************** PP

                             TIGR02082   72 FnsteialadYdledkayelnkkaaklarevadeftltpekkRfvaGslGPtnklatlspdverpefrnvt 142 
                                            F+++   la+Y++ed+aye+ +++a++are+ad ++ tp+++R+v+Gs+GP++kl++l+         +  
  NCBI__GCF_000341125.1:WP_083924073.1   74 FSANYGGLAEYGIEDRAYEIAEAGARVAREAADAYS-TPDHPRYVLGSVGPGTKLPSLG---------HAP 134 
                                            ************************************.**********************.........9** PP

                             TIGR02082  143 ydelvdaYkeqvkglldGGvDllLietvfDtlnakaalfaveevfeekgrelPilisgvivdksGrtLsGq 213 
                                            y+ l+d Y++  +gl+dGG+D++Liet++D+l++kaa+++++++ ++ gr++Pi+++ v+++++G++L+G+
  NCBI__GCF_000341125.1:WP_083924073.1  135 YALLRDHYEQCHRGLIDGGADAILIETCQDLLQVKAAVVGAQRARKALGRDVPIIAQ-VTIETTGTMLMGS 204 
                                            ********************************************************9.************* PP

                             TIGR02082  214 tleaflaslehaeililGLnCalGadelrefvkelsetaealvsviPnaGLPnalg...eYdltpeelaka 281 
                                            +++a+l+sl + +i+++GLnCa+G++e++e++++ls+++++++s++PnaGLP+  +    Y+ltp ela a
  NCBI__GCF_000341125.1:WP_083924073.1  205 EIGAALTSLAPLGIDVIGLNCATGPAEMSEHLRYLSHHSPIPISCMPNAGLPQLGAdgaVYPLTPAELADA 275 
                                            *****************************************************9987778*********** PP

                             TIGR02082  282 lkefaeegllnivGGCCGttPehiraiaeavkdikprkrqeleeksvlsglealkiaqessfvniGeRtnv 352 
                                               f +e++l++vGGCCGttPeh+r++ e v++   ++r+ l e+ ++s+++++++ q++s++ +GeRtn+
  NCBI__GCF_000341125.1:WP_083924073.1  276 HDTFTSEFGLSVVGGCCGTTPEHLRQVVERVQGRGIKDRKPLVEAAASSLYQSVPFRQDASYLAVGERTNA 346 
                                            *********************************************************************** PP

                             TIGR02082  353 aGskkfrklikaedyeealkiakqqveeGaqilDinvDevllDgeadmkkllsllasepdiakvPlmlDss 423 
                                            +Gskkfr ++ +e +++++++a++q+++Ga++lD+n+D+v++Dg++dm++l+s+ a++   +++P+mlDs+
  NCBI__GCF_000341125.1:WP_083924073.1  347 NGSKKFRTAMLEERWDDCVEMARDQIRDGAHLLDLNIDYVGRDGVRDMRELASRFATS---STLPIMLDST 414 
                                            **********************************************************...8********* PP

                             TIGR02082  424 efevleaGLkviqGkaivnsislkdG...eerFlekaklikeyGaavvvmafDeeGqartadkkieiakRa 491 
                                            e +vleaGL+ ++G+a++ns++++dG   ++rF + ++l+ e+Gaavv + +DeeGqarta+ k+++a+R+
  NCBI__GCF_000341125.1:WP_083924073.1  415 EPAVLEAGLEALGGRAVINSVNYEDGdgpDSRFTRIMELVSEHGAAVVGLCIDEEGQARTAEWKLRVATRL 485 
                                            **************************88889**************************************** PP

                             TIGR02082  492 yklltekvgfppediifDpniltiatGieehdryaidfieaireikeelPdakisgGvsnvsFslrgndav 562 
                                            ++++t ++g+   di++D ++++i+tG+ee +r++i++++aire+k+++Pd+++++G+sn+sF+++  +a+
  NCBI__GCF_000341125.1:WP_083924073.1  486 IEQITGEWGLRTGDIVIDCLTFPITTGQEETRRDGIETLDAIRELKRRYPDVQTTLGLSNLSFGVN--PAA 554 
                                            ******************************************************************..*** PP

                             TIGR02082  563 RealhsvFLyeaikaGlDmgivnagklavyddidkelrevvedlildrrreatekLlelaelykgtkekss 633 
                                            R +l+svFL+ea++aGlD++iv+a+k+ ++++i++e+rev++dl++drr +++++L +++el++g+ +ks 
  NCBI__GCF_000341125.1:WP_083924073.1  555 RIVLNSVFLHEAVEAGLDSAIVHASKIVPINQIPDEQREVALDLVYDRRADDYDPLSRFIELFEGVDAKSM 625 
                                            *********************************************************************** PP

                             TIGR02082  634 keaqeaewrnlpveeRLeralvkGeregieedleearkklkapleiiegpLldGmkvvGdLFGsGkmfLPq 704 
                                            + ++ +e++ lp+ eRLer++v+Ge  gie+dle+a+   +++l+i++  Ll+Gmk+vG+LFGsG+m+LP+
  NCBI__GCF_000341125.1:WP_083924073.1  626 RASRAEELAALPLWERLERRIVDGEMTGIEADLEAAL-AERPALAIVNDTLLEGMKTVGELFGSGQMQLPF 695 
                                            *************************************.88999**************************** PP

                             TIGR02082  705 vvksarvmkkavayLePylekekeedkskGkivlatvkGDvhDiGknivdvvLscngyevvdlGvkvPvek 775 
                                            v+ksa+vmk avayLeP++ek +  d+ kG+ivlatvkGDvhDiGkn+vd++Ls+ngy+vv++G+k+Pv  
  NCBI__GCF_000341125.1:WP_083924073.1  696 VLKSAEVMKGAVAYLEPHMEKTD--DDGKGRIVLATVKGDVHDIGKNLVDIILSNNGYDVVNIGIKQPVSA 764 
                                            ********************988..889******************************************* PP

                             TIGR02082  776 ileaakkkkaDviglsGLivksldemvevaeemerrgvk..iPlllGGaalskahvavkiaekYkgevvyv 844 
                                            ileaa+k++aDvig+sGL+vks++ m+e++eem+ rg++  +P+llGGaal++++v++++ae ++g+v y+
  NCBI__GCF_000341125.1:WP_083924073.1  765 ILEAAEKERADVIGMSGLLVKSTVIMKENLEEMNSRGLSerFPVLLGGAALTRSYVEQDLAEMFEGQVRYA 835 
                                            **************************************888****************************** PP

                             TIGR02082  845 kdaseavkvvdkllsekkkaeelekikeeyeeirekfgekkeklialsekaarkevfaldrsedlevpapk 915 
                                            kda+e+++++d+++  k+ +e +e    ++  +r+ ++         +  + ++     d   d +vp+p+
  NCBI__GCF_000341125.1:WP_083924073.1  836 KDAFEGLRLMDSFMAVKRGEEGAELPALRQRRVRRGATL--------KVTEPEDMPARSDVATDNRVPVPP 898 
                                            ****************98666655555555555544444........22233333333444556789**** PP

                             TIGR02082  916 flGtkvleas.ieellkyiDwkalFv.qWelrgkypkilkdeleglearklfkdakelldklsaekllrar 984 
                                            f+G ++ + + ++++  ++D++a F+ qW+l+ +++    +++e+l++++ +++++ +ld++++++ll+a+
  NCBI__GCF_000341125.1:WP_083924073.1  899 FWGDRISKGIpLADYSAFLDERATFMgQWGLKASRG-GNGPSYEELVETEGRPRMRMWLDRIQTDGLLEAA 968 
                                            ************************************.99******************************** PP

                             TIGR02082  985 gvvGlfPaqsvgddieiytdetvsqetkpiatvrekleqlrqqsdrylclaDfiaskesGikDylgallvt 1055
                                            +v G fP+ s+gdd++++++++v        t r+++ ++rq +dr+lclaD++++kesG+ D++++++vt
  NCBI__GCF_000341125.1:WP_083924073.1  969 VVHGHFPCYSEGDDLVVLDEDGV--------TERTRFTFPRQRRDRHLCLADYFRPKESGELDVVSFQVVT 1031
                                            *******************9999........778899********************************** PP

                             TIGR02082 1056 aglgaeelakkleakeddydsilvkaladrlaealaellhervRkelwgyaeeenldkedllkerYrGirp 1126
                                            +g  ++  + +l+++++++d++++++l+++l+ealae++h rvR el +   e++ + + ++k+ YrG+r+
  NCBI__GCF_000341125.1:WP_083924073.1 1032 VGSAISRATAELFERDAYRDYLELHGLSVQLTEALAEYWHTRVRAELGFA-GEDPAELDAFFKLGYRGARF 1101
                                            ***********************************************998.899***************** PP

                             TIGR02082 1127 afGYpacPdhtekatlleLleaeriGlklteslalaPeasvsglyfahpeakYfav 1182
                                            ++GY+acP+++++a++++Lle+er+G++l+e+++l Pe+++++++++hpea Yf+v
  NCBI__GCF_000341125.1:WP_083924073.1 1102 SLGYGACPNLEDRAKIMRLLEPERVGVTLSEEFQLVPEQATDAIVIHHPEATYFNV 1157
                                            ******************************************************98 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (1182 nodes)
Target sequences:                          1  (1157 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.04u 0.04s 00:00:00.08 Elapsed: 00:00:00.07
# Mc/sec: 17.57
//
[ok]

This GapMind analysis is from Jul 25 2024. The underlying query database was built on Jul 25 2024.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory