GapMind for catabolism of small carbon sources

 

Alignments for a candidate for putA in Methylocapsa acidiphila B2

Align L-glutamate gamma-semialdehyde dehydrogenase (EC 1.2.1.88); Proline dehydrogenase (EC 1.5.5.2) (characterized)
to candidate WP_026607825.1 METAC_RS0116005 bifunctional proline dehydrogenase/L-glutamate gamma-semialdehyde dehydrogenase PutA

Query= reanno::ANA3:7023590
         (1064 letters)



>NCBI__GCF_000427445.1:WP_026607825.1
          Length = 1032

 Score =  877 bits (2267), Expect = 0.0
 Identities = 477/1026 (46%), Positives = 656/1026 (63%), Gaps = 12/1026 (1%)

Query: 33   YIVDEEQYLSELIKLVPSSDEAIERVTRRAHELVNKVRQFDKKGLMVGIDAFLQQYSLET 92
            Y   +E   + L+     SD A  R+   A  L+   R+  +K  + G++  L++YSL +
Sbjct: 14   YAPADESVAAALLAQADLSDAAEARIDSLATRLIKAARE--QKFRIGGVEDLLREYSLSS 71

Query: 93   QEGIILMCLAEALLRIPDAATADALIEDKLSGAKWDEHLSKSDSVLVNASTWGLMLTGKI 152
            +EG+ LM LAEALLR+PD  TAD LIED+L+ A W  H   S+++LV+A+ W L  T K+
Sbjct: 72   EEGLALMTLAEALLRVPDDFTADLLIEDRLASAHWASHADGSEALLVSAAAWALGTTAKV 131

Query: 153  VKLDKKIDGTPSNLLSRLVNRLGEPVIRQAMMAAMKIMGKQFVLGRTMKEALKNSEDKRK 212
            ++  K   G    +++ L  R+G   +R A   A+++MG  FV G T+++ L+ +  +  
Sbjct: 132  MEKGKSAHG----VVAALGRRIGMGALRAAARQAVQLMGAHFVFGETIEQTLRQAASREG 187

Query: 213  LGYTHSYDMLGEAALTRKDAEKYFNDYANAITELGAQSYNENESPRPTISIKLSALHPRY 272
              +  S+DMLGE A T+ DA+ YF+ YA+AI  +GAQ+  +    R  IS+KLSALHPRY
Sbjct: 188  RNWRFSFDMLGEGARTQVDADSYFSAYAHAIEAIGAQAAGDAAPGRLGISVKLSALHPRY 247

Query: 273  EVANEDRVLTELYDTVIRLIKLARGLNIGISIDAEEVDRLELSLKLFQKLFNADATKGWG 332
            E  +  RVL EL   V  L + AR   +  +IDAEE DRLELSL +   L    +  GW 
Sbjct: 248  EPQSRGRVLAELTPRVAELARAARRHGLIFTIDAEEADRLELSLDIVDLLLADRSLAGWH 307

Query: 333  LLGIVVQAYSKRALPVLVWLTRLAKEQGDEIPVRLVKGAYWDSELKWAQQAGEAAYPLYT 392
              G+ VQAY KRA   +  +   A+     + +RLVKGAYWDSE+K AQ+ G A YP++T
Sbjct: 308  GFGLAVQAYQKRAAAAIDHVVATARLCDRRLTLRLVKGAYWDSEIKRAQERGLADYPVFT 367

Query: 393  RKAGTDVSYLACARYLLSDATRGAIYPQFASHNAQTVAAISDMAGDRNHEFQRLHGMGQE 452
            RKA TD++YLACAR LL  A R  I+PQFA+HNA TVA+I +MAGD   EFQRLHGMG+E
Sbjct: 368  RKAMTDLNYLACARRLL--AARDVIFPQFATHNALTVASIVEMAGDAGFEFQRLHGMGEE 425

Query: 453  LYDTILSEAGAKAVRIYAPIGAHKDLLPYLVRRLLENGANTSFVHKLVDPKTPIESLVVH 512
            L++ +  E    + R+YAP+GAH +LL YLVRRLLENGAN+SFV +L DP     +L+V 
Sbjct: 426  LFEALRIERPTLSSRVYAPVGAHANLLAYLVRRLLENGANSSFVARLGDPHVAPRALLVR 485

Query: 513  PLKTLTGYKTLANNKIVLPTDIFGSDRKNSKGLNMNIISEAEPFFAALDKFKSTQWQAGP 572
            P   +         ++ LP D+ G  RK ++G+        +    A+ K  +   +A P
Sbjct: 486  PQAVIGDASRARTARLPLPQDLHGPARKTAQGIEFGDRRAVDALTQAIGKAAALA-EAAP 544

Query: 573  LVNGQTLTGEHKTVVSPFDTTQTVGQVAFADKAAIEQAVASADAAFATWTRTPVEVRASA 632
            +V+G+   G  +++ SP D   T+G V       + +A+  A A F  W++TPVE RAS 
Sbjct: 545  IVSGRRREGVLRSLASPIDGA-TIGVVGETPVRILTEAMREARAGFRLWSQTPVEARASC 603

Query: 633  LQKLADLLEENREELIALCTREAGKSIQDGIDEVREAVDFCRYYAVQAKKLMSKPELLPG 692
            L+++A  LE    + + L   EAGK++ D + E REA+DFCRYYA +A++L +   +LPG
Sbjct: 604  LERVAQDLEGQAPKWLRLLQIEAGKTLDDAVGEWREAIDFCRYYAQEARRLFAASVVLPG 663

Query: 693  PTGELNELFLQGRGVFVCISPWNFPLAIFLGQVSAALAAGNTVVAKPAEQTSIIGYRAVQ 752
            PTGE N L  + RGVF CISPWNFPL+IF GQV+AALAAGN+V+AKPAEQT +I    V 
Sbjct: 664  PTGEDNRLSWRSRGVFACISPWNFPLSIFTGQVAAALAAGNSVLAKPAEQTPLIAAAIVA 723

Query: 753  LAHQAGIPTDVLQYLPGTGATVGNALTADERIGGVCFTGSTGTAKLINRTLANREGAIIP 812
              H+AG P + L  LPG G  +G  L A   + GV FTGS  TA  INR LA ++GAI  
Sbjct: 724  AFHRAGAPVEALHLLPGDG-EIGAGLVALPALAGVAFTGSMETAVRINRALAAKDGAIAT 782

Query: 813  LIAETGGQNAMVVDSTSQPEQVVNDVVSSSFTSAGQRCSALRVLFLQEDIADRVIDVLQG 872
            LIAETGG N M+ D+T+ PEQV +DV++SSF SAGQRCSALR+L +QED AD++I ++ G
Sbjct: 783  LIAETGGVNVMIADATALPEQVADDVLASSFRSAGQRCSALRLLCIQEDAADKIIAMIAG 842

Query: 873  AMDELVIGNPSSVKTDVGPVIDATAKANLDAHIDHIKQVGKLIKQMSLP-AGTENGHFVS 931
            A  EL++G+P  +   VGPVID  AK +L+AHI  +++   +     +P +   NG +V+
Sbjct: 843  AARELLVGDPRDLSVHVGPVIDLAAKTSLEAHIAAMRRRAIVRYAGEIPTSAPRNGFYVA 902

Query: 932  PTAVEIDSIKVLEKEHFGPILHVIRYKASELAHVIDEINSTGFGLTLGIHSRNEGHALEV 991
            P   E+     L++E FGPILHV+RYKA+EL  ++D++ + G+GLTLG+HSR +     +
Sbjct: 903  PHIFELAQPTDLDREVFGPILHVVRYKANELDRLLDQLEAMGYGLTLGVHSRIDATIAHI 962

Query: 992  ADKVNVGNVYINRNQIGAVVGVQPFGGQGLSGTGPKAGGPHYLTRFVTEKTRTNNITAIG 1051
             ++   GN Y+NRN IGAVVG QPFGG GLSGTGPKAGGPHYL RF  E+T T N  A+G
Sbjct: 963  MERRLAGNCYVNRNMIGAVVGTQPFGGFGLSGTGPKAGGPHYLERFCVEQTVTINTAAVG 1022

Query: 1052 GNATLL 1057
            GNA L+
Sbjct: 1023 GNAALV 1028


Lambda     K      H
   0.317    0.133    0.377 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 2328
Number of extensions: 111
Number of successful extensions: 7
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 1064
Length of database: 1032
Length adjustment: 45
Effective length of query: 1019
Effective length of database: 987
Effective search space:  1005753
Effective search space used:  1005753
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.6 bits)
S2: 58 (26.9 bits)

Align candidate WP_026607825.1 METAC_RS0116005 (bifunctional proline dehydrogenase/L-glutamate gamma-semialdehyde dehydrogenase PutA)
to HMM TIGR01238 (delta-1-pyrroline-5-carboxylate dehydrogenase (EC 1.2.1.88))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR01238.hmm
# target sequence database:        /tmp/gapView.48433.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR01238  [M=500]
Accession:   TIGR01238
Description: D1pyr5carbox3: delta-1-pyrroline-5-carboxylate dehydrogenase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                             Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                             -----------
   4.4e-181  588.7   0.4   7.6e-181  587.9   0.4    1.4  1  NCBI__GCF_000427445.1:WP_026607825.1  


Domain annotation for each sequence (and alignments):
>> NCBI__GCF_000427445.1:WP_026607825.1  
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  587.9   0.4  7.6e-181  7.6e-181       1     496 [.     506    1009 ..     506    1013 .. 0.97

  Alignments for each domain:
  == domain 1  score: 587.9 bits;  conditional E-value: 7.6e-181
                             TIGR01238    1 dlygegrknslGvdlaneselksleeqllkaaakkfqaapivgekakaegeaqpvknpadrkdivGqvsea 71  
                                            dl+g +rk ++G+++   + ++ l + + kaaa   +aapiv+   + eg  + + +p d    +G v e+
  NCBI__GCF_000427445.1:WP_026607825.1  506 DLHGPARKTAQGIEFGDRRAVDALTQAIGKAAAL-AEAAPIVS-GRRREGVLRSLASPIDGA-TIGVVGET 573 
                                            78999**********************9998775.69*****5.567799*********975.68****** PP

                             TIGR01238   72 daaevqeavdsavaafaewsatdakeraailerladlleshmpelvallvreaGktlsnaiaevreavdfl 142 
                                                  ea+  a a f  ws t+ + ra++ler+a+ le + p+ + ll+ eaGktl++a+ e rea+df+
  NCBI__GCF_000427445.1:WP_026607825.1  574 PVRILTEAMREARAGFRLWSQTPVEARASCLERVAQDLEGQAPKWLRLLQIEAGKTLDDAVGEWREAIDFC 644 
                                            *********************************************************************** PP

                             TIGR01238  143 ryyakqvedvldeesaka.............lGavvcispwnfplaiftGqiaaalaaGntviakpaeqts 200 
                                            ryya++++  +    + +             +G++ cispwnfpl+iftGq+aaalaaGn+v+akpaeqt+
  NCBI__GCF_000427445.1:WP_026607825.1  645 RYYAQEARRLFAASVVLPgptgednrlswrsRGVFACISPWNFPLSIFTGQVAAALAAGNSVLAKPAEQTP 715 
                                            **********99998888999************************************************** PP

                             TIGR01238  201 liaaravellqeaGvpagviqllpGrGedvGaaltsderiaGviftGstevarlinkalakredapvplia 271 
                                            liaa  v+ ++ aG p  ++ llpG Ge +Ga l + +++aGv+ftGs e+a +in+ala ++ a ++lia
  NCBI__GCF_000427445.1:WP_026607825.1  716 LIAAAIVAAFHRAGAPVEALHLLPGDGE-IGAGLVALPALAGVAFTGSMETAVRINRALAAKDGAIATLIA 785 
                                            ****************************.****************************************** PP

                             TIGR01238  272 etGGqnamivdstalaeqvvadvlasafdsaGqrcsalrvlcvqedvadrvltlikGamdelkvgkpirlt 342 
                                            etGG n mi d+tal+eqv  dvlas+f saGqrcsalr+lc+qed ad+++ +i Ga  el vg p  l 
  NCBI__GCF_000427445.1:WP_026607825.1  786 ETGGVNVMIADATALPEQVADDVLASSFRSAGQRCSALRLLCIQEDAADKIIAMIAGAARELLVGDPRDLS 856 
                                            *********************************************************************** PP

                             TIGR01238  343 tdvGpvidaeakqnllahiekmkakakkvaqvkleddvesekgtfvaptlfelddldelkkevfGpvlhvv 413 
                                              vGpvid  ak  l+ahi +m+ +a   +        + ++g +vap +fel +  +l++evfGp+lhvv
  NCBI__GCF_000427445.1:WP_026607825.1  857 VHVGPVIDLAAKTSLEAHIAAMRRRAIVRYAGEIPT-SAPRNGFYVAPHIFELAQPTDLDREVFGPILHVV 926 
                                            **********************98876655555544.6799****************************** PP

                             TIGR01238  414 rykadeldkvvdkinakGygltlGvhsrieetvrqiekrakvGnvyvnrnlvGavvGvqpfGGeGlsGtGp 484 
                                            ryka+eld+++d+++a GygltlGvhsri+ t+++i +r  +Gn+yvnrn++GavvG qpfGG GlsGtGp
  NCBI__GCF_000427445.1:WP_026607825.1  927 RYKANELDRLLDQLEAMGYGLTLGVHSRIDATIAHIMERRLAGNCYVNRNMIGAVVGTQPFGGFGLSGTGP 997 
                                            *********************************************************************** PP

                             TIGR01238  485 kaGGplylyrlt 496 
                                            kaGGp+yl r+ 
  NCBI__GCF_000427445.1:WP_026607825.1  998 KAGGPHYLERFC 1009
                                            **********96 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (500 nodes)
Target sequences:                          1  (1032 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.00u 0.01s 00:00:00.01 Elapsed: 00:00:00.00
# Mc/sec: 53.72
//
[ok]

This GapMind analysis is from Apr 09 2024. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory