GapMind for catabolism of small carbon sources

 

Alignments for a candidate for putA in Caulobacter crescentus NA1000

Align L-glutamate gamma-semialdehyde dehydrogenase (EC 1.2.1.88); Proline dehydrogenase (EC 1.5.5.2) (characterized)
to candidate CCNA_00846 CCNA_00846 proline dehydrogenase/delta-1-pyrroline-5-carboxylate dehydrogenase

Query= reanno::SB2B:6938573
         (1058 letters)



>FitnessBrowser__Caulo:CCNA_00846
          Length = 1029

 Score =  956 bits (2470), Expect = 0.0
 Identities = 529/1027 (51%), Positives = 673/1027 (65%), Gaps = 14/1027 (1%)

Query: 31   DEEAYLKELIALVPSSDEEIARITSRAHDLVAKVRQYEKKGLMVGIDAFLQQYSLETQEG 90
            DE A + +L+A  P S E+ A + + A  LV   R+  +K  +V  ++FLQ++SL T+EG
Sbjct: 14   DEAAVIADLLAAKPLSSEDRAAVRAEAEALVRGARRSVRKQGVV--ESFLQEFSLGTREG 71

Query: 91   IILMCLAEALLRIPDAETADALIADKLSGAKWDEHMSKSDSVLVNASTWGLMLTGKIVQL 150
            + LMCLAEALLR PD +T D LIA+K+  A W  H+  SDS+ VNASTWGLMLTGKIV+ 
Sbjct: 72   LALMCLAEALLRTPDDDTRDKLIAEKIGSADWASHLGGSDSLFVNASTWGLMLTGKIVEP 131

Query: 151  DKNLDGTPSNLLSRLVNRLGEPVIRQAMYAAMKIMGKQFVLGRTIEEGLKNAAEKRKLGY 210
            D+         + +L  RLGEPVIR A+  A++IMG+QFVLGRTIE  +K AA +   G 
Sbjct: 132  DETARNDMPGFIKKLAGRLGEPVIRAAVGQAIRIMGEQFVLGRTIEAAIKRAAAE---GD 188

Query: 211  THSYDMLGEAALTMKDADKYYRDYANAIQALGTAKFDESEAPRPTISIKLSALHPRYEVA 270
              S+DMLGE A T  DA +Y + YA+AI+ +G             +S+KLSAL PRYE  
Sbjct: 189  MCSFDMLGEGARTAADAARYEKAYADAIETVGKLSNGAGPEAGHGVSVKLSALCPRYEAT 248

Query: 271  NEDRVMTELYATLIKLIEQARSLNVGIQIDAEEVDRLELSLKLFKKLYQSDAAKGWGLLG 330
            +EDRV  ELY   ++L + A   N+   IDAEE DRL LSLKL  KL +      W  LG
Sbjct: 249  HEDRVWEELYPRTLRLAKIAARHNLNFTIDAEEADRLALSLKLLDKLCREPELGDWTGLG 308

Query: 331  IVVQAYSKRALPVLMWLTRLAKEQGDEIPLRLVKGAYWDSELKWAQQAGEAGYPLFTRKA 390
            + VQAY KR   V+  L  L++E G  + +RLVKGAYWDSE+K AQ AG   YP+FT K 
Sbjct: 309  LAVQAYQKRCGEVIARLKALSEETGRRLMVRLVKGAYWDSEIKRAQVAGRPDYPVFTTKP 368

Query: 391  ATDVSYLACARYLLSEATRGVIYPQFASHNAQTVAAITAMVGDR--KFEFQRLHGMGQEL 448
            ATD+SYL  A+ L+  A    +Y QFA+HNA T+AA+  M  +   K E QRLHGMG+ L
Sbjct: 369  ATDLSYLVNAKALIEAAPH--LYAQFATHNAHTLAAVVRMAKNTGVKIEHQRLHGMGEAL 426

Query: 449  YDTVLAEAAVPTVRIYAPIGAHKDLLPYLVRRLLENGANTSFVHKLVDPKTPIESLVTHP 508
            Y          T+R YAP+G H+DLLPYLVRRLLENGANTSFVH L+D + P+E +VT P
Sbjct: 427  YKAADDLYDGITLRAYAPVGGHEDLLPYLVRRLLENGANTSFVHALLDERVPVEKVVTDP 486

Query: 509  LKTLQGYKTLANNKIVKPADIFGAERKNSKGLNMNIISESEPFFAALEKFKDTQWSAGPL 568
            + T++ +    + KI  P +++G  R NS GL++++ ++ E   AA+        SAGPL
Sbjct: 487  IDTVEAHPD-RHAKIPTPINVYGERRVNSAGLDLSVKADRERLSAAVAAQDGVTLSAGPL 545

Query: 569  VNGETLSGEVR-DVVSPYNTTLKVGQVAFANEATIEQAIAGADKAFASWCRTPVETRANA 627
            V G+ ++G     +++P N    VG V+ A  A I++A   A  A  +W R     RA  
Sbjct: 546  VGGKVVAGGAPLPLIAPANDQKTVGVVSEAQSAQIDEAFKLARAAQPAWDRAGGVARAQV 605

Query: 628  LQKLADLLEENREELIALCTREAGKSIQDGIDEVREAVDFCRYYAVQAKKMMSKPELLPG 687
            L+ + D LE N E LIAL +REAGK++ DGI EVREAVDFCRYYA+ A+    + E+L G
Sbjct: 606  LRAMGDALEANIERLIALLSREAGKTLSDGIAEVREAVDFCRYYAMLAEDQFGEAEILKG 665

Query: 688  PTGELNELFLQGRGVFVCISPWNFPLAIFLGQVAAALATGNTVIAKPAEQTCLIGFRAVQ 747
            P GE N L L GRGVFVCISPWNFPLAIF GQ+AAALA GN V+AKPAEQT LI F AV+
Sbjct: 666  PVGETNSLRLAGRGVFVCISPWNFPLAIFTGQIAAALAAGNAVLAKPAEQTPLIAFEAVK 725

Query: 748  LAHEAGIPKDVLQFLPGTGAVVGAKLTSDERIGGVCFTGSTTTAKVINRALAGRDGAIIP 807
            L H AG+   +L  LPG G  VGA LTS E + GV FTG T TA  IN+ LA R G I+P
Sbjct: 726  LYHAAGLDPRLLALLPGRGETVGAALTSHEDLDGVAFTGGTDTAWRINQTLAARQGPIVP 785

Query: 808  LIAETGGQNAMVVDSTSQPEQVVNDVVSSAFTSAGQRCSALRVLYLQEDIAERVLDVLKG 867
             IAETGG N M VD+T+Q EQV++DV+ SAF SAGQRCSALR+L+L  D A+ +++ LKG
Sbjct: 786  FIAETGGLNGMFVDTTAQREQVIDDVIVSAFGSAGQRCSALRLLFLPHDTADHIIEGLKG 845

Query: 868  AMDELTLGNPGSVKTDVGPVIDAAAKANLNAHIDHIKQVGRLIHQLSLPEGTENGHFVAP 927
            AMD L LG+P    TDVGPVIDA AK  L+ H+  +K   +++H L+ P G   G F AP
Sbjct: 846  AMDALVLGDPALAVTDVGPVIDAEAKDALDKHLVRLKSDAKVLHALAAPAG---GTFFAP 902

Query: 928  TAVEIDSIKVLTKENFGPILHVVRYKAAGLQKVIDDINSTGFGLTLGIHSRNEGHALEVA 987
               EI +   L +E FGP+LHVVRYK   L+KV   + +  +GLTLGIHSR E  A +V 
Sbjct: 903  VLAEIPTADFLEREVFGPVLHVVRYKPENLEKVAGALAARRYGLTLGIHSRIESFAADVQ 962

Query: 988  DKVNVGNVYINRNQIGAVVGVQPFGGQGLSGTGPKAGGPHYLTRFVTEKTRTNNITAIGG 1047
              V  GN Y+NR+  GAVVGVQPFGG+GLSGTGPKAGGPH L RF  E+  + NITA GG
Sbjct: 963  RLVPAGNAYVNRSMTGAVVGVQPFGGEGLSGTGPKAGGPHALLRFAVERALSVNITAQGG 1022

Query: 1048 NATLLSL 1054
            +  LL+L
Sbjct: 1023 DPALLNL 1029


Lambda     K      H
   0.317    0.134    0.380 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 2548
Number of extensions: 99
Number of successful extensions: 9
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 1058
Length of database: 1029
Length adjustment: 45
Effective length of query: 1013
Effective length of database: 984
Effective search space:   996792
Effective search space used:   996792
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 58 (26.9 bits)

Align candidate CCNA_00846 CCNA_00846 (proline dehydrogenase/delta-1-pyrroline-5-carboxylate dehydrogenase)
to HMM TIGR01238 (delta-1-pyrroline-5-carboxylate dehydrogenase (EC 1.2.1.88))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR01238.hmm
# target sequence database:        /tmp/gapView.13248.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR01238  [M=500]
Accession:   TIGR01238
Description: D1pyr5carbox3: delta-1-pyrroline-5-carboxylate dehydrogenase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                             Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                             -----------
   1.4e-186  606.8   1.5   1.9e-186  606.4   1.5    1.2  1  lcl|FitnessBrowser__Caulo:CCNA_00846  CCNA_00846 proline dehydrogenase


Domain annotation for each sequence (and alignments):
>> lcl|FitnessBrowser__Caulo:CCNA_00846  CCNA_00846 proline dehydrogenase/delta-1-pyrroline-5-carboxylate dehydrogenase
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  606.4   1.5  1.9e-186  1.9e-186       1     496 [.     505    1008 ..     505    1012 .. 0.99

  Alignments for each domain:
  == domain 1  score: 606.4 bits;  conditional E-value: 1.9e-186
                             TIGR01238    1 dlygegrknslGvdlaneselksleeqllkaaakkfqaapivgekakaegeaqpvknpadrkdivGqvsea 71  
                                            ++yge r ns+G+dl+++   ++l++ + ++   +  a p+vg+k +a g   p+ +pa+ ++ vG vsea
  lcl|FitnessBrowser__Caulo:CCNA_00846  505 NVYGERRVNSAGLDLSVKADRERLSAAVAAQDGVTLSAGPLVGGKVVAGGAPLPLIAPANDQKTVGVVSEA 575 
                                            58********************************************************************* PP

                             TIGR01238   72 daaevqeavdsavaafaewsatdakeraailerladlleshmpelvallvreaGktlsnaiaevreavdfl 142 
                                            + a+++ea + a aa + w       ra +l+++ d le ++  l+all reaGktls+ iaevreavdf+
  lcl|FitnessBrowser__Caulo:CCNA_00846  576 QSAQIDEAFKLARAAQPAWDRAGGVARAQVLRAMGDALEANIERLIALLSREAGKTLSDGIAEVREAVDFC 646 
                                            *********************************************************************** PP

                             TIGR01238  143 ryyakqvedvldeesaka.............lGavvcispwnfplaiftGqiaaalaaGntviakpaeqts 200 
                                            ryya  +ed+++e                  +G++vcispwnfplaiftGqiaaalaaGn+v+akpaeqt+
  lcl|FitnessBrowser__Caulo:CCNA_00846  647 RYYAMLAEDQFGEAEILKgpvgetnslrlagRGVFVCISPWNFPLAIFTGQIAAALAAGNAVLAKPAEQTP 717 
                                            ************98765579*************************************************** PP

                             TIGR01238  201 liaaravellqeaGvpagviqllpGrGedvGaaltsderiaGviftGstevarlinkalakredapvplia 271 
                                            lia +av+l + aG+ +  + llpGrGe+vGaalts+e + Gv+ftG t++a +in++la r+ + vp+ia
  lcl|FitnessBrowser__Caulo:CCNA_00846  718 LIAFEAVKLYHAAGLDPRLLALLPGRGETVGAALTSHEDLDGVAFTGGTDTAWRINQTLAARQGPIVPFIA 788 
                                            *********************************************************************** PP

                             TIGR01238  272 etGGqnamivdstalaeqvvadvlasafdsaGqrcsalrvlcvqedvadrvltlikGamdelkvgkpirlt 342 
                                            etGG n m vd+ta  eqv+ dv+ saf saGqrcsalr+l++ +d ad++++ +kGamd l++g p +  
  lcl|FitnessBrowser__Caulo:CCNA_00846  789 ETGGLNGMFVDTTAQREQVIDDVIVSAFGSAGQRCSALRLLFLPHDTADHIIEGLKGAMDALVLGDPALAV 859 
                                            *********************************************************************** PP

                             TIGR01238  343 tdvGpvidaeakqnllahiekmkakakkvaqvkleddvesekgtfvaptlfelddldelkkevfGpvlhvv 413 
                                            tdvGpvidaeak+ l +h+ ++k+ ak ++ +   +      gtf ap+l e+   d l++evfGpvlhvv
  lcl|FitnessBrowser__Caulo:CCNA_00846  860 TDVGPVIDAEAKDALDKHLVRLKSDAKVLHALAAPA-----GGTFFAPVLAEIPTADFLEREVFGPVLHVV 925 
                                            ****************************99998777.....8***************************** PP

                             TIGR01238  414 rykadeldkvvdkinakGygltlGvhsrieetvrqiekrakvGnvyvnrnlvGavvGvqpfGGeGlsGtGp 484 
                                            ryk ++l+kv   + a+ ygltlG+hsrie+  + +++ + +Gn yvnr + GavvGvqpfGGeGlsGtGp
  lcl|FitnessBrowser__Caulo:CCNA_00846  926 RYKPENLEKVAGALAARRYGLTLGIHSRIESFAADVQRLVPAGNAYVNRSMTGAVVGVQPFGGEGLSGTGP 996 
                                            *********************************************************************** PP

                             TIGR01238  485 kaGGplylyrlt 496 
                                            kaGGp+ l r+ 
  lcl|FitnessBrowser__Caulo:CCNA_00846  997 KAGGPHALLRFA 1008
                                            ********9996 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (500 nodes)
Target sequences:                          1  (1029 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.02u 0.01s 00:00:00.03 Elapsed: 00:00:00.03
# Mc/sec: 16.51
//
[ok]

This GapMind analysis is from Sep 17 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory