GapMind for catabolism of small carbon sources

 

Alignments for a candidate for putA in Dinoroseobacter shibae DFL-12

Align L-glutamate gamma-semialdehyde dehydrogenase (EC 1.2.1.88); Proline dehydrogenase (EC 1.5.5.2) (characterized)
to candidate 3608920 Dshi_2311 delta-1-pyrroline-5-carboxylate dehydrogenase (RefSeq)

Query= reanno::Marino:GFF2744
         (1209 letters)



>FitnessBrowser__Dino:3608920
          Length = 1221

 Score = 1263 bits (3267), Expect = 0.0
 Identities = 665/1209 (55%), Positives = 848/1209 (70%), Gaps = 19/1209 (1%)

Query: 15   RQAIRDYYLADEHKVIHEMIAGAQLSQAERDAISARAAELVRSVRKNAKSTIMEKFLAEY 74
            R  +R +Y A+E  ++  + A  +LS  ER+  +A  A  V  VR   + ++ME FLAEY
Sbjct: 16   RAQVRAHYTAEETALLKSLAARIKLSAHEREKAAAAGARYVTRVRNETRPSMMEAFLAEY 75

Query: 75   GLTTKEGVALMCLAEALLRVPDNTTIHELIEDKITSGAWGTHVGKASSGLINTATVALLM 134
            GL+T EGV LMCLAEALLRVPD  TI +LIEDK+    WG H+G +SS L+N +T AL++
Sbjct: 76   GLSTSEGVGLMCLAEALLRVPDADTIDDLIEDKVAPSNWGAHLGHSSSSLVNASTWALML 135

Query: 135  TSNLLKDSERNTVGETLRKLLKRFGEPVIRTVAGQAMKEMGRQFVLGRDIDEAQDEAKEY 194
            T  +L +  R      LR L+KR GEPV+RT  GQ+MK +GRQFVLG+ I+E    A+E 
Sbjct: 136  TGKVLDEDPRGPA-RALRGLVKRLGEPVVRTAVGQSMKVLGRQFVLGQTIEEGLKNAREL 194

Query: 195  MAKGYTYSYDMLGEAARTDDDAKRYYDSYSNAIDSIAKASKGDVRKNPGISVKLSALLAR 254
              KG+TYSYDMLGEAARTD DA+RY+ +Y+ AI +IA+ + GDVR +PGISVKLSAL  R
Sbjct: 195  EKKGFTYSYDMLGEAARTDADARRYHAAYAQAITAIARQATGDVRSSPGISVKLSALHPR 254

Query: 255  YEYGNKERVMNELLPRARELVKKAAAANMGFNIDAEEQDRLDLSLDVIEELVADPELAGW 314
            YEY ++  VM +L+PRA  LVK+AA A +GFN+DAEEQDRLDLSLDVIE +++DP+L GW
Sbjct: 255  YEYTHRHSVMADLVPRAAALVKQAAQAGIGFNVDAEEQDRLDLSLDVIEAMMSDPDLDGW 314

Query: 315  DGFGVVVQAYGKRSSFVLDWLYGLAEKYDRKFMVRLVKGAYWDAEIKRAQVMGLNGFPVF 374
            DGFGVVVQAYG+R+  V++ LY +AE+YDRK MVRLVKGAYWD EIK AQ +G+  FPVF
Sbjct: 315  DGFGVVVQAYGRRAGPVIETLYDMAERYDRKIMVRLVKGAYWDTEIKLAQELGVERFPVF 374

Query: 375  TRKACSDVSFLSCATKLLNMTNRIYPQFATHNAHSVSAILEMAKTKGVDNYEFQRLHGMG 434
            TRK  +DVS+++CA  LL+  +RIYPQFATHNAH+ +A+L+MA     D +EFQRLHGMG
Sbjct: 375  TRKNNTDVSYMACAQMLLDRRDRIYPQFATHNAHTCAAVLQMAGNAR-DCFEFQRLHGMG 433

Query: 435  ESLHNEVLKVSGVPCRIYAPVGPHKDLLAYLVRRLLENGANSSFVNQIVDKRITPEEIAK 494
             SLH  V +  G  CRIYAPVG H+DLLAYLVRRLLENGANSSFVNQIVD  I  E I+ 
Sbjct: 434  ASLHQIVKETEGTRCRIYAPVGAHQDLLAYLVRRLLENGANSSFVNQIVDPDIPAEAISA 493

Query: 495  DPIVSVEEMGNNISSKAIVHPFKLFGDQRRNSKGWDITDPVTVNEIEKGRGAYKDYRWKG 554
            DP+  +E++G+ I + AI  P  LF   RRNS+G+ + +P ++  +   R A+ +  W  
Sbjct: 494  DPVSEMEKLGDQIPNPAIRQPSDLFAPDRRNSRGYRVNEPASILPLMTAREAFAETTWHA 553

Query: 555  GPLIAGEVAGT-EIQVVRNPADPDDLVGHVTQASDADVDTAITSAAAAFESWSAKSAEER 613
             P++AG    T   + V +PAD   LVG V +AS  DV  A+ +A   F  WSA+   ER
Sbjct: 554  RPMLAGGRDPTGPTREVHSPADKTRLVGTVQEASAEDVACALDAAETGFRDWSARPVSER 613

Query: 614  AACVRKVGDLYEENYAELFALTTREAGKSLLDAVAEIREAVDFSQYYANEAIRYK--DSG 671
            A  +RK+ D+YE+N AEL A+TTREAGK++LD +AE+REAVDF ++YANEA R +  D G
Sbjct: 614  ADMLRKLADMYEDNIAELTAITTREAGKTVLDGIAEVREAVDFLRFYANEAERLEEEDPG 673

Query: 672  DARGVMCCISPWNFPLAIFTGQILANLAAGNTVVAKPAEQTSLLAIRAVELMHQAGIPKD 731
              RG+  CISPWNFPLAIFTGQI A L  GN V+AKPAEQT ++A RAV++M   G+P  
Sbjct: 674  RPRGIFVCISPWNFPLAIFTGQIAAALVMGNAVLAKPAEQTPIIAARAVQMMRDCGLPDA 733

Query: 732  AIQLVPGTGATVGAALTSDSRVSGVCFTGSTATAQRINKVMTENMAPDAPLVAETGGLNA 791
            A+QL+PG G  VG  LTSD R++GVCFTGST  A  I+K + +N  P+A LVAETGGLNA
Sbjct: 734  ALQLLPGDGPMVGGPLTSDPRIAGVCFTGSTEVAMIIHKALAKNAGPEAVLVAETGGLNA 793

Query: 792  MIVDSTALPEQVVRDVLASSFQSAGQRCSALRMLYVQRDIADGLLEMLYGAMEELGIGDP 851
            MIVDSTAL EQ VRD+L SSFQSAGQRCSALR+LYVQ D+ D L+EML GA++ L IGD 
Sbjct: 794  MIVDSTALHEQAVRDILISSFQSAGQRCSALRILYVQEDVHDKLMEMLSGALDALVIGDS 853

Query: 852  WLLSTDVGPVIDENARKKIVDHCEKFERNGKLLKKMKVPEKGLFVSPAVLSVSGIEELEE 911
            W L  DV PVID +A+  I+ + ++  + G L+K +  P+ G +V+PA++ V GI ++E 
Sbjct: 854  WNLDVDVSPVIDADAQSDILGYIDQHRKAGTLIKTLAAPDSGTYVTPAIVKVGGIADMER 913

Query: 912  EIFGPVLHVATFEAKNIDKVVDDINAKGYGLTFGIHSRVDRRVERITSRIKVGNTYVNRN 971
            EIFGPVLHVATF+A  ID+VVD INA+ YGLTFG+H+R+D RVE+I  RI+VGN YVNRN
Sbjct: 914  EIFGPVLHVATFKANEIDQVVDAINARRYGLTFGLHTRIDDRVEQIVERIQVGNVYVNRN 973

Query: 972  QIGAIVGSQPFGGEGLSGTGPKAGGPQYVRRFLK-GETVEREADSNARKVDAKQLQKLIG 1030
            QIGAIVGSQPFGGEGLSGTGPKAGGP Y+ RF K G++    A   A  +    L   + 
Sbjct: 974  QIGAIVGSQPFGGEGLSGTGPKAGGPLYLTRFRKVGKSTSHPAPQGA-VLGKAALNTALS 1032

Query: 1031 QLDKLK-ASRPE----ARMDAIRPIFGNVPEPL------DAHVEALPGPTGETNRLSNHA 1079
             LD    A+RP+     RM A+    G V   L      D   + LPGPTGE+NRLS   
Sbjct: 1033 SLDARNWAARPDRVHILRM-ALSGSTGVVRRALSETAAFDMSPQTLPGPTGESNRLSMVP 1091

Query: 1080 RGVVLCLGPDKETALEQAGTALSQGNKVVVIAPGTQDVVDQANKAGLPIVGAQGLLEPEA 1139
            RG VLCLGP  E A+ QA  AL  G  VV+  PG+  +    + AG P+V   G ++   
Sbjct: 1092 RGTVLCLGPTPEIAMAQAVQALGAGCAVVIALPGSTPLSQPLSDAGAPVVTLDGTVDCVT 1151

Query: 1140 LATIDGFEAVVSCGDQPLLKAYREALAKRDGALLPLITEHTLDQRFVIERHLCVDTTAAG 1199
            L  + G E V + G     +  R AL++RDG ++PLI +    +R+V+ERHLC+DTTAAG
Sbjct: 1152 LTELTGIEVVAAAGASDWTRTLRVALSQRDGPIIPLIVDEIAPERYVLERHLCIDTTAAG 1211

Query: 1200 GNASLIAAS 1208
            GNA L+AAS
Sbjct: 1212 GNAKLLAAS 1220


Lambda     K      H
   0.316    0.133    0.378 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 3465
Number of extensions: 168
Number of successful extensions: 6
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 1209
Length of database: 1221
Length adjustment: 47
Effective length of query: 1162
Effective length of database: 1174
Effective search space:  1364188
Effective search space used:  1364188
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.6 bits)
S2: 59 (27.3 bits)

Align candidate 3608920 Dshi_2311 (delta-1-pyrroline-5-carboxylate dehydrogenase (RefSeq))
to HMM TIGR01238 (delta-1-pyrroline-5-carboxylate dehydrogenase (EC 1.2.1.88))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR01238.hmm
# target sequence database:        /tmp/gapView.28029.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR01238  [M=500]
Accession:   TIGR01238
Description: D1pyr5carbox3: delta-1-pyrroline-5-carboxylate dehydrogenase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                         Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                         -----------
   1.2e-194  633.4   3.2   7.9e-194  630.7   0.1    2.1  2  lcl|FitnessBrowser__Dino:3608920  Dshi_2311 delta-1-pyrroline-5-ca


Domain annotation for each sequence (and alignments):
>> lcl|FitnessBrowser__Dino:3608920  Dshi_2311 delta-1-pyrroline-5-carboxylate dehydrogenase (RefSeq)
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  630.7   0.1  7.9e-194  7.9e-194       1     498 [.     516    1008 ..     516    1010 .. 0.99
   2 !    2.9   0.6    0.0017    0.0017     158     272 ..    1089    1190 ..    1075    1198 .. 0.82

  Alignments for each domain:
  == domain 1  score: 630.7 bits;  conditional E-value: 7.9e-194
                         TIGR01238    1 dlygegrknslGvdlaneselksleeqllkaaakkfqaapivgekakaegeaqpvknpadrkdivGqvseadaae 75  
                                        dl++  r+ns G  ++    + +l +  ++ a+ +++a p++++     g ++ v +pad+  +vG+v+ea+a++
  lcl|FitnessBrowser__Dino:3608920  516 DLFAPDRRNSRGYRVNEPASILPLMTAREAFAETTWHARPMLAGGRDPTGPTREVHSPADKTRLVGTVQEASAED 590 
                                        799*9**************************************99999*************************** PP

                         TIGR01238   76 vqeavdsavaafaewsatdakeraailerladlleshmpelvallvreaGktlsnaiaevreavdflryyakqve 150 
                                        v  a+d+a + f  wsa +  era +l++lad+ e ++ el a++ reaGkt+ + iaevreavdflr+ya+++e
  lcl|FitnessBrowser__Dino:3608920  591 VACALDAAETGFRDWSARPVSERADMLRKLADMYEDNIAELTAITTREAGKTVLDGIAEVREAVDFLRFYANEAE 665 
                                        **************************************************************************8 PP

                         TIGR01238  151 dvldeesakalGavvcispwnfplaiftGqiaaalaaGntviakpaeqtsliaaravellqeaGvpagviqllpG 225 
                                           +e+  +++G +vcispwnfplaiftGqiaaal+ Gn+v+akpaeqt++iaarav+++++ G+p +++qllpG
  lcl|FitnessBrowser__Dino:3608920  666 RLEEEDPGRPRGIFVCISPWNFPLAIFTGQIAAALVMGNAVLAKPAEQTPIIAARAVQMMRDCGLPDAALQLLPG 740 
                                        88888999******************************************************************* PP

                         TIGR01238  226 rGedvGaaltsderiaGviftGstevarlinkalakredapvpliaetGGqnamivdstalaeqvvadvlasafd 300 
                                         G  vG  ltsd+riaGv ftGsteva  i+kalak   ++++l+aetGG namivdstal eq v+d+l s+f+
  lcl|FitnessBrowser__Dino:3608920  741 DGPMVGGPLTSDPRIAGVCFTGSTEVAMIIHKALAKNAGPEAVLVAETGGLNAMIVDSTALHEQAVRDILISSFQ 815 
                                        *************************************************************************** PP

                         TIGR01238  301 saGqrcsalrvlcvqedvadrvltlikGamdelkvgkpirlttdvGpvidaeakqnllahiekmkakakkvaqvk 375 
                                        saGqrcsalr+l+vqedv d++++++ Ga+d l++g   +l  dv pvida+a+ ++l +i++ ++ +  ++ + 
  lcl|FitnessBrowser__Dino:3608920  816 SAGQRCSALRILYVQEDVHDKLMEMLSGALDALVIGDSWNLDVDVSPVIDADAQSDILGYIDQHRKAGTLIKTLA 890 
                                        ****************************************************************99999998887 PP

                         TIGR01238  376 leddvesekgtfvaptlfelddldelkkevfGpvlhvvrykadeldkvvdkinakGygltlGvhsrieetvrqie 450 
                                          d      gt+v+p ++++  ++++++e+fGpvlhv  +ka+e+d+vvd ina+ yglt+G+h+ri++ v qi 
  lcl|FitnessBrowser__Dino:3608920  891 APD-----SGTYVTPAIVKVGGIADMEREIFGPVLHVATFKANEIDQVVDAINARRYGLTFGLHTRIDDRVEQIV 960 
                                        777.....8****************************************************************** PP

                         TIGR01238  451 krakvGnvyvnrnlvGavvGvqpfGGeGlsGtGpkaGGplylyrltrv 498 
                                        +r++vGnvyvnrn++Ga+vG qpfGGeGlsGtGpkaGGplyl r+ +v
  lcl|FitnessBrowser__Dino:3608920  961 ERIQVGNVYVNRNQIGAIVGSQPFGGEGLSGTGPKAGGPLYLTRFRKV 1008
                                        ********************************************9886 PP

  == domain 2  score: 2.9 bits;  conditional E-value: 0.0017
                         TIGR01238  158 akalGavvcispwnfplaiftGqiaaalaaGntviakpaeqtsliaaravellqeaGvpagviqllpGrGedvGa 232 
                                        + ++G+v+c+ p      i + q + al aG +v+      t+l      + l +aG p  ++    G  +    
  lcl|FitnessBrowser__Dino:3608920 1089 MVPRGTVLCLGPTP---EIAMAQAVQALGAGCAVVIALPGSTPLS-----QPLSDAGAPVVTL---DGTVD--CV 1150
                                        55688888888853...5778899999999998765555678985.....45889**998775...57777..56 PP

                         TIGR01238  233 altsderiaGviftGstevarlinkalakredapvpliae 272 
                                        +lt+ + i+ v+ +G+++ +r ++ al++r+ + +pli +
  lcl|FitnessBrowser__Dino:3608920 1151 TLTELTGIEVVAAAGASDWTRTLRVALSQRDGPIIPLIVD 1190
                                        8*******************************99999975 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (500 nodes)
Target sequences:                          1  (1221 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.03u 0.01s 00:00:00.04 Elapsed: 00:00:00.03
# Mc/sec: 17.94
//
[ok]

This GapMind analysis is from Sep 17 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory