GapMind for catabolism of small carbon sources

 

Alignments for a candidate for putA in Thiohalomonas denitrificans HLD2

Align L-glutamate gamma-semialdehyde dehydrogenase (EC 1.2.1.88); Proline dehydrogenase (EC 1.5.5.2) (characterized)
to candidate WP_092995715.1 BLP65_RS09020 bifunctional proline dehydrogenase/L-glutamate gamma-semialdehyde dehydrogenase PutA

Query= reanno::SB2B:6938573
         (1058 letters)



>NCBI__GCF_900102855.1:WP_092995715.1
          Length = 1045

 Score =  994 bits (2569), Expect = 0.0
 Identities = 532/1034 (51%), Positives = 692/1034 (66%), Gaps = 9/1034 (0%)

Query: 26   QNYIVDEEAYLKELIALVPSSDEEIARITSRAHDLVAKVRQYEKKGLMVGIDAFLQQYSL 85
            ++++ DE   +++L+     S E   RI +RA  LV  VR+  +  L   +DAFL +Y L
Sbjct: 20   ESHLADEARCIEQLLDAASFSPEVSQRIEARAAALVRTVRKSRRPQLP--LDAFLGEYGL 77

Query: 86   ETQEGIILMCLAEALLRIPDAETADALIADKLSGAKWDEHMSKSDSVLVNASTWGLMLTG 145
            +++EG++LMCLAE LLRIPD  TAD LI DKL  A WD H+ +S S LVNASTWGL+LTG
Sbjct: 78   DSEEGVLLMCLAEGLLRIPDPATADRLIRDKLVPAHWDAHIGRSPSPLVNASTWGLLLTG 137

Query: 146  KIVQLDKNL-DGTPSNLLSRLVNRLGEPVIRQAMYAAMKIMGKQFVLGRTIEEGLKNAAE 204
            ++V L ++  + +P  +LSR++ R  EP++R A+  AM ++ +QFV+GR+I+E L  +AE
Sbjct: 138  RLVALSRDGGEPSPGRVLSRMLARSSEPMLRGALNGAMHLLARQFVMGRSIDEALHRSAE 197

Query: 205  KRKLGYTHSYDMLGEAALTMKDADKYYRDYANAIQALGTAKFDESEAPRPTISIKLSALH 264
                 Y +SYDMLGEAALT  DA  Y   YANAI A+G     E  A RP ISIKLSALH
Sbjct: 198  GDAARYRYSYDMLGEAALTRDDAANYRDAYANAIVAVGGTTSGELLA-RPGISIKLSALH 256

Query: 265  PRYEVANEDRVMTELYATLIKLIEQARSLNVGIQIDAEEVDRLELSLKLFKKLYQSDAAK 324
            PRYE A   R + EL   L +L   A    V + IDAEE  RLELSL++F++        
Sbjct: 257  PRYEYAQRCRTVPELAGVLAELAALAHEHGVPLTIDAEEAHRLELSLEVFQRTLADPRLV 316

Query: 325  GWGLLGIVVQAYSKRALPVLMWLTRLAKEQGDEIPLRLVKGAYWDSELKWAQQAGEAGYP 384
            G   LG+ VQAY KRA  VL WL+ LA+     IP+RLVKGAYWD E+K AQQ G  GYP
Sbjct: 317  GSESLGLAVQAYQKRAPAVLEWLSTLARTLRHRIPVRLVKGAYWDYEIKQAQQLGLPGYP 376

Query: 385  LFTRKAATDVSYLACARYLLSEATRGVIYPQFASHNAQTVAAITAMVGDRKFEFQRLHGM 444
            ++TRKA +DV+YLACAR+L +      +YPQFA+HNA TVAAI  +   R FEFQRLHGM
Sbjct: 377  VYTRKAHSDVAYLACARHLFASDR---LYPQFATHNAHTVAAILYLADGRPFEFQRLHGM 433

Query: 445  GQELYDTVLAEAAVPTVRIYAPIGAHKDLLPYLVRRLLENGANTSFVHKLVDPKTPIESL 504
            G+ LY+ ++ +  V   R+YAP+G+H++LLPYLVRRLLENGANTSFVH++ DP  P+E L
Sbjct: 434  GEALYNDLVRDG-VAACRVYAPVGSHQELLPYLVRRLLENGANTSFVHRIADPDIPVERL 492

Query: 505  VTHPLKTLQGYKTLANNKIVKPADIFGAERKNSKGLNMNIISESEPFFAALEKFKDTQWS 564
               P + +    T  N  I  P  ++GAER+NS+GL ++   +    +AAL       W+
Sbjct: 493  AADPAQRIHTRGTSPNPAIPLPDALYGAERRNSRGLALDNCRDQTALYAALHGQLQHDWN 552

Query: 565  AGPLVNGETLSGEVRDVVSPYNTTLKVGQVAFANEATIEQAIAGADKAFASWCRTPVETR 624
            A P++ GE +SG VR V SP++   +VGQV  A+     +A+  A     +W  TPV  R
Sbjct: 553  AAPMIGGEMISGRVRPVTSPHDRQCRVGQVWEADAQIARRAMESARSGARAWSATPVAER 612

Query: 625  ANALQKLADLLEENREELIALCTREAGKSIQDGIDEVREAVDFCRYYAVQAKKMMSKPEL 684
            + AL++ ADL+E N   LIALC RE G+++ D + +VREAVDF RYYA  A+  ++ P  
Sbjct: 613  SEALRRAADLMESNYPTLIALCIREGGRTLPDALADVREAVDFLRYYATLAEHDLA-PRT 671

Query: 685  LPGPTGELNELFLQGRGVFVCISPWNFPLAIFLGQVAAALATGNTVIAKPAEQTCLIGFR 744
            LPGPTGE N L  +GRGV  CISPWNFP+AIF GQ+AAAL  GN V+AKPA QT L+G +
Sbjct: 672  LPGPTGEENRLLYEGRGVIACISPWNFPVAIFTGQIAAALVAGNAVVAKPAHQTPLVGMQ 731

Query: 745  AVQLAHEAGIPKDVLQFLPGTGAVVGAKLTSDERIGGVCFTGSTTTAKVINRALAGRDGA 804
              +L H+AG+P  VL +LPG  A V   L S   + GV FTGST +A+ I R LA RDG 
Sbjct: 732  ICRLLHQAGVPPAVLHYLPGPSAAVAPTLLSHPALAGVLFTGSTASARHIQRTLAARDGP 791

Query: 805  IIPLIAETGGQNAMVVDSTSQPEQVVNDVVSSAFTSAGQRCSALRVLYLQEDIAERVLDV 864
            ++PLIAETGG NAMVVDS++ PEQVV DV+ SAF SAGQRCSALR+L LQE+ AERVL +
Sbjct: 792  LLPLIAETGGVNAMVVDSSALPEQVVVDVLRSAFNSAGQRCSALRLLCLQEECAERVLGL 851

Query: 865  LKGAMDELTLGNPGSVKTDVGPVIDAAAKANLNAHIDHIKQVGRLIHQLSLPEGTENGHF 924
            L GA+ EL +G+P    TD+GPVID AA+  L+ HI  +++  RL+   +LP     G +
Sbjct: 852  LSGALAELHIGDPLEPDTDIGPVIDEAARDTLDRHIRRMEKQQRLVACAALPPAAAKGCY 911

Query: 925  VAPTAVEIDSIKVLTKENFGPILHVVRYKAAGLQKVIDDINSTGFGLTLGIHSRNEGHAL 984
            VAP   E+D   VL +E FGPILHVVRY+   L+ ++D IN++G GLT GIHSR +    
Sbjct: 912  VAPCIFELDDAGVLREEVFGPILHVVRYRGGDLEALLDTINASGHGLTFGIHSRVDATVR 971

Query: 985  EVADKVNVGNVYINRNQIGAVVGVQPFGGQGLSGTGPKAGGPHYLTRFVTEKTRTNNITA 1044
             VA+++  GNVYINR+ IGA VGVQPFGG+GLSGTGPKAGGP YL   V EK+   N  A
Sbjct: 972  RVAERIGAGNVYINRDIIGATVGVQPFGGRGLSGTGPKAGGPDYLRPLVQEKSIATNTAA 1031

Query: 1045 IGGNATLLSLGDAD 1058
            +GGN  LL+  D +
Sbjct: 1032 VGGNTGLLAGSDEE 1045


Lambda     K      H
   0.317    0.134    0.380 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 2531
Number of extensions: 105
Number of successful extensions: 6
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 1058
Length of database: 1045
Length adjustment: 45
Effective length of query: 1013
Effective length of database: 1000
Effective search space:  1013000
Effective search space used:  1013000
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 58 (26.9 bits)

Align candidate WP_092995715.1 BLP65_RS09020 (bifunctional proline dehydrogenase/L-glutamate gamma-semialdehyde dehydrogenase PutA)
to HMM TIGR01238 (delta-1-pyrroline-5-carboxylate dehydrogenase (EC 1.2.1.88))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR01238.hmm
# target sequence database:        /tmp/gapView.24453.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR01238  [M=500]
Accession:   TIGR01238
Description: D1pyr5carbox3: delta-1-pyrroline-5-carboxylate dehydrogenase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                                 Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                                 -----------
   1.3e-200  653.1   0.0   1.6e-200  652.8   0.0    1.1  1  lcl|NCBI__GCF_900102855.1:WP_092995715.1  BLP65_RS09020 bifunctional proli


Domain annotation for each sequence (and alignments):
>> lcl|NCBI__GCF_900102855.1:WP_092995715.1  BLP65_RS09020 bifunctional proline dehydrogenase/L-glutamate gamma-semialde
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  652.8   0.0  1.6e-200  1.6e-200       2     498 ..     517    1022 ..     516    1024 .. 0.98

  Alignments for each domain:
  == domain 1  score: 652.8 bits;  conditional E-value: 1.6e-200
                                 TIGR01238    2 lygegrknslGvdlaneselksleeqllkaaakkfqaapivgekakaegeaqpvknpadrkdivGqv 68  
                                                lyg  r+ns G+ l+n +    l + l+ +++++++aap++g++   +g ++pv++p dr+  vGqv
  lcl|NCBI__GCF_900102855.1:WP_092995715.1  517 LYGAERRNSRGLALDNCRDQTALYAALHGQLQHDWNAAPMIGGE-MISGRVRPVTSPHDRQCRVGQV 582 
                                                89***************************************765.56899***************** PP

                                 TIGR01238   69 seadaaevqeavdsavaafaewsatdakeraailerladlleshmpelvallvreaGktlsnaiaev 135 
                                                 eada+ +++a++sa + +  wsat+ +er+  l+r+adl+es+ p+l+al++re G+tl +a+a+v
  lcl|NCBI__GCF_900102855.1:WP_092995715.1  583 WEADAQIARRAMESARSGARAWSATPVAERSEALRRAADLMESNYPTLIALCIREGGRTLPDALADV 649 
                                                ******************************************************************* PP

                                 TIGR01238  136 reavdflryyakqvedvldeesaka............lGavvcispwnfplaiftGqiaaalaaGnt 190 
                                                reavdflryya  +e++l   +               +G++ cispwnfp+aiftGqiaaal+aGn+
  lcl|NCBI__GCF_900102855.1:WP_092995715.1  650 REAVDFLRYYATLAEHDLAPRTLPGptgeenrllyegRGVIACISPWNFPVAIFTGQIAAALVAGNA 716 
                                                ******************999877689999************************************* PP

                                 TIGR01238  191 viakpaeqtsliaaravellqeaGvpagviqllpGrGedvGaaltsderiaGviftGstevarlink 257 
                                                v+akpa qt+l+ ++   ll++aGvp++v+  lpG  + v  +l s++++aGv+ftGst+ ar i++
  lcl|NCBI__GCF_900102855.1:WP_092995715.1  717 VVAKPAHQTPLVGMQICRLLHQAGVPPAVLHYLPGPSAAVAPTLLSHPALAGVLFTGSTASARHIQR 783 
                                                ******************************************************************* PP

                                 TIGR01238  258 alakredapvpliaetGGqnamivdstalaeqvvadvlasafdsaGqrcsalrvlcvqedvadrvlt 324 
                                                +la r+ +  pliaetGG nam+vds+al+eqvv+dvl+saf+saGqrcsalr+lc+qe+ a+rvl 
  lcl|NCBI__GCF_900102855.1:WP_092995715.1  784 TLAARDGPLLPLIAETGGVNAMVVDSSALPEQVVVDVLRSAFNSAGQRCSALRLLCLQEECAERVLG 850 
                                                ******************************************************************* PP

                                 TIGR01238  325 likGamdelkvgkpirlttdvGpvidaeakqnllahiekmkakakkvaqvkleddvesekgtfvapt 391 
                                                l+ Ga+ el++g p +  td+Gpvid+ a++ l +hi++m+++++ va + l    ++ kg +vap 
  lcl|NCBI__GCF_900102855.1:WP_092995715.1  851 LLSGALAELHIGDPLEPDTDIGPVIDEAARDTLDRHIRRMEKQQRLVACAALPP--AAAKGCYVAPC 915 
                                                ************************************************999988..899******** PP

                                 TIGR01238  392 lfelddldelkkevfGpvlhvvrykadeldkvvdkinakGygltlGvhsrieetvrqiekrakvGnv 458 
                                                +feldd   l++evfGp+lhvvry+  +l+ ++d ina+G+glt+G+hsr++ tvr++ +r+ +Gnv
  lcl|NCBI__GCF_900102855.1:WP_092995715.1  916 IFELDDAGVLREEVFGPILHVVRYRGGDLEALLDTINASGHGLTFGIHSRVDATVRRVAERIGAGNV 982 
                                                ******************************************************************* PP

                                 TIGR01238  459 yvnrnlvGavvGvqpfGGeGlsGtGpkaGGplylyrltrv 498 
                                                y+nr+++Ga vGvqpfGG+GlsGtGpkaGGp yl  l++ 
  lcl|NCBI__GCF_900102855.1:WP_092995715.1  983 YINRDIIGATVGVQPFGGRGLSGTGPKAGGPDYLRPLVQE 1022
                                                **********************************999875 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (500 nodes)
Target sequences:                          1  (1045 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.02u 0.01s 00:00:00.03 Elapsed: 00:00:00.02
# Mc/sec: 17.94
//
[ok]

This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory