GapMind for catabolism of small carbon sources

 

Alignments for a candidate for putA in Herbaspirillum autotrophicum IAM 14942

Align L-glutamate gamma-semialdehyde dehydrogenase (EC 1.2.1.88); Proline dehydrogenase (EC 1.5.5.2) (characterized)
to candidate WP_050461642.1 AKL27_RS04800 trifunctional transcriptional regulator/proline dehydrogenase/L-glutamate gamma-semialdehyde dehydrogenase

Query= reanno::HerbieS:HSERO_RS00905
         (1230 letters)



>NCBI__GCF_001189915.1:WP_050461642.1
          Length = 1218

 Score = 1847 bits (4785), Expect = 0.0
 Identities = 956/1230 (77%), Positives = 1030/1230 (83%), Gaps = 12/1230 (0%)

Query: 1    MTHVASAAVAAPFGGFQAELLPTPSPLRAAITAAYRRDEREAVQWLLQQVQEEQPWKDAT 60
            MT  +  + +  F  FQ  LLP P+PL+AAIT+ YRRDE  AVQWLL Q+Q    W DAT
Sbjct: 1    MTATSQISPSPAFEAFQQALLPAPTPLQAAITSVYRRDETAAVQWLLTQIQSNDQWTDAT 60

Query: 61   QQLARKLVQQVREKRTRSSGVDALMHEFSLSSEEGVALMCLAEALLRIPDRQTADRLIAD 120
              LA  LVQ VR KRTR+SGVDALMHEFSLSSEEGVALMCLAEALLRIPD QTADRLIAD
Sbjct: 61   HTLAHTLVQAVRTKRTRASGVDALMHEFSLSSEEGVALMCLAEALLRIPDHQTADRLIAD 120

Query: 121  KISKGDWRKHLGESPSLFVNAATWGLLITGKLVSTSSESGLTQAITRLIGKGGEPLIRKG 180
            KISKGDWR+HLGESPSLFVNAATWGLLITGKLVST+SE+ LT A+TRLI KGGEPLIRKG
Sbjct: 121  KISKGDWRRHLGESPSLFVNAATWGLLITGKLVSTNSENSLTSALTRLINKGGEPLIRKG 180

Query: 181  VDLAMRMLGNQFVTGQTIEEALDNSRENEKRGYRYSYDMLGEAALTMHDADAYYQSYESA 240
            VDLAMRMLGNQFVTGQTI EAL NSRENEKRGYRYSYDMLGEAALT HDAD YY+ YE A
Sbjct: 181  VDLAMRMLGNQFVTGQTIAEALSNSRENEKRGYRYSYDMLGEAALTGHDADYYYRCYEEA 240

Query: 241  IHAIGRASNGRGIKDGPGISVKLSALHPRYSRAQHARVMSELLPRLKQLLLLAKQYDIGL 300
            IHAIGRASNGRGIKDGPGISVKLSALHPRYSRAQHARVM ELLPRLKQLL+LAK Y+IGL
Sbjct: 241  IHAIGRASNGRGIKDGPGISVKLSALHPRYSRAQHARVMGELLPRLKQLLVLAKSYNIGL 300

Query: 301  NIDAEEADRLELSLDMMEVLVADPDLAGFDGLGFVVQGYQKRCPFVIDYLVDLARRNGRR 360
            NIDAEEADRLELSL++MEVL  DPDL GF+G+GFVVQ YQKRCPFVIDYLVDLARR+GR+
Sbjct: 301  NIDAEEADRLELSLNLMEVLAFDPDLDGFEGIGFVVQAYQKRCPFVIDYLVDLARRSGRK 360

Query: 361  LMIRLVKGAYWDSEIKRAQVDGLEGYPVYTRKVHTDLSYLTCAQKLLAATDVIYPQFATH 420
            LM+RLVKGAYWDSEIKRAQVDGL GYPVYTRKVHTDLSYL CAQKLLA+T VIYPQFATH
Sbjct: 361  LMVRLVKGAYWDSEIKRAQVDGLSGYPVYTRKVHTDLSYLVCAQKLLASTAVIYPQFATH 420

Query: 421  NAHTLAAIYHWARQHQIDNYEFQCLHGMGETLYDQVVGPDNLGKACRVYAPVGSHQTLLA 480
            NAHTLAAIY WA+QHQI +YEFQCLHGMGETLYDQVVG D LGKACR+YAPVGSHQTLLA
Sbjct: 421  NAHTLAAIYTWAQQHQITDYEFQCLHGMGETLYDQVVGDDKLGKACRIYAPVGSHQTLLA 480

Query: 481  YLVRRLLENGANSSFVNQIVDEAVPLDRLVGDPIETVRAQGGLPHPAIAVPHRLYGEERK 540
            YLVRRLLENGANSSFVNQIVDEAV +D L+ DP+E VR  GG  H  I +P  LYGEERK
Sbjct: 481  YLVRRLLENGANSSFVNQIVDEAVAIDTLIADPVEAVRQTGGAAHAQIPLPRDLYGEERK 540

Query: 541  NSAGIDLSNEDRLQQLGQLFISMADRQWQAAPLLAADTAAQSAQAAQLVRNPADLREVVG 600
            NS GIDLSNED L+ LG+ F  +A  +WQA PLL     A        + NPAD R+ VG
Sbjct: 541  NSGGIDLSNEDSLRDLGREFAQLAGHEWQAGPLLQGRVTATPDLD---ILNPADHRDRVG 597

Query: 601  QVSEATVADVDTALRAATDYAPQWQSTPATERAAMLERAADLLEEHIAELMALAVREAGK 660
            +V+EA   DV+TAL AATDYAPQWQ+ PATERA MLERAADLLE+H  ELMALAVREAGK
Sbjct: 598  RVAEANGDDVETALAAATDYAPQWQALPATERAQMLERAADLLEQHRVELMALAVREAGK 657

Query: 661  SLPNAIAEVREAVDFLRYYAIASRHDGNVLAWGPVVCISPWNFPLAIFIGEVSAALAAGN 720
            SLPNAIAEVREAVDFLRYYA+ S+ DGNVLAWGPVVCISPWNFPLAIF+GE+SAALAAGN
Sbjct: 658  SLPNAIAEVREAVDFLRYYAVQSQRDGNVLAWGPVVCISPWNFPLAIFVGEISAALAAGN 717

Query: 721  VVLAKPAEQTALIAHRAVQLLHEAGIPRAALQLLPGRGETVGAALTSDVRVKGVIFTGST 780
            VVLAKPAEQT LIAHRAVQLLH AGIP AALQ LPGRGETVGA LTSD RVKGVIFTGST
Sbjct: 718  VVLAKPAEQTPLIAHRAVQLLHAAGIPLAALQFLPGRGETVGARLTSDERVKGVIFTGST 777

Query: 781  EVAQLINRTLAQRQHDDGDGSGEHGEVPLIAETGGQNALIVDSSALAEQVVQDVLSSAFD 840
            EVAQLINRTLA+R  D      EHG++PLIAETGGQNALIVDSSAL EQVVQDVLSSAFD
Sbjct: 778  EVAQLINRTLAKRVKD------EHGDLPLIAETGGQNALIVDSSALPEQVVQDVLSSAFD 831

Query: 841  SAGQRCSALRILCLQEDIADRTLAMLKGAMAELRVGRPDRLSIDIGPVIDAEARQNLLDH 900
            SAGQRCSALR+LCLQ DIAD+T+ MLKGAM ELR+GRPDRL  DIGPVIDAEAR  LL H
Sbjct: 832  SAGQRCSALRVLCLQHDIADKTMTMLKGAMEELRIGRPDRLVTDIGPVIDAEARDQLLAH 891

Query: 901  IERMRASARAVHQLPLGEECQHGTFVAPTVIEIDDLAQLQREVFGPVLHVLRYRRDALPQ 960
            IE+MR     V+QLPL E+  +GTFV PTVIEID LAQLQREVFGPVLHVLRYRRD LPQ
Sbjct: 892  IEQMRGQGCKVYQLPLDEQTGYGTFVPPTVIEIDSLAQLQREVFGPVLHVLRYRRDELPQ 951

Query: 961  LIDAINATGYGLTLGVHSRIDETIEFVAQRAHVGNIYVNRNIVGAVVGVQPFGGEGKSGT 1020
            LIDAINATGYGLTLGVHSRIDETI FVA RAHVGNIYVNRNIVGAVVGVQPFGGEGKSGT
Sbjct: 952  LIDAINATGYGLTLGVHSRIDETINFVAGRAHVGNIYVNRNIVGAVVGVQPFGGEGKSGT 1011

Query: 1021 GPKAGGPLYLKRLQRNAQLHEELTRAQPADVPNALLDSLLDWARTHGHERLAANGQRYHR 1080
            GPKAGGPLYLKRLQR  Q    LT       P+  L  LL+WA THGH +LA  G  Y  
Sbjct: 1012 GPKAGGPLYLKRLQRQPQ---PLTPTVAEAAPSTTLAVLLEWAATHGHAKLATLGNSYVA 1068

Query: 1081 DSLLQRSLVLPGPTGERNTLGFAPRGLVLCAAGSVGTLLNQLAAAFATGNTALVDERSAA 1140
            +SLL  +L LPGPTGERNTL FA RG +LC A SV  LLNQLAA  ATGNTA+VD  +A 
Sbjct: 1069 ESLLDTTLALPGPTGERNTLSFAARGKMLCIATSVPALLNQLAAVLATGNTAVVDAATAQ 1128

Query: 1141 ILPSGLPAPVRAAIRRASQLDAEPLQAALVDSHQAAHWRARLAAREGALVPLILCGEDTT 1200
             LP+GLP  VR  +  + Q     L  ALVD+     WRA+LAAR+GALVPLI+  +D  
Sbjct: 1129 YLPAGLPEMVRQQVIVSDQPGTLALHLALVDAPACVQWRAQLAARDGALVPLIVTTDDAP 1188

Query: 1201 IPLWRLLAERALCINTTAAGGNASLMTISV 1230
            I LWRL+AERALCINTTAAGGNASLMT+ V
Sbjct: 1189 IALWRLVAERALCINTTAAGGNASLMTLEV 1218


Lambda     K      H
   0.319    0.134    0.389 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 3797
Number of extensions: 143
Number of successful extensions: 5
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 1230
Length of database: 1218
Length adjustment: 47
Effective length of query: 1183
Effective length of database: 1171
Effective search space:  1385293
Effective search space used:  1385293
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.8 bits)
S2: 59 (27.3 bits)

Align candidate WP_050461642.1 AKL27_RS04800 (trifunctional transcriptional regulator/proline dehydrogenase/L-glutamate gamma-semialdehyde dehydrogenase)
to HMM TIGR01238 (delta-1-pyrroline-5-carboxylate dehydrogenase (EC 1.2.1.88))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR01238.hmm
# target sequence database:        /tmp/gapView.686714.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR01238  [M=500]
Accession:   TIGR01238
Description: D1pyr5carbox3: delta-1-pyrroline-5-carboxylate dehydrogenase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                             Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                             -----------
   4.6e-235  766.7   1.5   5.6e-235  766.4   0.1    1.8  2  NCBI__GCF_001189915.1:WP_050461642.1  


Domain annotation for each sequence (and alignments):
>> NCBI__GCF_001189915.1:WP_050461642.1  
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  766.4   0.1  5.6e-235  5.6e-235       1     498 [.     533    1027 ..     533    1029 .. 0.98
   2 ?   -1.8   0.2     0.045     0.045     159     197 ..    1091    1126 ..    1085    1135 .. 0.74

  Alignments for each domain:
  == domain 1  score: 766.4 bits;  conditional E-value: 5.6e-235
                             TIGR01238    1 dlygegrknslGvdlaneselksleeqllkaaakkfqaapivgekakaegeaqpvknpadrkdivGqvsea 71  
                                            dlyge rkns G+dl+ne+ l++l  ++ + a +++qa p+++++  a  + + + npad++d vG+v ea
  NCBI__GCF_001189915.1:WP_050461642.1  533 DLYGEERKNSGGIDLSNEDSLRDLGREFAQLAGHEWQAGPLLQGRVTATPDLD-ILNPADHRDRVGRVAEA 602 
                                            89****************************************99999988876.99*************** PP

                             TIGR01238   72 daaevqeavdsavaafaewsatdakeraailerladlleshmpelvallvreaGktlsnaiaevreavdfl 142 
                                            + ++v+ a+ +a + +++w+a++a era +ler+adlle+h  el+al+vreaGk+l naiaevreavdfl
  NCBI__GCF_001189915.1:WP_050461642.1  603 NGDDVETALAAATDYAPQWQALPATERAQMLERAADLLEQHRVELMALAVREAGKSLPNAIAEVREAVDFL 673 
                                            *********************************************************************** PP

                             TIGR01238  143 ryyakqvedvldeesakalGavvcispwnfplaiftGqiaaalaaGntviakpaeqtsliaaravellqea 213 
                                            ryya q +    + ++ a G+vvcispwnfplaif+G+i+aalaaGn v+akpaeqt+lia rav+ll+ a
  NCBI__GCF_001189915.1:WP_050461642.1  674 RYYAVQSQR---DGNVLAWGPVVCISPWNFPLAIFVGEISAALAAGNVVLAKPAEQTPLIAHRAVQLLHAA 741 
                                            ****99943...344789***************************************************** PP

                             TIGR01238  214 GvpagviqllpGrGedvGaaltsderiaGviftGstevarlinkalakredap...vpliaetGGqnamiv 281 
                                            G+p +++q+lpGrGe+vGa ltsder++GviftGsteva+lin++lakr +++    pliaetGGqna+iv
  NCBI__GCF_001189915.1:WP_050461642.1  742 GIPLAALQFLPGRGETVGARLTSDERVKGVIFTGSTEVAQLINRTLAKRVKDEhgdLPLIAETGGQNALIV 812 
                                            *************************************************87766779************** PP

                             TIGR01238  282 dstalaeqvvadvlasafdsaGqrcsalrvlcvqedvadrvltlikGamdelkvgkpirlttdvGpvidae 352 
                                            ds+al+eqvv+dvl+safdsaGqrcsalrvlc+q+d+ad+++t++kGam+el++g+p rl td+Gpvidae
  NCBI__GCF_001189915.1:WP_050461642.1  813 DSSALPEQVVQDVLSSAFDSAGQRCSALRVLCLQHDIADKTMTMLKGAMEELRIGRPDRLVTDIGPVIDAE 883 
                                            *********************************************************************** PP

                             TIGR01238  353 akqnllahiekmkakakkvaqvkleddvesekgtfvaptlfelddldelkkevfGpvlhvvrykadeldkv 423 
                                            a+++llahie+m++++ kv+q+ l++  ++  gtfv+pt++e+d+l++l++evfGpvlhv+ry++del ++
  NCBI__GCF_001189915.1:WP_050461642.1  884 ARDQLLAHIEQMRGQGCKVYQLPLDE--QTGYGTFVPPTVIEIDSLAQLQREVFGPVLHVLRYRRDELPQL 952 
                                            *************************9..8999*************************************** PP

                             TIGR01238  424 vdkinakGygltlGvhsrieetvrqiekrakvGnvyvnrnlvGavvGvqpfGGeGlsGtGpkaGGplylyr 494 
                                            +d ina+GygltlGvhsri+et++++ +ra+vGn+yvnrn+vGavvGvqpfGGeG sGtGpkaGGplyl+r
  NCBI__GCF_001189915.1:WP_050461642.1  953 IDAINATGYGLTLGVHSRIDETINFVAGRAHVGNIYVNRNIVGAVVGVQPFGGEGKSGTGPKAGGPLYLKR 1023
                                            *********************************************************************** PP

                             TIGR01238  495 ltrv 498 
                                            l r 
  NCBI__GCF_001189915.1:WP_050461642.1 1024 LQRQ 1027
                                            9875 PP

  == domain 2  score: -1.8 bits;  conditional E-value: 0.045
                             TIGR01238  159 kalGavvcispwnfplaiftGqiaaalaaGntviakpae 197 
                                             a+G ++ci+     +  ++ q+aa la+Gnt +   a 
  NCBI__GCF_001189915.1:WP_050461642.1 1091 AARGKMLCIATS---VPALLNQLAAVLATGNTAVVDAAT 1126
                                            567777787653...3456899**********9887765 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (500 nodes)
Target sequences:                          1  (1218 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.01u 0.01s 00:00:00.02 Elapsed: 00:00:00.01
# Mc/sec: 38.42
//
[ok]

This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory