GapMind for catabolism of small carbon sources

 

Alignments for a candidate for rocA in Dyella japonica UNC79MFTsu3.2

Align L-glutamate gamma-semialdehyde dehydrogenase (EC 1.2.1.88); Proline dehydrogenase (EC 1.5.5.2) (characterized)
to candidate N515DRAFT_4232 N515DRAFT_4232 L-proline dehydrogenase /delta-1-pyrroline-5-carboxylate dehydrogenase

Query= reanno::HerbieS:HSERO_RS00905
         (1230 letters)



>FitnessBrowser__Dyella79:N515DRAFT_4232
          Length = 1074

 Score = 1081 bits (2795), Expect = 0.0
 Identities = 591/1047 (56%), Positives = 723/1047 (69%), Gaps = 37/1047 (3%)

Query: 21   LPTPS-PLRAAITAAYRRDEREAVQWLLQQVQEEQPWKDATQQLARKLVQQVREKRTRSS 79
            LPT + P RA ITAA+ RDE EAV  LL Q       ++    LA  LV +VR +    S
Sbjct: 33   LPTGAEPARARITAAWLRDETEAVNDLLVQASLPPVEREKVIDLAADLVTRVRARAKDQS 92

Query: 80   GVDALMHEFSLSSEEGVALMCLAEALLRIPDRQTADRLIADKISKGDWRKHLGESPSLFV 139
             V++ M ++ LSSEEGV LMC+AEALLRIPD+ TAD+LI DK+   +W+KHLG+S SLFV
Sbjct: 93   AVESFMRQYDLSSEEGVLLMCVAEALLRIPDKATADKLIRDKLGDANWKKHLGQSESLFV 152

Query: 140  NAATWGLLITGKLVSTSSE--SGLTQAITRLIGKGGEPLIRKGVDLAMRMLGNQFVTGQT 197
            NA+TWGL++TGKLV+ + +     T A+ RL+G+ GEP IR  V  AMR++G+QFV G+T
Sbjct: 153  NASTWGLMLTGKLVNLAGDIRHDFTGALRRLVGRAGEPAIRLAVRQAMRIMGHQFVMGRT 212

Query: 198  IEEALDNSRENEKRGYRYSYDMLGEAALTMHDADAYYQSYESAIHAIGRASNGRGIKDGP 257
            I EALD   + E   YRYSYDMLGE+ALT   A+ Y Q Y +AI AIG         D P
Sbjct: 213  IGEALDRCAQKEYAVYRYSYDMLGESALTSETAERYQQDYRNAIAAIGARGPFANHTDAP 272

Query: 258  GISVKLSALHPRYSRAQHARVMSELLPRLKQLLLLAKQYDIGLNIDAEEADRLELSLDMM 317
             ISVKLSALHPRY  A+      +L  +L +L  LA ++ I L++DAEEADRLELSLD++
Sbjct: 273  SISVKLSALHPRYEVAKRELARRDLTAKLLELSQLAMKHGIALSVDAEEADRLELSLDIL 332

Query: 318  EVLVADPDLAGFDGLGFVVQGYQKRCPFVIDYLVDLARRNGRRLMIRLVKGAYWDSEIKR 377
              + A P LAG++GLG VVQ Y KR PFVID+L++ AR +GRR  +RLVKGAYWD+E+KR
Sbjct: 333  GDVFAHPSLAGWNGLGIVVQAYSKRTPFVIDWLIETARGSGRRWYVRLVKGAYWDAEVKR 392

Query: 378  AQVDGLEGYPVYTRKVHTDLSYLTCAQKLL-AATDVIYPQFATHNAHTLAAIYHWARQHQ 436
            AQ +GL GYPVYTRK +TD+SYL CA+KL  A  ++IYPQFATHNAHT+AA++H A+   
Sbjct: 393  AQENGLPGYPVYTRKPNTDVSYLACARKLFDAGIELIYPQFATHNAHTIAAVHHLAKGRP 452

Query: 437  IDNYEFQCLHGMGETLYDQVVGPDNLGKACRVYAPVGSHQTLLAYLVRRLLENGANSSFV 496
               YE+Q LHGMG  LY +V+G  NL   CRVYAPVG+H+ LL YLVRRLLENGAN+SFV
Sbjct: 453  ---YEYQRLHGMGTDLYAEVIGAQNLNVPCRVYAPVGTHEDLLPYLVRRLLENGANTSFV 509

Query: 497  NQIVDEAVPLDRLVGDPIETVRAQGGLPHPAIAVPHRLYGEERKNSAGIDLSNEDRLQQL 556
            N++VDE++P+  LV DP ETVR+   +PHP I +P  LYGE RKNS G++ SN++ L+ L
Sbjct: 510  NRVVDESLPVRELVADPCETVRSFASIPHPRIPLPVNLYGELRKNSMGVNFSNDNELKAL 569

Query: 557  GQLFISMADRQWQAAPLLAADTAAQSAQAAQLVRNPADLREVVGQVSEATVADVDTALRA 616
             +  ++     W A PL+     A SA A   V NPAD R+VVG    A  A V+ AL  
Sbjct: 570  AET-VNAKSGPWTATPLV---PGATSAGATVQVTNPADRRQVVGSYVSADSATVEKALAN 625

Query: 617  ATDYAPQWQSTPATERAAMLERAADLLEEHIAELMALAVREAGKSLPNAIAEVREAVDFL 676
            A      W   PA  RAA+LE AA+ LE    E +AL VREAGK LP+AIAE+REA DFL
Sbjct: 626  AVAAQHGWDRLPAASRAAILEHAAEQLEARRGEFIALCVREAGKGLPDAIAEIREAADFL 685

Query: 677  RYYAIASRH-------------DGNVL---AWGPVVCISPWNFPLAIFIGEVSAALAAGN 720
            RYYA  +R              + N L     G  VCISPWNFPLAIF+G+V+AALAAGN
Sbjct: 686  RYYATMARRYFGQPEQLPGPTGESNQLFLNGRGVFVCISPWNFPLAIFLGQVAAALAAGN 745

Query: 721  VVLAKPAEQTALIAHRAVQLLHEAGIPRAALQLLPGRGETVGAALTSDVRVKGVIFTGST 780
             V+AKPAEQT+LI H AVQLLHEAG+P   LQ LPG G TVGAALT D RV GV FTGST
Sbjct: 746  SVIAKPAEQTSLIGHAAVQLLHEAGVPADVLQYLPGDGATVGAALTRDPRVAGVAFTGST 805

Query: 781  EVAQLINRTLAQRQHDDGDGSGEHGEVPLIAETGGQNALIVDSSALAEQVVQDVLSSAFD 840
            E A  INR LA R               LIAETGGQNA+I DSSAL EQ+V+D +SSAF 
Sbjct: 806  ETAWAINRALAARNAP---------IAALIAETGGQNAMIADSSALPEQIVKDAVSSAFQ 856

Query: 841  SAGQRCSALRILCLQEDIADRTLAMLKGAMAELRVGRPDRLSIDIGPVIDAEARQNLLDH 900
            SAGQRCSA R+L +QEDIAD+  AML GAMAEL+VG P +LS D+GPVID +AR+ L+DH
Sbjct: 857  SAGQRCSAARVLYVQEDIADKVCAMLAGAMAELKVGDPAQLSTDVGPVIDEDARKILVDH 916

Query: 901  IERMRASARAVHQLPLGEECQ-HGTFVAPTVIEIDDLAQLQREVFGPVLHVLRYRRDALP 959
              RM   A+ + ++ L      +GTF AP   EI  LA L RE+FGPVLHV+R++   L 
Sbjct: 917  AARMDQEAKKIGEVALDPATTGNGTFFAPRAYEIPGLATLTREIFGPVLHVIRWKGSELD 976

Query: 960  QLIDAINATGYGLTLGVHSRIDETIEFVAQRAHVGNIYVNRNIVGAVVGVQPFGGEGKSG 1019
            +++D INATGYGLTLG+HSRID+T+EF+  RA VGN YVNRN +GAVVGVQPFGGEG SG
Sbjct: 977  KVVDEINATGYGLTLGIHSRIDDTVEFIQSRARVGNCYVNRNQIGAVVGVQPFGGEGLSG 1036

Query: 1020 TGPKAGGPLYLKRLQRNAQLHEELTRA 1046
            TGPKAGGP YL R      L    T A
Sbjct: 1037 TGPKAGGPHYLFRFAGERTLTINTTAA 1063



 Score = 40.0 bits (92), Expect = 1e-06
 Identities = 40/126 (31%), Positives = 54/126 (42%), Gaps = 23/126 (18%)

Query: 1090 LPGPTGERNTLGFAPRGLVLCAAG---SVGTLLNQLAAAFATGN---------TALVDER 1137
            LPGPTGE N L    RG+ +C +     +   L Q+AAA A GN         T+L+   
Sbjct: 702  LPGPTGESNQLFLNGRGVFVCISPWNFPLAIFLGQVAAALAAGNSVIAKPAEQTSLIGHA 761

Query: 1138 SAAIL-PSGLPAPV-------RAAIRRASQLDAEPLQAALVDSHQAAHW--RARLAAREG 1187
            +  +L  +G+PA V        A +  A   D      A   S + A W     LAAR  
Sbjct: 762  AVQLLHEAGVPADVLQYLPGDGATVGAALTRDPRVAGVAFTGSTETA-WAINRALAARNA 820

Query: 1188 ALVPLI 1193
             +  LI
Sbjct: 821  PIAALI 826



 Score = 39.7 bits (91), Expect = 1e-06
 Identities = 19/26 (73%), Positives = 21/26 (80%)

Query: 1203 LWRLLAERALCINTTAAGGNASLMTI 1228
            L+R   ER L INTTAAGGNASL+TI
Sbjct: 1047 LFRFAGERTLTINTTAAGGNASLLTI 1072


Lambda     K      H
   0.319    0.134    0.389 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 2993
Number of extensions: 145
Number of successful extensions: 11
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 4
Number of HSP's successfully gapped: 3
Length of query: 1230
Length of database: 1074
Length adjustment: 46
Effective length of query: 1184
Effective length of database: 1028
Effective search space:  1217152
Effective search space used:  1217152
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.8 bits)
S2: 58 (26.9 bits)

Align candidate N515DRAFT_4232 N515DRAFT_4232 (L-proline dehydrogenase /delta-1-pyrroline-5-carboxylate dehydrogenase)
to HMM TIGR01238 (delta-1-pyrroline-5-carboxylate dehydrogenase (EC 1.2.1.88))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR01238.hmm
# target sequence database:        /tmp/gapView.27165.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR01238  [M=500]
Accession:   TIGR01238
Description: D1pyr5carbox3: delta-1-pyrroline-5-carboxylate dehydrogenase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                                    Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                                    -----------
     8e-214  696.7   2.0   1.1e-213  696.3   2.0    1.1  1  lcl|FitnessBrowser__Dyella79:N515DRAFT_4232  N515DRAFT_4232 L-proline dehydro


Domain annotation for each sequence (and alignments):
>> lcl|FitnessBrowser__Dyella79:N515DRAFT_4232  N515DRAFT_4232 L-proline dehydrogenase /delta-1-pyrroline-5-carboxylate 
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  696.3   2.0  1.1e-213  1.1e-213       1     496 [.     546    1051 ..     546    1055 .. 0.98

  Alignments for each domain:
  == domain 1  score: 696.3 bits;  conditional E-value: 1.1e-213
                                    TIGR01238    1 dlygegrknslGvdlaneselksleeqllkaaakkfqaapivgekakaegeaqpvknpadrkdi 64  
                                                   +lyge rkns+Gv+++n++elk l e +++     + a p+v + + + g +  v+npadr+++
  lcl|FitnessBrowser__Dyella79:N515DRAFT_4232  546 NLYGELRKNSMGVNFSNDNELKALAETVNAK-SGPWTATPLVPGAT-SAGATVQVTNPADRRQV 607 
                                                   59***********************999765.679*******6655.5566667********** PP

                                    TIGR01238   65 vGqvseadaaevqeavdsavaafaewsatdakeraailerladlleshmpelvallvreaGktl 128 
                                                   vG    ad a+v++a+  avaa   w  ++a+ raaile++a+ le +  e++al+vreaGk l
  lcl|FitnessBrowser__Dyella79:N515DRAFT_4232  608 VGSYVSADSATVEKALANAVAAQHGWDRLPAASRAAILEHAAEQLEARRGEFIALCVREAGKGL 671 
                                                   **************************************************************** PP

                                    TIGR01238  129 snaiaevreavdflryyakqvedvldeesaka.............lGavvcispwnfplaiftG 179 
                                                    +aiae+rea dflryya  ++  +++  + +             +G++vcispwnfplaif+G
  lcl|FitnessBrowser__Dyella79:N515DRAFT_4232  672 PDAIAEIREAADFLRYYATMARRYFGQPEQLPgptgesnqlflngRGVFVCISPWNFPLAIFLG 735 
                                                   **************************997777999***************************** PP

                                    TIGR01238  180 qiaaalaaGntviakpaeqtsliaaravellqeaGvpagviqllpGrGedvGaaltsderiaGv 243 
                                                   q+aaalaaGn+viakpaeqtsli   av+ll+eaGvpa v+q lpG G++vGaalt d+r+aGv
  lcl|FitnessBrowser__Dyella79:N515DRAFT_4232  736 QVAAALAAGNSVIAKPAEQTSLIGHAAVQLLHEAGVPADVLQYLPGDGATVGAALTRDPRVAGV 799 
                                                   **************************************************************** PP

                                    TIGR01238  244 iftGstevarlinkalakredapvpliaetGGqnamivdstalaeqvvadvlasafdsaGqrcs 307 
                                                   +ftGste+a  in+ala r+++ ++liaetGGqnami ds+al+eq+v+d ++saf+saGqrcs
  lcl|FitnessBrowser__Dyella79:N515DRAFT_4232  800 AFTGSTETAWAINRALAARNAPIAALIAETGGQNAMIADSSALPEQIVKDAVSSAFQSAGQRCS 863 
                                                   **************************************************************** PP

                                    TIGR01238  308 alrvlcvqedvadrvltlikGamdelkvgkpirlttdvGpvidaeakqnllahiekmkakakkv 371 
                                                   a rvl+vqed+ad+v  ++ Gam elkvg p +l tdvGpvid++a++ l+ h  +m + akk+
  lcl|FitnessBrowser__Dyella79:N515DRAFT_4232  864 AARVLYVQEDIADKVCAMLAGAMAELKVGDPAQLSTDVGPVIDEDARKILVDHAARMDQEAKKI 927 
                                                   **************************************************************** PP

                                    TIGR01238  372 aqvkleddvesekgtfvaptlfelddldelkkevfGpvlhvvrykadeldkvvdkinakGyglt 435 
                                                    +v l+    + +gtf ap ++e+  l+ l +e+fGpvlhv+r+k +eldkvvd+ina+Gyglt
  lcl|FitnessBrowser__Dyella79:N515DRAFT_4232  928 GEVALDP-ATTGNGTFFAPRAYEIPGLATLTREIFGPVLHVIRWKGSELDKVVDEINATGYGLT 990 
                                                   *****99.78999*************************************************** PP

                                    TIGR01238  436 lGvhsrieetvrqiekrakvGnvyvnrnlvGavvGvqpfGGeGlsGtGpkaGGplylyrlt 496 
                                                   lG+hsri++tv +i++ra+vGn+yvnrn++GavvGvqpfGGeGlsGtGpkaGGp+yl+r+ 
  lcl|FitnessBrowser__Dyella79:N515DRAFT_4232  991 LGIHSRIDDTVEFIQSRARVGNCYVNRNQIGAVVGVQPFGGEGLSGTGPKAGGPHYLFRFA 1051
                                                   ***********************************************************96 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (500 nodes)
Target sequences:                          1  (1074 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.03u 0.01s 00:00:00.04 Elapsed: 00:00:00.04
# Mc/sec: 13.01
//
[ok]

This GapMind analysis is from Sep 17 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory