GapMind for catabolism of small carbon sources

 

Alignments for a candidate for put1 in Dyella japonica UNC79MFTsu3.2

Align L-glutamate gamma-semialdehyde dehydrogenase (EC 1.2.1.88); Proline dehydrogenase (EC 1.5.5.2) (characterized)
to candidate N515DRAFT_4232 N515DRAFT_4232 L-proline dehydrogenase /delta-1-pyrroline-5-carboxylate dehydrogenase

Query= reanno::HerbieS:HSERO_RS00905
         (1230 letters)



>FitnessBrowser__Dyella79:N515DRAFT_4232
          Length = 1074

 Score = 1081 bits (2795), Expect = 0.0
 Identities = 591/1047 (56%), Positives = 723/1047 (69%), Gaps = 37/1047 (3%)

Query: 21   LPTPS-PLRAAITAAYRRDEREAVQWLLQQVQEEQPWKDATQQLARKLVQQVREKRTRSS 79
            LPT + P RA ITAA+ RDE EAV  LL Q       ++    LA  LV +VR +    S
Sbjct: 33   LPTGAEPARARITAAWLRDETEAVNDLLVQASLPPVEREKVIDLAADLVTRVRARAKDQS 92

Query: 80   GVDALMHEFSLSSEEGVALMCLAEALLRIPDRQTADRLIADKISKGDWRKHLGESPSLFV 139
             V++ M ++ LSSEEGV LMC+AEALLRIPD+ TAD+LI DK+   +W+KHLG+S SLFV
Sbjct: 93   AVESFMRQYDLSSEEGVLLMCVAEALLRIPDKATADKLIRDKLGDANWKKHLGQSESLFV 152

Query: 140  NAATWGLLITGKLVSTSSE--SGLTQAITRLIGKGGEPLIRKGVDLAMRMLGNQFVTGQT 197
            NA+TWGL++TGKLV+ + +     T A+ RL+G+ GEP IR  V  AMR++G+QFV G+T
Sbjct: 153  NASTWGLMLTGKLVNLAGDIRHDFTGALRRLVGRAGEPAIRLAVRQAMRIMGHQFVMGRT 212

Query: 198  IEEALDNSRENEKRGYRYSYDMLGEAALTMHDADAYYQSYESAIHAIGRASNGRGIKDGP 257
            I EALD   + E   YRYSYDMLGE+ALT   A+ Y Q Y +AI AIG         D P
Sbjct: 213  IGEALDRCAQKEYAVYRYSYDMLGESALTSETAERYQQDYRNAIAAIGARGPFANHTDAP 272

Query: 258  GISVKLSALHPRYSRAQHARVMSELLPRLKQLLLLAKQYDIGLNIDAEEADRLELSLDMM 317
             ISVKLSALHPRY  A+      +L  +L +L  LA ++ I L++DAEEADRLELSLD++
Sbjct: 273  SISVKLSALHPRYEVAKRELARRDLTAKLLELSQLAMKHGIALSVDAEEADRLELSLDIL 332

Query: 318  EVLVADPDLAGFDGLGFVVQGYQKRCPFVIDYLVDLARRNGRRLMIRLVKGAYWDSEIKR 377
              + A P LAG++GLG VVQ Y KR PFVID+L++ AR +GRR  +RLVKGAYWD+E+KR
Sbjct: 333  GDVFAHPSLAGWNGLGIVVQAYSKRTPFVIDWLIETARGSGRRWYVRLVKGAYWDAEVKR 392

Query: 378  AQVDGLEGYPVYTRKVHTDLSYLTCAQKLL-AATDVIYPQFATHNAHTLAAIYHWARQHQ 436
            AQ +GL GYPVYTRK +TD+SYL CA+KL  A  ++IYPQFATHNAHT+AA++H A+   
Sbjct: 393  AQENGLPGYPVYTRKPNTDVSYLACARKLFDAGIELIYPQFATHNAHTIAAVHHLAKGRP 452

Query: 437  IDNYEFQCLHGMGETLYDQVVGPDNLGKACRVYAPVGSHQTLLAYLVRRLLENGANSSFV 496
               YE+Q LHGMG  LY +V+G  NL   CRVYAPVG+H+ LL YLVRRLLENGAN+SFV
Sbjct: 453  ---YEYQRLHGMGTDLYAEVIGAQNLNVPCRVYAPVGTHEDLLPYLVRRLLENGANTSFV 509

Query: 497  NQIVDEAVPLDRLVGDPIETVRAQGGLPHPAIAVPHRLYGEERKNSAGIDLSNEDRLQQL 556
            N++VDE++P+  LV DP ETVR+   +PHP I +P  LYGE RKNS G++ SN++ L+ L
Sbjct: 510  NRVVDESLPVRELVADPCETVRSFASIPHPRIPLPVNLYGELRKNSMGVNFSNDNELKAL 569

Query: 557  GQLFISMADRQWQAAPLLAADTAAQSAQAAQLVRNPADLREVVGQVSEATVADVDTALRA 616
             +  ++     W A PL+     A SA A   V NPAD R+VVG    A  A V+ AL  
Sbjct: 570  AET-VNAKSGPWTATPLV---PGATSAGATVQVTNPADRRQVVGSYVSADSATVEKALAN 625

Query: 617  ATDYAPQWQSTPATERAAMLERAADLLEEHIAELMALAVREAGKSLPNAIAEVREAVDFL 676
            A      W   PA  RAA+LE AA+ LE    E +AL VREAGK LP+AIAE+REA DFL
Sbjct: 626  AVAAQHGWDRLPAASRAAILEHAAEQLEARRGEFIALCVREAGKGLPDAIAEIREAADFL 685

Query: 677  RYYAIASRH-------------DGNVL---AWGPVVCISPWNFPLAIFIGEVSAALAAGN 720
            RYYA  +R              + N L     G  VCISPWNFPLAIF+G+V+AALAAGN
Sbjct: 686  RYYATMARRYFGQPEQLPGPTGESNQLFLNGRGVFVCISPWNFPLAIFLGQVAAALAAGN 745

Query: 721  VVLAKPAEQTALIAHRAVQLLHEAGIPRAALQLLPGRGETVGAALTSDVRVKGVIFTGST 780
             V+AKPAEQT+LI H AVQLLHEAG+P   LQ LPG G TVGAALT D RV GV FTGST
Sbjct: 746  SVIAKPAEQTSLIGHAAVQLLHEAGVPADVLQYLPGDGATVGAALTRDPRVAGVAFTGST 805

Query: 781  EVAQLINRTLAQRQHDDGDGSGEHGEVPLIAETGGQNALIVDSSALAEQVVQDVLSSAFD 840
            E A  INR LA R               LIAETGGQNA+I DSSAL EQ+V+D +SSAF 
Sbjct: 806  ETAWAINRALAARNAP---------IAALIAETGGQNAMIADSSALPEQIVKDAVSSAFQ 856

Query: 841  SAGQRCSALRILCLQEDIADRTLAMLKGAMAELRVGRPDRLSIDIGPVIDAEARQNLLDH 900
            SAGQRCSA R+L +QEDIAD+  AML GAMAEL+VG P +LS D+GPVID +AR+ L+DH
Sbjct: 857  SAGQRCSAARVLYVQEDIADKVCAMLAGAMAELKVGDPAQLSTDVGPVIDEDARKILVDH 916

Query: 901  IERMRASARAVHQLPLGEECQ-HGTFVAPTVIEIDDLAQLQREVFGPVLHVLRYRRDALP 959
              RM   A+ + ++ L      +GTF AP   EI  LA L RE+FGPVLHV+R++   L 
Sbjct: 917  AARMDQEAKKIGEVALDPATTGNGTFFAPRAYEIPGLATLTREIFGPVLHVIRWKGSELD 976

Query: 960  QLIDAINATGYGLTLGVHSRIDETIEFVAQRAHVGNIYVNRNIVGAVVGVQPFGGEGKSG 1019
            +++D INATGYGLTLG+HSRID+T+EF+  RA VGN YVNRN +GAVVGVQPFGGEG SG
Sbjct: 977  KVVDEINATGYGLTLGIHSRIDDTVEFIQSRARVGNCYVNRNQIGAVVGVQPFGGEGLSG 1036

Query: 1020 TGPKAGGPLYLKRLQRNAQLHEELTRA 1046
            TGPKAGGP YL R      L    T A
Sbjct: 1037 TGPKAGGPHYLFRFAGERTLTINTTAA 1063



 Score = 40.0 bits (92), Expect = 1e-06
 Identities = 40/126 (31%), Positives = 54/126 (42%), Gaps = 23/126 (18%)

Query: 1090 LPGPTGERNTLGFAPRGLVLCAAG---SVGTLLNQLAAAFATGN---------TALVDER 1137
            LPGPTGE N L    RG+ +C +     +   L Q+AAA A GN         T+L+   
Sbjct: 702  LPGPTGESNQLFLNGRGVFVCISPWNFPLAIFLGQVAAALAAGNSVIAKPAEQTSLIGHA 761

Query: 1138 SAAIL-PSGLPAPV-------RAAIRRASQLDAEPLQAALVDSHQAAHW--RARLAAREG 1187
            +  +L  +G+PA V        A +  A   D      A   S + A W     LAAR  
Sbjct: 762  AVQLLHEAGVPADVLQYLPGDGATVGAALTRDPRVAGVAFTGSTETA-WAINRALAARNA 820

Query: 1188 ALVPLI 1193
             +  LI
Sbjct: 821  PIAALI 826



 Score = 39.7 bits (91), Expect = 1e-06
 Identities = 19/26 (73%), Positives = 21/26 (80%)

Query: 1203 LWRLLAERALCINTTAAGGNASLMTI 1228
            L+R   ER L INTTAAGGNASL+TI
Sbjct: 1047 LFRFAGERTLTINTTAAGGNASLLTI 1072


Lambda     K      H
   0.319    0.134    0.389 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 2993
Number of extensions: 145
Number of successful extensions: 11
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 4
Number of HSP's successfully gapped: 3
Length of query: 1230
Length of database: 1074
Length adjustment: 46
Effective length of query: 1184
Effective length of database: 1028
Effective search space:  1217152
Effective search space used:  1217152
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.8 bits)
S2: 58 (26.9 bits)

This GapMind analysis is from Sep 17 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory