Align L-glutamate gamma-semialdehyde dehydrogenase (EC 1.2.1.88); Proline dehydrogenase (EC 1.5.5.2) (characterized)
to candidate WP_011384888.1 AMB_RS12610 bifunctional proline dehydrogenase/L-glutamate gamma-semialdehyde dehydrogenase PutA
Query= reanno::HerbieS:HSERO_RS00905 (1230 letters) >NCBI__GCF_000009985.1:WP_011384888.1 Length = 1039 Score = 926 bits (2394), Expect = 0.0 Identities = 534/1028 (51%), Positives = 668/1028 (64%), Gaps = 41/1028 (3%) Query: 21 LPTPSPLRAAITAAYRRDEREAVQWLLQQVQEEQPWKDATQQLARKLVQQVREKRTRSSG 80 LP P P R AI A E + V L + E + A LV R R R+ G Sbjct: 7 LPAPDPERQAIHRAAGTSEADLVSGLSAGIPLEDEARRRIVNRAVNLVDGARRNR-RTLG 65 Query: 81 VDALMHEFSLSSEEGVALMCLAEALLRIPDRQTADRLIADKISKGDWRKHLGESPSLFVN 140 +D L++E+ LS+ EGV LMCLAEALLRIPD T D LI DKI+ DW HLG SPS+FVN Sbjct: 66 LDGLLNEYRLSTREGVVLMCLAEALLRIPDDHTVDLLIKDKIASADWDGHLGHSPSVFVN 125 Query: 141 AATWGLLITGKLVSTSSESGLTQAITRLIGKGGEPLIRKGVDLAMRMLGNQFVTGQTIEE 200 A+TW L++ +L+ + + R+ G+ GE ++R+ + AM ++G QFV G+TI E Sbjct: 126 ASTWALVLGDRLLHLEEDG--RAVLGRMAGRLGEAVVRRALRHAMGLMGRQFVLGRTIAE 183 Query: 201 ALDNSRENEKRGYRYSYDMLGEAALTMHDADAYYQSYESAIHAIGRASNGRGIKDGPGIS 260 ALDN+R E RGYR+S+DMLGEAA A Y ++Y AI A+GR + G G GPG+S Sbjct: 184 ALDNARAWEARGYRHSFDMLGEAARCEQAAQDYLRAYAGAIEALGRHAKGAGPIAGPGLS 243 Query: 261 VKLSALHPRYSRAQHARVMSELLPRLKQLLLLAKQYDIGLNIDAEEADRLELSLDMMEVL 320 VKLSALHPR+ AQ RV+ EL+PRL+ L A+ IGL IDAEEADRL++SLD+ME Sbjct: 244 VKLSALHPRFEMAQRQRVLGELVPRLRDLCHRARDAGIGLTIDAEEADRLDISLDVMEAA 303 Query: 321 VADPDLAGFDGLGFVVQGYQKRCPFVIDYLVDLARRNGRRLMIRLVKGAYWDSEIKRAQV 380 +ADP L G+DG G VQ YQKR VI + LA R +RLMIRLVKGAYWD E+KRAQ Sbjct: 304 LADPALDGWDGFGMAVQAYQKRARPVIAWAGALAARRQQRLMIRLVKGAYWDGEVKRAQE 363 Query: 381 DGLEGYPVYTRKVHTDLSYLTCAQKLLAATDVIYPQFATHNAHTLAAIYHWARQHQIDNY 440 GL G+PV+T K TD+SYL CA LLA D+ YPQFATHNAHT AA+ ++ Sbjct: 364 RGLGGFPVFTTKEATDVSYLACAADLLARPDLFYPQFATHNAHTAAAVMEMT--GGAGDW 421 Query: 441 EFQCLHGMGETLYDQVVGPDNLGKACRVYAPVGSHQTLLAYLVRRLLENGANSSFVNQIV 500 EFQ LHGMGE LY Q+V P+ CR YAPVGSHQ LL YLVRRLLENGANSSFV+++ Sbjct: 422 EFQRLHGMGEALYAQLV-PE---FPCRTYAPVGSHQELLPYLVRRLLENGANSSFVSRLA 477 Query: 501 DEAVPLDRLVGDPIETVRAQGGLPHPAIAVPHRLYGEERKNSAGIDLSNEDRLQQLGQLF 560 DE +P + DP+ A G + +A P L+G R+NS G+DLS+ L QL Sbjct: 478 DEEIPAHVVAADPL---AALGRITPQLVAEPSALFGPSRRNSGGLDLSSPAVLAQLDLAL 534 Query: 561 ISMADRQWQAAPLLAADTAAQSAQAAQLVRNPADLREVVGQVSEATVADVDTALRAATDY 620 ++A + ++AP++ D + QAA+ V +PAD R VVG+V +A+ ADV+ AL +A Sbjct: 535 AAVATPE-RSAPIV--DGRERENQAAKPVLDPADHRRVVGEVVDASPADVEAALASARAA 591 Query: 621 APQWQSTPATERAAMLERAADLLEEHIAELMALAVREAGKSLPNAIAEVREAVDFLRYYA 680 P W RA++LERAAD LE A MALA+REAGK++P+A++EVREAVDFLR+YA Sbjct: 592 FPAWDDLGGEARASILERAADRLEADRARFMALAIREAGKTIPDALSEVREAVDFLRFYA 651 Query: 681 --IASRHDGNVLAWGPV--------------VCISPWNFPLAIFIGEVSAALAAGNVVLA 724 +R V GPV CISPWNFPLAIF+G+V+AALAAGN V+A Sbjct: 652 AEARARFSQPVRLPGPVGESNELMLGGRGVFACISPWNFPLAIFVGQVAAALAAGNAVVA 711 Query: 725 KPAEQTALIAHRAVQLLHEAGIPRAALQLLPGRGETVGAALTSDVRVKGVIFTGSTEVAQ 784 KPA QT L+A AV+LLH+AG+P AL L+PG G +G ALT + V + FTGST A+ Sbjct: 712 KPAPQTPLMAAAAVRLLHQAGVPPQALHLVPG-GPAIGEALTVNPLVDAIAFTGSTATAR 770 Query: 785 LINRTLAQRQHDDGDGSGEHGEVPLIAETGGQNALIVDSSALAEQVVQDVLSSAFDSAGQ 844 INR A DG PLIAETGG NA+IVDSSAL EQVV D L SAF SAGQ Sbjct: 771 HINRLRAAM---DGP------LAPLIAETGGLNAMIVDSSALPEQVVADCLESAFRSAGQ 821 Query: 845 RCSALRILCLQEDIADRTLAMLKGAMAELRVGRPDRLSIDIGPVIDAEARQNLLDHIERM 904 RCSALR+ +Q + R +L GAMAEL +G P LS D+GPVID +R+ LL H R+ Sbjct: 822 RCSALRVAFIQREAWTRIQPLLAGAMAELSLGDPALLSTDVGPVIDEASRRRLLAHGGRL 881 Query: 905 RASARAVHQLPLGEECQHGTFVAPTVIEIDDLAQLQREVFGPVLHVLRYRRDALPQLIDA 964 R + R + Q +C+ GTF AP ++D+L LQ EVFGP+LHV+ + L Q++D Sbjct: 882 RHAGRMIGQSACPPDCRVGTFFAPMAHQLDNLDLLQSEVFGPILHVIPWEAGRLEQVLDC 941 Query: 965 INATGYGLTLGVHSRIDETIEFVAQRAHVGNIYVNRNIVGAVVGVQPFGGEGKSGTGPKA 1024 + AT YGLTLG+HSRID TI V RA +GNIYVNR ++GAVVG QPFGG G SGTG KA Sbjct: 942 VAATSYGLTLGIHSRIDATIAQVIARARIGNIYVNRTMIGAVVGSQPFGGLGLSGTGAKA 1001 Query: 1025 GGPLYLKR 1032 GGP L R Sbjct: 1002 GGPNTLIR 1009 Score = 38.9 bits (89), Expect = 2e-06 Identities = 63/234 (26%), Positives = 86/234 (36%), Gaps = 51/234 (21%) Query: 999 NRNIVGAVVGVQPFGGEGKSGTGPKA-------GGPLYLKRLQRNAQLHEE--------L 1043 +R +VG VV P E + A GG L+R A E Sbjct: 566 HRRVVGEVVDASPADVEAALASARAAFPAWDDLGGEARASILERAADRLEADRARFMALA 625 Query: 1044 TRAQPADVPNALLD--SLLDWARTHGHERLAANGQRYHRDSLLQRSLVLPGPTGERNTLG 1101 R +P+AL + +D+ R + E A Q LPGP GE N L Sbjct: 626 IREAGKTIPDALSEVREAVDFLRFYAAEARARFSQPVR----------LPGPVGESNELM 675 Query: 1102 FAPRGLVLCAAG---SVGTLLNQLAAAFATGN---------TALVDERSAAIL-PSGLP- 1147 RG+ C + + + Q+AAA A GN T L+ + +L +G+P Sbjct: 676 LGGRGVFACISPWNFPLAIFVGQVAAALAAGNAVVAKPAPQTPLMAAAAVRLLHQAGVPP 735 Query: 1148 -----APVRAAIRRASQLDAEPLQAALV---DSHQAAHWRARLAAREGALVPLI 1193 P AI A L PL A+ + A H AA +G L PLI Sbjct: 736 QALHLVPGGPAIGEA--LTVNPLVDAIAFTGSTATARHINRLRAAMDGPLAPLI 787 Score = 27.3 bits (59), Expect = 0.007 Identities = 13/24 (54%), Positives = 16/24 (66%) Query: 1203 LWRLLAERALCINTTAAGGNASLM 1226 L R ER L +NT AAGG+ +LM Sbjct: 1007 LIRYGVERCLSVNTAAAGGDVALM 1030 Lambda K H 0.319 0.134 0.389 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 3106 Number of extensions: 164 Number of successful extensions: 12 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 3 Number of HSP's successfully gapped: 3 Length of query: 1230 Length of database: 1039 Length adjustment: 46 Effective length of query: 1184 Effective length of database: 993 Effective search space: 1175712 Effective search space used: 1175712 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.4 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.8 bits) S2: 58 (26.9 bits)
Align candidate WP_011384888.1 AMB_RS12610 (bifunctional proline dehydrogenase/L-glutamate gamma-semialdehyde dehydrogenase PutA)
to HMM TIGR01238 (delta-1-pyrroline-5-carboxylate dehydrogenase (EC 1.2.1.88))
# hmmsearch :: search profile(s) against a sequence database # HMMER 3.3.1 (Jul 2020); http://hmmer.org/ # Copyright (C) 2020 Howard Hughes Medical Institute. # Freely distributed under the BSD open source license. # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # query HMM file: ../tmp/path.carbon/TIGR01238.hmm # target sequence database: /tmp/gapView.28597.genome.faa # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: TIGR01238 [M=500] Accession: TIGR01238 Description: D1pyr5carbox3: delta-1-pyrroline-5-carboxylate dehydrogenase Scores for complete sequences (score includes all domains): --- full sequence --- --- best 1 domain --- -#dom- E-value score bias E-value score bias exp N Sequence Description ------- ------ ----- ------- ------ ----- ---- -- -------- ----------- 3.2e-184 599.0 0.1 4.9e-184 598.4 0.1 1.2 1 lcl|NCBI__GCF_000009985.1:WP_011384888.1 AMB_RS12610 bifunctional proline Domain annotation for each sequence (and alignments): >> lcl|NCBI__GCF_000009985.1:WP_011384888.1 AMB_RS12610 bifunctional proline dehydrogenase/L-glutamate gamma-semialdehy # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc --- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ---- 1 ! 598.4 0.1 4.9e-184 4.9e-184 2 495 .. 508 1010 .. 507 1014 .. 0.98 Alignments for each domain: == domain 1 score: 598.4 bits; conditional E-value: 4.9e-184 TIGR01238 2 lygegrknslGvdlaneselksleeqllkaaakkfqaapivgekakaegeaqpvknpadrkdivGqv 68 l+g +r+ns G+dl+ +l++l+ l ++a+ + ++apiv+++ ++++ a+pv +pad++ +vG+v lcl|NCBI__GCF_000009985.1:WP_011384888.1 508 LFGPSRRNSGGLDLSSPAVLAQLDLALAAVATPE-RSAPIVDGRERENQAAKPVLDPADHRRVVGEV 573 899**********************998887766.789***************************** PP TIGR01238 69 seadaaevqeavdsavaafaewsatdakeraailerladlleshmpelvallvreaGktlsnaiaev 135 a+ a+v++a+ sa aaf+ w + + ra+iler+ad le + ++al++reaGkt+ +a++ev lcl|NCBI__GCF_000009985.1:WP_011384888.1 574 VDASPADVEAALASARAAFPAWDDLGGEARASILERAADRLEADRARFMALAIREAGKTIPDALSEV 640 ******************************************************************* PP TIGR01238 136 reavdflryyakqvedvldeesaka.............lGavvcispwnfplaiftGqiaaalaaGn 189 reavdflr+ya +++ +++ + +G++ cispwnfplaif+Gq+aaalaaGn lcl|NCBI__GCF_000009985.1:WP_011384888.1 641 REAVDFLRFYAAEARARFSQPVRLPgpvgesnelmlggRGVFACISPWNFPLAIFVGQVAAALAAGN 707 ********************9666699**************************************** PP TIGR01238 190 tviakpaeqtsliaaravellqeaGvpagviqllpGrGedvGaaltsderiaGviftGstevarlin 256 +v+akpa qt+l+aa av ll++aGvp+ ++ l+pG + +G alt ++ + ++ftGst++ar in lcl|NCBI__GCF_000009985.1:WP_011384888.1 708 AVVAKPAPQTPLMAAAAVRLLHQAGVPPQALHLVPGGPA-IGEALTVNPLVDAIAFTGSTATARHIN 773 ************************************666.*************************** PP TIGR01238 257 kalakredapvpliaetGGqnamivdstalaeqvvadvlasafdsaGqrcsalrvlcvqedvadrvl 323 + a + + +pliaetGG namivds+al+eqvvad l+saf saGqrcsalrv ++q++ r+ lcl|NCBI__GCF_000009985.1:WP_011384888.1 774 RLRAAMDGPLAPLIAETGGLNAMIVDSSALPEQVVADCLESAFRSAGQRCSALRVAFIQREAWTRIQ 840 **99999999********************************************************* PP TIGR01238 324 tlikGamdelkvgkpirlttdvGpvidaeakqnllahiekmkakakkvaqvkleddvesekgtfvap 390 l+ Gam el +g p +l tdvGpvid+ ++++llah ++++ ++ + q +++ gtf ap lcl|NCBI__GCF_000009985.1:WP_011384888.1 841 PLLAGAMAELSLGDPALLSTDVGPVIDEASRRRLLAHGGRLRHAGRMIGQSACPP--DCRVGTFFAP 905 **************************************************99988..9********* PP TIGR01238 391 tlfelddldelkkevfGpvlhvvrykadeldkvvdkinakGygltlGvhsrieetvrqiekrakvGn 457 ++ +ld+ld l+ evfGp+lhv+ ++a +l++v+d + a+ ygltlG+hsri+ t++q+ ra++Gn lcl|NCBI__GCF_000009985.1:WP_011384888.1 906 MAHQLDNLDLLQSEVFGPILHVIPWEAGRLEQVLDCVAATSYGLTLGIHSRIDATIAQVIARARIGN 972 ******************************************************************* PP TIGR01238 458 vyvnrnlvGavvGvqpfGGeGlsGtGpkaGGplylyrl 495 +yvnr ++GavvG qpfGG GlsGtG kaGGp l r+ lcl|NCBI__GCF_000009985.1:WP_011384888.1 973 IYVNRTMIGAVVGSQPFGGLGLSGTGAKAGGPNTLIRY 1010 ********************************999886 PP Internal pipeline statistics summary: ------------------------------------- Query model(s): 1 (500 nodes) Target sequences: 1 (1039 residues searched) Passed MSV filter: 1 (1); expected 0.0 (0.02) Passed bias filter: 1 (1); expected 0.0 (0.02) Passed Vit filter: 1 (1); expected 0.0 (0.001) Passed Fwd filter: 1 (1); expected 0.0 (1e-05) Initial search space (Z): 1 [actual number of targets] Domain search space (domZ): 1 [number of targets reported over threshold] # CPU time: 0.01u 0.01s 00:00:00.02 Elapsed: 00:00:00.02 # Mc/sec: 21.89 // [ok]
This GapMind analysis is from Sep 17 2021. The underlying query database was built on Sep 17 2021.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see the paper from 2019 on GapMind for amino acid biosynthesis, the paper from 2022 on GapMind for carbon sources, or view the source code.
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory