Align L-glutamate gamma-semialdehyde dehydrogenase (EC 1.2.1.88); Proline dehydrogenase (EC 1.5.5.2) (characterized)
to candidate WP_090216616.1 CV091_RS05315 bifunctional proline dehydrogenase/L-glutamate gamma-semialdehyde dehydrogenase PutA
Query= reanno::Phaeo:GFF1160 (1158 letters) >NCBI__GCF_002796795.1:WP_090216616.1 Length = 1135 Score = 1711 bits (4432), Expect = 0.0 Identities = 869/1136 (76%), Positives = 967/1136 (85%), Gaps = 8/1136 (0%) Query: 24 LRYRIDAGTYVDQAQMRDQLFALANLDATDRSTISANAAALVRDIRGHSSPGLMEVFLAE 83 LR RID TY D Q RD+L A A L A DR I AA LVRDIRGHS+PGLMEVFLAE Sbjct: 7 LRDRIDLQTYADPEQKRDELIATAALSAEDRKAICGQAAGLVRDIRGHSAPGLMEVFLAE 66 Query: 84 YGLSTDEGVALMCLAEALLRVPDADTIDALIEDKIAPSEWGKHLGKSTSSLVNASTWALM 143 YGLSTDEGVALMCLAEALLRVPDA+TIDALIEDKIAPS+WGKHLG S+SSLVNASTWALM Sbjct: 67 YGLSTDEGVALMCLAEALLRVPDAETIDALIEDKIAPSDWGKHLGHSSSSLVNASTWALM 126 Query: 144 LTGKVLDEKRSPVSALRGAMKRLGEPVIRTAVSRAMKEMGRQFVLGETIEGAMKRAAGME 203 LTGKVLDE RSPV ALR A+KRLGEPVIRTAV RAMKEMGRQFVLGETIE AM RA GME Sbjct: 127 LTGKVLDEGRSPVGALRSAIKRLGEPVIRTAVGRAMKEMGRQFVLGETIESAMTRARGME 186 Query: 204 AKGYTYSYDMLGEAARTEADAARYHLAYSRAISAIAAACNSADIRQNPGISVKLSALHPR 263 KGYTYSYDMLGEAARTEADAARYHL+YS+AISAIA AC S DIR+NPGISVKLSALHPR Sbjct: 187 DKGYTYSYDMLGEAARTEADAARYHLSYSKAISAIANACTSDDIRKNPGISVKLSALHPR 246 Query: 264 YELAQETSVKEQLVPRLQALALLAKAAGMGLNVDAEEADRLSLSLEVIEEVISDPALAGW 323 YELAQET V E+LVPRL+ALALLAKAA MGLNVDAEEA+RLSLSLEVIE V+SDPALAGW Sbjct: 247 YELAQETLVMEELVPRLKALALLAKAAKMGLNVDAEEANRLSLSLEVIEAVVSDPALAGW 306 Query: 324 DGFGVVVQAYGPRTGAALDALYDMANRYDRRLMVRLVKGAYWDTEVKRAQVEGVDGFPVF 383 DGFG+VVQAYGPRTG ALDALY+MA+RYDR+ M+RLVKGAYWDTEVK AQVEG+DGFPV+ Sbjct: 307 DGFGIVVQAYGPRTGVALDALYEMADRYDRKFMIRLVKGAYWDTEVKLAQVEGIDGFPVY 366 Query: 384 THKSLTDVSYIANARKLLSITDRIYPQFATHNAHTVSAILHMAKDTDKGAYEFQRLHGMG 443 T+K+LTDVSYIANARKLL++TDRIYPQFATHNAHTVSAI+HMA++ A+EFQRLHGMG Sbjct: 367 TNKALTDVSYIANARKLLNMTDRIYPQFATHNAHTVSAIVHMAQEGQ--AFEFQRLHGMG 424 Query: 444 ETLHNMVLEQNQTHCRIYAPVGAHRDLLAYLVRRLLENGANSSFVNQIVDENVPPELVAA 503 ETLH +VLEQN+T+CRIYAPVGAHRDLLAYLVRRLLENGANSSFVNQIVDE+V PE VA Sbjct: 425 ETLHQLVLEQNKTNCRIYAPVGAHRDLLAYLVRRLLENGANSSFVNQIVDESVAPERVAT 484 Query: 504 DPFAQVEDLTANLRKGPDLFQPERPNSIGFDLGHAPTLAAIDAARAPWKSHSWAAEPLLA 563 DPF Q+ DL + GP+L+ ERPNS GFDL HAPTL AID+AR PW++H+W A PLLA Sbjct: 485 DPFDQIGDLKRQIPTGPELYGAERPNSKGFDLAHAPTLTAIDSARTPWRAHNWVARPLLA 544 Query: 564 KAPETATTTDEPVRNPADLTTVGRVQTAGQAEIETALSAATPWNASAETRAEVLNRAADL 623 +T + + V NP+D VG ++E AL+ A W+A A+ RAE+LNRAADL Sbjct: 545 S--DTTGSAPQNVMNPSDHALVGESSECRLEDVEQALNDAARWSAPAQERAEILNRAADL 602 Query: 624 YEANYGELFALLTREAGKTLPDCVAELREAVDFLRYYAARISAEPPVGVFTCISPWNFPL 683 YEA+YGELFALL REAGKTL D VAELREAVDFLRYYAA I A P G+FTCISPWNFPL Sbjct: 603 YEAHYGELFALLHREAGKTLMDAVAELREAVDFLRYYAANIPAADPAGIFTCISPWNFPL 662 Query: 684 AIFSGQIAAALAVGNAVLAKPAEQTPLIAHRAISLLHEAGVPRSALQLLPGAG-AVGGAL 742 AIF+GQIAAALAVGN VLAKPAE T LIAHRA+ LLHEAGVPR+ALQL PG G +G L Sbjct: 663 AIFTGQIAAALAVGNGVLAKPAESTTLIAHRAVQLLHEAGVPRTALQLTPGRGREIGPLL 722 Query: 743 TSDARVGGVAFTGSTATALKIRAAMAEHLRPGAPLIAETGGLNAMIVDSTALPEQAVQSI 802 T D RV GVAFTGSTATAL IR MA+ LRPGAPLIAETGGLNAMIVDSTALPEQAVQ+I Sbjct: 723 TGDPRVSGVAFTGSTATALHIRTEMAKGLRPGAPLIAETGGLNAMIVDSTALPEQAVQAI 782 Query: 803 IESAFQSAGQRCSALRCLYLQEDIADNVLKMLKGAMDALHLGDPWNLSTDSGPVIDETAR 862 IESAFQSAGQRCSALRCLYLQEDIAD VL MLKGAMD LHLGDPWNLSTDSGPVID A+ Sbjct: 783 IESAFQSAGQRCSALRCLYLQEDIADTVLDMLKGAMDCLHLGDPWNLSTDSGPVIDSRAQ 842 Query: 863 AGILAHIDAARAEGRVLKEMTAPQGGTFVAPTLIEITGIQALEQEIFGPVLHVVRFKSQD 922 +GILAHI AR+EGRV+ E+ PQGGTFVAPTLIE++GI AL++EIFGPVLHV RFK++D Sbjct: 843 SGILAHISTARSEGRVMHELHPPQGGTFVAPTLIEVSGIDALKEEIFGPVLHVARFKARD 902 Query: 923 LDQIIRDINATGYGLTFGLHTRIDDRVQYICDRIHAGNLYVNRNQIGAIVGSQPFGGEGL 982 LD++I IN TGYGLTFGLHTRIDDRVQ++CDRI AGN+YVNRNQIGAIVGSQPFGGEGL Sbjct: 903 LDKVIEAINGTGYGLTFGLHTRIDDRVQHVCDRIKAGNIYVNRNQIGAIVGSQPFGGEGL 962 Query: 983 SGTGPKAGGPFYMMRFCAPDRQKSVDSWPSDAPAMTMLPAPTGQPMQEITTSLPGPTGES 1042 SGTGPKAGGP Y+ R+CAPDRQ S +++ + + ++PAPTG Q T +LPGPTGES Sbjct: 963 SGTGPKAGGPLYLSRYCAPDRQTSAETFNN---STRVVPAPTGTAAQPTTQTLPGPTGES 1019 Query: 1043 NRLSQLARPPLLCLGPGPQAVVAQARAVHALGGTAIEATGPLDMRQLLTMEGTSGVIWWG 1102 NRL+ R PLLC+GPG +A QA+AVH+ GG AI+ LD+ QL T++ +GV+WWG Sbjct: 1020 NRLTTAPRLPLLCMGPGKKAAAEQAKAVHSHGGLAIQMADNLDLDQLRTLDAIAGVLWWG 1079 Query: 1103 DETTAREIESWLARRNGPILPLIPGLPDKARVQAERHVCVDTTAAGGNAALLGGMG 1158 DE TAREIE LA R+G ILPLIPGLPD+ARV AE HVCVDTTAAGGNA+LLGG G Sbjct: 1080 DEQTAREIEQHLAARDGAILPLIPGLPDRARVMAEHHVCVDTTAAGGNASLLGGQG 1135 Lambda K H 0.317 0.132 0.387 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 3256 Number of extensions: 127 Number of successful extensions: 5 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 1158 Length of database: 1135 Length adjustment: 46 Effective length of query: 1112 Effective length of database: 1089 Effective search space: 1210968 Effective search space used: 1210968 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.3 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.7 bits) S2: 58 (26.9 bits)
Align candidate WP_090216616.1 CV091_RS05315 (bifunctional proline dehydrogenase/L-glutamate gamma-semialdehyde dehydrogenase PutA)
to HMM TIGR01238 (delta-1-pyrroline-5-carboxylate dehydrogenase (EC 1.2.1.88))
# hmmsearch :: search profile(s) against a sequence database # HMMER 3.3.1 (Jul 2020); http://hmmer.org/ # Copyright (C) 2020 Howard Hughes Medical Institute. # Freely distributed under the BSD open source license. # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # query HMM file: ../tmp/path.carbon/TIGR01238.hmm # target sequence database: /tmp/gapView.1362687.genome.faa # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: TIGR01238 [M=500] Accession: TIGR01238 Description: D1pyr5carbox3: delta-1-pyrroline-5-carboxylate dehydrogenase Scores for complete sequences (score includes all domains): --- full sequence --- --- best 1 domain --- -#dom- E-value score bias E-value score bias exp N Sequence Description ------- ------ ----- ------- ------ ----- ---- -- -------- ----------- 2.8e-195 635.5 0.0 1.2e-192 626.8 0.0 2.4 2 NCBI__GCF_002796795.1:WP_090216616.1 Domain annotation for each sequence (and alignments): >> NCBI__GCF_002796795.1:WP_090216616.1 # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc --- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ---- 1 ! 626.8 0.0 1.2e-192 1.2e-192 1 495 [. 502 978 .. 502 982 .. 0.97 2 ! 6.1 0.0 0.00017 0.00017 237 270 .. 1069 1102 .. 1062 1123 .. 0.89 Alignments for each domain: == domain 1 score: 626.8 bits; conditional E-value: 1.2e-192 TIGR01238 1 dlygegrknslGvdlaneselksleeqllkaaakkfqaapivgekakaegeaqpvknpadrkdivGqvseada 73 +lyg r ns+G dla +l +++ + + a+++ a p++ + q v+np d+ +vG+ se + NCBI__GCF_002796795.1:WP_090216616.1 502 ELYGAERPNSKGFDLAHAPTLTAIDSARTPWRAHNWVARPLL-ASDTTGSAPQNVMNPSDHA-LVGESSECRL 572 79****************************************.55566667799******85.89*******9 PP TIGR01238 74 aevqeavdsavaafaewsatdakeraailerladlleshmpelvallvreaGktlsnaiaevreavdflryya 146 ++v++a++ +a wsa +a+era il+r+adl e h el all+reaGktl +a+ae+reavdflryya NCBI__GCF_002796795.1:WP_090216616.1 573 EDVEQALN----DAARWSA-PAQERAEILNRAADLYEAHYGELFALLHREAGKTLMDAVAELREAVDFLRYYA 640 99998875....5689*98.9**************************************************** PP TIGR01238 147 kqvedvldeesakalGavvcispwnfplaiftGqiaaalaaGntviakpaeqtsliaaravellqeaGvpagv 219 ++ +a + G + cispwnfplaiftGqiaaala Gn v+akpae t+lia rav+ll+eaGvp ++ NCBI__GCF_002796795.1:WP_090216616.1 641 ANI------PAADPAGIFTCISPWNFPLAIFTGQIAAALAVGNGVLAKPAESTTLIAHRAVQLLHEAGVPRTA 707 *99......45789*********************************************************** PP TIGR01238 220 iqllpGrGedvGaaltsderiaGviftGstevarlinkalakredapvpliaetGGqnamivdstalaeqvva 292 +ql pGrG ++G lt d+r++Gv+ftGst++a i+ ++ak + +pliaetGG namivdstal+eq v+ NCBI__GCF_002796795.1:WP_090216616.1 708 LQLTPGRGREIGPLLTGDPRVSGVAFTGSTATALHIRTEMAKGLRPGAPLIAETGGLNAMIVDSTALPEQAVQ 780 ************************************************************************* PP TIGR01238 293 dvlasafdsaGqrcsalrvlcvqedvadrvltlikGamdelkvgkpirlttdvGpvidaeakqnllahiekmk 365 +++saf+saGqrcsalr l++qed+ad vl+++kGamd l++g p +l td Gpvid++a+ +lahi + NCBI__GCF_002796795.1:WP_090216616.1 781 AIIESAFQSAGQRCSALRCLYLQEDIADTVLDMLKGAMDCLHLGDPWNLSTDSGPVIDSRAQSGILAHISTAR 853 ************************************************************************* PP TIGR01238 366 akakkvaqvkleddvesekgtfvaptlfelddldelkkevfGpvlhvvrykadeldkvvdkinakGygltlGv 438 + ++ ++++ + gtfvaptl+e+ +d+lk+e+fGpvlhv r+ka++ldkv++ in +Gyglt+G+ NCBI__GCF_002796795.1:WP_090216616.1 854 SEGRVMHELHPPQ-----GGTFVAPTLIEVSGIDALKEEIFGPVLHVARFKARDLDKVIEAINGTGYGLTFGL 921 *******998765.....9****************************************************** PP TIGR01238 439 hsrieetvrqiekrakvGnvyvnrnlvGavvGvqpfGGeGlsGtGpkaGGplylyrl 495 h+ri++ v+++ +r+k+Gn+yvnrn++Ga+vG qpfGGeGlsGtGpkaGGplyl r+ NCBI__GCF_002796795.1:WP_090216616.1 922 HTRIDDRVQHVCDRIKAGNIYVNRNQIGAIVGSQPFGGEGLSGTGPKAGGPLYLSRY 978 ******************************************************997 PP == domain 2 score: 6.1 bits; conditional E-value: 0.00017 TIGR01238 237 deriaGviftGstevarlinkalakredapvpli 270 ++iaGv++ G ++ar i++ la r+ a pli NCBI__GCF_002796795.1:WP_090216616.1 1069 LDAIAGVLWWGDEQTAREIEQHLAARDGAILPLI 1102 578************************9999998 PP Internal pipeline statistics summary: ------------------------------------- Query model(s): 1 (500 nodes) Target sequences: 1 (1135 residues searched) Passed MSV filter: 1 (1); expected 0.0 (0.02) Passed bias filter: 1 (1); expected 0.0 (0.02) Passed Vit filter: 1 (1); expected 0.0 (0.001) Passed Fwd filter: 1 (1); expected 0.0 (1e-05) Initial search space (Z): 1 [actual number of targets] Domain search space (domZ): 1 [number of targets reported over threshold] # CPU time: 0.01u 0.00s 00:00:00.01 Elapsed: 00:00:00.01 # Mc/sec: 53.51 // [ok]
This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory