Align L-glutamate gamma-semialdehyde dehydrogenase (EC 1.2.1.88); Proline dehydrogenase (EC 1.5.5.2) (characterized)
to candidate WP_110805920.1 C8J30_RS10945 bifunctional proline dehydrogenase/L-glutamate gamma-semialdehyde dehydrogenase PutA
Query= reanno::Phaeo:GFF1160 (1158 letters) >NCBI__GCF_003217355.1:WP_110805920.1 Length = 1127 Score = 1274 bits (3297), Expect = 0.0 Identities = 692/1130 (61%), Positives = 818/1130 (72%), Gaps = 21/1130 (1%) Query: 33 YVDQAQMRDQLFALANLDATDRSTISANAAALVRDIRGHSSPGLMEVFLAEYGLSTDEGV 92 + +A++ + L A A L I+A A LV IR + P LME FLA+YGLST EGV Sbjct: 13 FAPEAEVLEALVAQAALPQAQLDRIAARGADLVARIRAEAKPSLMEHFLAQYGLSTREGV 72 Query: 93 ALMCLAEALLRVPDADTIDALIEDKIAPSEWGKHLGKSTSSLVNASTWALMLTGKVLDEK 152 ALMCLAEA+LRVPD TIDALIEDKIAPS+WGKHLG + SSLVNASTWALMLTGKVLD+ Sbjct: 73 ALMCLAEAMLRVPDNATIDALIEDKIAPSDWGKHLGTAASSLVNASTWALMLTGKVLDDG 132 Query: 153 RSPVSA-LRGAMKRLGEPVIRTAVSRAMKEMGRQFVLGETIEGAMKRAAGMEAKGYTYSY 211 ++ LRGAM+RLGEPVIR AV +AM+EMGRQFVLGETIE A++RA E++GYT+SY Sbjct: 133 AGGLAGTLRGAMRRLGEPVIRAAVGQAMREMGRQFVLGETIEKALERAEKRESEGYTFSY 192 Query: 212 DMLGEAARTEADAARYHLAYSRAISAIAAACNSADIRQNPGISVKLSALHPRYELAQETS 271 DMLGEAA T ADA RY +AY++AI++IA A I NPGIS+KLSALHPRYE+AQE Sbjct: 193 DMLGEAALTTADAERYRVAYAQAIASIAKAATKGSIAANPGISIKLSALHPRYEVAQEAR 252 Query: 272 VKEQLVPRLQALALLAKAAGMGLNVDAEEADRLSLSLEVIEEVISDPALAGWDGFGVVVQ 331 V +LVP ++ LA A AAG+ L++DAEE DRL+LSL VIE V++DP AGW GFG VVQ Sbjct: 253 VMAELVPVVRDLARAAAAAGIALHIDAEEQDRLALSLRVIEAVMADPETAGWQGFGAVVQ 312 Query: 332 AYGPRTGAALDALYDMANRYDRRLMVRLVKGAYWDTEVKRAQVEGVDGFPVFTHKSLTDV 391 AYG R GAA+DAL MA RR+ +RLVKGAYWD+E+KRAQVEG GFP+FT K+ TDV Sbjct: 313 AYGKRAGAAIDALAAMARASGRRINIRLVKGAYWDSEMKRAQVEGHPGFPLFTSKTGTDV 372 Query: 392 SYIANARKLLSITDRIYPQFATHNAHTVSAILHMAKDTDKGAYEFQRLHGMGETLHNMVL 451 +YI A KL ++D +YPQFATHNAHTV+A+L MA YEFQRLHGMG LH++VL Sbjct: 373 AYICLAAKLFGLSDCLYPQFATHNAHTVAAVLEMAAGR---PYEFQRLHGMGARLHDIVL 429 Query: 452 EQNQTHCRIYAPVGAHRDLLAYLVRRLLENGANSSFVNQIVDENVPPELVAADPFAQVED 511 + CRIYAPVGAHRDLLAYLVRRLLENGANSSFVNQIV+E+VPP VAA PFA + Sbjct: 430 RETGGRCRIYAPVGAHRDLLAYLVRRLLENGANSSFVNQIVNESVPPSEVAACPFAALAT 489 Query: 512 LTA--NLRKGPDLFQPERPNSIGFDLGHAPTLAAIDAARAPWKSHSWAAEPLLAKAPETA 569 A L LF +R NS GFDL LA I+ AR A P++A P + Sbjct: 490 ARAPRGLLAPAALFGSDRVNSQGFDLSDPEVLARIETARTVTLPD---AAPIVA-GPVSG 545 Query: 570 TTTDEPVRNPADLTTVGRVQTAGQAEIETALSAATPWNASAETRAEVLNRAADLYEANYG 629 + D V NPA V RV A A + AL AA W+A A RA VL+RAADLYE N+G Sbjct: 546 GSRD--VVNPATGAVVARVTEADAATVAVALDAARVWSAPAAERAAVLSRAADLYEENFG 603 Query: 630 ELFALLTREAGKTLPDCVAELREAVDFLRYYAARISAE--PPVGVFTCISPWNFPLAIFS 687 +FA L REAGKTL D ++ELREAVDFLRYYAA +++ P G ISPWNFPLAIF+ Sbjct: 604 PIFAALAREAGKTLGDAISELREAVDFLRYYAAEGTSDTRAPRGPVVAISPWNFPLAIFT 663 Query: 688 GQIAAALAVGNAVLAKPAEQTPLIAHRAISLLHEAGVPRSALQLLPGAG-AVGGALTSDA 746 GQIAAAL GNAVLAKPAEQTP+IA A+ LLH+AGVP +ALQLLPG G VG ALT D Sbjct: 664 GQIAAALMAGNAVLAKPAEQTPVIAALAVRLLHQAGVPATALQLLPGDGPTVGAALTRDP 723 Query: 747 RVGGVAFTGSTATALKIRAAMAEHLRPGAPLIAETGGLNAMIVDSTALPEQAVQSIIESA 806 R+ GV FTGST TA I AMA H+ PG PLIAETGGLNAM+VDSTALPEQAV+ ++ SA Sbjct: 724 RIKGVVFTGSTETAQIIARAMAAHMAPGTPLIAETGGLNAMVVDSTALPEQAVRDVVASA 783 Query: 807 FQSAGQRCSALRCLYLQEDIADNVLKMLKGAMDALHLGDPWNLSTDSGPVIDETARAGIL 866 F+SAGQRCSALRCLY+QEDIA +V+ MLKGAMD L LGDPW LSTD GPVID A+AGI Sbjct: 784 FRSAGQRCSALRCLYVQEDIAPHVIAMLKGAMDELVLGDPWRLSTDVGPVIDAEAQAGIE 843 Query: 867 AHIDAARAEGRVLKEMTAPQGGTFVAPTLIEITGIQALEQEIFGPVLHVVRFKSQDLDQI 926 ++ A+ GR+L AP GG FVAP L+++TGI L EIFGPVLH+ F + DL + Sbjct: 844 TYL--AQNAGRILHRTAAPAGGHFVAPALLKVTGIADLSHEIFGPVLHLATFAADDLPAV 901 Query: 927 IRDINATGYGLTFGLHTRIDDRVQYICDRIHAGNLYVNRNQIGAIVGSQPFGGEGLSGTG 986 I INA GYGLTFGLH+RID RV+ + + I AGN+YVNRNQIGA+VGSQPFGGEGLSGTG Sbjct: 902 IAAINARGYGLTFGLHSRIDARVETVAETIRAGNIYVNRNQIGAVVGSQPFGGEGLSGTG 961 Query: 987 PKAGGPFYMMRFCAPDRQKSVDSWPSDAPAMTMLPAPTGQPMQEITTSLPGPTGESNRLS 1046 PKAGGP Y+ RF AP+ + +W +LP + E+ LPGPTGE NRL+ Sbjct: 962 PKAGGPLYLGRFYAPEPVVATGAWTQ--AVTPVLPEAVETLLGEL--FLPGPTGELNRLT 1017 Query: 1047 QLARPPLLCLGPGPQAVVAQARAVHALGGTAIEATGPLDMRQLLTMEGTSGVIWWGDETT 1106 + R P+LCLGPGP+A AQA AV LGG A++ATG + + L T E + V+WWG+ Sbjct: 1018 RHVRGPVLCLGPGPEAAAAQAAAVAVLGGQAVQATGAVAAKALGTAEPLAAVLWWGEAEM 1077 Query: 1107 AREIESWLARRNGPILPLIPGLPDKARVQAERHVCVDTTAAGGNAALLGG 1156 R LA R GP++PLI PD A V ERH+CVDTTAAGGNAALL G Sbjct: 1078 GRAYAQALAARPGPLVPLITARPDLAHVAHERHLCVDTTAAGGNAALLAG 1127 Lambda K H 0.317 0.132 0.387 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 3133 Number of extensions: 128 Number of successful extensions: 9 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 1158 Length of database: 1127 Length adjustment: 46 Effective length of query: 1112 Effective length of database: 1081 Effective search space: 1202072 Effective search space used: 1202072 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.3 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.7 bits) S2: 58 (26.9 bits)
Align candidate WP_110805920.1 C8J30_RS10945 (bifunctional proline dehydrogenase/L-glutamate gamma-semialdehyde dehydrogenase PutA)
to HMM TIGR01238 (delta-1-pyrroline-5-carboxylate dehydrogenase (EC 1.2.1.88))
# hmmsearch :: search profile(s) against a sequence database # HMMER 3.3.1 (Jul 2020); http://hmmer.org/ # Copyright (C) 2020 Howard Hughes Medical Institute. # Freely distributed under the BSD open source license. # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # query HMM file: ../tmp/path.carbon/TIGR01238.hmm # target sequence database: /tmp/gapView.1952976.genome.faa # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: TIGR01238 [M=500] Accession: TIGR01238 Description: D1pyr5carbox3: delta-1-pyrroline-5-carboxylate dehydrogenase Scores for complete sequences (score includes all domains): --- full sequence --- --- best 1 domain --- -#dom- E-value score bias E-value score bias exp N Sequence Description ------- ------ ----- ------- ------ ----- ---- -- -------- ----------- 1.6e-184 600.0 3.8 1.6e-184 600.0 3.8 2.2 2 NCBI__GCF_003217355.1:WP_110805920.1 Domain annotation for each sequence (and alignments): >> NCBI__GCF_003217355.1:WP_110805920.1 # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc --- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ---- 1 ! 600.0 3.8 1.6e-184 1.6e-184 2 496 .. 502 974 .. 501 977 .. 0.94 2 ! 2.8 0.5 0.0017 0.0017 221 271 .. 1047 1097 .. 1033 1118 .. 0.86 Alignments for each domain: == domain 1 score: 600.0 bits; conditional E-value: 1.6e-184 TIGR01238 2 lygegrknslGvdlaneselksleeqllkaaakkfqaapivgekakaegeaqpvknpadrkdivGqvseadaa 74 l+g+ r ns+G dl+ ++l+++e+ + ++ +aapiv+ ++ +g + v+npa +v +v+eadaa NCBI__GCF_003217355.1:WP_110805920.1 502 LFGSDRVNSQGFDLSDPEVLARIETARTVTL---PDAAPIVA-GPV-SGGSRDVVNPATG-AVVARVTEADAA 568 89*******************9997655444...579****4.444.56667899**975.689********* PP TIGR01238 75 evqeavdsavaafaewsatdakeraailerladlleshmpelvallvreaGktlsnaiaevreavdflryyak 147 +v a+d+a +wsa +a+eraa+l+r+adl e++ + a l reaGktl +ai+e+reavdflryya NCBI__GCF_003217355.1:WP_110805920.1 569 TVAVALDAA----RVWSA-PAAERAAVLSRAADLYEENFGPIFAALAREAGKTLGDAISELREAVDFLRYYAA 636 999888876....68997.999**************************************************9 PP TIGR01238 148 qvedvldeesakalGavvcispwnfplaiftGqiaaalaaGntviakpaeqtsliaaravellqeaGvpagvi 220 + ++ ++G+vv+ispwnfplaiftGqiaaal+aGn+v+akpaeqt++iaa av ll++aGvpa+++ NCBI__GCF_003217355.1:WP_110805920.1 637 EG----TSDTRAPRGPVVAISPWNFPLAIFTGQIAAALMAGNAVLAKPAEQTPVIAALAVRLLHQAGVPATAL 705 99....6778899************************************************************ PP TIGR01238 221 qllpGrGedvGaaltsderiaGviftGstevarlinkalakredapvpliaetGGqnamivdstalaeqvvad 293 qllpG G +vGaalt d+ri+Gv+ftGste+a+ i +a+a ++ +pliaetGG nam+vdstal+eq v+d NCBI__GCF_003217355.1:WP_110805920.1 706 QLLPGDGPTVGAALTRDPRIKGVVFTGSTETAQIIARAMAAHMAPGTPLIAETGGLNAMVVDSTALPEQAVRD 778 ************************************************************************* PP TIGR01238 294 vlasafdsaGqrcsalrvlcvqedvadrvltlikGamdelkvgkpirlttdvGpvidaeakqnllahiekmka 366 v+asaf saGqrcsalr l+vqed+a +v+ ++kGamdel++g p rl tdvGpvidaea+ +++++ + + NCBI__GCF_003217355.1:WP_110805920.1 779 VVASAFRSAGQRCSALRCLYVQEDIAPHVIAMLKGAMDELVLGDPWRLSTDVGPVIDAEAQAGIETYLAQ--N 849 *****************************************************************99864..4 PP TIGR01238 367 kakkvaqvkleddvesekgtfvaptlfelddldelkkevfGpvlhvvrykadeldkvvdkinakGygltlGvh 439 ++ +++ g fvap l+++ +++l++e+fGpvlh+ + ad+l v+ ina+Gyglt+G+h NCBI__GCF_003217355.1:WP_110805920.1 850 AGRILHRTAAP-----AGGHFVAPALLKVTGIADLSHEIFGPVLHLATFAADDLPAVIAAINARGYGLTFGLH 917 55666655444.....589****************************************************** PP TIGR01238 440 srieetvrqiekrakvGnvyvnrnlvGavvGvqpfGGeGlsGtGpkaGGplylyrlt 496 sri+ v + + +++Gn+yvnrn++GavvG qpfGGeGlsGtGpkaGGplyl r+ NCBI__GCF_003217355.1:WP_110805920.1 918 SRIDARVETVAETIRAGNIYVNRNQIGAVVGSQPFGGEGLSGTGPKAGGPLYLGRFY 974 *****************************************************9985 PP == domain 2 score: 2.8 bits; conditional E-value: 0.0017 TIGR01238 221 qllpGrGedvGaaltsderiaGviftGstevarlinkalakredapvplia 271 q + G+ al + e +a v++ G +e+ r +ala r + vpli+ NCBI__GCF_003217355.1:WP_110805920.1 1047 QAVQATGAVAAKALGTAEPLAAVLWWGEAEMGRAYAQALAARPGPLVPLIT 1097 677788998889************************************996 PP Internal pipeline statistics summary: ------------------------------------- Query model(s): 1 (500 nodes) Target sequences: 1 (1127 residues searched) Passed MSV filter: 1 (1); expected 0.0 (0.02) Passed bias filter: 1 (1); expected 0.0 (0.02) Passed Vit filter: 1 (1); expected 0.0 (0.001) Passed Fwd filter: 1 (1); expected 0.0 (1e-05) Initial search space (Z): 1 [actual number of targets] Domain search space (domZ): 1 [number of targets reported over threshold] # CPU time: 0.01u 0.00s 00:00:00.01 Elapsed: 00:00:00.01 # Mc/sec: 54.67 // [ok]
This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory