Align L-glutamate gamma-semialdehyde dehydrogenase (EC 1.2.1.88); Proline dehydrogenase (EC 1.5.5.2) (characterized)
to candidate WP_084610117.1 A3GO_RS0114220 bifunctional proline dehydrogenase/L-glutamate gamma-semialdehyde dehydrogenase PutA
Query= reanno::ANA3:7023590 (1064 letters) >NCBI__GCF_000428045.1:WP_084610117.1 Length = 1032 Score = 1240 bits (3208), Expect = 0.0 Identities = 637/1031 (61%), Positives = 781/1031 (75%), Gaps = 4/1031 (0%) Query: 33 YIVDEEQYLSELIKLVPSSDEAIERVTRRAHELVNKVRQFDKKGLMVGIDAFLQQYSLET 92 Y VDE L LI L + E +++ TR + ELV VR D M IDA L +YSL+T Sbjct: 2 YCVDESSLLETLIPLARPTPELLKKTTRESAELVRAVRARDDAMHM--IDALLLEYSLDT 59 Query: 93 QEGIILMCLAEALLRIPDAATADALIEDKLSGAKWDEHLSKSDSVLVNASTWGLMLTGKI 152 +EG++LMCLAE+L+RIPD+ATADALI+DKL A W +HL +S S+LVNASTWGL+LTGK+ Sbjct: 60 REGVLLMCLAESLMRIPDSATADALIKDKLGIADWKQHLKQSKSLLVNASTWGLLLTGKV 119 Query: 153 VKLDKKIDGTPSNLLSRLVNRLGEPVIRQAMMAAMKIMGKQFVLGRTMKEALKNSEDKRK 212 + LD I+ P+++++RLVNRL EPV+RQAM AMKIMG+QFVLGRT+ EAL NS +R+ Sbjct: 120 ITLDAGIEDEPASVINRLVNRLSEPVVRQAMHQAMKIMGRQFVLGRTITEALANSRKQRE 179 Query: 213 LGYTHSYDMLGEAALTRKDAEKYFNDYANAITELGAQSYNENESPRPTISIKLSALHPRY 272 GY++SYDMLGEAALT +DAE YF Y+ AI +G+ ++ + S R T+SIKLSALHPRY Sbjct: 180 AGYSYSYDMLGEAALTDRDAEHYFEAYSGAIKAIGSDTFPSDTSFRSTVSIKLSALHPRY 239 Query: 273 EVANEDRVLTELYDTVIRLIKLARGLNIGISIDAEEVDRLELSLKLFQKLFNADATKGWG 332 E E RV++EL D ++ LIKLAR L++GI+IDAEE DRLELSLKLF++L+ ++GWG Sbjct: 240 EQQQERRVISELGDRMLALIKLARELDVGITIDAEEADRLELSLKLFERLYRDQISRGWG 299 Query: 333 LLGIVVQAYSKRALPVLVWLTRLAKEQGDEIPVRLVKGAYWDSELKWAQQAGEAAYPLYT 392 LG+VVQAY KRALPVL WL LA +QGD IP+RL KGAYWDSE+K AQQ G +YP++T Sbjct: 300 NLGLVVQAYQKRALPVLFWLAALAGKQGDLIPLRLAKGAYWDSEIKHAQQLGLESYPVFT 359 Query: 393 RKAGTDVSYLACARYLLSDATRGAIYPQFASHNAQTVAAISDMAGDRNHEFQRLHGMGQE 452 RK TDVSYLACA +LLSD +G IYPQFASHNA TVA++ MA +EFQRLHGMG Sbjct: 360 RKEATDVSYLACASFLLSDQLKGLIYPQFASHNAHTVASVVSMATHDQYEFQRLHGMGDA 419 Query: 453 LYDTILSEAGAKAVRIYAPIGAHKDLLPYLVRRLLENGANTSFVHKLVDPKTPIESLVVH 512 LY+ ++ E A VRIYAP+G+HKDLLPYLVRRLLENGAN+SFVH+L+D KTP+ESL+ H Sbjct: 420 LYEAVI-ERYASRVRIYAPVGSHKDLLPYLVRRLLENGANSSFVHRLIDQKTPVESLIEH 478 Query: 513 PLKTLTGYKTLANNKIVLPTDIFGSDRKNSKGLNMNIISEAEPFFAALDKFKSTQWQAGP 572 PL L Y +L+N +I LP DI+G R NS G+N+N+ + P + F W A P Sbjct: 479 PLTKLCQYSSLSNERIPLPPDIYGKGRINSCGVNINVADQWLPLQREVQAFFRQSWHAAP 538 Query: 573 LVNGQTL-TGEHKTVVSPFDTTQTVGQVAFADKAAIEQAVASADAAFATWTRTPVEVRAS 631 L+ G+ + +GE V +P D VG + D+ + QA+A A AFA W TPV R+ Sbjct: 539 LIKGKRIESGEVVQVHAPHDRVIHVGSLIQTDEQTVGQAIAVACEAFAGWNATPVAKRSV 598 Query: 632 ALQKLADLLEENREELIALCTREAGKSIQDGIDEVREAVDFCRYYAVQAKKLMSKPELLP 691 AL++ ADLLEE+R ELIALC EAGK+IQD IDEVREAVDFCRYYA QA+ +LL Sbjct: 599 ALERAADLLEEHRAELIALCHLEAGKTIQDAIDEVREAVDFCRYYAWQARLHFDSAQLLD 658 Query: 692 GPTGELNELFLQGRGVFVCISPWNFPLAIFLGQVSAALAAGNTVVAKPAEQTSIIGYRAV 751 GPTGE NEL+LQGRG+FVCISPWNFPLAIF GQV AALAAGN+V+AKPAEQTS+I RAV Sbjct: 659 GPTGEANELYLQGRGLFVCISPWNFPLAIFTGQVMAALAAGNSVIAKPAEQTSLIAARAV 718 Query: 752 QLAHQAGIPTDVLQYLPGTGATVGNALTADERIGGVCFTGSTGTAKLINRTLANREGAII 811 +L HQAGIP VL LPG G+ +G LTADERI GV FTGST TA+LINR+LA R G + Sbjct: 719 ELFHQAGIPDSVLHLLPGEGSRIGPLLTADERIAGVAFTGSTETARLINRSLAMRNGPVA 778 Query: 812 PLIAETGGQNAMVVDSTSQPEQVVNDVVSSSFTSAGQRCSALRVLFLQEDIADRVIDVLQ 871 L+AETGGQNAMVVDST+ PEQV+ DV+SS+F SAGQRCSALRVLFLQ+D+A+R++ +L+ Sbjct: 779 TLVAETGGQNAMVVDSTALPEQVIKDVISSAFASAGQRCSALRVLFLQDDVAERILALLK 838 Query: 872 GAMDELVIGNPSSVKTDVGPVIDATAKANLDAHIDHIKQVGKLIKQMSLPAGTENGHFVS 931 GAM EL +G+P + TDVGPVID A+ L HI +++ LI + LP E G FV+ Sbjct: 839 GAMAELKVGDPCELDTDVGPVIDEEARRTLQRHIAEMREQATLIAETPLPLNAEAGCFVA 898 Query: 932 PTAVEIDSIKVLEKEHFGPILHVIRYKASELAHVIDEINSTGFGLTLGIHSRNEGHALEV 991 P A EI + L KEHFGP+LHVIR+KAS L VI +IN++G+GLTLGIHSRNE + + Sbjct: 899 PVAFEIQHMDQLSKEHFGPVLHVIRFKASRLDDVILQINNSGYGLTLGIHSRNETTSRYI 958 Query: 992 ADKVNVGNVYINRNQIGAVVGVQPFGGQGLSGTGPKAGGPHYLTRFVTEKTRTNNITAIG 1051 +V VGNVYINRNQIGAVVGVQPFGG GLSGTGPKAGGPHYL RF TE+TRT N TAIG Sbjct: 959 DTRVRVGNVYINRNQIGAVVGVQPFGGLGLSGTGPKAGGPHYLLRFATERTRTTNTTAIG 1018 Query: 1052 GNATLLSLGDS 1062 GNATLLSL +S Sbjct: 1019 GNATLLSLENS 1029 Lambda K H 0.317 0.133 0.377 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 2422 Number of extensions: 91 Number of successful extensions: 4 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 1064 Length of database: 1032 Length adjustment: 45 Effective length of query: 1019 Effective length of database: 987 Effective search space: 1005753 Effective search space used: 1005753 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.3 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.6 bits) S2: 58 (26.9 bits)
Align candidate WP_084610117.1 A3GO_RS0114220 (bifunctional proline dehydrogenase/L-glutamate gamma-semialdehyde dehydrogenase PutA)
to HMM TIGR01238 (delta-1-pyrroline-5-carboxylate dehydrogenase (EC 1.2.1.88))
# hmmsearch :: search profile(s) against a sequence database # HMMER 3.3.1 (Jul 2020); http://hmmer.org/ # Copyright (C) 2020 Howard Hughes Medical Institute. # Freely distributed under the BSD open source license. # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # query HMM file: ../tmp/path.carbon/TIGR01238.hmm # target sequence database: /tmp/gapView.20165.genome.faa # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: TIGR01238 [M=500] Accession: TIGR01238 Description: D1pyr5carbox3: delta-1-pyrroline-5-carboxylate dehydrogenase Scores for complete sequences (score includes all domains): --- full sequence --- --- best 1 domain --- -#dom- E-value score bias E-value score bias exp N Sequence Description ------- ------ ----- ------- ------ ----- ---- -- -------- ----------- 1.2e-222 725.8 0.4 1.8e-222 725.1 0.4 1.3 1 lcl|NCBI__GCF_000428045.1:WP_084610117.1 A3GO_RS0114220 bifunctional prol Domain annotation for each sequence (and alignments): >> lcl|NCBI__GCF_000428045.1:WP_084610117.1 A3GO_RS0114220 bifunctional proline dehydrogenase/L-glutamate gamma-semiald # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc --- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ---- 1 ! 725.1 0.4 1.8e-222 1.8e-222 1 497 [. 499 1006 .. 499 1009 .. 0.98 Alignments for each domain: == domain 1 score: 725.1 bits; conditional E-value: 1.8e-222 TIGR01238 1 dlygegrknslGvdlaneselksleeqllkaaakkfqaapivgekakaegeaqpvknpadrkdivGq 67 d+yg+gr ns Gv++++ ++ +l+ ++++ ++++aap++++k ++ge+ v +p dr + vG lcl|NCBI__GCF_000428045.1:WP_084610117.1 499 DIYGKGRINSCGVNINVADQWLPLQREVQAFFRQSWHAAPLIKGKRIESGEVVQVHAPHDRVIHVGS 565 89***************************************************************** PP TIGR01238 68 vseadaaevqeavdsavaafaewsatdakeraailerladlleshmpelvallvreaGktlsnaiae 134 + ++d+++v +a+ a +afa w+at+ ++r+ ler+adlle+h el+al++ eaGkt+++ai+e lcl|NCBI__GCF_000428045.1:WP_084610117.1 566 LIQTDEQTVGQAIAVACEAFAGWNATPVAKRSVALERAADLLEEHRAELIALCHLEAGKTIQDAIDE 632 ******************************************************************* PP TIGR01238 135 vreavdflryyakqvedvldeesaka.............lGavvcispwnfplaiftGqiaaalaaG 188 vreavdf+ryya q++ +d +G +vcispwnfplaiftGq+ aalaaG lcl|NCBI__GCF_000428045.1:WP_084610117.1 633 VREAVDFCRYYAWQARLHFDSAQLLDgptgeanelylqgRGLFVCISPWNFPLAIFTGQVMAALAAG 699 ***************99988875555789999*********************************** PP TIGR01238 189 ntviakpaeqtsliaaravellqeaGvpagviqllpGrGedvGaaltsderiaGviftGstevarli 255 n+viakpaeqtsliaaravel+++aG+p +v+ llpG G +G lt+deriaGv+ftGste+arli lcl|NCBI__GCF_000428045.1:WP_084610117.1 700 NSVIAKPAEQTSLIAARAVELFHQAGIPDSVLHLLPGEGSRIGPLLTADERIAGVAFTGSTETARLI 766 ******************************************************************* PP TIGR01238 256 nkalakredapvpliaetGGqnamivdstalaeqvvadvlasafdsaGqrcsalrvlcvqedvadrv 322 n++la r+ + ++l+aetGGqnam+vdstal+eqv++dv++saf+saGqrcsalrvl++q+dva+r+ lcl|NCBI__GCF_000428045.1:WP_084610117.1 767 NRSLAMRNGPVATLVAETGGQNAMVVDSTALPEQVIKDVISSAFASAGQRCSALRVLFLQDDVAERI 833 ******************************************************************* PP TIGR01238 323 ltlikGamdelkvgkpirlttdvGpvidaeakqnllahiekmkakakkvaqvkleddvesekgtfva 389 l l+kGam elkvg p +l tdvGpvid+ea++ l++hi +m+++a +a+ l ++e g fva lcl|NCBI__GCF_000428045.1:WP_084610117.1 834 LALLKGAMAELKVGDPCELDTDVGPVIDEEARRTLQRHIAEMREQATLIAETPLPL--NAEAGCFVA 898 *************************************************9998877..8999***** PP TIGR01238 390 ptlfelddldelkkevfGpvlhvvrykadeldkvvdkinakGygltlGvhsrieetvrqiekrakvG 456 p++fe++++d+l+ke fGpvlhv+r+ka++ld+v+ +in++GygltlG+hsr+e+t r+i++r++vG lcl|NCBI__GCF_000428045.1:WP_084610117.1 899 PVAFEIQHMDQLSKEHFGPVLHVIRFKASRLDDVILQINNSGYGLTLGIHSRNETTSRYIDTRVRVG 965 ******************************************************************* PP TIGR01238 457 nvyvnrnlvGavvGvqpfGGeGlsGtGpkaGGplylyrltr 497 nvy+nrn++GavvGvqpfGG GlsGtGpkaGGp+yl r+ lcl|NCBI__GCF_000428045.1:WP_084610117.1 966 NVYINRNQIGAVVGVQPFGGLGLSGTGPKAGGPHYLLRFAT 1006 **************************************976 PP Internal pipeline statistics summary: ------------------------------------- Query model(s): 1 (500 nodes) Target sequences: 1 (1032 residues searched) Passed MSV filter: 1 (1); expected 0.0 (0.02) Passed bias filter: 1 (1); expected 0.0 (0.02) Passed Vit filter: 1 (1); expected 0.0 (0.001) Passed Fwd filter: 1 (1); expected 0.0 (1e-05) Initial search space (Z): 1 [actual number of targets] Domain search space (domZ): 1 [number of targets reported over threshold] # CPU time: 0.01u 0.01s 00:00:00.02 Elapsed: 00:00:00.02 # Mc/sec: 19.75 // [ok]
This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory