Align L-glutamate gamma-semialdehyde dehydrogenase (EC 1.2.1.88); Proline dehydrogenase (EC 1.5.5.2) (characterized)
to candidate WP_028999944.1 H537_RS0124445 trifunctional transcriptional regulator/proline dehydrogenase/L-glutamate gamma-semialdehyde dehydrogenase
Query= reanno::acidovorax_3H11:Ac3H11_2850 (1261 letters) >NCBI__GCF_000430725.1:WP_028999944.1 Length = 1248 Score = 1696 bits (4392), Expect = 0.0 Identities = 886/1258 (70%), Positives = 997/1258 (79%), Gaps = 20/1258 (1%) Query: 6 APFADFAPRTPLANPLRAAITAAITAATRHPEPEALAPLLAQARLPADQAAAAEQLALRI 65 APFA+F P + LR AIT A R PE EAL L ARLPA A+ LA RI Sbjct: 9 APFAEFLHPAPYSAELRQAITEA----WRKPEVEALPMLAEMARLPAPLKEQAQALAARI 64 Query: 66 AKALRERKASAGRAGIVQGLLQEFSLSSQEGVALMCLAEALLRIPDKATRDALIRDKISH 125 A LR+RK SAGRAG+VQGLLQE++LSSQEGVALMCLAEALLRIPD+ TRDALIRDKI+ Sbjct: 65 ATTLRDRKPSAGRAGLVQGLLQEYALSSQEGVALMCLAEALLRIPDRETRDALIRDKIAR 124 Query: 126 GQWDAHLGKSPSLFVNAATWGLLITGKLVATHSEGSLGNSLSRLIGKGGEPLIRKGVDMA 185 GQW HLG+SPSLFVNAATWGLLITG+L ATHSE L ++L R++ GGEPLIRK VDMA Sbjct: 125 GQWHTHLGRSPSLFVNAATWGLLITGRLTATHSESGLSSALGRMLAVGGEPLIRKSVDMA 184 Query: 186 MRMMGEQFVTGETIDEALRNARTMEAEGFRYSYDMLGEAALTSEDAKRYYSSYEQAIHAI 245 MR+MGEQFVTGETID+AL NAR EAEGFRYSYDMLGEAALT++DA+RY +SYE+AIHAI Sbjct: 185 MRVMGEQFVTGETIDQALANARVREAEGFRYSYDMLGEAALTAQDAQRYLASYERAIHAI 244 Query: 246 GKASAGRGIYEGPGISIKLSALHPRYSRAQFGRVMDELYPLVLRLTALAKQYDIGLNIDA 305 GKASAGRGIYEGPGISIKLSALHPRYSRAQ RV+DELYP++LRL LAK+YDIGLNIDA Sbjct: 245 GKASAGRGIYEGPGISIKLSALHPRYSRAQLDRVLDELYPVLLRLALLAKRYDIGLNIDA 304 Query: 306 EETDRLELSLDLLERLCHEPTLAGWNGIGFVIQAYQKRCPFVIDCVVDLARRTQRRLMVR 365 EE DRLE+SLDLLERLC EP L GWNGIGFVIQAYQKRCP+VID +VDLARR++RRLM+R Sbjct: 305 EEADRLEISLDLLERLCFEPALKGWNGIGFVIQAYQKRCPYVIDFIVDLARRSERRLMIR 364 Query: 366 LVKGAYWDSEIKRAQVDGLKDYPVYTRKVHTDISYIACAKKLLAAPEAVYPQFATHNAET 425 LVKGAYWDSEIKRAQ+DG DYPVYTRK +TDI+YIACA+KLLAAP+ VYPQFATHNA T Sbjct: 365 LVKGAYWDSEIKRAQLDGQLDYPVYTRKPYTDIAYIACARKLLAAPQQVYPQFATHNAHT 424 Query: 426 VATIYQLAG-SNYYAGQYEFQCLHGMGEPLYEQVVGAITAGKLGREIGKGGLGRPCRIYA 484 +A IY LA + + GQYEFQCLHGMGEPLYEQVVG G G LGRPCR+YA Sbjct: 425 LAAIYHLADPARWQPGQYEFQCLHGMGEPLYEQVVGT----------GAGKLGRPCRVYA 474 Query: 485 PVGTHETLLAYLVRRLLENGANTSFVNRIADETIALDELVKSPVQVVDQQAATEGTAGLP 544 PVGTHETLLAYLVRRLLENGANTSFVNRIAD TI++ ELV+ PV V D+ A EG GLP Sbjct: 475 PVGTHETLLAYLVRRLLENGANTSFVNRIADATISITELVRDPVDVADELARKEGRFGLP 534 Query: 545 HPRIPLPAALYGAHRSNSRGLDLSNENTLTELAATLQATASHAWTAAPLLAADVPAGTTQ 604 HP IP P LYG R NSRG+DLSNE+ L +L A L+ +A AW+A PLLAA AG + Sbjct: 535 HPAIPAPRDLYGPARPNSRGIDLSNEHELAKLQAALRQSAGEAWSAEPLLAAGPVAGERE 594 Query: 605 PVRNPADHNDVVGQVQEATTADVDQALVHAQAAATSWAATPPAERAAALLRTADLLEERI 664 VRNPADH DVVG VQ A+ VD A HA AAA WAATPPA RA L R AD LEE + Sbjct: 595 AVRNPADHGDVVGFVQNASAEHVDLAFSHAAAAAGRWAATPPAARADMLDRAADRLEEDM 654 Query: 665 QPLMGLLMREAGKSASNAVAEVREAVDFLRYYAAQVQSTFDNATHIPLGPVACISPWNFP 724 LMGLL+REAGK+A+NA+AEVREAVDFLRYYA QV+ F+ TH+P+GPV CISPWNFP Sbjct: 655 PRLMGLLIREAGKTAANAIAEVREAVDFLRYYARQVREDFEPDTHVPVGPVVCISPWNFP 714 Query: 725 LAIFMGQVAAALAAGNPVLAKPAEQTPLIAAEAVRLLWQAGVPRAAVQLLPGQGETVGAR 784 LAIFMGQV+AALAAGNPVLAKPAEQTPLIAAEAVR+LW AGVPR +Q LPG GE VGAR Sbjct: 715 LAIFMGQVSAALAAGNPVLAKPAEQTPLIAAEAVRVLWAAGVPRDVLQFLPGAGEVVGAR 774 Query: 785 LIGDARVMGVMFTGSTEVARILQRTVAGRLDAAGRPIPLIAETGGQNAMIVDSSALVEQV 844 L+GDARV GV+FTGSTEVARILQR VAGRLDA GRPIPLIAETGGQNAMIVDSSAL EQV Sbjct: 775 LVGDARVRGVLFTGSTEVARILQRAVAGRLDADGRPIPLIAETGGQNAMIVDSSALAEQV 834 Query: 845 VGDAVSSAFDSAGQRCSALRVLCVQEEAADRVVEMLQGAMGELRVGNPGELRVDVGPVID 904 V D ++SAFDSAGQRCSALRVLCVQE+ ADR++EML GAM E R+G+P L VDVGPVID Sbjct: 835 VTDVLASAFDSAGQRCSALRVLCVQEDVADRLIEMLLGAMAEWRIGSPDRLAVDVGPVID 894 Query: 905 AEAQAGIAQHIEKFKAQGHRVFQHPNHVSAISAPGTFVPPTLIELNHIGELQREVFGPVL 964 AEA AG+ +H+E +A G RV Q + + GTF+ PT+IEL+ + ELQREVFGPVL Sbjct: 895 AEALAGLQRHVEGMRASGRRVHQLGACDAGVLGRGTFMLPTVIELDQLSELQREVFGPVL 954 Query: 965 HLVRYARSDLDQLLDQINATGYGLTQGVHTRIDETIARVVNRAHAGNVYVNRNMVGAVVG 1024 H++RY R DLD LL Q+NATGYGLT G+HTRIDETI+RV+ +HAGNVYVNRN+VGAVVG Sbjct: 955 HVLRYQREDLDALLAQVNATGYGLTMGLHTRIDETISRVLKASHAGNVYVNRNIVGAVVG 1014 Query: 1025 VQPFGGEGLSGTGPKAGGPLYLLRLLSQRPADALARTFAEADRTSPHDTERRERHLAP-L 1083 VQPFGGEGLSGTGPKAGGPLYL RLLS+RPAD + R F E D T D + P L Sbjct: 1015 VQPFGGEGLSGTGPKAGGPLYLYRLLSRRPADVMTRLF-EPDATL--DLGPFAGQMPPAL 1071 Query: 1084 ATLQQWAHNQGNLALAGHCQRFAQETQSGTSRTLPGPTGERNVYTLAPRARVLCLAHSVD 1143 L WA Q LA C RF Q ++SG S+TLPGPTGE+NVY LA R VLCLA S D Sbjct: 1072 RALHDWAQQQHQTQLAQVCHRFWQRSRSGASQTLPGPTGEKNVYALAAREAVLCLAGSDD 1131 Query: 1144 DLLVQTAAVLASGGTALWPHAHAGLRAKLPTHVQAQVMLQDNTLSDGSVALDAVLHHGDA 1203 D LVQ AAVLA GG +WP L KLP VQA V++ + S V+ DA L HG A Sbjct: 1132 DRLVQLAAVLAVGGRCVWPVECEALMRKLPAAVQASVVIARDWASP-EVSFDAALFHGAA 1190 Query: 1204 PSLQAVCTTLARRPGPIVGVTALQPGAADIPLERLLIERALSVNTAAAGGNASLMTIG 1261 P LQA+ LA RPGPIVG+ LQPG D+PLERL++E+A S+NTAAAGGNASLMTIG Sbjct: 1191 PELQAIRQRLAERPGPIVGIERLQPGETDVPLERLVVEKATSINTAAAGGNASLMTIG 1248 Lambda K H 0.318 0.133 0.387 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 3901 Number of extensions: 171 Number of successful extensions: 7 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 1261 Length of database: 1248 Length adjustment: 48 Effective length of query: 1213 Effective length of database: 1200 Effective search space: 1455600 Effective search space used: 1455600 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.3 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.7 bits) S2: 59 (27.3 bits)
Align candidate WP_028999944.1 H537_RS0124445 (trifunctional transcriptional regulator/proline dehydrogenase/L-glutamate gamma-semialdehyde dehydrogenase)
to HMM TIGR01238 (delta-1-pyrroline-5-carboxylate dehydrogenase (EC 1.2.1.88))
# hmmsearch :: search profile(s) against a sequence database # HMMER 3.3.1 (Jul 2020); http://hmmer.org/ # Copyright (C) 2020 Howard Hughes Medical Institute. # Freely distributed under the BSD open source license. # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # query HMM file: ../tmp/path.carbon/TIGR01238.hmm # target sequence database: /tmp/gapView.4105264.genome.faa # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: TIGR01238 [M=500] Accession: TIGR01238 Description: D1pyr5carbox3: delta-1-pyrroline-5-carboxylate dehydrogenase Scores for complete sequences (score includes all domains): --- full sequence --- --- best 1 domain --- -#dom- E-value score bias E-value score bias exp N Sequence Description ------- ------ ----- ------- ------ ----- ---- -- -------- ----------- 2.6e-229 747.7 0.3 3.8e-229 747.2 0.3 1.2 1 NCBI__GCF_000430725.1:WP_028999944.1 Domain annotation for each sequence (and alignments): >> NCBI__GCF_000430725.1:WP_028999944.1 # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc --- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ---- 1 ! 747.2 0.3 3.8e-229 3.8e-229 1 498 [. 543 1042 .. 543 1044 .. 0.99 Alignments for each domain: == domain 1 score: 747.2 bits; conditional E-value: 3.8e-229 TIGR01238 1 dlygegrknslGvdlaneselksleeqllkaaakkfqaapivgekakaegeaqpvknpadrkdivGqvsea 71 dlyg +r ns G+dl+ne+el++l++ l+++a + + a p+++ ++ ge + v+npad+ d+vG v++a NCBI__GCF_000430725.1:WP_028999944.1 543 DLYGPARPNSRGIDLSNEHELAKLQAALRQSAGEAWSAEPLLAAGPV-AGEREAVRNPADHGDVVGFVQNA 612 89****************************************76666.5899******************* PP TIGR01238 72 daaevqeavdsavaafaewsatdakeraailerladlleshmpelvallvreaGktlsnaiaevreavdfl 142 +a++v+ a + a aa+ w at+++ ra +l+r+ad le++mp l++ll+reaGkt naiaevreavdfl NCBI__GCF_000430725.1:WP_028999944.1 613 SAEHVDLAFSHAAAAAGRWAATPPAARADMLDRAADRLEEDMPRLMGLLIREAGKTAANAIAEVREAVDFL 683 *********************************************************************** PP TIGR01238 143 ryyakqvedvldeesakalGavvcispwnfplaiftGqiaaalaaGntviakpaeqtsliaaravellqea 213 ryya+qv+++++ +++ ++G+vvcispwnfplaif+Gq++aalaaGn v+akpaeqt+liaa+av +l a NCBI__GCF_000430725.1:WP_028999944.1 684 RYYARQVREDFEPDTHVPVGPVVCISPWNFPLAIFMGQVSAALAAGNPVLAKPAEQTPLIAAEAVRVLWAA 754 *********************************************************************** PP TIGR01238 214 GvpagviqllpGrGedvGaaltsderiaGviftGstevarlinkalakredap...vpliaetGGqnamiv 281 Gvp v+q+lpG+Ge vGa l d+r++Gv+ftGstevar +++a+a r da+ +pliaetGGqnamiv NCBI__GCF_000430725.1:WP_028999944.1 755 GVPRDVLQFLPGAGEVVGARLVGDARVRGVLFTGSTEVARILQRAVAGRLDADgrpIPLIAETGGQNAMIV 825 ***************************************************97777*************** PP TIGR01238 282 dstalaeqvvadvlasafdsaGqrcsalrvlcvqedvadrvltlikGamdelkvgkpirlttdvGpvidae 352 ds+alaeqvv+dvlasafdsaGqrcsalrvlcvqedvadr+++++ Gam e ++g p rl dvGpvidae NCBI__GCF_000430725.1:WP_028999944.1 826 DSSALAEQVVTDVLASAFDSAGQRCSALRVLCVQEDVADRLIEMLLGAMAEWRIGSPDRLAVDVGPVIDAE 896 *********************************************************************** PP TIGR01238 353 akqnllahiekmkakakkvaqvkleddvesekgtfvaptlfelddldelkkevfGpvlhvvrykadeldkv 423 a l++h+e m+a +++v+q+ d +gtf+ pt++eld+l+el++evfGpvlhv+ry++++ld + NCBI__GCF_000430725.1:WP_028999944.1 897 ALAGLQRHVEGMRASGRRVHQLGACDAGVLGRGTFMLPTVIELDQLSELQREVFGPVLHVLRYQREDLDAL 967 ************************99999****************************************** PP TIGR01238 424 vdkinakGygltlGvhsrieetvrqiekrakvGnvyvnrnlvGavvGvqpfGGeGlsGtGpkaGGplylyr 494 + ++na+Gyglt+G+h+ri+et+ ++ k ++Gnvyvnrn+vGavvGvqpfGGeGlsGtGpkaGGplylyr NCBI__GCF_000430725.1:WP_028999944.1 968 LAQVNATGYGLTMGLHTRIDETISRVLKASHAGNVYVNRNIVGAVVGVQPFGGEGLSGTGPKAGGPLYLYR 1038 *********************************************************************** PP TIGR01238 495 ltrv 498 l++ NCBI__GCF_000430725.1:WP_028999944.1 1039 LLSR 1042 *986 PP Internal pipeline statistics summary: ------------------------------------- Query model(s): 1 (500 nodes) Target sequences: 1 (1248 residues searched) Passed MSV filter: 1 (1); expected 0.0 (0.02) Passed bias filter: 1 (1); expected 0.0 (0.02) Passed Vit filter: 1 (1); expected 0.0 (0.001) Passed Fwd filter: 1 (1); expected 0.0 (1e-05) Initial search space (Z): 1 [actual number of targets] Domain search space (domZ): 1 [number of targets reported over threshold] # CPU time: 0.01u 0.00s 00:00:00.01 Elapsed: 00:00:00.01 # Mc/sec: 55.59 // [ok]
This GapMind analysis is from Apr 09 2024. The underlying query database was built on Sep 17 2021.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory