Align L-glutamate gamma-semialdehyde dehydrogenase (EC 1.2.1.88); Proline dehydrogenase (EC 1.5.5.2) (characterized)
to candidate WP_051243110.1 H537_RS44845 trifunctional transcriptional regulator/proline dehydrogenase/L-glutamate gamma-semialdehyde dehydrogenase
Query= reanno::Cup4G11:RR42_RS20125 (1333 letters) >NCBI__GCF_000430725.1:WP_051243110.1 Length = 1249 Score = 1447 bits (3745), Expect = 0.0 Identities = 778/1259 (61%), Positives = 927/1259 (73%), Gaps = 19/1259 (1%) Query: 81 PFLEFAQSVQPQSVLRAAITAAYRRPESECVPVLLEQARLPHQQAEAALAMARTLATRLR 140 PFL+FA + PQSVLR+AITAA RRPE E V +LLE ARLP QA + LA+A +LA RLR Sbjct: 4 PFLDFAGELLPQSVLRSAITAACRRPEPEAVAMLLESARLPPGQAVSVLALAASLARRLR 63 Query: 141 ERKVG--TGREGLVQGLIQEFSLSSQEGVALMCLAEALLRIPDKATRDALIRDKISGANW 198 ER G GREGLVQGL++EFSL+SQEGVALMCLAEALLRIPD ATRDALIRDK+ +W Sbjct: 64 ERGAGLTAGREGLVQGLMREFSLASQEGVALMCLAEALLRIPDDATRDALIRDKLGDGDW 123 Query: 199 QSHLGQSPSVFVNAATWGLLFTGKLVATH-TEAGLSKALTRIIGKGGEPLIRKGVDMAMR 257 +HLG+S S+FVNAATWGLL G++ AT EAGL AL+R++ + GEPL+RKGVD+AMR Sbjct: 124 GAHLGRSGSLFVNAATWGLLLGGRMAATQQAEAGLGSALSRVLARSGEPLLRKGVDLAMR 183 Query: 258 LMGEQFVTGETISEALANARKYEAEGFRYSYDMLGEAAMTEADAQRYLASYEQAINAIGQ 317 L+GEQFV G+TI EALA ARK E++GF YS+DMLGEAA+T DAQRYL++YE AI+A+G Sbjct: 184 LLGEQFVCGQTIGEALARARKRESQGFTYSFDMLGEAALTAEDAQRYLSAYEHAIHALGI 243 Query: 318 ASRGRGIYEGPGISIKLSALHPRYSRAQHERVIGELYGRLKSLTLLARQYDIGINIDAEE 377 A GR ++ GPGISIKLSALHPRY+R+QH RV+ ELY RL L LLAR + IG++IDAEE Sbjct: 244 ALAGRDLHAGPGISIKLSALHPRYTRSQHGRVMAELYPRLLQLALLARHHGIGLSIDAEE 303 Query: 378 ADRLEISLDLLERLCFEPELAGWNGIGFVVQGYQKRCPFVIDYLIDLARRSRHRLMIRLV 437 ADRLE+SLDLL+RLC E LAGW+G+G VQ YQKRCP V+D+ IDLARRSR RLM+RLV Sbjct: 304 ADRLELSLDLLQRLCGEHMLAGWSGLGLAVQAYQKRCPHVLDFCIDLARRSRRRLMLRLV 363 Query: 438 KGAYWDSEIKRAQVDGLEGYPVYTRKVYTDVSYVACARKLLSVPDVIYPQFATHNAHTLA 497 KGAYWDSEIKRAQ+DGLE Y VYTRK +TDV+Y+ACAR+LL+ PD +YPQFATHNAHT+A Sbjct: 364 KGAYWDSEIKRAQIDGLEDYAVYTRKAHTDVAYLACARRLLAAPDAVYPQFATHNAHTVA 423 Query: 498 AIYQIAGHNYYPGQYEFQCLHGMGEPLYDQVVGPLADGKFNRPCRIYAPVGTHETLLAYL 557 AI +AG + PG+YEFQCLHGMGEPLY+ VV P +G PCRIYAPVGTHETLLAYL Sbjct: 424 AIQHLAG-AWTPGRYEFQCLHGMGEPLYEMVVAPPIEGGLGLPCRIYAPVGTHETLLAYL 482 Query: 558 VRRLLENGANTSFVNRIADDTISLDELVADPVAVVEQMHADEGALGLPHPRIAQPRTLYG 617 VRRLLENGANTSFVNRIAD +I ++ LV DPVA VE+ EG LGLPHP I PR L+G Sbjct: 483 VRRLLENGANTSFVNRIADASIPVEALVEDPVAQVERAARAEGTLGLPHPAIPLPRALFG 542 Query: 618 ESRANSAGIDLSNEHRLASLSSALLAGTSEAVSAVPLLGTEAAAGEDVNQPAPVRNPSDQ 677 R NSAGIDL+NEHRLASLS+ALL G +A +AVPL+ G+ P PV NP+D+ Sbjct: 543 ALRPNSAGIDLANEHRLASLSAALLHGARQAWTAVPLVAGLPRPGD---LPQPVLNPADR 599 Query: 678 RDVVGHVTEASMAEVEAALQAAVNAAPIWQATPADVRAAALERAAELMEAQMQSLMGIIV 737 D VG V EA E+E AL AA P W ATP RAA LER A+ +E Q+Q+L+G+IV Sbjct: 600 SDRVGTVREARADEMEDALSAAAAVQPAWGATPPAERAALLERGADALEDQLQTLVGLIV 659 Query: 738 REAGKTFSNAIAEVREAVDFLRYYAAQVRETFSSDTHRPLGPVVCISPWNFPLAIFTGQV 797 REAGKT A+ EVREAVD LRY A Q R + + RPLGPV CISPWNFPLAIFTGQ+ Sbjct: 660 REAGKTVPAAVGEVREAVDALRYAALQARTALDAGS-RPLGPVACISPWNFPLAIFTGQL 718 Query: 798 AAALAAGNTVLAKPAEQTPLIAAQAVRLLREAGVPAGAVQLLPGRGETVGAALVGDARVK 857 AAALAAGN V+AKPA QTPL+AA+AVRLL AGVP +QLLPG GE+VG L DARV+ Sbjct: 719 AAALAAGNAVVAKPARQTPLVAAEAVRLLHAAGVPGAVLQLLPGPGESVGLRLARDARVR 778 Query: 858 GVMFTGSTEVARLLQRSVAGRLDAAGRPVPLIAETGGQNAMIVDSSALAEQVVGDVVNSA 917 GV+FTGST VAR LQ +A RLDA G P L+AETGG NA+I DSSALAEQ+V DV+ SA Sbjct: 779 GVLFTGSTAVARRLQAELALRLDARGVPPLLVAETGGLNALIADSSALAEQLVPDVLASA 838 Query: 918 FDSAGQRCSALRVLCLQEEVADRVLEMLKGAMDELTMGNPDRLSTDVGPVIDEEARGNIV 977 FDSAGQRCSALRVLCLQ+E+A+ VL ML+GA+ EL +G PD L+TDVGPVID+ AR + Sbjct: 839 FDSAGQRCSALRVLCLQQEIAEPVLRMLQGALAELCVGRPDALATDVGPVIDDTARDTVE 898 Query: 978 RHIDAMRAKGRRVHQADPNGALSAACRNGTFVSPTLIELDSIEELQREVFGPVLHVVRYP 1037 +H+ M+A G RV + + L +G+FV+PT++EL+S+ +L EVFGPVLHV+R+ Sbjct: 899 KHVLHMQALGLRVTRQPLSEELR---EHGSFVAPTVVELESLSQLPGEVFGPVLHVLRWR 955 Query: 1038 RAGLDTLLAQINGTGYGLTMGIHTRIDETIEHIVERAEVGNLYVNRNIVGAVVGVQPFGG 1097 R LD LL +I TGY LT+G+HTRIDETI + RA GN YVNRN++GAVVGVQPFGG Sbjct: 956 RGELDGLLQRIEATGYALTLGLHTRIDETIALVTARARAGNQYVNRNMIGAVVGVQPFGG 1015 Query: 1098 EGLSGTGPKAGGPLYLHRLLSVCPLDAVARVVRASDTVGGADETGPVRRTLTETLATLKE 1157 EGLSGTGPKAGGPL + RL C A A + + A L L+ Sbjct: 1016 EGLSGTGPKAGGPLLVRRL---CERHAPALLAIGTPVAAVAPSGNGGGEPRLPALRVLRG 1072 Query: 1158 WAQRESAALPGLVAACERFAAASAAGLSVTLPGPTGERNTYTLLPRAAVLCLAQQETDLA 1217 W Q LVAAC+ A S AGL V LPGPTGERN Y LLPR VLC A DL Sbjct: 1073 WLQEALPEDAALVAACDALLAQSPAGLDVLLPGPTGERNRYALLPRRRVLCQAGDRGDLL 1132 Query: 1218 VQLAAVLAAGSQAVWVESPMARALFARLPKAVQSRVRLVADWSAGDTGFDAVLHHGDSDQ 1277 LA VLA G + +W +S ARAL A LP V+ RV+L A+ D D + Sbjct: 1133 FLLALVLATGGRVLWADSAAARALHAALPAVVRERVKLSANPLGED--IDLAAAQDAPGR 1190 Query: 1278 LRAVCEQLATRPGPIISVQGLAHGEPNIAI---ERLLIERSLSVNTAAAGGNASLMTIG 1333 + + L+ R GPI+ + A GE + A+ ERL++ERSL VNTAAAGGNA LM +G Sbjct: 1191 VLELSLALSQRDGPIVPLVACAKGERDPALLPPERLMVERSLCVNTAAAGGNAGLMAMG 1249 Lambda K H 0.318 0.133 0.383 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 4010 Number of extensions: 186 Number of successful extensions: 9 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 1333 Length of database: 1249 Length adjustment: 48 Effective length of query: 1285 Effective length of database: 1201 Effective search space: 1543285 Effective search space used: 1543285 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.3 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.7 bits) S2: 59 (27.3 bits)
Align candidate WP_051243110.1 H537_RS44845 (trifunctional transcriptional regulator/proline dehydrogenase/L-glutamate gamma-semialdehyde dehydrogenase)
to HMM TIGR01238 (delta-1-pyrroline-5-carboxylate dehydrogenase (EC 1.2.1.88))
# hmmsearch :: search profile(s) against a sequence database # HMMER 3.3.1 (Jul 2020); http://hmmer.org/ # Copyright (C) 2020 Howard Hughes Medical Institute. # Freely distributed under the BSD open source license. # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # query HMM file: ../tmp/path.carbon/TIGR01238.hmm # target sequence database: /tmp/gapView.3919516.genome.faa # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: TIGR01238 [M=500] Accession: TIGR01238 Description: D1pyr5carbox3: delta-1-pyrroline-5-carboxylate dehydrogenase Scores for complete sequences (score includes all domains): --- full sequence --- --- best 1 domain --- -#dom- E-value score bias E-value score bias exp N Sequence Description ------- ------ ----- ------- ------ ----- ---- -- -------- ----------- 6.8e-207 673.8 0.2 9e-207 673.4 0.2 1.1 1 NCBI__GCF_000430725.1:WP_051243110.1 Domain annotation for each sequence (and alignments): >> NCBI__GCF_000430725.1:WP_051243110.1 # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc --- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ---- 1 ! 673.4 0.2 9e-207 9e-207 2 497 .. 540 1036 .. 539 1039 .. 0.98 Alignments for each domain: == domain 1 score: 673.4 bits; conditional E-value: 9e-207 TIGR01238 2 lygegrknslGvdlaneselksleeqllkaaakkfqaapivgekakaegeaqpvknpadrkdivGqvsead 72 l+g r ns+G+dlane++l+sl++ ll+ a + + a+p+v++ ++ + qpv npadr d vG+v+ea NCBI__GCF_000430725.1:WP_051243110.1 540 LFGALRPNSAGIDLANEHRLASLSAALLHGARQAWTAVPLVAGLPRPGDLPQPVLNPADRSDRVGTVREAR 610 8999******************************************************************* PP TIGR01238 73 aaevqeavdsavaafaewsatdakeraailerladlleshmpelvallvreaGktlsnaiaevreavdflr 143 a+e+++a+++a a + w at+++eraa ler ad le ++ +lv+l+vreaGkt+ a+ evreavd lr NCBI__GCF_000430725.1:WP_051243110.1 611 ADEMEDALSAAAAVQPAWGATPPAERAALLERGADALEDQLQTLVGLIVREAGKTVPAAVGEVREAVDALR 681 *********************************************************************** PP TIGR01238 144 yyakqvedvldeesakalGavvcispwnfplaiftGqiaaalaaGntviakpaeqtsliaaravellqeaG 214 y a q++ ld ++lG+v cispwnfplaiftGq+aaalaaGn+v+akpa qt+l+aa+av ll+ aG NCBI__GCF_000430725.1:WP_051243110.1 682 YAALQARTALDAG-SRPLGPVACISPWNFPLAIFTGQLAAALAAGNAVVAKPARQTPLVAAEAVRLLHAAG 751 **********998.8******************************************************** PP TIGR01238 215 vpagviqllpGrGedvGaaltsderiaGviftGstevarlinkalakredap...vpliaetGGqnamivd 282 vp +v+qllpG Ge+vG l+ d+r++Gv+ftGst+var+++ +la r da+ l+aetGG na+i d NCBI__GCF_000430725.1:WP_051243110.1 752 VPGAVLQLLPGPGESVGLRLARDARVRGVLFTGSTAVARRLQAELALRLDARgvpPLLVAETGGLNALIAD 822 **************************************************98744469************* PP TIGR01238 283 stalaeqvvadvlasafdsaGqrcsalrvlcvqedvadrvltlikGamdelkvgkpirlttdvGpvidaea 353 s+alaeq+v dvlasafdsaGqrcsalrvlc+q+++a+ vl +++Ga+ el vg+p l tdvGpvid+ a NCBI__GCF_000430725.1:WP_051243110.1 823 SSALAEQLVPDVLASAFDSAGQRCSALRVLCLQQEIAEPVLRMLQGALAELCVGRPDALATDVGPVIDDTA 893 *********************************************************************** PP TIGR01238 354 kqnllahiekmkakakkvaqvkleddvesekgtfvaptlfelddldelkkevfGpvlhvvrykadeldkvv 424 ++ +++h+ +m+a + +v++ l + e+g+fvapt++el++l++l evfGpvlhv+r+++ eld ++ NCBI__GCF_000430725.1:WP_051243110.1 894 RDTVEKHVLHMQALGLRVTRQPLSE-ELREHGSFVAPTVVELESLSQLPGEVFGPVLHVLRWRRGELDGLL 963 ***************9999888887.4679***************************************** PP TIGR01238 425 dkinakGygltlGvhsrieetvrqiekrakvGnvyvnrnlvGavvGvqpfGGeGlsGtGpkaGGplylyrl 495 ++i+a+Gy+ltlG+h+ri+et++ ++ ra++Gn yvnrn++GavvGvqpfGGeGlsGtGpkaGGpl + rl NCBI__GCF_000430725.1:WP_051243110.1 964 QRIEATGYALTLGLHTRIDETIALVTARARAGNQYVNRNMIGAVVGVQPFGGEGLSGTGPKAGGPLLVRRL 1034 *******************************************************************9999 PP TIGR01238 496 tr 497 + NCBI__GCF_000430725.1:WP_051243110.1 1035 CE 1036 75 PP Internal pipeline statistics summary: ------------------------------------- Query model(s): 1 (500 nodes) Target sequences: 1 (1249 residues searched) Passed MSV filter: 1 (1); expected 0.0 (0.02) Passed bias filter: 1 (1); expected 0.0 (0.02) Passed Vit filter: 1 (1); expected 0.0 (0.001) Passed Fwd filter: 1 (1); expected 0.0 (1e-05) Initial search space (Z): 1 [actual number of targets] Domain search space (domZ): 1 [number of targets reported over threshold] # CPU time: 0.01u 0.00s 00:00:00.01 Elapsed: 00:00:00.01 # Mc/sec: 62.16 // [ok]
This GapMind analysis is from Apr 09 2024. The underlying query database was built on Sep 17 2021.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory