Align L-glutamate gamma-semialdehyde dehydrogenase (EC 1.2.1.88); Proline dehydrogenase (EC 1.5.5.2) (characterized)
to candidate 3608920 Dshi_2311 delta-1-pyrroline-5-carboxylate dehydrogenase (RefSeq)
Query= reanno::Marino:GFF2744 (1209 letters) >lcl|FitnessBrowser__Dino:3608920 Dshi_2311 delta-1-pyrroline-5-carboxylate dehydrogenase (RefSeq) Length = 1221 Score = 1263 bits (3267), Expect = 0.0 Identities = 665/1209 (55%), Positives = 848/1209 (70%), Gaps = 19/1209 (1%) Query: 15 RQAIRDYYLADEHKVIHEMIAGAQLSQAERDAISARAAELVRSVRKNAKSTIMEKFLAEY 74 R +R +Y A+E ++ + A +LS ER+ +A A V VR + ++ME FLAEY Sbjct: 16 RAQVRAHYTAEETALLKSLAARIKLSAHEREKAAAAGARYVTRVRNETRPSMMEAFLAEY 75 Query: 75 GLTTKEGVALMCLAEALLRVPDNTTIHELIEDKITSGAWGTHVGKASSGLINTATVALLM 134 GL+T EGV LMCLAEALLRVPD TI +LIEDK+ WG H+G +SS L+N +T AL++ Sbjct: 76 GLSTSEGVGLMCLAEALLRVPDADTIDDLIEDKVAPSNWGAHLGHSSSSLVNASTWALML 135 Query: 135 TSNLLKDSERNTVGETLRKLLKRFGEPVIRTVAGQAMKEMGRQFVLGRDIDEAQDEAKEY 194 T +L + R LR L+KR GEPV+RT GQ+MK +GRQFVLG+ I+E A+E Sbjct: 136 TGKVLDEDPRGPA-RALRGLVKRLGEPVVRTAVGQSMKVLGRQFVLGQTIEEGLKNAREL 194 Query: 195 MAKGYTYSYDMLGEAARTDDDAKRYYDSYSNAIDSIAKASKGDVRKNPGISVKLSALLAR 254 KG+TYSYDMLGEAARTD DA+RY+ +Y+ AI +IA+ + GDVR +PGISVKLSAL R Sbjct: 195 EKKGFTYSYDMLGEAARTDADARRYHAAYAQAITAIARQATGDVRSSPGISVKLSALHPR 254 Query: 255 YEYGNKERVMNELLPRARELVKKAAAANMGFNIDAEEQDRLDLSLDVIEELVADPELAGW 314 YEY ++ VM +L+PRA LVK+AA A +GFN+DAEEQDRLDLSLDVIE +++DP+L GW Sbjct: 255 YEYTHRHSVMADLVPRAAALVKQAAQAGIGFNVDAEEQDRLDLSLDVIEAMMSDPDLDGW 314 Query: 315 DGFGVVVQAYGKRSSFVLDWLYGLAEKYDRKFMVRLVKGAYWDAEIKRAQVMGLNGFPVF 374 DGFGVVVQAYG+R+ V++ LY +AE+YDRK MVRLVKGAYWD EIK AQ +G+ FPVF Sbjct: 315 DGFGVVVQAYGRRAGPVIETLYDMAERYDRKIMVRLVKGAYWDTEIKLAQELGVERFPVF 374 Query: 375 TRKACSDVSFLSCATKLLNMTNRIYPQFATHNAHSVSAILEMAKTKGVDNYEFQRLHGMG 434 TRK +DVS+++CA LL+ +RIYPQFATHNAH+ +A+L+MA D +EFQRLHGMG Sbjct: 375 TRKNNTDVSYMACAQMLLDRRDRIYPQFATHNAHTCAAVLQMAGNAR-DCFEFQRLHGMG 433 Query: 435 ESLHNEVLKVSGVPCRIYAPVGPHKDLLAYLVRRLLENGANSSFVNQIVDKRITPEEIAK 494 SLH V + G CRIYAPVG H+DLLAYLVRRLLENGANSSFVNQIVD I E I+ Sbjct: 434 ASLHQIVKETEGTRCRIYAPVGAHQDLLAYLVRRLLENGANSSFVNQIVDPDIPAEAISA 493 Query: 495 DPIVSVEEMGNNISSKAIVHPFKLFGDQRRNSKGWDITDPVTVNEIEKGRGAYKDYRWKG 554 DP+ +E++G+ I + AI P LF RRNS+G+ + +P ++ + R A+ + W Sbjct: 494 DPVSEMEKLGDQIPNPAIRQPSDLFAPDRRNSRGYRVNEPASILPLMTAREAFAETTWHA 553 Query: 555 GPLIAGEVAGT-EIQVVRNPADPDDLVGHVTQASDADVDTAITSAAAAFESWSAKSAEER 613 P++AG T + V +PAD LVG V +AS DV A+ +A F WSA+ ER Sbjct: 554 RPMLAGGRDPTGPTREVHSPADKTRLVGTVQEASAEDVACALDAAETGFRDWSARPVSER 613 Query: 614 AACVRKVGDLYEENYAELFALTTREAGKSLLDAVAEIREAVDFSQYYANEAIRYK--DSG 671 A +RK+ D+YE+N AEL A+TTREAGK++LD +AE+REAVDF ++YANEA R + D G Sbjct: 614 ADMLRKLADMYEDNIAELTAITTREAGKTVLDGIAEVREAVDFLRFYANEAERLEEEDPG 673 Query: 672 DARGVMCCISPWNFPLAIFTGQILANLAAGNTVVAKPAEQTSLLAIRAVELMHQAGIPKD 731 RG+ CISPWNFPLAIFTGQI A L GN V+AKPAEQT ++A RAV++M G+P Sbjct: 674 RPRGIFVCISPWNFPLAIFTGQIAAALVMGNAVLAKPAEQTPIIAARAVQMMRDCGLPDA 733 Query: 732 AIQLVPGTGATVGAALTSDSRVSGVCFTGSTATAQRINKVMTENMAPDAPLVAETGGLNA 791 A+QL+PG G VG LTSD R++GVCFTGST A I+K + +N P+A LVAETGGLNA Sbjct: 734 ALQLLPGDGPMVGGPLTSDPRIAGVCFTGSTEVAMIIHKALAKNAGPEAVLVAETGGLNA 793 Query: 792 MIVDSTALPEQVVRDVLASSFQSAGQRCSALRMLYVQRDIADGLLEMLYGAMEELGIGDP 851 MIVDSTAL EQ VRD+L SSFQSAGQRCSALR+LYVQ D+ D L+EML GA++ L IGD Sbjct: 794 MIVDSTALHEQAVRDILISSFQSAGQRCSALRILYVQEDVHDKLMEMLSGALDALVIGDS 853 Query: 852 WLLSTDVGPVIDENARKKIVDHCEKFERNGKLLKKMKVPEKGLFVSPAVLSVSGIEELEE 911 W L DV PVID +A+ I+ + ++ + G L+K + P+ G +V+PA++ V GI ++E Sbjct: 854 WNLDVDVSPVIDADAQSDILGYIDQHRKAGTLIKTLAAPDSGTYVTPAIVKVGGIADMER 913 Query: 912 EIFGPVLHVATFEAKNIDKVVDDINAKGYGLTFGIHSRVDRRVERITSRIKVGNTYVNRN 971 EIFGPVLHVATF+A ID+VVD INA+ YGLTFG+H+R+D RVE+I RI+VGN YVNRN Sbjct: 914 EIFGPVLHVATFKANEIDQVVDAINARRYGLTFGLHTRIDDRVEQIVERIQVGNVYVNRN 973 Query: 972 QIGAIVGSQPFGGEGLSGTGPKAGGPQYVRRFLK-GETVEREADSNARKVDAKQLQKLIG 1030 QIGAIVGSQPFGGEGLSGTGPKAGGP Y+ RF K G++ A A + L + Sbjct: 974 QIGAIVGSQPFGGEGLSGTGPKAGGPLYLTRFRKVGKSTSHPAPQGA-VLGKAALNTALS 1032 Query: 1031 QLDKLK-ASRPE----ARMDAIRPIFGNVPEPL------DAHVEALPGPTGETNRLSNHA 1079 LD A+RP+ RM A+ G V L D + LPGPTGE+NRLS Sbjct: 1033 SLDARNWAARPDRVHILRM-ALSGSTGVVRRALSETAAFDMSPQTLPGPTGESNRLSMVP 1091 Query: 1080 RGVVLCLGPDKETALEQAGTALSQGNKVVVIAPGTQDVVDQANKAGLPIVGAQGLLEPEA 1139 RG VLCLGP E A+ QA AL G VV+ PG+ + + AG P+V G ++ Sbjct: 1092 RGTVLCLGPTPEIAMAQAVQALGAGCAVVIALPGSTPLSQPLSDAGAPVVTLDGTVDCVT 1151 Query: 1140 LATIDGFEAVVSCGDQPLLKAYREALAKRDGALLPLITEHTLDQRFVIERHLCVDTTAAG 1199 L + G E V + G + R AL++RDG ++PLI + +R+V+ERHLC+DTTAAG Sbjct: 1152 LTELTGIEVVAAAGASDWTRTLRVALSQRDGPIIPLIVDEIAPERYVLERHLCIDTTAAG 1211 Query: 1200 GNASLIAAS 1208 GNA L+AAS Sbjct: 1212 GNAKLLAAS 1220 Lambda K H 0.316 0.133 0.378 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 3465 Number of extensions: 168 Number of successful extensions: 6 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 1209 Length of database: 1221 Length adjustment: 47 Effective length of query: 1162 Effective length of database: 1174 Effective search space: 1364188 Effective search space used: 1364188 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.3 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.6 bits) S2: 59 (27.3 bits)
Align candidate 3608920 Dshi_2311 (delta-1-pyrroline-5-carboxylate dehydrogenase (RefSeq))
to HMM TIGR01238 (delta-1-pyrroline-5-carboxylate dehydrogenase (EC 1.2.1.88))
# hmmsearch :: search profile(s) against a sequence database # HMMER 3.3.1 (Jul 2020); http://hmmer.org/ # Copyright (C) 2020 Howard Hughes Medical Institute. # Freely distributed under the BSD open source license. # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # query HMM file: ../tmp/path.carbon/TIGR01238.hmm # target sequence database: /tmp/gapView.5403.genome.faa # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: TIGR01238 [M=500] Accession: TIGR01238 Description: D1pyr5carbox3: delta-1-pyrroline-5-carboxylate dehydrogenase Scores for complete sequences (score includes all domains): --- full sequence --- --- best 1 domain --- -#dom- E-value score bias E-value score bias exp N Sequence Description ------- ------ ----- ------- ------ ----- ---- -- -------- ----------- 1.2e-194 633.4 3.2 7.9e-194 630.7 0.1 2.1 2 lcl|FitnessBrowser__Dino:3608920 Dshi_2311 delta-1-pyrroline-5-ca Domain annotation for each sequence (and alignments): >> lcl|FitnessBrowser__Dino:3608920 Dshi_2311 delta-1-pyrroline-5-carboxylate dehydrogenase (RefSeq) # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc --- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ---- 1 ! 630.7 0.1 7.9e-194 7.9e-194 1 498 [. 516 1008 .. 516 1010 .. 0.99 2 ! 2.9 0.6 0.0017 0.0017 158 272 .. 1089 1190 .. 1075 1198 .. 0.82 Alignments for each domain: == domain 1 score: 630.7 bits; conditional E-value: 7.9e-194 TIGR01238 1 dlygegrknslGvdlaneselksleeqllkaaakkfqaapivgekakaegeaqpvknpadrkdivGqvseadaae 75 dl++ r+ns G ++ + +l + ++ a+ +++a p++++ g ++ v +pad+ +vG+v+ea+a++ lcl|FitnessBrowser__Dino:3608920 516 DLFAPDRRNSRGYRVNEPASILPLMTAREAFAETTWHARPMLAGGRDPTGPTREVHSPADKTRLVGTVQEASAED 590 799*9**************************************99999*************************** PP TIGR01238 76 vqeavdsavaafaewsatdakeraailerladlleshmpelvallvreaGktlsnaiaevreavdflryyakqve 150 v a+d+a + f wsa + era +l++lad+ e ++ el a++ reaGkt+ + iaevreavdflr+ya+++e lcl|FitnessBrowser__Dino:3608920 591 VACALDAAETGFRDWSARPVSERADMLRKLADMYEDNIAELTAITTREAGKTVLDGIAEVREAVDFLRFYANEAE 665 **************************************************************************8 PP TIGR01238 151 dvldeesakalGavvcispwnfplaiftGqiaaalaaGntviakpaeqtsliaaravellqeaGvpagviqllpG 225 +e+ +++G +vcispwnfplaiftGqiaaal+ Gn+v+akpaeqt++iaarav+++++ G+p +++qllpG lcl|FitnessBrowser__Dino:3608920 666 RLEEEDPGRPRGIFVCISPWNFPLAIFTGQIAAALVMGNAVLAKPAEQTPIIAARAVQMMRDCGLPDAALQLLPG 740 88888999******************************************************************* PP TIGR01238 226 rGedvGaaltsderiaGviftGstevarlinkalakredapvpliaetGGqnamivdstalaeqvvadvlasafd 300 G vG ltsd+riaGv ftGsteva i+kalak ++++l+aetGG namivdstal eq v+d+l s+f+ lcl|FitnessBrowser__Dino:3608920 741 DGPMVGGPLTSDPRIAGVCFTGSTEVAMIIHKALAKNAGPEAVLVAETGGLNAMIVDSTALHEQAVRDILISSFQ 815 *************************************************************************** PP TIGR01238 301 saGqrcsalrvlcvqedvadrvltlikGamdelkvgkpirlttdvGpvidaeakqnllahiekmkakakkvaqvk 375 saGqrcsalr+l+vqedv d++++++ Ga+d l++g +l dv pvida+a+ ++l +i++ ++ + ++ + lcl|FitnessBrowser__Dino:3608920 816 SAGQRCSALRILYVQEDVHDKLMEMLSGALDALVIGDSWNLDVDVSPVIDADAQSDILGYIDQHRKAGTLIKTLA 890 ****************************************************************99999998887 PP TIGR01238 376 leddvesekgtfvaptlfelddldelkkevfGpvlhvvrykadeldkvvdkinakGygltlGvhsrieetvrqie 450 d gt+v+p ++++ ++++++e+fGpvlhv +ka+e+d+vvd ina+ yglt+G+h+ri++ v qi lcl|FitnessBrowser__Dino:3608920 891 APD-----SGTYVTPAIVKVGGIADMEREIFGPVLHVATFKANEIDQVVDAINARRYGLTFGLHTRIDDRVEQIV 960 777.....8****************************************************************** PP TIGR01238 451 krakvGnvyvnrnlvGavvGvqpfGGeGlsGtGpkaGGplylyrltrv 498 +r++vGnvyvnrn++Ga+vG qpfGGeGlsGtGpkaGGplyl r+ +v lcl|FitnessBrowser__Dino:3608920 961 ERIQVGNVYVNRNQIGAIVGSQPFGGEGLSGTGPKAGGPLYLTRFRKV 1008 ********************************************9886 PP == domain 2 score: 2.9 bits; conditional E-value: 0.0017 TIGR01238 158 akalGavvcispwnfplaiftGqiaaalaaGntviakpaeqtsliaaravellqeaGvpagviqllpGrGedvGa 232 + ++G+v+c+ p i + q + al aG +v+ t+l + l +aG p ++ G + lcl|FitnessBrowser__Dino:3608920 1089 MVPRGTVLCLGPTP---EIAMAQAVQALGAGCAVVIALPGSTPLS-----QPLSDAGAPVVTL---DGTVD--CV 1150 55688888888853...5778899999999998765555678985.....45889**998775...57777..56 PP TIGR01238 233 altsderiaGviftGstevarlinkalakredapvpliae 272 +lt+ + i+ v+ +G+++ +r ++ al++r+ + +pli + lcl|FitnessBrowser__Dino:3608920 1151 TLTELTGIEVVAAAGASDWTRTLRVALSQRDGPIIPLIVD 1190 8*******************************99999975 PP Internal pipeline statistics summary: ------------------------------------- Query model(s): 1 (500 nodes) Target sequences: 1 (1221 residues searched) Passed MSV filter: 1 (1); expected 0.0 (0.02) Passed bias filter: 1 (1); expected 0.0 (0.02) Passed Vit filter: 1 (1); expected 0.0 (0.001) Passed Fwd filter: 1 (1); expected 0.0 (1e-05) Initial search space (Z): 1 [actual number of targets] Domain search space (domZ): 1 [number of targets reported over threshold] # CPU time: 0.03u 0.01s 00:00:00.04 Elapsed: 00:00:00.04 # Mc/sec: 14.43 // [ok]
This GapMind analysis is from Sep 17 2021. The underlying query database was built on Sep 17 2021.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see the paper from 2019 on GapMind for amino acid biosynthesis, the paper from 2022 on GapMind for carbon sources, or view the source code.
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory