Align L-glutamate gamma-semialdehyde dehydrogenase (EC 1.2.1.88); Proline dehydrogenase (EC 1.5.5.2) (characterized)
to candidate Ga0059261_3926 Ga0059261_3926 L-proline dehydrogenase (EC 1.5.99.8)/delta-1-pyrroline-5-carboxylate dehydrogenase (EC 1.5.1.12)
Query= reanno::azobra:AZOBR_RS23695 (1235 letters) >FitnessBrowser__Korea:Ga0059261_3926 Length = 1199 Score = 1573 bits (4072), Expect = 0.0 Identities = 829/1224 (67%), Positives = 941/1224 (76%), Gaps = 31/1224 (2%) Query: 16 APFADFAPPIRPATELRAAITAAYRRPEPECLPFLFEQASLPPGVITAAAATARKLITAL 75 APFADFAPPIRP T LR+AITAAYRRPEPEC+P L EQASLP G AA TA LITAL Sbjct: 3 APFADFAPPIRPQTPLRSAITAAYRRPEPECVPPLVEQASLPEGTREAARITASTLITAL 62 Query: 76 RAKPRGRGVEGLIHEYSLSSQEGMALMCLAEALLRIPDHATRDALIRDKIAGGDWQAHLG 135 RAK +G GVEGL+ EY+LSSQEG+ALMCLAEALLRIPD ATRDALIRDKIA GDW++H+G Sbjct: 63 RAKHKGTGVEGLVQEYALSSQEGVALMCLAEALLRIPDDATRDALIRDKIADGDWKSHIG 122 Query: 136 KGGSMFVNAATWGLLITGKLTSAGGEQALSSALTRLIARGGEPLIRRGVDFAMRMMGEQF 195 G S+FVNAATWGL++TGKLT + + L +ALTRLIAR GEP+IRRGVD AMRMMGEQF Sbjct: 123 DGRSLFVNAATWGLVVTGKLTGSVNDAGLGAALTRLIARAGEPVIRRGVDMAMRMMGEQF 182 Query: 196 VTGQTIQEALTNARTMEAEGFRYSYDMLGEAALTAEDAARYYADYVNAIHAIGTASAGRG 255 VTG+TI EAL ART+EA GF+YSYDMLGEAA T DA RYY DY NA+ AIG ASAGRG Sbjct: 183 VTGETIAEALKRARTLEARGFQYSYDMLGEAATTMADADRYYRDYENAVRAIGEASAGRG 242 Query: 256 VYEGPGISIKLSAIHPRYSRAQADRVMDELLPRVKALALLAKGYDIGLNIDAEEADRLEL 315 V GPGISIKLSA+HPRY+RAQA RVMDELLP+VKALA+LA+GYDIG NIDAEEADRLEL Sbjct: 243 VVGGPGISIKLSALHPRYARAQAGRVMDELLPKVKALAVLARGYDIGFNIDAEEADRLEL 302 Query: 316 SLDLMESLCFDPDLAGWNGIGFVVQAYGKRCPYVIDFLIDLARRSGHRLMIRLVKGAYWD 375 SLDL+ESL DPDL GW+G+GFVVQAYGKRCP+VID+++DLA+R+ R+M+RLVKGAYWD Sbjct: 303 SLDLLESLALDPDLKGWDGLGFVVQAYGKRCPFVIDWIVDLAQRADRRIMVRLVKGAYWD 362 Query: 376 SEIKRAQLDGLPDFPVYTRKVYTDVSYVACARKLLAAPEAVFPQFATHNAQTLATIYEMA 435 +EIKRAQ+DGLPDFPVYTRK+YTDV+Y+ACARKLLA + +FPQFATHNAQTLATIY+MA Sbjct: 363 AEIKRAQVDGLPDFPVYTRKIYTDVAYIACARKLLANRDRIFPQFATHNAQTLATIYQMA 422 Query: 436 GSDFQVGKYEFQCLHGMGEPLYKEVVG--PLKRPCRIYAPVGTHETLLAYLVRRLLENGA 493 G DF VG YEFQCLHGMGEPLY EVVG L RPCRIYAPVGTHETLLAYLVRRLLENGA Sbjct: 423 GPDFSVGDYEFQCLHGMGEPLYDEVVGATKLNRPCRIYAPVGTHETLLAYLVRRLLENGA 482 Query: 494 NSSFVNRIADPAVPVDELVADPVAVARAIAPTGAPHALIALPRNLYAPERANSAGIDLSD 553 NSSFVNRIADP V + ELVADPV R++ GA H LIALP LY R NS G+DLS+ Sbjct: 483 NSSFVNRIADPEVSIAELVADPVDQVRSMDVVGAKHPLIALPTGLYGARR-NSEGLDLSN 541 Query: 554 ETELARLSAALSASAEMTWTAAPLLADGERAGQAQPVRNPADRRDVVGSVTEASEALVAE 613 E LA L+A+L SA W A P +R G ++PV NPAD +DVVG+V E + Sbjct: 542 ENVLAELAASLKVSAAAGWAAEP----ADRIGTSRPVYNPADGKDVVGTVVEVTPEAAQT 597 Query: 614 AFGHAVAAASAWAATPPEERAASLFRAADTMQERMPTLLGLIVREAGKSLPNAIAEVREA 673 A A AAA+ WAA P ERAA L RAAD MQ+RM L+GLI+REAGKS PNAIAEVREA Sbjct: 598 AVATAQAAAADWAAVAPAERAACLDRAADIMQDRMQILMGLIIREAGKSAPNAIAEVREA 657 Query: 674 IDFLRYYGAQVRDRFDNATHRPLGPVVCISPWNFPLAIFSGQIAAALAAGNPVLAKPAEE 733 IDFLRYY Q R A H+PLG V CISPWNFPLAIF+GQ+AAAL AGN VLAKPAEE Sbjct: 658 IDFLRYYAEQARAML-GAAHKPLGAVTCISPWNFPLAIFTGQVAAALVAGNTVLAKPAEE 716 Query: 734 TPLIAAEAVRILHAAGIPAGALQLLPGAGEVGAALVGHEAVRGVMFTGSTEVARLIQRQL 793 TPLIAA+ V ILH AGIPA ALQL+PG G +GAALV + VMFTGSTEVAR+IQ++L Sbjct: 717 TPLIAAQGVSILHEAGIPAAALQLVPGDGRIGAALVAAPGTQAVMFTGSTEVARIIQKEL 776 Query: 794 AGRLLPDGAPIPLIAETGGQNAMIVDSSALAEQVVGDVIASAFDSAGQRCSALRILCLQE 853 A RL G P+P IAETGGQNAMIVDSSALAEQVVGDVIASAFDSAGQRCSALR+LCLQ+ Sbjct: 777 AKRLTDGGDPVPFIAETGGQNAMIVDSSALAEQVVGDVIASAFDSAGQRCSALRVLCLQD 836 Query: 854 DVADRTLAMLKGAMRELRIGNPDRLAVDVGPVISEEARATIAAHIEAMRAKGRNVEFLPL 913 DVADR L MLKGA+ EL IG D L+ D+GPVI+ EA+A I HI M GR VE + L Sbjct: 837 DVADRILVMLKGALHELSIGRTDSLSTDIGPVITAEAKANIEGHIARMLGMGRGVEQIEL 896 Query: 914 PAETADGTFIAPTVIEIGGIHELEREVFGPVLHVVRFHRDDLDALVDSINATGYGLTFGL 973 ETA GTF+ PT+IE+ I +LEREVFGPVLHVVR+ R DLD ++D+INATGYGLTFGL Sbjct: 897 AGETAQGTFVPPTIIELQSIADLEREVFGPVLHVVRYKRRDLDRVIDAINATGYGLTFGL 956 Query: 974 HTRIDATIERVTGRIGAGNVYVNRNTIGAVVGVQPFGGHGLSGTGPKAGGPLYLSRLLSR 1033 HTR+D TI V+ R+ AGN+Y+NRN IGAVVGVQPFGG GLSGTGPKAGGPLYL RL+ Sbjct: 957 HTRLDETIAHVSQRVEAGNLYINRNIIGAVVGVQPFGGRGLSGTGPKAGGPLYLGRLVQT 1016 Query: 1034 R--PKGWLEFRGPDAARAAGLAYGEWLRAKGFTAEASRCAGYVARSAIGGGAELNGPVGE 1091 P G A R L WL +G EA+R AG S +G EL GPVGE Sbjct: 1017 ATVPPGVASDTSDPALRDFAL----WLGEQG---EAARAAG--EASLLGAEVELPGPVGE 1067 Query: 1092 RNLYELHGRGRVLLLPQTRTGLLLQLGAVLATGNSAAVDAPPDLAELLRGLPPALAARVR 1151 RN+Y LH RG VLLLP+TR GL+ Q+ A LATGN A + E L GLP +AAR+ Sbjct: 1068 RNIYALHPRGSVLLLPKTREGLIAQVSAALATGNRAVIGNTALRGE-LAGLPERVAARLS 1126 Query: 1152 TTADWRDVGPLAAVLVEGDRERVTAINRRVADLPGPILLVQAATAEALAAGRGEGYDLDL 1211 DWR GP L+EGD +A +A LPGPI+L QA T LD Sbjct: 1127 WADDWRKAGPYGGALIEGDAAARSAALAEIAALPGPIVLAQAGTPR-----------LDW 1175 Query: 1212 LLNERSVSVNTAAAGGNASLVAMS 1235 L+ E S SVNT AAGGNASL+A++ Sbjct: 1176 LVEEVSTSVNTTAAGGNASLMAVA 1199 Lambda K H 0.319 0.136 0.396 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 3787 Number of extensions: 157 Number of successful extensions: 8 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 1235 Length of database: 1199 Length adjustment: 47 Effective length of query: 1188 Effective length of database: 1152 Effective search space: 1368576 Effective search space used: 1368576 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.4 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.8 bits) S2: 59 (27.3 bits)
Align candidate Ga0059261_3926 Ga0059261_3926 (L-proline dehydrogenase (EC 1.5.99.8)/delta-1-pyrroline-5-carboxylate dehydrogenase (EC 1.5.1.12))
to HMM TIGR01238 (delta-1-pyrroline-5-carboxylate dehydrogenase (EC 1.2.1.88))
# hmmsearch :: search profile(s) against a sequence database # HMMER 3.3.1 (Jul 2020); http://hmmer.org/ # Copyright (C) 2020 Howard Hughes Medical Institute. # Freely distributed under the BSD open source license. # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # query HMM file: ../tmp/path.carbon/TIGR01238.hmm # target sequence database: /tmp/gapView.1265.genome.faa # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: TIGR01238 [M=500] Accession: TIGR01238 Description: D1pyr5carbox3: delta-1-pyrroline-5-carboxylate dehydrogenase Scores for complete sequences (score includes all domains): --- full sequence --- --- best 1 domain --- -#dom- E-value score bias E-value score bias exp N Sequence Description ------- ------ ----- ------- ------ ----- ---- -- -------- ----------- 1.7e-219 715.4 7.7 3.2e-219 714.5 7.7 1.4 1 lcl|FitnessBrowser__Korea:Ga0059261_3926 Ga0059261_3926 L-proline dehydro Domain annotation for each sequence (and alignments): >> lcl|FitnessBrowser__Korea:Ga0059261_3926 Ga0059261_3926 L-proline dehydrogenase (EC 1.5.99.8)/delta-1-pyrroline-5-ca # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc --- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ---- 1 ! 714.5 7.7 3.2e-219 3.2e-219 2 498 .. 527 1016 .. 526 1018 .. 0.98 Alignments for each domain: == domain 1 score: 714.5 bits; conditional E-value: 3.2e-219 TIGR01238 2 lygegrknslGvdlaneselksleeqllkaaakkfqaapivgekakaegeaqpvknpadrkdivGqv 68 lyg r+ns G+dl+ne++l++l ++l+ +aa + a p g +pv npad kd+vG+v lcl|FitnessBrowser__Korea:Ga0059261_3926 527 LYGA-RRNSEGLDLSNENVLAELAASLKVSAAAGWAAEPA-----DRIGTSRPVYNPADGKDVVGTV 587 7888.******************************99885.....467999**************** PP TIGR01238 69 seadaaevqeavdsavaafaewsatdakeraailerladlleshmpelvallvreaGktlsnaiaev 135 e + +q av +a+aa+a w a+ ++eraa+l+r+ad+++ +m l++l++reaGk+ naiaev lcl|FitnessBrowser__Korea:Ga0059261_3926 588 VEVTPEAAQTAVATAQAAAADWAAVAPAERAACLDRAADIMQDRMQILMGLIIREAGKSAPNAIAEV 654 ******************************************************************* PP TIGR01238 136 reavdflryyakqvedvldeesakalGavvcispwnfplaiftGqiaaalaaGntviakpaeqtsli 202 rea+dflryya+q++ l+ +k+lGav cispwnfplaiftGq+aaal+aGntv+akpae+t+li lcl|FitnessBrowser__Korea:Ga0059261_3926 655 REAIDFLRYYAEQARAMLGAA-HKPLGAVTCISPWNFPLAIFTGQVAAALVAGNTVLAKPAEETPLI 720 ***************999988.********************************************* PP TIGR01238 203 aaravellqeaGvpagviqllpGrGedvGaaltsderiaGviftGstevarlinkalakredap... 266 aa++v++l+eaG+pa+++ql+pG G +Gaal + + + v+ftGstevar i+k+lakr + lcl|FitnessBrowser__Korea:Ga0059261_3926 721 AAQGVSILHEAGIPAAALQLVPGDGR-IGAALVAAPGTQAVMFTGSTEVARIIQKELAKRLTDGgdp 786 *************************9.*********************************8753344 PP TIGR01238 267 vpliaetGGqnamivdstalaeqvvadvlasafdsaGqrcsalrvlcvqedvadrvltlikGamdel 333 vp+iaetGGqnamivds+alaeqvv dv+asafdsaGqrcsalrvlc+q+dvadr+l ++kGa++el lcl|FitnessBrowser__Korea:Ga0059261_3926 787 VPFIAETGGQNAMIVDSSALAEQVVGDVIASAFDSAGQRCSALRVLCLQDDVADRILVMLKGALHEL 853 ******************************************************************* PP TIGR01238 334 kvgkpirlttdvGpvidaeakqnllahiekmkakakkvaqvkleddvesekgtfvaptlfelddlde 400 +g+ l td+Gpvi aeak n++ hi +m ++++ v q+ l+ e+ +gtfv+pt++el+++++ lcl|FitnessBrowser__Korea:Ga0059261_3926 854 SIGRTDSLSTDIGPVITAEAKANIEGHIARMLGMGRGVEQIELAG--ETAQGTFVPPTIIELQSIAD 918 *******************************************99..999***************** PP TIGR01238 401 lkkevfGpvlhvvrykadeldkvvdkinakGygltlGvhsrieetvrqiekrakvGnvyvnrnlvGa 467 l++evfGpvlhvvryk+++ld+v+d ina+Gyglt+G+h+r +et++++ +r+++Gn+y+nrn++Ga lcl|FitnessBrowser__Korea:Ga0059261_3926 919 LEREVFGPVLHVVRYKRRDLDRVIDAINATGYGLTFGLHTRLDETIAHVSQRVEAGNLYINRNIIGA 985 ******************************************************************* PP TIGR01238 468 vvGvqpfGGeGlsGtGpkaGGplylyrltrv 498 vvGvqpfGG+GlsGtGpkaGGplyl rl+++ lcl|FitnessBrowser__Korea:Ga0059261_3926 986 VVGVQPFGGRGLSGTGPKAGGPLYLGRLVQT 1016 ****************************986 PP Internal pipeline statistics summary: ------------------------------------- Query model(s): 1 (500 nodes) Target sequences: 1 (1199 residues searched) Passed MSV filter: 1 (1); expected 0.0 (0.02) Passed bias filter: 1 (1); expected 0.0 (0.02) Passed Vit filter: 1 (1); expected 0.0 (0.001) Passed Fwd filter: 1 (1); expected 0.0 (1e-05) Initial search space (Z): 1 [actual number of targets] Domain search space (domZ): 1 [number of targets reported over threshold] # CPU time: 0.02u 0.00s 00:00:00.02 Elapsed: 00:00:00.02 # Mc/sec: 24.20 // [ok]
This GapMind analysis is from Sep 17 2021. The underlying query database was built on Sep 17 2021.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory