Align L-glutamate gamma-semialdehyde dehydrogenase (EC 1.2.1.88); Proline dehydrogenase (EC 1.5.5.2) (characterized)
to candidate WP_012402366.1 BPHY_RS15380 trifunctional transcriptional regulator/proline dehydrogenase/L-glutamate gamma-semialdehyde dehydrogenase
Query= reanno::Cup4G11:RR42_RS20125 (1333 letters) >NCBI__GCF_000020045.1:WP_012402366.1 Length = 1320 Score = 1744 bits (4516), Expect = 0.0 Identities = 923/1357 (68%), Positives = 1064/1357 (78%), Gaps = 61/1357 (4%) Query: 1 MATTTLGVKLDDASRERLKRVAQSIDRTPHWLIKQAIFTYLEQVERGNIPHETSAAGTGS 60 MA+TTLGVK+DD R RLK A ++RTPHWLIKQAIF YLE++E G +P E S G Sbjct: 1 MASTTLGVKVDDLLRTRLKDAATRLERTPHWLIKQAIFAYLEKIEHGQLPAELS----GH 56 Query: 61 EGAADGAD-AFDGAASDGAIQPFLEFAQSVQPQSVLRAAITAAYRRPESECVPVLLEQAR 119 GA + AD A D SDG + PFLEFAQSVQPQSVLRAAITAAYRRPE ECVP L+ QAR Sbjct: 57 HGATELADGAADPDESDG-LHPFLEFAQSVQPQSVLRAAITAAYRRPEPECVPFLIGQAR 115 Query: 120 LPHQQAEAALAMARTLATRLRERKVGTGREGLVQGLIQEFSLSSQEGVALMCLAEALLRI 179 LP A AMA L LR + G G V+GLI EFSLSSQEGVALMCLAEALLRI Sbjct: 116 LPANIANDVQAMASKLVEALRSKSTGGG----VEGLIHEFSLSSQEGVALMCLAEALLRI 171 Query: 180 PDKATRDALIRDKISGANWQSHLGQSPSVFVNAATWGLLFTGKLVATHTEAGLSKALTRI 239 PD+ATRDALIRDKIS +W+SH+G +PS+FVNAATWGL+ TGKLV T++E GLS ALTR+ Sbjct: 172 PDRATRDALIRDKISKGDWRSHVGHAPSLFVNAATWGLMITGKLVTTNSETGLSSALTRM 231 Query: 240 IGKGGEPLIRKGVDMAMRLMGEQFVTGETISEALANARKYEAEGFRYSYDMLGEAAMTEA 299 IGKGGEPLIRKGVDMAMRLMGEQFVTGETISEALAN+RK+EA GFRYSYDMLGEAA TEA Sbjct: 232 IGKGGEPLIRKGVDMAMRLMGEQFVTGETISEALANSRKFEARGFRYSYDMLGEAATTEA 291 Query: 300 DAQRYLASYEQAINAIGQASRGRGIYEGPGISIKLSALHPRYSRAQHERVIGELYGRLKS 359 DAQRY ASYEQAI+AIG+A+ GRGIYEGPGISIKLSALHPRYSR+Q ER + EL R+++ Sbjct: 292 DAQRYYASYEQAIHAIGKAAGGRGIYEGPGISIKLSALHPRYSRSQQERTMSELLPRVRA 351 Query: 360 LTLLARQYDIGINIDAEEADRLEISLDLLERLCFEPELAGWNGIGFVVQGYQKRCPFVID 419 L +LAR+YDIG+NIDAEEADRLEISLDLLE LCF+PEL GWNGIGFVVQ YQKRCPFVID Sbjct: 352 LAILARRYDIGLNIDAEEADRLEISLDLLEALCFDPELQGWNGIGFVVQAYQKRCPFVID 411 Query: 420 YLIDLARRSRHRLMIRLVKGAYWDSEIKRAQVDGLEGYPVYTRKVYTDVSYVACARKLLS 479 Y++DLARRSRHR+M+RLVKGAYWD+EIKRAQVDGLEGYPVYTRK+YTDVSY+ACA+KLL Sbjct: 412 YIVDLARRSRHRIMVRLVKGAYWDTEIKRAQVDGLEGYPVYTRKIYTDVSYLACAKKLLG 471 Query: 480 VPDVIYPQFATHNAHTLAAIYQIAGHNYYPGQYEFQCLHGMGEPLYDQVVGPLADGKFNR 539 PD +YPQFATHNAHTL+AIY +AG NYYPGQYEFQCLHGMGEPLY++V G K NR Sbjct: 472 APDAVYPQFATHNAHTLSAIYHLAGQNYYPGQYEFQCLHGMGEPLYEEVTG---RDKLNR 528 Query: 540 PCRIYAPVGTHETLLAYLVRRLLENGANTSFVNRIADDTISLDELVADPVAVVEQMHADE 599 PCR+YAPVGTHETLLAYLVRRLLENGANTSFVNRIAD+T+ + +LVADPV DE Sbjct: 529 PCRVYAPVGTHETLLAYLVRRLLENGANTSFVNRIADETVPVQDLVADPV--------DE 580 Query: 600 GA----LGLPHPRIAQPRTLYGESRANSAGIDLSNEHRLASLSSALLAGTSEAVSAVPLL 655 A LG PH +I PR LYG R NS G+DLSNEHRLASLSSALLA + A P+L Sbjct: 581 AAKIVPLGAPHAKIPLPRNLYGAERTNSMGLDLSNEHRLASLSSALLASANHPWRAAPML 640 Query: 656 -GTEAAAGEDVNQPAPVRNPSDQRDVVGHVTEASMAEVEAALQAAVNAAPIWQATPADVR 714 G E A G + VRNPSD RD+VG V EA+ V AAL AV AAPIWQATP + R Sbjct: 641 EGNEIAVG----RARDVRNPSDHRDLVGTVVEATPEHVSAALAHAVAAAPIWQATPVEAR 696 Query: 715 AAALERAAELMEAQMQSLMGIIVREAGKTFSNAIAEVREAVDFLRYYAAQVRETFSSDTH 774 A L RAA+L+EAQM +LMG++VREAGK+ NA+AE+REA+DFLRYY++Q+R+ FS+DTH Sbjct: 697 ADCLARAADLLEAQMHTLMGLVVREAGKSLPNAVAEIREAIDFLRYYSSQIRDEFSNDTH 756 Query: 775 RPLGPVVCISPWNFPLAIFTGQVAAALAAGNTVLAKPAEQTPLIAAQAVRLLREAGVPAG 834 RPLGPVVCISPWNFPLAIF GQVAAALAAGNTVLAKPAEQTPLIAAQAVR+LREAGVPAG Sbjct: 757 RPLGPVVCISPWNFPLAIFMGQVAAALAAGNTVLAKPAEQTPLIAAQAVRILREAGVPAG 816 Query: 835 AVQLLPGRGETVGAALVGDARVKGVMFTGSTEVARLLQRSVAGRLDAAGRPVPLIAETGG 894 AVQLLPG GETVGAALV DAR + VMFTGSTEVARL+ ++++ RLD G+P+PLIAETGG Sbjct: 817 AVQLLPGDGETVGAALVADARTRAVMFTGSTEVARLINKTLSNRLDPDGKPIPLIAETGG 876 Query: 895 QNAMIVDSSALAEQVVGDVVNSAFDSAGQRCSALRVLCLQEEVADRVLEMLKGAMDELTM 954 QNAMIVDSSALAEQVV DV+ S+FDSAGQRCSALRVLCLQ++VADR LEML GAM EL + Sbjct: 877 QNAMIVDSSALAEQVVADVLQSSFDSAGQRCSALRVLCLQDDVADRTLEMLTGAMKELAV 936 Query: 955 GNPDRLSTDVGPVIDEEARGNIVRHIDAMRAKGRRVHQAD-PNGALSAACRNGTFVSPTL 1013 GNPDRLS DVGPVID EA+ I HI +MR KGR+V Q P+G C GTFV PTL Sbjct: 937 GNPDRLSIDVGPVIDAEAKRGIDAHIASMREKGRKVTQMPMPDG-----CAAGTFVPPTL 991 Query: 1014 IELDSIEELQREVFGPVLHVVRYPRAGLDTLLAQINGTGYGLTMGIHTRIDETIEHIVER 1073 IELD+I+EL+REVFGPVLHVVRY R+ LD LL QI TGYGLT+GIHTRIDETI H++ R Sbjct: 992 IELDNIDELKREVFGPVLHVVRYRRSALDKLLEQIRATGYGLTLGIHTRIDETIAHVIGR 1051 Query: 1074 AEVGNLYVNRNIVGAVVGVQPFGGEGLSGTGPKAGGPLYLHRLLSVCPL---DAVARVVR 1130 A VGN+YVNRN++GAVVGVQPFGGEGLSGTGPKAGG LYL RLL+ P ++AR + Sbjct: 1052 AHVGNIYVNRNVIGAVVGVQPFGGEGLSGTGPKAGGALYLQRLLATRPAGLPKSLARTLM 1111 Query: 1131 ASDTVGGAD-------ETGPVRRTLTETLATLKEW--AQRESAALPGLVAACERFAAASA 1181 + G A+ L TL++W A+RE P L A C+ + + Sbjct: 1112 VDASQGAANGGQSAAQNGNAASDNPAAALTTLRDWLIAERE----PVLAARCDGYLSHIP 1167 Query: 1182 AGLSVTLPGPTGERNTYTLLPRAAVLCLAQQETDLAVQLAAVLAAGSQAVWVESPMARAL 1241 AG + L GPTGERNTYTL R VLC+A + VQ AA LA G++A++ E L Sbjct: 1168 AGATAVLAGPTGERNTYTLGARGTVLCVASTASGARVQFAAALATGNKALF-EGAAGEQL 1226 Query: 1242 FARLPKAVQSRVRLVADWSAGDTGFDAVLHHGDSDQLRAVCEQLATRPGPIISVQGLA-- 1299 +A LP +++ + + + FDA L GDSD+L A+ + +A R GPI+SVQG+A Sbjct: 1227 YAALPPSLKQYASVKKN---AEASFDAALFEGDSDELLALVKDIAKRAGPIVSVQGVAAR 1283 Query: 1300 ---HGEPNIAIERLLIERSLSVNTAAAGGNASLMTIG 1333 G+ + A+ERLL ERS+SVNTAAAGGNA+LMTIG Sbjct: 1284 ALESGDEDYALERLLTERSVSVNTAAAGGNANLMTIG 1320 Lambda K H 0.318 0.133 0.383 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 4023 Number of extensions: 157 Number of successful extensions: 11 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 1333 Length of database: 1320 Length adjustment: 49 Effective length of query: 1284 Effective length of database: 1271 Effective search space: 1631964 Effective search space used: 1631964 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.3 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.7 bits) S2: 59 (27.3 bits)
Align candidate WP_012402366.1 BPHY_RS15380 (trifunctional transcriptional regulator/proline dehydrogenase/L-glutamate gamma-semialdehyde dehydrogenase)
to HMM TIGR01238 (delta-1-pyrroline-5-carboxylate dehydrogenase (EC 1.2.1.88))
# hmmsearch :: search profile(s) against a sequence database # HMMER 3.3.1 (Jul 2020); http://hmmer.org/ # Copyright (C) 2020 Howard Hughes Medical Institute. # Freely distributed under the BSD open source license. # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # query HMM file: ../tmp/path.carbon/TIGR01238.hmm # target sequence database: /tmp/gapView.32171.genome.faa # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: TIGR01238 [M=500] Accession: TIGR01238 Description: D1pyr5carbox3: delta-1-pyrroline-5-carboxylate dehydrogenase Scores for complete sequences (score includes all domains): --- full sequence --- --- best 1 domain --- -#dom- E-value score bias E-value score bias exp N Sequence Description ------- ------ ----- ------- ------ ----- ---- -- -------- ----------- 3.1e-244 797.0 0.8 5.2e-244 796.2 0.8 1.3 1 lcl|NCBI__GCF_000020045.1:WP_012402366.1 BPHY_RS15380 trifunctional trans Domain annotation for each sequence (and alignments): >> lcl|NCBI__GCF_000020045.1:WP_012402366.1 BPHY_RS15380 trifunctional transcriptional regulator/proline dehydrogenase/ # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc --- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ---- 1 ! 796.2 0.8 5.2e-244 5.2e-244 1 499 [. 599 1098 .. 599 1099 .. 0.99 Alignments for each domain: == domain 1 score: 796.2 bits; conditional E-value: 5.2e-244 TIGR01238 1 dlygegrknslGvdlaneselksleeqllkaaakkfqaapivgekakaegeaqpvknpadrkdivGq 67 +lyg r+ns+G+dl+ne++l+sl++ ll++a++ ++aap+++++ a+g a+ v+np d++d+vG+ lcl|NCBI__GCF_000020045.1:WP_012402366.1 599 NLYGAERTNSMGLDLSNEHRLASLSSALLASANHPWRAAPMLEGNEIAVGRARDVRNPSDHRDLVGT 665 59***************************************************************** PP TIGR01238 68 vseadaaevqeavdsavaafaewsatdakeraailerladlleshmpelvallvreaGktlsnaiae 134 v ea ++v++a+ avaa+++w+at+ + ra +l r+adlle +m +l++l+vreaGk+l na+ae lcl|NCBI__GCF_000020045.1:WP_012402366.1 666 VVEATPEHVSAALAHAVAAAPIWQATPVEARADCLARAADLLEAQMHTLMGLVVREAGKSLPNAVAE 732 ******************************************************************* PP TIGR01238 135 vreavdflryyakqvedvldeesakalGavvcispwnfplaiftGqiaaalaaGntviakpaeqtsl 201 +rea+dflryy+ q++d+++++++++lG+vvcispwnfplaif+Gq+aaalaaGntv+akpaeqt+l lcl|NCBI__GCF_000020045.1:WP_012402366.1 733 IREAIDFLRYYSSQIRDEFSNDTHRPLGPVVCISPWNFPLAIFMGQVAAALAAGNTVLAKPAEQTPL 799 ******************************************************************* PP TIGR01238 202 iaaravellqeaGvpagviqllpGrGedvGaaltsderiaGviftGstevarlinkalakredap.. 266 iaa+av +l+eaGvpag++qllpG Ge+vGaal +d+r + v+ftGstevarlink+l++r d++ lcl|NCBI__GCF_000020045.1:WP_012402366.1 800 IAAQAVRILREAGVPAGAVQLLPGDGETVGAALVADARTRAVMFTGSTEVARLINKTLSNRLDPDgk 866 ***************************************************************8777 PP TIGR01238 267 .vpliaetGGqnamivdstalaeqvvadvlasafdsaGqrcsalrvlcvqedvadrvltlikGamde 332 +pliaetGGqnamivds+alaeqvvadvl+s+fdsaGqrcsalrvlc+q+dvadr+l+++ Gam+e lcl|NCBI__GCF_000020045.1:WP_012402366.1 867 pIPLIAETGGQNAMIVDSSALAEQVVADVLQSSFDSAGQRCSALRVLCLQDDVADRTLEMLTGAMKE 933 7****************************************************************** PP TIGR01238 333 lkvgkpirlttdvGpvidaeakqnllahiekmkakakkvaqvkleddvesekgtfvaptlfelddld 399 l vg+p rl dvGpvidaeak+ + ahi m++k++kv+q+ + d + gtfv+ptl+eld++d lcl|NCBI__GCF_000020045.1:WP_012402366.1 934 LAVGNPDRLSIDVGPVIDAEAKRGIDAHIASMREKGRKVTQMPMPD--GCAAGTFVPPTLIELDNID 998 ********************************************99..9****************** PP TIGR01238 400 elkkevfGpvlhvvrykadeldkvvdkinakGygltlGvhsrieetvrqiekrakvGnvyvnrnlvG 466 elk+evfGpvlhvvry+++ ldk++++i a+GygltlG+h+ri+et++++ +ra+vGn+yvnrn++G lcl|NCBI__GCF_000020045.1:WP_012402366.1 999 ELKREVFGPVLHVVRYRRSALDKLLEQIRATGYGLTLGIHTRIDETIAHVIGRAHVGNIYVNRNVIG 1065 ******************************************************************* PP TIGR01238 467 avvGvqpfGGeGlsGtGpkaGGplylyrltrvr 499 avvGvqpfGGeGlsGtGpkaGG+lyl+rl+ +r lcl|NCBI__GCF_000020045.1:WP_012402366.1 1066 AVVGVQPFGGEGLSGTGPKAGGALYLQRLLATR 1098 *****************************9765 PP Internal pipeline statistics summary: ------------------------------------- Query model(s): 1 (500 nodes) Target sequences: 1 (1320 residues searched) Passed MSV filter: 1 (1); expected 0.0 (0.02) Passed bias filter: 1 (1); expected 0.0 (0.02) Passed Vit filter: 1 (1); expected 0.0 (0.001) Passed Fwd filter: 1 (1); expected 0.0 (1e-05) Initial search space (Z): 1 [actual number of targets] Domain search space (domZ): 1 [number of targets reported over threshold] # CPU time: 0.03u 0.01s 00:00:00.04 Elapsed: 00:00:00.04 # Mc/sec: 15.57 // [ok]
This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory