Align L-glutamate gamma-semialdehyde dehydrogenase (EC 1.2.1.88); Proline dehydrogenase (EC 1.5.5.2) (characterized)
to candidate WP_012708559.1 NGR_RS21360 trifunctional transcriptional regulator/proline dehydrogenase/L-glutamate gamma-semialdehyde dehydrogenase
Query= reanno::azobra:AZOBR_RS23695 (1235 letters) >NCBI__GCF_000018545.1:WP_012708559.1 Length = 1230 Score = 1580 bits (4092), Expect = 0.0 Identities = 818/1229 (66%), Positives = 945/1229 (76%), Gaps = 6/1229 (0%) Query: 8 PSAAPGEAAPFADFAPPIRPATELRAAITAAYRRPEPECLPFLFEQASLPPGVITAAAAT 67 P + P A PFA FAPPIR + LR ITAAYRRPE ECLP L E A L A Sbjct: 5 PLSEPDTARPFAAFAPPIRQQSTLRQEITAAYRRPETECLPPLVEAARLTGATKAKVAVI 64 Query: 68 ARKLITALRAKPRGRGVEGLIHEYSLSSQEGMALMCLAEALLRIPDHATRDALIRDKIAG 127 ARKLI ALRAK +G GVEGL+HEYSLSSQEG+ALMCLAEALLRIPD TRDALIRDKIA Sbjct: 65 ARKLIEALRAKHKGTGVEGLVHEYSLSSQEGVALMCLAEALLRIPDTETRDALIRDKIAE 124 Query: 128 GDWQAHLGKGGSMFVNAATWGLLITGKLTSAGGEQALSSALTRLIARGGEPLIRRGVDFA 187 GDW++HLG SMFVNAATWGL++TGKLTS +++L++AL+RLIAR GEP+IRRGVD A Sbjct: 125 GDWKSHLGGTKSMFVNAATWGLVVTGKLTSTVNDRSLAAALSRLIARAGEPVIRRGVDMA 184 Query: 188 MRMMGEQFVTGQTIQEALTNARTMEAEGFRYSYDMLGEAALTAEDAARYYADYVNAIHAI 247 MRMMGEQFVTG+TIQEA+ ++ +EA GFRYSYDMLGEAA TA DA RY+ DY NAIHAI Sbjct: 185 MRMMGEQFVTGETIQEAIKRSKPLEARGFRYSYDMLGEAATTAADATRYFRDYENAIHAI 244 Query: 248 GTASAGRGVYEGPGISIKLSAIHPRYSRAQADRVMDELLPRVKALALLAKGYDIGLNIDA 307 G AS GRGVYEGPGISIKLSA+HPRY+R Q +RV+ ELLP+VK LALLAK YDIGLNIDA Sbjct: 245 GKASGGRGVYEGPGISIKLSALHPRYTRTQTERVVGELLPKVKQLALLAKRYDIGLNIDA 304 Query: 308 EEADRLELSLDLMESLCFDPDLAGWNGIGFVVQAYGKRCPYVIDFLIDLARRSGHRLMIR 367 EEADRLELSLDL+E L D DL GW+G+GFVVQAYGKRCP+V+D++IDLARRSG R+M+R Sbjct: 305 EEADRLELSLDLLEELSLDKDLGGWDGLGFVVQAYGKRCPFVLDYVIDLARRSGRRIMLR 364 Query: 368 LVKGAYWDSEIKRAQLDGLPDFPVYTRKVYTDVSYVACARKLLAAPEAVFPQFATHNAQT 427 LVKGAYWD+EIKRAQLDGL FPVYTRKV+TDV+Y+ACARKLLAAP+A+FPQFATHNAQ+ Sbjct: 365 LVKGAYWDAEIKRAQLDGLEGFPVYTRKVHTDVAYIACARKLLAAPDAIFPQFATHNAQS 424 Query: 428 LATIYEMAGSDFQVGKYEFQCLHGMGEPLYKEVVGPLK--RPCRIYAPVGTHETLLAYLV 485 LATIYE+AG DF+VGKYEFQCLHGMGEPLY EVVG K RPCRIYAPVGTHETLLAYLV Sbjct: 425 LATIYELAGPDFEVGKYEFQCLHGMGEPLYDEVVGKAKLDRPCRIYAPVGTHETLLAYLV 484 Query: 486 RRLLENGANSSFVNRIADPAVPVDELVADPVAVARAIAPTGAPHALIALPRNLYAPERAN 545 RRLLENGANSSFVNRI+D V VDEL+ DPV V A+ G PH IALP +LY ERAN Sbjct: 485 RRLLENGANSSFVNRISDENVSVDELITDPVDVVEAMPVVGMPHDQIALPADLYGRERAN 544 Query: 546 SAGIDLSDETELARLSAALSASAEMTWTAAPLLADGERAGQAQPVRNPADRRDVVGSVTE 605 S G+DLS+E LA L++ L A+ W A PLLADG G+ +PV NPAD DVVGSVTE Sbjct: 545 SKGLDLSNEDTLADLASRLQATVTQDWHATPLLADGSLEGKTRPVVNPADHNDVVGSVTE 604 Query: 606 ASEALVAEAFGHAVAAASAWAATPPEERAASLFRAADTMQERMPTLLGLIVREAGKSLPN 665 + + A A+ WAAT P ERA L RAAD MQER+ L+ + +REAGKS N Sbjct: 605 LAVEDASRIARMAAEGAAQWAATSPAERADCLERAADLMQERLEVLMAIAMREAGKSAAN 664 Query: 666 AIAEVREAIDFLRYYGAQVRDRFDNATHRPLGPVVCISPWNFPLAIFSGQIAAALAAGNP 725 A+ EVREAIDFLRYY Q R +H PLG V+CISPWNFPLAIF+GQ+AAAL AGN Sbjct: 665 AVGEVREAIDFLRYYAVQARKTL-GPSHAPLGAVLCISPWNFPLAIFTGQVAAALVAGNS 723 Query: 726 VLAKPAEETPLIAAEAVRILHAAGIPAGALQLLPGAGEVGAALVGHEAVRGVMFTGSTEV 785 V+AKPA TP+IA E+V+ILH AG+P GALQ +PG+G +GAALV GVMFTGSTEV Sbjct: 724 VMAKPAGVTPIIAFESVKILHEAGVPRGALQFIPGSGRLGAALVAAPETAGVMFTGSTEV 783 Query: 786 ARLIQRQLAGRLLPDGAPIPLIAETGGQNAMIVDSSALAEQVVGDVIASAFDSAGQRCSA 845 AR IQ QLA RL G PIPLIAETGGQN MIVDSSALAEQVV DVIASAFDSAGQRCSA Sbjct: 784 ARQIQAQLAERLSASGKPIPLIAETGGQNGMIVDSSALAEQVVFDVIASAFDSAGQRCSA 843 Query: 846 LRILCLQEDVADRTLAMLKGAMRELRIGNPDRLAVDVGPVISEEARATIAAHIEAMRAKG 905 LR+LCLQEDVADRTL MLKGA++EL IG D+L++D+GPVI+ A+ I +HIE MR+ G Sbjct: 844 LRVLCLQEDVADRTLTMLKGALKELTIGRTDKLSIDIGPVINTGAKEEIESHIERMRSLG 903 Query: 906 RNVEFLPLPAETADGTFIAPTVIEIGGIHELEREVFGPVLHVVRFHRDDLDALVDSINAT 965 VE LPLP GTF+APT+IE+ + +L REVFGPVLHV+R+ R+DLD L+D +N + Sbjct: 904 CKVEQLPLPRAAELGTFVAPTIIELKKMSDLTREVFGPVLHVMRYRREDLDRLIDEVNNS 963 Query: 966 GYGLTFGLHTRIDATIERVTGRIGAGNVYVNRNTIGAVVGVQPFGGHGLSGTGPKAGGPL 1025 GYGLTFGLHTR+D TIE VT RI AGN+YVNRN IGAVVGVQPFGG GLSGTGPKAGGPL Sbjct: 964 GYGLTFGLHTRLDETIEHVTSRIKAGNLYVNRNIIGAVVGVQPFGGRGLSGTGPKAGGPL 1023 Query: 1026 YLSRLLSRRPKGWLEFRGPDAARAAGLAYGEWLRAKGFTAEASRCAGYVARSAIGGGAEL 1085 Y+ RL+ R P G A + WL +G A+A SA+G EL Sbjct: 1024 YIGRLVQRAPVP--PRHGSVHTDPALREFASWLGRRGHNADADAARDLADVSALGLDQEL 1081 Query: 1086 NGPVGERNLYELHGRGRVLLLPQTRTGLLLQLGAVLATGNSAAVDAPPDLAELLRGLPPA 1145 GPVGERNLY LH RGR+LL+P+T+ GL QL A LATGN +D +L L +P + Sbjct: 1082 AGPVGERNLYALHPRGRILLVPKTQRGLFRQLAAALATGNEVVIDKASNLENALNSVPAS 1141 Query: 1146 LAARVRTTADWRDVGPLAAVLVEGDRERVTAINRRVADLPGPILLVQAATAEALAAGRGE 1205 +AAR+ +ADW GP A LVEGD RV +N+R+A L GP++LVQAAT E LA G Sbjct: 1142 IAARLSWSADWNSAGPFAGALVEGDVSRVQEVNKRIAALDGPLVLVQAATPEELAKDAG- 1200 Query: 1206 GYDLDLLLNERSVSVNTAAAGGNASLVAM 1234 Y L+ LL E S S+NTAAAGGNA+L+ + Sbjct: 1201 AYCLNWLLEEVSTSINTAAAGGNANLMTI 1229 Lambda K H 0.319 0.136 0.396 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 3633 Number of extensions: 147 Number of successful extensions: 6 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 1235 Length of database: 1230 Length adjustment: 47 Effective length of query: 1188 Effective length of database: 1183 Effective search space: 1405404 Effective search space used: 1405404 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.4 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.8 bits) S2: 59 (27.3 bits)
Align candidate WP_012708559.1 NGR_RS21360 (trifunctional transcriptional regulator/proline dehydrogenase/L-glutamate gamma-semialdehyde dehydrogenase)
to HMM TIGR01238 (delta-1-pyrroline-5-carboxylate dehydrogenase (EC 1.2.1.88))
# hmmsearch :: search profile(s) against a sequence database # HMMER 3.3.1 (Jul 2020); http://hmmer.org/ # Copyright (C) 2020 Howard Hughes Medical Institute. # Freely distributed under the BSD open source license. # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # query HMM file: ../tmp/path.carbon/TIGR01238.hmm # target sequence database: /tmp/gapView.1908210.genome.faa # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: TIGR01238 [M=500] Accession: TIGR01238 Description: D1pyr5carbox3: delta-1-pyrroline-5-carboxylate dehydrogenase Scores for complete sequences (score includes all domains): --- full sequence --- --- best 1 domain --- -#dom- E-value score bias E-value score bias exp N Sequence Description ------- ------ ----- ------- ------ ----- ---- -- -------- ----------- 7.5e-216 703.3 0.5 7.5e-216 703.3 0.5 1.5 2 NCBI__GCF_000018545.1:WP_012708559.1 Domain annotation for each sequence (and alignments): >> NCBI__GCF_000018545.1:WP_012708559.1 # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc --- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ---- 1 ! 703.3 0.5 7.5e-216 7.5e-216 1 497 [. 536 1030 .. 536 1033 .. 0.98 2 ? -3.4 0.1 0.14 0.14 180 196 .. 1112 1128 .. 1110 1131 .. 0.82 Alignments for each domain: == domain 1 score: 703.3 bits; conditional E-value: 7.5e-216 TIGR01238 1 dlygegrknslGvdlaneselksleeqllkaaakkfqaapivgekakaegeaqpvknpadrkdivGqvsea 71 dlyg+ r ns+G+dl+ne++l++l ++l+++++++++a p++ + eg+++pv+npad++d+vG v+e NCBI__GCF_000018545.1:WP_012708559.1 536 DLYGRERANSKGLDLSNEDTLADLASRLQATVTQDWHATPLL-ADGSLEGKTRPVVNPADHNDVVGSVTEL 605 89****************************************.667789********************** PP TIGR01238 72 daaevqeavdsavaafaewsatdakeraailerladlleshmpelvallvreaGktlsnaiaevreavdfl 142 +++++ a + +a+w at ++era +ler+adl+++++ l+a+++reaGk+ na+ evrea+dfl NCBI__GCF_000018545.1:WP_012708559.1 606 AVEDASRIARMAAEGAAQWAATSPAERADCLERAADLMQERLEVLMAIAMREAGKSAANAVGEVREAIDFL 676 *********************************************************************** PP TIGR01238 143 ryyakqvedvldeesakalGavvcispwnfplaiftGqiaaalaaGntviakpaeqtsliaaravellqea 213 ryya q++++l+ + +lGav+cispwnfplaiftGq+aaal+aGn+v+akpa t++ia ++v++l+ea NCBI__GCF_000018545.1:WP_012708559.1 677 RYYAVQARKTLGPS-HAPLGAVLCISPWNFPLAIFTGQVAAALVAGNSVMAKPAGVTPIIAFESVKILHEA 746 ***********987.9******************************************************* PP TIGR01238 214 GvpagviqllpGrGedvGaaltsderiaGviftGstevarlinkalakredap...vpliaetGGqnamiv 281 Gvp g++q++pG G +Gaal + + aGv+ftGstevar+i+ +la+r a +pliaetGGqn miv NCBI__GCF_000018545.1:WP_012708559.1 747 GVPRGALQFIPGSGR-LGAALVAAPETAGVMFTGSTEVARQIQAQLAERLSASgkpIPLIAETGGQNGMIV 816 **************9.*********************************9886666*************** PP TIGR01238 282 dstalaeqvvadvlasafdsaGqrcsalrvlcvqedvadrvltlikGamdelkvgkpirlttdvGpvidae 352 ds+alaeqvv dv+asafdsaGqrcsalrvlc+qedvadr+lt++kGa++el +g+ +l d+Gpvi+ NCBI__GCF_000018545.1:WP_012708559.1 817 DSSALAEQVVFDVIASAFDSAGQRCSALRVLCLQEDVADRTLTMLKGALKELTIGRTDKLSIDIGPVINTG 887 *********************************************************************** PP TIGR01238 353 akqnllahiekmkakakkvaqvkleddvesekgtfvaptlfelddldelkkevfGpvlhvvrykadeldkv 423 ak+++++hie+m++ + kv q+ l ++e gtfvapt++el+++++l +evfGpvlhv+ry++++ld++ NCBI__GCF_000018545.1:WP_012708559.1 888 AKEEIESHIERMRSLGCKVEQLPLPR--AAELGTFVAPTIIELKKMSDLTREVFGPVLHVMRYRREDLDRL 956 **********************9987..999**************************************** PP TIGR01238 424 vdkinakGygltlGvhsrieetvrqiekrakvGnvyvnrnlvGavvGvqpfGGeGlsGtGpkaGGplylyr 494 +d++n++Gyglt+G+h+r +et+ ++++r+k+Gn+yvnrn++GavvGvqpfGG+GlsGtGpkaGGply+ r NCBI__GCF_000018545.1:WP_012708559.1 957 IDEVNNSGYGLTFGLHTRLDETIEHVTSRIKAGNLYVNRNIIGAVVGVQPFGGRGLSGTGPKAGGPLYIGR 1027 *********************************************************************** PP TIGR01238 495 ltr 497 l++ NCBI__GCF_000018545.1:WP_012708559.1 1028 LVQ 1030 997 PP == domain 2 score: -3.4 bits; conditional E-value: 0.14 TIGR01238 180 qiaaalaaGntviakpa 196 q+aaala+Gn v+ a NCBI__GCF_000018545.1:WP_012708559.1 1112 QLAAALATGNEVVIDKA 1128 899*******9986554 PP Internal pipeline statistics summary: ------------------------------------- Query model(s): 1 (500 nodes) Target sequences: 1 (1230 residues searched) Passed MSV filter: 1 (1); expected 0.0 (0.02) Passed bias filter: 1 (1); expected 0.0 (0.02) Passed Vit filter: 1 (1); expected 0.0 (0.001) Passed Fwd filter: 1 (1); expected 0.0 (1e-05) Initial search space (Z): 1 [actual number of targets] Domain search space (domZ): 1 [number of targets reported over threshold] # CPU time: 0.01u 0.01s 00:00:00.02 Elapsed: 00:00:00.01 # Mc/sec: 41.05 // [ok]
This GapMind analysis is from Apr 09 2024. The underlying query database was built on Sep 17 2021.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory