Align phosphogluconate dehydratase (characterized)
to candidate WP_044621646.1 H744_RS07730 phosphogluconate dehydratase
Query= CharProtDB::CH_024239 (603 letters) >NCBI__GCF_000940995.1:WP_044621646.1 Length = 598 Score = 612 bits (1577), Expect = e-179 Identities = 311/599 (51%), Positives = 419/599 (69%), Gaps = 5/599 (0%) Query: 1 MNPQLLRVTNRIIERSRETRSAYLARIEQAKTSTVHRSQLACGNLAHGFAACQPEDKASL 60 ++P + +T RI RS+ RS++ A+I Q R+ L+CGNLAH AA +DK + Sbjct: 2 IHPTITLITERIKARSQAHRSSFEAKITQQAEQGKGRTSLSCGNLAHAVAASCQQDKQQI 61 Query: 61 KSMLRNNIAIITSYNDMLSAHQPYEHYPEIIRKALHEANAVGQVAGGVPAMCDGVTQGQD 120 R N+AI++SYNDMLSAHQ Y++YP+ I+ AL Q+AG VPAMCDGVTQGQ Sbjct: 62 LDFTRANLAIVSSYNDMLSAHQLYKNYPDQIKSALQSLGHTAQIAGCVPAMCDGVTQGQP 121 Query: 121 GMELSLLSREVIAMSAAVGLSHNMFDGALFLGVCDKIVPGLTMAALSFGHLPAVFVPSGP 180 GM++SL SR++IA + A LSHN+FD L LG+CDKI PG M ALS+ HLP F+P+GP Sbjct: 122 GMDMSLFSRDLIAQATAFSLSHNVFDATLLLGICDKIAPGQLMGALSYAHLPTAFIPAGP 181 Query: 181 MASGLPNKEKVRIRQLYAEGKVDRMALLESEAASYHAPGTCTFYGTANTNQMVVEFMGMQ 240 M +G+ N +KV +RQ Y G+V + ALLE E +YH+ GTCTFYGTANTNQ+V E MG+ Sbjct: 182 MPTGITNDKKVAVRQQYVAGEVGKDALLEMECQAYHSGGTCTFYGTANTNQLVFEAMGLM 241 Query: 241 LPGSSFVHPDSPLRDALTAAAARQVTRMTGNGNEWMPIGKMIDEKVVVNGIVALLATGGS 300 LPGS+FV P++ LR ALT A+ + MT + + P+ + + +VNG+VALLA+GGS Sbjct: 242 LPGSAFVAPNTDLRSALTEHASLALASMTADSPNYRPLIDVFTAENLVNGVVALLASGGS 301 Query: 301 TNHTMHLVAMARAAGIQINWDDFSDLSDVVPLMARLYPNGPADINHFQAAGGVPVLVREL 360 TNHT+H++A+ARA G+ + WDD S+LSD+VPL+A++YPNGPADIN F+ AGGVP L++ L Sbjct: 302 TNHTIHMLAIARAGGLVLTWDDISELSDIVPLLAKMYPNGPADINAFEQAGGVPALMKTL 361 Query: 361 LKAGLLHEDVNTVAGFGLSRYTLEPWLNNGELDWREGAEKSLDSNVIASFEQPFSHHGGT 420 + GLL+ DV TV G + T +P + + +L W+ E S ++ VIA+ + FS GGT Sbjct: 362 HERGLLNSDVKTVFGEFADQLT-QPAITDSKLVWQPVGE-SRNAEVIAANGKAFSQTGGT 419 Query: 421 KVLSGNLGRAVMKTSAVPVENQVIEAPAVVFESQHDVMPAFEAGLLDRDCVVVVRHQGPK 480 KVL GNLG+AV+K SAV E+Q IEAPA VF QHDV A++AG DCV+VV H GP Sbjct: 420 KVLKGNLGQAVIKVSAVKQEHQFIEAPAKVFHCQHDVEAAYQAGEFTGDCVIVVSHNGPA 479 Query: 481 ANGMPELHKLMPPLGVLLDRCFKIALVTDGRLSGASGKVPSAIHVTPEAYDGGLLAKVRD 540 ANGMPELHKLMP LG + ++ALVTDGRLSGASGK+P+AIHV+PEA GG + + D Sbjct: 480 ANGMPELHKLMPILGNIQKAGHQVALVTDGRLSGASGKIPAAIHVSPEALRGGAIGMIND 539 Query: 541 GDIIRVNGQTGELTLLVDEAELAAREPHIPDLSASRVGTGRELFSALREKLSGAEQGAT 599 GDI++++ TG L + VD R+P D AS++ GR++F +RE + A++GA+ Sbjct: 540 GDIVKLDCSTGLLEVAVD---FDQRQPVQLDTEASQITWGRDIFKVMRENVGAADEGAS 595 Lambda K H 0.318 0.134 0.392 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 966 Number of extensions: 39 Number of successful extensions: 5 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 603 Length of database: 598 Length adjustment: 37 Effective length of query: 566 Effective length of database: 561 Effective search space: 317526 Effective search space used: 317526 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.3 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.7 bits) S2: 53 (25.0 bits)
Align candidate WP_044621646.1 H744_RS07730 (phosphogluconate dehydratase)
to HMM TIGR01196 (edd: phosphogluconate dehydratase (EC 4.2.1.12))
# hmmsearch :: search profile(s) against a sequence database # HMMER 3.3.1 (Jul 2020); http://hmmer.org/ # Copyright (C) 2020 Howard Hughes Medical Institute. # Freely distributed under the BSD open source license. # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # query HMM file: ../tmp/path.carbon/TIGR01196.hmm # target sequence database: /tmp/gapView.1846430.genome.faa # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: TIGR01196 [M=601] Accession: TIGR01196 Description: edd: phosphogluconate dehydratase Scores for complete sequences (score includes all domains): --- full sequence --- --- best 1 domain --- -#dom- E-value score bias E-value score bias exp N Sequence Description ------- ------ ----- ------- ------ ----- ---- -- -------- ----------- 2e-249 814.7 2.9 2.3e-249 814.6 2.9 1.0 1 NCBI__GCF_000940995.1:WP_044621646.1 Domain annotation for each sequence (and alignments): >> NCBI__GCF_000940995.1:WP_044621646.1 # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc --- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ---- 1 ! 814.6 2.9 2.3e-249 2.3e-249 1 600 [. 3 597 .. 3 598 .] 0.98 Alignments for each domain: == domain 1 score: 814.6 bits; conditional E-value: 2.3e-249 TIGR01196 1 hsrlaeiteriierskktrekylekirsaktkgklrstlgcgnlahgvaalsesekvelksekrknlaiitay 73 h+ ++ iteri +rs++ r+++ +ki++ +++gk r++l+cgnlah+vaa +++k+++ +r+nlai+++y NCBI__GCF_000940995.1:WP_044621646.1 3 HPTITLITERIKARSQAHRSSFEAKITQQAEQGKGRTSLSCGNLAHAVAASCQQDKQQILDFTRANLAIVSSY 75 678999******************************************************************* PP TIGR01196 74 ndmlsahqpfkeypdlikkalqeanavaqvagGvpamcdGvtqGedGmelsllsrdvialstaiglshnmfdg 146 ndmlsahq +k+ypd+ik+alq + +aq+ag vpamcdGvtqG++Gm++sl+srd+ia +ta +lshn+fd+ NCBI__GCF_000940995.1:WP_044621646.1 76 NDMLSAHQLYKNYPDQIKSALQSLGHTAQIAGCVPAMCDGVTQGQPGMDMSLFSRDLIAQATAFSLSHNVFDA 148 ************************************************************************* PP TIGR01196 147 alflGvcdkivpGlliaalsfGhlpavfvpaGpmasGlenkekakvrqlfaeGkvdreellksemasyhapGt 219 +l+lG+cdki pG l++als+ hlp+ f+paGpm++G++n++k++vrq++ G v++++ll+ e ++yh+ Gt NCBI__GCF_000940995.1:WP_044621646.1 149 TLLLGICDKIAPGQLMGALSYAHLPTAFIPAGPMPTGITNDKKVAVRQQYVAGEVGKDALLEMECQAYHSGGT 221 ************************************************************************* PP TIGR01196 220 ctfyGtansnqmlvelmGlhlpgasfvnpntplrdaltreaakrlarltakngevlplaelideksivnalvg 292 ctfyGtan+nq++ e mGl lpg++fv pnt+lr alt++a+ la++ta++ ++ pl ++ + +vn++v+ NCBI__GCF_000940995.1:WP_044621646.1 222 CTFYGTANTNQLVFEAMGLMLPGSAFVAPNTDLRSALTEHASLALASMTADSPNYRPLIDVFTAENLVNGVVA 294 ************************************************************************* PP TIGR01196 293 llatGGstnhtlhlvaiaraaGiilnwddlselsdlvpllarvypnGkadvnhfeaaGGlsflirellkeGll 365 lla+GGstnht+h++aiara G++l+wdd+selsd+vplla++ypnG ad+n+fe aGG++ l++ l ++Gll NCBI__GCF_000940995.1:WP_044621646.1 295 LLASGGSTNHTIHMLAIARAGGLVLTWDDISELSDIVPLLAKMYPNGPADINAFEQAGGVPALMKTLHERGLL 367 ************************************************************************* PP TIGR01196 366 hedvetvagkGlrrytkepfledgkleyreaaeksldedilrkvdkpfsaeGGlkllkGnlGravikvsavke 438 dv+tv g+ + t +p ++d+kl++++ +s + +++++ k fs++GG+k+lkGnlG+avikvsavk+ NCBI__GCF_000940995.1:WP_044621646.1 368 NSDVKTVFGEFADQLT-QPAITDSKLVWQPVG-ESRNAEVIAANGKAFSQTGGTKVLKGNLGQAVIKVSAVKQ 438 *********9888776.6999*******9865.699************************************* PP TIGR01196 439 esrvieapaivfkdqaellaafkagelerdlvavvrfqGpkanGmpelhklttvlGvlqdrgfkvalvtdGrl 511 e++ ieapa+vf+ q+++ aa++age+ +d+v+vv ++Gp anGmpelhkl++ lG +q g++valvtdGrl NCBI__GCF_000940995.1:WP_044621646.1 439 EHQFIEAPAKVFHCQHDVEAAYQAGEFTGDCVIVVSHNGPAANGMPELHKLMPILGNIQKAGHQVALVTDGRL 511 ************************************************************************* PP TIGR01196 512 sGasGkvpaaihvtpealegGalakirdGdlirldavngelevlvddaelkareleeldlednelGlGrelfa 584 sGasGk+paaihv+peal gGa+ i+dGd+++ld+ +g lev vd ++ r++ +ld e++++ Gr++f NCBI__GCF_000940995.1:WP_044621646.1 512 SGASGKIPAAIHVSPEALRGGAIGMINDGDIVKLDCSTGLLEVAVD---FDQRQPVQLDTEASQITWGRDIFK 581 ***************************************9999875...78999999**************** PP TIGR01196 585 alrekvssaeeGassl 600 +re+v++a+eGas l NCBI__GCF_000940995.1:WP_044621646.1 582 VMRENVGAADEGASFL 597 ************9865 PP Internal pipeline statistics summary: ------------------------------------- Query model(s): 1 (601 nodes) Target sequences: 1 (598 residues searched) Passed MSV filter: 1 (1); expected 0.0 (0.02) Passed bias filter: 1 (1); expected 0.0 (0.02) Passed Vit filter: 1 (1); expected 0.0 (0.001) Passed Fwd filter: 1 (1); expected 0.0 (1e-05) Initial search space (Z): 1 [actual number of targets] Domain search space (domZ): 1 [number of targets reported over threshold] # CPU time: 0.00u 0.01s 00:00:00.01 Elapsed: 00:00:00.00 # Mc/sec: 38.99 // [ok]
This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory