Align phosphogluconate dehydratase (characterized)
to candidate 6937660 Sama_1810 phosphogluconate dehydratase (RefSeq)
Query= CharProtDB::CH_024239 (603 letters) >FitnessBrowser__SB2B:6937660 Length = 608 Score = 823 bits (2125), Expect = 0.0 Identities = 405/600 (67%), Positives = 493/600 (82%) Query: 1 MNPQLLRVTNRIIERSRETRSAYLARIEQAKTSTVHRSQLACGNLAHGFAACQPEDKASL 60 M+P + VT+RIIERS+E+RSAYLA +++A++ VHRS L+CGNLAHGFAAC EDK SL Sbjct: 1 MHPVVKSVTDRIIERSKESRSAYLAALQEARSGKVHRSALSCGNLAHGFAACGAEDKQSL 60 Query: 61 KSMLRNNIAIITSYNDMLSAHQPYEHYPEIIRKALHEANAVGQVAGGVPAMCDGVTQGQD 120 + + + NI I+T++NDMLSAHQPYEHYPE+++ A +E +V QVAGGVPAMCDGVTQGQ Sbjct: 61 RQLTKVNIGIVTAFNDMLSAHQPYEHYPELLKAACNEVGSVAQVAGGVPAMCDGVTQGQP 120 Query: 121 GMELSLLSREVIAMSAAVGLSHNMFDGALFLGVCDKIVPGLTMAALSFGHLPAVFVPSGP 180 GMELSLLSREVIAM+ AVGLSHNMFDGAL LGVCDKIVPGL + A+SFGHLP +FVP+GP Sbjct: 121 GMELSLLSREVIAMATAVGLSHNMFDGALLLGVCDKIVPGLLIGAMSFGHLPMLFVPAGP 180 Query: 181 MASGLPNKEKVRIRQLYAEGKVDRMALLESEAASYHAPGTCTFYGTANTNQMVVEFMGMQ 240 M SG+PNKEK R+RQ +AEGKVDR ALLE+EA+SYH+ GTCTFYGTAN+NQ+V+E MG+Q Sbjct: 181 MRSGIPNKEKARVRQKFAEGKVDREALLEAEASSYHSAGTCTFYGTANSNQLVLEVMGLQ 240 Query: 241 LPGSSFVHPDSPLRDALTAAAARQVTRMTGNGNEWMPIGKMIDEKVVVNGIVALLATGGS 300 LPGSSFV+PD PLR L+ AA+QV R+T NG ++ PIG++++EK VVNGIVALLATGGS Sbjct: 241 LPGSSFVNPDDPLRTELSKMAAKQVCRLTENGLQYSPIGEIVNEKSVVNGIVALLATGGS 300 Query: 301 TNHTMHLVAMARAAGIQINWDDFSDLSDVVPLMARLYPNGPADINHFQAAGGVPVLVREL 360 TN TMH+VA ARAAGI INWDDFS+LSD VPL+AR+YPNG ADINHF AAGG+ L++EL Sbjct: 301 TNLTMHIVAAARAAGIIINWDDFSELSDAVPLLARVYPNGHADINHFHAAGGMAFLMKEL 360 Query: 361 LKAGLLHEDVNTVAGFGLSRYTLEPWLNNGELDWREGAEKSLDSNVIASFEQPFSHHGGT 420 L AGL+HEDVNTVAG+GL RYT EP L +G+L W +G SLD V+ +PF +GG Sbjct: 361 LDAGLIHEDVNTVAGYGLRRYTQEPRLIDGQLTWVDGPVTSLDQEVLRGVAEPFQSNGGL 420 Query: 421 KVLSGNLGRAVMKTSAVPVENQVIEAPAVVFESQHDVMPAFEAGLLDRDCVVVVRHQGPK 480 K++ GNLGRAV+K SAV +++++EAPAVV + Q+ + F+AG LDRDCVVVV+ QGPK Sbjct: 421 KLMKGNLGRAVIKVSAVQEQHRIVEAPAVVIDDQNKLDALFKAGELDRDCVVVVKGQGPK 480 Query: 481 ANGMPELHKLMPPLGVLLDRCFKIALVTDGRLSGASGKVPSAIHVTPEAYDGGLLAKVRD 540 ANGMPELHKL P LG L DR FK+AL+TDGR+SGASGKVP+AIH+TPEA DGGL+AKV+D Sbjct: 481 ANGMPELHKLTPILGTLQDRGFKVALMTDGRMSGASGKVPAAIHLTPEALDGGLIAKVQD 540 Query: 541 GDIIRVNGQTGELTLLVDEAELAAREPHIPDLSASRVGTGRELFSALREKLSGAEQGATC 600 GD+IR+N TGEL+LLV EL +R +L SR G GRELF ALR+ LS E GA C Sbjct: 541 GDLIRINAITGELSLLVSAPELESRTAAPVELRKSRYGMGRELFGALRQNLSSPETGARC 600 Lambda K H 0.318 0.134 0.392 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 1090 Number of extensions: 41 Number of successful extensions: 1 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 603 Length of database: 608 Length adjustment: 37 Effective length of query: 566 Effective length of database: 571 Effective search space: 323186 Effective search space used: 323186 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.3 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.7 bits) S2: 53 (25.0 bits)
Align candidate 6937660 Sama_1810 (phosphogluconate dehydratase (RefSeq))
to HMM TIGR01196 (edd: phosphogluconate dehydratase (EC 4.2.1.12))
# hmmsearch :: search profile(s) against a sequence database # HMMER 3.3.1 (Jul 2020); http://hmmer.org/ # Copyright (C) 2020 Howard Hughes Medical Institute. # Freely distributed under the BSD open source license. # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # query HMM file: ../tmp/path.carbon/TIGR01196.hmm # target sequence database: /tmp/gapView.32501.genome.faa # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: TIGR01196 [M=601] Accession: TIGR01196 Description: edd: phosphogluconate dehydratase Scores for complete sequences (score includes all domains): --- full sequence --- --- best 1 domain --- -#dom- E-value score bias E-value score bias exp N Sequence Description ------- ------ ----- ------- ------ ----- ---- -- -------- ----------- 0 1017.9 0.7 0 1017.7 0.7 1.0 1 lcl|FitnessBrowser__SB2B:6937660 Sama_1810 phosphogluconate dehyd Domain annotation for each sequence (and alignments): >> lcl|FitnessBrowser__SB2B:6937660 Sama_1810 phosphogluconate dehydratase (RefSeq) # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc --- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ---- 1 ! 1017.7 0.7 0 0 1 599 [. 2 600 .. 2 602 .. 1.00 Alignments for each domain: == domain 1 score: 1017.7 bits; conditional E-value: 0 TIGR01196 1 hsrlaeiteriierskktrekylekirsaktkgklrstlgcgnlahgvaalsesekvelksekrknlaiitayndml 77 h+ ++++t+riiersk+ r++yl+ +++a++ ++rs+l+cgnlahg+aa+ ++k++l++ ++ n++i+ta+ndml lcl|FitnessBrowser__SB2B:6937660 2 HPVVKSVTDRIIERSKESRSAYLAALQEARSGKVHRSALSCGNLAHGFAACGAEDKQSLRQLTKVNIGIVTAFNDML 78 677899*********************************************************************** PP TIGR01196 78 sahqpfkeypdlikkalqeanavaqvagGvpamcdGvtqGedGmelsllsrdvialstaiglshnmfdgalflGvcd 154 sahqp+++yp+l+k a++e ++vaqvagGvpamcdGvtqG++Gmelsllsr+via++ta+glshnmfdgal+lGvcd lcl|FitnessBrowser__SB2B:6937660 79 SAHQPYEHYPELLKAACNEVGSVAQVAGGVPAMCDGVTQGQPGMELSLLSREVIAMATAVGLSHNMFDGALLLGVCD 155 ***************************************************************************** PP TIGR01196 155 kivpGlliaalsfGhlpavfvpaGpmasGlenkekakvrqlfaeGkvdreellksemasyhapGtctfyGtansnqm 231 kivpGlli+a+sfGhlp +fvpaGpm sG++nkeka+vrq faeGkvdre+ll++e++syh++GtctfyGtansnq+ lcl|FitnessBrowser__SB2B:6937660 156 KIVPGLLIGAMSFGHLPMLFVPAGPMRSGIPNKEKARVRQKFAEGKVDREALLEAEASSYHSAGTCTFYGTANSNQL 232 ***************************************************************************** PP TIGR01196 232 lvelmGlhlpgasfvnpntplrdaltreaakrlarltakngevlplaelideksivnalvgllatGGstnhtlhlva 308 ++e+mGl+lpg+sfvnp+ plr +l++ aak++ rlt ++ ++ p++e+++eks+vn++v+llatGGstn t+h+va lcl|FitnessBrowser__SB2B:6937660 233 VLEVMGLQLPGSSFVNPDDPLRTELSKMAAKQVCRLTENGLQYSPIGEIVNEKSVVNGIVALLATGGSTNLTMHIVA 309 ***************************************************************************** PP TIGR01196 309 iaraaGiilnwddlselsdlvpllarvypnGkadvnhfeaaGGlsflirellkeGllhedvetvagkGlrrytkepf 385 araaGii+nwdd+selsd vpllarvypnG+ad+nhf+aaGG++fl++ell++Gl+hedv+tvag Glrryt+ep lcl|FitnessBrowser__SB2B:6937660 310 AARAAGIIINWDDFSELSDAVPLLARVYPNGHADINHFHAAGGMAFLMKELLDAGLIHEDVNTVAGYGLRRYTQEPR 386 ***************************************************************************** PP TIGR01196 386 ledgkleyreaaeksldedilrkvdkpfsaeGGlkllkGnlGravikvsavkeesrvieapaivfkdqaellaafka 462 l dg+l++ +++ +sld+++lr v +pf+++GGlkl+kGnlGravikvsav+e++r++eapa+v +dq++l a fka lcl|FitnessBrowser__SB2B:6937660 387 LIDGQLTWVDGPVTSLDQEVLRGVAEPFQSNGGLKLMKGNLGRAVIKVSAVQEQHRIVEAPAVVIDDQNKLDALFKA 463 ***************************************************************************** PP TIGR01196 463 gelerdlvavvrfqGpkanGmpelhklttvlGvlqdrgfkvalvtdGrlsGasGkvpaaihvtpealegGalakird 539 gel+rd+v+vv+ qGpkanGmpelhklt+ lG lqdrgfkval+tdGr+sGasGkvpaaih+tpeal+gG +ak++d lcl|FitnessBrowser__SB2B:6937660 464 GELDRDCVVVVKGQGPKANGMPELHKLTPILGTLQDRGFKVALMTDGRMSGASGKVPAAIHLTPEALDGGLIAKVQD 540 ***************************************************************************** PP TIGR01196 540 GdlirldavngelevlvddaelkareleeldlednelGlGrelfaalrekvssaeeGass 599 Gdlir++a++gel++lv el++r+ + ++l ++++G+Grelf alr+++ss e+Ga + lcl|FitnessBrowser__SB2B:6937660 541 GDLIRINAITGELSLLVSAPELESRTAAPVELRKSRYGMGRELFGALRQNLSSPETGARC 600 *********************************************************987 PP Internal pipeline statistics summary: ------------------------------------- Query model(s): 1 (601 nodes) Target sequences: 1 (608 residues searched) Passed MSV filter: 1 (1); expected 0.0 (0.02) Passed bias filter: 1 (1); expected 0.0 (0.02) Passed Vit filter: 1 (1); expected 0.0 (0.001) Passed Fwd filter: 1 (1); expected 0.0 (1e-05) Initial search space (Z): 1 [actual number of targets] Domain search space (domZ): 1 [number of targets reported over threshold] # CPU time: 0.03u 0.01s 00:00:00.04 Elapsed: 00:00:00.03 # Mc/sec: 9.50 // [ok]
This GapMind analysis is from Sep 17 2021. The underlying query database was built on Sep 17 2021.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory