Align phosphogluconate dehydratase (characterized)
to candidate 201628 SO2487 6-phosphogluconate dehydratase (NCBI ptt file)
Query= CharProtDB::CH_024239 (603 letters) >FitnessBrowser__MR1:201628 Length = 608 Score = 806 bits (2082), Expect = 0.0 Identities = 399/598 (66%), Positives = 484/598 (80%) Query: 1 MNPQLLRVTNRIIERSRETRSAYLARIEQAKTSTVHRSQLACGNLAHGFAACQPEDKASL 60 M+ + VT+RII RS+ +R AYLA + A+ VHRS L+CGNLAHGFAAC P+DK +L Sbjct: 1 MHSVVQSVTDRIIARSKASREAYLAALNDARNHGVHRSSLSCGNLAHGFAACNPDDKNAL 60 Query: 61 KSMLRNNIAIITSYNDMLSAHQPYEHYPEIIRKALHEANAVGQVAGGVPAMCDGVTQGQD 120 + + + NI IIT++NDMLSAHQPYE YP++++KA E +V QVAGGVPAMCDGVTQGQ Sbjct: 61 RQLTKANIGIITAFNDMLSAHQPYETYPDLLKKACQEVGSVAQVAGGVPAMCDGVTQGQP 120 Query: 121 GMELSLLSREVIAMSAAVGLSHNMFDGALFLGVCDKIVPGLTMAALSFGHLPAVFVPSGP 180 GMELSLLSREVIAM+ AVGLSHNMFDGAL LG+CDKIVPGL + ALSFGHLP +FVP+GP Sbjct: 121 GMELSLLSREVIAMATAVGLSHNMFDGALLLGICDKIVPGLLIGALSFGHLPMLFVPAGP 180 Query: 181 MASGLPNKEKVRIRQLYAEGKVDRMALLESEAASYHAPGTCTFYGTANTNQMVVEFMGMQ 240 M SG+PNKEK RIRQ +A+GKVDR LLE+EA SYH+ GTCTFYGTAN+NQ+++E MG+Q Sbjct: 181 MKSGIPNKEKARIRQQFAQGKVDRAQLLEAEAQSYHSAGTCTFYGTANSNQLMLEVMGLQ 240 Query: 241 LPGSSFVHPDSPLRDALTAAAARQVTRMTGNGNEWMPIGKMIDEKVVVNGIVALLATGGS 300 LPGSSFV+PD PLR+AL AA+QV R+T G ++ PIG++++EK +VNGIVALLATGGS Sbjct: 241 LPGSSFVNPDDPLREALNKMAAKQVCRLTELGTQYSPIGEVVNEKSIVNGIVALLATGGS 300 Query: 301 TNHTMHLVAMARAAGIQINWDDFSDLSDVVPLMARLYPNGPADINHFQAAGGVPVLVREL 360 TN TMH+VA ARAAGI +NWDDFS+LSD VPL+AR+YPNG ADINHF AAGG+ L++EL Sbjct: 301 TNLTMHIVAAARAAGIIVNWDDFSELSDAVPLLARVYPNGHADINHFHAAGGMAFLIKEL 360 Query: 361 LKAGLLHEDVNTVAGFGLSRYTLEPWLNNGELDWREGAEKSLDSNVIASFEQPFSHHGGT 420 L AGLLHEDVNTVAG+GL RYT EP L +GEL W +G SLD+ V+ S PF ++GG Sbjct: 361 LDAGLLHEDVNTVAGYGLRRYTQEPKLLDGELRWVDGPTVSLDTEVLTSVATPFQNNGGL 420 Query: 421 KVLSGNLGRAVMKTSAVPVENQVIEAPAVVFESQHDVMPAFEAGLLDRDCVVVVRHQGPK 480 K+L GNLGRAV+K SAV +++V+EAPAVV + Q+ + F++G LDRDCVVVV+ QGPK Sbjct: 421 KLLKGNLGRAVIKVSAVQPQHRVVEAPAVVIDDQNKLDALFKSGALDRDCVVVVKGQGPK 480 Query: 481 ANGMPELHKLMPPLGVLLDRCFKIALVTDGRLSGASGKVPSAIHVTPEAYDGGLLAKVRD 540 ANGMPELHKL P LG L D+ FK+AL+TDGR+SGASGKVP+AIH+TPEA DGGL+AKV+D Sbjct: 481 ANGMPELHKLTPLLGSLQDKGFKVALMTDGRMSGASGKVPAAIHLTPEAIDGGLIAKVQD 540 Query: 541 GDIIRVNGQTGELTLLVDEAELAAREPHIPDLSASRVGTGRELFSALREKLSGAEQGA 598 GD+IRV+ TGEL+LLV + ELA R DL SR G GRELF LR LS E GA Sbjct: 541 GDLIRVDALTGELSLLVSDTELATRTATEIDLRHSRYGMGRELFGVLRSNLSSPETGA 598 Lambda K H 0.318 0.134 0.392 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 1061 Number of extensions: 31 Number of successful extensions: 1 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 603 Length of database: 608 Length adjustment: 37 Effective length of query: 566 Effective length of database: 571 Effective search space: 323186 Effective search space used: 323186 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.3 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.7 bits) S2: 53 (25.0 bits)
Align candidate 201628 SO2487 (6-phosphogluconate dehydratase (NCBI ptt file))
to HMM TIGR01196 (edd: phosphogluconate dehydratase (EC 4.2.1.12))
# hmmsearch :: search profile(s) against a sequence database # HMMER 3.3.1 (Jul 2020); http://hmmer.org/ # Copyright (C) 2020 Howard Hughes Medical Institute. # Freely distributed under the BSD open source license. # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # query HMM file: ../tmp/path.carbon/TIGR01196.hmm # target sequence database: /tmp/gapView.23836.genome.faa # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: TIGR01196 [M=601] Accession: TIGR01196 Description: edd: phosphogluconate dehydratase Scores for complete sequences (score includes all domains): --- full sequence --- --- best 1 domain --- -#dom- E-value score bias E-value score bias exp N Sequence Description ------- ------ ----- ------- ------ ----- ---- -- -------- ----------- 0 1027.7 0.9 0 1027.5 0.9 1.0 1 lcl|FitnessBrowser__MR1:201628 SO2487 6-phosphogluconate dehydr Domain annotation for each sequence (and alignments): >> lcl|FitnessBrowser__MR1:201628 SO2487 6-phosphogluconate dehydratase (NCBI ptt file) # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc --- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ---- 1 ! 1027.5 0.9 0 0 1 599 [. 2 600 .. 2 602 .. 1.00 Alignments for each domain: == domain 1 score: 1027.5 bits; conditional E-value: 0 TIGR01196 1 hsrlaeiteriierskktrekylekirsaktkgklrstlgcgnlahgvaalsesekvelksekrknlaiitayndmlsa 79 hs ++++t+rii+rsk+ re+yl+ +++a+++g++rs+l+cgnlahg+aa+++++k +l++ +++n++iita+ndmlsa lcl|FitnessBrowser__MR1:201628 2 HSVVQSVTDRIIARSKASREAYLAALNDARNHGVHRSSLSCGNLAHGFAACNPDDKNALRQLTKANIGIITAFNDMLSA 80 678999************************************************************************* PP TIGR01196 80 hqpfkeypdlikkalqeanavaqvagGvpamcdGvtqGedGmelsllsrdvialstaiglshnmfdgalflGvcdkivp 158 hqp+++ypdl+kka+qe ++vaqvagGvpamcdGvtqG++Gmelsllsr+via++ta+glshnmfdgal+lG+cdkivp lcl|FitnessBrowser__MR1:201628 81 HQPYETYPDLLKKACQEVGSVAQVAGGVPAMCDGVTQGQPGMELSLLSREVIAMATAVGLSHNMFDGALLLGICDKIVP 159 ******************************************************************************* PP TIGR01196 159 GlliaalsfGhlpavfvpaGpmasGlenkekakvrqlfaeGkvdreellksemasyhapGtctfyGtansnqmlvelmG 237 Glli+alsfGhlp +fvpaGpm sG++nkeka++rq+fa+Gkvdr +ll++e++syh++GtctfyGtansnq+++e+mG lcl|FitnessBrowser__MR1:201628 160 GLLIGALSFGHLPMLFVPAGPMKSGIPNKEKARIRQQFAQGKVDRAQLLEAEAQSYHSAGTCTFYGTANSNQLMLEVMG 238 ******************************************************************************* PP TIGR01196 238 lhlpgasfvnpntplrdaltreaakrlarltakngevlplaelideksivnalvgllatGGstnhtlhlvaiaraaGii 316 l+lpg+sfvnp+ plr+al + aak++ rlt + ++ p++e+++eksivn++v+llatGGstn t+h+va araaGii lcl|FitnessBrowser__MR1:201628 239 LQLPGSSFVNPDDPLREALNKMAAKQVCRLTELGTQYSPIGEVVNEKSIVNGIVALLATGGSTNLTMHIVAAARAAGII 317 ******************************************************************************* PP TIGR01196 317 lnwddlselsdlvpllarvypnGkadvnhfeaaGGlsflirellkeGllhedvetvagkGlrrytkepfledgkleyre 395 +nwdd+selsd vpllarvypnG+ad+nhf+aaGG++fli+ell++Gllhedv+tvag Glrryt+ep+l dg+l++ + lcl|FitnessBrowser__MR1:201628 318 VNWDDFSELSDAVPLLARVYPNGHADINHFHAAGGMAFLIKELLDAGLLHEDVNTVAGYGLRRYTQEPKLLDGELRWVD 396 ******************************************************************************* PP TIGR01196 396 aaeksldedilrkvdkpfsaeGGlkllkGnlGravikvsavkeesrvieapaivfkdqaellaafkagelerdlvavvr 474 ++ sld+++l +v +pf+++GGlkllkGnlGravikvsav++++rv+eapa+v +dq++l a fk+g l+rd+v+vv+ lcl|FitnessBrowser__MR1:201628 397 GPTVSLDTEVLTSVATPFQNNGGLKLLKGNLGRAVIKVSAVQPQHRVVEAPAVVIDDQNKLDALFKSGALDRDCVVVVK 475 ******************************************************************************* PP TIGR01196 475 fqGpkanGmpelhklttvlGvlqdrgfkvalvtdGrlsGasGkvpaaihvtpealegGalakirdGdlirldavngele 553 qGpkanGmpelhklt+ lG lqd+gfkval+tdGr+sGasGkvpaaih+tpea++gG +ak++dGdlir+da +gel+ lcl|FitnessBrowser__MR1:201628 476 GQGPKANGMPELHKLTPLLGSLQDKGFKVALMTDGRMSGASGKVPAAIHLTPEAIDGGLIAKVQDGDLIRVDALTGELS 554 ******************************************************************************* PP TIGR01196 554 vlvddaelkareleeldlednelGlGrelfaalrekvssaeeGass 599 +lv d+el++r+ +e+dl ++++G+Grelf lr+++ss e+Ga s lcl|FitnessBrowser__MR1:201628 555 LLVSDTELATRTATEIDLRHSRYGMGRELFGVLRSNLSSPETGARS 600 *******************************************986 PP Internal pipeline statistics summary: ------------------------------------- Query model(s): 1 (601 nodes) Target sequences: 1 (608 residues searched) Passed MSV filter: 1 (1); expected 0.0 (0.02) Passed bias filter: 1 (1); expected 0.0 (0.02) Passed Vit filter: 1 (1); expected 0.0 (0.001) Passed Fwd filter: 1 (1); expected 0.0 (1e-05) Initial search space (Z): 1 [actual number of targets] Domain search space (domZ): 1 [number of targets reported over threshold] # CPU time: 0.03u 0.01s 00:00:00.04 Elapsed: 00:00:00.03 # Mc/sec: 11.69 // [ok]
This GapMind analysis is from Sep 17 2021. The underlying query database was built on Sep 17 2021.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory