Align phosphogluconate dehydratase (characterized)
to candidate WP_106709318.1 CU102_RS02320 phosphogluconate dehydratase
Query= CharProtDB::CH_024239 (603 letters) >NCBI__GCF_003010955.1:WP_106709318.1 Length = 607 Score = 743 bits (1918), Expect = 0.0 Identities = 371/595 (62%), Positives = 457/595 (76%), Gaps = 3/595 (0%) Query: 8 VTNRIIERSRETRSAYLARIEQAKTSTVHRSQLACGNLAHGFAACQPEDKASLKSMLRNN 67 +T+RI E+S+ TR YL + +A + RS LAC NLAHGFAAC P DKA+L + N Sbjct: 10 ITHRICEQSKPTRDVYLDHLREAASRKPKRSALACANLAHGFAACSPSDKAALAGDVVPN 69 Query: 68 IAIITSYNDMLSAHQPYEHYPEIIRKALHEANAVGQVAGGVPAMCDGVTQGQDGMELSLL 127 + IITSYNDMLSAHQP+E YP++I+ A EA V QVAGGVPAMCDGVTQGQ GMELSL Sbjct: 70 LGIITSYNDMLSAHQPFETYPQLIKAAAKEAGGVAQVAGGVPAMCDGVTQGQPGMELSLF 129 Query: 128 SREVIAMSAAVGLSHNMFDGALFLGVCDKIVPGLTMAALSFGHLPAVFVPSGPMASGLPN 187 SR+VIAM+ A+GLSH+MFD A++LGVCDKIVPGL + AL+FGHLPAVF+P+GPM +GLPN Sbjct: 130 SRDVIAMATAIGLSHDMFDAAVYLGVCDKIVPGLVIGALTFGHLPAVFIPAGPMTTGLPN 189 Query: 188 KEKVRIRQLYAEGKVDRMALLESEAASYHAPGTCTFYGTANTNQMVVEFMGMQLPGSSFV 247 EK + RQLYAEGKV R ALLESE+ SYH PGTCTFYGTAN+NQM++E MG+ +PGSSF+ Sbjct: 190 DEKAKTRQLYAEGKVGREALLESESKSYHGPGTCTFYGTANSNQMLMEIMGLHMPGSSFI 249 Query: 248 HPDSPLRDALTAAAARQVTRMTGNGNEWMPIGKMIDEKVVVNGIVALLATGGSTNHTMHL 307 +P +PLRDALT AA++ +T GNE+ P+G+MIDE+ +VNG+V L ATGGSTNHTMHL Sbjct: 250 NPGTPLRDALTREAAKRALAITALGNEYTPVGEMIDERSIVNGVVGLHATGGSTNHTMHL 309 Query: 308 VAMARAAGIQINWDDFSDLSDVVPLMARLYPNGPADINHFQAAGGVPVLVRELLKAGLLH 367 VAMA AAGI++ W D SDLSDVVPL+AR+YPNG AD+NHF AAGG+ ++RELL GLLH Sbjct: 310 VAMAAAAGIKLTWQDISDLSDVVPLLARVYPNGLADVNHFHAAGGMGYIIRELLDGGLLH 369 Query: 368 EDVNTV--AGFGLSRYTLEPWL-NNGELDWREGAEKSLDSNVIASFEQPFSHHGGTKVLS 424 EDV TV G GL YT+EP L NG + A +S D V+++ +QPF GG K+L Sbjct: 370 EDVKTVWGGGDGLRAYTIEPKLGENGTVVREPVAAESADKKVLSTCKQPFQVTGGLKMLK 429 Query: 425 GNLGRAVMKTSAVPVENQVIEAPAVVFESQHDVMPAFEAGLLDRDCVVVVRHQGPKANGM 484 GNLG AV+KTSAV + +IEAPA+VF+SQ + AF+AG LDRD V VVR QGP+ANGM Sbjct: 430 GNLGTAVIKTSAVKADRHIIEAPAIVFDSQAALQDAFKAGKLDRDFVAVVRFQGPRANGM 489 Query: 485 PELHKLMPPLGVLLDRCFKIALVTDGRLSGASGKVPSAIHVTPEAYDGGLLAKVRDGDII 544 PELHKL P LGVL DR +ALVTDGR+SGASGKVP+AIHVTPEA DGG++ K+ DGD++ Sbjct: 490 PELHKLTPALGVLQDRGHMVALVTDGRMSGASGKVPAAIHVTPEALDGGIIGKIHDGDVV 549 Query: 545 RVNGQTGELTLLVDEAELAAREPHIPDLSASRVGTGRELFSALREKLSGAEQGAT 599 R++ + G L +L D LAAR D++ + G GRELF+A R + AE GA+ Sbjct: 550 RLDAEIGTLDVLEDPDVLAARPTPEVDINHNSYGMGRELFAAFRNVVGKAENGAS 604 Lambda K H 0.318 0.134 0.392 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 1129 Number of extensions: 49 Number of successful extensions: 2 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 603 Length of database: 607 Length adjustment: 37 Effective length of query: 566 Effective length of database: 570 Effective search space: 322620 Effective search space used: 322620 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.3 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.7 bits) S2: 53 (25.0 bits)
Align candidate WP_106709318.1 CU102_RS02320 (phosphogluconate dehydratase)
to HMM TIGR01196 (edd: phosphogluconate dehydratase (EC 4.2.1.12))
# hmmsearch :: search profile(s) against a sequence database # HMMER 3.3.1 (Jul 2020); http://hmmer.org/ # Copyright (C) 2020 Howard Hughes Medical Institute. # Freely distributed under the BSD open source license. # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # query HMM file: ../tmp/path.carbon/TIGR01196.hmm # target sequence database: /tmp/gapView.1525750.genome.faa # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: TIGR01196 [M=601] Accession: TIGR01196 Description: edd: phosphogluconate dehydratase Scores for complete sequences (score includes all domains): --- full sequence --- --- best 1 domain --- -#dom- E-value score bias E-value score bias exp N Sequence Description ------- ------ ----- ------- ------ ----- ---- -- -------- ----------- 0 1038.1 0.9 0 1037.9 0.9 1.0 1 NCBI__GCF_003010955.1:WP_106709318.1 Domain annotation for each sequence (and alignments): >> NCBI__GCF_003010955.1:WP_106709318.1 # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc --- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ---- 1 ! 1037.9 0.9 0 0 3 600 .. 6 606 .. 4 607 .] 0.99 Alignments for each domain: == domain 1 score: 1037.9 bits; conditional E-value: 0 TIGR01196 3 rlaeiteriierskktrekylekirsaktkgklrstlgcgnlahgvaalsesekvelksekrknlaiitaynd 75 r+++it+ri e+sk+tr+ yl+++r+a+++ ++rs+l+c+nlahg+aa+s+s+k++l+ + ++nl+iit+ynd NCBI__GCF_003010955.1:WP_106709318.1 6 RVKSITHRICEQSKPTRDVYLDHLREAASRKPKRSALACANLAHGFAACSPSDKAALAGDVVPNLGIITSYND 78 7999********************************************************************* PP TIGR01196 76 mlsahqpfkeypdlikkalqeanavaqvagGvpamcdGvtqGedGmelsllsrdvialstaiglshnmfdgal 148 mlsahqpf++yp+lik a++ea++vaqvagGvpamcdGvtqG++Gmelsl+srdvia++taiglsh+mfd+a+ NCBI__GCF_003010955.1:WP_106709318.1 79 MLSAHQPFETYPQLIKAAAKEAGGVAQVAGGVPAMCDGVTQGQPGMELSLFSRDVIAMATAIGLSHDMFDAAV 151 ************************************************************************* PP TIGR01196 149 flGvcdkivpGlliaalsfGhlpavfvpaGpmasGlenkekakvrqlfaeGkvdreellksemasyhapGtct 221 +lGvcdkivpGl+i+al+fGhlpavf+paGpm++Gl+n+ekak+rql+aeGkv+re+ll+se++syh+pGtct NCBI__GCF_003010955.1:WP_106709318.1 152 YLGVCDKIVPGLVIGALTFGHLPAVFIPAGPMTTGLPNDEKAKTRQLYAEGKVGREALLESESKSYHGPGTCT 224 ************************************************************************* PP TIGR01196 222 fyGtansnqmlvelmGlhlpgasfvnpntplrdaltreaakrlarltakngevlplaelideksivnalvgll 294 fyGtansnqml+e+mGlh+pg+sf+np tplrdaltreaakr+ ++ta ++e++p++e+ide+sivn++vgl+ NCBI__GCF_003010955.1:WP_106709318.1 225 FYGTANSNQMLMEIMGLHMPGSSFINPGTPLRDALTREAAKRALAITALGNEYTPVGEMIDERSIVNGVVGLH 297 ************************************************************************* PP TIGR01196 295 atGGstnhtlhlvaiaraaGiilnwddlselsdlvpllarvypnGkadvnhfeaaGGlsflirellkeGllhe 367 atGGstnht+hlva+a aaGi l+w+d+s+lsd+vpllarvypnG advnhf+aaGG++++irell+ Gllhe NCBI__GCF_003010955.1:WP_106709318.1 298 ATGGSTNHTMHLVAMAAAAGIKLTWQDISDLSDVVPLLARVYPNGLADVNHFHAAGGMGYIIRELLDGGLLHE 370 ************************************************************************* PP TIGR01196 368 dvetvag..kGlrrytkepfled.gkleyreaaeksldedilrkvdkpfsaeGGlkllkGnlGravikvsavk 437 dv+tv g Glr+yt+ep+l + g +++++ a +s+d+++l + ++pf+ +GGlk+lkGnlG avik+savk NCBI__GCF_003010955.1:WP_106709318.1 371 DVKTVWGggDGLRAYTIEPKLGEnGTVVREPVAAESADKKVLSTCKQPFQVTGGLKMLKGNLGTAVIKTSAVK 443 *****987799**********764899999999**************************************** PP TIGR01196 438 eesrvieapaivfkdqaellaafkagelerdlvavvrfqGpkanGmpelhklttvlGvlqdrgfkvalvtdGr 510 + ++ieapaivf++qa l++afkag+l+rd+vavvrfqGp+anGmpelhklt++lGvlqdrg+ valvtdGr NCBI__GCF_003010955.1:WP_106709318.1 444 ADRHIIEAPAIVFDSQAALQDAFKAGKLDRDFVAVVRFQGPRANGMPELHKLTPALGVLQDRGHMVALVTDGR 516 ************************************************************************* PP TIGR01196 511 lsGasGkvpaaihvtpealegGalakirdGdlirldavngelevlvddaelkareleeldlednelGlGrelf 583 +sGasGkvpaaihvtpeal+gG + ki+dGd++rlda g l+vl+d l+ar + e+d+++n++G+Grelf NCBI__GCF_003010955.1:WP_106709318.1 517 MSGASGKVPAAIHVTPEALDGGIIGKIHDGDVVRLDAEIGTLDVLEDPDVLAARPTPEVDINHNSYGMGRELF 589 ************************************************************************* PP TIGR01196 584 aalrekvssaeeGassl 600 aa+r+ v+ ae+Gas++ NCBI__GCF_003010955.1:WP_106709318.1 590 AAFRNVVGKAENGASVF 606 **************998 PP Internal pipeline statistics summary: ------------------------------------- Query model(s): 1 (601 nodes) Target sequences: 1 (607 residues searched) Passed MSV filter: 1 (1); expected 0.0 (0.02) Passed bias filter: 1 (1); expected 0.0 (0.02) Passed Vit filter: 1 (1); expected 0.0 (0.001) Passed Fwd filter: 1 (1); expected 0.0 (1e-05) Initial search space (Z): 1 [actual number of targets] Domain search space (domZ): 1 [number of targets reported over threshold] # CPU time: 0.01u 0.00s 00:00:00.01 Elapsed: 00:00:00.01 # Mc/sec: 24.39 // [ok]
This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory