GapMind for catabolism of small carbon sources

 

Alignments for a candidate for edd in Dinoroseobacter shibae DFL-12

Align phosphogluconate dehydratase (EC 4.2.1.12) (characterized)
to candidate 3608367 Dshi_1769 6-phosphogluconate dehydratase (RefSeq)

Query= BRENDA::Q1PAG1
         (608 letters)



>FitnessBrowser__Dino:3608367
          Length = 601

 Score =  689 bits (1779), Expect = 0.0
 Identities = 348/594 (58%), Positives = 437/594 (73%), Gaps = 4/594 (0%)

Query: 5   VLEVTERLVARSRATREAYLALIRGAASDGPQRGKLQCANFAHGVAGCGSEDKHSLRMMN 64
           + +VT R+ ARS   R  YL  +R AA DGP+R  L C N AH  A  G  DK +L    
Sbjct: 7   ISDVTARIEARSAEARSTYLDRMRRAAEDGPRRAHLSCGNQAHAYAAMGG-DKEALVAER 65

Query: 65  AANVAIVSSYNDMLSAHQPYEHFPEQIKKALREMGSVGQFAGGTPAMCDGVTQGEAGMEL 124
           +AN+ IV++YNDMLSAHQP++ +P++IK+A R  G+  Q AGG PAMCDGVTQG+ GMEL
Sbjct: 66  SANIGIVTAYNDMLSAHQPFKDYPDKIKEAARRAGATAQVAGGVPAMCDGVTQGQVGMEL 125

Query: 125 SLPSREVIALSTAVALSHNMFDAALMLGICDKIVPGLMMGALRFGHLPTIFVPGGPMPSG 184
           SL SR+VIAL+T VALSHN FDAA  LG+CDKIVPGL++ A  FG+LP +FVP GPM SG
Sbjct: 126 SLFSRDVIALATGVALSHNTFDAAAYLGVCDKIVPGLVIAAATFGYLPGVFVPAGPMVSG 185

Query: 185 ISNKEKADVRQRYAEGKATREELLESEMKSYHSPGTCTFYGTANTNQLLMEVMGLHLPGA 244
           + N +KA VRQ++A G+  R++L+E+EM SYH PGTCTFYGTAN+NQ+LME MGLHLPGA
Sbjct: 186 LPNDQKAKVRQQFAAGEIGRDKLMEAEMASYHGPGTCTFYGTANSNQMLMEFMGLHLPGA 245

Query: 245 SFVNPYTPLRDALTHEAAQQVTRLTKQSGNFTPIGEIVDERSLVNSIVALHATGGSTNHT 304
           SFVNP TPLR+ALT  AA+++  +T+    + P+ +I+D ++ VN IV L ATGGSTN  
Sbjct: 246 SFVNPGTPLREALTAAAAERLAAITQLGNEYRPVCDILDAKAFVNGIVGLMATGGSTNLV 305

Query: 305 LHMPAIAQAAGIQLTWQDMADLSEVVPTLSHVYPNGKADINHFQAAGGMAFLIRELLEAG 364
           +H+PA+A+AAG+ L  QD AD+SE  P ++ VYPNG AD+NHF AAGG+A++I ELL  G
Sbjct: 306 IHLPAMARAAGVILDLQDFADISEATPLMARVYPNGLADVNHFHAAGGLAYMIGELLSEG 365

Query: 365 LLHEDVNTVAGRGLSRYTQEPFLDNGKLVWRDGPIESLDENILRPVARAFSPEGGLRVME 424
           LLH D  T+AG GL+ Y +EP L +G L W DGP  SL+  ILRP +  F+P GGL+ ++
Sbjct: 366 LLHPDTKTIAGDGLADYAREPKLIDGVLRWEDGPRRSLNAKILRPASDGFAPSGGLKELK 425

Query: 425 GNLGRGVMKVSAVALQHQIVEAPAVVFQDQQDLADAFKAGELEKDFVAVMRFQGPRSNGM 484
           GNLGRGVMKVSAVA +  ++EA A VF+DQ  + DAFKAGE  +D V ++RFQGP++NGM
Sbjct: 426 GNLGRGVMKVSAVAPERHVIEARARVFEDQGAVKDAFKAGEFTEDTVVIVRFQGPKANGM 485

Query: 485 PELHKMTPFLGVLQDRGFKVALVTDGRMSGASGKIPAAIHVSPEAQVGGALARVRDGDII 544
           PELH +TP L VLQDRG KVALVTDGRMSGASGK+PAAIHV+PEA  GG +A+VR GD++
Sbjct: 486 PELHALTPVLAVLQDRGLKVALVTDGRMSGASGKVPAAIHVAPEALDGGLMAKVRTGDLV 545

Query: 545 RVDGVKGTLELKVDADEFAAREPA-KGLLGNNVGSGRELFGFMRMAFSSAEQGA 597
           RVD V G LE+     E   R PA   L GN+ G GRELF   R     A  GA
Sbjct: 546 RVDAVAGVLEVLEPGVE--DRAPAMPDLSGNSHGIGRELFDVFRTTVGPASTGA 597


Lambda     K      H
   0.318    0.134    0.386 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 1081
Number of extensions: 44
Number of successful extensions: 3
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 608
Length of database: 601
Length adjustment: 37
Effective length of query: 571
Effective length of database: 564
Effective search space:   322044
Effective search space used:   322044
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 53 (25.0 bits)

Align candidate 3608367 Dshi_1769 (6-phosphogluconate dehydratase (RefSeq))
to HMM TIGR01196 (edd: phosphogluconate dehydratase (EC 4.2.1.12))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR01196.hmm
# target sequence database:        /tmp/gapView.1135.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR01196  [M=601]
Accession:   TIGR01196
Description: edd: phosphogluconate dehydratase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                         Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                         -----------
   2.8e-287  939.8   1.5   3.1e-287  939.6   1.5    1.0  1  lcl|FitnessBrowser__Dino:3608367  Dshi_1769 6-phosphogluconate deh


Domain annotation for each sequence (and alignments):
>> lcl|FitnessBrowser__Dino:3608367  Dshi_1769 6-phosphogluconate dehydratase (RefSeq)
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  939.6   1.5  3.1e-287  3.1e-287       4     599 ..       7     599 ..       4     601 .] 0.99

  Alignments for each domain:
  == domain 1  score: 939.6 bits;  conditional E-value: 3.1e-287
                         TIGR01196   4 laeiteriierskktrekylekirsaktkgklrstlgcgnlahgvaalsesekvelksekrknlaiitayndmlsah 80 
                                       + ++t+ri +rs++ r++yl+++r a++ g+ r++l+cgn ah++aa+   +k +l  e+ +n++i+tayndmlsah
  lcl|FitnessBrowser__Dino:3608367   7 ISDVTARIEARSAEARSTYLDRMRRAAEDGPRRAHLSCGNQAHAYAAMGG-DKEALVAERSANIGIVTAYNDMLSAH 82 
                                       679*********************************************86.78999********************* PP

                         TIGR01196  81 qpfkeypdlikkalqeanavaqvagGvpamcdGvtqGedGmelsllsrdvialstaiglshnmfdgalflGvcdkiv 157
                                       qpfk+ypd ik+a+++a+a+aqvagGvpamcdGvtqG+ Gmelsl+srdvial+t+++lshn fd+a +lGvcdkiv
  lcl|FitnessBrowser__Dino:3608367  83 QPFKDYPDKIKEAARRAGATAQVAGGVPAMCDGVTQGQVGMELSLFSRDVIALATGVALSHNTFDAAAYLGVCDKIV 159
                                       ***************************************************************************** PP

                         TIGR01196 158 pGlliaalsfGhlpavfvpaGpmasGlenkekakvrqlfaeGkvdreellksemasyhapGtctfyGtansnqmlve 234
                                       pGl+iaa++fG lp+vfvpaGpm+sGl+n++kakvrq+fa G ++r++l+++emasyh+pGtctfyGtansnqml+e
  lcl|FitnessBrowser__Dino:3608367 160 PGLVIAAATFGYLPGVFVPAGPMVSGLPNDQKAKVRQQFAAGEIGRDKLMEAEMASYHGPGTCTFYGTANSNQMLME 236
                                       ***************************************************************************** PP

                         TIGR01196 235 lmGlhlpgasfvnpntplrdaltreaakrlarltakngevlplaelideksivnalvgllatGGstnhtlhlvaiar 311
                                       +mGlhlpgasfvnp tplr+alt++aa+rla++t+ ++e+ p+++++d k++vn++vgl+atGGstn  +hl a+ar
  lcl|FitnessBrowser__Dino:3608367 237 FMGLHLPGASFVNPGTPLREALTAAAAERLAAITQLGNEYRPVCDILDAKAFVNGIVGLMATGGSTNLVIHLPAMAR 313
                                       ***************************************************************************** PP

                         TIGR01196 312 aaGiilnwddlselsdlvpllarvypnGkadvnhfeaaGGlsflirellkeGllhedvetvagkGlrrytkepfled 388
                                       aaG+il+ +d+ ++s+  pl+arvypnG advnhf+aaGGl+++i ell+eGllh d++t+ag Gl  y++ep+l d
  lcl|FitnessBrowser__Dino:3608367 314 AAGVILDLQDFADISEATPLMARVYPNGLADVNHFHAAGGLAYMIGELLSEGLLHPDTKTIAGDGLADYAREPKLID 390
                                       ***************************************************************************** PP

                         TIGR01196 389 gkleyreaaeksldedilrkvdkpfsaeGGlkllkGnlGravikvsavkeesrvieapaivfkdqaellaafkagel 465
                                       g l++++++ +sl+ +ilr++++ f+++GGlk lkGnlGr+v+kvsav++e +viea+a+vf+dq  +++afkage+
  lcl|FitnessBrowser__Dino:3608367 391 GVLRWEDGPRRSLNAKILRPASDGFAPSGGLKELKGNLGRGVMKVSAVAPERHVIEARARVFEDQGAVKDAFKAGEF 467
                                       ***************************************************************************** PP

                         TIGR01196 466 erdlvavvrfqGpkanGmpelhklttvlGvlqdrgfkvalvtdGrlsGasGkvpaaihvtpealegGalakirdGdl 542
                                         d v++vrfqGpkanGmpelh lt+vl vlqdrg kvalvtdGr+sGasGkvpaaihv+peal+gG +ak+r+Gdl
  lcl|FitnessBrowser__Dino:3608367 468 TEDTVVIVRFQGPKANGMPELHALTPVLAVLQDRGLKVALVTDGRMSGASGKVPAAIHVAPEALDGGLMAKVRTGDL 544
                                       ***************************************************************************** PP

                         TIGR01196 543 irldavngelevlvddaelkareleeldlednelGlGrelfaalrekvssaeeGass 599
                                       +r+dav+g levl+  +e   r ++++dl+ n+ G+Grelf  +r++v+ a +Ga +
  lcl|FitnessBrowser__Dino:3608367 545 VRVDAVAGVLEVLEPGVE--DRAPAMPDLSGNSHGIGRELFDVFRTTVGPASTGAGV 599
                                       *************88766..67788*****************************976 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (601 nodes)
Target sequences:                          1  (601 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.03u 0.01s 00:00:00.04 Elapsed: 00:00:00.03
# Mc/sec: 11.57
//
[ok]

This GapMind analysis is from Sep 17 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory