GapMind for catabolism of small carbon sources

 

Alignments for a candidate for edd in Stenotrophomonas chelatiphaga DSM 21508

Align phosphogluconate dehydratase (EC 4.2.1.12) (characterized)
to candidate WP_057508152.1 ABB28_RS08120 phosphogluconate dehydratase

Query= BRENDA::Q1PAG1
         (608 letters)



>NCBI__GCF_001431535.1:WP_057508152.1
          Length = 638

 Score =  703 bits (1814), Expect = 0.0
 Identities = 351/597 (58%), Positives = 443/597 (74%)

Query: 1   MHPRVLEVTERLVARSRATREAYLALIRGAASDGPQRGKLQCANFAHGVAGCGSEDKHSL 60
           +HP++  +TER+V RS A+R AYLA I  A  DGP R +L C N AHG A CG  DK  L
Sbjct: 3   LHPQLHAITERIVRRSAASRAAYLAGIDAALRDGPFRSRLSCGNLAHGFAACGGTDKKRL 62

Query: 61  RMMNAANVAIVSSYNDMLSAHQPYEHFPEQIKKALREMGSVGQFAGGTPAMCDGVTQGEA 120
           R     N+ I++SYNDMLSAHQP+E +PEQI++  RE G+  Q AGG PAMCDGVTQG  
Sbjct: 63  RGGVTPNLGIITSYNDMLSAHQPFETYPEQIREIAREWGATAQVAGGVPAMCDGVTQGRG 122

Query: 121 GMELSLPSREVIALSTAVALSHNMFDAALMLGICDKIVPGLMMGALRFGHLPTIFVPGGP 180
           GMELSL SR+VIA STA+ LSH+MFDAA+ LG+CDKIVPGL++GAL FGHLP++FVP GP
Sbjct: 123 GMELSLFSRDVIAQSTAIGLSHDMFDAAIYLGVCDKIVPGLLIGALAFGHLPSVFVPAGP 182

Query: 181 MPSGISNKEKADVRQRYAEGKATREELLESEMKSYHSPGTCTFYGTANTNQLLMEVMGLH 240
           M  GI NK+KA+VR+RYA G+ATREELLE+E  SYH  GTCTFYGTAN+NQ+L+E MG+ 
Sbjct: 183 MTPGIPNKQKAEVRERYAAGEATREELLEAEAASYHGAGTCTFYGTANSNQVLLEAMGVQ 242

Query: 241 LPGASFVNPYTPLRDALTHEAAQQVTRLTKQSGNFTPIGEIVDERSLVNSIVALHATGGS 300
           LPGASFVNP   LR ALT EA  +   +T    ++ PIG I+DER++VN+IVAL ATGGS
Sbjct: 243 LPGASFVNPEQTLRGALTREATVRALEMTALGDDYRPIGRIIDERAIVNAIVALMATGGS 302

Query: 301 TNHTLHMPAIAQAAGIQLTWQDMADLSEVVPTLSHVYPNGKADINHFQAAGGMAFLIREL 360
           TNHT+H  A+A+AAGI +TW DM  LS+++P L+ VYPNG+AD+N F AAGG AF+  EL
Sbjct: 303 TNHTIHWIAVARAAGIVVTWDDMDQLSQLIPLLTRVYPNGEADVNRFAAAGGPAFVFGEL 362

Query: 361 LEAGLLHEDVNTVAGRGLSRYTQEPFLDNGKLVWRDGPIESLDENILRPVARAFSPEGGL 420
           + AGL+H D+ TVA  G++ Y +EP L +G++ W DG I S DE++ R V   F  +GGL
Sbjct: 363 IRAGLMHGDIVTVARGGMADYAREPRLQDGQVTWVDGIIRSADEDVARGVDNPFESQGGL 422

Query: 421 RVMEGNLGRGVMKVSAVALQHQIVEAPAVVFQDQQDLADAFKAGELEKDFVAVMRFQGPR 480
           R++ GNLGR ++K+SAV  Q++ +EAPAVV    Q L     AG L +DFVAV+R+QGPR
Sbjct: 423 RLLRGNLGRSLIKLSAVKPQYRSIEAPAVVVDAPQVLNKLHAAGLLPQDFVAVVRYQGPR 482

Query: 481 SNGMPELHKMTPFLGVLQDRGFKVALVTDGRMSGASGKIPAAIHVSPEAQVGGALARVRD 540
           +NGMPELH + P LG+LQ++G +VALVTDGR+SGASGKIPAAIH++PEA  GG L ++R+
Sbjct: 483 ANGMPELHSLAPLLGLLQNQGRRVALVTDGRLSGASGKIPAAIHMTPEAARGGPLGKLRE 542

Query: 541 GDIIRVDGVKGTLELKVDADEFAAREPAKGLLGNNVGSGRELFGFMRMAFSSAEQGA 597
           GDIIR+DG  GTLE+ VD  E+AAR  A          GR LF   R+    A+QGA
Sbjct: 543 GDIIRLDGEAGTLEVLVDEAEWAARTHAPNTAPAANDLGRNLFAVNRLVVGPADQGA 599


Lambda     K      H
   0.318    0.134    0.386 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 1180
Number of extensions: 49
Number of successful extensions: 2
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 608
Length of database: 638
Length adjustment: 37
Effective length of query: 571
Effective length of database: 601
Effective search space:   343171
Effective search space used:   343171
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 54 (25.4 bits)

Align candidate WP_057508152.1 ABB28_RS08120 (phosphogluconate dehydratase)
to HMM TIGR01196 (edd: phosphogluconate dehydratase (EC 4.2.1.12))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR01196.hmm
# target sequence database:        /tmp/gapView.28821.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR01196  [M=601]
Accession:   TIGR01196
Description: edd: phosphogluconate dehydratase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                                 Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                                 -----------
   7.2e-281  918.6   0.0   8.2e-281  918.4   0.0    1.0  1  lcl|NCBI__GCF_001431535.1:WP_057508152.1  ABB28_RS08120 phosphogluconate d


Domain annotation for each sequence (and alignments):
>> lcl|NCBI__GCF_001431535.1:WP_057508152.1  ABB28_RS08120 phosphogluconate dehydratase
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  918.4   0.0  8.2e-281  8.2e-281       1     600 [.       4     602 ..       4     603 .. 0.99

  Alignments for each domain:
  == domain 1  score: 918.4 bits;  conditional E-value: 8.2e-281
                                 TIGR01196   1 hsrlaeiteriierskktrekylekirsaktkgklrstlgcgnlahgvaalsesekvelksekrknlai 69 
                                               h++l +iteri+ rs++ r +yl+ i +a + g++rs+l+cgnlahg+aa+  ++k  l+    +nl+i
  lcl|NCBI__GCF_001431535.1:WP_057508152.1   4 HPQLHAITERIVRRSAASRAAYLAGIDAALRDGPFRSRLSCGNLAHGFAACGGTDKKRLRGGVTPNLGI 72 
                                               799****************************************************************** PP

                                 TIGR01196  70 itayndmlsahqpfkeypdlikkalqeanavaqvagGvpamcdGvtqGedGmelsllsrdvialstaig 138
                                               it+yndmlsahqpf++yp++i++ ++e +a+aqvagGvpamcdGvtqG+ Gmelsl+srdvia staig
  lcl|NCBI__GCF_001431535.1:WP_057508152.1  73 ITSYNDMLSAHQPFETYPEQIREIAREWGATAQVAGGVPAMCDGVTQGRGGMELSLFSRDVIAQSTAIG 141
                                               ********************************************************************* PP

                                 TIGR01196 139 lshnmfdgalflGvcdkivpGlliaalsfGhlpavfvpaGpmasGlenkekakvrqlfaeGkvdreell 207
                                               lsh+mfd+a++lGvcdkivpGlli+al+fGhlp+vfvpaGpm+ G++nk+ka+vr+ +a G ++reell
  lcl|NCBI__GCF_001431535.1:WP_057508152.1 142 LSHDMFDAAIYLGVCDKIVPGLLIGALAFGHLPSVFVPAGPMTPGIPNKQKAEVRERYAAGEATREELL 210
                                               ********************************************************************* PP

                                 TIGR01196 208 ksemasyhapGtctfyGtansnqmlvelmGlhlpgasfvnpntplrdaltreaakrlarltakngevlp 276
                                               ++e+asyh++GtctfyGtansnq+l+e mG++lpgasfvnp++ lr altrea+ r+ ++ta ++++ p
  lcl|NCBI__GCF_001431535.1:WP_057508152.1 211 EAEAASYHGAGTCTFYGTANSNQVLLEAMGVQLPGASFVNPEQTLRGALTREATVRALEMTALGDDYRP 279
                                               ********************************************************************* PP

                                 TIGR01196 277 laelideksivnalvgllatGGstnhtlhlvaiaraaGiilnwddlselsdlvpllarvypnGkadvnh 345
                                               ++++ide++ivna+v+l+atGGstnht+h +a+araaGi+++wdd+++ls+l+pll+rvypnG+advn 
  lcl|NCBI__GCF_001431535.1:WP_057508152.1 280 IGRIIDERAIVNAIVALMATGGSTNHTIHWIAVARAAGIVVTWDDMDQLSQLIPLLTRVYPNGEADVNR 348
                                               ********************************************************************* PP

                                 TIGR01196 346 feaaGGlsflirellkeGllhedvetvagkGlrrytkepfledgkleyreaaeksldedilrkvdkpfs 414
                                               f aaGG +f+  el+++Gl+h d+ tva  G+  y++ep l+dg++++ +++ +s+ded+ r vd+pf+
  lcl|NCBI__GCF_001431535.1:WP_057508152.1 349 FAAAGGPAFVFGELIRAGLMHGDIVTVARGGMADYAREPRLQDGQVTWVDGIIRSADEDVARGVDNPFE 417
                                               ********************************************************************* PP

                                 TIGR01196 415 aeGGlkllkGnlGravikvsavkeesrvieapaivfkdqaellaafkagelerdlvavvrfqGpkanGm 483
                                               ++GGl+ll+GnlGr++ik+savk++ r ieapa+v +  + l++   ag l +d+vavvr+qGp+anGm
  lcl|NCBI__GCF_001431535.1:WP_057508152.1 418 SQGGLRLLRGNLGRSLIKLSAVKPQYRSIEAPAVVVDAPQVLNKLHAAGLLPQDFVAVVRYQGPRANGM 486
                                               ********************************************************************* PP

                                 TIGR01196 484 pelhklttvlGvlqdrgfkvalvtdGrlsGasGkvpaaihvtpealegGalakirdGdlirldavngel 552
                                               pelh l + lG+lq++g +valvtdGrlsGasGk+paaih+tpea+ gG+l k+r+Gd+irld+ +g l
  lcl|NCBI__GCF_001431535.1:WP_057508152.1 487 PELHSLAPLLGLLQNQGRRVALVTDGRLSGASGKIPAAIHMTPEAARGGPLGKLREGDIIRLDGEAGTL 555
                                               ********************************************************************* PP

                                 TIGR01196 553 evlvddaelkareleeldlednelGlGrelfaalrekvssaeeGassl 600
                                               evlvd+ae++ar+ + ++   +   lGr+lfa  r  v+ a++Ga+s+
  lcl|NCBI__GCF_001431535.1:WP_057508152.1 556 EVLVDEAEWAART-HAPNTAPAANDLGRNLFAVNRLVVGPADQGAISI 602
                                               ************9.88898889999********************997 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (601 nodes)
Target sequences:                          1  (638 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.03u 0.02s 00:00:00.05 Elapsed: 00:00:00.04
# Mc/sec: 8.37
//
[ok]

This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory