GapMind for catabolism of small carbon sources

 

Alignments for a candidate for gntK in Herbaspirillum seropedicae SmR1

Align Gluconokinase; EC 2.7.1.12; Gluconate kinase (uncharacterized)
to candidate HSERO_RS05475 HSERO_RS05475 gluconokinase

Query= curated2:P46834
         (513 letters)



>FitnessBrowser__HerbieS:HSERO_RS05475
          Length = 520

 Score =  582 bits (1499), Expect = e-170
 Identities = 279/509 (54%), Positives = 379/509 (74%), Gaps = 3/509 (0%)

Query: 1   MTSYMLGIDIGTTSTKAVLFSEKGDVIQKESIGYALYTPDISTAEQNPDEIFQAVIQSTA 60
           +  YMLG+DIGTTSTK+V+F+  G V+ + +  Y +   +   AEQ+P++I  A + S A
Sbjct: 6   LARYMLGVDIGTTSTKSVVFTLDGKVVAQHAEEYPVLCTEPGMAEQDPEQIVAAALASIA 65

Query: 61  KIMQ--QHPDKQPSFISFSSAMHSVIAMDENDKPLTSCITWADNRSEGWAHKIKEEMNGH 118
             ++  +   ++ + +SFS+AMHSVIA+D  ++ L++ ITW D R+  WA +IK E +G+
Sbjct: 66  GAVKAAKAAPQEIALLSFSAAMHSVIALDAENRLLSNSITWGDIRASVWAERIKHEHDGN 125

Query: 119 NVYKRTGTPIHPMAPLSKITWIVNEHPEIAVKAKKYIGIKEYIFKKLFDQYVVDYSLASA 178
            +Y+RTGTP+HPM+PL K+ W+ +E P++  +A +++GIKEY+  +LF Q+VVD+S+ASA
Sbjct: 126 AIYRRTGTPVHPMSPLCKLMWMRHEKPDVFHRAARFVGIKEYLLYQLFGQWVVDHSIASA 185

Query: 179 MGMMNLKTLAWDEEALAIAGITPDHLSKLVPTTAIFHHCNPELAAMMGIDPQTPFVIGAS 238
            GM NL+ LAWD+ ALA+ G+ PD L   VPTT      + E+A  +G+   TPF+IGA+
Sbjct: 186 TGMFNLRELAWDQGALALLGVRPDQLPTPVPTTHRLPALSSEMAQRLGLSVDTPFIIGAN 245

Query: 239 DGVLSNLGVNAIKKGEIAVTIGTSGAIRPIIDKPQTDEKGRIFCYALTENHWVIGGPVNN 298
           DGVLSNLGVNAI+ G +AVTIGTSGA+R +ID+P+TD +GR+FCYALTE HWV+GGPVNN
Sbjct: 246 DGVLSNLGVNAIEIGHVAVTIGTSGAMRTVIDEPRTDPQGRLFCYALTEKHWVVGGPVNN 305

Query: 299 GGIVLRWIRDEFASSEIETAKRLGIDPYDVLTKIAERVRPGADGLLFHPYLAGERAPLWN 358
           GG + RW+RDE A++E   A+  G+DPY+ LT+IAE+VRPGA+GLLFHP++AGERAPLWN
Sbjct: 306 GGNIFRWVRDELATAEAAAAREEGVDPYEALTRIAEKVRPGAEGLLFHPFMAGERAPLWN 365

Query: 359 PDVPGSFFGLTMSHKKEHMIRAALEGVIYNLYTVFLALTECMDGPVARIQATGGFARSDV 418
            D+ GSFFGL + H K HMIRAALEGVI+NLY++  AL E + GP  R+ ATGGFARS +
Sbjct: 366 ADLRGSFFGLALHHGKHHMIRAALEGVIFNLYSILPALEELV-GPTKRMMATGGFARSAL 424

Query: 419 WRQMMADIFESEVVVPESYESSCLGACILGLYATGKIDSFDVVSDMIGSTHRHAPKEESA 478
           WRQMMADIF  EVVVPES ESSCLGA +LG +A G + S  V+S M+GST+ HAP+ E+ 
Sbjct: 425 WRQMMADIFNREVVVPESVESSCLGAAVLGAWALGLVPSLSVISGMVGSTNHHAPEAEAV 484

Query: 479 KEYRKLMPLFINLSRALENEYTQIANYQR 507
             Y +L P+F  +   LE EY  IA +QR
Sbjct: 485 AVYGRLQPIFAAIPAKLEAEYHAIAAFQR 513


Lambda     K      H
   0.318    0.134    0.401 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 702
Number of extensions: 14
Number of successful extensions: 3
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 513
Length of database: 520
Length adjustment: 35
Effective length of query: 478
Effective length of database: 485
Effective search space:   231830
Effective search space used:   231830
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 52 (24.6 bits)

Align candidate HSERO_RS05475 HSERO_RS05475 (gluconokinase)
to HMM TIGR01314 (gntK: gluconate kinase (EC 2.7.1.12))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR01314.hmm
# target sequence database:        /tmp/gapView.27905.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR01314  [M=505]
Accession:   TIGR01314
Description: gntK_FGGY: gluconate kinase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                                  Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                                  -----------
   9.2e-223  726.2   0.0     1e-222  726.0   0.0    1.0  1  lcl|FitnessBrowser__HerbieS:HSERO_RS05475  HSERO_RS05475 gluconokinase


Domain annotation for each sequence (and alignments):
>> lcl|FitnessBrowser__HerbieS:HSERO_RS05475  HSERO_RS05475 gluconokinase
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  726.0   0.0    1e-222    1e-222       1     505 []       9     513 ..       9     513 .. 0.99

  Alignments for each domain:
  == domain 1  score: 726.0 bits;  conditional E-value: 1e-222
                                  TIGR01314   1 yligvdigttstkavlfeengkvvakesigyplytddvdvaeenleeifeavlvtikevskele.eek 67 
                                                y++gvdigttstk+v+f  +gkvva++   yp+   + ++ae+++e+i  a l +i+  +k  +   +
  lcl|FitnessBrowser__HerbieS:HSERO_RS05475   9 YMLGVDIGTTSTKSVVFTLDGKVVAQHAEEYPVLCTEPGMAEQDPEQIVAAALASIAGAVKAAKaAPQ 76 
                                                9*******************************************************987766651567 PP

                                  TIGR01314  68 eikfvsfsaqmhslialdendkpltrlitwadnraakyaekikeekngfeiykrtgtpihpmaplski 135
                                                ei ++sfsa+mhs+iald +++ l++ itw d ra  +ae+ik+e++g++iy+rtgtp+hpm+pl+k+
  lcl|FitnessBrowser__HerbieS:HSERO_RS05475  77 EIALLSFSAAMHSVIALDAENRLLSNSITWGDIRASVWAERIKHEHDGNAIYRRTGTPVHPMSPLCKL 144
                                                8******************************************************************* PP

                                  TIGR01314 136 iwlkaerkdifqkaakyleikeyifkrlfdtyvidyslasatgllnlkeldwdkealelagikeeqlp 203
                                                +w+++e++d+f++aa++++ikey++++lf+++v+d+s+asatg++nl+el wd+ al l+g++++qlp
  lcl|FitnessBrowser__HerbieS:HSERO_RS05475 145 MWMRHEKPDVFHRAARFVGIKEYLLYQLFGQWVVDHSIASATGMFNLRELAWDQGALALLGVRPDQLP 212
                                                ******************************************************************** PP

                                  TIGR01314 204 klvetteilknlkeeyakkmgidketkfvigasdgvlsnlgvnaikkgevavtigtsgairtvidkpk 271
                                                  v+tt+ l  l++e+a+++g+  +t+f+iga dgvlsnlgvnai+ g vavtigtsga+rtvid p+
  lcl|FitnessBrowser__HerbieS:HSERO_RS05475 213 TPVPTTHRLPALSSEMAQRLGLSVDTPFIIGANDGVLSNLGVNAIEIGHVAVTIGTSGAMRTVIDEPR 280
                                                ******************************************************************** PP

                                  TIGR01314 272 tdekgrifcyalteehyviggpvnnggvvlrwlrdellaseietakrlgvdpydvltkiakrvkpgad 339
                                                td +gr+fcyalte+h+v+ggpvnngg ++rw+rdel+++e  +a+  gvdpy+ lt+ia++v+pga+
  lcl|FitnessBrowser__HerbieS:HSERO_RS05475 281 TDPQGRLFCYALTEKHWVVGGPVNNGGNIFRWVRDELATAEAAAAREEGVDPYEALTRIAEKVRPGAE 348
                                                ******************************************************************** PP

                                  TIGR01314 340 gllfhpylageraplwnanargsffgltlshkkehmiraalegviynlytvalalvevvdetlkmika 407
                                                gllfhp++ageraplwna+ rgsffgl+l h k+hmiraalegvi+nly +  al e+v+  +k++ a
  lcl|FitnessBrowser__HerbieS:HSERO_RS05475 349 GLLFHPFMAGERAPLWNADLRGSFFGLALHHGKHHMIRAALEGVIFNLYSILPALEELVGP-TKRMMA 415
                                                *********************************************************9987.78889* PP

                                  TIGR01314 408 tggfaksevwrqllsdifesevvvpesyessclgaiilglkavgkiedlsavssmvgaterytpieen 475
                                                tggfa+s +wrq+++dif++evvvpes essclga +lg  a+g + +ls++s mvg+t+++ p  e+
  lcl|FitnessBrowser__HerbieS:HSERO_RS05475 416 TGGFARSALWRQMMADIFNREVVVPESVESSCLGAAVLGAWALGLVPSLSVISGMVGSTNHHAPEAEA 483
                                                ******************************************************************** PP

                                  TIGR01314 476 vkvyreivpifinlsrsleeeyeqiadfqr 505
                                                v vy  + pif  +   le+ey+ ia fqr
  lcl|FitnessBrowser__HerbieS:HSERO_RS05475 484 VAVYGRLQPIFAAIPAKLEAEYHAIAAFQR 513
                                                *****************************9 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (505 nodes)
Target sequences:                          1  (520 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.02u 0.01s 00:00:00.03 Elapsed: 00:00:00.03
# Mc/sec: 8.30
//
[ok]

This GapMind analysis is from Sep 17 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory