GapMind for catabolism of small carbon sources

 

Aligments for a candidate for glpK in Escherichia coli BW25113

Align glycerol kinase; EC 2.7.1.30 (characterized)
to candidate 17965 b3926 glycerol kinase (NCBI)

Query= CharProtDB::CH_121461
         (502 letters)



>lcl|FitnessBrowser__Keio:17965 b3926 glycerol kinase (NCBI)
          Length = 502

 Score = 1017 bits (2630), Expect = 0.0
 Identities = 502/502 (100%), Positives = 502/502 (100%)

Query: 1   MTEKKYIVALDQGTTSSRAVVMDHDANIISVSQREFEQIYPKPGWVEHDPMEIWATQSST 60
           MTEKKYIVALDQGTTSSRAVVMDHDANIISVSQREFEQIYPKPGWVEHDPMEIWATQSST
Sbjct: 1   MTEKKYIVALDQGTTSSRAVVMDHDANIISVSQREFEQIYPKPGWVEHDPMEIWATQSST 60

Query: 61  LVEVLAKADISSDQIAAIGITNQRETTIVWEKETGKPIYNAIVWQCRRTAEICEHLKRDG 120
           LVEVLAKADISSDQIAAIGITNQRETTIVWEKETGKPIYNAIVWQCRRTAEICEHLKRDG
Sbjct: 61  LVEVLAKADISSDQIAAIGITNQRETTIVWEKETGKPIYNAIVWQCRRTAEICEHLKRDG 120

Query: 121 LEDYIRSNTGLVIDPYFSGTKVKWILDHVEGSRERARRGELLFGTVDTWLIWKMTQGRVH 180
           LEDYIRSNTGLVIDPYFSGTKVKWILDHVEGSRERARRGELLFGTVDTWLIWKMTQGRVH
Sbjct: 121 LEDYIRSNTGLVIDPYFSGTKVKWILDHVEGSRERARRGELLFGTVDTWLIWKMTQGRVH 180

Query: 181 VTDYTNASRTMLFNIHTLDWDDKMLEVLDIPREMLPEVRRSSEVYGQTNIGGKGGTRIPI 240
           VTDYTNASRTMLFNIHTLDWDDKMLEVLDIPREMLPEVRRSSEVYGQTNIGGKGGTRIPI
Sbjct: 181 VTDYTNASRTMLFNIHTLDWDDKMLEVLDIPREMLPEVRRSSEVYGQTNIGGKGGTRIPI 240

Query: 241 SGIAGDQQAALFGQLCVKEGMAKNTYGTGCFMLMNTGEKAVKSENGLLTTIACGPTGEVN 300
           SGIAGDQQAALFGQLCVKEGMAKNTYGTGCFMLMNTGEKAVKSENGLLTTIACGPTGEVN
Sbjct: 241 SGIAGDQQAALFGQLCVKEGMAKNTYGTGCFMLMNTGEKAVKSENGLLTTIACGPTGEVN 300

Query: 301 YALEGAVFMAGASIQWLRDEMKLINDAYDSEYFATKVQNTNGVYVVPAFTGLGAPYWDPY 360
           YALEGAVFMAGASIQWLRDEMKLINDAYDSEYFATKVQNTNGVYVVPAFTGLGAPYWDPY
Sbjct: 301 YALEGAVFMAGASIQWLRDEMKLINDAYDSEYFATKVQNTNGVYVVPAFTGLGAPYWDPY 360

Query: 361 ARGAIFGLTRGVNANHIIRATLESIAYQTRDVLEAMQADSGIRLHALRVDGGAVANNFLM 420
           ARGAIFGLTRGVNANHIIRATLESIAYQTRDVLEAMQADSGIRLHALRVDGGAVANNFLM
Sbjct: 361 ARGAIFGLTRGVNANHIIRATLESIAYQTRDVLEAMQADSGIRLHALRVDGGAVANNFLM 420

Query: 421 QFQSDILGTRVERPEVREVTALGAAYLAGLAVGFWQNLDELQEKAVIEREFRPGIETTER 480
           QFQSDILGTRVERPEVREVTALGAAYLAGLAVGFWQNLDELQEKAVIEREFRPGIETTER
Sbjct: 421 QFQSDILGTRVERPEVREVTALGAAYLAGLAVGFWQNLDELQEKAVIEREFRPGIETTER 480

Query: 481 NYRYAGWKKAVKRAMAWEEHDE 502
           NYRYAGWKKAVKRAMAWEEHDE
Sbjct: 481 NYRYAGWKKAVKRAMAWEEHDE 502


Lambda     K      H
   0.318    0.134    0.404 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 1017
Number of extensions: 23
Number of successful extensions: 1
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 502
Length of database: 502
Length adjustment: 34
Effective length of query: 468
Effective length of database: 468
Effective search space:   219024
Effective search space used:   219024
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 52 (24.6 bits)

Align candidate 17965 b3926 (glycerol kinase (NCBI))
to HMM TIGR01311 (glpK: glycerol kinase (EC 2.7.1.30))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR01311.hmm
# target sequence database:        /tmp/gapView.10638.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR01311  [M=496]
Accession:   TIGR01311
Description: glycerol_kin: glycerol kinase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                       Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                       -----------
   7.3e-235  765.8   1.2   8.3e-235  765.6   1.2    1.0  1  lcl|FitnessBrowser__Keio:17965  b3926 glycerol kinase (NCBI)


Domain annotation for each sequence (and alignments):
>> lcl|FitnessBrowser__Keio:17965  b3926 glycerol kinase (NCBI)
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  765.6   1.2  8.3e-235  8.3e-235       1     496 []       5     497 ..       5     497 .. 0.99

  Alignments for each domain:
  == domain 1  score: 765.6 bits;  conditional E-value: 8.3e-235
                       TIGR01311   1 kliaaiDqGttssraivfdkegeevakaqkelsqifpkegwvEhdpleilesvvkvlaealekleikaeeiaaiGitnq 79 
                                     k+i+a+DqGttssra+v+d+++++++ +q+e++qi+pk+gwvEhdp+ei++++ ++l e+l+k++i++++iaaiGitnq
  lcl|FitnessBrowser__Keio:17965   5 KYIVALDQGTTSSRAVVMDHDANIISVSQREFEQIYPKPGWVEHDPMEIWATQSSTLVEVLAKADISSDQIAAIGITNQ 83 
                                     69***************************************************************************** PP

                       TIGR01311  80 REttvvWdketgkplvnaivWqdtrtakiveelkeetkeeelrektGLplstYfsatKlrWlldnveevrkaaeegell 158
                                     REtt+vW+ketgkp++naivWq++rta+i+e+lk+++ e+++r++tGL++++Yfs+tK++W+ld+ve+ r++a++gell
  lcl|FitnessBrowser__Keio:17965  84 RETTIVWEKETGKPIYNAIVWQCRRTAEICEHLKRDGLEDYIRSNTGLVIDPYFSGTKVKWILDHVEGSRERARRGELL 162
                                     ******************************************************************************* PP

                       TIGR01311 159 fGtvdtwliykLtggkvhvtdvtNASRtlllnletlkwdeellelfkipkellPeirsssevygeieekellkeevpit 237
                                     fGtvdtwli+k+t+g+vhvtd+tNASRt+l+n++tl+wd+++le+++ip+e+lPe+r ssevyg+++     ++++pi+
  lcl|FitnessBrowser__Keio:17965 163 FGTVDTWLIWKMTQGRVHVTDYTNASRTMLFNIHTLDWDDKMLEVLDIPREMLPEVRRSSEVYGQTNIGGKGGTRIPIS 241
                                     *********************************************************************9********* PP

                       TIGR01311 238 gvlGdqqaalvgqlclkkgeaKntYgtGcFlllntGekkviskhglLttvayklggkkptkyalEGsvavaGaavqwlr 316
                                     g++Gdqqaal+gqlc+k+g+aKntYgtGcF+l+ntGek+v s++glLtt+a+   g+   +yalEG+v++aGa +qwlr
  lcl|FitnessBrowser__Keio:17965 242 GIAGDQQAALFGQLCVKEGMAKNTYGTGCFMLMNTGEKAVKSENGLLTTIACGPTGEV--NYALEGAVFMAGASIQWLR 318
                                     ****************************************************999877..6****************** PP

                       TIGR01311 317 dnlklikkaeeveklaksvedsegvyfVPafsGLfaPyWdsdArgtivGltrkttkehiaraaleavafqardileame 395
                                     d++kli++a ++e +a++v++++gvy+VPaf+GL+aPyWd+ Arg+i+Gltr+++++hi+ra+le++a+q+rd+leam+
  lcl|FitnessBrowser__Keio:17965 319 DEMKLINDAYDSEYFATKVQNTNGVYVVPAFTGLGAPYWDPYARGAIFGLTRGVNANHIIRATLESIAYQTRDVLEAMQ 397
                                     ******************************************************************************* PP

                       TIGR01311 396 kdagvevkvLkvDGglsknnllmqiqadilgvkverpkvaettalGaAlaaglavgvwkseeeleksaeaeektfepem 474
                                     +d+g+++++L+vDGg+++nn+lmq+q+dilg++verp+v e+talGaA++aglavg+w++++el+++a  e ++f+p +
  lcl|FitnessBrowser__Keio:17965 398 ADSGIRLHALRVDGGAVANNFLMQFQSDILGTRVERPEVREVTALGAAYLAGLAVGFWQNLDELQEKAVIE-REFRPGI 475
                                     *******************************************************************9997.******* PP

                       TIGR01311 475 deeerekkykkwkeaverslkw 496
                                     +++er+ +y+ wk+av+r++ w
  lcl|FitnessBrowser__Keio:17965 476 ETTERNYRYAGWKKAVKRAMAW 497
                                     *******************998 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (496 nodes)
Target sequences:                          1  (502 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.02u 0.01s 00:00:00.03 Elapsed: 00:00:00.02
# Mc/sec: 10.50
//
[ok]

This GapMind analysis is from Sep 17 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer. Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see the paper from 2019 on GapMind for amino acid biosynthesis, the preprint on GapMind for carbon sources, or view the source code.

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory