GapMind for catabolism of small carbon sources

 

Alignments for a candidate for galK in Vagococcus penaei CD276

Align galactose kinase (characterized)
to candidate WP_126844466.1 BW732_RS08170 galactokinase

Query= CharProtDB::CH_024146
         (382 letters)



>NCBI__GCF_001998885.1:WP_126844466.1
          Length = 389

 Score =  283 bits (723), Expect = 8e-81
 Identities = 159/385 (41%), Positives = 233/385 (60%), Gaps = 12/385 (3%)

Query: 5   EKTQSLFANAFGYPATHTIQAPGRVNLIGEHTDYNDGFVLPCAIDYQTVISCAPRDDRKV 64
           E     F   F         APGR+NLIGEHTDYN G V PCAI Y T      R+DR V
Sbjct: 3   ETLNETFIKIFNEKPESAYFAPGRINLIGEHTDYNGGNVFPCAISYGTYGVVKAREDRLV 62

Query: 65  RVMAADYENQ-LDEFSLDAPIVAHENYQWANYVRGVVKHL-QLRNN--SFGGVDMVISGN 120
           R+ + ++ +  + EFSLD  +   E   WANY +G++ +L  L +N  +  G D+VI G+
Sbjct: 63  RLYSMNFPDLGIKEFSLD-DLTYKEADNWANYPKGMIGYLYDLVDNKEAVTGFDVVIFGD 121

Query: 121 VPQGAGLSSSASLEVAVGTVLQQLYHLPLDGAQIALNGQEAENQFVGCNCGIMDQLISAL 180
           +P GAGLSSSAS+E+  G +++ L+ L LD  ++   GQ+ EN F+G N GIMDQ    +
Sbjct: 122 IPNGAGLSSSASIELLTGVIVEDLWQLTLDRVELVKLGQKVENHFIGVNSGIMDQFAIGM 181

Query: 181 GKKDHALLIDCRSLGTKAVSMP-KGVAVVIINSNFKRTLVGSEYNTRREQCETGARFFQQ 239
           G+KD A+L+DC +L  + V +  +   ++I+N+N +R L  S+YN RR +C+   R  Q 
Sbjct: 182 GQKDQAILLDCHTLNYEMVPVHLENEKILIMNTNKRRELADSKYNERRAECDEALRRLQT 241

Query: 240 ----PALRDVTIEEFNAVAHEL-DPIVAKRVRHILTENARTVEAASALEQGDLKRMGELM 294
                AL ++++ +F+ +   + D  + KR RH + EN RT+EA  ALE  DL+  GEL+
Sbjct: 242 VTDIQALGELSLAQFDELKAVIKDDTLEKRARHAVAENQRTLEAKQALEANDLQTFGELL 301

Query: 295 AESHASMRDDFEITVPQIDTLVEIVKAVIGDKGGVRMTGGGFGGCIVALIPEELVPAVQQ 354
             SH S+RDD+E+T  ++DT+V + +A  G   G RMTG GFGGC +AL+ E+ +P + Q
Sbjct: 302 NASHQSLRDDYEVTGQELDTIVALTQAQNG-VIGARMTGAGFGGCAIALVKEDKLPEIIQ 360

Query: 355 AVAEQYEAKTGIKETFYVCKPSQGA 379
           AV + Y+ K G +  FYV   + GA
Sbjct: 361 AVGKAYQEKIGYEADFYVASIADGA 385


Lambda     K      H
   0.318    0.134    0.389 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 356
Number of extensions: 14
Number of successful extensions: 6
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 382
Length of database: 389
Length adjustment: 30
Effective length of query: 352
Effective length of database: 359
Effective search space:   126368
Effective search space used:   126368
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 50 (23.9 bits)

Align candidate WP_126844466.1 BW732_RS08170 (galactokinase)
to HMM TIGR00131 (galK: galactokinase (EC 2.7.1.6))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR00131.hmm
# target sequence database:        /tmp/gapView.3482968.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR00131  [M=388]
Accession:   TIGR00131
Description: gal_kin: galactokinase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                             Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                             -----------
   3.8e-132  426.8   0.1   4.3e-132  426.6   0.1    1.0  1  NCBI__GCF_001998885.1:WP_126844466.1  


Domain annotation for each sequence (and alignments):
>> NCBI__GCF_001998885.1:WP_126844466.1  
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  426.6   0.1  4.3e-132  4.3e-132       2     387 ..       3     387 ..       2     388 .. 0.95

  Alignments for each domain:
  == domain 1  score: 426.6 bits;  conditional E-value: 4.3e-132
                             TIGR00131   2 eevkkiFasaykekpdlvvraPGRvnliGehiDYndgsvlPlaidvdtlvavkerddknvsitlanadn.kla 73 
                                           e+++++F++ ++ekp+   +aPGR+nliGeh+DYn+g v+P+ai ++t+  vk r+d+ v+++++n+ +  ++
  NCBI__GCF_001998885.1:WP_126844466.1   3 ETLNETFIKIFNEKPESAYFAPGRINLIGEHTDYNGGNVFPCAISYGTYGVVKAREDRLVRLYSMNFPDlGIK 75 
                                           567899*************************************************************888*** PP

                             TIGR00131  74 erkldlpldksevsdWanYvkgvlkvlqeRfnsvpl..GldivisgdvPtgaGLsssaalevavaavlknlgk 144
                                           e++ld+  +k+  ++WanY+kg +  l +   + +   G+d+vi gd+P+gaGLsssa++e++ ++++++l++
  NCBI__GCF_001998885.1:WP_126844466.1  76 EFSLDDLTYKEA-DNWANYPKGMIGYLYDLVDNKEAvtGFDVVIFGDIPNGAGLSSSASIELLTGVIVEDLWQ 147
                                           *****9988877.***********99995544433334*********************************** PP

                             TIGR00131 145 leldskeillriqkveehfvGvncGgmDqlasvlGeedhallvefrkLkatpvklpqleialviantnvksnl 217
                                           l+ld  e ++ +qkve+hf+Gvn+G+mDq+a+++G++d a+l+++++L+++ v++   +  ++i+ntn++++l
  NCBI__GCF_001998885.1:WP_126844466.1 148 LTLDRVELVKLGQKVENHFIGVNSGIMDQFAIGMGQKDQAILLDCHTLNYEMVPVHLENEKILIMNTNKRREL 220
                                           *******************************************************999999************ PP

                             TIGR00131 218 apseYnlRrqeveeaakvlakksekgaLrDvkeeefaryearltkllqlvekqRakhvvsenlRvlkavkllk 290
                                           a+s+Yn Rr e++ea++ l++ +++ aL++++ ++f+   +    ++  + ++Ra+h+v+en+R+l+a ++l+
  NCBI__GCF_001998885.1:WP_126844466.1 221 ADSKYNERRAECDEALRRLQTVTDIQALGELSLAQFD---ELKAVIKDDTLEKRARHAVAENQRTLEAKQALE 290
                                           *************************************...6666677777778******************** PP

                             TIGR00131 291 dedlkelGkLmnesqasldddyeitvpeidelvesialvnGsiGsRltGaGfGGCtvalvpnenvekvrkala 363
                                            +dl+++G+L+n+s++sl+ddye+t  e+d+ v + +++nG+iG+R+tGaGfGGC +alv++++++++++a++
  NCBI__GCF_001998885.1:WP_126844466.1 291 ANDLQTFGELLNASHQSLRDDYEVTGQELDTIVALTQAQNGVIGARMTGAGFGGCAIALVKEDKLPEIIQAVG 363
                                           ************************************************************************* PP

                             TIGR00131 364 ekYekktdlklefavivskealge 387
                                           + Y++k + +++f+v+ + +++++
  NCBI__GCF_001998885.1:WP_126844466.1 364 KAYQEKIGYEADFYVASIADGAKK 387
                                           *******************99986 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (388 nodes)
Target sequences:                          1  (389 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.00u 0.00s 00:00:00.00 Elapsed: 00:00:00.01
# Mc/sec: 13.78
//
[ok]

This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory