GapMind for catabolism of small carbon sources

 

Alignments for a candidate for ackA in Pontimonas salivibrio CL-TW6

Align acetate kinase (EC 2.7.2.1) (characterized)
to candidate WP_104914114.1 C3B54_RS08495 acetate kinase

Query= BRENDA::P0CW05
         (408 letters)



>NCBI__GCF_002950575.1:WP_104914114.1
          Length = 384

 Score =  344 bits (883), Expect = 2e-99
 Identities = 186/401 (46%), Positives = 258/401 (64%), Gaps = 26/401 (6%)

Query: 2   KVLVINAGSSSLKYQLIDMTNESALAIGLCERIGIDNSIITQKRFDGKKLEKQTDLPNHK 61
           +VLV+N+GSSS+KYQ+ID+ +E  +  G  +R+GID                  D   HK
Sbjct: 3   QVLVVNSGSSSVKYQVIDLPSEQVVDKGQRDRVGIDGG----------------DFATHK 46

Query: 62  IALEEVVKALTDSEFGVIKSMDEINAVGHRVVHGGEKFNSSALINEGVEQAIKDCFELAP 121
            A+ ++V +L  +          ++ VGHRVVHGG +F    LI++ V   I+    LAP
Sbjct: 47  EAIADIVASLPANV--------SLDLVGHRVVHGGARFTRPVLIDQDVISGIEKVRALAP 98

Query: 122 LHNPPNMMGISSCQEIMPGVPMVAVFDTAFHHTIPPYAYMYALPYELYEKYGIRKYGFHG 181
           LHNP N+ GI +  E+MP +P VAVFDTAF  T+ P AY YALP +L EKYGIRKYGFHG
Sbjct: 99  LHNPANLEGIHAVTEVMPQLPQVAVFDTAFFSTLSPAAYDYALPRDLVEKYGIRKYGFHG 158

Query: 182 TSHFYVAKRAAAMLGKPEQDVKVITCHLGNGSSITAVKGGKSIETTMGFTPLEGVAMGTR 241
           TSH YV    + ++   E+ V+V++ HLGNGSS+ AV+GG +++T+MG TPL G+ MGTR
Sbjct: 159 TSHDYVTGELSRLMPPSEKPVRVVSFHLGNGSSVAAVRGGVAVDTSMGLTPLAGLVMGTR 218

Query: 242 CGSIDPAVVPFIMEKEGLSTREIDTLMNKKSGVLGVSSLSNDFRDLDEAASKGNQKAELA 301
            G IDP+VV ++  + GLS  ++D ++N+ SG+LG+S  S D RDL E A  G+  A  A
Sbjct: 219 SGDIDPSVVLYLQREAGLSPEQVDEVLNRNSGLLGLSGHS-DMRDLYEHAEHGDPDALHA 277

Query: 302 LEIFAYKIKKVIGEYIAVLNGVDAIVFTAGIGENSASIRKRILADLDGIGIKIDEEKNKI 361
           L+++ ++ K  +G Y+A L G+DAIVFT GIGENS  +R   ++DL+G+GI IDE +NK 
Sbjct: 278 LDVWTWRAKHYLGAYLAQLGGLDAIVFTGGIGENSPELRLSAVSDLEGLGIHIDESRNKA 337

Query: 362 R-GQEIDISTPDATVRVLVIPTNEELTIARDTKEICETEVK 401
             G+   IS   ++V + VIPTNEEL IAR     CE+  K
Sbjct: 338 ESGEPRLISADGSSVSLWVIPTNEELHIARIATRHCESARK 378


Lambda     K      H
   0.317    0.135    0.379 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 385
Number of extensions: 11
Number of successful extensions: 4
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 408
Length of database: 384
Length adjustment: 31
Effective length of query: 377
Effective length of database: 353
Effective search space:   133081
Effective search space used:   133081
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.6 bits)
S2: 50 (23.9 bits)

Align candidate WP_104914114.1 C3B54_RS08495 (acetate kinase)
to HMM TIGR00016 (ackA: acetate kinase (EC 2.7.2.1))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR00016.hmm
# target sequence database:        /tmp/gapView.3423799.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR00016  [M=405]
Accession:   TIGR00016
Description: ackA: acetate kinase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                             Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                             -----------
   4.6e-131  423.2   0.0   7.7e-130  419.2   0.0    1.9  1  NCBI__GCF_002950575.1:WP_104914114.1  


Domain annotation for each sequence (and alignments):
>> NCBI__GCF_002950575.1:WP_104914114.1  
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  419.2   0.0  7.7e-130  7.7e-130       4     402 ..       2     371 ..       1     374 [. 0.97

  Alignments for each domain:
  == domain 1  score: 419.2 bits;  conditional E-value: 7.7e-130
                             TIGR00016   4 kkilvlnaGssslkfalldaensekvllsglverikleeariktvedgekkeeeklaiedheeavkkllntlk 76 
                                           +++lv+n+Gsss+k++++d   se+v+ +g  +r+ +++                 +++ h+ea++ ++ +l 
  NCBI__GCF_002950575.1:WP_104914114.1   2 TQVLVVNSGSSSVKYQVIDLP-SEQVVDKGQRDRVGIDGG----------------DFATHKEAIADIVASLP 57 
                                           589******************.799988888888888776................7889************9 PP

                             TIGR00016  77 kdkkilkelseialiGHRvvhGgekftesvivtdevlkkikdiselAPlHnpaelegieavlklkvllkaknv 149
                                           +       + +++l+GHRvvhGg +ft  v+++++v+++i+++  lAPlHnpa+legi+av+  +v+++ ++v
  NCBI__GCF_002950575.1:WP_104914114.1  58 A-------NVSLDLVGHRVVHGGARFTRPVLIDQDVISGIEKVRALAPLHNPANLEGIHAVT--EVMPQLPQV 121
                                           5.......6799**************************************************..999999*** PP

                             TIGR00016 150 avFDtafHqtipeeaylYalPyslykelgvRrYGfHGtshkyvtqraakllnkplddlnlivcHlGnGasvsa 222
                                           avFDtaf  t+   ay YalP++l +++g+R+YGfHGtsh yvt ++++l+  +++ +++++ HlGnG+sv+a
  NCBI__GCF_002950575.1:WP_104914114.1 122 AVFDTAFFSTLSPAAYDYALPRDLVEKYGIRKYGFHGTSHDYVTGELSRLMPPSEKPVRVVSFHLGNGSSVAA 194
                                           ************************************************************************* PP

                             TIGR00016 223 vknGksidtsmGltPLeGlvmGtRsGdiDpaiisylaetlglsldeieetlnkksGllgisglssDlRdildk 295
                                           v+ G ++dtsmGltPL+GlvmGtRsGdiDp+++ yl+ + gls +++ e+ln++sGllg+sg  sD+Rd++++
  NCBI__GCF_002950575.1:WP_104914114.1 195 VRGGVAVDTSMGLTPLAGLVMGTRSGDIDPSVVLYLQREAGLSPEQVDEVLNRNSGLLGLSG-HSDMRDLYEH 266
                                           **************************************************************.99******** PP

                             TIGR00016 296 keegneeaklAlkvyvhRiakyigkyiaslegelDaivFtgGiGenaaevrelvleklevlGlkldlelnnaa 368
                                            e+g+ +a  Al+v++ R ++y+g+y+a+l g lDaivFtgGiGen+ e+r  ++++le lG+++d+ +n+ a
  NCBI__GCF_002950575.1:WP_104914114.1 267 AEHGDPDALHALDVWTWRAKHYLGAYLAQLGG-LDAIVFTGGIGENSPELRLSAVSDLEGLGIHIDESRNK-A 337
                                           ******************************76.*************************************9.9 PP

                             TIGR00016 369 rsgkesvisteeskvkvlviptneelviaeDalr 402
                                           +sg+ ++is + s+v + viptneel ia+ a+r
  NCBI__GCF_002950575.1:WP_104914114.1 338 ESGEPRLISADGSSVSLWVIPTNEELHIARIATR 371
                                           *****************************97776 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (405 nodes)
Target sequences:                          1  (384 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.01u 0.00s 00:00:00.01 Elapsed: 00:00:00.01
# Mc/sec: 14.65
//
[ok]

This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory