GapMind for catabolism of small carbon sources

 

Alignments for a candidate for ackA in Marivita geojedonensis DPG-138

Align Acetate kinase; Acetokinase; EC 2.7.2.1 (characterized)
to candidate WP_085635099.1 MGEO_RS02500 acetate/propionate family kinase

Query= SwissProt::P37877
         (395 letters)



>NCBI__GCF_002115805.1:WP_085635099.1
          Length = 386

 Score =  266 bits (680), Expect = 7e-76
 Identities = 161/390 (41%), Positives = 234/390 (60%), Gaps = 17/390 (4%)

Query: 4   IIAINAGSSSLKFQLFEMPSETVLTKGLVERIGIADSVFTISVNGEKNTEVTDIPDHAVA 63
           I+ +NAGSSS+KF LF+     +L+ G+ E IG A     + +  EK+  V  + DH  A
Sbjct: 10  ILVLNAGSSSIKFALFDDALNEILS-GMAEAIGGAS---ILRIGDEKHDVV--LIDHRAA 63

Query: 64  VKMLLNKLTEFGIIKDLNEIDGIGHRVVHGGEKFSDSVLLTDETIKEIEDISELAPLHNP 123
           ++ +L  L+E GI  D   +  +GHRVVHGG K +  + +TDE   EI + + LAPLHNP
Sbjct: 64  LEAILKALSERGITPDT--LRAVGHRVVHGGRKLTAPMRVTDEVRAEIANCTPLAPLHNP 121

Query: 124 ANIVGIKAFKEVLPNVPAVAVFDTAFHQTMPEQSYLYSLPYEYYEKFGIRKYGFHGTSHK 183
            N+  ++A   + P++P  A FDT+FH T P  +  Y++P +  E  GIR+YGFHG S+ 
Sbjct: 122 HNLAPMEALSRLAPDLPQFASFDTSFHATNPNVATRYAIP-KMMETKGIRRYGFHGLSYA 180

Query: 184 YVTERAAELLGRPLKDLRLISCHLGNGASIAAVEGGKSIDTSMGFTPLAGVAMGTRSGNI 243
            +  R  E+ G  L   RL++ HLGNGAS+ A+  G+SI T+MG++PL G+ MGTRSG I
Sbjct: 181 SLVRRLPEISGEALPS-RLLAFHLGNGASLCAIHNGQSIATTMGYSPLDGLTMGTRSGGI 239

Query: 244 DPALIPYIMEKTGQTADEVLNTLNKKSGLLGISGFSSDLRDIVEATKEGNERAETALEVF 303
           D   +  ++ + G    + +  LN +SGLLG+SG  SD+R ++    + +  +  A+E F
Sbjct: 240 DANAVLRLVGENGLERTKAI--LNHESGLLGLSGGKSDMRKLM---LDASAESAFAVEHF 294

Query: 304 ASRIHKYIGSYAARMSGVDAIIFTAGIGENSVEVRERVLRGLEFMGVYWDPALNNVRGEE 363
                ++ GS  A + G+DAI FT GIGEN+V VR R+LRGLE+ GV  +P  N+  G  
Sbjct: 295 CYWTLRHAGSLIAALEGLDAIAFTGGIGENAVGVRARILRGLEWAGVRINPDFNHRSGPR 354

Query: 364 AFISYPHSPVKVMIIPTDEEVMIARDVVRL 393
             +    S V V +IP +EE MIA D   L
Sbjct: 355 --LHAESSKVAVWVIPAEEERMIAMDAQAL 382


Lambda     K      H
   0.317    0.136    0.381 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 386
Number of extensions: 21
Number of successful extensions: 8
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 395
Length of database: 386
Length adjustment: 31
Effective length of query: 364
Effective length of database: 355
Effective search space:   129220
Effective search space used:   129220
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.6 bits)
S2: 50 (23.9 bits)

Align candidate WP_085635099.1 MGEO_RS02500 (acetate/propionate family kinase)
to HMM TIGR00016 (ackA: acetate kinase (EC 2.7.2.1))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR00016.hmm
# target sequence database:        /tmp/gapView.1827316.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR00016  [M=405]
Accession:   TIGR00016
Description: ackA: acetate kinase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                             Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                             -----------
   7.1e-106  340.2   0.0   8.5e-106  340.0   0.0    1.0  1  NCBI__GCF_002115805.1:WP_085635099.1  


Domain annotation for each sequence (and alignments):
>> NCBI__GCF_002115805.1:WP_085635099.1  
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  340.0   0.0  8.5e-106  8.5e-106       6     403 ..      10     382 ..       6     384 .. 0.91

  Alignments for each domain:
  == domain 1  score: 340.0 bits;  conditional E-value: 8.5e-106
                             TIGR00016   6 ilvlnaGssslkfalldaensekvllsglverikleeariktvedgekkeeeklaiedheeavkkllntlkkd 78 
                                           ilvlnaGsss+kfal+d +  ++ +lsg++e+i  +      +  g++k  + +   dh++a++++l++l  +
  NCBI__GCF_002115805.1:WP_085635099.1  10 ILVLNAGSSSIKFALFDDAL-NE-ILSGMAEAIGGASI----LRIGDEK--HDVVLIDHRAALEAILKALS-E 73 
                                           9****************993.44.59******987766....4456444..4556789*************.5 PP

                             TIGR00016  79 kkilkelseialiGHRvvhGgekftesvivtdevlkkikdiselAPlHnpaelegieavlklkvllkaknvav 151
                                           + i+   + ++++GHRvvhGg k+t  + vtdev ++i++ ++lAPlHnp +l  +ea++     ++ ++ a 
  NCBI__GCF_002115805.1:WP_085635099.1  74 RGIT--PDTLRAVGHRVVHGGRKLTAPMRVTDEVRAEIANCTPLAPLHNPHNLAPMEALS--RLAPDLPQFAS 142
                                           7775..5789**************************************************..7788899**** PP

                             TIGR00016 152 FDtafHqtipeeaylYalPyslykelgvRrYGfHGtshkyvtqraakllnkplddlnlivcHlGnGasvsavk 224
                                           FDt+fH t p+ a  Ya+P ++++ +g+RrYGfHG+s+  + +r+ ++ +  +  ++l+  HlGnGas++a++
  NCBI__GCF_002115805.1:WP_085635099.1 143 FDTSFHATNPNVATRYAIP-KMMETKGIRRYGFHGLSYASLVRRLPEISGE-ALPSRLLAFHLGNGASLCAIH 213
                                           *******************.889999***********************99.56789**************** PP

                             TIGR00016 225 nGksidtsmGltPLeGlvmGtRsGdiDpaiisylaetlglsldeieetlnkksGllgisglssDlRdildkke 297
                                           nG+si t+mG  PL+Gl mGtRsG iD  ++  l  ++g  l+  + +ln +sGllg+sg  sD+R+++ +  
  NCBI__GCF_002115805.1:WP_085635099.1 214 NGQSIATTMGYSPLDGLTMGTRSGGIDANAVLRLVGENG--LERTKAILNHESGLLGLSGGKSDMRKLMLDA- 283
                                           ***************************998888877666..5677899*******************98777. PP

                             TIGR00016 298 egneeaklAlkvyvhRiakyigkyiaslegelDaivFtgGiGenaaevrelvleklevlGlkldlelnnaars 370
                                               e++ A++ +++   ++ g+ ia+leg lDai FtgGiGena+ vr+++l++le  G++++++ n+  rs
  NCBI__GCF_002115805.1:WP_085635099.1 284 --SAESAFAVEHFCYWTLRHAGSLIAALEG-LDAIAFTGGIGENAVGVRARILRGLEWAGVRINPDFNH--RS 351
                                           ..44679*********************99.*************************************9..44 PP

                             TIGR00016 371 gkesvisteeskvkvlviptneelviaeDalrl 403
                                           g    +  e skv v vip +ee +ia Da  l
  NCBI__GCF_002115805.1:WP_085635099.1 352 G--PRLHAESSKVAVWVIPAEEERMIAMDAQAL 382
                                           3..4466789*******************9876 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (405 nodes)
Target sequences:                          1  (386 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.00u 0.00s 00:00:00.00 Elapsed: 00:00:00.00
# Mc/sec: 31.22
//
[ok]

This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory