GapMind for catabolism of small carbon sources

 

Alignments for a candidate for ackA in Phaeobacter inhibens BS107

Align propionate kinase (EC 2.7.2.15) (characterized)
to candidate GFF2838 PGA1_c28840 acetate kinase AckA

Query= BRENDA::O06961
         (402 letters)



>FitnessBrowser__Phaeo:GFF2838
          Length = 386

 Score =  242 bits (617), Expect = 2e-68
 Identities = 155/384 (40%), Positives = 218/384 (56%), Gaps = 14/384 (3%)

Query: 7   VLVINCGSSSIKFSVLDVATCDVLMAGIADGMNTENAFLSINGDKPINLAHSNYEDALKA 66
           +L++N GSSSIKF++ D    +  +AG+A+G+ T  + L I  D   +     + +AL A
Sbjct: 9   ILILNAGSSSIKFAIFDT-DLNQRLAGLAEGIGTPQSRLRI-ADTSRDSQLPTHAEALAA 66

Query: 67  IAFELEKRDLTDS-VALIGHRIAHGGELFTQSVIITDEIIDNIRRVSPLAPLHNYANLSG 125
           I   L    L  + +A +GHR+ HGG   T+ V IT EI   I   +PLAPLHN  +L+ 
Sbjct: 67  ILAALPDHGLDPTQLAAVGHRVVHGGRKLTKPVRITPEIRAEIADCTPLAPLHNPHSLAA 126

Query: 126 IDAARHLFPAVRQVAVFDTSFHQTLAPEAYLYGLPWEYFSSLGVRRYGFHGTSHRYVSRR 185
           ID      P + Q A FDTSFH T    A  Y +P     + G+RRYGFHG S+  + RR
Sbjct: 127 IDTMATTAPDLPQFASFDTSFHATNPEVATRYAIP-RVEETKGIRRYGFHGLSYASLVRR 185

Query: 186 AYELLDLDEKDSGLIVAHLGNGASICAVRNGQSVDTSMGMTPLEGLMMGTRSGDVDFGAM 245
             E+       S L+  HLGNGAS+CA+RNGQSV T+MG +PL+GL MGTRSG +D  A+
Sbjct: 186 LPEISGA-ALPSRLLAFHLGNGASLCAIRNGQSVATTMGYSPLDGLTMGTRSGGIDANAV 244

Query: 246 AWIAKETGQTLSDLERVVNKESGLLGISGLSSDLRVLEKAWHEGHERARLAIKTFVHRIA 305
             + ++ G  L   + ++N ESGLLG+SG  SD+R L     +    +  AI+ F +   
Sbjct: 245 LRLVEDNG--LDRTKAILNNESGLLGLSGGKSDMRNL---MLDPSADSAFAIEHFCYWSL 299

Query: 306 RHIAGHAASLHRLDGIIFTGGIGENSVLIRQLVIEHLGVLGLTLDVEMNKQPNSHGERII 365
           RH     A++  LD I FTGGIGEN+V +R  ++  L  +G  +DV+ N    S     +
Sbjct: 300 RHAGSLIAAMEGLDAIAFTGGIGENAVGVRARILRGLEWIGARMDVDANHARKSR----L 355

Query: 366 SANPSQVICAVIPTNEEKMIALDA 389
            A  S+V   V+   EE+ IA+DA
Sbjct: 356 HAGSSKVAIWVVEAEEERQIAMDA 379


Lambda     K      H
   0.320    0.137    0.397 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 380
Number of extensions: 25
Number of successful extensions: 7
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 402
Length of database: 386
Length adjustment: 31
Effective length of query: 371
Effective length of database: 355
Effective search space:   131705
Effective search space used:   131705
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.8 bits)
S2: 50 (23.9 bits)

Align candidate GFF2838 PGA1_c28840 (acetate kinase AckA)
to HMM TIGR00016 (ackA: acetate kinase (EC 2.7.2.1))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR00016.hmm
# target sequence database:        /tmp/gapView.7641.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR00016  [M=405]
Accession:   TIGR00016
Description: ackA: acetate kinase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                          Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                          -----------
   2.8e-107  344.9   0.0   3.2e-107  344.7   0.0    1.0  1  lcl|FitnessBrowser__Phaeo:GFF2838  PGA1_c28840 acetate kinase AckA


Domain annotation for each sequence (and alignments):
>> lcl|FitnessBrowser__Phaeo:GFF2838  PGA1_c28840 acetate kinase AckA
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  344.7   0.0  3.2e-107  3.2e-107       4     403 ..       7     382 ..       4     384 .. 0.92

  Alignments for each domain:
  == domain 1  score: 344.7 bits;  conditional E-value: 3.2e-107
                          TIGR00016   4 kkilvlnaGssslkfalldaensekvllsglverikleeariktvedgekkeeeklaiedheeavkkllntlkkdk 79 
                                          il+lnaGsss+kfa++d+   ++  l gl+e i  +++r+ +  +     +   + + h+ea++++l +l  d+
  lcl|FitnessBrowser__Phaeo:GFF2838   7 DNILILNAGSSSIKFAIFDTDL-NQR-LAGLAEGIGTPQSRLRIADT-----SRDSQLPTHAEALAAILAALP-DH 74 
                                        679****************994.565.69***********7664333.....35678899*************.67 PP

                          TIGR00016  80 kilkelseialiGHRvvhGgekftesvivtdevlkkikdiselAPlHnpaelegieavlklkvllkaknvavFDta 155
                                         +  +  ++a++GHRvvhGg k+t+ v +t e+ ++i+d ++lAPlHnp  l +i+++   ++ ++ ++ a FDt+
  lcl|FitnessBrowser__Phaeo:GFF2838  75 GL--DPTQLAAVGHRVVHGGRKLTKPVRITPEIRAEIADCTPLAPLHNPHSLAAIDTMA--TTAPDLPQFASFDTS 146
                                        65..5679*************************************************99..899999********* PP

                          TIGR00016 156 fHqtipeeaylYalPyslykelgvRrYGfHGtshkyvtqraakllnkplddlnlivcHlGnGasvsavknGksidt 231
                                        fH t pe a  Ya+P ++++ +g+RrYGfHG+s+  + +r+ ++ +  +  ++l+  HlGnGas++a++nG+s+ t
  lcl|FitnessBrowser__Phaeo:GFF2838 147 FHATNPEVATRYAIP-RVEETKGIRRYGFHGLSYASLVRRLPEISGA-ALPSRLLAFHLGNGASLCAIRNGQSVAT 220
                                        ***************.8999************************998.56889*********************** PP

                          TIGR00016 232 smGltPLeGlvmGtRsGdiDpaiisylaetlglsldeieetlnkksGllgisglssDlRdildkkeegneeaklAl 307
                                        +mG  PL+Gl mGtRsG iD  ++  l e +g  ld  + +ln +sGllg+sg  sD+R+++ ++     +++ A+
  lcl|FitnessBrowser__Phaeo:GFF2838 221 TMGYSPLDGLTMGTRSGGIDANAVLRLVEDNG--LDRTKAILNNESGLLGLSGGKSDMRNLMLDP---SADSAFAI 291
                                        *********************99998888766..678899*********************9999...45789*** PP

                          TIGR00016 308 kvyvhRiakyigkyiaslegelDaivFtgGiGenaaevrelvleklevlGlkldlelnnaarsgkesvisteeskv 383
                                        + +++   ++ g+ ia++eg lDai FtgGiGena+ vr+++l++le +G ++d + n+     ++s +    skv
  lcl|FitnessBrowser__Phaeo:GFF2838 292 EHFCYWSLRHAGSLIAAMEG-LDAIAFTGGIGENAVGVRARILRGLEWIGARMDVDANH----ARKSRLHAGSSKV 362
                                        ******************99.************************************99....4555667789*** PP

                          TIGR00016 384 kvlviptneelviaeDalrl 403
                                         + v++ +ee  ia Da  l
  lcl|FitnessBrowser__Phaeo:GFF2838 363 AIWVVEAEEERQIAMDAQTL 382
                                        ****************9877 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (405 nodes)
Target sequences:                          1  (386 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.00u 0.01s 00:00:00.01 Elapsed: 00:00:00.01
# Mc/sec: 11.23
//
[ok]

This GapMind analysis is from Sep 17 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory