GapMind for catabolism of small carbon sources

 

Alignments for a candidate for pta in Kocuria turfanensis HO-9042

Align phosphate acetyltransferase (EC 2.3.1.8) (characterized)
to candidate WP_062735419.1 AYX06_RS08590 phosphate acetyltransferase

Query= BRENDA::C1DQG8
         (712 letters)



>NCBI__GCF_001580365.1:WP_062735419.1
          Length = 697

 Score =  407 bits (1046), Expect = e-118
 Identities = 275/701 (39%), Positives = 390/701 (55%), Gaps = 43/701 (6%)

Query: 14  GLNSISLGLIRALEQAGLKVGFFKPI------AQPFPIDQGRERSCLLVERTLGRSTPEP 67
           G + ++LGL  +L +   ++G+F+P+      A+   +   +    L  ER  G      
Sbjct: 15  GKSLVTLGLADSLFRRTDRLGYFRPVHLGSSPAEDPMVQLMKRNFDLPDERCRGG----- 69

Query: 68  ILLDQVERHLASGETDLLLEEVVSRYQQAAAGKDVVIVEG--MLPTRDSDYSAYLNPLLS 125
           + L +    LA+G+ D L    VS Y + A   DV++V+G  +L    +     +N  ++
Sbjct: 70  LSLARTRELLAAGDHDELDSMAVSVYGEMARHCDVIVVDGTDLLAHNAATAEFDMNARMA 129

Query: 126 KSLNAEVI-LIAVQGGDGLKQLAERIEI-QAQLFGGIKSPKVLGAIINKIDSGDGIPAFV 183
            +L   V+ +I     +  + +   IE+ +A+L        +   I+N+       P +V
Sbjct: 130 NNLGTSVLAVIGADESESPEDVLNAIEVTRAELHQA--RCDIFAVIVNRAR-----PEWV 182

Query: 184 ERLKEYLPSLGSTDFQLFGAIPFAEELNALRTRDVAELIGAQVLSAGEAD---RRRVSKI 240
           ++L     S G     ++       E  A+    VAEL        G +     R V  +
Sbjct: 183 DQLSTNA-SRGVRGLPVY----VIAENAAVAAPTVAELRDRHGFGDGPSAVSLDRDVKGV 237

Query: 241 VLCARAVPSTVPLLQPGVLVVTPGDRDDVILAASLASLNGVKL---AGLLLCSDFMPDPR 297
            + A  V   +  L  G  V+TPGDR D+++A SLAS     L   +G+LL   F  + R
Sbjct: 238 KIAAMTVAHYLEQLADGDFVITPGDRSDIVVA-SLASALSPALPVPSGMLLTGGFGTEGR 296

Query: 298 IMELCKAALDGGLPVMSVTTGSYDTATNLFALNKETPADDIERATRVTDFIAGHLHPEYL 357
           I EL  AA     PV++    +Y  A  +             +        A  +  E L
Sbjct: 297 IQELTAAA---PFPVLTTGLDTYSAARAVSRSRGTLSGAHPRKVAAALGEWARRVDDEEL 353

Query: 358 RSRCSLPRELLLSPPAFRYQLVKSAQEADKRIVLPEGTEPRTIRAAAICQERGIARCVLL 417
            SR  LPR L  +P  F ++LV++A+   K IVLPEG +PR +RAA +   R      +L
Sbjct: 354 VSRLDLPRPLRRTPLRFLHELVEAARTQRKNIVLPEGEDPRILRAAEMIHRRNFCDLTIL 413

Query: 418 ARPEEVRAVAREQGITLP---DGLEILD---PESIRAQYIAPMVKMRQSKGLTPEMADEQ 471
             P ++  + + +GI L    +GL ++D    E++R +Y    V++R  KG+ PE A E+
Sbjct: 414 GDPAKIAGLCQTEGINLDFDDEGLTLIDFEHDEALREKYAGEYVRLRAHKGVQPEAALER 473

Query: 472 LRDTVVLGTMMLALDEVDGLVSGAVHTTANTIRPALQLIKTAPGYNLVSSVFFMLLPDQV 531
           + D    GTMM+ + +VDG+VSGA HTTANTIRPAL+ +KT+ G  +VSSVFFMLL D+V
Sbjct: 474 MLDGSYFGTMMVQMGDVDGMVSGAAHTTANTIRPALEFVKTSEGVRIVSSVFFMLLEDRV 533

Query: 532 LVYGDCAVNPNPSAAELAEIALQSAESAVALGVHPRVAMISYSTGDSGSGAEVDKVREAT 591
           LVYGDCAVNPNP A +LA+IAL SA +A   GV PRVAM+SYSTG SGSG +VDKVR AT
Sbjct: 534 LVYGDCAVNPNPDAQQLADIALASARTARQFGVEPRVAMLSYSTGGSGSGQDVDKVRAAT 593

Query: 592 RIAQERAPGLPIDGPLQYDAASVASVGKQKAPNSPVAGQATVFVFPDLNTGNTTYKAVQR 651
            I +   P L ++GP+QYDAA  AS+   K P S VAG+ATVFVFPDLNTGN TYKAVQ+
Sbjct: 594 EIVRAADPELEVEGPIQYDAAVDASIAASKLPGSTVAGRATVFVFPDLNTGNNTYKAVQQ 653

Query: 652 NANCISVGPMLQGLAKPVNDLSRGALVDDIVFTIALTALQA 692
           +A  ++VGP+LQGL KPVNDLSRG  VDDIV T+A+TA+QA
Sbjct: 654 SAGAVAVGPVLQGLRKPVNDLSRGCTVDDIVNTVAITAIQA 694


Lambda     K      H
   0.318    0.135    0.379 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 1157
Number of extensions: 52
Number of successful extensions: 5
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 712
Length of database: 697
Length adjustment: 39
Effective length of query: 673
Effective length of database: 658
Effective search space:   442834
Effective search space used:   442834
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 54 (25.4 bits)

Align candidate WP_062735419.1 AYX06_RS08590 (phosphate acetyltransferase)
to HMM TIGR00651 (pta: phosphate acetyltransferase (EC 2.3.1.8))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR00651.hmm
# target sequence database:        /tmp/gapView.2191782.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR00651  [M=304]
Accession:   TIGR00651
Description: pta: phosphate acetyltransferase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                             Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                             -----------
     3e-124  400.4   0.0   4.2e-124  399.9   0.0    1.2  1  NCBI__GCF_001580365.1:WP_062735419.1  


Domain annotation for each sequence (and alignments):
>> NCBI__GCF_001580365.1:WP_062735419.1  
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  399.9   0.0  4.2e-124  4.2e-124       1     304 []     385     691 ..     385     691 .. 0.96

  Alignments for each domain:
  == domain 1  score: 399.9 bits;  conditional E-value: 4.2e-124
                             TIGR00651   1 ivlPEgseervlkAaallaekkiaekvllvnkeeevkn.kakevnlklgkv..vved.pdvskdiekyverly 69 
                                           ivlPEg+++r+l+Aa+++ +++  + ++l++ +++    +++++nl+  +   +++d ++    +eky+ +++
  NCBI__GCF_001580365.1:WP_062735419.1 385 IVLPEGEDPRILRAAEMIHRRNFCDLTILGDPAKIAGLcQTEGINLDFDDEglTLIDfEHDEALREKYAGEYV 457
                                           8**********************************998889999988766411344413444458******** PP

                             TIGR00651  70 ekrkhkGvtekeareqlrDevslaallvelgeadglvsGavsttaktlrpalqiiktlegvklvssvfimeke 142
                                            +r hkGv+ ++a e + D +++++++v++g +dg+vsGa++tta+t+rpal+ +kt egv++vssvf+m +e
  NCBI__GCF_001580365.1:WP_062735419.1 458 RLRAHKGVQPEAALERMLDGSYFGTMMVQMGDVDGMVSGAAHTTANTIRPALEFVKTSEGVRIVSSVFFMLLE 530
                                           ************************************************************************* PP

                             TIGR00651 143 eevlvfaDCavavdPnaeeLAeiAlqsaksakslgeeepkvallsystkgsgkgeevekvkeAvkilkekepd 215
                                           ++vlv++DCav+++P+a++LA+iAl sa +a+++g  ep+va+lsyst+gsg+g++v+kv+ A++i++   p+
  NCBI__GCF_001580365.1:WP_062735419.1 531 DRVLVYGDCAVNPNPDAQQLADIALASARTARQFG-VEPRVAMLSYSTGGSGSGQDVDKVRAATEIVRAADPE 602
                                           ***********************************.************************************* PP

                             TIGR00651 216 llldGelqfDaAlvekvaekkapesevagkanvfvFPdLdaGnigYkivqRladaeaiGPilqGlakPvnDLs 288
                                           l ++G++q+DaA+ +++a++k p s+vag+a+vfvFPdL++Gn++Yk+vq +a+a a+GP+lqGl+kPvnDLs
  NCBI__GCF_001580365.1:WP_062735419.1 603 LEVEGPIQYDAAVDASIAASKLPGSTVAGRATVFVFPDLNTGNNTYKAVQQSAGAVAVGPVLQGLRKPVNDLS 675
                                           ************************************************************************* PP

                             TIGR00651 289 RGasvedivnvviita 304
                                           RG++v+divn+v+ita
  NCBI__GCF_001580365.1:WP_062735419.1 676 RGCTVDDIVNTVAITA 691
                                           **************97 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (304 nodes)
Target sequences:                          1  (697 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.00u 0.00s 00:00:00.00 Elapsed: 00:00:00.00
# Mc/sec: 36.15
//
[ok]

This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory