GapMind for catabolism of small carbon sources

 

Alignments for a candidate for fadA in Bacillus alkalinitrilicus DSM 22532

Align 3-oxo-acyl CoA thiolase (EC 2.3.1.16) (characterized)
to candidate WP_078428403.1 BK574_RS09250 thiolase family protein

Query= metacyc::G185E-7833-MONOMER
         (386 letters)



>NCBI__GCF_002019605.1:WP_078428403.1
          Length = 383

 Score =  381 bits (978), Expect = e-110
 Identities = 206/393 (52%), Positives = 266/393 (67%), Gaps = 18/393 (4%)

Query: 1   MTEAYVIDAVRTAVGKRGGALAGIHPVDLGALAWRGLLDRTDIDPAAVDDVIAGCVDAIG 60
           M EA V++ VR+ VG+R GAL+   P DL A   + ++ R +I P  ++DVI GCV  +G
Sbjct: 1   MKEAVVVEVVRSPVGRRNGALSNYRPDDLAAEVLKEVVKRANISPKIIEDVIMGCVSQVG 60

Query: 61  GQAGNIARLSWLAAGYPEEVPGVTVDRQCGSSQQAISFGAQAIMSGTADVIVAGGVQNMS 120
            QA +I R++ L AG+P EVPG T+DRQCGSSQQA+ F AQAIMSG  DV++A GV++MS
Sbjct: 61  EQAADIGRVAALIAGFPIEVPGTTIDRQCGSSQQAVHFAAQAIMSGDMDVVIAAGVESMS 120

Query: 121 QIPISSAMTVGEQFGFTSPTNESKQWLHRYGDQEISQFRGSELIAEKWNLSREEMERYSL 180
           ++P+ S +   E          S++   +Y  + I+Q   +E IAEKW +S++E++ ++L
Sbjct: 121 RVPMFSNLQGAEL---------SERLTSKY--EMINQGFSAERIAEKWGISKQELDEFAL 169

Query: 181 TSHERAFAAIRAGHFENEIITVETE-----SGPFRVDEGPR-ESSLEKMAGLQP-LVEGG 233
            SH +A AA   G FE EI+ +E             DEGPR E++LEK+A L+P  VE G
Sbjct: 170 QSHHKAIAAQDEGRFEREIMPLEVTLPDGTKATITADEGPRRETNLEKLASLKPSFVENG 229

Query: 234 RLTAAMASQISDGASAVLLASERAVKDHGLRPRARIHHISARAADPVFMLTGPIPATRYA 293
            +T   ASQISDGA+A+LL S    ++ GL+PR RI   S   +DP  MLTGPIPAT   
Sbjct: 230 TVTPGNASQISDGAAAILLMSREKAEELGLKPRFRIVARSVIGSDPTLMLTGPIPATEKV 289

Query: 294 LDKTGLAIDDIDTVEINEAFAPVVMAWLKEIKADPAKVNPNGGAIALGHPLGATGAKLFT 353
           L K GL+IDDID  E+NEAFA V + WLKE  ADP K+NPNGGAIALGHPLG +GA+L T
Sbjct: 290 LKKAGLSIDDIDIFEVNEAFASVPLVWLKETGADPNKLNPNGGAIALGHPLGGSGARLMT 349

Query: 354 TMLGELERIGGRYGLQTMCEGGGTANVTIIERL 386
           TM+ ELER GGRYGLQTMCEG G AN TIIERL
Sbjct: 350 TMMYELERTGGRYGLQTMCEGHGMANATIIERL 382


Lambda     K      H
   0.317    0.134    0.389 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 395
Number of extensions: 16
Number of successful extensions: 4
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 386
Length of database: 383
Length adjustment: 30
Effective length of query: 356
Effective length of database: 353
Effective search space:   125668
Effective search space used:   125668
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.6 bits)
S2: 50 (23.9 bits)

Align candidate WP_078428403.1 BK574_RS09250 (thiolase family protein)
to HMM TIGR01930 (acetyl-CoA C-acyltransferase (EC 2.3.1.16))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR01930.hmm
# target sequence database:        /tmp/gapView.5208.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR01930  [M=385]
Accession:   TIGR01930
Description: AcCoA-C-Actrans: acetyl-CoA C-acyltransferase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                                 Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                                 -----------
   1.7e-144  467.4   0.2     2e-144  467.2   0.2    1.0  1  lcl|NCBI__GCF_002019605.1:WP_078428403.1  BK574_RS09250 thiolase family pr


Domain annotation for each sequence (and alignments):
>> lcl|NCBI__GCF_002019605.1:WP_078428403.1  BK574_RS09250 thiolase family protein
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  467.2   0.2    2e-144    2e-144       1     385 []       6     380 ..       6     380 .. 0.96

  Alignments for each domain:
  == domain 1  score: 467.2 bits;  conditional E-value: 2e-144
                                 TIGR01930   1 ivdavRtpigklggslkelsaedLlaavikelleragldpekidevilGnvlqageq.aniaReaalaa 68 
                                               +v+ vR+p+g+ +g+l++ +++dL+a+v+ke+++ra+++p+ i++vi+G+v q geq a+i+R aal a
  lcl|NCBI__GCF_002019605.1:WP_078428403.1   6 VVEVVRSPVGRRNGALSNYRPDDLAAEVLKEVVKRANISPKIIEDVIMGCVSQVGEQaADIGRVAALIA 74 
                                               799*******88********************************************************* PP

                                 TIGR01930  69 glpesvpaltvnrvCaSglqAvalaaqkikaGeadvvvaGGvEsmSrvpillkaslrreslklgkakle 137
                                               g+p +vp++t++r+C+S+ qAv+ aaq+i++G++dvv+a+GvEsmSrvp++++ +    +++l +  ++
  lcl|NCBI__GCF_002019605.1:WP_078428403.1  75 GFPIEVPGTTIDRQCGSSQQAVHFAAQAIMSGDMDVVIAAGVESMSRVPMFSNLQ----GAELSERLTS 139
                                               **************************************************99875....4444333344 PP

                                 TIGR01930 138 dqllkdlvktklsmgetAenlakkygisReeqDeyalrShqkaakAieegkfkdeivpvevkgk...kk 203
                                               +       ++ +++g +Ae++a+k+gis++e+De+al+Sh+ka +A++eg+f++ei+p+ev      k 
  lcl|NCBI__GCF_002019605.1:WP_078428403.1 140 K-------YEMINQGFSAERIAEKWGISKQELDEFALQSHHKAIAAQDEGRFEREIMPLEVTLPdgtKA 201
                                               4.......56899***********************************************998899999 PP

                                 TIGR01930 204 vvskDegirpnttlekLakLkpafkekkgstvtAgNssqlnDGAaalllmseevakelgltplarivsa 272
                                               +++ Deg+r++t+lekLa+Lkp f e +g tvt gN+sq++DGAaa+llms+e+a+elgl+p  riv+ 
  lcl|NCBI__GCF_002019605.1:WP_078428403.1 202 TITADEGPRRETNLEKLASLKPSFVE-NG-TVTPGNASQISDGAAAILLMSREKAEELGLKPRFRIVAR 268
                                               ************************96.9*.6************************************** PP

                                 TIGR01930 273 avagvdpeemglgpvpAiekaLkkaglsisdidlvEinEAFAaqvlavekelgsldlekvNvnGGAiAl 341
                                               +v+g+dp+ m++gp+pA+ek+Lkkaglsi+did++E+nEAFA++ l++ ke+g  d++k+N nGGAiAl
  lcl|NCBI__GCF_002019605.1:WP_078428403.1 269 SVIGSDPTLMLTGPIPATEKVLKKAGLSIDDIDIFEVNEAFASVPLVWLKETG-ADPNKLNPNGGAIAL 336
                                               *****************************************************.88************* PP

                                 TIGR01930 342 GHPlGasGarivltllkeLkergkkyGlatlCvggGqGaAvile 385
                                               GHPlG sGar+++t+++eL++ g++yGl+t+C g G+++A+i+e
  lcl|NCBI__GCF_002019605.1:WP_078428403.1 337 GHPLGGSGARLMTTMMYELERTGGRYGLQTMCEGHGMANATIIE 380
                                               ******************************************97 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (385 nodes)
Target sequences:                          1  (383 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.02u 0.00s 00:00:00.02 Elapsed: 00:00:00.01
# Mc/sec: 7.56
//
[ok]

This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory