GapMind for catabolism of small carbon sources

 

Alignments for a candidate for fadA in Dyella japonica UNC79MFTsu3.2

Align 3-ketoacyl-CoA thiolase FadI; ACSs; Acetyl-CoA acyltransferase; Acyl-CoA ligase; Beta-ketothiolase; Fatty acid oxidation complex subunit beta; EC 2.3.1.16 (characterized)
to candidate N515DRAFT_3009 N515DRAFT_3009 acetyl-CoA C-acetyltransferase

Query= SwissProt::P76503
         (436 letters)



>FitnessBrowser__Dyella79:N515DRAFT_3009
          Length = 427

 Score =  284 bits (726), Expect = 4e-81
 Identities = 162/426 (38%), Positives = 242/426 (56%), Gaps = 11/426 (2%)

Query: 14  RIAIVSGLRTPFARQATAFHGIPAVDLGKMVVGELLARSEIPAEVIEQLVFGQVVQMPEA 73
           R+ ++ G+R PF R  TA+  +    +   V+G L+ R  +  E + ++  G V++    
Sbjct: 7   RVGVIGGIRIPFCRNNTAYADVGNFGMSVKVLGALVERFRLHGEELGEVAMGAVIKHSSE 66

Query: 74  PNIAREIVLGTGMNVHTDAYSVSRACATSFQAVANVAESLMAGTIRAGIAGGADSSSVLP 133
            N+ARE VL +G+   T   + +RAC TS      +A  + AG I AGIAGG+D++S +P
Sbjct: 67  WNLAREAVLSSGLAPTTPGITTARACGTSLDNAIIIANKIAAGQIEAGIAGGSDTTSDVP 126

Query: 134 IGVSKKLARVLVDVNKARTMSQRLKLFSR-LRLRDLMPVPPAVAEYSTGLRMGDTAEQMA 192
           I + ++  + L+ +N+A+    ++  F+R   L++L P  P VAE  TG+ MGD  E+MA
Sbjct: 127 IVLGERFRKRLLAINRAKGWQDKMAAFTRGFSLKELKPSFPGVAEPRTGMSMGDHCERMA 186

Query: 193 KTYGITREQQDALAHRSHQRAAQAWSDGKLKEEVMTAFIPPYKQPLVEDNNIRGNSSLAD 252
           K + I RE QD LA  SHQ+ A A+  G  ++ V+     P++  L  D  +R +SS+  
Sbjct: 187 KEWHIGREAQDRLALESHQKLAAAYEAGFFEDLVV-----PFRG-LKRDGFLRADSSMEK 240

Query: 253 YAKLRPAFDR--KHGTVTAANSTPLTDGAAAVILMTESRAKELGLVPLGYLRSYAFTAID 310
              L+PAFD+   HGT+TA NST L+DGAAAV+L ++  A   GL    Y       A+D
Sbjct: 241 LGTLKPAFDKISGHGTLTAGNSTGLSDGAAAVLLGSDEWAARRGLKVQAYFLDAEVAAVD 300

Query: 311 V--WQDMLLGPAWSTPLALERAGLTMSDLTLIDMHEAFAAQTLANIQLLGSERFAREALG 368
               + +L+ P  + P  L R GLT+ D    ++HEAFAAQ L  ++   SE + R  LG
Sbjct: 301 FVHGEGLLMAPTVAVPRMLARHGLTLQDFDFYEIHEAFAAQVLCTLRAWESETYCRNRLG 360

Query: 369 RAHATGEVDDSKFNVLGGSIAYGHPFAATGARMITQTLHELRRRGGGFGLVTACAAGGLG 428
                G +D +K NV G S+A GHPFAATGAR++      L  +G G GL++ C AGG+G
Sbjct: 361 LEQPLGSIDPAKLNVHGSSLAAGHPFAATGARIVATLAKMLEEKGSGRGLISICTAGGMG 420

Query: 429 AAMVLE 434
              +LE
Sbjct: 421 VTAILE 426


Lambda     K      H
   0.319    0.133    0.377 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 426
Number of extensions: 24
Number of successful extensions: 5
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 436
Length of database: 427
Length adjustment: 32
Effective length of query: 404
Effective length of database: 395
Effective search space:   159580
Effective search space used:   159580
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.8 bits)
S2: 51 (24.3 bits)

Align candidate N515DRAFT_3009 N515DRAFT_3009 (acetyl-CoA C-acetyltransferase)
to HMM TIGR01930 (acetyl-CoA C-acyltransferase (EC 2.3.1.16))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR01930.hmm
# target sequence database:        /tmp/gapView.13657.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR01930  [M=385]
Accession:   TIGR01930
Description: AcCoA-C-Actrans: acetyl-CoA C-acyltransferase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                                    Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                                    -----------
   1.9e-117  378.4   0.1   2.1e-117  378.2   0.1    1.0  1  lcl|FitnessBrowser__Dyella79:N515DRAFT_3009  N515DRAFT_3009 acetyl-CoA C-acet


Domain annotation for each sequence (and alignments):
>> lcl|FitnessBrowser__Dyella79:N515DRAFT_3009  N515DRAFT_3009 acetyl-CoA C-acetyltransferase
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  378.2   0.1  2.1e-117  2.1e-117       1     385 []      10     426 ..      10     426 .. 0.97

  Alignments for each domain:
  == domain 1  score: 378.2 bits;  conditional E-value: 2.1e-117
                                    TIGR01930   1 ivdavRtpigklggslkelsaedLlaavikelleragldpekidevilGnvlqageqaniaReaal 66 
                                                  +++++R+p+++ +++++++ +  ++++v+ +l+er+ l+ e++ ev++G+v++++++ n+aRea+l
  lcl|FitnessBrowser__Dyella79:N515DRAFT_3009  10 VIGGIRIPFCRNNTAYADVGNFGMSVKVLGALVERFRLHGEELGEVAMGAVIKHSSEWNLAREAVL 75 
                                                  789*******88****************************************************** PP

                                    TIGR01930  67 aaglpesvpaltvnrvCaSglqAvalaaqkikaGeadvvvaGGvEsmSrvpillkaslrreslklg 132
                                                  ++gl  ++p++t  r+C+++l+  ++ a+ki+aG++++ +aGG +++S+vpi+l ++ r++ l ++
  lcl|FitnessBrowser__Dyella79:N515DRAFT_3009  76 SSGLAPTTPGITTARACGTSLDNAIIIANKIAAGQIEAGIAGGSDTTSDVPIVLGERFRKRLLAIN 141
                                                  ****************************************************************99 PP

                                    TIGR01930 133 kakledqllkdl..................vktklsmgetAenlakkygisReeqDeyalrShqka 180
                                                  +ak  + ++  +                  ++t++smg + e++ak+++i Re qD++al+Shqk 
  lcl|FitnessBrowser__Dyella79:N515DRAFT_3009 142 RAKGWQDKMAAFtrgfslkelkpsfpgvaePRTGMSMGDHCERMAKEWHIGREAQDRLALESHQKL 207
                                                  999888777766899***************9*********************************** PP

                                    TIGR01930 181 akAieegkfkdeivpvevkgkkkvvskDegirpnttlekLakLkpafke..kkgstvtAgNssqln 244
                                                  a+A+e+g+f+d +vp++       +++D  +r+++++ekL++Lkpaf++   +g t tAgNs++l+
  lcl|FitnessBrowser__Dyella79:N515DRAFT_3009 208 AAAYEAGFFEDLVVPFRG------LKRDGFLRADSSMEKLGTLKPAFDKisGHG-TLTAGNSTGLS 266
                                                  ***************655......679*********************977777.7********** PP

                                    TIGR01930 245 DGAaalllmseevakelgltplarivsaavagvdp...eemglgpvpAiekaLkkaglsisdidlv 307
                                                  DGAaa+ll s+e+a++ gl++ a++ +a+va+vd    e ++++p+ A++++L+++gl+++d+d++
  lcl|FitnessBrowser__Dyella79:N515DRAFT_3009 267 DGAAAVLLGSDEWAARRGLKVQAYFLDAEVAAVDFvhgEGLLMAPTVAVPRMLARHGLTLQDFDFY 332
                                                  **********************************9999**************************** PP

                                    TIGR01930 308 EinEAFAaqvlavekelgs................ldlekvNvnGGAiAlGHPlGasGarivltll 357
                                                  Ei+EAFAaqvl   ++++s                +d++k+Nv+G+++A+GHP++a+Gariv+tl+
  lcl|FitnessBrowser__Dyella79:N515DRAFT_3009 333 EIHEAFAAQVLCTLRAWESetycrnrlgleqplgsIDPAKLNVHGSSLAAGHPFAATGARIVATLA 398
                                                  *********************************8777***************************** PP

                                    TIGR01930 358 keLkergkkyGlatlCvggGqGaAvile 385
                                                  k+L+e+g  +Gl+++C +gG+G+++ile
  lcl|FitnessBrowser__Dyella79:N515DRAFT_3009 399 KMLEEKGSGRGLISICTAGGMGVTAILE 426
                                                  *************************997 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (385 nodes)
Target sequences:                          1  (427 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.02u 0.00s 00:00:00.02 Elapsed: 00:00:00.01
# Mc/sec: 10.82
//
[ok]

This GapMind analysis is from Sep 17 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory