GapMind for catabolism of small carbon sources

 

Alignments for a candidate for aacS in Moritella dasanensis ArB 0140

Align acetoacetyl-CoA synthase (EC 2.3.1.194) (characterized)
to candidate WP_017219769.1 A923_RS0100990 acetoacetate--CoA ligase

Query= BRENDA::Q9JMI1
         (672 letters)



>NCBI__GCF_000276805.1:WP_017219769.1
          Length = 720

 Score =  486 bits (1250), Expect = e-141
 Identities = 281/726 (38%), Positives = 398/726 (54%), Gaps = 89/726 (12%)

Query: 18  MWEPDSKK--DTQMDRFRAAVGTACGLALGNYDDLYHWSVRSYSDFWAE---FWKFSGIV 72
           +W P   +  ++ + R+   +     L   +Y  L+ WS+ + + FW     F+ F G  
Sbjct: 9   IWTPSQARIDNSNITRYLQFLRLEYSLTFVHYQQLHQWSIDNTAVFWCSLVHFFNFKGNF 68

Query: 73  CSRMYDEVVDTSKGIADVPEWFRGSRLNYAENLL------------------------RH 108
             +     V          +WF  S LN+AENLL                         +
Sbjct: 69  NLKR----VFVKNDCFYHCQWFPDSTLNFAENLLFPTVFSINKKSISINKNHVSTNCSHY 124

Query: 109 KEND-------RVALYVAREGREEIAKVTFEELRQQVALFAAAMRKMGVKKGDRVVGYLP 161
             ND       ++A+   RE  +   ++++++LR +V   AAAMR++G+ KGDRV G LP
Sbjct: 125 SANDQDSVNPDKLAIICCREDGQR-TQLSYQQLRDEVTRVAAAMRELGIVKGDRVAGLLP 183

Query: 162 NSAHAVEAMLAAASIGAIWSSTSPDFGVNGVLDRFSQIQPKLIFSVEAVVYNGKEHGHLE 221
           N + A+ AMLA  SIGAIWSS SPDFG  GVLDRF QI+P+L+F+     Y GK+    E
Sbjct: 184 NCSEAIIAMLATTSIGAIWSSCSPDFGHQGVLDRFIQIRPQLLFACNGYHYAGKQIDIRE 243

Query: 222 KLQRVVKGLPDLQRVVLIPYVLPREKIDISKIPNSMFLDDFLASG----TGAQAPQLEFE 277
           K+  +   LPDL ++V+IPY+     I          LD            A    L FE
Sbjct: 244 KVNAIANVLPDLTQLVIIPYLALDVDIQTQANVTPSILDKTTVCHWRHFCAAIPRSLSFE 303

Query: 278 QLPFSHPLFIMFSSGTTGAPKCMVHSAGGTLIQHLKEHVLHGNMTSSDILLYYTTVGWMM 337
              F+ PL+I++SSGTTG PKC+VHS GGTL+QH KE  LH ++   D + YYTT GWMM
Sbjct: 304 ATAFADPLYILYSSGTTGMPKCIVHSVGGTLLQHAKELALHTDVQIDDRIFYYTTCGWMM 363

Query: 338 WNWMVSALATGASLVLYDGSPLVPTPNVLWDLVDRIGITILGTGAKWLSVLEEKDMKPME 397
           WNW+VS+L+ GA+LVL+DGSP  P   VL++L D   ++I G  AK+ S  ++  ++P E
Sbjct: 364 WNWLVSSLSQGATLVLFDGSPFYPHKQVLFELADTEKVSIFGASAKYYSACDKAKLRPAE 423

Query: 398 THNLHTLHTILSTGSPLKAQSYEYVYRCIKSTVLLGSISGGTDIISCFMGQNSSIPVYKG 457
           T+ L  L T+LSTGS L  +S++Y+Y+ IK  V L SI GGTDIISCFM    ++PVY+G
Sbjct: 424 TYKLSNLKTMLSTGSTLSHESFDYIYQHIKQDVCLSSICGGTDIISCFMLGMPTLPVYRG 483

Query: 458 EIQARNLGMAVEAWDEEGKTVWGASGELVCTKPIPCQPTHFWNDENGSKYRKAYFSKYPG 517
           E+Q   LGM V              G+LVC +P P  PT FW D +  KY  AYFS++  
Sbjct: 484 ELQCIGLGMDVAF----------MQGDLVCRQPFPSMPTGFWQDPDDRKYHDAYFSRFHN 533

Query: 518 VWAHGDYCR-----INP----------------------------KTGGIVMLGRSDGTL 544
           +WAHGDY       INP                            K  G+++ GRSD  L
Sbjct: 534 IWAHGDYGELIYHYINPNPELVKPIENKSESVADSGIKKSTDISIKQIGVIIHGRSDAVL 593

Query: 545 NPNGVRFGSSEIYNIVEAFDEVEDSLCVPQYNRDGEERVVLFLKMASGHTFQPDLVKHIR 604
           NP GVR G++EIY  VE    +++S+ + Q  RD + RV+LF+++++G      L+  I+
Sbjct: 594 NPGGVRIGTAEIYRQVEKLPAIQESIAIGQQWRD-DVRVILFVRLSAGVELDSQLISQIK 652

Query: 605 DAIRLGLSARHVPSLILETQGIPYTINGKKVEVAVKQVIAGKTVEHRGAFSNPESLDLYR 664
             IR   + RHVP+ I+    IP TI+GK VE+AV+ V+ G +V ++ A +NPE+L L+ 
Sbjct: 653 QVIRTNTTPRHVPAKIIAVADIPKTISGKIVELAVRNVVHGISVTNKDALANPEALTLFA 712

Query: 665 DIPELQ 670
            + ELQ
Sbjct: 713 HLAELQ 718


Lambda     K      H
   0.319    0.136    0.417 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 1178
Number of extensions: 40
Number of successful extensions: 7
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 2
Number of HSP's successfully gapped: 2
Length of query: 672
Length of database: 720
Length adjustment: 39
Effective length of query: 633
Effective length of database: 681
Effective search space:   431073
Effective search space used:   431073
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.8 bits)
S2: 54 (25.4 bits)

Align candidate WP_017219769.1 A923_RS0100990 (acetoacetate--CoA ligase)
to HMM TIGR01217 (acetoacetate-CoA ligase (EC 6.2.1.16))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR01217.hmm
# target sequence database:        /tmp/gapView.283529.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR01217  [M=652]
Accession:   TIGR01217
Description: ac_ac_CoA_syn: acetoacetate-CoA ligase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                             Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                             -----------
   1.8e-238  778.8   8.2   4.6e-167  542.9   0.1    3.0  3  NCBI__GCF_000276805.1:WP_017219769.1  


Domain annotation for each sequence (and alignments):
>> NCBI__GCF_000276805.1:WP_017219769.1  
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !   55.1   0.2     2e-19     2e-19       4      96 ..       7     100 ..       4     125 .. 0.89
   2 !  542.9   0.1  4.6e-167  4.6e-167     102     509 ..     138     543 ..     107     547 .. 0.92
   3 !  184.7   0.5   1.2e-58   1.2e-58     513     651 ..     581     718 ..     573     719 .. 0.98

  Alignments for each domain:
  == domain 1  score: 55.1 bits;  conditional E-value: 2e-19
                             TIGR01217   4 qvlwepdaervkdarlarfraavgerfGaalgdydalyrwsvdeldafwkavwefsdvvfssaekev.vddsk 75 
                                           +++w+p + r+++++++r+ ++   ++ +++ +y++l++ws+d+   fw ++++f++  ++ + k+v v ++ 
  NCBI__GCF_000276805.1:WP_017219769.1   7 DPIWTPSQARIDNSNITRYLQFLRLEYSLTFVHYQQLHQWSIDNTAVFWCSLVHFFNFKGNFNLKRVfVKNDC 79 
                                           589***********************************************************99998467777 PP

                             TIGR01217  76 mlaarffpgarlnyaenllrk 96 
                                               ++fp+++ln+aenll  
  NCBI__GCF_000276805.1:WP_017219769.1  80 FYHCQWFPDSTLNFAENLLFP 100
                                           88999*************965 PP

  == domain 2  score: 542.9 bits;  conditional E-value: 4.6e-167
                             TIGR01217 102 allyvdeekesakvtfeelrrqvaslaaalralGvkkGdrvagylpnipeavaallatasvGaiwsscspdfG 174
                                           a++   e+ + +++++++lr  v+++aaa+r+lG+ kGdrvag+lpn +ea+ a+lat+s+GaiwsscspdfG
  NCBI__GCF_000276805.1:WP_017219769.1 138 AIICCREDGQRTQLSYQQLRDEVTRVAAAMRELGIVKGDRVAGLLPNCSEAIIAMLATTSIGAIWSSCSPDFG 210
                                           556667778999************************************************************* PP

                             TIGR01217 175 argvldrfsqiepkllfsvdgyvynGkehdrrekvrevakelpdlravvlipyvg......dreklapkv.e. 239
                                            +gvldrf qi+p+llf+++gy y Gk+ d rekv+++a+ lpdl + v+ipy+        ++++ p + + 
  NCBI__GCF_000276805.1:WP_017219769.1 211 HQGVLDRFIQIRPQLLFACNGYHYAGKQIDIREKVNAIANVLPDLTQLVIIPYLAldvdiqTQANVTPSIlDk 283
                                           *****************************************************86222111223333333221 PP

                             TIGR01217 240 .galtledllaaaqaaelvfeqlpfdhplyilfssGttGvpkaivhsaGGtlvqhlkehvlhcdltdgdrlly 311
                                               ++  + aa     l fe   f  plyil+ssGttG+pk+ivhs GGtl+qh ke++lh+d++  dr++y
  NCBI__GCF_000276805.1:WP_017219769.1 284 tTVCHWRHFCAA-IPRSLSFEATAFADPLYILYSSGTTGMPKCIVHSVGGTLLQHAKELALHTDVQIDDRIFY 355
                                           123457778877.56789******************************************************* PP

                             TIGR01217 312 yttvGwmmwnflvsglatGatlvlydGsplvpatnvlfdlaeregitvlGtsakyvsavrkkglkparthdls 384
                                           ytt+Gwmmwn+lvs+l  Gatlvl+dGsp+ p+ +vlf+la+ e+++++G+saky sa+ k+ l+pa+t++ls
  NCBI__GCF_000276805.1:WP_017219769.1 356 YTTCGWMMWNWLVSSLSQGATLVLFDGSPFYPHKQVLFELADTEKVSIFGASAKYYSACDKAKLRPAETYKLS 428
                                           ************************************************************************* PP

                             TIGR01217 385 alrlvastGsplkpegfeyvyeeikadvllasisGGtdivscfvganpslpvykGeiqapglGlaveawdeeG 457
                                            l++++stGs l+ e+f+y+y+ ik dv l+si GGtdi+scf+++ p+lpvy+Ge+q+ glG++v       
  NCBI__GCF_000276805.1:WP_017219769.1 429 NLKTMLSTGSTLSHESFDYIYQHIKQDVCLSSICGGTDIISCFMLGMPTLPVYRGELQCIGLGMDVAF----- 496
                                           ****************************************************************9965..... PP

                             TIGR01217 458 kpvtgekGelvvtkplpsmpvrfwndedGskyrkayfdkypgvwahGdyiel 509
                                                 +G+lv+ +p+psmp+ fw d+d  ky++ayf+++ ++wahGdy el
  NCBI__GCF_000276805.1:WP_017219769.1 497 -----MQGDLVCRQPFPSMPTGFWQDPDDRKYHDAYFSRFHNIWAHGDYGEL 543
                                           .....579*****************************************876 PP

  == domain 3  score: 184.7 bits;  conditional E-value: 1.2e-58
                             TIGR01217 513 GgivihGrsdatlnpnGvrlGsaeiynaverldeveeslvigqeqedgeervvlfvklasGatldealvkeik 585
                                            g++ihGrsda lnp+Gvr+G+aeiy +ve+l+ ++es+ igq+++d ++rv+lfv+l++G++ld +l+ +ik
  NCBI__GCF_000276805.1:WP_017219769.1 581 IGVIIHGRSDAVLNPGGVRIGTAEIYRQVEKLPAIQESIAIGQQWRD-DVRVILFVRLSAGVELDSQLISQIK 652
                                           589********************************************.************************* PP

                             TIGR01217 586 dairaglsprhvpskiievagiprtlsGkkvevavkdvvaGkpvenkgalsnpealdlyeeleelk 651
                                           + ir++ +prhvp+kii+va+ip+t+sGk+ve+av++vv+G +v nk+al+npeal l+++l+el+
  NCBI__GCF_000276805.1:WP_017219769.1 653 QVIRTNTTPRHVPAKIIAVADIPKTISGKIVELAVRNVVHGISVTNKDALANPEALTLFAHLAELQ 718
                                           ***************************************************************987 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (652 nodes)
Target sequences:                          1  (720 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.03u 0.00s 00:00:00.03 Elapsed: 00:00:00.03
# Mc/sec: 12.55
//
[ok]

This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory