GapMind for catabolism of small carbon sources

 

Alignments for a candidate for aacS in Cupriavidus basilensis 4G11

Align acetoacetate-CoA ligase (EC 6.2.1.16) (characterized)
to candidate RR42_RS10085 RR42_RS10085 acetoacetyl-CoA synthase

Query= BRENDA::Q9Z3R3
         (650 letters)



>FitnessBrowser__Cup4G11:RR42_RS10085
          Length = 1020

 Score =  441 bits (1134), Expect = e-128
 Identities = 267/652 (40%), Positives = 346/652 (53%), Gaps = 25/652 (3%)

Query: 6   PLWVPDREIVERSPMAEFIDWCGERFGRSFADYDAFHDWSVSERGAFWTAV--WEHCKVI 63
           P++    E    S M  F        G+ F DYD  HD+SV E   FW     W H    
Sbjct: 10  PIYGSTSERAAASQMTAFTTALQAYTGQVFGDYDTLHDFSVREYRTFWQCFVQWSHGLAW 69

Query: 64  GESGEKALVDGDRMLDARFFPEARLNFAENLLRKTGSGD---ALIFRGEDKVSYRLTWDE 120
             S E   V GD    ARFFP+ +LN+A+NLL +  +     AL     D    RLT  E
Sbjct: 70  SGSTEPVCV-GDECEHARFFPQVQLNYADNLLGQAVAAPDTPALTACHADGRRVRLTRGE 128

Query: 121 LRALVSRLQQALRAQGIGAGDRVAAMMPNMPETIALMLATASVGAIWSSCSPDFGEQGVL 180
           LR  V+RL  AL   G+  GDRV  +M N  + +   LA  ++GA  S+ + +   + +L
Sbjct: 129 LRNRVARLAHALSELGLRDGDRVVGVMRNDADAVVAALAVTALGATLSTAAAEMSVETIL 188

Query: 181 DRFGQIAPKLFIVCDGYWYNGKRQDVDSKVRAVAKSLGAPTVIVPYAGDSAALAPTVEGG 240
           DRF  +AP+L               +   V  +A +L +   +V    D   L  TV+  
Sbjct: 189 DRFAPLAPRLLFAHVAEREFDTGMSLADNVADLAAALPSLQGVVRL--DDGTLPGTVKQR 246

Query: 241 V-TLADFIAGFQAGPLVFERLPFGHPLYILFSSGTTGVPKCIVHSAGGTLLQHLKEHRFH 299
           + +L + I    AG  V+ R PF HPL+I+FSSGTTG PKCIVH AGG+LL+HLKEHR H
Sbjct: 247 IYSLGELIDSGDAGSFVWRRFPFNHPLFIMFSSGTTGKPKCIVHGAGGSLLEHLKEHRLH 306

Query: 300 CGLRDGERLFYFTTCGWMMWNWLASGLAVGATLCLYDGSPFCPDGNVLFDYAAAERFAVF 359
           C LR G+RL++ TTC WMMWNW  S LA GA +  YDG     D   L+   A ER  VF
Sbjct: 307 CDLRPGDRLYFHTTCAWMMWNWQLSALASGAEIVTYDGPISTVD--ALWRLVADERVTVF 364

Query: 360 GTSAKYIDAVRKGGFTPARTHDLSSLRLMTSTGSPLSPEGFSFVYEGIKPDVQLASISGG 419
           GTS  Y+      G  P +  DL +LR M STG+ L    F +V + +KP + L SISGG
Sbjct: 365 GTSPAYLKMCEDAGLVPGQQFDLGALRAMMSTGAVLFDAQFEWVRDHVKP-LPLQSISGG 423

Query: 420 TDIVSCFVLGNPLKPVWRGEIQGPGLGLAVDVWNDEGKPVRGEKGELVCTRAFPSMPVMF 479
           TDI+ CFVLGNP  P++ GE Q   L L V  W       R   GELVC   FPS P+ F
Sbjct: 424 TDILGCFVLGNPNLPIYAGEAQCKSLALDVQAWEQGAHTSR--IGELVCANPFPSRPLGF 481

Query: 480 WNDPDGAKYRAAYFDRFDNVWCHGDFAEWTPHGGIVIHGRSDATLNPGGVRIGTAEIYNQ 539
           + D DG  + AAYF R   VW HGD  E++P G   +HGRSD  LN  G+ +G  EIY  
Sbjct: 482 FGDMDGKGFHAAYFTRNPGVWTHGDRIEFSPEGTARLHGRSDGILNVRGINVGPGEIYRV 541

Query: 540 VEQMDEVAEALCIGQ-----------DWEDDVRVVLFVRLARGVELTEALTREIKNRIRS 588
           +  + ++ EAL + Q               D R+VL + L  GV LT AL   ++  +  
Sbjct: 542 LSDIRDIREALVVEQRSCAAPPDRTHAERYDQRIVLLLVLQDGVALTGALATRVRRDLAR 601

Query: 589 GASPRHVPAKIIAVADIPRTKSGKIVELAVRDVVHGRPVKNKEALANPEALD 640
            ASP HVP  IIAV ++P T +GK+ E A R+ ++G PV N  AL NP  LD
Sbjct: 602 RASPAHVPDLIIAVDELPVTHNGKLSEAAARNAINGLPVGNAAALRNPGCLD 653


Lambda     K      H
   0.322    0.139    0.441 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 1868
Number of extensions: 106
Number of successful extensions: 8
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 650
Length of database: 1020
Length adjustment: 41
Effective length of query: 609
Effective length of database: 979
Effective search space:   596211
Effective search space used:   596211
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.9 bits)
S2: 56 (26.2 bits)

Align candidate RR42_RS10085 RR42_RS10085 (acetoacetyl-CoA synthase)
to HMM TIGR01217 (acetoacetate-CoA ligase (EC 6.2.1.16))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR01217.hmm
# target sequence database:        /tmp/gapView.3095.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR01217  [M=652]
Accession:   TIGR01217
Description: ac_ac_CoA_syn: acetoacetate-CoA ligase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                                 Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                                 -----------
   3.4e-182  592.9   0.0   4.6e-182  592.4   0.0    1.0  1  lcl|FitnessBrowser__Cup4G11:RR42_RS10085  RR42_RS10085 acetoacetyl-CoA syn


Domain annotation for each sequence (and alignments):
>> lcl|FitnessBrowser__Cup4G11:RR42_RS10085  RR42_RS10085 acetoacetyl-CoA synthase
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  592.4   0.0  4.6e-182  4.6e-182       6     645 ..      11     656 ..       7     662 .. 0.93

  Alignments for each domain:
  == domain 1  score: 592.4 bits;  conditional E-value: 4.6e-182
                                 TIGR01217   6 lwepdaervkdarlarfraavgerfGaalgdydalyrwsvdeldafwkavwefsd.vvfssaekevvdd 73 
                                               ++   +er + ++++ f  a    +G+ +gdyd l+++sv+e+ +fw+ ++++s+ +  s +++ v  +
  lcl|FitnessBrowser__Cup4G11:RR42_RS10085  11 IYGSTSERAAASQMTAFTTALQAYTGQVFGDYDTLHDFSVREYRTFWQCFVQWSHgLAWSGSTEPVCVG 79 
                                               566678999********************************************8626778888889999 PP

                                 TIGR01217  74 skmlaarffpgarlnyaenllrkkgs...edallyvdeekesakvtfeelrrqvaslaaalralGvkkG 139
                                               ++   arffp+ +lnya+nll ++ +    +al     +    ++t  elr++va+la al +lG++ G
  lcl|FitnessBrowser__Cup4G11:RR42_RS10085  80 DECEHARFFPQVQLNYADNLLGQAVAapdTPALTACHADGRRVRLTRGELRNRVARLAHALSELGLRDG 148
                                               *********************99877665677888888999**************************** PP

                                 TIGR01217 140 drvagylpnipeavaallatasvGaiwsscspdfGargvldrfsqiepkllfsvdgyvynGkehdrrek 208
                                               drv+g++ n ++av+a la +++Ga  s+++ ++ ++++ldrf+ ++p+llf+  + +         ++
  lcl|FitnessBrowser__Cup4G11:RR42_RS10085 149 DRVVGVMRNDADAVVAALAVTALGATLSTAAAEMSVETILDRFAPLAPRLLFAHVAEREFDTGMSLADN 217
                                               ****************************************************99888888888999*** PP

                                 TIGR01217 209 vrevakelpdlravvlipyvgdreklapkvegal.tledllaaaqaaelvfeqlpfdhplyilfssGtt 276
                                               v+++a  lp+l+ vv +       +l  +v++ +  l +l+ + +a+  v+ + pf+hpl+i+fssGtt
  lcl|FitnessBrowser__Cup4G11:RR42_RS10085 218 VADLAAALPSLQGVVRLD----DGTLPGTVKQRIySLGELIDSGDAGSFVWRRFPFNHPLFIMFSSGTT 282
                                               **************9875....4455556665542799******************************* PP

                                 TIGR01217 277 GvpkaivhsaGGtlvqhlkehvlhcdltdgdrllyyttvGwmmwnflvsglatGatlvlydGsplvpat 345
                                               G pk+ivh aGG l++hlkeh+lhcdl++gdrl+++tt+ wmmwn+  s+la Ga +v ydG   + + 
  lcl|FitnessBrowser__Cup4G11:RR42_RS10085 283 GKPKCIVHGAGGSLLEHLKEHRLHCDLRPGDRLYFHTTCAWMMWNWQLSALASGAEIVTYDGP--ISTV 349
                                               **************************************************************5..6799 PP

                                 TIGR01217 346 nvlfdlaeregitvlGtsakyvsavrkkglkparthdlsalrlvastGsplkpegfeyvyeeikadvll 414
                                               ++l++l++ e++tv+Gts +y+++++++gl p +++dl alr+++stG+ l    fe+v + +k+ + l
  lcl|FitnessBrowser__Cup4G11:RR42_RS10085 350 DALWRLVADERVTVFGTSPAYLKMCEDAGLVPGQQFDLGALRAMMSTGAVLFDAQFEWVRDHVKP-LPL 417
                                               ***************************************************************98.899 PP

                                 TIGR01217 415 asisGGtdivscfvganpslpvykGeiqapglGlaveawdeeGkpvtgekGelvvtkplpsmpvrfwnd 483
                                                sisGGtdi+ cfv++np lp+y Ge q+++l l+v+aw++  ++     Gelv+++p+ps p+ f+ d
  lcl|FitnessBrowser__Cup4G11:RR42_RS10085 418 QSISGGTDILGCFVLGNPNLPIYAGEAQCKSLALDVQAWEQGAHT--SRIGELVCANPFPSRPLGFFGD 484
                                               ***************************************987665..679******************* PP

                                 TIGR01217 484 edGskyrkayfdkypgvwahGdyieltprGgivihGrsdatlnpnGvrlGsaeiynaverldeveeslv 552
                                                dG  +++ayf + pgvw+hGd ie++p+G+  +hGrsd+ ln  G+ +G  eiy ++  + +++e+lv
  lcl|FitnessBrowser__Cup4G11:RR42_RS10085 485 MDGKGFHAAYFTRNPGVWTHGDRIEFSPEGTARLHGRSDGILNVRGINVGPGEIYRVLSDIRDIREALV 553
                                               ********************************************************************* PP

                                 TIGR01217 553 igqeq..........edgeervvlfvklasGatldealvkeikdairaglsprhvpskiievagiprtl 611
                                               + q            e  ++r+vl++ l  G+ l+ al  ++++ +   +sp hvp+ ii+v+++p t 
  lcl|FitnessBrowser__Cup4G11:RR42_RS10085 554 VEQRScaappdrthaERYDQRIVLLLVLQDGVALTGALATRVRRDLARRASPAHVPDLIIAVDELPVTH 622
                                               *9864222211111134589************************************************* PP

                                 TIGR01217 612 sGkkvevavkdvvaGkpvenkgalsnpealdlye 645
                                               +Gk  e a ++ ++G pv n  al np  ld  +
  lcl|FitnessBrowser__Cup4G11:RR42_RS10085 623 NGKLSEAAARNAINGLPVGNAAALRNPGCLDSIQ 656
                                               ****************************999765 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (652 nodes)
Target sequences:                          1  (1020 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.03u 0.01s 00:00:00.04 Elapsed: 00:00:00.04
# Mc/sec: 14.27
//
[ok]

This GapMind analysis is from Sep 17 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory