GapMind for catabolism of small carbon sources

 

Alignments for a candidate for aacS in Azoarcus sp. BH72

Align acetoacetate-CoA ligase (EC 6.2.1.16) (characterized)
to candidate WP_011765880.1 AZO_RS10865 acetoacetate--CoA ligase

Query= BRENDA::Q9Z3R3
         (650 letters)



>NCBI__GCF_000061505.1:WP_011765880.1
          Length = 661

 Score =  823 bits (2126), Expect = 0.0
 Identities = 411/658 (62%), Positives = 483/658 (73%), Gaps = 16/658 (2%)

Query: 2   QAERPLWVPDREIVERSPMAEFIDWCGERFGRSFADYDAFHDWSVSERGAFWTAVWEHCK 61
           Q + PLW P  E +  + M  F      R+     DY A H WSV     FW +VWE   
Sbjct: 7   QPDTPLWTPSAERIAAANMTAFRHTAERRWNIELPDYAALHAWSVDHPEQFWVSVWEDGD 66

Query: 62  VIGESGEKALVDGDRMLDARFFPEARLNFAENLLRKTGSGDALIFRGEDKVSYRLTWDEL 121
           V+G+ G + LVDGDRM  A++FP+ARLNFA+NLLR+    DA++F GEDKV  RLT  EL
Sbjct: 67  VVGKRGVRVLVDGDRMPGAQWFPDARLNFAQNLLRRRDGDDAVVFWGEDKVRDRLTHGEL 126

Query: 122 RALVSRLQQALRAQGIGAGDRVAAMMPNMPETIALMLATASVGAIWSSCSPDFGEQGVLD 181
              V++   ALR QG+G GDRVAA MPNMPET+  MLA AS+GAI++S SPDFG QGVLD
Sbjct: 127 YRRVAQFSAALREQGVGKGDRVAAYMPNMPETLIAMLAAASIGAIFTSASPDFGVQGVLD 186

Query: 182 RFGQIAPKLFIVCDGYWYNGKRQDVDSKVRAVAKSLGAP--TVIVPYAGDSAALAPTVEG 239
           RFGQ  PK+ + CDGY+YNGK  D  +K+  +   L +    VIVPY     AL     G
Sbjct: 187 RFGQTEPKVLLACDGYYYNGKMVDCLAKLGEIVPQLPSVERVVIVPYVHRDHAL-----G 241

Query: 240 GVT----LADFIAGFQAGPLV-FERLPFGHPLYILFSSGTTGVPKCIVHSAGGTLLQHLK 294
           G+      ADF+A   A   + FE LPF HPLY+++SSGTTGVPKCIVHSAGG LLQHLK
Sbjct: 242 GIPHARMYADFVAPHHAATEIGFEALPFSHPLYVMYSSGTTGVPKCIVHSAGGALLQHLK 301

Query: 295 EHRFHCGLRDGERLFYFTTCGWMMWNWLASGLAVGATLCLYDGSPFCPDGNVLFDYAAAE 354
           EHR HC ++ G+R+FYFTTCGWMMWNWL SGLA GAT+ LYDGSPF  D  +LFDYA AE
Sbjct: 302 EHRLHCDVKPGDRVFYFTTCGWMMWNWLVSGLAAGATILLYDGSPFAADNRILFDYADAE 361

Query: 355 RFAVFGTSAKYIDAVRKGGFTPARTHDLSSLRLMTSTGSPLSPEGFSFVYEGIKPDVQLA 414
           R   FGTSAK+IDA  K G  P  TH L+++R M STGSPL PEGF +VY  IK D+QL+
Sbjct: 362 RMTHFGTSAKFIDAAAKFGLKPRETHSLATVRAMMSTGSPLVPEGFDYVYRDIKADLQLS 421

Query: 415 SISGGTDIVSCFVLGNPLKPVWRGEIQGPGLGLAVDVWNDEGKPVRGEKGELVCTRAFPS 474
           SISGGTDI+SCFVLGNP+ PVWRGEIQ  GLGLAVDVW+DEG+PVRGEKGELVC R FP+
Sbjct: 422 SISGGTDILSCFVLGNPVLPVWRGEIQCRGLGLAVDVWDDEGRPVRGEKGELVCARPFPA 481

Query: 475 MPVMFWNDPDGAKYRAAYFDRFDNVWCHGDFAEWTPHGGIVIHGRSDATLNPGGVRIGTA 534
           MPV FW D DG+KYRAAYF+RFDNVWCHGDF E T HGG++I+GRSDATLNPGGVRIGTA
Sbjct: 482 MPVGFWRDEDGSKYRAAYFERFDNVWCHGDFCEITAHGGLIIYGRSDATLNPGGVRIGTA 541

Query: 535 EIYNQVEQMDEVAEALCIGQDW----EDDVRVVLFVRLARGVELTEALTREIKNRIRSGA 590
           EIY QVE++ EV E++ IGQDW     +DVRVVLFV+L  G+ L + L   IK  IR   
Sbjct: 542 EIYRQVEKLHEVVESIVIGQDWPPQNPNDVRVVLFVKLRDGMTLDDTLADRIKRTIRDNT 601

Query: 591 SPRHVPAKIIAVADIPRTKSGKIVELAVRDVVHGRPVKNKEALANPEALDLFAGLEEL 648
           +PRHVPAK++ VADIPRTKSGKIVELAVR+VVHGR VKN+EALANPEAL  F   +EL
Sbjct: 602 TPRHVPAKVLQVADIPRTKSGKIVELAVRNVVHGRAVKNQEALANPEALAHFRDRDEL 659


Lambda     K      H
   0.322    0.139    0.441 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 1483
Number of extensions: 59
Number of successful extensions: 3
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 650
Length of database: 661
Length adjustment: 38
Effective length of query: 612
Effective length of database: 623
Effective search space:   381276
Effective search space used:   381276
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.9 bits)
S2: 54 (25.4 bits)

Align candidate WP_011765880.1 AZO_RS10865 (acetoacetate--CoA ligase)
to HMM TIGR01217 (acetoacetate-CoA ligase (EC 6.2.1.16))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR01217.hmm
# target sequence database:        /tmp/gapView.29420.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR01217  [M=652]
Accession:   TIGR01217
Description: ac_ac_CoA_syn: acetoacetate-CoA ligase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                                 Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                                 -----------
   1.4e-297  974.1   0.0   1.7e-297  973.9   0.0    1.0  1  lcl|NCBI__GCF_000061505.1:WP_011765880.1  AZO_RS10865 acetoacetate--CoA li


Domain annotation for each sequence (and alignments):
>> lcl|NCBI__GCF_000061505.1:WP_011765880.1  AZO_RS10865 acetoacetate--CoA ligase
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  973.9   0.0  1.7e-297  1.7e-297       4     650 ..      10     659 ..       7     661 .] 0.98

  Alignments for each domain:
  == domain 1  score: 973.9 bits;  conditional E-value: 1.7e-297
                                 TIGR01217   4 qvlwepdaervkdarlarfraavgerfGaalgdydalyrwsvdeldafwkavwefsdvvfssaekevvd 72 
                                                +lw+p aer++ a+++ fr  +  r+   l dy+al+ wsvd+ ++fw +vwe  dvv+++++  +vd
  lcl|NCBI__GCF_000061505.1:WP_011765880.1  10 TPLWTPSAERIAAANMTAFRHTAERRWNIELPDYAALHAWSVDHPEQFWVSVWEDGDVVGKRGVRVLVD 78 
                                               58******************************************************************* PP

                                 TIGR01217  73 dskmlaarffpgarlnyaenllrkkgsedallyvdeekesakvtfeelrrqvaslaaalralGvkkGdr 141
                                               +++m++a++fp+arln+a+nllr+++ +da+++ +e+k+  ++t+ el r+va+++aalr++Gv+kGdr
  lcl|NCBI__GCF_000061505.1:WP_011765880.1  79 GDRMPGAQWFPDARLNFAQNLLRRRDGDDAVVFWGEDKVRDRLTHGELYRRVAQFSAALREQGVGKGDR 147
                                               ********************************************************************* PP

                                 TIGR01217 142 vagylpnipeavaallatasvGaiwsscspdfGargvldrfsqiepkllfsvdgyvynGkehdrrekvr 210
                                               va+y+pn+pe++ a+la+as+Gai++s+spdfG++gvldrf+q epk+l+++dgy+ynGk  d   k+ 
  lcl|NCBI__GCF_000061505.1:WP_011765880.1 148 VAAYMPNMPETLIAMLAAASIGAIFTSASPDFGVQGVLDRFGQTEPKVLLACDGYYYNGKMVDCLAKLG 216
                                               ********************************************************************* PP

                                 TIGR01217 211 evakelpdlravvlipyvgdreklapkvegaltledllaa.aqaaelvfeqlpfdhplyilfssGttGv 278
                                               e++ +lp++++vv++pyv   ++l   ++ a +++d++a    a+e+ fe lpf+hply+++ssGttGv
  lcl|NCBI__GCF_000061505.1:WP_011765880.1 217 EIVPQLPSVERVVIVPYVHRDHALGG-IPHARMYADFVAPhHAATEIGFEALPFSHPLYVMYSSGTTGV 284
                                               *******************6666655.************945567899********************* PP

                                 TIGR01217 279 pkaivhsaGGtlvqhlkehvlhcdltdgdrllyyttvGwmmwnflvsglatGatlvlydGsplvpatnv 347
                                               pk+ivhsaGG l+qhlkeh+lhcd+++gdr++y+tt+Gwmmwn+lvsgla+Gat+ lydGsp+  ++ +
  lcl|NCBI__GCF_000061505.1:WP_011765880.1 285 PKCIVHSAGGALLQHLKEHRLHCDVKPGDRVFYFTTCGWMMWNWLVSGLAAGATILLYDGSPFAADNRI 353
                                               ********************************************************************* PP

                                 TIGR01217 348 lfdlaeregitvlGtsakyvsavrkkglkparthdlsalrlvastGsplkpegfeyvyeeikadvllas 416
                                               lfd+a++e++t +Gtsak+++a  k glkp +th l ++r+++stGspl pegf+yvy+ ikad++l+s
  lcl|NCBI__GCF_000061505.1:WP_011765880.1 354 LFDYADAERMTHFGTSAKFIDAAAKFGLKPRETHSLATVRAMMSTGSPLVPEGFDYVYRDIKADLQLSS 422
                                               ********************************************************************* PP

                                 TIGR01217 417 isGGtdivscfvganpslpvykGeiqapglGlaveawdeeGkpvtgekGelvvtkplpsmpvrfwnded 485
                                               isGGtdi+scfv++np+lpv++Geiq++glGlav++wd+eG+pv+gekGelv+++p+p+mpv fw ded
  lcl|NCBI__GCF_000061505.1:WP_011765880.1 423 ISGGTDILSCFVLGNPVLPVWRGEIQCRGLGLAVDVWDDEGRPVRGEKGELVCARPFPAMPVGFWRDED 491
                                               ********************************************************************* PP

                                 TIGR01217 486 GskyrkayfdkypgvwahGdyieltprGgivihGrsdatlnpnGvrlGsaeiynaverldeveeslvig 554
                                               Gskyr+ayf+++++vw+hGd++e+t++Gg++i+Grsdatlnp+Gvr+G+aeiy +ve+l ev es+vig
  lcl|NCBI__GCF_000061505.1:WP_011765880.1 492 GSKYRAAYFERFDNVWCHGDFCEITAHGGLIIYGRSDATLNPGGVRIGTAEIYRQVEKLHEVVESIVIG 560
                                               ********************************************************************* PP

                                 TIGR01217 555 qeqe...dgeervvlfvklasGatldealvkeikdairaglsprhvpskiievagiprtlsGkkvevav 620
                                               q+++     ++rvvlfvkl  G+tld+ l ++ik++ir + +prhvp+k+++va+iprt+sGk+ve+av
  lcl|NCBI__GCF_000061505.1:WP_011765880.1 561 QDWPpqnPNDVRVVLFVKLRDGMTLDDTLADRIKRTIRDNTTPRHVPAKVLQVADIPRTKSGKIVELAV 629
                                               **96222469*********************************************************** PP

                                 TIGR01217 621 kdvvaGkpvenkgalsnpealdlyeeleel 650
                                               ++vv+G++v+n++al+npeal  +++ +el
  lcl|NCBI__GCF_000061505.1:WP_011765880.1 630 RNVVHGRAVKNQEALANPEALAHFRDRDEL 659
                                               ************************997775 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (652 nodes)
Target sequences:                          1  (661 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.03u 0.02s 00:00:00.05 Elapsed: 00:00:00.04
# Mc/sec: 10.23
//
[ok]

This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory