GapMind for catabolism of small carbon sources

 

Alignments for a candidate for acs in Williamsia sterculiae CPCC 203464

Align Acetyl-coenzyme A synthetase; AcCoA synthetase; Acs; Acetate--CoA ligase; Acyl-activating enzyme; EC 6.2.1.1 (characterized)
to candidate WP_076478630.1 BW971_RS08810 acetate--CoA ligase

Query= SwissProt::P9WQD1
         (651 letters)



>NCBI__GCF_900156495.1:WP_076478630.1
          Length = 657

 Score =  910 bits (2352), Expect = 0.0
 Identities = 447/647 (69%), Positives = 512/647 (79%), Gaps = 12/647 (1%)

Query: 7   EVSSSYPPPAHFAEHANARAELYREAEEDRLAFWAKQANRLSWTTPFTEVLDWSGAPFAK 66
           E   S+PP   FA+ ANA AELY  A+ DRL FWA+QA  L W T FT+VLDWS APFAK
Sbjct: 14  EAGQSFPPSRDFADQANAGAELYTRADADRLEFWAEQARALDWDTDFTDVLDWSNAPFAK 73

Query: 67  WFVGGELNVAYNCVDRHVEAGHGDRVAIHWEGEPVGDRRTLTYSDLLAEVSKAANALTDL 126
           WFVGG+LNVA NCVDRHV AG+GDRVAIHW GEP GD R +TYS LL +VS+AAN    +
Sbjct: 74  WFVGGKLNVAVNCVDRHVAAGNGDRVAIHWVGEP-GDHRDITYSQLLGDVSRAANYFDSI 132

Query: 127 GLVAGDRVAIYLPLIPEAVIAMLACARLGIMHSVVFGGFTAAALQARIVDAQAKLLITAD 186
           GL  GDRVAIY+P++PEA++ MLACARLG+ HSVVF GF++ AL++R+ DA+AKL++T D
Sbjct: 133 GLRPGDRVAIYMPMVPEAIVTMLACARLGLTHSVVFAGFSSTALRSRVDDAEAKLVVTTD 192

Query: 187 GQFRRGKPSPLKAAADEALAAIPDC--SVEHVLVVRRTGI--EMAWSEGRDLWWHHVVGS 242
           GQ+RRGKP+PLKA  DEAL    D   SV+ VLVVRRT    E+ W  GRD+WW   V  
Sbjct: 193 GQYRRGKPAPLKANVDEALGTGDDAAVSVQTVLVVRRTNHDPELNWVSGRDVWWDDTVAE 252

Query: 243 ASPAHTPEPFDSEHPLFLLYTSGTTGKPKGIMHTSGGYLTQCCYTMRTIFDVKPDSDVFW 302
           AS  H P+ FDSE PLFLLYTSGTTGKPKGI+HTSGGYLTQ  YT   +FD K   DVFW
Sbjct: 253 ASDHHEPDAFDSEQPLFLLYTSGTTGKPKGILHTSGGYLTQVRYTFHNVFDHKEGRDVFW 312

Query: 303 CTADIGWVTGHTYGVYGPLCNGVTEVLYEGTPDTPDRHRHFQIIEKYGVTIYYTAPTLIR 362
           C ADIGWVTGH+Y VYGPL NG TEV+YEGTP++P+ HRHF+IIE+YGV+IYY APTLIR
Sbjct: 313 CGADIGWVTGHSYLVYGPLSNGATEVVYEGTPNSPNEHRHFEIIERYGVSIYYIAPTLIR 372

Query: 363 MFMKWGREIPDSHDLSSLRLLGSVGEPINPEAWRWYRDVIGGGRTPLVDTWWQTETGSAM 422
            FMKWGREIPD+HDLSSLRLLGSVGEPINPEAWRWYR+VIG    P+VDTWWQTETG+ M
Sbjct: 373 TFMKWGREIPDAHDLSSLRLLGSVGEPINPEAWRWYREVIGHDSCPIVDTWWQTETGAIM 432

Query: 423 ISPLPGIAAAKPGSAMTPLPGISAKIVDDHGDPLPPHTEGAQHVTGYLVLDQPWPSMLRG 482
           ISPLPG+  AKPGSAM PLPGISA IVDD   P+      A    GYLVLD+PWPSMLRG
Sbjct: 433 ISPLPGVTDAKPGSAMRPLPGISATIVDDDAAPV------AAGEQGYLVLDKPWPSMLRG 486

Query: 483 IWGDPARYWHSYWSKFSDKGYYFAGDGARIDPDGAIWVLGRIDDVMNVSGHRISTAEVES 542
           IWGD  R+  +YWS+F+D+G+YFAGDGAR D D A+WVLGRIDDVMNVSGHRIST+EVES
Sbjct: 487 IWGDDQRFIDTYWSRFADQGWYFAGDGARYDDDHALWVLGRIDDVMNVSGHRISTSEVES 546

Query: 543 ALVAHSGVAEAAVVGVTDETTTQAICAFVVLRANYA-PHDRTAEELRTEVARVISPIARP 601
           ALV H  VAEAAV+G  D TT Q I AFV+LR       D    ELR +VA+ ISPIA+P
Sbjct: 547 ALVNHHSVAEAAVIGAADATTGQGIVAFVILRGGVEDTGDTLIAELRDQVAKDISPIAKP 606

Query: 602 RDVHVVPELPKTRSGKIMRRLLRDVAENRELGDTSTLLDPTVFDAIR 648
           R+V VVPELPKTRSGKIMRRLL+DVAE RELGDTSTL+DPTVF+ IR
Sbjct: 607 REVLVVPELPKTRSGKIMRRLLKDVAEGRELGDTSTLVDPTVFEEIR 653


Lambda     K      H
   0.319    0.136    0.433 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 1537
Number of extensions: 93
Number of successful extensions: 6
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 651
Length of database: 657
Length adjustment: 38
Effective length of query: 613
Effective length of database: 619
Effective search space:   379447
Effective search space used:   379447
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.8 bits)
S2: 54 (25.4 bits)

Align candidate WP_076478630.1 BW971_RS08810 (acetate--CoA ligase)
to HMM TIGR02188 (acs: acetate--CoA ligase (EC 6.2.1.1))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR02188.hmm
# target sequence database:        /tmp/gapView.1355379.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR02188  [M=629]
Accession:   TIGR02188
Description: Ac_CoA_lig_AcsA: acetate--CoA ligase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                             Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                             -----------
   3.9e-285  932.9   0.0   4.5e-285  932.7   0.0    1.0  1  NCBI__GCF_900156495.1:WP_076478630.1  


Domain annotation for each sequence (and alignments):
>> NCBI__GCF_900156495.1:WP_076478630.1  
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  932.7   0.0  4.5e-285  4.5e-285       4     627 ..      29     653 ..      26     655 .. 0.97

  Alignments for each domain:
  == domain 1  score: 932.7 bits;  conditional E-value: 4.5e-285
                             TIGR02188   4 leeykelyeeaiedpekfwaklakeelewlkpfekvldeslepkvkWfedgelnvsyncvdrhvek.rkdkva 75 
                                           +++ +ely++a +d+ +fwa++a+  l+w+++f+ vld+s++p++kWf++g+lnv++ncvdrhv++ + d+va
  NCBI__GCF_900156495.1:WP_076478630.1  29 ANAGAELYTRADADRLEFWAEQAR-ALDWDTDFTDVLDWSNAPFAKWFVGGKLNVAVNCVDRHVAAgNGDRVA 100
                                           5667899*****************.6****************************************9****** PP

                             TIGR02188  76 iiwegdeegedsrkltYaellrevcrlanvlkelGvkkgdrvaiYlpmipeaviamlacaRiGavhsvvfaGf 148
                                           i+w g+ +g d+r +tY++ll +v+r+an + ++G++ gdrvaiY+pm+pea+++mlacaR+G +hsvvfaGf
  NCBI__GCF_900156495.1:WP_076478630.1 101 IHWVGE-PG-DHRDITYSQLLGDVSRAANYFDSIGLRPGDRVAIYMPMVPEAIVTMLACARLGLTHSVVFAGF 171
                                           *****9.77.59************************************************************* PP

                             TIGR02188 149 saealaeRivdaeaklvitadeglRggkvielkkivdealekaee...svekvlvvkrtgee.vaewkegrDv 217
                                           s+ al++R+ daeaklv+t+d+++R+gk  +lk++vdeal + ++   sv++vlvv+rt+++   +w++grDv
  NCBI__GCF_900156495.1:WP_076478630.1 172 SSTALRSRVDDAEAKLVVTTDGQYRRGKPAPLKANVDEALGTGDDaavSVQTVLVVRRTNHDpELNWVSGRDV 244
                                           ****************************************987766679**********9752568******* PP

                             TIGR02188 218 wweelvekeasaecepekldsedplfiLYtsGstGkPkGvlhttgGylllaaltvkyvfdikd.edifwCtaD 289
                                           ww+++v++ as+++ep+++dse+plf+LYtsG+tGkPkG+lht+gGyl+++ +t++ vfd+k+  d+fwC aD
  NCBI__GCF_900156495.1:WP_076478630.1 245 WWDDTVAE-ASDHHEPDAFDSEQPLFLLYTSGTTGKPKGILHTSGGYLTQVRYTFHNVFDHKEgRDVFWCGAD 316
                                           *******6.*****************************************************636******** PP

                             TIGR02188 290 vGWvtGhsYivygPLanGattllfegvptypdasrfweviekykvtifYtaPtaiRalmklgeelvkkhdlss 362
                                           +GWvtGhsY+vygPL+nGat++++eg+p+ p+++r++e+ie+y+v+i+Y aPt+iR++mk+g+e++++hdlss
  NCBI__GCF_900156495.1:WP_076478630.1 317 IGWVTGHSYLVYGPLSNGATEVVYEGTPNSPNEHRHFEIIERYGVSIYYIAPTLIRTFMKWGREIPDAHDLSS 389
                                           ************************************************************************* PP

                             TIGR02188 363 lrvlgsvGepinpeaweWyyevvGkekcpivdtwWqtetGgilitplpgvatelkpgsatlPlfGieaevvde 435
                                           lr+lgsvGepinpeaw+Wy+ev+G+++cpivdtwWqtetG+i+i+plpg +t++kpgsa++Pl+Gi+a++vd+
  NCBI__GCF_900156495.1:WP_076478630.1 390 LRLLGSVGEPINPEAWRWYREVIGHDSCPIVDTWWQTETGAIMISPLPG-VTDAKPGSAMRPLPGISATIVDD 461
                                           *************************************************.5********************** PP

                             TIGR02188 436 egkeveeeeeggvLvikkpwPsmlrtiygdeerfvetYfkklkg..lyftGDgarrdkdGyiwilGRvDdvin 506
                                           ++ +v ++e+ g+Lv++kpwPsmlr+i+gd++rf++tY++++ +   yf+GDgar d+d  +w+lGR+Ddv+n
  NCBI__GCF_900156495.1:WP_076478630.1 462 DAAPVAAGEQ-GYLVLDKPWPSMLRGIWGDDQRFIDTYWSRFADqgWYFAGDGARYDDDHALWVLGRIDDVMN 533
                                           *******999.8******************************87779************************** PP

                             TIGR02188 507 vsGhrlgtaeiesalvsheavaeaavvgvpdeikgeaivafvvlkegveedeeelekelkklvrkeigpiakp 579
                                           vsGhr++t+e+esalv+h++vaeaav+g++d ++g+ ivafv+l+ gve + ++l +el+++v+k i+piakp
  NCBI__GCF_900156495.1:WP_076478630.1 534 VSGHRISTSEVESALVNHHSVAEAAVIGAADATTGQGIVAFVILRGGVEDTGDTLIAELRDQVAKDISPIAKP 606
                                           *************************************************9999******************** PP

                             TIGR02188 580 dkilvveelPktRsGkimRRllrkiaegeellgdvstledpsvveelk 627
                                           +++lvv+elPktRsGkimRRll+++aeg+el gd+stl dp+v+ee++
  NCBI__GCF_900156495.1:WP_076478630.1 607 REVLVVPELPKTRSGKIMRRLLKDVAEGREL-GDTSTLVDPTVFEEIR 653
                                           ***************************9875.6************996 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (629 nodes)
Target sequences:                          1  (657 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.01u 0.01s 00:00:00.02 Elapsed: 00:00:00.02
# Mc/sec: 19.44
//
[ok]

This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory