GapMind for catabolism of small carbon sources

 

Alignments for a candidate for acs in Archaeoglobus veneficus SNP6

Align acetate-CoA ligase (EC 6.2.1.1) (characterized)
to candidate WP_013683570.1 ARCVE_RS04375 acetate--CoA ligase

Query= BRENDA::A0B8F1
         (659 letters)



>NCBI__GCF_000194625.1:WP_013683570.1
          Length = 647

 Score =  709 bits (1831), Expect = 0.0
 Identities = 341/642 (53%), Positives = 461/642 (71%), Gaps = 20/642 (3%)

Query: 27  SNSYQWMKKKGFKTEKEMREWC----------AQNYLDFWDEMAQTYADWFKPYTQILE- 75
           SNS ++   + ++  +E R++           AQ+YL FWDE+A+   +WF+PY  +L+ 
Sbjct: 2   SNSKEFSPGEIYEPSEETRKYAWVNNERIYEMAQDYLTFWDEVAKNDVEWFEPYEDVLDD 61

Query: 76  WNPPYAKWFLGGKCNVAHNAVDRHAKSWRRNKVAYYFVGEPVGDTKTITYYQLYQAVNKM 135
            N P+ +WF+GGK N+ HN +DRH K  + +K A  + GEP  + + ITY++LYQ V + 
Sbjct: 62  SNAPFYRWFVGGKINITHNCLDRHIKL-KGDKTAIIWQGEPENEKEKITYHELYQRVCRF 120

Query: 136 ANGLKSLGVKKG-DRVSIYLPMIPELPITMLACAKIGAIHSVVFSGFSAGGLQSRVTDAE 194
           AN L++LGV++  D V+IY+ M+PELPI MLACA+IGAIHSVVF GFS+  L+ R+ DAE
Sbjct: 121 ANALRTLGVEEEEDVVTIYMGMVPELPIAMLACARIGAIHSVVFGGFSSKALRDRINDAE 180

Query: 195 AKVVVTSDGFYRRGKPLPLKPNVDEAVQNAPSVEKVVVVKRVGLDVPMKEGRDIWYHDLV 254
           +KVVVT DG+YRRGK +  K  VDEA++NAPSVE VVV++R G DV M EGRD+W+H+L 
Sbjct: 181 SKVVVTMDGYYRRGKVVETKRIVDEALENAPSVESVVVLERTGNDVNMVEGRDVWWHELE 240

Query: 255 KDQPAECYTEELDPEDRLFILYTSGTTGKPKGIEHAHGGFCVGPAYTTAWALDVHEEDVY 314
           +  P +C    LD E  LFILYTSGTTGKPKG+ H HGG+ VG   TT W  D+ + D++
Sbjct: 241 EGLPDKCECRPLDSEHTLFILYTSGTTGKPKGVLHVHGGYNVGTHITTKWVFDLKDRDIF 300

Query: 315 WCTADCGWITGHSYVVYGPLCLGATSILYEGAPDYPDIGRWWSIIEEYGVSVFYTAPTAI 374
           WCTAD GWITGHSYVVYGPL +GAT ++YEGAPD+P   RWWSIIE+YGV++ YTAPTAI
Sbjct: 301 WCTADIGWITGHSYVVYGPLSVGATVLMYEGAPDHPQPDRWWSIIEQYGVTILYTAPTAI 360

Query: 375 RMFMKAGDQWPKKYNLKSIRILASVGEPLNPEAYVWFRNNIGGGQAPIIDTWWQTETGCH 434
           R FMK G++WPKK++L S+R+L +VGE ++P+A+ W+  +IG  + PI+DTWWQTETG  
Sbjct: 361 RYFMKLGEEWPKKHDLSSLRLLGTVGETIDPKAWKWYYKHIGNERCPIVDTWWQTETGMI 420

Query: 435 VIAPLP-MTPEKPGSVAFPLPGFNTDIYDEDGNSVPLGYGGNIVQKTPWPSMLRAFFRDP 493
           +I PLP +T  KPGS   P PG   DI DE G S      G +V   PWP+M R  + +P
Sbjct: 421 MITPLPGITKLKPGSATLPFPGIEVDIRDEKGYSAD---SGELVITNPWPAMFRTLWGEP 477

Query: 494 ERYMKEYWQMYWDIKPGTYLAGDKATRDKDGYWWIQGRIDDVLKVAGHRISNAEVESAAV 553
           ER+ K+YW+    +    Y  GD A +D++GY+WI GRID+VLKV+GHR+ +AE+E A +
Sbjct: 478 ERFAKQYWRSNGGL---IYYTGDGARKDENGYYWIIGRIDEVLKVSGHRLGSAEIEGALI 534

Query: 554 SHPAVAEAAVIGKPDEVKGEVIVAFIILKEGVQESEDLKKDIAKHVRSVLGPVAYPEIVY 613
           SH AV+EAAV+GKP E+KGE  VAF++LK G + S +L++D+ KHVR+ +G +A PE +Y
Sbjct: 535 SHEAVSEAAVVGKPHEIKGETPVAFVVLKTGYEPSVELEEDLKKHVRNEIGAIAVPEGIY 594

Query: 614 FVKDVPKTRSGKIMRRVIKAKALGKPVGDISALANPESVENI 655
           FV+ +PKTRSGKIMRR++ A   G+ VGDI+ L +   VE I
Sbjct: 595 FVEQLPKTRSGKIMRRILLAIEKGEEVGDITTLEDVTVVEKI 636


Lambda     K      H
   0.318    0.137    0.438 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 1375
Number of extensions: 73
Number of successful extensions: 7
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 659
Length of database: 647
Length adjustment: 38
Effective length of query: 621
Effective length of database: 609
Effective search space:   378189
Effective search space used:   378189
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 54 (25.4 bits)

Align candidate WP_013683570.1 ARCVE_RS04375 (acetate--CoA ligase)
to HMM TIGR02188 (acs: acetate--CoA ligase (EC 6.2.1.1))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR02188.hmm
# target sequence database:        /tmp/gapView.2308577.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR02188  [M=629]
Accession:   TIGR02188
Description: Ac_CoA_lig_AcsA: acetate--CoA ligase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                             Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                             -----------
   1.4e-292  957.5   1.4   1.8e-292  957.2   1.4    1.0  1  NCBI__GCF_000194625.1:WP_013683570.1  


Domain annotation for each sequence (and alignments):
>> NCBI__GCF_000194625.1:WP_013683570.1  
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  957.2   1.4  1.8e-292  1.8e-292      12     628 ..      31     638 ..      21     639 .. 0.97

  Alignments for each domain:
  == domain 1  score: 957.2 bits;  conditional E-value: 1.8e-292
                             TIGR02188  12 eeaiedpekfwaklakeelewlkpfekvldeslepkvkWfedgelnvsyncvdrhvekrkdkvaiiwegdeeg 84 
                                            e ++d+ +fw++ ak+ +ew++p+e vld+s++p+++Wf++g++n++ nc+drh++ + dk+aiiw g+ e+
  NCBI__GCF_000194625.1:WP_013683570.1  31 YEMAQDYLTFWDEVAKNDVEWFEPYEDVLDDSNAPFYRWFVGGKINITHNCLDRHIKLKGDKTAIIWQGEPEN 103
                                           456789****************************************************************555 PP

                             TIGR02188  85 edsrkltYaellrevcrlanvlkelGvkk.gdrvaiYlpmipeaviamlacaRiGavhsvvfaGfsaealaeR 156
                                            +++k+tY+el+++vcr+an+l++lGv++  d v+iY+ m+pe+ iamlacaRiGa+hsvvf+Gfs++al++R
  NCBI__GCF_000194625.1:WP_013683570.1 104 -EKEKITYHELYQRVCRFANALRTLGVEEeEDVVTIYMGMVPELPIAMLACARIGAIHSVVFGGFSSKALRDR 175
                                           .69************************86268899************************************** PP

                             TIGR02188 157 ivdaeaklvitadeglRggkvielkkivdealekaeesvekvlvvkrtgeevaewkegrDvwweelvekeasa 229
                                           i+dae+k+v+t d+ +R+gkv+e+k+ivdeale+a+ sve+v+v++rtg++v+ ++egrDvww+el e + ++
  NCBI__GCF_000194625.1:WP_013683570.1 176 INDAESKVVVTMDGYYRRGKVVETKRIVDEALENAP-SVESVVVLERTGNDVN-MVEGRDVWWHELEE-GLPD 245
                                           ***********************************9.7*************66.************99.6*** PP

                             TIGR02188 230 ecepekldsedplfiLYtsGstGkPkGvlhttgGylllaaltvkyvfdikdedifwCtaDvGWvtGhsYivyg 302
                                           +ce+++ldse+ lfiLYtsG+tGkPkGvlh  gGy + +++t+k+vfd+kd difwCtaD+GW+tGhsY+vyg
  NCBI__GCF_000194625.1:WP_013683570.1 246 KCECRPLDSEHTLFILYTSGTTGKPKGVLHVHGGYNVGTHITTKWVFDLKDRDIFWCTADIGWITGHSYVVYG 318
                                           ************************************************************************* PP

                             TIGR02188 303 PLanGattllfegvptypdasrfweviekykvtifYtaPtaiRalmklgeelvkkhdlsslrvlgsvGepinp 375
                                           PL++Gat l++eg+p++p+++r+w++ie+y+vti+YtaPtaiR +mklgee++kkhdlsslr+lg+vGe i+p
  NCBI__GCF_000194625.1:WP_013683570.1 319 PLSVGATVLMYEGAPDHPQPDRWWSIIEQYGVTILYTAPTAIRYFMKLGEEWPKKHDLSSLRLLGTVGETIDP 391
                                           ************************************************************************* PP

                             TIGR02188 376 eaweWyyevvGkekcpivdtwWqtetGgilitplpgvatelkpgsatlPlfGieaevvdeegkeveeeeeggv 448
                                           +aw+Wyy+++G+e+cpivdtwWqtetG i+itplpg +t+lkpgsatlP++Gie+++ de+g ++++    g 
  NCBI__GCF_000194625.1:WP_013683570.1 392 KAWKWYYKHIGNERCPIVDTWWQTETGMIMITPLPG-ITKLKPGSATLPFPGIEVDIRDEKGYSADS----GE 459
                                           ************************************.5************************98765....67 PP

                             TIGR02188 449 LvikkpwPsmlrtiygdeerfvetYfkklkg.lyftGDgarrdkdGyiwilGRvDdvinvsGhrlgtaeiesa 520
                                           Lvi++pwP+m+rt++g++erf ++Y+++  g +y+tGDgar+d++Gy+wi+GR+D+v++vsGhrlg+aeie a
  NCBI__GCF_000194625.1:WP_013683570.1 460 LVITNPWPAMFRTLWGEPERFAKQYWRSNGGlIYYTGDGARKDENGYYWIIGRIDEVLKVSGHRLGSAEIEGA 532
                                           9*************************9888789**************************************** PP

                             TIGR02188 521 lvsheavaeaavvgvpdeikgeaivafvvlkegveedeeelekelkklvrkeigpiakpdkilvveelPktRs 593
                                           l+sheav+eaavvg+p+eikge+ vafvvlk+g+e++ e le++lkk+vr+eig+ia p+ i++ve+lPktRs
  NCBI__GCF_000194625.1:WP_013683570.1 533 LISHEAVSEAAVVGKPHEIKGETPVAFVVLKTGYEPSVE-LEEDLKKHVRNEIGAIAVPEGIYFVEQLPKTRS 604
                                           **************************************5.********************************* PP

                             TIGR02188 594 GkimRRllrkiaegeellgdvstledpsvveelke 628
                                           GkimRR+l +i +gee+ gd++tled +vve++ke
  NCBI__GCF_000194625.1:WP_013683570.1 605 GKIMRRILLAIEKGEEV-GDITTLEDVTVVEKIKE 638
                                           ***********998776.5*************986 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (629 nodes)
Target sequences:                          1  (647 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.01u 0.01s 00:00:00.02 Elapsed: 00:00:00.02
# Mc/sec: 16.72
//
[ok]

This GapMind analysis is from Apr 09 2024. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory