GapMind for catabolism of small carbon sources

 

Alignments for a candidate for acs in Nocardioides daejeonensis MJ31

Align Acetyl-coenzyme A synthetase; AcCoA synthetase; Acs; Acetate--CoA ligase; Acyl-activating enzyme; EC 6.2.1.1 (characterized)
to candidate WP_110207905.1 DNK54_RS14895 acetate--CoA ligase

Query= SwissProt::P9WQD1
         (651 letters)



>NCBI__GCF_003194585.1:WP_110207905.1
          Length = 644

 Score =  835 bits (2156), Expect = 0.0
 Identities = 403/638 (63%), Positives = 486/638 (76%), Gaps = 10/638 (1%)

Query: 12  YPPPAHFAEHANARAELYREAEEDRLAFWAKQANRLSWTTPFTEVLDWSGAPFAKWFVGG 71
           +PPP   A  AN   E Y  AE DR  FWA+QA RL W   +  VLDW   PFAKWF GG
Sbjct: 7   FPPPEALAAAANVTEEAYARAEADREGFWAEQAERLDWGQKWDRVLDWDNPPFAKWFTGG 66

Query: 72  ELNVAYNCVDRHVEAGHGDRVAIHWEGEPVGDRRTLTYSDLLAEVSKAANALTDLGLVAG 131
            +N A NCVDRHV AG G++VAIHW GEP  D R +TY+ L  EV++AANALT+LG+  G
Sbjct: 67  TINAAVNCVDRHVTAGRGEKVAIHWVGEPEDDTRDITYAQLQDEVNRAANALTELGVAKG 126

Query: 132 DRVAIYLPLIPEAVIAMLACARLGIMHSVVFGGFTAAALQARIVDAQAKLLITADGQFRR 191
           DRVAIYLP+IPEAV+ MLACARLG  H+VVFGGF+A AL +RIVD  A++++TADG +RR
Sbjct: 127 DRVAIYLPMIPEAVVTMLACARLGAPHTVVFGGFSADALASRIVDCGARVVVTADGGYRR 186

Query: 192 GKPSPLKAAADEALAAIPDCSVEHVLVVRRTGIEMAWSEGRDLWWHHVVGSASPAHTPEP 251
           G PS LK A DEA+A   D  VEHV+VV+RTG E+A+ E RDLWWH ++G  SP H  E 
Sbjct: 187 GAPSALKPAVDEAVAKA-DGLVEHVVVVQRTGQEVAFDETRDLWWHELMGRQSPQHDAEL 245

Query: 252 FDSEHPLFLLYTSGTTGKPKGIMHTSGGYLTQCCYTMRTIFDVKPDSDVFWCTADIGWVT 311
            D+EHPL+++YTSGTTGKPKGI+HT+GGYL    Y+   +FD+K D+DV+WCTADIGWVT
Sbjct: 246 HDAEHPLYVMYTSGTTGKPKGILHTTGGYLVGTAYSHWAVFDLKADTDVYWCTADIGWVT 305

Query: 312 GHTYGVYGPLCNGVTEVLYEGTPDTPDRHRHFQIIEKYGVTIYYTAPTLIRMFMKWGREI 371
           GH+Y VYGPL NG T+VLYEGTPD+P + R ++IIEKYGVTI+YTAPT IR FMKWG +I
Sbjct: 306 GHSYLVYGPLANGATQVLYEGTPDSPHKGRWWEIIEKYGVTIFYTAPTAIRTFMKWGNDI 365

Query: 372 PDSHDLSSLRLLGSVGEPINPEAWRWYRDVIGGGRTPLVDTWWQTETGSAMISPLPGIAA 431
           P   DLSS+R+LGSVGEPINPEA+ WYR  IGG R P+VDTWWQTETG  MI+P+PG+  
Sbjct: 366 PAKFDLSSIRVLGSVGEPINPEAYVWYRSTIGGDRAPVVDTWWQTETGQIMITPMPGVTH 425

Query: 432 AKPGSAMTPLPGISAKIVDDHGDPLPPHTEGAQHVTGYLVLDQPWPSMLRGIWGDPARYW 491
           AKPGSAM PLPGI A +V+D G+ +P  +       GYL++ +PWP+MLR +WGD  RY 
Sbjct: 426 AKPGSAMRPLPGIVADVVNDEGESVPDGS------GGYLIIREPWPAMLRTVWGDDERYK 479

Query: 492 HSYWSKFSDKGYYFAGDGARIDPDGAIWVLGRIDDVMNVSGHRISTAEVESALVAHSGVA 551
            +YWS++   G YFAGDGA+ D DG IWVLGR+DDVMNVSGHR+ST E+ESALV+H  VA
Sbjct: 480 DTYWSRW--PGVYFAGDGAKKDEDGDIWVLGRVDDVMNVSGHRLSTTEIESALVSHPKVA 537

Query: 552 EAAVVGVTDETTTQAICAFVVLRANYAP-HDRTAEELRTEVARVISPIARPRDVHVVPEL 610
           EAAVVG TDETT QA+CAFV+LR +     D    ELR  VA+ I  IA+PR V +VPEL
Sbjct: 538 EAAVVGATDETTGQAVCAFVILRESAGDGGDEIVAELRNHVAKEIGAIAKPRQVMIVPEL 597

Query: 611 PKTRSGKIMRRLLRDVAENRELGDTSTLLDPTVFDAIR 648
           PKTRSGKIMRRLLRDVAENRE+GD +TL D +V   I+
Sbjct: 598 PKTRSGKIMRRLLRDVAENREIGDVTTLADSSVMSLIQ 635


Lambda     K      H
   0.319    0.136    0.433 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 1506
Number of extensions: 65
Number of successful extensions: 5
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 651
Length of database: 644
Length adjustment: 38
Effective length of query: 613
Effective length of database: 606
Effective search space:   371478
Effective search space used:   371478
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.8 bits)
S2: 54 (25.4 bits)

Align candidate WP_110207905.1 DNK54_RS14895 (acetate--CoA ligase)
to HMM TIGR02188 (acs: acetate--CoA ligase (EC 6.2.1.1))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR02188.hmm
# target sequence database:        /tmp/gapView.3711669.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR02188  [M=629]
Accession:   TIGR02188
Description: Ac_CoA_lig_AcsA: acetate--CoA ligase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                             Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                             -----------
   2.3e-296  970.0   1.5   2.6e-296  969.8   1.5    1.0  1  NCBI__GCF_003194585.1:WP_110207905.1  


Domain annotation for each sequence (and alignments):
>> NCBI__GCF_003194585.1:WP_110207905.1  
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  969.8   1.5  2.6e-296  2.6e-296       5     627 ..      18     635 ..      14     637 .. 0.98

  Alignments for each domain:
  == domain 1  score: 969.8 bits;  conditional E-value: 2.6e-296
                             TIGR02188   5 eeykelyeeaiedpekfwaklakeelewlkpfekvldeslepkvkWfedgelnvsyncvdrhvek.rkdkvai 76 
                                           +  +e y++a++d+e fwa++a+ +l+w +++++vld++++p++kWf++g++n+++ncvdrhv++ r +kvai
  NCBI__GCF_003194585.1:WP_110207905.1  18 NVTEEAYARAEADREGFWAEQAE-RLDWGQKWDRVLDWDNPPFAKWFTGGTINAAVNCVDRHVTAgRGEKVAI 89 
                                           5678999***************9.5****************************************9******* PP

                             TIGR02188  77 iwegdeegedsrkltYaellrevcrlanvlkelGvkkgdrvaiYlpmipeaviamlacaRiGavhsvvfaGfs 149
                                           +w g+ +++d+r +tYa+l++ev+r+an+l+elGv kgdrvaiYlpmipeav++mlacaR+Ga h+vvf+Gfs
  NCBI__GCF_003194585.1:WP_110207905.1  90 HWVGE-PEDDTRDITYAQLQDEVNRAANALTELGVAKGDRVAIYLPMIPEAVVTMLACARLGAPHTVVFGGFS 161
                                           *****.5568*************************************************************** PP

                             TIGR02188 150 aealaeRivdaeaklvitadeglRggkvielkkivdealekaeesvekvlvvkrtgeevaewkegrDvwweel 222
                                           a+ala+Rivd+ a++v+tad+g+R+g   +lk +vdea++ka+  ve+v+vv+rtg+eva ++e rD+ww+el
  NCBI__GCF_003194585.1:WP_110207905.1 162 ADALASRIVDCGARVVVTADGGYRRGAPSALKPAVDEAVAKADGLVEHVVVVQRTGQEVA-FDETRDLWWHEL 233
                                           **********************************************************77.************ PP

                             TIGR02188 223 vekeasaecepekldsedplfiLYtsGstGkPkGvlhttgGylllaaltvkyvfdik.dedifwCtaDvGWvt 294
                                           + + +s ++++e  d+e+pl+++YtsG+tGkPkG+lhttgGyl+ +a+++  vfd+k d+d++wCtaD+GWvt
  NCBI__GCF_003194585.1:WP_110207905.1 234 MGR-QSPQHDAELHDAEHPLYVMYTSGTTGKPKGILHTTGGYLVGTAYSHWAVFDLKaDTDVYWCTADIGWVT 305
                                           **6.****************************************************9899************* PP

                             TIGR02188 295 GhsYivygPLanGattllfegvptypdasrfweviekykvtifYtaPtaiRalmklgeelvkkhdlsslrvlg 367
                                           GhsY+vygPLanGat++l+eg+p+ p+++r+we+ieky+vtifYtaPtaiR++mk+g+++++k dlss+rvlg
  NCBI__GCF_003194585.1:WP_110207905.1 306 GHSYLVYGPLANGATQVLYEGTPDSPHKGRWWEIIEKYGVTIFYTAPTAIRTFMKWGNDIPAKFDLSSIRVLG 378
                                           ************************************************************************* PP

                             TIGR02188 368 svGepinpeaweWyyevvGkekcpivdtwWqtetGgilitplpgvatelkpgsatlPlfGieaevvdeegkev 440
                                           svGepinpea+ Wy++++G +++p+vdtwWqtetG i+itp+pg +t++kpgsa++Pl+Gi a+vv++eg++v
  NCBI__GCF_003194585.1:WP_110207905.1 379 SVGEPINPEAYVWYRSTIGGDRAPVVDTWWQTETGQIMITPMPG-VTHAKPGSAMRPLPGIVADVVNDEGESV 450
                                           ********************************************.5*************************** PP

                             TIGR02188 441 eeeeeggvLvikkpwPsmlrtiygdeerfvetYfkklkglyftGDgarrdkdGyiwilGRvDdvinvsGhrlg 513
                                            ++++ g+L+i++pwP+mlrt++gd+er+ +tY+++ +g+yf+GDga++d+dG+iw+lGRvDdv+nvsGhrl+
  NCBI__GCF_003194585.1:WP_110207905.1 451 PDGSG-GYLIIREPWPAMLRTVWGDDERYKDTYWSRWPGVYFAGDGAKKDEDGDIWVLGRVDDVMNVSGHRLS 522
                                           *9999.8****************************************************************** PP

                             TIGR02188 514 taeiesalvsheavaeaavvgvpdeikgeaivafvvlkegveedeeelekelkklvrkeigpiakpdkilvve 586
                                           t+eiesalvsh++vaeaavvg++de++g+a++afv+l+e++    +e+ +el+++v+keig+iakp+++++v+
  NCBI__GCF_003194585.1:WP_110207905.1 523 TTEIESALVSHPKVAEAAVVGATDETTGQAVCAFVILRESAGDGGDEIVAELRNHVAKEIGAIAKPRQVMIVP 595
                                           ******************************************99999************************** PP

                             TIGR02188 587 elPktRsGkimRRllrkiaegeellgdvstledpsvveelk 627
                                           elPktRsGkimRRllr++ae++e+ gdv+tl+d+sv+  ++
  NCBI__GCF_003194585.1:WP_110207905.1 596 ELPKTRSGKIMRRLLRDVAENREI-GDVTTLADSSVMSLIQ 635
                                           ********************8876.5**********98776 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (629 nodes)
Target sequences:                          1  (644 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.01u 0.01s 00:00:00.02 Elapsed: 00:00:00.01
# Mc/sec: 26.49
//
[ok]

This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory