GapMind for catabolism of small carbon sources

 

Alignments for a candidate for acs in Nocardioides dokdonensis FR1436

Align Acetyl-coenzyme A synthetase; AcCoA synthetase; Acs; Acetate--CoA ligase; Acyl-activating enzyme; EC 6.2.1.1 (characterized)
to candidate WP_068107934.1 I601_RS07695 acetate--CoA ligase

Query= SwissProt::P9WQD1
         (651 letters)



>NCBI__GCF_001653335.1:WP_068107934.1
          Length = 661

 Score =  842 bits (2175), Expect = 0.0
 Identities = 415/642 (64%), Positives = 491/642 (76%), Gaps = 13/642 (2%)

Query: 12  YPPPAHFAEHANARAELYREAEEDRLAFWAKQANRLSWTTPFTEVLDWSGAPFAKWFVGG 71
           + PPA  A +AN  AE Y  A  D  AFWA+QA RL+W TP+  VL+W   PFAKWF GG
Sbjct: 16  FEPPAELAANANVTAEAYDAAAADPEAFWAEQAGRLTWATPWDRVLNWDDPPFAKWFEGG 75

Query: 72  ELNVAYNCVDRHVEAGHGDRVAIHWEGEPVGDRRTLTYSDLLAEVSKAANALTDLGLVAG 131
            LN AYNCVDRHVEAG GD+VA H+ GEPV D R +TY++L   V +AAN L DLG+  G
Sbjct: 76  RLNAAYNCVDRHVEAGRGDKVAFHFVGEPVDDTRDITYAELQDLVCQAANTLIDLGVQTG 135

Query: 132 DRVAIYLPLIPEAVIAMLACARLGIMHSVVFGGFTAAALQARIVDAQAKLLITADGQFRR 191
           DRVAIY+P+IPEAV+AMLACAR+G  H+VVFGGF++ AL +R+ D QAK+++TADG +RR
Sbjct: 136 DRVAIYMPMIPEAVVAMLACARIGAPHTVVFGGFSSDALASRLEDCQAKVVVTADGGYRR 195

Query: 192 GKPSPLKAAADEAL--AAIPDCSVEHVLVVRRTGIEM---AWSEGRDLWWHHVVGSASPA 246
           G PS LK A DEA   AA    +VE VLVVRRTG ++   +W +  D+WWH  V  AS  
Sbjct: 196 GAPSALKPAVDEARVKAAGLGHTVEKVLVVRRTGQDLGADSWDDAVDVWWHESVDRASTE 255

Query: 247 HTPEPFDSEHPLFLLYTSGTTGKPKGIMHTSGGYLTQCCYTMRTIFDVKPDSDVFWCTAD 306
           H  E FD+EHPL+++YTSGTTGKPKGI+HT+GGYLT   YT   +FD+KPD DVFWCTAD
Sbjct: 256 HAHEAFDAEHPLYVMYTSGTTGKPKGILHTTGGYLTGAAYTHWAVFDIKPD-DVFWCTAD 314

Query: 307 IGWVTGHTYGVYGPLCNGVTEVLYEGTPDTPDRHRHFQIIEKYGVTIYYTAPTLIRMFMK 366
           IGWVTGH+Y VYGPL NGVT+VLYEGTPD+P++ R +QIIE+ GVTI YTAPT IR FMK
Sbjct: 315 IGWVTGHSYIVYGPLANGVTQVLYEGTPDSPEKGRWWQIIEQQGVTILYTAPTAIRSFMK 374

Query: 367 WGREIPDSHDLSSLRLLGSVGEPINPEAWRWYRDVIGGGRTPLVDTWWQTETGSAMISPL 426
            GREIPD HDLSSLRLLGSVGEPINPEA+ WYR VIGG RTP+VDTWWQTETG  +ISPL
Sbjct: 375 QGREIPDRHDLSSLRLLGSVGEPINPEAYVWYRHVIGGDRTPVVDTWWQTETGQILISPL 434

Query: 427 PGIAAAKPGSAMTPLPGISAKIVDDHGDPLPPHTEGAQHVTGYLVLDQPWPSMLRGIWGD 486
           PG+ A KPGSAM P+PGISA +VD+ G       E A+   GYLVL +PWP+MLR IWGD
Sbjct: 435 PGVTAGKPGSAMVPIPGISAAVVDEEG------REVAKGGGGYLVLTKPWPAMLRTIWGD 488

Query: 487 PARYWHSYWSKFSDKGYYFAGDGARIDPDGAIWVLGRIDDVMNVSGHRISTAEVESALVA 546
             RY  +YW++F   GYYFAGDGA++D DG +WVLGR+DDVMNVSGHR+ST E+ESALV+
Sbjct: 489 DERYKETYWARFEKLGYYFAGDGAKLDDDGDLWVLGRVDDVMNVSGHRLSTTEIESALVS 548

Query: 547 HSGVAEAAVVGVTDETTTQAICAFVVLRANYAPHDR-TAEELRTEVARVISPIARPRDVH 605
           H  VAEAAVVG  DE T QA+CAFV+LR           EELR  V + I PIA+PR + 
Sbjct: 549 HPKVAEAAVVGAQDEDTGQAVCAFVILRDEAGDGGADIVEELRAHVRKEIGPIAKPRQIM 608

Query: 606 VVPELPKTRSGKIMRRLLRDVAENRELGDTSTLLDPTVFDAI 647
           +VPELPKTRSGKIMRRLLRDVAENRE+GD +TL D +V D I
Sbjct: 609 IVPELPKTRSGKIMRRLLRDVAENREVGDVTTLADSSVMDLI 650


Lambda     K      H
   0.319    0.136    0.433 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 1562
Number of extensions: 83
Number of successful extensions: 6
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 651
Length of database: 661
Length adjustment: 38
Effective length of query: 613
Effective length of database: 623
Effective search space:   381899
Effective search space used:   381899
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.8 bits)
S2: 54 (25.4 bits)

Align candidate WP_068107934.1 I601_RS07695 (acetate--CoA ligase)
to HMM TIGR02188 (acs: acetate--CoA ligase (EC 6.2.1.1))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR02188.hmm
# target sequence database:        /tmp/gapView.1392951.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR02188  [M=629]
Accession:   TIGR02188
Description: Ac_CoA_lig_AcsA: acetate--CoA ligase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                             Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                             -----------
   3.6e-291  952.9   0.1   4.1e-291  952.7   0.1    1.0  1  NCBI__GCF_001653335.1:WP_068107934.1  


Domain annotation for each sequence (and alignments):
>> NCBI__GCF_001653335.1:WP_068107934.1  
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  952.7   0.1  4.1e-291  4.1e-291       4     627 ..      26     651 ..      23     653 .. 0.97

  Alignments for each domain:
  == domain 1  score: 952.7 bits;  conditional E-value: 4.1e-291
                             TIGR02188   4 leeykelyeeaiedpekfwaklakeelewlkpfekvldeslepkvkWfedgelnvsyncvdrhvek.rkdkva 75 
                                           ++  +e y+ a++dpe+fwa++a  +l+w++p+++vl+++ +p++kWfe+g+ln++yncvdrhve+ r dkva
  NCBI__GCF_001653335.1:WP_068107934.1  26 ANVTAEAYDAAAADPEAFWAEQAG-RLTWATPWDRVLNWDDPPFAKWFEGGRLNAAYNCVDRHVEAgRGDKVA 97 
                                           56678999****************.5*********************************************** PP

                             TIGR02188  76 iiwegdeegedsrkltYaellrevcrlanvlkelGvkkgdrvaiYlpmipeaviamlacaRiGavhsvvfaGf 148
                                            ++ g+ + +d+r +tYael++ vc++an+l +lGv+ gdrvaiY+pmipeav+amlacaRiGa h+vvf+Gf
  NCBI__GCF_001653335.1:WP_068107934.1  98 FHFVGE-PVDDTRDITYAELQDLVCQAANTLIDLGVQTGDRVAIYMPMIPEAVVAMLACARIGAPHTVVFGGF 169
                                           ******.5568************************************************************** PP

                             TIGR02188 149 saealaeRivdaeaklvitadeglRggkvielkkivdealekaee...svekvlvvkrtgeev..aewkegrD 216
                                           s++ala+R++d++ak+v+tad+g+R+g   +lk +vdea  ka     +vekvlvv+rtg+++   +w++  D
  NCBI__GCF_001653335.1:WP_068107934.1 170 SSDALASRLEDCQAKVVVTADGGYRRGAPSALKPAVDEARVKAAGlghTVEKVLVVRRTGQDLgaDSWDDAVD 242
                                           ***************************************766654467************9863368****** PP

                             TIGR02188 217 vwweelvekeasaecepekldsedplfiLYtsGstGkPkGvlhttgGylllaaltvkyvfdikdedifwCtaD 289
                                           vww+e v++ as+e++ e++d+e+pl+++YtsG+tGkPkG+lhttgGyl+ aa+t+  vfdik++d+fwCtaD
  NCBI__GCF_001653335.1:WP_068107934.1 243 VWWHESVDR-ASTEHAHEAFDAEHPLYVMYTSGTTGKPKGILHTTGGYLTGAAYTHWAVFDIKPDDVFWCTAD 314
                                           ********6.*************************************************************** PP

                             TIGR02188 290 vGWvtGhsYivygPLanGattllfegvptypdasrfweviekykvtifYtaPtaiRalmklgeelvkkhdlss 362
                                           +GWvtGhsYivygPLanG+t++l+eg+p+ p+++r+w++ie+ +vti+YtaPtaiR++mk+g+e++++hdlss
  NCBI__GCF_001653335.1:WP_068107934.1 315 IGWVTGHSYIVYGPLANGVTQVLYEGTPDSPEKGRWWQIIEQQGVTILYTAPTAIRSFMKQGREIPDRHDLSS 387
                                           ************************************************************************* PP

                             TIGR02188 363 lrvlgsvGepinpeaweWyyevvGkekcpivdtwWqtetGgilitplpgvatelkpgsatlPlfGieaevvde 435
                                           lr+lgsvGepinpea+ Wy++v+G +++p+vdtwWqtetG ili+plpg +t+ kpgsa++P++Gi+a+vvde
  NCBI__GCF_001653335.1:WP_068107934.1 388 LRLLGSVGEPINPEAYVWYRHVIGGDRTPVVDTWWQTETGQILISPLPG-VTAGKPGSAMVPIPGISAAVVDE 459
                                           *************************************************.5********************** PP

                             TIGR02188 436 egkeveeeeeggvLvikkpwPsmlrtiygdeerfvetYfkklkg..lyftGDgarrdkdGyiwilGRvDdvin 506
                                           eg+ev ++ + g+Lv++kpwP+mlrti+gd+er+ etY+ ++++  +yf+GDga+ d+dG++w+lGRvDdv+n
  NCBI__GCF_001653335.1:WP_068107934.1 460 EGREVAKGGG-GYLVLTKPWPAMLRTIWGDDERYKETYWARFEKlgYYFAGDGAKLDDDGDLWVLGRVDDVMN 531
                                           ******9988.8*******************************999*************************** PP

                             TIGR02188 507 vsGhrlgtaeiesalvsheavaeaavvgvpdeikgeaivafvvlkegveedeeelekelkklvrkeigpiakp 579
                                           vsGhrl+t+eiesalvsh++vaeaavvg++de +g+a++afv+l++++     ++ +el+++vrkeigpiakp
  NCBI__GCF_001653335.1:WP_068107934.1 532 VSGHRLSTTEIESALVSHPKVAEAAVVGAQDEDTGQAVCAFVILRDEAGDGGADIVEELRAHVRKEIGPIAKP 604
                                           ************************************************999999******************* PP

                             TIGR02188 580 dkilvveelPktRsGkimRRllrkiaegeellgdvstledpsvveelk 627
                                           ++i++v+elPktRsGkimRRllr++ae++e+ gdv+tl+d+sv++ ++
  NCBI__GCF_001653335.1:WP_068107934.1 605 RQIMIVPELPKTRSGKIMRRLLRDVAENREV-GDVTTLADSSVMDLIS 651
                                           ***************************8876.5**********99875 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (629 nodes)
Target sequences:                          1  (661 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.01u 0.01s 00:00:00.02 Elapsed: 00:00:00.02
# Mc/sec: 18.29
//
[ok]

This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory