GapMind for catabolism of small carbon sources

 

Alignments for a candidate for acs in Sulfurihydrogenibium subterraneum DSM 15120

Align acetate-CoA ligase (EC 6.2.1.1) (characterized)
to candidate WP_028950423.1 Q385_RS0103975 acetate--CoA ligase

Query= BRENDA::Q2XNL6
         (634 letters)



>NCBI__GCF_000619805.1:WP_028950423.1
          Length = 632

 Score =  727 bits (1876), Expect = 0.0
 Identities = 354/629 (56%), Positives = 467/629 (74%), Gaps = 4/629 (0%)

Query: 7   VLLEEKRVFKPHYTVVEEAHIKNWEAELEKGKDH-ENYWAEKAERLEWFRKWDRVLDESN 65
           V L+ K +F P   + +E  ++++E+  +K  ++ E +W+E A +L WF+KWD+VL E N
Sbjct: 7   VHLQVKEIFYPPEEIKKECIVQDYESMYKKSIENPEEFWSEIASQLHWFQKWDKVL-EWN 65

Query: 66  RPFYRWFVNGKINMTYNAVDRWLDTDKRNQVAILYVNERGDERKLTYYELYREVSRTANA 125
            P+ +WFVNGK N+TYN +D  ++  +RN+VA + V+E G+E+K+TY EL  +V+R AN 
Sbjct: 66  FPYAKWFVNGKTNITYNCLDANIEKGRRNKVAYISVDEEGNEKKITYGELLEQVNRFANG 125

Query: 126 LKSLGIKKGDAVALYLPMCPELVVSMLACAKIGAVHSVIYSGLSVGALVERLNDARAKII 185
           LKSLG+ KGD V++Y+P   E V++MLACA+IGA+HSV+++G S GAL  R+NDA+AK++
Sbjct: 126 LKSLGVSKGDRVSIYMPNTVEAVIAMLACARIGAIHSVVFAGFSEGALKLRINDAKAKVV 185

Query: 186 ITADGTYRRGGVIKLKPIVDEAILQCPTIETTVVVKHTDIDIEMSDISGREMLFDKLIEG 245
           ITA  T RRG  I L PIV EAI     ++  V+V   D ++E S+ S + +  ++LI+ 
Sbjct: 186 ITATYTKRRGKKIPLLPIVKEAIKDLDFVKH-VIVWDRDKELEESEFSKKFVSLEELIKS 244

Query: 246 EGDRCDAEEMDAEDPLFILYTSGSTGKPKGVLHTTGGYMVGVASTLEMTFDIHNGDLWWC 305
               C+ E MDAEDPLFILYTSG+TGKPKGVLHT GGYMV    T ++ F+I    ++WC
Sbjct: 245 SSPNCNPEVMDAEDPLFILYTSGTTGKPKGVLHTVGGYMVNTYLTTKVVFNIKEDTVYWC 304

Query: 306 TADIGWITGHSYVVYGPLLLGTTTLLYEGAPDYPDPGVWWSIVEKYGVTKFYTAPTAIRH 365
           TADIGWITGHSY+VYGPLL GTT ++ EG P YP PG+WW  V+KY V  FYTAPTAIR 
Sbjct: 305 TADIGWITGHSYIVYGPLLNGTTIVMMEGVPVYPHPGIWWEYVDKYRVNVFYTAPTAIRM 364

Query: 366 LMRFGDKHPKRYNLESLKILGTVGEPINPEAWMWYYRNIGREKCPIIDTWWQTETGMHLI 425
           LMRFGD+ P +Y+L SLK+LG+VGEPINPEAW+WYY+NIGR +  ++DTWWQTETG H+I
Sbjct: 365 LMRFGDEIPAKYDLSSLKVLGSVGEPINPEAWLWYYKNIGRGRAVVVDTWWQTETGAHMI 424

Query: 426 APLPVTPLKPGSVTKPLPGIEADVVDENGDPVPLGKGGFLVIRKPWPAMFRTLFNDEQRY 485
             LP  P KPG   KPL G+  DVVD+ G+ +P    G LVI++PWP+M RT + + +RY
Sbjct: 425 TTLPCYPAKPGKAGKPLFGVIPDVVDKEGNSLPPNTIGHLVIKQPWPSMLRTCWGEPERY 484

Query: 486 IDVYWKQIPGGVYTAGDMARKDEDGYFWIQGRSDDVLNIAGHRIGTAEVESVFVAHPAVA 545
            + YWK+IPG VY+AGD+A  DEDGY  I GR+DDVL++AGHRIG AEVES  + HPAVA
Sbjct: 485 -EKYWKEIPGNVYSAGDLATIDEDGYIMILGRADDVLSVAGHRIGNAEVESAIIEHPAVA 543

Query: 546 EAAVIGKADPIKGEVIKAFLILKKGHKLNAALIEELKRHLRHELGPVAVVGEMVQVDSLP 605
           EAAVIGK + IKGE IKAF++LK+G+  +  LIEE+K  ++  LG +AV  E+  VD LP
Sbjct: 544 EAAVIGKPNEIKGESIKAFVVLKEGYSPSIELIEEIKETVKEILGAIAVPDEVEFVDKLP 603

Query: 606 KTRSGKIMRRILRAREEGEDLGDTSTLEE 634
           KTRSGKIMRR+LRARE G+DLGD STLE+
Sbjct: 604 KTRSGKIMRRVLRARELGQDLGDISTLED 632


Lambda     K      H
   0.319    0.138    0.428 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 1209
Number of extensions: 50
Number of successful extensions: 4
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 634
Length of database: 632
Length adjustment: 38
Effective length of query: 596
Effective length of database: 594
Effective search space:   354024
Effective search space used:   354024
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 54 (25.4 bits)

Align candidate WP_028950423.1 Q385_RS0103975 (acetate--CoA ligase)
to HMM TIGR02188 (acs: acetate--CoA ligase (EC 6.2.1.1))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR02188.hmm
# target sequence database:        /tmp/gapView.14065.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR02188  [M=629]
Accession:   TIGR02188
Description: Ac_CoA_lig_AcsA: acetate--CoA ligase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                                 Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                                 -----------
   2.5e-267  874.1   0.4   3.1e-267  873.8   0.4    1.0  1  lcl|NCBI__GCF_000619805.1:WP_028950423.1  Q385_RS0103975 acetate--CoA liga


Domain annotation for each sequence (and alignments):
>> lcl|NCBI__GCF_000619805.1:WP_028950423.1  Q385_RS0103975 acetate--CoA ligase
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  873.8   0.4  3.1e-267  3.1e-267       5     619 ..      28     632 .]      25     632 .] 0.97

  Alignments for each domain:
  == domain 1  score: 873.8 bits;  conditional E-value: 3.1e-267
                                 TIGR02188   5 eeykelyeeaiedpekfwaklakeelewlkpfekvldeslepkvkWfedgelnvsyncvdrhvek.rkd 72 
                                               ++y+++y+++ie+pe+fw++ a+ +l+w+++++kvl+++++  +kWf++g++n++ync+d ++ek r++
  lcl|NCBI__GCF_000619805.1:WP_028950423.1  28 QDYESMYKKSIENPEEFWSEIAS-QLHWFQKWDKVLEWNFP-YAKWFVNGKTNITYNCLDANIEKgRRN 94 
                                               79*********************.5*************975.89************************* PP

                                 TIGR02188  73 kvaiiwegdeegedsrkltYaellrevcrlanvlkelGvkkgdrvaiYlpmipeaviamlacaRiGavh 141
                                               kva i   +e +  ++k+tY ell++v+r+an lk+lGv kgdrv+iY+p ++eaviamlacaRiGa+h
  lcl|NCBI__GCF_000619805.1:WP_028950423.1  95 KVAYISVDEEGN--EKKITYGELLEQVNRFANGLKSLGVSKGDRVSIYMPNTVEAVIAMLACARIGAIH 161
                                               ***997766444..7****************************************************** PP

                                 TIGR02188 142 svvfaGfsaealaeRivdaeaklvitadeglRggkvielkkivdealekaeesvekvlvvkrtgeevae 210
                                               svvfaGfs  al+ Ri+da+ak+vita  + R+gk+i+l  iv ea+++ +  v++v+v  r +e  ++
  lcl|NCBI__GCF_000619805.1:WP_028950423.1 162 SVVFAGFSEGALKLRINDAKAKVVITATYTKRRGKKIPLLPIVKEAIKDLD-FVKHVIVWDRDKELEES 229
                                               **************************************************8.7********99887666 PP

                                 TIGR02188 211 wkegrDvwweelvekeasaecepekldsedplfiLYtsGstGkPkGvlhttgGylllaaltvkyvfdik 279
                                                 +++ v +eel+++ +s +c+pe +d+edplfiLYtsG+tGkPkGvlht+gGy++ ++lt+k+vf+ik
  lcl|NCBI__GCF_000619805.1:WP_028950423.1 230 EFSKKFVSLEELIKS-SSPNCNPEVMDAEDPLFILYTSGTTGKPKGVLHTVGGYMVNTYLTTKVVFNIK 297
                                               677888999999995.***************************************************** PP

                                 TIGR02188 280 dedifwCtaDvGWvtGhsYivygPLanGattllfegvptypdasrfweviekykvtifYtaPtaiRalm 348
                                               ++ ++wCtaD+GW+tGhsYivygPL+nG t ++ egvp yp+++ +we ++ky+v++fYtaPtaiR+lm
  lcl|NCBI__GCF_000619805.1:WP_028950423.1 298 EDTVYWCTADIGWITGHSYIVYGPLLNGTTIVMMEGVPVYPHPGIWWEYVDKYRVNVFYTAPTAIRMLM 366
                                               ********************************************************************* PP

                                 TIGR02188 349 klgeelvkkhdlsslrvlgsvGepinpeaweWyyevvGkekcpivdtwWqtetGgilitplpgvatelk 417
                                               + g+e+++k+dlssl+vlgsvGepinpeaw Wyy+++G++++ +vdtwWqtetG+++it+lp    ++k
  lcl|NCBI__GCF_000619805.1:WP_028950423.1 367 RFGDEIPAKYDLSSLKVLGSVGEPINPEAWLWYYKNIGRGRAVVVDTWWQTETGAHMITTLPC--YPAK 433
                                               **************************************************************9..6*** PP

                                 TIGR02188 418 pgsatlPlfGieaevvdeegkeveeeeeggvLvikkpwPsmlrtiygdeerfvetYfkklkg.lyftGD 485
                                               pg a +PlfG+ ++vvd+eg+++ +++  g Lvik+pwPsmlrt +g++er+  +Y+k+++g +y +GD
  lcl|NCBI__GCF_000619805.1:WP_028950423.1 434 PGKAGKPLFGVIPDVVDKEGNSLPPNTI-GHLVIKQPWPSMLRTCWGEPERYE-KYWKEIPGnVYSAGD 500
                                               ************************9999.8*********************95.699*999889***** PP

                                 TIGR02188 486 garrdkdGyiwilGRvDdvinvsGhrlgtaeiesalvsheavaeaavvgvpdeikgeaivafvvlkegv 554
                                                a+ d+dGyi+ilGR+Ddv+ v+Ghr+g ae+esa+++h+avaeaav+g+p+eikge+i afvvlkeg+
  lcl|NCBI__GCF_000619805.1:WP_028950423.1 501 LATIDEDGYIMILGRADDVLSVAGHRIGNAEVESAIIEHPAVAEAAVIGKPNEIKGESIKAFVVLKEGY 569
                                               ********************************************************************* PP

                                 TIGR02188 555 eedeeelekelkklvrkeigpiakpdkilvveelPktRsGkimRRllrkiaegeellgdvstled 619
                                               +++ e l +e+k++v++ +g+ia pd++++v++lPktRsGkimRR+lr+   g + lgd+stled
  lcl|NCBI__GCF_000619805.1:WP_028950423.1 570 SPSIE-LIEEIKETVKEILGAIAVPDEVEFVDKLPKTRSGKIMRRVLRARELG-QDLGDISTLED 632
                                               ****5.******************************************87665.5677*****98 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (629 nodes)
Target sequences:                          1  (632 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.03u 0.01s 00:00:00.04 Elapsed: 00:00:00.03
# Mc/sec: 11.81
//
[ok]

This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory