GapMind for catabolism of small carbon sources

 

Alignments for a candidate for acs in Methanospirillum stamsii Pt1

Align acetate-CoA ligase (EC 6.2.1.1) (characterized)
to candidate WP_109941954.1 DLD82_RS15135 acetate--CoA ligase

Query= BRENDA::Q2XNL6
         (634 letters)



>NCBI__GCF_003173335.1:WP_109941954.1
          Length = 629

 Score =  684 bits (1765), Expect = 0.0
 Identities = 328/635 (51%), Positives = 455/635 (71%), Gaps = 7/635 (1%)

Query: 1   MSKDTSVLLEEKRVFKPHYTVVEEAHIKNWEAELEKG-KDHENYWAEKAERLEWFRKWDR 59
           MS++  V ++ K  + P     E++ + +++A  ++  +D + +W+  A+ +EW + W +
Sbjct: 1   MSENFEVKVDSKS-YIPDPAYREQSFLGDYKAVYKEFIQDPDEFWSRMAKEIEWIKPWTK 59

Query: 60  VLDESNRPFYRWFVNGKINMTYNAVDRWLDTDKRNQVAILYVNERGDERKLTYYELYREV 119
           VL E N P+ RWF  G +N+T + +DR +   +RN++A+++  E G+ER  TY +L+R+V
Sbjct: 60  VL-EWNHPYARWFTGGTLNITTSCLDRHVKEGRRNKLALIWRGEDGEERVYTYRQLHRDV 118

Query: 120 SRTANALKSLGIKKGDAVALYLPMCPELVVSMLACAKIGAVHSVIYSGLSVGALVERLND 179
            R ANALK +G++KGD +  Y+P+ PE ++++LACA+IGA+HS++Y+G    AL  R+ D
Sbjct: 119 MRFANALKKIGVQKGDRICFYMPLVPEHIIALLACARIGAIHSIVYAGFGAEALHSRIRD 178

Query: 180 ARAKIIITADGTYRRGGVIKLKPIVDEAILQCPTIETTVVVKHTDIDIEMSDISGREMLF 239
           A AK++ITAD   RRG VI L+ IVD+A+   P++E  +V+      +E+   S  E  F
Sbjct: 179 AEAKVVITADVGKRRGKVISLRSIVDDAVRNAPSVEKVIVLCREKCSVEL--YSEMEEDF 236

Query: 240 DKLIEGEGDRCDAEEMDAEDPLFILYTSGSTGKPKGVLHTTGGYMVGVASTLEMTFDIHN 299
             L+EG    C  EEMDAE+PLFILYTSG+TG PKG++H+ GGY  GV  T +  FD+  
Sbjct: 237 YSLLEGVSADCSPEEMDAEEPLFILYTSGTTGMPKGIVHSCGGYATGVHYTAKYLFDLKE 296

Query: 300 GDLWWCTADIGWITGHSYVVYGPLLLGTTTLLYEGAPDYPDPGVWWSIVEKYGVTKFYTA 359
            D+ WCTAD GWITGHSY+VYGPL +G T ++ E  PD+PDPG+WWSI+E+ GVT FYTA
Sbjct: 297 NDVIWCTADTGWITGHSYIVYGPLSVGATVVITETTPDWPDPGIWWSIIEELGVTLFYTA 356

Query: 360 PTAIRHLMRFGDKHPKRYNLESLKILGTVGEPINPEAWMWYYRNIGREKCPIIDTWWQTE 419
           PTAIR  MR G++ P +YNL+SL+I+G+VGEP+NPEA+ WYYR IG+ +CPI+DTWWQTE
Sbjct: 357 PTAIRMFMRVGEEWPNKYNLDSLRIIGSVGEPLNPEAFEWYYRVIGKMRCPILDTWWQTE 416

Query: 420 TGMHLIAPLPVTPLKPGSVTKPLPGIEADVVDENGDPVPLGKGGFLVIRKPWPAMFRTLF 479
           TGMH+I      P+KPG    P+PGI ADVVDE G+ +P G+GG LVI+KPWP+M RT++
Sbjct: 417 TGMHMITTPLGEPMKPGFAGVPIPGIIADVVDEEGNSLPPGQGGLLVIKKPWPSMMRTVY 476

Query: 480 NDEQRYIDVYWKQIPGGVYTAGDMARKDEDGYFWIQGRSDDVLNIAGHRIGTAEVESVFV 539
            +++RY   YW QI    Y AGD+A KDEDGYF I GR+DD++ +AGH +GTAEVES  V
Sbjct: 477 RNDERY-KKYWNQIK-DYYAAGDLAVKDEDGYFMILGRADDIIIVAGHNLGTAEVESALV 534

Query: 540 AHPAVAEAAVIGKADPIKGEVIKAFLILKKGHKLNAALIEELKRHLRHELGPVAVVGEMV 599
            H AVAEAAVIG  D IKG+ +KAF+IL KG+  +  L+ EL  H+R  +GP+A+   + 
Sbjct: 535 EHEAVAEAAVIGVPDEIKGQAVKAFVILVKGYTPSQKLVSELTYHVRMSIGPIAMPAAIE 594

Query: 600 QVDSLPKTRSGKIMRRILRAREEGEDLGDTSTLEE 634
            VD+LP+TRSGKIMRRIL+A+E   DLGDTSTLEE
Sbjct: 595 FVDTLPRTRSGKIMRRILKAKEMNMDLGDTSTLEE 629


Lambda     K      H
   0.319    0.138    0.428 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 1240
Number of extensions: 59
Number of successful extensions: 4
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 634
Length of database: 629
Length adjustment: 38
Effective length of query: 596
Effective length of database: 591
Effective search space:   352236
Effective search space used:   352236
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 54 (25.4 bits)

Align candidate WP_109941954.1 DLD82_RS15135 (acetate--CoA ligase)
to HMM TIGR02188 (acs: acetate--CoA ligase (EC 6.2.1.1))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR02188.hmm
# target sequence database:        /tmp/gapView.2495692.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR02188  [M=629]
Accession:   TIGR02188
Description: Ac_CoA_lig_AcsA: acetate--CoA ligase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                             Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                             -----------
   2.9e-268  877.2   0.2   3.6e-268  876.9   0.2    1.0  1  NCBI__GCF_003173335.1:WP_109941954.1  


Domain annotation for each sequence (and alignments):
>> NCBI__GCF_003173335.1:WP_109941954.1  
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  876.9   0.2  3.6e-268  3.6e-268       6     619 ..      28     629 .]      24     629 .] 0.97

  Alignments for each domain:
  == domain 1  score: 876.9 bits;  conditional E-value: 3.6e-268
                             TIGR02188   6 eykelyeeaiedpekfwaklakeelewlkpfekvldeslepkvkWfedgelnvsyncvdrhvek.rkdkvaii 77 
                                           +yk++y+e i+dp++fw+++ake +ew+kp++kvl+++ +p ++Wf++g+ln+++ c+drhv++ r++k+a+i
  NCBI__GCF_003173335.1:WP_109941954.1  28 DYKAVYKEFIQDPDEFWSRMAKE-IEWIKPWTKVLEWN-HPYARWFTGGTLNITTSCLDRHVKEgRRNKLALI 98 
                                           89********************5.*************9.789******************************* PP

                             TIGR02188  78 wegdeegedsrkltYaellrevcrlanvlkelGvkkgdrvaiYlpmipeaviamlacaRiGavhsvvfaGfsa 150
                                           w g++ +  +r +tY++l+r+v r+an+lk++Gv+kgdr+++Y+p++pe +ia+lacaRiGa+hs+v+aGf a
  NCBI__GCF_003173335.1:WP_109941954.1  99 WRGEDGE--ERVYTYRQLHRDVMRFANALKKIGVQKGDRICFYMPLVPEHIIALLACARIGAIHSIVYAGFGA 169
                                           ****664..7*************************************************************** PP

                             TIGR02188 151 ealaeRivdaeaklvitadeglRggkvielkkivdealekaeesvekvlvvkrtgeevaewkegrDvwweelv 223
                                           eal++Ri daeak+vitad g R+gkvi+l++ivd+a+++a+ svekv+v+ r + +v+ + ++++  + +l+
  NCBI__GCF_003173335.1:WP_109941954.1 170 EALHSRIRDAEAKVVITADVGKRRGKVISLRSIVDDAVRNAP-SVEKVIVLCREKCSVE-LYSEMEEDFYSLL 240
                                           *****************************************9.7*************77.6667778888899 PP

                             TIGR02188 224 ekeasaecepekldsedplfiLYtsGstGkPkGvlhttgGylllaaltvkyvfdikdedifwCtaDvGWvtGh 296
                                           e ++sa+c+pe++d+e+plfiLYtsG+tG+PkG++h+ gGy++ +++t ky+fd+k++d+ wCtaD GW+tGh
  NCBI__GCF_003173335.1:WP_109941954.1 241 E-GVSADCSPEEMDAEEPLFILYTSGTTGMPKGIVHSCGGYATGVHYTAKYLFDLKENDVIWCTADTGWITGH 312
                                           9.59********************************************************************* PP

                             TIGR02188 297 sYivygPLanGattllfegvptypdasrfweviekykvtifYtaPtaiRalmklgeelvkkhdlsslrvlgsv 369
                                           sYivygPL++Gat ++ e++p++pd++ +w++ie+ +vt fYtaPtaiR++m++gee+++k++l+slr++gsv
  NCBI__GCF_003173335.1:WP_109941954.1 313 SYIVYGPLSVGATVVITETTPDWPDPGIWWSIIEELGVTLFYTAPTAIRMFMRVGEEWPNKYNLDSLRIIGSV 385
                                           ************************************************************************* PP

                             TIGR02188 370 GepinpeaweWyyevvGkekcpivdtwWqtetGgilitplpgvatelkpgsatlPlfGieaevvdeegkevee 442
                                           Gep+npea+eWyy+v+Gk +cpi dtwWqtetG ++it+  g   ++kpg a +P++Gi a+vvdeeg+++ +
  NCBI__GCF_003173335.1:WP_109941954.1 386 GEPLNPEAFEWYYRVIGKMRCPILDTWWQTETGMHMITTPLG--EPMKPGFAGVPIPGIIADVVDEEGNSLPP 456
                                           *************************************99888..5**************************** PP

                             TIGR02188 443 eeeggvLvikkpwPsmlrtiygdeerfvetYfkklkglyftGDgarrdkdGyiwilGRvDdvinvsGhrlgta 515
                                           +++ g+LvikkpwPsm+rt+y+++er+  +Y++++k++y +GD a++d+dGy++ilGR+Dd+i v+Gh+lgta
  NCBI__GCF_003173335.1:WP_109941954.1 457 GQG-GLLVIKKPWPSMMRTVYRNDERYK-KYWNQIKDYYAAGDLAVKDEDGYFMILGRADDIIIVAGHNLGTA 527
                                           999.8*********************96.6******************************************* PP

                             TIGR02188 516 eiesalvsheavaeaavvgvpdeikgeaivafvvlkegveedeeelekelkklvrkeigpiakpdkilvveel 588
                                           e+esalv+heavaeaav+gvpdeikg+a+ afv+l +g+++++ +l +el+ +vr +igpia p+ i++v++l
  NCBI__GCF_003173335.1:WP_109941954.1 528 EVESALVEHEAVAEAAVIGVPDEIKGQAVKAFVILVKGYTPSQ-KLVSELTYHVRMSIGPIAMPAAIEFVDTL 599
                                           *******************************************.5**************************** PP

                             TIGR02188 589 PktRsGkimRRllrkiaegeellgdvstled 619
                                           P+tRsGkimRR+l++    +  lgd+stle+
  NCBI__GCF_003173335.1:WP_109941954.1 600 PRTRSGKIMRRILKAKEM-NMDLGDTSTLEE 629
                                           *************97544.57788*****85 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (629 nodes)
Target sequences:                          1  (629 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.02u 0.00s 00:00:00.02 Elapsed: 00:00:00.01
# Mc/sec: 20.08
//
[ok]

This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory