GapMind for catabolism of small carbon sources

 

Alignments for a candidate for acs in Methanospirillum lacunae Ki8-1

Align acetate-CoA ligase (EC 6.2.1.1) (characterized)
to candidate WP_109969112.1 DK846_RS11570 acetate--CoA ligase

Query= BRENDA::Q2XNL6
         (634 letters)



>NCBI__GCF_003173355.1:WP_109969112.1
          Length = 629

 Score =  689 bits (1778), Expect = 0.0
 Identities = 338/636 (53%), Positives = 452/636 (71%), Gaps = 9/636 (1%)

Query: 1   MSKDTSVLLEEKRVFK-PHYTVVEEAHIKNWEAELEK-GKDHENYWAEKAERLEWFRKWD 58
           M++D  V LE+K     PHY  +  + + N++    K   D + +W EKA  L+W R W+
Sbjct: 1   MAEDFEVKLEKKAYIPAPHY--LANSALGNYKEAYNKFTSDPDGFWDEKARELKWMRPWE 58

Query: 59  RVLDESNRPFYRWFVNGKINMTYNAVDRWLDTDKRNQVAILYVNERGDERKLTYYELYRE 118
           +V  E N P+ RWF + K+N+T N +DR ++  +RN++AI++  E G E  LTY +LYR 
Sbjct: 59  KVR-EWNHPYARWFTSAKLNITENCLDRHVNNGRRNKLAIIWRGEDGREEVLTYRQLYRS 117

Query: 119 VSRTANALKSLGIKKGDAVALYLPMCPELVVSMLACAKIGAVHSVIYSGLSVGALVERLN 178
           V R ANALKSLG++KGD +  Y+P  PE VV++LACA+IGA+HS++Y+G    AL  R+ 
Sbjct: 118 VMRFANALKSLGVQKGDRICFYMPFVPEHVVAILACARIGAIHSIVYAGFGAEALHSRIR 177

Query: 179 DARAKIIITADGTYRRGGVIKLKPIVDEAILQCPTIETTVVVKHTDIDIEMSDISGREML 238
           DA AKI+ITAD   RRG  I LK IVD+A+   P++E  +V+      +E+   S  E+ 
Sbjct: 178 DANAKIVITADVGKRRGKTIPLKSIVDDAVRNAPSVEKVIVLCREKCPLEL--YSELEVD 235

Query: 239 FDKLIEGEGDRCDAEEMDAEDPLFILYTSGSTGKPKGVLHTTGGYMVGVASTLEMTFDIH 298
           F  + EG  D C AEEMDAEDPLFILYTSG+TG  KG++H  GGYMVG   T +  FDI 
Sbjct: 236 FYGIQEGMSDECPAEEMDAEDPLFILYTSGTTGSAKGIVHACGGYMVGTHYTCKYIFDIK 295

Query: 299 NGDLWWCTADIGWITGHSYVVYGPLLLGTTTLLYEGAPDYPDPGVWWSIVEKYGVTKFYT 358
             D++WC+AD GWITGHSY+VYGPL +G T ++ E  PDYPD GVWWSI+E++GV+ FYT
Sbjct: 296 ENDVYWCSADPGWITGHSYIVYGPLSVGATVVITETTPDYPDYGVWWSIIEEFGVSIFYT 355

Query: 359 APTAIRHLMRFGDKHPKRYNLESLKILGTVGEPINPEAWMWYYRNIGREKCPIIDTWWQT 418
           APTAIR  MR G++ P +Y+L SL+I+G+VGEP+NPEA+ WYYR IG+ +CPI+DTWWQT
Sbjct: 356 APTAIRMFMRVGEEWPNKYDLSSLRIIGSVGEPLNPEAFEWYYRVIGKNRCPILDTWWQT 415

Query: 419 ETGMHLIAPLPVTPLKPGSVTKPLPGIEADVVDENGDPVPLGKGGFLVIRKPWPAMFRTL 478
           ETGMH+I      P+KPG    P+PG+ ADVVD++G+PVP G+GG LVI+ PWP+M RT+
Sbjct: 416 ETGMHMITTPLGMPMKPGFAGVPIPGVFADVVDKDGNPVPAGQGGLLVIKGPWPSMMRTV 475

Query: 479 FNDEQRYIDVYWKQIPGGVYTAGDMARKDEDGYFWIQGRSDDVLNIAGHRIGTAEVESVF 538
           +N+++RY   YW QI    YT GD+A KD+DGY  I GRSDD++ +AGH +GTAEVES  
Sbjct: 476 YNNDERY-RKYWTQIK-DYYTVGDLAVKDDDGYIMILGRSDDIIIVAGHNLGTAEVESAL 533

Query: 539 VAHPAVAEAAVIGKADPIKGEVIKAFLILKKGHKLNAALIEELKRHLRHELGPVAVVGEM 598
           V H AVAEAAVIG  D IKG+ +KAF+ L +G++ +  L+ EL  H+R  +GP+A+   +
Sbjct: 534 VEHEAVAEAAVIGVPDDIKGQAVKAFVTLVQGYEPSQKLVSELTYHVRMSIGPIAMPNAI 593

Query: 599 VQVDSLPKTRSGKIMRRILRAREEGEDLGDTSTLEE 634
             +D LPKTRSGKIMRR+L+A+E G D GD STLEE
Sbjct: 594 EFMDKLPKTRSGKIMRRLLKAKEMGIDPGDISTLEE 629


Lambda     K      H
   0.319    0.138    0.428 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 1227
Number of extensions: 58
Number of successful extensions: 4
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 634
Length of database: 629
Length adjustment: 38
Effective length of query: 596
Effective length of database: 591
Effective search space:   352236
Effective search space used:   352236
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 54 (25.4 bits)

Align candidate WP_109969112.1 DK846_RS11570 (acetate--CoA ligase)
to HMM TIGR02188 (acs: acetate--CoA ligase (EC 6.2.1.1))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR02188.hmm
# target sequence database:        /tmp/gapView.3624775.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR02188  [M=629]
Accession:   TIGR02188
Description: Ac_CoA_lig_AcsA: acetate--CoA ligase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                             Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                             -----------
   4.7e-267  873.2   0.1   5.5e-267  873.0   0.1    1.0  1  NCBI__GCF_003173355.1:WP_109969112.1  


Domain annotation for each sequence (and alignments):
>> NCBI__GCF_003173355.1:WP_109969112.1  
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  873.0   0.1  5.5e-267  5.5e-267       4     619 ..      26     629 .]      23     629 .] 0.97

  Alignments for each domain:
  == domain 1  score: 873.0 bits;  conditional E-value: 5.5e-267
                             TIGR02188   4 leeykelyeeaiedpekfwaklakeelewlkpfekvldeslepkvkWfedgelnvsyncvdrhvek.rkdkva 75 
                                           l +yke y++ ++dp+ fw+++a+e l+w++p+ekv++++ +p ++Wf+ ++ln++ nc+drhv++ r++k+a
  NCBI__GCF_003173355.1:WP_109969112.1  26 LGNYKEAYNKFTSDPDGFWDEKARE-LKWMRPWEKVREWN-HPYARWFTSAKLNITENCLDRHVNNgRRNKLA 96 
                                           568*********************5.*************9.789***************************** PP

                             TIGR02188  76 iiwegdeegedsrkltYaellrevcrlanvlkelGvkkgdrvaiYlpmipeaviamlacaRiGavhsvvfaGf 148
                                           iiw g++    ++ ltY++l+r+v r+an+lk+lGv+kgdr+++Y+p++pe v+a+lacaRiGa+hs+v+aGf
  NCBI__GCF_003173355.1:WP_109969112.1  97 IIWRGEDGR--EEVLTYRQLYRSVMRFANALKSLGVQKGDRICFYMPFVPEHVVAILACARIGAIHSIVYAGF 167
                                           ******776..599*********************************************************** PP

                             TIGR02188 149 saealaeRivdaeaklvitadeglRggkvielkkivdealekaeesvekvlvvkrtgeevaewkegrDvwwee 221
                                            aeal++Ri da+ak+vitad g R+gk+i+lk+ivd+a+++a+ svekv+v+ r + +++ + ++ +v +  
  NCBI__GCF_003173355.1:WP_109969112.1 168 GAEALHSRIRDANAKIVITADVGKRRGKTIPLKSIVDDAVRNAP-SVEKVIVLCREKCPLE-LYSELEVDFYG 238
                                           *******************************************9.7*************76.88899999999 PP

                             TIGR02188 222 lvekeasaecepekldsedplfiLYtsGstGkPkGvlhttgGylllaaltvkyvfdikdedifwCtaDvGWvt 294
                                           + e + s+ec++e++d+edplfiLYtsG+tG  kG++h+ gGy++ +++t+ky+fdik++d++wC+aD GW+t
  NCBI__GCF_003173355.1:WP_109969112.1 239 IQE-GMSDECPAEEMDAEDPLFILYTSGTTGSAKGIVHACGGYMVGTHYTCKYIFDIKENDVYWCSADPGWIT 310
                                           999.6******************************************************************** PP

                             TIGR02188 295 GhsYivygPLanGattllfegvptypdasrfweviekykvtifYtaPtaiRalmklgeelvkkhdlsslrvlg 367
                                           GhsYivygPL++Gat ++ e++p+ypd + +w++ie+++v+ifYtaPtaiR++m++gee+++k+dlsslr++g
  NCBI__GCF_003173355.1:WP_109969112.1 311 GHSYIVYGPLSVGATVVITETTPDYPDYGVWWSIIEEFGVSIFYTAPTAIRMFMRVGEEWPNKYDLSSLRIIG 383
                                           ************************************************************************* PP

                             TIGR02188 368 svGepinpeaweWyyevvGkekcpivdtwWqtetGgilitplpgvatelkpgsatlPlfGieaevvdeegkev 440
                                           svGep+npea+eWyy+v+Gk++cpi dtwWqtetG ++it+  g  +++kpg a +P++G+ a+vvd++g++v
  NCBI__GCF_003173355.1:WP_109969112.1 384 SVGEPLNPEAFEWYYRVIGKNRCPILDTWWQTETGMHMITTPLG--MPMKPGFAGVPIPGVFADVVDKDGNPV 454
                                           ****************************************9999..6************************** PP

                             TIGR02188 441 eeeeeggvLvikkpwPsmlrtiygdeerfvetYfkklkglyftGDgarrdkdGyiwilGRvDdvinvsGhrlg 513
                                            ++++ g+Lvik pwPsm+rt+y+++er+  +Y++++k++y  GD a++d+dGyi+ilGR Dd+i v+Gh+lg
  NCBI__GCF_003173355.1:WP_109969112.1 455 PAGQG-GLLVIKGPWPSMMRTVYNNDERYR-KYWTQIKDYYTVGDLAVKDDDGYIMILGRSDDIIIVAGHNLG 525
                                           **999.8**********************7.599*************************************** PP

                             TIGR02188 514 taeiesalvsheavaeaavvgvpdeikgeaivafvvlkegveedeeelekelkklvrkeigpiakpdkilvve 586
                                           tae+esalv+heavaeaav+gvpd+ikg+a+ afv+l +g+e+++ +l +el+ +vr +igpia p+ i++++
  NCBI__GCF_003173355.1:WP_109969112.1 526 TAEVESALVEHEAVAEAAVIGVPDDIKGQAVKAFVTLVQGYEPSQ-KLVSELTYHVRMSIGPIAMPNAIEFMD 597
                                           *********************************************.5************************** PP

                             TIGR02188 587 elPktRsGkimRRllrkiaegeellgdvstled 619
                                           +lPktRsGkimRRll++   g    gd+stle+
  NCBI__GCF_003173355.1:WP_109969112.1 598 KLPKTRSGKIMRRLLKAKEMGI-DPGDISTLEE 629
                                           ***************9866655.556****985 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (629 nodes)
Target sequences:                          1  (629 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.01u 0.01s 00:00:00.02 Elapsed: 00:00:00.01
# Mc/sec: 32.50
//
[ok]

This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory