GapMind for catabolism of small carbon sources

 

Alignments for a candidate for acs in Thiohalospira halophila HL 3

Align Acetyl-coenzyme A synthetase (EC 6.2.1.1) (characterized)
to candidate WP_093426890.1 BM272_RS00970 acetate--CoA ligase

Query= reanno::pseudo5_N2C3_1:AO356_18695
         (651 letters)



>NCBI__GCF_900112605.1:WP_093426890.1
          Length = 646

 Score =  861 bits (2225), Expect = 0.0
 Identities = 418/648 (64%), Positives = 498/648 (76%), Gaps = 4/648 (0%)

Query: 1   MSAASLYPVRPEVAASTLTDEATYKAMYQQSVVNPDGFWREQAKR-LDWIKPFTTVKQTS 59
           MS   +YPV   VAA     +A Y+AMY++SV +P+GFW EQA R LDW + + T    S
Sbjct: 1   MSEHKVYPVPESVAARAHVTQAEYEAMYRRSVDDPEGFWAEQADRFLDWFEKWDTTLDWS 60

Query: 60  FDDHHVDIKWFADGTLNVSYNCLDRHLAERGDQIAIIWEGDDPSESRNITYRELHEEVCK 119
           F+   V ++WF  G LNV++NC+DRHLAE  ++ A+IWEGD+P + ++IT+RELHE+V +
Sbjct: 61  FEGD-VHVEWFKGGKLNVAHNCIDRHLAEHAEKTALIWEGDEPDQDQHITFRELHEQVSR 119

Query: 120 FANALRGQDVHRGDVVTIYMPMIPEAVVAMLACTRIGAIHSVVFGGFSPEALAGRIIDCK 179
             N L+ + V +GD V +YMPMIPEAV AMLAC RIGA+HSVVFGGFSPEAL  RI D +
Sbjct: 120 LGNVLKERGVSKGDRVCLYMPMIPEAVYAMLACARIGAVHSVVFGGFSPEALKDRIQDAE 179

Query: 180 SKVVITADEGVRAGKKIPLKANVDDALTNPETSSIQKVIVCKRTAGNIKWNQHRDIWYED 239
           + VVITADEGVR G+K+ LKAN D A+   +  S+  V+  +RT G+I WN  RD+WY +
Sbjct: 180 ASVVITADEGVRGGRKVGLKANTDKAVD--QCPSVHTVLTVRRTGGDIGWNDSRDVWYHE 237

Query: 240 LMKVAGTVCAPKEMGAEEALFILYTSGSTGKPKGVQHTTAGYLLYAALTHERVFDYKPGE 299
            ++ A   C  + M AE+ LFILYTSGSTGKPKGV HTT GYLLYAA+T    FD +P +
Sbjct: 238 AVEAASADCPAEPMDAEDPLFILYTSGSTGKPKGVLHTTGGYLLYAAITTWYTFDLQPDD 297

Query: 300 VYWCTADVGWVTGHSYIVYGPLANGATTLLFEGVPNYPDITRVAKVIDKHKVSILYTAPT 359
           VYWCTADVGW+TGHSY+VYGPLAN  T+++FEGVP+YPD +R  +V+DKH+VS+ YTAPT
Sbjct: 298 VYWCTADVGWITGHSYLVYGPLANATTSVVFEGVPSYPDASRFWEVVDKHQVSVFYTAPT 357

Query: 360 AIRAMMASGTAAVEGADGSSLRLLGSVGEPINPEAWDWYYKNVGKERCPIVDTWWQTETG 419
           AIRA+M  G   V   D SSLR+LG+VGEPINPEAW+WYY  VGKE+CPIVDTWWQTETG
Sbjct: 358 AIRALMREGDEPVTKTDRSSLRILGTVGEPINPEAWEWYYHVVGKEQCPIVDTWWQTETG 417

Query: 420 GVLISPLPGATALKPGSATRPFFGVVPALVDNLGNLIEGAAEGNLVILDSWPGQARTLYG 479
           G LI+PLPGAT LKPGSATRPFFG+VPAL+D  G+++EG  EG LVI   WPGQ RT+YG
Sbjct: 418 GHLITPLPGATELKPGSATRPFFGIVPALMDTDGHVVEGEGEGALVITRPWPGQMRTIYG 477

Query: 480 DHDRFVDTYFKTFSGMYFTGDGARRDEDGYYWITGRVDDVLNVSGHRMGTAEIESAMVAH 539
           +H RFV+TYF  F G YF+GDGARRD +G YWITGR+DDVLNVSGHRMGTAEIESA+V H
Sbjct: 478 NHQRFVETYFSAFKGCYFSGDGARRDGNGDYWITGRMDDVLNVSGHRMGTAEIESALVLH 537

Query: 540 PKVAEAAVVGVPHDIKGQGIYVYVTLNAGEETSEALRLELKNWVRKEIGPIASPDVIQWA 599
             VAEAAVVG PHD+KGQGIY YV L    E S+ALR EL N VR EIGPIASPD IQWA
Sbjct: 538 DAVAEAAVVGYPHDVKGQGIYAYVILVKDAEPSDALRKELVNLVRSEIGPIASPDAIQWA 597

Query: 600 PGLPKTRSGKIMRRILRKIATAEYDGLGDISTLADPGVVAHLIETHKT 647
           PGLPKTRSGKIMRRILRKIA  E D LGD STLADP VV  LI    T
Sbjct: 598 PGLPKTRSGKIMRRILRKIAANELDSLGDTSTLADPAVVDELINNRVT 645


Lambda     K      H
   0.318    0.135    0.418 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 1445
Number of extensions: 73
Number of successful extensions: 3
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 651
Length of database: 646
Length adjustment: 38
Effective length of query: 613
Effective length of database: 608
Effective search space:   372704
Effective search space used:   372704
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 54 (25.4 bits)

Align candidate WP_093426890.1 BM272_RS00970 (acetate--CoA ligase)
to HMM TIGR02188 (acs: acetate--CoA ligase (EC 6.2.1.1))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR02188.hmm
# target sequence database:        /tmp/gapView.362720.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR02188  [M=629]
Accession:   TIGR02188
Description: Ac_CoA_lig_AcsA: acetate--CoA ligase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                             Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                             -----------
          0 1026.7   0.0          0 1026.5   0.0    1.0  1  NCBI__GCF_900112605.1:WP_093426890.1  


Domain annotation for each sequence (and alignments):
>> NCBI__GCF_900112605.1:WP_093426890.1  
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 ! 1026.5   0.0         0         0       4     628 ..      21     641 ..      18     642 .. 0.98

  Alignments for each domain:
  == domain 1  score: 1026.5 bits;  conditional E-value: 0
                             TIGR02188   4 leeykelyeeaiedpekfwaklakeelewlkpfekvldeslep..kvkWfedgelnvsyncvdrhvekrkdkv 74 
                                           + ey+++y+++++dpe fwa++a + l+w+++++++ld+s+e   +v+Wf++g+lnv+ nc+drh++++++k+
  NCBI__GCF_900112605.1:WP_093426890.1  21 QAEYEAMYRRSVDDPEGFWAEQADRFLDWFEKWDTTLDWSFEGdvHVEWFKGGKLNVAHNCIDRHLAEHAEKT 93 
                                           679**************************************987899************************** PP

                             TIGR02188  75 aiiwegdeegedsrkltYaellrevcrlanvlkelGvkkgdrvaiYlpmipeaviamlacaRiGavhsvvfaG 147
                                           a+iwegde+++  +++t++el+++v+rl nvlke Gv kgdrv++Y+pmipeav+amlacaRiGavhsvvf+G
  NCBI__GCF_900112605.1:WP_093426890.1  94 ALIWEGDEPDQ-DQHITFRELHEQVSRLGNVLKERGVSKGDRVCLYMPMIPEAVYAMLACARIGAVHSVVFGG 165
                                           *********97.99*********************************************************** PP

                             TIGR02188 148 fsaealaeRivdaeaklvitadeglRggkvielkkivdealekaeesvekvlvvkrtgeevaewkegrDvwwe 220
                                           fs+eal++Ri+daea +vitadeg+Rgg+++ lk+++d+a+++++ sv++vl v+rtg ++  w+++rDvw++
  NCBI__GCF_900112605.1:WP_093426890.1 166 FSPEALKDRIQDAEASVVITADEGVRGGRKVGLKANTDKAVDQCP-SVHTVLTVRRTGGDIG-WNDSRDVWYH 236
                                           ********************************************9.7*************77.********** PP

                             TIGR02188 221 elvekeasaecepekldsedplfiLYtsGstGkPkGvlhttgGylllaaltvkyvfdikdedifwCtaDvGWv 293
                                           e+ve  asa+c++e++d+edplfiLYtsGstGkPkGvlhttgGyll+aa+t+ y+fd++++d++wCtaDvGW+
  NCBI__GCF_900112605.1:WP_093426890.1 237 EAVEA-ASADCPAEPMDAEDPLFILYTSGSTGKPKGVLHTTGGYLLYAAITTWYTFDLQPDDVYWCTADVGWI 308
                                           ****5.******************************************************************* PP

                             TIGR02188 294 tGhsYivygPLanGattllfegvptypdasrfweviekykvtifYtaPtaiRalmklgeelvkkhdlsslrvl 366
                                           tGhsY+vygPLan  t+++fegvp+ypdasrfwev++k++v++fYtaPtaiRalm++g+e v+k+d+sslr+l
  NCBI__GCF_900112605.1:WP_093426890.1 309 TGHSYLVYGPLANATTSVVFEGVPSYPDASRFWEVVDKHQVSVFYTAPTAIRALMREGDEPVTKTDRSSLRIL 381
                                           ************************************************************************* PP

                             TIGR02188 367 gsvGepinpeaweWyyevvGkekcpivdtwWqtetGgilitplpgvatelkpgsatlPlfGieaevvdeegke 439
                                           g+vGepinpeaweWyy+vvGke+cpivdtwWqtetGg+litplpg atelkpgsat+P+fGi ++++d++g+ 
  NCBI__GCF_900112605.1:WP_093426890.1 382 GTVGEPINPEAWEWYYHVVGKEQCPIVDTWWQTETGGHLITPLPG-ATELKPGSATRPFFGIVPALMDTDGHV 453
                                           *********************************************.6************************** PP

                             TIGR02188 440 veeeeeggvLvikkpwPsmlrtiygdeerfvetYfkklkglyftGDgarrdkdGyiwilGRvDdvinvsGhrl 512
                                           ve e e g+Lvi++pwP+++rtiyg+++rfvetYf+++kg+yf+GDgarrd +G++wi+GR+Ddv+nvsGhr+
  NCBI__GCF_900112605.1:WP_093426890.1 454 VEGEGE-GALVITRPWPGQMRTIYGNHQRFVETYFSAFKGCYFSGDGARRDGNGDYWITGRMDDVLNVSGHRM 525
                                           *98777.8***************************************************************** PP

                             TIGR02188 513 gtaeiesalvsheavaeaavvgvpdeikgeaivafvvlkegveedeeelekelkklvrkeigpiakpdkilvv 585
                                           gtaeiesalv h+avaeaavvg+p+++kg+ i+a+v+l + +e+++ +l+kel +lvr+eigpia+pd i+++
  NCBI__GCF_900112605.1:WP_093426890.1 526 GTAEIESALVLHDAVAEAAVVGYPHDVKGQGIYAYVILVKDAEPSD-ALRKELVNLVRSEIGPIASPDAIQWA 597
                                           **********************************************.5************************* PP

                             TIGR02188 586 eelPktRsGkimRRllrkiaege.ellgdvstledpsvveelke 628
                                           + lPktRsGkimRR+lrkia++e ++lgd+stl+dp+vv+el++
  NCBI__GCF_900112605.1:WP_093426890.1 598 PGLPKTRSGKIMRRILRKIAANElDSLGDTSTLADPAVVDELIN 641
                                           *****************************************986 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (629 nodes)
Target sequences:                          1  (646 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.01u 0.00s 00:00:00.01 Elapsed: 00:00:00.01
# Mc/sec: 22.26
//
[ok]

This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory