GapMind for catabolism of small carbon sources

 

Alignments for a candidate for acs in Algoriphagus machipongonensis PR1

Align Acetyl-coenzyme A synthetase (EC 6.2.1.1) (characterized)
to candidate WP_008198284.1 ALPR1_RS02860 acetate--CoA ligase

Query= reanno::pseudo5_N2C3_1:AO356_18695
         (651 letters)



>NCBI__GCF_000166275.1:WP_008198284.1
          Length = 630

 Score =  722 bits (1864), Expect = 0.0
 Identities = 358/625 (57%), Positives = 443/625 (70%), Gaps = 7/625 (1%)

Query: 24  YKAMYQQSVVNPDGFWREQAKRLDWIKPFTTVKQTSFDDHHVDIKWFADGTLNVSYNCLD 83
           Y   YQ+SV  P+ FW   A    W K +    + +F+    D+KWF +G LN++ N  +
Sbjct: 11  YLHEYQKSVAQPEEFWARIADSFHWRKRWDKTLKWNFEGP--DVKWFLNGKLNITENIFE 68

Query: 84  RHLAERGDQIAIIWEGDDPSES-RNITYRELHEEVCKFANALRGQDVHRGDVVTIYMPMI 142
           R+L   GD+ AIIWE +DP+E+ R +TYREL EEV +F+NAL+ + + +GD V IYMPM+
Sbjct: 69  RYLFTIGDRPAIIWEPNDPNEAGRTLTYRELFEEVSRFSNALKSKGIGKGDKVIIYMPMV 128

Query: 143 PEAVVAMLACTRIGAIHSVVFGGFSPEALAGRIIDCKSKVVITADEGVRAGKKIPLKANV 202
           PEA VAMLAC RIGAIHSVVF GFS  ALA RI DC++K V+T+D   R  KKI +KA V
Sbjct: 129 PEAAVAMLACARIGAIHSVVFAGFSSNALADRINDCEAKAVLTSDGNFRGSKKIAVKAVV 188

Query: 203 DDALTNPETSSIQKVIVCKRTAGNIKWNQHRDIWYEDLMKVAGTVCAPKEMGAEEALFIL 262
           D+ALT    SS++ VIV +RT  ++     RD W+ D++      C  +EM +E+ LFIL
Sbjct: 189 DEALTK---SSVETVIVYQRTHQDVTMQDGRDYWWHDVVADESKDCPAEEMDSEDMLFIL 245

Query: 263 YTSGSTGKPKGVQHTTAGYLLYAALTHERVFDYKPGEVYWCTADVGWVTGHSYIVYGPLA 322
           YTSGSTGKPKGV HTT GY++Y+  + E VF Y PG+VYWCTADVGW+TGHSYIVYGPL 
Sbjct: 246 YTSGSTGKPKGVVHTTGGYMVYSKYSFENVFQYSPGDVYWCTADVGWITGHSYIVYGPLL 305

Query: 323 NGATTLLFEGVPNYPDITRVAKVIDKHKVSILYTAPTAIRAMMASGTAAVEGADGSSLRL 382
            GATT++FEGVP +PD  R   +++K+KV+  YTAPTAIRA+ A GT  +E  D SSL++
Sbjct: 306 AGATTIMFEGVPTFPDCGRFWAIVEKYKVNQFYTAPTAIRALQAYGTVEIEKYDLSSLKV 365

Query: 383 LGSVGEPINPEAWDWYYKNVGKERCPIVDTWWQTETGGVLISPLPGATALKPGSATRPFF 442
           LGSVGEPIN EAW WY+ ++GK +CPIVDTWWQTETGG+++SP+ G T  KP  AT P  
Sbjct: 366 LGSVGEPINEEAWHWYHTHIGKNKCPIVDTWWQTETGGIMVSPIAGITPTKPAYATLPLP 425

Query: 443 GVVPALVDNLGNLIEG-AAEGNLVILDSWPGQARTLYGDHDRFVDTYFKTFSGMYFTGDG 501
           GV   +VD  GN + G + EGNL I   WP   RT YGDHDR   TYF T+ GMYFTGDG
Sbjct: 426 GVQLCIVDPEGNELTGNSVEGNLCIKFPWPSMIRTTYGDHDRCKQTYFSTYKGMYFTGDG 485

Query: 502 ARRDEDGYYWITGRVDDVLNVSGHRMGTAEIESAMVAHPKVAEAAVVGVPHDIKGQGIYV 561
            +RD DGYY I GRVDDV+NVSGHRMGTAE+E+A+  HPKV E+AVVG PHD+KGQGIY 
Sbjct: 486 VKRDHDGYYRILGRVDDVINVSGHRMGTAEVENAINEHPKVIESAVVGYPHDVKGQGIYA 545

Query: 562 YVTLNAGEETSEALRLELKNWVRKEIGPIASPDVIQWAPGLPKTRSGKIMRRILRKIATA 621
           YV  +    T + L  E+K  V K IGPIA PD IQ  PGLPKTRSGKIMRRILRK+A  
Sbjct: 546 YVICDLTNRTEDNLVNEIKEMVSKIIGPIAKPDKIQLVPGLPKTRSGKIMRRILRKVAEN 605

Query: 622 EYDGLGDISTLADPGVVAHLIETHK 646
             D +GD STL DP VV  +I+  K
Sbjct: 606 NLDNMGDTSTLLDPDVVEKIIDGRK 630


Lambda     K      H
   0.318    0.135    0.418 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 1272
Number of extensions: 68
Number of successful extensions: 5
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 651
Length of database: 630
Length adjustment: 38
Effective length of query: 613
Effective length of database: 592
Effective search space:   362896
Effective search space used:   362896
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 54 (25.4 bits)

Align candidate WP_008198284.1 ALPR1_RS02860 (acetate--CoA ligase)
to HMM TIGR02188 (acs: acetate--CoA ligase (EC 6.2.1.1))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR02188.hmm
# target sequence database:        /tmp/gapView.2137935.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR02188  [M=629]
Accession:   TIGR02188
Description: Ac_CoA_lig_AcsA: acetate--CoA ligase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                             Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                             -----------
   1.7e-294  963.8   0.8     2e-294  963.6   0.8    1.0  1  NCBI__GCF_000166275.1:WP_008198284.1  


Domain annotation for each sequence (and alignments):
>> NCBI__GCF_000166275.1:WP_008198284.1  
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  963.6   0.8    2e-294    2e-294       4     627 ..       8     626 ..       5     628 .. 0.98

  Alignments for each domain:
  == domain 1  score: 963.6 bits;  conditional E-value: 2e-294
                             TIGR02188   4 leeykelyeeaiedpekfwaklakeelewlkpfekvldeslep.kvkWfedgelnvsyncvdrhvekrkdkva 75 
                                           l+ y + y++++++pe+fwa+ a  +++w k ++k+l++++e  +vkWf +g+ln++ n  +r++ +  d+ a
  NCBI__GCF_000166275.1:WP_008198284.1   8 LSGYLHEYQKSVAQPEEFWARIAD-SFHWRKRWDKTLKWNFEGpDVKWFLNGKLNITENIFERYLFTIGDRPA 79 
                                           6678899****************9.6***************988***************************** PP

                             TIGR02188  76 iiwegdeegedsrkltYaellrevcrlanvlkelGvkkgdrvaiYlpmipeaviamlacaRiGavhsvvfaGf 148
                                           iiwe ++++e  r+ltY+el++ev+r++n+lk+ G+ kgd+v+iY+pm+pea++amlacaRiGa+hsvvfaGf
  NCBI__GCF_000166275.1:WP_008198284.1  80 IIWEPNDPNEAGRTLTYRELFEEVSRFSNALKSKGIGKGDKVIIYMPMVPEAAVAMLACARIGAIHSVVFAGF 152
                                           ************************************************************************* PP

                             TIGR02188 149 saealaeRivdaeaklvitadeglRggkvielkkivdealekaeesvekvlvvkrtgeevaewkegrDvwwee 221
                                           s++ala+Ri+d+eak v+t+d+++Rg k+i++k++vdeal+k+  sve+v+v++rt ++v+ +++grD+ww++
  NCBI__GCF_000166275.1:WP_008198284.1 153 SSNALADRINDCEAKAVLTSDGNFRGSKKIAVKAVVDEALTKS--SVETVIVYQRTHQDVT-MQDGRDYWWHD 222
                                           *****************************************98..5*************76.*********** PP

                             TIGR02188 222 lvekeasaecepekldsedplfiLYtsGstGkPkGvlhttgGylllaaltvkyvfdikdedifwCtaDvGWvt 294
                                           +v++ +s++c++e++dsed+lfiLYtsGstGkPkGv+httgGy+++ +++++ vf++ ++d++wCtaDvGW+t
  NCBI__GCF_000166275.1:WP_008198284.1 223 VVAD-ESKDCPAEEMDSEDMLFILYTSGSTGKPKGVVHTTGGYMVYSKYSFENVFQYSPGDVYWCTADVGWIT 294
                                           ***7.******************************************************************** PP

                             TIGR02188 295 GhsYivygPLanGattllfegvptypdasrfweviekykvtifYtaPtaiRalmklgeelvkkhdlsslrvlg 367
                                           GhsYivygPL++Gatt++fegvpt+pd +rfw+++ekykv++fYtaPtaiRal+++g+  ++k+dlssl+vlg
  NCBI__GCF_000166275.1:WP_008198284.1 295 GHSYIVYGPLLAGATTIMFEGVPTFPDCGRFWAIVEKYKVNQFYTAPTAIRALQAYGTVEIEKYDLSSLKVLG 367
                                           ************************************************************************* PP

                             TIGR02188 368 svGepinpeaweWyyevvGkekcpivdtwWqtetGgilitplpgvatelkpgsatlPlfGieaevvdeegkev 440
                                           svGepin eaw+Wy++++Gk+kcpivdtwWqtetGgi+++p++g +t++kp+ atlPl+G++  +vd eg+e+
  NCBI__GCF_000166275.1:WP_008198284.1 368 SVGEPINEEAWHWYHTHIGKNKCPIVDTWWQTETGGIMVSPIAG-ITPTKPAYATLPLPGVQLCIVDPEGNEL 439
                                           ********************************************.5*************************** PP

                             TIGR02188 441 eeeeeggvLvikkpwPsmlrtiygdeerfvetYfkklkglyftGDgarrdkdGyiwilGRvDdvinvsGhrlg 513
                                           + ++ +g L+ik pwPsm+rt ygd++r  +tYf+++kg+yftGDg++rd+dGy+ ilGRvDdvinvsGhr+g
  NCBI__GCF_000166275.1:WP_008198284.1 440 TGNSVEGNLCIKFPWPSMIRTTYGDHDRCKQTYFSTYKGMYFTGDGVKRDHDGYYRILGRVDDVINVSGHRMG 512
                                           *777779****************************************************************** PP

                             TIGR02188 514 taeiesalvsheavaeaavvgvpdeikgeaivafvvlkegveedeeelekelkklvrkeigpiakpdkilvve 586
                                           tae+e+a+ +h++v e+avvg+p+++kg+ i+a+v+   ++++++ +l +e+k++v+k igpiakpdki++v+
  NCBI__GCF_000166275.1:WP_008198284.1 513 TAEVENAINEHPKVIESAVVGYPHDVKGQGIYAYVICDLTNRTED-NLVNEIKEMVSKIIGPIAKPDKIQLVP 584
                                           *************************************99999888.5************************** PP

                             TIGR02188 587 elPktRsGkimRRllrkiaege.ellgdvstledpsvveelk 627
                                            lPktRsGkimRR+lrk+ae++ +++gd+stl dp vve+++
  NCBI__GCF_000166275.1:WP_008198284.1 585 GLPKTRSGKIMRRILRKVAENNlDNMGDTSTLLDPDVVEKII 626
                                           ****************************************97 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (629 nodes)
Target sequences:                          1  (630 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.01u 0.01s 00:00:00.02 Elapsed: 00:00:00.01
# Mc/sec: 20.10
//
[ok]

This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory