GapMind for catabolism of small carbon sources

 

Alignments for a candidate for acs in Pedobacter arcticus A12

Align Acetyl-coenzyme A synthetase (EC 6.2.1.1) (characterized)
to candidate WP_017257106.1 B176_RS0102045 acetate--CoA ligase

Query= reanno::pseudo5_N2C3_1:AO356_18695
         (651 letters)



>NCBI__GCF_000302595.1:WP_017257106.1
          Length = 635

 Score =  711 bits (1835), Expect = 0.0
 Identities = 359/626 (57%), Positives = 442/626 (70%), Gaps = 10/626 (1%)

Query: 24  YKAMYQQSVVNPDGFWREQAKRLDWIKPFTTVKQTSFDDHHVDIKWFADGTLNVSYNCLD 83
           YK  YQ+SV  P+ FW + A    W K +  V   +F +  ++  WF    LN++ NC+D
Sbjct: 11  YKEAYQRSVEQPEAFWADIADNFQWKKKWDKVLDWNFKEPKIE--WFKGAKLNITENCID 68

Query: 84  RHLAERGDQIAIIWEGDDPSES-RNITYRELHEEVCKFANALRGQDVHRGDVVTIYMPMI 142
           RHLA++GDQ AIIWE +DP+E  R +TY++LHE+VC FAN L+  DV +GD V IYMPMI
Sbjct: 69  RHLADKGDQPAIIWEANDPNEHHRVLTYKQLHEKVCLFANVLKNNDVKKGDRVCIYMPMI 128

Query: 143 PEAVVAMLACTRIGAIHSVVFGGFSPEALAGRIIDCKSKVVITADEGVRAGKKIPLKANV 202
           PE  +A+LAC RIGAIHSVVFGGFS +++A RI D + ++VITAD G R  K IPLK  +
Sbjct: 129 PELAIAVLACARIGAIHSVVFGGFSAQSIADRINDAQCEMVITADGGFRGPKDIPLKNVI 188

Query: 203 DDALTNPETSSIQKVIVCKRTAGNIKWNQHRDIWYEDLMKVAGTV----CAPKEMGAEEA 258
           DDAL   +  S++ VIV  RT   I   + RD W++D +    T+    C  +EM AE+ 
Sbjct: 189 DDALV--QCPSVKTVIVLTRTRTPISMIKGRDKWWQDEIHKVETLGMIDCPAEEMDAEDP 246

Query: 259 LFILYTSGSTGKPKGVQHTTAGYLLYAALTHERVFDYKPGEVYWCTADVGWVTGHSYIVY 318
           LFILYTSGSTGKPKGV HTTAGY++Y A T + VF Y+P +VY+CTAD+GW+TGHSYI+Y
Sbjct: 247 LFILYTSGSTGKPKGVVHTTAGYMIYTAYTFQNVFQYQPQDVYFCTADIGWITGHSYIIY 306

Query: 319 GPLANGATTLLFEGVPNYPDITRVAKVIDKHKVSILYTAPTAIRAMMASGTAAVEGADGS 378
           GPLA GATTL+FEGVP YPD  R   ++DK KV+ LYTAPTAIR++M SG   V+  D S
Sbjct: 307 GPLAQGATTLMFEGVPTYPDAGRFWDIVDKFKVNTLYTAPTAIRSLMQSGLDYVKDKDLS 366

Query: 379 SLRLLGSVGEPINPEAWDWYYKNVGKERCPIVDTWWQTETGGVLISPLPGATALKPGSAT 438
           SL++LGSVGEPIN EAW WY  N+GK +CPIVDTWWQTE GG+LISP+   T  KP  AT
Sbjct: 367 SLKVLGSVGEPINEEAWHWYNDNIGKGKCPIVDTWWQTENGGILISPIANVTPTKPCYAT 426

Query: 439 RPFFGVVPALVDNLGNLIEG-AAEGNLVILDSWPGQARTLYGDHDRFVDTYFKTFSGMYF 497
            P  GV P LVD  G +IEG    GNL I   WPG  RT YGDH+R   TYF T+  MYF
Sbjct: 427 LPLPGVQPVLVDENGAVIEGNGVSGNLCIKFPWPGMLRTTYGDHERCKLTYFSTYEDMYF 486

Query: 498 TGDGARRDEDGYYWITGRVDDVLNVSGHRMGTAEIESAMVAHPKVAEAAVVGVPHDIKGQ 557
           TGDG  RDEDGYY ITGRVDDV+NVSGHR+GTAE+E+A+     V E+AVVG PH+IKGQ
Sbjct: 487 TGDGCLRDEDGYYRITGRVDDVINVSGHRIGTAEVENAINMFTDVVESAVVGYPHEIKGQ 546

Query: 558 GIYVYVTLNAGEETSEALRLELKNWVRKEIGPIASPDVIQWAPGLPKTRSGKIMRRILRK 617
           GIY YV L+   E  E  + ++   V + IG IA PD IQ+  GLPKTRSGKIMRRILRK
Sbjct: 547 GIYAYVILDKESEDVELTKKDIAMTVSRIIGAIARPDKIQFVTGLPKTRSGKIMRRILRK 606

Query: 618 IATAEYDGLGDISTLADPGVVAHLIE 643
           IA  +   +GD+STL DP VV  +I+
Sbjct: 607 IAEGDMKNVGDVSTLLDPAVVEEIIK 632


Lambda     K      H
   0.318    0.135    0.418 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 1279
Number of extensions: 62
Number of successful extensions: 6
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 651
Length of database: 635
Length adjustment: 38
Effective length of query: 613
Effective length of database: 597
Effective search space:   365961
Effective search space used:   365961
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 54 (25.4 bits)

Align candidate WP_017257106.1 B176_RS0102045 (acetate--CoA ligase)
to HMM TIGR02188 (acs: acetate--CoA ligase (EC 6.2.1.1))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR02188.hmm
# target sequence database:        /tmp/gapView.956997.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR02188  [M=629]
Accession:   TIGR02188
Description: Ac_CoA_lig_AcsA: acetate--CoA ligase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                             Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                             -----------
   1.1e-293  961.2   0.3   1.2e-293  961.0   0.3    1.0  1  NCBI__GCF_000302595.1:WP_017257106.1  


Domain annotation for each sequence (and alignments):
>> NCBI__GCF_000302595.1:WP_017257106.1  
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  961.0   0.3  1.2e-293  1.2e-293       2     628 ..       6     632 ..       5     633 .. 0.97

  Alignments for each domain:
  == domain 1  score: 961.0 bits;  conditional E-value: 1.2e-293
                             TIGR02188   2 aeleeykelyeeaiedpekfwaklakeelewlkpfekvldeslep.kvkWfedgelnvsyncvdrhvekrkdk 73 
                                           ++ +eyke y++++e+pe+fwa+ a  +++w+k+++kvld+++++ k++Wf++++ln++ nc+drh++++ d+
  NCBI__GCF_000302595.1:WP_017257106.1   6 SSFDEYKEAYQRSVEQPEAFWADIAD-NFQWKKKWDKVLDWNFKEpKIEWFKGAKLNITENCIDRHLADKGDQ 77 
                                           6789*********************9.6**************9988*************************** PP

                             TIGR02188  74 vaiiwegdeegedsrkltYaellrevcrlanvlkelGvkkgdrvaiYlpmipeaviamlacaRiGavhsvvfa 146
                                            aiiwe+++++e++r ltY++l+++vc +anvlk+  vkkgdrv+iY+pmipe++ia+lacaRiGa+hsvvf+
  NCBI__GCF_000302595.1:WP_017257106.1  78 PAIIWEANDPNEHHRVLTYKQLHEKVCLFANVLKNNDVKKGDRVCIYMPMIPELAIAVLACARIGAIHSVVFG 150
                                           ************************************************************************* PP

                             TIGR02188 147 GfsaealaeRivdaeaklvitadeglRggkvielkkivdealekaeesvekvlvvkrtgeevaewkegrDvww 219
                                           Gfsa+++a+Ri+da++++vitad+g+Rg k i+lk+++d+al +++ sv++v+v+ rt ++++ + +grD+ww
  NCBI__GCF_000302595.1:WP_017257106.1 151 GFSAQSIADRINDAQCEMVITADGGFRGPKDIPLKNVIDDALVQCP-SVKTVIVLTRTRTPIS-MIKGRDKWW 221
                                           *********************************************9.7*************76.********* PP

                             TIGR02188 220 eelvek...easaecepekldsedplfiLYtsGstGkPkGvlhttgGylllaaltvkyvfdikdedifwCtaD 289
                                           +++++k       +c++e++d+edplfiLYtsGstGkPkGv+htt+Gy++++a+t++ vf+++++d+++CtaD
  NCBI__GCF_000302595.1:WP_017257106.1 222 QDEIHKvetLGMIDCPAEEMDAEDPLFILYTSGSTGKPKGVVHTTAGYMIYTAYTFQNVFQYQPQDVYFCTAD 294
                                           **999832123468*********************************************************** PP

                             TIGR02188 290 vGWvtGhsYivygPLanGattllfegvptypdasrfweviekykvtifYtaPtaiRalmklgeelvkkhdlss 362
                                           +GW+tGhsYi+ygPLa+Gattl+fegvptypda+rfw++++k+kv+++YtaPtaiR+lm+ g + vk +dlss
  NCBI__GCF_000302595.1:WP_017257106.1 295 IGWITGHSYIIYGPLAQGATTLMFEGVPTYPDAGRFWDIVDKFKVNTLYTAPTAIRSLMQSGLDYVKDKDLSS 367
                                           ************************************************************************* PP

                             TIGR02188 363 lrvlgsvGepinpeaweWyyevvGkekcpivdtwWqtetGgilitplpgvatelkpgsatlPlfGieaevvde 435
                                           l+vlgsvGepin eaw+Wy++++Gk+kcpivdtwWqte Ggili+p++  +t++kp  atlPl+G+++++vde
  NCBI__GCF_000302595.1:WP_017257106.1 368 LKVLGSVGEPINEEAWHWYNDNIGKGKCPIVDTWWQTENGGILISPIAN-VTPTKPCYATLPLPGVQPVLVDE 439
                                           *************************************************.6********************** PP

                             TIGR02188 436 egkeveeeeeggvLvikkpwPsmlrtiygdeerfvetYfkklkglyftGDgarrdkdGyiwilGRvDdvinvs 508
                                           +g  +e +  +g L+ik pwP+mlrt ygd+er   tYf++++++yftGDg+ rd+dGy+ i+GRvDdvinvs
  NCBI__GCF_000302595.1:WP_017257106.1 440 NGAVIEGNGVSGNLCIKFPWPGMLRTTYGDHERCKLTYFSTYEDMYFTGDGCLRDEDGYYRITGRVDDVINVS 512
                                           ******555559************************************************************* PP

                             TIGR02188 509 Ghrlgtaeiesalvsheavaeaavvgvpdeikgeaivafvvlkegveedeeelekelkklvrkeigpiakpdk 581
                                           Ghr+gtae+e+a+   ++v e+avvg+p+eikg+ i+a+v+l +++e  e  ++k++ ++v++ ig+ia+pdk
  NCBI__GCF_000302595.1:WP_017257106.1 513 GHRIGTAEVENAINMFTDVVESAVVGYPHEIKGQGIYAYVILDKESEDVE-LTKKDIAMTVSRIIGAIARPDK 584
                                           ******************************************99888666.69******************** PP

                             TIGR02188 582 ilvveelPktRsGkimRRllrkiaege.ellgdvstledpsvveelke 628
                                           i++v+ lPktRsGkimRR+lrkiaeg+ +++gdvstl dp+vvee+++
  NCBI__GCF_000302595.1:WP_017257106.1 585 IQFVTGLPKTRSGKIMRRILRKIAEGDmKNVGDVSTLLDPAVVEEIIK 632
                                           ***************************9*****************986 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (629 nodes)
Target sequences:                          1  (635 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.01u 0.01s 00:00:00.02 Elapsed: 00:00:00.01
# Mc/sec: 23.67
//
[ok]

This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory