GapMind for catabolism of small carbon sources

 

Alignments for a candidate for acs in Phaeobacter inhibens BS107

Align Acetyl-coenzyme A synthetase (EC 6.2.1.1) (characterized)
to candidate GFF1279 PGA1_c12950 acetyl-coenzyme A synthetase Acs

Query= reanno::pseudo3_N2E3:AO353_03060
         (651 letters)



>FitnessBrowser__Phaeo:GFF1279
          Length = 695

 Score =  833 bits (2153), Expect = 0.0
 Identities = 411/644 (63%), Positives = 477/644 (74%), Gaps = 5/644 (0%)

Query: 2   SAASLYPVRPEVAANTLTDEATYKAMYQQSVVNPDGFWREQAKRLDWIKPFTTVKQTSFD 61
           S +  Y    E  A    + A Y  MY  S+ +P+GFWR+QA+R+DWIKPFT VK   F 
Sbjct: 47  SDSKAYAPSEETVARAHVNAAQYDEMYAASMQDPEGFWRQQAERIDWIKPFTQVKDVDFT 106

Query: 62  DHHVDIKWFADGTLNVSYNCLDRHLAERGDQIAIIWEGDDPAES-RNITYRELHEQVCKF 120
             +V I W+ADGTLNV+ NC+DRHL  RGDQ AIIWE D P E+ ++I+Y++LH +VC+ 
Sbjct: 107 LGNVSINWYADGTLNVAANCIDRHLDTRGDQTAIIWEPDSPDEAAQHISYKQLHTRVCRM 166

Query: 121 ANALRGQDVHRGDVVTIYMPMIPEAVVAMLACTRIGAIHSVVFGGFSPEALAGRIIDCKS 180
           AN L    V +GD V IY+PMIPEA  AMLAC RIGAIHS+VF GFSP+ALA R+  C +
Sbjct: 167 ANVLETMGVRKGDRVVIYLPMIPEAAYAMLACARIGAIHSIVFAGFSPDALAARVNGCDA 226

Query: 181 KVVITADEGLRAGKKISLKANVDDALTNPETSSIQKVIVCKRTGGNIKWNQHRDIWYEDL 240
           KV+ITADE  R G+K  LK+N D AL +  T    K +V KRTGG   W   RD  Y ++
Sbjct: 227 KVLITADEAPRGGRKTPLKSNADAALLH--TKDTVKCLVVKRTGGQTTWIDGRDYDYNEM 284

Query: 241 MKVAGTVCAPKEMGAEEALFILYTSGSTGKPKGVQHTTGGYLLYAALTHERVFDYRPGEI 300
              A     P EM AE+ LFILYTSGSTG+PKGV HTTGGYL YAA+THE  FDY  G+I
Sbjct: 285 ALEADDYSKPAEMNAEDPLFILYTSGSTGQPKGVVHTTGGYLTYAAMTHEITFDYHDGDI 344

Query: 301 YWCTADVGWVTGHTYIVYGPLANGATTLLFEGVPNYPDITRVAKIIDKHKVNILYTAPTA 360
           YWCTADVGWVTGH+YIVYGPLANGATTL+FEGVP YPD +R  ++ DKHKV   YTAPTA
Sbjct: 345 YWCTADVGWVTGHSYIVYGPLANGATTLMFEGVPTYPDASRFWQVCDKHKVTQFYTAPTA 404

Query: 361 IRAMMAQGTAAVEGADGSSLRLLGSVGEPINPEAWEWYYKNVGQSRCPIVDTWWQTETGA 420
           +RA+M QG   VE  D SSLR LG+VGEPINPEAW WY   VG+ +CPIVDTWWQTETG 
Sbjct: 405 LRALMGQGNEWVEKCDLSSLRTLGTVGEPINPEAWNWYNDIVGKGKCPIVDTWWQTETGG 464

Query: 421 TLMSPLPGAHGLKPGSAARPFFGVVPALVD-NLGNIIEG-VAEGNLVILDSWPGQARTLY 478
            LM+PLPGAH  KPG+A +PFFGV P ++D   G  I G   EG L I DSWPGQ RT++
Sbjct: 465 HLMTPLPGAHATKPGAAMKPFFGVAPVVLDPQSGEEITGNGVEGVLCIKDSWPGQMRTVW 524

Query: 479 GDHDRFVDTYFKTFRGMYFTGDGARRDADGYWWITGRVDDVLNVSGHRMGTAEIESAMVA 538
           GDH+RF  TYF  ++G YFTGDG RRD DG +WITGRVDDV+NVSGHRMGTAE+ESA+VA
Sbjct: 525 GDHERFEKTYFSDYKGYYFTGDGCRRDEDGDYWITGRVDDVINVSGHRMGTAEVESALVA 584

Query: 539 HPKVAEAAVVGVPHDIKGQGIYVYVTLNGGEEPSEALRLELKNWVRKEIGPIASPDVIQW 598
           H  VAEAAVVG PH+IKGQGIY YVTL    EPS+ L  EL+ WVR EIGPIASPDVIQW
Sbjct: 585 HAAVAEAAVVGYPHEIKGQGIYCYVTLMNDREPSDELVKELRTWVRTEIGPIASPDVIQW 644

Query: 599 APGLPKTRSGKIMRRILRKIATGEYDGLGDISTLADPGVVQHLI 642
           APGLPKTRSGKIMRRILRKIA  ++  LGD STLADP VV  LI
Sbjct: 645 APGLPKTRSGKIMRRILRKIAENDFGSLGDTSTLADPSVVDDLI 688


Lambda     K      H
   0.319    0.136    0.425 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 1580
Number of extensions: 86
Number of successful extensions: 4
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 651
Length of database: 695
Length adjustment: 39
Effective length of query: 612
Effective length of database: 656
Effective search space:   401472
Effective search space used:   401472
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 54 (25.4 bits)

Align candidate GFF1279 PGA1_c12950 (acetyl-coenzyme A synthetase Acs)
to HMM TIGR02188 (acs: acetate--CoA ligase (EC 6.2.1.1))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR02188.hmm
# target sequence database:        /tmp/gapView.27319.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR02188  [M=629]
Accession:   TIGR02188
Description: Ac_CoA_lig_AcsA: acetate--CoA ligase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                          Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                          -----------
          0 1013.5   0.2          0 1013.2   0.2    1.0  1  lcl|FitnessBrowser__Phaeo:GFF1279  PGA1_c12950 acetyl-coenzyme A sy


Domain annotation for each sequence (and alignments):
>> lcl|FitnessBrowser__Phaeo:GFF1279  PGA1_c12950 acetyl-coenzyme A synthetase Acs
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 ! 1013.2   0.2         0         0       3     628 ..      65     689 ..      63     690 .. 0.98

  Alignments for each domain:
  == domain 1  score: 1013.2 bits;  conditional E-value: 0
                          TIGR02188   3 eleeykelyeeaiedpekfwaklakeelewlkpfekvldeslep...kvkWfedgelnvsyncvdrhvekrkdkva 75 
                                        ++ +y e+y+ +++dpe fw+++a+ +++w+kpf++v+d +++    +++W++dg+lnv++nc+drh+ +r d++a
  lcl|FitnessBrowser__Phaeo:GFF1279  65 NAAQYDEMYAASMQDPEGFWRQQAE-RIDWIKPFTQVKDVDFTLgnvSINWYADGTLNVAANCIDRHLDTRGDQTA 139
                                        6789********************9.5**************9877789**************************** PP

                          TIGR02188  76 iiwegdeegedsrkltYaellrevcrlanvlkelGvkkgdrvaiYlpmipeaviamlacaRiGavhsvvfaGfsae 151
                                        iiwe d+++e +++++Y++l+++vcr+anvl+++Gv+kgdrv+iYlpmipea++amlacaRiGa+hs+vfaGfs++
  lcl|FitnessBrowser__Phaeo:GFF1279 140 IIWEPDSPDEAAQHISYKQLHTRVCRMANVLETMGVRKGDRVVIYLPMIPEAAYAMLACARIGAIHSIVFAGFSPD 215
                                        **************************************************************************** PP

                          TIGR02188 152 alaeRivdaeaklvitadeglRggkvielkkivdealekaeesvekvlvvkrtgeevaewkegrDvwweelvekea 227
                                        ala R++ ++ak++itade+ Rgg++++lk+++d+al +++++v+ +lvvkrtg + ++w +grD+ ++e+  + a
  lcl|FitnessBrowser__Phaeo:GFF1279 216 ALAARVNGCDAKVLITADEAPRGGRKTPLKSNADAALLHTKDTVK-CLVVKRTGGQ-TTWIDGRDYDYNEMALE-A 288
                                        ****************************************98776.**********.56***********9995.* PP

                          TIGR02188 228 saecepekldsedplfiLYtsGstGkPkGvlhttgGylllaaltvkyvfdikdedifwCtaDvGWvtGhsYivygP 303
                                        +++ +p+++++edplfiLYtsGstG+PkGv+httgGyl++aa+t++++fd++d+di+wCtaDvGWvtGhsYivygP
  lcl|FitnessBrowser__Phaeo:GFF1279 289 DDYSKPAEMNAEDPLFILYTSGSTGQPKGVVHTTGGYLTYAAMTHEITFDYHDGDIYWCTADVGWVTGHSYIVYGP 364
                                        **************************************************************************** PP

                          TIGR02188 304 LanGattllfegvptypdasrfweviekykvtifYtaPtaiRalmklgeelvkkhdlsslrvlgsvGepinpeawe 379
                                        LanGattl+fegvptypdasrfw+v++k+kvt+fYtaPta+Ralm +g+e+v+k dlsslr lg+vGepinpeaw+
  lcl|FitnessBrowser__Phaeo:GFF1279 365 LANGATTLMFEGVPTYPDASRFWQVCDKHKVTQFYTAPTALRALMGQGNEWVEKCDLSSLRTLGTVGEPINPEAWN 440
                                        **************************************************************************** PP

                          TIGR02188 380 WyyevvGkekcpivdtwWqtetGgilitplpgvatelkpgsatlPlfGieaevvd.eegkeveeeeeggvLvikkp 454
                                        Wy++ vGk+kcpivdtwWqtetGg+l+tplpg a ++kpg+a++P+fG+ ++v+d ++g+e++ +  +gvL+ik++
  lcl|FitnessBrowser__Phaeo:GFF1279 441 WYNDIVGKGKCPIVDTWWQTETGGHLMTPLPG-AHATKPGAAMKPFFGVAPVVLDpQSGEEITGNGVEGVLCIKDS 515
                                        ********************************.6*********************999*****555559******* PP

                          TIGR02188 455 wPsmlrtiygdeerfvetYfkklkglyftGDgarrdkdGyiwilGRvDdvinvsGhrlgtaeiesalvsheavaea 530
                                        wP+++rt++gd+erf +tYf+++kg+yftGDg+rrd+dG++wi+GRvDdvinvsGhr+gtae+esalv+h avaea
  lcl|FitnessBrowser__Phaeo:GFF1279 516 WPGQMRTVWGDHERFEKTYFSDYKGYYFTGDGCRRDEDGDYWITGRVDDVINVSGHRMGTAEVESALVAHAAVAEA 591
                                        **************************************************************************** PP

                          TIGR02188 531 avvgvpdeikgeaivafvvlkegveedeeelekelkklvrkeigpiakpdkilvveelPktRsGkimRRllrkiae 606
                                        avvg+p+eikg+ i+++v+l++ +e+++e l kel+++vr+eigpia+pd i++++ lPktRsGkimRR+lrkiae
  lcl|FitnessBrowser__Phaeo:GFF1279 592 AVVGYPHEIKGQGIYCYVTLMNDREPSDE-LVKELRTWVRTEIGPIASPDVIQWAPGLPKTRSGKIMRRILRKIAE 666
                                        ****************************5.********************************************** PP

                          TIGR02188 607 ge.ellgdvstledpsvveelke 628
                                        ++  +lgd+stl+dpsvv++l++
  lcl|FitnessBrowser__Phaeo:GFF1279 667 NDfGSLGDTSTLADPSVVDDLIA 689
                                        *******************9975 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (629 nodes)
Target sequences:                          1  (695 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.03u 0.01s 00:00:00.04 Elapsed: 00:00:00.03
# Mc/sec: 13.31
//
[ok]

This GapMind analysis is from Sep 17 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory