GapMind for catabolism of small carbon sources

 

Alignments for a candidate for acs in Algiphilus aromaticivorans DG1253

Align Acetyl-coenzyme A synthetase (EC 6.2.1.1) (characterized)
to candidate WP_043768008.1 U743_RS10390 acetate--CoA ligase

Query= reanno::pseudo3_N2E3:AO353_03060
         (651 letters)



>NCBI__GCF_000733765.1:WP_043768008.1
          Length = 649

 Score =  947 bits (2448), Expect = 0.0
 Identities = 452/644 (70%), Positives = 527/644 (81%)

Query: 3   AASLYPVRPEVAANTLTDEATYKAMYQQSVVNPDGFWREQAKRLDWIKPFTTVKQTSFDD 62
           +++++PV+PE       D A Y+ MY  SV +P+ FW E  KR+DWIKPF+ VK  +FD 
Sbjct: 2   SSNIHPVKPEWRQGAKVDAAGYEKMYADSVRDPETFWAEHGKRIDWIKPFSKVKDVNFDA 61

Query: 63  HHVDIKWFADGTLNVSYNCLDRHLAERGDQIAIIWEGDDPAESRNITYRELHEQVCKFAN 122
             + I+WF DGTLNV+ NCLDRHLAERGDQ AIIWEGD+P +   I+YRELHE+VC+  N
Sbjct: 62  SDLHIRWFYDGTLNVAANCLDRHLAERGDQTAIIWEGDEPDQDARISYRELHEKVCRLGN 121

Query: 123 ALRGQDVHRGDVVTIYMPMIPEAVVAMLACTRIGAIHSVVFGGFSPEALAGRIIDCKSKV 182
           ALR   V RGDVVTIYMPMIPEA VAMLAC RIGA+HSVVFGGFSPEALAGRI DC SK 
Sbjct: 122 ALRDMGVARGDVVTIYMPMIPEAAVAMLACARIGAVHSVVFGGFSPEALAGRIEDCGSKW 181

Query: 183 VITADEGLRAGKKISLKANVDDALTNPETSSIQKVIVCKRTGGNIKWNQHRDIWYEDLMK 242
           VITADEG+R GKKI LKANVD AL  P  +S++K +V +RTGG++ W++ RD WY+D+ +
Sbjct: 182 VITADEGVRGGKKIPLKANVDTALGLPGGASVKKTVVVRRTGGDVDWSEGRDHWYQDITE 241

Query: 243 VAGTVCAPKEMGAEEALFILYTSGSTGKPKGVQHTTGGYLLYAALTHERVFDYRPGEIYW 302
            A + C P+EMGAE+ LFILYTSGSTGKPKGV HT GGYL++A++THE  FDYR GEIYW
Sbjct: 242 AASSDCPPEEMGAEDPLFILYTSGSTGKPKGVLHTQGGYLVHASMTHEYCFDYRDGEIYW 301

Query: 303 CTADVGWVTGHTYIVYGPLANGATTLLFEGVPNYPDITRVAKIIDKHKVNILYTAPTAIR 362
           CTADVGWVTGH+YIVYGPLANGATTL+FEGVPNYPD+ R  ++ DKH VNI YTAPTAIR
Sbjct: 302 CTADVGWVTGHSYIVYGPLANGATTLMFEGVPNYPDMGRFWQVCDKHGVNIFYTAPTAIR 361

Query: 363 AMMAQGTAAVEGADGSSLRLLGSVGEPINPEAWEWYYKNVGQSRCPIVDTWWQTETGATL 422
           A+M +G   V+     +LR+LG+VGEPINPEAWEWY++ VG SRCP++DTWWQTETG  L
Sbjct: 362 ALMREGEDPVKKHRRDTLRVLGTVGEPINPEAWEWYFRVVGDSRCPVIDTWWQTETGGIL 421

Query: 423 MSPLPGAHGLKPGSAARPFFGVVPALVDNLGNIIEGVAEGNLVILDSWPGQARTLYGDHD 482
           ++PLPGA  LKPGSA  PFFGV PALVDN G+I+EG A+GNLVILDSWPGQ RT+YGDH+
Sbjct: 422 IAPLPGATDLKPGSATTPFFGVQPALVDNEGHILEGEADGNLVILDSWPGQMRTVYGDHE 481

Query: 483 RFVDTYFKTFRGMYFTGDGARRDADGYWWITGRVDDVLNVSGHRMGTAEIESAMVAHPKV 542
           RFV TYF T+ G YF+GDGARRDADGY+WITGRVDDVLN+SGHRMGTAEIESA+VAH +V
Sbjct: 482 RFVQTYFSTYPGRYFSGDGARRDADGYFWITGRVDDVLNISGHRMGTAEIESALVAHSQV 541

Query: 543 AEAAVVGVPHDIKGQGIYVYVTLNGGEEPSEALRLELKNWVRKEIGPIASPDVIQWAPGL 602
           AEAAVVG PHD+KGQGIY YV+L  G EPSEAL+ EL   VRKEIG IASPD+I +APGL
Sbjct: 542 AEAAVVGYPHDLKGQGIYCYVSLVVGAEPSEALKKELVQHVRKEIGAIASPDIIHFAPGL 601

Query: 603 PKTRSGKIMRRILRKIATGEYDGLGDISTLADPGVVQHLIDTHK 646
           PKTRSGKIMRRILRKIA GEY  LGD STLADPGVV  L+   K
Sbjct: 602 PKTRSGKIMRRILRKIAEGEYSSLGDTSTLADPGVVDTLVTEAK 645


Lambda     K      H
   0.319    0.136    0.425 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 1549
Number of extensions: 64
Number of successful extensions: 1
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 651
Length of database: 649
Length adjustment: 38
Effective length of query: 613
Effective length of database: 611
Effective search space:   374543
Effective search space used:   374543
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 54 (25.4 bits)

Align candidate WP_043768008.1 U743_RS10390 (acetate--CoA ligase)
to HMM TIGR02188 (acs: acetate--CoA ligase (EC 6.2.1.1))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR02188.hmm
# target sequence database:        /tmp/gapView.1022325.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR02188  [M=629]
Accession:   TIGR02188
Description: Ac_CoA_lig_AcsA: acetate--CoA ligase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                             Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                             -----------
          0 1021.4   0.0          0 1021.1   0.0    1.0  1  NCBI__GCF_000733765.1:WP_043768008.1  


Domain annotation for each sequence (and alignments):
>> NCBI__GCF_000733765.1:WP_043768008.1  
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 ! 1021.1   0.0         0         0       5     627 ..      21     641 ..      17     643 .. 0.98

  Alignments for each domain:
  == domain 1  score: 1021.1 bits;  conditional E-value: 0
                             TIGR02188   5 eeykelyeeaiedpekfwaklakeelewlkpfekvldeslep...kvkWfedgelnvsyncvdrhvekrkdkv 74 
                                             y+++y+++++dpe+fwa+++k +++w+kpf+kv+d ++++   +++Wf dg+lnv++nc+drh+++r d++
  NCBI__GCF_000733765.1:WP_043768008.1  21 AGYEKMYADSVRDPETFWAEHGK-RIDWIKPFSKVKDVNFDAsdlHIRWFYDGTLNVAANCLDRHLAERGDQT 92 
                                           679********************.5************999887789*************************** PP

                             TIGR02188  75 aiiwegdeegedsrkltYaellrevcrlanvlkelGvkkgdrvaiYlpmipeaviamlacaRiGavhsvvfaG 147
                                           aiiwegde+++   +++Y+el+++vcrl n+l+++Gv +gd v+iY+pmipea++amlacaRiGavhsvvf+G
  NCBI__GCF_000733765.1:WP_043768008.1  93 AIIWEGDEPDQ-DARISYRELHEKVCRLGNALRDMGVARGDVVTIYMPMIPEAAVAMLACARIGAVHSVVFGG 164
                                           *********97.899********************************************************** PP

                             TIGR02188 148 fsaealaeRivdaeaklvitadeglRggkvielkkivdealekaee.svekvlvvkrtgeevaewkegrDvww 219
                                           fs+eala Ri+d+ +k vitadeg+Rggk+i+lk++vd al      sv+k++vv+rtg +v  w+egrD+w+
  NCBI__GCF_000733765.1:WP_043768008.1 165 FSPEALAGRIEDCGSKWVITADEGVRGGKKIPLKANVDTALGLPGGaSVKKTVVVRRTGGDVD-WSEGRDHWY 236
                                           *****************************************988777**************66.********* PP

                             TIGR02188 220 eelvekeasaecepekldsedplfiLYtsGstGkPkGvlhttgGylllaaltvkyvfdikdedifwCtaDvGW 292
                                           ++++e  as++c+pe++++edplfiLYtsGstGkPkGvlht+gGyl++a++t++y fd++d++i+wCtaDvGW
  NCBI__GCF_000733765.1:WP_043768008.1 237 QDITEA-ASSDCPPEEMGAEDPLFILYTSGSTGKPKGVLHTQGGYLVHASMTHEYCFDYRDGEIYWCTADVGW 308
                                           *****5.****************************************************************** PP

                             TIGR02188 293 vtGhsYivygPLanGattllfegvptypdasrfweviekykvtifYtaPtaiRalmklgeelvkkhdlsslrv 365
                                           vtGhsYivygPLanGattl+fegvp+ypd++rfw+v++k++v+ifYtaPtaiRalm++ge+ vkkh +++lrv
  NCBI__GCF_000733765.1:WP_043768008.1 309 VTGHSYIVYGPLANGATTLMFEGVPNYPDMGRFWQVCDKHGVNIFYTAPTAIRALMREGEDPVKKHRRDTLRV 381
                                           ************************************************************************* PP

                             TIGR02188 366 lgsvGepinpeaweWyyevvGkekcpivdtwWqtetGgilitplpgvatelkpgsatlPlfGieaevvdeegk 438
                                           lg+vGepinpeaweWy++vvG+++cp++dtwWqtetGgili+plpg at+lkpgsat+P+fG+++++vd+eg+
  NCBI__GCF_000733765.1:WP_043768008.1 382 LGTVGEPINPEAWEWYFRVVGDSRCPVIDTWWQTETGGILIAPLPG-ATDLKPGSATTPFFGVQPALVDNEGH 453
                                           **********************************************.6************************* PP

                             TIGR02188 439 eveeeeeggvLvikkpwPsmlrtiygdeerfvetYfkklkglyftGDgarrdkdGyiwilGRvDdvinvsGhr 511
                                            +e e++ g Lvi ++wP+++rt+ygd+erfv+tYf++++g yf+GDgarrd+dGy+wi+GRvDdv+n+sGhr
  NCBI__GCF_000733765.1:WP_043768008.1 454 ILEGEAD-GNLVILDSWPGQMRTVYGDHERFVQTYFSTYPGRYFSGDGARRDADGYFWITGRVDDVLNISGHR 525
                                           **98777.89*************************************************************** PP

                             TIGR02188 512 lgtaeiesalvsheavaeaavvgvpdeikgeaivafvvlkegveedeeelekelkklvrkeigpiakpdkilv 584
                                           +gtaeiesalv+h++vaeaavvg+p+++kg+ i+++v l  g+e++e +l+kel ++vrkeig+ia+pd i++
  NCBI__GCF_000733765.1:WP_043768008.1 526 MGTAEIESALVAHSQVAEAAVVGYPHDLKGQGIYCYVSLVVGAEPSE-ALKKELVQHVRKEIGAIASPDIIHF 597
                                           **********************************************9.5************************ PP

                             TIGR02188 585 veelPktRsGkimRRllrkiaege.ellgdvstledpsvveelk 627
                                           ++ lPktRsGkimRR+lrkiaege ++lgd+stl+dp vv++l 
  NCBI__GCF_000733765.1:WP_043768008.1 598 APGLPKTRSGKIMRRILRKIAEGEySSLGDTSTLADPGVVDTLV 641
                                           *****************************************986 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (629 nodes)
Target sequences:                          1  (649 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.01u 0.01s 00:00:00.02 Elapsed: 00:00:00.01
# Mc/sec: 21.20
//
[ok]

This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory