Align Acetyl-coenzyme A synthetase (EC 6.2.1.1) (characterized)
to candidate WP_017257106.1 B176_RS0102045 acetate--CoA ligase
Query= reanno::pseudo5_N2C3_1:AO356_18695 (651 letters) >NCBI__GCF_000302595.1:WP_017257106.1 Length = 635 Score = 711 bits (1835), Expect = 0.0 Identities = 359/626 (57%), Positives = 442/626 (70%), Gaps = 10/626 (1%) Query: 24 YKAMYQQSVVNPDGFWREQAKRLDWIKPFTTVKQTSFDDHHVDIKWFADGTLNVSYNCLD 83 YK YQ+SV P+ FW + A W K + V +F + ++ WF LN++ NC+D Sbjct: 11 YKEAYQRSVEQPEAFWADIADNFQWKKKWDKVLDWNFKEPKIE--WFKGAKLNITENCID 68 Query: 84 RHLAERGDQIAIIWEGDDPSES-RNITYRELHEEVCKFANALRGQDVHRGDVVTIYMPMI 142 RHLA++GDQ AIIWE +DP+E R +TY++LHE+VC FAN L+ DV +GD V IYMPMI Sbjct: 69 RHLADKGDQPAIIWEANDPNEHHRVLTYKQLHEKVCLFANVLKNNDVKKGDRVCIYMPMI 128 Query: 143 PEAVVAMLACTRIGAIHSVVFGGFSPEALAGRIIDCKSKVVITADEGVRAGKKIPLKANV 202 PE +A+LAC RIGAIHSVVFGGFS +++A RI D + ++VITAD G R K IPLK + Sbjct: 129 PELAIAVLACARIGAIHSVVFGGFSAQSIADRINDAQCEMVITADGGFRGPKDIPLKNVI 188 Query: 203 DDALTNPETSSIQKVIVCKRTAGNIKWNQHRDIWYEDLMKVAGTV----CAPKEMGAEEA 258 DDAL + S++ VIV RT I + RD W++D + T+ C +EM AE+ Sbjct: 189 DDALV--QCPSVKTVIVLTRTRTPISMIKGRDKWWQDEIHKVETLGMIDCPAEEMDAEDP 246 Query: 259 LFILYTSGSTGKPKGVQHTTAGYLLYAALTHERVFDYKPGEVYWCTADVGWVTGHSYIVY 318 LFILYTSGSTGKPKGV HTTAGY++Y A T + VF Y+P +VY+CTAD+GW+TGHSYI+Y Sbjct: 247 LFILYTSGSTGKPKGVVHTTAGYMIYTAYTFQNVFQYQPQDVYFCTADIGWITGHSYIIY 306 Query: 319 GPLANGATTLLFEGVPNYPDITRVAKVIDKHKVSILYTAPTAIRAMMASGTAAVEGADGS 378 GPLA GATTL+FEGVP YPD R ++DK KV+ LYTAPTAIR++M SG V+ D S Sbjct: 307 GPLAQGATTLMFEGVPTYPDAGRFWDIVDKFKVNTLYTAPTAIRSLMQSGLDYVKDKDLS 366 Query: 379 SLRLLGSVGEPINPEAWDWYYKNVGKERCPIVDTWWQTETGGVLISPLPGATALKPGSAT 438 SL++LGSVGEPIN EAW WY N+GK +CPIVDTWWQTE GG+LISP+ T KP AT Sbjct: 367 SLKVLGSVGEPINEEAWHWYNDNIGKGKCPIVDTWWQTENGGILISPIANVTPTKPCYAT 426 Query: 439 RPFFGVVPALVDNLGNLIEG-AAEGNLVILDSWPGQARTLYGDHDRFVDTYFKTFSGMYF 497 P GV P LVD G +IEG GNL I WPG RT YGDH+R TYF T+ MYF Sbjct: 427 LPLPGVQPVLVDENGAVIEGNGVSGNLCIKFPWPGMLRTTYGDHERCKLTYFSTYEDMYF 486 Query: 498 TGDGARRDEDGYYWITGRVDDVLNVSGHRMGTAEIESAMVAHPKVAEAAVVGVPHDIKGQ 557 TGDG RDEDGYY ITGRVDDV+NVSGHR+GTAE+E+A+ V E+AVVG PH+IKGQ Sbjct: 487 TGDGCLRDEDGYYRITGRVDDVINVSGHRIGTAEVENAINMFTDVVESAVVGYPHEIKGQ 546 Query: 558 GIYVYVTLNAGEETSEALRLELKNWVRKEIGPIASPDVIQWAPGLPKTRSGKIMRRILRK 617 GIY YV L+ E E + ++ V + IG IA PD IQ+ GLPKTRSGKIMRRILRK Sbjct: 547 GIYAYVILDKESEDVELTKKDIAMTVSRIIGAIARPDKIQFVTGLPKTRSGKIMRRILRK 606 Query: 618 IATAEYDGLGDISTLADPGVVAHLIE 643 IA + +GD+STL DP VV +I+ Sbjct: 607 IAEGDMKNVGDVSTLLDPAVVEEIIK 632 Lambda K H 0.318 0.135 0.418 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 1279 Number of extensions: 62 Number of successful extensions: 6 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 651 Length of database: 635 Length adjustment: 38 Effective length of query: 613 Effective length of database: 597 Effective search space: 365961 Effective search space used: 365961 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.3 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.7 bits) S2: 54 (25.4 bits)
Align candidate WP_017257106.1 B176_RS0102045 (acetate--CoA ligase)
to HMM TIGR02188 (acs: acetate--CoA ligase (EC 6.2.1.1))
# hmmsearch :: search profile(s) against a sequence database # HMMER 3.3.1 (Jul 2020); http://hmmer.org/ # Copyright (C) 2020 Howard Hughes Medical Institute. # Freely distributed under the BSD open source license. # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # query HMM file: ../tmp/path.carbon/TIGR02188.hmm # target sequence database: /tmp/gapView.520129.genome.faa # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: TIGR02188 [M=629] Accession: TIGR02188 Description: Ac_CoA_lig_AcsA: acetate--CoA ligase Scores for complete sequences (score includes all domains): --- full sequence --- --- best 1 domain --- -#dom- E-value score bias E-value score bias exp N Sequence Description ------- ------ ----- ------- ------ ----- ---- -- -------- ----------- 1.1e-293 961.2 0.3 1.2e-293 961.0 0.3 1.0 1 NCBI__GCF_000302595.1:WP_017257106.1 Domain annotation for each sequence (and alignments): >> NCBI__GCF_000302595.1:WP_017257106.1 # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc --- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ---- 1 ! 961.0 0.3 1.2e-293 1.2e-293 2 628 .. 6 632 .. 5 633 .. 0.97 Alignments for each domain: == domain 1 score: 961.0 bits; conditional E-value: 1.2e-293 TIGR02188 2 aeleeykelyeeaiedpekfwaklakeelewlkpfekvldeslep.kvkWfedgelnvsyncvdrhvekrkdk 73 ++ +eyke y++++e+pe+fwa+ a +++w+k+++kvld+++++ k++Wf++++ln++ nc+drh++++ d+ NCBI__GCF_000302595.1:WP_017257106.1 6 SSFDEYKEAYQRSVEQPEAFWADIAD-NFQWKKKWDKVLDWNFKEpKIEWFKGAKLNITENCIDRHLADKGDQ 77 6789*********************9.6**************9988*************************** PP TIGR02188 74 vaiiwegdeegedsrkltYaellrevcrlanvlkelGvkkgdrvaiYlpmipeaviamlacaRiGavhsvvfa 146 aiiwe+++++e++r ltY++l+++vc +anvlk+ vkkgdrv+iY+pmipe++ia+lacaRiGa+hsvvf+ NCBI__GCF_000302595.1:WP_017257106.1 78 PAIIWEANDPNEHHRVLTYKQLHEKVCLFANVLKNNDVKKGDRVCIYMPMIPELAIAVLACARIGAIHSVVFG 150 ************************************************************************* PP TIGR02188 147 GfsaealaeRivdaeaklvitadeglRggkvielkkivdealekaeesvekvlvvkrtgeevaewkegrDvww 219 Gfsa+++a+Ri+da++++vitad+g+Rg k i+lk+++d+al +++ sv++v+v+ rt ++++ + +grD+ww NCBI__GCF_000302595.1:WP_017257106.1 151 GFSAQSIADRINDAQCEMVITADGGFRGPKDIPLKNVIDDALVQCP-SVKTVIVLTRTRTPIS-MIKGRDKWW 221 *********************************************9.7*************76.********* PP TIGR02188 220 eelvek...easaecepekldsedplfiLYtsGstGkPkGvlhttgGylllaaltvkyvfdikdedifwCtaD 289 +++++k +c++e++d+edplfiLYtsGstGkPkGv+htt+Gy++++a+t++ vf+++++d+++CtaD NCBI__GCF_000302595.1:WP_017257106.1 222 QDEIHKvetLGMIDCPAEEMDAEDPLFILYTSGSTGKPKGVVHTTAGYMIYTAYTFQNVFQYQPQDVYFCTAD 294 **999832123468*********************************************************** PP TIGR02188 290 vGWvtGhsYivygPLanGattllfegvptypdasrfweviekykvtifYtaPtaiRalmklgeelvkkhdlss 362 +GW+tGhsYi+ygPLa+Gattl+fegvptypda+rfw++++k+kv+++YtaPtaiR+lm+ g + vk +dlss NCBI__GCF_000302595.1:WP_017257106.1 295 IGWITGHSYIIYGPLAQGATTLMFEGVPTYPDAGRFWDIVDKFKVNTLYTAPTAIRSLMQSGLDYVKDKDLSS 367 ************************************************************************* PP TIGR02188 363 lrvlgsvGepinpeaweWyyevvGkekcpivdtwWqtetGgilitplpgvatelkpgsatlPlfGieaevvde 435 l+vlgsvGepin eaw+Wy++++Gk+kcpivdtwWqte Ggili+p++ +t++kp atlPl+G+++++vde NCBI__GCF_000302595.1:WP_017257106.1 368 LKVLGSVGEPINEEAWHWYNDNIGKGKCPIVDTWWQTENGGILISPIAN-VTPTKPCYATLPLPGVQPVLVDE 439 *************************************************.6********************** PP TIGR02188 436 egkeveeeeeggvLvikkpwPsmlrtiygdeerfvetYfkklkglyftGDgarrdkdGyiwilGRvDdvinvs 508 +g +e + +g L+ik pwP+mlrt ygd+er tYf++++++yftGDg+ rd+dGy+ i+GRvDdvinvs NCBI__GCF_000302595.1:WP_017257106.1 440 NGAVIEGNGVSGNLCIKFPWPGMLRTTYGDHERCKLTYFSTYEDMYFTGDGCLRDEDGYYRITGRVDDVINVS 512 ******555559************************************************************* PP TIGR02188 509 Ghrlgtaeiesalvsheavaeaavvgvpdeikgeaivafvvlkegveedeeelekelkklvrkeigpiakpdk 581 Ghr+gtae+e+a+ ++v e+avvg+p+eikg+ i+a+v+l +++e e ++k++ ++v++ ig+ia+pdk NCBI__GCF_000302595.1:WP_017257106.1 513 GHRIGTAEVENAINMFTDVVESAVVGYPHEIKGQGIYAYVILDKESEDVE-LTKKDIAMTVSRIIGAIARPDK 584 ******************************************99888666.69******************** PP TIGR02188 582 ilvveelPktRsGkimRRllrkiaege.ellgdvstledpsvveelke 628 i++v+ lPktRsGkimRR+lrkiaeg+ +++gdvstl dp+vvee+++ NCBI__GCF_000302595.1:WP_017257106.1 585 IQFVTGLPKTRSGKIMRRILRKIAEGDmKNVGDVSTLLDPAVVEEIIK 632 ***************************9*****************986 PP Internal pipeline statistics summary: ------------------------------------- Query model(s): 1 (629 nodes) Target sequences: 1 (635 residues searched) Passed MSV filter: 1 (1); expected 0.0 (0.02) Passed bias filter: 1 (1); expected 0.0 (0.02) Passed Vit filter: 1 (1); expected 0.0 (0.001) Passed Fwd filter: 1 (1); expected 0.0 (1e-05) Initial search space (Z): 1 [actual number of targets] Domain search space (domZ): 1 [number of targets reported over threshold] # CPU time: 0.01u 0.00s 00:00:00.01 Elapsed: 00:00:00.01 # Mc/sec: 37.84 // [ok]
This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory