Align Benzoyl-CoA-dihydrodiol lyase; EC 4.1.2.44 (characterized)
to candidate WP_037375998.1 A3GO_RS0118595 2,3-epoxybenzoyl-CoA dihydrolase
Query= SwissProt::Q84HH6 (555 letters) >NCBI__GCF_000428045.1:WP_037375998.1 Length = 549 Score = 616 bits (1588), Expect = 0.0 Identities = 302/549 (55%), Positives = 392/549 (71%), Gaps = 5/549 (0%) Query: 12 LVDYRTEPSKYRHWSLATDGEIATLTLNIDEDGGIRPGYKLKLNSYDLGVDIELHDALQR 71 ++ ++T P+ Y+HW LA +G +A L +N+ EDGG+ PGY+LKLNSYDLGVDIEL+DA+QR Sbjct: 1 MISFQTNPAAYQHWQLAFEGPVARLRMNVKEDGGLMPGYELKLNSYDLGVDIELYDAVQR 60 Query: 72 VRFEHPEVRTVVVTSGKPKIFCSGANIYMLGLSTHAWKVNFCKFTNETRNGIEDSSQYSG 131 +RFEHPEV++V++ S K ++FC+GANI MLG STH KVNFCKFTNETRN IED+S++SG Sbjct: 61 LRFEHPEVKSVILESAKERVFCAGANIRMLGGSTHVHKVNFCKFTNETRNSIEDASEFSG 120 Query: 132 LKFLAACNGTTAGGGYELALACDEIVLVDDRNSSVSLPEVPLLGVLPGTGGLTRVTDKRR 191 ++L A GT AGGGYELALA D I+LVDD N+SVSLPEVPLL VLPGTGGLTR+ DKR+ Sbjct: 121 QRYLCAITGTAAGGGYELALAADHIMLVDDGNASVSLPEVPLLAVLPGTGGLTRLVDKRK 180 Query: 192 VRRDHADIFCTISEGVRGQRAKDWRLVDDVVKQQQFAEHIQARAKALAQTSDRPAGAKGV 251 +RRD AD+FC+I EGV+GQRA DWRLVD+VV +F E + RA+A+A SDRP G+ Sbjct: 181 MRRDLADVFCSIEEGVKGQRAVDWRLVDEVVVSSKFQEAVDERAQAIAGGSDRPDNESGI 240 Query: 252 KLTTLERTVDEKGYHYEFVDATIDADGRTVTLTVRAPAAVTAKTAAEIEAQGIKWWPLQM 311 +L LE DE HY +D ++D R TL ++ P A A + QG +WPL + Sbjct: 241 ELEPLEIGGDETHLHYPHLDLSVDRATRVATLILKGPTASVPDDLAAAKVQGCHFWPLAL 300 Query: 312 ARELDDAILNLRTNHLDVGLWQLRTEGDAQVVLDIDATIDANRDNWFVRETIGMLRRTLA 371 RELD A+L+LRTN + G +T+GDA+ V + + +RD+WF+RE + L+RTL Sbjct: 301 IRELDSALLHLRTNEPETGTLIFKTQGDAEQVAAYERFLLQHRDDWFIREILLYLKRTLK 360 Query: 372 RIDVSSRSLYALIEPGSCFAGTLLEIALAADRSYMLDAA-----EAKNVVGLSAMNFGTF 426 RID +SRS +A +EPGSCF+G L +I A DRSYML+ + LS +NFG Sbjct: 361 RIDYTSRSTFAFVEPGSCFSGFLADILFAVDRSYMLEGQFEGDDRPAPTIRLSELNFGPL 420 Query: 427 PMVNGLSRIDARFYQEEAPVAAVKAKQGSLLSPAEAMELGLVTAIPDDLDWAEEVRIAIE 486 PM NGL+R++ RF E + ++ G LS A + GLVT DD+DW +++R +E Sbjct: 421 PMGNGLTRLETRFLGEPDTLEKARSLIGESLSAEAAADAGLVTFALDDIDWEDDIRFLLE 480 Query: 487 ERAALSPDALTGLEANLRFGPVETMNTRIFGRLSAWQNWIFNRPNAVGENGALKLFGSGK 546 ERA+ SPDALTG+EANLRF ETM T+IFGRLSAWQNWIF RPNAVGE GALK FG+G+ Sbjct: 481 ERASYSPDALTGMEANLRFAGPETMETKIFGRLSAWQNWIFQRPNAVGEQGALKCFGTGE 540 Query: 547 KAQFDWNRV 555 + +D RV Sbjct: 541 RPHYDQKRV 549 Lambda K H 0.318 0.134 0.397 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 743 Number of extensions: 25 Number of successful extensions: 2 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 555 Length of database: 549 Length adjustment: 36 Effective length of query: 519 Effective length of database: 513 Effective search space: 266247 Effective search space used: 266247 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.4 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.7 bits) S2: 53 (25.0 bits)
Align candidate WP_037375998.1 A3GO_RS0118595 (2,3-epoxybenzoyl-CoA dihydrolase)
to HMM TIGR03222 (boxC: benzoyl-CoA-dihydrodiol lyase (EC 4.1.2.44))
# hmmsearch :: search profile(s) against a sequence database # HMMER 3.3.1 (Jul 2020); http://hmmer.org/ # Copyright (C) 2020 Howard Hughes Medical Institute. # Freely distributed under the BSD open source license. # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # query HMM file: ../tmp/path.carbon/TIGR03222.hmm # target sequence database: /tmp/gapView.12506.genome.faa # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: TIGR03222 [M=548] Accession: TIGR03222 Description: benzo_boxC: benzoyl-CoA-dihydrodiol lyase Scores for complete sequences (score includes all domains): --- full sequence --- --- best 1 domain --- -#dom- E-value score bias E-value score bias exp N Sequence Description ------- ------ ----- ------- ------ ----- ---- -- -------- ----------- 9.5e-283 924.8 0.0 1.1e-282 924.6 0.0 1.0 1 lcl|NCBI__GCF_000428045.1:WP_037375998.1 A3GO_RS0118595 2,3-epoxybenzoyl- Domain annotation for each sequence (and alignments): >> lcl|NCBI__GCF_000428045.1:WP_037375998.1 A3GO_RS0118595 2,3-epoxybenzoyl-CoA dihydrolase # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc --- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ---- 1 ! 924.6 0.0 1.1e-282 1.1e-282 1 548 [] 2 549 .] 2 549 .] 1.00 Alignments for each domain: == domain 1 score: 924.6 bits; conditional E-value: 1.1e-282 TIGR03222 1 vdfrtepskyrhwkltfdGpvatltldvdedgglrdGyklklnsydlGvdieladalqrlrfehpevrv 69 + f+t+p y+hw+l+f+Gpva l ++v+edggl++Gy+lklnsydlGvdiel+da+qrlrfehpev++ lcl|NCBI__GCF_000428045.1:WP_037375998.1 2 ISFQTNPAAYQHWQLAFEGPVARLRMNVKEDGGLMPGYELKLNSYDLGVDIELYDAVQRLRFEHPEVKS 70 68******************************************************************* PP TIGR03222 70 vvltsakdkvfcaGanikmlglsthahkvnfckftnetrngiedaseesglkflaavnGtaaGGGyela 138 v+l+sak++vfcaGani+mlg+sth+hkvnfckftnetrn+iedase sg+++l+a++GtaaGGGyela lcl|NCBI__GCF_000428045.1:WP_037375998.1 71 VILESAKERVFCAGANIRMLGGSTHVHKVNFCKFTNETRNSIEDASEFSGQRYLCAITGTAAGGGYELA 139 ********************************************************************* PP TIGR03222 139 lacdeivlvddrssavslpevpllavlpGtGGltrvtdkrrvrrdladifctieeGvkGkrakewrlvd 207 la+d+i+lvdd++++vslpevpllavlpGtGGltr++dkr++rrdlad+fc+ieeGvkG+ra++wrlvd lcl|NCBI__GCF_000428045.1:WP_037375998.1 140 LAADHIMLVDDGNASVSLPEVPLLAVLPGTGGLTRLVDKRKMRRDLADVFCSIEEGVKGQRAVDWRLVD 208 ********************************************************************* PP TIGR03222 208 evvksskfdaavaeraaelaaksdrpadakGveltklertieedgvryetvdvaidraartatitvkgp 276 evv sskf++av era+++a sdrp ++ G+el +le +e+ ++y ++d+++dra+r+at+ +kgp lcl|NCBI__GCF_000428045.1:WP_037375998.1 209 EVVVSSKFQEAVDERAQAIAGGSDRPDNESGIELEPLEIGGDETHLHYPHLDLSVDRATRVATLILKGP 277 ********************************************************************* PP TIGR03222 277 eaaapadlaaikaqGaefyplklarelddailhlrlneldiglwvlrteGdaelvlaadalleakedhw 345 a++p dlaa k qG +f+pl+l reld a+lhlr+ne + g+ +++t+Gdae+v+a++ l +++d+w lcl|NCBI__GCF_000428045.1:WP_037375998.1 278 TASVPDDLAAAKVQGCHFWPLALIRELDSALLHLRTNEPETGTLIFKTQGDAEQVAAYERFLLQHRDDW 346 ********************************************************************* PP TIGR03222 346 lvreilgllkrtlkrldvssrslfalvepgscfaGtlaelvfaadrsymlegeleddedeeaaitlsel 414 ++reil +lkrtlkr+d +srs fa+vepgscf+G+la+++fa+drsymleg++e+d++++++i+lsel lcl|NCBI__GCF_000428045.1:WP_037375998.1 347 FIREILLYLKRTLKRIDYTSRSTFAFVEPGSCFSGFLADILFAVDRSYMLEGQFEGDDRPAPTIRLSEL 415 ********************************************************************* PP TIGR03222 415 nfgayplsnglsrlaarflaeeaaveavrdkiGealdaaeaeklglvtaalddidwedeirilleeras 483 nfg +p++ngl+rl++rfl+e++++e++r+ iGe+l+a++a+ glvt+alddidwed+ir lleeras lcl|NCBI__GCF_000428045.1:WP_037375998.1 416 NFGPLPMGNGLTRLETRFLGEPDTLEKARSLIGESLSAEAAADAGLVTFALDDIDWEDDIRFLLEERAS 484 ********************************************************************* PP TIGR03222 484 lspdaltGleanlrfagpetmetrifgrltawqnwifnrpnavGekGalklyGsGkkaqfdlerv 548 +spdaltG+eanlrfagpetmet+ifgrl+awqnwif+rpnavGe+Galk +G+G+++++d++rv lcl|NCBI__GCF_000428045.1:WP_037375998.1 485 YSPDALTGMEANLRFAGPETMETKIFGRLSAWQNWIFQRPNAVGEQGALKCFGTGERPHYDQKRV 549 ***************************************************************98 PP Internal pipeline statistics summary: ------------------------------------- Query model(s): 1 (548 nodes) Target sequences: 1 (549 residues searched) Passed MSV filter: 1 (1); expected 0.0 (0.02) Passed bias filter: 1 (1); expected 0.0 (0.02) Passed Vit filter: 1 (1); expected 0.0 (0.001) Passed Fwd filter: 1 (1); expected 0.0 (1e-05) Initial search space (Z): 1 [actual number of targets] Domain search space (domZ): 1 [number of targets reported over threshold] # CPU time: 0.02u 0.01s 00:00:00.03 Elapsed: 00:00:00.02 # Mc/sec: 12.48 // [ok]
This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory