Align acetoacetate-CoA ligase (EC 6.2.1.16) (characterized)
to candidate RR42_RS15595 RR42_RS15595 acetoacetyl-CoA synthetase
Query= BRENDA::D6EQU8 (658 letters) >FitnessBrowser__Cup4G11:RR42_RS15595 Length = 678 Score = 640 bits (1652), Expect = 0.0 Identities = 315/666 (47%), Positives = 426/666 (63%), Gaps = 17/666 (2%) Query: 4 ENPQPLWQPDAQRIAQARITRFQAWAAEHHGAPAEGGYAALHRWSVDELDTFWKAVTEWF 63 E Q W P +RI F W G A Y AL +WSV +L+ FW A+ +F Sbjct: 5 EEGQLRWTPSQAFRDGSRIAHFMRWLESERGL-AFADYGALWQWSVTDLEAFWDAIRAYF 63 Query: 64 DVRFSTPYARVLGDRTMPGAQWFPGATLNYAEHALRAAGTRP--DEPALLYVDETHEPAP 121 D+RF TP RVLG MPGA+WF GA+LNY + R AG+ + A+ Y E+ Sbjct: 64 DLRFDTPAERVLGSAQMPGARWFEGASLNYVQQVFRHAGSGAARERTAIRYAGESIAVTD 123 Query: 122 VTWAELRRQVASLAAELRALGVRPGDRVSGYLPNIPQAVVALLATAAVGGVWTSCAPDFG 181 ++W L +QVASLA LR +GV GDRV+GYLPNIP VVA LATA++G +W+ CAPD G Sbjct: 124 LSWDVLEQQVASLAHALRGMGVARGDRVAGYLPNIPATVVAFLATASLGAIWSGCAPDMG 183 Query: 182 ARSVLDRFQQVEPVVLFTVDGYRYGGKEHDRRDTVAELRRELPTLRAVIHIPLLGTEA-- 239 +V DRF+Q+EP VL VDGYRYGGK +DR +A+L LP+L ++ +P T+A Sbjct: 184 QVAVSDRFRQIEPKVLIAVDGYRYGGKAYDRAPVLADLAAALPSLTDLVLVPGEHTDAHA 243 Query: 240 -PDGTL--------DWETLTAADAEPVYEQVPFDHPLWVLYSSGTTGLPKAIVQSQGGIL 290 P+ W+ + A E VPFDHPLW++YSSGTTG+PK IV GGI+ Sbjct: 244 APNAIALPANVRRHAWQAVLAHRVPLAIESVPFDHPLWIVYSSGTTGMPKPIVHGHGGIV 303 Query: 291 VEHLKQLGLHCDLGPGDRFFWYTSTGWMMWNFLVSGLLTGTTIVLYDGSPGFPATDAQWR 350 +E LK + H +LGP D F WY+S+GW+MWN ++GLL G+TI LYDG+P +P WR Sbjct: 304 IEQLKLMAFHNNLGPDDVFHWYSSSGWIMWNAQIAGLLLGSTIALYDGNPAWPDAGVLWR 363 Query: 351 IAERTGATLFGTSAAYVMACRKAGVHPARDLDLSAIQCVATTGSPLPPDGFRWLHDEFAA 410 + T FG AA+ A KAG+ PAR DLS ++ + +TGSPLP + + W++ +A Sbjct: 364 FVDAARVTAFGAGAAFFTAGMKAGIEPARVADLSRLRALGSTGSPLPAEAYDWIYRHVSA 423 Query: 411 GGADLWIASVSGGTDVCSCFAGAVPTLPVHIGELQAPGLGTDLQSWDPSGDPLTDEVGEL 470 D+W+A +SGGTD F P LPV+ GE+Q LG ++++D +G+ L EVGEL Sbjct: 424 ---DIWLAPMSGGTDFAGSFVAGCPILPVYSGEMQCRCLGAKVEAFDEAGNALVGEVGEL 480 Query: 471 VVTNPMPSMPIRFWNDPDGSRYHDSYFDTYPGVWRHGDWITLTSRGSVVIHGRSDSTLNR 530 V T PMPSMP+ W D +G RY DSYFDTYPGVWRHGDWI +T+ G +I+GRSD+T+NR Sbjct: 481 VCTAPMPSMPLFLWGDTNGQRYRDSYFDTYPGVWRHGDWIKITAHGGAIIYGRSDATINR 540 Query: 531 QGVRMGSADIYEAVERLPEIRESLVIGIEQPDGGYWMPLFVHLAPGATLDDALLDRIKRT 590 G+RMG++++Y VE LPE+ +S+V+ +E +MPLFV L G LDDAL D ++ Sbjct: 541 HGIRMGTSELYRVVEDLPEVLDSMVVDLEYLGRDSYMPLFVVLREGMVLDDALRDTMRAR 600 Query: 591 IRVNLSPRHVPDEVIEVPGIPHTLTGKRIEVPVKRLLQGTPLDKAVNPGSIDNLDLLHFY 650 IR LS RHVP+E+++VPG+P TL+GK++EVP+K+LL G DK N ++ N + L +Y Sbjct: 601 IRSALSSRHVPNEILQVPGVPRTLSGKKMEVPIKKLLLGHAPDKIANRDAMANPETLDWY 660 Query: 651 EELARK 656 A + Sbjct: 661 FAYAER 666 Lambda K H 0.320 0.138 0.442 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 1482 Number of extensions: 71 Number of successful extensions: 4 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 658 Length of database: 678 Length adjustment: 38 Effective length of query: 620 Effective length of database: 640 Effective search space: 396800 Effective search space used: 396800 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.4 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.8 bits) S2: 54 (25.4 bits)
Align candidate RR42_RS15595 RR42_RS15595 (acetoacetyl-CoA synthetase)
to HMM TIGR01217 (acetoacetate-CoA ligase (EC 6.2.1.16))
# hmmsearch :: search profile(s) against a sequence database # HMMER 3.3.1 (Jul 2020); http://hmmer.org/ # Copyright (C) 2020 Howard Hughes Medical Institute. # Freely distributed under the BSD open source license. # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # query HMM file: ../tmp/path.carbon/TIGR01217.hmm # target sequence database: /tmp/gapView.29045.genome.faa # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: TIGR01217 [M=652] Accession: TIGR01217 Description: ac_ac_CoA_syn: acetoacetate-CoA ligase Scores for complete sequences (score includes all domains): --- full sequence --- --- best 1 domain --- -#dom- E-value score bias E-value score bias exp N Sequence Description ------- ------ ----- ------- ------ ----- ---- -- -------- ----------- 9.8e-240 783.0 0.0 1.2e-239 782.7 0.0 1.0 1 lcl|FitnessBrowser__Cup4G11:RR42_RS15595 RR42_RS15595 acetoacetyl-CoA syn Domain annotation for each sequence (and alignments): >> lcl|FitnessBrowser__Cup4G11:RR42_RS15595 RR42_RS15595 acetoacetyl-CoA synthetase # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc --- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ---- 1 ! 782.7 0.0 1.2e-239 1.2e-239 2 647 .. 6 663 .. 5 667 .. 0.96 Alignments for each domain: == domain 1 score: 782.7 bits; conditional E-value: 1.2e-239 TIGR01217 2 reqvlwepdaervkdarlarfraavgerfGaalgdydalyrwsvdeldafwkavwefsdvvfssaekev 70 + q w+p + + +r+a+f + + G+a++dy al++wsv++l+afw a+ ++d++f+++ ++v lcl|FitnessBrowser__Cup4G11:RR42_RS15595 6 EGQLRWTPSQAFRDGSRIAHFMRWLESERGLAFADYGALWQWSVTDLEAFWDAIRAYFDLRFDTPAERV 74 56888**************************************************************** PP TIGR01217 71 vddskmlaarffpgarlnyaenllrkkgs.....edallyvdeekesakvtfeelrrqvaslaaalral 134 +++ +m++ar+f ga lny ++++r++gs +a+ y +e ++ ++++ l +qvasla alr + lcl|FitnessBrowser__Cup4G11:RR42_RS15595 75 LGSAQMPGARWFEGASLNYVQQVFRHAGSgaareRTAIRYAGESIAVTDLSWDVLEQQVASLAHALRGM 143 ****************************988884445669999999*********************** PP TIGR01217 135 GvkkGdrvagylpnipeavaallatasvGaiwsscspdfGargvldrfsqiepkllfsvdgyvynGkeh 203 Gv +Gdrvagylpnip +v+a+latas+Gaiws c+pd+G+ +v drf qiepk+l++vdgy+y+Gk lcl|FitnessBrowser__Cup4G11:RR42_RS15595 144 GVARGDRVAGYLPNIPATVVAFLATASLGAIWSGCAPDMGQVAVSDRFRQIEPKVLIAVDGYRYGGKAY 212 ********************************************************************* PP TIGR01217 204 drrekvrevakelpdlravvlipyvgdreklap...kvegaltledllaa.aqaaelvfeqlpfdhply 268 dr ++++a lp+l vl+p ++ ap +++ + + + a a+ +l e +pfdhpl+ lcl|FitnessBrowser__Cup4G11:RR42_RS15595 213 DRAPVLADLAAALPSLTDLVLVPGEHTDAHAAPnaiALPANVRRHAWQAVlAHRVPLAIESVPFDHPLW 281 ***********************887666655532267888877777776566899************* PP TIGR01217 269 ilfssGttGvpkaivhsaGGtlvqhlkehvlhcdltdgdrllyyttvGwmmwnflvsglatGatlvlyd 337 i++ssGttG+pk ivh GG++++ lk +++h +l++ d++ +y++ Gw+mwn ++gl+ G+t+ lyd lcl|FitnessBrowser__Cup4G11:RR42_RS15595 282 IVYSSGTTGMPKPIVHGHGGIVIEQLKLMAFHNNLGPDDVFHWYSSSGWIMWNAQIAGLLLGSTIALYD 350 ********************************************************************* PP TIGR01217 338 GsplvpatnvlfdlaeregitvlGtsakyvsavrkkglkparthdlsalrlvastGsplkpegfeyvye 406 G p p++ vl++++++ ++t +G++a++ +a k+g++par dls lr++ stGspl++e+++++y+ lcl|FitnessBrowser__Cup4G11:RR42_RS15595 351 GNPAWPDAGVLWRFVDAARVTAFGAGAAFFTAGMKAGIEPARVADLSRLRALGSTGSPLPAEAYDWIYR 419 ********************************************************************* PP TIGR01217 407 eikadvllasisGGtdivscfvganpslpvykGeiqapglGlaveawdeeGkpvtgekGelvvtkplps 475 + ad++la +sGGtd fv + p lpvy Ge+q++ lG +vea+de G+++ ge+Gelv+t p+ps lcl|FitnessBrowser__Cup4G11:RR42_RS15595 420 HVSADIWLAPMSGGTDFAGSFVAGCPILPVYSGEMQCRCLGAKVEAFDEAGNALVGEVGELVCTAPMPS 488 ********************************************************************* PP TIGR01217 476 mpvrfwndedGskyrkayfdkypgvwahGdyieltprGgivihGrsdatlnpnGvrlGsaeiynaverl 544 mp+ +w d +G +yr++yfd+ypgvw+hGd+i++t++Gg +i+Grsdat+n++G+r+G++e+y +ve l lcl|FitnessBrowser__Cup4G11:RR42_RS15595 489 MPLFLWGDTNGQRYRDSYFDTYPGVWRHGDWIKITAHGGAIIYGRSDATINRHGIRMGTSELYRVVEDL 557 ********************************************************************* PP TIGR01217 545 deveeslvigqeqedgeervvlfvklasGatldealvkeikdairaglsprhvpskiievagiprtlsG 613 +ev +s+v+ e + +++lfv l +G+ ld+al + ++ +ir++ls+rhvp++i++v+g+prtlsG lcl|FitnessBrowser__Cup4G11:RR42_RS15595 558 PEVLDSMVVDLEYLGRDSYMPLFVVLREGMVLDDALRDTMRARIRSALSSRHVPNEILQVPGVPRTLSG 626 ********************************************************************* PP TIGR01217 614 kkvevavkdvvaGkpve...nkgalsnpealdlyeel 647 kk+ev++k+++ G++ + n++a++npe+ld y lcl|FitnessBrowser__Cup4G11:RR42_RS15595 627 KKMEVPIKKLLLGHAPDkiaNRDAMANPETLDWYFAY 663 *************9766666************99665 PP Internal pipeline statistics summary: ------------------------------------- Query model(s): 1 (652 nodes) Target sequences: 1 (678 residues searched) Passed MSV filter: 1 (1); expected 0.0 (0.02) Passed bias filter: 1 (1); expected 0.0 (0.02) Passed Vit filter: 1 (1); expected 0.0 (0.001) Passed Fwd filter: 1 (1); expected 0.0 (1e-05) Initial search space (Z): 1 [actual number of targets] Domain search space (domZ): 1 [number of targets reported over threshold] # CPU time: 0.04u 0.02s 00:00:00.06 Elapsed: 00:00:00.05 # Mc/sec: 7.87 // [ok]
This GapMind analysis is from Sep 17 2021. The underlying query database was built on Sep 17 2021.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory