Align Beta-ketoadipyl CoA thiolase (EC 2.3.1.-) (characterized)
to candidate WP_093394716.1 BM091_RS07565 acetyl-CoA C-acetyltransferase
Query= reanno::Marino:GFF2751 (415 letters) >NCBI__GCF_900114975.1:WP_093394716.1 Length = 392 Score = 350 bits (898), Expect = e-101 Identities = 194/402 (48%), Positives = 268/402 (66%), Gaps = 14/402 (3%) Query: 10 AYIVDAIRTPIGRYGGALSAVRADDLGAIPIKALAERYPDLDWSKIDDVLYGCANQAGED 69 A I A+RTP+G +G L++V A DLG + +K R +L +D+V+ G QAG+ Sbjct: 4 AVIATAVRTPVGSFGKTLASVSAVDLGVVALKEALRRI-NLTPEMVDEVILGNVLQAGQ- 61 Query: 70 NRDVARMSLLLAGLPVDVPGSTINRLCGSGMDAVGSAARAIRTGETQLMIAGGVESMSRA 129 ++ AR + +G+P +VP T+N++C SG+ +V AA+AI GE ++++AGG+E+MS+A Sbjct: 62 GQNPARQVAVKSGIPYEVPAFTVNKVCASGLKSVILAAQAIMVGEAEIVVAGGIENMSQA 121 Query: 130 PFVMGKADSAFSRKAEIFDTTIGWRFVNPVLKKQYGIDSMPETAENVAADFGISREDQDA 189 P+ + KA + D ++ + L + M TAENVA FGISRE+QD Sbjct: 122 PYAVPKARWGH----RMGDGSLVDLMIFDGLWDIFNGYHMGITAENVAERFGISREEQDR 177 Query: 190 FALRSQQRTAAAQKEGRLAAEITPVTIPRRKQDPLVVDTDEHPR-ETSLEKLASLPTPFR 248 FALRSQQ+ AA KEG+ EI PVT+P+RK DP++ DTDEHPR T+LE L+ LP F+ Sbjct: 178 FALRSQQKAEAAIKEGKFREEIVPVTVPQRKGDPIIFDTDEHPRFGTTLEALSKLPPAFK 237 Query: 249 ENGTVTAGNASGVNDGACALLLAGADALKQYNLKPRARVVAMATAGVEPRIMGFGPAPAT 308 + GTVTAGNASG+NDGA +++ + ++P AR+V+ A+AGV+P IMG GP PA+ Sbjct: 238 KEGTVTAGNASGINDGAAVVIVMSEKKASELGIEPMARIVSYASAGVDPAIMGTGPIPAS 297 Query: 309 RKVLATAGLELADMDVIELNEAFAAQALAVTRDLGLPDDAEHVNPNGGAIALGHPLGMSG 368 RK L AG + D+D+IE NEAFAAQA+AV R++G D E VN NGGAIALGHP+G SG Sbjct: 298 RKALEKAGWSVDDLDLIEANEAFAAQAIAVNREMGW--DVEKVNVNGGAIALGHPIGASG 355 Query: 369 ARLVTTALNELERRHAAGQKARYALCTMCIGVGQGIALIIER 410 AR++TT L E++RR AR L T+CIG GQG AL +ER Sbjct: 356 ARILTTLLYEMKRR-----SARRGLATLCIGGGQGCALTVER 392 Lambda K H 0.318 0.133 0.382 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 414 Number of extensions: 20 Number of successful extensions: 6 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 415 Length of database: 392 Length adjustment: 31 Effective length of query: 384 Effective length of database: 361 Effective search space: 138624 Effective search space used: 138624 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.3 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.7 bits) S2: 50 (23.9 bits)
This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory