Align β-glucosidase (H1_GH3) (EC 3.2.1.21) (characterized)
to candidate WP_103233888.1 C1634_RS08970 beta-glucosidase BglX
Query= CAZy::AEW47970.1 (758 letters) >NCBI__GCF_002899945.2:WP_103233888.1 Length = 740 Score = 704 bits (1818), Expect = 0.0 Identities = 360/722 (49%), Positives = 490/722 (67%), Gaps = 21/722 (2%) Query: 30 IRTKVDALLSEMTLDEKIGQLNQYTSRWEMTGPAPQGKGEQELLEMIRKGQVGSMLNVNG 89 I KV LLS+MTL+EK+GQ+ QY+ TGP Q +LE I+KG+VGSMLNV G Sbjct: 23 IDQKVAELLSKMTLEEKVGQMVQYSGFEYATGP--QQSNSAVVLEEIKKGKVGSMLNVAG 80 Query: 90 AIATRNAQELAVKNSRLGIPLIFGYDVIHGYKTMFPIPLATAASWDPSAAELSARTAATE 149 + T+ Q+LA++ SR+ IPL+FG DVIHGY+T FP+ L AASWD E S R AATE Sbjct: 81 SEETKAFQKLAMQ-SRMKIPLLFGQDVIHGYRTTFPVNLGQAASWDLGMIEKSERIAATE 139 Query: 150 TAASGVHWTFAPMVDIARDARWGRIMEGAGEDPYLGAQMAAAQVKGFQGNDLSAENTIAA 209 +A G+HWTFAPMVDIARD RWGR+MEG+GED YLG ++ A++KGFQG L + + + A Sbjct: 140 ASAYGIHWTFAPMVDIARDPRWGRVMEGSGEDTYLGTKIGLARIKGFQGRGLGSLDAVMA 199 Query: 210 CAKHFAAYGFAEAGRDYNTVEITENTLRNVVLPPFKACADAGVATFMNAFNEIGGVTATA 269 CAKHFAAYG A GRDYN+V+++ L LPPFKA A+AGVATFMN+FN+I G+ ATA Sbjct: 200 CAKHFAAYGAAVGGRDYNSVDMSLRQLNETYLPPFKAAAEAGVATFMNSFNDINGIPATA 259 Query: 270 NKHLVRDILKGEWGFSGYVVSDWNSIGEIYEHGMTPDKKEAAFLAIKAGSDMDMEGNAYI 329 N+++ R++LKG+W + +VVSDW SIGE+ HG D EAA AI+ GSDMDME Y+ Sbjct: 260 NQYIQRNLLKGKWNYKDFVVSDWGSIGEMIPHGYAKDASEAAEKAIQGGSDMDMESRVYM 319 Query: 330 AHLKELVEEGRVDESMIDDAVRRILTLKFELGLFDDPFRYSDPGKEKILL-SEEHLKAAR 388 A L +LV+EG+VD ++DDA RILT KFE+GLFDDP+R+S ++K ++E+ K R Sbjct: 320 AELPKLVKEGKVDPKLVDDATARILTKKFEMGLFDDPYRFSSEKRQKEQTDNQENRKFGR 379 Query: 389 DVAKKSIVLLKNEKQLLPLKKSGQKIALIGDLADDKDSPLGSWRAQAVAGS--AVSLLDG 446 + KSIVLLKN+ +LPL K+ + +ALIG + + G W + VS DG Sbjct: 380 EFGSKSIVLLKNQGNILPLSKTTKTVALIGPFGKETVANHGFWSIAFKDDNQRIVSQFDG 439 Query: 447 MKNAIQDQRSLTFEQGPVFVTSTPQFTQHLQFNEKDLTGIDQAVELAEKSDVVVLALGEN 506 +KN + +L + +G +++D T +A+E A+K+DVV++ LGE Sbjct: 440 IKNQLDKNSTLLYAKG-------------CNVDDQDKTQFAEAIETAKKADVVIMTLGEG 486 Query: 507 CFQTGEGRSQTEIGLKGVQQQLLEAVYAANKNMVVVLMNGRPLVIDWMAERVPAIVEAWH 566 +GE +S++ IG GVQ+ LL+ + K +++++ GRPL+ +W ++ +P I W Sbjct: 487 HAMSGEAKSRSNIGFTGVQEDLLKEIAKTGKPIILMINAGRPLIFNWASDNIPTIAYTWW 546 Query: 567 LGSEAGNAIADVLFGDYNPSGKLPVSFPRSVGQCPIYYNHKNTGRPI--DTGTVFWSHYT 624 LG+EAGN+IADVLFG NP GKLP+SFPR+ GQ P+YYNH NTGRP +T + S Y Sbjct: 547 LGTEAGNSIADVLFGTVNPGGKLPMSFPRTEGQIPVYYNHYNTGRPAKNNTDRNYVSAYI 606 Query: 625 DQSNEPLFPFGYGLSYTTFEYADLKLSSSEIRPGEKLKISVNLKNTGKLSGAEVVQLYIR 684 D N+P +PFGYGLSYT F+Y+D+ L+S+ + + L ISV + NTGK G EVVQLYIR Sbjct: 607 DLDNDPKYPFGYGLSYTDFKYSDMVLNSASLTGNQTLNISVTVSNTGKYDGEEVVQLYIR 666 Query: 685 DLYGSVTRPVKELKGFKKISLNPGESRVVEFEISVRDLAFYTADGEWKAEPGHFHLWVGT 744 DL+G V RPVKELKGF+K+ + GES+ VEF+++ DL F+ D + E G F + +GT Sbjct: 667 DLFGKVVRPVKELKGFQKVWIKKGESKKVEFKLTPEDLKFFDDDLNFDWEGGEFDIMIGT 726 Query: 745 NS 746 +S Sbjct: 727 DS 728 Lambda K H 0.317 0.134 0.392 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 1474 Number of extensions: 60 Number of successful extensions: 7 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 758 Length of database: 740 Length adjustment: 40 Effective length of query: 718 Effective length of database: 700 Effective search space: 502600 Effective search space used: 502600 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.3 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.6 bits) S2: 55 (25.8 bits)
This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory