Align α-glucosidase (YgjK;EcYgjK;b3080) (EC 3.2.1.20|3.2.1.84) (characterized)
to candidate 5211333 Shew_3745 glycoside hydrolase family protein (RefSeq)
Query= CAZy::AAA57881.1 (783 letters) >FitnessBrowser__PV4:5211333 Length = 738 Score = 447 bits (1150), Expect = e-130 Identities = 282/787 (35%), Positives = 409/787 (51%), Gaps = 84/787 (10%) Query: 10 VTCALLISF-SAHAANADNYKNVINRTGAPQYMKDYDYDDHQRFNP-FFDLGAWHGHLLP 67 ++ +LL S S A A Y++++N G P M+ D + + D GAWHG LP Sbjct: 15 LSLSLLCSLPSVSVAQASEYRDLLNYRGTPSNMEQRDPQGNLTIPAVYMDQGAWHGFHLP 74 Query: 68 DGPNTMGGFPGVALLTEEYINFMASNFDRLTVW--QDGKKVDFT----LEAYSIPGALVQ 121 D P GGF G + +EY ++ + +L ++ + G+ VD +E YS P LVQ Sbjct: 75 DSPAYYGGFTGPLFIAQEYSLHLSDSLQKLQLFSGESGQSVDLAKAEQVEIYSEPWGLVQ 134 Query: 122 KLTAKDVQVEMTLRFATPRTSLLETKIT--SNKPLD--LVWDGELLEKLEAKEGKPLSDK 177 + KD+++ TL ++ RT+++ T++ S+KP L W G + L DK Sbjct: 135 RFRFKDLELSTTLEYSDNRTAIVSTRLINLSDKPGSWRLSWSGSPFATHPKLKQYRLVDK 194 Query: 178 TIAGEYPDYQRKISATRDGLKVTFGKVRATWDLLTSGESEYQVHKSLPVQTEINGNRFTS 237 + E D + F + TW + ++ Y++ V + G + Sbjct: 195 RLLSE------------DAVTWQFTPIDQTWQMQLD-DASYRLAFEQAVTLSVEGPQ--G 239 Query: 238 KAHINGSTTLYTTYSHLL-TAQEVSKEQMQIRDILARPAFYLTA----SQQRWEEYLKKG 292 +G TL + +L A + Q++ R +TA ++ RW++ L K Sbjct: 240 YRADSGLLTLAPGEATVLRAAHQYFHTQVEARHAAKLDWPTVTAKLAVNRSRWQQRLDK- 298 Query: 293 LTNPDATPEQTRVAVKAIETLNGNWRSPGGAVKFNTVTPSVTGRWFSGNQTWPWDTWKQA 352 L P + R+A K++ TL NWRSP GA+ + VTPSVT +WF+G W WD+WKQA Sbjct: 299 LVRGGELPAR-RLAAKSMMTLLHNWRSPAGALLHDAVTPSVTYKWFNG--VWAWDSWKQA 355 Query: 353 FAMAHFNPDIAKENIRAVFSWQIQPGDSVRPQDVGFVPDLIAWNLSPERGGDGGNWNERN 412 A+A F+ +A+ N+ A+F +Q D +RP+D G +PD I +N RGG GGNWNERN Sbjct: 356 VALAQFDVALAELNVLAMFDYQFDAKDPLRPEDAGNLPDAIFYNPDASRGGKGGNWNERN 415 Query: 413 TKPSLAAWSVMEVYNVTQDKTWVAEMYPKLVAYHDWWLRNRDHNGNGVPEYGATRDKAHN 472 KP LAAW+V ++Y +QDK + +YPKLVAYH+WW RNRDHN NG+ EYGA R H Sbjct: 416 GKPPLAAWAVWQIYQQSQDKALITRLYPKLVAYHEWWYRNRDHNHNGLAEYGANRHPRHG 475 Query: 473 TESGEMLFTVKKGDKEETQSGLNNYARVVEKGQYDSLEIPAQVAASWESGRDDAAVFGFI 532 E G E + G + V+E AA+WESG D+A F Sbjct: 476 -EPG-----------EPGEPGEPDREAVIE-------------AAAWESGMDNAPRFDMG 510 Query: 533 DKEQLDKYVANGGKRSDWTVKFAENRSQDGTLLGYSLLQESVDQASYMYSDNHYLAEMAT 592 D+ Q+ EN G LLGYS+ QESVD SY+Y++ YLA+MA Sbjct: 511 DELQV-----------------LENYDAKGRLLGYSISQESVDLNSYLYAEKGYLAQMAE 553 Query: 593 ILGKPEEAKRYRQLAQQLADYINTCMFDPTTQFYYDVRIEDKPLANGCAGKPIVERGKGP 652 +L EA +R+ A +L I + FD + F+YD R+ G + ++ GKG Sbjct: 554 LLDLDSEAAVWREQAARLGQLIRSEFFDEESGFFYDRRLA------GERSRLMIAEGKGV 607 Query: 653 EGWSPLFNGAATQANADAVVKVMLDPKEFNTFVPLGTAALTNPAFGADIYWRGRVWVDQF 712 EGW PL+ GAA+QA A+ ++ L+ F T +P T + N AF YWRG VW+DQ Sbjct: 608 EGWLPLWAGAASQAQAEQMIATQLNASNFGTKLPFPTVSADNSAFAPRRYWRGPVWLDQA 667 Query: 713 WFGLKGMERYGYRDDALKLADTFFRHAKGLTADGPIQENYNPLTGAQQGAPNFSWSAAHL 772 FGL+G+ RYG+ + A +LA A G+ DGPI+ENY+PLTG NFSWSA+ L Sbjct: 668 LFGLQGVSRYGHDELARRLATRLVNEADGVLGDGPIRENYDPLTGDGLHCTNFSWSASVL 727 Query: 773 YMLYNDF 779 ++Y + Sbjct: 728 LLIYRSW 734 Lambda K H 0.316 0.133 0.413 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 1767 Number of extensions: 89 Number of successful extensions: 7 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 783 Length of database: 738 Length adjustment: 40 Effective length of query: 743 Effective length of database: 698 Effective search space: 518614 Effective search space used: 518614 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.3 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.6 bits) S2: 55 (25.8 bits)
This GapMind analysis is from Sep 17 2021. The underlying query database was built on Sep 17 2021.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory