Align α,α-trehalase (MSMEG_4535;MSMEG4528) (EC 3.2.1.28) (characterized)
to candidate WP_047005936.1 AAW01_RS03580 glycoside hydrolase family 15 protein
Query= CAZy::ABK72415.1 (668 letters) >NCBI__GCF_001010925.1:WP_047005936.1 Length = 599 Score = 225 bits (573), Expect = 5e-63 Identities = 180/610 (29%), Positives = 277/610 (45%), Gaps = 50/610 (8%) Query: 48 LSDCETTCLISSAGSVEWLCVPRPDSPSVFGAIL-----DRGAGHFRLGPYGVSVPAARR 102 + +C+ + L+ G++ W CVPR D F A+L D G F L S + Sbjct: 13 IGNCQVSGLVDKRGAIVWGCVPRVDGDPTFCALLNGASQDVGVWRFELEGQTAS---HQE 69 Query: 103 YLPGSLILETTWQTHTGWLIVRDALVMGPWHDIDTRSRTHRRTPMDWDAEHILLRTVRCV 162 Y+ + IL T + G + + L P + R +R R VR V Sbjct: 70 YIRNTPILVTRLEAADGSAV--EVLDFCP--RFEGSGRMYRPVAF--------ARIVRPV 117 Query: 163 SGTVELVMSCEPAFDYHRVSATWEYSGPAYGEAIARASRNPDSHPTLRLTTNLRIG--IE 220 +G + + +P D+ + +A + R S ++RL+T+ +G +E Sbjct: 118 AGNPRIRVVLKPMRDWGQAAAETTHG--------TNHIRYLMSGQSMRLSTDAAVGYILE 169 Query: 221 GREARARTRLTEGDNVFVALSWSKHPAPQTYEEAADKMWK-TSEAWRQWINVGDFPDHPW 279 GR R + E + F+ P E +M + T W+ W P W Sbjct: 170 GRTFR----IEEDTHFFLG---PDEPFVGNLREQVRRMEQSTRRYWQLWARSLATP-FEW 221 Query: 280 RAYLQRSALTLKGLTYSPTGALLAAPTTSLPETPQGERNWDYRYSWIRDSTFALWGLYTL 339 + + R+A+TLK + TGA++AA TTS+PE P ERNWDYRY WIRDS + + L L Sbjct: 222 QQEVIRAAITLKLCQHEETGAIVAALTTSIPEAPGSERNWDYRYCWIRDSYYTVQALNRL 281 Query: 340 GLDREADDFFSFIADVSGANNGERHPLQVMYGVGGERSLVEEELHHLSGYDNSRPVRIGN 399 G + + F+ ++ +N + +Q +Y V G L E L+GY PVR+GN Sbjct: 282 GALDVLEKYLGFLRNL--VDNAKGGQIQPLYSVMGVAELTENTAGSLAGYRGMGPVRVGN 339 Query: 400 GAYNQRQHDIWG-TMLDSV--YLHAKSREQIPDALWPVLKNQVEEAIKHWKEPDRGIWEV 456 AY Q QHD +G +L +V +L + D + L+ E A +PD G+WE Sbjct: 340 AAYKQVQHDAYGQIVLPTVQGFLDRRLLRMADDRDFESLEEVGEMAWSMHDQPDAGLWEF 399 Query: 457 RGEPQHFTSSKIMCWVALDRGSKLAELQGEKSYAQQWRAIAEEIKADVLARGVDKRGV-- 514 R + T S +M W A DR + A G++ A W A+ I+ + AR + G Sbjct: 400 RTRQEVHTYSAVMSWAACDRLATAAHYLGKQDRAHFWSDRADTIRDTIEARAWKENGEGG 459 Query: 515 -LTQRYGDDALDASLLLAVLTRFLPADDPRIRATVLAIADELTEDGLVLRYRVEETDDGL 573 + D LDASLL + R++ DD R T + +L +LRY E D Sbjct: 460 HYGASFESDYLDASLLQLLELRYVTPDDERFEQTFAMVERDLRRGEHMLRYAAE---DDF 516 Query: 574 AGEEGTFTICSFWLVSALVEIGEISRAKHLCERLLSFASPLHLYAEEIEPRTGRHLGNFP 633 E F IC+FWL+ AL +G A+ L +L+ + L +E+++ TG GNFP Sbjct: 517 GAPETAFNICTFWLIEALALMGRKDEARELFCTMLAHRTGSGLLSEDMDFETGELWGNFP 576 Query: 634 QAFTHLALIN 643 Q ++ + +IN Sbjct: 577 QTYSLVGIIN 586 Lambda K H 0.319 0.135 0.426 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 1063 Number of extensions: 60 Number of successful extensions: 6 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 668 Length of database: 599 Length adjustment: 38 Effective length of query: 630 Effective length of database: 561 Effective search space: 353430 Effective search space used: 353430 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.4 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.7 bits) S2: 54 (25.4 bits)
This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory