Align β-galactosidase Z (LacZ;TmLac;TM1193) (EC 3.2.1.23) (characterized)
to candidate 352820 BT3293 beta-galactosidase (NCBI ptt file)
Query= CAZy::AAD36268.1 (1087 letters) >FitnessBrowser__Btheta:352820 Length = 1421 Score = 681 bits (1758), Expect = 0.0 Identities = 395/1035 (38%), Positives = 582/1035 (56%), Gaps = 81/1035 (7%) Query: 7 EWENPQLVSEGTEKSHASFIPYLDPFSGEWEYPEE---FISLNGNWRFLFAKNPFEVPED 63 EW N + E++ IP+ D EE + +LNG W+F + +P + P+D Sbjct: 29 EWSNELVSGVNKEEAVQIAIPFTDEQQAMNLTIEESPYYKTLNGIWKFHWVADPKDRPQD 88 Query: 64 FFSEKFDDSNWDEIEVPSNWEM------KGYGKPIYTNVVYPF--------------EPN 103 F ++D S WD I+VP+ W++ K + KP+Y NV+YPF +P Sbjct: 89 FCKPEYDVSQWDNIKVPATWQIEAVRHNKNWDKPLYCNVIYPFCEWDWKKIQWPNVIQPR 148 Query: 104 PP--FVPKDDNPTGVYRRWIEIPEDWFKKEIFLHFEGVRSFFYLWVNGKKIGFSKDSCTP 161 P NP G YRR +P+ W ++IF+ F GV + FY+WVNGKK+G+S+DS P Sbjct: 149 PSNYTFATMPNPVGSYRREFILPDSWKGRDIFIRFNGVEAGFYIWVNGKKVGYSEDSYLP 208 Query: 162 AEFRLTDVLRPGKNLITVEVLKWSDGSYLEDQDMWWFAGIYRDVYLYALPKFHIRDVFVR 221 AEF LT L+ GKN++ VEV +++DGS+LE QD W F+GI+RDV+L++ PK IRD F R Sbjct: 209 AEFNLTPYLKAGKNVLAVEVYRFTDGSFLECQDFWRFSGIFRDVFLWSAPKTQIRDFFFR 268 Query: 222 TDLDENYRNGKIFLDVEMRNLGEEEEKDLEVTLITPDGDEKTLVKETVKPEDRVLSFAFD 281 TDLD+ Y+N + LD+++ E ++VT D + K + + + F+ Sbjct: 269 TDLDKEYKNASVSLDIDITGKRSNNEIQVKVT----DQNGKEIATQNARAVTGTNKLQFE 324 Query: 282 VKDPKKWSAETPHLYVLKLKLGE-----DEKKVNFGFRKIEI-KDGTLLFNGKPLYIKGV 335 V +P KW+AETP+LY L + L + D + V GFRKIE+ +DG LL NGK KGV Sbjct: 325 VVNPLKWTAETPNLYNLTILLKQKGKTVDIRSVKVGFRKIELAQDGRLLINGKSTLFKGV 384 Query: 336 NRHEFDPDRGHAVTVERMIQDIKLMKQHNINTVRTSHYPNQTKWYDLCDYFGLYVIDEAN 395 +RH+ + G V+ E M +D++LMK NIN VRTSHYPN +YDLCD +G+YV+ EAN Sbjct: 385 DRHDHSSENGRTVSKEEMEKDVQLMKSLNINAVRTSHYPNNPYFYDLCDRYGIYVLSEAN 444 Query: 396 IESHGIDWDPEVTLANRWEWEKAHFDRIKRMVERDKNHPSIIFWSLGNEAGDGVNFEKAA 455 +E HG+ + L++ W KA +R + MV R KNH SI+ WSLGNE+G+G+NF+ AA Sbjct: 445 VECHGL-----MALSSEPSWVKAFTERSENMVRRYKNHASIVMWSLGNESGNGINFKSAA 499 Query: 456 LWIKKRDNTRLIHYEGTTRRGESYYVDVFSLMYPKM--------DILLEYASKKREKPFI 507 +KK D+TR HYE G S Y DV S MYP + + L ++ + + KP + Sbjct: 500 EAVKKLDDTRPTHYE-----GNSSYCDVTSSMYPDVQWLESVGKERLQKFQNGETVKPHV 554 Query: 508 MCEYAHAMGNSVGNLKDYWDVIEKYPYLHGGCIWDWVDQGIRKKDENGREFW-AYGGDFG 566 +CEYAHAMGNS+GN K+YW+ E+YP L GG IWDWVDQ I+ +G ++ A+GGDFG Sbjct: 555 VCEYAHAMGNSIGNFKEYWETYERYPALVGGFIWDWVDQSIKMPAPDGSGYYMAFGGDFG 614 Query: 567 DTPNDGNFCINGVVLPDRTPEPELYEVKKVYQNVKIRQVSKDTYEVENRYLFTNLEMFDG 626 DTPNDGNFC NGV+ DRT + YEVKK++Q V + + TY++ N+ L+ G Sbjct: 615 DTPNDGNFCTNGVIFSDRTYSAKAYEVKKIHQPVWVEAMGNGTYKLTNKRFHAGLDDLYG 674 Query: 627 AWKIRKDGEVIEEKTF-KIFAEPGEKRLLKI---PLPEMDDSEYFLEISFSLSEDTPWAE 682 ++I +DG+V+ ++ + +++ I + ++ +EYF++ F +DT W + Sbjct: 675 RYEIEEDGKVVFSANLEELSLNAQDSKVITIADNQINKIPGAEYFIKFRFCQKQDTEWEK 734 Query: 683 KGHVVAWEQFLLKAPAFEKKSISDG-VSLREDGKHLTVEAKDTVYVFSKLTGLLEQILHR 741 G+ VA EQF L A +G + L E V+ FSK G + Sbjct: 735 AGYEVASEQFKLSDSAKPVFKAGEGSIDLIETDDAYLVKGSQFEASFSKQQGTISSYTLN 794 Query: 742 RKKILKSPVVPNFWRVPTDNDIGNRMPQRLAIW-KRASKERKLFKMHW--KKEENRVSVH 798 ++ + N +R PTDND Q W ++ + L HW +KE+N+V++ Sbjct: 795 ELPMISKGLELNAFRAPTDND-----KQVDGDWYQKGLYQMTLEPGHWNVRKEDNKVTLQ 849 Query: 799 SVFQLPGNS-WVYTT---YTVFGNGDVLVDLSLIPAEDVPEIPRIGFQFTVPEEFGTVEW 854 G + + Y T YTV +G +LV+ ++IP+ IPRIG++ +PE F + W Sbjct: 850 IENLYRGKTGFDYRTNIEYTVAADGSILVNSTIIPSTKGVIIPRIGYRMELPEGFERMRW 909 Query: 855 YGRGPHETYWDRKESGLFARYRKAVGEMMHRYVRPQETGNRSDVRWFALS--DGETKLFV 912 YGRGP E Y DRK++ Y + V + YVR QE GNR D+RW +++ DG +F+ Sbjct: 910 YGRGPLENYVDRKDATYVGVYDELVSDQWVNYVRAQEMGNREDLRWISITNPDGIGFVFI 969 Query: 913 SGMPQIDFSVWPFSMEDL------ERVQHISELPERDFVTVNVDFRQMGLGGDDSWGAMP 966 +G ++ S + +D+ R+ H E+P R + +D Q L G+ S G P Sbjct: 970 AG-DKMSASALHATAQDMVDPANHRRLLHKYEVPMRKETVLCLDANQRPL-GNASCGPGP 1027 Query: 967 HLEYRLLPKPYRFSF 981 +Y L +P FSF Sbjct: 1028 MQKYELRSQPTVFSF 1042 Lambda K H 0.319 0.139 0.438 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 4611 Number of extensions: 256 Number of successful extensions: 13 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 1087 Length of database: 1421 Length adjustment: 48 Effective length of query: 1039 Effective length of database: 1373 Effective search space: 1426547 Effective search space used: 1426547 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.4 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.7 bits) S2: 59 (27.3 bits)
This GapMind analysis is from Sep 17 2021. The underlying query database was built on Sep 17 2021.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory