Align β-galactosidase (Gal4214-1) (EC 3.2.1.23) (characterized)
to candidate CA265_RS04025 CA265_RS04025 glycoside hydrolase family 2
Query= CAZy::AAX48919.1 (1046 letters) >FitnessBrowser__Pedo557:CA265_RS04025 Length = 1054 Score = 805 bits (2079), Expect = 0.0 Identities = 441/1075 (41%), Positives = 630/1075 (58%), Gaps = 53/1075 (4%) Query: 3 MKKRTILTSIFAFISIIVFAQEKPSRNDWENPEVFQINREPARAAFLPFADEASAIADDY 62 MK++ + FI+ + AQ+ PS + + PEV +NR P RA+ F ++ A Sbjct: 1 MKRKLTILFTSLFIAGQISAQDLPS--ELQTPEVVSVNRLPMRASAFAFENQDLATKRAK 58 Query: 63 TRSPWYMSLDGKWKFNWSPTPDERPKDFFNTDFNTTTWKEIGVPSNWELVGYGIPIYTNI 122 +S +++SL+G WKFNW P +RP DF+ DF+ W VP+NWE GYG PIY N Sbjct: 59 EKSEYFLSLNGTWKFNWVKDPRKRPTDFYKLDFDDKGWDNFKVPANWETNGYGTPIYVNQ 118 Query: 123 TYPFV--------KNPPFIDHADN-PVGSYRRTFELPENWDGRRVYLHFEGGTSAMYVWI 173 Y F NPPF ADN PVGSYR+ +P NW GR+V++ SA Y+W+ Sbjct: 119 PYEFAGRQLTGARMNPPFDIPADNNPVGSYRKKINIPANWSGRQVFISLGAVKSAFYIWV 178 Query: 174 NGEKVGYSQNTKSPTEFDITKYVKVGKNQVAVEVYRWSDGSYLEDQDFWRLSGIDRSVYL 233 NG+KVGYS+++K EFDITKYVK G+N +A++VYRWSDG+YLE QD WR+SGI+R VYL Sbjct: 179 NGKKVGYSEDSKLAAEFDITKYVKPGENTIALQVYRWSDGTYLECQDMWRISGIEREVYL 238 Query: 234 YSTANTRIADFFARPDLDTSYKNGSLSVDIKLKNANSVAKNNQT------VEAKLVDAAG 287 YST I DF +LD +Y NG L+VD+ ++N + N + V LVDA G Sbjct: 239 YSTPKLDIRDFKVIGNLDATYTNGLLNVDLAVENYKIDQRTNHSRPDSFYVALDLVDAKG 298 Query: 288 KEVF--IKTIKINLGANTVSSTTFEQMVKSPKLWNNETPNLYTLVLTLKDENGKFVETVA 345 V+ TI+ LG N + +F+ + + K W+ E P LYTL +TLKD+N K +E + Sbjct: 299 NNVWKDATTIQKVLG-NYKTDLSFKTQISNVKNWSAEIPYLYTLYITLKDKNNKIIEVIP 357 Query: 346 TSIGFRKVELKNGQLLVNGIRIMVHGVNIHEHNPKTGHYQDEATMMKDIKLMKQLNINAV 405 +GFR VE+K LLVNG R+ + GVN HEHN GH A M KD+++MK+LN+NAV Sbjct: 358 QRVGFRSVEIKGSDLLVNGKRVFLKGVNRHEHNATQGHTLTHADMEKDMEMMKKLNVNAV 417 Query: 406 RCSHYPNNLLWVKLCNKYGLFLVDEANIETHGMGAELQGSFDKTKHPAYLPEWKAAHMDR 465 R SHYP + W++LC++YGL+++DEANIE+HG L+ +F K +W+ H++R Sbjct: 418 RHSHYPPDPYWMELCDEYGLYVIDEANIESHGRYYSLETTFANDK------QWRIPHLER 471 Query: 466 IYSLVERDKNQPSIILWSLGNECGNGPVFHEAYNWIKNRDKTRLVQFEQAGEQENTDVVC 525 I + ERDKN S+I WSLGNE GNG F+EAY W+K +D R VQ+E+A NTD++ Sbjct: 472 ITRMYERDKNHASVITWSLGNEAGNGVNFYEAYQWLKGKD-FRPVQYERAESDFNTDMIV 530 Query: 526 PMYPSMEYMKEYANRKDVKRPFIMCEYSHAMGNSNGNFQEYWDIIHSSTNMQGGFIWDWV 585 P YPS Y+ Y+ + RPFIM EY+H MGNS GNF+EYWD I ++ +QGGF+W+W+ Sbjct: 531 PQYPSPNYLPRYSKQDKETRPFIMSEYAHIMGNSLGNFKEYWDAIENNPKLQGGFVWEWI 590 Query: 586 DQGFEETDEAGRKYWAYGGDMG-----GQNYTNDQNFCHNGLVWPDRTPHPGAFEVKKVY 640 DQ +T + G++ AYGGD +N+ +D +FC G+V R P A E+KKV+ Sbjct: 591 DQAI-DTVKNGKRIMAYGGDFPLSGPVDENF-SDNDFCVKGVVTAYRGMTPMAVELKKVH 648 Query: 641 QDI--LFKGVNLDKGIIEVENGFGYTNLDKYLFKFEVLKNGLVIKSGVI-NIRLAPQSKK 697 Q I F G N I V N + + ++ +E++++G VI++GV+ N+ + + + Sbjct: 649 QYIKTTFNGNNQ----INVNNSYFFKDISNVQLNWELVEDGKVIENGVVSNLNVGARQTQ 704 Query: 698 QIQIELPKLTTEDGVEYLLNVFAYTKEGTELLPQNFEIAREQFSIG---ESNYFVKVAKA 754 + + K G EY LNV K L + +E+A EQ ++ ++N + KA Sbjct: 705 MLSLPF-KTNYAAGKEYFLNVHYRLKTAEPFLEKGYEVAYEQIALAGTPKANVYNSNKKA 763 Query: 755 STNPIVKDSQDAITLSANGVEVTINKKTGLMQKYTSGEENYFNQMPVPNFWRAPTDNDFG 814 V+ + + + + +T + G + Y S + P P F+RAPTDND G Sbjct: 764 LK---VEQTAEKAVVKGSDFTITFDLIKGTLASYVSKGQELLASGPQPGFYRAPTDNDIG 820 Query: 815 NYMQVNSNVWRTVGRFSSLDSIEVKEVSTQ---TTVVAHLFLKDIASTYTITYSMDADGS 871 + +WR V + ++ +I+ ST V LK A T T +++ ADG+ Sbjct: 821 AGLNTKLRMWRNVYQDNNTSNIKSTINSTADGFILTVKSSLLKGDAET-TQEFNVSADGT 879 Query: 872 LTLQNSFKAGEMALSEMPRFGMLFSLKKELDNFSYYGRGPWENYQDRNTSSLKGIYESKV 931 + + N FKA + R G LK + N +YGRGP ENY DR T+SL G Y+S V Sbjct: 880 VKVNNQFKAVTGNYKSLMRIGNDLQLKNDYSNIQWYGRGPGENYVDRKTASLIGTYKSTV 939 Query: 932 ADQYVPYTRPQENGYKTDIRWITLTNSSGNGIEI-LGLQPLGVSALNNYPEDFDPGLTKK 990 +DQY PY RPQE+G KTD+RW+T TN +G G+ Q L +AL ED DP KK Sbjct: 940 SDQYFPYARPQESGNKTDVRWVTFTNKAGKGLRFEFADQLLNFNALPYSVEDLDPEAEKK 999 Query: 991 QQHTNDITPRDEVIICVDLAQRGLGGDNSWGAMPHEQYQLRNKAYSYGFVIKPIK 1045 Q H+ ++ R+++ + +D+ Q G+ G +SWG+MP QYQ+ K Y Y + IKPIK Sbjct: 1000 QYHSGELVKRNQIYVHMDMQQLGVQGIDSWGSMPLIQYQIPFKDYQYSYYIKPIK 1054 Lambda K H 0.316 0.134 0.410 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 3279 Number of extensions: 193 Number of successful extensions: 11 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 1046 Length of database: 1054 Length adjustment: 45 Effective length of query: 1001 Effective length of database: 1009 Effective search space: 1010009 Effective search space used: 1010009 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.3 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.6 bits) S2: 58 (26.9 bits)
This GapMind analysis is from Sep 17 2021. The underlying query database was built on Sep 17 2021.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory