Align β-galactosidase (BgalB) (EC 3.2.1.23) (characterized)
to candidate WP_066328166.1 BLR17_RS02220 DUF4981 domain-containing protein
Query= CAZy::AAC24219.1 (1085 letters) >NCBI__GCF_900100165.1:WP_066328166.1 Length = 1101 Score = 671 bits (1730), Expect = 0.0 Identities = 393/1043 (37%), Positives = 563/1043 (53%), Gaps = 79/1043 (7%) Query: 5 WENPQLVSEGTEKPHASFIPYL---NPFTGEWEYPDDFILLNGNWKFFFAKNPFEVPENF 61 WENP + S E A+ Y + G +LNG+W F FAKN E P++F Sbjct: 49 WENPMITSMNREPARATAYSYATVADALEGN-RNKSRIKVLNGDWDFKFAKNFQEAPQDF 107 Query: 62 FLEGFDDTNWDEIEVPSNWEMKGYGKPIYTNVVYPFEP-NPPFVPKDDNPTGIYRRWVEV 120 + G WD+I VPSNWEMKGYG IY + VY F P NPPFVP+DDNP G Y+R V Sbjct: 108 YKNGVK--GWDKIAVPSNWEMKGYGMAIYKSAVYSFRPVNPPFVPQDDNPVGSYQRKFTV 165 Query: 121 PEEWFEKEIFLHFEGVRSFFYLWVNGKRMGFSKDSCTPAEFRVTDVLKPGKNLICVEVLK 180 PE W + LHF GV S F +W+NG+ +G+ +DSC +EF +T LK G+N++ V V + Sbjct: 166 PENWDGMTVSLHFGGVSSAFQVWINGEFVGYGEDSCNSSEFNITPYLKKGENILSVRVFR 225 Query: 181 WSDGSYLEDQDMWWFAGIYRDVYLYALSKFHVRDIFVRTDLDEDYRDGKIFLDVELRNLG 240 +SDGSYLEDQD W +GI R+VY+ A K + D F +T LD+ Y+D L L N Sbjct: 226 YSDGSYLEDQDHWRLSGIQREVYIMAEPKLRIADFFYQTKLDKKYQDAVFQLRPRLDNQT 285 Query: 241 EE--KEKDLIITLTDPQGKEM----------TLVEERVGPKNETLSFVF---EVKDPKKW 285 + K + L D Q K + ++ E V P+ + + F F +K+P KW Sbjct: 286 GDTIKNHTFEVQLYDAQNKPLFEKDLSRKAIDIISE-VSPRLDKVRFGFFQETIKNPLKW 344 Query: 286 SAETPHLYVLKVELGEDE------KKVNFGFKKVEV--KDGRLLFNGKPLYIKGVNRHEF 337 SAE P+LY L + L + K GF+ +E ++G++L NGK Y+ GVN H Sbjct: 345 SAEKPNLYTLVMILKNPDGSISEVKTCKLGFRSIEFSKENGKMLINGKETYVYGVNHHGH 404 Query: 338 DPDRGHAVTVERMIQDIKLMKQHNINTVRTSHYPNQTKWYDLCDYYGLYVIDEANIESHG 397 P RG AV + + +D+KLMK++N N VRTSH+P +Y+LCD YGL V+DEAN+E+HG Sbjct: 405 HPSRGEAVNHQDLEEDVKLMKKYNFNFVRTSHFPEDPYFYELCDKYGLMVMDEANLETHG 464 Query: 398 IGEAPEVTLANRPEWEKAHLDRIKRMVERDKNHPSIIFWSLGNEAGDGMNFEKAALWIKE 457 +G L+N +W A+L+R+ RMV RDKNHPS++ WSLGNEAG G N A W + Sbjct: 465 LGG----FLSNDQQWTHAYLERMTRMVHRDKNHPSVVMWSLGNEAGKGPNHAAMAGWTHD 520 Query: 458 RDNTRLVHYE--------------------GTTRRGESY-------YVDVFSLMYPKIDV 490 D TR VHYE T Y YVDV S YP + Sbjct: 521 YDITRPVHYEPAQGNPRLDGYIDPLDPRYPKTVDHSHRYENPQDEPYVDVVSRFYPGVFT 580 Query: 491 LLEYASRKRE-KPFIMCEYAHAMGNSVGNLKDYWDVIEKYPYLHGGCIWDWVDQGIRKKD 549 +K++ +P + EY+H+MGNS GNLK+ WD P + GGCIWD DQG+ +KD Sbjct: 581 PKFLVDQKKDTRPILFVEYSHSMGNSTGNLKELWDEFRNTPRVIGGCIWDLKDQGLLRKD 640 Query: 550 -ENGKEFWAYGGDFGDEPNDKNFCCNGVVLPDRTPEPELYEVKKFYQNIKVRQIAKDTYE 608 + G+E+ YGGDFG++ +D NF NG+V D + +YE K YQ Sbjct: 641 AKTGQEYLGYGGDFGEKKHDGNFNINGMVSADGRAKAAMYENKWIYQPAVSSLSTDFKLN 700 Query: 609 VENGYLFTDLEMFDGTWRIRKDGEVVREERFK-LSARPGEKKILKIP--LP-EMEDSEYF 664 + N + +L F +I ++G ++++ + + G +L + LP +DSEYF Sbjct: 701 IHNRQVVQNLRDFIPVIQILENGNLIKQFVLQPIDIPAGGDYVLDVKKYLPNRKKDSEYF 760 Query: 665 LEICFSLSEDTLWAKKGHVVAWEQFLI-KPPSFEKTVVRESVDLSEDGRHLFVRSKDTEL 723 L I F L++D WA KG+ VA +QFL+ K ++ +L E + +D + Sbjct: 761 LNIEFQLAKDEFWASKGYAVATDQFLLQKKTEVVLAEAKQKSNLQETKEAYAINGRDFSI 820 Query: 724 VFSKFTGLLKRIVYRGRNILTGSIVPNFWRVPTDND-VGNKMPERLSIWKRASKERKLFK 782 +K G L VY+G+ + ++PNF R TDND G K + L W A K K Sbjct: 821 KINKANGALSSYVYKGKEQIKADLLPNFTRALTDNDRKGWKPHKLLKQWYLAKPMLKNIK 880 Query: 783 MFFWKKEENSVSVQSVYQ-VPGNSWVYLTYTIFGNGDILVDLSLIPAEGVPEIPRIGLQF 841 M ++ + + S Y+ + ++ V + Y I NG + V+ L +P IPRIG+Q Sbjct: 881 M---EQSAGDIKISSDYEIIKDSAAVKVVYNINANGIVKVNYQLNANPKLPNIPRIGMQL 937 Query: 842 AVPGDFRFVEWYGRGPHETYWDRKESGLFARYRRTVQDMIHRYVRPQETGNRSDVRWFAL 901 + DF + WYG+G E Y DR RY + I YV PQE NR++VRW A Sbjct: 938 GINQDFGQISWYGKGELENYCDRSFGFTVGRYSLPIDKFIEPYVMPQENANRTEVRWMAF 997 Query: 902 SDGRVN---LFVSGMPVVDFSVWPFSMEDLEKADHVNELPERDFVTVNVDYRQMGLGGDD 958 ++ + + V+ V++ S WP++ ++L+ A H +L F+TVNVD +QMG+GG+D Sbjct: 998 TNPTKSSGLIVVNEDKVLNMSAWPYTQQNLDDAKHTYDLVNPGFLTVNVDLKQMGVGGND 1057 Query: 959 SWG--AQPHLEYRLLPEPYRFSF 979 SW + P +Y++ Y++SF Sbjct: 1058 SWSPVSAPMEKYQIPSGDYQYSF 1080 Lambda K H 0.320 0.140 0.440 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 3546 Number of extensions: 219 Number of successful extensions: 13 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 1085 Length of database: 1101 Length adjustment: 46 Effective length of query: 1039 Effective length of database: 1055 Effective search space: 1096145 Effective search space used: 1096145 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.4 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.8 bits) S2: 58 (26.9 bits)
This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory