Align β-galactosidase (Gal4214-1) (EC 3.2.1.23) (characterized)
to candidate WP_066328293.1 BLR17_RS01685 DUF4981 domain-containing protein
Query= CAZy::AAX48919.1 (1046 letters) >NCBI__GCF_900100165.1:WP_066328293.1 Length = 1039 Score = 1031 bits (2665), Expect = 0.0 Identities = 506/1041 (48%), Positives = 694/1041 (66%), Gaps = 24/1041 (2%) Query: 15 FISIIVFAQEK---PSRNDWENPEVFQINREPARAAFLPFADEASAIADDYTRSPWYMSL 71 F+++++F + NDWENP++ + R++FL +++ A +D +S Y SL Sbjct: 5 FLALVLFLGAQINFAQNNDWENPQLLDRGKIEGRSSFLLYSNHAELKRNDPRKSVLYQSL 64 Query: 72 DGKWKFNWSPTPDERPKDFFNTDFNTTTWKEIGVPSNWELVGYGIPIYTNITYPFVKNPP 131 +G WKFN P +RP D+++ + + + W I VPSNWE+ GY IPIYTNITYPF KNPP Sbjct: 65 NGDWKFNIVKNPIQRPLDYYSENLDDSKWNVIKVPSNWEMQGYDIPIYTNITYPFPKNPP 124 Query: 132 FIDHADNPVGSYRRTFELPENWDGRRVYLHFEGGTSAMYVWINGEKVGYSQNTKSPTEFD 191 FI NPV SYRRTF + + W + + LHF T V++NG++VG ++ +K+P EF+ Sbjct: 125 FIGGDYNPVASYRRTFTIADFWKDKEIILHFGSITGYAKVFLNGKEVGMTKASKTPAEFN 184 Query: 192 ITKYVKVGKNQVAVEVYRWSDGSYLEDQDFWRLSGIDRSVYLYSTANTRIADFFARPDLD 251 IT ++K G N +AV+V+RW DGSYLEDQDFWRLSGI+R VYL + T + D+F + DLD Sbjct: 185 ITSFLKKGDNLIAVQVFRWHDGSYLEDQDFWRLSGIERDVYLQAMPKTTVWDYFVKSDLD 244 Query: 252 TSYKNGSLSVDIKLKNANSVAKNNQTVEAKLVDAAGKEVFIKTIKINLGANTVSSTTFEQ 311 YKNG ++D+ LK+ + N V+ +L D GK VF ++ K+N +S F++ Sbjct: 245 YQYKNGIFNLDVTLKSFENNKIKNPAVKVELFDKDGKVVFSESKKVNSKEPKIS---FQK 301 Query: 312 MVKSPKLWNNETPNLYTLVLTLKDENGKFVETVATSIGFRKVELKNGQLLVNGIRIMVHG 371 +++ K WN ETPNLY +TL D G E ++ GFRKVE+KN QLLVNG ++V G Sbjct: 302 TIENVKQWNAETPNLYRYTITLLDSKGNVFEVISKKTGFRKVEIKNAQLLVNGKAVLVKG 361 Query: 372 VNIHEHNPKTGHYQDEATMMKDIKLMKQLNINAVRCSHYPNNLLWVKLCNKYGLFLVDEA 431 VNIHEH+ GH ++ M KD++LMK+ NIN++R HYP++ + LC++YG ++VDEA Sbjct: 362 VNIHEHDDVNGHVPNKDLMKKDLQLMKEFNINSIRMCHYPHDTQFYDLCDEYGFYVVDEA 421 Query: 432 NIETHGMGAELQGSFDKTKHPAYLPEWKAAHMDRIYSLVERDKNQPSIILWSLGNECGNG 491 NIETHGMGAE Q FD+ KHPAYLPEW AH+DRI + DKN PSII+WSLGNECGNG Sbjct: 422 NIETHGMGAEWQNWFDQKKHPAYLPEWAPAHLDRIKRMFAFDKNHPSIIIWSLGNECGNG 481 Query: 492 PVFHEAYNWIKNRDKTRLVQFEQAGEQENTDVVCPMYPSMEYMKEYANRKDVKRPFIMCE 551 PVF++AYNW+K D TRLVQFEQAGE +NTD+VCPMYPS++ MK YA+ K RP+IMCE Sbjct: 482 PVFYDAYNWLKQADSTRLVQFEQAGENKNTDIVCPMYPSIKSMKTYADAKK-DRPYIMCE 540 Query: 552 YSHAMGNSNGNFQEYWDIIHSSTNMQGGFIWDWVDQGFEETDEAGRKYWAYGGDMGGQNY 611 YSHAMGNS+GNFQEYWDIIH+S +MQGGFIWDWVDQG + +E G ++WAYGGD+G + Sbjct: 541 YSHAMGNSSGNFQEYWDIIHASKHMQGGFIWDWVDQGMKTKNEKGVEFWAYGGDLGSAHL 600 Query: 612 TNDQNFCHNGLVWPDRTPHPGAFEVKKVYQDILFKGVNLDKGIIEVENGFGYTNLDKYLF 671 ND+NFC NGLV +R PHPG FEVKKVYQDI F+ N D ++ V+N F +TNL Y F Sbjct: 601 HNDENFCSNGLVSANRIPHPGLFEVKKVYQDIQFQLKN-DTDLL-VKNYFNFTNLSNYTF 658 Query: 672 KFEVLKNGLVIKSGVINIRLAPQSKKQIQIELPKLTTEDGVEYLLNVFAYTKEGTELLPQ 731 K+E++KNGL + SG N+ + P+ K+I++ T + G EY LNVFA +K T L+ + Sbjct: 659 KWELIKNGLKVNSGEFNLDVNPEETKEIKLNYG--TLDAGAEYFLNVFAVSKYDTPLVVK 716 Query: 732 NFEIAREQFSIGESNYFVKVAKASTNP-----IVKDSQDAITLSANGVEVTINKKTGLMQ 786 E AREQF+IG+ +YF AS +P VK + ++ + + K G + Sbjct: 717 GHEFAREQFAIGQGDYFKNNVLASKSPSKFKYAVKG--NVLSFETENIMGEFDLKNGELV 774 Query: 787 KY--TSGEENYFNQMPVPNFWRAPTDNDFGNYMQVNSNVWRTVGRFSSLDSIEVKEVSTQ 844 KY + + P P FWRAPTDNDFG+ MQ +WR + ++ S+ + + S++ Sbjct: 775 KYGLKNDPSTIISGFPTPYFWRAPTDNDFGSGMQNKLVIWREAHKNPTVVSVNLDKKSSE 834 Query: 845 TTVVAHLF-LKDIASTYTITYSMDADGSLTLQNSFKAGEMALSEMPRFGMLFSLKKELDN 903 +V ++ L + Y++ Y + +GS+ + S L E+PRFGM L DN Sbjct: 835 GVLVKVVYKLAQVEVPYSVDYLIQNNGSVKITASVDMNGKELPELPRFGMRMKLNGAYDN 894 Query: 904 FSYYGRGPWENYQDRNTSSLKGIYESKVADQYV-PYTRPQENGYKTDIRWITLTNSSGNG 962 SYYGRGPWENY DRNT+S G+Y KV +Q+ Y RPQE+GYKTD+RW+ L +S G G Sbjct: 895 LSYYGRGPWENYADRNTASFMGVYSDKVINQFTRNYIRPQESGYKTDVRWLVLNDSKGKG 954 Query: 963 IEILGLQPLGVSALNNYPEDFDPGLTKKQQHTN--DITPRDEVIICVDLAQRGLGGDNSW 1020 ++I G+QP+G S LN ED DPG K Q+H + D+ ++ V + +D QRG+GGD+SW Sbjct: 955 LKIEGVQPIGFSTLNIPTEDLDPGKRKSQRHPSDLDLDSKEVVYLHLDYKQRGVGGDDSW 1014 Query: 1021 GAMPHEQYQLRNKAYSYGFVI 1041 G++PH+ Y+L +K YSY +VI Sbjct: 1015 GSLPHDSYRLLDKKYSYSYVI 1035 Lambda K H 0.316 0.134 0.410 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 3448 Number of extensions: 180 Number of successful extensions: 9 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 1046 Length of database: 1039 Length adjustment: 45 Effective length of query: 1001 Effective length of database: 994 Effective search space: 994994 Effective search space used: 994994 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.3 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.6 bits) S2: 58 (26.9 bits)
This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory