Align β-galactosidase (BgaM) (EC 3.2.1.23) (characterized)
to candidate Echvi_1698 Echvi_1698 Beta-galactosidase/beta-glucuronidase
Query= CAZy::CAA04267.1 (1034 letters) >FitnessBrowser__Cola:Echvi_1698 Length = 1080 Score = 672 bits (1733), Expect = 0.0 Identities = 384/1036 (37%), Positives = 556/1036 (53%), Gaps = 65/1036 (6%) Query: 23 NPEIFQLNRSKAHALLMPYQTVEEALKNDRKSSVYYQSLNGSWYFHFAENADGRVKNFFA 82 +P I LNR A Y+ E A DR+ S Q LNG W FHFA N +F+ Sbjct: 44 DPLITSLNRMPARTTAYSYKDAETAKIGDREES-RIQLLNGDWDFHFAMNMKEAPSDFYR 102 Query: 83 PEFSYEKWDSISVPSHWQLQGYDYPQYTNVTYPWVENEELEPPFAPTKYNPVGQYVRTFT 142 + WD I VPS+W+L+GYD P Y + YP+ + PP+ P YN VG Y RTF Sbjct: 103 SRVT--GWDKIEVPSNWELKGYDKPIYKSAVYPF---RPINPPYVPEDYNGVGSYQRTFE 157 Query: 143 PKSEWKDQPVYISFQGVESAFYVWINGEFVGYSEDSFTPAEFDITSYLQEGENTIAVEVY 202 + W+D + + F V SAF VW+NGEFVGY EDSF P+EF+IT YL+ GEN ++V+V Sbjct: 158 LEENWEDMNITLHFGAVSSAFKVWLNGEFVGYGEDSFLPSEFNITPYLRSGENVLSVQVL 217 Query: 203 RWSDASWLEDQDFWRMSGIFRDVYLYSTPQVHIYDFSVRSSLDNNYEDGELSVSADILNY 262 RWSD S+LEDQD WR+SGI R+V+L + P++ +YDF +++L +Y + S+ + N Sbjct: 218 RWSDGSYLEDQDHWRLSGIQREVFLMAEPKLRVYDFHWQATLAEDYTNATFSLRPKVENL 277 Query: 263 FEHDTQDLTFEVMLYDANAQEVLQAPLQTNLSVSDQRTVS-----------LRTHIKSPA 311 D L+DA + V PL+ ++V D S L +++P Sbjct: 278 TGERVPDSKLTAQLFDAEGKPVFATPLE--MAVEDILNESYPRLDNVKFGLLEATVENPH 335 Query: 312 KWSAESPNLYTLVLSLKNAAGSIIETESCKVGFRT--FEIKNGLMTINGKRIVLRGVNRH 369 WS E P LYTLV+ L+ A G ++E +SCKVGFR F+ + + INGK + GVNRH Sbjct: 336 LWSDEHPYLYTLVIGLEGAKGQLLEAKSCKVGFRDIRFDPETSKLLINGKETYIYGVNRH 395 Query: 370 EFDSVKGRAGITREDMIHDILLMKQHNINAVRTSHYPNDSVWYELCNEYGLYVIDETNLE 429 + V+G+A +TR+D+ D+ +KQ N N +RTSHYPND +YELC+EYG+ VIDE N E Sbjct: 396 DHHPVRGKA-LTRQDIEEDVKTIKQFNFNTIRTSHYPNDPYFYELCDEYGILVIDEANHE 454 Query: 430 THGTWTYLQEGEQKAVPGSKPEWKENVLDRCRSMYERDKNHPSIIIWSLGNESFGGENFQ 489 THG L Q W ++R M +RDKNHPSII WSLGNE+ G N Sbjct: 455 THGIGGKLSNDTQ---------WTHAYMERVSRMVQRDKNHPSIIFWSLGNEAGRGPNHA 505 Query: 490 HMYTFFKEKDSTRLVHYEGIF-----------HHRDY------------DASDIESTMYV 526 M + + D TR VHYE +H DY D ++ Sbjct: 506 AMAAWVHDVDITRPVHYEPAQGNHRAEGYIPPNHPDYPKDHAHRIQVPTDQPYVDMVSRF 565 Query: 527 KPADVERYALMNPK---KPYILCEYSHAMGNSCGNLYKYWELFDQYPILQGGFIWDWKDQ 583 P L+N +P + EYSH+MGNS GN+ + W+ F P + GG IWD+KDQ Sbjct: 566 YPGIFTPDLLVNQHADHRPIVFIEYSHSMGNSTGNMKELWDKFRSLPQVIGGCIWDFKDQ 625 Query: 584 ALQATAEDGTSYLAYGGDFGDTPNDGNFCGNGLIFADGTASPKIAEVKKCYQPVKWTAVD 643 L +DG ++ AYGGDF + +DGNFC NG++ +DG + E K YQPV+ T D Sbjct: 626 GLLKQTDDGEAFYAYGGDFDEERHDGNFCINGIVASDGRPKAAMYECKWVYQPVEMTWED 685 Query: 644 PAKGKFAVQNKHLFTNLNAYDFVWTVEKNGELVEKHASLLNVA-PDGTDELTLSYPLYEQ 702 + + N+H +L Y F ++ +NGE V + L N+A G D + P Sbjct: 686 STEMTVRIHNRHADKSLEDYLFELSLLQNGERVNRR-DLPNLALAAGEDTVINLKPYLPD 744 Query: 703 ENETDEFVLTLSLRLSKDTAWASAGYEVAYEQFVLPAKAAMPSVKAAHPALTVDQNEQTL 762 DE++ L+ LS++ WA G+EVA +QF + K P A A V+++ + Sbjct: 745 LQPGDEYLAHLTFSLSEEELWAGKGHEVAQQQFQV-QKGNSPEFPAPRQAFEVEESVTNI 803 Query: 763 TVTGTNFTAIFDKRKGQFISYNYERTELLASGFRPNFWRAVTDND-LGNKLHERCQTWRQ 821 V G F F K G SY E ++ +F R +TDND G K HE+ + W + Sbjct: 804 LVKGEGFQVAFGKSTGALESYQLAGEEQISQPMALSFSRPLTDNDRKGWKPHEKLKVWYE 863 Query: 822 ASLEQHVKKVTVQPQVDFVI-ISVELALDNSLASCYVTYTLYNDGEMKIEQSLAPSETMP 880 A+ + ++ + D I ++ + AL + A V YT+ G +K++ +L P + +P Sbjct: 864 AT--PKLSDMSSSKEEDGSIEVTSKYALIDGKAEATVVYTVLAGGVVKVDYTLIPLDDLP 921 Query: 881 EIPEIGMLFTMNAAFDSLTWYGRGPHENYWDRKTGAKLALHKGSVKEQVTPYLRPQECGN 940 +P++GM + +D + WYG+GP ENY D+ G +++ + + + PY+ PQE GN Sbjct: 922 NLPKVGMHLGIRREYDQIRWYGKGPVENYIDKNHGFMAGIYQQPIDQFMEPYVMPQENGN 981 Query: 941 KTDVRWATITNDQG-RGFLIKGLPTVELNALPYSPFELEAYDHFYKLPASDSVTVRVNYK 999 +TDVRW +T+ G G I + ++A P++ + A +H Y+L + +TV ++ Sbjct: 982 RTDVRWMELTDKSGENGLNITADSLLSMSAWPFTAENINAAEHTYELDDAGFITVNIDLA 1041 Query: 1000 QMGVGGDDSWQAKTHP 1015 QMGVGG+DSW P Sbjct: 1042 QMGVGGNDSWSDVAQP 1057 Lambda K H 0.316 0.133 0.412 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 3437 Number of extensions: 210 Number of successful extensions: 13 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 2 Number of HSP's successfully gapped: 1 Length of query: 1034 Length of database: 1080 Length adjustment: 45 Effective length of query: 989 Effective length of database: 1035 Effective search space: 1023615 Effective search space used: 1023615 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.3 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.6 bits) S2: 58 (26.9 bits)
This GapMind analysis is from Sep 17 2021. The underlying query database was built on Sep 17 2021.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory