Align Beta-galactosidase BoGH2A; Beta-gal; Glycosyl hydrolase family protein 2A; BoGH2A; EC 3.2.1.23 (characterized)
to candidate WP_066328198.1 BLR17_RS02090 glycoside hydrolase family 2 protein
Query= SwissProt::A7LXS9 (851 letters) >NCBI__GCF_900100165.1:WP_066328198.1 Length = 812 Score = 822 bits (2124), Expect = 0.0 Identities = 409/816 (50%), Positives = 534/816 (65%), Gaps = 37/816 (4%) Query: 27 LMLLGACSSSSLVSPRERSDFNADWRFHLGDGLQAAQPGFADNDWRVLDLPHDWAIEGDF 86 L+ L AC+S+ S R +DFN DW F LGD A Q F NDWR LDLPHDW+IEG F Sbjct: 18 LLFLVACASTKKES-RIVADFNPDWNFKLGDYPTAIQADFNANDWRALDLPHDWSIEGTF 76 Query: 87 SQENPSGTGGGALPGGVGWYRKTFSVDKADAGKIFRIEFDGVYMNSEVFINGVSLGVRPY 146 +++ + G LP G GWYRKTF++ + A K +EFDGV+ NSEVFING SLG+RP Sbjct: 77 DKDSKTKQAQGFLPAGKGWYRKTFTLPENLANKSISVEFDGVFKNSEVFINGHSLGMRPN 136 Query: 147 GYISFSYDLTPYLKWD-EPNVLAVRVDNAEQPNSRWYSGCGIYRNVWLSKTGPIHVGGWG 205 GYISF+Y+LTPYL + + N++AV+VDN QPNSRWY+G GIYRNV L + +HV WG Sbjct: 137 GYISFAYELTPYLHFGTQKNIIAVKVDNDAQPNSRWYTGSGIYRNVRLVASEKLHVAQWG 196 Query: 206 TYVTTSSVDEKQAVLNLATTLVNESDTNENVTVCSSLQDAEGREVAETRSSGEAEAGKEV 265 TYVTT + +++A++++ + N N+ + S++ D EVA+ S G A + Sbjct: 197 TYVTTRGITKEKAIVDIDVDVKNGLGINKLFKLVSTILDKNNVEVAKAISDGNIPANSIL 256 Query: 266 VFTQQLTVKQPQLWDIDTPYLYTLVTKVMRNEECMDRYTTPVGIRTFSLDARKGFTLNGR 325 Q ++ P LW+ + PYLY +VTKV +D Y TP+G+R F+ DA KGF+LNG+ Sbjct: 257 QVKQNTKIENPILWNTENPYLYKIVTKVYDGSTVVDTYETPLGVRYFNFDAEKGFSLNGK 316 Query: 326 QTKINGVCMHHDLGCLGAAVNTRAIERHLQILKEMGCNGIRCSHNPPAPELLDLCDRMGF 385 TKI GVC+HHD G LGA N AI R L +LKEMG N IR SHNP + E++ LCD MGF Sbjct: 317 PTKILGVCLHHDNGALGAVENIHAIRRKLTLLKEMGTNAIRMSHNPHSLEMMKLCDEMGF 376 Query: 386 IVMDEAFDMWRKKKTAHDYARYFNEWHERDLNDFILRDRNHPSVFMWSIGNEVLEQWSDA 445 IV DE D+W+KKK +DY + ++ WH++DL DFI RDRNHPSV MWSIGNE+ EQ+ Sbjct: 377 IVQDEFTDVWKKKKVTNDYHKDWDAWHKQDLEDFIKRDRNHPSVMMWSIGNEIREQF--- 433 Query: 446 KADTLSLEEANLILNFGHSSEMLAKEGEESVNSLLTKKLVSFVKGLDPTRPVTAGC--NE 503 +S +T++L VK LD TRPVT+ NE Sbjct: 434 ----------------------------DSTGVRITRELAQIVKSLDKTRPVTSALTENE 465 Query: 504 PNSGNHLFRSGVLDVIGYNYHNKDIPNVPANFPDKPFIITESNSALMTRGYYRMPSDRMF 563 P N +++SG LD++G+NY + D P F + + +ES SA TRG+Y MP+D + Sbjct: 466 PQK-NFIYQSGALDLLGFNYKHADYATFPERFKGQKIVASESVSAYATRGHYDMPTDEIR 524 Query: 564 IWPKRWDKSF-ADSTFACSSYENCHVPWGNTHEESLKLVRDNDFISGQYVWTGFDYIGEP 622 WPK++ ++F +S ++Y+N WG THEE+ K + DFI+G +VWTGFDYIGEP Sbjct: 525 FWPKKYGETFDGNSDLTVTAYDNIASYWGTTHEENWKAAKKYDFIAGTFVWTGFDYIGEP 584 Query: 623 TPYGWPARSSYFGIVDLAGFPKDVYYLYQSEWTDKQVLHLFPHWNWTPGQEIDMWCYYNQ 682 PY +PARSSYFGIVDLAGFPKDVYY+YQSEW+DK VLHL PHWNW GQ ID+W YYN Sbjct: 585 DPYPYPARSSYFGIVDLAGFPKDVYYMYQSEWSDKNVLHLLPHWNWKVGQLIDVWAYYNN 644 Query: 683 ADEVELFVNGKSQGVKRKDLDNLHVAWRVKFEPGTVKVIARESGKVVAEKEICTAGKPAE 742 ADEVELF+NGKS G K K D LH+AW+V FE GT+K ++R++GK+V E EI TAG+ A+ Sbjct: 645 ADEVELFLNGKSLGSKAKQGDELHIAWKVPFEAGTLKAVSRKAGKIVKETEIHTAGEAAK 704 Query: 743 IRLTPDRSILTADGKDLCFVTVEVLDEKGNLCPDADNLVNFTVQGNGFIAGVDNGNPVSM 802 I L D++ + DG L +VTV + D+ GN P ADNL+NF V G I GVDNG S+ Sbjct: 705 INLQADKTAIKNDGYHLAYVTVTLQDKDGNALPKADNLINFKVSGGAKIVGVDNGYQASL 764 Query: 803 ERFKDEKRKAFYGKCLVVIQNDGKPGKAKLTATSEG 838 E FK RK + GKCLV++Q++ K L AT+ G Sbjct: 765 EPFKANYRKLYNGKCLVILQSNKKAENITLEATTAG 800 Lambda K H 0.319 0.136 0.431 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 1931 Number of extensions: 103 Number of successful extensions: 4 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 3 Number of HSP's successfully gapped: 1 Length of query: 851 Length of database: 812 Length adjustment: 42 Effective length of query: 809 Effective length of database: 770 Effective search space: 622930 Effective search space used: 622930 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.4 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.7 bits) S2: 56 (26.2 bits)
This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory