GapMind for catabolism of small carbon sources

 

Alignments for a candidate for lacZ in Flavobacterium glycines Gm-149

Align Beta-galactosidase BoGH2A; Beta-gal; Glycosyl hydrolase family protein 2A; BoGH2A; EC 3.2.1.23 (characterized)
to candidate WP_066328198.1 BLR17_RS02090 glycoside hydrolase family 2 protein

Query= SwissProt::A7LXS9
         (851 letters)



>NCBI__GCF_900100165.1:WP_066328198.1
          Length = 812

 Score =  822 bits (2124), Expect = 0.0
 Identities = 409/816 (50%), Positives = 534/816 (65%), Gaps = 37/816 (4%)

Query: 27  LMLLGACSSSSLVSPRERSDFNADWRFHLGDGLQAAQPGFADNDWRVLDLPHDWAIEGDF 86
           L+ L AC+S+   S R  +DFN DW F LGD   A Q  F  NDWR LDLPHDW+IEG F
Sbjct: 18  LLFLVACASTKKES-RIVADFNPDWNFKLGDYPTAIQADFNANDWRALDLPHDWSIEGTF 76

Query: 87  SQENPSGTGGGALPGGVGWYRKTFSVDKADAGKIFRIEFDGVYMNSEVFINGVSLGVRPY 146
            +++ +    G LP G GWYRKTF++ +  A K   +EFDGV+ NSEVFING SLG+RP 
Sbjct: 77  DKDSKTKQAQGFLPAGKGWYRKTFTLPENLANKSISVEFDGVFKNSEVFINGHSLGMRPN 136

Query: 147 GYISFSYDLTPYLKWD-EPNVLAVRVDNAEQPNSRWYSGCGIYRNVWLSKTGPIHVGGWG 205
           GYISF+Y+LTPYL +  + N++AV+VDN  QPNSRWY+G GIYRNV L  +  +HV  WG
Sbjct: 137 GYISFAYELTPYLHFGTQKNIIAVKVDNDAQPNSRWYTGSGIYRNVRLVASEKLHVAQWG 196

Query: 206 TYVTTSSVDEKQAVLNLATTLVNESDTNENVTVCSSLQDAEGREVAETRSSGEAEAGKEV 265
           TYVTT  + +++A++++   + N    N+   + S++ D    EVA+  S G   A   +
Sbjct: 197 TYVTTRGITKEKAIVDIDVDVKNGLGINKLFKLVSTILDKNNVEVAKAISDGNIPANSIL 256

Query: 266 VFTQQLTVKQPQLWDIDTPYLYTLVTKVMRNEECMDRYTTPVGIRTFSLDARKGFTLNGR 325
              Q   ++ P LW+ + PYLY +VTKV      +D Y TP+G+R F+ DA KGF+LNG+
Sbjct: 257 QVKQNTKIENPILWNTENPYLYKIVTKVYDGSTVVDTYETPLGVRYFNFDAEKGFSLNGK 316

Query: 326 QTKINGVCMHHDLGCLGAAVNTRAIERHLQILKEMGCNGIRCSHNPPAPELLDLCDRMGF 385
            TKI GVC+HHD G LGA  N  AI R L +LKEMG N IR SHNP + E++ LCD MGF
Sbjct: 317 PTKILGVCLHHDNGALGAVENIHAIRRKLTLLKEMGTNAIRMSHNPHSLEMMKLCDEMGF 376

Query: 386 IVMDEAFDMWRKKKTAHDYARYFNEWHERDLNDFILRDRNHPSVFMWSIGNEVLEQWSDA 445
           IV DE  D+W+KKK  +DY + ++ WH++DL DFI RDRNHPSV MWSIGNE+ EQ+   
Sbjct: 377 IVQDEFTDVWKKKKVTNDYHKDWDAWHKQDLEDFIKRDRNHPSVMMWSIGNEIREQF--- 433

Query: 446 KADTLSLEEANLILNFGHSSEMLAKEGEESVNSLLTKKLVSFVKGLDPTRPVTAGC--NE 503
                                       +S    +T++L   VK LD TRPVT+    NE
Sbjct: 434 ----------------------------DSTGVRITRELAQIVKSLDKTRPVTSALTENE 465

Query: 504 PNSGNHLFRSGVLDVIGYNYHNKDIPNVPANFPDKPFIITESNSALMTRGYYRMPSDRMF 563
           P   N +++SG LD++G+NY + D    P  F  +  + +ES SA  TRG+Y MP+D + 
Sbjct: 466 PQK-NFIYQSGALDLLGFNYKHADYATFPERFKGQKIVASESVSAYATRGHYDMPTDEIR 524

Query: 564 IWPKRWDKSF-ADSTFACSSYENCHVPWGNTHEESLKLVRDNDFISGQYVWTGFDYIGEP 622
            WPK++ ++F  +S    ++Y+N    WG THEE+ K  +  DFI+G +VWTGFDYIGEP
Sbjct: 525 FWPKKYGETFDGNSDLTVTAYDNIASYWGTTHEENWKAAKKYDFIAGTFVWTGFDYIGEP 584

Query: 623 TPYGWPARSSYFGIVDLAGFPKDVYYLYQSEWTDKQVLHLFPHWNWTPGQEIDMWCYYNQ 682
            PY +PARSSYFGIVDLAGFPKDVYY+YQSEW+DK VLHL PHWNW  GQ ID+W YYN 
Sbjct: 585 DPYPYPARSSYFGIVDLAGFPKDVYYMYQSEWSDKNVLHLLPHWNWKVGQLIDVWAYYNN 644

Query: 683 ADEVELFVNGKSQGVKRKDLDNLHVAWRVKFEPGTVKVIARESGKVVAEKEICTAGKPAE 742
           ADEVELF+NGKS G K K  D LH+AW+V FE GT+K ++R++GK+V E EI TAG+ A+
Sbjct: 645 ADEVELFLNGKSLGSKAKQGDELHIAWKVPFEAGTLKAVSRKAGKIVKETEIHTAGEAAK 704

Query: 743 IRLTPDRSILTADGKDLCFVTVEVLDEKGNLCPDADNLVNFTVQGNGFIAGVDNGNPVSM 802
           I L  D++ +  DG  L +VTV + D+ GN  P ADNL+NF V G   I GVDNG   S+
Sbjct: 705 INLQADKTAIKNDGYHLAYVTVTLQDKDGNALPKADNLINFKVSGGAKIVGVDNGYQASL 764

Query: 803 ERFKDEKRKAFYGKCLVVIQNDGKPGKAKLTATSEG 838
           E FK   RK + GKCLV++Q++ K     L AT+ G
Sbjct: 765 EPFKANYRKLYNGKCLVILQSNKKAENITLEATTAG 800


Lambda     K      H
   0.319    0.136    0.431 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 1931
Number of extensions: 103
Number of successful extensions: 4
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 3
Number of HSP's successfully gapped: 1
Length of query: 851
Length of database: 812
Length adjustment: 42
Effective length of query: 809
Effective length of database: 770
Effective search space:   622930
Effective search space used:   622930
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 56 (26.2 bits)

This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory