GapMind for catabolism of small carbon sources

 

Alignments for a candidate for bgl in Herbaspirillum seropedicae SmR1

Align β-glucosidase (BglX;STM2166) (EC 3.2.1.21) (characterized)
to candidate HSERO_RS23930 HSERO_RS23930 beta-D-glucoside glucohydrolase

Query= CAZy::AAL21070.1
         (765 letters)



>FitnessBrowser__HerbieS:HSERO_RS23930
          Length = 784

 Score =  927 bits (2397), Expect = 0.0
 Identities = 465/765 (60%), Positives = 586/765 (76%), Gaps = 9/765 (1%)

Query: 4   LCSVGVAVSLAMQPALAENLFGNHPLTPEARDAFVTDLLKKMTVDEKIGQLRLISVGPDN 63
           LC++  +  L    A A+     +P     + AF+ DLL++MT++EKIGQLRLIS+GP+ 
Sbjct: 24  LCALACSAVLTGPTAHAQA----NPALLGDKTAFIDDLLRQMTLEEKIGQLRLISIGPEM 79

Query: 64  PKEAIREMIKDGQVGAIFNTVTRQDIRQMQDQVMALSRLKIPLFFAYDVVHGQRTVFPIS 123
           P + + E +  G+VG  FN+VTR + R +QD  M  SRLKIP+FFAYDV+HG RT FPI 
Sbjct: 80  PAKKLAEELAAGRVGGTFNSVTRPENRPLQDGAMR-SRLKIPMFFAYDVIHGHRTTFPIG 138

Query: 124 LGLASSFNLDAVRTVGRVSAYEAADDGLNMTWAPMVDVSRDPRWGRASEGFGEDTYLTSI 183
           LGLASS+++D V    RVSA EAA D ++MT+APMVD+SRDPRWGR SEGFGED YL S 
Sbjct: 139 LGLASSWDMDVVARAMRVSAEEAAADSIDMTFAPMVDISRDPRWGRTSEGFGEDPYLVSR 198

Query: 184 MGETMVKAMQGKS-PADRYSVMTSVKHFAAYGAVEGGKEYNTVDMSSQRLFNDYMPPYKA 242
           + E  V+A+QG + P     VM SVKHFA YGAVEGG++YN V+M  QR++NDY+PPY+A
Sbjct: 199 IAEVSVRALQGDTKPIAANRVMASVKHFALYGAVEGGRDYNVVNMDPQRMYNDYLPPYRA 258

Query: 243 GLDAGSGAVMVALNSLNGTPATSDSWLLKDVLRDEWGFKGITVSDHGAIKELIKHGTAAD 302
            +DAG+GAVMVALNS+NG PATS++WLL+D+LR +WGFKG+TVSDHGAI EL+ HG A +
Sbjct: 259 AIDAGAGAVMVALNSINGAPATSNTWLLQDLLRRDWGFKGLTVSDHGAITELVNHGVAQN 318

Query: 303 PEDAVRVALKAGVDMSMADEYYSKYLPGLIKSGKVTMAELDDATRHVLNVKYDMGLFNDP 362
             +A R+++KAG DMSMAD+ Y K LP L++SGKV+  ELD+A R +L  KYD+GLF DP
Sbjct: 319 DSEAARLSMKAGTDMSMADQVYIKQLPELVRSGKVSQQELDNAVRDILGAKYDLGLFKDP 378

Query: 363 YSHLGPKESDPVDTNAESRLHRKEAREVARESVVLLKNRLETLPLKKSGTIAVVGPLADS 422
           Y  +G    DP D  A+SRLHR++AREVA++S+VLL+NR   LPLKK+  IA+VGPLADS
Sbjct: 379 YVRIGRAADDPPDVYADSRLHRRDAREVAQQSMVLLENRNAALPLKKNARIALVGPLADS 438

Query: 423 QRDVMGSWSAAGVANQSVTVLAGIQNAVGDGAKILYAKGANITNDKGIVDFLNLY---EE 479
             D++GSWSAAG   Q++T+  G+Q A+G   K++YA+GANIT DK IVD+LN     + 
Sbjct: 439 HIDMLGSWSAAGKDKQTITLRQGLQAALGGQGKLVYARGANITEDKHIVDYLNFLNWDDP 498

Query: 480 AVKIDPRSPQAMIDEAVQAAKQADVVVAVVGESQGMAHEASSRTNITIPQSQRDLITALK 539
            V  D RSP+AMIDEAV+AA+ ADV+VA VGES+GM+HE+SSRT++++PQSQ DL+ ALK
Sbjct: 499 EVVQDKRSPKAMIDEAVKAARHADVIVAAVGESRGMSHESSSRTSLSLPQSQLDLLKALK 558

Query: 540 ATGKPLVLVLMNGRPLALVKEDQQADAILETWFAGTEGGNAIADVLFGDYNPSGKLPISF 599
           ATGKPLVLVLMNGRPL L    + A AILETW+ GTEGGNAIAD+LFGD NPSGKLPI+F
Sbjct: 559 ATGKPLVLVLMNGRPLDLNWARENASAILETWYTGTEGGNAIADILFGDVNPSGKLPITF 618

Query: 600 PRSVGQIPVYYSHLNTGRPYNPEKPNKYTSRYFDEANGPLYPFGYGLSYTTFTVSDVTLS 659
           PRSVGQIP YY+H   GRPY   KP  YTS+YFDE NGPLYPFGYGLSYT F +S+V+LS
Sbjct: 619 PRSVGQIPSYYNHPRVGRPYTEGKPGNYTSQYFDEPNGPLYPFGYGLSYTEFKLSEVSLS 678

Query: 660 SPTMQRDGKVTASVEVTNTGKREGATVIQMYLQDVTASMSRPVKQLKGFEKITLKPGERK 719
            P+M  DGKV ASV V N G+R GATV+Q+YL+DV AS+ RPVK+LK F K+ L+PGE K
Sbjct: 679 QPSMSADGKVEASVTVKNVGRRAGATVVQLYLRDVAASVVRPVKELKDFRKVMLQPGEEK 738

Query: 720 TVSFPIDIEALKFWNQQMKYDAEPGKFNVFIGVDSARVKQGSFEL 764
            V F ID +AL F+N +++Y AEPG+F V IG+DS  VK  SF L
Sbjct: 739 QVQFSIDRKALSFYNAKLEYVAEPGEFQVQIGLDSKEVKTASFNL 783


Lambda     K      H
   0.316    0.132    0.380 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 1538
Number of extensions: 51
Number of successful extensions: 4
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 765
Length of database: 784
Length adjustment: 41
Effective length of query: 724
Effective length of database: 743
Effective search space:   537932
Effective search space used:   537932
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.6 bits)
S2: 55 (25.8 bits)

This GapMind analysis is from Sep 17 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory