GapMind for catabolism of small carbon sources

 

Alignments for a candidate for lacZ in Bacteroides thetaiotaomicron VPI-5482

Align β-galactosidase Z (LacZ;TmLac;TM1193) (EC 3.2.1.23) (characterized)
to candidate 352820 BT3293 beta-galactosidase (NCBI ptt file)

Query= CAZy::AAD36268.1
         (1087 letters)



>FitnessBrowser__Btheta:352820
          Length = 1421

 Score =  681 bits (1758), Expect = 0.0
 Identities = 395/1035 (38%), Positives = 582/1035 (56%), Gaps = 81/1035 (7%)

Query: 7    EWENPQLVSEGTEKSHASFIPYLDPFSGEWEYPEE---FISLNGNWRFLFAKNPFEVPED 63
            EW N  +     E++    IP+ D         EE   + +LNG W+F +  +P + P+D
Sbjct: 29   EWSNELVSGVNKEEAVQIAIPFTDEQQAMNLTIEESPYYKTLNGIWKFHWVADPKDRPQD 88

Query: 64   FFSEKFDDSNWDEIEVPSNWEM------KGYGKPIYTNVVYPF--------------EPN 103
            F   ++D S WD I+VP+ W++      K + KP+Y NV+YPF              +P 
Sbjct: 89   FCKPEYDVSQWDNIKVPATWQIEAVRHNKNWDKPLYCNVIYPFCEWDWKKIQWPNVIQPR 148

Query: 104  PP--FVPKDDNPTGVYRRWIEIPEDWFKKEIFLHFEGVRSFFYLWVNGKKIGFSKDSCTP 161
            P         NP G YRR   +P+ W  ++IF+ F GV + FY+WVNGKK+G+S+DS  P
Sbjct: 149  PSNYTFATMPNPVGSYRREFILPDSWKGRDIFIRFNGVEAGFYIWVNGKKVGYSEDSYLP 208

Query: 162  AEFRLTDVLRPGKNLITVEVLKWSDGSYLEDQDMWWFAGIYRDVYLYALPKFHIRDVFVR 221
            AEF LT  L+ GKN++ VEV +++DGS+LE QD W F+GI+RDV+L++ PK  IRD F R
Sbjct: 209  AEFNLTPYLKAGKNVLAVEVYRFTDGSFLECQDFWRFSGIFRDVFLWSAPKTQIRDFFFR 268

Query: 222  TDLDENYRNGKIFLDVEMRNLGEEEEKDLEVTLITPDGDEKTLVKETVKPEDRVLSFAFD 281
            TDLD+ Y+N  + LD+++       E  ++VT    D + K +  +  +         F+
Sbjct: 269  TDLDKEYKNASVSLDIDITGKRSNNEIQVKVT----DQNGKEIATQNARAVTGTNKLQFE 324

Query: 282  VKDPKKWSAETPHLYVLKLKLGE-----DEKKVNFGFRKIEI-KDGTLLFNGKPLYIKGV 335
            V +P KW+AETP+LY L + L +     D + V  GFRKIE+ +DG LL NGK    KGV
Sbjct: 325  VVNPLKWTAETPNLYNLTILLKQKGKTVDIRSVKVGFRKIELAQDGRLLINGKSTLFKGV 384

Query: 336  NRHEFDPDRGHAVTVERMIQDIKLMKQHNINTVRTSHYPNQTKWYDLCDYFGLYVIDEAN 395
            +RH+   + G  V+ E M +D++LMK  NIN VRTSHYPN   +YDLCD +G+YV+ EAN
Sbjct: 385  DRHDHSSENGRTVSKEEMEKDVQLMKSLNINAVRTSHYPNNPYFYDLCDRYGIYVLSEAN 444

Query: 396  IESHGIDWDPEVTLANRWEWEKAHFDRIKRMVERDKNHPSIIFWSLGNEAGDGVNFEKAA 455
            +E HG+     + L++   W KA  +R + MV R KNH SI+ WSLGNE+G+G+NF+ AA
Sbjct: 445  VECHGL-----MALSSEPSWVKAFTERSENMVRRYKNHASIVMWSLGNESGNGINFKSAA 499

Query: 456  LWIKKRDNTRLIHYEGTTRRGESYYVDVFSLMYPKM--------DILLEYASKKREKPFI 507
              +KK D+TR  HYE     G S Y DV S MYP +        + L ++ + +  KP +
Sbjct: 500  EAVKKLDDTRPTHYE-----GNSSYCDVTSSMYPDVQWLESVGKERLQKFQNGETVKPHV 554

Query: 508  MCEYAHAMGNSVGNLKDYWDVIEKYPYLHGGCIWDWVDQGIRKKDENGREFW-AYGGDFG 566
            +CEYAHAMGNS+GN K+YW+  E+YP L GG IWDWVDQ I+    +G  ++ A+GGDFG
Sbjct: 555  VCEYAHAMGNSIGNFKEYWETYERYPALVGGFIWDWVDQSIKMPAPDGSGYYMAFGGDFG 614

Query: 567  DTPNDGNFCINGVVLPDRTPEPELYEVKKVYQNVKIRQVSKDTYEVENRYLFTNLEMFDG 626
            DTPNDGNFC NGV+  DRT   + YEVKK++Q V +  +   TY++ N+     L+   G
Sbjct: 615  DTPNDGNFCTNGVIFSDRTYSAKAYEVKKIHQPVWVEAMGNGTYKLTNKRFHAGLDDLYG 674

Query: 627  AWKIRKDGEVIEEKTF-KIFAEPGEKRLLKI---PLPEMDDSEYFLEISFSLSEDTPWAE 682
             ++I +DG+V+      ++     + +++ I    + ++  +EYF++  F   +DT W +
Sbjct: 675  RYEIEEDGKVVFSANLEELSLNAQDSKVITIADNQINKIPGAEYFIKFRFCQKQDTEWEK 734

Query: 683  KGHVVAWEQFLLKAPAFEKKSISDG-VSLREDGKHLTVEAKDTVYVFSKLTGLLEQILHR 741
             G+ VA EQF L   A       +G + L E      V+       FSK  G +      
Sbjct: 735  AGYEVASEQFKLSDSAKPVFKAGEGSIDLIETDDAYLVKGSQFEASFSKQQGTISSYTLN 794

Query: 742  RKKILKSPVVPNFWRVPTDNDIGNRMPQRLAIW-KRASKERKLFKMHW--KKEENRVSVH 798
               ++   +  N +R PTDND      Q    W ++   +  L   HW  +KE+N+V++ 
Sbjct: 795  ELPMISKGLELNAFRAPTDND-----KQVDGDWYQKGLYQMTLEPGHWNVRKEDNKVTLQ 849

Query: 799  SVFQLPGNS-WVYTT---YTVFGNGDVLVDLSLIPAEDVPEIPRIGFQFTVPEEFGTVEW 854
                  G + + Y T   YTV  +G +LV+ ++IP+     IPRIG++  +PE F  + W
Sbjct: 850  IENLYRGKTGFDYRTNIEYTVAADGSILVNSTIIPSTKGVIIPRIGYRMELPEGFERMRW 909

Query: 855  YGRGPHETYWDRKESGLFARYRKAVGEMMHRYVRPQETGNRSDVRWFALS--DGETKLFV 912
            YGRGP E Y DRK++     Y + V +    YVR QE GNR D+RW +++  DG   +F+
Sbjct: 910  YGRGPLENYVDRKDATYVGVYDELVSDQWVNYVRAQEMGNREDLRWISITNPDGIGFVFI 969

Query: 913  SGMPQIDFSVWPFSMEDL------ERVQHISELPERDFVTVNVDFRQMGLGGDDSWGAMP 966
            +G  ++  S    + +D+       R+ H  E+P R    + +D  Q  L G+ S G  P
Sbjct: 970  AG-DKMSASALHATAQDMVDPANHRRLLHKYEVPMRKETVLCLDANQRPL-GNASCGPGP 1027

Query: 967  HLEYRLLPKPYRFSF 981
              +Y L  +P  FSF
Sbjct: 1028 MQKYELRSQPTVFSF 1042


Lambda     K      H
   0.319    0.139    0.438 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 4611
Number of extensions: 256
Number of successful extensions: 13
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 1087
Length of database: 1421
Length adjustment: 48
Effective length of query: 1039
Effective length of database: 1373
Effective search space:  1426547
Effective search space used:  1426547
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 59 (27.3 bits)

This GapMind analysis is from Sep 17 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory