GapMind for catabolism of small carbon sources

 

Alignments for a candidate for lacZ in Echinicola vietnamensis KMM 6221, DSM 17526

Align β-galactosidase (BgaM) (EC 3.2.1.23) (characterized)
to candidate Echvi_1698 Echvi_1698 Beta-galactosidase/beta-glucuronidase

Query= CAZy::CAA04267.1
         (1034 letters)



>FitnessBrowser__Cola:Echvi_1698
          Length = 1080

 Score =  672 bits (1733), Expect = 0.0
 Identities = 384/1036 (37%), Positives = 556/1036 (53%), Gaps = 65/1036 (6%)

Query: 23   NPEIFQLNRSKAHALLMPYQTVEEALKNDRKSSVYYQSLNGSWYFHFAENADGRVKNFFA 82
            +P I  LNR  A      Y+  E A   DR+ S   Q LNG W FHFA N      +F+ 
Sbjct: 44   DPLITSLNRMPARTTAYSYKDAETAKIGDREES-RIQLLNGDWDFHFAMNMKEAPSDFYR 102

Query: 83   PEFSYEKWDSISVPSHWQLQGYDYPQYTNVTYPWVENEELEPPFAPTKYNPVGQYVRTFT 142
               +   WD I VPS+W+L+GYD P Y +  YP+     + PP+ P  YN VG Y RTF 
Sbjct: 103  SRVT--GWDKIEVPSNWELKGYDKPIYKSAVYPF---RPINPPYVPEDYNGVGSYQRTFE 157

Query: 143  PKSEWKDQPVYISFQGVESAFYVWINGEFVGYSEDSFTPAEFDITSYLQEGENTIAVEVY 202
             +  W+D  + + F  V SAF VW+NGEFVGY EDSF P+EF+IT YL+ GEN ++V+V 
Sbjct: 158  LEENWEDMNITLHFGAVSSAFKVWLNGEFVGYGEDSFLPSEFNITPYLRSGENVLSVQVL 217

Query: 203  RWSDASWLEDQDFWRMSGIFRDVYLYSTPQVHIYDFSVRSSLDNNYEDGELSVSADILNY 262
            RWSD S+LEDQD WR+SGI R+V+L + P++ +YDF  +++L  +Y +   S+   + N 
Sbjct: 218  RWSDGSYLEDQDHWRLSGIQREVFLMAEPKLRVYDFHWQATLAEDYTNATFSLRPKVENL 277

Query: 263  FEHDTQDLTFEVMLYDANAQEVLQAPLQTNLSVSDQRTVS-----------LRTHIKSPA 311
                  D      L+DA  + V   PL+  ++V D    S           L   +++P 
Sbjct: 278  TGERVPDSKLTAQLFDAEGKPVFATPLE--MAVEDILNESYPRLDNVKFGLLEATVENPH 335

Query: 312  KWSAESPNLYTLVLSLKNAAGSIIETESCKVGFRT--FEIKNGLMTINGKRIVLRGVNRH 369
             WS E P LYTLV+ L+ A G ++E +SCKVGFR   F+ +   + INGK   + GVNRH
Sbjct: 336  LWSDEHPYLYTLVIGLEGAKGQLLEAKSCKVGFRDIRFDPETSKLLINGKETYIYGVNRH 395

Query: 370  EFDSVKGRAGITREDMIHDILLMKQHNINAVRTSHYPNDSVWYELCNEYGLYVIDETNLE 429
            +   V+G+A +TR+D+  D+  +KQ N N +RTSHYPND  +YELC+EYG+ VIDE N E
Sbjct: 396  DHHPVRGKA-LTRQDIEEDVKTIKQFNFNTIRTSHYPNDPYFYELCDEYGILVIDEANHE 454

Query: 430  THGTWTYLQEGEQKAVPGSKPEWKENVLDRCRSMYERDKNHPSIIIWSLGNESFGGENFQ 489
            THG    L    Q         W    ++R   M +RDKNHPSII WSLGNE+  G N  
Sbjct: 455  THGIGGKLSNDTQ---------WTHAYMERVSRMVQRDKNHPSIIFWSLGNEAGRGPNHA 505

Query: 490  HMYTFFKEKDSTRLVHYEGIF-----------HHRDY------------DASDIESTMYV 526
             M  +  + D TR VHYE              +H DY            D   ++     
Sbjct: 506  AMAAWVHDVDITRPVHYEPAQGNHRAEGYIPPNHPDYPKDHAHRIQVPTDQPYVDMVSRF 565

Query: 527  KPADVERYALMNPK---KPYILCEYSHAMGNSCGNLYKYWELFDQYPILQGGFIWDWKDQ 583
             P       L+N     +P +  EYSH+MGNS GN+ + W+ F   P + GG IWD+KDQ
Sbjct: 566  YPGIFTPDLLVNQHADHRPIVFIEYSHSMGNSTGNMKELWDKFRSLPQVIGGCIWDFKDQ 625

Query: 584  ALQATAEDGTSYLAYGGDFGDTPNDGNFCGNGLIFADGTASPKIAEVKKCYQPVKWTAVD 643
             L    +DG ++ AYGGDF +  +DGNFC NG++ +DG     + E K  YQPV+ T  D
Sbjct: 626  GLLKQTDDGEAFYAYGGDFDEERHDGNFCINGIVASDGRPKAAMYECKWVYQPVEMTWED 685

Query: 644  PAKGKFAVQNKHLFTNLNAYDFVWTVEKNGELVEKHASLLNVA-PDGTDELTLSYPLYEQ 702
              +    + N+H   +L  Y F  ++ +NGE V +   L N+A   G D +    P    
Sbjct: 686  STEMTVRIHNRHADKSLEDYLFELSLLQNGERVNRR-DLPNLALAAGEDTVINLKPYLPD 744

Query: 703  ENETDEFVLTLSLRLSKDTAWASAGYEVAYEQFVLPAKAAMPSVKAAHPALTVDQNEQTL 762
                DE++  L+  LS++  WA  G+EVA +QF +  K   P   A   A  V+++   +
Sbjct: 745  LQPGDEYLAHLTFSLSEEELWAGKGHEVAQQQFQV-QKGNSPEFPAPRQAFEVEESVTNI 803

Query: 763  TVTGTNFTAIFDKRKGQFISYNYERTELLASGFRPNFWRAVTDND-LGNKLHERCQTWRQ 821
             V G  F   F K  G   SY     E ++     +F R +TDND  G K HE+ + W +
Sbjct: 804  LVKGEGFQVAFGKSTGALESYQLAGEEQISQPMALSFSRPLTDNDRKGWKPHEKLKVWYE 863

Query: 822  ASLEQHVKKVTVQPQVDFVI-ISVELALDNSLASCYVTYTLYNDGEMKIEQSLAPSETMP 880
            A+    +  ++   + D  I ++ + AL +  A   V YT+   G +K++ +L P + +P
Sbjct: 864  AT--PKLSDMSSSKEEDGSIEVTSKYALIDGKAEATVVYTVLAGGVVKVDYTLIPLDDLP 921

Query: 881  EIPEIGMLFTMNAAFDSLTWYGRGPHENYWDRKTGAKLALHKGSVKEQVTPYLRPQECGN 940
             +P++GM   +   +D + WYG+GP ENY D+  G    +++  + + + PY+ PQE GN
Sbjct: 922  NLPKVGMHLGIRREYDQIRWYGKGPVENYIDKNHGFMAGIYQQPIDQFMEPYVMPQENGN 981

Query: 941  KTDVRWATITNDQG-RGFLIKGLPTVELNALPYSPFELEAYDHFYKLPASDSVTVRVNYK 999
            +TDVRW  +T+  G  G  I     + ++A P++   + A +H Y+L  +  +TV ++  
Sbjct: 982  RTDVRWMELTDKSGENGLNITADSLLSMSAWPFTAENINAAEHTYELDDAGFITVNIDLA 1041

Query: 1000 QMGVGGDDSWQAKTHP 1015
            QMGVGG+DSW     P
Sbjct: 1042 QMGVGGNDSWSDVAQP 1057


Lambda     K      H
   0.316    0.133    0.412 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 3437
Number of extensions: 210
Number of successful extensions: 13
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 2
Number of HSP's successfully gapped: 1
Length of query: 1034
Length of database: 1080
Length adjustment: 45
Effective length of query: 989
Effective length of database: 1035
Effective search space:  1023615
Effective search space used:  1023615
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.6 bits)
S2: 58 (26.9 bits)

This GapMind analysis is from Sep 17 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory