GapMind for catabolism of small carbon sources

 

Alignments for a candidate for lacZ in Echinicola vietnamensis KMM 6221, DSM 17526

Align β-galactosidase (Gal4214-1) (EC 3.2.1.23) (characterized)
to candidate Echvi_0485 Echvi_0485 Beta-galactosidase/beta-glucuronidase

Query= CAZy::AAX48919.1
         (1046 letters)



>FitnessBrowser__Cola:Echvi_0485
          Length = 1038

 Score = 1048 bits (2709), Expect = 0.0
 Identities = 515/1049 (49%), Positives = 690/1049 (65%), Gaps = 15/1049 (1%)

Query: 1    MNMKKRTILTSIFAFISIIVFAQEKPSRNDWENPEVFQINREPARAAFLPFADEASAIAD 60
            M  KK  +  ++ A +  ++ AQ   S+N+WE+P     N+E ARA F+ +  E  A+  
Sbjct: 1    MQFKKLWMTGALVAALGGLLHAQ---SQNEWEDPTAVDRNKEAARAYFITYPSEEKALLG 57

Query: 61   DYTRSPWYMSLDGKWKFNWSPTPDERPKDFFNTDFNTTTWKEIGVPSNWELVGYGIPIYT 120
            + T +  + +LDG WKF+    P +RP DFF   F    W +I VPSNWEL GY +P+YT
Sbjct: 58   NRTTNESFKTLDGLWKFSLVKRPQDRPTDFFEPTFKDEDWDDITVPSNWELEGYDMPVYT 117

Query: 121  NITYPFVKNPPFIDHADNPVGSYRRTFELPENWDGRRVYLHFEGGTSAMYVWINGEKVGY 180
            N+ YPF  +PP +D+  NPVG+YRRTF +P  WD + V LHF   +    V++NGE+VG 
Sbjct: 118  NVAYPFPADPPLVDNQYNPVGTYRRTFSIPSQWDNQEVILHFGSISGYATVYVNGEEVGM 177

Query: 181  SQNTKSPTEFDITKYVKVGKNQVAVEVYRWSDGSYLEDQDFWRLSGIDRSVYLYSTANTR 240
            ++  K+P EF IT Y+K G+N +AV+V+RW DGSYLEDQDFWRLSGI+RSV+L +     
Sbjct: 178  TKAAKTPAEFVITDYLKTGENTLAVQVFRWHDGSYLEDQDFWRLSGIERSVFLQAVPKLT 237

Query: 241  IADFFARPDLDTSYKNGSLSVDIKLKNANSVAKNNQTVEAKLVDAAGKEVFIKTIKINLG 300
            I DFF +  LD  YKNG L   I+L+           +  +L D  GK+V+  T  ++ G
Sbjct: 238  IWDFFVKSGLDDRYKNGVLEAAIQLRAFEGSDVQGGELSFELQDEDGKQVYSDTKAVSNG 297

Query: 301  ANTVSSTTFEQMVKSPKLWNNETPNLYTLVLTLKDENGKFVETVATSIGFRKVELKNGQL 360
               V    F + + +   W+ E P LY   ++LKD  G+ +  V+   GFRKVE+K+ QL
Sbjct: 298  DQEVK---FSKTIGNVNKWSAEEPYLYQYTISLKDSRGRTLAAVSKKTGFRKVEIKDAQL 354

Query: 361  LVNGIRIMVHGVNIHEHNPKTGHYQDEATMMKDIKLMKQLNINAVRCSHYPNNLLWVKLC 420
            +VNG  ++V GVN HEH+   GH  DE  M++DI+LMKQ NINAVR SHYP++  W +LC
Sbjct: 355  MVNGQSVLVKGVNRHEHHGVKGHVPDEEIMLRDIQLMKQNNINAVRMSHYPHSPRWYELC 414

Query: 421  NKYGLFLVDEANIETHGMGAELQGSFDKTKHPAYLPEWKAAHMDRIYSLVERDKNQPSII 480
            ++YGL++VDEANIETHGMGAE QG F K +HPAYL  W  AH+DRI+ LVERDKN PSII
Sbjct: 415  DEYGLYVVDEANIETHGMGAEWQGRFKKDRHPAYLEAWAPAHLDRIHRLVERDKNHPSII 474

Query: 481  LWSLGNECGNGPVFHEAYNWIKNRDKTRLVQFEQAGEQENTDVVCPMYPSMEYMKEYANR 540
            +WS+GNECGNGPVF+EAYNW+K RD +RLVQFEQAGE E+TD+VCPMYPS+ +M+EYA+ 
Sbjct: 475  IWSMGNECGNGPVFYEAYNWMKERDDSRLVQFEQAGENEDTDIVCPMYPSIRHMQEYADA 534

Query: 541  KDVKRPFIMCEYSHAMGNSNGNFQEYWDIIHSSTNMQGGFIWDWVDQGFEETDEAGRKYW 600
             D  RPFIMCEY+H+MGNS GNFQEYWDII  S +MQGGFIWDWVDQG    D+ G+++W
Sbjct: 535  TDKTRPFIMCEYAHSMGNSTGNFQEYWDIILDSPHMQGGFIWDWVDQGLLAKDDNGKEFW 594

Query: 601  AYGGDMGGQNYTNDQNFCHNGLVWPDRTPHPGAFEVKKVYQDILFKGVNLDKGIIEVENG 660
            AYGGD+GG  + ND+NFC NGLV  DR PHP   EVKKVYQDILF   + +KG + V+N 
Sbjct: 595  AYGGDLGGYFFQNDENFCANGLVTADRKPHPALHEVKKVYQDILF-DYSPEKG-LHVQNL 652

Query: 661  FGYTNLDKYLFKFEVLKNGLVIKSGVINIRLAPQSKKQIQIELPKLTTEDGVEYLLNVFA 720
            F +TNLD+Y FK+E ++ G V+K+G  ++ L+   +K +Q+ LP +      E  LNV+A
Sbjct: 653  FDFTNLDQYAFKWEWVEEGEVVKTGDFDVDLSADEEKYVQLNLPSV---GDAETFLNVYA 709

Query: 721  YTKEGTELLPQNFEIAREQFSIGESNYFVKVAKASTNPIVKDSQDAITLSANGVEVTINK 780
            YTK    L+P   E+AREQF++ E  YF  +   + N  V+ ++D +T + + V    + 
Sbjct: 710  YTKNTEALVPAGHEVAREQFALNEGYYFDHLEAVTGNLQVEQTEDLLTFATDKVTGAFDL 769

Query: 781  KTGLMQKYT--SGEENYFNQMPVPNFWRAPTDNDFGNYMQVNSNVWRTVGRFSSLDSIEV 838
            K G  +KYT   GE      +P P FWRAP DNDFGN+M     VWR+      +  ++V
Sbjct: 770  KRGNFRKYTLKDGEPWMVRSLPSPYFWRAPIDNDFGNHMPSRLGVWRSAHLGQKVLDVQV 829

Query: 839  KEVSTQ-TTVVAHLFLKDIASTYTITYSMDADGSLTLQNSFKAGEMALSEMPRFGMLFSL 897
             E S +   +  +  L +I   YT+TY + +DG++ +  +       L E+PR+GM   L
Sbjct: 830  GEKSDEGIQITVNYELTNINVPYTVTYQIQSDGAVKVTAAMDLEGRDLPELPRYGMRMEL 889

Query: 898  KKELDNFSYYGRGPWENYQDRNTSSLKGIYESKVADQ-YVPYTRPQENGYKTDIRWITLT 956
              +  N +YYGRGPWENY DR  SS  G Y  +V +Q Y  Y RPQE+G KTD+RW+TL 
Sbjct: 890  PGQYGNLAYYGRGPWENYSDRKHSSFIGQYNDQVENQFYWDYVRPQESGNKTDVRWLTLR 949

Query: 957  NSSGNGIEILGLQPLGVSALNNYPEDFDPGLTKKQQHTNDITPRDEVIICVDLAQRGLGG 1016
            N  G GI+I G+QPL  SAL+   ED DPGLTKKQQH  DI P++ V + +D  QRGLGG
Sbjct: 950  NDKGQGIQIQGIQPLSFSALDVSVEDLDPGLTKKQQHPTDIKPKNTVYLHIDWKQRGLGG 1009

Query: 1017 DNSWGAMPHEQYQLRNKAYSYGFVIKPIK 1045
            D SWGA PH+ Y+L +  Y Y +VI+ ++
Sbjct: 1010 DTSWGAYPHKPYRLEDDHYEYSYVIRLVE 1038


Lambda     K      H
   0.316    0.134    0.410 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 3375
Number of extensions: 184
Number of successful extensions: 8
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 1046
Length of database: 1038
Length adjustment: 45
Effective length of query: 1001
Effective length of database: 993
Effective search space:   993993
Effective search space used:   993993
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.6 bits)
S2: 58 (26.9 bits)

This GapMind analysis is from Sep 17 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory