GapMind for catabolism of small carbon sources

 

Alignments for a candidate for lacZ in Echinicola vietnamensis KMM 6221, DSM 17526

Align Beta-galactosidase BoGH2A; Beta-gal; Glycosyl hydrolase family protein 2A; BoGH2A; EC 3.2.1.23 (characterized)
to candidate Echvi_1669 Echvi_1669 Beta-galactosidase/beta-glucuronidase

Query= SwissProt::A7LXS9
         (851 letters)



>FitnessBrowser__Cola:Echvi_1669
          Length = 828

 Score =  877 bits (2265), Expect = 0.0
 Identities = 431/810 (53%), Positives = 547/810 (67%), Gaps = 35/810 (4%)

Query: 35  SSSLVSPRERSDFNADWRFHLGDGLQAAQPGFADNDWRVLDLPHDWAIEGDFSQENPSGT 94
           +SS    R+  DFN  WRF LGD   A    F  ++WR L+LPHDW+IEGDFS+++P+  
Sbjct: 40  ASSEDKERQVIDFNYGWRFQLGDHPNAISEDFDVSNWRELNLPHDWSIEGDFSEDHPTKP 99

Query: 95  GGGALPGGVGWYRKTFSVDKADAGKIFRIEFDGVYMNSEVFINGVSLGVRPYGYISFSYD 154
            GGALP G+GWYRK F +      +   IEFDGVY N EV+ING  LG RP GY SF YD
Sbjct: 100 EGGALPAGIGWYRKAFKLPTEAKEQSIWIEFDGVYRNGEVWINGHRLGKRPNGYSSFKYD 159

Query: 155 LTPYLKW-DEPNVLAVRVDNAEQPNSRWYSGCGIYRNVWLSKTGPIHVGGWGTYVTTSSV 213
           L  +L + D+ NVLAVRVDN+EQPNSRWY+G GIYRNV L +TG +HV  WGTYVTT  +
Sbjct: 160 LGEHLNYGDKVNVLAVRVDNSEQPNSRWYTGSGIYRNVRLIRTGKVHVEHWGTYVTTPEI 219

Query: 214 DEKQAVLNLATTLVNESDTNENVTVCSSLQDAEGREVAETRSSGEAEAGKEVVFTQQLTV 273
            +  AV+NL   + N+      +TV S++ DA+G  V E         G+    TQ+ TV
Sbjct: 220 TDSSAVVNLEVMVKNDGFNERKLTVRSTILDADGEAVTEEEQPLVLGKGESTDVTQRFTV 279

Query: 274 KQPQLWDIDTPYLYTLVTKVMRNEECMDRYTTPVGIRTFSLDARKGFTLNGRQTKINGVC 333
             P+LW  D PYLY +VT+V    + MD Y TP+GIR F+ DA+KGF+LNG++ KI GVC
Sbjct: 280 PSPKLWSTDEPYLYQVVTQVYAGMQLMDDYVTPLGIRYFNFDAQKGFSLNGKRMKILGVC 339

Query: 334 MHHDLGCLGAAVNTRAIERHLQILKEMGCNGIRCSHNPPAPELLDLCDRMGFIVMDEAFD 393
            HHDLG LGAAVN RAIER L+ILKEMG N IR +HNPPAPELL LCD MGFIV DEAFD
Sbjct: 340 NHHDLGALGAAVNKRAIERRLEILKEMGVNAIRTAHNPPAPELLQLCDEMGFIVQDEAFD 399

Query: 394 MWRKKKTAHDYARYFNEWHERDLNDFILRDRNHPSVFMWSIGNEVLEQWSDAKADTLSLE 453
           +W+KKK   D   ++++WH RDL D ILRDRNHPS+ MWSIGNE+ EQ+           
Sbjct: 400 VWKKKKVDADSHLFWDQWHRRDLEDLILRDRNHPSIMMWSIGNEIREQF----------- 448

Query: 454 EANLILNFGHSSEMLAKEGEESVNSLLTKKLVSFVKGLDPTRPVTAGCNEP-NSGNHLFR 512
                               +S    +TK+LV  VK LD TR VT    E   S N +++
Sbjct: 449 --------------------DSTGISITKELVRIVKELDTTRVVTCALTENIPSKNFIYQ 488

Query: 513 SGVLDVIGYNYHNKDIPNVPANFPDKPFIITESNSALMTRGYYRMPSDRMFIWPKRWDKS 572
           S  LD++G+NY +KD  N P  +P +  I TE+ SAL TRG+Y +PSD +  WP+  DK 
Sbjct: 489 SKALDLLGFNYKHKDHKNFPKWYPGEKLIATENMSALATRGHYDLPSDTIMRWPQAHDKP 548

Query: 573 F--ADSTFACSSYENCHVPWGNTHEESLKLVRDNDFISGQYVWTGFDYIGEPTPYGWPAR 630
               +     S+Y+     WG+THEE+ K ++D DF++G +VWTGFDY+GEP PY +PAR
Sbjct: 549 LETGNEDLTVSAYDQVSAYWGSTHEETWKSIKDQDFMAGLFVWTGFDYLGEPIPYPYPAR 608

Query: 631 SSYFGIVDLAGFPKDVYYLYQSEWTDKQVLHLFPHWNWTPGQEIDMWCYYNQADEVELFV 690
           SSYFGIVDLAGFPKD YY+YQSEWT   VLH+FPHWNW  GQE+D+W YYNQADEVELF+
Sbjct: 609 SSYFGIVDLAGFPKDAYYMYQSEWTADTVLHVFPHWNWEAGQEVDVWAYYNQADEVELFL 668

Query: 691 NGKSQGVKRKDLDNLHVAWRVKFEPGTVKVIARESGKVVAEKEICTAGKPAEIRLTPDRS 750
           NG+S G+K+K+ D+LHV WR  FEPGT+K +AR+ GK VAEK++ TAG   ++ L+PDR 
Sbjct: 669 NGESLGIKQKEGDDLHVMWRTPFEPGTLKAVARKDGKKVAEKKVTTAGDAQKVTLSPDRK 728

Query: 751 ILTADGKDLCFVTVEVLDEKGNLCPDADNLVNFTVQGNGFIAGVDNGNPVSMERFKDEKR 810
            + ADGKDL F+TV + D  GN+ P+ADN+VNF +QG G I GVDNG   S+E FK   R
Sbjct: 729 TIKADGKDLSFITVSICDMDGNIVPNADNMVNFEIQGEGKIMGVDNGYQASLEPFKANYR 788

Query: 811 KAFYGKCLVVIQNDGKPGKAKLTATSEGLR 840
           KAF GKCL+++Q+  + G+  + ATSE L+
Sbjct: 789 KAFKGKCLLIVQSAREAGEISIRATSEHLQ 818


Lambda     K      H
   0.319    0.136    0.431 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 2049
Number of extensions: 97
Number of successful extensions: 5
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 851
Length of database: 828
Length adjustment: 42
Effective length of query: 809
Effective length of database: 786
Effective search space:   635874
Effective search space used:   635874
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 56 (26.2 bits)

This GapMind analysis is from Sep 17 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory