GapMind for catabolism of small carbon sources

 

Alignments for a candidate for treF in Erythrobacter gangjinensis K7-2

Align α,α-trehalase (MSMEG_4535;MSMEG4528) (EC 3.2.1.28) (characterized)
to candidate WP_047005936.1 AAW01_RS03580 glycoside hydrolase family 15 protein

Query= CAZy::ABK72415.1
         (668 letters)



>NCBI__GCF_001010925.1:WP_047005936.1
          Length = 599

 Score =  225 bits (573), Expect = 5e-63
 Identities = 180/610 (29%), Positives = 277/610 (45%), Gaps = 50/610 (8%)

Query: 48  LSDCETTCLISSAGSVEWLCVPRPDSPSVFGAIL-----DRGAGHFRLGPYGVSVPAARR 102
           + +C+ + L+   G++ W CVPR D    F A+L     D G   F L     S    + 
Sbjct: 13  IGNCQVSGLVDKRGAIVWGCVPRVDGDPTFCALLNGASQDVGVWRFELEGQTAS---HQE 69

Query: 103 YLPGSLILETTWQTHTGWLIVRDALVMGPWHDIDTRSRTHRRTPMDWDAEHILLRTVRCV 162
           Y+  + IL T  +   G  +  + L   P    +   R +R             R VR V
Sbjct: 70  YIRNTPILVTRLEAADGSAV--EVLDFCP--RFEGSGRMYRPVAF--------ARIVRPV 117

Query: 163 SGTVELVMSCEPAFDYHRVSATWEYSGPAYGEAIARASRNPDSHPTLRLTTNLRIG--IE 220
           +G   + +  +P  D+ + +A   +             R   S  ++RL+T+  +G  +E
Sbjct: 118 AGNPRIRVVLKPMRDWGQAAAETTHG--------TNHIRYLMSGQSMRLSTDAAVGYILE 169

Query: 221 GREARARTRLTEGDNVFVALSWSKHPAPQTYEEAADKMWK-TSEAWRQWINVGDFPDHPW 279
           GR  R    + E  + F+       P      E   +M + T   W+ W      P   W
Sbjct: 170 GRTFR----IEEDTHFFLG---PDEPFVGNLREQVRRMEQSTRRYWQLWARSLATP-FEW 221

Query: 280 RAYLQRSALTLKGLTYSPTGALLAAPTTSLPETPQGERNWDYRYSWIRDSTFALWGLYTL 339
           +  + R+A+TLK   +  TGA++AA TTS+PE P  ERNWDYRY WIRDS + +  L  L
Sbjct: 222 QQEVIRAAITLKLCQHEETGAIVAALTTSIPEAPGSERNWDYRYCWIRDSYYTVQALNRL 281

Query: 340 GLDREADDFFSFIADVSGANNGERHPLQVMYGVGGERSLVEEELHHLSGYDNSRPVRIGN 399
           G     + +  F+ ++   +N +   +Q +Y V G   L E     L+GY    PVR+GN
Sbjct: 282 GALDVLEKYLGFLRNL--VDNAKGGQIQPLYSVMGVAELTENTAGSLAGYRGMGPVRVGN 339

Query: 400 GAYNQRQHDIWG-TMLDSV--YLHAKSREQIPDALWPVLKNQVEEAIKHWKEPDRGIWEV 456
            AY Q QHD +G  +L +V  +L  +      D  +  L+   E A     +PD G+WE 
Sbjct: 340 AAYKQVQHDAYGQIVLPTVQGFLDRRLLRMADDRDFESLEEVGEMAWSMHDQPDAGLWEF 399

Query: 457 RGEPQHFTSSKIMCWVALDRGSKLAELQGEKSYAQQWRAIAEEIKADVLARGVDKRGV-- 514
           R   +  T S +M W A DR +  A   G++  A  W   A+ I+  + AR   + G   
Sbjct: 400 RTRQEVHTYSAVMSWAACDRLATAAHYLGKQDRAHFWSDRADTIRDTIEARAWKENGEGG 459

Query: 515 -LTQRYGDDALDASLLLAVLTRFLPADDPRIRATVLAIADELTEDGLVLRYRVEETDDGL 573
                +  D LDASLL  +  R++  DD R   T   +  +L     +LRY  E   D  
Sbjct: 460 HYGASFESDYLDASLLQLLELRYVTPDDERFEQTFAMVERDLRRGEHMLRYAAE---DDF 516

Query: 574 AGEEGTFTICSFWLVSALVEIGEISRAKHLCERLLSFASPLHLYAEEIEPRTGRHLGNFP 633
              E  F IC+FWL+ AL  +G    A+ L   +L+  +   L +E+++  TG   GNFP
Sbjct: 517 GAPETAFNICTFWLIEALALMGRKDEARELFCTMLAHRTGSGLLSEDMDFETGELWGNFP 576

Query: 634 QAFTHLALIN 643
           Q ++ + +IN
Sbjct: 577 QTYSLVGIIN 586


Lambda     K      H
   0.319    0.135    0.426 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 1063
Number of extensions: 60
Number of successful extensions: 6
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 668
Length of database: 599
Length adjustment: 38
Effective length of query: 630
Effective length of database: 561
Effective search space:   353430
Effective search space used:   353430
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 54 (25.4 bits)

This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory