GapMind for catabolism of small carbon sources

 

Alignments for a candidate for cbp in Pandoraea thiooxydans ATSB16

Align cellobiose phosphorylase (EC 2.4.1.20) (characterized)
to candidate WP_083566737.1 PATSB16_RS08730 phosphorylase

Query= BRENDA::Q9X2G3
         (813 letters)



>NCBI__GCF_001931675.1:WP_083566737.1
          Length = 2916

 Score =  327 bits (838), Expect = 6e-93
 Identities = 239/840 (28%), Positives = 387/840 (46%), Gaps = 75/840 (8%)

Query: 4    GYFDDVNREYVITT---PQTPYPWINYLGTEDFFSIISHMAGGYCFYKDARLRRITRFRY 60
            G F    REYV       +TP PWIN +    F   +S    GY + ++++  ++T +  
Sbjct: 2117 GGFASQGREYVTVLGPDQRTPAPWINVIANPAFGFQVSESGAGYTWSENSQANQLTPWS- 2175

Query: 61   NNVPTDAGGRYFYIRE-ENGDFWTPTWMPVRKDLSFFEARHGLGYTKITGERNGLRATIT 119
            N+   D  G  FY+R+ E G+ W PT +P+R + + + ARHG GY++     +G+ + + 
Sbjct: 2176 NDPVCDTPGEVFYLRDDETGELWAPTALPIRIENTRYIARHGHGYSRFQHSSHGVMSELL 2235

Query: 120  YFVPRHFTGEVHYLVLENKAEKPRKIKLFSFIEFCLWNALDDMTNFQRNYSTGEVEIEGS 179
             FV      ++  L LE+ + +PRK+ +  + E+ L  +      F       E++ +  
Sbjct: 2236 QFVSWSDPVKISVLTLESLSSRPRKLSVTGYAEWVLGTSRAASAPF----IVTEMDAQTG 2291

Query: 180  VIYHKTEYRER-RNHYAFYSVNQPIDGFDTDRESFIGLYSGFEAPQAVVEGKP-RNSVAS 237
             ++    +        AF         +  DR  FIG     E P A+       N   +
Sbjct: 2292 AMFAANPWNAGFGKRIAFVDWVGRKTSWTGDRTEFIGRNGTLEQPAALAGSAGLSNRTGA 2351

Query: 238  GWAPIASHYLEIELAPSEKKELIFILGYVENPEEEKWEKPGVINKKRAKEMIEKFKTGED 297
            G+ P ++    +ELAP ++ +L F+LG   +             +  A+++I +++  + 
Sbjct: 2352 GFDPCSALQTAVELAPGQRVQLTFLLGQTAD-------------RPAARQLIGRYRALDP 2398

Query: 298  VEHALKELREYWDDLLGRIQVETHDEKLNRMVNIWNQYQCMVTFNISRSASYFESGISRG 357
             E  L E+   WD +L ++QVET D   + M+N W  YQ +     +R+A Y  SG    
Sbjct: 2399 TE-VLAEVNAQWDRILTKVQVETPDRATDLMLNGWLLYQVLACRMWARAAFYQASG---A 2454

Query: 358  IGFRDSNQDILGFVHMIPEKARQRILDLASIQFEDGSTYHQFQPLTKKGNNEIGGGFNDD 417
             GFRD  QD +      P+ AR  +LD A+ QF +G   H + P   +G   +    +DD
Sbjct: 2455 YGFRDQLQDCMALNMARPDLARAHLLDAAARQFVEGDVQHWWHPQQGRG---VRTHISDD 2511

Query: 418  PLWLILSTSAYIKETGDWSILGEEVPFDNDPNKKA----------------SLFEHLKRS 461
             LWL    + Y+  T D ++L E +PF   P   A                +LFEH  R+
Sbjct: 2512 RLWLPYVVAQYVSVTADAAVLDEGLPFLEGPAVPAEHEDAYYAPKVSAQTGTLFEHCVRA 2571

Query: 462  FYFTVNNLGPHGLPLIGRADWNDCLNLNCFSKNPDESFQTTVNALDGRVAESVFIAGLFV 521
               ++ N G HGLPL+G  DWND +N                        ESV++     
Sbjct: 2572 IDCSLAN-GAHGLPLMGGGDWNDGMN----------------RVGQAGKGESVWLGWFLY 2614

Query: 522  LAGKEFVEICKRRGLEEEAREAEKHVNKMIETTLKYGWDGEWFLRAYDAFGRKVGSKECE 581
                +F  +   RG         KH   +     K  WDG W+ RAY   G  +GS    
Sbjct: 2615 STIAKFSVLAAARGEHACVERWHKHAAALRAALKKDAWDGAWYRRAYFDDGTPLGSSANA 2674

Query: 582  EGKIFIEPQGMCVMAGIGVDNGYAEKALDSVKKYLDTPYG--LVLQQPAYSRYYIELGEI 639
            E +I    Q   V++G   +     +A+ SV++YL  P    ++L  P + +   + G I
Sbjct: 2675 ECRIDSLAQSWSVISG-AAELARQRRAMASVEQYLIRPGDDLVLLLAPPFDKTPYDPGYI 2733

Query: 640  SSYPPGYKENAGIFCHNNPWVAIAETVIGRGDRAFEIYRKITPA-YLEDISEIH--RTEP 696
              Y PG +EN G + H   W  IA  ++G GDRA ++ + + P  +    + +H  + EP
Sbjct: 2734 KGYLPGVRENGGQYTHAAAWCMIAYAMLGDGDRAGDLLKMLNPVNHASTRAGVHAYKVEP 2793

Query: 697  YVYAQMVAGKDAPRHGEAKN-SWLTGTAAWSFVAITQHILGIRPTYDSLVVDPCIPKEWE 755
            YV A  +    AP H      +W TG+A W +    + +LG++   ++L VDPCIP++W 
Sbjct: 2794 YVVAGDIYA--APAHVRRGGWTWYTGSAGWLYRGGLESVLGLQKHGENLTVDPCIPRDWR 2851

Query: 756  GFRITRKFRGSIYDITVKNPSHVSKGVKEIIVDGKKIEGQV--LPVFEDGKVHRVEVVMG 813
             FR+  +   + Y ITV NP  VSKGV  + +DG  +      + + +DG++HRV V++G
Sbjct: 2852 SFRLDYRHGATHYLITVDNPQGVSKGVARVELDGVPLPSNTHSVALVDDGQLHRVLVMLG 2911



 Score = 55.8 bits (133), Expect = 3e-11
 Identities = 51/217 (23%), Positives = 91/217 (41%), Gaps = 11/217 (5%)

Query: 55   ITRFRYNNVPTDAGGRYFYIREE-NGDFWTPTWMPVRKDLSFFEARHGLGYTKITGERNG 113
            +TR+R  +V  DA G Y ++R+  +GD W+ T+ P+  +   +E        +       
Sbjct: 1658 VTRWR-EDVTCDAWGSYIFLRDTASGDVWSATYQPLGLEPDRYEVVFSEDRARFVRHDGT 1716

Query: 114  LRATITYFVPRHFTGEVHYLVLENKAEKPRKIKLFSFIEFCLWNALDDMTN--FQRNYST 171
            L   +   V      E+  L L N   +  +I+L S+ E  L     D T+  F   +  
Sbjct: 1717 LSTCLEIIVSPEDNAEIRRLTLTNSGAQAVEIELTSYAEVVLAPMAADATHPAFSNLFIH 1776

Query: 172  GEV--EIEGSVIYHK----TEYRERRNHYAFYSVNQPIDGFDTDRESFIGLYSGFEAPQA 225
             E   E+ G +   +    T+      H           G++TDR  F+G       P A
Sbjct: 1777 TEYLPEVHGLLAMRRPHSATDAPVWAAHILAGGSRSENVGYETDRARFLGRGHPIRDPVA 1836

Query: 226  VVEGKP-RNSVASGWAPIASHYLEIELAPSEKKELIF 261
            +++G+P  N+V S   P+ S    +++A     +++F
Sbjct: 1837 IMDGRPLSNTVGSVLDPVLSLRTRVQVAAGATADILF 1873


Lambda     K      H
   0.320    0.139    0.431 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 5509
Number of extensions: 265
Number of successful extensions: 12
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 3
Number of HSP's successfully gapped: 3
Length of query: 813
Length of database: 2916
Length adjustment: 51
Effective length of query: 762
Effective length of database: 2865
Effective search space:  2183130
Effective search space used:  2183130
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.8 bits)
S2: 60 (27.7 bits)

This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory