GapMind for catabolism of small carbon sources


Definition of D-cellobiose catabolism

As text, or see rules and steps

# MetaCyc does not list any pathways for cellobiose utilization, but the major catabolic enzymes are
# believed to be intracellular cellobiase, periplasmic cellobiase,
# cellobiose-6-phosphate hydrolase, or
# cellobiose phosphorylase (PMID:28535986).
# These pathways all lead to glucose-6-phosphate, which is a central metabolic intermediate.
# There also may be a 3-ketoglucoside pathway in some Bacteroidetes, but this is not characterized.

# ABC transporters:

# 5-member system from Thermotoga maritima (TM0027:TM0031)
TM0031	cellobiose ABC transporter, substrate-binding component	curated:TCDB::Q9WXN8
TM0030	cellobiose ABC transporter, permease component 1	curated:TCDB::Q9WXN7	ignore:TCDB::Q9X0U9
TM0029	cellobiose ABC transporter, permease component 2	curated:TCDB::Q9WXN6	ignore:TCDB::Q9X0U8
TM0028	cellobiose ABC transporter, ATPase component 1	curated:TCDB::Q9WXN5	ignore:TCDB::Q9X0U7
TM0027	cellobiose ABC transporter, ATPase component 2	curated:TCDB::Q9WXN4	ignore:TCDB::Q9X0U6

# Transporters and PTS systems were identified uing
# query: transporter:cellobiose:D-cellobiose:CPD-15976:beta-glucoside:beta-glucosides.
# But, some of the beta-glucoside transporters might not be cellobiose transporters.
# Omitted TCDB 4.A.1.2.15 = LGAS_1755 = ART98417 (see curated BLAST for PTS vs. GCF_000014425.1), "PTS 19CBA" in PMC2848229 (Francl et al 2010);
# this was not actually shown to be a cellobiose transporter
# (and, the main one in this organism is TC 4.A.1.2.14 = LGAS_1669).
# Omitted TM1219:TM1223, listed as a probable cellobiose porter:
#    the SBP TM1223 is reported to bind beta,1-4-mannobiose only (PMC1392961).
cellobiose-transport: TM0031 TM0030 TM0029 TM0028 TM0027

# cbtABCDF from Sulfolobus solfataricus
cbtA	cellobiose ABC transporter, substrate-binding component CbtA	curated:TCDB::Q97VF7
cbtB	cellobiose ABC transporter, permease component 1 (CbtB)	curated:TCDB::Q97VF8
cbtC	cellobiose ABC transporter, permease component 2 (CbtC)	curated:TCDB::Q97VF6
cbtD	cellobiose ABC transporter, ATPase component 1 (CbtD)	curated:TCDB::Q97VF5
cbtF	cellobiose ABC transporter, ATPase component 2 (CbtF)	curated:TCDB::Q97VF4
cellobiose-transport: cbtA cbtB cbtC cbtD cbtF

# cebEFG-msiK from Streptomyces reticuli
cebE	cellobiose ABC transporter, substrate-binding component CebE	curated:TCDB::Q9X9R7
cebF	cellobiose ABC transporter, permease component 1 (CebF)	curated:TCDB::Q9X9R6
cebG	cellobiose ABC transporter, permease component 2 (CebG)	curated:TCDB::Q9X9R5
# Ignore msiK from S. coelicolor (msiK is shared across ABC transporters)
msiK	cellobiose ABC transporter, ATPase component	curated:TCDB::P96483	ignore:SwissProt::Q9L0Q1
cellobiose-transport: cebE cebF cebG msiK

# 4-member transporter from Sinorhizobium meliloti (SMc04256:SMc04259)
SMc04256	cellobiose ABC transporter, ATPase component	curated:reanno::Smeli:SMc04256
SMc04257	cellobiose ABC transporter, permease component 1	curated:reanno::Smeli:SMc04257
SMc04258	cellobiose ABC transporter, permease component 2	curated:reanno::Smeli:SMc04258
SMc04259	cellobiose ABC transporter, substrate-binding protein	curated:reanno::Smeli:SMc04259
cellobiose-transport: SMc04256 SMc04257 SMc04258 SMc04259

# Clostridium thermocellum has two ABC transporters with cellobiose-binding SBPs,
# see PMID:18952792 (Nataf et al 2009).
# cbpB = gi 125973535 = WP_011837947.1 = A3DE73 = Cthe_1020
# cbpC = gi 125974617 = WP_003516700.1 = Cthe_2128 = A3DHA5
# The ATPase component of these systems is not clear, but
# cbpB is close to permease components Cthe_1019=msdB1=A3DE72 and Cthe_1018=msdB2=A3DE71,
# cbpC is close to permease components Cthe_2126=A3DHA3 and Cthe_2125=A3DHA2
# (the name of those components by Nataf et al is ambiguous)
cbpB	cellobiose ABC transporter, substrate-binding component CpbB	uniprot:A3DE73
msdB1	cellobiose ABC transporter, permease component 1 (MsdB1)	uniprot:A3DE72
msdB2	cellobiose ABC transporter, permease component 2 (MsdB2)	uniprot:A3DE71
cellobiose-transport: cbpB msdB1 msdB2

cbpC	cellobiose ABC transporter, substrate-binding component CbpC	uniprot:A3DHA5
msdC1	cellobiose ABC transporter, permease component 1 (MsdC1)	uniprot:A3DHA3
msdC2	cellobiose ABC transporter, permease component 1 (MsdC2)	uniprot:A3DHA2
cellobiose-transport: cbpC msdC1 msdC2

# PTS systems:

# Lactobacillus gasseri has EII-BCA, and E. coli has EII-BCA (bglG).
# E. coli also has a similar system, EII-BC (ascF); the EII-A component is not known.
# A single-a.a. mutation to a EII-BCA from Corynebacterium glutamicum (TC 4.A.1.2.5 / Q8GGK3) allows it to transport cellobiose,
# so that is included as well.
bglG	cellobiose PTS system, EII-BC or EII-BCA components	curated:SwissProt::P24241	curated:TCDB::ART98499	curated:TCDB::P08722	curated:TCDB::Q8GGK3

cellobiose-PTS: bglG

# E. coli also has chbABC (separate EII-A, EII-B and EII-C components), and
# Lactococcus lactis has a similar system (ptcA-ptcB-celB).
# Separate EIIABC proteins are also known in Streptococcus pneumoniae, see PMC3302838.
# (SP0308 = celC = EIIA = A0A0H2UNC0; Sp0305 = celB = EIIB = A0A0H2UNC2; SP0310 = celD = EIIC = A0A0H2UNC6.)
# Ignore a close homolog from L. lactis cremoris (annotated as galactose-specific) and a systems from Klebsiella pneumoniae
# and Serratia marcescens that may transport cellobiose.
celEIIA	cellobiose PTS system, EII-A component	curated:SwissProt::P69791	curated:SwissProt::Q9CIE9	uniprot:A0A0H2UNC0	ignore:SwissProt::A2RIE7	ignore:TCDB::C4X746
celEIIB	cellobiose PTS system, EII-B component	curated:BRENDA::P69795	curated:SwissProt::Q9CIF0	uniprot:A0A0H2UNC2	ignore:SwissProt::A2RIE6	ignore:TCDB::C4X744	ignore:TCDB::Q8L3C3
celEIIC	cellobiose PTS system, EII-C component	curated:SwissProt::P17334	curated:SwissProt::Q9CJ32	uniprot:A0A0H2UNC6	ignore:TCDB::C4X745	ignore:BRENDA::Q8L3C2

cellobiose-PTS: celEIIA celEIIB celEIIC

# Homomeric transporters:

# combine cdt-1, cdt-2 which are distantly related
cdt	cellobiose transporter cdt-1/cdt-2	curated:TCDB::Q7SCU1	curated:TCDB::Q7SD12
cellobiose-transport: cdt

# Gene name is from Rodionov et al 2010 (PMC2996990)
bglT	cellobiose transporter BglT	curated:reanno::SB2B:6937231
cellobiose-transport: bglT

# Ignore the porin bglH and the efflux transporter setA

# glk is glucokinase
import glucose.steps:glucose-utilization glk

# AAB62870.1 is annotated in CAZy as a beta-glucosidase, but
# but I was unable to find any data supporting this annotation.
# In PMID:9811648, the authors of the genbank entry report knocking out this gene
# (which they call bglA), but they did not report a phenotype.
# This protein is 78% identical to BT3567, which
# cleaves beta,1-2 and beta,1-3 linkages and has little activity on cellobiose (see PMID:28343388);
# also BT3567 is improtant for growth on laminaribiose but not cellobiose.
bgl	cellobiase	EC:	ignore:CAZy::AAB62870.1

# Cellobiose may be cleaved to two glucose in the periplasm (by bgl).
all: bgl glucose-utilization

# Or, after transport, cellobiose may be cleaved in the cytoplasm and then the glucose is phosphorylated by glk.
all: cellobiose-transport bgl glk

cbp	cellobiose phosphorylase	EC:
import galactose.steps:pgmA

# Or cellobiose is cleaved by cellobiose phosphorylase (cbp),
# into glucose and alpha-glucose-1-phosphate;
# alpha-phosphoglucomutase (pgmA) converts the glucose-1-P to glucose-6-phosphate
# and glucokinase converts glucose to glucose-6-phosphate.
all: cellobiose-transport cbp pgmA glk

# Q9AI65 from Erwinia rhapontici seems to be misannotated in BRENDA, as does
# Q1JK37 from Streptococcus pyogenes.
ascB	6-phosphocellobiose hydrolase	EC:	ignore:BRENDA::Q9AI65	ignore:BRENDA::Q1JK37

# Or, after transport and phosphorylation by a PTS system,
# 6-phosphocellobiose hydrolase forms glucose and glucose-6-phosphate,
# and glucokinase converts the glucose to glucose-6-phosphate.
all: cellobiose-PTS ascB glk



Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see the paper from 2019 on GapMind for amino acid biosynthesis, the paper from 2022 on GapMind for carbon sources, or view the source code.

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory