GapMind for catabolism of small carbon sources


Definition of D-xylose catabolism

As text, or see rules and steps

# Xylose degradation in GapMind is based on MetaCyc pathways
# I via D-xylulose (metacyc:XYLCAT-PWY),
# II via xylitol (metacyc:PWY-5516),
# III or V via 2-dehydro-3-deoxy-D-arabinonate (DKDP) dehydratase (metacyc:PWY-6760, metacyc:PWY-8020),
# IV via DKDP aldolase (metacyc:PWY-7294),
# as well as another pathway via DKDP dehydrogenase (PMC6336799).

# monomeric transporters include xylT-like proteins, and also Gal2, GlcP, but not XYLP_LACPE
# TCDB also mentions MA6T (TC 2.A.1.1.10; P15685) but I did not find a reference linking this to xylose uptake

# Fitness data identified CCNA_00857 (CC0814; A0A0H3C6H3) as the xylose transporter in Caulobacter crescentus,
# consistent with a previous report (PMC2168598).
xylT	D-xylose transporter	curated:SwissProt::O52733	curated:CharProtDB::CH_091400	curated:CharProtDB::CH_091493	curated:CharProtDB::CH_109760	curated:SwissProt::P0AE24	curated:SwissProt::P96710	curated:TCDB::C4B4V9	curated:TCDB::Q0WWW9	curated:TCDB::Q2MDH1	curated:TCDB::Q2MEV7	curated:TCDB::Q64L87	curated:TCDB::Q9XIH7	uniprot:A0A0H3C6H3

gal2	galactose/glucose/xylose uniporter	curated:CharProtDB::CH_091029

glcP	glucose/mannose/xylose:H+ symporter	curated:SwissProt::O07563

# Echvi_1871 (L0FZF3) seems to be a xylose transporter as well as a glucose/galactose transporter.
Echvi_1871	sodium/xylose cotransporter	uniprot:L0FZF3

# ABC transporters include xylFGH from several organisms (T. maritima has a diverged SBP),
# another system from T. maritima (xylE_Tm xylF_Tm XylK_Tm),
# and araVUTS from Sulfolobus solfataricus

# T. maritima has a diverged SBP, Tmari_1858 (uniprot:G4FGN5).
# Tmari_1858 is sometimes annotated as gluE, and is glucose induced.
# But the Km for xylose is quite low, so, considered it a xylose transporter as well.
# Ignore P54083 (sbpA from A. brasilensis), not known if it transports xylose or not;
# close homolog HSERO_RS05190 is mildly important for fitness during growth on xylose, so, it may transport xylose.
xylF	ABC transporter for xylose, substrate binding component xylF	curated:CharProtDB::CH_003787	curated:TCDB::A6LW10	curated:TCDB::P25548	curated:TCDB::G4FGN5	ignore:SwissProt::P54083

xylG	ABC transporter for xylose, ATP-binding component xylG	curated:SwissProt::P37388	curated:TCDB::A6LW11	curated:TCDB::G4FGN3	curated:TCDB::O05176

xylH	ABC transporter for xylose, permease component xylH	curated:TCDB::A6LW12	curated:CharProtDB::CH_024441	curated:TCDB::G4FGN4	curated:TCDB::O05177

# The ABC transporter in T. maritima described above is TC 3.A.1.2.20.

# TC 3.A.1.2.18 describes another ABC transporter for xylose in T. maritima:
# Q9WXW7 = TM0112 = XylF, the permease subunit;
# Q9WXW9 = TM0114 = XylE, the SBP, reported high affinity for xylose (PMC1392961);
# Q9WXX0 = TM0115 = xylK is the ATPase-binding component.
# Also TM0113 is a putative xylanase.
#   (Only the SBP seems to be characterized but the clustering of these genes with each other
#    and other xylose-related genes is quite suggestive.)
# TC 3.A.1.2.18 also includes Q9WXW0 = TM0105, for which TCDB cites Nanavati et al (PMC1392961),
#  but that paper does not mention this gene. Probably a curation error.
xylF_Tm	ABC transporter for xylose, permease component xylF	curated:TCDB::Q9WXW7
xylE_Tm	ABC transporter for xylose, substrate binding component xylE	uniprot:Q9WXW9
xylK_Tm	ABC transporter for xylose, ATP binding component xylK	uniprot:Q9WXX0

# AraVUTS (TC 3.A.1.1.14) from Sulfolobus solfataricus
araV	component of Arabinose, fructose, xylose porter	curated:TCDB::Q97UF2
araU	component of Arabinose, fructose, xylose porter	curated:TCDB::Q97UF3
araT	component of Arabinose, fructose, xylose porter	curated:TCDB::Q97UF4
araS	component of Arabinose, fructose, xylose porter	curated:TCDB::Q97UF5

# GtsABCD from P. fluorescens WCS417 (PS417_22130:PS417_22145) is very important for utilization
# of both glycose and xylose.
# Similar systems in other Pseudomonas have been reported as glucose transporters.
# In P. putida, after enzymes for xylose catabolism were introduced and the strain was
# evolved for growth on xylose, GtsABCD were required for xylose utilization, although there were
# two point mutations in GtsA, so it is not certain if the wild-type P. putida GtsA binds xylose
# efficiently (see PMC3340264). The P. putida system is marked ignore.
# GtsA = PS417_22145 = GFF4324
gtsA	xylose ABC transporter, periplasmic substrate-binding component GtsA	curated:reanno::WCS417:GFF4324	ignore:TCDB::Q88P38

# GtsB = PS417_22140 = GFF4323
gtsB	xylose ABC transporter, permease component 1 GtsB	curated:reanno::WCS417:GFF4323	ignore:TCDB::Q88P37

# GtsC = PS417_22135 = GFF4322
gtsC	xylose ABC transporter, permease component 2 GtsC	curated:reanno::WCS417:GFF4322	ignore:TCDB::Q88P36

# GtsD = PS417_22130 = GFF4321
gtsD	xylose ABC transporter, ATPase component GtsD	curated:reanno::WCS417:GFF4321	ignore:TCDB::Q88P35

# Transporters were identified using
# query: transporter:D-xylose:xylose:CPD-15377
xylose-transport: xylT
xylose-transport: gal2
xylose-transport: glcP
xylose-transport: Echvi_1871

xylose-transport: xylF xylG xylH
xylose-transport: xylF_Tm xylE_Tm xylK_Tm
xylose-transport: araV araU araT araS
xylose-transport: gtsA gtsB gtsC gtsD

# There are also reports of xylose uptake by the mannose PTS system in Lactobacillus
# (S. Chaillou et al, J. Bact. 1999) but with poor affinity.

xylA	xylose isomerase	EC:
# Echvi_1875 (L0FZT0) is annotated as xylulose kinase and has its strongest phentoypes on xylose.
# CA_C2612 (Q97FW4) from Clostridium acetobutylicum was proven to be xylulose kinase (PMC2873477).
# BT0792 (Q8A9M3) has its strongest phenotypes on xylose.
xylB	xylulokinase	EC:	uniprot:L0FZT0	uniprot:Q97FW4	uniprot:Q8A9M3
#rpe	ribulose-phosphate epimerase	EC:
#rpi	ribose-5-phosphate isomerase	EC:

xyrA	xylitol reductase	EC:

# L-iditol 2-dehydrogenases (EC: often act on xylitol as well, so are ignored.
# There's also some xylulose reductases annotated but without an EC number.
xdhA	xylitol dehydrogenase	EC:	ignore_other:	ignore_other:xylulose reductase

# Watanabe et al 2019 (PMC6336799) show that
# C785_RS00860 = WP_034330287.1 = A0A4R8NY47 is D-xylose dehydrogenase (xdh).
# Another issue is that xdh from Haloferax volcanii (HVO_B0028; D4GP29) is reported to form xylono-1,4-lactone,
# (Sutter et al 2017, PMID:28854683), but this is not reflected in the databases,
# and for some (many?) xylose dehydrogenases,
# it is uncertain which lactone they form.
# So, both forms of D-xylose dehydrogenase are included in "xdh."
xdh	D-xylose dehydrogenase	EC:	EC:	uniprot:A0A4R8NY47

# EC: is the 1,5-lactonase; EC: is the 1,4-lactonase;
# both are included in "xylC."
# xylono-1,4-lactonase is sometimes given EC: (L-arabinino-1,4-lactonase) so ignore those;
# indeed HVO_B0030 = metacyc::MONOMER-20630 has both 1,4-lactonase activities (PMID:28854683).
xylC	xylonolactonase	EC:	EC:	ignore_other:

# Watanabe et al (PMC6336799) show that
# C785_RS00855 = WP_039783171.1 = UPI0004007277 is D-xylonate dehydratase (xad);
#   this is 98% identical to D8IWS7_HERSS, so use that identifier.
xad	D-xylonate dehydratase	EC:	uniprot:D8IWS7_HERSS
# Watanabe et al (PMC6336799) show that
# C785_RS13680 = WP_039786859.1 = UPI00041852D2 is DKDP dehydratase (kdaD).
#   Is 98% identical to HSERO_RS19360, which is included via reannotations.
kdaD	2-keto-3-deoxy-D-arabinonate dehydratase	EC:
dopDH	2,5-dioxopentanonate dehydrogenase	EC:

# A number of EC: enzymes are similar but are promiscuous and are likely
# to have this activity as well. This includes uniprot:Q97U28 which is nearly identical to the
# promiscuous uniprot:KDGA_SACSO (O54288), but is not annotated with this activity.
DKDP-aldolase	2-dehydro-3-deoxy-D-arabinonate aldolase	EC:	ignore_other:

# glycolaldehyde oxidoreductase has multiple subunits and no EC number (uniprot:Q97VI4, uniprot:Q97VI7, uniprot:Q97VI6).
# This is an inference from close homologs from S. acidocaldarius, which
# have demonstrated activity on glyceraldehyde-3-phosphate, glyceraldehyde, and acetaldehyde, but not
# on glycolaldehyde itself, so there's no proof that these genes provide the activity.
# Related enzymes in EC: are promiscuous, may well have this activity, so ignore.
aldox-large	(glycol)aldehyde oxidoreductase, large subunit	curated:metacyc::MONOMER-18071	curated:SwissProt::Q4J6M3	ignore_other:
aldox-med	(glycol)aldehyde oxidoreductase, medium subunit	curated:metacyc::MONOMER-18072	curated:SwissProt::Q4J6M6	ignore_other:
aldox-small	(glycol)aldehyde oxidoreductase, small subunit	curated:metacyc::MONOMER-18073	curated:SwissProt::Q4J6M5	ignore_other:

# There's also glycolaldehyde dehydrogenase, EC: (aldA), with a single subunit
aldA	(glycol)aldehyde dehydrogenase	EC:

# The NADP based glyoxylate reductase (EC: is probably biased in the wrong direction
# for glycolate oxidation, so do not include, but ignore homology to it.
gyaR	glyoxylate reductase	EC:	ignore_other:

glycolaldehyde-dehydrogenase: aldA
glycolaldehyde-dehydrogenase: aldox-large aldox-med aldox-small

# Besides the standard enzyme, there's an archaeal enzyme that is sometimes annotated as EC:,
# but that only includes the formation of malyl-CoA, not the cleavage to malate.
glcB	malate synthase	EC:	ignore_other:

# C785_RS13675 = WP_039786858.1 = A0A4P7ABK7 is the DKDP 4-dehydrogenase (PMC6336799).
DKDP-dehydrog	D-2-keto-3-deoxypentoate dehydrogenase	uniprot:A0A4P7ABK7

# C785_RS20550 = WP_039788920.1 = A0A2E7P912 is the HDOP hydrolase (PMC6336799).
# This enzyme is also similar to SM_b21112, thought to be L-2,4-diketo-3-deoxyrhamnonate hydrolase
# and 2,4-dioxopentanoate hydrolase; it is plausible that SM_b21112 acts on HDOP as well.
# Similarly for BPHYT_RS34210 (thought to act on 2,4-diketo-3-deoxy-L-fuconate = L-2,4-diketo-3-deoxyrhamnonate).
# Both of these homologs are ignored.
HDOP-hydrol	5-hydroxy-2,4-dioxopentanonate hydrolase	uniprot:A0A2E7P912	ignore:reanno::BFirm:BPHYT_RS34210	ignore:reanno::Smeli:SM_b21112

# In pathway I, isomerase xylA forms D-xylulose and kinase
# xylB forms D-xylulose 5-phosphate, an intermediate in the pentose phosphate pathway.
all: xylose-transport xylA xylB

# In pathway II, the reductase xyrA forms xylitol, the dehydrogenase xdhA forms xylitol,
# and the kinase xylB forms D-xylulose 5-phosphate. (This pathway is only reported in fungi.)
all: xylose-transport xyrA xdhA xylB

# In pathway III or V, dehydrogenase xdh forms xylonolactone, lactonase xylC forms D-xylonate,
# dehydratase xad forms 2-dehydro-3-deoxy-D-arabinonate,
# dehydratase kdaD forms 2,5-dioxopentanoate (also known as α-ketoglutarate semialdehyde), and
# dopDH forms 2-oxoglutarate, an intermediate in the TCA cycle.
# (Pathway III has a 1,4-lactone intermediate, while pathway V has a 1,5-lactone intermediate;
#  GapMind does not distinguish these.)
all: xylose-transport xdh xylC xad kdaD dopDH

# In pathway IV, xdh and xylC form D-xylonate, dehydratase xad forms 2-dehydro-3-deoxy-D-arabinonate (DKDP),
# and an aldolase forms pyruvate and glycolaldehyde; glycolaldehyde is oxidized to glycolate and to glyoxylate,
# and assimilated by malate synthase (glcB).
all: xylose-transport xdh xylC xad DKDP-aldolase glycolaldehyde-dehydrogenase gyaR glcB

# Alternatively, after DKDP is formed,
# a dehydrogenase forms 5-hydroxy,2-4-dioxopentanonate (HDOP),
# and a hydrolase forms pyruvate and glycolate (PMC6336799);
# the glycolate is oxidized to glyoxylate and converted to malate.
all: xylose-transport xdh xylC xad DKDP-dehydrog HDOP-hydrol gyaR glcB



Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see the paper from 2019 on GapMind for amino acid biosynthesis, the paper from 2022 on GapMind for carbon sources, or view the source code.

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory