GapMind for Amino acid biosynthesis

 

Definition of L-tryptophan biosynthesis

As text, or see rules and steps

# Tryptophan biosynthesis in GapMind is based on MetaCyc pathway
# L-tryptophan biosynthesis (metacyc:TRPSYN-PWY), from chorismate,
# glutamine, 5-phosphoribose-1-diphosphate, and serine.

# This component of anthranilate synthase is also known as the alpha subunit  or component I.
# BT0532 (uniprot:Q8AAD3_BACTN) and CA265_RS22890 (uniprot:A0A1X9ZB07_9SPHI) are also component I.
# Similarly for DVU0465 (uniprot:Q72EV3_DESVH) and DvMF_1746 (uniprot:B8DM47_DESVM).
# Some related proteins have EC:2.6.1.85 which should perhaps be ignored.
trpE	anthranilate synthase subunit TrpE	hmm:TIGR00564	hmm:TIGR00565	hmm:TIGR01820	term:anthranilate synthase component 1	term:anthranilate synthase component I	term:anthranilate synthase alpha subunit	uniprot:Q8AAD3_BACTN	uniprot:A0A1X9ZB07_9SPHI	uniprot:Q72EV3_DESVH	uniprot:B8DM47_DESVM	ignore_other:EC 4.1.3.27

# This component of anthranilate synthase is also known as the beta subunit or component II.
# Note that E. coli TrpD is this component fused to anthranilate phosphoribosyltransferase
# (which is described separately).
# Some proteins are components of both anthranilate synthase and 4-amino-4-deoxychorismate synthase (EC:2.6.1.85).
# Similarly, TIGR00566 also matches the amidotransferase component of para-aminobenzoate synthase.
# BT0531 (uniprot:Q8AAD4_BACTN) and CA265_RS22895 (uniprot:A0A1X9ZDE4_9SPHI) are also component II
# Similarly for DVU0466 (uniprot:Q72EV2_DESVH) and DvMF_1745 (uniprot:B8DM46_DESVM).
trpD_1	glutamine amidotransferase of anthranilate synthase	hmm:TIGR00566	term:anthranilate synthase component 2	term:anthranilate synthase component II	term:anthranilate synthase beta subunit	uniprot:Q8AAD4_BACTN	uniprot:A0A1X9ZDE4_9SPHI	uniprot:Q72EV2_DESVH	uniprot:B8DM46_DESVM	ignore_other:EC 4.1.3.27	ignore_other:EC 2.6.1.85

# Often fused to anthranilate synthase so ignore that (i.e., P00500 is labeled in BRENDA as
# a subunit of anthranilate synthase; it is probably anthranilate phosphoribosyltransferase as well)
trpD_2	anthranilate phosphoribosyltransferase	EC:2.4.2.18	ignore_other:EC 4.1.3.27

# Some alphaproteobacteria have a single-protein anthranilate synthase
trpED	anthranilate synthase, alpha proteobacterial clade	hmm:TIGR01815	ignore_other:EC 4.1.3.27

# PRAI is sometimes known as trpF or as part of trpC (if fused to IGPS).
# In Bacteroides thetaiotaomicron, BT0528 (uniprot:Q8AAD7_BACTN) has a PRAI domain (pfam:PF00697), is auxotrophic,
# and clusters with trp synthesis genes
# Similarly, annotate CCNA_03659 (uniprot:A0A0H3CC53_CAUVN), SMc02767 (uniprot:TRPF_RHIME),
# and Ga0059261_0237 (uniprot:A0A1L6JC25_9SPHN).
# Some bacteria have a bifunctional phosphoribosylanthranilate isomerase (EC 5.3.1.24)
# and isomerase in histidine biosynthesis (hisA, EC:5.3.1.16), for instance uniprot:P16250.
# (In both reactions a N-modified-1-amino-ribofuranose-5-phosphate rearranges to a
#  linear N-modified-1-amino-ribulose-5-phosphate.)
# So hits to hisA are ignored.
PRAI	phosphoribosylanthranilate isomerase	EC:5.3.1.24	uniprot:Q8AAD7_BACTN	uniprot:A0A0H3CC53_CAUVN	uniprot:TRPF_RHIME	uniprot:A0A1L6JC25_9SPHN	ignore_other:EC 5.3.1.16

# Sometimes fused to anthranilate synthase subunits or known as trpC (if fused to PRAI).
# Ignore BRENDA::P24920 because annotated as phosphoribosylanthranilate isomerase but is also believed to be IGPS.
# And based on presence in Trp clusters and auxotrophic or cofit phenotypes, annotate somewhat diverged IGPS:
# HP15_3291 (uniprot:E4PQZ8_MARAH), DvMF_1743 (uniprot:B8DM44_DESVM), BT0529 (uniprot:TRPC_BACTN),
# AZOBR_RS07450 (uniprot:A0A560BXT3), Pf1N1B4_2549 (uniprot:A0A166NT80_PSEFL), AO353_07210 (uniprot:A0A0N9WG05_PSEFL).
IGPS	indole-3-glycerol phosphate synthase	EC:4.1.1.48	uniprot:E4PQZ8_MARAH	uniprot:B8DM44_DESVM	uniprot:TRPC_BACTN	uniprot:A0A560BXT3	uniprot:A0A166NT80_PSEFL	uniprot:A0A166NT80_PSEFL	ignore_other:EC 4.1.3.27	ignore:BRENDA::P24920

# This reaction is 4.1.2.8 by itself or part of 4.2.1.20.
# The enzyme from Pyrolobus is not matched by the HMM, but is similar to the characterized trpA
# from Thermococcus (uniprot:Q9YGA9, see PMID:11029579)
# APJEIK_06300 from Pedobacter sp. FW305-3-2-15-E-R2A2 (uniprot:A0A3N7DTH5) can complement a trpA- strain of E. coli (Bradley Biggs).
# GAMENC_07330 from Acidovorax sp. FHTAMBA (uniprot:H0BV16) can complement a trpA- strain of E. coli (Bradley Biggs);
# also it is very similar to Ac3H11_1535, which has auxotrophic phenotypes.
# LRK54_RS01680 from Rhodanobacter sp. FW104-10B01 (uniprot:M4NLA4) can complement a trpA- strain of E. coli (Bradley Biggs);
# also it is similar to N515DRAFT_0116, which has
# auxotrophic pheonotypes.
trpA	indoleglycerol phosphate aldolase	hmm:TIGR00262	curated:BRENDA::Q9YGA9	uniprot:A0A3N7DTH5	uniprot:H0BV16	uniprot:M4NLA4	ignore_other:EC 4.2.1.20	ignore_other:EC 4.1.2.8

# Some members of TIGR01415 might also have this activity.
# The family also includes indole-scavenging "trpB-2" proteins, whose presence generally indicates
# tryptophan synthesis (EC 4.2.1.122).
# The enzyme from Pyrolobus is not matched by the HMM, but is similar to the characterized trpB
# from Sulfolobus (uniprot:P50383, see PMID:11029579) or Saccharolobus (uniprot:Q97TX6)
# "TrpB2" proteins are reported to use phosphoserine (an intermediate in serine synthesis) rather than serine
# as a substrate (PMID:25184516); GapMind does not try to distinguish them from TrpB.
trpB	tryptophan synthase	hmm:TIGR00263	uniprot:P50383	curated:BRENDA::Q97TX6	ignore_other:EC 4.2.1.20	ignore_other:EC 4.2.1.122

# Anthranilate synthase usually has two components, as in E. coli, but single-protein systems are also known.
# (In E. coli, trpD is the second component fused to anthranilate phosphoribosyltransferase.) 
anthranilate-synthase: trpE trpD_1
anthranilate-synthase: trpED

all: anthranilate-synthase trpD_2 PRAI IGPS trpA trpB

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory