As text, or see rules and steps
# Tryptophan biosynthesis in GapMind is based on MetaCyc pathway # L-tryptophan biosynthesis (metacyc:TRPSYN-PWY), from chorismate, # glutamine, 5-phosphoribose-1-diphosphate, and serine. # This component of anthranilate synthase is also known as the alpha subunit or component I. # BT0532 (uniprot:Q8AAD3_BACTN) and CA265_RS22890 (uniprot:A0A1X9ZB07_9SPHI) are also component I. # Similarly for DVU0465 (uniprot:Q72EV3_DESVH) and DvMF_1746 (uniprot:B8DM47_DESVM). # Some related proteins have EC:2.6.1.85 which should perhaps be ignored. trpE anthranilate synthase subunit TrpE hmm:TIGR00564 hmm:TIGR00565 hmm:TIGR01820 term:anthranilate synthase component 1 term:anthranilate synthase component I term:anthranilate synthase alpha subunit uniprot:Q8AAD3_BACTN uniprot:A0A1X9ZB07_9SPHI uniprot:Q72EV3_DESVH uniprot:B8DM47_DESVM ignore_other:EC 4.1.3.27 # This component of anthranilate synthase is also known as the beta subunit or component II. # Note that E. coli TrpD is this component fused to anthranilate phosphoribosyltransferase # (which is described separately). # Some proteins are components of both anthranilate synthase and 4-amino-4-deoxychorismate synthase (EC:2.6.1.85). # Similarly, TIGR00566 also matches the amidotransferase component of para-aminobenzoate synthase. # BT0531 (uniprot:Q8AAD4_BACTN) and CA265_RS22895 (uniprot:A0A1X9ZDE4_9SPHI) are also component II # Similarly for DVU0466 (uniprot:Q72EV2_DESVH) and DvMF_1745 (uniprot:B8DM46_DESVM). trpD_1 glutamine amidotransferase of anthranilate synthase hmm:TIGR00566 term:anthranilate synthase component 2 term:anthranilate synthase component II term:anthranilate synthase beta subunit uniprot:Q8AAD4_BACTN uniprot:A0A1X9ZDE4_9SPHI uniprot:Q72EV2_DESVH uniprot:B8DM46_DESVM ignore_other:EC 4.1.3.27 ignore_other:EC 2.6.1.85 # Often fused to anthranilate synthase so ignore that (i.e., P00500 is labeled in BRENDA as # a subunit of anthranilate synthase; it is probably anthranilate phosphoribosyltransferase as well) trpD_2 anthranilate phosphoribosyltransferase EC:2.4.2.18 ignore_other:EC 4.1.3.27 # Some alphaproteobacteria have a single-protein anthranilate synthase trpED anthranilate synthase, alpha proteobacterial clade hmm:TIGR01815 ignore_other:EC 4.1.3.27 # PRAI is sometimes known as trpF or as part of trpC (if fused to IGPS). # In Bacteroides thetaiotaomicron, BT0528 (uniprot:Q8AAD7_BACTN) has a PRAI domain (pfam:PF00697), is auxotrophic, # and clusters with trp synthesis genes # Similarly, annotate CCNA_03659 (uniprot:A0A0H3CC53_CAUVN), SMc02767 (uniprot:TRPF_RHIME), # and Ga0059261_0237 (uniprot:A0A1L6JC25_9SPHN). # Some bacteria have a bifunctional phosphoribosylanthranilate isomerase (EC 5.3.1.24) # and isomerase in histidine biosynthesis (hisA, EC:5.3.1.16), for instance uniprot:P16250. # (In both reactions a N-modified-1-amino-ribofuranose-5-phosphate rearranges to a # linear N-modified-1-amino-ribulose-5-phosphate.) # So hits to hisA are ignored. PRAI phosphoribosylanthranilate isomerase EC:5.3.1.24 uniprot:Q8AAD7_BACTN uniprot:A0A0H3CC53_CAUVN uniprot:TRPF_RHIME uniprot:A0A1L6JC25_9SPHN ignore_other:EC 5.3.1.16 # Sometimes fused to anthranilate synthase subunits or known as trpC (if fused to PRAI). # Ignore BRENDA::P24920 because annotated as phosphoribosylanthranilate isomerase but is also believed to be IGPS. # And based on presence in Trp clusters and auxotrophic or cofit phenotypes, annotate somewhat diverged IGPS: # HP15_3291 (uniprot:E4PQZ8_MARAH), DvMF_1743 (uniprot:B8DM44_DESVM), BT0529 (uniprot:TRPC_BACTN), # AZOBR_RS07450 (uniprot:A0A560BXT3), Pf1N1B4_2549 (uniprot:A0A166NT80_PSEFL), AO353_07210 (uniprot:A0A0N9WG05_PSEFL). IGPS indole-3-glycerol phosphate synthase EC:4.1.1.48 uniprot:E4PQZ8_MARAH uniprot:B8DM44_DESVM uniprot:TRPC_BACTN uniprot:A0A560BXT3 uniprot:A0A166NT80_PSEFL uniprot:A0A166NT80_PSEFL ignore_other:EC 4.1.3.27 ignore:BRENDA::P24920 # This reaction is 4.1.2.8 by itself or part of 4.2.1.20. # The enzyme from Pyrolobus is not matched by the HMM, but is similar to the characterized trpA # from Thermococcus (uniprot:Q9YGA9, see PMID:11029579) # APJEIK_06300 from Pedobacter sp. FW305-3-2-15-E-R2A2 (uniprot:A0A3N7DTH5) can complement a trpA- strain of E. coli (Bradley Biggs). # GAMENC_07330 from Acidovorax sp. FHTAMBA (uniprot:H0BV16) can complement a trpA- strain of E. coli (Bradley Biggs); # also it is very similar to Ac3H11_1535, which has auxotrophic phenotypes. # LRK54_RS01680 from Rhodanobacter sp. FW104-10B01 (uniprot:M4NLA4) can complement a trpA- strain of E. coli (Bradley Biggs); # also it is similar to N515DRAFT_0116, which has # auxotrophic pheonotypes. trpA indoleglycerol phosphate aldolase hmm:TIGR00262 curated:BRENDA::Q9YGA9 uniprot:A0A3N7DTH5 uniprot:H0BV16 uniprot:M4NLA4 ignore_other:EC 4.2.1.20 ignore_other:EC 4.1.2.8 # Some members of TIGR01415 might also have this activity. # The family also includes indole-scavenging "trpB-2" proteins, whose presence generally indicates # tryptophan synthesis (EC 4.2.1.122). # The enzyme from Pyrolobus is not matched by the HMM, but is similar to the characterized trpB # from Sulfolobus (uniprot:P50383, see PMID:11029579) or Saccharolobus (uniprot:Q97TX6) # "TrpB2" proteins are reported to use phosphoserine (an intermediate in serine synthesis) rather than serine # as a substrate (PMID:25184516); GapMind does not try to distinguish them from TrpB. trpB tryptophan synthase hmm:TIGR00263 uniprot:P50383 curated:BRENDA::Q97TX6 ignore_other:EC 4.2.1.20 ignore_other:EC 4.2.1.122 # Anthranilate synthase usually has two components, as in E. coli, but single-protein systems are also known. # (In E. coli, trpD is the second component fused to anthranilate phosphoribosyltransferase.) anthranilate-synthase: trpE trpD_1 anthranilate-synthase: trpED all: anthranilate-synthase trpD_2 PRAI IGPS trpA trpB
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory