Definition of L-lysine biosynthesis
As rules and steps, or see full text
Rules
Overview: Lysine biosynthesis in GapMind is based on MetaCyc pathways L-lysine biosynthesis I via diaminopimelate (DAP) and succinylated intermediates (link), II with DAP and acetylated intermediates (link), III with DAP and no blocking group (link), V via 2-aminoadipate and LysW carrier protein (link), and VI with DAP aminotransferase (link). Most of these pathways involve tetrahydrodipicolinate and meso-diaminopimelate, with variations in how the amino group is introduced. Pathway V instead involves L-2-aminoadipate and LysW-attached intermediates. Lysine biosynthesis IV (link), via 2-aminoadipate and saccharopine, is only reported to occur in eukaryotes and is not described here.
- all:
- meso-DAP and lysA
- or lysW-pathway
- lysW-pathway: hcs, lysT, lysU, hicdh, lysN, lysW, lysX, lysZ, lysY, lysJ and lysK
- Comment: 2-oxoglutarate and acetyl-CoA are converted to homocysteine, homoaconitate and then 2-oxoadipate (by hcs-lysTU-hicdh), an aminotransferase (lysN) forms L-2-aminoadipate, lysX ligates 2-aminoadipate to lysW, lysZYJ convert LysW-aminoadipate to LysW-lysine, and lysK releases lysine.
- meso-DAP:
- aspartate-semialdehyde, dapA, dapB, dapD, dapC, dapE and dapF
- or aspartate-semialdehyde, dapA, dapB, dapH, dapX, dapL and dapF
- or aspartate-semialdehyde, dapA, dapB and ddh
- or aspartate-semialdehyde, dapA, dapB, DAPtransferase and dapF
- Comment: (S)-2,3,4,5-tetrahydrodipicolinate is formed from aspartate semialdehyde by dapAB. In pathway I (dapDCE), it is succinylated, transaminated, and desuccinyulated, to L,L-DAP, and then the epimerase dapF forms meso-DAP. Pathway II (dapHXL) is similar but with acetylated intermediates. In pathway III, tetrahydrodipicolinate is reductively aminated to meso-DAP in one step, by ddh. In pathway VI, an aminotransferase (DAPtransferase) forms L,L-DAP.
- aspartate-semialdehyde: asp-kinase and asd
Steps
asp-kinase: aspartate kinase
- Curated proteins or TIGRFams with EC 2.7.2.4
- Ignore hits to O63067 when looking for 'other' hits (homoserine dehydrogenase (EC 1.1.1.3))
- Comment: For BRENDA::O63067 -- the paper describes a monofunctional hom but the sequence of O63067 is much longer and has a close homolog of functional aspartate kinase (due to alternative splicing?)
- Total: 3 HMMs and 29 characterized proteins
asd: aspartate semi-aldehyde dehydrogenase
dapA: 4-hydroxy-tetrahydrodipicolinate synthase
dapB: 4-hydroxy-tetrahydrodipicolinate reductase
- Curated proteins or TIGRFams with EC 1.17.1.8
- UniProt sequence L0G028_ECHVK: SubName: Full=Dihydrodipicolinate reductase {ECO:0000313|EMBL:AGA78643.1};
- UniProt sequence A0A1X9Z7Q6_9SPHI: RecName: Full=4-hydroxy-tetrahydrodipicolinate reductase {ECO:0000256|HAMAP-Rule:MF_00102}; Short=HTPA reductase {ECO:0000256|HAMAP-Rule:MF_00102}; EC=1.17.1.8 {ECO:0000256|HAMAP-Rule:MF_00102};
- Comment: Formerly known as dihydrodipicolinate reductase. Echvi_2395 (L0G028_ECHVK) and CA265_RS15670 (A0A1X9Z7Q6_9SPHI) are somewhat diverged, but conserved essentiality confirms they are dapB.
- Total: 2 HMMs and 6 characterized proteins
dapD: tetrahydrodipicolinate succinylase
dapC: N-succinyldiaminopimelate aminotransferase
dapE: succinyl-diaminopimelate desuccinylase
dapF: diaminopimelate epimerase
lysA: diaminopimelate decarboxylase
dapH: tetrahydrodipicolinate acetyltransferase
dapX: acetyl-diaminopimelate aminotransferase
dapL: N-acetyl-diaminopimelate deacetylase
ddh: meso-diaminopimelate D-dehydrogenase
DAPtransferase: L,L-diaminopimelate aminotransferase
hcs: homocitrate synthase
lysT: homoaconitase large subunit
lysU: homoaconitase small subunit
hicdh: homo-isocitrate dehydrogenase
- Curated proteins or TIGRFams with EC 1.1.1.87
- Curated proteins or TIGRFams with EC 1.1.1.286
- Comment: homoisocitrate to 2-oxoadipate. This rule also matches some isocitrate/homoisocitrate dehydrogenases (1.1.1.286) which often have multiple subunits in eukaryotes; this is not represented here.
- Total: 12 characterized proteins
lysN: 2-aminoadipate:2-oxoglutarate aminotransferase
lysW: 2-aminoadipate/glutamate carrier protein
- Curated proteins matching alpha-aminoadipate%carrier
- UniProt sequence Q5JFV9: SubName: Full=Probable lysine biosynthesis protein {ECO:0000313|EMBL:BAD84468.1};
- Comment: LysW is a carrier protein for intermediates in lysine or ornithine biosynthesis. TK0279 (Q5JFV9) from Thermococcus kodakarensis was characterized, see PMC5076833.
- Total: 5 characterized proteins
lysX: 2-aminoadipate-LysW ligase
lysZ: [LysW]-2-aminoadipate 6-kinase / [LysW]-glutamate kinase
lysY: [LysW]-2-aminoadipate 6-phosphate reductase / [LysW]-glutamylphosphate reductase
lysJ: [LysW]-2-aminoadipate semialdehyde transaminase / [LysW]-glutamate semialdehyde transaminase
lysK: [LysW]-lysine hydrolase / [LysW]-ornithine hydrolase
Links
Downloads
Related tools
About GapMind
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using
ublast (a fast alternative to protein BLAST)
against a database of manually-curated proteins (most of which are experimentally characterized) or by using
HMMer with enzyme models (usually from
TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
- ublast finds a hit to a characterized protein at above 40% identity and 80% coverage, and bits >= other bits+10.
- (Hits to curated proteins without experimental data as to their function are never considered high confidence.)
- HMMer finds a hit with 80% coverage of the model, and either other identity < 40 or other coverage < 0.75.
where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").
Otherwise, a candidate is "medium confidence" if either:
- ublast finds a hit at above 40% identity and 70% coverage (ignoring otherBits).
- ublast finds a hit at above 30% identity and 80% coverage, and bits >= other bits.
- HMMer finds a hit (regardless of coverage or other bits).
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps."
For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways.
For diverse bacteria and archaea that can utilize a carbon source, there is a complete
high-confidence catabolic pathway (including a transporter) just 38% of the time, and
there is a complete medium-confidence pathway 63% of the time.
Gaps may be due to:
- our ignorance of proteins' functions,
- omissions in the gene models,
- frame-shift errors in the genome sequence, or
- the organism lacks the pathway.
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory