PaperBLAST
Full List of Papers Linked to P19262
ODO2_YEAST / P19262 Dihydrolipoyllysine-residue succinyltransferase component of 2-oxoglutarate dehydrogenase complex, mitochondrial; DLST; 2-oxoglutarate dehydrogenase complex component E2; OGDC-E2; OGDHC subunit E2; Alpha-ketoglutarate dehydrogenase subunit E2; alpha-KGDHC subunit E2; Dihydrolipoamide succinyltransferase component of 2-oxoglutarate dehydrogenase complex; EC 2.3.1.61 from Saccharomyces cerevisiae (strain ATCC 204508 / S288c) (Baker's yeast) (see 4 papers)
P19262 dihydrolipoyllysine-residue succinyltransferase (EC 2.3.1.61) from Saccharomyces cerevisiae (see paper)
YDR148C Dihydrolipoyl transsuccinylase, component of the mitochondrial alpha-ketoglutarate dehydrogenase complex, which catalyzes the oxidative decarboxylation of alpha-ketoglutarate to succinyl-CoA in the TCA cycle; phosphorylated from Saccharomyces cerevisiae
- function: The 2-oxoglutarate dehydrogenase complex catalyzes the overall conversion of 2-oxoglutarate to succinyl-CoA and CO(2). It contains multiple copies of three enzymatic components: 2-oxoglutarate dehydrogenase (E1), dihydrolipoamide succinyltransferase (E2) and lipoamide dehydrogenase (E3).
catalytic activity: N(6)-[(R)-dihydrolipoyl]-L-lysyl-[2-oxoglutarate dehydrogenase complex component E2] + succinyl-CoA = N(6)-[(R)-S(8)- succinyldihydrolipoyl]-L-lysyl-[2-oxoglutarate dehydrogenase complex component E2] + CoA (RHEA:15213)
cofactor: (R)-lipoate Note=Binds 1 lipoyl cofactor covalently
subunit: Component of the 2-oxoglutarate dehydrogenase complex (OGDC), also called alpha-ketoglutarate dehydrogenase (KGDH) complex. The copmplex is composed of the catalytic subunits OGDH (2-oxoglutarate dehydrogenase KGD1; also called E1 subunit), DLST (dihydrolipoamide succinyltransferase KGD2; also called E2 subunit) and DLD (dihydrolipoamide dehydrogenase LPD1; also called E3 subunit), and the assembly factor KGD4. - Improving Identification of In-organello Protein-Protein Interactions Using an Affinity-enrichable, Isotopically Coded, and Mass Spectrometry-cleavable Chemical Crosslinker
Makepeace, Molecular & cellular proteomics : MCP 2020 - “...13 9 4 33 P07253 CBP6 P21560 CBP3 41 Yes Yes 13 0 27 14 P19262 KGD2 P20967 KGD1 40 Yes Yes 9 19 3 18 P05626 ATP4 P07251 ATP1 38 No Yes 6 2 22 14 P09624 LPD1 P16451 PDX1 35 Yes Yes 10 7...”
- Proteomic analysis reveals a novel function of the kinase Sat4p in Saccharomyces cerevisiae mitochondria
Gey, PloS one 2014 - “...Q12031 Mitochondrial 2-methylisocitrate lyase 0.28 MALDI-TOF-MS 159 20/51 32 65.3 7.21 62 6.2 6 KGD2 P19262 Dihydrolipoyllysine-residue succinyltransferase component of 2-oxoglutarate dehydrogenase complex, mitochondrial 0.33 nanoLC-MS/MS 7132 17 58 50.5 8.88 55 5.4 7 LYS4 P49367 Homoaconitase, mitochondrial 0.39 MALDI-TOF-MS 180 15/25 24 68.0 6.12 74...”
- “...P36112 Formation of crista junctions protein 1 nanoLC-MS/MS 2219 26 58 61.1 6.60 11 KGD2 P19262 Dihydrolipoyllysine-residue succinyltransferase component of 2-oxoglutarate dehydrogenase complex, mitochondrial 3.62 nanoLC-MS/MS 3659 19 59 50.5 8.88 54 5.5 12 KGD2 P19262 Dihydrolipoyllysine-residue succinyltransferase component of 2-oxoglutarate dehydrogenase complex, mitochondrial 3.65 MALDI-TOF-MS...”
- Mitochondrial enzymes are protected from stress-induced aggregation by mitochondrial chaperones and the Pim1/LON protease
Bender, Molecular biology of the cell 2011 - “...75 53 20 8 Kgd1 -Ketoglutarate dehydrogenase P20967 114 58 19 9 Kgd2 Dihydrolipoyl transsuccinylase P19262 50 60 20 10 Lat1 Dihydrolipoylamide acetyltransferase, subunit E2 P12695 52 46 16 11 Leu4 -Isopropylmalate synthase P06208 68 83 24 12 Lsc2 Succinyl-CoA ligase, subunit P53312 47 93 25...”
- Plant mitochondrial 2-oxoglutarate dehydrogenase complex: purification and characterization in potato
Millar, The Biochemical journal 1999 (secret) - Loss of function of Hog1 improves glycerol assimilation in Saccharomyces cerevisiae
Sone, World journal of microbiology & biotechnology 2023 - “...kinase involved in osmoregulation Frameshift mutation corresponding to Lys65 Chr IV 753,916 T A KGD2 (YDR148C) 2-Oxoglutarate dehydrogenase E2 component Asn384 (A A T) Ile (A T T) Chr XII 1,019,484 A T SIR3 (YLR442C) Chromatin-silencing protein Leu923 (C T A) Gln (C A A) Chr...”
- Data integration uncovers the metabolic bases of phenotypic variation in yeast
Petrizzelli, PLoS computational biology 2021 - “...protein complexes with an AND Boolean relationship. Proteins Reactions (YOR136W AND YNL037C) Icit_Akg_m_nad (YIL125W AND YDR148C AND YFL018C ) Akg_Succoa_m ( YGR240C AND YMR205C ) F6p_Fdp (YBR221C AND YER178W AND YFL018C AND YGR193C AND YNL071W ) Pyr_Accoa_m ( YGR244C AND YOR142W ) Succoa_Succ_m ( YGL080W AND...”
- Central Metabolism in Mammals and Plants as a Hub for Controlling Cell Fate
Selinski, Antioxidants & redox signaling 2021 - “...(EC 1.1.2.4) 2-Oxoglutarate dehydrogenase complex OGDH-E1 AT3G55410 (EC 1.2.4.2) HGNC:8124 (EC 1.2.4.2) YIL125W (EC 1.2.4.2) YDR148C (EC 1.2.4.2) YFR049W (EC 1.2.4.2) OGDH-E1 AT5G65750 (EC 1.2.4.2) OGDH-E2 AT4G26910 (EC 1.2.4.2) OGDH-E2 AT5G55070 (EC 1.2.4.2) OGDH-E3 AT3G17240 (EC 1.2.4.2) OGDH-E3 AT1G48030 (EC 1.2.4.2) OGDH-E3 AT3G13930 (EC 1.2.4.2) The...”
- Systematic analysis of nuclear gene function in respiratory growth and expression of the mitochondrial genome in S. cerevisiae
Stenger, Microbial cell (Graz, Austria) 2020 - “...COQ4 YDR204W ISA1 YLL027W SOM1 YEL059C-A COQ5 YML110C ISA2 YPR067W SSQ1 YLR369W COQ6 YGR255C KGD2 YDR148C SUV3 YPL029W COQ9 YLR201C LIP2 YLR239C YTA12 YMR089C DSS1 YMR287C LPD1 YFL018C ETR1 YBR026C MCT1 YOR221C Vacuolar proteins DID4 YKL002W VMA21 YGR105W VMA6 YLR447C VMA1 YDL185W VMA22 YHR060W VMA8 YEL051W...”
- Adjustment of trehalose metabolism in wine Saccharomyces cerevisiae strains to modify ethanol yields
Rossouw, Applied and environmental microbiology 2013 - “...YCR005c YPR001w YLR304c YJL200C YNL037c YOR136w YIL125w YDR148c YOR142w YGR244c Y15376 Y13485 Y12828 Y15212 Y17022 Y15362 Y12392 Y12284 Y13506 Y12398 Y15897...”
- High hydrostatic pressure activates gene expression that leads to ethanol production enhancement in a Saccharomyces cerevisiae distillery strain
Bravim, Applied microbiology and biotechnology 2013 - “...2.39 3.53 4.18 High-affinity glucose transporter YDR342C HXT7 2.72 2.49 3.78 4.52 High-affinity glucose transporter YDR148C KGD2 0.86 0.68 1.55 2.05 Dihydrolipoyl transsuccinylase, which catalyses the oxidative decarboxylation of -ketoglutarate to succinyl-CoA in the TCA cycle YFR030W MET10 1.76 0.44 0.93 1.02 Subunit alpha of assimilatory...”
- Inaccurately assembled cytochrome c oxidase can lead to oxidative stress-induced growth arrest
Bode, Antioxidants & redox signaling 2013 - “...Strong None Strong None Minor YHR067W HTD2 Poor Poor YDR148C KGD2 None Poor Minor None Minor None YIL125W KGD1 None Poor Minor None Minor None Function Twin...”
- Systematic genetic array analysis links the Saccharomyces cerevisiae SAGA/SLIK and NuA4 component Tra1 to multiple cellular processes
Hoke, BMC genetics 2008 - “...5, 9 ER 19 MDM34 YGL219C S 6 mitochondrial outer membrane E, R 10 KGD2 YDR148C S 6, 5 mitochondrion 0 MDM10 YAL010C s 6, 9 Mdm10/Mdm12/Mmm1 complex, mitochondrial outer membrane T, E 11 BEM1 YBR200W S 7, 3, 1 bud neck E 33 BEM4 YPL161C...”
- Increased respiration in the sch9Delta mutant is required for increasing chronological life span but not replicative life span
Lavoie, Eukaryotic cell 2008 - “...1.79 1.69 1.04E-04 3.12E-05 Tricarboxylic acid cycle YDR148C YLR304C KGD2 ACO1 1.44 1.52 4.89E-03 3.80E-03 Retrograde signaling YNL076W MKS1 1.36 2.24E-05...”
- Transcriptome analysis of a respiratory Saccharomyces cerevisiae strain suggests the expression of its phenotype is glucose insensitive and predominantly controlled by Hap4, Cat8 and Mig1
Bonander, BMC genomics 2008 - “...0.4 Alternative carbon source utilization YML054C CYB2 10.1 5.0 TCA cycle YIL125W KGD1 2.2 2.2 YDR148C KGD2 3.5 4.4 YFL018C LPD1 2.0 2.2 YKL085W MDH1 3.6 2.2 YKL148C SDH1 4.6 4.7 YLL041C SDH2 4.1 3.9 YKL141W SDH3 3.7 2.3 YDR178W SDH4 5.3 3.2 Glyoxylate cycle YIR029W...”
- Identification of coherent patterns in gene expression data using an efficient biclustering algorithm and parallel coordinate visualization
Cheng, BMC bioinformatics 2008 - “...15 protein folding 4.54E-04 1.22E-02 YAL005C, YBR169C, YDR214W, YLR216C 19 tricarboxylic acid cycle 7.49E-05 2.55E-03 YDR148C, YLL041C, YNR001C 24 response to stress 3.45E-04 3.45E-03 YBR072W, YDR258C, YPL240C 25 SRP-dependent cotranslational protein targeting to membrane, translocation 4.27E-04 8.12E-03 YAL005C, YER103W response to stress 6.85E-05 1.30E-03 YAL005C, YDR258C,...”
- Regulation of gluconeogenesis in Saccharomyces cerevisiae is mediated by activator and repressor functions of Rds2
Soontorngun, Molecular and cellular biology 2007 - “...FBP1 VID24 MAE1 HAP4 PDC1 OPI1 ICY1 YOL109W YDR148C ZEO1 KGD2 YLR056W YLR055C ERG3 SPT8 YLR174W YGR044C YLR153C YJR095W YOL136C YDR178W YJL219W YKL203C IDP2...”
- PIPE: a protein-protein interaction prediction engine based on the re-occurring short polypeptide sequences between known interacting protein pairs
Pitre, BMC bioinformatics 2006 - “...YKL013C YMR167W YNL082W YDR074W YMR261C YJL005W YNL138W YMR231W YDL077C YDR074W YML100W YJL008C YOR117W YNL126W YLR212C YDR148C YFR049W YJL124C YBL026W YNR006W YHL002W YDR179C YMR025W Table 2 Negative validation set. The list of the protein pairs that our known not to interact. This list was used to evaluate...”
- Ascospore formation in the yeast Saccharomyces cerevisiae
Neiman, Microbiology and molecular biology reviews : MMBR 2005 - “...YBL080c YBR003w YBR037c YBR126c YCR010c YDL033c YDR116c YDR148c YDR197w YDR350c YDR377w YDR393W YDR511w YDR529c YEL024w YER017c YER058w YER065c YER178w YGL107c...”
- Global protein function annotation through mining genome-scale data in yeast Saccharomyces cerevisiae
Chen, Nucleic acids research 2004 - “...YER045C ACA1 A YHR055C CUP1 YJR091C JSN1 A A YDR148C KGD2 A Figure 11. Global function prediction for yeast YBR100W. All interacting partners of YBR100W are...”
- Large-scale mutagenesis of the yeast genome using a Tn7-derived multipurpose transposon
Kumar, Genome research 2004 - “...TFP1 TN7-56E10 YDL192W ARF1 TN7-1G1 TN7-71D4 YDR074W YDR148C TPS2 KGD2 TN7-37F4 TN7-12E10 YDR149C YDR207C N/A UME6 TN7-17H11 TN7-73H9 YDR225W YDR443C HTA1...”
- Exploring relationships in gene expressions: a partial least squares approach
Datta, Gene expression 2001 - “...YDR285W, YDR374C, YER179w, YIL072W, YJL106W Early II YDR148C, YGL032C, YGL210W, YPR112C, YHR153C, YOL123W, YPR192W, YDR113C Early-Mid YDR118W, YLR045c, YNL013C,...”
- Knowledge-based analysis of microarray gene expression data by using support vector machines
Brown, Proceedings of the National Academy of Sciences of the United States of America 2000 - “...Gene Locus TCA YPR001W YOR142W YLR174W YIL125W YDR148C YBL015W YPR191W YPL271W YPL262W YML120C YKL085W YGR207C YDL067C YPL037C YLR406C YLR075W YDL184C YAL003W...”
- Genetic and biochemical interactions involving tricarboxylic acid cycle (TCA) function using a collection of mutants defective in all TCA cycle genes
Przybyla-Zawislak, Genetics 1999 - “...dehydrogenase complex (KGDC), KGD1 (YIL125w), KGD2 (YDR148c), and LPD1 (YFL018c); succinylCoA ligase (synthetase), LSC1 (YOR142w), and LSC2 (YGR244c);...”
- Pathway alignment: application to the comparative analysis of glycolytic enzymes
Dandekar, The Biochemical journal 1999 - “...274 273 0446 0447 YER178W YBR221C 1232 272 0448 YNL071W YDR148C 1231 271 0449 YFL018C 0871 3138 MT 1100 1850 slr1934 sll1721 HP 0476 0902 0823 0179 0180 0903...”
For advice on how to use these tools together, see
Interactive tools for functional annotation of bacterial genomes.
The PaperBLAST database links 793,807 different protein sequences to 1,259,118 scientific articles. Searches against EuropePMC were last performed on March 13 2025.
PaperBLAST builds a database of protein sequences that are linked
to scientific articles. These links come from automated text searches
against the articles in EuropePMC
and from manually-curated information from GeneRIF, UniProtKB/Swiss-Prot,
BRENDA,
CAZy (as made available by dbCAN),
BioLiP,
CharProtDB,
MetaCyc,
EcoCyc,
TCDB,
REBASE,
the Fitness Browser,
and a subset of the European Nucleotide Archive with the /experiment tag.
Given this database and a protein sequence query,
PaperBLAST uses protein-protein BLAST
to find similar sequences with E < 0.001.
To build the database, we query EuropePMC with locus tags, with RefSeq protein
identifiers, and with UniProt
accessions. We obtain the locus tags from RefSeq or from MicrobesOnline. We use
queries of the form "locus_tag AND genus_name" to try to ensure that
the paper is actually discussing that gene. Because EuropePMC indexes
most recent biomedical papers, even if they are not open access, some
of the links may be to papers that you cannot read or that our
computers cannot read. We query each of these identifiers that
appears in the open access part of EuropePMC, as well as every locus
tag that appears in the 500 most-referenced genomes, so that a gene
may appear in the PaperBLAST results even though none of the papers
that mention it are open access. We also incorporate text-mined links
from EuropePMC that link open access articles to UniProt or RefSeq
identifiers. (This yields some additional links because EuropePMC
uses different heuristics for their text mining than we do.)
For every article that mentions a locus tag, a RefSeq protein
identifier, or a UniProt accession, we try to select one or two
snippets of text that refer to the protein. If we cannot get access to
the full text, we try to select a snippet from the abstract, but
unfortunately, unique identifiers such as locus tags are rarely
provided in abstracts.
PaperBLAST also incorporates manually-curated protein functions:
- Proteins from NCBI's RefSeq are included if a
GeneRIF
entry links the gene to an article in
PubMed®.
GeneRIF also provides a short summary of the article's claim about the
protein, which is shown instead of a snippet.
- Proteins from Swiss-Prot (the curated part of UniProt)
are included if the curators
identified experimental evidence for the protein's function (evidence
code ECO:0000269). For these proteins, the fields of the Swiss-Prot entry that
describe the protein's function are shown (with bold headings).
- Proteins from BRENDA,
a curated database of enzymes, are included if they are linked to a paper in PubMed
and their full sequence is known.
- Every protein from the non-redundant subset of
BioLiP,
a database
of ligand-binding sites and catalytic residues in protein structures, is included. Since BioLiP itself
does not include descriptions of the proteins, those are taken from the
Protein Data Bank.
Descriptions from PDB rely on the original submitter of the
structure and cannot be updated by others, so they may be less reliable.
(For SitesBLAST and Sites on a Tree, we use a larger subset of BioLiP so that every
ligand is represented among a group of structures with similar sequences, but for
PaperBLAST, we use the non-redundant set provided by BioLiP.)
- Every protein from EcoCyc, a curated
database of the proteins in Escherichia coli K-12, is included, regardless
of whether they are characterized or not.
- Proteins from the MetaCyc metabolic pathway database
are included if they are linked to a paper in PubMed and their full sequence is known.
- Proteins from the Transport Classification Database (TCDB)
are included if they have known substrate(s), have reference(s),
and are not described as uncharacterized or putative.
(Some of the references are not visible on the PaperBLAST web site.)
- Every protein from CharProtDB,
a database of experimentally characterized protein annotations, is included.
- Proteins from the CAZy database of carbohydrate-active enzymes
are included if they are associated with an Enzyme Classification number.
Even though CAZy does not provide links from individual protein sequences to papers,
these should all be experimentally-characterized proteins.
- Proteins from the REBASE database
of restriction enzymes are included if they have known specificity.
- Every protein with an evidence-based reannotation (based on mutant phenotypes)
in the Fitness Browser is included.
- Sequence-specific transcription factors (including sigma factors and DNA-binding response regulators)
with experimentally-determined DNA binding sites from the
PRODORIC database of gene regulation in prokaryotes.
- Putative transcription factors from RegPrecise
that have manually-curated predictions for their binding sites. These predictions are based on
conserved putative regulatory sites across genomes that contain similar transcription factors,
so PaperBLAST clusters the TFs at 70% identity and retains just one member of each cluster.
- Coding sequence (CDS) features from the
European Nucleotide Archive (ENA)
are included if the /experiment tag is set (implying that there is experimental evidence for the annotation),
the nucleotide entry links to paper(s) in PubMed,
and the nucleotide entry is from the STD data class
(implying that these are targeted annotated sequences, not from shotgun sequencing).
Also, to filter out genes whose transcription or translation was detected, but whose function
was not studied, nucleotide entries or papers with more than 25 such proteins are excluded.
Descriptions from ENA rely on the original submitter of the
sequence and cannot be updated by others, so they may be less reliable.
Except for GeneRIF and ENA,
the curated entries include a short curated
description of the protein's function.
For entries from BioLiP, the protein's function may not be known beyond binding to the ligand.
Many of these entries also link to articles in PubMed.
For more information see the
PaperBLAST paper (mSystems 2017)
or the code.
You can download PaperBLAST's database here.
Changes to PaperBLAST since the paper was written:
- November 2023: incorporated PRODORIC and RegPrecise. Many PRODORIC entries were not linked to a protein sequence (no UniProt identifier), so we added this information.
- February 2023: BioLiP changed their download format. PaperBLAST now includes their non-redundant subset. SitesBLAST and Sites on a Tree use a larger non-redundant subset that ensures that every ligand is represented within each cluster. This should ensure that every binding site is represented.
- June 2022: incorporated some coding sequences from ENA with the /experiment tag.
- March 2022: incorporated BioLiP.
- April 2020: incorporated TCDB.
- April 2019: EuropePMC now returns table entries in their search results. This has expanded PaperBLAST's database, but most of the new entries are of low relevance, and the resulting snippets are often just lists of locus tags with annotations.
- February 2018: the alignment page reports the conservation of the hit's functional sites (if available from from Swiss-Prot or UniProt)
- January 2018: incorporated BRENDA.
- December 2017: incorporated MetaCyc, CharProtDB, CAZy, REBASE, and the reannotations from the Fitness Browser.
- September 2017: EuropePMC no longer returns some table entries in their search results. This has shrunk PaperBLAST's database, but has also reduced the number of low-relevance hits.
Many of these changes are described in Interactive tools for functional annotation of bacterial genomes.
PaperBLAST cannot provide snippets for many of the papers that are
published in non-open-access journals. This limitation applies even if
the paper is marked as "free" on the publisher's web site and is
available in PubmedCentral or EuropePMC. If a journal that you publish
in is marked as "secret," please consider publishing elsewhere.
Many important articles are missing from PaperBLAST, either because
the article's full text is not in EuropePMC (as for many older
articles), or because the paper does not mention a protein identifier such as a locus tag, or because of PaperBLAST's heuristics. If you notice an
article that characterizes a protein's function but is missing from
PaperBLAST, please notify the curators at UniProt
or add an entry to GeneRIF.
Entries in either of these databases will eventually be incorporated
into PaperBLAST. Note that to add an entry to UniProt, you will need
to find the UniProt identifier for the protein. If the protein is not
already in UniProt, you can ask them to create an entry. To add an
entry to GeneRIF, you will need an NCBI Gene identifier, but
unfortunately many prokaryotic proteins in RefSeq do not have
corresponding Gene identifers.
References
PaperBLAST: Text-mining papers for information about homologs.
M. N. Price and A. P. Arkin (2017). mSystems, 10.1128/mSystems.00039-17.
Europe PMC in 2017.
M. Levchenko et al (2017). Nucleic Acids Research, 10.1093/nar/gkx1005.
Gene indexing: characterization and analysis of NLM's GeneRIFs.
J. A. Mitchell et al (2003). AMIA Annu Symp Proc 2003:460-464.
UniProt: the universal protein knowledgebase.
The UniProt Consortium (2016). Nucleic Acids Research, 10.1093/nar/gkw1099.
BRENDA in 2017: new perspectives and new tools in BRENDA.
S. Placzek et al (2017). Nucleic Acids Research, 10.1093/nar/gkw952.
The EcoCyc database: reflecting new knowledge about Escherichia coli K-12.
I. M. Keeseler et al (2016). Nucleic Acids Research, 10.1093/nar/gkw1003.
The MetaCyc database of metabolic pathways and enzymes.
R. Caspi et al (2018). Nucleic Acids Research, 10.1093/nar/gkx935.
CharProtDB: a database of experimentally characterized protein annotations.
R. Madupu et al (2012). Nucleic Acids Research, 10.1093/nar/gkr1133.
The carbohydrate-active enzymes database (CAZy) in 2013.
V. Lombard et al (2014). Nucleic Acids Research, 10.1093/nar/gkt1178.
The Transporter Classification Database (TCDB): recent advances
M. H. Saier, Jr. et al (2016). Nucleic Acids Research, 10.1093/nar/gkv1103.
REBASE - a database for DNA restriction and modification: enzymes, genes and genomes.
R. J. Roberts et al (2015). Nucleic Acids Research, 10.1093/nar/gku1046.
Deep annotation of protein function across diverse bacteria from mutant phenotypes.
M. N. Price et al (2016). bioRxiv, 10.1101/072470.
by Morgan Price,
Arkin group
Lawrence Berkeley National Laboratory