PaperBLAST
Full List of Papers Linked to NP_666329.2
SYAC_MOUSE / Q8BGQ7 Alanine--tRNA ligase, cytoplasmic; Alanyl-tRNA synthetase; AlaRS; Protein lactyltransferase AARS1; Protein sticky; Sti; EC 6.1.1.7; EC 6.-.-.- from Mus musculus (Mouse) (see 6 papers)
NP_666329 alanine--tRNA ligase, cytoplasmic from Mus musculus
- function: Catalyzes the attachment of alanine to tRNA(Ala) in a two- step reaction: alanine is first activated by ATP to form Ala-AMP and then transferred to the acceptor end of tRNA(Ala) (PubMed:16906134, PubMed:20010690, PubMed:25422440, PubMed:27622773). Also edits incorrectly charged tRNA(Ala) via its editing domain (PubMed:16906134, PubMed:20010690, PubMed:25422440, PubMed:29769718). In presence of high levels of lactate, also acts as a protein lactyltransferase that mediates lactylation of lysine residues in target proteins, such as TEAD1, TP53/p53 and YAP1. Protein lactylation takes place in a two-step reaction: lactate is first activated by ATP to form lactate-AMP and then transferred to lysine residues of target proteins. Acts as an inhibitor of TP53/p53 activity by catalyzing lactylation of TP53/p53. Acts as a positive regulator of the Hippo pathway by mediating lactylation of TEAD1 and YAP1.
catalytic activity: tRNA(Ala) + L-alanine + ATP = L-alanyl-tRNA(Ala) + AMP + diphosphate (RHEA:12540)
catalytic activity: (S)-lactate + ATP + H(+) = (S)-lactoyl-AMP + diphosphate (RHEA:80271)
catalytic activity: (S)-lactoyl-AMP + L-lysyl-[protein] = N(6)-[(S)-lactoyl]-L- lysyl-[protein] + AMP + 2 H(+) (RHEA:80275)
cofactor: Zn(2+) (Binds 1 zinc ion per subunit.)
subunit: Monomer (By similarity). Interacts with ANKRD16; the interaction is direct (PubMed:29769718). - TBX1 is required for normal stria vascularis and semicircular canal development
Tian, Developmental biology 2020 - “...3 NP_081118 582 aa 114241 ATRN Good 2 NP_033860 1428 aa 256490 AARS Moderate 1 NP_666329 968 aa 754913 ABTB2 Moderate 1 NP_849221 1024 aa 256490 COMP Moderate 1 NP_057894 755 aa 5136 CREBZF Moderate 1 NP_660133 358 aa 182358 FBN1 Moderate 1 NP_032019 2873 aa...”
- Proteomic Analysis of Protective Effects of Dl-3-n-Butylphthalide against mpp + -Induced Toxicity via downregulating P53 pathway in N2A Cells
Zhao, Proteome science 2023 - “...cholesterol ester hydrolase 1 0.00098 Q8BHC1 Rab39b Ras-related protein Rab-39B 0.002247 Q8BGZ6 Gla Alpha-galactosidase 0.010911 Q8BGQ7 Aars \AlaninetRNA ligase, cytoplasmic \"" 8.69E-05 Q8BGB5 Limd2 LIM domain-containing protein 2 0.000117 Q8BG16 Slc6a15 Sodium-dependent neutral amino acid transporter B(0)AT2 0.001698 Q8BFW6 Entpd3 Ectonucleoside triphosphate diphosphohydrolase 3 0.0032 Q80ZM5...”
- The amyloid peptide β disrupts intercellular junctions and increases endothelial permeability in a NADPH oxidase 1-dependent manner.
Tarafdar, Redox biology 2022 - “...Q8BP47 Asparagine--tRNA ligase, cytoplasmic NARS1 0.0091134 Q8K0B2 Lysosomal cobalamin transport escort protein LMBD1 Lmbrd1 0.0095729 Q8BGQ7 Alanine--tRNA ligase, cytoplasmic Aars1 0.0102687 Q80X50 Ubiquitin-associated protein 2-like Ubap2l 0.0116174 Q8BJW6 Eukaryotic translation initiation factor 2A Eif2a 0.0116343 Q69ZR2 E3 ubiquitin-protein ligase HECTD1 Hectd1 0.0119589 Q9JK81 MYG1 exonuclease Myg1...”
- Proteomic profiling of the interface between the stomach wall and the pancreas in dystrophinopathy
Dowling, European journal of translational myology 2021 - “...Cpt2 6 0.013181 2.1 P63038 60 kDa heat shock protein, mitochondrial Hspd1 5 0.007267 2.1 Q8BGQ7 Alanine--tRNA ligase, cytoplasmic Aars 3 0.007629 2.1 P38647 Stress-70 protein, mitochondrial Hspa9 7 0.000879 2 P56480 ATP synthase subunit beta, mitochondrial Atp5f1b 3 0.02928 2 P02088 Hemoglobin subunit beta-1 Hbb-b1...”
- Proteomics of autism and Alzheimer's mouse models reveal common alterations in mTOR signaling pathway.
Mencer, Translational psychiatry 2021 - “...1 Q3V1U8 ELMO domain-containing protein 1 Q6PDI5 Proteasome-associated protein ECM29 homolog Q06335 Amyloid-like protein 2 Q8BGQ7 AlaninetRNA ligase, cytoplasmic Q8C779 Uncharacterized protein CXorf57 homolog Q9QZE7 Translin-associated protein X Q9QZQ1 Afadin Q91Y44 Bromodomain testis-specific protein Q8BUH8 Sentrin-specific protease 7 Q3UQ44 Ras GTPase-activating-like protein IQGAP2 Q8CHY6 Transcriptional repressor...”
- Protein Expression Analysis of an In Vitro Murine Model of Prostate Cancer Progression: Towards Identification of High-Potential Therapeutic Targets
Bahmad, Journal of personalized medicine 2020 - “...shock 70 kDa protein 1B Hspa1b 70,176 -1.3879 0.0209 O55131 Septin-7 Sept7 50,550 1.4336 0.0409 Q8BGQ7 AlaninetRNA ligase, cytoplasmic Aars 106,909 1.4824 0.0113 Q99K85 Phosphoserine aminotransferase Psat1 40,473 1.4994 0.0014 Q6IRU2 Tropomyosin alpha-4 chain Tpm4 28,468 1.5363 0.0019 P62814 V-type proton ATPase subunit B, brain isoform...”
- The G3-U70-independent tRNA recognition by human mitochondrial alanyl-tRNA synthetase.
Zeng, Nucleic acids research 2019 - “...No. P40825); hAlaRS, human cytoplasmic AlaRS (Uniprot No. P49588); mAlaRS, mouse cytoplasmic AlaRS (Uniprot No. Q8BGQ7); Sc AlaRS, S. cerevisiae cytoplasmic AlaRS (Uniprot No. P40825); Tt AlaRS, Thermus thermophilus AlaRS (Uniprot No. P74941) and Ph AlaRS, Pyrococcus horikoshii AlaRS (Uniprot No. O58035). ( B ) Location...”
- Proteomic Analysis of Secretomes of Oncolytic Herpes Simplex Virus-Infected Squamous Cell Carcinoma Cells.
Tada, Cancers 2018 - “...72 16 11.3 1.41 0.00015 * Q9D8N0 Elongation factor 1-gamma 50 16.7 10.3 1.61 0.18 Q8BGQ7 Alanine--tRNA ligase, cytoplasmic 107 15.7 10.3 1.51 0.21 Q61553 Fascin 55 15 10.3 1.45 0.049 * Q8CIE6 Coatomer subunit alpha 138 14.7 10 1.47 0.025 * P21550 Beta-enolase 47 38...”
- Proteomics Reveals Scope of Mycolactone-mediated Sec61 Blockade and Distinctive Stress Signature.
Morel, Molecular & cellular proteomics : MCP 2018 - The transcriptome of metamorphosing flatfish
Alves, BMC genomics 2016 - “...lcst_c28836 Extracellular matrix protein FRAS1 fras1 Q80T14 Mus musculus 1E-70 lcst_c6455 Alanine-trna ligase, cytoplasmic aars Q8BGQ7 Mus musculus 0 lcst_c50366 Nerve growth factor receptor (TNFR superfamily, member 16) ngfr Q8CFT3 Mus musculus 7E-34 lcst_c1016 Delta(24)-sterol reductase dhcr24 Q8VCH6 Mus musculus 0 lcst_c60747 Long-chain fatty acid transport...”
- Integrated analysis of proteome and transcriptome changes in the mucopolysaccharidosis type VII mouse hippocampus
Parente, Molecular genetics and metabolism 2016 - “...Nuclear Transport Factor 2 (Ntf2); nuclear transport factor 2; P84086 Cplx2 0.015 -2.58 complexin 2 Q8BGQ7 Aars 0.026 -2.63 alanyl-tRNA synthetase Q8VCT3 Rnpep 0.020 -2.65 arginyl aminopeptidase (aminopeptidase B) Q80Y14 Glrx5 0.034 -2.68 glutaredoxin 5 homolog (S. cerevisiae) Q9CQ85 Timm22 0.017 -2.69 translocase of inner mitochondrial...”
- Effect of diets supplemented with different conjugated linoleic acid (CLA) isomers on protein expression in C57/BL6 mice.
Della, Genes & nutrition 2016 - “...10 33.3 6.9 45 P50236 8 Alanine-TRNA ligase, cytoplasmic (SYAC) Aars 1 106.9 5.4 34 Q8BGQ7 9 Peroxisomal acyl-coenzyme A oxidase 1 (ACOX1) Acox1 15 74.7 8.6 177 Q9R0H0 a 2-DE gel image spot number presented in Fig. 1 b Commonly used protein name c Gene...”
- In vivo substrates of the lens molecular chaperones αA-crystallin and αB-crystallin.
Andley, PloS one 2014 - “...( p -value) WT vs. heterozygous WT vs. homozygous Heterozygous/ homozygous 593 Alanyl-tRNA synthetase, cytoplasmic Q8BGQ7 107 12 2.01/0.035 2.28/0.021 1.13/0.43 A-crystallin Q569M7 20 2 T-complex protein 1 subunit theta P42932 60 2 Glutathione synthetase P51855 52 2 884 A3/A1-crystallin Q9QXC6 25 1 1.02/0.83 1.65/0.017 1.62/0.0048...”
- Inhibition of mitochondrial aconitase by succination in fumarate hydratase deficiency.
Ternette, Cell reports 2013 - “...Accession No. Gene Symbol Protein Name Succination Site(s) Source PSMs Sequence Coverage 2SC Peptide Instances Q8BGQ7 Aars alanine-tRNA ligase, cytoplasmic C403 M(c) 302 49.9% 3 Q99KI0 Aco2 aconitate hydratase, mitochondrial C385, C448, C451 M(m) 1,340 66.7% C385(30), C451(3) Q9R0X4 Acot9 acyl-coenzyme A thioesterase 9, mitochondrial C154...”
For advice on how to use these tools together, see
Interactive tools for functional annotation of bacterial genomes.
The PaperBLAST database links 793,807 different protein sequences to 1,259,118 scientific articles. Searches against EuropePMC were last performed on March 13 2025.
PaperBLAST builds a database of protein sequences that are linked
to scientific articles. These links come from automated text searches
against the articles in EuropePMC
and from manually-curated information from GeneRIF, UniProtKB/Swiss-Prot,
BRENDA,
CAZy (as made available by dbCAN),
BioLiP,
CharProtDB,
MetaCyc,
EcoCyc,
TCDB,
REBASE,
the Fitness Browser,
and a subset of the European Nucleotide Archive with the /experiment tag.
Given this database and a protein sequence query,
PaperBLAST uses protein-protein BLAST
to find similar sequences with E < 0.001.
To build the database, we query EuropePMC with locus tags, with RefSeq protein
identifiers, and with UniProt
accessions. We obtain the locus tags from RefSeq or from MicrobesOnline. We use
queries of the form "locus_tag AND genus_name" to try to ensure that
the paper is actually discussing that gene. Because EuropePMC indexes
most recent biomedical papers, even if they are not open access, some
of the links may be to papers that you cannot read or that our
computers cannot read. We query each of these identifiers that
appears in the open access part of EuropePMC, as well as every locus
tag that appears in the 500 most-referenced genomes, so that a gene
may appear in the PaperBLAST results even though none of the papers
that mention it are open access. We also incorporate text-mined links
from EuropePMC that link open access articles to UniProt or RefSeq
identifiers. (This yields some additional links because EuropePMC
uses different heuristics for their text mining than we do.)
For every article that mentions a locus tag, a RefSeq protein
identifier, or a UniProt accession, we try to select one or two
snippets of text that refer to the protein. If we cannot get access to
the full text, we try to select a snippet from the abstract, but
unfortunately, unique identifiers such as locus tags are rarely
provided in abstracts.
PaperBLAST also incorporates manually-curated protein functions:
- Proteins from NCBI's RefSeq are included if a
GeneRIF
entry links the gene to an article in
PubMed®.
GeneRIF also provides a short summary of the article's claim about the
protein, which is shown instead of a snippet.
- Proteins from Swiss-Prot (the curated part of UniProt)
are included if the curators
identified experimental evidence for the protein's function (evidence
code ECO:0000269). For these proteins, the fields of the Swiss-Prot entry that
describe the protein's function are shown (with bold headings).
- Proteins from BRENDA,
a curated database of enzymes, are included if they are linked to a paper in PubMed
and their full sequence is known.
- Every protein from the non-redundant subset of
BioLiP,
a database
of ligand-binding sites and catalytic residues in protein structures, is included. Since BioLiP itself
does not include descriptions of the proteins, those are taken from the
Protein Data Bank.
Descriptions from PDB rely on the original submitter of the
structure and cannot be updated by others, so they may be less reliable.
(For SitesBLAST and Sites on a Tree, we use a larger subset of BioLiP so that every
ligand is represented among a group of structures with similar sequences, but for
PaperBLAST, we use the non-redundant set provided by BioLiP.)
- Every protein from EcoCyc, a curated
database of the proteins in Escherichia coli K-12, is included, regardless
of whether they are characterized or not.
- Proteins from the MetaCyc metabolic pathway database
are included if they are linked to a paper in PubMed and their full sequence is known.
- Proteins from the Transport Classification Database (TCDB)
are included if they have known substrate(s), have reference(s),
and are not described as uncharacterized or putative.
(Some of the references are not visible on the PaperBLAST web site.)
- Every protein from CharProtDB,
a database of experimentally characterized protein annotations, is included.
- Proteins from the CAZy database of carbohydrate-active enzymes
are included if they are associated with an Enzyme Classification number.
Even though CAZy does not provide links from individual protein sequences to papers,
these should all be experimentally-characterized proteins.
- Proteins from the REBASE database
of restriction enzymes are included if they have known specificity.
- Every protein with an evidence-based reannotation (based on mutant phenotypes)
in the Fitness Browser is included.
- Sequence-specific transcription factors (including sigma factors and DNA-binding response regulators)
with experimentally-determined DNA binding sites from the
PRODORIC database of gene regulation in prokaryotes.
- Putative transcription factors from RegPrecise
that have manually-curated predictions for their binding sites. These predictions are based on
conserved putative regulatory sites across genomes that contain similar transcription factors,
so PaperBLAST clusters the TFs at 70% identity and retains just one member of each cluster.
- Coding sequence (CDS) features from the
European Nucleotide Archive (ENA)
are included if the /experiment tag is set (implying that there is experimental evidence for the annotation),
the nucleotide entry links to paper(s) in PubMed,
and the nucleotide entry is from the STD data class
(implying that these are targeted annotated sequences, not from shotgun sequencing).
Also, to filter out genes whose transcription or translation was detected, but whose function
was not studied, nucleotide entries or papers with more than 25 such proteins are excluded.
Descriptions from ENA rely on the original submitter of the
sequence and cannot be updated by others, so they may be less reliable.
Except for GeneRIF and ENA,
the curated entries include a short curated
description of the protein's function.
For entries from BioLiP, the protein's function may not be known beyond binding to the ligand.
Many of these entries also link to articles in PubMed.
For more information see the
PaperBLAST paper (mSystems 2017)
or the code.
You can download PaperBLAST's database here.
Changes to PaperBLAST since the paper was written:
- November 2023: incorporated PRODORIC and RegPrecise. Many PRODORIC entries were not linked to a protein sequence (no UniProt identifier), so we added this information.
- February 2023: BioLiP changed their download format. PaperBLAST now includes their non-redundant subset. SitesBLAST and Sites on a Tree use a larger non-redundant subset that ensures that every ligand is represented within each cluster. This should ensure that every binding site is represented.
- June 2022: incorporated some coding sequences from ENA with the /experiment tag.
- March 2022: incorporated BioLiP.
- April 2020: incorporated TCDB.
- April 2019: EuropePMC now returns table entries in their search results. This has expanded PaperBLAST's database, but most of the new entries are of low relevance, and the resulting snippets are often just lists of locus tags with annotations.
- February 2018: the alignment page reports the conservation of the hit's functional sites (if available from from Swiss-Prot or UniProt)
- January 2018: incorporated BRENDA.
- December 2017: incorporated MetaCyc, CharProtDB, CAZy, REBASE, and the reannotations from the Fitness Browser.
- September 2017: EuropePMC no longer returns some table entries in their search results. This has shrunk PaperBLAST's database, but has also reduced the number of low-relevance hits.
Many of these changes are described in Interactive tools for functional annotation of bacterial genomes.
PaperBLAST cannot provide snippets for many of the papers that are
published in non-open-access journals. This limitation applies even if
the paper is marked as "free" on the publisher's web site and is
available in PubmedCentral or EuropePMC. If a journal that you publish
in is marked as "secret," please consider publishing elsewhere.
Many important articles are missing from PaperBLAST, either because
the article's full text is not in EuropePMC (as for many older
articles), or because the paper does not mention a protein identifier such as a locus tag, or because of PaperBLAST's heuristics. If you notice an
article that characterizes a protein's function but is missing from
PaperBLAST, please notify the curators at UniProt
or add an entry to GeneRIF.
Entries in either of these databases will eventually be incorporated
into PaperBLAST. Note that to add an entry to UniProt, you will need
to find the UniProt identifier for the protein. If the protein is not
already in UniProt, you can ask them to create an entry. To add an
entry to GeneRIF, you will need an NCBI Gene identifier, but
unfortunately many prokaryotic proteins in RefSeq do not have
corresponding Gene identifers.
References
PaperBLAST: Text-mining papers for information about homologs.
M. N. Price and A. P. Arkin (2017). mSystems, 10.1128/mSystems.00039-17.
Europe PMC in 2017.
M. Levchenko et al (2017). Nucleic Acids Research, 10.1093/nar/gkx1005.
Gene indexing: characterization and analysis of NLM's GeneRIFs.
J. A. Mitchell et al (2003). AMIA Annu Symp Proc 2003:460-464.
UniProt: the universal protein knowledgebase.
The UniProt Consortium (2016). Nucleic Acids Research, 10.1093/nar/gkw1099.
BRENDA in 2017: new perspectives and new tools in BRENDA.
S. Placzek et al (2017). Nucleic Acids Research, 10.1093/nar/gkw952.
The EcoCyc database: reflecting new knowledge about Escherichia coli K-12.
I. M. Keeseler et al (2016). Nucleic Acids Research, 10.1093/nar/gkw1003.
The MetaCyc database of metabolic pathways and enzymes.
R. Caspi et al (2018). Nucleic Acids Research, 10.1093/nar/gkx935.
CharProtDB: a database of experimentally characterized protein annotations.
R. Madupu et al (2012). Nucleic Acids Research, 10.1093/nar/gkr1133.
The carbohydrate-active enzymes database (CAZy) in 2013.
V. Lombard et al (2014). Nucleic Acids Research, 10.1093/nar/gkt1178.
The Transporter Classification Database (TCDB): recent advances
M. H. Saier, Jr. et al (2016). Nucleic Acids Research, 10.1093/nar/gkv1103.
REBASE - a database for DNA restriction and modification: enzymes, genes and genomes.
R. J. Roberts et al (2015). Nucleic Acids Research, 10.1093/nar/gku1046.
Deep annotation of protein function across diverse bacteria from mutant phenotypes.
M. N. Price et al (2016). bioRxiv, 10.1101/072470.
by Morgan Price,
Arkin group
Lawrence Berkeley National Laboratory