PaperBLAST – Find papers about a protein or its homologs

 

PaperBLAST

PaperBLAST Hits for ABZR87_RS02730 (38 a.a., MKVLASVKRI...)

Other sequence analysis tools:

Find functional residues: SitesBLAST

Search for conserved domains

Find the best match in UniProt

Compare to protein structures

Predict transmenbrane helices: Phobius

Predict protein localization: PSORTb

Find homologs in fast.genomics

Fitness BLAST: loading...

Found 50 similar proteins in the literature:

Bxe_A0336 50S ribosomal protein L36 from Burkholderia xenovorans LB400
Bxe_A0336 50S ribosomal protein L36 from Paraburkholderia xenovorans LB400
95% identity, 100% coverage

bglu_1g02800 50S ribosomal protein L36 from Burkholderia glumae BGR1
95% identity, 83% coverage

PP0475 ribosomal protein L36 from Pseudomonas putida KT2440
PP_0475 50S ribosomal protein L36 from Pseudomonas putida KT2440
82% identity, 100% coverage

YPO0230 50S ribosomal protein L36 from Yersinia pestis CO92
YPTB3677 50S ribosomal protein L36 from Yersinia pseudotuberculosis IP 32953
79% identity, 100% coverage

8cd16 / Q9HWF6 8cd16 (see paper)
PA4242 50S ribosomal protein L36 from Pseudomonas aeruginosa PAO1
Q9HWF6 Large ribosomal subunit protein bL36A from Pseudomonas aeruginosa (strain ATCC 15692 / DSM 22644 / CIP 104116 / JCM 14847 / LMG 12228 / 1C / PRS 101 / PAO1)
79% identity, 100% coverage

SecX / b3299 50S ribosomal subunit protein L36 from Escherichia coli K-12 substr. MG1655 (see 11 papers)
rpmJ / P0A7Q6 50S ribosomal subunit protein L36 from Escherichia coli (strain K12) (see 13 papers)
RL36_ECOLI / P0A7Q6 Large ribosomal subunit protein bL36A; 50S ribosomal protein L36; Ribosomal protein B from Escherichia coli (strain K12) (see 10 papers)
8a3l3 / P0A7Q6 8a3l3 (see paper)
SENTW_3546 50S ribosomal protein L36 from Salmonella enterica subsp. enterica serovar Weltevreden str.
NP_417758 50S ribosomal subunit protein L36 from Escherichia coli str. K-12 substr. MG1655
STM3419 50S ribosomal subunit protein X from Salmonella enterica subsp. enterica serovar Typhimurium str. LT2
NP_462323 50S ribosomal subunit protein X from Salmonella typhimurium LT2
b3299 50S ribosomal protein L36 from Escherichia coli str. K-12 substr. MG1655
SEN3247 50S ribosomal subunit protein L36 from Salmonella enterica subsp. enterica serovar Enteritidis str. P125109
ECs4164 50S ribosomal subunit protein L36 from Escherichia coli O157:H7 str. Sakai
76% identity, 100% coverage

HI0798.1 ribosomal protein L36 (rpL36) from Haemophilus influenzae Rd KW20
79% identity, 100% coverage

7m4v3 / B7IA18 A. Baumannii ribosome-eravacycline complex: 50s (see paper)
ACIAD3198 50S ribosomal protein L36 from Acinetobacter sp. ADP1
AbA118F_2920 50S ribosomal protein L36 from Acinetobacter baumannii
74% identity, 100% coverage

VC2575 ribosomal protein L36 from Vibrio cholerae O1 biovar eltor str. N16961
B7C60_RS03515 50S ribosomal protein L36 from Vibrio fujianensis
82% identity, 100% coverage

Alvin_2342 ribosomal protein L36 from Allochromatium vinosum DSM 180
79% identity, 100% coverage

FTT_0346 50S ribosomal protein L36 from Francisella tularensis subsp. tularensis SCHU S4
82% identity, 100% coverage

SO0252 ribosomal protein L36 from Shewanella oneidensis MR-1
79% identity, 100% coverage

LOC18054512 uncharacterized protein LOC18054512 from Citrus x clementina
63% identity, 29% coverage

SP60_05345 50S ribosomal protein L36 from Candidatus Thioglobus autotrophicus
82% identity, 100% coverage

8rd8ep / A0A0M4U3Z5 8rd8ep (see paper)
66% identity, 100% coverage

HP1297 ribosomal protein L36 (rpl36) from Helicobacter pylori 26695
P56058 Large ribosomal subunit protein bL36 from Helicobacter pylori (strain ATCC 700392 / 26695)
74% identity, 100% coverage

AT5G20180 ribosomal protein L36 family protein from Arabidopsis thaliana
63% identity, 37% coverage

4v61B6 4v61B6 (see paper)
63% identity, 100% coverage

HPG27_RS06525 50S ribosomal protein L36 from Helicobacter pylori
71% identity, 100% coverage

LSEI_2480 Ribosomal protein L36 from Lactobacillus casei ATCC 334
61% identity, 100% coverage

NMB0164 50S ribosomal protein L36 from Neisseria meningitidis MC58
NMA0107 50S ribosomal protein L36 from Neisseria meningitidis Z2491
76% identity, 100% coverage

sml0006 50S ribosomal protein L36 from Synechocystis sp. PCC 6803
63% identity, 100% coverage

SPy0076 50S ribosomal protein B from Streptococcus pyogenes M1 GAS
66% identity, 100% coverage

Dde_2235 50S ribosomal protein L36 from Oleidesulfovibrio alaskensis G20
Dde_2235 Ribosomal protein L36 from Desulfovibrio desulfuricans G20
68% identity, 100% coverage

7nhk8 / A0A1B4XKT9 7nhk8 (see paper)
Q839E1 Large ribosomal subunit protein bL36 from Enterococcus faecalis (strain ATCC 700802 / V583)
IUJ47_RS04640 50S ribosomal protein L36 from Enterococcus faecalis
63% identity, 100% coverage

CNAG_01974 large subunit ribosomal protein L36 from Cryptococcus neoformans var. grubii H99
61% identity, 36% coverage

Q04BZ2 Large ribosomal subunit protein bL36 from Lactobacillus delbrueckii subsp. bulgaricus (strain ATCC BAA-365 / Lb-18)
63% identity, 100% coverage

I872_00600 50S ribosomal protein L36 from Streptococcus cristatus AS 1.3089
llmg_2357 50S ribosomal protein L36 from Lactococcus lactis subsp. cremoris MG1363
SSUSC84_0092 50S ribosomal protein L36 from Streptococcus suis SC84
63% identity, 100% coverage

YSS_RS00895 50S ribosomal protein L36 from Campylobacter coli RM4661
63% identity, 100% coverage

7ood2 / P52864 Mycoplasma pneumoniae 50s subunit of ribosomes in chloramphenicol- treated cells (see paper)
71% identity, 100% coverage

EKO22_07880 50S ribosomal protein L36 from Synechococcus elongatus PCC 11802
66% identity, 100% coverage

RL36_THET8 / Q5SHR2 Large ribosomal subunit protein bL36; 50S ribosomal protein L36; Ribosomal protein B from Thermus thermophilus (strain ATCC 27634 / DSM 579 / HB8) (see paper)
RL36_THETH / P80256 Large ribosomal subunit protein bL36; 50S ribosomal protein L36; Ribosomal protein B from Thermus thermophilus (see paper)
68% identity, 100% coverage

A5IV11 Large ribosomal subunit protein bL36 from Staphylococcus aureus (strain JH9)
Q2FW29 Large ribosomal subunit protein bL36 from Staphylococcus aureus (strain NCTC 8325 / PS 47)
SAS078 50S ribosomal protein L36 from Staphylococcus aureus subsp. aureus N315
SAOUHSC_02488 ribosomal protein L36 from Staphylococcus aureus subsp. aureus NCTC 8325
USA300HOU_2218 ribosomal protein L36 from Staphylococcus aureus subsp. aureus USA300_TCH1516
SACOL2216 ribosomal protein L36 from Staphylococcus aureus subsp. aureus COL
ACIV1F_002546, B4602_RS11710, EKM74_RS05515 50S ribosomal protein L36 from Staphylococcus aureus
68% identity, 100% coverage

B3DFA8 Large ribosomal subunit protein bL36 from Microcystis aeruginosa (strain NIES-843 / IAM M-2473)
63% identity, 100% coverage

TP0209 ribosomal protein L36 (rpmJ-1) from Treponema pallidum subsp. pallidum str. Nichols
TPANIC_RS01040 50S ribosomal protein L36 from Treponema pallidum subsp. pallidum str. Nichols
63% identity, 100% coverage

MSMEG_1520 ribosomal protein L36 from Mycobacterium smegmatis str. MC2 155
68% identity, 100% coverage

MT3567.1 50S ribosomal protein L36 from Mycobacterium tuberculosis CDC1551
Rv3461c 50S ribosomal protein L36 from Mycobacterium tuberculosis H37Rv
68% identity, 100% coverage

CAC3108 Ribosomal protein L36 from Clostridium acetobutylicum ATCC 824
CD0094A 50S ribosomal protein L36 from Clostridium difficile 630
66% identity, 100% coverage

9c4g3 / A0A2B7JN22 Cutibacterium acnes 50s ribosomal subunit with clindamycin bound (see paper)
66% identity, 100% coverage

RL36_BACSU / P20278 Large ribosomal subunit protein bL36; 50S ribosomal protein L36; BL38; Ribosomal protein B; Ribosomal protein II from Bacillus subtilis (strain 168) (see paper)
7as8j / P20278 Bacillus subtilis ribosome quality control complex state b. Ribosomal 50s subunit with p-tRNA, rqch, and rqcp/yabo (see paper)
BSU01400 50S ribosomal protein L36 from Bacillus subtilis subsp. subtilis str. 168
63% identity, 100% coverage

SCO4726 50S ribosomal protein L36 from Streptomyces coelicolor A3(2)
63% identity, 100% coverage

RTC6_NEUCR / Q7S4E7 Large ribosomal subunit protein bL36m from Neurospora crassa (strain ATCC 24698 / 74-OR23-1A / CBS 708.71 / DSM 1257 / FGSC 987) (see paper)
45% identity, 31% coverage

Q71WG9 Large ribosomal subunit protein bL36 from Listeria monocytogenes serotype 4b (strain F2365)
61% identity, 100% coverage

RTC6_SCHPO / O94690 Large ribosomal subunit protein bL36m; 54S ribosomal protein rtc6, mitochondrial from Schizosaccharomyces pombe (strain 972 / ATCC 24843) (Fission yeast) (see paper)
53% identity, 41% coverage

BC0155 LSU ribosomal protein L36P from Bacillus cereus ATCC 14579
Q81VQ6 Large ribosomal subunit protein bL36 from Bacillus anthracis
61% identity, 100% coverage

RL36_DEIRA / Q9RSK0 Large ribosomal subunit protein bL36; 50S ribosomal protein L36 from Deinococcus radiodurans (strain ATCC 13939 / DSM 20539 / JCM 16871 / CCUG 27074 / LMG 4051 / NBRC 15346 / NCIMB 9279 / VKM B-1422 / R1) (see 6 papers)
3cf54 / Q9RSK0 Thiopeptide antibiotic thiostrepton bound to the large ribosomal subunit of deinococcus radiodurans (see paper)
61% identity, 100% coverage

6ywe0 / Q7S4E7 structure of the mitoribosome from Neurospora crassa in the P/E tRNA bound state (see paper)
45% identity, 83% coverage

3j6b0 / O14464 of the yeast mitochondrial large ribosomal subunit (see paper)
47% identity, 100% coverage

6z1pAG / W7XH61 6z1pAG (see paper)
50% identity, 100% coverage

RTC6_YEAST / O14464 Large ribosomal subunit protein bL36m; 54S ribosomal protein RTC6, mitochondrial; Restriction of telomere capping protein 6; Translation associated element 4 from Saccharomyces cerevisiae (strain ATCC 204508 / S288c) (Baker's yeast) (see 6 papers)
YPL183W-A Homolog of the prokaryotic ribosomal protein L36, likely to be a mitochondrial ribosomal protein coded in the nuclear genome from Saccharomyces cerevisiae
47% identity, 41% coverage

New Search

For advice on how to use these tools together, see Interactive tools for functional annotation of bacterial genomes.

Statistics

The PaperBLAST database links 793,807 different protein sequences to 1,259,118 scientific articles. Searches against EuropePMC were last performed on March 13 2025.

How It Works

PaperBLAST builds a database of protein sequences that are linked to scientific articles. These links come from automated text searches against the articles in EuropePMC and from manually-curated information from GeneRIF, UniProtKB/Swiss-Prot, BRENDA, CAZy (as made available by dbCAN), BioLiP, CharProtDB, MetaCyc, EcoCyc, TCDB, REBASE, the Fitness Browser, and a subset of the European Nucleotide Archive with the /experiment tag. Given this database and a protein sequence query, PaperBLAST uses protein-protein BLAST to find similar sequences with E < 0.001.

To build the database, we query EuropePMC with locus tags, with RefSeq protein identifiers, and with UniProt accessions. We obtain the locus tags from RefSeq or from MicrobesOnline. We use queries of the form "locus_tag AND genus_name" to try to ensure that the paper is actually discussing that gene. Because EuropePMC indexes most recent biomedical papers, even if they are not open access, some of the links may be to papers that you cannot read or that our computers cannot read. We query each of these identifiers that appears in the open access part of EuropePMC, as well as every locus tag that appears in the 500 most-referenced genomes, so that a gene may appear in the PaperBLAST results even though none of the papers that mention it are open access. We also incorporate text-mined links from EuropePMC that link open access articles to UniProt or RefSeq identifiers. (This yields some additional links because EuropePMC uses different heuristics for their text mining than we do.)

For every article that mentions a locus tag, a RefSeq protein identifier, or a UniProt accession, we try to select one or two snippets of text that refer to the protein. If we cannot get access to the full text, we try to select a snippet from the abstract, but unfortunately, unique identifiers such as locus tags are rarely provided in abstracts.

PaperBLAST also incorporates manually-curated protein functions:

Except for GeneRIF and ENA, the curated entries include a short curated description of the protein's function. For entries from BioLiP, the protein's function may not be known beyond binding to the ligand. Many of these entries also link to articles in PubMed.

For more information see the PaperBLAST paper (mSystems 2017) or the code. You can download PaperBLAST's database here.

Changes to PaperBLAST since the paper was written:

Many of these changes are described in Interactive tools for functional annotation of bacterial genomes.

Secrets

PaperBLAST cannot provide snippets for many of the papers that are published in non-open-access journals. This limitation applies even if the paper is marked as "free" on the publisher's web site and is available in PubmedCentral or EuropePMC. If a journal that you publish in is marked as "secret," please consider publishing elsewhere.

Omissions from the PaperBLAST Database

Many important articles are missing from PaperBLAST, either because the article's full text is not in EuropePMC (as for many older articles), or because the paper does not mention a protein identifier such as a locus tag, or because of PaperBLAST's heuristics. If you notice an article that characterizes a protein's function but is missing from PaperBLAST, please notify the curators at UniProt or add an entry to GeneRIF. Entries in either of these databases will eventually be incorporated into PaperBLAST. Note that to add an entry to UniProt, you will need to find the UniProt identifier for the protein. If the protein is not already in UniProt, you can ask them to create an entry. To add an entry to GeneRIF, you will need an NCBI Gene identifier, but unfortunately many prokaryotic proteins in RefSeq do not have corresponding Gene identifers.

References

PaperBLAST: Text-mining papers for information about homologs.
M. N. Price and A. P. Arkin (2017). mSystems, 10.1128/mSystems.00039-17.

Europe PMC in 2017.
M. Levchenko et al (2017). Nucleic Acids Research, 10.1093/nar/gkx1005.

Gene indexing: characterization and analysis of NLM's GeneRIFs.
J. A. Mitchell et al (2003). AMIA Annu Symp Proc 2003:460-464.

UniProt: the universal protein knowledgebase.
The UniProt Consortium (2016). Nucleic Acids Research, 10.1093/nar/gkw1099.

BRENDA in 2017: new perspectives and new tools in BRENDA.
S. Placzek et al (2017). Nucleic Acids Research, 10.1093/nar/gkw952.

The EcoCyc database: reflecting new knowledge about Escherichia coli K-12.
I. M. Keeseler et al (2016). Nucleic Acids Research, 10.1093/nar/gkw1003.

The MetaCyc database of metabolic pathways and enzymes.
R. Caspi et al (2018). Nucleic Acids Research, 10.1093/nar/gkx935.

CharProtDB: a database of experimentally characterized protein annotations.
R. Madupu et al (2012). Nucleic Acids Research, 10.1093/nar/gkr1133.

The carbohydrate-active enzymes database (CAZy) in 2013.
V. Lombard et al (2014). Nucleic Acids Research, 10.1093/nar/gkt1178.

The Transporter Classification Database (TCDB): recent advances
M. H. Saier, Jr. et al (2016). Nucleic Acids Research, 10.1093/nar/gkv1103.

REBASE - a database for DNA restriction and modification: enzymes, genes and genomes.
R. J. Roberts et al (2015). Nucleic Acids Research, 10.1093/nar/gku1046.

Deep annotation of protein function across diverse bacteria from mutant phenotypes.
M. N. Price et al (2016). bioRxiv, 10.1101/072470.

by Morgan Price, Arkin group
Lawrence Berkeley National Laboratory