PaperBLAST – Find papers about a protein or its homologs

 

PaperBLAST

PaperBLAST Hits for EX31_RS25170 (46 a.a., MKRTFQPSVL...)

Other sequence analysis tools:

Find functional residues: SitesBLAST

Search for conserved domains

Find the best match in UniProt

Compare to protein structures

Predict transmenbrane helices: Phobius

Predict protein localization: PSORTb

Find homologs in fast.genomics

Fitness BLAST: loading...

Found 70 similar proteins in the literature:

HSM_2021 50S ribosomal protein L34 from Haemophilus somnus 2336
93% identity, 96% coverage

Asuc_2117 ribosomal protein L34 from Actinobacillus succinogenes 130Z
HI0998 ribosomal protein L34 (rpL34) from Haemophilus influenzae Rd KW20
91% identity, 96% coverage

BU013 50S ribosomal protein L34 from Buchnera aphidicola str. APS (Acyrthosiphon pisum)
82% identity, 96% coverage

Bfl015 50s ribosomal protein l34 from Candidatus Blochmannia floridanus
78% identity, 98% coverage

8rd8Ba / A0A0M4TEZ6 8rd8Ba (see paper)
84% identity, 96% coverage

8cd14 / P29436 8cd14 (see paper)
PA5570 50S ribosomal protein L34 from Pseudomonas aeruginosa PAO1
IAU57_33160 50S ribosomal protein L34 from Pseudomonas aeruginosa
84% identity, 96% coverage

B195_022465, FXO12_18260 50S ribosomal protein L34 from Pseudomonas sp. J380
81% identity, 93% coverage

PP0009 ribosomal protein L34 from Pseudomonas putida KT2440
81% identity, 93% coverage

A1S_2984 50S ribosomal protein L34 from Acinetobacter baumannii ATCC 17978
82% identity, 71% coverage

7m4v1 / B7IBH8 A. Baumannii ribosome-eravacycline complex: 50s (see paper)
ACIAD3684 50S ribosomal protein L34 from Acinetobacter sp. ADP1
82% identity, 96% coverage

VDA_003244 50S ribosomal protein L34 from Photobacterium damselae subsp. damselae CIP 102761
83% identity, 91% coverage

VC0007 ribosomal protein L34 from Vibrio cholerae O1 biovar eltor str. N16961
83% identity, 91% coverage

SsaF / b3703 50S ribosomal subunit protein L34 from Escherichia coli K-12 substr. MG1655 (see 31 papers)
rpmH / P0A7P5 50S ribosomal subunit protein L34 from Escherichia coli (strain K12) (see 30 papers)
RL34_ECOLI / P0A7P5 Large ribosomal subunit protein bL34; 50S ribosomal protein L34 from Escherichia coli (strain K12) (see 8 papers)
8a3l1 / P0A7P5 8a3l1 (see paper)
ECs4638 50S ribosomal subunit protein L34 from Escherichia coli O157:H7 str. Sakai
A6TG05 Large ribosomal subunit protein bL34 from Klebsiella pneumoniae subsp. pneumoniae (strain ATCC 700721 / MGH 78578)
B5QUQ1 Large ribosomal subunit protein bL34 from Salmonella enteritidis PT4 (strain P125109)
NP_418158 50S ribosomal subunit protein L34 from Escherichia coli str. K-12 substr. MG1655
P0A7P8 Large ribosomal subunit protein bL34 from Salmonella typhimurium (strain LT2 / SGSC1412 / ATCC 700720)
b3703 50S ribosomal protein L34 from Escherichia coli str. K-12 substr. MG1655
NP_709497 50S ribosomal subunit protein L34 from Shigella flexneri 2a str. 301
SPC_3926 50S ribosomal protein L34 from Salmonella enterica subsp. enterica serovar Paratyphi C strain RKS4594
SEN3656 50s ribosomal protein l34 from Salmonella enterica subsp. enterica serovar Enteritidis str. P125109
Z5194 50S ribosomal subunit protein L34 from Escherichia coli O157:H7 EDL933
93% identity, 100% coverage

BB0440 ribosomal protein L34 (rpmH) from Borrelia burgdorferi B31
69% identity, 88% coverage

SMc04434 PROBABLE 50S RIBOSOMAL PROTEIN L34 from Sinorhizobium meliloti 1021
73% identity, 96% coverage

BF0487 50S ribosomal protein L34 from Bacteroides fragilis YCH46
71% identity, 85% coverage

RL34_DEIRA / Q9RSH2 Large ribosomal subunit protein bL34; 50S ribosomal protein L34 from Deinococcus radiodurans (strain ATCC 13939 / DSM 20539 / JCM 16871 / CCUG 27074 / LMG 4051 / NBRC 15346 / NCIMB 9279 / VKM B-1422 / R1) (see 6 papers)
7a0r2 / Q9RSH2 50s deinococcus radiodurans ribosome bounded with mycinamicin i (see paper)
67% identity, 96% coverage

FTH_0169 ribosomal protein L34 from Francisella tularensis subsp. holarctica OSU18
70% identity, 96% coverage

SUB1659A 50S ribosomal protein L34 from Streptococcus uberis 0140J
70% identity, 96% coverage

PGN_0694 50S ribosomal protein L34 from Porphyromonas gingivalis ATCC 33277
PG0656 ribosomal protein L34 from Porphyromonas gingivalis W83
F452_RS0103440 50S ribosomal protein L34 from Porphyromonas gulae DSM 15663
69% identity, 90% coverage

RL34_BACSU / P05647 Large ribosomal subunit protein bL34; 50S ribosomal protein L34 from Bacillus subtilis (strain 168) (see paper)
Q65CM7 Large ribosomal subunit protein bL34 from Bacillus licheniformis (strain ATCC 14580 / DSM 13 / JCM 2505 / CCUG 7422 / NBRC 12200 / NCIMB 9375 / NCTC 10341 / NRRL NRS-1264 / Gibson 46)
BSU41060 50S ribosomal protein L34 from Bacillus subtilis subsp. subtilis str. 168
70% identity, 96% coverage

stu1808 50S ribosomal protein L34 from Streptococcus thermophilus LMG 18311
68% identity, 96% coverage

SM12261_RS01350 50S ribosomal protein L34 from Streptococcus mitis NCTC 12261
SPD_1790 ribosomal protein L34 from Streptococcus pneumoniae D39
70% identity, 96% coverage

LMRG_02427 ribosomal protein L34 from Listeria monocytogenes 10403S
Q71VQ6 Large ribosomal subunit protein bL34 from Listeria monocytogenes serotype 4b (strain F2365)
lmo2856 ribosomal protein L34 from Listeria monocytogenes EGD-e
70% identity, 96% coverage

NE0390 Ribosomal protein L34 from Nitrosomonas europaea ATCC 19718
64% identity, 96% coverage

AT3G13882 structural constituent of ribosome from Arabidopsis thaliana
67% identity, 21% coverage

TDE2400 ribosomal protein L34 from Treponema denticola ATCC 35405
71% identity, 88% coverage

C6B32_03095 50S ribosomal protein L34 from Campylobacter fetus subsp. testudinum
64% identity, 96% coverage

CFF8240_0551 ribosomal protein L34 from Campylobacter fetus subsp. fetus 82-40
64% identity, 96% coverage

BPSL0075a 50S ribosomal protein L34 from Burkholderia pseudomallei K96243
BRPE64_RS14035 50S ribosomal protein L34 from Burkholderia pseudomallei MSHR146
65% identity, 93% coverage

LOC103870086 uncharacterized protein LOC103870086 from Brassica rapa
67% identity, 29% coverage

FE46_RS03875 50S ribosomal protein L34 from Flavobacterium psychrophilum
68% identity, 83% coverage

5nrg2 / Q2FUQ0 The crystal structure of the large ribosomal subunit of staphylococcus aureus in complex with rb02 (see paper)
6dddP Structure of the 50s ribosomal subunit from methicillin resistant staphylococcus aureus in complex with the oxazolidinone antibiotic lzd-5 (see paper)
66% identity, 96% coverage

NGO2182 hypothetical protein from Neisseria gonorrhoeae FA 1090
66% identity, 96% coverage

YSS_RS04330 50S ribosomal protein L34 from Campylobacter coli RM4661
CJJ81176_0984 ribosomal protein L34 from Campylobacter jejuni subsp. jejuni 81-176
64% identity, 96% coverage

SAS093 50S ribosomal protein L34 from Staphylococcus aureus subsp. aureus N315
SAOUHSC_03055 ribosomal protein L34 from Staphylococcus aureus subsp. aureus NCTC 8325
SERP0001 ribosomal protein L34 from Staphylococcus epidermidis RP62A
B4602_RS14385, EKM74_RS07590 50S ribosomal protein L34 from Staphylococcus aureus
66% identity, 96% coverage

GRMZM2G126603 LIN1 protein from Zea mays
64% identity, 31% coverage

6o8w4 / A0A1B4XSI4 6o8w4 (see paper)
IUJ47_RS03805 50S ribosomal protein L34 from Enterococcus faecalis
EF3333 ribosomal protein L34 from Enterococcus faecalis V583
66% identity, 96% coverage

Q81JG9 Large ribosomal subunit protein bL34 from Bacillus anthracis
66% identity, 96% coverage

MHO_0030 50S ribosomal protein L34 from Mycoplasma hominis
64% identity, 94% coverage

9c4g1 / A0A2B7IDI8 Cutibacterium acnes 50s ribosomal subunit with clindamycin bound (see paper)
66% identity, 96% coverage

8p7x0 / P78006 8p7x0 (see paper)
MPN682 ribosomal protein L34 from Mycoplasma pneumoniae M129
62% identity, 94% coverage

HL033_03600 50S ribosomal protein L34 from Neoehrlichia mikurensis
64% identity, 96% coverage

Kole_0258 50S ribosomal protein L34 from Kosmotoga olearia TBF 19.5.1
Kole_0258 ribosomal protein L34 from Thermotogales bacterium TBF 19.5.1
61% identity, 96% coverage

CD3680 50S ribosomal protein L34 from Clostridium difficile 630
64% identity, 91% coverage

RL34_THET8 / P80340 Large ribosomal subunit protein bL34; 50S ribosomal protein L34 from Thermus thermophilus (strain ATCC 27634 / DSM 579 / HB8) (see paper)
5a9zAd / P80340 of Thermous thermophilus ribosome bound to BipA-GDPCP (see paper)
TTHA0446 ribosomal protein L34 from Thermus thermophilus HB8
62% identity, 92% coverage

GOX1825 LSU ribosomal protein L34P from Gluconobacter oxydans 621H
66% identity, 96% coverage

5myjB6 / A2RHL6 of 70S ribosome from Lactococcus lactis (see paper)
L133770 50S ribosomal protein L34 from Lactococcus lactis subsp. lactis Il1403
64% identity, 96% coverage

D7FBT9 Large ribosomal subunit protein bL34 from Helicobacter pylori (strain B8)
HP1447 ribosomal protein L34 (rpl34) from Helicobacter pylori 26695
61% identity, 96% coverage

8c972 / P0A7P5 Cryo-em captures early ribosome assembly in action (see paper)
100% identity, 59% coverage

ECH_0440 ribosomal protein L34 from Ehrlichia chaffeensis str. Arkansas
64% identity, 96% coverage

BLIJ_2570 50S ribosomal protein L34 from Bifidobacterium longum subsp. infantis ATCC 15697 = JCM 1222 = DSM
64% identity, 96% coverage

BL0642 50S ribosomal protein L34 from Bifidobacterium longum NCC2705
64% identity, 96% coverage

HMPREF0424_0046 ribosomal protein L34 from Gardnerella vaginalis 409-05
64% identity, 96% coverage

SCO3880 50S ribosomal protein L34 from Streptomyces coelicolor A3(2)
63% identity, 93% coverage

AKJ12_RS19625, XFF4834R_chr42980 50S ribosomal protein L34 from Xanthomonas arboricola pv. juglandis
Q05HP6 Large ribosomal subunit protein bL34 from Xanthomonas oryzae pv. oryzae (strain KACC10331 / KXO85)
81% identity, 93% coverage

TTE2802 Ribosomal protein L34 from Thermoanaerobacter tengcongensis MB4
61% identity, 96% coverage

XF_RS12125 50S ribosomal protein L34 from Xylella fastidiosa 9a5c
PD2123 50S ribosomal protein L34 from Xylella fastidiosa Temecula1
77% identity, 93% coverage

CPj0935 50S ribosomal protein L34 from Chlamydia pneumoniae J138
Q9Z6X1 Large ribosomal subunit protein bL34 from Chlamydia pneumoniae
CPj0935 L34 ribosomal protein from Chlamydophila pneumoniae J138
64% identity, 91% coverage

RPA0634 possible ribosomal protein L34 from Rhodopseudomonas palustris CGA009
73% identity, 96% coverage

bsr8096 ribosomal protein L34 from Bradyrhizobium japonicum USDA 110
73% identity, 96% coverage

A8FJG4 Large ribosomal subunit protein bL34 from Bacillus pumilus (strain SAFR-032)
68% identity, 96% coverage

MAB_4955c 50S ribosomal protein L34 from Mycobacterium abscessus ATCC 19977
58% identity, 91% coverage

MSMEG_6946 ribosomal protein L34 from Mycobacterium smegmatis str. MC2 155
MSMEG_6946 50S ribosomal protein L34 from Mycolicibacterium smegmatis MC2 155
58% identity, 91% coverage

Rv3924c 50S ribosomal protein L34 from Mycobacterium tuberculosis H37Rv
MT4041.1 50S ribosomal protein L34 from Mycobacterium tuberculosis CDC1551
58% identity, 91% coverage

ML2713 50S ribosomal protein L34 from Mycobacterium leprae TN
56% identity, 91% coverage

wcw_0805 50S ribosomal protein L34 from Waddlia chondrophila WSU 86-1044
51% identity, 93% coverage

5mrcY / Q04598 of the yeast mitochondrial ribosome - Class A (see paper)
55% identity, 87% coverage

RM34_YEAST / Q04598 Large ribosomal subunit protein bL34m; 54S ribosomal protein L34, mitochondrial; L34mt from Saccharomyces cerevisiae (strain ATCC 204508 / S288c) (Baker's yeast) (see 4 papers)
YDR115W Putative mitochondrial ribosomal protein of the large subunit, has similarity to E. coli L34 ribosomal protein; required for respiratory growth, as are most mitochondrial ribosomal proteins from Saccharomyces cerevisiae
55% identity, 38% coverage

PADG_04085 ribosomal protein L34 from Paracoccidioides brasiliensis Pb18
58% identity, 26% coverage

New Search

For advice on how to use these tools together, see Interactive tools for functional annotation of bacterial genomes.

Statistics

The PaperBLAST database links 793,807 different protein sequences to 1,259,118 scientific articles. Searches against EuropePMC were last performed on March 13 2025.

How It Works

PaperBLAST builds a database of protein sequences that are linked to scientific articles. These links come from automated text searches against the articles in EuropePMC and from manually-curated information from GeneRIF, UniProtKB/Swiss-Prot, BRENDA, CAZy (as made available by dbCAN), BioLiP, CharProtDB, MetaCyc, EcoCyc, TCDB, REBASE, the Fitness Browser, and a subset of the European Nucleotide Archive with the /experiment tag. Given this database and a protein sequence query, PaperBLAST uses protein-protein BLAST to find similar sequences with E < 0.001.

To build the database, we query EuropePMC with locus tags, with RefSeq protein identifiers, and with UniProt accessions. We obtain the locus tags from RefSeq or from MicrobesOnline. We use queries of the form "locus_tag AND genus_name" to try to ensure that the paper is actually discussing that gene. Because EuropePMC indexes most recent biomedical papers, even if they are not open access, some of the links may be to papers that you cannot read or that our computers cannot read. We query each of these identifiers that appears in the open access part of EuropePMC, as well as every locus tag that appears in the 500 most-referenced genomes, so that a gene may appear in the PaperBLAST results even though none of the papers that mention it are open access. We also incorporate text-mined links from EuropePMC that link open access articles to UniProt or RefSeq identifiers. (This yields some additional links because EuropePMC uses different heuristics for their text mining than we do.)

For every article that mentions a locus tag, a RefSeq protein identifier, or a UniProt accession, we try to select one or two snippets of text that refer to the protein. If we cannot get access to the full text, we try to select a snippet from the abstract, but unfortunately, unique identifiers such as locus tags are rarely provided in abstracts.

PaperBLAST also incorporates manually-curated protein functions:

Except for GeneRIF and ENA, the curated entries include a short curated description of the protein's function. For entries from BioLiP, the protein's function may not be known beyond binding to the ligand. Many of these entries also link to articles in PubMed.

For more information see the PaperBLAST paper (mSystems 2017) or the code. You can download PaperBLAST's database here.

Changes to PaperBLAST since the paper was written:

Many of these changes are described in Interactive tools for functional annotation of bacterial genomes.

Secrets

PaperBLAST cannot provide snippets for many of the papers that are published in non-open-access journals. This limitation applies even if the paper is marked as "free" on the publisher's web site and is available in PubmedCentral or EuropePMC. If a journal that you publish in is marked as "secret," please consider publishing elsewhere.

Omissions from the PaperBLAST Database

Many important articles are missing from PaperBLAST, either because the article's full text is not in EuropePMC (as for many older articles), or because the paper does not mention a protein identifier such as a locus tag, or because of PaperBLAST's heuristics. If you notice an article that characterizes a protein's function but is missing from PaperBLAST, please notify the curators at UniProt or add an entry to GeneRIF. Entries in either of these databases will eventually be incorporated into PaperBLAST. Note that to add an entry to UniProt, you will need to find the UniProt identifier for the protein. If the protein is not already in UniProt, you can ask them to create an entry. To add an entry to GeneRIF, you will need an NCBI Gene identifier, but unfortunately many prokaryotic proteins in RefSeq do not have corresponding Gene identifers.

References

PaperBLAST: Text-mining papers for information about homologs.
M. N. Price and A. P. Arkin (2017). mSystems, 10.1128/mSystems.00039-17.

Europe PMC in 2017.
M. Levchenko et al (2017). Nucleic Acids Research, 10.1093/nar/gkx1005.

Gene indexing: characterization and analysis of NLM's GeneRIFs.
J. A. Mitchell et al (2003). AMIA Annu Symp Proc 2003:460-464.

UniProt: the universal protein knowledgebase.
The UniProt Consortium (2016). Nucleic Acids Research, 10.1093/nar/gkw1099.

BRENDA in 2017: new perspectives and new tools in BRENDA.
S. Placzek et al (2017). Nucleic Acids Research, 10.1093/nar/gkw952.

The EcoCyc database: reflecting new knowledge about Escherichia coli K-12.
I. M. Keeseler et al (2016). Nucleic Acids Research, 10.1093/nar/gkw1003.

The MetaCyc database of metabolic pathways and enzymes.
R. Caspi et al (2018). Nucleic Acids Research, 10.1093/nar/gkx935.

CharProtDB: a database of experimentally characterized protein annotations.
R. Madupu et al (2012). Nucleic Acids Research, 10.1093/nar/gkr1133.

The carbohydrate-active enzymes database (CAZy) in 2013.
V. Lombard et al (2014). Nucleic Acids Research, 10.1093/nar/gkt1178.

The Transporter Classification Database (TCDB): recent advances
M. H. Saier, Jr. et al (2016). Nucleic Acids Research, 10.1093/nar/gkv1103.

REBASE - a database for DNA restriction and modification: enzymes, genes and genomes.
R. J. Roberts et al (2015). Nucleic Acids Research, 10.1093/nar/gku1046.

Deep annotation of protein function across diverse bacteria from mutant phenotypes.
M. N. Price et al (2016). bioRxiv, 10.1101/072470.

by Morgan Price, Arkin group
Lawrence Berkeley National Laboratory