PaperBLAST – Find papers about a protein or its homologs

 

PaperBLAST

PaperBLAST Hits for ABIE51_RS17405 (63 a.a., MSETAATTTF...)

Other sequence analysis tools:

Find functional residues: SitesBLAST

Search for conserved domains

Find the best match in UniProt

Compare to protein structures

Predict transmenbrane helices: Phobius

Predict protein localization: PSORTb

Find homologs in fast.genomics

Fitness BLAST: loading...

Found 136 similar proteins in the literature:

PP_5371 rubredoxin/rubredoxin reductase from Pseudomonas putida KT2440
66% identity, 11% coverage

RUBR2_PSEAE / Q9HTK8 Rubredoxin-2; Rdxs from Pseudomonas aeruginosa (strain ATCC 15692 / DSM 22644 / CIP 104116 / JCM 14847 / LMG 12228 / 1C / PRS 101 / PAO1) (see 3 papers)
PA5350 rubredoxin 2 from Pseudomonas aeruginosa PAO1
BWR11_30365 rubredoxin from Pseudomonas aeruginosa
74% identity, 86% coverage

Pnuc_0238 rubredoxin-type Fe(Cys)4 protein from Polynucleobacter sp. QLW-P1DMWA-1
69% identity, 86% coverage

2v3bB / Q9HTK8 Crystal structure of the electron transfer complex rubredoxin - rubredoxin reductase from pseudomonas aeruginosa. (see paper)
77% identity, 83% coverage

Pnuc_1377 rubredoxin-type Fe(Cys)4 protein from Polynucleobacter sp. QLW-P1DMWA-1
67% identity, 86% coverage

RSc0667 PROBABLE RUBREDOXIN PROTEIN from Ralstonia solanacearum GMI1000
63% identity, 86% coverage

E3H47_10395 rubredoxin RubA from Acinetobacter radioresistens
65% identity, 86% coverage

ABO_0163 rubredoxin from Alcanivorax borkumensis SK2
69% identity, 86% coverage

PP_5315 rubredoxin from Pseudomonas putida KT2440
67% identity, 86% coverage

ACP86_07290 rubredoxin from Marinobacter sp. CP1
67% identity, 86% coverage

RUBR_ACIAD / P42453 Rubredoxin; Rdxs from Acinetobacter baylyi (strain ATCC 33305 / BD413 / ADP1) (see paper)
rubA / RF|YP_045776.1 rubredoxin from Acinetobacter sp. ADP1 (see 3 papers)
rubA / CAA86925.1 rubredoxin from Acinetobacter baylyi (see 3 papers)
65% identity, 86% coverage

HWW27_RS03965 rubredoxin from Burkholderia ambifaria
62% identity, 100% coverage

RUBR1_PSEAE / Q9HTK7 Rubredoxin-1; Rdxs from Pseudomonas aeruginosa (strain ATCC 15692 / DSM 22644 / CIP 104116 / JCM 14847 / LMG 12228 / 1C / PRS 101 / PAO1) (see 2 papers)
PA5351 Rubredoxin 1 from Pseudomonas aeruginosa PAO1
PA14_70640 Rubredoxin 1 from Pseudomonas aeruginosa UCBPP-PA14
BWR11_30370, CIA_05306 rubredoxin from Pseudomonas aeruginosa
69% identity, 86% coverage

MSMEG_1841 rubredoxin from Mycolicibacterium smegmatis MC2 155
MSMEG_1841 rubredoxin from Mycobacterium smegmatis str. MC2 155
61% identity, 89% coverage

A1S_0995 rubredoxin from Acinetobacter baumannii ATCC 17978
BUM88_04810 rubredoxin RubA from Acinetobacter nosocomialis
63% identity, 86% coverage

rubA rubredoxin from Acinetobacter sp. M-1 (see paper)
63% identity, 86% coverage

2kn9A / O05893 Solution structure of zinc-substituted rubredoxin b (rv3250c) from mycobacterium tuberculosis. Seattle structural genomics center for infectious disease target mytud.01635.A (see paper)
56% identity, 73% coverage

CFU_RS05585 rubredoxin from Collimonas fungivorans Ter331
60% identity, 95% coverage

rubB / I6YFL7 rubredoxin 1 (EC 1.14.15.3) from Mycobacterium tuberculosis (strain ATCC 25618 / H37Rv) (see paper)
MT3348 rubredoxin from Mycobacterium tuberculosis CDC1551
NP_217767 rubredoxin RubB from Mycobacterium tuberculosis H37Rv
BCG_3279c putative rubredoxin rubB from Mycobacterium bovis BCG str. Pasteur 1173P2
Rv3250c PROBABLE RUBREDOXIN RUBB from Mycobacterium tuberculosis H37Rv
59% identity, 89% coverage

Q0VKZ2 Rubredoxin-2 from Alcanivorax borkumensis (strain ATCC 700651 / DSM 11573 / NCIMB 13689 / SK2)
ABO_2708 rubredoxin from Alcanivorax borkumensis SK2
60% identity, 33% coverage

alkG / CAC38028.1 rubredoxin from Alcanivorax borkumensis (see 2 papers)
60% identity, 33% coverage

NE1426 Rubredoxin:Rubredoxin-type Fe(Cys)4 protein from Nitrosomonas europaea ATCC 19718
57% identity, 95% coverage

Mvan_1744 Rubredoxin-type Fe(Cys)4 protein from Mycobacterium vanbaalenii PYR-1
59% identity, 89% coverage

mdpB / A2SP77 MTBE monooxygenase rubredoxin component from Methylibium petroleiphilum (strain ATCC BAA-1232 / LMG 22953 / PM1) (see 4 papers)
Mpe_B0602 rubredoxin from Methylibium petroleiphilum PM1
60% identity, 87% coverage

FXO12_18785 rubredoxin from Pseudomonas sp. J380
61% identity, 86% coverage

AC1659_RS17120 rubredoxin from Rhodococcus erythropolis
56% identity, 100% coverage

alkG / CAB51049.1 rubredoxin 2 from Pseudomonas putida (see 2 papers)
Q9WWW4 Rubredoxin-1 from Pseudomonas putida
53% identity, 33% coverage

rubA4 / CAC37040.1 rubredoxin 4 from Rhodococcus erythropolis (see 2 papers)
AC1659_RS04740 rubredoxin from Rhodococcus erythropolis
54% identity, 89% coverage

alkG / P00272 rubredoxin 2 (EC 1.14.15.3) from Pseudomonas oleovorans (see 2 papers)
RUBR2_ECTOL / P00272 Rubredoxin-2; Rdxs; Two-iron rubredoxin from Ectopseudomonas oleovorans (Pseudomonas oleovorans) (see 4 papers)
alkG / CAB54052.1 rubredoxin 2 from Pseudomonas putida (see 6 papers)
54% identity, 33% coverage

CLJU_RS09545 flavin reductase from Clostridium ljungdahlii DSM 13528
57% identity, 21% coverage

OLMES_3726 rubredoxin from Oleiphilus messinensis
57% identity, 84% coverage

1s24A / P00272 Rubredoxin domain ii from pseudomonas oleovorans (see paper)
57% identity, 84% coverage

WP_010878381 rubredoxin from Archaeoglobus fulgidus DSM 4304
AF0880 rubredoxin (rd-1) from Archaeoglobus fulgidus DSM 4304
56% identity, 79% coverage

Cbei_0465 rubredoxin-type Fe(Cys)4 protein from Clostridium beijerincki NCIMB 8052
59% identity, 78% coverage

2pvxB / P24297 Nmr and x-ray analysis of structural additivity in metal binding site- swapped hybrids of rubredoxin (see paper)
56% identity, 83% coverage

8f6tA / A0A1I2I8Z9 Cryo-em structure of alkane 1-monooxygenase alkb-alkg complex from fontimonas thermophila (see paper)
55% identity, 12% coverage

NMB0993 rubredoxin from Neisseria meningitidis MC58
60% identity, 79% coverage

PAP_03330 rubredoxin from Palaeococcus pacificus DY20341
58% identity, 79% coverage

Tsac_1153 rubredoxin from Thermoanaerobacterium saccharolyticum JW/SL-YS485
56% identity, 83% coverage

RUBR_CLOAB / Q9AL94 Rubredoxin; Rd; EC 1.-.-.- from Clostridium acetobutylicum (strain ATCC 824 / DSM 792 / JCM 1419 / IAM 19013 / LMG 5710 / NBRC 13948 / NRRL B-527 / VKM B-1787 / 2291 / W) (see 5 papers)
CA_C2778 rubredoxin from Clostridium acetobutylicum ATCC 824
CAC2778 Rubredoxin from Clostridium acetobutylicum ATCC 824
59% identity, 78% coverage

TK0524 rubredoxin from Thermococcus kodakaraensis KOD1
60% identity, 79% coverage

BT_2539 rubredoxin from Bacteroides thetaiotaomicron VPI-5482
61% identity, 78% coverage

alr1174 rubrerythrin from Nostoc sp. PCC 7120
51% identity, 22% coverage

RUBR_HELMO / P56263 Rubredoxin; Rd from Heliobacterium mobile (Heliobacillus mobilis) (see paper)
55% identity, 81% coverage

4xnwC / P00268,P47900 The human p2y1 receptor in complex with mrs2500 (see paper)
51% identity, 14% coverage

6gpsA / P00268,P41597 Crystal structure of ccr2a in complex with mk-0812 (see paper)
51% identity, 15% coverage

5xpdA / Q9FGQ2 Sugar transporter of atsweet13 in inward-facing state with a substrate analog (see paper)
51% identity, 18% coverage

BCAL2458 rubredoxin from Burkholderia cenocepacia J2315
50% identity, 86% coverage

6me8A / P00268,P0ABE7,P49286 Xfel crystal structure of human melatonin receptor mt2 (n86d) in complex with 2-phenylmelatonin (see paper)
51% identity, 11% coverage

HRB_MOOTA / Q9FDN6 High molecular weight rubredoxin; Nitric oxide reductase NADH:FprA oxidoreductase from Moorella thermoacetica (strain ATCC 39073 / JCM 9320) (see paper)
51% identity, 21% coverage

TepiRe1_0396 rubredoxin from Tepidanaerobacter acetatoxydans Re1
54% identity, 79% coverage

5ai2A / P24297 Anomalous neutron phased crystal structure of 113cd-substituted perdeuterated pyrococcus furiosus rubredoxin to 1.75a resolution at 295k (see paper)
PF1282 rubredoxin from Pyrococcus furiosus DSM 3638
P24297 Rubredoxin from Pyrococcus furiosus (strain ATCC 43587 / DSM 3638 / JCM 8422 / Vc1)
WP_011012426 rubredoxin from Pyrococcus furiosus DSM 3638
52% identity, 79% coverage

1yk5A / Q9V099 Pyrococcus abyssi rubredoxin (see paper)
54% identity, 79% coverage

6li2A / P00268,Q9Y2T5 Crystal structure of gpr52 ligand free form with rubredoxin fusion (see paper)
50% identity, 15% coverage

WP_048064374 rubredoxin from Archaeoglobus fulgidus DSM 4304
54% identity, 79% coverage

G3EIL7 alkane 1-monooxygenase (EC 1.14.15.3) from Dietzia sp. DQ12-45-1b (see paper)
47% identity, 12% coverage

AF1349 rubredoxin (rd-2) from Archaeoglobus fulgidus DSM 4304
57% identity, 64% coverage

Fisuc_2091 Rubredoxin-type Fe(Cys)4 protein from Fibrobacter succinogenes subsp. succinogenes S85
62% identity, 75% coverage

Csac_1990 Rubredoxin-type Fe(Cys)4 protein from Caldicellulosiruptor saccharolyticus DSM 8903
60% identity, 75% coverage

5vblB / P00268,P35414 Structure of apelin receptor in complex with agonist peptide (see paper)
49% identity, 16% coverage

2dsxA / P00270 Crystal structure of rubredoxin from desulfovibrio gigas to ultra-high 0.68 a resolution (see paper)
55% identity, 78% coverage

Deba_2049 rubredoxin from Desulfarculus baarsii DSM 2075
51% identity, 81% coverage

6ln2A / P00268,P43220 Crystal structure of full length human glp1 receptor in complex with fab fragment (fab7f38) (see paper)
51% identity, 11% coverage

alkB / CAB51024.2 alkane 1-monooxygenase from Prauserella rugosa (see 3 papers)
59% identity, 9% coverage

TDE1052 rubredoxin from Treponema denticola ATCC 35405
57% identity, 75% coverage

TM_0659 rubredoxin from Thermotoga maritima MSB8
TM0659 rubredoxin from Thermotoga maritima MSB8
53% identity, 78% coverage

6iivA / P00268,P0ABE7,P21731 Crystal structure of the human thromboxane a2 receptor bound to daltroban (see paper)
51% identity, 11% coverage

Cthe_2164 rubredoxin from Acetivibrio thermocellus ATCC 27405
Cthe_2164 Rubredoxin-type Fe(Cys)4 protein from Clostridium thermocellum ATCC 27405
57% identity, 78% coverage

AOP6_0414 rubredoxin from Desulfuromonas sp. AOP6
53% identity, 75% coverage

Dde_3194 rubredoxin from Oleidesulfovibrio alaskensis G20
Dde_3194 rubredoxin from Desulfovibrio desulfuricans G20
52% identity, 79% coverage

rubR2 / OMNI|NTL03CP0780 rubredoxin 2 from Clostridium perfringens (see 2 papers)
CPE0780 rubredoxin from Clostridium perfringens str. 13
P14072 Rubredoxin-2 from Clostridium perfringens (strain 13 / Type A)
52% identity, 76% coverage

6bd4A / P00268,Q9ULV1 Crystal structure of human apo-frizzled4 receptor (see paper)
48% identity, 14% coverage

MCP_2757 rubredoxin from Methanocella paludicola SANAE
50% identity, 79% coverage

DBW_3237 rubredoxin from Desulfuromonas sp. DDH964
50% identity, 79% coverage

AC1659_RS04745 rubredoxin from Rhodococcus erythropolis
56% identity, 81% coverage

Tpen_1457 Rubredoxin-type Fe(Cys)4 protein from Thermofilum pendens Hrk 5
46% identity, 83% coverage

AAA23279.1 rubredoxin from Clostridium pasteurianum (see paper)
P00268 Rubredoxin from Clostridium pasteurianum
51% identity, 78% coverage

7f1tA / P00268,P10147,P51681 Crystal structure of the human chemokine receptor ccr5 in complex with mip-1a (see paper)
49% identity, 12% coverage

SSCH_180038 flavin reductase from Syntrophaceticus schinkii
51% identity, 21% coverage

KSMBR1_2919 rubredoxin from Candidatus Kuenenia stuttgartiensis
46% identity, 83% coverage

BQ4888_RS07685 rubredoxin from Desulfuromonas acetexigens
50% identity, 76% coverage

HMPREF0389_00337 rubredoxin from Filifactor alocis ATCC 35896
50% identity, 79% coverage

DEFDS_0573 rubredoxin from Deferribacter desulfuricans SSM1
48% identity, 76% coverage

Tlet_1612 rubredoxin from Pseudothermotoga lettingae TMO
50% identity, 76% coverage

SYNW2369 Rubrerythrin from Synechococcus sp. WH 8102
52% identity, 19% coverage

OA04_09050 anaerobic nitric oxide reductase flavorubredoxin from Pectobacterium versatile
54% identity, 9% coverage

Dret_0139 Rubredoxin-type Fe(Cys)4 protein from Desulfohalobium retbaense DSM 5692
50% identity, 79% coverage

STM2840 putative flavoprotein from Salmonella typhimurium LT2
52% identity, 10% coverage

AOP6_0415 rubredoxin from Desulfuromonas sp. AOP6
48% identity, 79% coverage

YgaK / b2710 anaerobic nitric oxide reductase flavorubredoxin from Escherichia coli K-12 substr. MG1655 (see 6 papers)
norV / Q46877 anaerobic nitric oxide reductase flavorubredoxin from Escherichia coli (strain K12) (see 24 papers)
NORV_ECODH / B1XCN7 Anaerobic nitric oxide reductase flavorubredoxin; FlRd; FlavoRb from Escherichia coli (strain K12 / DH10B) (see paper)
GB|AAC75752.1 anaerobic nitric oxide reductase flavorubredoxin from Escherichia coli K12 (see 10 papers)
NP_417190 anaerobic nitric oxide reductase flavorubredoxin from Escherichia coli str. K-12 substr. MG1655
Q46877 Anaerobic nitric oxide reductase flavorubredoxin from Escherichia coli (strain K12)
b2710 anaerobic nitric oxide reductase flavorubredoxin from Escherichia coli str. K-12 substr. MG1655
48% identity, 12% coverage

Cspa_c10950 rubredoxin from Clostridium saccharoperbutylacetonicum N1-4(HMT)
47% identity, 81% coverage

GSU0847 rubredoxin from Geobacter sulfurreducens PCA
48% identity, 76% coverage

2pveA / P00268 Nmr and x-ray analysis of structural additivity in metal binding site- swapped hybrids of rubredoxin (see paper)
45% identity, 78% coverage

tca_00140 rubredoxin from Methanothermobacter sp. EMTCatA1
52% identity, 79% coverage

2kkdA / P00269 Nmr structure of ni substitued desulfovibrio vulgaris rubredoxin (see paper)
DVU3184 rubredoxin from Desulfovibrio vulgaris Hildenborough
47% identity, 78% coverage

A0A061F296 Rubredoxin-like superfamily protein from Theobroma cacao
41% identity, 29% coverage

L21SP2_2197 rubredoxin from Salinispira pacifica
49% identity, 78% coverage

U876_23285 anaerobic nitric oxide reductase flavorubredoxin from Aeromonas hydrophila NJ-35
50% identity, 9% coverage

DTF_RS25905 rubredoxin from Desulfuromonas sp. TF
50% identity, 79% coverage

ECs3566 putative flavodoxin from Escherichia coli O157:H7 str. Sakai
Z4018 putative flavodoxin from Escherichia coli O157:H7 EDL933
47% identity, 14% coverage

RUBR1_ECTOL / P12692 Rubredoxin-1; Rdxs from Ectopseudomonas oleovorans (Pseudomonas oleovorans) (see paper)
51% identity, 35% coverage

7e0lA / D9PYV4 Class iii hybrid cluster protein (hcp) from methanothermobacter marburgensis
48% identity, 10% coverage

GM298_06635 anaerobic nitric oxide reductase flavorubredoxin from Enterobacter sp. HSTU-ASh6
52% identity, 10% coverage

GSU3188 rubredoxin from Geobacter sulfurreducens PCA
50% identity, 79% coverage

WP_048196713 rubredoxin from Methanocaldococcus jannaschii DSM 2661
49% identity, 71% coverage

MJ0740 rubredoxin 2 (rd2) from Methanocaldococcus jannaschii DSM 2661
49% identity, 71% coverage

2ms3A / Q46877 The nmr structure of the rubredoxin domain of the no reductase flavorubredoxin from escherichia coli
50% identity, 76% coverage

FN1424 ACYL-COA dehydrogenase, short-chain specific from Fusobacterium nucleatum subsp. nucleatum ATCC 25586
43% identity, 7% coverage

Mvan_1743 Rubredoxin-type Fe(Cys)4 protein from Mycobacterium vanbaalenii PYR-1
53% identity, 86% coverage

FPV33_RS05380 anaerobic nitric oxide reductase flavorubredoxin from Klebsiella aerogenes
44% identity, 10% coverage

KPK_1081 anaerobic nitric oxide reductase flavorubredoxin from Klebsiella pneumoniae 342
44% identity, 10% coverage

AC1659_RS17115 rubredoxin from Rhodococcus erythropolis
44% identity, 81% coverage

Fisuc_1369 Rubredoxin-type Fe(Cys)4 protein from Fibrobacter succinogenes subsp. succinogenes S85
41% identity, 83% coverage

Acfer_1575 acyl-CoA dehydrogenase domain protein from Acidaminococcus fermentans DSM 20731
39% identity, 7% coverage

G3EIL3 alkane 1-monooxygenase (EC 1.14.15.3) from Dietzia sp. DQ12-45-1b (see paper)
45% identity, 10% coverage

MT3349 rubredoxin from Mycobacterium tuberculosis CDC1551
49% identity, 79% coverage

Rv3251c PROBABLE RUBREDOXIN RUBA from Mycobacterium tuberculosis H37Rv
49% identity, 79% coverage

DSOUD_3135 rubredoxin from Desulfuromonas soudanensis
43% identity, 81% coverage

X551_03234 rubredoxin from Methylibium sp. T29
49% identity, 75% coverage

E5GBR8 Rubredoxin from Cucumis melo subsp. melo
41% identity, 30% coverage

Q9SLI4 At1g54500/F20D21_31 from Arabidopsis thaliana
AT1G54500 rubredoxin family protein from Arabidopsis thaliana
37% identity, 30% coverage

LI0697 Rubredoxin 2 (Rd-2) from Lawsonia intracellularis PHE/MN1-00
39% identity, 91% coverage

P00271 Rubredoxin from Megasphaera elsdenii
46% identity, 79% coverage

Q8KPP5 Rubredoxin from Synechococcus elongatus (strain ATCC 33912 / PCC 7942 / FACHB-805)
38% identity, 43% coverage

8itoB / Q726L3 Crystal structure of ferlp from desulfovibrio vulgaris (hildenborough)
45% identity, 66% coverage

DVU3093 rubredoxin-like protein from Desulfovibrio vulgaris Hildenborough
45% identity, 65% coverage

OLMES_3727 rubredoxin from Oleiphilus messinensis
43% identity, 79% coverage

TP0991 rubredoxin from Treponema pallidum subsp. pallidum str. Nichols
TPANIC_0991 rubredoxin from Treponema pallidum subsp. pallidum str. Nichols
47% identity, 75% coverage

RUBR2_DESDA / Q93PP8 Rubredoxin-2; Rd-2 from Desulfovibrio desulfuricans (strain ATCC 27774 / DSM 6949 / MB) (see paper)
rd2 / GI|14326007 rubredoxin 2 from Desulfovibrio desulfuricans subsp. desulfuricans str. ATCC 27774 (see 2 papers)
39% identity, 78% coverage

Dret_0886 Rubredoxin-type Fe(Cys)4 protein from Desulfohalobium retbaense DSM 5692
42% identity, 87% coverage

A0A1V5A688 Rubredoxin from Methanoregulaceae archaeon PtaU1.Bin222
38% identity, 75% coverage

Mbur_0092 ferredoxin-dependent glutamate synthase from Methanococcoides burtonii DSM 6242
43% identity, 10% coverage

New Search

For advice on how to use these tools together, see Interactive tools for functional annotation of bacterial genomes.

Statistics

The PaperBLAST database links 793,807 different protein sequences to 1,259,118 scientific articles. Searches against EuropePMC were last performed on March 13 2025.

How It Works

PaperBLAST builds a database of protein sequences that are linked to scientific articles. These links come from automated text searches against the articles in EuropePMC and from manually-curated information from GeneRIF, UniProtKB/Swiss-Prot, BRENDA, CAZy (as made available by dbCAN), BioLiP, CharProtDB, MetaCyc, EcoCyc, TCDB, REBASE, the Fitness Browser, and a subset of the European Nucleotide Archive with the /experiment tag. Given this database and a protein sequence query, PaperBLAST uses protein-protein BLAST to find similar sequences with E < 0.001.

To build the database, we query EuropePMC with locus tags, with RefSeq protein identifiers, and with UniProt accessions. We obtain the locus tags from RefSeq or from MicrobesOnline. We use queries of the form "locus_tag AND genus_name" to try to ensure that the paper is actually discussing that gene. Because EuropePMC indexes most recent biomedical papers, even if they are not open access, some of the links may be to papers that you cannot read or that our computers cannot read. We query each of these identifiers that appears in the open access part of EuropePMC, as well as every locus tag that appears in the 500 most-referenced genomes, so that a gene may appear in the PaperBLAST results even though none of the papers that mention it are open access. We also incorporate text-mined links from EuropePMC that link open access articles to UniProt or RefSeq identifiers. (This yields some additional links because EuropePMC uses different heuristics for their text mining than we do.)

For every article that mentions a locus tag, a RefSeq protein identifier, or a UniProt accession, we try to select one or two snippets of text that refer to the protein. If we cannot get access to the full text, we try to select a snippet from the abstract, but unfortunately, unique identifiers such as locus tags are rarely provided in abstracts.

PaperBLAST also incorporates manually-curated protein functions:

Except for GeneRIF and ENA, the curated entries include a short curated description of the protein's function. For entries from BioLiP, the protein's function may not be known beyond binding to the ligand. Many of these entries also link to articles in PubMed.

For more information see the PaperBLAST paper (mSystems 2017) or the code. You can download PaperBLAST's database here.

Changes to PaperBLAST since the paper was written:

Many of these changes are described in Interactive tools for functional annotation of bacterial genomes.

Secrets

PaperBLAST cannot provide snippets for many of the papers that are published in non-open-access journals. This limitation applies even if the paper is marked as "free" on the publisher's web site and is available in PubmedCentral or EuropePMC. If a journal that you publish in is marked as "secret," please consider publishing elsewhere.

Omissions from the PaperBLAST Database

Many important articles are missing from PaperBLAST, either because the article's full text is not in EuropePMC (as for many older articles), or because the paper does not mention a protein identifier such as a locus tag, or because of PaperBLAST's heuristics. If you notice an article that characterizes a protein's function but is missing from PaperBLAST, please notify the curators at UniProt or add an entry to GeneRIF. Entries in either of these databases will eventually be incorporated into PaperBLAST. Note that to add an entry to UniProt, you will need to find the UniProt identifier for the protein. If the protein is not already in UniProt, you can ask them to create an entry. To add an entry to GeneRIF, you will need an NCBI Gene identifier, but unfortunately many prokaryotic proteins in RefSeq do not have corresponding Gene identifers.

References

PaperBLAST: Text-mining papers for information about homologs.
M. N. Price and A. P. Arkin (2017). mSystems, 10.1128/mSystems.00039-17.

Europe PMC in 2017.
M. Levchenko et al (2017). Nucleic Acids Research, 10.1093/nar/gkx1005.

Gene indexing: characterization and analysis of NLM's GeneRIFs.
J. A. Mitchell et al (2003). AMIA Annu Symp Proc 2003:460-464.

UniProt: the universal protein knowledgebase.
The UniProt Consortium (2016). Nucleic Acids Research, 10.1093/nar/gkw1099.

BRENDA in 2017: new perspectives and new tools in BRENDA.
S. Placzek et al (2017). Nucleic Acids Research, 10.1093/nar/gkw952.

The EcoCyc database: reflecting new knowledge about Escherichia coli K-12.
I. M. Keeseler et al (2016). Nucleic Acids Research, 10.1093/nar/gkw1003.

The MetaCyc database of metabolic pathways and enzymes.
R. Caspi et al (2018). Nucleic Acids Research, 10.1093/nar/gkx935.

CharProtDB: a database of experimentally characterized protein annotations.
R. Madupu et al (2012). Nucleic Acids Research, 10.1093/nar/gkr1133.

The carbohydrate-active enzymes database (CAZy) in 2013.
V. Lombard et al (2014). Nucleic Acids Research, 10.1093/nar/gkt1178.

The Transporter Classification Database (TCDB): recent advances
M. H. Saier, Jr. et al (2016). Nucleic Acids Research, 10.1093/nar/gkv1103.

REBASE - a database for DNA restriction and modification: enzymes, genes and genomes.
R. J. Roberts et al (2015). Nucleic Acids Research, 10.1093/nar/gku1046.

Deep annotation of protein function across diverse bacteria from mutant phenotypes.
M. N. Price et al (2016). bioRxiv, 10.1101/072470.

by Morgan Price, Arkin group
Lawrence Berkeley National Laboratory