PaperBLAST – Find papers about a protein or its homologs

 

PaperBLAST

PaperBLAST Hits for tr|Q8EGS2|Q8EGS2_SHEON L-lactate permease OS=Shewanella oneidensis (strain ATCC 700550 / JCM 31522 / CIP 106686 / LMG 19005 / NCIMB 14063 / MR-1) OX=211586 GN=SO_1522 PE=3 SV=1 (547 a.a., MTILQLFASL...)

Other sequence analysis tools:

Find functional residues: SitesBLAST

Search for conserved domains

Find the best match in UniProt

Compare to protein structures

Predict transmenbrane helices: Phobius

Predict protein localization: PSORTb

Find homologs in fast.genomics

Fitness BLAST: loading...

Found 103 similar proteins in the literature:

SO1522 D,L-lactate/pyruvate symporter LctP2 from Shewanella oneidensis MR-1
SO1522 L-lactate permease, putative from Shewanella oneidensis MR-1
SO_1522 L-lactate permease from Shewanella oneidensis MR-1
100% identity, 100% coverage

Shewana3_2904 propionate/L-lactate/D-lactate transporter from Shewanella sp. ANA-3
95% identity, 100% coverage

HU689_06695 L-lactate permease from Shewanella algae
86% identity, 100% coverage

Psest_0955 D,L-lactate:H+ symporter from Pseudomonas stutzeri RCH2
45% identity, 95% coverage

PST_3336 L-lactate permease from Pseudomonas stutzeri A1501
44% identity, 95% coverage

VCA0983 L-lactate permease, putative from Vibrio cholerae O1 biovar eltor str. N16961
44% identity, 93% coverage

Q7UY15 L-lactate permease from Rhodopirellula baltica (strain DSM 10527 / NCIMB 13988 / SH1)
42% identity, 95% coverage

DealDRAFT_1845 L-lactate permease from Dethiobacter alkaliphilus AHT 1
43% identity, 92% coverage

DSY2261 hypothetical protein from Desulfitobacterium hafniense Y51
40% identity, 86% coverage

Dde_3238 L-lactate permease, putative from Desulfovibrio desulfuricans G20
40% identity, 92% coverage

Dret_1039 L-lactate transport from Desulfohalobium retbaense DSM 5692
40% identity, 94% coverage

Q726T0 L-lactate permease from Nitratidesulfovibrio vulgaris (strain ATCC 29579 / DSM 644 / CCUG 34227 / NCIMB 8303 / VKM B-1760 / Hildenborough)
DVU3026 L-lactate permease family protein from Desulfovibrio vulgaris Hildenborough
38% identity, 92% coverage

DVU2451 L-lactate permease family protein from Desulfovibrio vulgaris Hildenborough
38% identity, 91% coverage

UH47_06315 L-lactate permease from Staphylococcus pseudintermedius
38% identity, 100% coverage

SXYL_00250 L-lactate permease from Staphylococcus xylosus
38% identity, 99% coverage

HVO_1696 L-lactate permease from Haloferax volcanii DS2
37% identity, 91% coverage

AF0806 L-lactate permease (lctP) from Archaeoglobus fulgidus DSM 4304
AF_0806 L-lactate permease from Archaeoglobus fulgidus DSM 4304
36% identity, 95% coverage

YP_130418 hypothetical L-lactate permease (lctP) from Photobacterium profundum SS9
29% identity, 96% coverage

HVO_2251 Glycolate permease glcA from Haloferax volcanii DS2
31% identity, 96% coverage

NMB0543 putative L-lactate permease from Neisseria meningitidis MC58
31% identity, 98% coverage

NMC0482 putative transmembrane transport protein from Neisseria meningitidis FAM18
31% identity, 98% coverage

QR722_RS12480 L-lactate permease from Aliiglaciecola sp. LCG003
32% identity, 96% coverage

NGO1449 LctP from Neisseria gonorrhoeae FA 1090
NGO_1449 L-lactate permease from Neisseria gonorrhoeae FA 1090
31% identity, 98% coverage

NGFG_01471 L-lactate permease from Neisseria gonorrhoeae MS11
31% identity, 98% coverage

APL_0447 putative L-lactate permease from Actinobacillus pleuropneumoniae L20
31% identity, 98% coverage

APPSER1_RS02395 L-lactate permease from Actinobacillus pleuropneumoniae serovar 1 str. 4074
31% identity, 98% coverage

NTHI1391 putative L-lactate permease from Haemophilus influenzae 86-028NP
30% identity, 96% coverage

Q57251 Putative L-lactate permease from Haemophilus influenzae (strain ATCC 51907 / DSM 11121 / KW20 / Rd)
HI1218 L-lactate permease (lctP) from Haemophilus influenzae Rd KW20
30% identity, 96% coverage

CPZ25_RS02235 L-lactate permease from Eubacterium maltosivorans
30% identity, 95% coverage

Cbei_2885 L-lactate transport from Clostridium beijerincki NCIMB 8052
27% identity, 95% coverage

CPE0310 probable lactate permease from Clostridium perfringens str. 13
28% identity, 95% coverage

LBA1768 lactate premease from Lactobacillus acidophilus NCFM
30% identity, 95% coverage

CLJU_RS10610 L-lactate permease from Clostridium ljungdahlii DSM 13528
30% identity, 99% coverage

Hflu103001790 COG1620: L-lactate permease from Haemophilus influenzae R2846
29% identity, 97% coverage

EZN00_RS08605 L-lactate permease from Clostridium tyrobutyricum
29% identity, 99% coverage

AWO_RS04425, Awo_c08740, WP_014355268 L-lactate permease from Acetobacterium woodii DSM 1030
30% identity, 96% coverage

EHLA_0973 L-lactate permease from Anaerobutyricum hallii
27% identity, 98% coverage

jhp0128 L-lactate permease from Helicobacter pylori J99
27% identity, 97% coverage

DVU2285 L-lactate permease family protein from Desulfovibrio vulgaris Hildenborough
Q729R4 L-lactate permease from Nitratidesulfovibrio vulgaris (strain ATCC 29579 / DSM 644 / CCUG 34227 / NCIMB 8303 / VKM B-1760 / Hildenborough)
27% identity, 99% coverage

AB57_0116 L-lactate permease from Acinetobacter baumannii AB0057
27% identity, 97% coverage

HP0140 L-lactate permease (lctP) from Helicobacter pylori 26695
27% identity, 98% coverage

CJJ81176_0113 L-lactate permease from Campylobacter jejuni subsp. jejuni 81-176
Cj0076c L-lactate permease from Campylobacter jejuni subsp. jejuni NCTC 11168
A911_00360 L-lactate permease from Campylobacter jejuni subsp. jejuni PT14
28% identity, 94% coverage

ABZJ_00099 L-lactate permease from Acinetobacter baumannii MDR-ZJ06
27% identity, 96% coverage

A1S_0067 L-lactate permease from Acinetobacter baumannii ATCC 17978
ABUW_3814 L-lactate permease from Acinetobacter baumannii
27% identity, 96% coverage

AEX15_02655, SeKA_A3314 L-lactate permease from Salmonella enterica subsp. enterica serovar Kentucky str. CVM29188
27% identity, 98% coverage

STM3692 LctP transporter, L-lactate permease from Salmonella typhimurium LT2
STM14_4451 L-lactate permease from Salmonella enterica subsp. enterica serovar Typhimurium str. 14028S
27% identity, 98% coverage

HPYLPMSS1_00131 L-lactate permease from Helicobacter pylori PMSS1
27% identity, 98% coverage

ACIAD0106 L-lactate permease from Acinetobacter sp. ADP1
27% identity, 97% coverage

TepiRe1_2531 L-lactate permease from Tepidanaerobacter acetatoxydans Re1
30% identity, 92% coverage

C8J_0069 L-lactate permease from Campylobacter jejuni subsp. jejuni 81116
28% identity, 94% coverage

BAS5077 L-lactate permease from Bacillus anthracis str. Sterne
GBAA5464 L-lactate permease from Bacillus anthracis str. 'Ames Ancestor'
28% identity, 96% coverage

PP4735, PP_4735 L-lactate transporter from Pseudomonas putida KT2440
27% identity, 97% coverage

GlcA / b2975 glycolate/lactate:H+ symporter GlcA from Escherichia coli K-12 substr. MG1655 (see 3 papers)
glcA / Q46839 glycolate/lactate:H+ symporter GlcA from Escherichia coli (strain K12) (see 3 papers)
GLCA_ECOLI / Q46839 Glycolate permease GlcA from Escherichia coli (strain K12) (see 3 papers)
TC 2.A.14.1.2 / Q46839 Glycolate permease, GlcA or YghK (substrates: L-lactate, D-lactate and glycolate) from Escherichia coli (see 5 papers)
b2975 glycolate transporter from Escherichia coli str. K-12 substr. MG1655
29% identity, 96% coverage

BDU_605 L-lactate permease from Borrelia duttonii Ly
28% identity, 94% coverage

SO0827 propionate/L-lactate/D-lactate transporter from Shewanella oneidensis MR-1
SO0827 L-lactate permease from Shewanella oneidensis MR-1
SO_0827 lactate permease LctP family transporter from Shewanella oneidensis MR-1
28% identity, 98% coverage

Dde_1074 Putative permease from Desulfovibrio desulfuricans G20
28% identity, 99% coverage

F6476_01160 lactate permease LctP family transporter from Pseudomonas umsongensis
27% identity, 96% coverage

BC_0612 L-lactate permease from Bacillus cereus ATCC 14579
BC0612 L-lactate permease from Bacillus cereus ATCC 14579
26% identity, 94% coverage

BA0610 L-lactate permease from Bacillus anthracis str. Ames
27% identity, 94% coverage

BB0604 L-lactate permease (lctP) from Borrelia burgdorferi B31
BB_0604 L-lactate permease from Borreliella burgdorferi B31
26% identity, 99% coverage

ESA_RS17695 L-lactate permease from Cronobacter sakazakii ATCC BAA-894
28% identity, 97% coverage

LUTP_BACSU / P71067 L-lactate permease from Bacillus subtilis (strain 168) (see paper)
TC 2.A.14.1.3 / P71067 L-lactate permease from Bacillus subtilis (strain 168) (see 3 papers)
27% identity, 92% coverage

PS417_24105 D-lactate transporter (lctP family) from Pseudomonas simiae WCS417
28% identity, 95% coverage

PFLU5278 glycolate permease from Pseudomonas fluorescens SBW25
27% identity, 95% coverage

PA4770 L-lactate permease from Pseudomonas aeruginosa PAO1
28% identity, 95% coverage

THER_0621 L-lactate permease from Thermodesulfovibrio sp. N1
27% identity, 92% coverage

PTH_2229 L-lactate permease from Pelotomaculum thermopropionicum SI
27% identity, 90% coverage

AO356_07550 L-lactate and D-lactate permease (lctP family) from Pseudomonas fluorescens FW300-N2C3
27% identity, 95% coverage

Avin_43070 L-lactate permease from Azotobacter vinelandii AvOP
28% identity, 96% coverage

RSPO_c02391 L-lactate permease from Ralstonia solanacearum Po82
25% identity, 95% coverage

PG1340 L-lactate permease from Porphyromonas gingivalis W83
28% identity, 97% coverage

ECs4481 L-lactate permease from Escherichia coli O157:H7 str. Sakai
Z_RS23645 L-lactate permease from Escherichia coli O157:H7 str. EDL933
27% identity, 98% coverage

b3603 (R)-lactate / (S)-lactate / glycolate:H+ symporter LldP from Escherichia coli BW25113
Lct / b3603 lactate/glycolate:H+ symporter LldP from Escherichia coli K-12 substr. MG1655 (see 5 papers)
lldP / P33231 lactate/glycolate:H+ symporter LldP from Escherichia coli (strain K12) (see 4 papers)
LLDP_ECOLI / P33231 L-lactate permease from Escherichia coli (strain K12) (see 4 papers)
TC 2.A.14.1.1 / P33231 Lactate permease, LctP or LidP (substrates: L-lactate, D-lactate and glycolate) from Escherichia coli (see 7 papers)
NP_418060 lactate/glycolate:H(+) symporter LldP from Escherichia coli str. K-12 substr. MG1655
b3603 L-lactate permease from Escherichia coli str. K-12 substr. MG1655
ETEC_3846 L-lactate permease from Escherichia coli ETEC H10407
27% identity, 98% coverage

BWI76_RS27185 L-lactate/D-lactate permease from Klebsiella michiganensis M5al
27% identity, 98% coverage

SMLT_RS13840 L-lactate permease from Stenotrophomonas maltophilia K279a
25% identity, 96% coverage

H16_RS19175 lactate permease LctP family transporter from Cupriavidus necator H16
27% identity, 95% coverage

HP0141 L-lactate permease (lctP) from Helicobacter pylori 26695
25% identity, 97% coverage

PFREUD_18660 L-lactate permease from Propionibacterium freudenreichii subsp. shermanii CIRM-BIA1
27% identity, 91% coverage

RM25_RS08745 L-lactate permease from Propionibacterium freudenreichii subsp. freudenreichii
27% identity, 91% coverage

Q725Z0 L-lactate permease from Nitratidesulfovibrio vulgaris (strain ATCC 29579 / DSM 644 / CCUG 34227 / NCIMB 8303 / VKM B-1760 / Hildenborough)
DVU3284 L-lactate permease from Desulfovibrio vulgaris Hildenborough
27% identity, 92% coverage

BSU03060 L-lactate permease from Bacillus subtilis subsp. subtilis str. 168
28% identity, 95% coverage

BB0976 lactate permease family protwin from Bordetella bronchiseptica RB50
26% identity, 95% coverage

Q72A87 L-lactate permease from Nitratidesulfovibrio vulgaris (strain ATCC 29579 / DSM 644 / CCUG 34227 / NCIMB 8303 / VKM B-1760 / Hildenborough)
DVU2110 L-lactate permease from Desulfovibrio vulgaris Hildenborough
27% identity, 92% coverage

PPA0166 putative L-lactate permease from Propionibacterium acnes KPA171202
26% identity, 93% coverage

SA0106 hypothetical protein from Staphylococcus aureus subsp. aureus N315
SAR0113 L-lactate permease 1 from Staphylococcus aureus subsp. aureus MRSA252
26% identity, 95% coverage

E3T15_09260 L-lactate permease from Staphylococcus aureus
SAUSA300_0112 L-lactate permease from Staphylococcus aureus subsp. aureus USA300_FPR3757
SACOL0093 L-lactate permease from Staphylococcus aureus subsp. aureus COL
26% identity, 95% coverage

BruAb1_0737 LldP, L-lactate permease from Brucella abortus biovar 1 str. 9-941
BAB1_0738 L-lactate permease from Brucella melitensis biovar Abortus 2308
25% identity, 93% coverage

SXYL_00577 L-lactate permease from Staphylococcus xylosus
29% identity, 58% coverage

RPA1136 putative L-lactate permease from Rhodopseudomonas palustris CGA009
24% identity, 96% coverage

BF29_RS14480 L-lactate permease from Heyndrickxia coagulans DSM 1 = ATCC 7050
31% identity, 57% coverage

MW2287 L-lactate permease lctP homolog~ORFID:MW2287 from Staphylococcus aureus subsp. aureus MW2
29% identity, 59% coverage

GSU1622 L-lactate permease from Geobacter sulfurreducens PCA
28% identity, 49% coverage

SE1945 L-lactate permease lctP-like protein from Staphylococcus epidermidis ATCC 12228
29% identity, 59% coverage

KQ76_12340 L-lactate permease from Staphylococcus aureus
28% identity, 59% coverage

SAFDA_2229 L-lactate permease from Staphylococcus aureus
Q2FVQ4 L-lactate permease from Staphylococcus aureus (strain NCTC 8325 / PS 47)
SAOUHSC_02648 L-lactate permease from Staphylococcus aureus subsp. aureus NCTC 8325
NWMN_2268 L-lactate permease 2 from Staphylococcus aureus subsp. aureus str. Newman
SACOL2363 L-lactate permease from Staphylococcus aureus subsp. aureus COL
28% identity, 59% coverage

SA2156 hypothetical protein from Staphylococcus aureus subsp. aureus N315
28% identity, 59% coverage

BC1240 L-lactate permease from Bacillus cereus ATCC 14579
22% identity, 95% coverage

SAUSA300_RS12780 L-lactate permease from Staphylococcus aureus subsp. aureus USA300_FPR3757
28% identity, 59% coverage

SSO2126 L-lactate permease from Sulfolobus solfataricus P2
26% identity, 96% coverage

DealDRAFT_0239 L-lactate permease from Dethiobacter alkaliphilus AHT 1
22% identity, 98% coverage

DVU2683 L-lactate permease family protein from Desulfovibrio vulgaris Hildenborough
24% identity, 95% coverage

New Search

For advice on how to use these tools together, see Interactive tools for functional annotation of bacterial genomes.

Statistics

The PaperBLAST database links 789,361 different protein sequences to 1,256,019 scientific articles. Searches against EuropePMC were last performed on January 10 2025.

How It Works

PaperBLAST builds a database of protein sequences that are linked to scientific articles. These links come from automated text searches against the articles in EuropePMC and from manually-curated information from GeneRIF, UniProtKB/Swiss-Prot, BRENDA, CAZy (as made available by dbCAN), BioLiP, CharProtDB, MetaCyc, EcoCyc, TCDB, REBASE, the Fitness Browser, and a subset of the European Nucleotide Archive with the /experiment tag. Given this database and a protein sequence query, PaperBLAST uses protein-protein BLAST to find similar sequences with E < 0.001.

To build the database, we query EuropePMC with locus tags, with RefSeq protein identifiers, and with UniProt accessions. We obtain the locus tags from RefSeq or from MicrobesOnline. We use queries of the form "locus_tag AND genus_name" to try to ensure that the paper is actually discussing that gene. Because EuropePMC indexes most recent biomedical papers, even if they are not open access, some of the links may be to papers that you cannot read or that our computers cannot read. We query each of these identifiers that appears in the open access part of EuropePMC, as well as every locus tag that appears in the 500 most-referenced genomes, so that a gene may appear in the PaperBLAST results even though none of the papers that mention it are open access. We also incorporate text-mined links from EuropePMC that link open access articles to UniProt or RefSeq identifiers. (This yields some additional links because EuropePMC uses different heuristics for their text mining than we do.)

For every article that mentions a locus tag, a RefSeq protein identifier, or a UniProt accession, we try to select one or two snippets of text that refer to the protein. If we cannot get access to the full text, we try to select a snippet from the abstract, but unfortunately, unique identifiers such as locus tags are rarely provided in abstracts.

PaperBLAST also incorporates manually-curated protein functions:

Except for GeneRIF and ENA, the curated entries include a short curated description of the protein's function. For entries from BioLiP, the protein's function may not be known beyond binding to the ligand. Many of these entries also link to articles in PubMed.

For more information see the PaperBLAST paper (mSystems 2017) or the code. You can download PaperBLAST's database here.

Changes to PaperBLAST since the paper was written:

Many of these changes are described in Interactive tools for functional annotation of bacterial genomes.

Secrets

PaperBLAST cannot provide snippets for many of the papers that are published in non-open-access journals. This limitation applies even if the paper is marked as "free" on the publisher's web site and is available in PubmedCentral or EuropePMC. If a journal that you publish in is marked as "secret," please consider publishing elsewhere.

Omissions from the PaperBLAST Database

Many important articles are missing from PaperBLAST, either because the article's full text is not in EuropePMC (as for many older articles), or because the paper does not mention a protein identifier such as a locus tag, or because of PaperBLAST's heuristics. If you notice an article that characterizes a protein's function but is missing from PaperBLAST, please notify the curators at UniProt or add an entry to GeneRIF. Entries in either of these databases will eventually be incorporated into PaperBLAST. Note that to add an entry to UniProt, you will need to find the UniProt identifier for the protein. If the protein is not already in UniProt, you can ask them to create an entry. To add an entry to GeneRIF, you will need an NCBI Gene identifier, but unfortunately many prokaryotic proteins in RefSeq do not have corresponding Gene identifers.

References

PaperBLAST: Text-mining papers for information about homologs.
M. N. Price and A. P. Arkin (2017). mSystems, 10.1128/mSystems.00039-17.

Europe PMC in 2017.
M. Levchenko et al (2017). Nucleic Acids Research, 10.1093/nar/gkx1005.

Gene indexing: characterization and analysis of NLM's GeneRIFs.
J. A. Mitchell et al (2003). AMIA Annu Symp Proc 2003:460-464.

UniProt: the universal protein knowledgebase.
The UniProt Consortium (2016). Nucleic Acids Research, 10.1093/nar/gkw1099.

BRENDA in 2017: new perspectives and new tools in BRENDA.
S. Placzek et al (2017). Nucleic Acids Research, 10.1093/nar/gkw952.

The EcoCyc database: reflecting new knowledge about Escherichia coli K-12.
I. M. Keeseler et al (2016). Nucleic Acids Research, 10.1093/nar/gkw1003.

The MetaCyc database of metabolic pathways and enzymes.
R. Caspi et al (2018). Nucleic Acids Research, 10.1093/nar/gkx935.

CharProtDB: a database of experimentally characterized protein annotations.
R. Madupu et al (2012). Nucleic Acids Research, 10.1093/nar/gkr1133.

The carbohydrate-active enzymes database (CAZy) in 2013.
V. Lombard et al (2014). Nucleic Acids Research, 10.1093/nar/gkt1178.

The Transporter Classification Database (TCDB): recent advances
M. H. Saier, Jr. et al (2016). Nucleic Acids Research, 10.1093/nar/gkv1103.

REBASE - a database for DNA restriction and modification: enzymes, genes and genomes.
R. J. Roberts et al (2015). Nucleic Acids Research, 10.1093/nar/gku1046.

Deep annotation of protein function across diverse bacteria from mutant phenotypes.
M. N. Price et al (2016). bioRxiv, 10.1101/072470.

by Morgan Price, Arkin group
Lawrence Berkeley National Laboratory