PaperBLAST – Find papers about a protein or its homologs

 

PaperBLAST

PaperBLAST Hits for reanno::Burk376:H281DRAFT_01114 deoxynucleoside transporter, substrate-binding component (Paraburkholderia bryophila 376MFSha3.1) (334 a.a., MKLTRLGAAL...)

Other sequence analysis tools:

Find functional residues: SitesBLAST

Search for conserved domains

Find the best match in UniProt

Compare to protein structures

Predict transmenbrane helices: Phobius

Predict protein localization: PSORTb

Find homologs in fast.genomics

Fitness BLAST: loading...

Found 136 similar proteins in the literature:

H281DRAFT_01114 deoxynucleoside transporter, substrate-binding component from Paraburkholderia bryophila 376MFSha3.1
100% identity, 100% coverage

SMb20316 putative ABC transporter periplasmic sugar-binding protein from Sinorhizobium meliloti 1021
62% identity, 97% coverage

YPO1813 putative sugar-binding periplasmic protein from Yersinia pestis CO92
56% identity, 97% coverage

APY09_02520 substrate-binding domain-containing protein from Schaalia odontolytica
42% identity, 96% coverage

ATU_RS22740 autoinducer 2 ABC transporter substrate-binding protein from Agrobacterium fabrum str. C58
43% identity, 88% coverage

ROD_24811 ABC transporter, substrate-binding protein from Citrobacter rodentium ICC168
43% identity, 92% coverage

Entcl_1207 autoinducer 2 ABC transporter substrate-binding protein from [Enterobacter] lignolyticus SCF1
42% identity, 92% coverage

YPO3633 putative periplasmic binding protein from Yersinia pestis CO92
39% identity, 91% coverage

G5643_21680 autoinducer 2 ABC transporter substrate-binding protein from Serratia marcescens
44% identity, 84% coverage

YPO3328 putative sugar ABC transporter, periplasmic protein from Yersinia pestis CO92
42% identity, 75% coverage

YPTB0802 putative ABC transporter, periplasmic sugar binding protein from Yersinia pseudotuberculosis IP 32953
42% identity, 82% coverage

pRL80085 putative substrate-binding component of ABC transporter from Rhizobium leguminosarum bv. viciae 3841
39% identity, 87% coverage

YE2751 putative periplasmic binding protein from Yersinia enterocolitica subsp. enterocolitica 8081
38% identity, 92% coverage

C6B607 Putative ABC transporter periplasmic sugar-binding protein from Rhizobium leguminosarum bv. trifolii (strain WSM1325)
42% identity, 89% coverage

ECs0374 putative sugar-binding protein from Escherichia coli O157:H7 str. Sakai
Z0415 putative periplasmic binding protein, probable substrate ribose from Escherichia coli O157:H7 EDL933
37% identity, 97% coverage

SMb20484 putative ABC transporter periplasmic sugar-binding protein from Sinorhizobium meliloti 1021
39% identity, 82% coverage

SMc02324 PUTATIVE PERIPLASMIC BINDING ABC TRANSPORTER PROTEIN from Sinorhizobium meliloti 1021
33% identity, 87% coverage

Gocc_0231 autoinducer 2 ABC transporter substrate-binding protein from Gaiella occulta
34% identity, 79% coverage

Atu3487 ABC transporter, substrate binding protein (sugar) from Agrobacterium tumefaciens str. C58 (Cereon)
32% identity, 86% coverage

TC 3.A.1.2.9 / Q7BSH5 RhaS, component of Rhamnose porter (Richardson et al., 2004) (Transport activity is dependent on rhamnokinase (RhaK; AAQ92412) activity (Richardson and Oresnik, 2007) This could be an example of group translocation!) from Rhizobium leguminosarum (biovar trifolii) (see paper)
32% identity, 87% coverage

pRL110413 putative substrate binding protein involved in competition for nodulation from Rhizobium leguminosarum bv. viciae 3841
31% identity, 81% coverage

KPN_04210 putative LACI-type transcriptional regulator from Klebsiella pneumoniae subsp. pneumoniae MGH 78578
32% identity, 85% coverage

B1G1H7 Periplasmic binding protein/LacI transcriptional regulator from Paraburkholderia graminis (strain ATCC 700544 / DSM 17151 / LMG 18924 / NCIMB 13744 / C4D1M)
32% identity, 82% coverage

Swol_0423 putative sugar ABC transporter, substrate-binding protein from Syntrophomonas wolfei subsp. wolfei str. Goettingen
30% identity, 88% coverage

5hqjA / B1G1H7 Crystal structure of abc transporter solute binding protein b1g1h7 from burkholderia graminis c4d1m, target efi-511179, in complex with d-arabinose
30% identity, 73% coverage

BC2960 Sugar-binding protein from Bacillus cereus ATCC 14579
26% identity, 94% coverage

4wzzA / A9KIX1 Crystal structure of an abc transporter solute binding protein (ipr025997) from clostridium phytofermentas (cphy_0583, target efi- 511148) with bound l-rhamnose
26% identity, 87% coverage

Cphy_0583 putative sugar ABC transporter, substrate-binding protein from Clostridium phytofermentans ISDg
26% identity, 79% coverage

C9Z1U7 Putative secreted solute-binding lipoprotein from Streptomyces scabiei (strain 87.22)
27% identity, 80% coverage

C0C300 Periplasmic binding protein domain-containing protein from [Clostridium] hylemonae DSM 15053
26% identity, 88% coverage

4pz0A / A0A6H3AKG3 The crystal structure of a solute binding protein from bacillus anthracis str. Ames in complex with quorum-sensing signal autoinducer-2 (ai-2)
26% identity, 89% coverage

TM0114 sugar ABC transporter, periplasmic sugar-binding protein from Thermotoga maritima MSB8
28% identity, 77% coverage

2h3hA / Q9WXW9 Crystal structure of the liganded form of thermotoga maritima glucose binding protein (see paper)
28% identity, 77% coverage

CTN_0576 Sugar ABC transporter, periplasmic sugar-binding protein from Thermotoga neapolitana DSM 4359
28% identity, 77% coverage

6gt9A / W8QN64 Crystal structure of ganp, a glucose-galactose binding protein from geobacillus stearothermophilus, in complex with galactose
27% identity, 75% coverage

SC3966 putative ABC superfamily (peri_perm), sugar transport protein from Salmonella enterica subsp. enterica serovar Choleraesuis str. SC-B67
26% identity, 98% coverage

LSRB_SALTY / Q8ZKQ1 Autoinducer 2-binding protein LsrB; AI-2-binding protein LsrB from Salmonella typhimurium (strain LT2 / SGSC1412 / ATCC 700720) (see paper)
STM4077 putative ABC superfamily (peri_perm), sugar transport protein from Salmonella typhimurium LT2
26% identity, 98% coverage

4wutA / B9K0B2 Crystal structure of an abc transporter solute binding protein (ipr025997) from agrobacterium vitis (avi_5133, target efi-511220) with bound d-fucose
27% identity, 86% coverage

Q8Z2X8 Autoinducer 2-binding protein LsrB from Salmonella typhi
26% identity, 98% coverage

C289_0603 sugar-binding protein from Anoxybacillus ayderensis
25% identity, 81% coverage

ECs2123 putative LACI-type transcriptional regulator from Escherichia coli O157:H7 str. Sakai
Z2189 putative LACI-type transcriptional regulator from Escherichia coli O157:H7 EDL933
BC33_RS14590 autoinducer 2 ABC transporter substrate-binding protein LsrB from Escherichia coli ATCC 700728
26% identity, 98% coverage

LsrB / b1516 Autoinducer-2 ABC transporter periplasmic binding protein (EC 7.6.2.13) from Escherichia coli K-12 substr. MG1655 (see 3 papers)
LsrB / P76142 Autoinducer-2 ABC transporter periplasmic binding protein (EC 7.6.2.13) from Escherichia coli (strain K12) (see 5 papers)
LSRB_ECOLI / P76142 Autoinducer 2-binding protein LsrB; AI-2-binding protein LsrB from Escherichia coli (strain K12) (see 2 papers)
TC 3.A.1.2.8 / P76142 LsrB(R), component of Autoinducer-2 (AI-2, a furanosyl borate diester: (3aS,6S,6aR)-2,2,6,6a-tetrahydroxy-3a-methyltetrahydrofuro[3,2-d][1,3,2]dioxaborolan-2-uide) uptake porter (Taga et al., 2001, 2003) from Escherichia coli (see 4 papers)
lsrB autoinducer 2 ABC transporter, periplasmic substrate-binding protein LsrB from Escherichia coli K12 (see paper)
NP_416033 Autoinducer-2 ABC transporter periplasmic binding protein from Escherichia coli str. K-12 substr. MG1655
b1516 AI2 transporter from Escherichia coli str. K-12 substr. MG1655
26% identity, 98% coverage

Entcl_0617 autoinducer 2 ABC transporter substrate-binding protein LsrB from [Enterobacter] lignolyticus SCF1
26% identity, 94% coverage

YPO0409 putative periplasmic solute-binding protein from Yersinia pestis CO92
24% identity, 99% coverage

y3772 putative LACI-type transcriptional regulator from Yersinia pestis KIM
24% identity, 89% coverage

A0R67_09330 sugar ABC transporter substrate-binding protein from Pasteurella multocida subsp. multocida
26% identity, 85% coverage

1tjyA / Q8ZKQ1 Crystal structure of salmonella typhimurium ai-2 receptor lsrb in complex with r-thmf (see paper)
24% identity, 92% coverage

PSPTO_2367 ribose ABC transporter, periplasmic ribose-binding protein from Pseudomonas syringae pv. tomato str. DC3000
27% identity, 72% coverage

BBR47_06790 putative ABC transporter substrate binding protein from Brevibacillus brevis NBRC 100599
22% identity, 81% coverage

AlsB / b4088 D-allose ABC transporter periplasmic binding protein (EC 7.5.2.8) from Escherichia coli K-12 substr. MG1655 (see 5 papers)
AlsB / P39265 D-allose ABC transporter periplasmic binding protein (EC 7.5.2.8) from Escherichia coli (strain K12) (see 5 papers)
ALSB_ECOLI / P39265 D-allose-binding periplasmic protein; ALBP from Escherichia coli (strain K12) (see paper)
P39265 ABC-type D-allose transporter (EC 7.5.2.8) from Escherichia coli (see paper)
TC 3.A.1.2.6 / P39265 AlsB aka B4088, component of D-allose porter from Escherichia coli (see 6 papers)
alsB / GB|AAC77049.1 D-allose-binding periplasmic protein; EC 3.6.3.17 from Escherichia coli K12 (see 6 papers)
b4088 D-allose transporter subunit from Escherichia coli str. K-12 substr. MG1655
28% identity, 76% coverage

5dteB / A6VKG5 Crystal structure of an abc transporter periplasmic solute binding protein (ipr025997) from actinobacillus succinogenes 130z(asuc_0081, target efi-511065) with bound d-allose
29% identity, 76% coverage

RSP_3500 ABC sugar transporter, periplasmic binding protein from Rhodobacter sphaeroides 2.4.1
24% identity, 94% coverage

A4XG54 Periplasmic binding protein/LacI transcriptional regulator from Caldicellulosiruptor saccharolyticus (strain ATCC 43494 / DSM 8903 / Tp8T 6331)
Csac_0242 periplasmic binding protein/LacI transcriptional regulator from Caldicellulosiruptor saccharolyticus DSM 8903
26% identity, 63% coverage

Cthe_2446 substrate-binding domain-containing protein from Acetivibrio thermocellus DSM 1313
Cthe_2446 ABC-type sugar transport system periplasmic component-like protein from Clostridium thermocellum ATCC 27405
26% identity, 69% coverage

plu3146 ABC transporter Binding Protein (BP) LsrB from Photorhabdus luminescens subsp. laumondii TTO1
25% identity, 90% coverage

1gudA / P39265 Hinge-bending motion of d-allose binding protein from escherichia coli: three open conformations (see paper)
28% identity, 71% coverage

BCAL1657 putative ribose transport system, substrate-binding protein from Burkholderia cenocepacia J2315
30% identity, 67% coverage

HSERO_RS05260 ABC transporter for L-fucose, substrate-binding component from Herbaspirillum seropedicae SmR1
25% identity, 90% coverage

HSERO_RS11480 D-ribose ABC transporter, substrate-binding component RbsB from Herbaspirillum seropedicae SmR1
29% identity, 73% coverage

Pf1N1B4_6035 D-ribose ABC transporter, substrate-binding component RbsB from Pseudomonas fluorescens FW300-N1B4
25% identity, 79% coverage

RHE_RS22400 substrate-binding domain-containing protein from Rhizobium etli CFN 42
27% identity, 70% coverage

CTC_00907 ABC transporter substrate-binding protein from Clostridium tetani E88
Q896U1 D-ribose-binding periplasmic protein from Clostridium tetani (strain Massachusetts / E88)
23% identity, 90% coverage

3t95A / Q74PW2 Crystal structure of lsrb from yersinia pestis complexed with autoinducer-2 (see paper)
23% identity, 92% coverage

OKIT_0347 substrate-binding domain-containing protein from Oenococcus kitaharae DSM 17330
26% identity, 60% coverage

Atu3063 ABC transporter, nucleotide binding/ATPase protein from Agrobacterium tumefaciens str. C58 (Cereon)
28% identity, 89% coverage

blr1123 ABC transporter sugar-binding protein from Bradyrhizobium japonicum USDA 110
26% identity, 49% coverage

Bdiaspc4_05515 sugar-binding protein from Bradyrhizobium diazoefficiens
26% identity, 49% coverage

RALBP_PSEAE / Q9I2F8 D-ribose/D-allose-binding protein from Pseudomonas aeruginosa (strain ATCC 15692 / DSM 22644 / CIP 104116 / JCM 14847 / LMG 12228 / 1C / PRS 101 / PAO1) (see paper)
PA1946 binding protein component precursor of ABC ribose transporter from Pseudomonas aeruginosa PAO1
27% identity, 72% coverage

Caur_2286 ABC-type sugar transport system periplasmic component-like protein from Chloroflexus aurantiacus J-10-fl
25% identity, 36% coverage

SMb21016 putative sugar ABC transporter periplasmic solute-binding protein precursor from Sinorhizobium meliloti 1021
24% identity, 84% coverage

MSMEG_1374 ribose ABC transporter, periplasmic binding protein from Mycobacterium smegmatis str. MC2 155
26% identity, 74% coverage

RL0518 putative solute-binding component of ABC transporter from Rhizobium leguminosarum bv. viciae 3841
26% identity, 91% coverage

B2904_orf2673 sugar ABC transporter substrate-binding protein from Brachyspira pilosicoli B2904
28% identity, 62% coverage

PS417_18405 D-ribose ABC transporter, substrate-binding component RbsB from Pseudomonas simiae WCS417
29% identity, 64% coverage

llmg_0789 ribose ABC transporter substrate binding protein RbsB from Lactococcus lactis subsp. cremoris MG1363
LLNZ_RS04085 substrate-binding domain-containing protein from Lactococcus cremoris subsp. cremoris NZ9000
23% identity, 69% coverage

CTN_0777 Periplasmic binding protein/LacI transcriptional regulator precursor from Thermotoga neapolitana DSM 4359
TRQ7_RS05225 sugar-binding protein from Thermotoga sp. RQ7
28% identity, 63% coverage

6dspA / U5MRH9 Lsrb from clostridium saccharobutylicum in complex with ai-2 (see paper)
23% identity, 81% coverage

Cthe_0393 sugar ABC transporter (sugar-binding protein) from Clostridium thermocellum ATCC 27405
Cthe_0393 substrate-binding domain-containing protein from Acetivibrio thermocellus ATCC 27405
33% identity, 41% coverage

Avi_5339 ABC transporter substrate binding protein (ribose) from Agrobacterium vitis S4
26% identity, 78% coverage

7x0hA / A3DCF2 Crystal structure of sugar binding protein cbpa complexed wtih glucose from clostridium thermocellum (see paper)
33% identity, 41% coverage

3ejwA / Q926H7 Crystal structure of the sinorhizobium meliloti ai-2 receptor, smlsrb (see paper)
24% identity, 83% coverage

TC 3.A.1.2.20 / G4FGN5 LacI family transcriptional regulator, component of Glucose porter. Also bind xylose (Boucher and Noll 2011). Induced by glucose (Frock et al. 2012). Directly regulated by glucose-responsive regulator GluR from Thermotoga maritima (strain ATCC 43589 / MSB8 / DSM 3109 / JCM 10099)
Tmari_1858 sugar-binding protein from Thermotoga maritima MSB8
28% identity, 52% coverage

BCAM0766 D-ribose-binding periplasmic protein precursor from Burkholderia cenocepacia J2315
28% identity, 51% coverage

4rxtA / B9JKX8 Crystal structure of carbohydrate transporter solute binding protein arad_9553 from agrobacterium radiobacter, target efi-511541, in complex with d-arabinose
26% identity, 83% coverage

5dkvA / B9K0T2 Crystal structure of an abc transporter solute binding protein from agrobacterium vitis(avis_5339, target efi-511225) bound with alpha-d- tagatopyranose
28% identity, 57% coverage

KP1_1422 putative rhizopine uptake ABC transport system periplasmic solute-binding protein precursor from Klebsiella pneumoniae NTUH-K2044
28% identity, 77% coverage

PP_2454 ribose ABC transporter, periplasmic ribose-binding protein from Pseudomonas putida KT2440
24% identity, 69% coverage

SMb21377 putative sugar uptake ABC transporter periplasmic solute-binding protein precursor from Sinorhizobium meliloti 1021
28% identity, 54% coverage

SMb20931 putative sugar uptake ABC transporter periplasmic solute-binding protein precursor from Sinorhizobium meliloti 1021
25% identity, 49% coverage

gbs0113 Unknown from Streptococcus agalactiae NEM316
22% identity, 76% coverage

TTE0206 Periplasmic sugar-binding proteins from Thermoanaerobacter tengcongensis MB4
25% identity, 86% coverage

2ioyA / Q8RD41 Crystal structure of thermoanaerobacter tengcongensis ribose binding protein (see paper)
25% identity, 79% coverage

RHE_CH00492 probable sugar ABC transporter, substrate-binding protein from Rhizobium etli CFN 42
26% identity, 90% coverage

NGR_RS07515 sugar-binding protein from Sinorhizobium fredii NGR234
25% identity, 58% coverage

Entcl_4175 sugar ABC transporter substrate-binding protein from [Enterobacter] lignolyticus SCF1
31% identity, 31% coverage

SAK_0166 ribose ABC transporter, ribose-binding protein from Streptococcus agalactiae A909
23% identity, 71% coverage

SAN_0145 ribose ABC transporter, periplasmic D-ribose-binding protein from Streptococcus agalactiae COH1
23% identity, 71% coverage

SAG0114 ribose ABC transporter, periplasmic D-ribose-binding protein from Streptococcus agalactiae 2603V/R
24% identity, 75% coverage

7e7mC / Q8E283 Crystal structure analysis of the streptococcus agalactiae ribose binding protein rbsb
24% identity, 75% coverage

RHE_RS27555 sugar-binding protein from Rhizobium etli CFN 42
27% identity, 74% coverage

AOT13_01795 ribose ABC transporter substrate-binding protein RbsB from Parageobacillus thermoglucosidasius
26% identity, 75% coverage

RLV_4716 sugar-binding protein from Rhizobium leguminosarum bv. viciae
27% identity, 74% coverage

TEL01S_RS08670 sugar ABC transporter substrate-binding protein from Pseudothermotoga elfii DSM 9442 = NBRC 107921
30% identity, 42% coverage

GK1896 sugar ABC transporter (sugar-binding protein) from Geobacillus kaustophilus HTA426
24% identity, 75% coverage

SXYL_01518 D-ribose ABC transporter substrate-binding protein from Staphylococcus xylosus
26% identity, 78% coverage

C9ZD81 Putative secreted solute binding protein from Streptomyces scabiei (strain 87.22)
26% identity, 84% coverage

CTN_0240 Sugar binding protein of ABC transporter from Thermotoga neapolitana DSM 4359
25% identity, 64% coverage

CD0300 D-ribose ABC transporter, substrate-binding protein from Clostridium difficile 630
27% identity, 75% coverage

VCA0130 ribose ABC transporter, periplasmic D-ribose-binding protein from Vibrio cholerae O1 biovar eltor str. N16961
22% identity, 87% coverage

CTN_0364 putative periplasmic binding protein from Thermotoga neapolitana DSM 4359
33% identity, 34% coverage

LBA1481 D-ribose-binding protein precursor from Lactobacillus acidophilus NCFM
26% identity, 53% coverage

RHE_RS30060 substrate-binding domain-containing protein from Rhizobium etli CFN 42
27% identity, 56% coverage

MSMEG_3999 ABC transporter periplasmic-binding protein YphF from Mycobacterium smegmatis str. MC2 155
23% identity, 67% coverage

Ac3H11_3035 Fructose ABC transporter, substrate-binding component FrcB from Acidovorax sp. GW101-3H11
34% identity, 33% coverage

SACE_0943 binding protein/LacI transcriptional regulator from Saccharopolyspora erythraea NRRL 2338
26% identity, 74% coverage

PGA1_262p00430 glucose transporter, periplasmic substrate-binding component from Phaeobacter inhibens DSM 17395
34% identity, 30% coverage

4ry8B / A8F7U7 Crystal structure of 5-methylthioribose transporter solute binding protein tlet_1677 from thermotoga lettingae tmo target efi-511109 in complex with 5-methylthioribose
32% identity, 36% coverage

4yo7A / Q9KAG4 Crystal structure of an abc transporter solute binding protein (ipr025997) from bacillus halodurans c-125 (bh2323, target efi- 511484) with bound myo-inositol
23% identity, 81% coverage

SPRI_RS32325 sugar ABC transporter substrate-binding protein from Streptomyces pristinaespiralis
23% identity, 76% coverage

B5S52_21960 ribose ABC transporter substrate-binding protein RbsB from Pectobacterium brasiliense
24% identity, 87% coverage

ECA_RS00065 ribose ABC transporter substrate-binding protein RbsB from Pectobacterium atrosepticum SCRI1043
24% identity, 87% coverage

PFLU_2583 sugar ABC transporter substrate-binding protein from Pseudomonas [fluorescens] SBW25
PFLU2583 putative rhizopine-binding ABC transporter protein from Pseudomonas fluorescens SBW25
23% identity, 84% coverage

OG1RF_0175 possible DNA binding protein from Enterococcus faecalis OG1RF
OG1RF_11789 LacI family DNA-binding transcriptional regulator from Enterococcus faecalis OG1RF
24% identity, 49% coverage

THPA_MYCS2 / A0QYB5 D-threitol-binding protein from Mycolicibacterium smegmatis (strain ATCC 700084 / mc(2)155) (Mycobacterium smegmatis) (see paper)
MSMEG_3599 sugar ABC transporter xylitol/D-threitol-binding protein ThpA from Mycolicibacterium smegmatis MC2 155
MSMEG_3599 sugar-binding transcriptional regulator, LacI family protein from Mycobacterium smegmatis str. MC2 155
27% identity, 54% coverage

Q92TD7 D-fucose, pyruvic acid or L-fucose ABC transporter, periplasmic substrate-binding component from Rhizobium meliloti (strain 1021)
SMc02774 PUTATIVE ABC TRANSPORTER PERIPLASMIC BINDING PROTEIN from Sinorhizobium meliloti 1021
23% identity, 87% coverage

RL3840 putative substrate-binding protein component of ABC transporter from Rhizobium leguminosarum bv. viciae 3841
25% identity, 58% coverage

DDA3937_RS00045 ribose ABC transporter substrate-binding protein RbsB from Dickeya dadantii 3937
23% identity, 87% coverage

Q98FL9 Xylose binding protein transport system XylF from Mesorhizobium japonicum (strain LMG 29417 / CECT 9101 / MAFF 303099)
29% identity, 39% coverage

A79_4530 D-ribose-binding periplasmic protein from Vibrio parahaemolyticus AQ3810
VPA1084 ribose ABC transporter, periplasmic D-ribose-binding protein from Vibrio parahaemolyticus RIMD 2210633
24% identity, 70% coverage

RBAM_RS16755 ribose ABC transporter substrate-binding protein RbsB from Bacillus velezensis FZB42
24% identity, 75% coverage

5xssA / A6LW07 Xylfii molecule (see paper)
21% identity, 60% coverage

SMb20856 putative sugar uptake ABC transporter periplasmic solute-binding protein precursor from Sinorhizobium meliloti 1021
25% identity, 62% coverage

RL3624 putative solute-binding component of ABC transporter from Rhizobium leguminosarum bv. viciae 3841
31% identity, 32% coverage

c3070 ABC transporter Periplasmic binding protein yphF precursor from Escherichia coli CFT073
22% identity, 78% coverage

XYPA_MYCS2 / A0QYB3 Xylitol-binding protein from Mycolicibacterium smegmatis (strain ATCC 700084 / mc(2)155) (Mycobacterium smegmatis) (see paper)
MSMEG_3598 periplasmic sugar-binding proteins from Mycobacterium smegmatis str. MC2 155
24% identity, 82% coverage

New Search

For advice on how to use these tools together, see Interactive tools for functional annotation of bacterial genomes.

Statistics

The PaperBLAST database links 798,070 different protein sequences to 1,261,478 scientific articles. Searches against EuropePMC were last performed on May 12 2025.

How It Works

PaperBLAST builds a database of protein sequences that are linked to scientific articles. These links come from automated text searches against the articles in EuropePMC and from manually-curated information from GeneRIF, UniProtKB/Swiss-Prot, BRENDA, CAZy (as made available by dbCAN), BioLiP, CharProtDB, MetaCyc, EcoCyc, TCDB, REBASE, the Fitness Browser, and a subset of the European Nucleotide Archive with the /experiment tag. Given this database and a protein sequence query, PaperBLAST uses protein-protein BLAST to find similar sequences with E < 0.001.

To build the database, we query EuropePMC with locus tags, with RefSeq protein identifiers, and with UniProt accessions. We obtain the locus tags from RefSeq or from MicrobesOnline. We use queries of the form "locus_tag AND genus_name" to try to ensure that the paper is actually discussing that gene. Because EuropePMC indexes most recent biomedical papers, even if they are not open access, some of the links may be to papers that you cannot read or that our computers cannot read. We query each of these identifiers that appears in the open access part of EuropePMC, as well as every locus tag that appears in the 500 most-referenced genomes, so that a gene may appear in the PaperBLAST results even though none of the papers that mention it are open access. We also incorporate text-mined links from EuropePMC that link open access articles to UniProt or RefSeq identifiers. (This yields some additional links because EuropePMC uses different heuristics for their text mining than we do.)

For every article that mentions a locus tag, a RefSeq protein identifier, or a UniProt accession, we try to select one or two snippets of text that refer to the protein. If we cannot get access to the full text, we try to select a snippet from the abstract, but unfortunately, unique identifiers such as locus tags are rarely provided in abstracts.

PaperBLAST also incorporates manually-curated protein functions:

Except for GeneRIF and ENA, the curated entries include a short curated description of the protein's function. For entries from BioLiP, the protein's function may not be known beyond binding to the ligand. Many of these entries also link to articles in PubMed.

For more information see the PaperBLAST paper (mSystems 2017) or the code. You can download PaperBLAST's database here.

Changes to PaperBLAST since the paper was written:

Many of these changes are described in Interactive tools for functional annotation of bacterial genomes.

Secrets

PaperBLAST cannot provide snippets for many of the papers that are published in non-open-access journals. This limitation applies even if the paper is marked as "free" on the publisher's web site and is available in PubmedCentral or EuropePMC. If a journal that you publish in is marked as "secret," please consider publishing elsewhere.

Omissions from the PaperBLAST Database

Many important articles are missing from PaperBLAST, either because the article's full text is not in EuropePMC (as for many older articles), or because the paper does not mention a protein identifier such as a locus tag, or because of PaperBLAST's heuristics. If you notice an article that characterizes a protein's function but is missing from PaperBLAST, please notify the curators at UniProt or add an entry to GeneRIF. Entries in either of these databases will eventually be incorporated into PaperBLAST. Note that to add an entry to UniProt, you will need to find the UniProt identifier for the protein. If the protein is not already in UniProt, you can ask them to create an entry. To add an entry to GeneRIF, you will need an NCBI Gene identifier, but unfortunately many prokaryotic proteins in RefSeq do not have corresponding Gene identifers.

References

PaperBLAST: Text-mining papers for information about homologs.
M. N. Price and A. P. Arkin (2017). mSystems, 10.1128/mSystems.00039-17.

Europe PMC in 2017.
M. Levchenko et al (2017). Nucleic Acids Research, 10.1093/nar/gkx1005.

Gene indexing: characterization and analysis of NLM's GeneRIFs.
J. A. Mitchell et al (2003). AMIA Annu Symp Proc 2003:460-464.

UniProt: the universal protein knowledgebase.
The UniProt Consortium (2016). Nucleic Acids Research, 10.1093/nar/gkw1099.

BRENDA in 2017: new perspectives and new tools in BRENDA.
S. Placzek et al (2017). Nucleic Acids Research, 10.1093/nar/gkw952.

The EcoCyc database: reflecting new knowledge about Escherichia coli K-12.
I. M. Keeseler et al (2016). Nucleic Acids Research, 10.1093/nar/gkw1003.

The MetaCyc database of metabolic pathways and enzymes.
R. Caspi et al (2018). Nucleic Acids Research, 10.1093/nar/gkx935.

CharProtDB: a database of experimentally characterized protein annotations.
R. Madupu et al (2012). Nucleic Acids Research, 10.1093/nar/gkr1133.

The carbohydrate-active enzymes database (CAZy) in 2013.
V. Lombard et al (2014). Nucleic Acids Research, 10.1093/nar/gkt1178.

The Transporter Classification Database (TCDB): recent advances
M. H. Saier, Jr. et al (2016). Nucleic Acids Research, 10.1093/nar/gkv1103.

REBASE - a database for DNA restriction and modification: enzymes, genes and genomes.
R. J. Roberts et al (2015). Nucleic Acids Research, 10.1093/nar/gku1046.

Deep annotation of protein function across diverse bacteria from mutant phenotypes.
M. N. Price et al (2016). bioRxiv, 10.1101/072470.

by Morgan Price, Arkin group
Lawrence Berkeley National Laboratory