PaperBLAST – Find papers about a protein or its homologs

 

PaperBLAST

PaperBLAST Hits for CharProtDB::CH_123433 glucosamine-6-phosphate deaminase (Candida albicans) (182 a.a., MDEYLGLAPS...)

Other sequence analysis tools:

Find functional residues: SitesBLAST

Search for conserved domains

Find the best match in UniProt

Compare to protein structures

Predict transmenbrane helices: Phobius

Predict protein localization: PSORTb

Find homologs in fast.genomics

Fitness BLAST: loading...

Found 132 similar proteins in the literature:

NAG1 glucosamine-6-phosphate deaminase from Candida albicans (see 4 papers)
100% identity, 100% coverage

Q04802 Glucosamine-6-phosphate isomerase from Candida albicans (strain SC5314 / ATCC MYA-2876)
99% identity, 73% coverage

SG0858 glucosamine-6-phosphate deaminase from Sodalis glossinidius str. 'morsitans'
52% identity, 63% coverage

BT4127 glucosamine-6-phosphate isomerase from Bacteroides thetaiotaomicron VPI-5482
53% identity, 65% coverage

YPTB1119 putative glucosamine-6-phosphate isomerase from Yersinia pseudotuberculosis IP 32953
A4TNY0 Glucosamine-6-phosphate deaminase from Yersinia pestis (strain Pestoides F)
YPO2627 putative glucosamine-6-phosphate isomerase from Yersinia pestis CO92
YPK_2997 glucosamine-6-phosphate isomerase from Yersinia pseudotuberculosis YPIII
52% identity, 65% coverage

B3MMZ7 Glucosamine-6-phosphate isomerase from Drosophila ananassae
49% identity, 64% coverage

ESA_02661 glucosamine-6-phosphate deaminase from Cronobacter sakazakii ATCC BAA-894
ESA_02661 hypothetical protein from Enterobacter sakazakii ATCC BAA-894
51% identity, 66% coverage

TDE_0337 glucosamine-6-phosphate deaminase from Treponema denticola ATCC 35405
51% identity, 65% coverage

VSAL_I2812 glucosamine-6-phosphate deaminase from Aliivibrio salmonicida LFI1238
VSAL_I2812 glucosamine-6-phosphate deaminase from Vibrio salmonicida LFI1238
51% identity, 66% coverage

GlmD / b0678 glucosamine-6-phosphate deaminase (EC 3.5.99.6) from Escherichia coli K-12 substr. MG1655 (see 28 papers)
nagB / P0A759 glucosamine-6-phosphate deaminase (EC 3.5.99.6) from Escherichia coli (strain K12) (see 27 papers)
NAGB_ECOLI / P0A759 Glucosamine-6-phosphate deaminase; GlcN6P deaminase; GNPDA; Glucosamine-6-phosphate isomerase; EC 3.5.99.6 from Escherichia coli (strain K12) (see 3 papers)
NAGB_ECO57 / P0A760 Glucosamine-6-phosphate deaminase; GlcN6P deaminase; GNPDA; Glucosamine-6-phosphate isomerase; EC 3.5.99.6 from Escherichia coli O157:H7 (see paper)
1deaA / P0A759 Structure and catalytic mechanism of glucosamine 6-phosphate deaminase from escherichia coli at 2.1 angstroms resolution (see paper)
nagB / PDB|1CD5 GlcN6P deaminase; EC 3.5.99.6 from Escherichia coli K12 (see 14 papers)
B7LKT5 Glucosamine-6-phosphate deaminase from Escherichia fergusonii (strain ATCC 35469 / DSM 13698 / CCUG 18766 / IAM 14443 / JCM 21226 / LMG 7866 / NBRC 102419 / NCTC 12128 / CDC 0568-73)
NP_415204 glucosamine-6-phosphate deaminase from Escherichia coli str. K-12 substr. MG1655
b0678 glucosamine-6-phosphate deaminase from Escherichia coli str. K-12 substr. MG1655
51% identity, 65% coverage

HI0141 glucosamine-6-phosphate isomerase (nagB) from Haemophilus influenzae Rd KW20
50% identity, 65% coverage

STM0684 glucosamine-6-phosphate deaminase from Salmonella typhimurium LT2
51% identity, 65% coverage

A6T6C1 Glucosamine-6-phosphate deaminase from Klebsiella pneumoniae subsp. pneumoniae (strain ATCC 700721 / MGH 78578)
51% identity, 65% coverage

An16g09070 uncharacterized protein from Aspergillus niger
50% identity, 50% coverage

VV2_1200 6-phosphogluconolactonase from Vibrio vulnificus CMCP6
52% identity, 65% coverage

Q4QP46 Glucosamine-6-phosphate deaminase from Haemophilus influenzae (strain 86-028NP)
50% identity, 65% coverage

Afu8g04070 glucosamine-6-phosphate deaminase, putative from Aspergillus fumigatus Af293
51% identity, 48% coverage

Afu1g00480 glucosamine-6-phosphate deaminase, putative from Aspergillus fumigatus Af293
51% identity, 48% coverage

Z0825 glucosamine-6-phosphate deaminase from Escherichia coli O157:H7 EDL933
51% identity, 65% coverage

5hj5B / Q9KKS5 Crystal structure of tertiary complex of glucosamine-6-phosphate deaminase from vibrio cholerae with beta-d-glucose-6-phosphate and fructose-6-phosphate
52% identity, 64% coverage

VCA1025 glucosamine-6-phosphate isomerase from Vibrio cholerae O1 biovar eltor str. N16961
52% identity, 65% coverage

VF_2357 glucosamine-6-phosphate deaminase from Vibrio fischeri ES114
VF_2357 glucosamine-6-phosphate deaminase from Aliivibrio fischeri ES114
50% identity, 66% coverage

VPA0038 glucosamine-6-phosphate isomerase from Vibrio parahaemolyticus RIMD 2210633
51% identity, 65% coverage

I1C2J3 Glucosamine-6-phosphate isomerase from Rhizopus delemar (strain RA 99-880 / ATCC MYA-4621 / FGSC 9543 / NRRL 43880)
RO3G_07378 glucosamine-6-phosphate isomerase 1 from Rhizopus delemar RA 99-880
49% identity, 59% coverage

PMI0454 glucosamine-6-phosphate deaminase from Proteus mirabilis HI4320
50% identity, 66% coverage

HCAG_08152 glucosamine-6-phosphate deaminase from Histoplasma mississippiense (nom. inval.)
52% identity, 60% coverage

E7F0E2 Glucosamine-6-phosphate isomerase from Danio rerio
XP_684147 glucosamine-6-phosphate isomerase 2 from Danio rerio
47% identity, 64% coverage

GL50803_8245 Glucosamine-6-phosphate deaminase from Giardia intestinalis
48% identity, 65% coverage

PADG_00401 glucosamine-6-phosphate deaminase from Paracoccidioides brasiliensis Pb18
51% identity, 50% coverage

GPI1 / O97439 glucosamine 6-phosphate isomerase 1 (EC 3.5.99.6) from Giardia intestinalis (see 2 papers)
48% identity, 65% coverage

CH_124073 putative glucosamine-6-phosphate deaminase from Magnaporthe grisea 70-15 (see 2 papers)
47% identity, 47% coverage

CNAG_06098 glucosamine-6-phosphate deaminase from Cryptococcus neoformans var. grubii H99
J9VVN4 Glucosamine-6-phosphate isomerase from Cryptococcus neoformans var. grubii serotype A (strain H99 / ATCC 208821 / CBS 10515 / FGSC 9487)
48% identity, 62% coverage

D0UPW5 Glucosamine-6-phosphate isomerase (Fragment) from Periplaneta americana
47% identity, 94% coverage

CNM01050 Glucosamine-6-phosphate isomerase from Cryptococcus neoformans var. neoformans JEC21
48% identity, 52% coverage

GNPI1_BOVIN / A4FV08 Glucosamine-6-phosphate deaminase 1; GlcN6P deaminase 1; Glucosamine-6-phosphate isomerase 1; Protein oscillin; EC 3.5.99.6 from Bos taurus (Bovine) (see paper)
G1TCQ4 Glucosamine-6-phosphate isomerase from Oryctolagus cuniculus
48% identity, 61% coverage

W5NUG3 Glucosamine-6-phosphate isomerase from Ovis aries
48% identity, 61% coverage

FPSE_09015 hypothetical protein from Fusarium pseudograminearum CS3096
49% identity, 42% coverage

GNPI1_MESAU / Q64422 Glucosamine-6-phosphate deaminase 1; GlcN6P deaminase 1; Glucosamine-6-phosphate isomerase 1; Proetin oscillin; EC 3.5.99.6 from Mesocricetus auratus (Golden hamster) (see paper)
48% identity, 61% coverage

GNPI1_MOUSE / O88958 Glucosamine-6-phosphate deaminase 1; GlcN6P deaminase 1; Glucosamine-6-phosphate isomerase 1; Protein oscillin; EC 3.5.99.6 from Mus musculus (Mouse) (see paper)
48% identity, 61% coverage

GNPI1_HUMAN / P46926 Glucosamine-6-phosphate deaminase 1; GlcN6P deaminase 1; Glucosamine-6-phosphate isomerase 1; Protein oscillin; EC 3.5.99.6 from Homo sapiens (Human) (see 3 papers)
NP_005462 glucosamine-6-phosphate isomerase 1 from Homo sapiens
47% identity, 61% coverage

1ne7A / P46926 Human glucosamine-6-phosphate deaminase isomerase at 1.75 a resolution complexed with n-acetyl-glucosamine-6-phosphate and 2-deoxy-2-amino- glucitol-6-phosphate (see paper)
47% identity, 64% coverage

NP_001257810 glucosamine-6-phosphate isomerase 2 isoform 3 from Homo sapiens
47% identity, 85% coverage

W5Q0A8 Glucosamine-6-phosphate isomerase from Ovis aries
Q17QL1 Glucosamine-6-phosphate deaminase 2 from Bos taurus
47% identity, 64% coverage

NP_001099475 glucosamine-6-phosphate isomerase 2 from Rattus norvegicus
47% identity, 64% coverage

GNPI2_HUMAN / Q8TDQ7 Glucosamine-6-phosphate deaminase 2; GlcN6P deaminase 2; Glucosamine-6-phosphate isomerase 2; Glucosamine-6-phosphate isomerase SB52; EC 3.5.99.6 from Homo sapiens (Human) (see 2 papers)
Q8TDQ7 glucosamine-6-phosphate deaminase (EC 3.5.99.6) from Homo sapiens (see 2 papers)
47% identity, 64% coverage

U5LXR4 glucosamine-6-phosphate deaminase (EC 3.5.99.6) from Gallus gallus (see paper)
47% identity, 86% coverage

E1C878 glucosamine-6-phosphate deaminase (EC 3.5.99.6) from Gallus gallus (see paper)
47% identity, 64% coverage

LINJ_32_3460 putative glucosamine-6-phosphate isomerase from Leishmania infantum JPCM5
A4I8F1 Glucosamine-6-phosphate isomerase from Leishmania infantum
47% identity, 65% coverage

BB0152 glucosamine-6-phosphate isomerase (nagB) from Borrelia burgdorferi B31
53% identity, 67% coverage

3hn6A / O30564 Crystal structure of glucosamine-6-phosphate deaminase from borrelia burgdorferi
53% identity, 66% coverage

D0AAS0 Glucosamine-6-phosphate isomerase from Trypanosoma brucei gambiense (strain MHOM/CI/86/DAL972)
46% identity, 65% coverage

Q4Q4U6 Glucosamine-6-phosphate isomerase from Leishmania major
45% identity, 65% coverage

GAMA_BACSU / O31458 Glucosamine-6-phosphate deaminase 2; GlcN6P deaminase 2; GNPDA 2; Glucosamine-6-phosphate isomerase 2; EC 3.5.99.6 from Bacillus subtilis (strain 168) (see 3 papers)
BSU02360 glucosamine-6-phosphate isomerase from Bacillus subtilis subsp. subtilis str. 168
44% identity, 72% coverage

Q4D0F2 Glucosamine-6-phosphate isomerase from Trypanosoma cruzi (strain CL Brener)
XP_807857 glucosamine-6-phosphate isomerase, putative from Trypanosoma cruzi
44% identity, 65% coverage

VDAG_05573 glucosamine-6-phosphate isomerase from Verticillium dahliae VdLs.17
46% identity, 48% coverage

BH0420 N-acetylglucosamine-6-phosphate isomerase from Bacillus halodurans C-125
44% identity, 71% coverage

GJQ69_02440 glucosamine-6-phosphate deaminase from Caproicibacterium lactatifermentans
42% identity, 70% coverage

CD630_10110, CDIF630erm_01147 glucosamine-6-phosphate deaminase from Clostridioides difficile
44% identity, 71% coverage

CPF_2744 glucosamine-6-phosphate isomerase from Clostridium perfringens ATCC 13124
40% identity, 74% coverage

NAGB_BACSU / O35000 Glucosamine-6-phosphate deaminase 1; GlcN6P deaminase 1; GNPDA 1; Glucosamine-6-phosphate isomerase 1; EC 3.5.99.6 from Bacillus subtilis (strain 168) (see 3 papers)
O35000 glucosamine-6-phosphate deaminase (EC 3.5.99.6) from Bacillus subtilis (see paper)
2bkvB / O35000 Structure and kinetics of a monomeric glucosamine-6-phosphate deaminase: missing link of the nagb superfamily (see paper)
NP_391382 glucosamine-6-phosphate isomerase from Bacillus subtilis subsp. subtilis str. 168
42% identity, 74% coverage

LSA0417 Glucosamine-6-phosphate deaminase from Lactobacillus sakei subsp. sakei 23K
44% identity, 77% coverage

LGG_02913 glucosamine-6-phosphate deaminase / glucosamine-6-phosphate isomerase (GNPDA), GlcN6P deaminase from Lactobacillus rhamnosus GG
43% identity, 75% coverage

RBAM_RS16295 glucosamine-6-phosphate deaminase from Bacillus velezensis FZB42
41% identity, 73% coverage

lp_0226 glucosamine-6-phosphate isomerase from Lactobacillus plantarum WCFS1
41% identity, 76% coverage

LSEI_2889 glucosamine-6-phosphate deaminase from Lacticaseibacillus paracasei ATCC 334
LSEI_2889 Glucosamine-6-phosphate isomerase from Lactobacillus casei ATCC 334
43% identity, 77% coverage

HR078_10525 glucosamine-6-phosphate deaminase from Lactobacillus delbrueckii subsp. lactis
42% identity, 76% coverage

A4X018 Glucosamine-6-phosphate deaminase from Cereibacter sphaeroides (strain ATCC 17025 / ATH 2.4.3)
42% identity, 68% coverage

LBCZ_2712 glucosamine-6-phosphate deaminase from Lacticaseibacillus casei DSM 20011 = JCM 1134 = ATCC 393
41% identity, 77% coverage

LMRG_02056 glucosamine-6-phosphate isomerase from Listeria monocytogenes 10403S
OCPFDLNE_01017 glucosamine-6-phosphate deaminase from Listeria monocytogenes
42% identity, 76% coverage

BTF1_18585 glucosamine-6-phosphate deaminase from Bacillus thuringiensis HD-789
38% identity, 69% coverage

Q8Y8E7 Glucosamine-6-phosphate deaminase from Listeria monocytogenes serovar 1/2a (strain ATCC BAA-679 / EGD-e)
lmo0957 similar to glucosamine-6-Phoasphate isomerase (EC 5.3.1.10) from Listeria monocytogenes EGD-e
42% identity, 76% coverage

Fjoh_4557 glucosamine-6-phosphate deaminase-like protein from Flavobacterium johnsoniae UW101
40% identity, 26% coverage

B1745_07135 glucosamine-6-phosphate deaminase from Lactobacillus amylolyticus
41% identity, 74% coverage

Amuc_1822 glucosamine-6-phosphate isomerase from Akkermansia muciniphila ATCC BAA-835
41% identity, 59% coverage

MED152_08485 glucosamine-6-phosphate deaminase from Polaribacter sp. MED152
40% identity, 27% coverage

LMOSA_18470 glucosamine-6-phosphate deaminase from Listeria monocytogenes str. Scott A
40% identity, 76% coverage

NH13_09645 glucosamine-6-phosphate deaminase from Lactobacillus acidophilus
LBA1948 glucosamine-6-phosphate isomerase from Lactobacillus acidophilus NCFM
39% identity, 74% coverage

Shew_0815 Glucosamine-6-phosphate deaminase (EC 3.5.99.6) from Shewanella loihica PV-4
40% identity, 66% coverage

AMUC_RS09725 glucosamine-6-phosphate deaminase from Akkermansia muciniphila ATCC BAA-835
41% identity, 72% coverage

SPSF3K_01753 glucosamine-6-phosphate deaminase from Streptococcus parauberis
40% identity, 76% coverage

A8YZR7 glucosamine-6-phosphate deaminase (EC 3.5.99.6) from Staphylococcus aureus (see paper)
SA0527 probable glucosamine-6-phosphate isomerase from Staphylococcus aureus subsp. aureus N315
USA300HOU_0563 glucosamine-6-phosphate deaminase from Staphylococcus aureus subsp. aureus USA300_TCH1516
40% identity, 69% coverage

L14408 glucosamine-6-P isomerase (EC 5.3.1.10) from Lactococcus lactis subsp. lactis Il1403
40% identity, 77% coverage

NCDO2118_1594 glucosamine-6-phosphate deaminase from Lactococcus lactis subsp. lactis NCDO 2118
40% identity, 77% coverage

LEGAS_1624 glucosamine-6-phosphate deaminase from Leuconostoc gasicomitatum LMG 18811
41% identity, 76% coverage

Q5HRH8 Glucosamine-6-phosphate deaminase from Staphylococcus epidermidis (strain ATCC 35984 / DSM 28319 / BCRC 17069 / CCUG 31568 / BM 3577 / RP62A)
38% identity, 72% coverage

M5005_Spy_1139 glucosamine-6-phosphate isomerase from Streptococcus pyogenes MGAS5005
SPy1399, SPy_1399 putative N-acetylglucosamine-6-phosphate isomerase from Streptococcus pyogenes M1 GAS
40% identity, 75% coverage

Spy49_1115c Glucosamine-6-phosphate deaminase from Streptococcus pyogenes NZ131
40% identity, 75% coverage

Ssal_01624 glucosamine-6-phosphate deaminase from Streptococcus salivarius 57.I
38% identity, 76% coverage

SSU05_0634 6-phosphogluconolactonase/Glucosamine-6- phosphate isomerase/deaminase from Streptococcus suis 05ZYH33
SSU10_RS03040 glucosamine-6-phosphate deaminase from Streptococcus suis
39% identity, 75% coverage

SmuNN2025_1351 putative N-acetylglucosamine-6-phosphate isomerase from Streptococcus mutans NN2025
39% identity, 76% coverage

2ri1A / Q8DV70 Crystal structure of glucosamine 6-phosphate deaminase (nagb) with glcn6p from s. Mutans (see paper)
39% identity, 76% coverage

EF0466 glucosamine-6-phosphate isomerase from Enterococcus faecalis V583
39% identity, 76% coverage

SXYL_00254 glucosamine-6-phosphate deaminase from Staphylococcus xylosus
39% identity, 71% coverage

nagB / A3CLX4 glucosamine-6-phosphate deaminase subunit (EC 3.5.99.6) from Streptococcus sanguinis (strain SK36) (see 2 papers)
39% identity, 76% coverage

EHI_174640 glucosamine-6-phosphate isomerase, putative from Entamoeba histolytica HM-1:IMSS
37% identity, 26% coverage

CA265_RS21925 Glucosamine-6-phosphate deaminase (EC 3.5.99.6) from Pedobacter sp. GW460-11-11-14-LB5
39% identity, 27% coverage

SPCG_RS07255 glucosamine-6-phosphate deaminase from Streptococcus pneumoniae CGSP14
39% identity, 77% coverage

SPD_1246 glucosamine-6-phosphate isomerase from Streptococcus pneumoniae D39
39% identity, 77% coverage

spr1272 N-acetylglucosamine-6-phosphate isomerase from Streptococcus pneumoniae R6
39% identity, 75% coverage

SP_1415 glucosamine-6-phosphate isomerase from Streptococcus pneumoniae TIGR4
39% identity, 76% coverage

SGO_1586 glucosamine-6-phosphate isomerase from Streptococcus gordonii str. Challis substr. CH1
39% identity, 76% coverage

MSMEG_2118 glucosamine-6-phosphate isomerase from Mycobacterium smegmatis str. MC2 155
36% identity, 67% coverage

LMOh7858_1020 glucosamine-6-phosphate isomerase from Listeria monocytogenes str. 4b H7858
41% identity, 77% coverage

NAGB_STRCO / Q9K487 Glucosamine-6-phosphate deaminase; GlcN6P deaminase; GNPDA; Glucosamine-6-phosphate isomerase; EC 3.5.99.6 from Streptomyces coelicolor (strain ATCC BAA-471 / A3(2) / M145) (see paper)
SCO5236 glucosamine-6-phosphate deaminase from Streptomyces coelicolor A3(2)
34% identity, 67% coverage

cg2928 N-acetylglucosamine-6-phosphate isomerase from Corynebacterium glutamicum ATCC 13032
38% identity, 70% coverage

CPA40_RS06275 glucosamine-6-phosphate deaminase from Bifidobacterium callitrichos
35% identity, 65% coverage

BL1343 glucosamine-6-phosphate deaminase from Bifidobacterium longum NCC2705
33% identity, 65% coverage

Bbr_1248 glucosamine-6-phosphate deaminase from Bifidobacterium breve UCC2003
33% identity, 65% coverage

Bbr_0847 glucosamine-6-phosphate deaminase from Bifidobacterium breve UCC2003
34% identity, 65% coverage

Blon_0881 glucosamine-6-phosphate isomerase from Bifidobacterium longum subsp. infantis ATCC 15697
33% identity, 65% coverage

BCAL_RS02710 glucosamine-6-phosphate deaminase from Bifidobacterium callitrichos DSM 23973
34% identity, 65% coverage

BF3116 glucosamine-6-phosphate isomerase from Bacteroides fragilis YCH46
34% identity, 27% coverage

Q8AB53 Putative glucosamine-6-phosphate deaminase-like protein BT_0258 from Bacteroides thetaiotaomicron (strain ATCC 29148 / DSM 2079 / JCM 5827 / CCUG 10774 / NCTC 10582 / VPI-5482 / E50)
BT0258, BT_0258 glucosamine-6-phosphate isomerase from Bacteroides thetaiotaomicron VPI-5482
33% identity, 27% coverage

MSMEG_0501 glucosamine-6-phosphate deaminase 1 from Mycobacterium smegmatis str. MC2 155
36% identity, 72% coverage

Bbr_0169 glucosamine-6-phosphate deaminase from Bifidobacterium breve UCC2003
33% identity, 65% coverage

LMOSA_3160 glucosamine-6-phosphate deaminase from Listeria monocytogenes str. Scott A
31% identity, 73% coverage

OCPFDLNE_02454 glucosamine-6-phosphate deaminase from Listeria monocytogenes
31% identity, 73% coverage

lmo0877 similar to B. subtilis NagB protein (glucosamine-6-phosphate isomerase) from Listeria monocytogenes EGD-e
29% identity, 74% coverage

PGN_0606 glucosamine-6-phosphate isomerase from Porphyromonas gingivalis ATCC 33277
31% identity, 24% coverage

I7C3V5 glucosamine-6-phosphate deaminase (EC 3.5.99.6) from Gallus gallus (see paper)
52% identity, 37% coverage

BT3587 glucosamine-6-phosphate isomerase from Bacteroides thetaiotaomicron VPI-5482
30% identity, 67% coverage

U5LV87 glucosamine-6-phosphate deaminase (EC 3.5.99.6) from Gallus gallus (see paper)
51% identity, 37% coverage

KPN_04126 hypothetical protein from Klebsiella pneumoniae subsp. pneumoniae MGH 78578
30% identity, 65% coverage

VK055_3349 glucosamine-6-phosphate deaminase from Klebsiella pneumoniae subsp. pneumoniae
30% identity, 65% coverage

SSU0206 glucosamine-6-phosphate isomerase from Streptococcus suis P1/7
34% identity, 46% coverage

CTN_RS07115 6-phosphogluconolactonase from Thermotoga neapolitana DSM 4359
28% identity, 59% coverage

EF0451 glucosamine-6-phosphate isomerase, putative from Enterococcus faecalis V583
25% identity, 72% coverage

sll1479 glucose-6-P-dehydrogenase from Synechocystis sp. PCC 6803
23% identity, 60% coverage

YraG / b3141 putative deaminase AgaI from Escherichia coli K-12 substr. MG1655 (see 6 papers)
AGAI_ECOLI / P42912 Putative deaminase AgaI; EC 3.5.99.- from Escherichia coli (strain K12) (see paper)
NP_417610 putative deaminase AgaI from Escherichia coli str. K-12 substr. MG1655
b3141 galactosamine-6-phosphate isomerase from Escherichia coli str. K-12 substr. MG1655
24% identity, 69% coverage

alr1602 glucose-6-P-dehydrogenase from Nostoc sp. PCC 7120
23% identity, 65% coverage

SF3739 orf, conserved hypothetical protein from Shigella flexneri 2a str. 301
29% identity, 62% coverage

YieK / b3718 putative glucosamine-6-phosphate deaminase YieK from Escherichia coli K-12 substr. MG1655 (see 2 papers)
b3718 hypothetical protein from Escherichia coli str. K-12 substr. MG1655
29% identity, 62% coverage

New Search

For advice on how to use these tools together, see Interactive tools for functional annotation of bacterial genomes.

Statistics

The PaperBLAST database links 793,807 different protein sequences to 1,259,118 scientific articles. Searches against EuropePMC were last performed on March 13 2025.

How It Works

PaperBLAST builds a database of protein sequences that are linked to scientific articles. These links come from automated text searches against the articles in EuropePMC and from manually-curated information from GeneRIF, UniProtKB/Swiss-Prot, BRENDA, CAZy (as made available by dbCAN), BioLiP, CharProtDB, MetaCyc, EcoCyc, TCDB, REBASE, the Fitness Browser, and a subset of the European Nucleotide Archive with the /experiment tag. Given this database and a protein sequence query, PaperBLAST uses protein-protein BLAST to find similar sequences with E < 0.001.

To build the database, we query EuropePMC with locus tags, with RefSeq protein identifiers, and with UniProt accessions. We obtain the locus tags from RefSeq or from MicrobesOnline. We use queries of the form "locus_tag AND genus_name" to try to ensure that the paper is actually discussing that gene. Because EuropePMC indexes most recent biomedical papers, even if they are not open access, some of the links may be to papers that you cannot read or that our computers cannot read. We query each of these identifiers that appears in the open access part of EuropePMC, as well as every locus tag that appears in the 500 most-referenced genomes, so that a gene may appear in the PaperBLAST results even though none of the papers that mention it are open access. We also incorporate text-mined links from EuropePMC that link open access articles to UniProt or RefSeq identifiers. (This yields some additional links because EuropePMC uses different heuristics for their text mining than we do.)

For every article that mentions a locus tag, a RefSeq protein identifier, or a UniProt accession, we try to select one or two snippets of text that refer to the protein. If we cannot get access to the full text, we try to select a snippet from the abstract, but unfortunately, unique identifiers such as locus tags are rarely provided in abstracts.

PaperBLAST also incorporates manually-curated protein functions:

Except for GeneRIF and ENA, the curated entries include a short curated description of the protein's function. For entries from BioLiP, the protein's function may not be known beyond binding to the ligand. Many of these entries also link to articles in PubMed.

For more information see the PaperBLAST paper (mSystems 2017) or the code. You can download PaperBLAST's database here.

Changes to PaperBLAST since the paper was written:

Many of these changes are described in Interactive tools for functional annotation of bacterial genomes.

Secrets

PaperBLAST cannot provide snippets for many of the papers that are published in non-open-access journals. This limitation applies even if the paper is marked as "free" on the publisher's web site and is available in PubmedCentral or EuropePMC. If a journal that you publish in is marked as "secret," please consider publishing elsewhere.

Omissions from the PaperBLAST Database

Many important articles are missing from PaperBLAST, either because the article's full text is not in EuropePMC (as for many older articles), or because the paper does not mention a protein identifier such as a locus tag, or because of PaperBLAST's heuristics. If you notice an article that characterizes a protein's function but is missing from PaperBLAST, please notify the curators at UniProt or add an entry to GeneRIF. Entries in either of these databases will eventually be incorporated into PaperBLAST. Note that to add an entry to UniProt, you will need to find the UniProt identifier for the protein. If the protein is not already in UniProt, you can ask them to create an entry. To add an entry to GeneRIF, you will need an NCBI Gene identifier, but unfortunately many prokaryotic proteins in RefSeq do not have corresponding Gene identifers.

References

PaperBLAST: Text-mining papers for information about homologs.
M. N. Price and A. P. Arkin (2017). mSystems, 10.1128/mSystems.00039-17.

Europe PMC in 2017.
M. Levchenko et al (2017). Nucleic Acids Research, 10.1093/nar/gkx1005.

Gene indexing: characterization and analysis of NLM's GeneRIFs.
J. A. Mitchell et al (2003). AMIA Annu Symp Proc 2003:460-464.

UniProt: the universal protein knowledgebase.
The UniProt Consortium (2016). Nucleic Acids Research, 10.1093/nar/gkw1099.

BRENDA in 2017: new perspectives and new tools in BRENDA.
S. Placzek et al (2017). Nucleic Acids Research, 10.1093/nar/gkw952.

The EcoCyc database: reflecting new knowledge about Escherichia coli K-12.
I. M. Keeseler et al (2016). Nucleic Acids Research, 10.1093/nar/gkw1003.

The MetaCyc database of metabolic pathways and enzymes.
R. Caspi et al (2018). Nucleic Acids Research, 10.1093/nar/gkx935.

CharProtDB: a database of experimentally characterized protein annotations.
R. Madupu et al (2012). Nucleic Acids Research, 10.1093/nar/gkr1133.

The carbohydrate-active enzymes database (CAZy) in 2013.
V. Lombard et al (2014). Nucleic Acids Research, 10.1093/nar/gkt1178.

The Transporter Classification Database (TCDB): recent advances
M. H. Saier, Jr. et al (2016). Nucleic Acids Research, 10.1093/nar/gkv1103.

REBASE - a database for DNA restriction and modification: enzymes, genes and genomes.
R. J. Roberts et al (2015). Nucleic Acids Research, 10.1093/nar/gku1046.

Deep annotation of protein function across diverse bacteria from mutant phenotypes.
M. N. Price et al (2016). bioRxiv, 10.1101/072470.

by Morgan Price, Arkin group
Lawrence Berkeley National Laboratory