PaperBLAST – Find papers about a protein or its homologs

 

PaperBLAST

PaperBLAST Hits for reanno::HerbieS:HSERO_RS17015 sorbitol dehydrogenase (EC 1.1.1.14); xylitol dehydrogenase (EC 1.1.1.9) (Herbaspirillum seropedicae SmR1) (345 a.a., MQALVLEATR...)

Other sequence analysis tools:

Find functional residues: SitesBLAST

Search for conserved domains

Find the best match in UniProt

Compare to protein structures

Predict transmenbrane helices: Phobius

Predict protein localization: PSORTb

Find homologs in fast.genomics

Fitness BLAST: loading...

Found 251 similar proteins in the literature:

HSERO_RS17015 sorbitol dehydrogenase (EC 1.1.1.14); xylitol dehydrogenase (EC 1.1.1.9) from Herbaspirillum seropedicae SmR1
100% identity, 100% coverage

Q2K0Q7 D-xylulose reductase (EC 1.1.1.9) from Rhizobium etli (see paper)
61% identity, 99% coverage

PhaeoP88_01862 NAD(P)-dependent alcohol dehydrogenase from Phaeobacter inhibens
60% identity, 98% coverage

A9H073 Putative D-xylulose reductase from Gluconacetobacter diazotrophicus (strain ATCC 49037 / DSM 5601 / CCUG 37298 / CIP 103539 / LMG 7603 / PAl5)
GDI_3142 NAD(P)-dependent alcohol dehydrogenase from Gluconacetobacter diazotrophicus PA1 5
63% identity, 99% coverage

Dshi_0551 D-xylulose reductase (EC 1.1.1.9) from Dinoroseobacter shibae DFL-12
61% identity, 99% coverage

BAD_RS01695 NAD(P)-dependent alcohol dehydrogenase from Bifidobacterium adolescentis ATCC 15703
56% identity, 99% coverage

BKKJ1_0339 NAD(P)-dependent alcohol dehydrogenase from Bifidobacterium catenulatum subsp. kashiwanohense
55% identity, 99% coverage

GFR01_RS14945 NAD(P)-dependent alcohol dehydrogenase from Gluconobacter frateurii
64% identity, 99% coverage

NRBB01_1662 NAD(P)-dependent alcohol dehydrogenase from Bifidobacterium breve
53% identity, 99% coverage

A4I8R5 Putative d-xylulose reductase from Leishmania infantum
LINJ_33_0530 putative d-xylulose reductase from Leishmania infantum JPCM5
52% identity, 99% coverage

Q59545 xylitol dehydrogenase (EC 1.1.1.9) from Morganella morganii (see paper)
Q59545 D-xylulose reductase from Morganella morganii
56% identity, 96% coverage

DHSO_RAT / P27867 Sorbitol dehydrogenase; SDH; L-iditol 2-dehydrogenase; Polyol dehydrogenase; Xylitol dehydrogenase; XDH; EC 1.1.1.-; EC 1.1.1.14; EC 1.1.1.9 from Rattus norvegicus (Rat) (see 5 papers)
41% identity, 95% coverage

XP_017446963 sorbitol dehydrogenase isoform X1 from Rattus norvegicus
41% identity, 96% coverage

DHSO_MOUSE / Q64442 Sorbitol dehydrogenase; SDH; SORD; L-iditol 2-dehydrogenase; Polyol dehydrogenase; Xylitol dehydrogenase; XDH; EC 1.1.1.-; EC 1.1.1.14; EC 1.1.1.9 from Mus musculus (Mouse) (see 3 papers)
NP_666238 sorbitol dehydrogenase from Mus musculus
40% identity, 95% coverage

DHSO_BOVIN / Q58D31 Sorbitol dehydrogenase; SDH; L-iditol 2-dehydrogenase; Polyol dehydrogenase; Xylitol dehydrogenase; XDH; EC 1.1.1.-; EC 1.1.1.14; EC 1.1.1.9 from Bos taurus (Bovine) (see paper)
40% identity, 96% coverage

XP_004010700 sorbitol dehydrogenase from Ovis aries
40% identity, 96% coverage

3qe3A / P07846 Sheep liver sorbitol dehydrogenase (see paper)
40% identity, 97% coverage

NP_001280132 sorbitol dehydrogenase from Gallus gallus
40% identity, 96% coverage

V9HW89 Sorbitol dehydrogenase from Homo sapiens
41% identity, 92% coverage

1pl6A / Q00796 Human sdh/nadh/inhibitor complex (see paper)
41% identity, 93% coverage

DHSO_HUMAN / Q00796 Sorbitol dehydrogenase; SDH; (R,R)-butanediol dehydrogenase; L-iditol 2-dehydrogenase; Polyol dehydrogenase; Ribitol dehydrogenase; RDH; Xylitol dehydrogenase; XDH; EC 1.1.1.-; EC 1.1.1.4; EC 1.1.1.14; EC 1.1.1.56; EC 1.1.1.9 from Homo sapiens (Human) (see 8 papers)
NP_003095 sorbitol dehydrogenase from Homo sapiens
41% identity, 92% coverage

DHSO_CHICK / P0DMQ6 Sorbitol dehydrogenase; SDH; Polyol dehydrogenase; EC 1.1.1.- from Gallus gallus (Chicken) (see paper)
40% identity, 96% coverage

W5QIT0 Sorbitol dehydrogenase from Ovis aries
40% identity, 95% coverage

XYL2_YEAST / Q07993 D-xylulose reductase; Xylitol dehydrogenase; XDH; EC 1.1.1.9 from Saccharomyces cerevisiae (strain ATCC 204508 / S288c) (Baker's yeast) (see paper)
Q07993 D-xylulose reductase (EC 1.1.1.9) from Saccharomyces cerevisiae (see paper)
YLR070C Xyl2p from Saccharomyces cerevisiae
40% identity, 96% coverage

DHSO_SHEEP / P07846 Sorbitol dehydrogenase; SDH; L-iditol 2-dehydrogenase; Polyol dehydrogenase; Xylitol dehydrogenase; XDH; EC 1.1.1.-; EC 1.1.1.14; EC 1.1.1.9 from Ovis aries (Sheep) (see 3 papers)
39% identity, 96% coverage

A0A3S7PMC4 D-xylulose reductase (EC 1.1.1.9) from Torulaspora delbrueckii (see paper)
41% identity, 93% coverage

HAH_5138 NAD(P)-dependent alcohol dehydrogenase from Haloarcula hispanica ATCC 33960
G0I050 Zn-dependent oxidoreductase / NADPH2:quinone reductase from Haloarcula hispanica (strain ATCC 33960 / DSM 4426 / JCM 8911 / NBRC 102182 / NCIMB 2187 / VKM B-1755)
43% identity, 100% coverage

Q5V6U8 NAD(P)-dependent alcohol dehydrogenase from Haloarcula marismortui (strain ATCC 43049 / DSM 3752 / JCM 8966 / VKM B-1809)
43% identity, 100% coverage

LOC103960512 sorbitol dehydrogenase from Pyrus x bretschneideri
42% identity, 86% coverage

LOC103333266 sorbitol dehydrogenase from Prunus mume
42% identity, 89% coverage

NAD-SDH / Q9ZR22 D-sorbitol dehydrogenase (EC 1.1.1.14) from Malus domestica (see paper)
41% identity, 88% coverage

Q9MBD7 NAD-dependent sorbitol dehydrogenase from Prunus persica
LOC18787602 sorbitol dehydrogenase from Prunus persica
41% identity, 89% coverage

Q3C2L6 L-iditol 2-dehydrogenase (EC 1.1.1.14) from Solanum lycopersicum (see paper)
41% identity, 92% coverage

LOC107448906 sorbitol dehydrogenase from Parasteatoda tepidariorum
35% identity, 96% coverage

Q5I6M4 L-iditol 2-dehydrogenase (EC 1.1.1.14) from Malus domestica (see paper)
41% identity, 88% coverage

DHSO_ARATH / Q9FJ95 Sorbitol dehydrogenase; SDH; Polyol dehydrogenase; Ribitol dehydrogenase; RDH; Xylitol dehydrogenase; XDH; EC 1.1.1.-; EC 1.1.1.56; EC 1.1.1.9 from Arabidopsis thaliana (Mouse-ear cress) (see paper)
AT5G51970 sorbitol dehydrogenase, putative / L-iditol 2-dehydrogenase, putative from Arabidopsis thaliana
NP_200010 GroES-like zinc-binding alcohol dehydrogenase family protein from Arabidopsis thaliana
42% identity, 88% coverage

Q0QWI2 Sorbitol dehydrogenase from Zea mays
42% identity, 90% coverage

Q5I6M3 L-iditol 2-dehydrogenase (EC 1.1.1.14) from Malus domestica (see paper)
42% identity, 86% coverage

xdh1 / Q876R2 D-sorbitol dehydrogenase (EC 1.1.1.14; EC 1.1.1.9) from Hypocrea jecorina (see paper)
40% identity, 93% coverage

M0HN94 Zinc-binding dehydrogenase from Haloferax gibbonsii (strain ATCC 33959 / DSM 4427 / JCM 8863 / NBRC 102184 / NCIMB 2188 / Ma 2.38)
40% identity, 98% coverage

O96496 Sorbitol dehydrogenase from Bemisia argentifolii
39% identity, 96% coverage

F2CYT1 Predicted protein from Hordeum vulgare subsp. vulgare
41% identity, 90% coverage

DQ124868 / Q1PSI9 L-idonate 5-dehydrogenase (EC 1.1.1.366) from Vitis vinifera (see paper)
IDND_VITVI / Q1PSI9 L-idonate 5-dehydrogenase; EC 1.1.1.366 from Vitis vinifera (Grape) (see paper)
Q1PSI9 L-idonate 5-dehydrogenase (EC 1.1.1.264); L-idonate 5-dehydrogenase (NAD+) (EC 1.1.1.366) from Vitis vinifera (see 3 papers)
40% identity, 89% coverage

1e3jA / O96496 Ketose reductase (sorbitol dehydrogenase) from silverleaf whitefly (see paper)
39% identity, 97% coverage

Q6ZBH2 Os08g0545200 protein from Oryza sativa subsp. japonica
41% identity, 89% coverage

D7TMY3 Enoyl reductase (ER) domain-containing protein from Vitis vinifera
41% identity, 90% coverage

NCU00891 xylitol dehydrogenase from Neurospora crassa OR74A
39% identity, 86% coverage

B8B9C5 Enoyl reductase (ER) domain-containing protein from Oryza sativa subsp. indica
41% identity, 89% coverage

LOC126998993 sorbitol dehydrogenase-like from Eriocheir sinensis
39% identity, 89% coverage

Q07786 L-iditol 2-dehydrogenase (EC 1.1.1.14) from Saccharomyces cerevisiae (see 2 papers)
YDL246C Protein of unknown function, computational analysis of large-scale protein-protein interaction data suggests a possible role in fructose or mannose metabolism from Saccharomyces cerevisiae
40% identity, 91% coverage

DHSO1_YEAST / P35497 Sorbitol dehydrogenase 1; SDH 1; Polyol dehydrogenase; Xylitol dehydrogenase; EC 1.1.1.-; EC 1.1.1.9 from Saccharomyces cerevisiae (strain ATCC 204508 / S288c) (Baker's yeast) (see paper)
P35497 L-iditol 2-dehydrogenase (EC 1.1.1.14) from Saccharomyces cerevisiae (see 2 papers)
YJR159W Sor1p from Saccharomyces cerevisiae
40% identity, 91% coverage

XP_002318842 L-idonate 5-dehydrogenase from Populus trichocarpa
38% identity, 89% coverage

NP_001149440 sorbitol dehydrogenase homolog 1 from Zea mays
42% identity, 90% coverage

LOC103933319 sorbitol dehydrogenase from Pyrus x bretschneideri
41% identity, 86% coverage

NP_001188510 sorbitol dehydrogenase-2b from Bombyx mori
37% identity, 94% coverage

gutB / Q06004 glucitol dehydrogenase monomer (EC 1.1.1.9; EC 1.1.1.14) from Bacillus subtilis (strain 168) (see paper)
DHSO_BACSU / Q06004 Sorbitol dehydrogenase; SDH; Glucitol dehydrogenase; L-iditol 2-dehydrogenase; Polyol dehydrogenase; Xylitol dehydrogenase; EC 1.1.1.-; EC 1.1.1.14; EC 1.1.1.9 from Bacillus subtilis (strain 168) (see paper)
gutB / GI|304153 L-iditol 2-dehydrogenase; EC 1.1.1.14 from Bacillus subtilis subsp. subtilis str. 168 (see 3 papers)
gutB / AAA22508.1 sorbitol dehydrogenase from Bacillus subtilis (see paper)
BSU06150 glucitol (sorbitol) dehydrogenase from Bacillus subtilis subsp. subtilis str. 168
35% identity, 97% coverage

LOC105669158 sorbitol dehydrogenase from Linepithema humile
36% identity, 97% coverage

LOC408871 sorbitol dehydrogenase from Apis mellifera
38% identity, 94% coverage

S6BFC0 D-xylulose reductase (EC 1.1.1.9) from Rhizomucor pusillus (see paper)
38% identity, 94% coverage

B6TEC1 Sorbitol dehydrogenase from Zea mays
42% identity, 90% coverage

xdhA / Q5GN51 D-xylulose reductase (EC 1.1.1.9) from Aspergillus niger (see paper)
An12g00030 uncharacterized protein from Aspergillus niger
38% identity, 92% coverage

xdhA / Q86ZV0 NAD+-dependent xylitol dehydrogenase (EC 1.1.1.9) from Aspergillus oryzae (strain ATCC 42149 / RIB 40) (see paper)
XYL2_ASPOR / Q86ZV0 D-xylulose reductase A; Xylitol dehydrogenase A; EC 1.1.1.9 from Aspergillus oryzae (strain ATCC 42149 / RIB 40) (Yellow koji mold) (see paper)
Q86ZV0 D-xylulose reductase (EC 1.1.1.9) from Aspergillus oryzae (see 3 papers)
GI|83774265 xylitol dehydrogenase; EC 1.1.1.9 from Aspergillus oryzae (see paper)
xdhA / BAC75870.2 xylitol dehydrogenase from Aspergillus oryzae (see paper)
AO090038000631 No description from Aspergillus oryzae RIB40
38% identity, 92% coverage

A0A3S7PMB5 D-xylulose reductase (EC 1.1.1.9) from Pichia kudriavzevii (see paper)
38% identity, 93% coverage

LOC105679484 sorbitol dehydrogenase-like from Linepithema humile
36% identity, 97% coverage

YdjJ / b1774 putative zinc-binding dehydrogenase YdjJ from Escherichia coli K-12 substr. MG1655 (see 2 papers)
b1774 predicted oxidoreductase, Zn-dependent and NAD(P)-binding from Escherichia coli str. K-12 substr. MG1655
38% identity, 97% coverage

B8B9C4 Enoyl reductase (ER) domain-containing protein from Oryza sativa subsp. indica
40% identity, 91% coverage

5vm2A / P77280 Crystal structure of eck1772, an oxidoreductase/dehydrogenase of unknown specificity involved in membrane biogenesis from escherichia coli
38% identity, 97% coverage

Cthe_2445 Alcohol dehydrogenase GroES-like protein from Clostridium thermocellum ATCC 27405
Clo1313_0076 NAD(P)-dependent alcohol dehydrogenase from Acetivibrio thermocellus DSM 1313
34% identity, 99% coverage

Afu1g11020 L-arabinitol 4-dehydrogenase from Aspergillus fumigatus Af293
35% identity, 93% coverage

NP_524311 sorbitol dehydrogenase 2 from Drosophila melanogaster
37% identity, 94% coverage

NP_477348 sorbitol dehydrogenase 1, isoform A from Drosophila melanogaster
36% identity, 94% coverage

An08g09380 uncharacterized protein from Aspergillus niger
36% identity, 88% coverage

KLMA_70044 sorbitol dehydrogenase 1 from Kluyveromyces marxianus DMKU3-1042
39% identity, 92% coverage

G3AIP8 D-xylulose reductase (EC 1.1.1.9) from Spathaspora passalidarum (see paper)
37% identity, 91% coverage

LAD_ASPOZ / Q763T4 L-arabinitol 4-dehydrogenase; LAD; EC 1.1.1.12 from Aspergillus oryzae (Yellow koji mold) (see paper)
Q763T4 L-arabinitol 4-dehydrogenase (EC 1.1.1.12) from Aspergillus oryzae (see 3 papers)
AO090005001078 No description from Aspergillus oryzae RIB40
34% identity, 86% coverage

ladB / A2R6Z2 D-galactitol dehydrogenase from Aspergillus niger (strain ATCC MYA-4892 / CBS 513.88 / FGSC A1513) (see paper)
An16g01710 uncharacterized protein from Aspergillus niger
A2R6Z2 D-xylulose reductase from Aspergillus niger (strain ATCC MYA-4892 / CBS 513.88 / FGSC A1513)
33% identity, 96% coverage

PPTG_17182 chlorophyll synthesis pathway protein BchC from Phytophthora nicotianae INRA-310
38% identity, 92% coverage

SS1G_05959 hypothetical protein from Sclerotinia sclerotiorum 1980 UF-70
37% identity, 93% coverage

LAD_NEUCR / Q7SI09 L-arabinitol 4-dehydrogenase; LAD; EC 1.1.1.12 from Neurospora crassa (strain ATCC 24698 / 74-OR23-1A / CBS 708.71 / DSM 1257 / FGSC 987) (see 2 papers)
Q7SI09 L-arabinitol 4-dehydrogenase (EC 1.1.1.12) from Neurospora crassa (see paper)
NCU00643 L-arabinitol 4-dehydrogenase from Neurospora crassa OR74A
33% identity, 91% coverage

3m6iA / Q7SI09 L-arabinitol 4-dehydrogenase (see paper)
33% identity, 92% coverage

PITG_04121 sorbitol dehydrogenase, putative from Phytophthora infestans T30-4
38% identity, 92% coverage

PICST_86924, XP_001386982 D-xylulose reductase (Xylitol dehydrogenase) (XDH) from Scheffersomyces stipitis CBS 6054
P22144 D-xylulose reductase from Scheffersomyces stipitis (strain ATCC 58785 / CBS 6054 / NBRC 10063 / NRRL Y-11545)
35% identity, 91% coverage

G3AIB3 D-xylulose reductase (EC 1.1.1.9) from Spathaspora passalidarum (see paper)
37% identity, 91% coverage

7y9pA / P22144 Xylitol dehydrogenase s96c/s99c/y102c mutant(thermostabilized form) from pichia stipitis (see paper)
35% identity, 93% coverage

XYL2 D-xylulose reductase from Candida albicans (see 2 papers)
XP_719434 L-iditol 2-dehydrogenase from Candida albicans SC5314
35% identity, 92% coverage

LAD_PENRW / B6HI95 L-arabinitol 4-dehydrogenase; LAD; EC 1.1.1.12 from Penicillium rubens (strain ATCC 28089 / DSM 1075 / NRRL 1951 / Wisconsin 54-1255) (Penicillium chrysogenum) (see paper)
B6HI95 L-arabinitol 4-dehydrogenase (EC 1.1.1.12) from Penicillium chrysogenum (see paper)
33% identity, 85% coverage

ladA putative L-arabinitol 4-dehydrogenase from Emericella nidulans (see paper)
33% identity, 85% coverage

Q6KAV2 D-xylulose reductase (EC 1.1.1.9) from Blastobotrys adeninivorans (see paper)
39% identity, 90% coverage

lad1 / Q96V44 D-galactitol dehydrogenase (EC 1.1.1.12) from Hypocrea jecorina (see paper)
LAD_HYPJE / Q96V44 L-arabinitol 4-dehydrogenase; LAD; EC 1.1.1.12 from Hypocrea jecorina (Trichoderma reesei) (see 4 papers)
Q96V44 L-arabinitol 4-dehydrogenase (EC 1.1.1.12) from Trichoderma reesei (see 5 papers)
34% identity, 84% coverage

H6WCP4 L-arabinitol 4-dehydrogenase (EC 1.1.1.12) from Aspergillus tubingensis (see paper)
33% identity, 85% coverage

ladA / A2QAC0 L-arabinitol 4-dehydrogenase (EC 1.1.1.9; EC 1.1.1.12) from Aspergillus niger (strain ATCC MYA-4892 / CBS 513.88 / FGSC A1513) (see 2 papers)
LAD_ASPNC / A2QAC0 L-arabinitol 4-dehydrogenase; LAD; EC 1.1.1.12 from Aspergillus niger (strain ATCC MYA-4892 / CBS 513.88 / FGSC A1513) (see 3 papers)
A2QAC0 L-arabinitol 4-dehydrogenase (EC 1.1.1.12) from Aspergillus niger (see 2 papers)
An01g10920 uncharacterized protein from Aspergillus niger
33% identity, 85% coverage

Afu8g02000 sorbitol/xylitol dehydrogenase, putative from Aspergillus fumigatus Af293
34% identity, 90% coverage

LAD_TALEM / C5J3R8 L-arabinitol 4-dehydrogenase; LAD; EC 1.1.1.12 from Talaromyces emersonii (Thermophilic fungus) (Rasamsonia emersonii) (see paper)
C5J3R8 L-arabinitol 4-dehydrogenase (EC 1.1.1.12) from Rasamsonia emersonii (see paper)
34% identity, 85% coverage

CNAG_00115 chlorophyll synthesis pathway protein BchC from Cryptococcus neoformans var. grubii H99
35% identity, 89% coverage

CNJ01090 xylitol dehydrogenase from Cryptococcus neoformans var. neoformans JEC21
35% identity, 89% coverage

sdhA / A2QM95 L-arabinitol dehydrogenase (EC 1.1.1.12; EC 1.1.1.9; EC 1.1.1.14) from Aspergillus niger (strain ATCC MYA-4892 / CBS 513.88 / FGSC A1513) (see 3 papers)
An07g01290 uncharacterized protein from Aspergillus niger
34% identity, 92% coverage

CNA01050 sorbitol dehydrogenase from Cryptococcus neoformans var. neoformans JEC21
35% identity, 89% coverage

Atu1408 sorbitol dehydrogenase from Agrobacterium tumefaciens str. C58 (Cereon)
35% identity, 99% coverage

RHTO_01629 L-iditol 2-dehydrogenase from Rhodotorula toruloides NP11
36% identity, 80% coverage

AO090020000635 No description from Aspergillus oryzae RIB40
35% identity, 89% coverage

B4DKI2 Sorbitol dehydrogenase from Homo sapiens
37% identity, 73% coverage

BCAM1704 2,3-butanediol dehydrogenase from Burkholderia cenocepacia J2315
33% identity, 93% coverage

CCM_06561 sorbitol dehydrogenase from Cordyceps militaris CM01
33% identity, 91% coverage

WP_039351048 2,3-butanediol dehydrogenase from Burkholderia contaminans
32% identity, 97% coverage

DHSO_SCHPO / P36624 Sorbitol dehydrogenase; SDH; Polyol dehydrogenase; Protein tms1; EC 1.1.1.- from Schizosaccharomyces pombe (strain 972 / ATCC 24843) (Fission yeast) (see paper)
tms1 / RF|NP_595120.1 hexitol dehydrogenase (predicted); EC 1.1.1.- from Schizosaccharomyces pombe (see 2 papers)
SPBC1773.05c hexitol dehydrogenase (predicted) from Schizosaccharomyces pombe
32% identity, 94% coverage

A0A1B4XTS0 L-arabinitol 4-dehydrogenase (EC 1.1.1.12); D-xylulose reductase (EC 1.1.1.9) from Meyerozyma caribbica (see paper)
34% identity, 95% coverage

PGUG_01218 uncharacterized protein from Meyerozyma guilliermondii ATCC 6260
32% identity, 91% coverage

PGUG_05726 uncharacterized protein from Meyerozyma guilliermondii ATCC 6260
34% identity, 78% coverage

CTK_RS07810 2,3-butanediol dehydrogenase from Clostridium tyrobutyricum
30% identity, 96% coverage

CIBE_1696 2,3-butanediol dehydrogenase from Clostridium beijerinckii
Cbei_1464 alcohol dehydrogenase from Clostridium beijerincki NCIMB 8052
30% identity, 95% coverage

BC0668, NP_830481 (R,R)-butanediol dehydrogenase from Bacillus cereus ATCC 14579
32% identity, 97% coverage

WP_039344544 2,3-butanediol dehydrogenase from Burkholderia contaminans
33% identity, 92% coverage

Q93R65 Acetylacetoin reductase from Bacillus cereus
32% identity, 97% coverage

BCV53_03685 2,3-butanediol dehydrogenase from Parageobacillus thermoglucosidasius
31% identity, 98% coverage

SGO_0440 L-iditol 2-dehydrogenase BH3949 from Streptococcus gordonii str. Challis substr. CH1
31% identity, 99% coverage

DMR38_12050 2,3-butanediol dehydrogenase from Clostridium sp. AWRP
29% identity, 96% coverage

AXG94_01200 2,3-butanediol dehydrogenase from Pseudomonas corrugata
31% identity, 96% coverage

Q9HWM8 (R,R)-butanediol dehydrogenase (EC 1.1.1.4) from Pseudomonas aeruginosa (see paper)
PA4153 2,3-butanediol dehydrogenase from Pseudomonas aeruginosa PAO1
31% identity, 93% coverage

CAETHG_0385, CLJU_c23220 2,3-butanediol dehydrogenase from Clostridium autoethanogenum DSM 10061
29% identity, 96% coverage

ELZ14_17085 2,3-butanediol dehydrogenase from Pseudomonas brassicacearum
31% identity, 96% coverage

F8TEL7 (R,R)-butanediol dehydrogenase (EC 1.1.1.4) from Clostridium autoethanogenum (see 2 papers)
29% identity, 96% coverage

BAS0641 alcohol dehydrogenase, zinc-containing from Bacillus anthracis str. Sterne
32% identity, 97% coverage

A0U95_29290 2,3-butanediol dehydrogenase from Pseudomonas brassicacearum
31% identity, 96% coverage

CH_000557 (R,R)-butanediol dehydrogenase; EC 1.1.1.4 from Pseudomonas putida (see paper)
adh / AAB58982.1 2,3-butanediol dehydrogenase from Pseudomonas putida (see paper)
30% identity, 94% coverage

PP0552, PP_0552 2,3-butanediol dehydrogenase from Pseudomonas putida KT2440
30% identity, 94% coverage

PPE_03421 2,3-butanediol dehydrogenase from Paenibacillus polymyxa E681
31% identity, 97% coverage

SSA_0572 Dehydrogenase, putative from Streptococcus sanguinis SK36
30% identity, 99% coverage

MLD56_18150 2,3-butanediol dehydrogenase from Paenibacillus peoriae
31% identity, 97% coverage

A0A3Q8GZQ4 (R,R)-butanediol dehydrogenase (EC 1.1.1.4) from Paenibacillus brasilensis (see paper)
31% identity, 97% coverage

Atu4740 zinc-binding dehydrogenase from Agrobacterium tumefaciens str. C58 (Cereon)
30% identity, 90% coverage

E7EKB8 (R,R)-butanediol dehydrogenase (EC 1.1.1.4) from Paenibacillus polymyxa (see paper)
31% identity, 97% coverage

An09g03900 uncharacterized protein from Aspergillus niger
32% identity, 84% coverage

CNA02580 sorbitol dehydrogenase from Cryptococcus neoformans var. neoformans JEC21
30% identity, 81% coverage

CNAG_00269 sorbitol dehydrogenase from Cryptococcus neoformans var. grubii H99
30% identity, 76% coverage

jpw_02880 2,3-butanediol dehydrogenase from Pseudomonas asiatica
30% identity, 96% coverage

NCU01905 sorbitol dehydrogenase from Neurospora crassa OR74A
30% identity, 79% coverage

SMb20445 putative alcohol dehydrogenase protein from Sinorhizobium meliloti 1021
34% identity, 95% coverage

Pc16g12970 uncharacterized protein from Penicillium rubens
30% identity, 83% coverage

CD0490 putative sugar-phosphate dehydrogenase from Clostridium difficile 630
28% identity, 94% coverage

DDGAH_PSEA6 / Q15SS1 2-dehydro-3-deoxy-L-galactonate 5-dehydrogenase; 2-keto-3-deoxy-L-galactonate 5-dehydrogenase; EC 1.1.1.389 from Pseudoalteromonas atlantica (strain T6c / ATCC BAA-1087)
34% identity, 92% coverage

Pcar_0330 2,3-butanediol dehydrogenase from Pelobacter carbinolicus str. DSM 2380
30% identity, 96% coverage

BH3949 L-iditol 2-dehydrogenase from Bacillus halodurans C-125
30% identity, 91% coverage

SEN1433 putative hexonate dehydrogenase from Salmonella enterica subsp. enterica serovar Enteritidis str. P125109
33% identity, 92% coverage

L2164_19520 L-idonate 5-dehydrogenase from Pectobacterium brasiliense
31% identity, 91% coverage

Ppro_1043 Alcohol dehydrogenase GroES domain protein from Pelobacter propionicus DSM 2379
31% identity, 96% coverage

BMMGA3_RS07345 zinc-binding dehydrogenase from Bacillus methanolicus MGA3
31% identity, 97% coverage

Swol_1727 zinc-binding dehydrogenase from Syntrophomonas wolfei subsp. wolfei str. Goettingen
30% identity, 93% coverage

PA14_10900 putative Zn-dependent alcohol dehydrogenase from Pseudomonas aeruginosa UCBPP-PA14
30% identity, 97% coverage

BSQ49_10255 2,3-butanediol dehydrogenase from Liquorilactobacillus hordei
29% identity, 97% coverage

LMRG_02209 alcohol dehydrogenase, zinc-dependent from Listeria monocytogenes 10403S
29% identity, 97% coverage

lmo0506 similar to polyol (sorbitol) dehydrogenase from Listeria monocytogenes EGD-e
31% identity, 94% coverage

lmo2664 / Q8Y413 pentitolphosphate dehydrogenase Lmo2664 (EC 1.1.1.301) from Listeria monocytogenes serovar 1/2a (strain ATCC BAA-679 / EGD-e) (see 2 papers)
lmo2664 similar to sorbitol dehydrogenase from Listeria monocytogenes EGD-e
29% identity, 97% coverage

PA4097 probable alcohol dehydrogenase (Zn-dependent) from Pseudomonas aeruginosa PAO1
31% identity, 92% coverage

Afu3g01490 alcohol dehydrogenase, putative from Aspergillus fumigatus Af293
38% identity, 66% coverage

SEN4237 l-idonate 5-dehydrogenase (ec 1.1.1.264) from Salmonella enterica subsp. enterica serovar Enteritidis str. P125109
29% identity, 99% coverage

PGA1_c34320 L-threonine 3-dehydrogenase (EC 1.1.1.103) from Phaeobacter inhibens DSM 17395
30% identity, 99% coverage

Clo1313_1833 zinc-binding dehydrogenase from Acetivibrio thermocellus DSM 1313
Cthe_0388 Alcohol dehydrogenase GroES-like protein from Clostridium thermocellum ATCC 27405
28% identity, 97% coverage

ECs4494 threonine dehydrogenase from Escherichia coli O157:H7 str. Sakai
29% identity, 98% coverage

PMI3178 L-threonine 3-dehydrogenase from Proteus mirabilis HI4320
29% identity, 98% coverage

Tdh / b3616 threonine dehydrogenase (EC 1.1.1.103) from Escherichia coli K-12 substr. MG1655 (see 5 papers)
tdh / P07913 threonine dehydrogenase (EC 1.1.1.103) from Escherichia coli (strain K12) (see 10 papers)
TDH_ECOLI / P07913 L-threonine 3-dehydrogenase; TDH; L-threonine dehydrogenase; EC 1.1.1.103 from Escherichia coli (strain K12) (see 2 papers)
P07913 L-threonine 3-dehydrogenase (EC 1.1.1.103) from Escherichia coli (see paper)
NP_418073 threonine dehydrogenase from Escherichia coli str. K-12 substr. MG1655
b3616 L-threonine 3-dehydrogenase from Escherichia coli str. K-12 substr. MG1655
WP_000646007 L-threonine 3-dehydrogenase from Escherichia coli
29% identity, 98% coverage

NMC0547 putative zinc-binding alcohol dehydrogenase from Neisseria meningitidis FAM18
A1KSL2 Zinc-binding alcohol dehydrogenase from Neisseria meningitidis serogroup C / serotype 2a (strain ATCC 700532 / DSM 15464 / FAM18)
29% identity, 92% coverage

NMB0604 alcohol dehydrogenase, zinc-containing from Neisseria meningitidis MC58
29% identity, 92% coverage

Z5043 threonine dehydrogenase from Escherichia coli O157:H7 EDL933
29% identity, 98% coverage

BDH_NEIG1 / Q5FA46 (R,R)-butanediol dehydrogenase; BDH; (2R,3R)-2,3-butanediol dehydrogenase; (2R,3R)-BDH; NgBDH; Acetoin/diacetyl reductase; EC 1.1.1.4 from Neisseria gonorrhoeae (strain ATCC 700825 / FA 1090) (see paper)
NGFG_00324, NGO_0186 2,3-butanediol dehydrogenase from Neisseria gonorrhoeae MS11
NGO0186 putative zinc-binding alcohol dehydrogenas from Neisseria gonorrhoeae FA 1090
29% identity, 92% coverage

YPO0060 threonine 3-dehydrogenase from Yersinia pestis CO92
29% identity, 96% coverage

UTI89_C4162 threonine 3-dehydrogenase from Escherichia coli UTI89
29% identity, 98% coverage

Q83F39 L-threonine 3-dehydrogenase from Coxiella burnetii (strain RSA 493 / Nine Mile phase I)
30% identity, 92% coverage

STM3708 threonine 3-dehydrogenase from Salmonella typhimurium LT2
29% identity, 96% coverage

YPTB0057 threonine 3-dehydrogenase from Yersinia pseudotuberculosis IP 32953
29% identity, 96% coverage

A6TFL2 L-threonine 3-dehydrogenase from Klebsiella pneumoniae subsp. pneumoniae (strain ATCC 700721 / MGH 78578)
29% identity, 96% coverage

SACOL0235 hexitol dehydrogenase from Staphylococcus aureus subsp. aureus COL
SAOUHSC_00219 hypothetical protein from Staphylococcus aureus subsp. aureus NCTC 8325
SAUSA300_0244 oxidoreductase, zinc-binding dehydrogenase family from Staphylococcus aureus subsp. aureus USA300_FPR3757
Newbould305_0793 galactitol-1-phosphate 5-dehydrogenase from Staphylococcus aureus subsp. aureus str. Newbould 305
30% identity, 98% coverage

SAR0245 putative zinc-binding dehydrogenase from Staphylococcus aureus subsp. aureus MRSA252
30% identity, 98% coverage

4cpdA / B2ZRE3 Alcohol dehydrogenase tadh from thermus sp. Atn1
32% identity, 95% coverage

B2ZRE3 alcohol dehydrogenase (EC 1.1.1.1) from Thermus sp. (see paper)
32% identity, 94% coverage

YjgV / b4267 L-idonate 5-dehydrogenase (EC 1.1.1.264) from Escherichia coli K-12 substr. MG1655 (see 5 papers)
idnD / P39346 L-idonate 5-dehydrogenase (EC 1.1.1.264) from Escherichia coli (strain K12) (see 3 papers)
IDND_ECOLI / P39346 L-idonate 5-dehydrogenase (NAD(P)(+)); EC 1.1.1.264 from Escherichia coli (strain K12) (see paper)
idnD / GB|AAC77224.1 L-idonate 5-dehydrogenase; EC 1.1.1.264 from Escherichia coli K12 (see 4 papers)
b4267 L-idonate 5-dehydrogenase, NAD-binding from Escherichia coli str. K-12 substr. MG1655
31% identity, 90% coverage

SA0240 hypothetical protein from Staphylococcus aureus subsp. aureus N315
30% identity, 98% coverage

CH51_RS01175 zinc-binding dehydrogenase from Staphylococcus aureus
30% identity, 95% coverage

6dkhC / P39346 The crystal structure of l-idonate 5-dehydrogenase from escherichia coli str. K-12 substr. Mg1655
31% identity, 90% coverage

STM4484 L-idonate 5-dehydrogenase from Salmonella typhimurium LT2
28% identity, 99% coverage

SA0239 sorbitol dehydrogenase from Staphylococcus aureus subsp. aureus N315
30% identity, 95% coverage

Q8KQG6 mannitol 2-dehydrogenase (EC 1.1.1.67) from Leuconostoc mesenteroides (see paper)
31% identity, 93% coverage

TDH_THET8 / Q5SKS4 L-threonine 3-dehydrogenase; TDH; EC 1.1.1.103 from Thermus thermophilus (strain ATCC 27634 / DSM 579 / HB8)
2dq4A / Q5SKS4 Crystal structure of threonine 3-dehydrogenase
31% identity, 99% coverage

SAOUHSC_00217 sorbitol dehydrogenase, putative from Staphylococcus aureus subsp. aureus NCTC 8325
30% identity, 95% coverage

DJ41_566 2,3-butanediol dehydrogenase from Acinetobacter baumannii ATCC 19606 = CIP 70.34 = JCM 6841
30% identity, 93% coverage

lgnH / BAM68211.1 L-gluconate dehydrogenase lgnH from Paracoccus laeviglucosivorans (see 2 papers)
K7ZKU8 2-desacetyl-2-hydroxyethyl bacteriochlorophyllide A dehydrogenase from Paracoccus laeviglucosivorans
33% identity, 99% coverage

D4GPB2 2-dehydro-3-deoxy-L-rhamnonate dehydrogenase (NAD+) (EC 1.1.1.401) from Haloferax volcanii (see paper)
30% identity, 95% coverage

apdH / Q8KQL2 D-arabitol-phosphate dehydrogenase monomer (EC 1.1.1.301) from Enterococcus avium (see paper)
ARPD_ENTAV / Q8KQL2 D-arabitol-phosphate dehydrogenase; APDH; EC 1.1.1.301 from Enterococcus avium (Streptococcus avium) (see paper)
Q8KQL2 D-arabitol-phosphate dehydrogenase (EC 1.1.1.301) from Enterococcus avium (see paper)
32% identity, 93% coverage

Q83VI5 mannitol 2-dehydrogenase (EC 1.1.1.67) from Leuconostoc pseudomesenteroides (see 3 papers)
31% identity, 94% coverage

MTCOM_21580 zinc-dependent dehydrogenase from Moorella thermoacetica
31% identity, 96% coverage

RSAU_000193 zinc-binding dehydrogenase from Staphylococcus aureus subsp. aureus 6850
30% identity, 95% coverage

Pden_4931 Alcohol dehydrogenase, zinc-binding domain protein from Paracoccus denitrificans PD1222
32% identity, 99% coverage

WP_000642458 2,3-butanediol dehydrogenase from Bacillus pacificus
27% identity, 100% coverage

Q57517 Uncharacterized zinc-type alcohol dehydrogenase-like protein HI_0053 from Haemophilus influenzae (strain ATCC 51907 / DSM 11121 / KW20 / Rd)
HI0053 zinc-type alcohol dehydrogenase from Haemophilus influenzae Rd KW20
29% identity, 97% coverage

CD630_23230 zinc-binding dehydrogenase from Clostridioides difficile 630
32% identity, 90% coverage

WP_135197409 2,3-butanediol dehydrogenase from Leuconostoc sp.
28% identity, 98% coverage

ACIAD1021 putative (R,R)-butanediol dehydrogenase from Acinetobacter sp. ADP1
29% identity, 88% coverage

Cbei_0544 alcohol dehydrogenase from Clostridium beijerincki NCIMB 8052
30% identity, 92% coverage

eltD / A0QXD8 erythritol/L-threitol dehydrogenase (EC 1.1.1.56; EC 1.1.1.12; EC 1.1.1.9) from Mycolicibacterium smegmatis (strain ATCC 700084 / mc(2)155) (see paper)
ELTD_MYCS2 / A0QXD8 Erythritol/L-threitol dehydrogenase; EC 1.1.1.- from Mycolicibacterium smegmatis (strain ATCC 700084 / mc(2)155) (Mycobacterium smegmatis) (see paper)
MSMEG_3265 arabitol-phosphate dehydrogenase from Mycobacterium smegmatis str. MC2 155
28% identity, 94% coverage

llmg_1642 2,3-butanediol dehydrogenase from Lactococcus lactis subsp. cremoris MG1363
30% identity, 90% coverage

GulDH / E1V4Y1 L-gulonate 5-dehydrogenase (EC 1.1.1.380) from Halomonas elongata (strain ATCC 33173 / DSM 2581 / NBRC 15536 / NCIMB 2198 / 1H9) (see paper)
E1V4Y1 L-gulonate 5-dehydrogenase (EC 1.1.1.380) from Halomonas elongata (see paper)
30% identity, 99% coverage

AFUA_2G15930 alcohol dehydrogenase, zinc-containing from Aspergillus fumigatus Af293
31% identity, 97% coverage

NTHI0063 conserved hypothetical zinc-type alcohol dehydrogenase-like protein from Haemophilus influenzae 86-028NP
29% identity, 97% coverage

ECA0168 L-threonine 3-dehydrogenase from Erwinia carotovora subsp. atroseptica SCRI1043
28% identity, 90% coverage

ETAE_0085 L-threonine 3-dehydrogenase from Edwardsiella tarda EIB202
29% identity, 98% coverage

STM1542 putative zinc-binding dehydrogenase from Salmonella typhimurium LT2
29% identity, 95% coverage

F1T242 (R,R)-butanediol dehydrogenase (EC 1.1.1.4) from Mycobacterium sp. (see paper)
32% identity, 87% coverage

O35045 Uncharacterized zinc-type alcohol dehydrogenase-like protein YjmD from Bacillus subtilis (strain 168)
BSU12330 putative oxidoreductase from Bacillus subtilis subsp. subtilis str. 168
29% identity, 95% coverage

CLJU_c25840 zinc-binding dehydrogenase from Clostridium ljungdahlii DSM 13528
28% identity, 94% coverage

RBAM_006650 YdjL from Bacillus amyloliquefaciens FZB42
29% identity, 99% coverage

A0A0E4A9D6 (R,R)-butanediol dehydrogenase (EC 1.1.1.4) from Rhodococcus erythropolis (see paper)
29% identity, 92% coverage

MS6_A0925 L-threonine 3-dehydrogenase from Vibrio cholerae MS6
VCA0885 threonine 3-dehydrogenase from Vibrio cholerae O1 biovar eltor str. N16961
27% identity, 95% coverage

SO_4673 threonine 3-dehydrogenase from Shewanella oneidensis MR-1
28% identity, 98% coverage

AO090023000523 No description from Aspergillus oryzae RIB40
29% identity, 97% coverage

WP_029946299 (R,R)-butanediol dehydrogenase from Bacillus sp. SN32
31% identity, 92% coverage

6ie0B / O34788 X-ray crystal structure of 2r,3r-butanediol dehydrogenase from bacillus subtilis
31% identity, 91% coverage

BDHA_BACSU / O34788 (R,R)-butanediol dehydrogenase; Acetoin reductase/2,3-butanediol dehydrogenase; AR/BDH; EC 1.1.1.4 from Bacillus subtilis (strain 168) (see 3 papers)
O34788 (R,R)-butanediol dehydrogenase (EC 1.1.1.4) from Bacillus subtilis (see paper)
BSU06240, NP_388505 acetoin reductase/2,3-butanediol dehydrogenase from Bacillus subtilis subsp. subtilis str. 168
NP_388505 acetoin reductase/2,3-butanediol dehydrogenase from Bacillus subtilis subsp. subtilis str. 168
31% identity, 92% coverage

ZP_02432273 hypothetical protein from Clostridium scindens ATCC 35704
28% identity, 99% coverage

Q1QT88 Alcohol dehydrogenase GroES-like protein from Chromohalobacter salexigens (strain ATCC BAA-138 / DSM 3043 / CIP 106854 / NCIMB 13768 / 1H11)
29% identity, 93% coverage

AS588_RS00220 2,3-butanediol dehydrogenase from Bacillus amyloliquefaciens
29% identity, 92% coverage

A0A075BZ18 (R,R)-butanediol dehydrogenase (EC 1.1.1.4) from Bacillus sp. (in: Bacteria) (see paper)
S6FPW0 (R,R)-butanediol dehydrogenase (EC 1.1.1.4) from Bacillus amyloliquefaciens (see paper)
BVY13_17360 2,3-butanediol dehydrogenase from Bacillus amyloliquefaciens
29% identity, 92% coverage

CD2279 putative sugar dehydrogenase from Clostridium difficile 630
30% identity, 91% coverage

MS141 putative dehydrogenase from Microscilla sp. PRE1
30% identity, 88% coverage

O31776 L-threonine 3-dehydrogenase (EC 1.1.1.103) from Bacillus subtilis (see paper)
30% identity, 92% coverage

BMMGA3_RS07355 galactitol-1-phosphate 5-dehydrogenase from Bacillus methanolicus MGA3
29% identity, 94% coverage

WP_013350835 zinc-binding dehydrogenase from Bacillus amyloliquefaciens
31% identity, 70% coverage

MTKAM_23920, MTMBA_25190 zinc-binding dehydrogenase from Moorella thermoacetica
29% identity, 96% coverage

gutB1 / A7Z0T4 2-amino-2-deoxy-D-mannitol dehydrogenase from Bacillus velezensis (strain DSM 23117 / BGSC 10A6 / LMG 26770 / FZB42) (see 4 papers)
RBAM_002060 GutB1 from Bacillus amyloliquefaciens FZB42
WP_011996208 zinc-binding dehydrogenase from Bacillus velezensis
30% identity, 70% coverage

WP_003156784 zinc-binding dehydrogenase from Bacillus velezensis
30% identity, 70% coverage

CD2324 putative galactitol-1-phosphate 5-dehydrogenase from Clostridium difficile 630
CD630_23240 galactitol-1-phosphate 5-dehydrogenase from Clostridioides difficile 630
26% identity, 97% coverage

ZP_02081557 hypothetical protein from Clostridium leptum DSM 753
29% identity, 93% coverage

swp_5068 Zinc-containing alcohol dehydrogenase superfamily from Shewanella piezotolerans WP3
27% identity, 94% coverage

LMOf2365_2643 alcohol dehydrogenase, zinc-dependent from Listeria monocytogenes str. 4b F2365
30% identity, 96% coverage

CH1034_300308 Zn-dependent oxidoreductase from Klebsiella pneumoniae
28% identity, 99% coverage

WP_020955232 zinc-binding dehydrogenase from Bacillus velezensis
30% identity, 70% coverage

WP_007408027 zinc-binding dehydrogenase from Bacillus sp. 916
30% identity, 70% coverage

VV21485 Threonine dehydrogenase from Vibrio vulnificus CMCP6
27% identity, 98% coverage

Q91_0877 2,3-butanediol dehydrogenase from Cycloclasticus sp. P1
26% identity, 99% coverage

VPA1509 threonine 3-dehydrogenase from Vibrio parahaemolyticus RIMD 2210633
FORC22_4519, WP_005478141 L-threonine 3-dehydrogenase from Vibrio parahaemolyticus
27% identity, 98% coverage

A1S_1705 putative (RR)-butanediol dehydrogenase from Acinetobacter baumannii ATCC 17978
29% identity, 82% coverage

PAAG_04541 alcohol dehydrogenase from Paracoccidioides lutzii Pb01
27% identity, 82% coverage

lmo2663 / Q8Y414 pentitolphosphate dehydrogenase Lmo2663 (EC 1.1.1.301) from Listeria monocytogenes serovar 1/2a (strain ATCC BAA-679 / EGD-e) (see 2 papers)
lmo2663 similar to polyol dehydrogenase from Listeria monocytogenes EGD-e
30% identity, 96% coverage

LMRG_02208 alcohol dehydrogenase, zinc-dependent from Listeria monocytogenes 10403S
30% identity, 95% coverage

4ej6A / Q92PZ3 Crystal structure of a putative zinc-binding dehydrogenase (target psi-012003) from sinorhizobium meliloti 1021
30% identity, 93% coverage

DSB67_20895 L-threonine 3-dehydrogenase from Vibrio campbellii
26% identity, 98% coverage

SCO1682 zinc-binding alcohol dehydrogenase from Streptomyces coelicolor A3(2)
31% identity, 92% coverage

VIBHAR_05001 threonine 3-dehydrogenase from Vibrio harveyi ATCC BAA-1116
26% identity, 94% coverage

BPSS0006 threonine 3-dehydrogenase from Burkholderia pseudomallei K96243
27% identity, 98% coverage

BDP_0423 Putative dehydrogenase from Bifidobacterium dentium Bd1
29% identity, 93% coverage

CNBG_3919 xylitol dehydrogenase from Cryptococcus deuterogattii R265
36% identity, 39% coverage

gutB1 / E3EJK4 2-amino-2-deoxy-D-mannitol dehydrogenase from Paenibacillus polymyxa (strain SC2) (see 4 papers)
28% identity, 99% coverage

New Search

For advice on how to use these tools together, see Interactive tools for functional annotation of bacterial genomes.

Statistics

The PaperBLAST database links 789,361 different protein sequences to 1,256,019 scientific articles. Searches against EuropePMC were last performed on January 10 2025.

How It Works

PaperBLAST builds a database of protein sequences that are linked to scientific articles. These links come from automated text searches against the articles in EuropePMC and from manually-curated information from GeneRIF, UniProtKB/Swiss-Prot, BRENDA, CAZy (as made available by dbCAN), BioLiP, CharProtDB, MetaCyc, EcoCyc, TCDB, REBASE, the Fitness Browser, and a subset of the European Nucleotide Archive with the /experiment tag. Given this database and a protein sequence query, PaperBLAST uses protein-protein BLAST to find similar sequences with E < 0.001.

To build the database, we query EuropePMC with locus tags, with RefSeq protein identifiers, and with UniProt accessions. We obtain the locus tags from RefSeq or from MicrobesOnline. We use queries of the form "locus_tag AND genus_name" to try to ensure that the paper is actually discussing that gene. Because EuropePMC indexes most recent biomedical papers, even if they are not open access, some of the links may be to papers that you cannot read or that our computers cannot read. We query each of these identifiers that appears in the open access part of EuropePMC, as well as every locus tag that appears in the 500 most-referenced genomes, so that a gene may appear in the PaperBLAST results even though none of the papers that mention it are open access. We also incorporate text-mined links from EuropePMC that link open access articles to UniProt or RefSeq identifiers. (This yields some additional links because EuropePMC uses different heuristics for their text mining than we do.)

For every article that mentions a locus tag, a RefSeq protein identifier, or a UniProt accession, we try to select one or two snippets of text that refer to the protein. If we cannot get access to the full text, we try to select a snippet from the abstract, but unfortunately, unique identifiers such as locus tags are rarely provided in abstracts.

PaperBLAST also incorporates manually-curated protein functions:

Except for GeneRIF and ENA, the curated entries include a short curated description of the protein's function. For entries from BioLiP, the protein's function may not be known beyond binding to the ligand. Many of these entries also link to articles in PubMed.

For more information see the PaperBLAST paper (mSystems 2017) or the code. You can download PaperBLAST's database here.

Changes to PaperBLAST since the paper was written:

Many of these changes are described in Interactive tools for functional annotation of bacterial genomes.

Secrets

PaperBLAST cannot provide snippets for many of the papers that are published in non-open-access journals. This limitation applies even if the paper is marked as "free" on the publisher's web site and is available in PubmedCentral or EuropePMC. If a journal that you publish in is marked as "secret," please consider publishing elsewhere.

Omissions from the PaperBLAST Database

Many important articles are missing from PaperBLAST, either because the article's full text is not in EuropePMC (as for many older articles), or because the paper does not mention a protein identifier such as a locus tag, or because of PaperBLAST's heuristics. If you notice an article that characterizes a protein's function but is missing from PaperBLAST, please notify the curators at UniProt or add an entry to GeneRIF. Entries in either of these databases will eventually be incorporated into PaperBLAST. Note that to add an entry to UniProt, you will need to find the UniProt identifier for the protein. If the protein is not already in UniProt, you can ask them to create an entry. To add an entry to GeneRIF, you will need an NCBI Gene identifier, but unfortunately many prokaryotic proteins in RefSeq do not have corresponding Gene identifers.

References

PaperBLAST: Text-mining papers for information about homologs.
M. N. Price and A. P. Arkin (2017). mSystems, 10.1128/mSystems.00039-17.

Europe PMC in 2017.
M. Levchenko et al (2017). Nucleic Acids Research, 10.1093/nar/gkx1005.

Gene indexing: characterization and analysis of NLM's GeneRIFs.
J. A. Mitchell et al (2003). AMIA Annu Symp Proc 2003:460-464.

UniProt: the universal protein knowledgebase.
The UniProt Consortium (2016). Nucleic Acids Research, 10.1093/nar/gkw1099.

BRENDA in 2017: new perspectives and new tools in BRENDA.
S. Placzek et al (2017). Nucleic Acids Research, 10.1093/nar/gkw952.

The EcoCyc database: reflecting new knowledge about Escherichia coli K-12.
I. M. Keeseler et al (2016). Nucleic Acids Research, 10.1093/nar/gkw1003.

The MetaCyc database of metabolic pathways and enzymes.
R. Caspi et al (2018). Nucleic Acids Research, 10.1093/nar/gkx935.

CharProtDB: a database of experimentally characterized protein annotations.
R. Madupu et al (2012). Nucleic Acids Research, 10.1093/nar/gkr1133.

The carbohydrate-active enzymes database (CAZy) in 2013.
V. Lombard et al (2014). Nucleic Acids Research, 10.1093/nar/gkt1178.

The Transporter Classification Database (TCDB): recent advances
M. H. Saier, Jr. et al (2016). Nucleic Acids Research, 10.1093/nar/gkv1103.

REBASE - a database for DNA restriction and modification: enzymes, genes and genomes.
R. J. Roberts et al (2015). Nucleic Acids Research, 10.1093/nar/gku1046.

Deep annotation of protein function across diverse bacteria from mutant phenotypes.
M. N. Price et al (2016). bioRxiv, 10.1101/072470.

by Morgan Price, Arkin group
Lawrence Berkeley National Laboratory