PaperBLAST – Find papers about a protein or its homologs

 

PaperBLAST

PaperBLAST Hits for WP_043847943.1 glycosyltransferase (Amycolatopsis keratiniphila) (408 a.a., MRVLLSTCGS...)

Other sequence analysis tools:

Find functional residues: SitesBLAST

Search for conserved domains

Find the best match in UniProt

Compare to protein structures

Predict transmenbrane helices: Phobius

Predict protein localization: PSORTb

Find homologs in fast.genomics

Fitness BLAST: loading...

Found 183 similar proteins in the literature:

AAB49299.1 TDP/UDP-Glc: aglycosyl-vancomycin glucosyltransferase (GtfE';Vcm10) (EC 2.4.1.-) (see protein)
AORI_1487 glycosyltransferase from Amycolatopsis keratiniphila
100% identity, 100% coverage

GTFE_AMYOR / Q9AFC6 Glycosyltransferase GtfE; EC 2.4.1.- from Amycolatopsis orientalis (Nocardia orientalis) (see 2 papers)
AAK31353.1 TDB/UDP-Glc: aglycosyl-vancomycin glucosyltransferase (GtfE) (EC 2.4.1.-) (see protein)
90% identity, 100% coverage

GTFB_AMYOR / P96559 Vancomycin aglycone glucosyltransferase; Glycosyltransferase GtfB; EC 2.4.1.310 from Amycolatopsis orientalis (Nocardia orientalis) (see 4 papers)
P96559 vancomycin aglycone glucosyltransferase (EC 2.4.1.310) from Amycolatopsis orientalis (see 2 papers)
AAB49293.1 TDP/UDP-Glc: aglycosyl-vancomycin: glucosyltransferase (GtfB;PCZA361.20) (EC 2.4.1.-) (see protein)
81% identity, 100% coverage

CAA76552.1 UDP-Glc: balhimycin aglycone Hpg-glucosyltransferase B (BgtfB) (EC 2.4.1.-) (see protein)
82% identity, 100% coverage

CAE53364.1 UDP-GlcNAc: teicoplanin aglycone 4Hpg-N-acetylglucosaminidase (Tcp23;Orf10*;GtfB) (EC 2.4.1.-) (see protein)
tcp23 / CAE53364.1 GtfB protein from Actinoplanes teichomyceticus (see paper)
66% identity, 100% coverage

DMB42_RS42780 glycosyltransferase from Nonomuraea sp. WAC 01424
67% identity, 98% coverage

GTFC_AMYOR / P96560 Glycosyltransferase GtfC; EC 2.4.1.- from Amycolatopsis orientalis (Nocardia orientalis) (see 3 papers)
AAB49294.1 UDP-β-L-4-epi-vancosamine: vancomycin-pseudoaglycone vancosaminyltransferase (GtfC;PCZA361.21) (EC 2.4.1.-) (see protein)
63% identity, 99% coverage

AAB49298.1 UDP-β-L-4-epi-vancosamine: vancomycin-pseudoaglycone vancosaminyltransferase (GtfD';Vcm9) (EC 2.4.1.-) (see protein)
AORI_1486 glycosyltransferase from Amycolatopsis keratiniphila
65% identity, 100% coverage

1rrvA / Q9AFC7 X-ray crystal structure of tdp-vancosaminyltransferase gtfd as a complex with tdp and the natural substrate, desvancosaminyl vancomycin. (see paper)
64% identity, 95% coverage

gtfD / Q9AFC7 desvancosaminyl-vancomycin vancosaminetransferase (EC 2.4.1.322) from Amycolatopsis orientalis (see 2 papers)
GTFD_AMYOR / Q9AFC7 Devancosaminyl-vancomycin vancosaminetransferase; Devancosamine-vancomycin TDP-vancosaminyltransferase; Glycosyltransferase GtfD; EC 2.4.1.322 from Amycolatopsis orientalis (Nocardia orientalis) (see 2 papers)
Q9AFC7 devancosaminyl-vancomycin vancosaminetransferase (EC 2.4.1.322) from Amycolatopsis orientalis (see 5 papers)
AAK31352.1 UDP-β-L-4-epi-vancosamine: vancomycin-pseudoaglycone vancosaminyltransferase (GtfD) (EC 2.4.1.-) (see protein)
64% identity, 100% coverage

CAE53349.1 UDP-GlcNAc: teicoplanin aglycone 3-Cl-6-β-Hty-N-acetylglucosaminidase (Tcp8;Orf1;GtfA) (EC 2.4.1.-) (see protein)
tcp8 / CAE53349.1 GtfA protein from Actinoplanes teichomyceticus (see paper)
61% identity, 100% coverage

3h4iA / P96558,Q6ZZJ7 Chimeric glycosyltransferase for the generation of novel natural products (see paper)
60% identity, 98% coverage

GTFA_AMYOR / P96558 dTDP-epi-vancosaminyltransferase; Glycosyltransferase AH1; GtfAH1; Glycosyltransferase GtfA; EC 2.4.1.311 from Amycolatopsis orientalis (Nocardia orientalis) (see 4 papers)
P96558 chloroorienticin B synthase (EC 2.4.1.311) from Amycolatopsis orientalis (see 2 papers)
AAB49292.1 dTDP-β-L-4-epi-epivancosamine: epivancosaminyltransferase (GtfA;PCZA361.19) (EC 2.4.1.-) (see protein)
59% identity, 100% coverage

1pn3A / P96558 Crystal structure of tdp-epi-vancosaminyltransferase gtfa in complexes with tdp and the acceptor substrate dvv. (see paper)
60% identity, 98% coverage

ADN68481.1 UDP-Glc: sorangicin A glucosyltransferase (SorF) (EC 2.4.1.-) (see protein)
42% identity, 94% coverage

XALc_0365 putative glucosyltransferase protein from Xanthomonas albilineans
32% identity, 88% coverage

K4BTE6 sterol 3beta-glucosyltransferase (EC 2.4.1.173) from Solanum lycopersicum (see paper)
29% identity, 63% coverage

SGTL1 / Q2I015 3β-hydroxy sterol glucosyltransferase (EC 2.4.1.173) from Withania somnifera (see 4 papers)
29% identity, 56% coverage

A0A023NFQ4 sterol 3beta-glucosyltransferase (EC 2.4.1.173) from Gossypium hirsutum (see paper)
29% identity, 60% coverage

WP_012855060 glycosyltransferase from Thermomonospora curvata DSM 43183
32% identity, 96% coverage

LOC115709176 sterol 3-beta-glucosyltransferase UGT80B1 from Cannabis sativa
27% identity, 61% coverage

WP_030163570 glycosyltransferase from Spirillospora albida
33% identity, 84% coverage

AT3G07020 UDP-glucose:sterol glucosyltransferase (UGT80A2) from Arabidopsis thaliana
NP_850529 UDP-Glycosyltransferase superfamily protein from Arabidopsis thaliana
28% identity, 62% coverage

AAO95505.1 At3g07020/F17A9.17 (UDP-Glc: sterol glucosyltransferase;Sgt;Ugt80A2) (EC 2.4.1.173) (see protein)
ugt80A2 / CAB06082.1 UDP-glucose:sterol glucosyltransferase from Arabidopsis thaliana (see paper)
28% identity, 62% coverage

UGT80A2 / Q9M8Z7 3β-hydroxy sterol glucosyltransferase (EC 2.4.1.173) from Arabidopsis thaliana (see 3 papers)
U80A2_ARATH / Q9M8Z7 Sterol 3-beta-glucosyltransferase UGT80A2; UDP-glucose:sterol glucosyltransferase 80A2; EC 2.4.1.173 from Arabidopsis thaliana (Mouse-ear cress) (see 3 papers)
Q9M8Z7 sterol 3beta-glucosyltransferase (EC 2.4.1.173) from Arabidopsis thaliana (see 4 papers)
NP_566297 UDP-Glycosyltransferase superfamily protein from Arabidopsis thaliana
28% identity, 62% coverage

Q6U848 Rhamnosyltransferase A (Fragment) from Mycobacterium intracellulare
27% identity, 92% coverage

U80B1_ARATH / Q9XIG1 Sterol 3-beta-glucosyltransferase UGT80B1; Protein TRANSPARENT TESTA 15; UDP-glucose:sterol glucosyltransferase 80B1; EC 2.4.1.173 from Arabidopsis thaliana (Mouse-ear cress) (see 2 papers)
Q9XIG1 sterol 3beta-glucosyltransferase (EC 2.4.1.173) from Arabidopsis thaliana (see 3 papers)
NP_001077674 UDP-Glycosyltransferase superfamily protein from Arabidopsis thaliana
NP_175027 UDP-Glycosyltransferase superfamily protein from Arabidopsis thaliana
AT1G43620 UDP-glucose:sterol glucosyltransferase, putative from Arabidopsis thaliana
27% identity, 64% coverage

BAC22616.1 UDP-Glc: sterol 3-O-glucosyltransferase (PGGT-1) (EC 2.4.1.173) (see protein)
28% identity, 65% coverage

A0A023NGA8 sterol 3beta-glucosyltransferase (EC 2.4.1.173) from Gossypium hirsutum (see paper)
27% identity, 65% coverage

MAP_RS19275 glycosyltransferase from Mycobacterium avium subsp. paratuberculosis K-10
MAP3762c hypothetical protein from Mycobacterium avium subsp. paratuberculosis str. k10
31% identity, 98% coverage

ML2348 putative glycosyl transferase from Mycobacterium leprae TN
29% identity, 96% coverage

Afu7g04880 sterol glucosyltransferase, putative from Aspergillus fumigatus Af293
28% identity, 50% coverage

MSMEG_4740 Glycosyltransferase family protein 28 from Mycobacterium smegmatis str. MC2 155
29% identity, 98% coverage

ABO_1783 glycosyl transferase, putative from Alcanivorax borkumensis SK2
28% identity, 91% coverage

Q6CUV2 sterol 3beta-glucosyltransferase (EC 2.4.1.173) from Kluyveromyces lactis (see paper)
27% identity, 33% coverage

SS1G_05901 hypothetical protein from Sclerotinia sclerotiorum 1980 UF-70
28% identity, 31% coverage

LEN_4536 glycosyltransferase from Lysobacter enzymogenes
28% identity, 86% coverage

O22678 sterol 3beta-glucosyltransferase (EC 2.4.1.173) from Avena sativa (see 2 papers)
CAB06081.1 UDP-Glc: sterol glucosyltransferase (Ugt80A1) (EC 2.4.1.173) (see protein)
ugt80A1 / CAB06081.1 UDP-glucose:sterol glucosyltransferase from Avena sativa (see paper)
26% identity, 64% coverage

AAC71702.1 GPL/6-deoxytalose α-L-1,2-rhamnosyltransferase A (RtfA;Orf2) (EC 2.4.1.-) (see protein)
AAD44209.1 rhamnosyltransferase A (RtfA) (EC 2.4.1.-) (see protein)
25% identity, 92% coverage

BAC22617.1 UDP-Glc: sterol 3-O-glucosyltransferase (PGGT-2) (EC 2.4.1.173) (see protein)
Q8H9B4 UDP-glucose:sterol 3-O-glucosyltransferase from Panax ginseng
28% identity, 66% coverage

MAB_4112c Putative glycosyltransferase GtfA from Mycobacterium abscessus ATCC 19977
26% identity, 90% coverage

K4C3H8 sterol 3beta-glucosyltransferase (EC 2.4.1.173) from Solanum lycopersicum (see paper)
27% identity, 64% coverage

TERG_00990 sterol 3-beta-glucosyltransferase from Trichophyton rubrum CBS 118892
28% identity, 60% coverage

AAN28688.1 GPL/3,4-di-O-Me-rhamnose α-L-1,2-(3-O-methyl-)rhamnosyltransferase (Gtf3;MSMEG_0385) (EC 2.4.1.-) (see protein)
28% identity, 94% coverage

Mb1553c Probable glycosyltransferase from Mycobacterium bovis AF2122/97
Rv1526c Probable glycosyltransferase from Mycobacterium tuberculosis H37Rv
29% identity, 93% coverage

XAC3921 glucosyltransferase from Xanthomonas axonopodis pv. citri str. 306
30% identity, 89% coverage

MSMEG_0385 hypothetical glycosyl transferase from Mycobacterium smegmatis str. MC2 155
28% identity, 94% coverage

A0A2D1N4Z8 sterol 3beta-glucosyltransferase (EC 2.4.1.173) from Solanum lycopersicum (see paper)
26% identity, 67% coverage

LOC115712970 sterol 3-beta-glucosyltransferase UGT80A2 from Cannabis sativa
26% identity, 65% coverage

K4BS77 sterol 3beta-glucosyltransferase (EC 2.4.1.173) from Solanum lycopersicum (see paper)
26% identity, 64% coverage

MMAR_2353 UDP-glycosyltransferase from Mycobacterium marinum M
28% identity, 97% coverage

FVEG_00073 hypothetical protein from Fusarium verticillioides 7600
27% identity, 45% coverage

A5DNB9 sterol 3beta-glucosyltransferase (EC 2.4.1.173) from Meyerozyma guilliermondii (see paper)
28% identity, 25% coverage

MAV_3258 Glycosyltransferase family protein 28 from Mycobacterium avium 104
28% identity, 94% coverage

SS1G_09252 hypothetical protein from Sclerotinia sclerotiorum 1980 UF-70
26% identity, 52% coverage

RHTO_07138 sterol 3-beta-glucosyltransferase, glycosyltransferase family 1 protein from Rhodotorula toruloides NP11
26% identity, 24% coverage

SPSK_05928 glucosylltransferase family 28 protein from Sporothrix schenckii 1099-18
39% identity, 17% coverage

ATG26_CANAL / Q5A950 Sterol 3-beta-glucosyltransferase; Autophagy-related protein 26; UDP-glycosyltransferase 51; EC 2.4.1.-; EC 2.4.1.173 from Candida albicans (strain SC5314 / ATCC MYA-2876) (Yeast) (see paper)
Q5A950 sterol 3beta-glucosyltransferase (EC 2.4.1.173) from Candida albicans (see paper)
27% identity, 24% coverage

Q9WW66 Glycosyltransferase gtfB from Mycobacterium avium
28% identity, 94% coverage

UGT51C1 sterol 3-beta-glucosyltransferase; EC 2.4.1.173 from Candida albicans (see paper)
27% identity, 24% coverage

AAD29571.1 UDP-Glc: sterol glucosyltransferase (UGT51C1) (EC 2.4.1.173) (see protein)
27% identity, 24% coverage

ATG26_KOMPG / Q9Y751 Sterol 3-beta-glucosyltransferase; Autophagy-related protein 26; Peroxisome degradation protein 3; Pexophagy zeocin-resistant mutant protein 4; UDP-glycosyltransferase 51; EC 2.4.1.-; EC 2.4.1.173 from Komagataella phaffii (strain GS115 / ATCC 20864) (Yeast) (Pichia pastoris) (see 5 papers)
Q9Y751 sterol 3beta-glucosyltransferase (EC 2.4.1.173) from Komagataella pastoris (see 3 papers)
CAY71393.1 UDP-Glc: sterol glucosyltransferase (UGT51B1;PAS_chr4_0167) (EC 2.4.1.173) (see protein)
26% identity, 32% coverage

5gl5A / Q06321 Sterol 3-beta-glucosyltransferase (ugt51) from saccharomyces cerevisiae (strain atcc 204508 / s288c): udpg complex (see paper)
27% identity, 80% coverage

ASPSYDRAFT_91437 uncharacterized protein from Aspergillus sydowii CBS 593.65
34% identity, 19% coverage

MAB_4107c Glycosyltransferase GtfA from Mycobacterium abscessus ATCC 19977
25% identity, 94% coverage

U5D0T8 Uncharacterized protein from Amborella trichopoda
27% identity, 71% coverage

XALc_1144 putative glycosyltransferase protein from Xanthomonas albilineans
31% identity, 88% coverage

ATG26_YEAST / Q06321 Sterol 3-beta-glucosyltransferase; Autophagy-related protein 26; UDP-glycosyltransferase 51; EC 2.4.1.-; EC 2.4.1.173 from Saccharomyces cerevisiae (strain ATCC 204508 / S288c) (Baker's yeast) (see 6 papers)
Q06321 sterol 3beta-glucosyltransferase (EC 2.4.1.173) from Saccharomyces cerevisiae (see 5 papers)
AAB67475.1 UDP-Glc: sterol glucosyltransferase (Atg26;YLR189c;UGT51) (EC 2.4.1.173) (see protein)
NP_013290 sterol 3-beta-glucosyltransferase from Saccharomyces cerevisiae S288C
YLR189C Atg26p from Saccharomyces cerevisiae
26% identity, 29% coverage

AAD28546.1 UDP-Glc: sterol glucosyltransferase (Ugt52) (EC 2.4.1.173) (see protein)
24% identity, 39% coverage

Q6BN88 sterol 3beta-glucosyltransferase (EC 2.4.1.173) from Debaryomyces hansenii (see paper)
28% identity, 22% coverage

ATG26_PICAN / A7KAK6 Sterol 3-beta-glucosyltransferase; Autophagy-related protein 26; EC 2.4.1.-; EC 2.4.1.173 from Pichia angusta (Yeast) (Hansenula polymorpha) (see paper)
A7KAK6 sterol 3beta-glucosyltransferase (EC 2.4.1.173) from Ogataea angusta (see paper)
27% identity, 32% coverage

AAN77910.1 UDP-Glc: sterol glucosyltransferase (UGT51D1) (EC 2.4.1.173) (see protein)
28% identity, 58% coverage

XC_3951 glucosyltransferase from Xanthomonas campestris pv. campestris str. 8004
28% identity, 89% coverage

MAV_3994 glycosyltransferase GtfB from Mycobacterium avium 104
36% identity, 41% coverage

XP_003044011 uncharacterized protein from Fusarium vanettenii 77-13-4
37% identity, 13% coverage

MSMEG_0389 hypothetical glycosyl transferase from Mycobacterium smegmatis str. MC2 155
24% identity, 88% coverage

AAC71701.1 GPL/6-deoxytalose α-L-1,2-rhamnosyltransferase A (GtfA) (EC 2.4.1.-) (see protein)
O68999 Glycosyltransferase from Mycobacterium avium
27% identity, 87% coverage

SS1G_04910 hypothetical protein from Sclerotinia sclerotiorum 1980 UF-70
33% identity, 32% coverage

Rv1524 Probable glycosyltransferase from Mycobacterium tuberculosis H37Rv
Mb1551 Probable glycosyltransferase from Mycobacterium bovis AF2122/97
26% identity, 95% coverage

ASPSYDRAFT_86678 uncharacterized protein from Aspergillus sydowii CBS 593.65
28% identity, 22% coverage

tylN / O70023 O-mycaminosyltylonolide 6-deoxyallosyltransferase (EC 2.4.1.317) from Streptomyces fradiae (see paper)
TYLN_STRFR / O70023 O-mycaminosyltylonolide 6-deoxyallosyltransferase; EC 2.4.1.317 from Streptomyces fradiae (Streptomyces roseoflavus) (see paper)
O70023 O-mycaminosyltylonolide 6-deoxyallosyltransferase (EC 2.4.1.317) from Streptomyces fradiae (see paper)
AAD12163.1 deoxyallosyl-transferase (TylN) (EC 2.4.1.-) (see protein)
tylN / CAA06512.2 deoxyallosyl-transferase from Streptomyces fradiae (see paper)
29% identity, 94% coverage

UGT52_DICDI / Q54IL5 UDP-sugar-dependent glycosyltransferase 52; Sterol 3-beta-glucosyltransferase; UDP-glycosyltransferase 52; EC 2.4.1.173 from Dictyostelium discoideum (Social amoeba) (see paper)
Q54IL5 sterol 3beta-glucosyltransferase (EC 2.4.1.173) from Dictyostelium discoideum (see paper)
25% identity, 23% coverage

MSMEG_0392 hypothetical glycosyl transferase from Mycobacterium smegmatis str. MC2 155
26% identity, 79% coverage

MAB_4104 Putative glycosyltransferase GtfB from Mycobacterium abscessus ATCC 19977
35% identity, 42% coverage

MAB_4695c Putative glycosyltransferase/rhamnosyltransferase from Mycobacterium abscessus ATCC 19977
37% identity, 41% coverage

BTH_II1076 rhamnosyltransferase I, subunit B from Burkholderia thailandensis E264
BTH_II1880 rhamnosyltransferase I, subunit B from Burkholderia thailandensis E264
31% identity, 82% coverage

C9YYI9 Putative glycosyltransferase from Streptomyces scabiei (strain 87.22)
33% identity, 82% coverage

FRAAL5400 Putative glycosyltransferase from Frankia alni ACN14a
31% identity, 45% coverage

Afu2g02220 UDP-glucose:sterol glycosyltransferase from Aspergillus fumigatus Af293
26% identity, 29% coverage

MAB_4694c Glycosyltransferase from Mycobacterium abscessus ATCC 19977
25% identity, 89% coverage

AAM81359.1 UDP-Glc: sterol glycosyltransferase (Ugt51;Ugt51E1) (EC 2.4.1.173) (see protein)
27% identity, 24% coverage

ATG26_GIBZE / I1S8Q3 Sterol 3-beta-glucosyltransferase ATG26; Autophagy-related protein 26; EC 2.4.1.-; EC 2.4.1.173 from Gibberella zeae (strain ATCC MYA-4620 / CBS 123657 / FGSC 9075 / NRRL 31084 / PH-1) (Wheat head blight fungus) (Fusarium graminearum) (see paper)
27% identity, 24% coverage

ATG26_ASPOR / Q2U0C3 Sterol 3-beta-glucosyltransferase; Autophagy-related protein 26; EC 2.4.1.-; EC 2.4.1.173 from Aspergillus oryzae (strain ATCC 42149 / RIB 40) (Yellow koji mold) (see paper)
25% identity, 29% coverage

Npun_R3449 glycosyl transferase family protein from Nostoc punctiforme
26% identity, 92% coverage

Q5B4C9 Sterol 3-beta-glucosyltransferase from Emericella nidulans (strain FGSC A4 / ATCC 38163 / CBS 112.46 / NRRL 194 / M139)
26% identity, 28% coverage

AAN77909.1 UDP-Glc: sterol glucosyltransferase (UGT53A1) (EC 2.4.1.173) (see protein)
25% identity, 26% coverage

An07g06610 uncharacterized protein from Aspergillus niger
27% identity, 27% coverage

WQ49_RS07360 glycosyltransferase from Burkholderia cenocepacia
BCAM2338 putative glycosyltransferase from Burkholderia cenocepacia J2315
31% identity, 88% coverage

rhlB / Q51560 RhlB rhamnosyltransferase from Pseudomonas aeruginosa (see 4 papers)
Q51560 Rhamnosyl transferase from Pseudomonas aeruginosa
27% identity, 82% coverage

CCM_01158 UDP-glucose:sterol glycosyltransferase from Cordyceps militaris CM01
28% identity, 24% coverage

D2EDM4 Rhamnosyltransferase-1 from Pseudomonas aeruginosa
27% identity, 83% coverage

AAG06866.1 rhamnosyltransferase (RhlB;PA3478) (EC 2.4.1.-) (see protein)
PA3478 rhamnosyltransferase chain B from Pseudomonas aeruginosa PAO1
NP_252168 rhamnosyltransferase subunit B from Pseudomonas aeruginosa PAO1
Q9HYD1 Rhamnosyltransferase chain B from Pseudomonas aeruginosa (strain ATCC 15692 / DSM 22644 / CIP 104116 / JCM 14847 / LMG 12228 / 1C / PRS 101 / PAO1)
PA14_19110 rhamnosyltransferase chain B from Pseudomonas aeruginosa UCBPP-PA14
BWR11_07885 glycosyltransferase from Pseudomonas aeruginosa
27% identity, 82% coverage

K659_RS0103715 glycosyltransferase from Pseudomonas corrugata CFBP 5454
28% identity, 83% coverage

BLU14_RS07155 glycosyltransferase from Pseudomonas corrugata
28% identity, 83% coverage

ATG26_GLOLA / C4B4E5 Sterol 3-beta-glucosyltransferase; Autophagy-related protein 26; EC 2.4.1.-; EC 2.4.1.173 from Glomerella lagenarium (Anthracnose fungus) (Colletotrichum lagenarium) (see paper)
26% identity, 24% coverage

SPSK_04821 udp-transferase from Sporothrix schenckii 1099-18
29% identity, 19% coverage

HZ99_RS01145 glycosyltransferase from Pseudomonas fluorescens
27% identity, 92% coverage

ASPSYDRAFT_33013 uncharacterized protein from Aspergillus sydowii CBS 593.65
26% identity, 28% coverage

MGG_03459 Atg26p from Pyricularia oryzae 70-15
26% identity, 22% coverage

MUL_1529 UDP-glycosyltransferase from Mycobacterium ulcerans Agy99
31% identity, 36% coverage

PSPA7_1648 rhamnosyltransferase chain B from Pseudomonas aeruginosa PA7
27% identity, 82% coverage

SS1G_07979 hypothetical protein from Sclerotinia sclerotiorum 1980 UF-70
26% identity, 24% coverage

SPSK_01368 sterol 3beta-glucosyltransferase from Sporothrix schenckii 1099-18
31% identity, 8% coverage

mycD / Q83WF1 mycinamicin VII 6-deoxyallosyltransferase from Micromonospora griseorubida (see 2 papers)
40% identity, 24% coverage

F4KII1 sterol 3beta-glucosyltransferase (EC 2.4.1.173) from Arabidopsis thaliana (see paper)
AT5G24750, NP_568452 UDP-Glycosyltransferase superfamily protein from Arabidopsis thaliana
NP_568452 hypothetical protein from Arabidopsis thaliana
26% identity, 34% coverage

W1PA48 Erythromycin biosynthesis protein CIII-like C-terminal domain-containing protein from Amborella trichopoda
26% identity, 33% coverage

AAF00215.1 (12b -derhodinosyl-)urdamycin G D-olivosyltransferase (UrdGT1b) (EC 2.4.1.-) (see protein)
45% identity, 22% coverage

FQZ25_19840 macrolide family glycosyltransferase from Bacillus thuringiensis
29% identity, 39% coverage

AAF00217.1 aquayamycin / urdamycinone B L-rhodinosyltransferase (UrdGT1c) (EC 2.4.1.-) (see protein)
43% identity, 24% coverage

AAS41737.1 flavonoid β-3(7)-O-glucosyltransferase (BcGT-3;BCE2825) (EC 2.4.1.-) (see protein)
21% identity, 95% coverage

FRAAL4787 putative N-glycosyltransferase from Frankia alni ACN14a
31% identity, 53% coverage

LOC112052352 uncharacterized protein LOC112052352 from Bicyclus anynana
27% identity, 14% coverage

J8Y18_13760 macrolide family glycosyltransferase from Bacillus cereus
28% identity, 39% coverage

FSOA_HUMFU / P9WEH1 Terpene cyclase-glycosyl transferase fusion protein fsoA; Fuscoatroside biosynthesis cluster protein A; EC 5.4.99.-; EC 2.4.1.- from Humicola fuscoatra (see paper)
34% identity, 7% coverage

NDPGT_BACLD / Q65JC2 NDP-glycosyltransferase YjiC; UDP-glucosyltransferase YjiC; EC 2.4.1.384 from Bacillus licheniformis (strain ATCC 14580 / DSM 13 / JCM 2505 / CCUG 7422 / NBRC 12200 / NCIMB 9375 / NCTC 10341 / NRRL NRS-1264 / Gibson 46) (see 8 papers)
Q65JC2 flavone 7-O-beta-glucosyltransferase (EC 2.4.1.81) from Bacillus licheniformis (see paper)
AAU40842.1 UDP-Glc: isoflavonoid β-glucosyltransferase (YjiC;BLi01948;BL00446) (EC 2.4.1.-) (see protein)
32% identity, 39% coverage

NDPGT_BACSU / O34539 NDP-glycosyltransferase YjiC; UDP-glycosyltransferase YjiC; EC 2.4.1.384 from Bacillus subtilis (strain 168) (see 3 papers)
O34539 NDP-glycosyltransferase (EC 2.4.1.384) from Bacillus subtilis (see 2 papers)
NP_389104 putative glycosyltransferase from Bacillus subtilis subsp. subtilis str. 168
34% identity, 26% coverage

Celf_3212 glycosyltransferase from Cellulomonas fimi ATCC 484
41% identity, 21% coverage

AAD13559.1 olivosyltransferase (LanGT3) (EC 2.4.1.-) (see protein)
38% identity, 27% coverage

6kqxA / O34539 Crystal structure of yijc from b. Subtilis in complex with udp (see paper)
34% identity, 26% coverage

WP_003220489 glycosyltransferase from Bacillus spizizenii
31% identity, 32% coverage

7vlbA / A0A289QH46 Crystal structure of ugt109a1 from bacillus
31% identity, 32% coverage

7vlbB / A0A289QH46 Crystal structure of ugt109a1 from bacillus
31% identity, 32% coverage

SS1G_13524 hypothetical protein from Sclerotinia sclerotiorum 1980 UF-70
25% identity, 25% coverage

tylM2 / P95747 tylactone mycaminosyltransferase (EC 2.4.1.316) from Streptomyces fradiae (see paper)
TYLM2_STRFR / P95747 Tylactone mycaminosyltransferase; EC 2.4.1.316 from Streptomyces fradiae (Streptomyces roseoflavus) (see paper)
P95747 tylactone mycaminosyltransferase (EC 2.4.1.316) from Streptomyces fradiae (see paper)
CAA57472.2 TDP-D-mycaminose : tylactone mycaminyltransferase (TylMII;Orf2*) (EC 2.4.1.-) (see protein)
39% identity, 20% coverage

ABO27086.1 SaqAE3 (L-rhodinose) β-1,4A-olivosyltransferase (SaqGT3) (EC 2.4.1.-) (see protein)
39% identity, 17% coverage

eryBV / A4F7N6 dTDP-L-mycarosyl: erythronolide B mycarosyltransferase (EC 2.4.1.328) from Saccharopolyspora erythraea (strain ATCC 11635 / DSM 40517 / JCM 4748 / NBRC 13426 / NCIMB 8594 / NRRL 2338) (see 2 papers)
ERYBV_SACER / O33939 Erythronolide mycarosyltransferase; EC 2.4.1.328 from Saccharopolyspora erythraea (Streptomyces erythraeus) (see paper)
SACE_0719 6-DEB TDP-mycarosyl glycosyltransferase from Saccharopolyspora erythraea NRRL 2338
32% identity, 33% coverage

3wagB / Q76KZ6 Crystal structure of glycosyltransferase vinc in complex with dtdp
36% identity, 27% coverage

BAD08357.1 dTDP-vicenisamine: vicenilactam β-vicenisaminyltransferase (VinC) (EC 2.4.1.-) (see protein)
36% identity, 26% coverage

SACE_4644 putative glycosyltransferase from Saccharopolyspora erythraea NRRL 2338
41% identity, 18% coverage

CAC37820.1 dTDP-D-desosamine: 3-α-mycarosylerythronolide B desosaminyltransferase (MegCIII) (EC 2.4.1.-) (see protein)
36% identity, 22% coverage

ERYC3_SACEN / A4F7P3 3-alpha-mycarosylerythronolide B desosaminyl transferase; Desosaminyl transferase EryCIII; Erythromycin biosynthesis protein CIII; EC 2.4.1.278 from Saccharopolyspora erythraea (strain ATCC 11635 / DSM 40517 / JCM 4748 / NBRC 13426 / NCIMB 8594 / NRRL 2338) (see 3 papers)
CAA74710.1 TDP-desosamine: α-mycarosyl erythronolide B desosaminyltransferase (EryCIII;SACE_0726) (EC 2.4.1.-) (see protein)
YP_001102993 glycosyl transferase, NDP-D-desosamine : 3-L-mycarosyl erythronolide B from Saccharopolyspora erythraea NRRL 2338
33% identity, 23% coverage

KALB_6579 macrolide family glycosyltransferase from Kutzneria albida DSM 43870
34% identity, 27% coverage

LOC725997 UDP-glucuronosyltransferase 2C1 from Apis mellifera
25% identity, 33% coverage

YP_003204087 UDP-glucuronosyl/UDP-glucosyltransferase from Nakamurella multipartita DSM 44233
34% identity, 38% coverage

CAJ42338.1 steffimycin L-rhamnosyltransferase (StfG) (EC 2.4.1.-) (see protein)
stfG / CAJ42338.1 glycosyl transferase from Streptomyces steffisburgensis (see paper)
42% identity, 19% coverage

LOC118279357 UDP-glucosyltransferase 2-like from Spodoptera frugiperda
25% identity, 41% coverage

3otiA / Q8KND7 Crystal structure of calg3, calicheamicin glycostyltransferase, tdp and calicheamicin t0 bound form (see paper)
35% identity, 23% coverage

CHLREDRAFT_154976 uncharacterized protein from Chlamydomonas reinhardtii
34% identity, 15% coverage

T1KUK4 UDP-glycosyltransferase 202A2 from Tetranychus urticae
28% identity, 38% coverage

8sftB / T1KUK4 Crystal structure of tuugt202a2 (tetur22g00270) in complex with kaempferol
28% identity, 39% coverage

YP_138519 polyprotein from Cryphonectria hypovirus 4
33% identity, 3% coverage

mycB / Q83WE1 protomycinolide IV desosaminyltransferase from Micromonospora griseorubida (see 2 papers)
35% identity, 19% coverage

EFUA_HORCR / A0A2Z4HPY4 Enfumafungin synthase efuA; Enfumafungin biosynthesis cluster protein A; Terpene cyclase-glycosyl transferase fusion protein efuA; EC 5.4.99.-; EC 2.4.1.- from Hormonema carpetanum (see paper)
30% identity, 13% coverage

slr1125 zeaxanthin glucosyl transferase from Synechocystis sp. PCC 6803
28% identity, 38% coverage

ABL09968.1 TDP-L-Rha: CBS000020 α-L-rhamnosyltransferase (aranciamycin synthase) (AraGT;Orf21) (EC 2.4.1.-) (see protein)
46% identity, 12% coverage

SACE_3599 antibiotic resistance macrolide glycosyltransferase from Saccharopolyspora erythraea NRRL 2338
32% identity, 27% coverage

AAS20331.1 β-olivosyltransferase (LndGT1) (EC 2.4.1.-) (see protein)
35% identity, 15% coverage

EGT_NPVAC / P18569 Ecdysteroid UDP-glucosyltransferase; EC 2.4.1.- from Autographa californica nuclear polyhedrosis virus (AcMNPV) (see paper)
AAA69845.1 UDP-Glc: ecdysteroid glucosyltransferase (Egt;UGT21A1) (EC 2.4.1.-) (see protein)
32% identity, 19% coverage

cloM / Q8GHC2 L-demethylnoviosyl:clorobiocic acid transferase (EC 2.4.1.302) from Streptomyces roseochromogenus subsp. oscitans (see paper)
32% identity, 25% coverage

AAD13555.1 olivosyltransferase (LanGT1) (EC 2.4.1.-) (see protein)
32% identity, 19% coverage

NP_047420 UDP-Glucosyl Transferase from Bombyx mori nucleopolyhedrovirus
36% identity, 15% coverage

2iyaA / Q3HTL7 The crystal structure of macrolide glycosyltransferases: a blueprint for antibiotic engineering (see paper)
33% identity, 27% coverage

oleI / Q3HTL7 oleandomycin glycosyltransferase from Streptomyces antibioticus (see 3 papers)
ABA42118.2 oleandomycin glycosyltransferase (OleI) (EC 2.4.1.-) (see protein)
33% identity, 26% coverage

LRR80_00495 nucleotide disphospho-sugar-binding domain-containing protein from Streptomyces sp. RO-S4
47% identity, 12% coverage

CAC37814.1 mycarosyltransferase (MegBV) (EC 2.4.1.-) (see protein)
33% identity, 34% coverage

asm25 / Q8KUH5 ansamitocin N-glucosyltransferase from Actinosynnema pretiosum subsp. auranticum (see 4 papers)
AAM54103.1 ansamitocin N-β-glucosyltransferase (Asm25) (EC 2.4.1.-) (see protein)
30% identity, 44% coverage

UTI89_C1122 putative glucosyltransferase from Escherichia coli UTI89
O2ColV53 IroB from Escherichia coli
38% identity, 19% coverage

c1254 Putative glucosyltransferase from Escherichia coli CFT073
49% identity, 12% coverage

NRG857_30008 salmochelin biosynthesis C-glycosyltransferase IroB from Escherichia coli O83:H1 str. NRG 857C
38% identity, 19% coverage

iroB / A0A0H2V630 enterobactin C-glucosyltransferase (EC 2.4.1.369) from Escherichia coli O6:H1 (strain CFT073 / ATCC 700928 / UPEC) (see 3 papers)
IROB_ECOL6 / A0A0H2V630 Enterobactin C-glucosyltransferase; Ent C-glucosyltransferase; EC 2.4.1.369 from Escherichia coli O6:H1 (strain CFT073 / ATCC 700928 / UPEC) (see 3 papers)
A0A0H2V630 enterobactin C-glucosyltransferase (EC 2.4.1.369) from Escherichia coli O6:H1 (see 2 papers)
38% identity, 19% coverage

AAC12648.1 UDP-Glc: oleandomycin β-glucosyltransferase (OleI) (EC 2.4.1.-) (see protein)
33% identity, 26% coverage

novM / Q9L9F5 4-O-demethyl-L-noviosyl transferase (EC 2.4.1.302) from Streptomyces niveus (see paper)
NOVM_STRNV / Q9L9F5 L-demethylnoviosyl transferase; Novobiocin biosynthesis protein M; EC 2.4.1.302 from Streptomyces niveus (Streptomyces spheroides) (see 3 papers)
32% identity, 26% coverage

t2668 putative glycosyl transferase from Salmonella enterica subsp. enterica serovar Typhi Ty2
38% identity, 21% coverage

RJF2_RS26160 salmochelin biosynthesis C-glycosyltransferase IroB from Klebsiella pneumoniae subsp. pneumoniae
40% identity, 19% coverage

ABB52547.1 TDP-D-chalcose: macrolide chalcosyltransferase / TDP-D-desosamine: macrolide D-desosaminyltransferase (GerTII) (EC 2.4.1.-) (see protein)
37% identity, 20% coverage

elmGT / Q9F2F9 8-demethyltetracenomycin C rhamnosyltransferase (EC 2.4.1.331) from Streptomyces olivaceus (see 2 papers)
ELMGT_STROV / Q9F2F9 Elloramycin glycosyltransferase ElmGT; EC 2.4.1.331 from Streptomyces olivaceus (see 2 papers)
Q9F2F9 8-demethyltetracenomycin C L-rhamnosyltransferase (EC 2.4.1.331) from Streptomyces olivaceus (see 5 papers)
CAC16413.2 dTDP-L-Rha: 8-demethyl-tetracenomycin C α-L-rhamnosyltransferase / elloramycin glycosyltransferase (ElmGT) (EC 2.4.1.-) (see protein)
elmgt / CAC16413.2 elloramycin glycosyltransferase from Streptomyces olivaceus (see 2 papers)
44% identity, 15% coverage

STM2773 putative glycosyl transferase, related to UDP-glucuronosyltransferase from Salmonella typhimurium LT2
47% identity, 12% coverage

3tsaB / Q9ALM8 Spinosyn rhamnosyltransferase spng (see paper)
38% identity, 25% coverage

spnG / Q9ALM8 spinosyn rhamnosyltransferase subunit from Saccharopolyspora spinosa (see paper)
AAG23268.1 TDP-β-L-Rha: spinosyn 9-O-α-L-rhamnosyltransferase (SpnG) (EC 2.4.1.-) (see protein)
38% identity, 25% coverage

megDI / Q9F839 dTDP-L-megosamine:erythromycin C L-megosaminyltransferase from Micromonospora megalomicea subsp. nigra (see 2 papers)
CAC37807.1 rhodosaminyltransferase (MegDI) (EC 2.4.1.-) (see protein)
34% identity, 22% coverage

ADU85989.1 tiacumicin 2-O-methyl-d-rhamnosyltransferase (TiaG2) (EC 2.4.1.-) (see protein)
30% identity, 34% coverage

YP_009051683 polyprotein from Phomopsis longicolla hypovirus
25% identity, 4% coverage

New Search

For advice on how to use these tools together, see Interactive tools for functional annotation of bacterial genomes.

Statistics

The PaperBLAST database links 798,070 different protein sequences to 1,261,478 scientific articles. Searches against EuropePMC were last performed on May 12 2025.

How It Works

PaperBLAST builds a database of protein sequences that are linked to scientific articles. These links come from automated text searches against the articles in EuropePMC and from manually-curated information from GeneRIF, UniProtKB/Swiss-Prot, BRENDA, CAZy (as made available by dbCAN), BioLiP, CharProtDB, MetaCyc, EcoCyc, TCDB, REBASE, the Fitness Browser, and a subset of the European Nucleotide Archive with the /experiment tag. Given this database and a protein sequence query, PaperBLAST uses protein-protein BLAST to find similar sequences with E < 0.001.

To build the database, we query EuropePMC with locus tags, with RefSeq protein identifiers, and with UniProt accessions. We obtain the locus tags from RefSeq or from MicrobesOnline. We use queries of the form "locus_tag AND genus_name" to try to ensure that the paper is actually discussing that gene. Because EuropePMC indexes most recent biomedical papers, even if they are not open access, some of the links may be to papers that you cannot read or that our computers cannot read. We query each of these identifiers that appears in the open access part of EuropePMC, as well as every locus tag that appears in the 500 most-referenced genomes, so that a gene may appear in the PaperBLAST results even though none of the papers that mention it are open access. We also incorporate text-mined links from EuropePMC that link open access articles to UniProt or RefSeq identifiers. (This yields some additional links because EuropePMC uses different heuristics for their text mining than we do.)

For every article that mentions a locus tag, a RefSeq protein identifier, or a UniProt accession, we try to select one or two snippets of text that refer to the protein. If we cannot get access to the full text, we try to select a snippet from the abstract, but unfortunately, unique identifiers such as locus tags are rarely provided in abstracts.

PaperBLAST also incorporates manually-curated protein functions:

Except for GeneRIF and ENA, the curated entries include a short curated description of the protein's function. For entries from BioLiP, the protein's function may not be known beyond binding to the ligand. Many of these entries also link to articles in PubMed.

For more information see the PaperBLAST paper (mSystems 2017) or the code. You can download PaperBLAST's database here.

Changes to PaperBLAST since the paper was written:

Many of these changes are described in Interactive tools for functional annotation of bacterial genomes.

Secrets

PaperBLAST cannot provide snippets for many of the papers that are published in non-open-access journals. This limitation applies even if the paper is marked as "free" on the publisher's web site and is available in PubmedCentral or EuropePMC. If a journal that you publish in is marked as "secret," please consider publishing elsewhere.

Omissions from the PaperBLAST Database

Many important articles are missing from PaperBLAST, either because the article's full text is not in EuropePMC (as for many older articles), or because the paper does not mention a protein identifier such as a locus tag, or because of PaperBLAST's heuristics. If you notice an article that characterizes a protein's function but is missing from PaperBLAST, please notify the curators at UniProt or add an entry to GeneRIF. Entries in either of these databases will eventually be incorporated into PaperBLAST. Note that to add an entry to UniProt, you will need to find the UniProt identifier for the protein. If the protein is not already in UniProt, you can ask them to create an entry. To add an entry to GeneRIF, you will need an NCBI Gene identifier, but unfortunately many prokaryotic proteins in RefSeq do not have corresponding Gene identifers.

References

PaperBLAST: Text-mining papers for information about homologs.
M. N. Price and A. P. Arkin (2017). mSystems, 10.1128/mSystems.00039-17.

Europe PMC in 2017.
M. Levchenko et al (2017). Nucleic Acids Research, 10.1093/nar/gkx1005.

Gene indexing: characterization and analysis of NLM's GeneRIFs.
J. A. Mitchell et al (2003). AMIA Annu Symp Proc 2003:460-464.

UniProt: the universal protein knowledgebase.
The UniProt Consortium (2016). Nucleic Acids Research, 10.1093/nar/gkw1099.

BRENDA in 2017: new perspectives and new tools in BRENDA.
S. Placzek et al (2017). Nucleic Acids Research, 10.1093/nar/gkw952.

The EcoCyc database: reflecting new knowledge about Escherichia coli K-12.
I. M. Keeseler et al (2016). Nucleic Acids Research, 10.1093/nar/gkw1003.

The MetaCyc database of metabolic pathways and enzymes.
R. Caspi et al (2018). Nucleic Acids Research, 10.1093/nar/gkx935.

CharProtDB: a database of experimentally characterized protein annotations.
R. Madupu et al (2012). Nucleic Acids Research, 10.1093/nar/gkr1133.

The carbohydrate-active enzymes database (CAZy) in 2013.
V. Lombard et al (2014). Nucleic Acids Research, 10.1093/nar/gkt1178.

The Transporter Classification Database (TCDB): recent advances
M. H. Saier, Jr. et al (2016). Nucleic Acids Research, 10.1093/nar/gkv1103.

REBASE - a database for DNA restriction and modification: enzymes, genes and genomes.
R. J. Roberts et al (2015). Nucleic Acids Research, 10.1093/nar/gku1046.

Deep annotation of protein function across diverse bacteria from mutant phenotypes.
M. N. Price et al (2016). bioRxiv, 10.1101/072470.

by Morgan Price, Arkin group
Lawrence Berkeley National Laboratory