PaperBLAST – Find papers about a protein or its homologs

 

PaperBLAST

PaperBLAST Hits for 93 a.a. (GPRTRKLKKK...)

Other sequence analysis tools:

Find functional residues: SitesBLAST

Search for conserved domains

Find the best match in UniProt

Compare to protein structures

Predict transmenbrane helices: Phobius

Predict protein localization: PSORTb

Find homologs in fast.genomics

Fitness BLAST: loading...

Found 251 similar proteins in the literature:

HME1_MOUSE / P09065 Homeobox protein engrailed-1; Homeobox protein en-1; Mo-En-1 from Mus musculus (Mouse) (see paper)
NP_034263 homeobox protein engrailed-1 from Mus musculus
100% identity, 23% coverage

HME1_HUMAN / Q05925 Homeobox protein engrailed-1; Homeobox protein en-1; Hu-En-1 from Homo sapiens (Human) (see paper)
NP_001417 homeobox protein engrailed-1 from Homo sapiens
100% identity, 24% coverage

P09015 Homeobox protein engrailed-2a from Danio rerio
NP_571119 homeobox protein engrailed-2a from Danio rerio
87% identity, 35% coverage

NP_001095213 homeobox protein engrailed-2-A from Xenopus laevis
88% identity, 35% coverage

P52730 Homeobox protein engrailed-2-B from Xenopus laevis
87% identity, 35% coverage

NP_571120 homeobox protein engrailed-1a from Danio rerio
83% identity, 40% coverage

NP_034264 homeobox protein engrailed-2 isoform 1 from Mus musculus
P09066 Homeobox protein engrailed-2 from Mus musculus
88% identity, 29% coverage

NP_001418 homeobox protein engrailed-2 from Homo sapiens
P19622 Homeobox protein engrailed-2 from Homo sapiens
88% identity, 28% coverage

Q05917 Homeobox protein engrailed-2 from Gallus gallus
NP_001254648 homeobox protein engrailed-2 from Gallus gallus
87% identity, 32% coverage

Q05916 Homeobox protein engrailed-1 from Gallus gallus
96% identity, 23% coverage

HMEN_DROME / P02836 Segmentation polarity homeobox protein engrailed from Drosophila melanogaster (Fruit fly) (see 2 papers)
NP_725059 engrailed, isoform B from Drosophila melanogaster
71% identity, 17% coverage

P27609 Segmentation polarity homeobox protein engrailed from Bombyx mori
71% identity, 25% coverage

NP_001037550 segmentation polarity homeobox protein engrailed from Bombyx mori
71% identity, 25% coverage

NP_001034511 engrailed from Tribolium castaneum
68% identity, 28% coverage

P27610 Homeobox protein invected from Bombyx mori
67% identity, 20% coverage

Q9U0S0 Homeobox protein engrailed-like from Periplaneta americana
72% identity, 79% coverage

HMIN_DROME / P05527 Homeobox protein invected from Drosophila melanogaster (Fruit fly) (see 2 papers)
NP_725057 invected, isoform D from Drosophila melanogaster
74% identity, 14% coverage

HMEN_LYMST / A9ZPC9 Homeobox protein engrailed; Lsten from Lymnaea stagnalis (Great pond snail) (Helix stagnalis) (see paper)
70% identity, 12% coverage

Smp_145200 putative engrailed from Schistosoma mansoni
63% identity, 31% coverage

HM16_CAEEL / P34326 Homeobox protein engrailed-like ceh-16 from Caenorhabditis elegans (see 2 papers)
62% identity, 41% coverage

6m3dC / P02836 X-ray crystal structure of tandemly connected engrailed homeodomains (ehd) with r53a mutations and DNA complex (see paper)
72% identity, 53% coverage

HM12_CAEEL / P17487 Homeobox protein ceh-12 from Caenorhabditis elegans (see 3 papers)
57% identity, 33% coverage

4qtrC Computational design of co-assembling protein-DNA nanowires (see paper)
61% identity, 61% coverage

NKX11_MOUSE / G3UXB3 NK1 transcription factor-related protein 1; Homeobox protein SAX-2; NKX-1.1 from Mus musculus (Mouse) (see 3 papers)
NP_035450 NK1 transcription factor-related protein 1 from Mus musculus
46% identity, 16% coverage

VAX1_DANRE / Q801E0 Ventral anterior homeobox 1 from Danio rerio (Zebrafish) (Brachydanio rerio) (see paper)
55% identity, 19% coverage

NP_001027781 hox4 protein from Ciona intestinalis
52% identity, 19% coverage

BARH1_DROME / Q24255 Homeobox protein B-H1; Homeobox protein BarH1 from Drosophila melanogaster (Fruit fly) (see 5 papers)
NP_523387 BarH1, isoform A from Drosophila melanogaster
41% identity, 14% coverage

EGR_00014 Homeobox protein Hox-B4a from Echinococcus granulosus
54% identity, 14% coverage

VAX2_DANRE / Q801E1 Ventral anterior homeobox 2 from Danio rerio (Zebrafish) (Brachydanio rerio) (see paper)
NP_919390 ventral anterior homeobox 2 from Danio rerio
56% identity, 19% coverage

XP_015154923 homeobox protein Hox-B4 isoform X1 from Gallus gallus
52% identity, 23% coverage

NP_919391 ventral anterior homeobox 1 from Danio rerio
55% identity, 19% coverage

DFD_DROME / P07548 Homeotic protein deformed from Drosophila melanogaster (Fruit fly) (see 3 papers)
NP_477201 deformed from Drosophila melanogaster
52% identity, 11% coverage

XP_006520524 homeobox protein Hox-C4 isoform X1 from Mus musculus
Q08624 Homeobox protein Hox-C4 from Mus musculus
51% identity, 26% coverage

NP_055435 homeobox protein Hox-C4 from Homo sapiens
P09017 Homeobox protein Hox-C4 from Homo sapiens
51% identity, 26% coverage

HXB4A_DANRE / P22574 Homeobox protein Hox-B4a; Hox-B4; Homeobox protein Zf-13 from Danio rerio (Zebrafish) (Brachydanio rerio) (see 2 papers)
NP_571193 homeobox protein Hox-B4a from Danio rerio
52% identity, 26% coverage

P17483 Homeobox protein Hox-B4 from Homo sapiens
NP_076920 homeobox protein Hox-B4 from Homo sapiens
52% identity, 26% coverage

HXA3_MOUSE / P02831 Homeobox protein Hox-A3; Homeobox protein Hox-1.5; Homeobox protein MO-10 from Mus musculus (Mouse) (see paper)
NP_034582 homeobox protein Hox-A3 from Mus musculus
50% identity, 15% coverage

VAX2A_XENLA / Q9PU20 Ventral anterior homeobox 2a; Xvax2 from Xenopus laevis (African clawed frog) (see 3 papers)
XP_018088768 ventral anterior homeobox 2a isoform X1 from Xenopus laevis
56% identity, 19% coverage

NP_001037341 transcription factor deformed from Bombyx mori
51% identity, 17% coverage

NP_001094257 homeobox protein Hox-B4 from Rattus norvegicus
52% identity, 26% coverage

Q00056 Homeobox protein Hox-A4 from Homo sapiens
NP_002132 homeobox protein Hox-A4 from Homo sapiens
50% identity, 21% coverage

O43365 Homeobox protein Hox-A3 from Homo sapiens
50% identity, 15% coverage

HXC4A_DANRE / Q9PWM3 Homeobox protein Hox-C4a; Hox-C4 from Danio rerio (Zebrafish) (Brachydanio rerio) (see paper)
48% identity, 27% coverage

XP_003640745 homeobox protein Hox-A1 from Gallus gallus
46% identity, 22% coverage

NP_034589 homeobox protein Hox-B4 from Mus musculus
52% identity, 26% coverage

P81192 Homeobox protein Hox-A4 (Fragment) from Lineus sanguineus
51% identity, 73% coverage

NKX11_HUMAN / Q15270 NK1 transcription factor-related protein 1; Homeobox protein 153; HPX-153; Homeobox protein SAX-2; NKX-1.1 from Homo sapiens (Human) (see paper)
46% identity, 16% coverage

XP_006532353 homeobox protein Hox-B3 isoform X1 from Mus musculus
55% identity, 15% coverage

NP_001162367 homeobox protein Hox-A4 from Papio anubis
50% identity, 21% coverage

P14651 Homeobox protein Hox-B3 from Homo sapiens
55% identity, 15% coverage

XP_017452803 homeobox protein Hox-B3 isoform X1 from Rattus norvegicus
55% identity, 15% coverage

P06798 Homeobox protein Hox-A4 from Mus musculus
50% identity, 24% coverage

Q00444 polo kinase (EC 2.7.11.21) from Homo sapiens (see paper)
NP_061826 homeobox protein Hox-C5 from Homo sapiens
50% identity, 32% coverage

HXD4_HUMAN / P09016 Homeobox protein Hox-D4; Homeobox protein HHO.C13; Homeobox protein Hox-4B; Homeobox protein Hox-5.1 from Homo sapiens (Human) (see paper)
52% identity, 25% coverage

NP_032291 homeobox protein Hox-A4 from Mus musculus
50% identity, 24% coverage

HXC5A_DANRE / P09074 Homeobox protein Hox-C5a; Hox-C5; Homeobox protein Hox-3.4; Homeobox protein Zf-25 from Danio rerio (Zebrafish) (Brachydanio rerio) (see paper)
51% identity, 29% coverage

NP_001080293 homeobox A3 S homeolog from Xenopus laevis
50% identity, 16% coverage

NP_034579 homeobox protein Hox-A1 isoform 1 from Mus musculus
46% identity, 21% coverage

HXA1_MOUSE / P09022 Homeobox protein Hox-A1; Early retinoic acid 1; Homeobox protein Hox-1.6; Homeoboxless protein ERA-1-399; Homeotic protein ERA-1-993 from Mus musculus (Mouse) (see 2 papers)
46% identity, 22% coverage

XP_003134892 homeobox protein Hox-A1 isoform X1 from Sus scrofa
46% identity, 21% coverage

HXD4_MOUSE / P10628 Homeobox protein Hox-D4; Homeobox protein Hox-4.2; Homeobox protein Hox-5.1 from Mus musculus (Mouse) (see paper)
NP_034599 homeobox protein Hox-D4 from Mus musculus
52% identity, 26% coverage

NP_001101586 homeobox protein Hox-C5 from Rattus norvegicus
NP_783857 homeobox protein Hox-C5 from Mus musculus
P32043 Homeobox protein Hox-C5 from Mus musculus
50% identity, 32% coverage

F8VXG0 Homeobox B3 from Homo sapiens
55% identity, 22% coverage

GSX2_HUMAN / Q9BZM3 GS homeobox 2; Genetic-screened homeobox 2; Homeobox protein GSH-2 from Homo sapiens (Human) (see paper)
NP_573574 GS homeobox 2 from Homo sapiens
54% identity, 19% coverage

VAX1B_XENLA / Q9DDB0 Ventral anterior homeobox 1b from Xenopus laevis (African clawed frog) (see paper)
55% identity, 21% coverage

LOC107447988 homeobox protein engrailed-like ceh-16 from Parasteatoda tepidariorum
53% identity, 24% coverage

GSX2_MOUSE / P31316 GS homeobox 2; Genetic-screened homeobox 2; Homeobox protein GSH-2 from Mus musculus (Mouse) (see 2 papers)
NP_573555 GS homeobox 2 from Mus musculus
54% identity, 19% coverage

VAX2_HUMAN / Q9UIW0 Ventral anterior homeobox 2 from Homo sapiens (Human) (see paper)
56% identity, 20% coverage

HXA1_HUMAN / P49639 Homeobox protein Hox-A1; Homeobox protein Hox-1F from Homo sapiens (Human) (see 2 papers)
46% identity, 21% coverage

HXD3_MOUSE / P09027 Homeobox protein Hox-D3; Homeobox protein Hox-4.1; Homeobox protein MH-19 from Mus musculus (Mouse) (see paper)
NP_034598 homeobox protein Hox-D3 from Mus musculus
56% identity, 13% coverage

P31249 Homeobox protein Hox-D3 from Homo sapiens
56% identity, 13% coverage

VAX1A_XENLA / O93528 Ventral anterior homeobox 1a from Xenopus laevis (African clawed frog) (see paper)
53% identity, 20% coverage

NP_077365 homeobox protein Hox-A5 from Rattus norvegicus
52% identity, 17% coverage

NP_001123388 homeobox protein vex1 from Xenopus tropicalis
46% identity, 25% coverage

NKX12_MOUSE / P42580 NK1 transcription factor-related protein 2; Homeobox protein SAX-1; NKX-1.1 from Mus musculus (Mouse) (see 3 papers)
NP_033149 NK1 transcription factor-related protein 2 from Mus musculus
56% identity, 19% coverage

LOC107440711 homeobox protein Hox-A1 from Parasteatoda tepidariorum
54% identity, 15% coverage

HXD4A_DANRE / O57374 Homeobox protein Hox-D4a; Hox-D4 from Danio rerio (Zebrafish) (Brachydanio rerio) (see paper)
54% identity, 28% coverage

VAX1_MOUSE / Q2NKI2 Ventral anterior homeobox 1 from Mus musculus (Mouse) (see 4 papers)
55% identity, 18% coverage

VAX1_HUMAN / Q5SQQ9 Ventral anterior homeobox 1 from Homo sapiens (Human) (see paper)
55% identity, 18% coverage

NP_997995 NK1 transcription factor related 2-like,a from Danio rerio
45% identity, 29% coverage

HM19_CAEEL / P26797 Homeobox protein ceh-19 from Caenorhabditis elegans (see paper)
NP_001023142 Homeobox protein ceh-19 from Caenorhabditis elegans
45% identity, 37% coverage

ZEN2_DROME / P09090 Protein zerknuellt 2; ZEN-2 from Drosophila melanogaster (Fruit fly) (see paper)
44% identity, 29% coverage

HXD1_XENLA / Q08820 Homeobox protein Hox-D1; Hox.lab1; Labial protein; Xlab from Xenopus laevis (African clawed frog) (see 2 papers)
54% identity, 18% coverage

XP_004004611 homeobox protein Hox-D1 from Ovis aries
41% identity, 23% coverage

E7FEH1 Homeobox protein MOX-2 from Danio rerio
47% identity, 21% coverage

XP_039290045 uncharacterized protein LOC120352641 from Nilaparvata lugens
46% identity, 13% coverage

NP_571609 homeobox protein Hox-A3a from Danio rerio
56% identity, 14% coverage

LAB_DROME / P10105 Homeotic protein labial; F24; F90-2 from Drosophila melanogaster (Fruit fly) (see 2 papers)
55% identity, 10% coverage

XP_018124725 homeobox protein Hox-A1 from Xenopus laevis
46% identity, 22% coverage

HXA5_MOUSE / P09021 Homeobox protein Hox-A5; Homeobox protein Hox-1.3; Homeobox protein M2 from Mus musculus (Mouse) (see paper)
NP_034583 homeobox protein Hox-A5 from Mus musculus
49% identity, 27% coverage

HXA5_HUMAN / P20719 Homeobox protein Hox-A5; Homeobox protein Hox-1C from Homo sapiens (Human) (see 2 papers)
NP_061975 homeobox protein Hox-A5 from Homo sapiens
47% identity, 29% coverage

NP_476613 labial, isoform A from Drosophila melanogaster
55% identity, 10% coverage

VAX1_CHICK / Q9PVN2 Ventral anterior homeobox 1 from Gallus gallus (Chicken) (see paper)
55% identity, 18% coverage

NP_990130 ventral anterior homeobox 1 from Gallus gallus
55% identity, 18% coverage

Lox1 / AAA19914.1 Lox1 protein from Hirudo medicinalis (see paper)
46% identity, 18% coverage

ROUGH_DROME / P10181 Homeobox protein rough from Drosophila melanogaster (Fruit fly) (see 4 papers)
NP_733173 rough, isoform B from Drosophila melanogaster
57% identity, 15% coverage

XP_006532343 homeobox protein Hox-B1 isoform X1 from Mus musculus
54% identity, 16% coverage

NP_523670 even skipped from Drosophila melanogaster
P06602 Segmentation protein even-skipped from Drosophila melanogaster
53% identity, 15% coverage

P18488 Homeotic protein empty spiracles from Drosophila melanogaster
47% identity, 12% coverage

XP_006630061 GS homeobox 2 from Lepisosteus oculatus
56% identity, 24% coverage

NP_571200 homeobox protein Hox-D3a from Danio rerio
56% identity, 14% coverage

HESX1_MOUSE / Q61658 Homeobox expressed in ES cells 1; Anterior-restricted homeobox protein; Homeobox protein ANF; Rathke pouch homeo box from Mus musculus (Mouse) (see 2 papers)
46% identity, 38% coverage

MEOX2_MOUSE / P32443 Homeobox protein MOX-2; Growth arrest-specific homeobox; Mesenchyme homeobox 2 from Mus musculus (Mouse) (see 6 papers)
NP_032610 homeobox protein MOX-2 from Mus musculus
43% identity, 24% coverage

SCR_DROME / P09077 Homeotic protein Sex combs reduced from Drosophila melanogaster (Fruit fly) (see 2 papers)
NP_524248 Sex combs reduced, isoform A from Drosophila melanogaster
NP_996165 Sex combs reduced, isoform B from Drosophila melanogaster
54% identity, 14% coverage

HXB1B_DANRE / Q90423 Homeobox protein Hox-B1b; Homeobox protein Hox-A1 from Danio rerio (Zebrafish) (Brachydanio rerio) (see paper)
46% identity, 23% coverage

NKX11_DANRE / Q75W95 NK1 transcription factor-related protein 1; Homeodomain protein Sax2; NK1 transcription factor related 2-like,b from Danio rerio (Zebrafish) (Brachydanio rerio) (see paper)
NP_998713 NK1 transcription factor-related protein 1 from Danio rerio
44% identity, 19% coverage

MEOX2_RAT / P39020 Homeobox protein MOX-2; Growth arrest-specific homeobox; Mesenchyme homeobox 2 from Rattus norvegicus (Rat) (see paper)
47% identity, 25% coverage

NP_001005427 homeobox protein MOX-2 from Gallus gallus
43% identity, 25% coverage

NP_034597 homeobox protein Hox-D1 from Mus musculus
42% identity, 22% coverage

NP_571355 homeobox protein EMX2 from Danio rerio
48% identity, 24% coverage

NP_005184 homeobox protein CDX-4 from Homo sapiens
46% identity, 25% coverage

VAX2_MOUSE / Q9WTP9 Ventral anterior homeobox 2; Ventral retina homeodomain protein from Mus musculus (Mouse) (see 5 papers)
Q14B19 Ventral anterior homeobox containing gene 2 from Mus musculus
NP_036042 ventral anterior homeobox 2 from Mus musculus
56% identity, 20% coverage

NP_001027665 Hox 5 from Ciona intestinalis
46% identity, 33% coverage

P23683 Homeobox even-skipped homolog protein 1 from Mus musculus
NP_031992 homeobox even-skipped homolog protein 1 from Mus musculus
51% identity, 14% coverage

NP_001980 homeobox even-skipped homolog protein 1 isoform 1 from Homo sapiens
P49640 Homeobox even-skipped homolog protein 1 from Homo sapiens
51% identity, 14% coverage

NP_001009885 motor neuron and pancreas homeobox protein 1 from Danio rerio
48% identity, 19% coverage

CG34031 uncharacterized protein from Drosophila melanogaster
48% identity, 26% coverage

XP_006526827 pituitary homeobox 3 isoform X1 from Mus musculus
43% identity, 19% coverage

NP_058845 homeobox protein MOX-2 from Rattus norvegicus
43% identity, 24% coverage

NP_731868 empty spiracles from Drosophila melanogaster
47% identity, 12% coverage

NP_001258382 homeobox protein GBX-1 from Rattus norvegicus
49% identity, 14% coverage

XP_017894255 homeobox protein Hox-D1 from Capra hircus
40% identity, 23% coverage

NP_001009886 motor neuron and pancreas homeobox 2a from Danio rerio
45% identity, 21% coverage

NP_001009887 motor neuron and pancreas homeobox 2b isoform 2 from Danio rerio
48% identity, 18% coverage

NP_996087 intermediate neuroblasts defective from Drosophila melanogaster
53% identity, 18% coverage

MEOX2_HUMAN / P50222 Homeobox protein MOX-2; Growth arrest-specific homeobox; Mesenchyme homeobox 2 from Homo sapiens (Human) (see 6 papers)
NP_005915 homeobox protein MOX-2 from Homo sapiens
43% identity, 24% coverage

NP_648164 extra-extra from Drosophila melanogaster
48% identity, 11% coverage

EMX2_HUMAN / Q04743 Homeobox protein EMX2; Empty spiracles homolog 2; Empty spiracles-like protein 2 from Homo sapiens (Human) (see paper)
NP_004089 homeobox protein EMX2 isoform 1 from Homo sapiens
48% identity, 24% coverage

EMX2_MOUSE / Q04744 Homeobox protein EMX2; Empty spiracles homolog 2; Empty spiracles-like protein 2 from Mus musculus (Mouse) (see 4 papers)
48% identity, 24% coverage

XP_025007825 homeobox protein EMX2 isoform X1 from Gallus gallus
48% identity, 24% coverage

MNX1_HUMAN / P50219 Motor neuron and pancreas homeobox protein 1; Homeobox protein HB9 from Homo sapiens (Human) (see 4 papers)
NP_005506 motor neuron and pancreas homeobox protein 1 isoform 1 from Homo sapiens
48% identity, 14% coverage

MNX1_MOUSE / Q9QZW9 Motor neuron and pancreas homeobox protein 1; Homeobox protein HB9 from Mus musculus (Mouse) (see 2 papers)
48% identity, 14% coverage

MNX1_RAT / M0R6D8 Motor neuron and pancreas homeobox 1; Homeobox protein HB9 from Rattus norvegicus (Rat) (see paper)
NP_001258203 motor neuron and pancreas homeobox 1 from Rattus norvegicus
48% identity, 14% coverage

NP_571518 pancreas/duodenum homeobox protein 1 from Danio rerio
57% identity, 24% coverage

XP_793141 homeobox protein Hox-A7 from Strongylocentrotus purpuratus
53% identity, 19% coverage

NP_064328 motor neuron and pancreas homeobox protein 1 from Mus musculus
48% identity, 14% coverage

Smp_134690 putative emx homeobox protein from Schistosoma mansoni
49% identity, 19% coverage

HXB3A_DANRE / O42368 Homeobox protein Hox-B3a; Hox-B3 from Danio rerio (Zebrafish) (Brachydanio rerio) (see paper)
56% identity, 14% coverage

NP_571192 homeobox protein Hox-B3a from Danio rerio
56% identity, 14% coverage

NP_990096 homeobox protein MOX-1 from Gallus gallus
42% identity, 30% coverage

NP_001107793 empty spiracles from Tribolium castaneum
48% identity, 20% coverage

XP_624481 homeotic protein empty spiracles from Apis mellifera
48% identity, 13% coverage

HXA4A_DANRE / Q9PWL5 Homeobox protein Hox-A4a; Homeobox protein Zf-26; Hoxx4 from Danio rerio (Zebrafish) (Brachydanio rerio) (see 2 papers)
49% identity, 26% coverage

HM02_CAEEL / G5ECT8 Homeobox protein ceh-2 from Caenorhabditis elegans (see paper)
48% identity, 29% coverage

XP_001927147 NK1 transcription factor-related protein 2 from Sus scrofa
56% identity, 18% coverage

HXB1_HUMAN / P14653 Homeobox protein Hox-B1; Homeobox protein Hox-2I from Homo sapiens (Human) (see paper)
NP_002135 homeobox protein Hox-B1 from Homo sapiens
54% identity, 18% coverage

BARH1_DROAN / P22544 Homeobox protein B-H1; Homeobox BarH1 protein from Drosophila ananassae (Fruit fly) (see 2 papers)
39% identity, 13% coverage

NP_571611 homeobox protein Hox-A1a from Danio rerio
44% identity, 22% coverage

NP_001102307 homeobox protein MOX-1 from Rattus norvegicus
53% identity, 23% coverage

GBX1_MOUSE / P82976 Homeobox protein GBX-1; Gastrulation and brain-specific homeobox protein 1 from Mus musculus (Mouse) (see paper)
NP_056554 homeobox protein GBX-1 from Mus musculus
49% identity, 14% coverage

NP_001017168 GS homeobox 2 from Xenopus tropicalis
54% identity, 23% coverage

NP_990399 homeobox protein GBX-2 from Gallus gallus
49% identity, 17% coverage

MEOX1_HUMAN / P50221 Homeobox protein MOX-1; Mesenchyme homeobox 1 from Homo sapiens (Human) (see 2 papers)
53% identity, 22% coverage

NP_694496 homeobox protein GBX-2 from Danio rerio
49% identity, 17% coverage

NP_001020683 GS homeobox 2 from Danio rerio
54% identity, 24% coverage

FTZ_DROME / P02835 Segmentation protein fushi tarazu from Drosophila melanogaster (Fruit fly) (see 4 papers)
NP_477498 fushi tarazu from Drosophila melanogaster
55% identity, 15% coverage

NP_001039254 GS homeobox 1 from Xenopus tropicalis
54% identity, 23% coverage

NP_777286 homeobox protein GBX-1 from Danio rerio
49% identity, 19% coverage

NP_078777 homeobox protein Hox-D1 from Homo sapiens
Q9GZZ0 Homeobox protein Hox-D1 from Homo sapiens
52% identity, 16% coverage

EMX1_HUMAN / Q04741 Homeobox protein EMX1; Empty spiracles homolog 1; Empty spiracles-like protein 1 from Homo sapiens (Human) (see 2 papers)
48% identity, 21% coverage

XP_006627824 GS homeobox 1 from Lepisosteus oculatus
52% identity, 25% coverage

NP_989490 homeobox protein DLX-5 from Gallus gallus
43% identity, 28% coverage

MEOX1_MOUSE / P32442 Homeobox protein MOX-1; Mesenchyme homeobox 1 from Mus musculus (Mouse) (see 7 papers)
NP_034921 homeobox protein MOX-1 from Mus musculus
53% identity, 23% coverage

Q14549 Homeobox protein GBX-1 from Homo sapiens
49% identity, 16% coverage

P92067 Transcription factor homolog from Tribolium castaneum
52% identity, 22% coverage

NP_001135456 pancreas/duodenum homeobox protein 1 from Sus scrofa
55% identity, 21% coverage

NP_001153132 homeobox protein DLX-5 from Sus scrofa
43% identity, 27% coverage

NP_492586 Homeobox protein ceh-5 from Caenorhabditis elegans
49% identity, 43% coverage

HXB5B_DANRE / P09013 Homeobox protein Hox-B5b; Homeobox protein Zf-54; Hox-B5-like from Danio rerio (Zebrafish) (Brachydanio rerio) (see paper)
46% identity, 28% coverage

NP_937787 homeobox protein EMX1 from Danio rerio
48% identity, 26% coverage

P09079 Homeobox protein Hox-B5 from Mus musculus
NP_001178854 homeobox protein Hox-B5 from Rattus norvegicus
NP_032294 homeobox protein Hox-B5 from Mus musculus
46% identity, 29% coverage

NP_002138 homeobox protein Hox-B5 from Homo sapiens
46% identity, 29% coverage

HXB5A_DANRE / P09014 Homeobox protein Hox-B5a; Hox-B5; Homeobox protein Zf-21 from Danio rerio (Zebrafish) (Brachydanio rerio) (see 2 papers)
NP_571176 homeobox protein Hox-B5a from Danio rerio
46% identity, 28% coverage

XP_011675355 even-skipped-like protein isoform X1 from Strongylocentrotus purpuratus
49% identity, 13% coverage

DLX5_MOUSE / P70396 Homeobox protein DLX-5 from Mus musculus (Mouse) (see 4 papers)
43% identity, 27% coverage

GSX1_MOUSE / P31315 GS homeobox 1; Homeobox protein GSH-1 from Mus musculus (Mouse) (see paper)
NP_032204 GS homeobox 1 from Mus musculus
52% identity, 23% coverage

NP_001035091 homeobox protein MOX-1 isoform 3 from Homo sapiens
53% identity, 41% coverage

NP_571612 homeobox protein Hox-B5b from Danio rerio
46% identity, 28% coverage

NP_571324 homeobox even-skipped homolog protein 1 from Danio rerio
51% identity, 15% coverage

Q9H4S2 GS homeobox 1 from Homo sapiens
NP_663632 GS homeobox 1 from Homo sapiens
52% identity, 23% coverage

EMX1_MOUSE / Q04742 Homeobox protein EMX1; Empty spiracles homolog 1; Empty spiracles-like protein 1 from Mus musculus (Mouse) (see 4 papers)
NP_034261 homeobox protein EMX1 from Mus musculus
48% identity, 23% coverage

NP_001012251 GS homeobox 1 from Danio rerio
52% identity, 25% coverage

ZEN1_DROME / P09089 Protein zerknuellt 1; ZEN-1 from Drosophila melanogaster (Fruit fly) (see paper)
NP_476793 zerknullt from Drosophila melanogaster
58% identity, 16% coverage

UNPG_DROME / Q4V5A3 Homeobox protein unplugged from Drosophila melanogaster (Fruit fly) (see 2 papers)
NP_477146 unplugged from Drosophila melanogaster
50% identity, 13% coverage

NP_001098303 GS homeobox 1 from Oryzias latipes
47% identity, 28% coverage

NP_001122333 homeobox transcription factor Hox1 from Ciona intestinalis
56% identity, 15% coverage

NP_999815 homeobox protein Splox from Strongylocentrotus purpuratus
55% identity, 15% coverage

MEOX1_DANRE / F1Q4R9 Homeobox protein MOX-1; Mesenchyme homeobox 1; Protein choker from Danio rerio (Zebrafish) (Brachydanio rerio) (see paper)
NP_001002450 homeobox protein MOX-1 from Danio rerio
53% identity, 23% coverage

PITX3_MOUSE / O35160 Pituitary homeobox 3; Homeobox protein PITX3; Paired-like homeodomain transcription factor 3 from Mus musculus (Mouse) (see 6 papers)
NP_032878 pituitary homeobox 3 from Mus musculus
43% identity, 24% coverage

PITX3_RAT / P81062 Pituitary homeobox 3; Homeobox protein PITX3; Paired-like homeodomain transcription factor 3 from Rattus norvegicus (Rat) (see paper)
XP_006231540 pituitary homeobox 3 isoform X1 from Rattus norvegicus
43% identity, 24% coverage

NP_996161 proboscipedia, isoform D from Drosophila melanogaster
41% identity, 9% coverage

XP_010816072 homeobox expressed in ES cells 1 isoform X4 from Bos taurus
46% identity, 38% coverage

NP_001271400 pancreas/duodenum homeobox protein 1 from Canis lupus familiaris
55% identity, 21% coverage

D4AEG9 Homeobox expressed in ES cells 1 from Rattus norvegicus
45% identity, 38% coverage

NP_001080931 VENT homeobox 2, gene 2 L homeolog from Xenopus laevis
50% identity, 18% coverage

HXC1A_DANRE / Q98SH9 Homeobox protein Hox-C1a from Danio rerio (Zebrafish) (Brachydanio rerio) (see paper)
51% identity, 22% coverage

XP_015144551 homeobox protein SAX-1 from Gallus gallus
56% identity, 17% coverage

HESX1_HUMAN / Q9UBX0 Homeobox expressed in ES cells 1; Homeobox protein ANF; hAnf from Homo sapiens (Human) (see 6 papers)
XP_005265583 homeobox expressed in ES cells 1 isoform X1 from Homo sapiens
46% identity, 37% coverage

P50476 Homeobox protein XHOX-3 from Xenopus laevis
51% identity, 15% coverage

NP_001476 homeobox protein GBX-2 isoform 1 from Homo sapiens
49% identity, 17% coverage

CG18599 uncharacterized protein from Drosophila melanogaster
48% identity, 13% coverage

NP_037075 homeobox protein DLX-5 from Rattus norvegicus
43% identity, 27% coverage

PITX3_HUMAN / O75364 Pituitary homeobox 3; Homeobox protein PITX3; Paired-like homeodomain transcription factor 3 from Homo sapiens (Human) (see 2 papers)
NP_005020 pituitary homeobox 3 from Homo sapiens
43% identity, 24% coverage

NP_001116381 GS homeobox 2 from Oryzias latipes
54% identity, 24% coverage

XP_011530999 homeobox protein EMX1 isoform X1 from Homo sapiens
51% identity, 48% coverage

DLX5_HUMAN / P56178 Homeobox protein DLX-5 from Homo sapiens (Human) (see 3 papers)
NP_005212 homeobox protein DLX-5 from Homo sapiens
43% identity, 27% coverage

NP_996314 Ptx1, isoform C from Drosophila melanogaster
40% identity, 14% coverage

NP_990819 homeobox protein MSX-1 from Gallus gallus
43% identity, 22% coverage

VENTX_HUMAN / O95231 Homeobox protein VENTX; VENT homeobox homolog; VENT-like homeobox protein 2 from Homo sapiens (Human) (see paper)
NP_055283 homeobox protein VENTX from Homo sapiens
46% identity, 26% coverage

CEH63_CAEEL / A3FPJ2 Homeobox protein ceh-63 from Caenorhabditis elegans (see paper)
46% identity, 44% coverage

LOC577702 homeobox protein EMX1 from Strongylocentrotus purpuratus
47% identity, 20% coverage

Q07424 Homeobox protein CDX-4 from Mus musculus
NP_031700 homeobox protein CDX-4 from Mus musculus
45% identity, 25% coverage

P02833 Homeotic protein antennapedia from Drosophila melanogaster
49% identity, 17% coverage

P48031 Homeobox protein GBX-2 from Mus musculus
NP_034392 homeobox protein GBX-2 from Mus musculus
49% identity, 17% coverage

PDX1_XENLA / P14837 Pancreas/duodenum homeobox protein 1; PDX-1; Homeobox protein 8; XlHbox-8 from Xenopus laevis (African clawed frog) (see 2 papers)
51% identity, 25% coverage

HMSH_DROME / Q03372 Muscle segmentation homeobox; Protein drop; Protein msh from Drosophila melanogaster (Fruit fly) (see 4 papers)
NP_477324 drop from Drosophila melanogaster
41% identity, 12% coverage

PITX_DROME / O18400 Pituitary homeobox homolog Ptx1; D-PTX1 from Drosophila melanogaster (Fruit fly) (see paper)
40% identity, 14% coverage

PNX_DANRE / F1R2J1 Homeobox protein pnx; Posterior neuron-specific homeobox from Danio rerio (Zebrafish) (Brachydanio rerio) (see paper)
50% identity, 32% coverage

NP_001083900 gastrulation brain homeobox 2, gene 1 S homeolog from Xenopus laevis
49% identity, 18% coverage

DLL_DROME / P20009 Homeotic protein distal-less; Protein brista from Drosophila melanogaster (Fruit fly) (see 4 papers)
NP_523857 Distal-less, isoform A from Drosophila melanogaster
42% identity, 20% coverage

NP_476657 slouch, isoform A from Drosophila melanogaster
54% identity, 9% coverage

HM43_CAEEL / Q18273 Homeobox protein ceh-43 from Caenorhabditis elegans (see 2 papers)
46% identity, 23% coverage

XP_521666 homeobox protein VENTX isoform X1 from Pan troglodytes
46% identity, 26% coverage

hox1 / CAD59667.1 putative homeobox protein hox1, partial from Ciona intestinalis (see paper)
56% identity, 42% coverage

XP_006498792 homeobox even-skipped homolog protein 2 isoform X1 from Mus musculus
P49749 Homeobox even-skipped homolog protein 2 from Mus musculus
51% identity, 12% coverage

NP_999816 even-skipped-like protein from Strongylocentrotus purpuratus
49% identity, 19% coverage

2me6A / Q14549 Nmr structure of the homeodomain transcription factor gbx1 from homo sapiens in complex with the DNA sequence cgactaattagtcg
49% identity, 70% coverage

HESXB_XENLA / Q91898 Homeobox expressed in ES cells 1-B; Homeobox protein ANF-1; XANF-1; Xanf1 from Xenopus laevis (African clawed frog) (see 9 papers)
NP_001156042 homeobox expressed in ES cells 1-B from Xenopus laevis
48% identity, 31% coverage

VEX1_XENLA / Q9W769 Homeobox protein vex1; Homeodomain transcription factor vex-1; Ventral homeobox protein; Xvex-1 from Xenopus laevis (African clawed frog) (see 2 papers)
44% identity, 25% coverage

LOC107447678 homeobox protein MSH-B from Parasteatoda tepidariorum
41% identity, 17% coverage

PDX1_MOUSE / P52946 Pancreas/duodenum homeobox protein 1; Insulin promoter factor 1; IPF-1; Islet/duodenum homeobox 1; IDX-1; Somatostatin-transactivating factor 1; STF-1 from Mus musculus (Mouse) (see 6 papers)
NP_032840 pancreas/duodenum homeobox protein 1 from Mus musculus
55% identity, 21% coverage

UBX_DROME / P83949 Homeotic protein ultrabithorax from Drosophila melanogaster (Fruit fly) (see paper)
NP_536752 ultrabithorax, isoform A from Drosophila melanogaster
49% identity, 15% coverage

NP_001107762 labial from Tribolium castaneum
53% identity, 18% coverage

GSBN_DROME / P09083 Protein gooseberry-neuro; BSH4; Protein gooseberry proximal from Drosophila melanogaster (Fruit fly) (see paper)
NP_523862 gooseberry-neuro from Drosophila melanogaster
40% identity, 15% coverage

NP_726486 Distal-less, isoform B from Drosophila melanogaster
42% identity, 21% coverage

HXB2_HUMAN / P14652 Homeobox protein Hox-B2; Homeobox protein Hox-2.8; Homeobox protein Hox-2H; K8 from Homo sapiens (Human) (see paper)
NP_002136 homeobox protein Hox-B2 from Homo sapiens
47% identity, 17% coverage

NOT2_XENLA / Q91770 Homeobox protein not2; Xnot-2; Xnot2 from Xenopus laevis (African clawed frog) (see 3 papers)
51% identity, 24% coverage

NP_074043 pancreas/duodenum homeobox protein 1 from Rattus norvegicus
P52947 Pancreas/duodenum homeobox protein 1 from Rattus norvegicus
55% identity, 21% coverage

P70118 Pancreas/duodenum homeobox protein 1 from Mesocricetus auratus
55% identity, 21% coverage

NP_996173 antennapedia, isoform G from Drosophila melanogaster
47% identity, 19% coverage

XP_027829696 pancreas/duodenum homeobox protein 1 from Ovis aries
55% identity, 21% coverage

A1YG85 Pancreas/duodenum homeobox protein 1 from Pan paniscus
55% identity, 21% coverage

NP_996219 ultrabithorax, isoform F from Drosophila melanogaster
49% identity, 16% coverage

HMX_CAEEL / Q18533 Homeobox protein mls-2; Mesodermal lineage specification protein 2 from Caenorhabditis elegans (see 6 papers)
NP_508815 Homeobox protein mls-2 from Caenorhabditis elegans
47% identity, 17% coverage

PDX1_HUMAN / P52945 Pancreas/duodenum homeobox protein 1; PDX-1; Glucose-sensitive factor; GSF; Insulin promoter factor 1; IPF-1; Insulin upstream factor 1; IUF-1; Islet/duodenum homeobox-1; IDX-1; Somatostatin-transactivating factor 1; STF-1 from Homo sapiens (Human) (see 5 papers)
NP_000200 pancreas/duodenum homeobox protein 1 from Homo sapiens
55% identity, 21% coverage

P31264 Homeotic protein proboscipedia from Drosophila melanogaster
47% identity, 8% coverage

BARH2_DROME / Q24256 Homeobox protein B-H2; Homeobox protein BarH2 from Drosophila melanogaster (Fruit fly) (see 4 papers)
39% identity, 12% coverage

HMX_DROME / Q9VEI9 Homeobox protein Hmx; DHmx from Drosophila melanogaster (Fruit fly) (see paper)
NP_732244 H6-like-homeobox, isoform C from Drosophila melanogaster
44% identity, 10% coverage

XP_022192571 homeobox protein Nkx-2.4 from Nilaparvata lugens
38% identity, 24% coverage

MSX1_HUMAN / P28360 Homeobox protein MSX-1; Homeobox protein Hox-7; Msh homeobox 1-like protein from Homo sapiens (Human) (see 5 papers)
NP_002439 homeobox protein MSX-1 from Homo sapiens
43% identity, 21% coverage

New Search

For advice on how to use these tools together, see Interactive tools for functional annotation of bacterial genomes.

Statistics

The PaperBLAST database links 798,070 different protein sequences to 1,261,478 scientific articles. Searches against EuropePMC were last performed on May 12 2025.

How It Works

PaperBLAST builds a database of protein sequences that are linked to scientific articles. These links come from automated text searches against the articles in EuropePMC and from manually-curated information from GeneRIF, UniProtKB/Swiss-Prot, BRENDA, CAZy (as made available by dbCAN), BioLiP, CharProtDB, MetaCyc, EcoCyc, TCDB, REBASE, the Fitness Browser, and a subset of the European Nucleotide Archive with the /experiment tag. Given this database and a protein sequence query, PaperBLAST uses protein-protein BLAST to find similar sequences with E < 0.001.

To build the database, we query EuropePMC with locus tags, with RefSeq protein identifiers, and with UniProt accessions. We obtain the locus tags from RefSeq or from MicrobesOnline. We use queries of the form "locus_tag AND genus_name" to try to ensure that the paper is actually discussing that gene. Because EuropePMC indexes most recent biomedical papers, even if they are not open access, some of the links may be to papers that you cannot read or that our computers cannot read. We query each of these identifiers that appears in the open access part of EuropePMC, as well as every locus tag that appears in the 500 most-referenced genomes, so that a gene may appear in the PaperBLAST results even though none of the papers that mention it are open access. We also incorporate text-mined links from EuropePMC that link open access articles to UniProt or RefSeq identifiers. (This yields some additional links because EuropePMC uses different heuristics for their text mining than we do.)

For every article that mentions a locus tag, a RefSeq protein identifier, or a UniProt accession, we try to select one or two snippets of text that refer to the protein. If we cannot get access to the full text, we try to select a snippet from the abstract, but unfortunately, unique identifiers such as locus tags are rarely provided in abstracts.

PaperBLAST also incorporates manually-curated protein functions:

Except for GeneRIF and ENA, the curated entries include a short curated description of the protein's function. For entries from BioLiP, the protein's function may not be known beyond binding to the ligand. Many of these entries also link to articles in PubMed.

For more information see the PaperBLAST paper (mSystems 2017) or the code. You can download PaperBLAST's database here.

Changes to PaperBLAST since the paper was written:

Many of these changes are described in Interactive tools for functional annotation of bacterial genomes.

Secrets

PaperBLAST cannot provide snippets for many of the papers that are published in non-open-access journals. This limitation applies even if the paper is marked as "free" on the publisher's web site and is available in PubmedCentral or EuropePMC. If a journal that you publish in is marked as "secret," please consider publishing elsewhere.

Omissions from the PaperBLAST Database

Many important articles are missing from PaperBLAST, either because the article's full text is not in EuropePMC (as for many older articles), or because the paper does not mention a protein identifier such as a locus tag, or because of PaperBLAST's heuristics. If you notice an article that characterizes a protein's function but is missing from PaperBLAST, please notify the curators at UniProt or add an entry to GeneRIF. Entries in either of these databases will eventually be incorporated into PaperBLAST. Note that to add an entry to UniProt, you will need to find the UniProt identifier for the protein. If the protein is not already in UniProt, you can ask them to create an entry. To add an entry to GeneRIF, you will need an NCBI Gene identifier, but unfortunately many prokaryotic proteins in RefSeq do not have corresponding Gene identifers.

References

PaperBLAST: Text-mining papers for information about homologs.
M. N. Price and A. P. Arkin (2017). mSystems, 10.1128/mSystems.00039-17.

Europe PMC in 2017.
M. Levchenko et al (2017). Nucleic Acids Research, 10.1093/nar/gkx1005.

Gene indexing: characterization and analysis of NLM's GeneRIFs.
J. A. Mitchell et al (2003). AMIA Annu Symp Proc 2003:460-464.

UniProt: the universal protein knowledgebase.
The UniProt Consortium (2016). Nucleic Acids Research, 10.1093/nar/gkw1099.

BRENDA in 2017: new perspectives and new tools in BRENDA.
S. Placzek et al (2017). Nucleic Acids Research, 10.1093/nar/gkw952.

The EcoCyc database: reflecting new knowledge about Escherichia coli K-12.
I. M. Keeseler et al (2016). Nucleic Acids Research, 10.1093/nar/gkw1003.

The MetaCyc database of metabolic pathways and enzymes.
R. Caspi et al (2018). Nucleic Acids Research, 10.1093/nar/gkx935.

CharProtDB: a database of experimentally characterized protein annotations.
R. Madupu et al (2012). Nucleic Acids Research, 10.1093/nar/gkr1133.

The carbohydrate-active enzymes database (CAZy) in 2013.
V. Lombard et al (2014). Nucleic Acids Research, 10.1093/nar/gkt1178.

The Transporter Classification Database (TCDB): recent advances
M. H. Saier, Jr. et al (2016). Nucleic Acids Research, 10.1093/nar/gkv1103.

REBASE - a database for DNA restriction and modification: enzymes, genes and genomes.
R. J. Roberts et al (2015). Nucleic Acids Research, 10.1093/nar/gku1046.

Deep annotation of protein function across diverse bacteria from mutant phenotypes.
M. N. Price et al (2016). bioRxiv, 10.1101/072470.

by Morgan Price, Arkin group
Lawrence Berkeley National Laboratory