PaperBLAST – Find papers about a protein or its homologs

 

PaperBLAST

PaperBLAST Hits for 74 a.a. (DARRKRRNFS...)

Other sequence analysis tools:

Find functional residues: SitesBLAST

Search for conserved domains

Find the best match in UniProt

Compare to protein structures

Predict transmenbrane helices: Phobius

Predict protein localization: PSORTb

Find homologs in fast.genomics

Fitness BLAST: loading...

Found 250 similar proteins in the literature:

EXD_DROME / P40427 Homeobox protein extradenticle; Dpbx from Drosophila melanogaster (Fruit fly) (see 7 papers)
NP_523360 extradenticle, isoform A from Drosophila melanogaster
100% identity, 20% coverage

NP_001034501 extradenticle from Tribolium castaneum
100% identity, 20% coverage

PBX2_HUMAN / P40425 Pre-B-cell leukemia transcription factor 2; Homeobox protein PBX2; Protein G17 from Homo sapiens (Human) (see 2 papers)
NP_002577 pre-B-cell leukemia transcription factor 2 from Homo sapiens
93% identity, 17% coverage

NP_059491 pre-B-cell leukemia transcription factor 2 from Mus musculus
93% identity, 17% coverage

PBX1_HUMAN / P40424 Pre-B-cell leukemia transcription factor 1; Homeobox protein PBX1; Homeobox protein PRL from Homo sapiens (Human) (see 16 papers)
PBX1_MOUSE / P41778 Pre-B-cell leukemia transcription factor 1; Homeobox protein PBX1 from Mus musculus (Mouse) (see 13 papers)
93% identity, 17% coverage

XP_038946720 pre-B-cell leukemia transcription factor 1 isoform X1 from Rattus norvegicus
93% identity, 18% coverage

XP_011507892 pre-B-cell leukemia transcription factor 1 isoform X2 from Homo sapiens
93% identity, 16% coverage

NP_571522 pre-B-cell leukemia transcription factor 4 from Danio rerio
93% identity, 22% coverage

NP_001340059 pre-B-cell leukemia transcription factor 1 isoform 4 from Homo sapiens
93% identity, 21% coverage

NP_001191892 pre-B-cell leukemia transcription factor 1 isoform 3 from Homo sapiens
93% identity, 18% coverage

NP_032809 pre-B-cell leukemia transcription factor 1 isoform b from Mus musculus
NP_001191890 pre-B-cell leukemia transcription factor 1 isoform 2 from Homo sapiens
NP_001340060 pre-B-cell leukemia transcription factor 1 isoform 2 from Homo sapiens
93% identity, 21% coverage

NP_001277505 pre-B-cell leukemia transcription factor 3 isoform PBX3b from Mus musculus
95% identity, 21% coverage

XP_038961188 pre-B-cell leukemia transcription factor 3 isoform X2 from Rattus norvegicus
95% identity, 29% coverage

XP_017206849 pre-B-cell leukemia transcription factor 2 isoform X1 from Danio rerio
92% identity, 18% coverage

PBX1_XENLA / Q8QGC4 Pre-B-cell leukemia transcription factor 1; Xpbx1; Homeobox protein pbx1; Pre-B-cell leukemia transcription factor 1b; Xpbx1b from Xenopus laevis (African clawed frog) (see 2 papers)
92% identity, 21% coverage

PBX3_HUMAN / P40426 Pre-B-cell leukemia transcription factor 3; Homeobox protein PBX3 from Homo sapiens (Human) (see paper)
NP_006186 pre-B-cell leukemia transcription factor 3 isoform 1 from Homo sapiens
95% identity, 17% coverage

H1A3Y0 PBX homeobox 1 from Taeniopygia guttata
93% identity, 17% coverage

1lfuP / P41778 Nmr solution structure of the extended pbx homeodomain bound to DNA (see paper)
92% identity, 89% coverage

XP_017214451 pre-B-cell leukemia homeobox 1a isoform X1 from Danio rerio
91% identity, 16% coverage

NP_079521 pre-B-cell leukemia transcription factor 4 from Homo sapiens
Q9BYU1 Pre-B-cell leukemia transcription factor 4 from Homo sapiens
86% identity, 20% coverage

PBX4_MOUSE / Q99NE9 Pre-B-cell leukemia transcription factor 4; Homeobox protein PBX4 from Mus musculus (Mouse) (see paper)
88% identity, 20% coverage

HM20_CAEEL / P41779 Homeobox protein ceh-20 from Caenorhabditis elegans (see 5 papers)
NP_001022555 Homeobox protein ceh-20 from Caenorhabditis elegans
86% identity, 22% coverage

XP_018115010 pre-B-cell leukemia transcription factor 1 isoform X3 from Xenopus laevis
90% identity, 17% coverage

HM40_CAEEL / Q19503 Homeobox protein ceh-40 from Caenorhabditis elegans (see 2 papers)
67% identity, 21% coverage

CBG05292 Protein CBG05292 from Caenorhabditis briggsae
66% identity, 22% coverage

XP_006509863 pre-B-cell leukemia transcription factor 4 isoform X1 from Mus musculus
88% identity, 14% coverage

CEH60_CAEEL / Q45EK2 Homeobox protein ceh-60 from Caenorhabditis elegans (see 2 papers)
NP_509101 Homeobox protein ceh-60 from Caenorhabditis elegans
54% identity, 19% coverage

XP_018084194 homeobox protein meis3-A isoform X2 from Xenopus laevis
50% identity, 15% coverage

MEI3A_XENLA / Q5U4X3 Homeobox protein meis3-A; XMeis3 from Xenopus laevis (African clawed frog) (see 5 papers)
50% identity, 12% coverage

Q99687 Homeobox protein Meis3 from Homo sapiens
50% identity, 15% coverage

NP_001009813 homeobox protein Meis3 isoform 2 from Homo sapiens
50% identity, 16% coverage

NP_571853 homeobox protein Meis3 from Danio rerio
45% identity, 14% coverage

XP_008196218 homothorax isoform X4 from Tribolium castaneum
47% identity, 13% coverage

MEIS3_MOUSE / P97368 Homeobox protein Meis3; Meis1-related protein 2 from Mus musculus (Mouse) (see paper)
50% identity, 15% coverage

ATEG_00728 uncharacterized protein from Aspergillus terreus NIH2624
47% identity, 18% coverage

NP_476578 homothorax, isoform A from Drosophila melanogaster
47% identity, 12% coverage

HTH_DROME / O46339 Homeobox protein homothorax; Homeobox protein dorsotonals from Drosophila melanogaster (Fruit fly) (see 5 papers)
NP_476576 homothorax, isoform C from Drosophila melanogaster
47% identity, 12% coverage

NP_571968 homeobox protein Meis1b from Danio rerio
48% identity, 14% coverage

AFUA_4G10110, Afu4g10110 homeobox transcription factor, putative from Aspergillus fumigatus Af293
47% identity, 21% coverage

NP_034919 homeobox protein Meis1 isoform A from Mus musculus
49% identity, 11% coverage

Q3TYM2 Meis homeobox 2 from Mus musculus
49% identity, 13% coverage

XP_015133525 homeobox protein Meis1 isoform X1 from Gallus gallus
49% identity, 11% coverage

MEIS2_MOUSE / P97367 Homeobox protein Meis2; Meis1-related protein 1 from Mus musculus (Mouse) (see 7 papers)
49% identity, 11% coverage

XP_063129699 homeobox protein Meis1 isoform X3 from Rattus norvegicus
49% identity, 11% coverage

XP_042101912 homeobox protein Meis1 isoform X2 from Ovis aries
49% identity, 11% coverage

XP_005662564 homeobox protein Meis1 isoform X4 from Sus scrofa
49% identity, 11% coverage

NP_034955 homeobox protein Meis2 isoform 2 from Mus musculus
49% identity, 11% coverage

MEIS2_HUMAN / O14770 Homeobox protein Meis2; Meis1-related protein 1 from Homo sapiens (Human) (see 10 papers)
49% identity, 11% coverage

NP_002390 homeobox protein Meis2 isoform f from Homo sapiens
49% identity, 13% coverage

4xrmB / O14770 Homodimer of tale type homeobox transcription factor meis1 complexes with specific DNA (see paper)
49% identity, 69% coverage

MEIS1_XENLA / P79937 Homeobox protein Meis1; XMeis1 from Xenopus laevis (African clawed frog) (see 3 papers)
48% identity, 15% coverage

MEIS1_MOUSE / Q60954 Homeobox protein Meis1; Myeloid ecotropic viral integration site 1 from Mus musculus (Mouse) (see 8 papers)
49% identity, 13% coverage

MEIS1_HUMAN / O00470 Homeobox protein Meis1 from Homo sapiens (Human) (see 4 papers)
49% identity, 13% coverage

F2DV82 Predicted protein from Hordeum vulgare subsp. vulgare
46% identity, 9% coverage

YGL096W Homeodomain-containing transcription factor; SBF regulated target gene that in turn regulates expression of genes involved in G1/S phase events such as bud site selection, bud emergence and cell cycle progression; similarity to Cup9p from Saccharomyces cerevisiae
42% identity, 26% coverage

IRX_CAEEL / Q93348 Putative iroquois-class homeodomain protein irx-1 from Caenorhabditis elegans (see 3 papers)
NP_492533 Putative iroquois-class homeodomain protein irx-1 from Caenorhabditis elegans
47% identity, 15% coverage

UV8b_06464 uncharacterized protein from Ustilaginoidea virens
44% identity, 14% coverage

A9UP33 Homeobox domain-containing protein from Monosiga brevicollis
46% identity, 9% coverage

GLRG_00169 homeobox domain-containing protein from Colletotrichum graminicola M1.001
44% identity, 14% coverage

CUP9_YEAST / P41817 Homeobox protein CUP9 from Saccharomyces cerevisiae (strain ATCC 204508 / S288c) (Baker's yeast) (see paper)
NP_015148 Cup9p from Saccharomyces cerevisiae S288C
YPL177C Cup9p from Saccharomyces cerevisiae
47% identity, 18% coverage

CCM_07504 homeobox transcription factor, putative from Cordyceps militaris CM01
42% identity, 15% coverage

NP_001087637 TGFB induced factor homeobox 2 L homeolog from Xenopus laevis
47% identity, 23% coverage

B6SXN6 Homeodomain protein JUBEL1 from Zea mays
40% identity, 8% coverage

AFUA_1G15550 homeobox and C2H2 transcription factor, putative from Aspergillus fumigatus Af293
42% identity, 7% coverage

NP_571966 homeobox protein PKNOX1.1 from Danio rerio
39% identity, 13% coverage

UNC62_CAEEL / Q9N5D6 Homeobox protein unc-62; Uncoordinated protein 62 from Caenorhabditis elegans (see 5 papers)
44% identity, 10% coverage

XP_015138071 homeobox protein AKR isoform X1 from Gallus gallus
44% identity, 21% coverage

XP_005538457 unknown homeobox protein from Cyanidioschyzon merolae strain 10D
42% identity, 16% coverage

Smp_063520 irx-related from Schistosoma mansoni
44% identity, 10% coverage

XP_001455625 uncharacterized protein from Paramecium tetraurelia
39% identity, 12% coverage

BLH4_ARATH / Q94KL5 BEL1-like homeodomain protein 4; BEL1-like protein 4; Protein SAWTOOTH 2 from Arabidopsis thaliana (Mouse-ear cress) (see 3 papers)
NP_850044 BEL1-like homeodomain 4 from Arabidopsis thaliana
AT2G23760 BLH4 (BEL1-LIKE HOMEODOMAIN 4); DNA binding / transcription factor from Arabidopsis thaliana
42% identity, 10% coverage

LOC100257488 BEL1-like homeodomain protein 7 from Vitis vinifera
35% identity, 11% coverage

EIN_475530 homeobox protein knotted-1, putative from Entamoeba invadens IP1
43% identity, 21% coverage

ECU06_0740 MEI2-RELATED PROTEIN from Encephalitozoon cuniculi GB-M1
44% identity, 28% coverage

BLH5_ARATH / Q8S897 BEL1-like homeodomain protein 5; BEL1-like protein 5 from Arabidopsis thaliana (Mouse-ear cress) (see paper)
AT2G27220 BLH5 (BELL1-like homeodomain 5); DNA binding / transcription factor from Arabidopsis thaliana
39% identity, 15% coverage

NP_001157546 homeobox protein TGIF1 isoform c from Mus musculus
42% identity, 23% coverage

Q941S9 Os01g0848400 protein from Oryza sativa subsp. japonica
39% identity, 10% coverage

HBX3_DICDI / Q54F11 Homeobox protein 3; DdHbx-3 from Dictyostelium discoideum (Social amoeba) (see paper)
43% identity, 9% coverage

Pc06g01320 uncharacterized protein from Penicillium rubens
40% identity, 7% coverage

XP_005536034 unknown homeobox protein from Cyanidioschyzon merolae strain 10D
50% identity, 13% coverage

NP_777480 homeobox protein TGIF1 isoform d from Homo sapiens
42% identity, 23% coverage

TGIF1_HUMAN / Q15583 Homeobox protein TGIF1; 5'-TG-3'-interacting factor 1 from Homo sapiens (Human) (see 2 papers)
42% identity, 14% coverage

NP_001015020 homeobox protein TGIF1 from Rattus norvegicus
42% identity, 20% coverage

BLH1_ARATH / Q9SJ56 BEL1-like homeodomain protein 1; BEL1-like protein 1; Protein EMBRYO SAC DEVELOPMENT ARREST 29 from Arabidopsis thaliana (Mouse-ear cress) (see 2 papers)
NP_181138 BEL1-like homeodomain 1 from Arabidopsis thaliana
AT2G35940 BLH1 (BEL1-LIKE HOMEODOMAIN 1); DNA binding / protein heterodimerization/ protein homodimerization/ transcription factor from Arabidopsis thaliana
40% identity, 9% coverage

Pc22g06630 uncharacterized protein from Penicillium rubens
33% identity, 11% coverage

XP_006362847 BEL1-like homeodomain protein 3 from Solanum tuberosum
34% identity, 10% coverage

Sb05g003750 No description from Sorghum bicolor
41% identity, 9% coverage

PITG_01080 uncharacterized protein from Phytophthora infestans T30-4
47% identity, 31% coverage

Q8LLD9 BEL1-related homeotic protein 29 (Fragment) from Solanum tuberosum
40% identity, 11% coverage

XP_006348467 BEL1-like homeodomain protein 1 from Solanum tuberosum
40% identity, 9% coverage

XP_038954473 homeobox protein PKNOX1 isoform X1 from Rattus norvegicus
34% identity, 13% coverage

PKNX1_MOUSE / O70477 Homeobox protein PKNOX1; PBX/knotted homeobox 1 from Mus musculus (Mouse) (see 2 papers)
34% identity, 13% coverage

Q7Y0Z9 Bell-like homeodomain protein 3 (Fragment) from Solanum lycopersicum
41% identity, 11% coverage

F2DUL1 Predicted protein from Hordeum vulgare subsp. vulgare
34% identity, 12% coverage

NP_683752 homeobox protein PKNOX2 isoform 1 from Mus musculus
Q8BG99 Homeobox protein PKNOX2 from Mus musculus
38% identity, 12% coverage

PKNX2_HUMAN / Q96KN3 Homeobox protein PKNOX2; Homeobox protein PREP-2; PBX/knotted homeobox 2 from Homo sapiens (Human) (see paper)
38% identity, 12% coverage

PKNX1_HUMAN / P55347 Homeobox protein PKNOX1; Homeobox protein PREP-1; PBX/knotted homeobox 1 from Homo sapiens (Human) (see 3 papers)
NP_004562 homeobox protein PKNOX1 isoform 1 from Homo sapiens
34% identity, 13% coverage

XP_644890 homeodomain containing protein from Dictyostelium discoideum AX4
41% identity, 8% coverage

TRIATDRAFT_288678 uncharacterized protein from Trichoderma atroviride
42% identity, 6% coverage

PITG_01135 uncharacterized protein from Phytophthora infestans T30-4
41% identity, 24% coverage

TRIATDRAFT_161626 uncharacterized protein from Trichoderma atroviride
42% identity, 5% coverage

NP_001274921 BEL1-related homeotic protein 5 from Solanum tuberosum
40% identity, 9% coverage

XP_308755 uncharacterized protein LOC1270088 isoform X2 from Anopheles gambiae
41% identity, 12% coverage

TGIF2_HUMAN / Q9GZN2 Homeobox protein TGIF2; 5'-TG-3'-interacting factor 2; TGF-beta-induced transcription factor 2; TGFB-induced factor 2 from Homo sapiens (Human) (see paper)
NP_068581 homeobox protein TGIF2 from Homo sapiens
NP_001186443 homeobox protein TGIF2 from Homo sapiens
40% identity, 24% coverage

LOC105059537 homeobox protein BEL1 homolog from Elaeis guineensis
42% identity, 10% coverage

BLH2_ARATH / Q9SW80 BEL1-like homeodomain protein 2; BEL1-like protein 2; Protein SAWTOOTH 1 from Arabidopsis thaliana (Mouse-ear cress) (see 3 papers)
AT4G36870 BLH2 (BEL1-LIKE HOMEODOMAIN 2); DNA binding / transcription factor from Arabidopsis thaliana
NP_001031797 BEL1-like homeodomain 2 from Arabidopsis thaliana
40% identity, 8% coverage

FGSG_07909 hypothetical protein from Fusarium graminearum PH-1
41% identity, 7% coverage

NP_955861 homeobox protein TGIF1 from Danio rerio
40% identity, 21% coverage

XP_006499416 homeobox protein TGIF2 isoform X1 from Mus musculus
NP_775572 homeobox protein TGIF2 isoform a from Mus musculus
40% identity, 24% coverage

BEL1_ARATH / Q38897 Homeobox protein BEL1 homolog from Arabidopsis thaliana (Mouse-ear cress) (see 8 papers)
NP_198957 POX (plant homeobox) family protein from Arabidopsis thaliana
AT5G41410 BEL1 (BELL 1); DNA binding / protein binding / transcription factor from Arabidopsis thaliana
42% identity, 10% coverage

VDAG_00465 uncharacterized protein from Verticillium dahliae VdLs.17
42% identity, 4% coverage

FGSG_07914 hypothetical protein from Fusarium graminearum PH-1
45% identity, 9% coverage

vaamana / CAD58040.1 homeodomain protein vaamana from Arabidopsis thaliana (see paper)
37% identity, 10% coverage

BLH9_ARATH / Q9LZM8 BEL1-like homeodomain protein 9; BEL1-like protein 9; Protein BELLRINGER; Protein LARSON; Protein PENNYWISE; Protein REPLUMLESS; Protein VAAMANA from Arabidopsis thaliana (Mouse-ear cress) (see 9 papers)
NP_195823 POX (plant homeobox) family protein from Arabidopsis thaliana
AT5G02030 RPL (REPLUMLESS); DNA binding / sequence-specific DNA binding / transcription factor from Arabidopsis thaliana
37% identity, 10% coverage

VDAG_04660 uncharacterized protein from Verticillium dahliae VdLs.17
38% identity, 11% coverage

XP_018120525 homeobox protein SIX2 isoform X2 from Xenopus laevis
44% identity, 21% coverage

XP_001653660 uncharacterized protein LOC5571407 isoform X2 from Aedes aegypti
42% identity, 12% coverage

IRX6A_DANRE / Q503Z8 Iroquois homeobox protein 6a from Danio rerio (Zebrafish) (Brachydanio rerio) (see 2 papers)
40% identity, 12% coverage

BLH8_ARATH / Q9SJJ3 BEL1-like homeodomain protein 8; BEL1-like protein 8; Protein POUND-FOOLISH from Arabidopsis thaliana (Mouse-ear cress) (see 2 papers)
AT2G27990 BLH8 (BEL1-LIKE HOMEODOMAIN 8); DNA binding / transcription factor from Arabidopsis thaliana
NP_180366 BEL1-like homeodomain 8 from Arabidopsis thaliana
37% identity, 10% coverage

IRX4_MOUSE / Q9QY61 Iroquois-class homeodomain protein IRX-4; Homeodomain protein IRXA3; Iroquois homeobox protein 4 from Mus musculus (Mouse) (see 2 papers)
XP_006517355 iroquois-class homeodomain protein IRX-4 isoform X1 from Mus musculus
37% identity, 12% coverage

NP_001153689 vismay from Tribolium castaneum
41% identity, 19% coverage

BLH10_ARATH / Q9FXG8 BEL1-like homeodomain protein 10; BEL1-like protein 10 from Arabidopsis thaliana (Mouse-ear cress) (see 2 papers)
AT1G19700 BEL10 (BEL1-LIKE HOMEODOMAIN 10); DNA binding / transcription factor from Arabidopsis thaliana
36% identity, 11% coverage

IRX4_HUMAN / P78413 Iroquois-class homeodomain protein IRX-4; Homeodomain protein IRXA3; Iroquois homeobox protein 4 from Homo sapiens (Human) (see paper)
NP_001265561 iroquois-class homeodomain protein IRX-4 isoform b from Homo sapiens
37% identity, 12% coverage

NP_524047 mirror, isoform A from Drosophila melanogaster
38% identity, 9% coverage

BLH3_ARATH / Q9FWS9 BEL1-like homeodomain protein 3; BEL1-like protein 3 from Arabidopsis thaliana (Mouse-ear cress) (see 2 papers)
AT1G75410 BLH3 (BEL1-LIKE HOMEODOMAIN 3); DNA binding / transcription factor from Arabidopsis thaliana
NP_177674 BEL1-like homeodomain 3 from Arabidopsis thaliana
32% identity, 13% coverage

XP_001122713 iroquois-class homeodomain protein IRX-6 isoform X2 from Apis mellifera
41% identity, 18% coverage

T265_04509 hypothetical protein from Opisthorchis viverrini
47% identity, 22% coverage

ECU10_1480 TRANSCRIPTION FACTOR OF THE TALE/PBX FAMILY from Encephalitozoon cuniculi GB-M1
39% identity, 29% coverage

LOC100862767 uncharacterized protein LOC100862767 from Bombyx mori
40% identity, 16% coverage

SIX1A_DANRE / Q6DHF9 Homeobox protein six1a; Homeobox protein six1b; Sine oculis homeobox homolog 1a; Sine oculis homeobox homolog 1b from Danio rerio (Zebrafish) (Brachydanio rerio) (see 2 papers)
49% identity, 18% coverage

NP_001009904 homeobox protein six1a from Danio rerio
49% identity, 18% coverage

CAUP_DROME / P54269 Homeobox protein caupolican from Drosophila melanogaster (Fruit fly) (see paper)
NP_524046 caupolican from Drosophila melanogaster
35% identity, 10% coverage

XP_002431193 Homeobox protein SIX1, putative from Pediculus humanus corporis
49% identity, 18% coverage

XP_011244637 homeobox protein SIX2 isoform X2 from Mus musculus
49% identity, 24% coverage

XP_005264157 homeobox protein SIX2 isoform X1 from Homo sapiens
49% identity, 17% coverage

XP_005212711 homeobox protein SIX2 isoform X1 from Bos taurus
49% identity, 17% coverage

SIX2_HUMAN / Q9NPC8 Homeobox protein SIX2; Sine oculis homeobox homolog 2 from Homo sapiens (Human) (see 3 papers)
49% identity, 18% coverage

SIX2_MOUSE / Q62232 Homeobox protein SIX2; Sine oculis homeobox homolog 2 from Mus musculus (Mouse) (see 13 papers)
49% identity, 17% coverage

NP_001038160 homeobox protein SIX2 from Gallus gallus
49% identity, 18% coverage

NP_001153690 TGIF-like from Nasonia vitripennis
42% identity, 16% coverage

AT1G75430 BLH11 (BEL1-LIKE HOMEODOMAIN 11); transcription factor from Arabidopsis thaliana
39% identity, 19% coverage

Q24248 Homeobox protein araucan from Drosophila melanogaster
NP_524045 araucan, isoform A from Drosophila melanogaster
35% identity, 10% coverage

XP_641195 homeodomain containing protein from Dictyostelium discoideum AX4
39% identity, 9% coverage

SIX1B_DANRE / Q6NZ04 Homeobox protein six1b; Homeobox protein six1a; Sine oculis homeobox homolog 1a; Sine oculis homeobox homolog 1b from Danio rerio (Zebrafish) (Brachydanio rerio) (see 6 papers)
NP_996978 homeobox protein six1b from Danio rerio
49% identity, 18% coverage

XP_002691065 homeobox protein SIX1 from Bos taurus
44% identity, 21% coverage

NP_001186647 homeobox protein SIX1 from Sus scrofa
49% identity, 18% coverage

SIX1_HUMAN / Q15475 Homeobox protein SIX1; Sine oculis homeobox homolog 1 from Homo sapiens (Human) (see 11 papers)
49% identity, 18% coverage

SIX1_MOUSE / Q62231 Homeobox protein SIX1; Sine oculis homeobox homolog 1 from Mus musculus (Mouse) (see 11 papers)
NP_033215 homeobox protein SIX1 from Mus musculus
NP_446211 homeobox protein SIX1 from Rattus norvegicus
44% identity, 21% coverage

NCU05257 homeobox and C2H2 transcription factor from Neurospora crassa OR74A
33% identity, 5% coverage

NP_071873 iroquois-class homeodomain protein IRX-6 isoform 1 from Mus musculus
38% identity, 13% coverage

IRX6_MOUSE / Q9ER75 Iroquois-class homeodomain protein IRX-6; Homeodomain protein IRXB3; Iroquois homeobox protein 6 from Mus musculus (Mouse) (see 2 papers)
38% identity, 13% coverage

NP_001038150 homeobox protein SIX1 from Gallus gallus
49% identity, 18% coverage

NP_001092203 mohawk homeobox a from Danio rerio
38% identity, 17% coverage

Smp_149230 iroquois homeobox family transcription factor from Schistosoma mansoni
41% identity, 9% coverage

NCU00097 BEAK-1 from Neurospora crassa OR74A
35% identity, 21% coverage

NP_523714 vismay, isoform A from Drosophila melanogaster
42% identity, 13% coverage

XP_005538442 similar to BEL1-related homeotic protein from Cyanidioschyzon merolae strain 10D
38% identity, 18% coverage

IRX4A_XENLA / Q90XW6 Iroquois-class homeodomain protein irx-4-A; Iroquois homeobox protein 4-A from Xenopus laevis (African clawed frog) (see 3 papers)
37% identity, 12% coverage

NP_808263 homeobox protein Mohawk from Mus musculus
38% identity, 17% coverage

MKX_MOUSE / Q8BIA3 Homeobox protein Mohawk from Mus musculus (Mouse) (see paper)
38% identity, 17% coverage

NP_001122206 homeobox protein SIX2b from Danio rerio
44% identity, 21% coverage

IRX4_XENTR / Q688D0 Iroquois-class homeodomain protein irx-4; Iroquois homeobox protein 4 from Xenopus tropicalis (Western clawed frog) (Silurana tropicalis) (see paper)
37% identity, 12% coverage

NP_001229631 homeobox protein Mohawk from Homo sapiens
38% identity, 17% coverage

FWA_ARATH / Q9FVI6 Homeobox-leucine zipper protein HDG6; HD-ZIP protein HDG6; Homeobox protein FWA; Homeodomain GLABRA 2-like protein 6; Homeodomain transcription factor HDG6; Protein HOMEODOMAIN GLABROUS 6 from Arabidopsis thaliana (Mouse-ear cress) (see paper)
AT4G25530, NP_567722 FLOWERING WAGENINGEN from Arabidopsis thaliana
NP_567722 FWA; DNA binding / protein binding / protein homodimerization/ transcription factor from Arabidopsis thaliana
41% identity, 9% coverage

Q9I9C5 Iroquois homologue-1 from Gallus gallus
36% identity, 15% coverage

SO_DROME / Q27350 Protein sine oculis from Drosophila melanogaster (Fruit fly) (see 2 papers)
NP_476733 sine oculis from Drosophila melanogaster
47% identity, 12% coverage

XP_004238268 BEL1-like homeodomain protein 11 from Solanum lycopersicum
36% identity, 14% coverage

NP_725182 achintya, isoform A from Drosophila melanogaster
42% identity, 13% coverage

6fqpA / Q15583 Crystal structure of tale homeobox domain transcription factor tgif1 with its consensus DNA (see paper)
37% identity, 80% coverage

NP_001093693 homeobox protein SIX1 from Xenopus tropicalis
49% identity, 18% coverage

ATH1_ARATH / P48731 Homeobox protein ATH1 from Arabidopsis thaliana (Mouse-ear cress) (see paper)
NP_195024 homeobox protein ATH1 from Arabidopsis thaliana
AT4G32980 ATH1 (ARABIDOPSIS THALIANA HOMEOBOX GENE 1); DNA binding / sequence-specific DNA binding / transcription factor from Arabidopsis thaliana
36% identity, 15% coverage

SIX1_XENLA / Q9I8H0 Homeobox protein six1; Sine oculis homeobox homolog 1 from Xenopus laevis (African clawed frog) (see paper)
NP_001082027 homeobox protein six1 from Xenopus laevis
49% identity, 18% coverage

4egcA / P0AEX9,Q15475 Crystal structure of mbp-fused human six1 bound to human eya2 eya domain (see paper)
47% identity, 10% coverage

KNAT5_ARATH / P48002 Homeobox protein knotted-1-like 5; Homeodomain-containing protein 1; Protein KNAT5 from Arabidopsis thaliana (Mouse-ear cress) (see 3 papers)
AT4G32040 KNAT5 (KNOTTED1-LIKE HOMEOBOX GENE 5); transcription activator/ transcription factor from Arabidopsis thaliana
38% identity, 15% coverage

BLH6_ARATH / O65685 BEL1-like homeodomain protein 6; BEL1-like protein 6 from Arabidopsis thaliana (Mouse-ear cress) (see paper)
AT4G34610 BLH6 (BELL1-LIKE HOMEODOMAIN 6); DNA binding / transcription factor from Arabidopsis thaliana
NP_195187 BEL1-like homeodomain 6 from Arabidopsis thaliana
39% identity, 11% coverage

XP_002425861 Homeobox protein TGIF2LX, putative from Pediculus humanus corporis
40% identity, 17% coverage

IRX5_XENLA / Q90XW5 Iroquois-class homeodomain protein irx-5; Iroquois homeobox protein 5 from Xenopus laevis (African clawed frog) (see 6 papers)
42% identity, 12% coverage

XP_005711328 hypothetical protein from Chondrus crispus
35% identity, 13% coverage

IRX5_MOUSE / Q9JKQ4 Iroquois-class homeodomain protein IRX-5; Homeodomain protein IRXB2; Iroquois homeobox protein 5 from Mus musculus (Mouse) (see 4 papers)
NP_061296 iroquois-class homeodomain protein IRX-5 from Mus musculus
40% identity, 12% coverage

NP_991261 iroquois-class homeodomain protein IRX-4a from Danio rerio
38% identity, 13% coverage

FGSG_09043 hypothetical protein from Fusarium graminearum PH-1
36% identity, 11% coverage

XP_006255259 iroquois-class homeodomain protein IRX-5 isoform X1 from Rattus norvegicus
40% identity, 14% coverage

NP_620410 homeobox protein TGIF2LX from Homo sapiens
37% identity, 24% coverage

IRX5_HUMAN / P78411 Iroquois-class homeodomain protein IRX-5; Homeodomain protein IRX-2A; Homeodomain protein IRXB2; Iroquois homeobox protein 5 from Homo sapiens (Human) (see paper)
NP_005844 iroquois-class homeodomain protein IRX-5 isoform 1 from Homo sapiens
40% identity, 12% coverage

IRX2_MOUSE / P81066 Iroquois-class homeodomain protein IRX-2; Homeodomain protein IRXA2; Iroquois homeobox protein 2; Iroquois-class homeobox protein Irx6 from Mus musculus (Mouse) (see 4 papers)
NP_034704 iroquois-class homeodomain protein IRX-2 from Mus musculus
40% identity, 12% coverage

IRX3_MOUSE / P81067 Iroquois-class homeodomain protein IRX-3; Homeodomain protein IRXB1; Iroquois homeobox protein 3 from Mus musculus (Mouse) (see 8 papers)
NP_032419 iroquois-class homeodomain protein IRX-3 isoform 2 from Mus musculus
33% identity, 13% coverage

IRX2_XENLA / Q6DCQ1 Iroquois-class homeodomain protein irx-2; Irx2-A; Iroquois homeobox protein 2; Xiro2 from Xenopus laevis (African clawed frog) (see 6 papers)
40% identity, 13% coverage

Q9BZI1 Iroquois-class homeodomain protein IRX-2 from Homo sapiens
XP_011512281 iroquois-class homeodomain protein IRX-2 isoform X1 from Homo sapiens
40% identity, 12% coverage

IRX2_XENTR / Q66IK1 Iroquois-class homeodomain protein irx-2; Iroquois homeobox protein 2 from Xenopus tropicalis (Western clawed frog) (Silurana tropicalis) (see paper)
40% identity, 12% coverage

Q9PU52 Iroquois homologue 2 from Gallus gallus
XP_015137852 iroquois-class homeodomain protein IRX-2 isoform X1 from Gallus gallus
40% identity, 12% coverage

AT2G16400 BLH7 (bell1-like homeodomain 7); DNA binding / transcription factor from Arabidopsis thaliana
37% identity, 12% coverage

FOXG_07428 hypothetical protein from Fusarium oxysporum f. sp. lycopersici 4287
43% identity, 6% coverage

CUP9 putative uncharacterized protein CUP9 from Candida albicans (see paper)
XP_721352 Cup9p from Candida albicans SC5314
38% identity, 21% coverage

NP_957351 iroquois-class homeodomain protein IRX-2a from Danio rerio
40% identity, 13% coverage

KNAT4_ARATH / P48001 Homeobox protein knotted-1-like 4; Protein KNAT4 from Arabidopsis thaliana (Mouse-ear cress) (see 2 papers)
AT5G11060 KNAT4 (KNOTTED1-LIKE HOMEOBOX GENE 4); transcription activator/ transcription factor from Arabidopsis thaliana
38% identity, 14% coverage

XP_017456222 homeobox protein Mohawk from Rattus norvegicus
38% identity, 17% coverage

NP_077311 iroquois-class homeodomain protein IRX-6 from Homo sapiens
38% identity, 13% coverage

IRX5A_DANRE / Q1LXU5 Iroquois homeobox protein 5a from Danio rerio (Zebrafish) (Brachydanio rerio) (see 4 papers)
40% identity, 13% coverage

IRX5_XENTR / Q4LDQ3 Iroquois-class homeodomain protein irx-5; Iroquois homeobox protein 5 from Xenopus tropicalis (Western clawed frog) (Silurana tropicalis) (see paper)
40% identity, 12% coverage

P78414 Iroquois-class homeodomain protein IRX-1 from Homo sapiens
38% identity, 12% coverage

IRX1_MOUSE / P81068 Iroquois-class homeodomain protein IRX-1; Homeodomain protein IRXA1; Iroquois homeobox protein 1 from Mus musculus (Mouse) (see 4 papers)
NP_034703 iroquois-class homeodomain protein IRX-1 from Mus musculus
38% identity, 12% coverage

NP_001307054 iroquois-class homeodomain protein IRX-3a from Danio rerio
33% identity, 15% coverage

VDAG_04891 uncharacterized protein from Verticillium dahliae VdLs.17
40% identity, 6% coverage

CPAR2_301340 uncharacterized protein from Candida parapsilosis
32% identity, 23% coverage

NP_649256 Six4, isoform A from Drosophila melanogaster
53% identity, 9% coverage

F4JWP8 Homeobox protein knotted-1-like 3 from Arabidopsis thaliana
38% identity, 13% coverage

NP_571429 SIX homeobox 7 from Danio rerio
37% identity, 23% coverage

O04136 Homeobox protein knotted-1-like 3 from Malus domestica
38% identity, 13% coverage

SS1G_03098 hypothetical protein from Sclerotinia sclerotiorum 1980 UF-70
38% identity, 4% coverage

A0A0B7A551 Uncharacterized protein (Fragment) from Arion vulgaris
33% identity, 15% coverage

NP_571956 iroquois homeobox 7 from Danio rerio
35% identity, 20% coverage

KNOS3_ORYSJ / Q94LW3 Homeobox protein knotted-1-like 3; Homeobox protein HOS66 from Oryza sativa subsp. japonica (Rice) (see 2 papers)
36% identity, 18% coverage

six1-2 / CAD89530.1 six1-2 protein from Dugesia japonica (see paper)
37% identity, 14% coverage

KNAT3_ARATH / P48000 Homeobox protein knotted-1-like 3; Protein KNAT3 from Arabidopsis thaliana (Mouse-ear cress) (see 2 papers)
NP_197904 homeobox protein knotted-1-like 3 from Arabidopsis thaliana
AT5G25220, NP_197904 KNAT3 (KNOTTED1-LIKE HOMEOBOX GENE 3); transcription activator/ transcription factor from Arabidopsis thaliana
38% identity, 13% coverage

R7TKD0 Transcription factor Pax3/7 (Fragment) from Capitella teleta
39% identity, 26% coverage

NP_998265 iroquois-class homeodomain protein IRX-3b from Danio rerio
38% identity, 16% coverage

NP_631960 homeobox protein TGIF2LY from Homo sapiens
Q8IUE0 Homeobox protein TGIF2LY from Homo sapiens
32% identity, 34% coverage

GRMZM2G159431 homeobox protein HD1 from Zea mays
36% identity, 18% coverage

P78415 Iroquois-class homeodomain protein IRX-3 from Homo sapiens
NP_077312 iroquois-class homeodomain protein IRX-3 from Homo sapiens
36% identity, 12% coverage

NP_001073284 mix-type homeobox gene 2 from Danio rerio
39% identity, 21% coverage

NP_001084204 iroquois-class homeodomain protein irx-3 from Xenopus laevis
33% identity, 15% coverage

V4AMZ8 Uncharacterized protein (Fragment) from Lottia gigantea
36% identity, 24% coverage

IRX3_XENLA / O42261 Iroquois-class homeodomain protein irx-3; Iroquois homeobox protein 3; Xiro3 from Xenopus laevis (African clawed frog) (see 6 papers)
33% identity, 15% coverage

CLUG_02322 Homeobox KN domain family protein from Clavispora lusitaniae
39% identity, 28% coverage

NP_001081933 SIX homeobox 6 L homeolog from Xenopus laevis
38% identity, 23% coverage

XP_068068923 homeobox protein SIX6a isoform X1 from Danio rerio
38% identity, 19% coverage

Q9SYT6 Homeobox protein liguleless 3 from Zea mays
42% identity, 17% coverage

XP_005685997 homeobox protein SIX6 isoform X1 from Capra hircus
38% identity, 22% coverage

SIX6_CHICK / O93307 Homeobox protein SIX6; Optic homeobox 2; Sine oculis homeobox homolog 6; Six9 protein from Gallus gallus (Chicken) (see 2 papers)
38% identity, 22% coverage

NP_571635 mix-type homeobox gene 1 from Danio rerio
38% identity, 21% coverage

XP_038529716 homeobox protein SIX6 from Canis lupus familiaris
38% identity, 22% coverage

SIX6_MOUSE / Q9QZ28 Homeobox protein SIX6; Optic homeobox 2; Sine oculis homeobox homolog 6; Six9 protein from Mus musculus (Mouse) (see 3 papers)
NP_035514 homeobox protein SIX6 from Mus musculus
38% identity, 22% coverage

XP_017172856 homeobox protein SIX3 isoform X1 from Mus musculus
38% identity, 16% coverage

SIX6_HUMAN / O95475 Homeobox protein SIX6; Homeodomain protein OPTX2; Optic homeobox 2; Sine oculis homeobox homolog 6 from Homo sapiens (Human) (see paper)
NP_031400 homeobox protein SIX6 from Homo sapiens
38% identity, 22% coverage

XP_063118601 homeobox protein SIX3 isoform X1 from Rattus norvegicus
38% identity, 15% coverage

BC1G_06341 Bchox2 from Botrytis cinerea B05.10
36% identity, 4% coverage

NP_001018421 homeobox protein SIX6b from Danio rerio
38% identity, 22% coverage

Q9PUR3 Iroquois-related homeobox transcription factor IRX3 (Fragment) from Gallus gallus
38% identity, 56% coverage

NP_997067 iroquois-class homeodomain protein IRX-1a isoform 1 from Danio rerio
36% identity, 14% coverage

KNOSD_ORYSJ / Q0J6N4 Homeobox protein knotted-1-like 13; Homeobox protein OSH45 from Oryza sativa subsp. japonica (Rice) (see paper)
38% identity, 15% coverage

XP_015147858 LOW QUALITY PROTEIN: iroquois-class homeodomain protein IRX-3 from Gallus gallus
36% identity, 12% coverage

IRX3_XENTR / Q6NVN3 Iroquois-class homeodomain protein irx-3; Iroquois homeobox protein 3 from Xenopus tropicalis (Western clawed frog) (Silurana tropicalis) (see paper)
33% identity, 15% coverage

KNOS8_ORYSJ / Q10ED2 Homeobox protein knotted-1-like 8; Homeobox protein OSH43 from Oryza sativa subsp. japonica (Rice) (see paper)
38% identity, 18% coverage

LOC107436457 homeobox protein SIX6-like from Parasteatoda tepidariorum
38% identity, 15% coverage

so / CAB89515.1 homeodomain transcription factor from Girardia tigrina (see paper)
37% identity, 14% coverage

SIX3_MOUSE / Q62233 Homeobox protein SIX3; Sine oculis homeobox homolog 3 from Mus musculus (Mouse) (see 18 papers)
38% identity, 17% coverage

PAX7_HUMAN / P23759 Paired box protein Pax-7; HuP1 from Homo sapiens (Human) (see 3 papers)
NP_001128726 paired box protein Pax-7 isoform 3 from Homo sapiens
33% identity, 14% coverage

KNOS1_ORYSJ / Q9FP29 Homeobox protein knotted-1-like 1; Homeobox protein HOS16; Homeobox protein OSH6 from Oryza sativa subsp. japonica (Rice) (see paper)
42% identity, 17% coverage

NP_989600 paired box protein Pax-3 isoform a from Gallus gallus
33% identity, 14% coverage

Smp_147790 putative homothorax homeobox protein from Schistosoma mansoni
58% identity, 3% coverage

New Search

For advice on how to use these tools together, see Interactive tools for functional annotation of bacterial genomes.

Statistics

The PaperBLAST database links 798,070 different protein sequences to 1,261,478 scientific articles. Searches against EuropePMC were last performed on May 12 2025.

How It Works

PaperBLAST builds a database of protein sequences that are linked to scientific articles. These links come from automated text searches against the articles in EuropePMC and from manually-curated information from GeneRIF, UniProtKB/Swiss-Prot, BRENDA, CAZy (as made available by dbCAN), BioLiP, CharProtDB, MetaCyc, EcoCyc, TCDB, REBASE, the Fitness Browser, and a subset of the European Nucleotide Archive with the /experiment tag. Given this database and a protein sequence query, PaperBLAST uses protein-protein BLAST to find similar sequences with E < 0.001.

To build the database, we query EuropePMC with locus tags, with RefSeq protein identifiers, and with UniProt accessions. We obtain the locus tags from RefSeq or from MicrobesOnline. We use queries of the form "locus_tag AND genus_name" to try to ensure that the paper is actually discussing that gene. Because EuropePMC indexes most recent biomedical papers, even if they are not open access, some of the links may be to papers that you cannot read or that our computers cannot read. We query each of these identifiers that appears in the open access part of EuropePMC, as well as every locus tag that appears in the 500 most-referenced genomes, so that a gene may appear in the PaperBLAST results even though none of the papers that mention it are open access. We also incorporate text-mined links from EuropePMC that link open access articles to UniProt or RefSeq identifiers. (This yields some additional links because EuropePMC uses different heuristics for their text mining than we do.)

For every article that mentions a locus tag, a RefSeq protein identifier, or a UniProt accession, we try to select one or two snippets of text that refer to the protein. If we cannot get access to the full text, we try to select a snippet from the abstract, but unfortunately, unique identifiers such as locus tags are rarely provided in abstracts.

PaperBLAST also incorporates manually-curated protein functions:

Except for GeneRIF and ENA, the curated entries include a short curated description of the protein's function. For entries from BioLiP, the protein's function may not be known beyond binding to the ligand. Many of these entries also link to articles in PubMed.

For more information see the PaperBLAST paper (mSystems 2017) or the code. You can download PaperBLAST's database here.

Changes to PaperBLAST since the paper was written:

Many of these changes are described in Interactive tools for functional annotation of bacterial genomes.

Secrets

PaperBLAST cannot provide snippets for many of the papers that are published in non-open-access journals. This limitation applies even if the paper is marked as "free" on the publisher's web site and is available in PubmedCentral or EuropePMC. If a journal that you publish in is marked as "secret," please consider publishing elsewhere.

Omissions from the PaperBLAST Database

Many important articles are missing from PaperBLAST, either because the article's full text is not in EuropePMC (as for many older articles), or because the paper does not mention a protein identifier such as a locus tag, or because of PaperBLAST's heuristics. If you notice an article that characterizes a protein's function but is missing from PaperBLAST, please notify the curators at UniProt or add an entry to GeneRIF. Entries in either of these databases will eventually be incorporated into PaperBLAST. Note that to add an entry to UniProt, you will need to find the UniProt identifier for the protein. If the protein is not already in UniProt, you can ask them to create an entry. To add an entry to GeneRIF, you will need an NCBI Gene identifier, but unfortunately many prokaryotic proteins in RefSeq do not have corresponding Gene identifers.

References

PaperBLAST: Text-mining papers for information about homologs.
M. N. Price and A. P. Arkin (2017). mSystems, 10.1128/mSystems.00039-17.

Europe PMC in 2017.
M. Levchenko et al (2017). Nucleic Acids Research, 10.1093/nar/gkx1005.

Gene indexing: characterization and analysis of NLM's GeneRIFs.
J. A. Mitchell et al (2003). AMIA Annu Symp Proc 2003:460-464.

UniProt: the universal protein knowledgebase.
The UniProt Consortium (2016). Nucleic Acids Research, 10.1093/nar/gkw1099.

BRENDA in 2017: new perspectives and new tools in BRENDA.
S. Placzek et al (2017). Nucleic Acids Research, 10.1093/nar/gkw952.

The EcoCyc database: reflecting new knowledge about Escherichia coli K-12.
I. M. Keeseler et al (2016). Nucleic Acids Research, 10.1093/nar/gkw1003.

The MetaCyc database of metabolic pathways and enzymes.
R. Caspi et al (2018). Nucleic Acids Research, 10.1093/nar/gkx935.

CharProtDB: a database of experimentally characterized protein annotations.
R. Madupu et al (2012). Nucleic Acids Research, 10.1093/nar/gkr1133.

The carbohydrate-active enzymes database (CAZy) in 2013.
V. Lombard et al (2014). Nucleic Acids Research, 10.1093/nar/gkt1178.

The Transporter Classification Database (TCDB): recent advances
M. H. Saier, Jr. et al (2016). Nucleic Acids Research, 10.1093/nar/gkv1103.

REBASE - a database for DNA restriction and modification: enzymes, genes and genomes.
R. J. Roberts et al (2015). Nucleic Acids Research, 10.1093/nar/gku1046.

Deep annotation of protein function across diverse bacteria from mutant phenotypes.
M. N. Price et al (2016). bioRxiv, 10.1101/072470.

by Morgan Price, Arkin group
Lawrence Berkeley National Laboratory