PaperBLAST
PaperBLAST Hits for VIMSS10109814 hypothetical protein (361 a.a., MSLLLNQPSK...)
Show query sequence
>VIMSS10109814 hypothetical protein
MSLLLNQPSKLCFRRPVLARSSPLHSNGFSSSSLQTPPSMMALKRTYFADYAHHFEKKVP
PELVYNDIKDDDDIGTIGSSHGWVVTLKDDGILRLQDDLNPVASETDPKRISLPPLVTLP
HCQTQIVTNVAMSSSSPEDDECVVAVKFLGPQLRFCRPALKNPEWTNIRIQNPCFFSSRV
MFSETHDMFRIPGSGGHLIGSWGLHTPPKIDKLRFQNLPKLTKTKRKLLHSCFTSEHLVE
SRSTGETFLVKWYRRTPAKPIKGMATMITKALHVFKLDKKGNAVYTQDIGDLCIFLSKSE
PFCVPASSFPQGLHSLQAMRSNHVYILDVDEFGLVELVDSSIASVNCTLKVPFYIPPQNI
N
Running BLASTp...
Found 37 similar proteins in the literature:
AT5G53240 hypothetical protein from Arabidopsis thaliana
100% identity, 100% coverage
- Experimentally heat-induced transposition increases drought tolerance in Arabidopsis thaliana
Thieme, The New phytologist 2022 - “...class) family (/) 5 2105023943 Intron AT5G51800 Protein kinase superfamily protein (/) 5 2160203034 Exon AT5G53240 Hypothetical protein (DUF295) (+/) 5 2284643236 Intron AT5G56400 FBD, Fbox, Skp2like and Leucine Rich Repeat domainscontaining protein (+/) Location, description (Araport11) and zygosity (/ homozygous ONSEN , +/ heterozygous) of...”
- Neofunctionalization of Mitochondrial Proteins and Incorporation into Signaling Networks in Plants
Lama, Molecular biology and evolution 2019 - “...AT5G52940 AtDOB5 + + mito + Vir AT5G53230 AtDOB6 + + mito + Ang Eud AT5G53240 AtDOB7 + + mito Brass AT5G54320 AtDOB8 + + mito Ang Eud AT5G54330 AtDOB9 + + mito Ang Eud AT5G54420 AtDOB10 + + pm Land Brass AT5G54450 AtDOB11 + +...”
- BLISTER Regulates Polycomb-Target Genes, Represses Stress-Regulated Genes and Promotes Stress Responses in Arabidopsis thaliana
Kleinmanns, Frontiers in plant science 2017 - “...gene 13 (SAG13) 10.90 Yes 9 AT5G53230 Protein of unknown function (DUF295) 10.73 Yes 10 AT5G53240 Protein of unknown function (DUF295) 9.96 Yes 11 AT1G09180 Secretion-associated RAS super family 1 (SARA1) 8.44 No 12 AT3G57260 beta-1,3-glucanase 2, PATHOGENESIS-RELATED PROTEIN 2, (PR2) 8.00 Yes 13 AT2G38240 2-oxoglutarate...”
- Differential proteomic analysis of Arabidopsis thaliana genotypes exhibiting resistance or susceptibility to the insect herbivore, Plutella xylostella
Collins, PloS one 2010 - “...individual Ril; isocitrate dehydrogenase (pooled Rils=At5g14590; individual Ril=At1g65930), malate dehydrogenase (pooled Rils=At5g58330; individual Ril=At5g53240 and At5g53240 or At3g15020), and glyceraldehyde-3-phosphate dehydrogenase (pooled Rils=At1g42970 and At3g04120; individual Ril=At1g13440). The relevance of these has been discussed above. A number of other protein differences between Rils 57 and 23...”
AT5G55270 hypothetical protein from Arabidopsis thaliana
64% identity, 100% coverage
AT5G52940 hypothetical protein from Arabidopsis thaliana
61% identity, 100% coverage
- Transcriptome Remodeling in Arabidopsis: A Response to Heterologous Poplar MSL-lncRNAs Overexpression
Mao, Plants (Basel, Switzerland) 2024 - “...S2 . A search of the National Center for Biotechnology Information (NCBI) database revealed that AT5G52940 (Chr5: 21472103-21473602), AT5G54450 (Chr5: 22108273-22109379), and AT4G25930 (Chr4: 13168127-13169393) are classified as hypothetical proteins (DUF295) containing the domain of Unknown Function 295. The C-terminal DUF295 domain is often accompanied by...”
- “...leading to the upregulation of Class B gene expression [ 40 ]. In our results, AT5G52940, AT5G54450, and AT4G25930 possess the F-box domain, suggesting that they may be potential genes involved in the co-regulation of stamen development by MSL-lncRNAs. Transcription factors represent a significant proportion of...”
- Neofunctionalization of Mitochondrial Proteins and Incorporation into Signaling Networks in Plants
Lama, Molecular biology and evolution 2019 - “...Ang Eud AT4G13680 AtDOB3 + + mito Brass Eud AT5G52930 AtDOB4 + + mito + AT5G52940 AtDOB5 + + mito + Vir AT5G53230 AtDOB6 + + mito + Ang Eud AT5G53240 AtDOB7 + + mito Brass AT5G54320 AtDOB8 + + mito Ang Eud AT5G54330 AtDOB9 +...”
- “...of the predicted mitochondrial isoforms, we cloned three representatives AtDOA10 ( At4g25930 ), AtDOB5 ( At5g52940 ), and AtDOB12 ( At5g54550 ) into C-terminal GFP-fusion vectors (see below for more information on why these were selected). The localization of the fusion proteins was analyzed by transient...”
AT5G55890 hypothetical protein from Arabidopsis thaliana
AT5G55880 hypothetical protein from Arabidopsis thaliana
63% identity, 99% coverage
- Enhanced recombination empowers the detection and mapping of Quantitative Trait Loci
Capilla-Pérez, Communications biology 2024 - “...to 22kb (Chr5:22608 to 22631). The defined interval contains five protein-coding genes (AT5G55860, AT5G55870, AT5G55880, AT5G55890, AT5G55893) and two transposons (AT5G55875 and ATG555896) (Fig. 6 ). One very tempting candidate is AT5G55860/ TREPH1 , whose mutation leads to modification of flowering time and rosette radius in...”
- Neofunctionalization of Mitochondrial Proteins and Incorporation into Signaling Networks in Plants
Lama, Molecular biology and evolution 2019 - “...Eud AT5G55870 AtDOB15 + + mito Ang Eud AT5G55880 AtDOB16 + + mito Ang Eud AT5G55890 AtDOB17 + + mito Ang Eud Note . A. trich , conserved in Amborella trichopoda ; Mono/Dicot, conserved in monocots and dicots; Brassic-only, only conserved in Brassicaceae; F-box, protein contains...”
- Joint Analysis of Dependent Features within Compound Spectra Can Improve Detection of Differential Features
Trutschel, Frontiers in bioengineering and biotechnology 2015 - “...population (Schneider et al., 2005 ). This particular mutant has an over-expression of the AT5G55880 AT5G55890 genetic region with unknown function. Plants were grown on soil in a growth chamber under controlled conditions as biological replicates. The frozen leaf material of each plant was ground and...”
- Enhanced recombination empowers the detection and mapping of Quantitative Trait Loci
Capilla-Pérez, Communications biology 2024 - “...down to 22kb (Chr5:22608 to 22631). The defined interval contains five protein-coding genes (AT5G55860, AT5G55870, AT5G55880, AT5G55890, AT5G55893) and two transposons (AT5G55875 and ATG555896) (Fig. 6 ). One very tempting candidate is AT5G55860/ TREPH1 , whose mutation leads to modification of flowering time and rosette radius...”
- Neofunctionalization of Mitochondrial Proteins and Incorporation into Signaling Networks in Plants
Lama, Molecular biology and evolution 2019 - “...+ Land AT5G55270 AtDOB14 + + plastid Eud AT5G55870 AtDOB15 + + mito Ang Eud AT5G55880 AtDOB16 + + mito Ang Eud AT5G55890 AtDOB17 + + mito Ang Eud Note . A. trich , conserved in Amborella trichopoda ; Mono/Dicot, conserved in monocots and dicots; Brassic-only,...”
- Joint Analysis of Dependent Features within Compound Spectra Can Improve Detection of Differential Features
Trutschel, Frontiers in bioengineering and biotechnology 2015 - “...TAMARA population (Schneider et al., 2005 ). This particular mutant has an over-expression of the AT5G55880 AT5G55890 genetic region with unknown function. Plants were grown on soil in a growth chamber under controlled conditions as biological replicates. The frozen leaf material of each plant was ground...”
AT5G54330 hypothetical protein from Arabidopsis thaliana
63% identity, 97% coverage
- FIERY1 promotes microRNA accumulation by suppressing rRNA-derived small interfering RNAs in Arabidopsis
You, Nature communications 2019 - “...the location of the mutation to a region consisting of 10 candidate genes (AT5G49770, AT5G52170, AT5G54330, AT5G55330, AT5G57060, AT5G62770, AT5G63450, AT5G63980, AT5G64390, and AT5G64430) that contained SNPs with discordant chastity scores over 0.95. We designed dCAPS primers for all of the candidate genes and analyzed another...”
- Neofunctionalization of Mitochondrial Proteins and Incorporation into Signaling Networks in Plants
Lama, Molecular biology and evolution 2019 - “...Ang Eud AT5G53240 AtDOB7 + + mito Brass AT5G54320 AtDOB8 + + mito Ang Eud AT5G54330 AtDOB9 + + mito Ang Eud AT5G54420 AtDOB10 + + pm Land Brass AT5G54450 AtDOB11 + + mito + Ang Eud AT5G54550 AtDOB12 + + mito + Ang Eud AT5G54560...”
AT5G53230 hypothetical protein from Arabidopsis thaliana
61% identity, 100% coverage
AT5G54320 hypothetical protein from Arabidopsis thaliana
61% identity, 95% coverage
- Neofunctionalization of Mitochondrial Proteins and Incorporation into Signaling Networks in Plants
Lama, Molecular biology and evolution 2019 - “...Vir AT5G53230 AtDOB6 + + mito + Ang Eud AT5G53240 AtDOB7 + + mito Brass AT5G54320 AtDOB8 + + mito Ang Eud AT5G54330 AtDOB9 + + mito Ang Eud AT5G54420 AtDOB10 + + pm Land Brass AT5G54450 AtDOB11 + + mito + Ang Eud AT5G54550 AtDOB12...”
- “...of 34 genes (80%) spread over 13 tandems. One tandem even contained six genes spanning At5g54320 to At5g54560 . The F-box/DUF295 FDB group contained 26 of 38 (68%) tandem duplicated genes spread over 10 tandems. Also here tandems of up to six genes were found (At4g22030...”
AT5G54560 hypothetical protein from Arabidopsis thaliana
59% identity, 100% coverage
- Physiological and Transcriptomic Analysis of Arabidopsis thaliana Responses to Ailanthone, a Potential Bio-Herbicide
Hopson, International journal of molecular sciences 2022 - “...hypothetical protein 6.85 4.55 5.52 UP AT5G55150 F-box SKIP23-like protein (DUF295) 6.65 3.82 3.02 UP AT5G54560 hypothetical protein (DUF295) 6.60 6.05 6.05 UP AT2G20800 NAD(P)H dehydrogenase B4 ( NDB4 ) 6.34 3.76 4.46 UP AT5G54450 hypothetical protein (DUF295) 6.24 2.60 3.30 UP AT5G36130 Cytochrome P450 superfamily...”
- Neofunctionalization of Mitochondrial Proteins and Incorporation into Signaling Networks in Plants
Lama, Molecular biology and evolution 2019 - “...AtDOB11 + + mito + Ang Eud AT5G54550 AtDOB12 + + mito + Ang Eud AT5G54560 AtDOB13 + + mito + Land AT5G55270 AtDOB14 + + plastid Eud AT5G55870 AtDOB15 + + mito Ang Eud AT5G55880 AtDOB16 + + mito Ang Eud AT5G55890 AtDOB17 + +...”
- “...genes (80%) spread over 13 tandems. One tandem even contained six genes spanning At5g54320 to At5g54560 . The F-box/DUF295 FDB group contained 26 of 38 (68%) tandem duplicated genes spread over 10 tandems. Also here tandems of up to six genes were found (At4g22030 to At4g22180)....”
AT5G54550 hypothetical protein from Arabidopsis thaliana
61% identity, 96% coverage
AT5G54450 hypothetical protein from Arabidopsis thaliana
62% identity, 94% coverage
- Transcriptome Remodeling in Arabidopsis: A Response to Heterologous Poplar MSL-lncRNAs Overexpression
Mao, Plants (Basel, Switzerland) 2024 - “...search of the National Center for Biotechnology Information (NCBI) database revealed that AT5G52940 (Chr5: 21472103-21473602), AT5G54450 (Chr5: 22108273-22109379), and AT4G25930 (Chr4: 13168127-13169393) are classified as hypothetical proteins (DUF295) containing the domain of Unknown Function 295. The C-terminal DUF295 domain is often accompanied by an N-terminal F-box...”
- “...to the upregulation of Class B gene expression [ 40 ]. In our results, AT5G52940, AT5G54450, and AT4G25930 possess the F-box domain, suggesting that they may be potential genes involved in the co-regulation of stamen development by MSL-lncRNAs. Transcription factors represent a significant proportion of eukaryotic...”
- Physiological and Transcriptomic Analysis of Arabidopsis thaliana Responses to Ailanthone, a Potential Bio-Herbicide
Hopson, International journal of molecular sciences 2022 - “...6.60 6.05 6.05 UP AT2G20800 NAD(P)H dehydrogenase B4 ( NDB4 ) 6.34 3.76 4.46 UP AT5G54450 hypothetical protein (DUF295) 6.24 2.60 3.30 UP AT5G36130 Cytochrome P450 superfamily protein 2.08 2.99 2.05 DOWN AT4G40020 Myosin heavy chain-related protein 2.24 1.23 1.26 DOWN AT4G31940 cytochrome P450, family 82,...”
- Neofunctionalization of Mitochondrial Proteins and Incorporation into Signaling Networks in Plants
Lama, Molecular biology and evolution 2019 - “...Eud AT5G54330 AtDOB9 + + mito Ang Eud AT5G54420 AtDOB10 + + pm Land Brass AT5G54450 AtDOB11 + + mito + Ang Eud AT5G54550 AtDOB12 + + mito + Ang Eud AT5G54560 AtDOB13 + + mito + Land AT5G55270 AtDOB14 + + plastid Eud AT5G55870 AtDOB15...”
- “...Organellar genes that are apparently regulated by ANAC017, five are present in tandem gene duplications. At5g54450 , At5g54550 , and At5g54560 form a consecutive group of three (from a total of six in close proximity) (see table1 ), whereas At5g52930 and At5g52940 form a consecutive group...”
- The modulation of acetic acid pathway genes in Arabidopsis improves survival under drought stress
Rasheed, Scientific reports 2018 - “...to WT plants. Gene Name Fold change WT_D/WT_C P5_C/WT_C P5_D/WT_D AT4G33070 PDC1 0.90 17.36 31.35 AT5G54450 Hypothetical protein (DUF295) 1.21 31.75 25.97 AT2G03130 Ribosomal protein 0.74 9.13 14.01 AT2G47520 Ethylene Response Factor 71 (ERF71) 0.86 14.32 13.45 AT3G42658 SADHU3-2 1.16 14.53 8.13 AT5G09570 Cox19-like CHCH family...”
- Genome-wide transcriptome analysis of two contrasting Brassica rapa doubled haploid lines under cold-stresses using Br135K oligomeric chip
Jung, PloS one 2014 - “...Bra034094 AT3G10120 Unknown protein 4.9 4.0 3.4 1.1 1.2 1.6 0.8 1.0 2.2 1.6 Bra039899 AT5G54450 Protein of unknown function (DUF295) 4.2 3.9 3.1 0.9 1.2 1.5 1.1 1.3 1.1 0.9 Bra024091 AT4G30230 Unknown protein 4.2 4.2 2.0 1.1 1.0 1.6 1.3 1.9 1.0 0.9 Bra009220...”
AT5G52930 hypothetical protein from Arabidopsis thaliana
59% identity, 91% coverage
- Neofunctionalization of Mitochondrial Proteins and Incorporation into Signaling Networks in Plants
Lama, Molecular biology and evolution 2019 - “...Eud AT3G43170 AtDOB2 + + perox Ang Eud AT4G13680 AtDOB3 + + mito Brass Eud AT5G52930 AtDOB4 + + mito + AT5G52940 AtDOB5 + + mito + Vir AT5G53230 AtDOB6 + + mito + Ang Eud AT5G53240 AtDOB7 + + mito Brass AT5G54320 AtDOB8 + +...”
- “...group of three (from a total of six in close proximity) (see table1 ), whereas At5g52930 and At5g52940 form a consecutive group (two out of two at this locus). This suggests their coregulation is possibly caused by coduplication of regulatory information. This coexpression of neighboring genes...”
AT4G13680 hypothetical protein from Arabidopsis thaliana
56% identity, 92% coverage
AT1G05540 hypothetical protein from Arabidopsis thaliana
40% identity, 74% coverage
AT2G45940 hypothetical protein from Arabidopsis thaliana
38% identity, 77% coverage
AT4G16080 hypothetical protein from Arabidopsis thaliana
35% identity, 86% coverage
- Neofunctionalization of Mitochondrial Proteins and Incorporation into Signaling Networks in Plants
Lama, Molecular biology and evolution 2019 - “...Eud AT2G45940 AtDOA6 + + cyt Land Vir AT4G14260 AtDOA7 + + nu Brass Eud AT4G16080 AtDOA8 + + mito mito AT4G25920 AtDOA9 + + mito Eud AT4G25930 AtDOA10 + + mito + Ang Eud AT5G03390 AtDOA11 + + mito plastid Land AT5G46130 AtDOA12 + +...”
- “...evidence for organellar location of the DOA and DOB proteins was very limited. AtDOA8 ( At4g16080 ) was identified in purified mitochondria by mass spectrometry (MS) ( Senkler etal. 2017 ), whereas AtDOA11 ( AT5G03390 ) was identified by MS in purified chloroplasts ( Zybailov etal....”
LOC106438264 uncharacterized protein LOC106438264 from Brassica napus
40% identity, 67% coverage
AT1G30160 hypothetical protein from Arabidopsis thaliana
39% identity, 65% coverage
- Neofunctionalization of Mitochondrial Proteins and Incorporation into Signaling Networks in Plants
Lama, Molecular biology and evolution 2019 - “...pm Land AT1G05540 AtDOA1 + + mito Land AT1G05550 AtDOA2 + + pm Land Ang AT1G30160 AtDOA3 + + cyt + Eud AT1G30170 AtDOA4 + + mito Ang AT1G68960 AtDOA5 + + mito Ang Eud AT2G45940 AtDOA6 + + cyt Land Vir AT4G14260 AtDOA7 + +...”
- “...(CTTGnnnnnCAAG or similar) was found ( Kulkarni et al. 2018 ). Only for AtDOA3 ( At1g30160 ) no MDM could be found, which is in line with its ANAC017-independent gene expression ( supplementary table 2 , Supplementary Material online). Furthermore, by using DNA affinity purification sequencing...”
AT5G53790 hypothetical protein from Arabidopsis thaliana
32% identity, 77% coverage
AT1G30170 hypothetical protein from Arabidopsis thaliana
33% identity, 98% coverage
- A novel seed plants gene regulates oxidative stress tolerance in Arabidopsis thaliana
Sujeeth, Cellular and molecular life sciences : CMLS 2020 - “...HSFA2, ZAT10 ), chromatin remodelers ( CHR34 ), and unknown or uncharacterized proteins ( AT5G59390, AT1G30170, AT1G21520 ) are elevated in atr7 . This indicates that atr7 is primed for an upcoming oxidative stress via pathways involving genes of unknown functions. Collectively, the data reveal ATR7...”
- “...remodelers ( CHR34 ), and many genes encoding uncharacterized proteins ( ANAC085 , AT5G59390 , AT1G30170 , AT1G21520 ), were constantly upregulated in atr7 in the absence of stress (Table 1 , Supplementary Table3). Among the proteins encoded by the 100 most upregulated genes, 32% are...”
- Neofunctionalization of Mitochondrial Proteins and Incorporation into Signaling Networks in Plants
Lama, Molecular biology and evolution 2019 - “...Land AT1G05550 AtDOA2 + + pm Land Ang AT1G30160 AtDOA3 + + cyt + Eud AT1G30170 AtDOA4 + + mito Ang AT1G68960 AtDOA5 + + mito Ang Eud AT2G45940 AtDOA6 + + cyt Land Vir AT4G14260 AtDOA7 + + nu Brass Eud AT4G16080 AtDOA8 + +...”
AT5G55440 hypothetical protein from Arabidopsis thaliana
33% identity, 82% coverage
- Marker and readout genes for defense priming in Pseudomonas cannabina pv. alisalensis interaction aid understanding systemic immunity in Arabidopsis
Sistenich, Scientific reports 2024 - “...after rechallenge in primed leaves (immunological condition 2 compared to condition 4; Fig. 1 ) AT5G55440 ATDOA16 DUF295 ORGANELLAR A 16, Fbox protein 7.08 1.85 5.16 2.98 AT5G22380 NAC090 NAC domain containing protein 90 6.01 1.19 1.54 3.47 AT3G25010 RLP41 Receptorlike protein 41 5.93 0.15 4.53...”
- “...protein 7.26 2.21 2.50 3.03 AT5G62480 GSTU9 Glutathione Stransferase tau 9 7.12 4.63 3.13 2.32 AT5G55440 ATDOA16 DUF295 ORGANELLAR A 16, Fbox protein 7.08 1.85 5.16 2.98 Top genes expressed because of priming in systemic rechallenge condition (condition 3 compared to condition 4) AT5G24540 BGLU31 Beta...”
- Neofunctionalization of Mitochondrial Proteins and Incorporation into Signaling Networks in Plants
Lama, Molecular biology and evolution 2019 - “...+ cyt AT5G53780 AtDOA14 + + cyt Eud Brass AT5G53790 AtDOA15 + + pm Brass AT5G55440 AtDOA16 + + mito Eud Brass AT5G67040 AtDOA17 + + cyt Brass AT3G25200 AtDOB1 + + cyt Ang Eud AT3G43170 AtDOB2 + + perox Ang Eud AT4G13680 AtDOB3 + +...”
- A Lipid Transfer Protein Increases the Glutathione Content and Enhances Arabidopsis Resistance to a Trichothecene Mycotoxin
McLaughlin, PloS one 2015 - “...Sequence analysis indicated that the T-DNA tag was inserted into the last exon of the At5G55440 gene ( Fig 1B ), which encodes a protein of unknown function. To determine the effect of the insertion on transcription in this region, expression of At5G55440, two upstream genes...”
- “...was little difference in expression of At5G55420 and no expression was detected from At5G55430 or At5G55440 in either trr1 or in wild type. However in trr1 , expression of At5G55450, which is ~220 bp downstream of the T-DNA insert was induced 12-fold and expression of At5G55460,...”
AT5G53780 hypothetical protein from Arabidopsis thaliana
30% identity, 79% coverage
AT4G25920 hypothetical protein from Arabidopsis thaliana
36% identity, 61% coverage
- Neofunctionalization of Mitochondrial Proteins and Incorporation into Signaling Networks in Plants
Lama, Molecular biology and evolution 2019 - “...Land Vir AT4G14260 AtDOA7 + + nu Brass Eud AT4G16080 AtDOA8 + + mito mito AT4G25920 AtDOA9 + + mito Eud AT4G25930 AtDOA10 + + mito + Ang Eud AT5G03390 AtDOA11 + + mito plastid Land AT5G46130 AtDOA12 + + mito AT5G46140 AtDOA13 + + cyt...”
- “...as an annotated domain ( fig.1 ). Yeast two-hybrid interactions have been reported only for At4g25920 ( Arabidopsis Interactome Mapping Consortium 2011 ) ( supplementary table 1 , Supplementary Material online). A fourth group (indicated in orange in fig.1 ), including At1g57790 and At5g55150 , contains...”
- Comparative analysis of plant immune receptor architectures uncovers host proteins likely targeted by pathogens
Sarris, BMC biology 2016 - “...DNA-binding domain HARXL16 AT4G32570 tify tify domain HARXL21 AT1G15750 WD40 WD domain, G-beta repeat HARXL44 AT4G25920 DUF295 Protein of unknown function (DUF295) HARXL44 AT4G16380 HMA Heavy metal-associated domain HARXL45_group AT4G02550 Myb_DNA-bind_3 Myb/SANT-like DNA-binding domain HARXL68 AT1G45145 Thioredoxin Thioredoxin HARXL68 AT5G42980 Thioredoxin Thioredoxin HARXL73 AT4G39050 Kinesin Kinesin...”
- Conserved versatile master regulators in signalling pathways in response to stress in plants
Balderas-Hernández, AoB PLANTS 2013 - “...Sip1p/Sip2p/Gal83p family member); activates glucose-repressed genes, represses glucose-induced genes; role in sporulation and peroxisome biogenesis AT4G25920 828698 Hypothetical protein AT1G07310 837242 Calcium-dependent lipid-binding domain GDU4 817013 Glutamine dumper 4 ATERDJ2A 844334 Translocation protein SEC63 AT1G19450 8838529 Sugar transporter ERD6-like 4 CNGC18 831339 Cyclic nucleotide-gated channel 18...”
- Regulation of ABCB1/PGP1-catalysed auxin transport by linker phosphorylation
Henrichs, The EMBO journal 2012 - “...protein (At3G48830) and an unknown protein (At4G25920). In addition, two protein kinases, the histidinelike kinase AHK5/CYTOKININ INDEPENDENT2 (At5G10720)...”
- “...AT3G48830 61 3.2 Similar to unknown protein 1.1 AT4G25920 38 AHK5 (CYTOKININ INDEPENDENT 2) AT5G10720 34 1.2 PID (PINOID) AT2G34650 32 3.0 Protein phosphatase...”
AT5G55870 hypothetical protein from Arabidopsis thaliana
56% identity, 32% coverage
AT4G14260 hypothetical protein from Arabidopsis thaliana
30% identity, 62% coverage
AT1G68960 hypothetical protein from Arabidopsis thaliana
31% identity, 80% coverage
AT3G43170 hypothetical protein from Arabidopsis thaliana
35% identity, 57% coverage
AT4G25930 hypothetical protein from Arabidopsis thaliana
30% identity, 83% coverage
- Transcriptome Remodeling in Arabidopsis: A Response to Heterologous Poplar MSL-lncRNAs Overexpression
Mao, Plants (Basel, Switzerland) 2024 - “...Center for Biotechnology Information (NCBI) database revealed that AT5G52940 (Chr5: 21472103-21473602), AT5G54450 (Chr5: 22108273-22109379), and AT4G25930 (Chr4: 13168127-13169393) are classified as hypothetical proteins (DUF295) containing the domain of Unknown Function 295. The C-terminal DUF295 domain is often accompanied by an N-terminal F-box domain [ 27 ],...”
- “...upregulation of Class B gene expression [ 40 ]. In our results, AT5G52940, AT5G54450, and AT4G25930 possess the F-box domain, suggesting that they may be potential genes involved in the co-regulation of stamen development by MSL-lncRNAs. Transcription factors represent a significant proportion of eukaryotic genomes, with...”
- Neofunctionalization of Mitochondrial Proteins and Incorporation into Signaling Networks in Plants
Lama, Molecular biology and evolution 2019 - “...nu Brass Eud AT4G16080 AtDOA8 + + mito mito AT4G25920 AtDOA9 + + mito Eud AT4G25930 AtDOA10 + + mito + Ang Eud AT5G03390 AtDOA11 + + mito plastid Land AT5G46130 AtDOA12 + + mito AT5G46140 AtDOA13 + + cyt AT5G53780 AtDOA14 + + cyt Eud...”
- “...were published for any of the predicted mitochondrial isoforms, we cloned three representatives AtDOA10 ( At4g25930 ), AtDOB5 ( At5g52940 ), and AtDOB12 ( At5g54550 ) into C-terminal GFP-fusion vectors (see below for more information on why these were selected). The localization of the fusion proteins...”
AT5G54420 hypothetical protein from Arabidopsis thaliana
54% identity, 27% coverage
AT4G16090 hypothetical protein from Arabidopsis thaliana
35% identity, 48% coverage
- Nitrate Uptake Affects Cell Wall Synthesis and Modeling
Landi, Frontiers in plant science 2017 - “...101.2 RWP-RK 80.7 transporter 94.4 GIA1 60.7 PIP1D 22.2 WAV5 12 RPT3 25.1 RING/U-box 49.5 At4g16090 127.1 F-box 154.6 Oxidoreductase 105.7 Kinase 87.7 Major facilitator 94.9 CYP72A15 64.7 Major Facilitator 23.4 TIP2;1 12.2 Gibberellin-regulated 26.8 Kinase 53.5 Transposable 128.1 Transposable 173.5 RWP-RK 119.8 CRK24 90.3 WR3...”
- “...123.5 Transposable 111.1 COR15B 71.5 ZYK4 26.8 Phosphodiesterases 14.7 GRH1 31.4 Kinase 60.8 At1g53640 141 At4g16090 211.8 Transposable 140.8 PTR3 128.2 MYB2 124.1 SOM 76.7 ATRR4 27 DUF617 16.4 PME3 33.9 TIR-NBS 61.5 Kinase 157 At4g11930 223.8 PLAC8 144.3 zinc finger 132.4 SS3 124.8 RLP33 77.5...”
AT5G46130 hypothetical protein from Arabidopsis thaliana
33% identity, 70% coverage
AT5G67040 hypothetical protein from Arabidopsis thaliana
39% identity, 41% coverage
AT1G05550 hypothetical protein from Arabidopsis thaliana
44% identity, 35% coverage
- Differential root and shoot magnetoresponses in Arabidopsis thaliana
Paponov, Scientific reports 2021 - “...with heme binding, peroxidase activity ( At5g58400 ) and 5 genes on unknown function ( At1g05550 , At1g67865 , At3g49230 , At4g19430 , At5g43240 ). Table 6 Group F, genes characterized by a late upregulation in the shoots. Selected genes showing a fold change>2 and P...”
- “...OGOX4 1.14 1.19 1.20 1.19 1.80 1.14 1.00 1.50 1.01 1.22 1.26 3.33 4.13 3.28 At1g05550 DUF295 ORGANELLAR A 2 1.08 1.07 1.22 1.09 1.31 1.39 1.51 1.11 1.18 1.11 1.20 1.26 2.83 2.92 At1g67865 hypothetical protein 1.04 1.01 1.03 1.06 1.03 1.02 1.04 1.08 1.41...”
- High-throughput single-cell transcriptomics reveals the female germline differentiation trajectory in Arabidopsis thaliana
Hou, Communications biology 2021 - “...subcluster at the AC stage included the known gene AGO9 and unknown genes such as AT1G05550 , LTP6 , PDF1 , and AT4G29030 and were enriched for biological processes such as translation, gene expression, and peptide biosynthetic and metabolic processes (Supplementary Figs. 7 and 8 and...”
- Neofunctionalization of Mitochondrial Proteins and Incorporation into Signaling Networks in Plants
Lama, Molecular biology and evolution 2019 - “...AT5G55150 AtFDR2 + + + Not clear pm Land AT1G05540 AtDOA1 + + mito Land AT1G05550 AtDOA2 + + pm Land Ang AT1G30160 AtDOA3 + + cyt + Eud AT1G30170 AtDOA4 + + mito Ang AT1G68960 AtDOA5 + + mito Ang Eud AT2G45940 AtDOA6 + +...”
- A spatial dissection of the Arabidopsis floral transcriptome by MPSS
Peiffer, BMC plant biology 2008 - “...with ovule-inclusive, gynoecium enrichment as predicted by relaxed MPSS parameters. A gene of unknown function, At1g05550, was identified within the integuments (Figure 5a ). At5g24420 was characterized within the funiculus and both integuments of the ovule, confirming MPSS-based predictions of ovule expression (Figure 5b ). Expression...”
- “...for At2g47470 that shows expression in the carpel walls and stigma (data not shown). (A) At1g05550; GUS is expressed in both integuments and the nucellus, at the chalazal region. (B) At5g24420; GUS is expressed in the funiculus and both integuments throughout the ovule. (C) At5g49180; GUS...”
AT5G03390 hypothetical protein from Arabidopsis thaliana
30% identity, 72% coverage
- Neofunctionalization of Mitochondrial Proteins and Incorporation into Signaling Networks in Plants
Lama, Molecular biology and evolution 2019 - “...mito AT4G25920 AtDOA9 + + mito Eud AT4G25930 AtDOA10 + + mito + Ang Eud AT5G03390 AtDOA11 + + mito plastid Land AT5G46130 AtDOA12 + + mito AT5G46140 AtDOA13 + + cyt AT5G53780 AtDOA14 + + cyt Eud Brass AT5G53790 AtDOA15 + + pm Brass AT5G55440...”
- “...in purified mitochondria by mass spectrometry (MS) ( Senkler etal. 2017 ), whereas AtDOA11 ( AT5G03390 ) was identified by MS in purified chloroplasts ( Zybailov etal. 2008 ). As no green fluorescent protein (GFP) localization data were published for any of the predicted mitochondrial isoforms,...”
AT5G46140 hypothetical protein from Arabidopsis thaliana
35% identity, 51% coverage
AT3G25200 hypothetical protein from Arabidopsis thaliana
61% identity, 20% coverage
AT2G45930 hypothetical protein from Arabidopsis thaliana
30% identity, 48% coverage
AT3G43970 hypothetical protein from Arabidopsis thaliana
48% identity, 24% coverage
For advice on how to use these tools together, see
Interactive tools for functional annotation of bacterial genomes.
The PaperBLAST database links 793,807 different protein sequences to 1,259,118 scientific articles. Searches against EuropePMC were last performed on March 13 2025.
PaperBLAST builds a database of protein sequences that are linked
to scientific articles. These links come from automated text searches
against the articles in EuropePMC
and from manually-curated information from GeneRIF, UniProtKB/Swiss-Prot,
BRENDA,
CAZy (as made available by dbCAN),
BioLiP,
CharProtDB,
MetaCyc,
EcoCyc,
TCDB,
REBASE,
the Fitness Browser,
and a subset of the European Nucleotide Archive with the /experiment tag.
Given this database and a protein sequence query,
PaperBLAST uses protein-protein BLAST
to find similar sequences with E < 0.001.
To build the database, we query EuropePMC with locus tags, with RefSeq protein
identifiers, and with UniProt
accessions. We obtain the locus tags from RefSeq or from MicrobesOnline. We use
queries of the form "locus_tag AND genus_name" to try to ensure that
the paper is actually discussing that gene. Because EuropePMC indexes
most recent biomedical papers, even if they are not open access, some
of the links may be to papers that you cannot read or that our
computers cannot read. We query each of these identifiers that
appears in the open access part of EuropePMC, as well as every locus
tag that appears in the 500 most-referenced genomes, so that a gene
may appear in the PaperBLAST results even though none of the papers
that mention it are open access. We also incorporate text-mined links
from EuropePMC that link open access articles to UniProt or RefSeq
identifiers. (This yields some additional links because EuropePMC
uses different heuristics for their text mining than we do.)
For every article that mentions a locus tag, a RefSeq protein
identifier, or a UniProt accession, we try to select one or two
snippets of text that refer to the protein. If we cannot get access to
the full text, we try to select a snippet from the abstract, but
unfortunately, unique identifiers such as locus tags are rarely
provided in abstracts.
PaperBLAST also incorporates manually-curated protein functions:
- Proteins from NCBI's RefSeq are included if a
GeneRIF
entry links the gene to an article in
PubMed®.
GeneRIF also provides a short summary of the article's claim about the
protein, which is shown instead of a snippet.
- Proteins from Swiss-Prot (the curated part of UniProt)
are included if the curators
identified experimental evidence for the protein's function (evidence
code ECO:0000269). For these proteins, the fields of the Swiss-Prot entry that
describe the protein's function are shown (with bold headings).
- Proteins from BRENDA,
a curated database of enzymes, are included if they are linked to a paper in PubMed
and their full sequence is known.
- Every protein from the non-redundant subset of
BioLiP,
a database
of ligand-binding sites and catalytic residues in protein structures, is included. Since BioLiP itself
does not include descriptions of the proteins, those are taken from the
Protein Data Bank.
Descriptions from PDB rely on the original submitter of the
structure and cannot be updated by others, so they may be less reliable.
(For SitesBLAST and Sites on a Tree, we use a larger subset of BioLiP so that every
ligand is represented among a group of structures with similar sequences, but for
PaperBLAST, we use the non-redundant set provided by BioLiP.)
- Every protein from EcoCyc, a curated
database of the proteins in Escherichia coli K-12, is included, regardless
of whether they are characterized or not.
- Proteins from the MetaCyc metabolic pathway database
are included if they are linked to a paper in PubMed and their full sequence is known.
- Proteins from the Transport Classification Database (TCDB)
are included if they have known substrate(s), have reference(s),
and are not described as uncharacterized or putative.
(Some of the references are not visible on the PaperBLAST web site.)
- Every protein from CharProtDB,
a database of experimentally characterized protein annotations, is included.
- Proteins from the CAZy database of carbohydrate-active enzymes
are included if they are associated with an Enzyme Classification number.
Even though CAZy does not provide links from individual protein sequences to papers,
these should all be experimentally-characterized proteins.
- Proteins from the REBASE database
of restriction enzymes are included if they have known specificity.
- Every protein with an evidence-based reannotation (based on mutant phenotypes)
in the Fitness Browser is included.
- Sequence-specific transcription factors (including sigma factors and DNA-binding response regulators)
with experimentally-determined DNA binding sites from the
PRODORIC database of gene regulation in prokaryotes.
- Putative transcription factors from RegPrecise
that have manually-curated predictions for their binding sites. These predictions are based on
conserved putative regulatory sites across genomes that contain similar transcription factors,
so PaperBLAST clusters the TFs at 70% identity and retains just one member of each cluster.
- Coding sequence (CDS) features from the
European Nucleotide Archive (ENA)
are included if the /experiment tag is set (implying that there is experimental evidence for the annotation),
the nucleotide entry links to paper(s) in PubMed,
and the nucleotide entry is from the STD data class
(implying that these are targeted annotated sequences, not from shotgun sequencing).
Also, to filter out genes whose transcription or translation was detected, but whose function
was not studied, nucleotide entries or papers with more than 25 such proteins are excluded.
Descriptions from ENA rely on the original submitter of the
sequence and cannot be updated by others, so they may be less reliable.
Except for GeneRIF and ENA,
the curated entries include a short curated
description of the protein's function.
For entries from BioLiP, the protein's function may not be known beyond binding to the ligand.
Many of these entries also link to articles in PubMed.
For more information see the
PaperBLAST paper (mSystems 2017)
or the code.
You can download PaperBLAST's database here.
Changes to PaperBLAST since the paper was written:
- November 2023: incorporated PRODORIC and RegPrecise. Many PRODORIC entries were not linked to a protein sequence (no UniProt identifier), so we added this information.
- February 2023: BioLiP changed their download format. PaperBLAST now includes their non-redundant subset. SitesBLAST and Sites on a Tree use a larger non-redundant subset that ensures that every ligand is represented within each cluster. This should ensure that every binding site is represented.
- June 2022: incorporated some coding sequences from ENA with the /experiment tag.
- March 2022: incorporated BioLiP.
- April 2020: incorporated TCDB.
- April 2019: EuropePMC now returns table entries in their search results. This has expanded PaperBLAST's database, but most of the new entries are of low relevance, and the resulting snippets are often just lists of locus tags with annotations.
- February 2018: the alignment page reports the conservation of the hit's functional sites (if available from from Swiss-Prot or UniProt)
- January 2018: incorporated BRENDA.
- December 2017: incorporated MetaCyc, CharProtDB, CAZy, REBASE, and the reannotations from the Fitness Browser.
- September 2017: EuropePMC no longer returns some table entries in their search results. This has shrunk PaperBLAST's database, but has also reduced the number of low-relevance hits.
Many of these changes are described in Interactive tools for functional annotation of bacterial genomes.
PaperBLAST cannot provide snippets for many of the papers that are
published in non-open-access journals. This limitation applies even if
the paper is marked as "free" on the publisher's web site and is
available in PubmedCentral or EuropePMC. If a journal that you publish
in is marked as "secret," please consider publishing elsewhere.
Many important articles are missing from PaperBLAST, either because
the article's full text is not in EuropePMC (as for many older
articles), or because the paper does not mention a protein identifier such as a locus tag, or because of PaperBLAST's heuristics. If you notice an
article that characterizes a protein's function but is missing from
PaperBLAST, please notify the curators at UniProt
or add an entry to GeneRIF.
Entries in either of these databases will eventually be incorporated
into PaperBLAST. Note that to add an entry to UniProt, you will need
to find the UniProt identifier for the protein. If the protein is not
already in UniProt, you can ask them to create an entry. To add an
entry to GeneRIF, you will need an NCBI Gene identifier, but
unfortunately many prokaryotic proteins in RefSeq do not have
corresponding Gene identifers.
References
PaperBLAST: Text-mining papers for information about homologs.
M. N. Price and A. P. Arkin (2017). mSystems, 10.1128/mSystems.00039-17.
Europe PMC in 2017.
M. Levchenko et al (2017). Nucleic Acids Research, 10.1093/nar/gkx1005.
Gene indexing: characterization and analysis of NLM's GeneRIFs.
J. A. Mitchell et al (2003). AMIA Annu Symp Proc 2003:460-464.
UniProt: the universal protein knowledgebase.
The UniProt Consortium (2016). Nucleic Acids Research, 10.1093/nar/gkw1099.
BRENDA in 2017: new perspectives and new tools in BRENDA.
S. Placzek et al (2017). Nucleic Acids Research, 10.1093/nar/gkw952.
The EcoCyc database: reflecting new knowledge about Escherichia coli K-12.
I. M. Keeseler et al (2016). Nucleic Acids Research, 10.1093/nar/gkw1003.
The MetaCyc database of metabolic pathways and enzymes.
R. Caspi et al (2018). Nucleic Acids Research, 10.1093/nar/gkx935.
CharProtDB: a database of experimentally characterized protein annotations.
R. Madupu et al (2012). Nucleic Acids Research, 10.1093/nar/gkr1133.
The carbohydrate-active enzymes database (CAZy) in 2013.
V. Lombard et al (2014). Nucleic Acids Research, 10.1093/nar/gkt1178.
The Transporter Classification Database (TCDB): recent advances
M. H. Saier, Jr. et al (2016). Nucleic Acids Research, 10.1093/nar/gkv1103.
REBASE - a database for DNA restriction and modification: enzymes, genes and genomes.
R. J. Roberts et al (2015). Nucleic Acids Research, 10.1093/nar/gku1046.
Deep annotation of protein function across diverse bacteria from mutant phenotypes.
M. N. Price et al (2016). bioRxiv, 10.1101/072470.
by Morgan Price,
Arkin group
Lawrence Berkeley National Laboratory