PaperBLAST
PaperBLAST Hits for 64 a.a. (MASRNKLVVP...)
Show query sequence
>64 a.a. (MASRNKLVVP...)
MASRNKLVVPGVEQALDQFKLEVAQEFGVNLGSDTVARANGSVGGEMTKRLVQQAQSQLN
GTTK
Running BLASTp...
Found 30 similar proteins in the literature:
P04833 Small, acid-soluble spore protein D from Bacillus subtilis (strain 168)
100% identity, 100% coverage
BMMGA3_03940 alpha/beta-type small acid-soluble spore protein from Bacillus methanolicus MGA3
74% identity, 85% coverage
BC4646 Small acid-soluble spore protein from Bacillus cereus ATCC 14579
P0A4F4 Small, acid-soluble spore protein 2 from Bacillus cereus (strain ATCC 14579 / DSM 31 / CCUG 7414 / JCM 2152 / NBRC 15305 / NCIMB 9373 / NCTC 2599 / NRRL B-3711)
P0A4F5 Small, acid-soluble spore protein 2 from Bacillus cereus
69% identity, 95% coverage
BAS4544 small, acid-soluble spore protein B from Bacillus anthracis str. Sterne
BA4898 small, acid-soluble spore protein B from Bacillus anthracis str. Ames
69% identity, 95% coverage
- Immunization of mice with formalin-inactivated spores from avirulent Bacillus cereus strains provides significant protection from challenge with Bacillus anthracis Ames
Vergis, Clinical and vaccine immunology : CVI 2013 - “...BA5641 BA5699 BAS2986 BAS3402 BAS3619 BAS3957 BAS4177 BAS4383 BAS4544 BAS5241 BAS5242 BAS5303 a b Protein name Translation elongation factor Tu Alanine racemase...”
- Recombinant Bacillus anthracis spore proteins enhance protection of mice primed with suboptimal amounts of protective antigen
Cybulski, Vaccine 2008 - “...+ [ 44 ] BAS4383 BA4722 ThiJ/PfpI family protein 663 24 + [ 44 ] BAS4544 BA4898 Small, acid-soluble spore protein B 198 6.8 [ 44 ] BAS5241 BA5640 Cell wall hydrolase 423 16.1 [ 44 ] BAS5242 BA5641 YwdL 438 16.2 + [ 44 ]...”
- Whole genome protein microarrays for serum profiling of immunodominant antigens of Bacillus anthracis
Kempsell, Frontiers in microbiology 2015 - “...MKRIGINDKCIGCGAEVDDPECECEWRTCSCCGYPDCF VYEEGRYYHCKNCDHSTDPGHY 0 Belgian and Turkish (IgG) Turkish (IgA) 9.1 10 3 1.64 10 2 BA4898 Small, acid-soluble spore protein ( sspB ) MARSTNKLAVPGAESALDQMKYEIAQEFGVQLGADAT MSRSTNKLAVPGAESALDQMKYEIAQEFGVQLGADAT ARANGSVGGEITKRLVSLAEQQLGGFQK ARANGSVGGEITKRLVSLAEQQLGGFQK 98 Turkish (IgA) ND 4.02 10 2 Table 2 IgG Anti-PA, LF and immunogenic peptide (from Table 1) ELISA...”
- “...NA NA BA0448 NA NA NA NA NA NA NA NA NA 25.221 77.188 0.867 BA4898 NA NA NA NA NA NA NA NA NA NA NA NA EC50 ( 0 ), standard error and R-values are given from three parameter logistic regression analysis . Microarray...”
- Immunization of mice with formalin-inactivated spores from avirulent Bacillus cereus strains provides significant protection from challenge with Bacillus anthracis Ames
Vergis, Clinical and vaccine immunology : CVI 2013 - “...BAS2693 BA3211 BA3668 BA3906 BA4266 BA4499 BA4722 BA4898 BA5640 BA5641 BA5699 BAS2986 BAS3402 BAS3619 BAS3957 BAS4177 BAS4383 BAS4544 BAS5241 BAS5242 BAS5303...”
- Recombinant Bacillus anthracis spore proteins enhance protection of mice primed with suboptimal amounts of protective antigen
Cybulski, Vaccine 2008 - “...[ 44 ] BAS4383 BA4722 ThiJ/PfpI family protein 663 24 + [ 44 ] BAS4544 BA4898 Small, acid-soluble spore protein B 198 6.8 [ 44 ] BAS5241 BA5640 Cell wall hydrolase 423 16.1 [ 44 ] BAS5242 BA5641 YwdL 438 16.2 + [ 44 ] BAS5303...”
- Formation and composition of the Bacillus anthracis endospore
Liu, Journal of bacteriology 2004 - “...five are likely to be highly abundant (BA0858, BA0524, BA4898, BA3127, and BA1987), whereas the other three are present in lower abundance. Three SASPs (BA4898,...”
- “...BA1238 BA1489 BA2045 BA2162 BA2292 BA2554 BA3668 BA4266 BA4722 BA4898 BA5640 BA5641 BA5699 Counta 176 J. BACTERIOL. LIU ET AL. tively few of the gene products...”
NP_388856 small acid-soluble spore protein (beta-type SASP) from Bacillus subtilis subsp. subtilis str. 168
69% identity, 88% coverage
YP_001374031 small acid-soluble spore protein alpha/beta type from Bacillus cereus subsp. cytotoxis NVH 391-98
66% identity, 93% coverage
SAS1_BACIU / P84583 Small, acid-soluble spore protein 1; SASP-1 from Bacillus subtilis (see paper)
66% identity, 84% coverage
- function: SASP are bound to spore DNA. They are double-stranded DNA- binding proteins that cause DNA to change to an a-like conformation. They protect the DNA backbone from chemical and enzymatic cleavage and are thus involved in dormant spore's high resistance to UV light (By similarity).
YP_001643650 small acid-soluble spore protein alpha/beta type from Bacillus weihenstephanensis KBAB4
63% identity, 94% coverage
W8YUH6 Small, acid-soluble spore protein 1 from Bacillus thuringiensis DB27
Q73CW6 Small acid-soluble spore protein from Bacillus cereus (strain ATCC 10987 / NRS 248)
BA0858 small acid-soluble spore protein from Bacillus anthracis str. Ames
BC0875 Small acid-soluble spore protein from Bacillus cereus ATCC 14579
BAS0815 small acid-soluble spore protein from Bacillus anthracis str. Sterne
65% identity, 93% coverage
- Microwave supported hydrolysis prepares Bacillus spores for proteomic analysis
Chen, International journal of mass spectrometry 2019 - “...EKGNADVEYLNLANHDVKFVANNL P94217 S-layer protein EA1 2291.6 2292.8 TKLDLNVSTTVEYQLSKYTS P94217 S-layer protein EA1 3074.5 3074.5 STARANGSVGGEITKRLVAMAEQSLGGFHK W8YUH6 Small, acid-soluble spore protein 3088.5 3088.2 ATARANGSVGGEITKRLVSLAEQQLGGFQK Q81KU1 Small, acid-soluble spore protein B 6679.5 6680.3 ARSTNKLAVPGAESALDQMKYEIAQEFGVQLGADATARANGSVGGEITKRLVSLAEQQLGGFQK Q81KU1 Small, acid-soluble spore protein B 3610.1 3611.0 ARSTNKLAVPGAESALDQMKYEIAQEFGVQLGAD Q81KU1 Small, acid-soluble spore protein...”
- Identification of Highly Pathogenic Microorganisms by Matrix-Assisted Laser Desorption Ionization-Time of Flight Mass Spectrometry: Results of an Interlaboratory Ring Trial
Lasch, Journal of clinical microbiology 2015 - “...6,835 (-SASP in B. cereus ATCC 10987; UniProt Q73CW6). All mass spectra were smoothed, baseline corrected, and intensity normalized. Black traces are reference...”
- Formation and composition of the Bacillus anthracis endospore
Liu, Journal of bacteriology 2004 - “...indicates that five are likely to be highly abundant (BA0858, BA0524, BA4898, BA3127, and BA1987), whereas the other three are present in lower abundance. Three...”
- SigB modulates expression of novel SigB regulon members via Bc1009 in non-stressed and heat-stressed cells revealing its alternative roles in Bacillus cereus
Yeak, BMC microbiology 2023 - “...regulator, AsnC family -1.6 K 31 BC4448 BC4448 Protein with unknown function -1.6 S 32 BC0875 BC0875 hypothetical protein -1.6 -1.2 S 33 BC1991 BC1991 putative murein endopeptidase -1.4 D 34 BC1520 YpiB hypothetical Cytosolic Protein -1.4 S 35 BC0673 BC0673 Flavin-dependent dehydrogenase -1.4 -1.1 C...”
- Detection of Bacillus anthracis spore germination in vivo by bioluminescence imaging
Sanz, Infection and immunity 2008 - “...acid-soluble protein gene in Bacillus subtilis is locus BAS0815 (hereafter also called sspB). The promoter region of sspB was amplified from genomic B....”
SAS2_BACIU / P84584 Small, acid-soluble spore protein 2; SASP-2 from Bacillus subtilis (see paper)
64% identity, 83% coverage
- function: SASP are bound to spore DNA. They are double-stranded DNA- binding proteins that cause DNA to change to an a-like conformation. They protect the DNA backbone from chemical and enzymatic cleavage and are thus involved in dormant spore's high resistance to UV light (By similarity).
NP_390835 small acid-soluble spore protein (alpha-type SASP) from Bacillus subtilis subsp. subtilis str. 168
P04831 Small, acid-soluble spore protein A from Bacillus subtilis (strain 168)
BSU29570 small acid-soluble spore protein (alpha-type SASP) from Bacillus subtilis subsp. subtilis str. 168
69% identity, 80% coverage
BCAH187_A3151 small, acid-soluble spore protein, alpha/beta family from Bacillus cereus AH187
BA3127 small, acid-soluble spore protein, alpha/beta family from Bacillus anthracis str. Ames
70% identity, 80% coverage
- Sporulation is dispensable for the vegetable-associated life cycle of the human pathogen Bacillus cereus
Antequera-Gómez, Microbial biotechnology 2021 - “...protein X BCAH187_A3146 CotF; spore coat protein F BCAH187_A3147 YqcI/YcgG family protein a , b BCAH187_A3151 Small acidsoluble spore protein A (major alphatype SASP) BCAH187_A3165 Spore germination protein GerAC BCAH187_A3166 Spore germination protein A2 BCAH187_A3167 Spore germination protein GerAA a , b BCAH187_A3201 Putative spore germination...”
- Formation and composition of the Bacillus anthracis endospore
Liu, Journal of bacteriology 2004 - “...likely to be highly abundant (BA0858, BA0524, BA4898, BA3127, and BA1987), whereas the other three are present in lower abundance. Three SASPs (BA4898, BA3127,...”
- “...peptides, allowing unique identifications, while the majority of BA3127 peptides were identical to those derived from other SASPs. Also identified was a...”
BC1984 Small acid-soluble spore protein from Bacillus cereus ATCC 14579
BAS1844 Small, acid-soluble spore protein, alpha/beta family from Bacillus anthracis str. Sterne
65% identity, 81% coverage
GBAA1987 Small, acid-soluble spore protein, alpha/beta family from Bacillus anthracis str. 'Ames Ancestor'
BA1987 Small, acid-soluble spore protein, alpha/beta family from Bacillus anthracis str. Ames
AW20_817, BC_1984 alpha/beta-type small acid-soluble spore protein from Bacillus cereus ATCC 14579
65% identity, 84% coverage
- Structure and complexity of a bacterial transcriptome
Passalacqua, Journal of bacteriology 2009 - “...sample (sample 5), while genes GBAA1981-6 and GBAA1987 are differentially expressed (roughly 10-fold down and 100-fold up during sporulation, respectively)....”
- “...operon is 14-fold lower, and expression of the GBAA1987 locus is 140-fold higher during sporulation. This level of agreement was typical across the genome;...”
- The global transcriptional responses of Bacillus anthracis Sterne (34F2) and a Delta sodA1 mutant to paraquat reveal metal ion homeostasis imbalances during endogenous superoxide stress
Passalacqua, Journal of bacteriology 2007 - “...Sporulation and germination GBAA0521 GBAA0767 GBAA1979 GBAA1987 GBAA5528 Transcription factors GBAA2454 Transport and binding proteins GBAA0228 GBAA0314...”
- Formation and composition of the Bacillus anthracis endospore
Liu, Journal of bacteriology 2004 - “...be highly abundant (BA0858, BA0524, BA4898, BA3127, and BA1987), whereas the other three are present in lower abundance. Three SASPs (BA4898, BA3127, and...”
- Beyond the spore, the exosporium sugar anthrose impacts vegetative Bacillus anthracis gene regulation in cis and trans
Norris, Scientific reports 2023 - “...AW20_5232 BAS2997 Hypothetical protein 2.75 AW20_2269 BAS0470 Hypothetical protein 2.85 AW20_360 BAS2292 Hypothetical protein 3.47 AW20_817 BAS1844 Small acid-soluble spore protein, alpha/beta family, SASP_1 5.79 AW20_1924 BAS0767 Polypeptide composition of the spore coat protein CotJB 6.18 AW20_5645 Hypothetical protein DUF4037, between lef and pagR, pXO1 7.07...”
- Stoichiometry, Absolute Abundance, and Localization of Proteins in the Bacillus cereus Spore Coat Insoluble Fraction Determined Using a QconCAT Approach
Stelder, Journal of proteome research 2018 - “...protein a b c BC_p0002 Small acid-soluble protein gamma type sspE Small acid-soluble spore protein BC_1984 Spore coat proteinE c BC_3770 (CotE) Stage IV sporulation proteinB BC_4172 a Indicates proteins previously identified by Abhyankar et al. 14 b Indicates proteins previously identified in exosporium isolates. c...”
- Proteomic evidences for rex regulation of metabolism in toxin-producing Bacillus cereus ATCC 14579
Laouami, PloS one 2014 - “...0,02 - BC_3728 NP_833453 DNA-binding protein HU N N 0,23 0,422 2,5 0,00 Other - BC_1984 NP_831667 Phage protein N N 3,113 0,134 4,77 0,01 - BC_1012 NP_830798 unknown N N 0,81 0,291 3,5 0,00 - BC_5360 NP_835021 unknown Y Y 0,299 0,134 2,38 0,00 -...”
YP_001374814 small acid-soluble spore protein alpha/beta type from Bacillus cereus subsp. cytotoxis NVH 391-98
63% identity, 84% coverage
BSU19950 small acid-soluble spore protein (alpha/beta-type SASP); SPbeta phage protein from Bacillus subtilis subsp. subtilis str. 168
NP_389876 small acid-soluble spore protein (alpha/beta-type SASP); SPbeta phage protein from Bacillus subtilis subsp. subtilis str. 168
58% identity, 79% coverage
- FUNAGE-Pro: comprehensive web server for gene set enrichment analysis of prokaryotes
de, Nucleic acids research 2022 - “...Gene Ontology (GO) class-IDs ClassID Class description Locus-tag gene Protein description GO:0003690 double-stranded DNA binding BSU19950 sspC small acid-soluble spore protein C BSU29570 sspA small acid-soluble spore protein A GO:0042601/GO:0030436 endospore-forming forespore BSU08550 sspK small acid-soluble spore protein K BSU17990 sspO small acid-soluble spore protein O...”
- Roles of the major, small, acid-soluble spore proteins and spore-specific and universal DNA repair mechanisms in resistance of Bacillus subtilis spores to ionizing radiation from X rays and high-energy charged-particle bombardment.
Moeller, Journal of bacteriology 2008 - GeneRIF: The loss of alpha/beta-type SASP leads to a significant radiosensitivity to ionizing radiation, suggesting the essential function of these spore proteins as protectants of spore DNA against ionizing radiation.
- Role of the Nfo and ExoA apurinic/apyrimidinic endonucleases in repair of DNA damage during outgrowth of Bacillus subtilis spores.
Ibarra, Journal of bacteriology 2008 - GeneRIF: Results suggest that alpha/beta-type SASP, Nfo/ExoA, RecA, and NER system all contribute to the repair of and/or protection against oxidative damage of DNA in germinating and outgrowing spores.
- Crystallization and preliminary X-ray analysis of the complex between a Bacillus subtilis alpha/beta-type small acid-soluble spore protein and DNA.
Bumbaca, Acta crystallographica. Section F, Structural biology and crystallization communications 2007 - GeneRIF: An engineered variant of an alpha/beta-type small acid-soluble spore protein (SASP) from Bacillus subtilis has been crystallized in a complex with a ten-base-pair double-stranded DNA.
2z3xA / P02958 Structure of a protein-DNA complex essential for DNA protection in spore of bacillus species (see paper)
63% identity, 80% coverage
C174_05773 alpha/beta-type small acid-soluble spore protein from Bacillus mycoides FSL H7-687
47% identity, 79% coverage
- Genomic comparison of sporeforming bacilli isolated from milk
Moreno, BMC genomics 2014 - “...B. weihenstephanensis Bacteriocin cerein 7B C174_01754 to C174_01764 87.0% [ B. cereus =CAJ32354.1] Lanthipeptide_class_I 4 C174_05773 to C174_05843 100% [ B. cereus =WP_002128181.1] Lasso_peptide 4 C174_07587 to C174_07617 100% [ B. cereus =WP_002128586.1] R5-213 V. arenosi - - - H8-237 P. odorifer Putative bacteriocins with double-glycine...”
CA_C2365 alpha/beta-type small acid-soluble spore protein from Clostridium acetobutylicum ATCC 824
CAC2365 Small acid-soluble spore protein from Clostridium acetobutylicum ATCC 824
49% identity, 85% coverage
- Clostridium acetobutylicum Biofilm: Advances in Understanding the Basis
Zhang, Frontiers in bioengineering and biotechnology 2021 - “...small, acid-soluble proteins (SASP) that are used to coat DNA in spores (CA_C1487, CA_C1522, and CA_C2365), which were significantly down-regulated by 48200-fold. It is generally believed that the solvent production in C. acetobutylicum is coupled to the formation of spores, but the biofilm shows that C....”
- “...CA_C2859 (spoIIID) Up-regulated at early stages then down-regulated over time Stage III sporulation proteins CA_C1487, CA_C2365 Down-regulated Small acid-soluble spore protein CA_C2908-2910, CA_C1337-1338 Down-regulated Spore coat protein CA_C0117 Up-regulated CheY-like chemotaxis protein CA_C2745, CA_C2419, CA_C2803 Down-regulated Methyl-accepting chemotaxis protein CA_C2205 (fliD) Down-regulated Flagellar hook-associated protein Oxidative...”
- Clostridium acetobutylicum grows vegetatively in a biofilm rich in heteropolysaccharides and cytoplasmic proteins
Liu, Biotechnology for biofuels 2018 - “...CA_C1522) were also significantly down-regulated by 48-fold ( p <0.01; Student t test), and the CA_C2365 was down-regulated by 200-fold. Overall, the decreased expression of sporulation-related genes over time was consistent with the elimination of sporulation in the biofilm. Fig.3 Temporal expression of sporulation genes in...”
- Small acid-soluble spore proteins of Clostridium acetobutylicum are able to protect DNA in vitro and are specifically cleaved by germination protease GPR and spore protease YyaC
Wetzel, Microbiology (Reading, England) 2015 (PubMed)- “...forespore-specific gene expression. SASPs were termed SspA (Cac2365), SspB (Cac1522), SspD (Cac1620), SspF (Cac2372), SspH (Cac1663) and Tlp (Cac1487). Here it...”
- “...putative SASPs of C. acetobutylicum including five genes, cac2365, cac1522, cac1620, cac2372 and cac1487, that have been annotated in the genome (Nolling et...”
- Pleiotropic functions of catabolite control protein CcpA in Butanol-producing Clostridium acetobutylicum
Ren, BMC genomics 2012 - “...CAC2908-2910), yabG (CAC2905) and yhjR (CAC3002); 6 G -controlled genes spoVAE (CAC2303), spoVAD (CAC2304), sspA (CAC2365), sleB (CAC3081), spoVT (CAC3214) and spoVT homologue (CAC3649). Among these genes, only a putative G -controlled gene CAC3649 possesses an identifiable CRE site (Additional file 3 ), indicating that the...”
- A proteomic and transcriptional view of acidogenic and solventogenic steady-state cells of Clostridium acetobutylicum in a chemostat culture
Janssen, Applied microbiology and biotechnology 2010 - “...5.2 4.1 2.5 G CAC2342 Predicted membrane protein 4.1 4.3 2.1 3.4 3.5 1.0 R CAC2365 sspA Small acid-soluble spore protein 13.1 13.9 6.1 11.0 4.3 CAC2438 Predicted phosphatase 5.3 4.7 3.4 4.8 4.5 0.8 CAC2601 S -adenosylmethionine decarboxylase 1.9 2.1 5.2 3.2 3.1 1.5 E...”
- “...(CAP0149) have been postulated (Nlling et al. 2001 ). Among the chromosomal open reading frames, cac2365 showed the highest transcript increase of ~11-fold at pH5.7. This gene putatively encodes an SspA-like protein, annotated as a small acid-soluble DNA-binding spore protein which might protect the spore genome....”
- The transcriptional program underlying the physiology of clostridial sporulation
Jones, Genome biology 2008 - “...cotJ gene, one cotS gene, the spore maturation protein B, a small acid soluble protein (CAC2365), and two spore lytic enzymes (CAC0686, CAC3244). Though several sporulation-related genes are included in the next (sixth) cluster as well, most, beyond those listed here, are upregulated in mid-stationary phase...”
CDR20291_2576 alpha/beta-type small acid-soluble spore protein from Clostridioides difficile R20291
CD2688 small acid-soluble spore protein A from Clostridium difficile 630
CDR20291_2576 small acid-soluble spore protein A from Clostridium difficile R20291
48% identity, 84% coverage
- Pleiotropic roles of Clostridium difficile sin locus
Girinathan, PLoS pathogens 2018 - “...sporulation protein P 15.1 3.9 SigF 2.67E-11 CDR20291_2363 gpr Spore endopeptidase 28.8 4.8 SigF 1.83E-07 CDR20291_2576 sspA small acid-soluble spore protein A 196.8 7.6 SigG, SigF 5.77E-07 CDR20291_3080 small acid-soluble spore protein 9.0 3.2 SigG, SigF 6.98E-06 CDR20291_3107 sspB small acid-soluble spore protein B 305.5 8.3...”
- Characterization of the Clostridioides difficile 630Δerm putative Pro-Pro endopeptidase CD1597
Claushuis, Access microbiology 2024 - “...tag Gene Description cd1597/WT cd1597/tcdC Remark CD1199 spoIIIAH Stage III sporulation protein AH 6,67 5,35 CD2688 sspA Small, acid-soluble spore protein alpha 4,61 5,15 CD3275 Putative phosphosugar isomerase 4,09 3,92 CD3249 sspB Small, acid-soluble spore protein beta 3,92 4,93 CD1065 cotL Morphogenetic spore coat protein 3,53...”
- Insights into the Structure and Protein Composition of Moorella thermoacetica Spores Formed at Different Temperatures
Malleck, International journal of molecular sciences 2022 - “...moth_0925 Q2RJZ7 Small acid-soluble spore protein, alpha/beta type SspA Protection of spore DNA Core BSU29750 CD2688 moth_0806 Q2RKB5 Small, acid-soluble spore protein, alpha/beta family SspF Protection of the spore DNA Core BSU24210 moth_1875 Q2RHB4 Small, acid-soluble spore protein SASP Small, acid-soluble spore protein Core CPR1870 moth_2056...”
- Deciphering Adaptation Strategies of the Epidemic Clostridium difficile 027 Strain during Infection through In Vivo Transcriptional Analysis
Kansau, PloS one 2016 - “...CDR2532 CD2644 spoIIGA Sporulation sigma factor E processing peptidase 0.29 0.34 1.00 0.34 0.29 CDR2576 CD2688 sspA Small acid-soluble spore protein A 0.02 0.01 1.00 1.00 1.00 CDR2802 CD2967 spoVFB Dipicolinate synthase subunit B 0.33 0.33 1.00 2.50 1.00 CDR2803 CD2968 dpaA Dipicolinate synthase subunit B...”
- “...0.29 1.00 2.59 1.00 CDR3090 CD3230 bclA2 Exosporium glycoprotein 1.00 1.00 1.00 5.96 3.15 CDR3107 CD2688 sspB small acid-soluble spore protein B 0.04 0.03 1.00 1.00 1.00 CDR3193 CD3349 bclA3 Putative exosporium glycoprotein 0.24 0.26 1.00 5.69 1.00 CDR3327 CD3490 spoIIE Stage II sporulation protein E...”
- Conserved oligopeptide permeases modulate sporulation initiation in Clostridium difficile
Edwards, Infection and immunity 2014 - “...sigK qPCR (CD1230) sigK qPCR (CD1230) sspA qPCR (CD2688) sspA qPCR (CD2688) a Source or reference 47 47 Underlining indicates sequence-specific sites within...”
- Pleiotropic role of the RNA chaperone protein Hfq in the human pathogen Clostridium difficile
Boudry, Journal of bacteriology 2014 - “...cotE CD1613* cotA CD2399* CD2400* cotJB2 CD2401* cotD CD2688 sspA CD2967* spoVFB CD3230* bclA2 CD3249 sspA CD3349* bclA3 CD3516 spoVG CD3567 sipL CD3678 oxaA1...”
- Transcriptional analysis of temporal gene expression in germinating Clostridium difficile 630 endospores
Dembek, PloS one 2013 - “...this study represented late-sporulation genes such as those encoding small acid-soluble proteins A and B (CD2688 and CD3249 respectively) a putative spore coat protein (CD0213) and a stage IV sporulation protein (CD0783) ( Table S2 ). Alternatively, transcripts might be stored to equip the spore with...”
- Genome-wide analysis of cell type-specific gene transcription during spore formation in Clostridium difficile
Saujet, PLoS genetics 2013 - “...DPA uptake protein, SpoAD 0.27 0.3 + CD0775 spoVAE DPA uptake protein, SpoAE 0.32 0.28 CD2688 sspA Small, acid-soluble spore protein alpha 0.00 0.02 G + CD3249 sspB Small, acid-soluble spore protein alpha 0.01 0.07 G + CD1290 Putative small acid-soluble spore protein 0.29 0.52 G...”
- “...to the forespore chromosome protecting the DNA from damage [52] . Indeed, the sspA ( CD2688 ) and sspB ( CD3249 ) genes encoding alpha/beta-type SASP and two other genes annotated as SASPs ( CD1290 and CD3220.1 ) were expressed under the direct G control (...”
- Adaptive strategies and pathogenesis of Clostridium difficile from in vivo transcriptomics
Janoir, Infection and immunity 2013 - “...by a specific protease. The expression of both sspA (CD2688) and sspB (CD3249), coding for orthologs of two major SASP of B. subtilis, as well as expression...”
- More
- Effect of tcdR Mutation on Sporulation in the Epidemic Clostridium difficile Strain R20291
Girinathan, mSphere 2017 - “...dacF , d -alanyl- d -alanine-carboxypeptidase 5.891 SigG CDR20291_1529 sodA , superoxide dismutase 5.714 SigG CDR20291_2576 sspA , small acid-soluble spore protein A 4.500 SigG CDR20291_2802 spoVFB , dipicolinate synthase subunit B 3.914 SigG CDR20291_3080 Small acid-soluble spore protein 4.107 SigG CDR20291_3107 sspB , small acid-soluble...”
CD3249 small acid-soluble spore protein B from Clostridium difficile 630
CDR20291_3107 small acid-soluble spore protein B from Clostridium difficile R20291
45% identity, 84% coverage
- Characterization of the Clostridioides difficile 630Δerm putative Pro-Pro endopeptidase CD1597
Claushuis, Access microbiology 2024 - “...CD2688 sspA Small, acid-soluble spore protein alpha 4,61 5,15 CD3275 Putative phosphosugar isomerase 4,09 3,92 CD3249 sspB Small, acid-soluble spore protein beta 3,92 4,93 CD1065 cotL Morphogenetic spore coat protein 3,53 4,39 CD3567 sipL Spore coat protein 3,14 2,91 CD2960 atpI V-type ATP synthase subunit I...”
- Genome-Wide Transcription Start Site Mapping and Promoter Assignments to a Sigma Factor in the Human Enteropathogen Clostridioides difficile
Soutourina, Frontiers in microbiology 2020 - “...promoters than SigG-dependent promoters. This included genes encoding 3 SASPs ( CD1290 , CD3220.1 , CD3249 ), a catalase ( CD1567 ), a dipicolinate transporter ( spoVAC operon) and the SpoVT regulator. The results obtained for spoVT strongly suggest that this gene is transcribed by both...”
- Pleiotropic role of the RNA chaperone protein Hfq in the human pathogen Clostridium difficile
Boudry, Journal of bacteriology 2014 - “...CD2401* cotD CD2688 sspA CD2967* spoVFB CD3230* bclA2 CD3249 sspA CD3349* bclA3 CD3516 spoVG CD3567 sipL CD3678 oxaA1 CDIP53/CDIP51 expression ratio 4.97 4.31...”
- Transcriptional analysis of temporal gene expression in germinating Clostridium difficile 630 endospores
Dembek, PloS one 2013 - “...represented late-sporulation genes such as those encoding small acid-soluble proteins A and B (CD2688 and CD3249 respectively) a putative spore coat protein (CD0213) and a stage IV sporulation protein (CD0783) ( Table S2 ). Alternatively, transcripts might be stored to equip the spore with proteins that...”
- Genome-wide analysis of cell type-specific gene transcription during spore formation in Clostridium difficile
Saujet, PLoS genetics 2013 - “...protein, SpoAE 0.32 0.28 CD2688 sspA Small, acid-soluble spore protein alpha 0.00 0.02 G + CD3249 sspB Small, acid-soluble spore protein alpha 0.01 0.07 G + CD1290 Putative small acid-soluble spore protein 0.29 0.52 G CD3220.1 Small acid-soluble spore protein 0.28 0.66 G + CD3499 spoVT...”
- “...the DNA from damage [52] . Indeed, the sspA ( CD2688 ) and sspB ( CD3249 ) genes encoding alpha/beta-type SASP and two other genes annotated as SASPs ( CD1290 and CD3220.1 ) were expressed under the direct G control ( Table 1 ). Moreover, CD0684...”
- Adaptive strategies and pathogenesis of Clostridium difficile from in vivo transcriptomics
Janoir, Infection and immunity 2013 - “...protease. The expression of both sspA (CD2688) and sspB (CD3249), coding for orthologs of two major SASP of B. subtilis, as well as expression of the putative...”
- Pleiotropic roles of Clostridium difficile sin locus
Girinathan, PLoS pathogens 2018 - “...196.8 7.6 SigG, SigF 5.77E-07 CDR20291_3080 small acid-soluble spore protein 9.0 3.2 SigG, SigF 6.98E-06 CDR20291_3107 sspB small acid-soluble spore protein B 305.5 8.3 SigG, SigE, SigF 5.95E-08 CDR20291_3400 sleB Putative spore cortex-lytic enzyme 14.0 3.8 SigF 5.79E-09 CDR20291_2530 sigG RNA polymease sigma-G factor 44.0 5.5...”
- Effect of tcdR Mutation on Sporulation in the Epidemic Clostridium difficile Strain R20291
Girinathan, mSphere 2017 - “...spoVFB , dipicolinate synthase subunit B 3.914 SigG CDR20291_3080 Small acid-soluble spore protein 4.107 SigG CDR20291_3107 sspB , small acid-soluble spore protein B 4.690 SigG CDR20291_0212 Spore coat protein 6.600 SigK CDR20291_0316 Spore coat assembly asparagine-rich protein 6.101 SigK CDR20291_0337 Fragment of putative exosporium glycoprotein 12.666...”
- The LexA regulated genes of the Clostridium difficile
Walter, BMC microbiology 2014 - “...GAAC....GTTA 284 NG NG 1 NG NG NG NG 1 NG NG NG NO NO CDR20291_3107 sspB Small. acid-soluble spore protein beta GAAC....GTTC 34 1 8 2 1 3 2 1 3 3 3 1 1 1 CDR20291_0784 oppC ABC-type transport system. oligopeptide GAACGTTT 285/ -286...”
CDR20291_3107 alpha/beta-type small acid-soluble spore protein from Clostridioides difficile R20291
45% identity, 91% coverage
- Pleiotropic roles of Clostridium difficile sin locus
Girinathan, PLoS pathogens 2018 - “...196.8 7.6 SigG, SigF 5.77E-07 CDR20291_3080 small acid-soluble spore protein 9.0 3.2 SigG, SigF 6.98E-06 CDR20291_3107 sspB small acid-soluble spore protein B 305.5 8.3 SigG, SigE, SigF 5.95E-08 CDR20291_3400 sleB Putative spore cortex-lytic enzyme 14.0 3.8 SigF 5.79E-09 CDR20291_2530 sigG RNA polymease sigma-G factor 44.0 5.5...”
- The LexA regulated genes of the Clostridium difficile
Walter, BMC microbiology 2014 - “...GAAC....GTTA 284 NG NG 1 NG NG NG NG 1 NG NG NG NO NO CDR20291_3107 sspB Small. acid-soluble spore protein beta GAAC....GTTC 34 1 8 2 1 3 2 1 3 3 3 1 1 1 CDR20291_0784 oppC ABC-type transport system. oligopeptide GAACGTTT 285/ -286...”
Cbei_2471 small acid-soluble spore protein, alpha/beta type from Clostridium beijerincki NCIMB 8052
44% identity, 92% coverage
- Genome-wide dynamic transcriptional profiling in Clostridium beijerinckii NCIMB 8052 using single-nucleotide resolution RNA-Seq
Wang, BMC genomics 2012 - “...those encoding spore coat proteins (Cbei_2069, cotJC and Cbei_2070, cotJB ), small acid-soluble spore proteins (Cbei_2471, _0474, _3275, _3264, _1447, _2345, _3080, _3111 and _3250), AbrB family transcriptional regulator (Cbei_0088, annotated as AbrB family stage V sporulation protein T in C. acetobutylicum ) and sporulation sigma...”
CLAU_0265 alpha/beta-type small acid-soluble spore protein from Clostridium autoethanogenum DSM 10061
42% identity, 83% coverage
- Required Gene Set for Autotrophic Growth of Clostridium autoethanogenum
Woods, Applied and environmental microbiology 2022 - “...medium calls into question several of the annotations in the C. autoethanogenum genome. For instance, CLAU_0265 which is annotated as a small acid-soluble spore protein, is required on rich medium, despite that fact that sporulation should not have been required in the library preparation process. The...”
CBO3048 small acid-soluble spore protein from Clostridium botulinum A str. ATCC 3502
45% identity, 88% coverage
DESHY_RS06920 alpha/beta-type small acid-soluble spore protein from Desulforamulus hydrothermalis Lam5 = DSM 18033
46% identity, 69% coverage
- Identification of novel tail-anchored membrane proteins integrated by the bacterial twin-arginine translocase
Gallego-Parrilla, Microbiology (Reading, England) 2024 - “...DESHY_RS06870 , DESHY_RS06865 and DESHY_RS06855 encode hypothetical proteins, DESHY_RS06925 encodes a phage holin family protein, DESHY_RS06920 an alpha/beta-type small acid-soluble spore protein, DESHY_RS06915 a 4Fe4S binding protein, DESHY_RS06910 a YkvA family protein, DESHY_RS06875 a cache domain-containing protein and DESHY_RS06960 encodes SpoIIR. For Proteus alimentorum strain 08MAS0041,...”
- Identification of novel tail-anchored membrane proteins integrated by the bacterial twin-arginine translocase
Gallego-Parrilla, 2023
CPF_2417 small, acid-soluble spore protein 1 from Clostridium perfringens ATCC 13124
45% identity, 86% coverage
- Biofilm and Spore Formation of Clostridium perfringens and Its Resistance to Disinfectant and Oxidative Stress
Hu, Antibiotics (Basel, Switzerland) 2021 - “...( codY , sigE , sigK , soj , spo0A , spollAA , spollE , CPF_2417, ftsK , minD , and spoVD ) and biofilm formation ( ctrAB , abrB , luxS , sigG , CPF_0368, argG , ribD , ribE , lexA , and sleC...”
- “...22.14 2.09 4.47 3.31 ND spollE anti-sigma F factor antagonist 58.15 1.80 5.07 9.36 ND CPF_2417 small, acid-soluble spore protein 11.64 2.19 3.54 6.70 1 ftsK DNA translocase 5.96 2.40 3.73 1.04 36(7) minD septum site-determining protein 4.89 2.33 3.92 1.99 5 spoVD stage V sporulation...”
CEA_G1634 alpha/beta-type small acid-soluble spore protein from Clostridium acetobutylicum EA 2018
CAC1620 Small acid-soluble spore protein from Clostridium acetobutylicum ATCC 824
37% identity, 92% coverage
- Comparative genomic and transcriptomic analysis revealed genetic characteristics related to solvent formation and xylose utilization in Clostridium acetobutylicum EA 2018
Hu, BMC genomics 2011 - “...-56(T) -56(G) Beta-glucosidase family protein CEA_G1365 CAC1351 -97(T) -97(C) Periplasmic sugar-binding protein Sporulation related genes CEA_G1634 CAC1620 -136(T) -136(G) Small acid-soluble spore protein CEA_G3742 CAC3735 -(8-7)(-) -7(C) Predicted RNA-binding protein Jag, SpoIIIJ-associated Numbers in gene or protein variation sites lines indicated the variation sites in genes;...”
- Small acid-soluble spore proteins of Clostridium acetobutylicum are able to protect DNA in vitro and are specifically cleaved by germination protease GPR and spore protease YyaC
Wetzel, Microbiology (Reading, England) 2015 (PubMed)- “...SASPs were termed SspA (Cac2365), SspB (Cac1522), SspD (Cac1620), SspF (Cac2372), SspH (Cac1663) and Tlp (Cac1487). Here it is shown that with the exception...”
- “...of C. acetobutylicum including five genes, cac2365, cac1522, cac1620, cac2372 and cac1487, that have been annotated in the genome (Nolling et al., 2001),...”
- Comparative genomic and transcriptomic analysis revealed genetic characteristics related to solvent formation and xylose utilization in Clostridium acetobutylicum EA 2018
Hu, BMC genomics 2011 - “...-56(G) Beta-glucosidase family protein CEA_G1365 CAC1351 -97(T) -97(C) Periplasmic sugar-binding protein Sporulation related genes CEA_G1634 CAC1620 -136(T) -136(G) Small acid-soluble spore protein CEA_G3742 CAC3735 -(8-7)(-) -7(C) Predicted RNA-binding protein Jag, SpoIIIJ-associated Numbers in gene or protein variation sites lines indicated the variation sites in genes; the...”
CBO1789 small, acid-soluble spore protein alpha from Clostridium botulinum A str. ATCC 3502
43% identity, 82% coverage
sspC2 / GI|144919 small, acid-soluble spore protein C2 from Clostridium perfringens (see 3 papers)
sspC2 / AAA62758.1 acid-soluble spore protein C2 from Clostridium perfringens (see paper)
CPE1423 small acid-soluble spore protein C2 from Clostridium perfringens str. 13
42% identity, 81% coverage
For advice on how to use these tools together, see
Interactive tools for functional annotation of bacterial genomes.
The PaperBLAST database links 798,070 different protein sequences to 1,261,478 scientific articles. Searches against EuropePMC were last performed on May 12 2025.
PaperBLAST builds a database of protein sequences that are linked
to scientific articles. These links come from automated text searches
against the articles in EuropePMC
and from manually-curated information from GeneRIF, UniProtKB/Swiss-Prot,
BRENDA,
CAZy (as made available by dbCAN),
BioLiP,
CharProtDB,
MetaCyc,
EcoCyc,
TCDB,
REBASE,
the Fitness Browser,
and a subset of the European Nucleotide Archive with the /experiment tag.
Given this database and a protein sequence query,
PaperBLAST uses protein-protein BLAST
to find similar sequences with E < 0.001.
To build the database, we query EuropePMC with locus tags, with RefSeq protein
identifiers, and with UniProt
accessions. We obtain the locus tags from RefSeq or from MicrobesOnline. We use
queries of the form "locus_tag AND genus_name" to try to ensure that
the paper is actually discussing that gene. Because EuropePMC indexes
most recent biomedical papers, even if they are not open access, some
of the links may be to papers that you cannot read or that our
computers cannot read. We query each of these identifiers that
appears in the open access part of EuropePMC, as well as every locus
tag that appears in the 500 most-referenced genomes, so that a gene
may appear in the PaperBLAST results even though none of the papers
that mention it are open access. We also incorporate text-mined links
from EuropePMC that link open access articles to UniProt or RefSeq
identifiers. (This yields some additional links because EuropePMC
uses different heuristics for their text mining than we do.)
For every article that mentions a locus tag, a RefSeq protein
identifier, or a UniProt accession, we try to select one or two
snippets of text that refer to the protein. If we cannot get access to
the full text, we try to select a snippet from the abstract, but
unfortunately, unique identifiers such as locus tags are rarely
provided in abstracts.
PaperBLAST also incorporates manually-curated protein functions:
- Proteins from NCBI's RefSeq are included if a
GeneRIF
entry links the gene to an article in
PubMed®.
GeneRIF also provides a short summary of the article's claim about the
protein, which is shown instead of a snippet.
- Proteins from Swiss-Prot (the curated part of UniProt)
are included if the curators
identified experimental evidence for the protein's function (evidence
code ECO:0000269). For these proteins, the fields of the Swiss-Prot entry that
describe the protein's function are shown (with bold headings).
- Proteins from BRENDA,
a curated database of enzymes, are included if they are linked to a paper in PubMed
and their full sequence is known.
- Every protein from the non-redundant subset of
BioLiP,
a database
of ligand-binding sites and catalytic residues in protein structures, is included. Since BioLiP itself
does not include descriptions of the proteins, those are taken from the
Protein Data Bank.
Descriptions from PDB rely on the original submitter of the
structure and cannot be updated by others, so they may be less reliable.
(For SitesBLAST and Sites on a Tree, we use a larger subset of BioLiP so that every
ligand is represented among a group of structures with similar sequences, but for
PaperBLAST, we use the non-redundant set provided by BioLiP.)
- Every protein from EcoCyc, a curated
database of the proteins in Escherichia coli K-12, is included, regardless
of whether they are characterized or not.
- Proteins from the MetaCyc metabolic pathway database
are included if they are linked to a paper in PubMed and their full sequence is known.
- Proteins from the Transport Classification Database (TCDB)
are included if they have known substrate(s), have reference(s),
and are not described as uncharacterized or putative.
(Some of the references are not visible on the PaperBLAST web site.)
- Every protein from CharProtDB,
a database of experimentally characterized protein annotations, is included.
- Proteins from the CAZy database of carbohydrate-active enzymes
are included if they are associated with an Enzyme Classification number.
Even though CAZy does not provide links from individual protein sequences to papers,
these should all be experimentally-characterized proteins.
- Proteins from the REBASE database
of restriction enzymes are included if they have known specificity.
- Every protein with an evidence-based reannotation (based on mutant phenotypes)
in the Fitness Browser is included.
- Sequence-specific transcription factors (including sigma factors and DNA-binding response regulators)
with experimentally-determined DNA binding sites from the
PRODORIC database of gene regulation in prokaryotes.
- Putative transcription factors from RegPrecise
that have manually-curated predictions for their binding sites. These predictions are based on
conserved putative regulatory sites across genomes that contain similar transcription factors,
so PaperBLAST clusters the TFs at 70% identity and retains just one member of each cluster.
- Coding sequence (CDS) features from the
European Nucleotide Archive (ENA)
are included if the /experiment tag is set (implying that there is experimental evidence for the annotation),
the nucleotide entry links to paper(s) in PubMed,
and the nucleotide entry is from the STD data class
(implying that these are targeted annotated sequences, not from shotgun sequencing).
Also, to filter out genes whose transcription or translation was detected, but whose function
was not studied, nucleotide entries or papers with more than 25 such proteins are excluded.
Descriptions from ENA rely on the original submitter of the
sequence and cannot be updated by others, so they may be less reliable.
Except for GeneRIF and ENA,
the curated entries include a short curated
description of the protein's function.
For entries from BioLiP, the protein's function may not be known beyond binding to the ligand.
Many of these entries also link to articles in PubMed.
For more information see the
PaperBLAST paper (mSystems 2017)
or the code.
You can download PaperBLAST's database here.
Changes to PaperBLAST since the paper was written:
- November 2023: incorporated PRODORIC and RegPrecise. Many PRODORIC entries were not linked to a protein sequence (no UniProt identifier), so we added this information.
- February 2023: BioLiP changed their download format. PaperBLAST now includes their non-redundant subset. SitesBLAST and Sites on a Tree use a larger non-redundant subset that ensures that every ligand is represented within each cluster. This should ensure that every binding site is represented.
- June 2022: incorporated some coding sequences from ENA with the /experiment tag.
- March 2022: incorporated BioLiP.
- April 2020: incorporated TCDB.
- April 2019: EuropePMC now returns table entries in their search results. This has expanded PaperBLAST's database, but most of the new entries are of low relevance, and the resulting snippets are often just lists of locus tags with annotations.
- February 2018: the alignment page reports the conservation of the hit's functional sites (if available from from Swiss-Prot or UniProt)
- January 2018: incorporated BRENDA.
- December 2017: incorporated MetaCyc, CharProtDB, CAZy, REBASE, and the reannotations from the Fitness Browser.
- September 2017: EuropePMC no longer returns some table entries in their search results. This has shrunk PaperBLAST's database, but has also reduced the number of low-relevance hits.
Many of these changes are described in Interactive tools for functional annotation of bacterial genomes.
PaperBLAST cannot provide snippets for many of the papers that are
published in non-open-access journals. This limitation applies even if
the paper is marked as "free" on the publisher's web site and is
available in PubmedCentral or EuropePMC. If a journal that you publish
in is marked as "secret," please consider publishing elsewhere.
Many important articles are missing from PaperBLAST, either because
the article's full text is not in EuropePMC (as for many older
articles), or because the paper does not mention a protein identifier such as a locus tag, or because of PaperBLAST's heuristics. If you notice an
article that characterizes a protein's function but is missing from
PaperBLAST, please notify the curators at UniProt
or add an entry to GeneRIF.
Entries in either of these databases will eventually be incorporated
into PaperBLAST. Note that to add an entry to UniProt, you will need
to find the UniProt identifier for the protein. If the protein is not
already in UniProt, you can ask them to create an entry. To add an
entry to GeneRIF, you will need an NCBI Gene identifier, but
unfortunately many prokaryotic proteins in RefSeq do not have
corresponding Gene identifers.
References
PaperBLAST: Text-mining papers for information about homologs.
M. N. Price and A. P. Arkin (2017). mSystems, 10.1128/mSystems.00039-17.
Europe PMC in 2017.
M. Levchenko et al (2017). Nucleic Acids Research, 10.1093/nar/gkx1005.
Gene indexing: characterization and analysis of NLM's GeneRIFs.
J. A. Mitchell et al (2003). AMIA Annu Symp Proc 2003:460-464.
UniProt: the universal protein knowledgebase.
The UniProt Consortium (2016). Nucleic Acids Research, 10.1093/nar/gkw1099.
BRENDA in 2017: new perspectives and new tools in BRENDA.
S. Placzek et al (2017). Nucleic Acids Research, 10.1093/nar/gkw952.
The EcoCyc database: reflecting new knowledge about Escherichia coli K-12.
I. M. Keeseler et al (2016). Nucleic Acids Research, 10.1093/nar/gkw1003.
The MetaCyc database of metabolic pathways and enzymes.
R. Caspi et al (2018). Nucleic Acids Research, 10.1093/nar/gkx935.
CharProtDB: a database of experimentally characterized protein annotations.
R. Madupu et al (2012). Nucleic Acids Research, 10.1093/nar/gkr1133.
The carbohydrate-active enzymes database (CAZy) in 2013.
V. Lombard et al (2014). Nucleic Acids Research, 10.1093/nar/gkt1178.
The Transporter Classification Database (TCDB): recent advances
M. H. Saier, Jr. et al (2016). Nucleic Acids Research, 10.1093/nar/gkv1103.
REBASE - a database for DNA restriction and modification: enzymes, genes and genomes.
R. J. Roberts et al (2015). Nucleic Acids Research, 10.1093/nar/gku1046.
Deep annotation of protein function across diverse bacteria from mutant phenotypes.
M. N. Price et al (2016). bioRxiv, 10.1101/072470.
by Morgan Price,
Arkin group
Lawrence Berkeley National Laboratory