PaperBLAST
PaperBLAST Hits for tr|Q9HT67|Q9HT67_PSEAE HTH rpiR-type domain-containing protein OS=Pseudomonas aeruginosa (strain ATCC 15692 / DSM 22644 / CIP 104116 / JCM 14847 / LMG 12228 / 1C / PRS 101 / PAO1) OX=208964 GN=PA5506 PE=4 SV=1 (285 a.a., MQELKQRLAS...)
Show query sequence
>tr|Q9HT67|Q9HT67_PSEAE HTH rpiR-type domain-containing protein OS=Pseudomonas aeruginosa (strain ATCC 15692 / DSM 22644 / CIP 104116 / JCM 14847 / LMG 12228 / 1C / PRS 101 / PAO1) OX=208964 GN=PA5506 PE=4 SV=1
MQELKQRLASPPAELTPAERKVVRALLDDYPRLGLGPMTRLARHAGVSDPTIMRLVKKLG
FAGYGDFQEALLADVDDRLRSPRTLLAERRERMGRDDTWARYLDQAGQSLQQTLGLTRPD
DIQRLADWLLDSRLRVHCHGGRFSRFLAGYLVTHLRLLRPQCRLLDDGALLPDQLYDLGR
QDLLVLFDYRRYQSQAQHVAQAAKARGTRLVLFTDIYASPLREHADLIVSSPVESASPFD
SLVPAMAQVEALVATLVARMGAPLDERLEGIDQLRNAFSSHILEE
Running BLASTp...
Found 62 similar proteins in the literature:
NP_254193 hypothetical protein from Pseudomonas aeruginosa PAO1
PA5506 hypothetical protein from Pseudomonas aeruginosa PAO1
100% identity, 100% coverage
- QapR (PA5506) represses an operon that negatively affects the Pseudomonas quinolone signal in Pseudomonas aeruginosa.
Tipton, Journal of bacteriology 2013 - GeneRIF: Authors characterized the qapR (PA5506) operon to show that it contains genes qapR, PA5507, PA5508, and PA5509 and that QapR directly controls the transcription of these genes in a negative manner.
- QapR (PA5506) represses an operon that negatively affects the Pseudomonas quinolone signal in Pseudomonas aeruginosa
Tipton, Journal of bacteriology 2013 - “...(PA5506) Represses an Operon That Negatively Affects the Pseudomonas Quinolone Signal in Pseudomonas aeruginosa Kyle A. Tipton, James P. Coleman, Everett C. Pesci Department of Microbiology and Immunology, The Brody School of Medicine...”
- “...mutants was found to have a disruption in gene PA5506 (hereafter referred to August 2013 Volume 195 Number 15 as qapR for quinolone alteration pathway...”
- Uracil influences quorum sensing and biofilm formation in Pseudomonas aeruginosa and fluorouracil is an antagonist
Ueda, Microbial biotechnology 2009 - “...1.1 7.5 Probable coenzyme A transferase PA14_72180 PA5469 7 1.6 10.6 Conserved hypothetical protein PA14_72650 PA5506 6.5 1.9 4 Hypothetical protein PA14_72700 PA5509 6.1 1.1 5.3 Hypothetical protein PA14_72960 PA5530 4.9 1.1 4.6 Probable MFS dicarboxylate transporter Partial list of differentially expressed genes in biofilm cells...”
- Effect of anaerobiosis and nitrate on gene expression in Pseudomonas aeruginosa
Filiatrault, Infection and immunity 2005 - “...PA5171 PA5172 PA5208 PA5216 PA5217 PA5496 PA5497 PA5504 PA5506 PA5507 PA5508 PA5510 PA5570 Gene 3768 INFECT. IMMUN. NOTES TABLE 2. Differentially expressed...”
- Screening for quorum-sensing inhibitors (QSI) by use of a novel genetic system, the QSI selector
Rasmussen, Journal of bacteriology 2005 - “...PA5352 PA5383 PA5457 PA5460 PA5468 PA5481 PA5482 PA5503 PA5506 PA5517 PA5541 PA5544 Gene 1810 RASMUSSEN ET AL. J. BACTERIOL. TABLE 3. Genes upregulated by...”
- Identification of AlgR-regulated genes in Pseudomonas aeruginosa by use of microarray analysis
Lizewski, Journal of bacteriology 2004 - “...PA4348 PA4352 PA4354 PA4577 PA4613 PA4916 PA5171 PA5261 PA5497 PA5506 PA5507 PA5508 PA5509 hemN hcnB mexE mexF glpK oprG katB arcA algR Fold activationa P...”
PA14_72650 putative transcriptional regulator from Pseudomonas aeruginosa UCBPP-PA14
99% identity, 100% coverage
SPO2602 MurR/RpiR family transcriptional regulator from Ruegeria pomeroyi DSS-3
35% identity, 96% coverage
MED193_10021 hypothetical protein from Roseobacter sp. MED193
34% identity, 90% coverage
- Bacterial catabolism of membrane phospholipids links marine biogeochemical cycles
Westermann, Science advances 2023 - “...ABC-type transporter, ATP-binding component (MED193_07698); ChoX: ABC-type transporter, betaine/carnitine/choline binding protein (MED193_07703); RpiR: RpiR regulator (MED193_10021); GGAH: -glutamylglycine amidohydrolase (MED193_10026), EtoV: TRAP transporter, small permease component (MED193_10031); EtoW: TRAP transporter, large permease component (MED193_10036); EtoX: TRAP transporter, ethanolamine binding protein (MED193_10041); ETAGA: ethanolamine -glutamylase (MED193_10046); GAADDH:...”
ZMO0190 transcriptional regulator, RpiR family from Zymomonas mobilis subsp. mobilis ZM4
28% identity, 85% coverage
- Transcriptome profiling of Zymomonas mobilis under ethanol stress
He, Biotechnology for biofuels 2012 - “...family transcriptional regulator (ZMO0281, ZMO1547), LysR family transcriptional regulator (ZMO0774) and RpiR family transcriptional regulator (ZMO0190). Two phage shock protein B and C (ZMO1064, pspB and ZMO1065, pspC ) were shown to be higher differentially expressed (see Additional file 1 : Table S1). Anoter phage shock...”
BCAM2158 putative DNA-binding protein from Burkholderia cenocepacia J2315
32% identity, 93% coverage
Tsac_1881 MurR/RpiR family transcriptional regulator from Thermoanaerobacterium saccharolyticum JW/SL-YS485
22% identity, 97% coverage
YP_001795747 putative transcriptional regulator, RpiR and Sugar isomerase (SIS) domains from Cupriavidus taiwanensis
30% identity, 84% coverage
- Plant-bacteria association and symbiosis: are there common genomic traits in alphaproteobacteria?
Pini, Genes 2011 - “...pneumoniae 2149 Transcriptional regulator YP_932298 YP_002005188 YP_002007781 YP_001177142 YP_002237096 YP_002237759 YP_001177763 YP_001177947 2248 Transcriptional regulator YP_001795747 YP_001177733 YP_002239091 YP_002240771 YP_001177763 YP_001177947 2654 Adenylate cyclase YP_932132 YP_002008552 2734 ABC transporter YP_001175837 YP_002237478 YP_001177423 YP_002239806 2737 Unknown YP_002005759 2774 Endoribonuclease l-psp YP_931980 YP_002008711 YP_001178228 YP_002237662 YP_002008874 YP_002238064 2791...”
CD2048 RpiR-family transcriptional regulator from Clostridium difficile 630
23% identity, 97% coverage
- Flagellin is essential for initial attachment to mucosal surfaces by Clostridioides difficile
Sidner, Microbiology spectrum 2023 - “...a non-toxigenic clade 1 isolate ( 25 , 26 ), and CD1014 (ribotype 014) and CD2048 (ribotype 053) are toxigenic clade 1 isolates ( 22 ). C. difficile , regardless of strain, was grown in an anaerobic chamber (Coy Laboratory Products) with 5% CO/5% H 2...”
- “...assess swimming motility, C. difficile strains R20291, 630 VPI 10463, CD2015, CD1015, CD37, CD1014, and CD2048 were cultured overnight in BHIS(TA) broth. Overnight broth cultures were used to inoculate 0.5 BHIS with 0.3% agar plates, and measurements were recorded either at 48 hours (Fig. 3) or...”
- The role of trehalose in the global spread of epidemic Clostridium difficile
Collins, Gut microbes 2019 - “...used to examine the metabolic profiles of strains CD2048 (RT053), M68 (RT017) and CD2015 (RT027). Among the carbohydrates tested, strains M68 (RT017) and CD2015...”
- “...in the presence of trehalose over no carbohydrate media. CD2048 (RT053) was unable to utilize trehalose. (n = 2) (B) M68 reaches a similar maximum optical...”
- Dietary trehalose enhances virulence of epidemic Clostridium difficile
Collins, Nature 2018 - “...the minimum level of trehalose required to activate treA expression, we grew CD2015 (RT027) and CD2048 (RT053) and exposed them to increasing amounts of trehalose. We found that the RT027 strain turned on treA expression at 50 M trehalose, a concentration 500-fold lower than that required...”
- “...to activated treA gene expression in the RT027 strain CD2015 but not in RT053 strain CD2048 ( Fig 5a ). To test whether we could detect a low dietary amount of trehalose, we gavaged antibiotic treated mice with 100 l (5 mM) trehalose and measured treA...”
- Epidemic Clostridium difficile strains demonstrate increased competitive fitness compared to nonepidemic isolates
Robinson, Infection and immunity 2014 - “...Strain Toxinotype PFGEa type (NAP status) Ribotype CD1014 CD2015 CD2048 CD3014 CD3017 CD4004 CD4010 CD4015 0 III 0 0 III 0 III III MI-NAP4 MI-NAP1 MI-NAP3...”
- “...for CD3017 plus CD1014 and 1:50 for CD4015 plus CD2048. The mice were observed daily for disease symptoms and morbidity. Fecal samples were collected daily and...”
- Adaptive strategies and pathogenesis of Clostridium difficile from in vivo transcriptomics
Janoir, Infection and immunity 2013 - “...and six orphan transcriptional regulators (CD2640, CD2214, CD0670, CD2048, CD0693, and CD2444), which were upregulated in vivo compared to in vitro (see Table...”
- “...to the MerR and RpiR families (CD0693 and CD2048, respectively). NrdR negatively regulates expression of the ribonucleotide reductases (RNRs), which provide the...”
CAC1850 Transcriptional regulators, RpiR family from Clostridium acetobutylicum ATCC 824
25% identity, 83% coverage
- Pleiotropic Regulator GssR Positively Regulates Autotrophic Growth of Gas-Fermenting Clostridium ljungdahlii
Zhang, Microorganisms 2023 - “...Nanjing, China), yielding the pET28a- gssR plasmid. The pWJ1-CAC1850 plasmid for the disruption of the CAC1850 gene was constructed as follows. In brief, A 350-bp DNA fragment was first obtained through PCR amplification by using the following primers: the EBS universal primer, CAC1850-381,382s-IBS, CAC1850-381,382s-EBS1d, and CAC1850-381,382s-EBS2....”
- “...the importance of these two proteins in their hosts, we inactivated their corresponding genes, i.e., CAC1850 and Cbei1890, by using the Targetron method [ 41 ], yielding the mutant strains Cac-Mu and Cbei-Mu, respectively, to examine their performance in fermentation. The results show that the growth...”
bhn_I1841 MurR/RpiR family transcriptional regulator from Butyrivibrio hungatei
20% identity, 89% coverage
ECA3496 putative transcriptional regulator from Erwinia carotovora subsp. atroseptica SCRI1043
28% identity, 91% coverage
- DsbA plays a critical and multifaceted role in the production of secreted virulence factors by the phytopathogen Erwinia carotovora subsp. atroseptica
Coulthurst, The Journal of biological chemistry 2008 - “...(Nip, PelABZ, PehAX, Pel-3, ECA2553, CelB, ECA2134, ECA3580, and ECA3496), including several that are are detailed in Table 1. In the wild type secretome, many...”
XNC1_2986 MurR/RpiR family transcriptional regulator from Xenorhabdus nematophila ATCC 19061
22% identity, 92% coverage
BP1026B_II1046 sap1 transcriptional regulator SapR from Burkholderia pseudomallei 1026b
26% identity, 86% coverage
- A virulence activator of a surface attachment protein in Burkholderia pseudomallei acts as a global regulator of other membrane-associated virulence factors
Sun, Frontiers in microbiology 2022 - “...regulated at different stages of Bp intracellular lifecycle by unidentified regulator(s). Here, we identified SapR (BP1026B_II1046) as a transcriptional regulator that activates sap1 , using a high-throughput transposon mutagenesis screen in combination with Tn-Seq. Consistent with phenotypes of the sap1 mutant, the sapR activator mutant exhibited...”
- “...transposon mutagenesis screen coupled with Tn-Seq to identify potential regulators of sap1 . We identified BP1026B_II1046 as a transcriptional regulator that activates the sap1 gene and designated it as SapR. Furthermore, we used RNA-Seq to characterize the SapR associated transcriptional network, providing insights into its role...”
PP3592 transcriptional regulator, RpiR family from Pseudomonas putida KT2440
26% identity, 90% coverage
- Identification of the initial steps in D-lysine catabolism in Pseudomonas putida
Revelles, Journal of bacteriology 2007 - “...made up of seven open reading frames (ORFs) encoding PP3592 through PP3597. The dpkA amaC operon was transcribed divergently from the operon ORF3592 to ORF3597....”
- “...complementary to the mRNA of the ORF encoding PP3592 (5-CCAGGCGTTGCTT GATCAGT-3). Primers were labeled at their 5 ends with [-32P]ATP and T4 polynucleotide...”
PP_3592 MurR/RpiR family transcriptional regulator from Pseudomonas putida KT2440
26% identity, 96% coverage
RSP_1414 hypothetical protein from Rhodobacter sphaeroides 2.4.1
26% identity, 90% coverage
- Architecture of divergent flagellar promoters controlled by CtrA in Rhodobacter sphaeroides
Rivera-Osorio, BMC microbiology 2018 - “...of the secretion system; fliF2 , the MS-ring protein; fliL2 , a motor control protein; RSP_1414, RSP_1315, conserved hypothetical proteins; motA 2, the stator protein A; RSP_1318, a conserved hypothetical protein; SltF, the flagellar soluble lytic transglycosylase (named before flgJB2) ; flhA2 , fliR2 , flhB2...”
BCAM1768 hypothetical protein from Burkholderia cenocepacia J2315
26% identity, 90% coverage
- NtrC-dependent control of exopolysaccharide synthesis and motility in Burkholderia cenocepacia H111
Liu, PloS one 2017 - “...3.5 I35_4653 BCAM0754 Transcriptional regulator, TetR family -3.6 I35_5551 BCAM1688 Response regulator nasT -8.1 I35_5643 BCAM1768 Transcriptional regulators -6.1 I35_5806 BCAM1975 Ethanolamine operon regulatory protein -4.8 I35_5874 BCAM2039 Two-component response regulator -5.0 I35_6218 BCAM2327 Transcriptional regulator -6.7 Translation I35_2322 BCAL2395 Cytoplasmic axial filament protein CafA and...”
CLJU_c21350 MurR/RpiR family transcriptional regulator from Clostridium ljungdahlii DSM 13528
22% identity, 83% coverage
AWC34_RS13160 MurR/RpiR family transcriptional regulator from Staphylococcus equorum
24% identity, 98% coverage
MAKP3_43790 MurR/RpiR family transcriptional regulator from Klebsiella pneumoniae
24% identity, 83% coverage
YPK_3330 RpiR family transcriptional regulator from Yersinia pseudotuberculosis YPIII
26% identity, 98% coverage
UgtR / VIMSS7512254 UgtR regulator of Sugar utilization (repressor) from Thermotogales bacterium TBF 19.5.1
22% identity, 91% coverage
CC1297 SIS domain protein from Caulobacter crescentus CB15
26% identity, 80% coverage
- Genetic and computational identification of a conserved bacterial metabolic module
Boutte, PLoS genetics 2008 - “...iatP downstream=959,410960,005. iolR upstream=1,442,9411,443,408; iolR downstream=1,444,2441,444,741. All gene deletions were in-frame. The deletion of iolR (CC1297) left the first and last 6 codons intact. The deletion of ibpA (CC0859) left the first 45 and last 38 codons intact. The deletion of iatA (CC0860) left the first...”
- “...myo -Inositol Module Is Regulated by IolR and a Conserved cis -Acting Sequence The gene CC1297 (NP_420110) is annotated as an RpiR-family transcriptional regulator and encodes a putative SIS ( S ugar IS omerase; Pfam 01380) domain at its N-terminus. Based on its predicted function as...”
I35_5643 sap1 transcriptional regulator SapR from Burkholderia cenocepacia H111
25% identity, 90% coverage
- NtrC-dependent control of exopolysaccharide synthesis and motility in Burkholderia cenocepacia H111
Liu, PloS one 2017 - “...ethanolamine operon regulatory protein (I35_0068 and I35_7815), I35_1967, ntrB , I35_4176, I35_4535, I35_4653, nasT (I35_5551), I35_5643, I35_5874 and I35_6218. An analysis of the categories associated with the top 150 NtrC-regulated genes, revealed that, beside category E (amino acid metabolism and transport), category N (motility) is over-represented...”
- “...kinase 3.5 I35_4653 BCAM0754 Transcriptional regulator, TetR family -3.6 I35_5551 BCAM1688 Response regulator nasT -8.1 I35_5643 BCAM1768 Transcriptional regulators -6.1 I35_5806 BCAM1975 Ethanolamine operon regulatory protein -4.8 I35_5874 BCAM2039 Two-component response regulator -5.0 I35_6218 BCAM2327 Transcriptional regulator -6.7 Translation I35_2322 BCAL2395 Cytoplasmic axial filament protein CafA...”
CPE0189 conserved hypothetical protein from Clostridium perfringens str. 13
21% identity, 78% coverage
UgtR / VIMSS3676287 UgtR regulator of Sugar utilization (repressor) from Thermotoga lettingae TMO
24% identity, 91% coverage
SiaQ / VIMSS3626386 SiaQ regulator of Sialic acid utilization, effector N-acetylmannosamine-6-phosphate (repressor) from Enterobacter sp. 638
YP_001177947 transcriptional regulator, RpiR family from Enterobacter sp. 638
27% identity, 85% coverage
- Plant-bacteria association and symbiosis: are there common genomic traits in alphaproteobacteria?
Pini, Genes 2011 - “...taiwanensis Enterobacter 638 Klebsiella pneumoniae 2149 Transcriptional regulator YP_932298 YP_002005188 YP_002007781 YP_001177142 YP_002237096 YP_002237759 YP_001177763 YP_001177947 2248 Transcriptional regulator YP_001795747 YP_001177733 YP_002239091 YP_002240771 YP_001177763 YP_001177947 2654 Adenylate cyclase YP_932132 YP_002008552 2734 ABC transporter YP_001175837 YP_002237478 YP_001177423 YP_002239806 2737 Unknown YP_002005759 2774 Endoribonuclease l-psp YP_931980 YP_002008711 YP_001178228...”
GntR1 / VIMSS2182975 GntR1 regulator of Gluconate utilization, effector Gluconate from Oenococcus oeni PSU-1
22% identity, 97% coverage
RSUY_18120, RSUY_RS08860 MurR/RpiR family transcriptional regulator from Ralstonia solanacearum
26% identity, 69% coverage
- Transcriptomes of Ralstonia solanacearum during Root Colonization of Solanum commersonii
Puigvert, Frontiers in plant science 2017 - “...RSc2867 2.45751 dppD1 peptide ABC transporter substrate-bindingprotein RSUY_RS19000 RSUY_38980 RSp0691 2.63642 hmgA homogentisate 1,2-dioxygenase RSUY_RS08860 RSUY_18120 RSc1377 2.72373 transcriptional regulator RSUY_RS00950 RSUY_01980 RSc3296 2.82715 sdaA2 L-serine ammonia-lyase / L-serine ammonia-lyase RSUY_RS18995 RSUY_38970 RSp0690 2.93699 hmgB fumarylacetoacetase RSUY_RS08895 RSUY_18190 RSc1384 3.10843 D-aminopeptidase RSUY_RS08865 RSUY_18130 RSc1378 3.44342 isoaspartyl...”
- “...RSUY_06080 RSc2867 2.45751 dppD1 peptide ABC transporter substrate-bindingprotein RSUY_RS19000 RSUY_38980 RSp0691 2.63642 hmgA homogentisate 1,2-dioxygenase RSUY_RS08860 RSUY_18120 RSc1377 2.72373 transcriptional regulator RSUY_RS00950 RSUY_01980 RSc3296 2.82715 sdaA2 L-serine ammonia-lyase / L-serine ammonia-lyase RSUY_RS18995 RSUY_38970 RSp0690 2.93699 hmgB fumarylacetoacetase RSUY_RS08895 RSUY_18190 RSc1384 3.10843 D-aminopeptidase RSUY_RS08865 RSUY_18130 RSc1378 3.44342...”
SMc04363 HYPOTHETICAL PROTEIN from Sinorhizobium meliloti 1021
27% identity, 92% coverage
UgtR / VIMSS10363690 UgtR regulator of Sugar utilization (repressor) from Thermotoga naphthophila RKU-10
23% identity, 81% coverage
NTHI0229 putative HTH-type transcriptional regulator from Haemophilus influenzae 86-028NP
22% identity, 85% coverage
- Molecular Signatures of Non-typeable Haemophilus influenzae Lung Adaptation in Pediatric Chronic Lung Disease
Aziz, Frontiers in microbiology 2019 - “..., glpK , hitA , hsdMS3 , licBC , mao2 , nagAB , nanAEK , NTHI0229 , NTHI0232 , NTHI0235 , NTHI0646 , NTHI0647 , NTHI1183 , NTHI1243 , nrfCD , pckA , pilA , sdaA , siaT , and tnaB ), and the remaining 87...”
P44540 Uncharacterized HTH-type transcriptional regulator HI_0143 from Haemophilus influenzae (strain ATCC 51907 / DSM 11121 / KW20 / Rd)
HI0143 conserved hypothetical protein from Haemophilus influenzae Rd KW20
22% identity, 86% coverage
- Functional annotation of conserved hypothetical proteins from Haemophilus influenzae Rd KW20
Shahbaaz, PloS one 2013 - “...24. HP HI0134 951034 P43952 sugar transporter (AsmA-like C-terminal domain protein) 25. HP HI0143 951052 P44540 HTH-type transcriptional regulator 26. HP HI0146 951056 P44542 sialic acid transporter, TRAP-type C4-dicarboxylate transport system, periplasmic component 27. HP HI0147 951057 P44543 C4-dicarboxylate ABC transporter permease 28. HP HI0149 951059...”
- Genetic snapshots of the Rhizobium species NGR234 genome
Viprey, Genome biology 2000 - “...O07220 956 13a08 P44093 904 25a05 O06235 957 25h09 P44886 905 25b06 P71984 958 01h02 P44540 906 25h05 O53203 959 22c12 P44543 907 27h05 Q10849 960 19a07 Q57184 Agrobacterium sp. 1005 07d10 Q58322 961 03a07 AAB91569 1006 05a02 Q57883 962 05f01 AAB67296 1006a 29e09 overlaps clone...”
- Functional annotation of conserved hypothetical proteins from Haemophilus influenzae Rd KW20
Shahbaaz, PloS one 2013 - “...C permease 24. HP HI0134 951034 P43952 sugar transporter (AsmA-like C-terminal domain protein) 25. HP HI0143 951052 P44540 HTH-type transcriptional regulator 26. HP HI0146 951056 P44542 sialic acid transporter, TRAP-type C4-dicarboxylate transport system, periplasmic component 27. HP HI0147 951057 P44543 C4-dicarboxylate ABC transporter permease 28. HP...”
- Sialic acid mediated transcriptional modulation of a highly conserved sialometabolism gene cluster in Haemophilus influenzae and its effect on virulence
Jenkins, BMC microbiology 2010 - “...), insertion of kan R was achieved following partial digestion with Mfe1 and siaR ( HI0143 ) was inactivated by inserting kan R at an MfeI site. Table 1 Oligonucleotide primers used in this study. primer Sequence (5'-3') primer Sequence (5'-3') 0140for CTGCAATTAAATGGCTGTGG 0140rev GCAATTGTGTCATTCGCATC 0141for...”
- “...), HI0142 ( nanA ), HI0144 ( nanK ), HI0145 ( nanE ) and including HI0143 ( siaR )) and procurement ( HI0146 ( siaP ), HI1047 ( siaQM ), HI0148 ) are transcribed divergently (Figure 1 ). siaR and nanK possess overlapping ORFs whilst three...”
- Convergent pathways for utilization of the amino sugars N-acetylglucosamine, N-acetylmannosamine, and N-acetylneuraminic acid by Escherichia coli
Plumbridge, Journal of bacteriology 1999 - “...(HI0145). The gene order in H. influenzae (yhcJ, yhcI, HI0143, nanA, nagB, nagA) is somewhat different from that in E. coli (Fig. 2) but suggests a remarkable...”
Atu5063 transcriptional regulator, RpiR family from Agrobacterium tumefaciens str. C58 (Cereon)
27% identity, 84% coverage
SMa0520 conserved hypothetical protein from Sinorhizobium meliloti 1021
27% identity, 89% coverage
- Rhizobium etli CFN42 and Sinorhizobium meliloti 1021 bioinformatic transcriptional regulatory networks from culture and symbiosis
Taboada-Castro, Frontiers in bioinformatics 2024 - “...iron ABC transporter substrate-binding protein and gene 5,003 is of the LacI family TF. The SMa0520 regulon contains two LysR family TFs. The SMa0748 regulon contains two GntR family TFs and the 4,457 and 4,458 neighbor genes, where 4,457 gene is a TF and 4,458 gene...”
- Transcriptome profiling of a Sinorhizobium meliloti fadD mutant reveals the role of rhizobactin 1021 biosynthesis and regulation genes in the control of swarming
Nogales, BMC genomics 2010 - “...M value b SS/L 7 h SS/L 14 h SS/S 7 h SS/S 14 h SMa0520 Transcriptional regulator, RpiR family 1,45 1,90 1,73 1,55 SMa0564 Putative dehydrogenase -0,45 -1,12 -0,83 2,78 SMa1052 Conserved hypothetical protein 1,01 1,24 0,51 1,56 SMa1077 ( nex18 ) c Nex18 Symbiotically...”
BMEII0573 TRANSCRIPTIONAL REGULATOR, RPIR FAMILY from Brucella melitensis 16M
27% identity, 70% coverage
- Analyses of Brucella pathogenesis, host immunity, and vaccine targets using systems biology and bioinformatics
He, Frontiers in cellular and infection microbiology 2012 - “...BMEI1913 Mice 16113274 86 lysR18 BMEI1573 Mice 16113274 87 rho BMEI0003 HeLa 11579087 88 RpiR BMEII0573 Mice 14979322 89 rpoA BMEI0781 Mice 14638795 90 vjbR BMEII1116 Mice, macrophages, HeLa 14979322 L: REPLICATION, RECOMBINATION, AND REPAIR 91 alkA BMEI0382 Mice, macrophages, HeLa 14979322 92 BMEI1229 BMEI1229 Mice,...”
- Large scale immune profiling of infected humans and goats reveals differential recognition of Brucella melitensis antigens
Liang, PLoS neglected tropical diseases 2010 - “...BAB1_1009 BR0990 0.842 Rare Lipoprotein A BMEII0338 BAB2_0275 BRA0960 0.958 ABC transporter periplasmic BP, lipoprotein BMEII0573 BAB2_0527 BRA0712 0.917 Transcriptional regulator, RpiR family BMEII0179 BAB2_1078 BRA1120 0.858 Zn binding protein BMEII0859 BAB2_0812 BRA0409 0.883 ABC dipeptide transport system, periplasmic component Validation of serodiagnostic accuracy with immunostrips...”
- Brucella melitensis global gene expression study provides novel information on growth phase-specific gene regulation with potential insights for understanding Brucella:host initial interactions
Rossetti, BMC microbiology 2009 - “...IclR (BMEI1717), LysR (BMEII0902, BMEII1077, BMEII1135), LuxR (BMEI1758), MarR (BMEII0520), MerR (BMEII0372, BMEII0467), and RpiR (BMEII0573) families were differentially expressed in late-log phase B. melitensis cultures compared to stationary phase cultures. Some of these transcription factors are known to be involved in positive regulation of gene...”
STM4417 putative transcriptional regulator from Salmonella typhimurium LT2
STM14_5307 myo-inositol utilization transcriptional regulator IolR from Salmonella enterica subsp. enterica serovar Typhimurium str. 14028S
22% identity, 83% coverage
- Hysteresis in myo-inositol utilization by Salmonella Typhimurium
Hellinckx, MicrobiologyOpen 2017 - “...or literature Bacterial strains 14028 S.enterica serovar Typhimurium strain ATCC14028 ATCC 14028 iolR Inframe iolR (STM4417) deletion mutant This study MvP101 14028 with sseD :: aphT , Kan R ; allelicexchange mutant (Medina etal., 1999 ) MvP101 iolR Inframe iolR (STM4417) deletion mutant of MvP101 This...”
- High binding affinity of repressor IolR avoids costs of untimely induction of myo-inositol utilization by Salmonella Typhimurium
Hellinckx, Scientific reports 2017 - “...strain ATCC14028 ATCC 14028 iolR ::Kan R Allelic-exchange mutant This study 14028 iolR In-frame iolR (STM4417) deletion mutant This study 14028 dacB ::Kan R Allelic-exchange mutant This study 14028 iolR ::Kan R 44184436 Deletion of iolT1 (STM4418) to iolH (STM4436) in 14028 iolR ::Kan R This...”
- “...with sseD :: aphT , Kan R ; allelic-exchange mutant 37 MvP101 iolR In-frame iolR (STM4417) deletion mutant of MvP101 This study E. coli BL21(DE3) F , ompT, hsd S B (r B m B ), gal, lon, dcm, rne 131, (DE3 [ lacI lac UV5-T7...”
- Deciphering the Regulatory Circuitry That Controls Reversible Lysine Acetylation in Salmonella enterica
Hentchel, mBio 2015 - “...products affected the pat promoter (P pat ). We show that inactivation of iolR ( stm4417 ), encoding an RpiR-like transcriptional repressor, decreased pat expression ( 16 ). RpiR-like regulators are involved in sugar catabolism and can function as activators and repressors ( 17 , 18...”
- “...flanking the Tn 10 d( tet + ) elements located both insertions within iolR ( stm4417 ), the gene encoding the repressor of the myo- inositol utilization ( iol ) genes. No other insertions affecting pat expression were identified. To confirm that the iolR ::Tn 10...”
- Identification of novel factors involved in modulating motility of Salmonella enterica serotype typhimurium
Bogomolnaya, PloS one 2014 - “...pdxK, tatC yliB, STM1635, sfbA, mgtB, sodA, STM0163, STM1546, STM0857 Transcription tctD invF, STM2912, STM3696, STM4417, arcA STM0859, ydiP, torR, STM4315 Replication, recombination and repair STM1005 STM1861 Translation, ribosomal structure and biogenesis valS STM1552 Posttranslational modification, protein turnover, chaperones STM2743, sspA Energy production and conversion STM0762,...”
- SrfJ, a Salmonella type III secretion system effector regulated by PhoP, RcsB, and IolR
Cordero-Alba, Journal of bacteriology 2012 - “...carried the T-POP transposon in the same gene: STM14_5307 (STM4417 in strain LT2), also known as iolR (33). To confirm the results, an iolR null mutant was...”
- Bistability in myo-inositol utilization by Salmonella enterica serovar Typhimurium
Kröger, Journal of bacteriology 2011 - “...14028 wild-type strain ATCC 14028 In-frame iolR (STM4417) deletion mutant In-frame iolI1 (STM4427) deletion mutant In-frame iolG2 (STM4433) deletion mutant...”
- In Salmonella enterica, 2-methylcitrate blocks gluconeogenesis
Rocco, Journal of bacteriology 2010 - “...JE10713 that contained a cat cassette replacing iolR (formerly STM4417) (36) located near the fbp gene. Bacteriophage P22 was concentrated at 39,000 g for 2 h...”
- Characterization of the myo-inositol utilization island of Salmonella enterica serovar Typhimurium
Kröger, Journal of bacteriology 2009 - “...part of a 22.6-kb genomic island which spans STM4417 to STM4436 (genomic island 4417/4436) and is responsible for MI degradation. Genome comparison revealed the...”
- “...to a reduced lag phase of a strain mutated in STM4417 (iolR). Deletion of iolR resulted in stimulation of the iol operons, indicating its negative effect on the...”
- More
- Heat Survival and Phenotype Microarray Profiling of Salmonella Typhimurium Mutants
Dawoud, Current microbiology 2017 (PubMed)- “...of doublestranded DNA breaks through homologous recombination. STM14_5307 is a transcriptional regulator involved in stationary phase growth and inositol...”
- “...function to provide protection against these conditions, RecD, STM14_5307, and aroD are among some of the most important stress response genes besides general...”
- Genome scanning for conditionally essential genes in Salmonella enterica Serotype Typhimurium
Khatiwara, Applied and environmental microbiology 2012 - “...using 4 deletion mutants (pyrD, glnL, recD, and STM14_5307) confirmed the phenotypes predicted by Tn-seq data, validating the utility of this approach in...”
- “...14028 carrying a deletion in pyrD, glnL, recD, and STM14_5307 were constructed with a lambda red recombination system by the method described by Cox et al. (6),...”
- SrfJ, a Salmonella type III secretion system effector regulated by PhoP, RcsB, and IolR
Cordero-Alba, Journal of bacteriology 2012 - “...isolates carried the T-POP transposon in the same gene: STM14_5307 (STM4417 in strain LT2), also known as iolR (33). To confirm the results, an iolR null mutant...”
VC1775 conserved hypothetical protein from Vibrio cholerae O1 biovar eltor str. N16961
24% identity, 90% coverage
BAB2_0527 Sugar isomerase (SIS) from Brucella melitensis biovar Abortus 2308
27% identity, 70% coverage
BRA0712 SIS domain protein from Brucella suis 1330
27% identity, 70% coverage
AlsR / b4089 DNA-binding transcriptional repressor AlsR from Escherichia coli K-12 substr. MG1655 (see 3 papers)
RPIR_ECOLI / P0ACS7 HTH-type transcriptional regulator RpiR; Als operon repressor from Escherichia coli (strain K12) (see paper)
rpiR / RF|NP_418513 Als operon repressor from Escherichia coli K12 (see 5 papers)
rpiR / CAA57687.1 rpiR from Escherichia coli (see paper)
AlsR / VIMSS18117 AlsR regulator of Allose utilization, effector D-allose (repressor) from Escherichia coli str. K-12 substr. MG1655
D1792_18820 D-allose utilization transcriptional regulator AlsR from Escherichia coli
b4089 transcriptional repressor of rpiB expression from Escherichia coli str. K-12 substr. MG1655
23% identity, 74% coverage
- function: Regulatory protein involved in rpiB gene repression. Also involved in als operon repression.
- Effects of PEF on Cell and Transcriptomic of Escherichia coli
Kuang, Microorganisms 2024 - “...gene was up-regulated. As we can observe in Figure 6 E, the expression levels of D1792_18820 (carbohydrate metabolic process), D1792_24705 (Succinate Dehydrogenase activity), D1792_16885 (energy metabolism), D1792_19100 (amino acid metabolism), D1792_09930 (DNA damage response), E4.2.1.2A (Tricarboxylic acid cycle), and rbsB (motility) were down-regulated, while the expression...”
- “...genes. microorganisms-12-01380-t001_Table 1 Table 1 Primer information. Genes Forward (F) Reverse (R) 16SrRNA TGACGTTACCCGCAGAAGAA ATCTCTACGCATTTCACCGCTAC D1792_18820 AGCCGCATCGTGGAGTGG TCGCTTCAGATACCGCCAGAG D1792_24705 CGTAGGTATTCGCCACATGATGATG GAAAGCACGACAGTAATAACAAAGGAG D1792_19850 ACTGCGATCAACTGGTGGTAATG CCGCTTCCACGCTGAATACTG D1792_16885 GGTGATGGCGTACTGGAGATATTG AGGTTGAAACGGCGGATTTGG D1792_02420 CGGCATCAATACTTACGCTCAGG ACATACTTTATTCTCACCCAGCAACAC D1792_19100 TGGCAGCAGAAGCAGGTCAG AGCAAGCGTTGGTCAGAATGTG D1792_09930 TTTGATGAAGTGGATGTAGGGATTAGC ACCTGAGTTGATTCGCCAAGTTG E4.2.1.2A CTCCTGTTCTGCTGACCGTAATATC ACCGCTTCGCCTTCTCCTG rbsB CGTCAGTGCGAATGCGATGG CCACCAGGTTATAGCCAAGTTTATCC microorganisms-12-01380-t002_Table 2 Table 2...”
- Combined, functional genomic-biochemical approach to intermediary metabolism: interaction of acivicin, a glutamine amidotransferase inhibitor, with Escherichia coli K-12
Smulski, Journal of bacteriology 2001 - “...b3300 b2679 b4245 b4244 b3749 b0497 b3041 b1427 b3704 b4089 b3984 b3317 b3320 b3319 b3308 b3305 b4203 b3985 b3983 b3986 b3231 b3310 b3301 b3313 b3304 b2606...”
VVMO6_04102 MurR/RpiR family transcriptional regulator from Vibrio vulnificus MO6-24/O
24% identity, 90% coverage
BF29_RS07850 MurR/RpiR family transcriptional regulator from Heyndrickxia coagulans DSM 1 = ATCC 7050
20% identity, 93% coverage
SiaQ / VIMSS311908 SiaQ regulator of Sialic acid utilization, effector N-acetylmannosamine-6-phosphate (repressor) from Vibrio vulnificus CMCP6
VV2_0729 Transcriptional regulator from Vibrio vulnificus CMCP6
24% identity, 84% coverage
SSU05_1933 Transcriptional regulator from Streptococcus suis 05ZYH33
SSU1727 RpiR family regulatory protein from Streptococcus suis P1/7
23% identity, 77% coverage
- HP0197 contributes to CPS synthesis and the virulence of Streptococcus suis via CcpA
Zhang, PloS one 2012 - “...on the data obtained from a total of 12 genes. From left to right: ssu05_1372, ssu05_1933, ssu05_0167, ssu05_2076, ssu05_2137, ssu05_1039, ssu05_0360, ssu05_1401, ssu05_0812, ssu05_0573 ( cps 2J ), ssu05_0265 and ssu05_0469. ( C ) Functional classification of the differentially expressed genes. J: Translation, ribosomal structure and...”
- Genetic diversity of Streptococcus suis isolates as determined by comparative genome hybridization
de, BMC microbiology 2011 - “...23.640 45 43.4 Two-component regulatory system, tranpsoase, glucosaminidase, hypothetical proteins, alpha-1,2,-mannosidase, eno-beta-N-acetylglucusaminidase RD33 SSU1722 - SSU1727 4.924 30 38.3 Acetyltransferase, hypothetical proteins, PTS IIBC RD34 SSU1763 - SSU1768 6.153 29 47.1 Nicotinamide mononucleotide transporter, transcriptional regulator, hypothetical proteins RD35 SSU1855 - SSU1862 8.479 52 39.9 PTS...”
RSc1377 CONSERVED HYPOTHETICAL PROTEIN from Ralstonia solanacearum GMI1000
25% identity, 68% coverage
- Transcriptomes of Ralstonia solanacearum during Root Colonization of Solanum commersonii
Puigvert, Frontiers in plant science 2017 - “...2.45751 dppD1 peptide ABC transporter substrate-bindingprotein RSUY_RS19000 RSUY_38980 RSp0691 2.63642 hmgA homogentisate 1,2-dioxygenase RSUY_RS08860 RSUY_18120 RSc1377 2.72373 transcriptional regulator RSUY_RS00950 RSUY_01980 RSc3296 2.82715 sdaA2 L-serine ammonia-lyase / L-serine ammonia-lyase RSUY_RS18995 RSUY_38970 RSp0690 2.93699 hmgB fumarylacetoacetase RSUY_RS08895 RSUY_18190 RSc1384 3.10843 D-aminopeptidase RSUY_RS08865 RSUY_18130 RSc1378 3.44342 isoaspartyl peptidase...”
GQR50_20745 SIS domain-containing protein from Aeromonas hydrophila
23% identity, 79% coverage
HexR / VIMSS140465 HexR regulator of Central carbohydrate metabolism, effector 2-keto-3-deoxy-6-phosphogluconate (activator/repressor) from Neisseria meningitidis MC58
NMB1389 RpiR/YebK/YfhH family protein from Neisseria meningitidis MC58
NMA1605 putative transcriptional regulator from Neisseria meningitidis Z2491
23% identity, 90% coverage
HexR / VIMSS515299 HexR regulator of Central carbohydrate metabolism, effector 2-keto-3-deoxy-6-phosphogluconate (activator/repressor) from Chromobacterium violaceum ATCC 12472
23% identity, 90% coverage
BL1137 protein similar to hex regulon repressor from Bifidobacterium longum NCC2705
24% identity, 79% coverage
4ivnA / Q7MD38 The vibrio vulnificus nanr protein complexed with mannac-6p (see paper)
24% identity, 88% coverage
- Ligand: 2-acetamido-2-deoxy-6-o-phosphono-alpha-d-mannopyranose (4ivnA)
RSc0081 CONSERVED HYPOTHETICAL PROTEIN from Ralstonia solanacearum GMI1000
27% identity, 80% coverage
- Changes in DNA methylation contribute to rapid adaptation in bacterial plant pathogen evolution
Gopalan-Nair, PLoS biology 2024 - “...upstream unmethylated Megaplasmid RSp1529 efe 1-aminocyclopropane-1-carboxylate oxidase (Ethylene-forming enzyme) 1916009 1916012 GTTTAC upstream unmethylated Chromosome RSc0081 Transcription regulator, MurR/RpiR family 94117 94120 GTTAAC upstream hemimethylated strand Chromosome RSc0608 ripAA type III effector protein RipAA 655714 655717 GTTAAC upstream hemimethylated strand Chromosome RSc2094/RSc2095 xanR/xdhA Purine salvage pathway...”
- “...b1 b2 c2 d1 e3 a1 a4 b1 b4 c1 c2 d3 d5 e1 e3 RSc0081 GTTAAC upstream Transcriptional regulator, MurR/RpiR family 94117 6mA 6mA 6mA 6mA 6mA 6A 6mA 6mA 6mA 6mA 6mA 6mA 6mA 6A 6mA 6mA 6mA 6mA 6mA 6mA 6A 6mA 6mA...”
HexR / VIMSS7174624 HexR regulator of Central carbohydrate metabolism, effector 2-keto-3-deoxy-6-phosphogluconate (activator/repressor) from Marinomonas sp. MWYL1
22% identity, 95% coverage
HexR / VIMSS2068954 HexR regulator of Central carbohydrate metabolism, effector 2-keto-3-deoxy-6-phosphogluconate (activator/repressor) from Verminephrobacter eiseniae EF01-2
34% identity, 25% coverage
UctR / VIMSS1216 UctR regulator of Sugar utilization (repressor) from Thermotoga maritima MSB8
TM0326 transcriptional regulator, RpiR family from Thermotoga maritima MSB8
22% identity, 96% coverage
RSc1249 CONSERVED HYPOTHETICAL PROTEIN from Ralstonia solanacearum GMI1000
22% identity, 84% coverage
NGO0718 putative RpiR-family transcriptional regulator from Neisseria gonorrhoeae FA 1090
Q5F8P9 Transcriptional regulator from Neisseria gonorrhoeae (strain ATCC 700825 / FA 1090)
23% identity, 90% coverage
- Characterization of an ntrX mutant of Neisseria gonorrhoeae reveals a response regulator that controls expression of respiratory enzymes in oxidase-positive proteobacteria
Atack, Journal of bacteriology 2013 - “...starvation protein found in E. coli; and (iii) NGO0718, which is annotated as encoding an rpiR family transcription factor--most probably hexR, which encodes a...”
- “...Downregulated NGO1276 NGO1275 NGO1769 NGO1371 NGO1024 NGO1064 NGO0718 Nitrite reductase (aniA) Nitric oxide reductase (norB) Cytochrome c peroxidase (ccpR)...”
- Proteomic analysis of Neisseria gonorrhoeae biofilms shows shift to anaerobic respiration and changes in nutrient transport and outermembrane proteins
Phillips, PloS one 2012 - “...oxidase subunit II Electron transport 1.479 (1) 1.930 (1) NGO0375 Pgm; phosphoglucomutase Sugars 1.368 (1) NGO0718 d RpiR family transcriptional regulator Glycolysis/gluconeogenesis 1.217 (1) NGO0214 putative phosphotransacetylase Fermentation 1.177 (2) NGO1328 putative cytochrome Electron transport 1.152 (1) NGO1371 CcoP; cb-type cytochrome c oxidase subunit III Electron...”
- “...role assignment provided (Main role/JCVI sub-role): NGO0094, Cellular processes/DNA transformation, Cellular processes/Pathogenesis; NGO1812, Energy metabolism/Fermentation; NGO0718, Energy metabolism/Sugars. e Protein functional role category tentatively assigned in the present study by orthology to an N. meningitidis protein (see Supplementary Table S5 for details). 10.1371/journal.pone.0038303.t002 Table 2 Downregulated...”
- Transcriptional and functional analysis of the Neisseria gonorrhoeae Fur regulon
Jackson, Journal of bacteriology 2010 - “...(44, 52); MpeR (NGO0025), an AraC family regulator; NGO0718, an RpiR family regulator; and two regulators of PilE, RegF (NGO2130) and RegG (NGO2131) (see...”
- Availability of iron ions impacts physicochemical properties and proteome of outer membrane vesicles released by Neisseria gonorrhoeae
Płaczkiewicz, Scientific reports 2023 - “...Q5F5H5 J: Translation Integration host factor beta subunit Q5F906 K: Transcription Putative RpiR-family transcriptional regulator Q5F8P9 K: Transcription Transcription termination factor rho Q5FA35 K: Transcription Hypothetical protein Q5F6N0 K: Transcription 3-isopropylmalate dehydratase large subunit Q5F8T1 H: Coenzyme metabolism Putative oxygen-independent coproporphyrinogen III oxidase Q5F6H8 H: Coenzyme...”
HexR / VIMSS840087 HexR regulator of Central carbohydrate metabolism, effector 2-keto-3-deoxy-6-phosphogluconate (activator/repressor) from Colwellia psychrerythraea 34H
22% identity, 92% coverage
AL538_RS19380 MurR/RpiR family transcriptional regulator from Vibrio harveyi
23% identity, 83% coverage
VC0204 conserved hypothetical protein from Vibrio cholerae O1 biovar eltor str. N16961
23% identity, 78% coverage
RSp1556 PUTATIVE TRANSCRIPTION REGULATION REPRESSOR HEXR TRANSCRIPTION REGULATOR PROTEIN from Ralstonia solanacearum GMI1000
35% identity, 23% coverage
For advice on how to use these tools together, see
Interactive tools for functional annotation of bacterial genomes.
The PaperBLAST database links 793,807 different protein sequences to 1,259,118 scientific articles. Searches against EuropePMC were last performed on March 13 2025.
PaperBLAST builds a database of protein sequences that are linked
to scientific articles. These links come from automated text searches
against the articles in EuropePMC
and from manually-curated information from GeneRIF, UniProtKB/Swiss-Prot,
BRENDA,
CAZy (as made available by dbCAN),
BioLiP,
CharProtDB,
MetaCyc,
EcoCyc,
TCDB,
REBASE,
the Fitness Browser,
and a subset of the European Nucleotide Archive with the /experiment tag.
Given this database and a protein sequence query,
PaperBLAST uses protein-protein BLAST
to find similar sequences with E < 0.001.
To build the database, we query EuropePMC with locus tags, with RefSeq protein
identifiers, and with UniProt
accessions. We obtain the locus tags from RefSeq or from MicrobesOnline. We use
queries of the form "locus_tag AND genus_name" to try to ensure that
the paper is actually discussing that gene. Because EuropePMC indexes
most recent biomedical papers, even if they are not open access, some
of the links may be to papers that you cannot read or that our
computers cannot read. We query each of these identifiers that
appears in the open access part of EuropePMC, as well as every locus
tag that appears in the 500 most-referenced genomes, so that a gene
may appear in the PaperBLAST results even though none of the papers
that mention it are open access. We also incorporate text-mined links
from EuropePMC that link open access articles to UniProt or RefSeq
identifiers. (This yields some additional links because EuropePMC
uses different heuristics for their text mining than we do.)
For every article that mentions a locus tag, a RefSeq protein
identifier, or a UniProt accession, we try to select one or two
snippets of text that refer to the protein. If we cannot get access to
the full text, we try to select a snippet from the abstract, but
unfortunately, unique identifiers such as locus tags are rarely
provided in abstracts.
PaperBLAST also incorporates manually-curated protein functions:
- Proteins from NCBI's RefSeq are included if a
GeneRIF
entry links the gene to an article in
PubMed®.
GeneRIF also provides a short summary of the article's claim about the
protein, which is shown instead of a snippet.
- Proteins from Swiss-Prot (the curated part of UniProt)
are included if the curators
identified experimental evidence for the protein's function (evidence
code ECO:0000269). For these proteins, the fields of the Swiss-Prot entry that
describe the protein's function are shown (with bold headings).
- Proteins from BRENDA,
a curated database of enzymes, are included if they are linked to a paper in PubMed
and their full sequence is known.
- Every protein from the non-redundant subset of
BioLiP,
a database
of ligand-binding sites and catalytic residues in protein structures, is included. Since BioLiP itself
does not include descriptions of the proteins, those are taken from the
Protein Data Bank.
Descriptions from PDB rely on the original submitter of the
structure and cannot be updated by others, so they may be less reliable.
(For SitesBLAST and Sites on a Tree, we use a larger subset of BioLiP so that every
ligand is represented among a group of structures with similar sequences, but for
PaperBLAST, we use the non-redundant set provided by BioLiP.)
- Every protein from EcoCyc, a curated
database of the proteins in Escherichia coli K-12, is included, regardless
of whether they are characterized or not.
- Proteins from the MetaCyc metabolic pathway database
are included if they are linked to a paper in PubMed and their full sequence is known.
- Proteins from the Transport Classification Database (TCDB)
are included if they have known substrate(s), have reference(s),
and are not described as uncharacterized or putative.
(Some of the references are not visible on the PaperBLAST web site.)
- Every protein from CharProtDB,
a database of experimentally characterized protein annotations, is included.
- Proteins from the CAZy database of carbohydrate-active enzymes
are included if they are associated with an Enzyme Classification number.
Even though CAZy does not provide links from individual protein sequences to papers,
these should all be experimentally-characterized proteins.
- Proteins from the REBASE database
of restriction enzymes are included if they have known specificity.
- Every protein with an evidence-based reannotation (based on mutant phenotypes)
in the Fitness Browser is included.
- Sequence-specific transcription factors (including sigma factors and DNA-binding response regulators)
with experimentally-determined DNA binding sites from the
PRODORIC database of gene regulation in prokaryotes.
- Putative transcription factors from RegPrecise
that have manually-curated predictions for their binding sites. These predictions are based on
conserved putative regulatory sites across genomes that contain similar transcription factors,
so PaperBLAST clusters the TFs at 70% identity and retains just one member of each cluster.
- Coding sequence (CDS) features from the
European Nucleotide Archive (ENA)
are included if the /experiment tag is set (implying that there is experimental evidence for the annotation),
the nucleotide entry links to paper(s) in PubMed,
and the nucleotide entry is from the STD data class
(implying that these are targeted annotated sequences, not from shotgun sequencing).
Also, to filter out genes whose transcription or translation was detected, but whose function
was not studied, nucleotide entries or papers with more than 25 such proteins are excluded.
Descriptions from ENA rely on the original submitter of the
sequence and cannot be updated by others, so they may be less reliable.
Except for GeneRIF and ENA,
the curated entries include a short curated
description of the protein's function.
For entries from BioLiP, the protein's function may not be known beyond binding to the ligand.
Many of these entries also link to articles in PubMed.
For more information see the
PaperBLAST paper (mSystems 2017)
or the code.
You can download PaperBLAST's database here.
Changes to PaperBLAST since the paper was written:
- November 2023: incorporated PRODORIC and RegPrecise. Many PRODORIC entries were not linked to a protein sequence (no UniProt identifier), so we added this information.
- February 2023: BioLiP changed their download format. PaperBLAST now includes their non-redundant subset. SitesBLAST and Sites on a Tree use a larger non-redundant subset that ensures that every ligand is represented within each cluster. This should ensure that every binding site is represented.
- June 2022: incorporated some coding sequences from ENA with the /experiment tag.
- March 2022: incorporated BioLiP.
- April 2020: incorporated TCDB.
- April 2019: EuropePMC now returns table entries in their search results. This has expanded PaperBLAST's database, but most of the new entries are of low relevance, and the resulting snippets are often just lists of locus tags with annotations.
- February 2018: the alignment page reports the conservation of the hit's functional sites (if available from from Swiss-Prot or UniProt)
- January 2018: incorporated BRENDA.
- December 2017: incorporated MetaCyc, CharProtDB, CAZy, REBASE, and the reannotations from the Fitness Browser.
- September 2017: EuropePMC no longer returns some table entries in their search results. This has shrunk PaperBLAST's database, but has also reduced the number of low-relevance hits.
Many of these changes are described in Interactive tools for functional annotation of bacterial genomes.
PaperBLAST cannot provide snippets for many of the papers that are
published in non-open-access journals. This limitation applies even if
the paper is marked as "free" on the publisher's web site and is
available in PubmedCentral or EuropePMC. If a journal that you publish
in is marked as "secret," please consider publishing elsewhere.
Many important articles are missing from PaperBLAST, either because
the article's full text is not in EuropePMC (as for many older
articles), or because the paper does not mention a protein identifier such as a locus tag, or because of PaperBLAST's heuristics. If you notice an
article that characterizes a protein's function but is missing from
PaperBLAST, please notify the curators at UniProt
or add an entry to GeneRIF.
Entries in either of these databases will eventually be incorporated
into PaperBLAST. Note that to add an entry to UniProt, you will need
to find the UniProt identifier for the protein. If the protein is not
already in UniProt, you can ask them to create an entry. To add an
entry to GeneRIF, you will need an NCBI Gene identifier, but
unfortunately many prokaryotic proteins in RefSeq do not have
corresponding Gene identifers.
References
PaperBLAST: Text-mining papers for information about homologs.
M. N. Price and A. P. Arkin (2017). mSystems, 10.1128/mSystems.00039-17.
Europe PMC in 2017.
M. Levchenko et al (2017). Nucleic Acids Research, 10.1093/nar/gkx1005.
Gene indexing: characterization and analysis of NLM's GeneRIFs.
J. A. Mitchell et al (2003). AMIA Annu Symp Proc 2003:460-464.
UniProt: the universal protein knowledgebase.
The UniProt Consortium (2016). Nucleic Acids Research, 10.1093/nar/gkw1099.
BRENDA in 2017: new perspectives and new tools in BRENDA.
S. Placzek et al (2017). Nucleic Acids Research, 10.1093/nar/gkw952.
The EcoCyc database: reflecting new knowledge about Escherichia coli K-12.
I. M. Keeseler et al (2016). Nucleic Acids Research, 10.1093/nar/gkw1003.
The MetaCyc database of metabolic pathways and enzymes.
R. Caspi et al (2018). Nucleic Acids Research, 10.1093/nar/gkx935.
CharProtDB: a database of experimentally characterized protein annotations.
R. Madupu et al (2012). Nucleic Acids Research, 10.1093/nar/gkr1133.
The carbohydrate-active enzymes database (CAZy) in 2013.
V. Lombard et al (2014). Nucleic Acids Research, 10.1093/nar/gkt1178.
The Transporter Classification Database (TCDB): recent advances
M. H. Saier, Jr. et al (2016). Nucleic Acids Research, 10.1093/nar/gkv1103.
REBASE - a database for DNA restriction and modification: enzymes, genes and genomes.
R. J. Roberts et al (2015). Nucleic Acids Research, 10.1093/nar/gku1046.
Deep annotation of protein function across diverse bacteria from mutant phenotypes.
M. N. Price et al (2016). bioRxiv, 10.1101/072470.
by Morgan Price,
Arkin group
Lawrence Berkeley National Laboratory