PaperBLAST
PaperBLAST Hits for Q02287 T-protein (Enterobacter agglomerans) (373 a.a., MVAELTALRD...)
Show query sequence
>Q02287 T-protein (Enterobacter agglomerans)
MVAELTALRDQIDSVDKALLDLLAKRLELVAEVGEVKSRYGLPIYVPEREASMLASRRKE
AEALGVPPDLIEDVLRRVMRESYTSENDKGFKTLCPELRPVVIVGGKGQMGRLFEKMLGL
SGYTVKTLDKEDWPQAETLLSDAGMVIISVPIHLTEQVIAQLPPLPEDCILVDLASVKNR
PLQAMLAAHNGPVLGLHPMFGPDSGSLAKQVVVWCDGRQPEAYQWFLEQIQVWGARLHRI
SAVEHDQNMAFIQALRHFATFAYGLHLAEENVNLDQLLALSSPIYRLELAMVGRLFAQDP
QLYADIIMSSESNLALIKRYYQRFGEAIALLEQGDKQAFIASFNRVEQWFGDHAKRFLVE
SRSLLRSANDSRP
Running BLASTp...
Found 69 similar proteins in the literature:
Q02287 T-protein from Enterobacter agglomerans
100% identity, 100% coverage
UTI89_C2933 bifunctional chorismate mutase/prephenate dehydratase from Escherichia coli UTI89
88% identity, 100% coverage
- Genetic requirements for uropathogenic <i>E. coli</i> proliferation in the bladder cell infection cycle
Mediati, mSystems 2024 - “...catabolic genes, L-carnitine metabolism) 3.54 8.81E-03 9.40E-02 UTI89_C3096 ygbA Conserved hypothetical protein 3.48 2.45E-02 1.85E-01 UTI89_C2933 tyrA Chorismate mutase/prephenate (aromatic amino acid biosynthesis) 3.29 5.97E-03 4.79E-02 UTI89_C1105 Hypothetical protein 3.28 2.63E-02 2.01E-01 UTI89_C0231 gloB Hydroxyacylglutathione hydrolase (glycerol assimilation via methylglyoxyl) 3.27 2.67E-02 2.25E-01 UTI89_C2747 cysK O-acetylserine...”
A0A140N544 T-protein from Escherichia coli (strain B / BL21-DE3)
88% identity, 100% coverage
ECs3463 chorismate mutase-T / prephenate dehydrogenase from Escherichia coli O157:H7 str. Sakai
88% identity, 100% coverage
TyrA / b2600 fused chorismate mutase/prephenate dehydrogenase (EC 5.4.99.5; EC 1.3.1.12) from Escherichia coli K-12 substr. MG1655 (see 4 papers)
tyrA / P07023 fused chorismate mutase/prephenate dehydrogenase (EC 5.4.99.5; EC 1.3.1.12) from Escherichia coli (strain K12) (see 12 papers)
P07023 T-protein from Escherichia coli (strain K12)
b2600 fused chorismate mutase T/prephenate dehydrogenase from Escherichia coli str. K-12 substr. MG1655
88% identity, 100% coverage
- Identification of enzymes and regulatory proteins in Escherichia coli that are oxidized under nitrogen, carbon, or phosphate starvation
Noda, Proceedings of the National Academy of Sciences of the United States of America 2007 - “...P0A9P1 P0A6M9 P0A6Q0 Q59346 P23847 P0A7A0 P23721 P0AD61 P07023 Protein name Protein MW Protein PI Dihydrolipoyl dehydrogenase, 3 Elongation factor G (EF-G), 2...”
- Analysis of the pmsCEAB gene cluster involved in biosynthesis of salicylic acid and the siderophore pseudomonine in the biocontrol strain Pseudomonas fluorescens WCS374
Mercado-Blanco, Journal of bacteriology 2001 - “...influenzae (P43902), Erwinia herbicola (Q02287), and E. coli (P07023). The homology was located in the N-terminal domain of TyrA. TyrA proteins are larger than...”
- Ferric Citrate Uptake Is a Virulence Factor in Uropathogenic Escherichia coli
Frick-Cheng, mBio 2022 - “...sufE Cysteine desulfuration protein 3.8 b1679 sufS Selenocysteine lyase 4.1 b1680 tyrA Chorismate mutase 3.7 b2600 ybdZ Enterobactin biosynthesis protein 6.1 b4511 ybgS Uncharacterized protein 4.3 b0753 ybiX PKHD-type hydroxylase 3.7 b0804 yciG Uncharacterized protein 4.5 b1259 yddM Putative DNA-binding transcriptional regulator 4.6 b1477 ydiE Uncharacterized...”
- Gap-filling analysis of the iJO1366 Escherichia coli metabolic network reconstruction for discovery of metabolic functions
Orth, BMC systems biology 2012 - “...(b0073) 2.00E-15 R00732 (R) III aroA (b0908) 5.00E-32 murA (b3189) 7.00E-8 R00733 (R) III tyrA (b2600) 2.80E-2 R01393 (R) I global orphan R01618 (R) IV glgP (b3428) 2.10 R01713 (F) I global orphan R01731 (F) IV tyrB (b4054) * R01785 (R) III rhaD (b3902) * R01902...”
- More than just a metabolic regulator--elucidation and validation of new targets of PdhR in Escherichia coli
Göhler, BMC systems biology 2011 - “...0.977 2.124 b0628 lipA * -0.075 3.914 b1596 ynfM 0.927 0.007 b4052 dnaB -0.018 3.864 b2600 tyrA 0.915 2.065 b0085 murE $ 0.167 3.659 b2505 yfgH 0.820 0.425 b0822 ybiV 0.284 3.653 b0333 prpC 0.729 0.021 b2683 ygaH -0.077 3.628 b0331 prpB 0.713 0.024 b0436 tig...”
- Genome-scale analysis to the impact of gene deletion on the metabolism of E. coli: constraint-based simulation approach
Xu, BMC bioinformatics 2009 - “...b1260 b0914 b1098 b1136 b1261, b1262 s0001 b2827 b1263, b1264 b3648 b1693, b2329 s0001 b2599, b2600 b3389 SS AAM FM ACM HM CM genes b0928 b1415 b1415 b2019, b2020 b2750, b2751 b3941 b3608 b2021, b2022 b2752, b2762 b2023, b2024 b2763, b2764 b2025, b2026 b3607 SS IITM...”
- Experimental and computational assessment of conditionally essential genes in Escherichia coli
Joyce, Journal of bacteriology 2006 - “...(b1261) trpC (b1262) trpD (b1263) trpE (b1264) tyrA (b2600) Group 8262 JOYCE ET AL. J. BACTERIOL. two experimentally defined essential genes. We then examined...”
- Interfering with different steps of protein synthesis explored by transcriptional profiling of Escherichia coli K-12
Sabina, Journal of bacteriology 2003 - “...hdeB b3524 b2601 b1261 b1264 b1493 b3517 b3320 b1263 b2600 b1260 b1973 b1262 b3340 b1779 b3321 b3308 b3304 b3616 b3339 b2155 b3829 b3317 b3296 b2913 b0631 b2416...”
- “...b1264 b1493 b3517 b3774 b1289 b2553 b0698 b1004 b0179 b2600 b1260 b3509 b1261 b2342 b2800 b3460 b3458 b1263 b1783 b0907 b0461 b2464 b0812 b0903 b1262 b0126...”
YP_0399 T-protein [includes: chorismate mutase and prephenate dehydrogenase] from Yersinia pestis biovar Medievalis str. 91001
86% identity, 100% coverage
ETAE_2836 bifunctional chorismate mutase/prephenate dehydrogenase from Edwardsiella tarda EIB202
78% identity, 100% coverage
VP0547 chorismate mutase/prephenate dehydrogenase from Vibrio parahaemolyticus RIMD 2210633
64% identity, 99% coverage
P43902 prephenate dehydrogenase (EC 1.3.1.12) from Haemophilus influenzae (see paper)
HI1290 chorismate mutase / prephenate dehydrogenase (tyrA) from Haemophilus influenzae Rd KW20
59% identity, 97% coverage
SO1362 chorismate mutase/prephenate dehydrogenase from Shewanella oneidensis MR-1
58% identity, 98% coverage
- Design and analysis of mismatch probes for long oligonucleotide microarrays
Deng, BMC genomics 2008 - “...Mm Na+/H+ exchanger MMP0926 44921025 Mm Chemotaxis protein cheB MMP1559 45047480 Mm Formatedehydrogenase alpha subunit SO1362 24371479 So Chorismate mutase/prephenate dehydrogenase (tyrA) SO1779 24371479 So Decaheme cytochrome c (omcA) SO2452 24371479 So Alcohol dehydrogenase, zinc-containing *DvH: Desulfovibrio vulgaris str. Hildenborough ; Mm: Methanococcus maripaludis ; So:...”
2pv7B / P43902 Crystal structure of chorismate mutase / prephenate dehydrogenase (tyra) (1574749) from haemophilus influenzae rd at 2.00 a resolution (see paper)
58% identity, 75% coverage
- Ligands: tyrosine; nicotinamide-adenine-dinucleotide (2pv7B)
FTN_0055 prephenate dehydrogenase from Francisella tularensis subsp. novicida U112
39% identity, 72% coverage
- Francisella tularensis metabolism and its relation to virulence
Meibom, Frontiers in microbiology 2010 - “...absent in the Schu S4 strain but present in both LVS (FTL_048) and subsp. novicida (FTN_0055). Bruce Stocker's pioneer work on the genetics of Salmonella enterica serovar Typhimurium ( S. typhimurium ) has demonstrated the crucial importance of the aromatic biosynthetic pathway for bacterial virulence. Mutations...”
- “...pulmonary and systemic infection in mice (Kraemer et al., 2009 ). The gene tyrA ( FTN_0055 ), which encodes the enzyme prephenate dehydrogenase converting chorismate to tyrosine, was also hit in this in vivo screen, as was aroC . Notably, the tyrA gene is absent in...”
- Genome-wide screen in Francisella novicida for genes required for pulmonary and systemic infection in mice
Kraemer, Infection and immunity 2009 - “...were lost in all three organs. These were tyrA (FTN_0055) and folD (FTN_0417), both involved in amino acid metabolism; the intracellular growth locus protein D...”
- “...Berkeley FTN_0008 FTN_0012 FTN_0023 FTN_0028 FTN_0031 FTN_0045 FTN_0055 FTN_0111 FTN_0132 FTN_0133 FTN_0143 Gene name GENOME-WIDE SCREEN IN F. NOVICIDA VOL. 77,...”
- Working toward the future: insights into Francisella tularensis pathogenesis and vaccine development
Pechous, Microbiology and molecular biology reviews : MMBR 2009 - “...FTN_0023 FTN_0028 FTN_0031 FTN_0035 FTN_0036 FTN_0045 FTN_0055 FTN_0090, FTN_1556, FTN_1061, FTN_0954 FTN_0096 FTN_0097 carA tmpT FTN_0028 FTN_0031 pyrF...”
Npun_R1269 prephenate dehydrogenase from Nostoc punctiforme
36% identity, 95% coverage
- Biotechnological Production of the Sunscreen Pigment Scytonemin in Cyanobacteria: Progress and Strategy
Gao, Marine drugs 2021 - “...substrates [ 60 , 66 , 67 ]. Except Npun_R1268 (encoding a DSBA oxidoreductase) and Npun_R1269 (encoding a prephenate dehydrogenase), other genes in Module II have at least two homologous genes. The redundancy of the aromatic amino acid biosynthetic genes should be beneficial for providing more...”
- Mutational Studies of Putative Biosynthetic Genes for the Cyanobacterial Sunscreen Scytonemin in Nostoc punctiforme ATCC 29133
Ferreira, Frontiers in microbiology 2016 - “...oxidative deamination of L -tryptophan to provide indole-3 pyruvic acid ( Figure 1A ). TyrA (Npun_R1269), a putative prephenate dehydrogenase, is thought to be responsible for the oxidation of prephenate to p -hydroxyphenylpyruvic acid ( Gao and Garcia-Pichel, 2011 ). Subsequently ScyA (Npun_R1276), a thiamin-dependent enzyme,...”
COO91_00780 bifunctional chorismate mutase/prephenate dehydrogenase from Nostoc flagelliforme CCNUN1
34% identity, 95% coverage
all0418 chorismate mutase/prephenate dehydrogenase from Nostoc sp. PCC 7120
43% identity, 66% coverage
FTL_0048 prephenate dehydrogenase. from Francisella tularensis subsp. holarctica
42% identity, 57% coverage
- Francisella tularensis metabolism and its relation to virulence
Meibom, Frontiers in microbiology 2010 - “...the tyrA gene is absent in the Schu S4 strain but present in LVS ( FTL_0048 ). aroC encodes the enzyme performing the last step of chorismic acid synthesis. The gene aroE1 , which encodes shikimate-5-dehydrogenase, the fourth step of the biosynthetic pathway, was identified in...”
J9XQS6 prephenate dehydrogenase (EC 1.3.1.12) from uncultured bacterium (see paper)
41% identity, 67% coverage
D3S601 Prephenate dehydrogenase from Methanocaldococcus sp. (strain FS406-22)
30% identity, 57% coverage
MMP1514 Prephenate dehydrogenase from Methanococcus maripaludis S2
30% identity, 59% coverage
MM1275 Prephenate dehydrogenase from Methanosarcina mazei Goe1
30% identity, 54% coverage
A0A101IGG2 prephenate dehydrogenase (NADP+) (EC 1.3.1.13) from Methanothrix harundinacea (see paper)
30% identity, 67% coverage
DVU0464 prephenate and/or arogenate dehydrogenase from Desulfovibrio vulgaris Hildenborough JW710
DVU0464 prephenate dehydrogenase from Desulfovibrio vulgaris Hildenborough
29% identity, 69% coverage
- mutant phenotype: # important for fitness in defined media, and rescued by added tyrosine. This is the second or third step in tyrosine synthesis from chorismate via prephenate; it is not clear if dehydrogenation or transamination occurs first.
- Response of Desulfovibrio vulgaris to alkaline stress
Stolyar, Journal of bacteriology 2007 - “...DVU0285 DVU0286 DVU0339 DVU0460 DVU0461 DVU0462 DVU0463 DVU0464 DVU0465 DVU0466 DVU0467 DVU0468 DVU0469 DVU0470 DVU0471 DVU0663 DVU0890 DVU1466 DVU1585 DVU1609...”
CPI83_19940 prephenate dehydrogenase dimerization domain-containing protein from Rhodococcus sp. H-CA8f
29% identity, 71% coverage
A8AAX2 prephenate dehydrogenase (NADP+) (EC 1.3.1.13) from Ignicoccus hospitalis (see 2 papers)
27% identity, 68% coverage
O30012 prephenate dehydrogenase (EC 1.3.1.12); prephenate dehydratase (EC 4.2.1.51); chorismate mutase (EC 5.4.99.5) from Archaeoglobus fulgidus (see paper)
AF0227 chorismate mutase/prephenate dehydratase (pheA) from Archaeoglobus fulgidus DSM 4304
27% identity, 41% coverage
- Characterization of a key trifunctional enzyme for aromatic amino acid biosynthesis in Archaeoglobus fulgidus
Lim, Extremophiles : life under extreme conditions 2009 (PubMed)- “...proteins or fusions combining two activities. Gene locus AF0227 of Archaeoglobus fulgidus is predicted to encode a fusion protein, AroQ, containing all three...”
- “...to be particularly interesting since its gene at locus AF0227 is predicted to encode a protein containing three domains harboring putative CM, PDT, and PDHG...”
- Evolution of gene fusions: horizontal transfer versus independent events
Yanai, Genome biology 2002 - “...of the fused gene (Figure 4a,4b ). The single archaeal fusion, the Arachaeoglobus fulgidus protein AF0227, belongs to one of these clusters and shows a strongly supported affinity with the ortholog from the hyperthermophilic bacterium Thermotoga maritima . (Figure 4a,4b ). Given the broad distribution of...”
- “...(COG1605); (b) prephenate dehydratase (COG0077). Protein designations are as in Figure 2 . The protein AF0227 contains a prephenate dehydrogenase domain in addition to the chorismate mutase and prephenate dehydratase domains. Figure 5 Phylogenetic trees for fusion-linked COGs: and subunits of acetyl-CoA carboxylase. (a) subunit (domain)...”
- Identification of 86 candidates for small non-messenger RNAs from the archaeon Archaeoglobus fulgidus
Tang, Proceedings of the National Academy of Sciences of the United States of America 2002 - “...() AF1049 () AF1987 () AF1017 () AF0790 () AF2390 () AF0227 () AF0896 () AF0701 () AF1489 () AF1277 () AF1444 () AF0208 () AF0595 () AF0592 () AF0597 () AF2236...”
- Prephenate dehydratase from the aphid endosymbiont (Buchnera) displays changes in the regulatory domain that suggest its desensitization to inhibition by phenylalanine
Jiménez, Journal of bacteriology 2000 - “...pombe, O14361), PDTARCFU (Archaeoglobus fulgidus, O30012), PDTAQUAE (Aquifex aeolicus, O67085), PDTXANCA (Xanthomonas campestris, O87954), PDTPSEAER (P....”
Ddes_0334 Prephenate dehydrogenase from Desulfovibrio desulfuricans subsp. desulfuricans str. ATCC 27774
29% identity, 61% coverage
- Coordinated response of the Desulfovibrio desulfuricans 27774 transcriptome to nitrate, nitrite and nitric oxide
Cadby, Scientific reports 2017 - “...Glycosyl transferase family 9 1.75 0.00129 Ddes_0333 Major facilitator family membrane transport protein 1.79 0.00336 Ddes_0334 Prephenate dehydrogenase 1.85 0.00164 Ddes_0335 3-phosphoshikimate 1-carboxyvinyltransferase 2.29 4.16e-5 Ddes_0336 Chorismate mutase 2.47 3.53e-6 Ddes_0337 3-dehydroquinate synthase 1.87 0.00292 Ddes_0525 4Fe-4S ferredoxin family 1.94 0.00030 Ddes_0526 Pyridoxamine 5-phosphate oxidase-related FMN-binding...”
- “...0.00303 H Ddes_0289 SAM-binding methylase 2.41 1.59E-06 R Ddes_0290 alaS Alanyl-tRNA synthetase 1.49 0.0132 J Ddes_0334 Prephenate dehydrogenase 1.52 0.0206 E Ddes_0335 3-phosphoshikimate 1-carboxyvinyltransferase 1.35 0.0377 E Ddes_0336 Chorismate mutase 1.56 0.00875 E Ddes_0337 3-dehydroquinate synthase 1.45 0.0119 E Ddes_0338 Fructose-bisphosphate aldolase 1.34 0.0334 G Ddes_0339...”
PFLU_1770 prephenate dehydrogenase dimerization domain-containing protein from Pseudomonas [fluorescens] SBW25
27% identity, 58% coverage
plu3562 No description from Photorhabdus luminescens subsp. laumondii TTO1
28% identity, 68% coverage
- Photorhabdus luminescens genes induced upon insect infection
Münch, BMC genomics 2008 - “...similar to PapB protein and to chorismate mutase/prephenate dehydrogenase plu3563 similar to p-aminobenzoic acid synthase plu3562 similar to dehydrogenase PapC of Streptomyces pristinaespiralis plu3561 probable transport protein Unknown function plu0801 PrK012399-domain, similar to Plu1012 and Plu1017 2.0-fold ( 0.3) plu1012-1010 10.6-fold ( 4.2) plu1012 PrK012399-domain, similar...”
Dde_3485 Prephenate dehydrogenase from Desulfovibrio desulfuricans G20
Dde_3485 prephenate dehydrogenase/arogenate dehydrogenase family protein from Oleidesulfovibrio alaskensis G20
29% identity, 61% coverage
PAPC_STRPR / P72540 4-amino-4-deoxyprephenate dehydrogenase; EC 1.3.1.121 from Streptomyces pristinaespiralis (see paper)
32% identity, 50% coverage
- function: Involved in pristinamycin I biosynthesis (PubMed:9044253). Probably catalyzes the formation of 3-(4-aminophenyl)pyruvate from 4- amino-4-deoxyprephenate (Probable).
catalytic activity: 4-amino-4-deoxyprephenate + NAD(+) = 3-(4-aminophenyl)pyruvate + CO2 + NADH + H(+) (RHEA:59380)
cmlC / F2RB78 4-amino-4-deoxyprephenate dehydrogenase (EC 1.3.1.121) from Streptomyces venezuelae (strain ATCC 10712 / CBS 650.69 / DSM 40230 / JCM 4526 / NBRC 13096 / PD 04745) (see 2 papers)
CMLC_STRVP / F2RB78 4-amino-4-deoxyprephenate dehydrogenase; EC 1.3.1.121 from Streptomyces venezuelae (strain ATCC 10712 / CBS 650.69 / DSM 40230 / JCM 4526 / NBRC 13096 / PD 04745) (see paper)
32% identity, 48% coverage
- function: Involved in chloramphenicol biosynthesis (PubMed:11577160). Probably catalyzes the formation of 3-(4-aminophenyl)pyruvate from 4- amino-4-deoxyprephenate (Probable).
catalytic activity: 4-amino-4-deoxyprephenate + NAD(+) = 3-(4-aminophenyl)pyruvate + CO2 + NADH + H(+) (RHEA:59380)
disruption phenotype: Disruption of the gene causes severe decrease in chloramphenicol production.
papC / BAD21141.1 4-amino-4-deoxyprephenate dehydrogenase from Streptomyces venezuelae (see paper)
32% identity, 48% coverage
H16_A0792 prephenate dehydratase, Chorismate mutase from Ralstonia eutropha H16
H16_A0792 prephenate dehydratase from Cupriavidus necator H16
57% identity, 12% coverage
TyrAAT1 / Q944B6 arogenate dehydrogenase (EC 1.3.1.78) from Arabidopsis thaliana (see paper)
TYRA1_ARATH / Q944B6 Arogenate dehydrogenase 1, chloroplastic; TYRATC; TyrAAT1; EC 1.3.1.78 from Arabidopsis thaliana (Mouse-ear cress) (see 3 papers)
Q944B6 arogenate dehydrogenase (NADP+) (EC 1.3.1.78) from Arabidopsis thaliana (see 2 papers)
AT5G34930 arogenate dehydrogenase from Arabidopsis thaliana
27% identity, 24% coverage
- function: Involved in the biosynthesis of tyrosine. Has no prephenate dehydrogenase activity.
catalytic activity: L-arogenate + NADP(+) = L-tyrosine + CO2 + NADPH (RHEA:15417) - Densification of Genetic Map and Stable Quantitative Trait Locus Analysis for Amino Acid Content of Seed in Soybean (Glycine max L.)
Li, Plants (Basel, Switzerland) 2024 - “...AT2G39940 GO:0005515 protein binding Glyma.02g256100 AT3G47570 GO:0016301 kinase activity Glyma.02g256200 AT3G47570 GO:0005515 protein binding Glyma.02g260900 AT5G34930 GO:0000166 nucleotide binding Glyma.02g263500 AT2G40280 GO:0005576 extracellular region Glyma.02g263600 AT2G03430 GO:0005886 plasma membrane Glyma.02g263900 AT5G45840 GO:0005515 protein binding Glyma.02g265400 Null PF07172 glycine-rich protein family Glyma.02g265800 Null PF07172 glycine-rich protein family...”
- An Argon-Ion-Induced Pale Green Mutant of Arabidopsis Exhibiting Rapid Disassembly of Mesophyll Chloroplast Grana
Sanjaya, Plants (Basel, Switzerland) 2021 - “...AtMBD2 (At5g35330) and AtMBD12 (At5g35338). Five genes encoded known chloroplast-targeted proteins, namely TYRAAt1 / TyrA1 (At5g34930), encoding arogenate dehydrogenase [ 53 ]; AtCYP28 (At5g35100), encoding thylakoid lumen-localized cyclophilin [ 54 ]; AMK5 (At5g35170), encoding an envelope- and thylakoid-associated adenylate kinase protein [ 55 ]; PTM (At5g35210),...”
- “...(1.0) ECA1 gametogenesis family protein AT5G34908 Signal peptide (1.0) a ECA1 gametogenesis related family protein AT5G34930 Chloroplast stroma TYRAAt1/TyrA1; arogenate dehydrogenase [ 53 , 60 ] AT5G34940 Signal peptide (1.0) ATGUS3/GUS3; glucuronidase 3 [ 61 ] AT5G35067 Other (1.0) hypothetical protein AT5G35069 Other (0.6), Mitochondrion (0.3)...”
- Imbalance of tyrosine by modulating TyrA arogenate dehydrogenases impacts growth and development of Arabidopsis thaliana
de, The Plant journal : for cell and molecular biology 2019 (PubMed) (secret) - Genetic dissection of vitamin E biosynthesis in tomato
Almeida, Journal of experimental botany 2011 - “...(TyrA, EC 1.3.1.78) At1g15710 ( 358) Chloroplast U567861 (1) Chloroplast SL2.31sc03731 C2_At5g34850 7 (0.4 cM) At5g34930 (640) Chloroplast ( Rippert et al., 2009 ) U570951 (2) Chloroplast SL2.31sc03771 T1212 9 (48 cM) Tyrosine aminotransferase (TAT, EC 2.6.1.5) At5g53970 (414) nd U577103(1) nd SL2.31sc05925 C2_At1g53000 10 (7.5...”
- The Biosynthetic Pathways for Shikimate and Aromatic Amino Acids in Arabidopsis thaliana
Tzin, The arabidopsis book 2010 - “...two genes encoding TyrA enzymes were identified: TyrA1 (At5g34930) and TyrA2 (At1g15710) (Rippert and Matringe, 2002b, a; Rippert et al., 2009). A second...”
- “...3 3 PAT TyrA1, ADS1 TyrA2, ADS2 At5g34930 At1g15710 2.6.1.79 1.3.1.43 1.3.1.43 Prephenate Aminotransferase Arogenate Dehydrogenase Arogenate Dehydrogenase PDH...”
- Coordinations between gene modules control the operation of plant amino acid metabolic networks
Less, BMC systems biology 2009 - “...mutase 32 CM1 AT3G29200 257746_at CM2 AT5G10870 250407_at CM3 AT1G69370 260360_at arogenate dehydrogenase 33 AAT1 AT5G34930 255859_at AAT2 AT1G15710 259486_at tyrosine aminotransferase 34 TAT3 d AT2G24850 263539_at TAT AT5G53970 248207_at prephenate dehydratase 35 PD1 AT2G27820 266257_at PD AT1G08250 261758_at PD AT1G11790 262825_at PD AT3G07630 259254_at PD...”
- Differential regulation of closely related R2R3-MYB transcription factors controls flavonol accumulation in different parts of the Arabidopsis thaliana seedling
Stracke, The Plant journal : for cell and molecular biology 2007 - “...flavanone 3-hydroxylase (TT6) 253195_at At4g35420 94.7 P 32.9 P 2.9 Dihydroflavonol 4-reductase family protein 252215_at At5g34930 21.3 P 7.7 P 2.8 Arogenate dehydrogenase 266851_at At2g26820 24.3 P 8.8 A 2.8 Avirulence-responsive family protein 261773_at At1g76250 28.5 P 10.4 A 2.7 Expressed protein 258241_at At3g27650 52.9 P...”
- “...noteworthy putative PFG target genes are At1g53270 , encoding an ABC transporter family protein, and At5g34930 , encoding an arogenate dehydrogenase. The ABC transporter might be able to facilitate the transport of flavonols across the tonoplast. In the study of Tohge et al. (2005) two transporters...”
- Dynamic evolution at pericentromeres
Hall, Genome research 2006 - “...F-CTTCTTGATTTCCTCCGTCTCC, R-CCTCT TTGGTACGCTGTTAGGC; and At5g34930, F-TCTTCTCCTT CAATACTTACCT, R-TTCCAAACCCGACGACGATACCAAT. BACs identified with the peri-CEN5...”
NP_001331736 arogenate dehydrogenase from Arabidopsis thaliana
27% identity, 23% coverage
Ga0059261_2298 prephenate and/or arogenate dehydrogenase (EC 1.3.1.13) from Sphingomonas koreensis DSMZ 15582
32% identity, 41% coverage
- mutant phenotype: Important for fitness in defined media. 31% identical to arogenate dehydrogenase from Arabidopsis (Q9LMR3). The substrate could be prephenate (dehydrogenation first) or arogenate (transamination followed by dehydrogenation).
E1R5M5 arogenate dehydrogenase [NAD(P)+] (EC 1.3.1.79) from Sediminispirochaeta smaragdinae (see paper)
28% identity, 42% coverage
SCO2019 chorismate mutase from Streptomyces coelicolor A3(2)
38% identity, 21% coverage
ACIAD2222 bifunctional protein [Includes: putative prephenate or cyclohexadienyl dehydrogenase; 3-phosphoshikimate 1-carboxyvinyltransferase (5-enolpyruvylshikimate-3-phosphate synthase) (EPSP synthase) (EPSPS) (AroA)] from Acinetobacter sp. ADP1
26% identity, 27% coverage
Q74NC4 prephenate dehydratase (EC 4.2.1.51) from Nanoarchaeum equitans (see paper)
NEQ192 NEQ192 from Nanoarchaeum equitans Kin4-M
23% identity, 39% coverage
LOC100284089 arogenate dehydrogenase from Zea mays
29% identity, 39% coverage
M271_36305 chorismate mutase from Streptomyces rapamycinicus NRRL 5491
37% identity, 21% coverage
ELZ14_13330 isochorismate lyase from Pseudomonas brassicacearum
40% identity, 20% coverage
SXYL_01513 prephenate dehydrogenase from Staphylococcus xylosus
22% identity, 56% coverage
PFLU_1772 chorismate mutase from Pseudomonas [fluorescens] SBW25
27% identity, 23% coverage
Afu2g10450 prephenate dehydrogenase from Aspergillus fumigatus Af293
21% identity, 53% coverage
- A Multifaceted Role of Tryptophan Metabolism and Indoleamine 2,3-Dioxygenase Activity in Aspergillus fumigatus-Host Interactions
Choera, Frontiers in immunology 2017 - “...Afu6g12110 Isochorismate synthase AroC Aro7 Afu5g13130 Chorismate mutase PheA Pha2 Afu5g05690 Prephenate dehydratase TyrA Tyr1 Afu2g10450 Prephenate dehydrogenase AroH Aro8 Afu2g13630 AAA transaminase AroI Aro9 Afu5g02290 Trp degradation IdoA Bna2 Afu3g14250 Indoleamine 2,3-dioxygenases IDO1 IdoB Bna2 Afu4g09830 IDO2 IdoC Bna2 Afu7g02010 TDO FmdS Bna7 Afu1g09960 Kynurenine...”
- TrpE feedback mutants reveal roadblocks and conduits toward increasing secondary metabolism in Aspergillus fumigatus
Wang, Fungal genetics and biology : FG & B 2016 - “...(Afu6g04820), ADC synthetase (EC:2.6.1.85); PabaB (Afu2g01650), ADC lyase (4.1.3.38); AroC (Afu5g13130), chorismate mutase (EC:5.4.99.5); 2g10450 (Afu2g10450), prephenate dehydrogenase (EC:1.3.1.13); 5g05690 (Afu5g05690), prephenate dehydratase (EC:4.2.1.51); AroH (Afu2g13630), aromatic aminotransferase (EC:2.6.1.27; 2.6.1.57; 2.6.1.5); IcsA (Afu6g12110), isochorismate synthetase (EC:5.4.4.2); IdoA (Afu3g14250), IdoB (Afu4g09830), IdoC (Afu7g02010), indoleamine 2,3-dioxygenase (EC:1.13.11.52). Abbreviations:...”
BT3933 prephenate dehydrogenase (EC 1.3.1.13) from Bacteroides thetaiotaomicron VPI-5482
24% identity, 44% coverage
- mutant phenotype: Important for fitness in defined media. Distantly related (under 25% identity) to the prephenate dehydrogenase portion of E. coli tyrA. This is sometimes annotated as chorismate mutase as well, but it lacks that domain.
O67085 Bifunctional chorismate mutase/prephenate dehydratase from Aquifex aeolicus (strain VF5)
47% identity, 14% coverage
Echvi_0125 prephenate dehydrogenase from Echinicola vietnamensis DSM 17526
23% identity, 48% coverage
- GapMind: Automated Annotation of Amino Acid Biosynthesis
Price, mSystems 2020 - “...some defined media ( Fig.4B ). This bacterium also has a prephenate or arogenate dehydrogenase (Echvi_0125), which is important for growth in some defined media but not others ( Fig.4B ). It is difficult to understand why PAH is important for fitness in defined media unless...”
CE140_03015 isochorismate lyase from Pseudomonas thivervalensis
42% identity, 17% coverage
D820_RS07095, SMU_531 chorismate mutase from Streptococcus mutans ATCC 25175
37% identity, 21% coverage
- Transcriptomic Stress Response in Streptococcus mutans following Treatment with a Sublethal Concentration of Chlorhexidine Digluconate
Muehler, Microorganisms 2022 - “...D820_RS02990 4.00 10 6 Protein modification D820_RS05715, D820_RS02255, D820_RS08010 2.00 10 2 L-phenylalanine biosynthesis D820_RS03740, D820_RS07095 2.00 10 2 microorganisms-10-00561-t005_Table 5 Table 5 Validation of differentially expressed genes using qRT-PCR. Transcript levels of selected genes ( Table 1 ) were corrected to gyrB . Each value...”
- Genome-Wide Identification of Novel sRNAs in Streptococcus mutans
Krieger, Journal of bacteriology 2022 (secret) - [Transcriptomic Analysis of csn2 Gene Mutant Strains of Streptococcus mutans CRISPR-Cas9 System]
He, Sichuan da xue xue bao. Yi xue ban = Journal of Sichuan University. Medical science edition 2021 (secret) - Remodeling of the Streptococcus mutans proteome in response to LrgAB and external stresses
Ahn, Scientific reports 2017 - “...bottom). Notable protein abundance changes unique to aeration include upregulated accumulation of putative chorismate mutase (SMU_531), cell division protein FtsX, and putative mannose specific EIID component (SMU_1957), which were newly produced only in response to aeration, but not without aeration (Supplemental Table S3 ). It is...”
TK0259 prephenate dehydrogenase from Thermococcus kodakaraensis KOD1
24% identity, 45% coverage
- Proteome profiling of heat, oxidative, and salt stress responses in Thermococcus kodakarensis KOD1
Jia, Frontiers in microbiology 2015 - “...27.97 26.0 13 Metal-dependent phosphohydrolase TK1944 25 2.4 5.76 5.0 30.00 30.8 14 Prephenate dehydrogenase TK0259 38 2.6 5.29 5.9 29.25 31.3 15 Ferredoxin: NADP oxidoreductase TK1685 28 3.0 5.76 5.0 32.50 33.5 16 Protein disulphide oxidoreductase TK1085 39 3.2 4.72 4.0 25.28 24.3 17 Hypothetical...”
- “...stresses. TK0955 and TK1110 in mannose metabolism were only up regulated under heat stress. TK0254, TK0259, TK0268, TK1379, TK1431, TK1447, and TK2217 that were up-regulated by different stresses may participate in amino acids synthesis. Among them, TK1379, TK1431, and TK1447 were increased under both heat and...”
O25931 Prephenate dehydrogenase (TyrA) from Helicobacter pylori (strain ATCC 700392 / 26695)
HP1380 prephenate dehydrogenase (tyrA) from Helicobacter pylori 26695
20% identity, 56% coverage
Q5Z9H5 Os06g0708832 protein from Oryza sativa subsp. japonica
23% identity, 58% coverage
- Physiological and Proteomic Analysis of Various Priming on Rice Seed under Chilling Stress
Zhang, Plants (Basel, Switzerland) 2024 - “...Os07g0681400 Q7XHW4 Probable calcium-binding protein 3.629 3.103 UP Os02g0169900 Q6H6B9 Inositol-1-monophosphatase 0.010 0.010 Down Os06g0708832 Q5Z9H5 Similar to arogenate dehydrogenase. 0.010 0.010 Down Os02g0738900 Q0DXR0 Dynamin GTPase 0.135 0.179 Down Os05g0595100 Q8LNZ3 UDP-glucose 4-epimerase 0.190 0.186 Down Os01g0633100 Q7G065 Glucose-1-phosphate adenylyltransferase large subunit 0.232 6.741 Down...”
Q0PBJ3 Bifunctional chorismate mutase/prephenate dehydratase from Campylobacter jejuni subsp. jejuni serotype O:2 (strain ATCC 700819 / NCTC 11168)
34% identity, 25% coverage
- Novel Drug Targets for Food-Borne Pathogen Campylobacter jejuni: An Integrated Subtractive Genomics and Comparative Metabolic Pathway Study
Mehla, Omics : a journal of integrative biology 2015 - “...Q0PAL6 Q0PAZ1 Q0P9X4 Q9PJ53 Q0PC20 Q9PIK3 Q9PIK2 Q9PIK1 Q0PBJ3 Q0PBA5 Q9PI11 P0C632 P0C630 Q9PNT2 Q9PMV3 Q0P8N9 Q9PM41 Q9PIT2 Q0PAS0 Q0PB07 Q9PIM1 Q9PHU0 Q9PIZ5...”
- “...14 Q9PIK3 15 Q9PIK2 16 Q9PIK1 17 Q0PBJ3 Drug name DB00229, DB00267, DB00301, Cefotiam, Cefmenoxime, Flucloxacillin, Penicillin V, DB00417, DB00456, DB00713,...”
AroDH-1 / B4FY98 arogenate dehydrogenase 1 (EC 1.3.1.43) from Zea mays (see paper)
25% identity, 58% coverage
DET0461 chorismate mutase/prephenate dehydratase from Dehalococcoides ethenogenes 195
36% identity, 20% coverage
Ssal_00456 chorismate mutase from Streptococcus salivarius 57.I
38% identity, 20% coverage
CHMU_METJA / Q57696 Chorismate mutase; CM; Monofunctional chorismate mutase AroQ(f); EC 5.4.99.5 from Methanocaldococcus jannaschii (strain ATCC 43067 / DSM 2661 / JAL-1 / JCM 10045 / NBRC 100440) (Methanococcus jannaschii) (see paper)
39% identity, 16% coverage
- function: Catalyzes the conversion of chorismate into prephenate via a Claisen rearrangement.
catalytic activity: chorismate = prephenate (RHEA:13897)
subunit: Homodimer.
SMc03858 PUTATIVE CHORISMATE MUTASE PROTEIN from Sinorhizobium meliloti 1021
33% identity, 21% coverage
Ddes_1346 chorismate mutase related enzyme from Desulfovibrio desulfuricans subsp. desulfuricans str. ATCC 27774
36% identity, 20% coverage
B488_11240 prephenate/arogenate dehydrogenase family protein from Liberibacter crescens BT-1
23% identity, 59% coverage
Ddes_0336 chorismate mutase from Desulfovibrio desulfuricans subsp. desulfuricans str. ATCC 27774
40% identity, 19% coverage
- Coordinated response of the Desulfovibrio desulfuricans 27774 transcriptome to nitrate, nitrite and nitric oxide
Cadby, Scientific reports 2017 - “...membrane transport protein 1.79 0.00336 Ddes_0334 Prephenate dehydrogenase 1.85 0.00164 Ddes_0335 3-phosphoshikimate 1-carboxyvinyltransferase 2.29 4.16e-5 Ddes_0336 Chorismate mutase 2.47 3.53e-6 Ddes_0337 3-dehydroquinate synthase 1.87 0.00292 Ddes_0525 4Fe-4S ferredoxin family 1.94 0.00030 Ddes_0526 Pyridoxamine 5-phosphate oxidase-related FMN-binding 2.6 1.67e-6 Ddes_0527 Flavodoxin family protein 2.05 9.34e-5 Ddes_0528 CRP-family...”
- “...1.49 0.0132 J Ddes_0334 Prephenate dehydrogenase 1.52 0.0206 E Ddes_0335 3-phosphoshikimate 1-carboxyvinyltransferase 1.35 0.0377 E Ddes_0336 Chorismate mutase 1.56 0.00875 E Ddes_0337 3-dehydroquinate synthase 1.45 0.0119 E Ddes_0338 Fructose-bisphosphate aldolase 1.34 0.0334 G Ddes_0339 Pyridoxal phosphate-dependent D-cysteine desulfhydrase family 1.67 0.0392 E Ddes_0382 cooS Carbon monoxide...”
Fisuc_2558 Chorismate mutase from Fibrobacter succinogenes subsp. succinogenes S85
32% identity, 22% coverage
- Generation and Characterization of Acid Tolerant Fibrobacter succinogenes S85
Wu, Scientific reports 2017 - “...transcription regulators Fisuc_0335, Fisuc_0933 and Fisuc_1186, a diguanylate cyclase Fisuc_2957, and genes Fisuc_0137, Fisuc_0138 and Fisuc_2558, which are involved in tryptophan metabolism. A gene ontology enrichment analysis was performed on differentially expressed genes. Among the up-regulated genes in the pH 5.65 samples, category V: defense mechanisms...”
- “...Fisuc_2091 2.5 0.001303 Rubredoxin-type Fe(Cys)4 protein Fisuc_2123 2.3 0.002766 4Fe-4S ferredoxin iron-sulfur binding domain protein Fisuc_2558 1.9 2.99E-15 Chorismate mutase Fisuc_2559 2.0 1.03E-16 Prephenate dehydrogenase Fisuc_2908 2.1 0.000191 (Sulfur transfer protein involved in) thiamine biosynthesis protein ThiS Acid survival for E . coli knockout mutants To...”
GSU2608 chorismate mutase/prephenate dehydratase from Geobacter sulfurreducens PCA
39% identity, 22% coverage
ZMO0563 chorismate mutase from Zymomonas mobilis subsp. mobilis ZM4
37% identity, 21% coverage
- Model-driven analysis of mutant fitness experiments improves genome-scale metabolic models of Zymomonas mobilis ZM4
Ong, PLoS computational biology 2020 - “...Annotation ZMO0201 Glutamine amidotransferase of 4-amino-4-deoxychorismate synthase (isozyme) MEGS pabA Glutamine amidotransferase of anthranilate synthase ZMO0563 Chorismate-pyruvate lyase MEGS ubiC Chorismate mutase ZMO1008 Erythronate-4-phosphate dehydrogenase MEGS pdxB FAD linked oxidase domain protein ZMO1518 Histidinol phosphatase Bar-Seq Correlation N/A Inositol-monophosphatase ZMO1916 Pimeloyl-ACP methyl ester esterase MEGS bioH...”
- “...ZMO1916 (annotated as a hypothetical protein) likely encodes a pimeloyl-ACP methyl ester esterase. ZMO0562 and ZMO0563 were found together on the plasmid complementing the growth of the ubiC E . coli mutant. These two genes were cloned separately into the E . coli ubiC strain and...”
plu3564 No description from Photorhabdus luminescens subsp. laumondii TTO1
26% identity, 23% coverage
- Photorhabdus luminescens genes induced upon insect infection
Münch, BMC genomics 2008 - “...putative methylase and protoporphyrinogen oxidase plu3565 similar to class II aminotransferase and 5-aminolevulinic acid synthase plu3564 weakly similar to PapB protein and to chorismate mutase/prephenate dehydrogenase plu3563 similar to p-aminobenzoic acid synthase plu3562 similar to dehydrogenase PapC of Streptomyces pristinaespiralis plu3561 probable transport protein Unknown function...”
PGN_1053 putative phospho-2-dehydro-3-deoxyheptonate aldolase/chorismate mutase from Porphyromonas gingivalis ATCC 33277
PG0885 phospho-2-dehydro-3-deoxyheptonate aldolase/chorismate mutase from Porphyromonas gingivalis W83
30% identity, 21% coverage
- Insights into Dynamic Polymicrobial Synergy Revealed by Time-Coursed RNA-Seq
Hendrickson, Frontiers in microbiology 2017 - “...PGN_0972 TPR motif PGN_0569 PGN_2050 PGN_0185 fimE PGN_1903 PGN_0172 PGN_1423 PGN_1673 PGN_0580 PGN_1413 PGN_0380 PGN_1145 PGN_1053 PGN_0180 fimA PGN_1507 PGN_0388 PGN_0972 TPR motif Figure 3 Differential expression of fimA locus genes ( PGN_0180 - PGN_0185 ) along with fimSR ( PGN_0903 and PGN_0904 ) in communities...”
- Regulon controlled by the GppX hybrid two component system in Porphyromonas gingivalis
Hirano, Molecular oral microbiology 2013 - “...0.01 PGN_1054 virulence modulating gene F 0.96 0.02 PGN_0997 putative deoxyuridine 5-triphosphate nucleotidohydrolase 0.88 0.04 PGN_1053 putative phospho-2-dehydro-3-deoxyheptonate aldolase/chorismate mutase 0.88 0.04 PGN_0315 precorrin-6x reductase/cobalamin biosynthetic protein CbiD 0.86 0.01 PGN_0316 precorrin-4 C11-methyltransferase 0.86 0.02 PGN_2023 putative phosphoribosylformylglycinamidine cyclo-ligase 0.81 0.03 PGN_1479 hypothetical protein 0.81 0.01...”
- Role of Acetyltransferase PG1842 in Gingipain Biogenesis in Porphyromonas gingivalis
Mishra, Journal of bacteriology 2018 - “...qRT-PCR. As shown in Figure 1, the vimE, vimF, PG0885 and PG0886 genes were 127 downregulated 3.6, 6.5, 5.6 and 2.8 fold respectively. 128 To further determine...”
- “...gingivalis FLL102, the expression levels of the vimE, vimF, PG0885 and PG0886 137 genes were determined by qRT-PCR. In contrast to FLL92, expression levels of...”
- Deletion of a 77-base-pair inverted repeat element alters the synthesis of surface polysaccharides in Porphyromonas gingivalis
Bainbridge, Journal of bacteriology 2015 - “...(PG1138 to PG1142) and the VimA locus (PG0880 to PG0885) (34); however, the nature of the alterations in LPS structure have not been elucidated. The 77bpIR...”
- “...synthesis, PG1138 to PG1142 (Fig. 6B) and PG0880 to PG0885 (data not shown), were examined in the 77bpIR strain, and expression was found to be unchanged or...”
- Microarray analysis of the transcriptional responses of Porphyromonas gingivalis to polyphosphate
Moon, BMC microbiology 2014 - “...126 R: TCCGGCTCATAGACTTCCAA PG1180 F: CAGTCTGCCACAGTTCACCA 124 R: CCCTACACGGACACTACCGA PG1983 F: GCTCTGTGGTGTGGGCTATC 146 R: GGATAACAGGCAAACCCGAT PG0885 F: CAGATCCAAATCGGGACTGA 156 R: GTAGAGCAAGCCATGCAAGC PG1181 F: GATGAATTCGGGCGGATAAT 184 R: CCTTGAAGTGCTCCAACGAC a Based on the genome annotation provided by TIGR ( http://cmr.jcvi.org/cgi-bin/CMR/GenomePage.cgi?org=gpg ). b Primers were designed using Primer3 program...”
str1594 hypothetical protein from Streptococcus thermophilus CNRZ1066
36% identity, 20% coverage
For advice on how to use these tools together, see
Interactive tools for functional annotation of bacterial genomes.
The PaperBLAST database links 793,807 different protein sequences to 1,259,118 scientific articles. Searches against EuropePMC were last performed on March 13 2025.
PaperBLAST builds a database of protein sequences that are linked
to scientific articles. These links come from automated text searches
against the articles in EuropePMC
and from manually-curated information from GeneRIF, UniProtKB/Swiss-Prot,
BRENDA,
CAZy (as made available by dbCAN),
BioLiP,
CharProtDB,
MetaCyc,
EcoCyc,
TCDB,
REBASE,
the Fitness Browser,
and a subset of the European Nucleotide Archive with the /experiment tag.
Given this database and a protein sequence query,
PaperBLAST uses protein-protein BLAST
to find similar sequences with E < 0.001.
To build the database, we query EuropePMC with locus tags, with RefSeq protein
identifiers, and with UniProt
accessions. We obtain the locus tags from RefSeq or from MicrobesOnline. We use
queries of the form "locus_tag AND genus_name" to try to ensure that
the paper is actually discussing that gene. Because EuropePMC indexes
most recent biomedical papers, even if they are not open access, some
of the links may be to papers that you cannot read or that our
computers cannot read. We query each of these identifiers that
appears in the open access part of EuropePMC, as well as every locus
tag that appears in the 500 most-referenced genomes, so that a gene
may appear in the PaperBLAST results even though none of the papers
that mention it are open access. We also incorporate text-mined links
from EuropePMC that link open access articles to UniProt or RefSeq
identifiers. (This yields some additional links because EuropePMC
uses different heuristics for their text mining than we do.)
For every article that mentions a locus tag, a RefSeq protein
identifier, or a UniProt accession, we try to select one or two
snippets of text that refer to the protein. If we cannot get access to
the full text, we try to select a snippet from the abstract, but
unfortunately, unique identifiers such as locus tags are rarely
provided in abstracts.
PaperBLAST also incorporates manually-curated protein functions:
- Proteins from NCBI's RefSeq are included if a
GeneRIF
entry links the gene to an article in
PubMed®.
GeneRIF also provides a short summary of the article's claim about the
protein, which is shown instead of a snippet.
- Proteins from Swiss-Prot (the curated part of UniProt)
are included if the curators
identified experimental evidence for the protein's function (evidence
code ECO:0000269). For these proteins, the fields of the Swiss-Prot entry that
describe the protein's function are shown (with bold headings).
- Proteins from BRENDA,
a curated database of enzymes, are included if they are linked to a paper in PubMed
and their full sequence is known.
- Every protein from the non-redundant subset of
BioLiP,
a database
of ligand-binding sites and catalytic residues in protein structures, is included. Since BioLiP itself
does not include descriptions of the proteins, those are taken from the
Protein Data Bank.
Descriptions from PDB rely on the original submitter of the
structure and cannot be updated by others, so they may be less reliable.
(For SitesBLAST and Sites on a Tree, we use a larger subset of BioLiP so that every
ligand is represented among a group of structures with similar sequences, but for
PaperBLAST, we use the non-redundant set provided by BioLiP.)
- Every protein from EcoCyc, a curated
database of the proteins in Escherichia coli K-12, is included, regardless
of whether they are characterized or not.
- Proteins from the MetaCyc metabolic pathway database
are included if they are linked to a paper in PubMed and their full sequence is known.
- Proteins from the Transport Classification Database (TCDB)
are included if they have known substrate(s), have reference(s),
and are not described as uncharacterized or putative.
(Some of the references are not visible on the PaperBLAST web site.)
- Every protein from CharProtDB,
a database of experimentally characterized protein annotations, is included.
- Proteins from the CAZy database of carbohydrate-active enzymes
are included if they are associated with an Enzyme Classification number.
Even though CAZy does not provide links from individual protein sequences to papers,
these should all be experimentally-characterized proteins.
- Proteins from the REBASE database
of restriction enzymes are included if they have known specificity.
- Every protein with an evidence-based reannotation (based on mutant phenotypes)
in the Fitness Browser is included.
- Sequence-specific transcription factors (including sigma factors and DNA-binding response regulators)
with experimentally-determined DNA binding sites from the
PRODORIC database of gene regulation in prokaryotes.
- Putative transcription factors from RegPrecise
that have manually-curated predictions for their binding sites. These predictions are based on
conserved putative regulatory sites across genomes that contain similar transcription factors,
so PaperBLAST clusters the TFs at 70% identity and retains just one member of each cluster.
- Coding sequence (CDS) features from the
European Nucleotide Archive (ENA)
are included if the /experiment tag is set (implying that there is experimental evidence for the annotation),
the nucleotide entry links to paper(s) in PubMed,
and the nucleotide entry is from the STD data class
(implying that these are targeted annotated sequences, not from shotgun sequencing).
Also, to filter out genes whose transcription or translation was detected, but whose function
was not studied, nucleotide entries or papers with more than 25 such proteins are excluded.
Descriptions from ENA rely on the original submitter of the
sequence and cannot be updated by others, so they may be less reliable.
Except for GeneRIF and ENA,
the curated entries include a short curated
description of the protein's function.
For entries from BioLiP, the protein's function may not be known beyond binding to the ligand.
Many of these entries also link to articles in PubMed.
For more information see the
PaperBLAST paper (mSystems 2017)
or the code.
You can download PaperBLAST's database here.
Changes to PaperBLAST since the paper was written:
- November 2023: incorporated PRODORIC and RegPrecise. Many PRODORIC entries were not linked to a protein sequence (no UniProt identifier), so we added this information.
- February 2023: BioLiP changed their download format. PaperBLAST now includes their non-redundant subset. SitesBLAST and Sites on a Tree use a larger non-redundant subset that ensures that every ligand is represented within each cluster. This should ensure that every binding site is represented.
- June 2022: incorporated some coding sequences from ENA with the /experiment tag.
- March 2022: incorporated BioLiP.
- April 2020: incorporated TCDB.
- April 2019: EuropePMC now returns table entries in their search results. This has expanded PaperBLAST's database, but most of the new entries are of low relevance, and the resulting snippets are often just lists of locus tags with annotations.
- February 2018: the alignment page reports the conservation of the hit's functional sites (if available from from Swiss-Prot or UniProt)
- January 2018: incorporated BRENDA.
- December 2017: incorporated MetaCyc, CharProtDB, CAZy, REBASE, and the reannotations from the Fitness Browser.
- September 2017: EuropePMC no longer returns some table entries in their search results. This has shrunk PaperBLAST's database, but has also reduced the number of low-relevance hits.
Many of these changes are described in Interactive tools for functional annotation of bacterial genomes.
PaperBLAST cannot provide snippets for many of the papers that are
published in non-open-access journals. This limitation applies even if
the paper is marked as "free" on the publisher's web site and is
available in PubmedCentral or EuropePMC. If a journal that you publish
in is marked as "secret," please consider publishing elsewhere.
Many important articles are missing from PaperBLAST, either because
the article's full text is not in EuropePMC (as for many older
articles), or because the paper does not mention a protein identifier such as a locus tag, or because of PaperBLAST's heuristics. If you notice an
article that characterizes a protein's function but is missing from
PaperBLAST, please notify the curators at UniProt
or add an entry to GeneRIF.
Entries in either of these databases will eventually be incorporated
into PaperBLAST. Note that to add an entry to UniProt, you will need
to find the UniProt identifier for the protein. If the protein is not
already in UniProt, you can ask them to create an entry. To add an
entry to GeneRIF, you will need an NCBI Gene identifier, but
unfortunately many prokaryotic proteins in RefSeq do not have
corresponding Gene identifers.
References
PaperBLAST: Text-mining papers for information about homologs.
M. N. Price and A. P. Arkin (2017). mSystems, 10.1128/mSystems.00039-17.
Europe PMC in 2017.
M. Levchenko et al (2017). Nucleic Acids Research, 10.1093/nar/gkx1005.
Gene indexing: characterization and analysis of NLM's GeneRIFs.
J. A. Mitchell et al (2003). AMIA Annu Symp Proc 2003:460-464.
UniProt: the universal protein knowledgebase.
The UniProt Consortium (2016). Nucleic Acids Research, 10.1093/nar/gkw1099.
BRENDA in 2017: new perspectives and new tools in BRENDA.
S. Placzek et al (2017). Nucleic Acids Research, 10.1093/nar/gkw952.
The EcoCyc database: reflecting new knowledge about Escherichia coli K-12.
I. M. Keeseler et al (2016). Nucleic Acids Research, 10.1093/nar/gkw1003.
The MetaCyc database of metabolic pathways and enzymes.
R. Caspi et al (2018). Nucleic Acids Research, 10.1093/nar/gkx935.
CharProtDB: a database of experimentally characterized protein annotations.
R. Madupu et al (2012). Nucleic Acids Research, 10.1093/nar/gkr1133.
The carbohydrate-active enzymes database (CAZy) in 2013.
V. Lombard et al (2014). Nucleic Acids Research, 10.1093/nar/gkt1178.
The Transporter Classification Database (TCDB): recent advances
M. H. Saier, Jr. et al (2016). Nucleic Acids Research, 10.1093/nar/gkv1103.
REBASE - a database for DNA restriction and modification: enzymes, genes and genomes.
R. J. Roberts et al (2015). Nucleic Acids Research, 10.1093/nar/gku1046.
Deep annotation of protein function across diverse bacteria from mutant phenotypes.
M. N. Price et al (2016). bioRxiv, 10.1101/072470.
by Morgan Price,
Arkin group
Lawrence Berkeley National Laboratory