PaperBLAST
PaperBLAST Hits for VIMSS6823329 ferredoxin (608 a.a., MVQVTFLPGK...)
Show query sequence
>VIMSS6823329 ferredoxin
MVQVTFLPGKRAIEVSEGSTVMEAAIAAGVPLESTCGGRGTCGKCKVQVDPTLVDPALDM
GKFLSDSERKAGWVLACRYKVAEDLIVNLSESKDAHQRKTNLSQLEDIDLVPSVKKYELK
LAKPTVHDQTPDWDRLMAALPSPKIHFNRTLAAGLPQILHQSNFHVTAVVDGNALLAVEP
GDTTQKSHGLAIDIGTTTTVVYLVDLLQGKILDSDALTNPQRVFGADVISRITHAAKGPE
QLQQLQTVVVEGLNTIISRLCKRNDLKQEDIYQAVVVGNTTMSHLFLGIDPTYLAPAPFI
PVFRQSVQVKAAELGLNILKTGHVVVVPNVAGYVGADTVGVMIAAKVDQLPGYTLAVDIG
TNGEIILAGGKRILTCSTAAGPAFEGAEIKYGMRAADGAIERVKITDDVELAVIGNAKPI
GICGSGLIDAIAQMAEAGVIHESGRIVNTPEDLAKLPARIQERIRKAEGGFEFVLAWGKD
TGLKEDVVLTQKDIRELQLAKGAILAGIKILMKEMGIGLEQLDRVLLAGAFGNYISKEAA
LRIGLLPDVPLEKIRAIGNAAGDGAKMILLSKEERKRAALLAELAEHLELSTRSDFQEEF
IEALSFEK
Running BLASTp...
Found 78 similar proteins in the literature:
Dhaf_3310 ferredoxin from Desulfitobacterium hafniense DCB-2
100% identity, 100% coverage
- Characterization of an O-demethylase of Desulfitobacterium hafniense DCB-2
Studenik, Journal of bacteriology 2012 - “...for the genes Dhaf_1265, Dhaf_2573, Dhaf_2795, Dhaf_3310, Dhaf_3879, Dhaf_4322, Dhaf_4610, Dhaf_4611, and Dhaf_4612 (GenBank accession no. CP001336.1) as...”
- “...CCT TAT TTT TCG AAC TGC GGG TGG C 1 Dhaf_2573 Dhaf_3310 Dhaf_3879 Dhaf_4322 Dhaf_4610 Dhaf_4611 Dhaf_4612 a 2 2 2 2 2 2 2 2 For details, see Materials and...”
SSCH_450007 ASKHA domain-containing protein from Syntrophaceticus schinkii
43% identity, 98% coverage
BP07_RS03235, WP_042685513 ASKHA domain-containing protein from Methermicoccus shengliensis
42% identity, 98% coverage
- Methanogenic archaea use a bacteria-like methyltransferase system to demethoxylate aromatic compounds
Kurth, The ISME journal 2021 - “...and MtoD The gene encoding the corrinoid protein MtoC (BP07_RS03260) and the corrinoid activating enzyme (BP07_RS03235) were amplified from genomic M. shengliensis DNA with primers 3235fw/3235Srev (CTCATATGAGCGTCAGAGTAACGTTCGAGC, CTGCGGCCGCTTATTTTTCGAACTGCGGGTGGCTCCAGCTAGCTGAAGAGAGTTTTTCTCC) and 3260fw/3260Srev (CTCATATGACGGACGTAAGAGAAGAGCTC/CTGCGGCCGCTTATTTTTCGAACTGCGGGTGGCTCCAGCTAGCCTCCACCCCCACCAGAGC) for cloning in expression vector pET-30a inserting an N-terminal Strep tag via the reverse primer....”
- “...plasmid transformation. For production of the corrinoid protein MtoC (BP07_RS03260) and the corrinoid activating enzyme (BP07_RS03235) the plasmids pET-30a_BP07_RS03260 and pET-30a_BP07_RS03235 were used for transformation into E. coli Bl21 (DE3). For protein overexpression, one colony was inoculated in 600ml LB-medium containing 50g/ml kanamycin and incubated at...”
- Several ways one goal-methanogenesis from unconventional substrates
Kurth, Applied microbiology and biotechnology 2020 - “...MtvB O-demethylase BP07_RS03250 WP_042685515 Corrinoid protein BP07_RS03260 WP_042685521 MtrH-like methyltransferase BP07_RS03240 WP_042685937 Corrinoid activation protein BP07_RS03235 WP_042685513 Methanococcoides Tertiary amines ? ? ? Methanolobus vulcani Quaternary amines MtgB methyltransferase FKV42_RS08545 WP_154809802 Corrinoid protein FKV42_RS08550 WP_154809803 Corrinoid activator FKV42_RS10455 WP_154810143 CoM methyltransferase FKV42_RS10480 WP_154810148 For the organisms...”
- “...O-demethylase BP07_RS03250 WP_042685515 Corrinoid protein BP07_RS03260 WP_042685521 MtrH-like methyltransferase BP07_RS03240 WP_042685937 Corrinoid activation protein BP07_RS03235 WP_042685513 Methanococcoides Tertiary amines ? ? ? Methanolobus vulcani Quaternary amines MtgB methyltransferase FKV42_RS08545 WP_154809802 Corrinoid protein FKV42_RS08550 WP_154809803 Corrinoid activator FKV42_RS10455 WP_154810143 CoM methyltransferase FKV42_RS10480 WP_154810148 For the organisms conducting...”
Dhaf_3879 ferredoxin from Desulfitobacterium hafniense DCB-2
41% identity, 98% coverage
- Characterization of an O-demethylase of Desulfitobacterium hafniense DCB-2
Studenik, Journal of bacteriology 2012 - “...for the genes Dhaf_1265, Dhaf_2573, Dhaf_2795, Dhaf_3310, Dhaf_3879, Dhaf_4322, Dhaf_4610, Dhaf_4611, and Dhaf_4612 (GenBank accession no. CP001336.1) as Strep...”
- “...TAT TTT TCG AAC TGC GGG TGG C 1 Dhaf_2573 Dhaf_3310 Dhaf_3879 Dhaf_4322 Dhaf_4610 Dhaf_4611 Dhaf_4612 a 2 2 2 2 2 2 2 2 For details, see Materials and Methods....”
3zyyX / Q3ACS2 Reductive activator for corrinoid,iron-sulfur protein (see paper)
39% identity, 96% coverage
- Ligands: fe2/s2 (inorganic) cluster; (r,r)-2,3-butanediol (3zyyX)
Dhaf_2795 ferredoxin from Desulfitobacterium hafniense DCB-2
DSY1650 ferredoxin from Desulfitobacterium hafniense Y51
39% identity, 94% coverage
- Characterization of an O-demethylase of Desulfitobacterium hafniense DCB-2
Studenik, Journal of bacteriology 2012 - “...cassettes for the genes Dhaf_1265, Dhaf_2573, Dhaf_2795, Dhaf_3310, Dhaf_3879, Dhaf_4322, Dhaf_4610, Dhaf_4611, and Dhaf_4612 (GenBank accession no. CP001336.1)...”
- “...on March 3, 2017 by University of California, Berkeley Dhaf_2795 2 Studenik et al. oriented in the reverse direction in comparison to the orientation in the...”
- Complete genome sequence of the dehalorespiring bacterium Desulfitobacterium hafniense Y51 and comparison with Dehalococcoides ethenogenes 195
Nonaka, Journal of bacteriology 2006 - “...DSY0391 DSY0393 DSY1228 DSY1247 DSY1596 DSY1598 DSY1648 DSY1650 DSY1651 DSY1652 DSY1671 DSY1890 DSY2085 DSY2558 DSY2585 DSY3715 DSY4099 DSY4876 Predicted...”
Dhaf_2573 ferredoxin from Desulfitobacterium hafniense DCB-2
39% identity, 96% coverage
- Characterization of an O-demethylase of Desulfitobacterium hafniense DCB-2
Studenik, Journal of bacteriology 2012 - “...Expression cassettes for the genes Dhaf_1265, Dhaf_2573, Dhaf_2795, Dhaf_3310, Dhaf_3879, Dhaf_4322, Dhaf_4610, Dhaf_4611, and Dhaf_4612 (GenBank accession no....”
- “...GAT CCT TAT TTT TCG AAC TGC GGG TGG C 1 Dhaf_2573 Dhaf_3310 Dhaf_3879 Dhaf_4322 Dhaf_4610 Dhaf_4611 Dhaf_4612 a 2 2 2 2 2 2 2 2 For details, see Materials and...”
Dtox_1273 ferredoxin from Desulfotomaculum acetoxidans DSM 771
38% identity, 96% coverage
Dhaf_4322 ferredoxin from Desulfitobacterium hafniense DCB-2
39% identity, 97% coverage
- Characterization of an O-demethylase of Desulfitobacterium hafniense DCB-2
Studenik, Journal of bacteriology 2012 - “...the genes Dhaf_1265, Dhaf_2573, Dhaf_2795, Dhaf_3310, Dhaf_3879, Dhaf_4322, Dhaf_4610, Dhaf_4611, and Dhaf_4612 (GenBank accession no. CP001336.1) as Strep tag...”
- “...BamHI according to the manufacturer's protocol. For Dhaf_4322, a compatible 3318 jb.asm.org Journal of Bacteriology Downloaded from http://jb.asm.org/ on March...”
D9S251 Ferredoxin from Thermosediminibacter oceani (strain ATCC BAA-1034 / DSM 16646 / JW/IW-1228P)
38% identity, 99% coverage
- Analytical Validation of Loss of Heterozygosity and Mutation Detection in Pancreatic Fine-Needle Aspirates by Capillary Electrophoresis and Sanger Sequencing.
Timmaraju, Diagnostics (Basel, Switzerland) 2024 - “...150 chr5 119101810 119101960 GGTGTCAACAAAGTAATGTAAAG TGGATACATATTGTTTTCTGCTG 5q D5S615 330 chr5 125163290 125163620 GAGATAGGTAGGTAGGTAGG TCCACAGTGGTAAGAACCAG 9p D9S251 390 chr9 30819368 30819758 TGCATGTTTTATGTGCACTAAC CAATACTTTTTAAGGCTTTGTAGG 9p D9S254 120 chr9 126869098 126869218 TGGGTAATAACTGCCGGAGA GAGGATAAACCTGCTTCACTCAA 10q D10S520 180 chr10 96424526 96424706 CAGCCTATGCAACAGAACAAG GTCCTTGTGAGAAACTGGATGC 10q D10S523 150 chr10 87006333 87006483 GGTGGAGGTTGTGGTGA AACTGGGCATTTGTCTTTC...”
- Molecular Clues for Prediction of Hepatocellular Carcinoma Recurrence After Liver Transplantation.
Badwei, Journal of clinical and experimental hepatology 2023 - Role of Allelic Imbalance in Predicting Hepatocellular Carcinoma (HCC) Recurrence Risk After Liver Transplant.
Pagano, Annals of transplantation 2019 - “...Results We report that AI was associated with HCC recurrence in 3 main loci (D3S2303, D9S251, and D9S254). Tumor recurrence was associated only with 2 specific panels with 9 microsatellites previously reported to be associated with high risk for HCC recurrence. Our data show that fractional...”
- “...for D3S2303 (p=0.048) considering the presence of LOH ( Table 3A ), and D1S407 (p=0.006) D9S251 (p=0.02), D1S162 (p=0.005), D5S592 (p=0.005), D9S254 (p=0.002) and D10S520 (p=0.04) considering high-level LOH ( Table 3B ). Evaluation of specific panels and association with HCC recurrence Descriptive analysis of the...”
- The C9ORF72 expansion mutation is a common cause of ALS+/-FTD in Europe and has a single founder.
Smith, European journal of human genetics : EJHG 2013 - Clinical, neuroimaging and neuropathological features of a new chromosome 9p-linked FTD-ALS family.
Boxer, Journal of neurology, neurosurgery, and psychiatry 2011 - “...Genome-wide linkage analysis conclusively linked family VSM-20 to a 28.3 cM region between D9S1808 and D9S251 on chromosome 9p, reducing the published minimal linked region to a 3.7 Mb interval. Genomic sequencing and expression analysis failed to identify mutations in the 10 known and predicted genes...”
- “...GENESCAN and GENOTYPER software (Applied Biosystems) and normalised to the CEPH genotype database, except for D9S251 and D9S304 for which fragment sizes were not available. Mutation analyses In family VSM-20, a genomic DNA (gDNA) sequencing analysis was performed for all 10 candidate genes located within the...”
- Chromosome 9p21 in sporadic amyotrophic lateral sclerosis in the UK and seven other countries: a genome-wide association study.
Shatunov, The Lancet. Neurology 2010 - “...7 this 36 Mb locus is defined across studies by the flanking markers D9S169 and D9S251. The SNPs we have identified lie within this region, with the peak association at 1065 Kb. A GWAS that used pathological subtyping of patients with frontotemporal dementia to increase homogeneity...”
- Liver transplantation for hepatocellular carcinoma: extension of indications based on molecular markers.
Schwartz, Journal of hepatology 2008 - Use of microsatellite marker loss of heterozygosity in accurate diagnosis of pancreaticobiliary malignancy from brush cytology samples.
Khalid, Gut 2004
CAETHG_1606 corrinoid activation/regeneration protein AcsV from Clostridium autoethanogenum DSM 10061
35% identity, 94% coverage
Ccar_18775 corrinoid activation/regeneration protein AcsV from Clostridium carboxidivorans P7
34% identity, 94% coverage
AF_0010 ASKHA domain-containing protein from Archaeoglobus fulgidus DSM 4304
37% identity, 99% coverage
- A novel methoxydotrophic metabolism discovered in the hyperthermophilic archaeon Archaeoglobus fulgidus
Welte, Environmental microbiology 2021 - “.... Genomic and transcriptomic analysis revealed cobalamin binding protein MtoC (AF_0006) and its activator MtoD (AF_0010), Odemethylase MtoB (AF_0007) and methyl transferase MtoA (AF_0009) to be essential for growth of A. fulgidus on methoxylated aromatic compounds. CoM: coenzyme M, H 4 folate: tetrahydrofolate, CO(III): cobalamin binding...”
- “...VhtACDG (AF_137881), ATP synthase AtpAK (AF_115868), cobalamin binding protein MtoC (AF_0006) and its activator MtoD (AF_0010), Odemethylase MtoB (AF_0007) and methyl transferase MtoA (AF_0009), MFS transporters (AF_0008 & AF_0013). H 4 MPT: tetrahydromethanopterin, MQH 2 : reduced menaquinone (MQ), MFR: methanofuran, Fd: ferredoxin, F 420 H...”
B8R2M5 [Co(II) methylated amine-specific corrinoid protein] reductase (EC 1.16.99.1) from Acetobacterium dehalogenans (see paper)
WP_026395886 ASKHA domain-containing protein from Acetobacterium dehalogenans DSM 11527
34% identity, 99% coverage
TepiRe1_0615 corrinoid activation/regeneration protein AcsV from Tepidanaerobacter acetatoxydans Re1
36% identity, 93% coverage
CD0730 putative iron-sulfur protein from Clostridium difficile 630
33% identity, 94% coverage
RAMQ_EUBLI / P0DX10 Corrinoid activation enzyme RamQ from Eubacterium limosum (see 2 papers)
WP_038351871 ASKHA domain-containing protein from Eubacterium limosum
35% identity, 100% coverage
- function: Involved in the degradation of the quaternary amines L- proline betaine and L-carnitine (PubMed:31341018, PubMed:32571881). Component of a corrinoid-dependent methyltransferase system that transfers a methyl group from L-proline betaine or L-carnitine to tetrahydrofolate (THF), forming methyl-THF, a key intermediate in the Wood-Ljungdahl acetogenesis pathway (PubMed:31341018, PubMed:32571881). RamQ is not required for the methyl transfer, but it stimulates reduction of reconstituted MtqC from the Co(II) state to the Co(I) state in vitro (PubMed:31341018). It also stimulates the rate of THF methylation (PubMed:32571881).
cofactor: [2Fe-2S] cluster (Binds 1 2Fe-2S cluster.) - MtpB, a member of the MttB superfamily from the human intestinal acetogen Eubacterium limosum, catalyzes proline betaine demethylation
Picking, The Journal of biological chemistry 2019 (secret)
ELI_0370 ASKHA domain-containing protein from Eubacterium callanderi
34% identity, 100% coverage
TepiRe1_0333 ASKHA domain-containing protein from Tepidanaerobacter acetatoxydans Re1
37% identity, 100% coverage
PGA1_c15200 ATP-dependent reduction of co(II)balamin (RamA-like) (EC:2.1.1.13) from Phaeobacter inhibens DSM 17395
PGA1_c15200 ASKHA domain-containing protein from Phaeobacter inhibens DSM 17395
34% identity, 86% coverage
- mutant phenotype: Apparently required for the reactivation of vitamin B12. Distantly related to RamA (see PMID: 19043046) (auxotroph)
- Filling gaps in bacterial amino acid biosynthesis pathways with high-throughput genetics
Price, PLoS genetics 2018 - “...are likely to be involved in B12 reactivation: a protein with ferredoxin and DUF4445 domains (PGA1_c15200) and a DUF1638 protein (PGA1_c13340). As shown in Fig 4B , mutants in these genes are rescued by added methionine. The DUF4445 protein is distantly related to RamA, which uses...”
Awo_c10680 corrinoid activation/regeneration protein AcsV from Acetobacterium woodii DSM 1030
34% identity, 94% coverage
SMc04347 CONSERVED HYPOTHETICAL PROTEIN from Sinorhizobium meliloti 1021
34% identity, 88% coverage
Dred_2206 ferredoxin from Desulfotomaculum reducens MI-1
36% identity, 100% coverage
Dtur_0730 ferredoxin from Dictyoglomus turgidum DSM 6724
35% identity, 98% coverage
DET0670 iron-sulfur cluster binding protein from Dehalococcoides ethenogenes 195
DET0704 iron-sulfur cluster binding protein from Dehalococcoides ethenogenes 195
33% identity, 94% coverage
Dhaf_1265 ferredoxin from Desulfitobacterium hafniense DCB-2
36% identity, 69% coverage
- Characterization of an O-demethylase of Desulfitobacterium hafniense DCB-2
Studenik, Journal of bacteriology 2012 - “...components. Expression cassettes for the genes Dhaf_1265, Dhaf_2573, Dhaf_2795, Dhaf_3310, Dhaf_3879, Dhaf_4322, Dhaf_4610, Dhaf_4611, and Dhaf_4612 (GenBank...”
- “...Desulfitobacterium hafniense DCB-2 into pET11aa Gene Primer sequence PCR step Dhaf_1265 CGC GTT CAT ATG AAT CAT TAT CGG CC CTG CGG GTG GCT CCA AGC GCT GCA GAG...”
RSK20926_19267 iron-sulfur cluster-binding protein from Roseobacter sp. SK209-2-6
38% identity, 38% coverage
RSK20926_19262 iron-sulfur cluster-binding protein from Roseobacter sp. SK209-2-6
29% identity, 42% coverage
DvMF_1398 ATP-dependent reduction of co(II)balamin (RamA-like) from Desulfovibrio vulgaris Miyazaki F
DvMF_1398 iron-sulfur cluster-binding protein, putative from Desulfovibrio vulgaris str. Miyazaki F
25% identity, 84% coverage
- mutant phenotype: Cofit with the B12-dependent methionine synthase (DvMF_0476), which lacks a standard domain for the reactivation of vitamin B12.
- Filling gaps in bacterial amino acid biosynthesis pathways with high-throughput genetics
Price, PLoS genetics 2018 - “...the standard B12 activation domain. This methionine synthase has a very similar fitness pattern as DvMF_1398, which contains two DUF4445 domains (r = 0.92 across 170 experiments; also see Fig 3 ). We infer that DUF4445 proteins perform the reactivation of vitamin B12 in diverse bacteria....”
DVU0908 ATP-dependent reduction of co(II)balamin from Desulfovibrio vulgaris Hildenborough JW710
27% identity, 69% coverage
- mutant phenotype: Important for fitness in most defined media. Semi-automated annotation based on the auxotrophic phenotype and a hit to HMM PF14574.
ramA / B8Y445 [Co(II) methylated amines-specific corrinoid protein] reductase (EC 1.16.99.1) from Methanosarcina barkeri (see 2 papers)
RAMA_METBA / B8Y445 [Co(II) methylated amine-specific corrinoid protein] reductase; Corrinoid activation enzyme RamA; EC 1.16.99.1 from Methanosarcina barkeri (see paper)
B8Y445 [Co(II) methylated amine-specific corrinoid protein] reductase (EC 1.16.99.1) from Methanosarcina barkeri (see paper)
28% identity, 69% coverage
- function: Reductase required for the activation of corrinoid-dependent methylamine methyltransferase reactions during methanogenesis (PubMed:19043046). Mediates the ATP-dependent reduction of corrinoid proteins from the inactive cobalt(II) state to the active cobalt(I) state (PubMed:19043046). Acts on the corrinoid proteins involved in methanogenesis from monomethylamine (MMA), dimethylamine (DMA) and trimethylamine (TMA), namely MtmC, MtbC and MttC, respectively (PubMed:19043046).
catalytic activity: 2 Co(II)-[methylamine-specific corrinoid protein] + AH2 + ATP + H2O = 2 Co(I)-[methylamine-specific corrinoid protein] + A + ADP + phosphate + 3 H(+) (RHEA:65816)
catalytic activity: 2 Co(II)-[dimethylamine-specific corrinoid protein] + AH2 + ATP + H2O = 2 Co(I)-[dimethylamine-specific corrinoid protein] + A + ADP + phosphate + 3 H(+) (RHEA:65832)
catalytic activity: 2 Co(II)-[trimethylamine-specific corrinoid protein] + AH2 + ATP + H2O = 2 Co(I)-[trimethylamine-specific corrinoid protein] + A + ADP + phosphate + 3 H(+) (RHEA:65836)
cofactor: [4Fe-4S] cluster (Binds 2 [4Fe-4S] clusters.)
subunit: Monomer.
Dde_2711 2Fe-2S iron-sulfur cluster binding domains protein from Desulfovibrio desulfuricans G20
27% identity, 71% coverage
MA0849 hypothetical protein (multi-domain) from Methanosarcina acetivorans C2A
27% identity, 69% coverage
Mmah_1683 4Fe-4S ferredoxin iron-sulfur binding domain protein from Methanohalophilus mahii DSM 5219
27% identity, 69% coverage
MM1440 conserved protein from Methanosarcina mazei Goe1
27% identity, 69% coverage
MA0150 methylamine methyltransferase corrinoid activation protein from Methanosarcina acetivorans C2A
27% identity, 69% coverage
FKV42_RS10455, WP_154810143 methylamine methyltransferase corrinoid protein reductive activase from Methanolobus vulcani
27% identity, 69% coverage
- Several ways one goal-methanogenesis from unconventional substrates
Kurth, Applied microbiology and biotechnology 2020 - “...? Methanolobus vulcani Quaternary amines MtgB methyltransferase FKV42_RS08545 WP_154809802 Corrinoid protein FKV42_RS08550 WP_154809803 Corrinoid activator FKV42_RS10455 WP_154810143 CoM methyltransferase FKV42_RS10480 WP_154810148 For the organisms conducting hydrogen-dependent methylotrophic the enzymes important for energy conversion/recycling of reducing equivalents are shown as those play an important role of the...”
- “...proteomic analysis revealed that MtgB, a corrinoid binding protein (FKV42_RS08550), a corrinoid reductive activation enzyme (FKV42_RS10455) and a methylcorrinoid:CoM methyltransferase (FKV42_RS10480) were highly abundant when M. vulcani B1d was grown on betaine relative to growth on trimethylamine. Energy conservation presumably follows what is known for methylamine...”
- “...Methanolobus vulcani Quaternary amines MtgB methyltransferase FKV42_RS08545 WP_154809802 Corrinoid protein FKV42_RS08550 WP_154809803 Corrinoid activator FKV42_RS10455 WP_154810143 CoM methyltransferase FKV42_RS10480 WP_154810148 For the organisms conducting hydrogen-dependent methylotrophic the enzymes important for energy conversion/recycling of reducing equivalents are shown as those play an important role of the special...”
MM0940 putative Flavoprotein from Methanosarcina mazei Goe1
26% identity, 69% coverage
MA3972 conserved hypothetical protein from Methanosarcina acetivorans C2A
27% identity, 69% coverage
Q8PXZ5 Conserved protein from Methanosarcina mazei (strain ATCC BAA-159 / DSM 3647 / Goe1 / Go1 / JCM 11833 / OCM 88)
MM1071 conserved protein from Methanosarcina mazei Goe1
27% identity, 69% coverage
- Mining proteomic data to expose protein modifications in Methanosarcina mazei strain Gö1
Leon, Frontiers in microbiology 2015 - “...Rpl1P 3 62 Q8PY39 MM1025 ThiC 3 58 Q8PXZ6 MM1070 MtaA1 methylcobalamin:CoM methyltransferase 7 374 Q8PXZ5 MM1071 4Fe:4S ferredoxin, hypothetical 2 121 Q8PXZ3 MM1073 MtaC2 methyl corrinoid protein 6 230 Q8PXZ2 MM1074 MtaB2 9 250 Q8PXZ1 MM1075 Putative regulatory protein 2 92 1 Y Q8PXX0 MM1096...”
- Mining proteomic data to expose protein modifications in Methanosarcina mazei strain Gö1
Leon, Frontiers in microbiology 2015 - “...3 62 Q8PY39 MM1025 ThiC 3 58 Q8PXZ6 MM1070 MtaA1 methylcobalamin:CoM methyltransferase 7 374 Q8PXZ5 MM1071 4Fe:4S ferredoxin, hypothetical 2 121 Q8PXZ3 MM1073 MtaC2 methyl corrinoid protein 6 230 Q8PXZ2 MM1074 MtaB2 9 250 Q8PXZ1 MM1075 Putative regulatory protein 2 92 1 Y Q8PXX0 MM1096 Thermosome,...”
- Transcriptional profiling of methyltransferase genes during growth of Methanosarcina mazei on trimethylamine
Krätzer, Journal of bacteriology 2009 - “...(C-terminal domain) MM0174 MM0175 MM0312 MM0408 MM0479 MM0924 MM1071 MM1073 MM1074 MM1075 MM1112 MM1271 MM1272 MM1273 MM1274 MM1275 MM1647 MM1648 MM1761 MM1762...”
- RamA, a protein required for reductive activation of corrinoid-dependent methylamine methyltransferase reactions in methanogenic archaea
Ferguson, The Journal of biological chemistry 2009 - “...were found in M. acetivorans (MA4380), M. mazei (mm1071), and M. barkeri (Mbar_A1055). Additionally, other RamA homologs in Methanosarcina spp. were found, but...”
- A subset of the diverse COG0523 family of putative metal chaperones is linked to zinc homeostasis in all kingdoms of life
Haas, BMC genomics 2009 - “...( M. mazei COG0523) is induced to the same extent as its neighboring ramM homolog, MM1071 , during growth in high salt conditions (2.38 and 2.21 fold, respectively) [ 104 ]. Archaeal genomes sequenced to date lack any recognizable homolog of the Fur (Fe) or Zur...”
- Characterization of a novel bifunctional dihydropteroate synthase/dihydropteroate reductase enzyme from Helicobacter pylori
Levin, Journal of bacteriology 2007 - “...MM512 MM612 MM808 MM847 MM851 MM902 MM1059 MM1060 MM1061 MM1071 Genotype or description 4064 LEVIN ET AL. and E. coli Fre. The reaction mixture was incubated...”
MA4380 conserved hypothetical protein from Methanosarcina acetivorans C2A
27% identity, 69% coverage
dsoF / O32433 DMSO monooxygenase reductase component (EC 1.14.13.245) from Acinetobacter sp. (see 4 papers)
O32433 assimilatory dimethylsulfide S-monooxygenase (subunit 1/6) (EC 1.14.13.245) from Acinetobacter sp. (see 2 papers)
36% identity, 20% coverage
DMPP_ACIP2 / Q7WTJ2 Phenol hydroxylase P5 protein; Phenol 2-monooxygenase P5 component; EC 1.14.13.7 from Acinetobacter pittii (strain PHEA-2)
TC 5.B.1.3.2 / Q7WTJ2 Phenol hydroxylase from Acinetobacter calcoaceticus (strain PHEA-2)
35% identity, 20% coverage
- function: Catabolizes phenol, and some of its methylated derivatives. P5 is required for growth on phenol, and for in vitro phenol hydroxylase activity (By similarity).
function: Probable electron transfer from NADPH, via FAD and the 2Fe-2S center, to the oxygenase activity site of the enzyme.
catalytic activity: phenol + NADPH + O2 + H(+) = catechol + NADP(+) + H2O (RHEA:17061)
cofactor: FAD (Binds 1 FAD.)
cofactor: [2Fe-2S] cluster (Binds 1 [2Fe-2S] cluster.)
subunit: The multicomponent enzyme phenol hydroxylase is formed by P0, P1, P2, P3, P4 and P5 polypeptides. - substrates: Electrons
WP_226348815 NADH:ubiquinone reductase (Na(+)-transporting) subunit F from Alcaligenes sp. 13f
37% identity, 20% coverage
P23_2977 NADH:ubiquinone reductase (Na(+)-transporting) subunit F from Acinetobacter calcoaceticus
36% identity, 20% coverage
phlF / CAA56745.1 subunit of phenolhydroxylase from Pseudomonas putida (see 2 papers)
34% identity, 20% coverage
A2SI47 Phenol hydrolase reductase from Methylibium petroleiphilum (strain ATCC BAA-1232 / LMG 22953 / PM1)
33% identity, 20% coverage
dmpP / P19734 phenol hydroxylase reductase component (EC 1.14.13.244) from Pseudomonas sp. (strain CF600) (see 9 papers)
DMPP_PSEUF / P19734 Phenol 2-monooxygenase, reductase component DmpP; Phenol 2-monooxygenase P5 component; Phenol hydroxylase P5 protein; EC 1.14.13.244 from Pseudomonas sp. (strain CF600) (see 2 papers)
P19734 phenol 2-monooxygenase (NADH) (subunit 5/6) (EC 1.14.13.244) from Pseudomonas sp. CF600 (see paper)
dmpP phenol hydroxylase P5 protein; EC 1.14.13.7 from Pseudomonas sp. CF600 (see 2 papers)
dmpP / AAA25944.1 phenol hydroxylase from Pseudomonas putida (see paper)
35% identity, 20% coverage
- function: Part of a multicomponent enzyme which catalyzes the degradation of phenol and some of its methylated derivatives (PubMed:2254259). DmpP probably transfers electrons from NADH, via FAD and the iron-sulfur center, to the oxygenase component of the complex (PubMed:2254259). Required for growth on phenol and for in vitro phenol hydroxylase activity (PubMed:2254258, PubMed:2254259).
catalytic activity: phenol + NADH + O2 + H(+) = catechol + NAD(+) + H2O (RHEA:57952)
cofactor: FAD (Binds 1 FAD per subunit.)
cofactor: [2Fe-2S] cluster (Binds 1 [2Fe-2S] cluster per subunit.)
subunit: The multicomponent enzyme phenol hydroxylase is formed by DmpL (P1 component), DmpM (P2 component), DmpN (P3 component), DmpO (P4 component) and DmpP (P5 component).
disruption phenotype: Cells lacking this gene cannot grow on phenol. - Purification and identification of trichloroethylene induced proteins from Stenotrophomonas maltophilia PM102 by immuno-affinity-chromatography and MALDI-TOF Mass spectrometry
Mukherjee, SpringerPlus 2013 - “...6.77 Propane monooxygenase from Rhodococcus sp . Q0SJK9 63222.42 5.56 Phenol hydroxylase from Pseudomonas sp. P19734 38477.58 4.79 Competing interests The authors declare that they have no competing interests regarding any of the research work reported in this paper. Authors contribution PM carried out the biochemical...”
- Proteogenomic elucidation of the initial steps in the benzene degradation pathway of a novel halophile, Arhodomonas sp. strain Rozel, isolated from a hypersaline environment
Dalvi, Applied and environmental microbiology 2012 - “...P19730 66 2e25 Q9RAF7 77 0 Q5KT19 56 4e35 O84962 64 3e161 P19734 44 3e15 A1K6K5 69 e129 Q1LNR9 50 8e139 Q2W7L9 48 4e49 A1K899 59 2e27 Q49KG4 70 0 G6YS35 a Shown...”
- “...component (P19732), and phenol 2-monooxygenase P5 component (P19734). October 2012 Volume 78 Number 20 aem.asm.org 7313 Downloaded from http://aem.asm.org/ on...”
- Epoxyalkane: coenzyme M transferase in the ethene and vinyl chloride biodegradation pathways of mycobacterium strain JS60
Coleman, Journal of bacteriology 2003 - “...41.8 P27353 BAA07115 DmpP Pseudomonas strain CF600 40.4 P19734 GctB CatJ CAA10043 Organism Incomplete ORF. (amoABCD) of Rhodococcus strain B-276. The sequence...”
- Duplicate copies of genes encoding methanesulfonate monooxygenase in Marinosulfonomonas methylotropha strain TR3 and detection of methanesulfonate utilizers in the environment
Baxter, Applied and environmental microbiology 2002 - “...42 42 Pseudomonas sp. strain CF600 P. putida P. putida P19734 Q52126 P23101 OrfX Cyc6 Cyc6 Cytochrome c6 Cytochrome c6 33 32 44 47 E. gracilis M. aeruginosa...”
BT1155 Na+-translocating NADH-quinone reductase subunit from Bacteroides thetaiotaomicron VPI-5482
35% identity, 14% coverage
D8DWB6 Na(+)-translocating NADH-quinone reductase subunit F from Segatella baroniae B14
34% identity, 14% coverage
- Occurrence and Function of the Na+-Translocating NADH:Quinone Oxidoreductase in Prevotella spp.
Deusch, Microorganisms 2019 - “...RnfE D8DXV3 37.36 NuoN D8DX02 19.08 NqrE D8DWB7 RnfA D8DXV2 44.50 NuoL A0A1H9A8K0 16.67 NqrF D8DWB6 RnfB D8DXV7 17.34 NuoCD D8DWN9 17.48 microorganisms-07-00117-t004_Table 4 Table 4 Subunits of the NQR, RNF, NDH-I (Nuo), and other respiratory enzymes identified from membranes solubilized with 1% or 2% (...”
- “...reductase, Fe-S pr. 1272.00 63.89 16 16 Triton 1%B D8DWC1 NqrA 1126.93 60.58 26 27 D8DWB6 NqrF 386.31 18.01 7 7 D8DWB9 NqrC 272.89 40.48 7 8 D8DWC0 NqrB 55.66 11.43 3 3 D8DWB8 NqrD 41.65 4.78 2 2 D8DWN8 NuoH 115.46 9.62 4 4 D8DWN7...”
A0A0D0J042 Na(+)-translocating NADH-quinone reductase subunit F from Prevotella pectinovora
37% identity, 14% coverage
Q84AQ0 phenol 2-monooxygenase (NADH) (subunit 1/5) (EC 1.14.13.244) from Pseudomonas stutzeri (see paper)
34% identity, 20% coverage
Rmet_1326 NADH:ubiquinone reductase (Na(+)-transporting) subunit F from Cupriavidus metallidurans CH34
Rmet_1326 Oxidoreductase FAD-binding region from Ralstonia metallidurans CH34
31% identity, 22% coverage
tomA5 / Q9ANX0 toluene ortho-monooxygenase TomA5 subunit (EC 1.14.13.243) from Burkholderia cepacia (see paper)
Q9ANX0 toluene 2-monooxygenase (subunit 5/6) (EC 1.14.13.243) from Burkholderia cepacia (see 8 papers)
31% identity, 22% coverage
KZ686_09965 NADH:ubiquinone reductase (Na(+)-transporting) subunit F from Cupriavidus cauae
31% identity, 22% coverage
pc1533 probable Na(+)-translocating NADH-quinone reductase, chain F from Parachlamydia sp. UWE25
35% identity, 13% coverage
Slit_1671 ferredoxin from Sideroxydans lithotrophicus ES-1
32% identity, 14% coverage
N6YI82 Phenol 2-monooxygenase from Thauera sp. 63
34% identity, 16% coverage
CT740 Phenolhydrolase/NADH ubiquinone oxidoreductase from Chlamydia trachomatis D/UW-3/CX
29% identity, 15% coverage
TC0116 NADH:ubiquinone oxidoreductase, beta subunit, putative from Chlamydia muridarum Nigg
27% identity, 13% coverage
C6KUI9 Ferredoxin oxidoreductase from bacterium
31% identity, 20% coverage
CTLon_0109 Na(+)-translocating NADH-quinone reductase subunit F from Chlamydia trachomatis L2b/UCH-1/proctitis
28% identity, 15% coverage
Q52574 toluene 2-monooxygenase (subunit 1/6) (EC 1.14.13.243) from Pseudomonas sp. (see paper)
tbmF / AAA88461.1 oxidoreductase from Pseudomonas sp (see paper)
32% identity, 20% coverage
5ogxA / A0A076MZ01 Crystal structure of amycolatopsis cytochrome p450 reductase gcob. (see paper)
31% identity, 19% coverage
- Ligands: flavin-adenine dinucleotide; fe2/s2 (inorganic) cluster (5ogxA)
GCOB_AMYS7 / P0DPQ8 Aromatic O-demethylase, reductase subunit; NADH--hemoprotein reductase; EC 1.6.2.- from Amycolatopsis sp. (strain ATCC 39116 / 75iv2) (see paper)
WP_020419854 2Fe-2S iron-sulfur cluster-binding protein from Amycolatopsis sp. ATCC 39116
31% identity, 19% coverage
- function: Part of a two-component P450 system that efficiently O- demethylates diverse aromatic substrates such as guaiacol and a wide variety of lignin-derived monomers. Is likely involved in lignin degradation, allowing Amycolatopsis sp. ATCC 39116 to catabolize plant biomass. GcoB transfers electrons from NADH to the cytochrome P450 subunit GcoA. Highly prefers NADH over NADPH as the electron donor.
catalytic activity: 2 oxidized [cytochrome P450] + NADH = 2 reduced [cytochrome P450] + NAD(+) + H(+) (RHEA:57420)
cofactor: FAD (Binds 1 FAD per subunit.)
cofactor: [2Fe-2S] cluster (Binds 1 [2Fe-2S] cluster per subunit.)
subunit: Monomer. Forms a heterodimer with GcoA. - Engineering a Cytochrome P450 for Demethylation of Lignin-Derived Aromatic Aldehydes
Ellis, JACS Au 2021 - “...B, C). 35 , 36 The Amycolatopsis sp. ATCC 39116 GcoAB cytochrome P450 system (WP_020419855, WP_020419854) was therefore engineered for efficient turnover of aromatic aldehydes, specifically targeting p- and o -vanillin: substrates with which GcoAB has little to no native activity. Building from our previous work,...”
CTME_CASD6 / W8X5L3 2Fe-2S ferredoxin CtmE from Castellaniella defragrans (strain DSM 12143 / CCUG 39792 / 65Phen) (Alcaligenes defragrans) (see paper)
39% identity, 14% coverage
- function: Involved in the degradation of the cyclic monoterpene limonene (PubMed:24952578). Probably part of an electron transfer system involved in the oxidation of limonene to perillyl alcohol (Probable).
cofactor: [2Fe-2S] cluster (Binds 1 2Fe-2S cluster.)
disruption phenotype: Mutant cannot not grow aerobically or anaerobically on limonene, but it can grow on perillyl alcohol or on acetate.
P95461 p-cymene methyl-monooxygenase (EC 1.14.15.25) from Pseudomonas chlororaphis subsp. aureofaciens (see paper)
36% identity, 14% coverage
cymAb / O33457 NADH-ferredoxin reductase (EC 1.18.1.3) from Pseudomonas putida (see 2 papers)
O33457 p-cymene methyl-monooxygenase (subunit 1/2) (EC 1.14.15.25) from Pseudomonas putida (see 2 papers)
cymAb / AAB62300.1 p-cymene monooxygenase reductase subunit from Pseudomonas putida (see 3 papers)
33% identity, 16% coverage
N8H69_24105 CDP-6-deoxy-delta-3,4-glucoseen reductase from Achromobacter spanius
32% identity, 15% coverage
RHE_RS24270 adenylate/guanylate cyclase domain-containing protein from Rhizobium etli CFN 42
48% identity, 7% coverage
SMc01818 PUTATIVE ADENYLATE CYCLASE TRANSMEMBRANE PROTEIN from Sinorhizobium meliloti 1021
50% identity, 7% coverage
- A signaling complex of adenylate cyclase CyaC of Sinorhizobium meliloti with cAMP and the transcriptional regulators Clr and CycR
Klein, BMC microbiology 2023 - “...a broad significance for the regulation of diverse processes in cell physiology and metabolism. CyaC (SMc01818 protein) of Sinorhizobium (Ensifer) meliloti belongs to the bacterial class III ACs, which are homodimers, are mostly membrane-bound and have a large variation in domain composition [ 3 5 ]....”
- “...proteins of metabolic or signaling pathways are often clustered [ 14 ]. Interestingly, cyaC ( smc01818 ) is preceded by a gene ( smc01819 ) that encodes a putative transcriptional regulator (TR01819) of the TetR family with an N -terminal Helix-Turn-Helix DNA binding motif. TetR regulators...”
3huiA / Q6N2U2 Crystal structure of the mutant a105r of [2fe-2s] ferredoxin in the class i cyp199a2 system from rhodopseudomonas palustris (see paper)
36% identity, 12% coverage
- Ligand: fe2/s2 (inorganic) cluster (3huiA)
Q6N2U2 2Fe-2S iron-sulfur cluster-binding protein from Rhodopseudomonas palustris (strain ATCC BAA-98 / CGA009)
RPA3956 ferredoxin from Rhodopseudomonas palustris CGA009
36% identity, 12% coverage
- Altering glycopeptide antibiotic biosynthesis through mutasynthesis allows incorporation of fluorinated phenylglycine residues.
Voitsekhovskaia, RSC chemical biology 2024 - “...(UniProt protein ID: O52825 and O52816); PuR (UniProt protein ID: Q6N3B2), PuxB (UniProt protein ID: Q6N2U2), Sfp (R4-4 mutant) and M4 and M5 domains 57 of Tcp 11 UniProt protein ID: Q70AZ7) were expressed and purified as previously reported. 13,5760 A-domain characterisation Activation of phenylglycine substrates...”
- Thioredoxin Reductase-Type Ferredoxin: NADP+ Oxidoreductase of Rhodopseudomonas palustris: Potentiometric Characteristics and Reactions with Nonphysiological Oxidants
Lesanavičius, Antioxidants (Basel, Switzerland) 2022 - “...the other hand, Rp FNR has low reactivity toward Fe 2 S 2 -type ferredoxin (RPA3956), whereas its reactivity toward Fe 4 S 4 -type Fds of R. palustris has not been reported [ 15 ]. In order to extend the understanding of the redox properties...”
- Protein recognition in ferredoxin-P450 electron transfer in the class I CYP199A2 system from Rhodopseudomonas palustris
Bell, Journal of biological inorganic chemistry : JBIC : a publication of the Society of Biological Inorganic Chemistry 2010 (PubMed)- “...(PuR). Another [2Fe-2S] ferredoxin, palustrisredoxin B (PuxB; RPA3956) has been identified in the genome. PuxB shares sequence identity and motifs with...”
- “...studies of palustrisredoxin B (PuxB) encoded by the RPA3956 gene. PuxB shares a high degree of sequence identity with vertebrate-type [2Fe-2S] ferredoxins that...”
JJQ59_20660 NADH:ubiquinone reductase (Na(+)-transporting) subunit F from Cupriavidus necator
34% identity, 16% coverage
Q7UWS0 Na(+)-translocating NADH-quinone reductase subunit F from Rhodopirellula baltica (strain DSM 10527 / NCIMB 13988 / SH1)
27% identity, 19% coverage
PA4331 hypothetical protein from Pseudomonas aeruginosa PAO1
34% identity, 13% coverage
- Oxygen-dependent regulation of c-di-GMP synthesis by SadC controls alginate production in Pseudomonas aeruginosa
Schmidt, Environmental microbiology 2016 (PubMed)- “...we found that the gene products of PA4330 and PA4331, located in a predicted operon with sadC, have a major impact on alginate production: deletion of PA4330...”
- “...production defect under anaerobic conditions, whereas a PA4331 (odaI, for oxygendependent alginate synthesis inhibitor) deletion mutant produced alginate also...”
- BIIL 284 reduces neutrophil numbers but increases P. aeruginosa bacteremia and inflammation in mouse lungs
Döring, Journal of cystic fibrosis : official journal of the European Cystic Fibrosis Society 2014 - “...patients with non-CF bronchiectasis and healthy individuals using a qPCR TaqMan assays based on the PA4331 gene, which is ubiquitously present in a collection of 117 isolates from 14 CF patients and therefore a stable marker of P. aeruginosa . A standard curve was used to...”
- “...TaqHotStart DNA polymerase, qPCR Buffer, dNTPs, MgCl 2 and stabilizers) (peQlab, Erlangen, Germany), 0.3M Primer PA4331 forward (5-GTGTTGCAGCCTTTCGATCC3-), 0.3 M Primer PA4331 reverse (5- AACTCCAGCCATGGGTCCTC 3-), 0.3 M qPCR PA4331 Probe (5-FAM GCAGCACCTGCTGCTGTGGA 3-TAM) (Eurofins MWG Operon, Ebersberg, Germany). PCR conditions were 3 min at 95C,...”
For advice on how to use these tools together, see
Interactive tools for functional annotation of bacterial genomes.
The PaperBLAST database links 793,807 different protein sequences to 1,259,118 scientific articles. Searches against EuropePMC were last performed on March 13 2025.
PaperBLAST builds a database of protein sequences that are linked
to scientific articles. These links come from automated text searches
against the articles in EuropePMC
and from manually-curated information from GeneRIF, UniProtKB/Swiss-Prot,
BRENDA,
CAZy (as made available by dbCAN),
BioLiP,
CharProtDB,
MetaCyc,
EcoCyc,
TCDB,
REBASE,
the Fitness Browser,
and a subset of the European Nucleotide Archive with the /experiment tag.
Given this database and a protein sequence query,
PaperBLAST uses protein-protein BLAST
to find similar sequences with E < 0.001.
To build the database, we query EuropePMC with locus tags, with RefSeq protein
identifiers, and with UniProt
accessions. We obtain the locus tags from RefSeq or from MicrobesOnline. We use
queries of the form "locus_tag AND genus_name" to try to ensure that
the paper is actually discussing that gene. Because EuropePMC indexes
most recent biomedical papers, even if they are not open access, some
of the links may be to papers that you cannot read or that our
computers cannot read. We query each of these identifiers that
appears in the open access part of EuropePMC, as well as every locus
tag that appears in the 500 most-referenced genomes, so that a gene
may appear in the PaperBLAST results even though none of the papers
that mention it are open access. We also incorporate text-mined links
from EuropePMC that link open access articles to UniProt or RefSeq
identifiers. (This yields some additional links because EuropePMC
uses different heuristics for their text mining than we do.)
For every article that mentions a locus tag, a RefSeq protein
identifier, or a UniProt accession, we try to select one or two
snippets of text that refer to the protein. If we cannot get access to
the full text, we try to select a snippet from the abstract, but
unfortunately, unique identifiers such as locus tags are rarely
provided in abstracts.
PaperBLAST also incorporates manually-curated protein functions:
- Proteins from NCBI's RefSeq are included if a
GeneRIF
entry links the gene to an article in
PubMed®.
GeneRIF also provides a short summary of the article's claim about the
protein, which is shown instead of a snippet.
- Proteins from Swiss-Prot (the curated part of UniProt)
are included if the curators
identified experimental evidence for the protein's function (evidence
code ECO:0000269). For these proteins, the fields of the Swiss-Prot entry that
describe the protein's function are shown (with bold headings).
- Proteins from BRENDA,
a curated database of enzymes, are included if they are linked to a paper in PubMed
and their full sequence is known.
- Every protein from the non-redundant subset of
BioLiP,
a database
of ligand-binding sites and catalytic residues in protein structures, is included. Since BioLiP itself
does not include descriptions of the proteins, those are taken from the
Protein Data Bank.
Descriptions from PDB rely on the original submitter of the
structure and cannot be updated by others, so they may be less reliable.
(For SitesBLAST and Sites on a Tree, we use a larger subset of BioLiP so that every
ligand is represented among a group of structures with similar sequences, but for
PaperBLAST, we use the non-redundant set provided by BioLiP.)
- Every protein from EcoCyc, a curated
database of the proteins in Escherichia coli K-12, is included, regardless
of whether they are characterized or not.
- Proteins from the MetaCyc metabolic pathway database
are included if they are linked to a paper in PubMed and their full sequence is known.
- Proteins from the Transport Classification Database (TCDB)
are included if they have known substrate(s), have reference(s),
and are not described as uncharacterized or putative.
(Some of the references are not visible on the PaperBLAST web site.)
- Every protein from CharProtDB,
a database of experimentally characterized protein annotations, is included.
- Proteins from the CAZy database of carbohydrate-active enzymes
are included if they are associated with an Enzyme Classification number.
Even though CAZy does not provide links from individual protein sequences to papers,
these should all be experimentally-characterized proteins.
- Proteins from the REBASE database
of restriction enzymes are included if they have known specificity.
- Every protein with an evidence-based reannotation (based on mutant phenotypes)
in the Fitness Browser is included.
- Sequence-specific transcription factors (including sigma factors and DNA-binding response regulators)
with experimentally-determined DNA binding sites from the
PRODORIC database of gene regulation in prokaryotes.
- Putative transcription factors from RegPrecise
that have manually-curated predictions for their binding sites. These predictions are based on
conserved putative regulatory sites across genomes that contain similar transcription factors,
so PaperBLAST clusters the TFs at 70% identity and retains just one member of each cluster.
- Coding sequence (CDS) features from the
European Nucleotide Archive (ENA)
are included if the /experiment tag is set (implying that there is experimental evidence for the annotation),
the nucleotide entry links to paper(s) in PubMed,
and the nucleotide entry is from the STD data class
(implying that these are targeted annotated sequences, not from shotgun sequencing).
Also, to filter out genes whose transcription or translation was detected, but whose function
was not studied, nucleotide entries or papers with more than 25 such proteins are excluded.
Descriptions from ENA rely on the original submitter of the
sequence and cannot be updated by others, so they may be less reliable.
Except for GeneRIF and ENA,
the curated entries include a short curated
description of the protein's function.
For entries from BioLiP, the protein's function may not be known beyond binding to the ligand.
Many of these entries also link to articles in PubMed.
For more information see the
PaperBLAST paper (mSystems 2017)
or the code.
You can download PaperBLAST's database here.
Changes to PaperBLAST since the paper was written:
- November 2023: incorporated PRODORIC and RegPrecise. Many PRODORIC entries were not linked to a protein sequence (no UniProt identifier), so we added this information.
- February 2023: BioLiP changed their download format. PaperBLAST now includes their non-redundant subset. SitesBLAST and Sites on a Tree use a larger non-redundant subset that ensures that every ligand is represented within each cluster. This should ensure that every binding site is represented.
- June 2022: incorporated some coding sequences from ENA with the /experiment tag.
- March 2022: incorporated BioLiP.
- April 2020: incorporated TCDB.
- April 2019: EuropePMC now returns table entries in their search results. This has expanded PaperBLAST's database, but most of the new entries are of low relevance, and the resulting snippets are often just lists of locus tags with annotations.
- February 2018: the alignment page reports the conservation of the hit's functional sites (if available from from Swiss-Prot or UniProt)
- January 2018: incorporated BRENDA.
- December 2017: incorporated MetaCyc, CharProtDB, CAZy, REBASE, and the reannotations from the Fitness Browser.
- September 2017: EuropePMC no longer returns some table entries in their search results. This has shrunk PaperBLAST's database, but has also reduced the number of low-relevance hits.
Many of these changes are described in Interactive tools for functional annotation of bacterial genomes.
PaperBLAST cannot provide snippets for many of the papers that are
published in non-open-access journals. This limitation applies even if
the paper is marked as "free" on the publisher's web site and is
available in PubmedCentral or EuropePMC. If a journal that you publish
in is marked as "secret," please consider publishing elsewhere.
Many important articles are missing from PaperBLAST, either because
the article's full text is not in EuropePMC (as for many older
articles), or because the paper does not mention a protein identifier such as a locus tag, or because of PaperBLAST's heuristics. If you notice an
article that characterizes a protein's function but is missing from
PaperBLAST, please notify the curators at UniProt
or add an entry to GeneRIF.
Entries in either of these databases will eventually be incorporated
into PaperBLAST. Note that to add an entry to UniProt, you will need
to find the UniProt identifier for the protein. If the protein is not
already in UniProt, you can ask them to create an entry. To add an
entry to GeneRIF, you will need an NCBI Gene identifier, but
unfortunately many prokaryotic proteins in RefSeq do not have
corresponding Gene identifers.
References
PaperBLAST: Text-mining papers for information about homologs.
M. N. Price and A. P. Arkin (2017). mSystems, 10.1128/mSystems.00039-17.
Europe PMC in 2017.
M. Levchenko et al (2017). Nucleic Acids Research, 10.1093/nar/gkx1005.
Gene indexing: characterization and analysis of NLM's GeneRIFs.
J. A. Mitchell et al (2003). AMIA Annu Symp Proc 2003:460-464.
UniProt: the universal protein knowledgebase.
The UniProt Consortium (2016). Nucleic Acids Research, 10.1093/nar/gkw1099.
BRENDA in 2017: new perspectives and new tools in BRENDA.
S. Placzek et al (2017). Nucleic Acids Research, 10.1093/nar/gkw952.
The EcoCyc database: reflecting new knowledge about Escherichia coli K-12.
I. M. Keeseler et al (2016). Nucleic Acids Research, 10.1093/nar/gkw1003.
The MetaCyc database of metabolic pathways and enzymes.
R. Caspi et al (2018). Nucleic Acids Research, 10.1093/nar/gkx935.
CharProtDB: a database of experimentally characterized protein annotations.
R. Madupu et al (2012). Nucleic Acids Research, 10.1093/nar/gkr1133.
The carbohydrate-active enzymes database (CAZy) in 2013.
V. Lombard et al (2014). Nucleic Acids Research, 10.1093/nar/gkt1178.
The Transporter Classification Database (TCDB): recent advances
M. H. Saier, Jr. et al (2016). Nucleic Acids Research, 10.1093/nar/gkv1103.
REBASE - a database for DNA restriction and modification: enzymes, genes and genomes.
R. J. Roberts et al (2015). Nucleic Acids Research, 10.1093/nar/gku1046.
Deep annotation of protein function across diverse bacteria from mutant phenotypes.
M. N. Price et al (2016). bioRxiv, 10.1101/072470.
by Morgan Price,
Arkin group
Lawrence Berkeley National Laboratory