PaperBLAST

PaperBLAST – Find papers about a protein or its homologs

PaperBLAST

PaperBLAST Hits for VIMSS6823329 ferredoxin (608 a.a., MVQVTFLPGK...)

Other sequence analysis tools:

Find functional residues: SitesBLAST

Search for conserved domains

Find the best match in UniProt

Compare to protein structures

Predict transmenbrane helices: Phobius

Predict protein localization: PSORTb

Find homologs in fast.genomics

Fitness BLAST: loading...

Show query sequence

Found 78 similar proteins in the literature:

Dhaf_3310 ferredoxin from Desulfitobacterium hafniense DCB-2
100% identity, 100% coverage

Characterization of an O-demethylase of Desulfitobacterium hafniense DCB-2
Studenik, Journal of bacteriology 2012
- “...for the genes Dhaf_1265, Dhaf_2573, Dhaf_2795, Dhaf_3310, Dhaf_3879, Dhaf_4322, Dhaf_4610, Dhaf_4611, and Dhaf_4612 (GenBank accession no. CP001336.1) as...”
- “...CCT TAT TTT TCG AAC TGC GGG TGG C 1 Dhaf_2573 Dhaf_3310 Dhaf_3879 Dhaf_4322 Dhaf_4610 Dhaf_4611 Dhaf_4612 a 2 2 2 2 2 2 2 2 For details, see Materials and...”

SSCH_450007 ASKHA domain-containing protein from Syntrophaceticus schinkii
43% identity, 98% coverage

Genome-Guided Analysis and Whole Transcriptome Profiling of the Mesophilic Syntrophic Acetate Oxidising Bacterium Syntrophaceticus schinkii
Manzoor, PloS one 2016
- “...Several ferredoxin-encoding genes were found dispersed in the genome of S . schinkii (SSCH _100042, SSCH_450007, SSCH_530010, SSCH_760007, SSCH_1120013) and one putative rubredoxin gene (SSCH_180038). 10.1371/journal.pone.0166520.g006 Fig 6 Comparison of the NADH-dependent [Fe-Fe] hydrogenase gene cluster (SSCH_21000810) predicted for S . schinkii strain Sp3 to the...”

BP07_RS03235, WP_042685513 ASKHA domain-containing protein from Methermicoccus shengliensis
42% identity, 98% coverage

Methanogenic archaea use a bacteria-like methyltransferase system to demethoxylate aromatic compounds
Kurth, The ISME journal 2021
- “...and MtoD The gene encoding the corrinoid protein MtoC (BP07_RS03260) and the corrinoid activating enzyme (BP07_RS03235) were amplified from genomic M. shengliensis DNA with primers 3235fw/3235Srev (CTCATATGAGCGTCAGAGTAACGTTCGAGC, CTGCGGCCGCTTATTTTTCGAACTGCGGGTGGCTCCAGCTAGCTGAAGAGAGTTTTTCTCC) and 3260fw/3260Srev (CTCATATGACGGACGTAAGAGAAGAGCTC/CTGCGGCCGCTTATTTTTCGAACTGCGGGTGGCTCCAGCTAGCCTCCACCCCCACCAGAGC) for cloning in expression vector pET-30a inserting an N-terminal Strep tag via the reverse primer....”
- “...plasmid transformation. For production of the corrinoid protein MtoC (BP07_RS03260) and the corrinoid activating enzyme (BP07_RS03235) the plasmids pET-30a_BP07_RS03260 and pET-30a_BP07_RS03235 were used for transformation into E. coli Bl21 (DE3). For protein overexpression, one colony was inoculated in 600ml LB-medium containing 50g/ml kanamycin and incubated at...”
Several ways one goal-methanogenesis from unconventional substrates
Kurth, Applied microbiology and biotechnology 2020
- “...MtvB O-demethylase BP07_RS03250 WP_042685515 Corrinoid protein BP07_RS03260 WP_042685521 MtrH-like methyltransferase BP07_RS03240 WP_042685937 Corrinoid activation protein BP07_RS03235 WP_042685513 Methanococcoides Tertiary amines ? ? ? Methanolobus vulcani Quaternary amines MtgB methyltransferase FKV42_RS08545 WP_154809802 Corrinoid protein FKV42_RS08550 WP_154809803 Corrinoid activator FKV42_RS10455 WP_154810143 CoM methyltransferase FKV42_RS10480 WP_154810148 For the organisms...”
- “...O-demethylase BP07_RS03250 WP_042685515 Corrinoid protein BP07_RS03260 WP_042685521 MtrH-like methyltransferase BP07_RS03240 WP_042685937 Corrinoid activation protein BP07_RS03235 WP_042685513 Methanococcoides Tertiary amines ? ? ? Methanolobus vulcani Quaternary amines MtgB methyltransferase FKV42_RS08545 WP_154809802 Corrinoid protein FKV42_RS08550 WP_154809803 Corrinoid activator FKV42_RS10455 WP_154810143 CoM methyltransferase FKV42_RS10480 WP_154810148 For the organisms conducting...”

Dhaf_3879 ferredoxin from Desulfitobacterium hafniense DCB-2
41% identity, 98% coverage

Characterization of an O-demethylase of Desulfitobacterium hafniense DCB-2
Studenik, Journal of bacteriology 2012
- “...for the genes Dhaf_1265, Dhaf_2573, Dhaf_2795, Dhaf_3310, Dhaf_3879, Dhaf_4322, Dhaf_4610, Dhaf_4611, and Dhaf_4612 (GenBank accession no. CP001336.1) as Strep...”
- “...TAT TTT TCG AAC TGC GGG TGG C 1 Dhaf_2573 Dhaf_3310 Dhaf_3879 Dhaf_4322 Dhaf_4610 Dhaf_4611 Dhaf_4612 a 2 2 2 2 2 2 2 2 For details, see Materials and Methods....”

3zyyX / Q3ACS2 Reductive activator for corrinoid,iron-sulfur protein (see paper)
39% identity, 96% coverage

Ligands: fe2/s2 (inorganic) cluster; (r,r)-2,3-butanediol (3zyyX)

Dhaf_2795 ferredoxin from Desulfitobacterium hafniense DCB-2
DSY1650 ferredoxin from Desulfitobacterium hafniense Y51
39% identity, 94% coverage

Characterization of an O-demethylase of Desulfitobacterium hafniense DCB-2
Studenik, Journal of bacteriology 2012
- “...cassettes for the genes Dhaf_1265, Dhaf_2573, Dhaf_2795, Dhaf_3310, Dhaf_3879, Dhaf_4322, Dhaf_4610, Dhaf_4611, and Dhaf_4612 (GenBank accession no. CP001336.1)...”
- “...on March 3, 2017 by University of California, Berkeley Dhaf_2795 2 Studenik et al. oriented in the reverse direction in comparison to the orientation in the...”
Complete genome sequence of the dehalorespiring bacterium Desulfitobacterium hafniense Y51 and comparison with Dehalococcoides ethenogenes 195
Nonaka, Journal of bacteriology 2006
- “...DSY0391 DSY0393 DSY1228 DSY1247 DSY1596 DSY1598 DSY1648 DSY1650 DSY1651 DSY1652 DSY1671 DSY1890 DSY2085 DSY2558 DSY2585 DSY3715 DSY4099 DSY4876 Predicted...”

Dhaf_2573 ferredoxin from Desulfitobacterium hafniense DCB-2
39% identity, 96% coverage

Characterization of an O-demethylase of Desulfitobacterium hafniense DCB-2
Studenik, Journal of bacteriology 2012
- “...Expression cassettes for the genes Dhaf_1265, Dhaf_2573, Dhaf_2795, Dhaf_3310, Dhaf_3879, Dhaf_4322, Dhaf_4610, Dhaf_4611, and Dhaf_4612 (GenBank accession no....”
- “...GAT CCT TAT TTT TCG AAC TGC GGG TGG C 1 Dhaf_2573 Dhaf_3310 Dhaf_3879 Dhaf_4322 Dhaf_4610 Dhaf_4611 Dhaf_4612 a 2 2 2 2 2 2 2 2 For details, see Materials and...”

Dtox_1273 ferredoxin from Desulfotomaculum acetoxidans DSM 771
38% identity, 96% coverage

Genome analysis of Desulfotomaculum kuznetsovii strain 17(T) reveals a physiological similarity with Pelotomaculum thermopropionicum strain SI(T)
Visser, Standards in genomic sciences 2013
- “...(Desku_1493) and acsE (Desku_1487), which in contrast is present in the genome of D. acetoxidans (Dtox_1273). Moreover, three genes similar to heterodisulfide reductase encoding genes (Desku_1486-1484) are located upstream of acsE in D. kuznetsovii , which is not the case in the genome of D. acetoxidans...”

Dhaf_4322 ferredoxin from Desulfitobacterium hafniense DCB-2
39% identity, 97% coverage

Characterization of an O-demethylase of Desulfitobacterium hafniense DCB-2
Studenik, Journal of bacteriology 2012
- “...the genes Dhaf_1265, Dhaf_2573, Dhaf_2795, Dhaf_3310, Dhaf_3879, Dhaf_4322, Dhaf_4610, Dhaf_4611, and Dhaf_4612 (GenBank accession no. CP001336.1) as Strep tag...”
- “...BamHI according to the manufacturer's protocol. For Dhaf_4322, a compatible 3318 jb.asm.org Journal of Bacteriology Downloaded from http://jb.asm.org/ on March...”

D9S251 Ferredoxin from Thermosediminibacter oceani (strain ATCC BAA-1034 / DSM 16646 / JW/IW-1228P)
38% identity, 99% coverage

Analytical Validation of Loss of Heterozygosity and Mutation Detection in Pancreatic Fine-Needle Aspirates by Capillary Electrophoresis and Sanger Sequencing.
Timmaraju, Diagnostics (Basel, Switzerland) 2024
- “...150 chr5 119101810 119101960 GGTGTCAACAAAGTAATGTAAAG TGGATACATATTGTTTTCTGCTG 5q D5S615 330 chr5 125163290 125163620 GAGATAGGTAGGTAGGTAGG TCCACAGTGGTAAGAACCAG 9p D9S251 390 chr9 30819368 30819758 TGCATGTTTTATGTGCACTAAC CAATACTTTTTAAGGCTTTGTAGG 9p D9S254 120 chr9 126869098 126869218 TGGGTAATAACTGCCGGAGA GAGGATAAACCTGCTTCACTCAA 10q D10S520 180 chr10 96424526 96424706 CAGCCTATGCAACAGAACAAG GTCCTTGTGAGAAACTGGATGC 10q D10S523 150 chr10 87006333 87006483 GGTGGAGGTTGTGGTGA AACTGGGCATTTGTCTTTC...”
Molecular Clues for Prediction of Hepatocellular Carcinoma Recurrence After Liver Transplantation.
Badwei, Journal of clinical and experimental hepatology 2023
Role of Allelic Imbalance in Predicting Hepatocellular Carcinoma (HCC) Recurrence Risk After Liver Transplant.
Pagano, Annals of transplantation 2019
- “...Results We report that AI was associated with HCC recurrence in 3 main loci (D3S2303, D9S251, and D9S254). Tumor recurrence was associated only with 2 specific panels with 9 microsatellites previously reported to be associated with high risk for HCC recurrence. Our data show that fractional...”
- “...for D3S2303 (p=0.048) considering the presence of LOH ( Table 3A ), and D1S407 (p=0.006) D9S251 (p=0.02), D1S162 (p=0.005), D5S592 (p=0.005), D9S254 (p=0.002) and D10S520 (p=0.04) considering high-level LOH ( Table 3B ). Evaluation of specific panels and association with HCC recurrence Descriptive analysis of the...”
The C9ORF72 expansion mutation is a common cause of ALS+/-FTD in Europe and has a single founder.
Smith, European journal of human genetics : EJHG 2013
Clinical, neuroimaging and neuropathological features of a new chromosome 9p-linked FTD-ALS family.
Boxer, Journal of neurology, neurosurgery, and psychiatry 2011
- “...Genome-wide linkage analysis conclusively linked family VSM-20 to a 28.3 cM region between D9S1808 and D9S251 on chromosome 9p, reducing the published minimal linked region to a 3.7 Mb interval. Genomic sequencing and expression analysis failed to identify mutations in the 10 known and predicted genes...”
- “...GENESCAN and GENOTYPER software (Applied Biosystems) and normalised to the CEPH genotype database, except for D9S251 and D9S304 for which fragment sizes were not available. Mutation analyses In family VSM-20, a genomic DNA (gDNA) sequencing analysis was performed for all 10 candidate genes located within the...”
Chromosome 9p21 in sporadic amyotrophic lateral sclerosis in the UK and seven other countries: a genome-wide association study.
Shatunov, The Lancet. Neurology 2010
- “...7 this 36 Mb locus is defined across studies by the flanking markers D9S169 and D9S251. The SNPs we have identified lie within this region, with the peak association at 1065 Kb. A GWAS that used pathological subtyping of patients with frontotemporal dementia to increase homogeneity...”
Liver transplantation for hepatocellular carcinoma: extension of indications based on molecular markers.
Schwartz, Journal of hepatology 2008
Use of microsatellite marker loss of heterozygosity in accurate diagnosis of pancreaticobiliary malignancy from brush cytology samples.
Khalid, Gut 2004

CAETHG_1606 corrinoid activation/regeneration protein AcsV from Clostridium autoethanogenum DSM 10061
35% identity, 94% coverage

Deletion of genes linked to the C₁-fixing gene cluster affects growth, by-products, and proteome of Clostridium autoethanogenum
Nwaokorie, Frontiers in bioengineering and biotechnology 2023
- “...Clostridium autoethanogenum. The gene cluster contains 16 genes, including five genes with unconfirmed biochemical functions (CAETHG_1606; 1607; 1612; 1613; 1619). Gene names and annotations based on Valgepea et al., 2022 . Figure created using BioRender.com . Determination of gene functionalities in acetogens through genotype-phenotype studies has...”

Ccar_18775 corrinoid activation/regeneration protein AcsV from Clostridium carboxidivorans P7
34% identity, 94% coverage

Combination of Trace Metal to Improve Solventogenesis of Clostridium carboxidivorans P7 in Syngas Fermentation
Han, Frontiers in microbiology 2020
- “...updated and remapped the WLP gene cluster ( Table 2 , and it ranged from Ccar_18775 to _18845 and contained 15 genes ( Pierce et al., 2008 ; Bruant et al., 2010 ). Formate dehydrogenase (FDH), the enzyme responsible for CO 2 reduction or formate oxidization,...”

AF_0010 ASKHA domain-containing protein from Archaeoglobus fulgidus DSM 4304
37% identity, 99% coverage

A novel methoxydotrophic metabolism discovered in the hyperthermophilic archaeon Archaeoglobus fulgidus
Welte, Environmental microbiology 2021
- “.... Genomic and transcriptomic analysis revealed cobalamin binding protein MtoC (AF_0006) and its activator MtoD (AF_0010), Odemethylase MtoB (AF_0007) and methyl transferase MtoA (AF_0009) to be essential for growth of A. fulgidus on methoxylated aromatic compounds. CoM: coenzyme M, H 4 folate: tetrahydrofolate, CO(III): cobalamin binding...”
- “...VhtACDG (AF_137881), ATP synthase AtpAK (AF_115868), cobalamin binding protein MtoC (AF_0006) and its activator MtoD (AF_0010), Odemethylase MtoB (AF_0007) and methyl transferase MtoA (AF_0009), MFS transporters (AF_0008 & AF_0013). H 4 MPT: tetrahydromethanopterin, MQH 2 : reduced menaquinone (MQ), MFR: methanofuran, Fd: ferredoxin, F 420 H...”

B8R2M5 [Co(II) methylated amine-specific corrinoid protein] reductase (EC 1.16.99.1) from Acetobacterium dehalogenans (see paper)
WP_026395886 ASKHA domain-containing protein from Acetobacterium dehalogenans DSM 11527
34% identity, 99% coverage

Redox potential changes during ATP-dependent corrinoid reduction determined by redox titrations with europium(II)-DTPA
Dürichen, Protein science : a publication of the Protein Society 2019
- “...Germany). 4.2 Production and purification of the recombinant proteins The activating enzyme (AE; Accession number: WP_026395886) and corrinoid protein (CP; Accession number: WP_026394334) of the vanillate O demethylase of A. dehalogenans were heterologously produced as Cterminal Strep Tag fusions in Escherichia coli BL21 (DE3) as described...”

TepiRe1_0615 corrinoid activation/regeneration protein AcsV from Tepidanaerobacter acetatoxydans Re1
36% identity, 93% coverage

Genome-guided analysis of physiological capacities of Tepidanaerobacter acetatoxydans provides insights into environmental adaptations and syntrophic acetate oxidation
Müller, PloS one 2015
- “...genes encoding putative ferredoxins (TepiRe1_0333, 0615, 2026) were found in the genome. Ferredoxin encoding gene TepiRe1_0615 is part of the WL pathway operon; TepiRe1_0333 was found to be reverse-transcribed close to the second fhs cluster (described above). Six enzymatic activities were predicted to use ferredoxin as...”

CD0730 putative iron-sulfur protein from Clostridium difficile 630
33% identity, 94% coverage

Diverse Energy-Conserving Pathways in Clostridium difficile: Growth in the Absence of Amino Acid Stickland Acceptors and the Role of the Wood-Ljungdahl Pathway
Gencic, Journal of bacteriology 2020 (secret)
Vegetative Cell and Spore Proteomes of Clostridioides difficile Show Finite Differences and Reveal Potential Protein Markers
Abhyankar, Journal of proteome research 2019
- “...C.difficile 630, which reinforces the acetogenic nature of C.difficile growth. Of these, CD3405, CD3407, and CD0730 have been detected only in single replicates and thus are not quantified. The other WoodLjungdahl pathway proteins have all been quantified, with only three proteinsMetF (CD0722), CD0728, and CD3258being highly...”

RAMQ_EUBLI / P0DX10 Corrinoid activation enzyme RamQ from Eubacterium limosum (see 2 papers)
WP_038351871 ASKHA domain-containing protein from Eubacterium limosum
35% identity, 100% coverage

function: Involved in the degradation of the quaternary amines L- proline betaine and L-carnitine (PubMed:31341018, PubMed:32571881). Component of a corrinoid-dependent methyltransferase system that transfers a methyl group from L-proline betaine or L-carnitine to tetrahydrofolate (THF), forming methyl-THF, a key intermediate in the Wood-Ljungdahl acetogenesis pathway (PubMed:31341018, PubMed:32571881). RamQ is not required for the methyl transfer, but it stimulates reduction of reconstituted MtqC from the Co(II) state to the Co(I) state in vitro (PubMed:31341018). It also stimulates the rate of THF methylation (PubMed:32571881).
cofactor: [2Fe-2S] cluster (Binds 1 2Fe-2S cluster.)
MtpB, a member of the MttB superfamily from the human intestinal acetogen Eubacterium limosum, catalyzes proline betaine demethylation
Picking, The Journal of biological chemistry 2019 (secret)

ELI_0370 ASKHA domain-containing protein from Eubacterium callanderi
34% identity, 100% coverage

Cloning, expression, and characterization of a four-component O-demethylase from human intestinal bacterium Eubacterium limosum ZL-II
Chen, Applied microbiology and biotechnology 2016 (PubMed)
- “...including ELI_2003 (MT-I), ELI_2004 (CP), ELI_2005 (MT-II), and ELI_0370 (AE), were confirmed to constitute the Odemethylase in E. limosum ZL-II. The complete...”
- “...was oxidized to [CoII]-CP immediately in vitro, and ELI_0370 (AE) was responsible for catalyzing the reduction of [CoII]CP to its active form [CoI]-CP. The...”

TepiRe1_0333 ASKHA domain-containing protein from Tepidanaerobacter acetatoxydans Re1
37% identity, 100% coverage

Genome-guided analysis of physiological capacities of Tepidanaerobacter acetatoxydans provides insights into environmental adaptations and syntrophic acetate oxidation
Müller, PloS one 2015
- “...proteins belonging to the family [ 57 ]. Energy conservation Three genes encoding putative ferredoxins (TepiRe1_0333, 0615, 2026) were found in the genome. Ferredoxin encoding gene TepiRe1_0615 is part of the WL pathway operon; TepiRe1_0333 was found to be reverse-transcribed close to the second fhs cluster...”

PGA1_c15200 ATP-dependent reduction of co(II)balamin (RamA-like) (EC:2.1.1.13) from Phaeobacter inhibens DSM 17395
PGA1_c15200 ASKHA domain-containing protein from Phaeobacter inhibens DSM 17395
34% identity, 86% coverage

mutant phenotype: Apparently required for the reactivation of vitamin B12. Distantly related to RamA (see PMID: 19043046) (auxotroph)
Filling gaps in bacterial amino acid biosynthesis pathways with high-throughput genetics
Price, PLoS genetics 2018
- “...are likely to be involved in B12 reactivation: a protein with ferredoxin and DUF4445 domains (PGA1_c15200) and a DUF1638 protein (PGA1_c13340). As shown in Fig 4B , mutants in these genes are rescued by added methionine. The DUF4445 protein is distantly related to RamA, which uses...”

Awo_c10680 corrinoid activation/regeneration protein AcsV from Acetobacterium woodii DSM 1030
34% identity, 94% coverage

A new metabolic trait in an acetogen: Mixed acid fermentation of fructose in a methylene-tetrahydrofolate reductase mutant of Acetobacterium woodii
Moon, Environmental microbiology reports 2023
- “...171,346 29,713 2.52 Awo_c10670 CODH Ni 2+ insertion accessory protein CooC1 ACS/CODH 1000 1997 0.99 Awo_c10680 Corrinoid activation/regeneration protein 4126 7952 0.94 Awo_c10690 Hypothetical protein 1356 2178 0.68 Awo_c10700 Hypothetical protein 933 1748 0.9 Awo_c10710 CFeS protein, SSU AcsD 16,808 36,479 1.11 Awo_c10720 CFeS protein, LSU...”

SMc04347 CONSERVED HYPOTHETICAL PROTEIN from Sinorhizobium meliloti 1021
34% identity, 88% coverage

An integrated approach to functional genomics: construction of a novel reporter gene fusion library for Sinorhizobium meliloti
Cowie, Applied and environmental microbiology 2006
- “...SMc04010 ................................Hypothetical protein SMc04347 ................................Conserved hypothetical protein SMb20057...”
sinI- and expR-dependent quorum sensing in Sinorhizobium meliloti
Gao, Journal of bacteriology 2005
- “...CoA, coenzyme A. b SMc03864 SMc03930 SMc03969 SMc03972 SMc03983 SMc03983 SMc04040 SMc04330 SMc04347 33 44 29 47 17 19 58 10 15 31 42 43 a 111 56 70 55 112 96 77...”

Dred_2206 ferredoxin from Desulfotomaculum reducens MI-1
36% identity, 100% coverage

The genome of the Gram-positive metal- and sulfate-reducing bacterium Desulfotomaculum reducens strain MI-1
Junier, Environmental microbiology 2010
- “..., formate, CO as well as various reduced soluble electron transport proteins such as ferredoxin (dred_2206) and electron transfer flavoproteins (EtfAB) (dred_1778-9, dred_1538-9, dred_0572-3 and dred_0367-8) ( Fig. 2 ). Formate is oxidized via formate dehydrogenase (dred_1112-19) that includes a ten-TMH transmembrane subunit (dred_1116) annotated as...”

Dtur_0730 ferredoxin from Dictyoglomus turgidum DSM 6724
35% identity, 98% coverage

The Complete Genome Sequence of Hyperthermophile Dictyoglomus turgidum DSM 6724™ Reveals a Specialized Carbohydrate Fermentor
Brumm, Frontiers in microbiology 2016
- “...ubiquinone is scavenged from the environment, or an alternate electron acceptor is utilized, like ferridoxin (Dtur_0730) or ferredoxin-like proteins (Dtur_0076; Dtur_0457; Dtur_0556; Dtur_0730; Dtur_0774, and Dtur_1717). The proton gradient needed for ATP generation is produced by NADH oxidoreductase (Dtur_0558; Dtur_0559; Dtur_0916; Dtur_0919, and Dtur_1091), and succinate...”

DET0670 iron-sulfur cluster binding protein from Dehalococcoides ethenogenes 195
DET0704 iron-sulfur cluster binding protein from Dehalococcoides ethenogenes 195
33% identity, 94% coverage

Comparative genomics of "Dehalococcoides ethenogenes" 195 and an enrichment culture containing unsequenced "Dehalococcoides" strains
West, Applied and environmental microbiology 2008
- “...DET0318 DET0343 DET0551 DET0666 DET0667 DET0668 DET0669 DET0670 DET0671 DET1158 DET1481 DET1483 DET1484 DET1488 DET1494 DET1559 DET1574 DET1630 Trichloroethene...”
Complete genome sequence of the dehalorespiring bacterium Desulfitobacterium hafniense Y51 and comparison with Dehalococcoides ethenogenes 195
Nonaka, Journal of bacteriology 2006
- “...DET1175 DET1173 DET0516 DET0237 DET0109 DET0110 DET0701 DET0704 DET0699 DET0700 DET1371 DET0104 DET0685 DET0698 DET0926 DET1598 DET1387 DET0416 hydrogenase....”

Dhaf_1265 ferredoxin from Desulfitobacterium hafniense DCB-2
36% identity, 69% coverage

Characterization of an O-demethylase of Desulfitobacterium hafniense DCB-2
Studenik, Journal of bacteriology 2012
- “...components. Expression cassettes for the genes Dhaf_1265, Dhaf_2573, Dhaf_2795, Dhaf_3310, Dhaf_3879, Dhaf_4322, Dhaf_4610, Dhaf_4611, and Dhaf_4612 (GenBank...”
- “...Desulfitobacterium hafniense DCB-2 into pET11aa Gene Primer sequence PCR step Dhaf_1265 CGC GTT CAT ATG AAT CAT TAT CGG CC CTG CGG GTG GCT CCA AGC GCT GCA GAG...”

RSK20926_19267 iron-sulfur cluster-binding protein from Roseobacter sp. SK209-2-6
38% identity, 38% coverage

Filling gaps in bacterial amino acid biosynthesis pathways with high-throughput genetics
Price, PLoS genetics 2018
- “...However in Roseobacter sp. SK209-2-6, the RamA-like protein is split into two proteins (RSK20926_19262 and RSK20926_19267), and we manually classified the RamA-like protein as present in this bacterium. Source code Code for analyzing fitness data and for the Fitness Browser is available at https://bitbucket.org/berkeleylab/feba . Supporting...”

RSK20926_19262 iron-sulfur cluster-binding protein from Roseobacter sp. SK209-2-6
29% identity, 42% coverage

Filling gaps in bacterial amino acid biosynthesis pathways with high-throughput genetics
Price, PLoS genetics 2018
- “...(VIMSS 5050244). However in Roseobacter sp. SK209-2-6, the RamA-like protein is split into two proteins (RSK20926_19262 and RSK20926_19267), and we manually classified the RamA-like protein as present in this bacterium. Source code Code for analyzing fitness data and for the Fitness Browser is available at https://bitbucket.org/berkeleylab/feba...”

DvMF_1398 ATP-dependent reduction of co(II)balamin (RamA-like) from Desulfovibrio vulgaris Miyazaki F
DvMF_1398 iron-sulfur cluster-binding protein, putative from Desulfovibrio vulgaris str. Miyazaki F
25% identity, 84% coverage

mutant phenotype: Cofit with the B12-dependent methionine synthase (DvMF_0476), which lacks a standard domain for the reactivation of vitamin B12.
Filling gaps in bacterial amino acid biosynthesis pathways with high-throughput genetics
Price, PLoS genetics 2018
- “...the standard B12 activation domain. This methionine synthase has a very similar fitness pattern as DvMF_1398, which contains two DUF4445 domains (r = 0.92 across 170 experiments; also see Fig 3 ). We infer that DUF4445 proteins perform the reactivation of vitamin B12 in diverse bacteria....”

DVU0908 ATP-dependent reduction of co(II)balamin from Desulfovibrio vulgaris Hildenborough JW710
27% identity, 69% coverage

mutant phenotype: Important for fitness in most defined media. Semi-automated annotation based on the auxotrophic phenotype and a hit to HMM PF14574.

ramA / B8Y445 [Co(II) methylated amines-specific corrinoid protein] reductase (EC 1.16.99.1) from Methanosarcina barkeri (see 2 papers)
RAMA_METBA / B8Y445 [Co(II) methylated amine-specific corrinoid protein] reductase; Corrinoid activation enzyme RamA; EC 1.16.99.1 from Methanosarcina barkeri (see paper)
B8Y445 [Co(II) methylated amine-specific corrinoid protein] reductase (EC 1.16.99.1) from Methanosarcina barkeri (see paper)
28% identity, 69% coverage

function: Reductase required for the activation of corrinoid-dependent methylamine methyltransferase reactions during methanogenesis (PubMed:19043046). Mediates the ATP-dependent reduction of corrinoid proteins from the inactive cobalt(II) state to the active cobalt(I) state (PubMed:19043046). Acts on the corrinoid proteins involved in methanogenesis from monomethylamine (MMA), dimethylamine (DMA) and trimethylamine (TMA), namely MtmC, MtbC and MttC, respectively (PubMed:19043046).
catalytic activity: 2 Co(II)-[methylamine-specific corrinoid protein] + AH2 + ATP + H2O = 2 Co(I)-[methylamine-specific corrinoid protein] + A + ADP + phosphate + 3 H(+) (RHEA:65816)
catalytic activity: 2 Co(II)-[dimethylamine-specific corrinoid protein] + AH2 + ATP + H2O = 2 Co(I)-[dimethylamine-specific corrinoid protein] + A + ADP + phosphate + 3 H(+) (RHEA:65832)
catalytic activity: 2 Co(II)-[trimethylamine-specific corrinoid protein] + AH2 + ATP + H2O = 2 Co(I)-[trimethylamine-specific corrinoid protein] + A + ADP + phosphate + 3 H(+) (RHEA:65836)
cofactor: [4Fe-4S] cluster (Binds 2 [4Fe-4S] clusters.)
subunit: Monomer.

Dde_2711 2Fe-2S iron-sulfur cluster binding domains protein from Desulfovibrio desulfuricans G20
27% identity, 71% coverage

Filling gaps in bacterial amino acid biosynthesis pathways with high-throughput genetics
Price, PLoS genetics 2018
- “...vitamin B12 reactivation, and we previously proposed that in Desulfovibrio alaskensis , a RamA-like protein (Dde_2711) would be involved in B12 reactivation because it is cofit with MetH (r = 0.90; [ 3 ]). We also found evidence that DUF4445 is involved in the reactivation of...”
Functional genomics with a comprehensive library of transposon mutants for the sulfate-reducing bacterium Desulfovibrio alaskensis G20
Kuehl, mBio 2014
- “...supplementation of minimal medium with methionine also rescued the fitness defects of the uncharacterized genes Dde_2711 and Dde_3007 ( Fig.4B ). The D.alaskensis G20 MetH is missing the N-terminal activation domain [for reducing Co(II) to Co(I)] that is present in E. coli MetH. To identify this...”
- “...G20, we examined the new methionine auxotrophs identified by our fitness assay and found that Dde_2711 encodes a predicted ferredoxin and has homology to this missing activation domain of E. coli MetH. Dde_3007 encodes a conserved protein annotated as domain of unknown function DUF39. To determine...”

MA0849 hypothetical protein (multi-domain) from Methanosarcina acetivorans C2A
27% identity, 69% coverage

Genetic basis for metabolism of methylated sulfur compounds in Methanosarcina species
Fu, Journal of bacteriology 2015
- “...vs MeSH MA4164 MA4165 MA4166 MA4167 MA1617 MA1616 MA0849 MA3860 MA3861 MA3862 MA3863 MA3864 MA3865 MA3300 MA0859 MA4384 MA4558 MA3302 MA3130 MA0685 MA3736...”
- “...scriptional regulation during growth on methylsulfides. The MA0849 locus, which encodes a protein with homology to the methylamine methyltransferase-activating...”
RamA, a protein required for reductive activation of corrinoid-dependent methylamine methyltransferase reactions in methanogenic archaea
Ferguson, The Journal of biological chemistry 2009
- “...and methylcobamide:CoM methyltransferases (47, 48). A RamA homolog (MA0849) was also found in M. acetivorans adjacent to a gene encoding a methylcobamide:CoM...”

Mmah_1683 4Fe-4S ferredoxin iron-sulfur binding domain protein from Methanohalophilus mahii DSM 5219
27% identity, 69% coverage

The genome sequence of Methanohalophilus mahii SLP(T) reveals differences in the energy metabolism among members of the Methanosarcinaceae inhabiting freshwater and saline environments
Spring, Archaea (Vancouver, B.C.) 2010
- “...activation of the corrinoid protein is catalyzed in methylotrophic methanogens by the iron-sulfur protein RamA (Mmah_1683) [ 49 ]. In Methanosarcina species it was demonstrated that the corrinoid protein and the substrate specific methyltransferase form a tight complex and that the corresponding genes are transcribed in...”

MM1440 conserved protein from Methanosarcina mazei Goe1
27% identity, 69% coverage

RamA, a protein required for reductive activation of corrinoid-dependent methylamine methyltransferase reactions in methanogenic archaea
Ferguson, The Journal of biological chemistry 2009
- “...(45) was examined, which led to identification of MM1440 (GenBankTM NP_633464), whose predicted product differed from the N terminus of the isolated protein by...”
- “...1). Genes were identified in M. mazei (MM1440), M. barkeri Fusaro (Mbar_A0840), and Methanosarcina acetivorans (MA0150) genomes whose predicted gene products...”

MA0150 methylamine methyltransferase corrinoid activation protein from Methanosarcina acetivorans C2A
27% identity, 69% coverage

RamA, a protein required for reductive activation of corrinoid-dependent methylamine methyltransferase reactions in methanogenic archaea
Ferguson, The Journal of biological chemistry 2009
- “...M. barkeri Fusaro (Mbar_A0840), and Methanosarcina acetivorans (MA0150) genomes whose predicted gene products are 93-97% similar to that produced by the...”
Cobalamin- and corrinoid-dependent enzymes
Matthews, Metal ions in life sciences 2009
- “.../hydrogenase [ 81 , 91 ] Methylamine:coenzyme M methyltransferases (5-OH-benzimidazolylcobamide) Methanosarcina barkeri Methanosarcina acetovorans RamA (MA0150) ATP/ [ 26 ] Energy-conserving methyltetrahydromethanopterin: coenzyme M methyltransferase (5-OH-benzimidazolylcobamide) Methanobacterium thermoautotrophicum not identified ATP/reduced ferredoxin [ 92 ] Veratrol:H 4 folate and vanillate:H 4 folate O-methyltransferases (cobalamin) Acetobacterium dehalogenans...”

FKV42_RS10455, WP_154810143 methylamine methyltransferase corrinoid protein reductive activase from Methanolobus vulcani
27% identity, 69% coverage

Several ways one goal-methanogenesis from unconventional substrates
Kurth, Applied microbiology and biotechnology 2020
- “...? Methanolobus vulcani Quaternary amines MtgB methyltransferase FKV42_RS08545 WP_154809802 Corrinoid protein FKV42_RS08550 WP_154809803 Corrinoid activator FKV42_RS10455 WP_154810143 CoM methyltransferase FKV42_RS10480 WP_154810148 For the organisms conducting hydrogen-dependent methylotrophic the enzymes important for energy conversion/recycling of reducing equivalents are shown as those play an important role of the...”
- “...proteomic analysis revealed that MtgB, a corrinoid binding protein (FKV42_RS08550), a corrinoid reductive activation enzyme (FKV42_RS10455) and a methylcorrinoid:CoM methyltransferase (FKV42_RS10480) were highly abundant when M. vulcani B1d was grown on betaine relative to growth on trimethylamine. Energy conservation presumably follows what is known for methylamine...”
- “...Methanolobus vulcani Quaternary amines MtgB methyltransferase FKV42_RS08545 WP_154809802 Corrinoid protein FKV42_RS08550 WP_154809803 Corrinoid activator FKV42_RS10455 WP_154810143 CoM methyltransferase FKV42_RS10480 WP_154810148 For the organisms conducting hydrogen-dependent methylotrophic the enzymes important for energy conversion/recycling of reducing equivalents are shown as those play an important role of the special...”

MM0940 putative Flavoprotein from Methanosarcina mazei Goe1
26% identity, 69% coverage

Quantitative proteomic and microarray analysis of the archaeon Methanosarcina acetivorans grown with acetate versus methanol
Li, Journal of proteome research 2007
- “...of 59 kDa 52, the sequence is 90% identical to MM0940 of M. mazei encoding a 59-kDa putative flavoprotein of unknown function that is also up regulated in...”
- “...cells 11. The results suggest that the product of MM0940 and MA3972 is a core flavoprotein with an unknown function in the acetate pathways of freshwater and...”

MA3972 conserved hypothetical protein from Methanosarcina acetivorans C2A
27% identity, 69% coverage

Quantitative proteomic and microarray analysis of the archaeon Methanosarcina acetivorans grown with acetate versus methanol
Li, Journal of proteome research 2007
- “...pathway. NIH-PA Author Manuscript The protein product of MA3972 was in greater abundance, and expression of the gene up regulated in acetate-grown cells (Table...”
- “...The results suggest that the product of MM0940 and MA3972 is a core flavoprotein with an unknown function in the acetate pathways of freshwater and marine...”

Q8PXZ5 Conserved protein from Methanosarcina mazei (strain ATCC BAA-159 / DSM 3647 / Goe1 / Go1 / JCM 11833 / OCM 88)
MM1071 conserved protein from Methanosarcina mazei Goe1
27% identity, 69% coverage

Mining proteomic data to expose protein modifications in Methanosarcina mazei strain Gö1
Leon, Frontiers in microbiology 2015
- “...Rpl1P 3 62 Q8PY39 MM1025 ThiC 3 58 Q8PXZ6 MM1070 MtaA1 methylcobalamin:CoM methyltransferase 7 374 Q8PXZ5 MM1071 4Fe:4S ferredoxin, hypothetical 2 121 Q8PXZ3 MM1073 MtaC2 methyl corrinoid protein 6 230 Q8PXZ2 MM1074 MtaB2 9 250 Q8PXZ1 MM1075 Putative regulatory protein 2 92 1 Y Q8PXX0 MM1096...”
Mining proteomic data to expose protein modifications in Methanosarcina mazei strain Gö1
Leon, Frontiers in microbiology 2015
- “...3 62 Q8PY39 MM1025 ThiC 3 58 Q8PXZ6 MM1070 MtaA1 methylcobalamin:CoM methyltransferase 7 374 Q8PXZ5 MM1071 4Fe:4S ferredoxin, hypothetical 2 121 Q8PXZ3 MM1073 MtaC2 methyl corrinoid protein 6 230 Q8PXZ2 MM1074 MtaB2 9 250 Q8PXZ1 MM1075 Putative regulatory protein 2 92 1 Y Q8PXX0 MM1096 Thermosome,...”
Transcriptional profiling of methyltransferase genes during growth of Methanosarcina mazei on trimethylamine
Krätzer, Journal of bacteriology 2009
- “...(C-terminal domain) MM0174 MM0175 MM0312 MM0408 MM0479 MM0924 MM1071 MM1073 MM1074 MM1075 MM1112 MM1271 MM1272 MM1273 MM1274 MM1275 MM1647 MM1648 MM1761 MM1762...”
RamA, a protein required for reductive activation of corrinoid-dependent methylamine methyltransferase reactions in methanogenic archaea
Ferguson, The Journal of biological chemistry 2009
- “...were found in M. acetivorans (MA4380), M. mazei (mm1071), and M. barkeri (Mbar_A1055). Additionally, other RamA homologs in Methanosarcina spp. were found, but...”
A subset of the diverse COG0523 family of putative metal chaperones is linked to zinc homeostasis in all kingdoms of life
Haas, BMC genomics 2009
- “...( M. mazei COG0523) is induced to the same extent as its neighboring ramM homolog, MM1071 , during growth in high salt conditions (2.38 and 2.21 fold, respectively) [ 104 ]. Archaeal genomes sequenced to date lack any recognizable homolog of the Fur (Fe) or Zur...”
Characterization of a novel bifunctional dihydropteroate synthase/dihydropteroate reductase enzyme from Helicobacter pylori
Levin, Journal of bacteriology 2007
- “...MM512 MM612 MM808 MM847 MM851 MM902 MM1059 MM1060 MM1061 MM1071 Genotype or description 4064 LEVIN ET AL. and E. coli Fre. The reaction mixture was incubated...”

MA4380 conserved hypothetical protein from Methanosarcina acetivorans C2A
27% identity, 69% coverage

RamA, a protein required for reductive activation of corrinoid-dependent methylamine methyltransferase reactions in methanogenic archaea
Ferguson, The Journal of biological chemistry 2009
- “...RamM genes were found in M. acetivorans (MA4380), M. mazei (mm1071), and M. barkeri (Mbar_A1055). Additionally, other RamA homologs in Methanosarcina...”

dsoF / O32433 DMSO monooxygenase reductase component (EC 1.14.13.245) from Acinetobacter sp. (see 4 papers)
O32433 assimilatory dimethylsulfide S-monooxygenase (subunit 1/6) (EC 1.14.13.245) from Acinetobacter sp. (see 2 papers)
36% identity, 20% coverage

DMPP_ACIP2 / Q7WTJ2 Phenol hydroxylase P5 protein; Phenol 2-monooxygenase P5 component; EC 1.14.13.7 from Acinetobacter pittii (strain PHEA-2)
TC 5.B.1.3.2 / Q7WTJ2 Phenol hydroxylase from Acinetobacter calcoaceticus (strain PHEA-2)
35% identity, 20% coverage

function: Catabolizes phenol, and some of its methylated derivatives. P5 is required for growth on phenol, and for in vitro phenol hydroxylase activity (By similarity).
function: Probable electron transfer from NADPH, via FAD and the 2Fe-2S center, to the oxygenase activity site of the enzyme.
catalytic activity: phenol + NADPH + O2 + H(+) = catechol + NADP(+) + H2O (RHEA:17061)
cofactor: FAD (Binds 1 FAD.)
cofactor: [2Fe-2S] cluster (Binds 1 [2Fe-2S] cluster.)
subunit: The multicomponent enzyme phenol hydroxylase is formed by P0, P1, P2, P3, P4 and P5 polypeptides.
substrates: Electrons

WP_226348815 NADH:ubiquinone reductase (Na(+)-transporting) subunit F from Alcaligenes sp. 13f
37% identity, 20% coverage

Genome Characterisation of an Isoprene-Degrading Alcaligenes sp. Isolated from a Tropical Restored Forest
Uttarotai, Biology 2022
- “...IsoH - 23.98 24.29 WP_226348814 Ring-hydroxylating dioxygenase ferredoxin reductase family protein IsoF 30.36 28.97 29.75 WP_226348815 2Fe-2S iron-sulfur cluster binding domain-containing protein IsoF 29.08 29.46 29.68 WP_226348817 Aromatic/alkene/methane monooxygenase hydroxylase/oxygenase subunit alpha IsoA 27.77 28.09 28.36 WP_003800437 MmoB/DmpM family protein IsoD 23.94 26.77 22.54 WP_226348818 Aromatic/alkene...”

P23_2977 NADH:ubiquinone reductase (Na(+)-transporting) subunit F from Acinetobacter calcoaceticus
36% identity, 20% coverage

Draft Genome Sequence of Acinetobacter calcoaceticus Strain P23, a Plant Growth-Promoting Bacterium of Duckweed
Sugawara, Genome announcements 2015
- “...bp). The complete set of genes involved in the phenol degradation pathway, mphRKLMNOP (P23_2971 through P23_2977) and catMBCAIJFD (P23_0869 through P23_0876), was predicted to have high homology (>95% identity by BLASTp) to that of the phenol-degrading bacterium A.calcoaceticus PHEA-2 ( 4 ). However, the genes involved...”

phlF / CAA56745.1 subunit of phenolhydroxylase from Pseudomonas putida (see 2 papers)
34% identity, 20% coverage

A2SI47 Phenol hydrolase reductase from Methylibium petroleiphilum (strain ATCC BAA-1232 / LMG 22953 / PM1)
33% identity, 20% coverage

Microbial Consortia and Mixed Plastic Waste: Pangenomic Analysis Reveals Potential for Degradation of Multiple Plastic Types via Previously Identified PET Degrading Bacteria
Edwards, International journal of molecular sciences 2022
- “...4,4-diaponeurosporenoateglycosyltransferase Bacillus enclensis A0A0V8HPX8 44.05 3.17 10 10 60.5 all Phenol hydrolase reductase Methylibium petroleiphilum A2SI47 41.38 4.97 10 11 61.2 all 2-hydroxy-6-oxo-6-(2-carboxyphenyl)-hexa-2,4-dienoate hydrolase Terrabacter sp. strain DBF63 Q83ZF0 38.46 1.06 10 18 82.4 all Tert-butyl alcohol monooxygenase Aquincola tertiaricarbonis G8FRC5 38.18 1.27 10 4 37.7...”

dmpP / P19734 phenol hydroxylase reductase component (EC 1.14.13.244) from Pseudomonas sp. (strain CF600) (see 9 papers)
DMPP_PSEUF / P19734 Phenol 2-monooxygenase, reductase component DmpP; Phenol 2-monooxygenase P5 component; Phenol hydroxylase P5 protein; EC 1.14.13.244 from Pseudomonas sp. (strain CF600) (see 2 papers)
P19734 phenol 2-monooxygenase (NADH) (subunit 5/6) (EC 1.14.13.244) from Pseudomonas sp. CF600 (see paper)
dmpP phenol hydroxylase P5 protein; EC 1.14.13.7 from Pseudomonas sp. CF600 (see 2 papers)
dmpP / AAA25944.1 phenol hydroxylase from Pseudomonas putida (see paper)
35% identity, 20% coverage

function: Part of a multicomponent enzyme which catalyzes the degradation of phenol and some of its methylated derivatives (PubMed:2254259). DmpP probably transfers electrons from NADH, via FAD and the iron-sulfur center, to the oxygenase component of the complex (PubMed:2254259). Required for growth on phenol and for in vitro phenol hydroxylase activity (PubMed:2254258, PubMed:2254259).
catalytic activity: phenol + NADH + O2 + H(+) = catechol + NAD(+) + H2O (RHEA:57952)
cofactor: FAD (Binds 1 FAD per subunit.)
cofactor: [2Fe-2S] cluster (Binds 1 [2Fe-2S] cluster per subunit.)
subunit: The multicomponent enzyme phenol hydroxylase is formed by DmpL (P1 component), DmpM (P2 component), DmpN (P3 component), DmpO (P4 component) and DmpP (P5 component).
disruption phenotype: Cells lacking this gene cannot grow on phenol.
Purification and identification of trichloroethylene induced proteins from Stenotrophomonas maltophilia PM102 by immuno-affinity-chromatography and MALDI-TOF Mass spectrometry
Mukherjee, SpringerPlus 2013
- “...6.77 Propane monooxygenase from Rhodococcus sp . Q0SJK9 63222.42 5.56 Phenol hydroxylase from Pseudomonas sp. P19734 38477.58 4.79 Competing interests The authors declare that they have no competing interests regarding any of the research work reported in this paper. Authors contribution PM carried out the biochemical...”
Proteogenomic elucidation of the initial steps in the benzene degradation pathway of a novel halophile, Arhodomonas sp. strain Rozel, isolated from a hypersaline environment
Dalvi, Applied and environmental microbiology 2012
- “...P19730 66 2e25 Q9RAF7 77 0 Q5KT19 56 4e35 O84962 64 3e161 P19734 44 3e15 A1K6K5 69 e129 Q1LNR9 50 8e139 Q2W7L9 48 4e49 A1K899 59 2e27 Q49KG4 70 0 G6YS35 a Shown...”
- “...component (P19732), and phenol 2-monooxygenase P5 component (P19734). October 2012 Volume 78 Number 20 aem.asm.org 7313 Downloaded from http://aem.asm.org/ on...”
Epoxyalkane: coenzyme M transferase in the ethene and vinyl chloride biodegradation pathways of mycobacterium strain JS60
Coleman, Journal of bacteriology 2003
- “...41.8 P27353 BAA07115 DmpP Pseudomonas strain CF600 40.4 P19734 GctB CatJ CAA10043 Organism Incomplete ORF. (amoABCD) of Rhodococcus strain B-276. The sequence...”
Duplicate copies of genes encoding methanesulfonate monooxygenase in Marinosulfonomonas methylotropha strain TR3 and detection of methanesulfonate utilizers in the environment
Baxter, Applied and environmental microbiology 2002
- “...42 42 Pseudomonas sp. strain CF600 P. putida P. putida P19734 Q52126 P23101 OrfX Cyc6 Cyc6 Cytochrome c6 Cytochrome c6 33 32 44 47 E. gracilis M. aeruginosa...”

BT1155 Na+-translocating NADH-quinone reductase subunit from Bacteroides thetaiotaomicron VPI-5482
35% identity, 14% coverage

The NQR Complex Regulates the Immunomodulatory Function of Bacteroides thetaiotaomicron
Engelhart, Journal of immunology (Baltimore, Md. : 1950) 2023
- “...by nanodrop. RNA was confirmed free of DNA contamination by running a qPCR for the BT1155 gene of B. theta (see Supplemental Table 1 ) using an in-house qPCR mix (see below) on the purified RNA prior to cDNA synthesis, and were considered DNA-free when a...”

D8DWB6 Na(+)-translocating NADH-quinone reductase subunit F from Segatella baroniae B14
34% identity, 14% coverage

Occurrence and Function of the Na+-Translocating NADH:Quinone Oxidoreductase in Prevotella spp.
Deusch, Microorganisms 2019
- “...RnfE D8DXV3 37.36 NuoN D8DX02 19.08 NqrE D8DWB7 RnfA D8DXV2 44.50 NuoL A0A1H9A8K0 16.67 NqrF D8DWB6 RnfB D8DXV7 17.34 NuoCD D8DWN9 17.48 microorganisms-07-00117-t004_Table 4 Table 4 Subunits of the NQR, RNF, NDH-I (Nuo), and other respiratory enzymes identified from membranes solubilized with 1% or 2% (...”
- “...reductase, Fe-S pr. 1272.00 63.89 16 16 Triton 1%B D8DWC1 NqrA 1126.93 60.58 26 27 D8DWB6 NqrF 386.31 18.01 7 7 D8DWB9 NqrC 272.89 40.48 7 8 D8DWC0 NqrB 55.66 11.43 3 3 D8DWB8 NqrD 41.65 4.78 2 2 D8DWN8 NuoH 115.46 9.62 4 4 D8DWN7...”

A0A0D0J042 Na(+)-translocating NADH-quinone reductase subunit F from Prevotella pectinovora
37% identity, 14% coverage

Occurrence and Function of the Na⁺-Translocating NADH:Quinone Oxidoreductase in Prevotella spp
Deusch, Microorganisms 2019
- “...NqrA 11.67 19.60 5 8 Prevotella ruminicola D5ESF9 NqrC 5.05 12.20 3 3 Prevotella ruminicola A0A0D0J042 NqrF 3.99 11.30 2 3 Prevotella sp. P5-119 R5P524 NqrA 2.00 8.30 1 3 Prevotella sp. CAG:1092 D5ESG0 NqrB 1.67 2.60 1 1 Prevotella ruminicola D1VWD5 NqrA -2.00 7.00 2...”

Q84AQ0 phenol 2-monooxygenase (NADH) (subunit 1/5) (EC 1.14.13.244) from Pseudomonas stutzeri (see paper)
34% identity, 20% coverage

Rmet_1326 NADH:ubiquinone reductase (Na(+)-transporting) subunit F from Cupriavidus metallidurans CH34
Rmet_1326 Oxidoreductase FAD-binding region from Ralstonia metallidurans CH34
31% identity, 22% coverage

The complete genome sequence of Cupriavidus metallidurans strain CH34, a master survivalist in harsh and anthropogenic environments
Janssen, PloS one 2010
- “...island and contains the genes for two BMMs: a phenol hydroxylase, encoded by six genes Rmet_1326 to Rmet_1331, and a benzene-toluene monooxygenase, encoded by six genes Rmet_1311 to Rmet_1316. All genes are present for the meta -cleavage of catechol (by an 2,3-dioxygenase encoded by Rmet_1324), and...”

tomA5 / Q9ANX0 toluene ortho-monooxygenase TomA5 subunit (EC 1.14.13.243) from Burkholderia cepacia (see paper)
Q9ANX0 toluene 2-monooxygenase (subunit 5/6) (EC 1.14.13.243) from Burkholderia cepacia (see 8 papers)
31% identity, 22% coverage

KZ686_09965 NADH:ubiquinone reductase (Na(+)-transporting) subunit F from Cupriavidus cauae
31% identity, 22% coverage

Comparative Genomic Analysis and BTEX Degradation Pathways of a Thermotolerant Cupriavidus cauae PHS1
Sathesh-Prabu, Journal of microbiology and biotechnology 2023
- “...519 Phenol hydroxylase subunit () KZ686_09960 btxE 2220586-2220942 + 58.82 118 Phenol hydroxylase subunit () KZ686_09965 btxF 2221021-2222085 + 57.37 354 Phenol hydroxylase subunit KZ686_09970 btxG 2222088-2222444 + 59.38 118 Ferredoxin (Fn) KZ686_09975 btxH 2222462-2223406 + 55.87 314 Catechol 2,3-dioxygenase (C23O) KZ686_09980 btxI 2223428-2223877 + 61.78...”

pc1533 probable Na(+)-translocating NADH-quinone reductase, chain F from Parachlamydia sp. UWE25
35% identity, 13% coverage

The alternative translational profile that underlies the immune-evasive state of persistence in Chlamydiaceae exploits differential tryptophan contents of the protein repertoire
Lo, Microbiology and molecular biology reviews : MMBR 2012
- “...1.66 1.47 2.15 0.22 1.95 PC0301 PC0300 PC0299 PC0298 PC0095 PC1533 1.73 1.14 0.83 1.79 0.19 1.33 Primary H pump genes CT013 CT014 3.54 2.08 PC1630 PC1629 2.66...”

Slit_1671 ferredoxin from Sideroxydans lithotrophicus ES-1
32% identity, 14% coverage

Comparative genomics of freshwater Fe-oxidizing bacteria: implications for physiology, ecology, and systematics
Emerson, Frontiers in microbiology 2013
- “...thiosulfate (Ghosh and Dam, 2009 ). In the same genomic region are 15 contiguous genes (Slit_1671 Slit_1686) that encode for the alpha and beta-subunits of dissimilatory sulfite reductase, the dsrEFHC genes, and other genes that appear to be involved in lithotrophic S-metabolism (Ghosh and Dam, 2009...”

N6YI82 Phenol 2-monooxygenase from Thauera sp. 63
34% identity, 16% coverage

A benzene-degrading nitrate-reducing microbial consortium displays aerobic and anaerobic benzene degradation pathways
Atashgahi, Scientific reports 2018
- “...P4 subunit Q479F9 78 Dechloromonas aromatica strain RCB dmpP contig-100_2834_2 Azoarcus toluclasticus{92003} 100 Phenol 2-monooxygenase N6YI82 79 Thauera sp. 63 dmpB contig-100_1413_1 Candidatus Kuenenia stuttgartiensis 71 Similar to cysteine dioxygenase type I Q1PVP4 94 Candidatus Kuenenia stuttgartiensis dmpC contig-100_1829_1 Candidatus Kuenenia stuttgartiensis 73 Similar to succinate-semialdehyde...”

CT740 Phenolhydrolase/NADH ubiquinone oxidoreductase from Chlamydia trachomatis D/UW-3/CX
29% identity, 15% coverage

Genomic and phenotypic characterization of in vitro-generated Chlamydia trachomatis recombinants
Jeffrey, BMC microbiology 2013
- “...ORFs CT740-749, resulting in a progeny strain that contains only the C. suis homologs of CT740 through CT749. The results demonstrate that these C. suis sequences can complement any required function of the deleted C. trachomatis genes for growth in vitro. Figure 4 Schematic diagram of...”
- “...crossover sites are shown in black. The deletion of the C. trachomatis homologous region of CT740 to CT749 in the RC-J(s)/122 sequence is indicated by the delta symbol. Nucleotide sequence analysis of the recombinant genomes showed that some of these isolates lacked the chlamydial plasmid (Table...”
The alternative translational profile that underlies the immune-evasive state of persistence in Chlamydiaceae exploits differential tryptophan contents of the protein repertoire
Lo, Microbiology and molecular biology reviews : MMBR 2012
- “...Primary Na pump genes CT278 CT279 CT280 CT281 CT634b CT740 2.08 1.66 1.47 2.15 0.22 1.95 PC0301 PC0300 PC0299 PC0298 PC0095 PC1533 1.73 1.14 0.83 1.79 0.19...”
- “...are homologous throughout. On the other hand, CT278 and CT740 encode proteins that are significantly larger than those encoded by their b1630 and b3844 E. coli...”
Chlamydia trachomatis lacks an adaptive response to changes in carbon source availability
Nicholson, Infection and immunity 2004
- “...1.35 0.04 1.34 0.01 Aerobic CT278 CT279 CT280 CT281 CT634 CT714 CT740 nqr2 nqr3 nqr4 nqr5 nqrA gpdA nqr6 K I D B A E Continued on following page Downloaded from...”

TC0116 NADH:ubiquinone oxidoreductase, beta subunit, putative from Chlamydia muridarum Nigg
27% identity, 13% coverage

Identification of immunodominant antigens by probing a whole Chlamydia trachomatis open reading frame proteome microarray using sera from immunized mice
Cruz-Fisher, Infection and immunity 2011
- “...TC0066 TC0077 TC0078 TC0079 TC0084 TC0093 TC0104 TC0114 TC0116 TC0117 TC0133 TC0136 TC0137 TC0140 TC0144 TC0149 TC0151 TC0153 TC0160 TC0163 TC0166 TC0177 TC0178...”

C6KUI9 Ferredoxin oxidoreductase from bacterium
31% identity, 20% coverage

Arhodomonas sp. strain Seminole and its genetic potential to degrade aromatic compounds under high-salinity conditions
Dalvi, Applied and environmental microbiology 2014
- “...A2SI51 66 6e25 Q9RAF7 77 0 Q5KT19 56 8e35 O84962 66 e136 C6KUI9 44 7e15 A1K6K5 69 e128 Q1LNR9 50 2e39 M2ZC26 48 9e49 A1K899 59 9e29 N6YI61 74 0 I7J281 67 e107...”

CTLon_0109 Na(+)-translocating NADH-quinone reductase subunit F from Chlamydia trachomatis L2b/UCH-1/proctitis
28% identity, 15% coverage

Horizontal transfer of tetracycline resistance among Chlamydia spp. in vitro
Suchland, Antimicrobial agents and chemotherapy 2009
- “...along with 10 neighboring C. suis genes (homologs of CTLon_0109 to CTLon_0118), flanked by the two rrn operons. The upstream (L2 parental) rrn operon was not...”

Q52574 toluene 2-monooxygenase (subunit 1/6) (EC 1.14.13.243) from Pseudomonas sp. (see paper)
tbmF / AAA88461.1 oxidoreductase from Pseudomonas sp (see paper)
32% identity, 20% coverage

5ogxA / A0A076MZ01 Crystal structure of amycolatopsis cytochrome p450 reductase gcob. (see paper)
31% identity, 19% coverage

Ligands: flavin-adenine dinucleotide; fe2/s2 (inorganic) cluster (5ogxA)

GCOB_AMYS7 / P0DPQ8 Aromatic O-demethylase, reductase subunit; NADH--hemoprotein reductase; EC 1.6.2.- from Amycolatopsis sp. (strain ATCC 39116 / 75iv2) (see paper)
WP_020419854 2Fe-2S iron-sulfur cluster-binding protein from Amycolatopsis sp. ATCC 39116
31% identity, 19% coverage

function: Part of a two-component P450 system that efficiently O- demethylates diverse aromatic substrates such as guaiacol and a wide variety of lignin-derived monomers. Is likely involved in lignin degradation, allowing Amycolatopsis sp. ATCC 39116 to catabolize plant biomass. GcoB transfers electrons from NADH to the cytochrome P450 subunit GcoA. Highly prefers NADH over NADPH as the electron donor.
catalytic activity: 2 oxidized [cytochrome P450] + NADH = 2 reduced [cytochrome P450] + NAD(+) + H(+) (RHEA:57420)
cofactor: FAD (Binds 1 FAD per subunit.)
cofactor: [2Fe-2S] cluster (Binds 1 [2Fe-2S] cluster per subunit.)
subunit: Monomer. Forms a heterodimer with GcoA.
Engineering a Cytochrome P450 for Demethylation of Lignin-Derived Aromatic Aldehydes
Ellis, JACS Au 2021
- “...B, C). 35 , 36 The Amycolatopsis sp. ATCC 39116 GcoAB cytochrome P450 system (WP_020419855, WP_020419854) was therefore engineered for efficient turnover of aromatic aldehydes, specifically targeting p- and o -vanillin: substrates with which GcoAB has little to no native activity. Building from our previous work,...”

CTME_CASD6 / W8X5L3 2Fe-2S ferredoxin CtmE from Castellaniella defragrans (strain DSM 12143 / CCUG 39792 / 65Phen) (Alcaligenes defragrans) (see paper)
39% identity, 14% coverage

function: Involved in the degradation of the cyclic monoterpene limonene (PubMed:24952578). Probably part of an electron transfer system involved in the oxidation of limonene to perillyl alcohol (Probable).
cofactor: [2Fe-2S] cluster (Binds 1 2Fe-2S cluster.)
disruption phenotype: Mutant cannot not grow aerobically or anaerobically on limonene, but it can grow on perillyl alcohol or on acetate.

P95461 p-cymene methyl-monooxygenase (EC 1.14.15.25) from Pseudomonas chlororaphis subsp. aureofaciens (see paper)
36% identity, 14% coverage

cymAb / O33457 NADH-ferredoxin reductase (EC 1.18.1.3) from Pseudomonas putida (see 2 papers)
O33457 p-cymene methyl-monooxygenase (subunit 1/2) (EC 1.14.15.25) from Pseudomonas putida (see 2 papers)
cymAb / AAB62300.1 p-cymene monooxygenase reductase subunit from Pseudomonas putida (see 3 papers)
33% identity, 16% coverage

N8H69_24105 CDP-6-deoxy-delta-3,4-glucoseen reductase from Achromobacter spanius
32% identity, 15% coverage

Draft Genome Assembly of Stutzerimonas sp. Strain S1 and Achromobacter spanius Strain S4, Two Syringol-Metabolizing Bacteria Isolated from Compost Soil
Brink, Microbiology resource announcements 2023
- “...sequence identity (locus tags in S1, N8H22_08560 and N8H22_17570; locus tags in S4, N8H69_18620 and N8H69_24105). TABLE1 Results of the average nucleotide identity analysis used for taxonomic classification of strains S1 and S4 a Reference assembly (NCBI URL) ANIb identity (%) ANIb alignment coverage (%) ANIm...”

RHE_RS24270 adenylate/guanylate cyclase domain-containing protein from Rhizobium etli CFN 42
48% identity, 7% coverage

Rhizobium etli CFN42 proteomes showed isoenzymes in free-living and symbiosis with a different transcriptional regulation inferred from a transcriptional regulatory network
Taboada-Castro, Frontiers in microbiology 2022
- “...Bacteroid RHE_RS28210 E3.8.1.2; 2-haloacid dehalogenase [EC:3.8.1.2] K01768 MM RHE_RS18990 E4.6.1.1; adenylate cyclase [EC:4.6.1.1] K01768 MM RHE_RS24270 E4.6.1.1; adenylate cyclase [EC:4.6.1.1] K01768 MM RHE_RS18920 E4.6.1.1; adenylate cyclase [EC:4.6.1.1] K01768 Bacteroid RHE_RS11150 E4.6.1.1; adenylate cyclase [EC:4.6.1.1] K01768 Bacteroid RHE_RS12750 E4.6.1.1; adenylate cyclase [EC:4.6.1.1] K01768 Bacteroid RHE_RS13090 E4.6.1.1; adenylate...”

SMc01818 PUTATIVE ADENYLATE CYCLASE TRANSMEMBRANE PROTEIN from Sinorhizobium meliloti 1021
50% identity, 7% coverage

A signaling complex of adenylate cyclase CyaC of Sinorhizobium meliloti with cAMP and the transcriptional regulators Clr and CycR
Klein, BMC microbiology 2023
- “...a broad significance for the regulation of diverse processes in cell physiology and metabolism. CyaC (SMc01818 protein) of Sinorhizobium (Ensifer) meliloti belongs to the bacterial class III ACs, which are homodimers, are mostly membrane-bound and have a large variation in domain composition [ 3 5 ]....”
- “...proteins of metabolic or signaling pathways are often clustered [ 14 ]. Interestingly, cyaC ( smc01818 ) is preceded by a gene ( smc01819 ) that encodes a putative transcriptional regulator (TR01819) of the TetR family with an N -terminal Helix-Turn-Helix DNA binding motif. TetR regulators...”

3huiA / Q6N2U2 Crystal structure of the mutant a105r of [2fe-2s] ferredoxin in the class i cyp199a2 system from rhodopseudomonas palustris (see paper)
36% identity, 12% coverage

Ligand: fe2/s2 (inorganic) cluster (3huiA)

Q6N2U2 2Fe-2S iron-sulfur cluster-binding protein from Rhodopseudomonas palustris (strain ATCC BAA-98 / CGA009)
RPA3956 ferredoxin from Rhodopseudomonas palustris CGA009
36% identity, 12% coverage

Altering glycopeptide antibiotic biosynthesis through mutasynthesis allows incorporation of fluorinated phenylglycine residues.
Voitsekhovskaia, RSC chemical biology 2024
- “...(UniProt protein ID: O52825 and O52816); PuR (UniProt protein ID: Q6N3B2), PuxB (UniProt protein ID: Q6N2U2), Sfp (R4-4 mutant) and M4 and M5 domains 57 of Tcp 11 UniProt protein ID: Q70AZ7) were expressed and purified as previously reported. 13,5760 A-domain characterisation Activation of phenylglycine substrates...”
Thioredoxin Reductase-Type Ferredoxin: NADP⁺ Oxidoreductase of Rhodopseudomonas palustris: Potentiometric Characteristics and Reactions with Nonphysiological Oxidants
Lesanavičius, Antioxidants (Basel, Switzerland) 2022
- “...the other hand, Rp FNR has low reactivity toward Fe 2 S 2 -type ferredoxin (RPA3956), whereas its reactivity toward Fe 4 S 4 -type Fds of R. palustris has not been reported [ 15 ]. In order to extend the understanding of the redox properties...”
Protein recognition in ferredoxin-P450 electron transfer in the class I CYP199A2 system from Rhodopseudomonas palustris
Bell, Journal of biological inorganic chemistry : JBIC : a publication of the Society of Biological Inorganic Chemistry 2010 (PubMed)
- “...(PuR). Another [2Fe-2S] ferredoxin, palustrisredoxin B (PuxB; RPA3956) has been identified in the genome. PuxB shares sequence identity and motifs with...”
- “...studies of palustrisredoxin B (PuxB) encoded by the RPA3956 gene. PuxB shares a high degree of sequence identity with vertebrate-type [2Fe-2S] ferredoxins that...”

JJQ59_20660 NADH:ubiquinone reductase (Na(+)-transporting) subunit F from Cupriavidus necator
34% identity, 16% coverage

Whole Genome Sequence Analysis of Cupriavidus necator C39, a Multiple Heavy Metal(loid) and Antibiotic Resistant Bacterium Isolated from a Gold/Copper Mine
Xie, Microorganisms 2023
- “...EC:1.14.13.244 phenol hydroxylase P4 protein JJQ59_20655 (Chr 2) dmpP K16246 EC:1.14.13.244 phenol hydroxylase P5 protein JJQ59_20660 (Chr 2) benzonitrile NA K01501 EC: 3.5.5.1 nitrilase JJQ59_09680 (Chr 1) benzamide amiE K01426 EC:3.5.1.4 amidase JJQ59_09300 (Chr 1)...”

Q7UWS0 Na(+)-translocating NADH-quinone reductase subunit F from Rhodopirellula baltica (strain DSM 10527 / NCIMB 13988 / SH1)
27% identity, 19% coverage

Bioinformatic analyses of integral membrane transport proteins encoded within the genome of the planctomycetes species, Rhodopirellula baltica
Paparoditis, Biochimica et biophysica acta 2014
- “...10 3.D.5.1.1 Q56582 2 cations Na + Q7UWS3 2 3.D.5.1.1 Q56584 2 cations Na + Q7UWS0 2 3.D.5.1.1 Q56589 6 cations Na + Q7UWS1 6 3.D.5.1.1 Q57095 6 cations Na + Q7UWS2 5 4.D Polysaccharide Synthase/Exporters 4.D.1 Putative Vectorial Glycosyl Polymerization (VGP) Family 4.D.1.1.3 P75905 5...”

PA4331 hypothetical protein from Pseudomonas aeruginosa PAO1
34% identity, 13% coverage

Oxygen-dependent regulation of c-di-GMP synthesis by SadC controls alginate production in Pseudomonas aeruginosa
Schmidt, Environmental microbiology 2016 (PubMed)
- “...we found that the gene products of PA4330 and PA4331, located in a predicted operon with sadC, have a major impact on alginate production: deletion of PA4330...”
- “...production defect under anaerobic conditions, whereas a PA4331 (odaI, for oxygendependent alginate synthesis inhibitor) deletion mutant produced alginate also...”
BIIL 284 reduces neutrophil numbers but increases P. aeruginosa bacteremia and inflammation in mouse lungs
Döring, Journal of cystic fibrosis : official journal of the European Cystic Fibrosis Society 2014
- “...patients with non-CF bronchiectasis and healthy individuals using a qPCR TaqMan assays based on the PA4331 gene, which is ubiquitously present in a collection of 117 isolates from 14 CF patients and therefore a stable marker of P. aeruginosa . A standard curve was used to...”
- “...TaqHotStart DNA polymerase, qPCR Buffer, dNTPs, MgCl 2 and stabilizers) (peQlab, Erlangen, Germany), 0.3M Primer PA4331 forward (5-GTGTTGCAGCCTTTCGATCC3-), 0.3 M Primer PA4331 reverse (5- AACTCCAGCCATGGGTCCTC 3-), 0.3 M qPCR PA4331 Probe (5-FAM GCAGCACCTGCTGCTGTGGA 3-TAM) (Eurofins MWG Operon, Ebersberg, Germany). PCR conditions were 3 min at 95C,...”

New Search

For advice on how to use these tools together, see Interactive tools for functional annotation of bacterial genomes.

Statistics

The PaperBLAST database links 793,807 different protein sequences to 1,259,118 scientific articles. Searches against EuropePMC were last performed on March 13 2025.

How It Works

PaperBLAST builds a database of protein sequences that are linked to scientific articles. These links come from automated text searches against the articles in EuropePMC and from manually-curated information from GeneRIF, UniProtKB/Swiss-Prot, BRENDA, CAZy (as made available by dbCAN), BioLiP, CharProtDB, MetaCyc, EcoCyc, TCDB, REBASE, the Fitness Browser, and a subset of the European Nucleotide Archive with the /experiment tag. Given this database and a protein sequence query, PaperBLAST uses protein-protein BLAST to find similar sequences with E < 0.001.

To build the database, we query EuropePMC with locus tags, with RefSeq protein identifiers, and with UniProt accessions. We obtain the locus tags from RefSeq or from MicrobesOnline. We use queries of the form "locus_tag AND genus_name" to try to ensure that the paper is actually discussing that gene. Because EuropePMC indexes most recent biomedical papers, even if they are not open access, some of the links may be to papers that you cannot read or that our computers cannot read. We query each of these identifiers that appears in the open access part of EuropePMC, as well as every locus tag that appears in the 500 most-referenced genomes, so that a gene may appear in the PaperBLAST results even though none of the papers that mention it are open access. We also incorporate text-mined links from EuropePMC that link open access articles to UniProt or RefSeq identifiers. (This yields some additional links because EuropePMC uses different heuristics for their text mining than we do.)

For every article that mentions a locus tag, a RefSeq protein identifier, or a UniProt accession, we try to select one or two snippets of text that refer to the protein. If we cannot get access to the full text, we try to select a snippet from the abstract, but unfortunately, unique identifiers such as locus tags are rarely provided in abstracts.

PaperBLAST also incorporates manually-curated protein functions:

Proteins from NCBI's RefSeq are included if a GeneRIF entry links the gene to an article in PubMed^®. GeneRIF also provides a short summary of the article's claim about the protein, which is shown instead of a snippet.
Proteins from Swiss-Prot (the curated part of UniProt) are included if the curators identified experimental evidence for the protein's function (evidence code ECO:0000269). For these proteins, the fields of the Swiss-Prot entry that describe the protein's function are shown (with bold headings).
Proteins from BRENDA, a curated database of enzymes, are included if they are linked to a paper in PubMed and their full sequence is known.
Every protein from the non-redundant subset of BioLiP, a database of ligand-binding sites and catalytic residues in protein structures, is included. Since BioLiP itself does not include descriptions of the proteins, those are taken from the Protein Data Bank. Descriptions from PDB rely on the original submitter of the structure and cannot be updated by others, so they may be less reliable. (For SitesBLAST and Sites on a Tree, we use a larger subset of BioLiP so that every ligand is represented among a group of structures with similar sequences, but for PaperBLAST, we use the non-redundant set provided by BioLiP.)
Every protein from EcoCyc, a curated database of the proteins in Escherichia coli K-12, is included, regardless of whether they are characterized or not.
Proteins from the MetaCyc metabolic pathway database are included if they are linked to a paper in PubMed and their full sequence is known.
Proteins from the Transport Classification Database (TCDB) are included if they have known substrate(s), have reference(s), and are not described as uncharacterized or putative. (Some of the references are not visible on the PaperBLAST web site.)
Every protein from CharProtDB, a database of experimentally characterized protein annotations, is included.
Proteins from the CAZy database of carbohydrate-active enzymes are included if they are associated with an Enzyme Classification number. Even though CAZy does not provide links from individual protein sequences to papers, these should all be experimentally-characterized proteins.
Proteins from the REBASE database of restriction enzymes are included if they have known specificity.
Every protein with an evidence-based reannotation (based on mutant phenotypes) in the Fitness Browser is included.
Sequence-specific transcription factors (including sigma factors and DNA-binding response regulators) with experimentally-determined DNA binding sites from the PRODORIC database of gene regulation in prokaryotes.
Putative transcription factors from RegPrecise that have manually-curated predictions for their binding sites. These predictions are based on conserved putative regulatory sites across genomes that contain similar transcription factors, so PaperBLAST clusters the TFs at 70% identity and retains just one member of each cluster.
Coding sequence (CDS) features from the European Nucleotide Archive (ENA) are included if the /experiment tag is set (implying that there is experimental evidence for the annotation), the nucleotide entry links to paper(s) in PubMed, and the nucleotide entry is from the STD data class (implying that these are targeted annotated sequences, not from shotgun sequencing). Also, to filter out genes whose transcription or translation was detected, but whose function was not studied, nucleotide entries or papers with more than 25 such proteins are excluded. Descriptions from ENA rely on the original submitter of the sequence and cannot be updated by others, so they may be less reliable.

Except for GeneRIF and ENA, the curated entries include a short curated description of the protein's function. For entries from BioLiP, the protein's function may not be known beyond binding to the ligand. Many of these entries also link to articles in PubMed.

For more information see the PaperBLAST paper (mSystems 2017) or the code. You can download PaperBLAST's database here.

Changes to PaperBLAST since the paper was written:

November 2023: incorporated PRODORIC and RegPrecise. Many PRODORIC entries were not linked to a protein sequence (no UniProt identifier), so we added this information.
February 2023: BioLiP changed their download format. PaperBLAST now includes their non-redundant subset. SitesBLAST and Sites on a Tree use a larger non-redundant subset that ensures that every ligand is represented within each cluster. This should ensure that every binding site is represented.
June 2022: incorporated some coding sequences from ENA with the /experiment tag.
March 2022: incorporated BioLiP.
April 2020: incorporated TCDB.
April 2019: EuropePMC now returns table entries in their search results. This has expanded PaperBLAST's database, but most of the new entries are of low relevance, and the resulting snippets are often just lists of locus tags with annotations.
February 2018: the alignment page reports the conservation of the hit's functional sites (if available from from Swiss-Prot or UniProt)
January 2018: incorporated BRENDA.
December 2017: incorporated MetaCyc, CharProtDB, CAZy, REBASE, and the reannotations from the Fitness Browser.
September 2017: EuropePMC no longer returns some table entries in their search results. This has shrunk PaperBLAST's database, but has also reduced the number of low-relevance hits.

Many of these changes are described in Interactive tools for functional annotation of bacterial genomes.

Secrets

PaperBLAST cannot provide snippets for many of the papers that are published in non-open-access journals. This limitation applies even if the paper is marked as "free" on the publisher's web site and is available in PubmedCentral or EuropePMC. If a journal that you publish in is marked as "secret," please consider publishing elsewhere.

Omissions from the PaperBLAST Database

Many important articles are missing from PaperBLAST, either because the article's full text is not in EuropePMC (as for many older articles), or because the paper does not mention a protein identifier such as a locus tag, or because of PaperBLAST's heuristics. If you notice an article that characterizes a protein's function but is missing from PaperBLAST, please notify the curators at UniProt or add an entry to GeneRIF. Entries in either of these databases will eventually be incorporated into PaperBLAST. Note that to add an entry to UniProt, you will need to find the UniProt identifier for the protein. If the protein is not already in UniProt, you can ask them to create an entry. To add an entry to GeneRIF, you will need an NCBI Gene identifier, but unfortunately many prokaryotic proteins in RefSeq do not have corresponding Gene identifers.

References

PaperBLAST: Text-mining papers for information about homologs.
M. N. Price and A. P. Arkin (2017). mSystems, 10.1128/mSystems.00039-17.

Europe PMC in 2017.
M. Levchenko et al (2017). Nucleic Acids Research, 10.1093/nar/gkx1005.

Gene indexing: characterization and analysis of NLM's GeneRIFs.
J. A. Mitchell et al (2003). AMIA Annu Symp Proc 2003:460-464.

UniProt: the universal protein knowledgebase.
The UniProt Consortium (2016). Nucleic Acids Research, 10.1093/nar/gkw1099.

BRENDA in 2017: new perspectives and new tools in BRENDA.
S. Placzek et al (2017). Nucleic Acids Research, 10.1093/nar/gkw952.

The EcoCyc database: reflecting new knowledge about Escherichia coli K-12.
I. M. Keeseler et al (2016). Nucleic Acids Research, 10.1093/nar/gkw1003.

The MetaCyc database of metabolic pathways and enzymes.
R. Caspi et al (2018). Nucleic Acids Research, 10.1093/nar/gkx935.

CharProtDB: a database of experimentally characterized protein annotations.
R. Madupu et al (2012). Nucleic Acids Research, 10.1093/nar/gkr1133.

The carbohydrate-active enzymes database (CAZy) in 2013.
V. Lombard et al (2014). Nucleic Acids Research, 10.1093/nar/gkt1178.

The Transporter Classification Database (TCDB): recent advances
M. H. Saier, Jr. et al (2016). Nucleic Acids Research, 10.1093/nar/gkv1103.

REBASE - a database for DNA restriction and modification: enzymes, genes and genomes.
R. J. Roberts et al (2015). Nucleic Acids Research, 10.1093/nar/gku1046.

Deep annotation of protein function across diverse bacteria from mutant phenotypes.
M. N. Price et al (2016). bioRxiv, 10.1101/072470.

by Morgan Price, Arkin group
Lawrence Berkeley National Laboratory