PaperBLAST
PaperBLAST Hits for reanno::DvH:206336 ATP-dependent reduction of co(II)balamin (Desulfovibrio vulgaris Hildenborough JW710) (543 a.a., MQPHPETTPV...)
Show query sequence
>reanno::DvH:206336 ATP-dependent reduction of co(II)balamin (Desulfovibrio vulgaris Hildenborough JW710)
MQPHPETTPVTSCTIIDATDRSIRQTLGETSTLARLIWVDAGLASPPLCSGLARCGRCRV
RITEAAPAPHEDDREFFSAEDISAGWRLACRHAPAHGMVVHVPLPVMPHRHASRPKHPGP
FRLAVDLGTTSLQWSLLAPDGTVAAQGSETNPQMGAGSDVMSRIAMARSDKGRGRLRELV
LQALRRIVADVEGTPATADAVPPAGSGDEPTTACGYESRVEELCVAGNTAMTAILADESV
EGLASAPYRLEMRGGTALALPGLPPAWIPPLPAPFVGGDLSAGYLAVLTDHAPAFPFVLA
DLGTNGEFVLALSPERTLVTSVALGPALEGIGLTFGTVAQRGAITSFTLTPGGLVPYVLD
GGEADGISGTGYISLVHALLRAGLLDVDGRFIQSPSSPLAARMARSIVSHRGEPCLPLAR
GLYLAARDIEEILKVKAAFSLAFERLLATAQMPSHALSGIHLAGALGQHALPADLEGLGF
IPPGSGGRTRAVGNTSLRGAELLLTSPPLRDTLNTWREGCTVVDLTAAPDFSAAFLRHMH
FHF
Running BLASTp...
Found 46 similar proteins in the literature:
DVU0908 ATP-dependent reduction of co(II)balamin from Desulfovibrio vulgaris Hildenborough JW710
100% identity, 100% coverage
- mutant phenotype: Important for fitness in most defined media. Semi-automated annotation based on the auxotrophic phenotype and a hit to HMM PF14574.
Dde_2711 2Fe-2S iron-sulfur cluster binding domains protein from Desulfovibrio desulfuricans G20
47% identity, 95% coverage
DvMF_1398 ATP-dependent reduction of co(II)balamin (RamA-like) from Desulfovibrio vulgaris Miyazaki F
DvMF_1398 iron-sulfur cluster-binding protein, putative from Desulfovibrio vulgaris str. Miyazaki F
54% identity, 63% coverage
- mutant phenotype: Cofit with the B12-dependent methionine synthase (DvMF_0476), which lacks a standard domain for the reactivation of vitamin B12.
- Filling gaps in bacterial amino acid biosynthesis pathways with high-throughput genetics
Price, PLoS genetics 2018 - “...the standard B12 activation domain. This methionine synthase has a very similar fitness pattern as DvMF_1398, which contains two DUF4445 domains (r = 0.92 across 170 experiments; also see Fig 3 ). We infer that DUF4445 proteins perform the reactivation of vitamin B12 in diverse bacteria....”
Dred_2206 ferredoxin from Desulfotomaculum reducens MI-1
27% identity, 91% coverage
AF_0010 ASKHA domain-containing protein from Archaeoglobus fulgidus DSM 4304
26% identity, 70% coverage
- A novel methoxydotrophic metabolism discovered in the hyperthermophilic archaeon Archaeoglobus fulgidus
Welte, Environmental microbiology 2021 - “.... Genomic and transcriptomic analysis revealed cobalamin binding protein MtoC (AF_0006) and its activator MtoD (AF_0010), Odemethylase MtoB (AF_0007) and methyl transferase MtoA (AF_0009) to be essential for growth of A. fulgidus on methoxylated aromatic compounds. CoM: coenzyme M, H 4 folate: tetrahydrofolate, CO(III): cobalamin binding...”
- “...VhtACDG (AF_137881), ATP synthase AtpAK (AF_115868), cobalamin binding protein MtoC (AF_0006) and its activator MtoD (AF_0010), Odemethylase MtoB (AF_0007) and methyl transferase MtoA (AF_0009), MFS transporters (AF_0008 & AF_0013). H 4 MPT: tetrahydromethanopterin, MQH 2 : reduced menaquinone (MQ), MFR: methanofuran, Fd: ferredoxin, F 420 H...”
BP07_RS03235, WP_042685513 ASKHA domain-containing protein from Methermicoccus shengliensis
26% identity, 68% coverage
- Methanogenic archaea use a bacteria-like methyltransferase system to demethoxylate aromatic compounds
Kurth, The ISME journal 2021 - “...and MtoD The gene encoding the corrinoid protein MtoC (BP07_RS03260) and the corrinoid activating enzyme (BP07_RS03235) were amplified from genomic M. shengliensis DNA with primers 3235fw/3235Srev (CTCATATGAGCGTCAGAGTAACGTTCGAGC, CTGCGGCCGCTTATTTTTCGAACTGCGGGTGGCTCCAGCTAGCTGAAGAGAGTTTTTCTCC) and 3260fw/3260Srev (CTCATATGACGGACGTAAGAGAAGAGCTC/CTGCGGCCGCTTATTTTTCGAACTGCGGGTGGCTCCAGCTAGCCTCCACCCCCACCAGAGC) for cloning in expression vector pET-30a inserting an N-terminal Strep tag via the reverse primer....”
- “...plasmid transformation. For production of the corrinoid protein MtoC (BP07_RS03260) and the corrinoid activating enzyme (BP07_RS03235) the plasmids pET-30a_BP07_RS03260 and pET-30a_BP07_RS03235 were used for transformation into E. coli Bl21 (DE3). For protein overexpression, one colony was inoculated in 600ml LB-medium containing 50g/ml kanamycin and incubated at...”
- Several ways one goal-methanogenesis from unconventional substrates
Kurth, Applied microbiology and biotechnology 2020 - “...MtvB O-demethylase BP07_RS03250 WP_042685515 Corrinoid protein BP07_RS03260 WP_042685521 MtrH-like methyltransferase BP07_RS03240 WP_042685937 Corrinoid activation protein BP07_RS03235 WP_042685513 Methanococcoides Tertiary amines ? ? ? Methanolobus vulcani Quaternary amines MtgB methyltransferase FKV42_RS08545 WP_154809802 Corrinoid protein FKV42_RS08550 WP_154809803 Corrinoid activator FKV42_RS10455 WP_154810143 CoM methyltransferase FKV42_RS10480 WP_154810148 For the organisms...”
- “...O-demethylase BP07_RS03250 WP_042685515 Corrinoid protein BP07_RS03260 WP_042685521 MtrH-like methyltransferase BP07_RS03240 WP_042685937 Corrinoid activation protein BP07_RS03235 WP_042685513 Methanococcoides Tertiary amines ? ? ? Methanolobus vulcani Quaternary amines MtgB methyltransferase FKV42_RS08545 WP_154809802 Corrinoid protein FKV42_RS08550 WP_154809803 Corrinoid activator FKV42_RS10455 WP_154810143 CoM methyltransferase FKV42_RS10480 WP_154810148 For the organisms conducting...”
Awo_c10680 corrinoid activation/regeneration protein AcsV from Acetobacterium woodii DSM 1030
26% identity, 66% coverage
SSCH_450007 ASKHA domain-containing protein from Syntrophaceticus schinkii
29% identity, 68% coverage
Dhaf_2573 ferredoxin from Desulfitobacterium hafniense DCB-2
28% identity, 68% coverage
- Characterization of an O-demethylase of Desulfitobacterium hafniense DCB-2
Studenik, Journal of bacteriology 2012 - “...Expression cassettes for the genes Dhaf_1265, Dhaf_2573, Dhaf_2795, Dhaf_3310, Dhaf_3879, Dhaf_4322, Dhaf_4610, Dhaf_4611, and Dhaf_4612 (GenBank accession no....”
- “...GAT CCT TAT TTT TCG AAC TGC GGG TGG C 1 Dhaf_2573 Dhaf_3310 Dhaf_3879 Dhaf_4322 Dhaf_4610 Dhaf_4611 Dhaf_4612 a 2 2 2 2 2 2 2 2 For details, see Materials and...”
Dhaf_1265 ferredoxin from Desulfitobacterium hafniense DCB-2
27% identity, 76% coverage
- Characterization of an O-demethylase of Desulfitobacterium hafniense DCB-2
Studenik, Journal of bacteriology 2012 - “...components. Expression cassettes for the genes Dhaf_1265, Dhaf_2573, Dhaf_2795, Dhaf_3310, Dhaf_3879, Dhaf_4322, Dhaf_4610, Dhaf_4611, and Dhaf_4612 (GenBank...”
- “...Desulfitobacterium hafniense DCB-2 into pET11aa Gene Primer sequence PCR step Dhaf_1265 CGC GTT CAT ATG AAT CAT TAT CGG CC CTG CGG GTG GCT CCA AGC GCT GCA GAG...”
D9S251 Ferredoxin from Thermosediminibacter oceani (strain ATCC BAA-1034 / DSM 16646 / JW/IW-1228P)
29% identity, 68% coverage
- Analytical Validation of Loss of Heterozygosity and Mutation Detection in Pancreatic Fine-Needle Aspirates by Capillary Electrophoresis and Sanger Sequencing.
Timmaraju, Diagnostics (Basel, Switzerland) 2024 - “...150 chr5 119101810 119101960 GGTGTCAACAAAGTAATGTAAAG TGGATACATATTGTTTTCTGCTG 5q D5S615 330 chr5 125163290 125163620 GAGATAGGTAGGTAGGTAGG TCCACAGTGGTAAGAACCAG 9p D9S251 390 chr9 30819368 30819758 TGCATGTTTTATGTGCACTAAC CAATACTTTTTAAGGCTTTGTAGG 9p D9S254 120 chr9 126869098 126869218 TGGGTAATAACTGCCGGAGA GAGGATAAACCTGCTTCACTCAA 10q D10S520 180 chr10 96424526 96424706 CAGCCTATGCAACAGAACAAG GTCCTTGTGAGAAACTGGATGC 10q D10S523 150 chr10 87006333 87006483 GGTGGAGGTTGTGGTGA AACTGGGCATTTGTCTTTC...”
- Molecular Clues for Prediction of Hepatocellular Carcinoma Recurrence After Liver Transplantation.
Badwei, Journal of clinical and experimental hepatology 2023 - Role of Allelic Imbalance in Predicting Hepatocellular Carcinoma (HCC) Recurrence Risk After Liver Transplant.
Pagano, Annals of transplantation 2019 - “...Results We report that AI was associated with HCC recurrence in 3 main loci (D3S2303, D9S251, and D9S254). Tumor recurrence was associated only with 2 specific panels with 9 microsatellites previously reported to be associated with high risk for HCC recurrence. Our data show that fractional...”
- “...for D3S2303 (p=0.048) considering the presence of LOH ( Table 3A ), and D1S407 (p=0.006) D9S251 (p=0.02), D1S162 (p=0.005), D5S592 (p=0.005), D9S254 (p=0.002) and D10S520 (p=0.04) considering high-level LOH ( Table 3B ). Evaluation of specific panels and association with HCC recurrence Descriptive analysis of the...”
- The C9ORF72 expansion mutation is a common cause of ALS+/-FTD in Europe and has a single founder.
Smith, European journal of human genetics : EJHG 2013 - Clinical, neuroimaging and neuropathological features of a new chromosome 9p-linked FTD-ALS family.
Boxer, Journal of neurology, neurosurgery, and psychiatry 2011 - “...Genome-wide linkage analysis conclusively linked family VSM-20 to a 28.3 cM region between D9S1808 and D9S251 on chromosome 9p, reducing the published minimal linked region to a 3.7 Mb interval. Genomic sequencing and expression analysis failed to identify mutations in the 10 known and predicted genes...”
- “...GENESCAN and GENOTYPER software (Applied Biosystems) and normalised to the CEPH genotype database, except for D9S251 and D9S304 for which fragment sizes were not available. Mutation analyses In family VSM-20, a genomic DNA (gDNA) sequencing analysis was performed for all 10 candidate genes located within the...”
- Chromosome 9p21 in sporadic amyotrophic lateral sclerosis in the UK and seven other countries: a genome-wide association study.
Shatunov, The Lancet. Neurology 2010 - “...7 this 36 Mb locus is defined across studies by the flanking markers D9S169 and D9S251. The SNPs we have identified lie within this region, with the peak association at 1065 Kb. A GWAS that used pathological subtyping of patients with frontotemporal dementia to increase homogeneity...”
- Liver transplantation for hepatocellular carcinoma: extension of indications based on molecular markers.
Schwartz, Journal of hepatology 2008 - Use of microsatellite marker loss of heterozygosity in accurate diagnosis of pancreaticobiliary malignancy from brush cytology samples.
Khalid, Gut 2004
Dtox_1273 ferredoxin from Desulfotomaculum acetoxidans DSM 771
27% identity, 67% coverage
Ccar_18775 corrinoid activation/regeneration protein AcsV from Clostridium carboxidivorans P7
26% identity, 65% coverage
Dtur_0730 ferredoxin from Dictyoglomus turgidum DSM 6724
25% identity, 79% coverage
TepiRe1_0615 corrinoid activation/regeneration protein AcsV from Tepidanaerobacter acetatoxydans Re1
28% identity, 67% coverage
3zyyX / Q3ACS2 Reductive activator for corrinoid,iron-sulfur protein (see paper)
27% identity, 67% coverage
- Ligands: fe2/s2 (inorganic) cluster; (r,r)-2,3-butanediol (3zyyX)
CAETHG_1606 corrinoid activation/regeneration protein AcsV from Clostridium autoethanogenum DSM 10061
28% identity, 65% coverage
ELI_0370 ASKHA domain-containing protein from Eubacterium callanderi
25% identity, 74% coverage
CD0730 putative iron-sulfur protein from Clostridium difficile 630
25% identity, 67% coverage
DET0670 iron-sulfur cluster binding protein from Dehalococcoides ethenogenes 195
DET0704 iron-sulfur cluster binding protein from Dehalococcoides ethenogenes 195
29% identity, 60% coverage
TepiRe1_0333 ASKHA domain-containing protein from Tepidanaerobacter acetatoxydans Re1
23% identity, 76% coverage
Dhaf_3879 ferredoxin from Desulfitobacterium hafniense DCB-2
26% identity, 68% coverage
- Characterization of an O-demethylase of Desulfitobacterium hafniense DCB-2
Studenik, Journal of bacteriology 2012 - “...for the genes Dhaf_1265, Dhaf_2573, Dhaf_2795, Dhaf_3310, Dhaf_3879, Dhaf_4322, Dhaf_4610, Dhaf_4611, and Dhaf_4612 (GenBank accession no. CP001336.1) as Strep...”
- “...TAT TTT TCG AAC TGC GGG TGG C 1 Dhaf_2573 Dhaf_3310 Dhaf_3879 Dhaf_4322 Dhaf_4610 Dhaf_4611 Dhaf_4612 a 2 2 2 2 2 2 2 2 For details, see Materials and Methods....”
RAMQ_EUBLI / P0DX10 Corrinoid activation enzyme RamQ from Eubacterium limosum (see 2 papers)
WP_038351871 ASKHA domain-containing protein from Eubacterium limosum
26% identity, 72% coverage
- function: Involved in the degradation of the quaternary amines L- proline betaine and L-carnitine (PubMed:31341018, PubMed:32571881). Component of a corrinoid-dependent methyltransferase system that transfers a methyl group from L-proline betaine or L-carnitine to tetrahydrofolate (THF), forming methyl-THF, a key intermediate in the Wood-Ljungdahl acetogenesis pathway (PubMed:31341018, PubMed:32571881). RamQ is not required for the methyl transfer, but it stimulates reduction of reconstituted MtqC from the Co(II) state to the Co(I) state in vitro (PubMed:31341018). It also stimulates the rate of THF methylation (PubMed:32571881).
cofactor: [2Fe-2S] cluster (Binds 1 2Fe-2S cluster.) - MtpB, a member of the MttB superfamily from the human intestinal acetogen Eubacterium limosum, catalyzes proline betaine demethylation
Picking, The Journal of biological chemistry 2019 (secret)
Dhaf_3310 ferredoxin from Desulfitobacterium hafniense DCB-2
27% identity, 69% coverage
- Characterization of an O-demethylase of Desulfitobacterium hafniense DCB-2
Studenik, Journal of bacteriology 2012 - “...for the genes Dhaf_1265, Dhaf_2573, Dhaf_2795, Dhaf_3310, Dhaf_3879, Dhaf_4322, Dhaf_4610, Dhaf_4611, and Dhaf_4612 (GenBank accession no. CP001336.1) as...”
- “...CCT TAT TTT TCG AAC TGC GGG TGG C 1 Dhaf_2573 Dhaf_3310 Dhaf_3879 Dhaf_4322 Dhaf_4610 Dhaf_4611 Dhaf_4612 a 2 2 2 2 2 2 2 2 For details, see Materials and...”
Dhaf_2795 ferredoxin from Desulfitobacterium hafniense DCB-2
DSY1650 ferredoxin from Desulfitobacterium hafniense Y51
27% identity, 66% coverage
- Characterization of an O-demethylase of Desulfitobacterium hafniense DCB-2
Studenik, Journal of bacteriology 2012 - “...cassettes for the genes Dhaf_1265, Dhaf_2573, Dhaf_2795, Dhaf_3310, Dhaf_3879, Dhaf_4322, Dhaf_4610, Dhaf_4611, and Dhaf_4612 (GenBank accession no. CP001336.1)...”
- “...on March 3, 2017 by University of California, Berkeley Dhaf_2795 2 Studenik et al. oriented in the reverse direction in comparison to the orientation in the...”
- Complete genome sequence of the dehalorespiring bacterium Desulfitobacterium hafniense Y51 and comparison with Dehalococcoides ethenogenes 195
Nonaka, Journal of bacteriology 2006 - “...DSY0391 DSY0393 DSY1228 DSY1247 DSY1596 DSY1598 DSY1648 DSY1650 DSY1651 DSY1652 DSY1671 DSY1890 DSY2085 DSY2558 DSY2585 DSY3715 DSY4099 DSY4876 Predicted...”
B8R2M5 [Co(II) methylated amine-specific corrinoid protein] reductase (EC 1.16.99.1) from Acetobacterium dehalogenans (see paper)
WP_026395886 ASKHA domain-containing protein from Acetobacterium dehalogenans DSM 11527
25% identity, 71% coverage
Dhaf_4322 ferredoxin from Desulfitobacterium hafniense DCB-2
28% identity, 69% coverage
- Characterization of an O-demethylase of Desulfitobacterium hafniense DCB-2
Studenik, Journal of bacteriology 2012 - “...the genes Dhaf_1265, Dhaf_2573, Dhaf_2795, Dhaf_3310, Dhaf_3879, Dhaf_4322, Dhaf_4610, Dhaf_4611, and Dhaf_4612 (GenBank accession no. CP001336.1) as Strep tag...”
- “...BamHI according to the manufacturer's protocol. For Dhaf_4322, a compatible 3318 jb.asm.org Journal of Bacteriology Downloaded from http://jb.asm.org/ on March...”
SMc04347 CONSERVED HYPOTHETICAL PROTEIN from Sinorhizobium meliloti 1021
24% identity, 61% coverage
PGA1_c15200 ATP-dependent reduction of co(II)balamin (RamA-like) (EC:2.1.1.13) from Phaeobacter inhibens DSM 17395
PGA1_c15200 ASKHA domain-containing protein from Phaeobacter inhibens DSM 17395
27% identity, 60% coverage
- mutant phenotype: Apparently required for the reactivation of vitamin B12. Distantly related to RamA (see PMID: 19043046) (auxotroph)
- Filling gaps in bacterial amino acid biosynthesis pathways with high-throughput genetics
Price, PLoS genetics 2018 - “...are likely to be involved in B12 reactivation: a protein with ferredoxin and DUF4445 domains (PGA1_c15200) and a DUF1638 protein (PGA1_c13340). As shown in Fig 4B , mutants in these genes are rescued by added methionine. The DUF4445 protein is distantly related to RamA, which uses...”
CT740 Phenolhydrolase/NADH ubiquinone oxidoreductase from Chlamydia trachomatis D/UW-3/CX
37% identity, 17% coverage
CTLon_0109 Na(+)-translocating NADH-quinone reductase subunit F from Chlamydia trachomatis L2b/UCH-1/proctitis
37% identity, 17% coverage
ramA / B8Y445 [Co(II) methylated amines-specific corrinoid protein] reductase (EC 1.16.99.1) from Methanosarcina barkeri (see 2 papers)
RAMA_METBA / B8Y445 [Co(II) methylated amine-specific corrinoid protein] reductase; Corrinoid activation enzyme RamA; EC 1.16.99.1 from Methanosarcina barkeri (see paper)
B8Y445 [Co(II) methylated amine-specific corrinoid protein] reductase (EC 1.16.99.1) from Methanosarcina barkeri (see paper)
24% identity, 73% coverage
- function: Reductase required for the activation of corrinoid-dependent methylamine methyltransferase reactions during methanogenesis (PubMed:19043046). Mediates the ATP-dependent reduction of corrinoid proteins from the inactive cobalt(II) state to the active cobalt(I) state (PubMed:19043046). Acts on the corrinoid proteins involved in methanogenesis from monomethylamine (MMA), dimethylamine (DMA) and trimethylamine (TMA), namely MtmC, MtbC and MttC, respectively (PubMed:19043046).
catalytic activity: 2 Co(II)-[methylamine-specific corrinoid protein] + AH2 + ATP + H2O = 2 Co(I)-[methylamine-specific corrinoid protein] + A + ADP + phosphate + 3 H(+) (RHEA:65816)
catalytic activity: 2 Co(II)-[dimethylamine-specific corrinoid protein] + AH2 + ATP + H2O = 2 Co(I)-[dimethylamine-specific corrinoid protein] + A + ADP + phosphate + 3 H(+) (RHEA:65832)
catalytic activity: 2 Co(II)-[trimethylamine-specific corrinoid protein] + AH2 + ATP + H2O = 2 Co(I)-[trimethylamine-specific corrinoid protein] + A + ADP + phosphate + 3 H(+) (RHEA:65836)
cofactor: [4Fe-4S] cluster (Binds 2 [4Fe-4S] clusters.)
subunit: Monomer.
MA3972 conserved hypothetical protein from Methanosarcina acetivorans C2A
26% identity, 59% coverage
TC0116 NADH:ubiquinone oxidoreductase, beta subunit, putative from Chlamydia muridarum Nigg
34% identity, 17% coverage
MM1440 conserved protein from Methanosarcina mazei Goe1
27% identity, 55% coverage
MA0849 hypothetical protein (multi-domain) from Methanosarcina acetivorans C2A
24% identity, 71% coverage
MM0940 putative Flavoprotein from Methanosarcina mazei Goe1
25% identity, 60% coverage
Mmah_1683 4Fe-4S ferredoxin iron-sulfur binding domain protein from Methanohalophilus mahii DSM 5219
25% identity, 71% coverage
MA0150 methylamine methyltransferase corrinoid activation protein from Methanosarcina acetivorans C2A
27% identity, 55% coverage
MA4380 conserved hypothetical protein from Methanosarcina acetivorans C2A
26% identity, 60% coverage
Q8PXZ5 Conserved protein from Methanosarcina mazei (strain ATCC BAA-159 / DSM 3647 / Goe1 / Go1 / JCM 11833 / OCM 88)
MM1071 conserved protein from Methanosarcina mazei Goe1
28% identity, 45% coverage
- Mining proteomic data to expose protein modifications in Methanosarcina mazei strain Gö1
Leon, Frontiers in microbiology 2015 - “...Rpl1P 3 62 Q8PY39 MM1025 ThiC 3 58 Q8PXZ6 MM1070 MtaA1 methylcobalamin:CoM methyltransferase 7 374 Q8PXZ5 MM1071 4Fe:4S ferredoxin, hypothetical 2 121 Q8PXZ3 MM1073 MtaC2 methyl corrinoid protein 6 230 Q8PXZ2 MM1074 MtaB2 9 250 Q8PXZ1 MM1075 Putative regulatory protein 2 92 1 Y Q8PXX0 MM1096...”
- Mining proteomic data to expose protein modifications in Methanosarcina mazei strain Gö1
Leon, Frontiers in microbiology 2015 - “...3 62 Q8PY39 MM1025 ThiC 3 58 Q8PXZ6 MM1070 MtaA1 methylcobalamin:CoM methyltransferase 7 374 Q8PXZ5 MM1071 4Fe:4S ferredoxin, hypothetical 2 121 Q8PXZ3 MM1073 MtaC2 methyl corrinoid protein 6 230 Q8PXZ2 MM1074 MtaB2 9 250 Q8PXZ1 MM1075 Putative regulatory protein 2 92 1 Y Q8PXX0 MM1096 Thermosome,...”
- Transcriptional profiling of methyltransferase genes during growth of Methanosarcina mazei on trimethylamine
Krätzer, Journal of bacteriology 2009 - “...(C-terminal domain) MM0174 MM0175 MM0312 MM0408 MM0479 MM0924 MM1071 MM1073 MM1074 MM1075 MM1112 MM1271 MM1272 MM1273 MM1274 MM1275 MM1647 MM1648 MM1761 MM1762...”
- RamA, a protein required for reductive activation of corrinoid-dependent methylamine methyltransferase reactions in methanogenic archaea
Ferguson, The Journal of biological chemistry 2009 - “...were found in M. acetivorans (MA4380), M. mazei (mm1071), and M. barkeri (Mbar_A1055). Additionally, other RamA homologs in Methanosarcina spp. were found, but...”
- A subset of the diverse COG0523 family of putative metal chaperones is linked to zinc homeostasis in all kingdoms of life
Haas, BMC genomics 2009 - “...( M. mazei COG0523) is induced to the same extent as its neighboring ramM homolog, MM1071 , during growth in high salt conditions (2.38 and 2.21 fold, respectively) [ 104 ]. Archaeal genomes sequenced to date lack any recognizable homolog of the Fur (Fe) or Zur...”
- Characterization of a novel bifunctional dihydropteroate synthase/dihydropteroate reductase enzyme from Helicobacter pylori
Levin, Journal of bacteriology 2007 - “...MM512 MM612 MM808 MM847 MM851 MM902 MM1059 MM1060 MM1061 MM1071 Genotype or description 4064 LEVIN ET AL. and E. coli Fre. The reaction mixture was incubated...”
pc1533 probable Na(+)-translocating NADH-quinone reductase, chain F from Parachlamydia sp. UWE25
35% identity, 17% coverage
For advice on how to use these tools together, see
Interactive tools for functional annotation of bacterial genomes.
The PaperBLAST database links 793,807 different protein sequences to 1,259,118 scientific articles. Searches against EuropePMC were last performed on March 13 2025.
PaperBLAST builds a database of protein sequences that are linked
to scientific articles. These links come from automated text searches
against the articles in EuropePMC
and from manually-curated information from GeneRIF, UniProtKB/Swiss-Prot,
BRENDA,
CAZy (as made available by dbCAN),
BioLiP,
CharProtDB,
MetaCyc,
EcoCyc,
TCDB,
REBASE,
the Fitness Browser,
and a subset of the European Nucleotide Archive with the /experiment tag.
Given this database and a protein sequence query,
PaperBLAST uses protein-protein BLAST
to find similar sequences with E < 0.001.
To build the database, we query EuropePMC with locus tags, with RefSeq protein
identifiers, and with UniProt
accessions. We obtain the locus tags from RefSeq or from MicrobesOnline. We use
queries of the form "locus_tag AND genus_name" to try to ensure that
the paper is actually discussing that gene. Because EuropePMC indexes
most recent biomedical papers, even if they are not open access, some
of the links may be to papers that you cannot read or that our
computers cannot read. We query each of these identifiers that
appears in the open access part of EuropePMC, as well as every locus
tag that appears in the 500 most-referenced genomes, so that a gene
may appear in the PaperBLAST results even though none of the papers
that mention it are open access. We also incorporate text-mined links
from EuropePMC that link open access articles to UniProt or RefSeq
identifiers. (This yields some additional links because EuropePMC
uses different heuristics for their text mining than we do.)
For every article that mentions a locus tag, a RefSeq protein
identifier, or a UniProt accession, we try to select one or two
snippets of text that refer to the protein. If we cannot get access to
the full text, we try to select a snippet from the abstract, but
unfortunately, unique identifiers such as locus tags are rarely
provided in abstracts.
PaperBLAST also incorporates manually-curated protein functions:
- Proteins from NCBI's RefSeq are included if a
GeneRIF
entry links the gene to an article in
PubMed®.
GeneRIF also provides a short summary of the article's claim about the
protein, which is shown instead of a snippet.
- Proteins from Swiss-Prot (the curated part of UniProt)
are included if the curators
identified experimental evidence for the protein's function (evidence
code ECO:0000269). For these proteins, the fields of the Swiss-Prot entry that
describe the protein's function are shown (with bold headings).
- Proteins from BRENDA,
a curated database of enzymes, are included if they are linked to a paper in PubMed
and their full sequence is known.
- Every protein from the non-redundant subset of
BioLiP,
a database
of ligand-binding sites and catalytic residues in protein structures, is included. Since BioLiP itself
does not include descriptions of the proteins, those are taken from the
Protein Data Bank.
Descriptions from PDB rely on the original submitter of the
structure and cannot be updated by others, so they may be less reliable.
(For SitesBLAST and Sites on a Tree, we use a larger subset of BioLiP so that every
ligand is represented among a group of structures with similar sequences, but for
PaperBLAST, we use the non-redundant set provided by BioLiP.)
- Every protein from EcoCyc, a curated
database of the proteins in Escherichia coli K-12, is included, regardless
of whether they are characterized or not.
- Proteins from the MetaCyc metabolic pathway database
are included if they are linked to a paper in PubMed and their full sequence is known.
- Proteins from the Transport Classification Database (TCDB)
are included if they have known substrate(s), have reference(s),
and are not described as uncharacterized or putative.
(Some of the references are not visible on the PaperBLAST web site.)
- Every protein from CharProtDB,
a database of experimentally characterized protein annotations, is included.
- Proteins from the CAZy database of carbohydrate-active enzymes
are included if they are associated with an Enzyme Classification number.
Even though CAZy does not provide links from individual protein sequences to papers,
these should all be experimentally-characterized proteins.
- Proteins from the REBASE database
of restriction enzymes are included if they have known specificity.
- Every protein with an evidence-based reannotation (based on mutant phenotypes)
in the Fitness Browser is included.
- Sequence-specific transcription factors (including sigma factors and DNA-binding response regulators)
with experimentally-determined DNA binding sites from the
PRODORIC database of gene regulation in prokaryotes.
- Putative transcription factors from RegPrecise
that have manually-curated predictions for their binding sites. These predictions are based on
conserved putative regulatory sites across genomes that contain similar transcription factors,
so PaperBLAST clusters the TFs at 70% identity and retains just one member of each cluster.
- Coding sequence (CDS) features from the
European Nucleotide Archive (ENA)
are included if the /experiment tag is set (implying that there is experimental evidence for the annotation),
the nucleotide entry links to paper(s) in PubMed,
and the nucleotide entry is from the STD data class
(implying that these are targeted annotated sequences, not from shotgun sequencing).
Also, to filter out genes whose transcription or translation was detected, but whose function
was not studied, nucleotide entries or papers with more than 25 such proteins are excluded.
Descriptions from ENA rely on the original submitter of the
sequence and cannot be updated by others, so they may be less reliable.
Except for GeneRIF and ENA,
the curated entries include a short curated
description of the protein's function.
For entries from BioLiP, the protein's function may not be known beyond binding to the ligand.
Many of these entries also link to articles in PubMed.
For more information see the
PaperBLAST paper (mSystems 2017)
or the code.
You can download PaperBLAST's database here.
Changes to PaperBLAST since the paper was written:
- November 2023: incorporated PRODORIC and RegPrecise. Many PRODORIC entries were not linked to a protein sequence (no UniProt identifier), so we added this information.
- February 2023: BioLiP changed their download format. PaperBLAST now includes their non-redundant subset. SitesBLAST and Sites on a Tree use a larger non-redundant subset that ensures that every ligand is represented within each cluster. This should ensure that every binding site is represented.
- June 2022: incorporated some coding sequences from ENA with the /experiment tag.
- March 2022: incorporated BioLiP.
- April 2020: incorporated TCDB.
- April 2019: EuropePMC now returns table entries in their search results. This has expanded PaperBLAST's database, but most of the new entries are of low relevance, and the resulting snippets are often just lists of locus tags with annotations.
- February 2018: the alignment page reports the conservation of the hit's functional sites (if available from from Swiss-Prot or UniProt)
- January 2018: incorporated BRENDA.
- December 2017: incorporated MetaCyc, CharProtDB, CAZy, REBASE, and the reannotations from the Fitness Browser.
- September 2017: EuropePMC no longer returns some table entries in their search results. This has shrunk PaperBLAST's database, but has also reduced the number of low-relevance hits.
Many of these changes are described in Interactive tools for functional annotation of bacterial genomes.
PaperBLAST cannot provide snippets for many of the papers that are
published in non-open-access journals. This limitation applies even if
the paper is marked as "free" on the publisher's web site and is
available in PubmedCentral or EuropePMC. If a journal that you publish
in is marked as "secret," please consider publishing elsewhere.
Many important articles are missing from PaperBLAST, either because
the article's full text is not in EuropePMC (as for many older
articles), or because the paper does not mention a protein identifier such as a locus tag, or because of PaperBLAST's heuristics. If you notice an
article that characterizes a protein's function but is missing from
PaperBLAST, please notify the curators at UniProt
or add an entry to GeneRIF.
Entries in either of these databases will eventually be incorporated
into PaperBLAST. Note that to add an entry to UniProt, you will need
to find the UniProt identifier for the protein. If the protein is not
already in UniProt, you can ask them to create an entry. To add an
entry to GeneRIF, you will need an NCBI Gene identifier, but
unfortunately many prokaryotic proteins in RefSeq do not have
corresponding Gene identifers.
References
PaperBLAST: Text-mining papers for information about homologs.
M. N. Price and A. P. Arkin (2017). mSystems, 10.1128/mSystems.00039-17.
Europe PMC in 2017.
M. Levchenko et al (2017). Nucleic Acids Research, 10.1093/nar/gkx1005.
Gene indexing: characterization and analysis of NLM's GeneRIFs.
J. A. Mitchell et al (2003). AMIA Annu Symp Proc 2003:460-464.
UniProt: the universal protein knowledgebase.
The UniProt Consortium (2016). Nucleic Acids Research, 10.1093/nar/gkw1099.
BRENDA in 2017: new perspectives and new tools in BRENDA.
S. Placzek et al (2017). Nucleic Acids Research, 10.1093/nar/gkw952.
The EcoCyc database: reflecting new knowledge about Escherichia coli K-12.
I. M. Keeseler et al (2016). Nucleic Acids Research, 10.1093/nar/gkw1003.
The MetaCyc database of metabolic pathways and enzymes.
R. Caspi et al (2018). Nucleic Acids Research, 10.1093/nar/gkx935.
CharProtDB: a database of experimentally characterized protein annotations.
R. Madupu et al (2012). Nucleic Acids Research, 10.1093/nar/gkr1133.
The carbohydrate-active enzymes database (CAZy) in 2013.
V. Lombard et al (2014). Nucleic Acids Research, 10.1093/nar/gkt1178.
The Transporter Classification Database (TCDB): recent advances
M. H. Saier, Jr. et al (2016). Nucleic Acids Research, 10.1093/nar/gkv1103.
REBASE - a database for DNA restriction and modification: enzymes, genes and genomes.
R. J. Roberts et al (2015). Nucleic Acids Research, 10.1093/nar/gku1046.
Deep annotation of protein function across diverse bacteria from mutant phenotypes.
M. N. Price et al (2016). bioRxiv, 10.1101/072470.
by Morgan Price,
Arkin group
Lawrence Berkeley National Laboratory