PaperBLAST
PaperBLAST Hits for VIMSS1744426 lipoprotein, putative (1225 a.a., MKRFLHRVKW...)
Show query sequence
>VIMSS1744426 lipoprotein, putative
MKRFLHRVKWPLLLSSIAVSLGIVAVACAQPNSRTIENLFRPSSAFTDKNDGSINATLYK
ALENREGLTQYLTMRLAPVLRNFYEENVDDDIKRNLRTFNTDTDNSFVNQEQNLRNQYRG
DYLVRLQTDILDNTGGNQANWKLRDVNNKIVDDFINKLFTKNFVEYVDKSVGVLSTPLKG
LIENQSNWNNIKIQAKFVDKNKRLRINNDAVYAAIQDKLLDQFVTNENPNLVSRVVFTNE
TPNDGFDNYFNPDLIKSPTPSYQFQVFNKYNQQDNSIKGANGFHILASNLQSYVNTNNKT
IDIPNKFSSDSGGKLLLKASDMFDTFDPSFSAAFIQGYLALQKKSKGAEQTEYTKLEKDK
SIIENFFVENNSAKAAMKSASSSSQTTTVHKTDLAKIFKENETTKSTDVFKGEYQKKFSN
ATSTSNSDSSNNSAIVDLKELKKDNNSQPDLILARGKDGIHLMGVDGGGYYLSESGRDVN
KQKQFLLFRALQTKYGLIDTNTTYDFKLFDEVKKHFDKNRVLFLFNALFKLVDSKESNFL
SFPQFKKFSDSIITVKNELKDLVESQYQQIVFNEVATAENKVALKLAERNQPFIDNERNK
QIWMNGLAAVLPYEQDSKTGHYNELGIYYKDIIDKVSSNTSNSNSNSNSSSSDPFSKKII
DKLKENKKKVEAAVKKHVDELKVSVIPSPQYSQIILVDTKLSSDPRNTSLALNLALNAVL
SSDELQNTIRRDYFVNDDQFKQAIDLDKLTFKNWNSLNNENWNIFKYTYLFDLFQKQANP
SIFGNGVNESSTDNKPKINGVLDSLYNSLNLEERLDSNDLINYYSYLYTVQWLLKDNLKN
LKQNLQAKLSRTTNSFLVWSLASDKDRNNTASQAMSVSSSKSVLVKMANNVASQTNQDFT
KQEQQNPNYVFGSSAYNWTNNKTPTVNSAANDISSLYYTKNNGSSSTSLTLMQKSAQQTN
NQQRRFGFHGIVTNTSSNNLPDAVRNRLFTSFVSQSEKSSSNGGQAQLQSTQSSGSNETI
YKGALFSFGSLTKLIETIDNIPTQAEFDALYNHLTSNLNINVTGVDRSKSLQEQKTNLKN
FANSNFNNTQTVQLKQAQSKTNNSNFNDVFSRFEGYIGTNKTSNYSSYNFLQDNQIYHAV
YAKQINLEDVSMLGSDSLNSTDSNNSKRLDLSLEEFLSTVALEALNPNNQTQAINALIAN
AKNGLVRVGDNRLFSAISSQWVRKF
Running BLASTp...
Found 11 similar proteins in the literature:
Y309_MYCGE / P47551 Uncharacterized lipoprotein MG309 from Mycoplasma genitalium (strain ATCC 33530 / DSM 19775 / NCTC 10195 / G37) (Mycoplasmoides genitalium) (see paper)
MG_309 lipoprotein, putative from Mycoplasma genitalium G37
100% identity, 100% coverage
- disruption phenotype: Probably essential, it was not disrupted in a global transposon mutagenesis study.
- Mycoplasma genitalium infection in the female reproductive system: Diseases and treatment
Yu, Frontiers in microbiology 2023 - “...typing is mainly based on analyzing different short tandem repeats of MgpB alleles MG_191 and MG_309 ( Pineiro et al., 2019 ; Laumen et al., 2021 ; Dumke, 2022 ). Mycoplasma genitalium morphologically presents as a bottle or flask shape, with a rod-like structure at the...”
- Molecular Tools for Typing Mycoplasma pneumoniae and Mycoplasma genitalium
Dumke, Frontiers in microbiology 2022 - “...in cases of treatment failure. Furthermore, analysis of different short tandem repeats in the gene MG_309, which codes for a surface-localized lipoprotein, is suitable for typing ( Ma and Martin, 2004 ; Ma et al., 2008 ; McGowin et al., 2009 ). Combining the number of...”
- “...for p1 , MLV, MLS, and SNP typing in M. pneumoniae and for mgpB and MG_309 typing in M. genitalium ( Table 1 ). Using p1 typing, calculated DIs ( Hunter and Gaston, 1988 ) from selected studies ranged between 0.42 and 0.68. To date, 15...”
- Antibiotic Resistance and Genotypes of Mycoplasma genitalium during a Resistance-Guided Treatment Regime in a German University Hospital
Dumke, Antibiotics (Basel, Switzerland) 2021 - “...between ongoing colonization, re-infection or the development of resistance. In the present study, mgpB and MG_309 types as well as mutations associated with macrolide, quinolone and tetracycline resistance of strains in M . genitalium -positive samples accumulated in the years 2019 and 2020 at a university...”
- “...have sex with men, 74.1% HIV-positive) were included. Twenty-three mgpB types (seven new types), nine MG_309 types and thirty-four mgpB /MG_309 types were identified. The prevalence of mutations associated with macrolide, quinolone and tetracycline resistance was 56.9%, 10.3% and 6.8%, respectively. Despite the fact that many...”
- Lower mgpB diversity in macrolide-resistant Mycoplasma genitalium infecting men visiting two sexually transmitted infection clinics in Montpellier, France
Guiraud, The Journal of antimicrobial chemotherapy 2021 (PubMed)- “...gene and number of tandem repeats in the MG_309 gene. Macrolide and fluoroquinolone resistance were determined. Typing results were compared with antibiotic...”
- “...STs, with ST4 being most prevalent. The mgpB/ MG_309 typing method identified 52 genetic profiles, resulting in a discriminatory index of 0.979. Macrolide and...”
- Genotyping of Mycoplasma genitalium Suggests De Novo Acquisition of Antimicrobial Resistance in Queensland, Australia
Sweeney, Journal of clinical microbiology 2020 - MgpB Types among Mycoplasma genitalium Strains from Men Who Have Sex with Men in Berlin, Germany, 2016-2018
Dumke, Pathogens (Basel, Switzerland) 2019 - “...not demonstrated. Investigation of follow-up samples from 35 patients confirmed the same mgpB and, additionally, MG_309 types in 25 cases. In 10 cases, differences between types in subsequent samples indicated an infection with a genetically different strain in the period between samplings. MgpB / MG_309 typing...”
- “...in different populations or at different locations. Combination of mgpB typing and VNTR in gene MG_309 was described as useful for the investigation of sexual networks [ 8 , 9 ]. In the present study, we analyzed the mgpB types of first and follow-up samples of...”
- Transcriptional regulation of MG_149, an osmoinducible lipoprotein gene from Mycoplasma genitalium
Zhang, Molecular microbiology 2011 - “...et al ., 1999 ). For example, two lipoproteins of M. genitalium , MG_149 and MG_309, have been shown to activate TLR1/2 or TLR2/6 leading to the release of pro-inflammatory cytokines ( Shimizu et al ., 2008b , McGowin et al ., 2009a ). Similarly, three...”
MPN444 conserved hypothetical protein from Mycoplasma pneumoniae M129
58% identity, 92% coverage
- Structural characterization of Mpn444, an essential lipoprotein ofMycoplasma pneumoniae
Keles, 2024 - Inter- and intra-strain variability of tandem repeats in Mycoplasma pneumoniae based on next-generation sequencing data
Zhang, Future microbiology 2017 - “...short peptide repeats in five genes (MPN089, MPN141, MPN444, MPN501 and MPN524). Different levels of intrastrain copy number variation To examine repeat copy...”
- “...AC/AG 182793-182813 AGT 2 3 10-17 5-16 Mpn444 538724-53835 TAC 3 3-6 Mpn13 596741-596804 TATTAATAACTATTCT 16 3-4 Mpn14 608651-608755 TGGACAAAATGGAAGTAAAAA 21...”
- P40 and P90 from Mpn142 are Targets of Multiple Processing Events on the Surface of Mycoplasma pneumoniae
Widjaja, Proteomes 2015 - “...in M. pneumoniae including Mpn142 and several uncharacterized lipoproteins (Mpn052, Mpn284, Mpn288, Mpn376, Mpn400, Mpn408, Mpn444, Mpn456, Mpn474 and Mpn491) are processed at multiple sites [ 95 ]. However, in most instances, precise cleavage events have not been mapped. Previous studies have shown that M. pneumoniae...”
- Mycoplasma genitalium-encoded MG309 activates NF-kappaB via Toll-like receptors 2 and 6 to elicit proinflammatory cytokine secretion from human genital epithelial cells
McGowin, Infection and immunity 2009 - “...A homolog of MG309 exists in Mycoplasma pneumoniae (MPN444), but no studies to our knowledge have addressed antigenicity or a role in inflammation. Similar...”
- Differential expression of lipoprotein genes in Mycoplasma pneumoniae after contact with human lung epithelial cells, and under oxidative and acidic stress
Hallamaa, BMC microbiology 2008 - “...AGTTTCCGCTAGTTCGTTGC GTTTTTGCGGCATCTTCAAT MPN408 ATTCCCATTTCCCTTTCCAC CATTTGAGCACCGTTTTCCT MPN200 TTCCGGTCTCTGTTTCGACT TCTTTTTGTGCGCCCTTACT MPN152 CGATTAATGGACCCGTTTTG TCTTTGCACCGAAGTGACAG 3 MPN436 CCCAGTCAAGGGTTAGGTCA TCTTCGGCAAAGAAAGGAAA MPN444 AACCGAAGTCAAAAACCC GAAGTGTCATCAGCAGCC MPN489 GATGGTAGTTACCCCGCT ACTAAAGCGGCAGATCCT 4 MPN456 AGCTGCACAAAGAAGCAC CTTGAGTGCCGTTACCAC 5 MPN011 AAAGGCATTAGCGATGTTTCA AATGTTTGTCACCTTTGTGGA MPN012 AGAGTGCGGAAAAAGGTGAA AAAGGATCATTGCCTGTTGC MPN411 GGTATTGCGGAACTTGCTTT TCAACTTTCCGCTCCATTTT MPN271 TGCGGATTTTGATTTTGACC GATCAACCTTTCGCTCCATC MPN505 TTTGAAAAGGGCGAATTAGG TGATCAACCTTTCGCTCAAA 6 MPN647 ATGGATCCTTCCCGTTTTTC CCGGGATAAGTTTCTGCAAG MPN646...”
- “...MPN152 1.2 0.677 0.2 0.104 0.2 0.020* 3 MPN436 0.9 0.802 0.4 0.094 0.9 0.836 MPN444 1.3 0.549 0.7 0.240 0.8 0.555 MPN489 1.4 0.286 0.6 0.033* 0.7 0.328 4 MPN456 2.2 0.010* 0.8 0.681 0.5 0.082 5 MPN011 1.6 0.044* 0.4 0.299 0.6 0.316 MPN012...”
- Lipoprotein multigene families in Mycoplasma pneumoniae
Hallamaa, Journal of bacteriology 2006 - “...MG260 MG185 MPN489 P02-orf1300 MPN485 P02_orf316 MPN444 H08_orf1325 MPN442 H08_orf150 MPN436-like ORFo MPN440 H08_orf726 MPN439 H08_orf237 MPN438 H08_orf345...”
- “...Of the three full-length ORFs in family 3, MPN444 and MPN436 were transcribed, while MPN489 was not detected by RT-PCR. Truncated ORFs MPN442, MPN436-like...”
- Re-annotating the Mycoplasma pneumoniae genome sequence: adding value, function and reading frames
Dandekar, Nucleic acids research 2000 - “...PID:g1674089 A05_orf395 MPN096 059.0 PID:g1673709 R02_orf264 MPN444 395.0 PID:g1674078 H08_orf289 MPN108 047.0 PID:g1673696 C09_orf404 MPN448 392.0 PID:g1674075...”
- Transcription in Mycoplasma pneumoniae
Weiner, Nucleic acids research 2000 - “...735 H08_orf591 Low Putative lipoprotein family 4 396 (MPN444) 481 119-485 096 H08_orf1325 Medium Putative lipoprotein family 3 438 (MPN401) 539 611-540 093...”
MGA_0319 hypothetical protein from Mycoplasma gallisepticum str. R(low)
29% identity, 95% coverage
- Mycoplasmosis in Poultry: An Evaluation of Diagnostic Schemes and Molecular Analysis of Egyptian Mycoplasma gallisepticum Strains
Al-Baqir, Pathogens (Basel, Switzerland) 2023 - “...(GTS) analysis of MG surface proteins, such as the gapA , mgc2 , pvpA, and MGA_0319 , was developed by Ferguson et al. [ 15 ]. However, there are few studies and publications dealing with the molecular characterization and sequencing of MG isolates from species aside...”
- Epidemiological investigations and multilocus sequence typing of Mycoplasma gallisepticum collected in China
Wei, Poultry science 2023 - “...genotyping methods have been described, such as sequencing variable surface proteins (mgc2, pvpA, gapA and MGA_0319) or variable intergenic spacer region ( IGSR ) between 23S rRNA and 16S rRNA ( Jiang et al., 2009 ; Sprygin et al., 2010 ). However, these methods showed insufficient...”
- Targeted sequencing analysis of Mycoplasma gallisepticum isolates in chicken layer and breeder flocks in Thailand
Limsatanun, Scientific reports 2022 - “...of partial surface proteins of MG, including the gap A, mgc 2, pvp A and MGA_0319 genes. The multilocus sequence typing scheme (MLST) is a technique that many studies have used and is regarded as the gold standard for bacterial typing 6 , 10 , 11...”
- “...generally feasible. The important genes of MG, including gap A, mgc 2, pvp A and MGA_0319 ( lp ), have been investigated in several epidemiological studies 4 , 18 , 19 . In Thailand, Limsatanun et al. 20 classified MG strains with partial mgc 2 gene...”
- Molecular detection and characterization of Mycoplasma gallisepticum and Mycoplasma synoviae strains in backyard poultry in Italy
Felice, Poultry science 2020 - “...gapA and mgc2 ) and an uncharacterized hypothetical surface lipoprotein-encoding gene (designated coding DNA sequence MGA_0319) (Ferguson et al., 2005 ) was performed. In particular, 590, 824, and 332 bp fragments of the MGA_0319, mgc 2, and gap A genes were amplified according to Ferguson et...”
- “...were pr- mgc 2 positive. All the corresponding GTS amplicons were successfully sequenced, except the MGA_0319 gene from flocks 9 and flock 10 samples. The MG strains analyzed with GTS were named as follows: IT/MG675/ck/16 (flock 1), IT/MG690/ck/16 (flock 2), IT/MG704/ck/16 (flock 3), IT/MG705/ck/16 (flock 4),...”
- Core Genome Multilocus Sequence Typing: a Standardized Approach for Molecular Typing of Mycoplasma gallisepticum
Ghanem, Journal of clinical microbiology 2018 - “...pvpA, and gapA) and one predicted surface protein (MGA_0319) to differentiate between 67 different M. gallisepticum strains and isolates. The total number of...”
- “...the currently used GTS scheme (mgc2, pvpA, gapA, and MGA_0319) compared to the newly developed cgMLST scheme, we extracted the 4 gene targets that are currently...”
- The development and application of a Mycoplasma gallisepticum sequence database
Armour, Avian pathology : journal of the W.V.P.A 2013 (PubMed)- “...spacer region (IGSR), M. gallisepticum cytadhesin 2 (mgc2), MGA_0319 and gapA genetic regions. The DNA sequences of these genotypes were distinct from those of...”
- “...for use in South Africa. The IGSR, mgc2 or MGA_0319 sequences of three South African genotypes were identical to those of the ts-11 vaccine strain,...”
- Characterization of in vivo-acquired resistance to macrolides of Mycoplasma gallisepticum strains isolated from poultry
Gerchman, Veterinary research 2011 - “...was performed by modified GTS analysis [ 11 ]. The pvpA, gapA , and lp (MGA_0319) partial gene sequences were amplified using primers pvpA 4F/3R, gapA 3F/4R, and lp 1F/1R described previously [ 11 ]. However, the mgc2 gene was amplified using primers mgc2 2F/2R, previously...”
- “...Israeli reference strain were submitted to GenBank under the following accession numbers: gapA , JN102573-102623; MGA_0319, JN102624-102674; pvpA , JN113291-113341; mgc2 , JN 13342-113392. Results Antimicrobial susceptibility An overview of the MIC values, MIC 50 , and MIC 90 for tylosin, tilmicosin, and enrofloxacin for the...”
- Use of molecular diversity of Mycoplasma gallisepticum by gene-targeted sequencing (GTS) and random amplified polymorphic DNA (RAPD) analysis for epidemiological studies
Ferguson, Microbiology (Reading, England) 2005 (PubMed)- “...gene designated genome coding DNA sequence (CDS) MGA_0319. The regions of the surface-protein-encoding genes targeted in this analysis were found to...”
- “...(RAPD) analysis. GTS analysis of individual genes, gapA, MGA_0319, mgc2 and pvpA, identified 17, 16, 20 and 22 sequence types, respectively. GTS analysis using...”
MPN436 Mollicute specific lipoprotein, MG307 homolog, from M. genitalium from Mycoplasma pneumoniae M129
26% identity, 79% coverage
- Inflammation-inducing Factors of Mycoplasma pneumoniae
Shimizu, Frontiers in microbiology 2016 - “...MPN369 Hypothetical MPN408 Hypothetical MPN411 Hypothetical MPN415 High affinity transport system protein P37 3 a MPN436 Hypothetical MPN439 Pseudo MPN442 Hypothetical MPN456 Hypothetical MPN459 Hypothetical MPN467 Hypothetical MPN489 Hypothetical MPN506 Hypothetical MPN523 Hypothetical MPN582 Hypothetical MPN585 Hypothetical MPN587 Hypothetical MPN588 Hypothetical MPN590 Hypothetical MPN592 Hypothetical MPN602...”
- Differential expression of lipoprotein genes in Mycoplasma pneumoniae after contact with human lung epithelial cells, and under oxidative and acidic stress
Hallamaa, BMC microbiology 2008 - “...CTAATTTGCTTGGTGCGACA 2 MPN199 AGTTTCCGCTAGTTCGTTGC GTTTTTGCGGCATCTTCAAT MPN408 ATTCCCATTTCCCTTTCCAC CATTTGAGCACCGTTTTCCT MPN200 TTCCGGTCTCTGTTTCGACT TCTTTTTGTGCGCCCTTACT MPN152 CGATTAATGGACCCGTTTTG TCTTTGCACCGAAGTGACAG 3 MPN436 CCCAGTCAAGGGTTAGGTCA TCTTCGGCAAAGAAAGGAAA MPN444 AACCGAAGTCAAAAACCC GAAGTGTCATCAGCAGCC MPN489 GATGGTAGTTACCCCGCT ACTAAAGCGGCAGATCCT 4 MPN456 AGCTGCACAAAGAAGCAC CTTGAGTGCCGTTACCAC 5 MPN011 AAAGGCATTAGCGATGTTTCA AATGTTTGTCACCTTTGTGGA MPN012 AGAGTGCGGAAAAAGGTGAA AAAGGATCATTGCCTGTTGC MPN411 GGTATTGCGGAACTTGCTTT TCAACTTTCCGCTCCATTTT MPN271 TGCGGATTTTGATTTTGACC GATCAACCTTTCGCTCCATC MPN505 TTTGAAAAGGGCGAATTAGG TGATCAACCTTTCGCTCAAA 6 MPN647...”
- “...MPN200 3.3 0.006** 0.4 0.195 0.4 0.037* MPN152 1.2 0.677 0.2 0.104 0.2 0.020* 3 MPN436 0.9 0.802 0.4 0.094 0.9 0.836 MPN444 1.3 0.549 0.7 0.240 0.8 0.555 MPN489 1.4 0.286 0.6 0.033* 0.7 0.328 4 MPN456 2.2 0.010* 0.8 0.681 0.5 0.082 5 MPN011...”
- Lipoprotein multigene families in Mycoplasma pneumoniae
Hallamaa, Journal of bacteriology 2006 - “...synoviae (Fig. 2). BlastX analysis of family 3 ORF MPN436 found 29% (E value, 8e-09) identity to a predicted homolog of the permease component of an...”
- “...MPN438 H08_orf345 MPN437 H08_orf572o MPN436 A05_orf1244 c(592398..596300) c(589030..589980)n c(537762..541739) c(536089..536541) c(535005..536168)...”
Y338_MYCGE / P47580 Uncharacterized lipoprotein MG338 from Mycoplasma genitalium (strain ATCC 33530 / DSM 19775 / NCTC 10195 / G37) (Mycoplasmoides genitalium) (see paper)
MG_338 lipoprotein, putative from Mycoplasma genitalium G37
22% identity, 95% coverage
MPN440 Mollicute specific lipoprotein, MG307 homolog, from M. genitalium from Mycoplasma pneumoniae M129
27% identity, 44% coverage
- SURE editing: combining oligo-recombineering and programmable insertion/deletion of selection markers to efficiently edit the Mycoplasma pneumoniae genome
Piñero-Lambea, Nucleic acids research 2022 - “...Remarkably, our initial attempt to find lox72 reads in the sample corresponding to M129-GP35-PtetCre 1kb mpn440 ::lox scar strain failed, so we refined the search for a mutated version of lox72 that was already revealed by Sanger sequencing ( Supplementary Figure S3 ). In addition, genome...”
- “...Table S6 ). Finally, for the samples corresponding to edits in mpn088 , mpn256 , mpn440 and mpn583 we checked if any of the raw sequenced reads (read length=112 bp) mapped to the Vcre or GentaR sequences (1143and 1488 bp, respectively) by using Bowtie 2 alignment...”
- Lipoprotein multigene families in Mycoplasma pneumoniae
Hallamaa, Journal of bacteriology 2006 - “...H08_orf1325 MPN442 H08_orf150 MPN436-like ORFo MPN440 H08_orf726 MPN439 H08_orf237 MPN438 H08_orf345 MPN437 H08_orf572o MPN436 A05_orf1244 c(592398..596300)...”
- “...MPN485 MPN489 MPN436 MPN436 MPN436 MPN436 MPN436 MPN436 MPN440 MPN440 (93%, (95%, (24%, (77%, (57%, (69%, (74%, (44%, (74%, (69%, 4e-121) 0.0) 9e-54) 4e-59)...”
MPN439 Mollicute specific lipoprotein, MG307 homolog, from M. genitalium, : MPN436 from Mycoplasma pneumoniae M129
30% identity, 20% coverage
- Inflammation-inducing Factors of Mycoplasma pneumoniae
Shimizu, Frontiers in microbiology 2016 - “...MPN408 Hypothetical MPN411 Hypothetical MPN415 High affinity transport system protein P37 3 a MPN436 Hypothetical MPN439 Pseudo MPN442 Hypothetical MPN456 Hypothetical MPN459 Hypothetical MPN467 Hypothetical MPN489 Hypothetical MPN506 Hypothetical MPN523 Hypothetical MPN582 Hypothetical MPN585 Hypothetical MPN587 Hypothetical MPN588 Hypothetical MPN590 Hypothetical MPN592 Hypothetical MPN602 atpF F...”
- Comparative genome analysis of Mycoplasma pneumoniae
Xiao, BMC genomics 2015 - “...12 12 12 13 12 MPN503 Unknown 10 10 10 11 10 10 12 10 MPN439 Unknown 10 10 10 10 10 10 10 10 MPN489 Unknown 10 10 10 10 10 10 10 10 MPN370 Unknown 9 11 10 9 10 9 9 9 MPN048...”
- Lipoprotein multigene families in Mycoplasma pneumoniae
Hallamaa, Journal of bacteriology 2006 - “...H08_orf150 MPN436-like ORFo MPN440 H08_orf726 MPN439 H08_orf237 MPN438 H08_orf345 MPN437 H08_orf572o MPN436 A05_orf1244 c(592398..596300) c(589030..589980)n...”
- “...MPN440 were transcribed polycistronically, whereas truncated ORFs MPN439, MPN438, MPN437, and MPN485 did not yield consistent products in RT-PCR, possibly as...”
MPN489 species specific lipoprotein from Mycoplasma pneumoniae M129
24% identity, 37% coverage
- Inflammation-inducing Factors of Mycoplasma pneumoniae
Shimizu, Frontiers in microbiology 2016 - “...P37 3 a MPN436 Hypothetical MPN439 Pseudo MPN442 Hypothetical MPN456 Hypothetical MPN459 Hypothetical MPN467 Hypothetical MPN489 Hypothetical MPN506 Hypothetical MPN523 Hypothetical MPN582 Hypothetical MPN585 Hypothetical MPN587 Hypothetical MPN588 Hypothetical MPN590 Hypothetical MPN592 Hypothetical MPN602 atpF F 0 F 1 ATP synthase subunit b 2, 6 2...”
- Comparative genome analysis of Mycoplasma pneumoniae
Xiao, BMC genomics 2015 - “...of these variants were found in the MPN413 gene and the rest were found in MPN489. These two genes code for proteins of unknown function. To explore the variable and invariable regions of the M. pneumoniae genome, we identified the genes with the most and least...”
- “...11 10 10 12 10 MPN439 Unknown 10 10 10 10 10 10 10 10 MPN489 Unknown 10 10 10 10 10 10 10 10 MPN370 Unknown 9 11 10 9 10 9 9 9 MPN048 Unknown 10 9 9 10 9 9 10 10 Table...”
- Differential expression of lipoprotein genes in Mycoplasma pneumoniae after contact with human lung epithelial cells, and under oxidative and acidic stress
Hallamaa, BMC microbiology 2008 - “...ATTCCCATTTCCCTTTCCAC CATTTGAGCACCGTTTTCCT MPN200 TTCCGGTCTCTGTTTCGACT TCTTTTTGTGCGCCCTTACT MPN152 CGATTAATGGACCCGTTTTG TCTTTGCACCGAAGTGACAG 3 MPN436 CCCAGTCAAGGGTTAGGTCA TCTTCGGCAAAGAAAGGAAA MPN444 AACCGAAGTCAAAAACCC GAAGTGTCATCAGCAGCC MPN489 GATGGTAGTTACCCCGCT ACTAAAGCGGCAGATCCT 4 MPN456 AGCTGCACAAAGAAGCAC CTTGAGTGCCGTTACCAC 5 MPN011 AAAGGCATTAGCGATGTTTCA AATGTTTGTCACCTTTGTGGA MPN012 AGAGTGCGGAAAAAGGTGAA AAAGGATCATTGCCTGTTGC MPN411 GGTATTGCGGAACTTGCTTT TCAACTTTCCGCTCCATTTT MPN271 TGCGGATTTTGATTTTGACC GATCAACCTTTCGCTCCATC MPN505 TTTGAAAAGGGCGAATTAGG TGATCAACCTTTCGCTCAAA 6 MPN647 ATGGATCCTTCCCGTTTTTC CCGGGATAAGTTTCTGCAAG MPN646 TGAACTGGGCGATAAGGAAG AACAAATTTGAAGCAGGTGGA MPN645...”
- “...3 MPN436 0.9 0.802 0.4 0.094 0.9 0.836 MPN444 1.3 0.549 0.7 0.240 0.8 0.555 MPN489 1.4 0.286 0.6 0.033* 0.7 0.328 4 MPN456 2.2 0.010* 0.8 0.681 0.5 0.082 5 MPN011 1.6 0.044* 0.4 0.299 0.6 0.316 MPN012 1.5 0.119 0.3 0.122 0.6 0.257 MPN411...”
- Lipoprotein multigene families in Mycoplasma pneumoniae
Hallamaa, Journal of bacteriology 2006 - “...MG260 MG260 MG260 MG260 MG260 MG260 MG185 MPN489 P02-orf1300 MPN485 P02_orf316 MPN444 H08_orf1325 MPN442 H08_orf150 MPN436-like ORFo MPN440 H08_orf726 MPN439...”
- “...c(531949..532662) c(530856..531893) c(528920..530638) c(524877..528611) MPN485 MPN489 MPN436 MPN436 MPN436 MPN436 MPN436 MPN436 MPN440 MPN440 (93%, (95%,...”
MPN437 Mollicute specific lipoprotein, MG307 homolog, from M. genitalium from Mycoplasma pneumoniae M129
26% identity, 33% coverage
- Lipoprotein multigene families in Mycoplasma pneumoniae
Hallamaa, Journal of bacteriology 2006 - “...MPN439 H08_orf237 MPN438 H08_orf345 MPN437 H08_orf572o MPN436 A05_orf1244 c(592398..596300) c(589030..589980)n c(537762..541739) c(536089..536541)...”
- “...transcribed polycistronically, whereas truncated ORFs MPN439, MPN438, MPN437, and MPN485 did not yield consistent products in RT-PCR, possibly as a consequence...”
MPN442 species specific lipoprotein from Mycoplasma pneumoniae M129
26% identity, 11% coverage
- Molecular Tools for Typing Mycoplasma pneumoniae and Mycoplasma genitalium
Dumke, Frontiers in microbiology 2022 - “...called MLS typing) or for house-keeping proteins (MPN004, MPN050, MPN168, MPN246, and MPN516), hypothetical lipoproteins (MPN442, MPN582), and the P1 adhesin [(MPN141); Touati et al., 2015 ; SNP typing] were selected. Both methods can be used not only for characterization of isolates but also for investigation...”
- Mycoplasma pneumoniae from the Respiratory Tract and Beyond
Waites, Clinical microbiology reviews 2017 - “...gmk, glpK, rpoB, rplB, the P1 gene, MPN582, and MPN442) were selected according to the extensive analysis of the whole-genome sequences of the clinical strains....”
- Comparison of Mycoplasma pneumoniae Genome Sequences from Strains Isolated from Symptomatic and Asymptomatic Patients
Spuesens, Frontiers in microbiology 2016 (no snippet) - Inflammation-inducing Factors of Mycoplasma pneumoniae
Shimizu, Frontiers in microbiology 2016 - “...MPN411 Hypothetical MPN415 High affinity transport system protein P37 3 a MPN436 Hypothetical MPN439 Pseudo MPN442 Hypothetical MPN456 Hypothetical MPN459 Hypothetical MPN467 Hypothetical MPN489 Hypothetical MPN506 Hypothetical MPN523 Hypothetical MPN582 Hypothetical MPN585 Hypothetical MPN587 Hypothetical MPN588 Hypothetical MPN590 Hypothetical MPN592 Hypothetical MPN602 atpF F 0 F...”
- Molecular Epidemiology of Mycoplasma pneumoniae: Genotyping Using Single Nucleotide Polymorphisms and SNaPshot Technology
Touati, Journal of clinical microbiology 2015 - “...kinase MPN516 rpoB DNA-directed RNA polymerase subunit beta MPN442 -- Hypothetical lipoprotein MPN168 rplB 50S ribosomal protein L2 MPN141 P1 adhesin gene P1...”
- “...SNPs were located in hypothetical lipoprotein genes (MPN582 and MPN442) (Table 5). The eight SNPs were concatenated, resulting in a total of nine distinct SNP...”
- Lipoprotein multigene families in Mycoplasma pneumoniae
Hallamaa, Journal of bacteriology 2006 - “...MPN489 P02-orf1300 MPN485 P02_orf316 MPN444 H08_orf1325 MPN442 H08_orf150 MPN436-like ORFo MPN440 H08_orf726 MPN439 H08_orf237 MPN438 H08_orf345 MPN437...”
- “...was not detected by RT-PCR. Truncated ORFs MPN442, MPN436-like ORF, and MPN440 were transcribed polycistronically, whereas truncated ORFs MPN439, MPN438,...”
For advice on how to use these tools together, see
Interactive tools for functional annotation of bacterial genomes.
The PaperBLAST database links 793,807 different protein sequences to 1,259,118 scientific articles. Searches against EuropePMC were last performed on March 13 2025.
PaperBLAST builds a database of protein sequences that are linked
to scientific articles. These links come from automated text searches
against the articles in EuropePMC
and from manually-curated information from GeneRIF, UniProtKB/Swiss-Prot,
BRENDA,
CAZy (as made available by dbCAN),
BioLiP,
CharProtDB,
MetaCyc,
EcoCyc,
TCDB,
REBASE,
the Fitness Browser,
and a subset of the European Nucleotide Archive with the /experiment tag.
Given this database and a protein sequence query,
PaperBLAST uses protein-protein BLAST
to find similar sequences with E < 0.001.
To build the database, we query EuropePMC with locus tags, with RefSeq protein
identifiers, and with UniProt
accessions. We obtain the locus tags from RefSeq or from MicrobesOnline. We use
queries of the form "locus_tag AND genus_name" to try to ensure that
the paper is actually discussing that gene. Because EuropePMC indexes
most recent biomedical papers, even if they are not open access, some
of the links may be to papers that you cannot read or that our
computers cannot read. We query each of these identifiers that
appears in the open access part of EuropePMC, as well as every locus
tag that appears in the 500 most-referenced genomes, so that a gene
may appear in the PaperBLAST results even though none of the papers
that mention it are open access. We also incorporate text-mined links
from EuropePMC that link open access articles to UniProt or RefSeq
identifiers. (This yields some additional links because EuropePMC
uses different heuristics for their text mining than we do.)
For every article that mentions a locus tag, a RefSeq protein
identifier, or a UniProt accession, we try to select one or two
snippets of text that refer to the protein. If we cannot get access to
the full text, we try to select a snippet from the abstract, but
unfortunately, unique identifiers such as locus tags are rarely
provided in abstracts.
PaperBLAST also incorporates manually-curated protein functions:
- Proteins from NCBI's RefSeq are included if a
GeneRIF
entry links the gene to an article in
PubMed®.
GeneRIF also provides a short summary of the article's claim about the
protein, which is shown instead of a snippet.
- Proteins from Swiss-Prot (the curated part of UniProt)
are included if the curators
identified experimental evidence for the protein's function (evidence
code ECO:0000269). For these proteins, the fields of the Swiss-Prot entry that
describe the protein's function are shown (with bold headings).
- Proteins from BRENDA,
a curated database of enzymes, are included if they are linked to a paper in PubMed
and their full sequence is known.
- Every protein from the non-redundant subset of
BioLiP,
a database
of ligand-binding sites and catalytic residues in protein structures, is included. Since BioLiP itself
does not include descriptions of the proteins, those are taken from the
Protein Data Bank.
Descriptions from PDB rely on the original submitter of the
structure and cannot be updated by others, so they may be less reliable.
(For SitesBLAST and Sites on a Tree, we use a larger subset of BioLiP so that every
ligand is represented among a group of structures with similar sequences, but for
PaperBLAST, we use the non-redundant set provided by BioLiP.)
- Every protein from EcoCyc, a curated
database of the proteins in Escherichia coli K-12, is included, regardless
of whether they are characterized or not.
- Proteins from the MetaCyc metabolic pathway database
are included if they are linked to a paper in PubMed and their full sequence is known.
- Proteins from the Transport Classification Database (TCDB)
are included if they have known substrate(s), have reference(s),
and are not described as uncharacterized or putative.
(Some of the references are not visible on the PaperBLAST web site.)
- Every protein from CharProtDB,
a database of experimentally characterized protein annotations, is included.
- Proteins from the CAZy database of carbohydrate-active enzymes
are included if they are associated with an Enzyme Classification number.
Even though CAZy does not provide links from individual protein sequences to papers,
these should all be experimentally-characterized proteins.
- Proteins from the REBASE database
of restriction enzymes are included if they have known specificity.
- Every protein with an evidence-based reannotation (based on mutant phenotypes)
in the Fitness Browser is included.
- Sequence-specific transcription factors (including sigma factors and DNA-binding response regulators)
with experimentally-determined DNA binding sites from the
PRODORIC database of gene regulation in prokaryotes.
- Putative transcription factors from RegPrecise
that have manually-curated predictions for their binding sites. These predictions are based on
conserved putative regulatory sites across genomes that contain similar transcription factors,
so PaperBLAST clusters the TFs at 70% identity and retains just one member of each cluster.
- Coding sequence (CDS) features from the
European Nucleotide Archive (ENA)
are included if the /experiment tag is set (implying that there is experimental evidence for the annotation),
the nucleotide entry links to paper(s) in PubMed,
and the nucleotide entry is from the STD data class
(implying that these are targeted annotated sequences, not from shotgun sequencing).
Also, to filter out genes whose transcription or translation was detected, but whose function
was not studied, nucleotide entries or papers with more than 25 such proteins are excluded.
Descriptions from ENA rely on the original submitter of the
sequence and cannot be updated by others, so they may be less reliable.
Except for GeneRIF and ENA,
the curated entries include a short curated
description of the protein's function.
For entries from BioLiP, the protein's function may not be known beyond binding to the ligand.
Many of these entries also link to articles in PubMed.
For more information see the
PaperBLAST paper (mSystems 2017)
or the code.
You can download PaperBLAST's database here.
Changes to PaperBLAST since the paper was written:
- November 2023: incorporated PRODORIC and RegPrecise. Many PRODORIC entries were not linked to a protein sequence (no UniProt identifier), so we added this information.
- February 2023: BioLiP changed their download format. PaperBLAST now includes their non-redundant subset. SitesBLAST and Sites on a Tree use a larger non-redundant subset that ensures that every ligand is represented within each cluster. This should ensure that every binding site is represented.
- June 2022: incorporated some coding sequences from ENA with the /experiment tag.
- March 2022: incorporated BioLiP.
- April 2020: incorporated TCDB.
- April 2019: EuropePMC now returns table entries in their search results. This has expanded PaperBLAST's database, but most of the new entries are of low relevance, and the resulting snippets are often just lists of locus tags with annotations.
- February 2018: the alignment page reports the conservation of the hit's functional sites (if available from from Swiss-Prot or UniProt)
- January 2018: incorporated BRENDA.
- December 2017: incorporated MetaCyc, CharProtDB, CAZy, REBASE, and the reannotations from the Fitness Browser.
- September 2017: EuropePMC no longer returns some table entries in their search results. This has shrunk PaperBLAST's database, but has also reduced the number of low-relevance hits.
Many of these changes are described in Interactive tools for functional annotation of bacterial genomes.
PaperBLAST cannot provide snippets for many of the papers that are
published in non-open-access journals. This limitation applies even if
the paper is marked as "free" on the publisher's web site and is
available in PubmedCentral or EuropePMC. If a journal that you publish
in is marked as "secret," please consider publishing elsewhere.
Many important articles are missing from PaperBLAST, either because
the article's full text is not in EuropePMC (as for many older
articles), or because the paper does not mention a protein identifier such as a locus tag, or because of PaperBLAST's heuristics. If you notice an
article that characterizes a protein's function but is missing from
PaperBLAST, please notify the curators at UniProt
or add an entry to GeneRIF.
Entries in either of these databases will eventually be incorporated
into PaperBLAST. Note that to add an entry to UniProt, you will need
to find the UniProt identifier for the protein. If the protein is not
already in UniProt, you can ask them to create an entry. To add an
entry to GeneRIF, you will need an NCBI Gene identifier, but
unfortunately many prokaryotic proteins in RefSeq do not have
corresponding Gene identifers.
References
PaperBLAST: Text-mining papers for information about homologs.
M. N. Price and A. P. Arkin (2017). mSystems, 10.1128/mSystems.00039-17.
Europe PMC in 2017.
M. Levchenko et al (2017). Nucleic Acids Research, 10.1093/nar/gkx1005.
Gene indexing: characterization and analysis of NLM's GeneRIFs.
J. A. Mitchell et al (2003). AMIA Annu Symp Proc 2003:460-464.
UniProt: the universal protein knowledgebase.
The UniProt Consortium (2016). Nucleic Acids Research, 10.1093/nar/gkw1099.
BRENDA in 2017: new perspectives and new tools in BRENDA.
S. Placzek et al (2017). Nucleic Acids Research, 10.1093/nar/gkw952.
The EcoCyc database: reflecting new knowledge about Escherichia coli K-12.
I. M. Keeseler et al (2016). Nucleic Acids Research, 10.1093/nar/gkw1003.
The MetaCyc database of metabolic pathways and enzymes.
R. Caspi et al (2018). Nucleic Acids Research, 10.1093/nar/gkx935.
CharProtDB: a database of experimentally characterized protein annotations.
R. Madupu et al (2012). Nucleic Acids Research, 10.1093/nar/gkr1133.
The carbohydrate-active enzymes database (CAZy) in 2013.
V. Lombard et al (2014). Nucleic Acids Research, 10.1093/nar/gkt1178.
The Transporter Classification Database (TCDB): recent advances
M. H. Saier, Jr. et al (2016). Nucleic Acids Research, 10.1093/nar/gkv1103.
REBASE - a database for DNA restriction and modification: enzymes, genes and genomes.
R. J. Roberts et al (2015). Nucleic Acids Research, 10.1093/nar/gku1046.
Deep annotation of protein function across diverse bacteria from mutant phenotypes.
M. N. Price et al (2016). bioRxiv, 10.1101/072470.
by Morgan Price,
Arkin group
Lawrence Berkeley National Laboratory