GapMind for catabolism of small carbon sources

 

Alignments for a candidate for mmsA in Erythrobacter marinus HWDM-33

Align Methylmalonate-semialdehyde dehydrogenase [inositol] (EC 1.2.1.27) (characterized)
to candidate WP_047092853.1 AAV99_RS05730 CoA-acylating methylmalonate-semialdehyde dehydrogenase

Query= reanno::Caulo:CCNA_01360
         (500 letters)



>NCBI__GCF_001013305.1:WP_047092853.1
          Length = 498

 Score =  737 bits (1902), Expect = 0.0
 Identities = 360/498 (72%), Positives = 411/498 (82%), Gaps = 1/498 (0%)

Query: 2   MRDIPHFIGGQKVDGASGRFGEVFDPNTGKVQARVALASAGELNTAIANAKVAQAAWAAT 61
           MR + HFI G    G SGR  ++++P+TG+VQA VAL  A  L+ A+ NAK  Q  WAAT
Sbjct: 1   MRQVDHFIQGGSGSG-SGRTHKIWNPSTGEVQAEVALGDAALLDRAMENAKKVQPEWAAT 59

Query: 62  NPQRRARVMFEFKRLLEVHMDELAALLSSEHGKVIADSKGDIQRGLEVIEFACGVPHLLK 121
           NPQ+RARVMF+FK L+E +M +LA LLSSEHGKV+ D+KGD+QRGLEVIE+ACG+P +LK
Sbjct: 60  NPQKRARVMFKFKELIEANMQDLAELLSSEHGKVVDDAKGDVQRGLEVIEYACGIPQVLK 119

Query: 122 GEYTQGAGPGIDVYSMRQPLGVVAGITPFNFPAMIPMWMFGPAIATGNAFILKPSERDPS 181
           GEYTQGAGPGIDVYS RQPLG+ AGITPFNFPAMIPMWMFG AIA GNAFILKPSERDPS
Sbjct: 120 GEYTQGAGPGIDVYSTRQPLGIGAGITPFNFPAMIPMWMFGMAIAAGNAFILKPSERDPS 179

Query: 182 VPVRLAELMIEAGLPPGVLNVVHGDKDCVEAILDHPDIKAVSFVGSSDIAQSVFQRAGAA 241
           VPVRLAEL +EAG P G+L VVHGDK+ V+AILDHPDI AVSFVGSSDIAQ ++ R  A 
Sbjct: 180 VPVRLAELFLEAGAPEGLLQVVHGDKEMVDAILDHPDIAAVSFVGSSDIAQYIYSRGTAN 239

Query: 242 GKRVQAMGGAKNHGLVMPDADLDQAVADIIGAAYGSAGERCMALPVVVPVGEKTATALRE 301
            KRVQA GGAKNHG+VMPDADLDQ V D+ GAA+GSAGERCMALPVVVPVGE TA  LRE
Sbjct: 240 AKRVQAFGGAKNHGVVMPDADLDQVVNDLAGAAFGSAGERCMALPVVVPVGEDTANRLRE 299

Query: 302 KLVAAIGGLRVGVSTDPDAHYGPVVSAAHKARIESYIQMGVDEGAELVVDGRGFSLQGHE 361
           KL+ AI  LR+GVS DPDAHYGPVV+  HKARIE +I    +EG E+VVDGRG+SLQGHE
Sbjct: 300 KLIPAINALRIGVSNDPDAHYGPVVTPEHKARIEEWITTAENEGGEIVVDGRGYSLQGHE 359

Query: 362 EGFFVGPTLFDHVKPTSRSYHDEIFGPVLQMVRAESLEEGIALASRHQYGNGVAIFTRNG 421
           +GFFVGPTL DHV P   SY +EIFGPVLQ+VRA   E  + L S HQYGNGVAIFTRNG
Sbjct: 360 KGFFVGPTLIDHVTPEMESYKEEIFGPVLQIVRATDFEHALRLPSEHQYGNGVAIFTRNG 419

Query: 422 DAAREFADQVEVGMVGINVPIPVPVAYHSFGGWKRSGFGDLNQYGMDGVRFYTRTKTVTQ 481
            AAREFA +V VGMVGINVPIPVPVAYHSFGGWKRSGFGD++QYG +G++F+T+TK VTQ
Sbjct: 420 HAAREFASRVNVGMVGINVPIPVPVAYHSFGGWKRSGFGDIDQYGTEGLKFWTKTKKVTQ 479

Query: 482 RWPKGGAVLDQSFVIPTM 499
           RWP GG     +F+IPTM
Sbjct: 480 RWPDGGGDGSNAFIIPTM 497


Lambda     K      H
   0.320    0.137    0.409 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 900
Number of extensions: 30
Number of successful extensions: 1
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 500
Length of database: 498
Length adjustment: 34
Effective length of query: 466
Effective length of database: 464
Effective search space:   216224
Effective search space used:   216224
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.8 bits)
S2: 52 (24.6 bits)

Align candidate WP_047092853.1 AAV99_RS05730 (CoA-acylating methylmalonate-semialdehyde dehydrogenase)
to HMM TIGR01722 (mmsA: methylmalonate-semialdehyde dehydrogenase (acylating) (EC 1.2.1.27))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR01722.hmm
# target sequence database:        /tmp/gapView.1635204.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR01722  [M=477]
Accession:   TIGR01722
Description: MMSDH: methylmalonate-semialdehyde dehydrogenase (acylating)
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                             Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                             -----------
   7.1e-201  653.9   0.1   8.1e-201  653.7   0.1    1.0  1  NCBI__GCF_001013305.1:WP_047092853.1  


Domain annotation for each sequence (and alignments):
>> NCBI__GCF_001013305.1:WP_047092853.1  
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  653.7   0.1  8.1e-201  8.1e-201       1     477 []       4     481 ..       4     481 .. 0.98

  Alignments for each domain:
  == domain 1  score: 653.7 bits;  conditional E-value: 8.1e-201
                             TIGR01722   1 vkhlidGkfvegksdkyipvsnpatnevlakvaeasaeevdaavasaretfaawaetsvaerarvllryqall 73 
                                           v h+i+G    g s++  ++ np t+ev a+va   a+ +d a+  a++  + wa t+  +rarv++++++l+
  NCBI__GCF_001013305.1:WP_047092853.1   4 VDHFIQGGSGSG-SGRTHKIWNPSTGEVQAEVALGDAALLDRAMENAKKVQPEWAATNPQKRARVMFKFKELI 75 
                                           689999987655.677889****************************************************** PP

                             TIGR01722  74 kehrdeiaklisaeqGktledakGdvarGlevvehacsvtslllGetvesvakdvdvysirqplGvvaGitpf 146
                                           +++ +++a+l+s+e+Gk+++dakGdv rGlev+e+ac+++  l+Ge ++     +dvys rqplG+ aGitpf
  NCBI__GCF_001013305.1:WP_047092853.1  76 EANMQDLAELLSSEHGKVVDDAKGDVQRGLEVIEYACGIPQVLKGEYTQGAGPGIDVYSTRQPLGIGAGITPF 148
                                           ************************************************************************* PP

                             TIGR01722 147 nfpamiplwmfplaiacGntfvlkpsekvpsaavklaellseaGapdGvlnvvhGdkeavdrllehpdvkavs 219
                                           nfpamip+wmf +aia Gn+f+lkpse++ps  v+lael++eaGap+G l+vvhGdke vd +l+hpd+ avs
  NCBI__GCF_001013305.1:WP_047092853.1 149 NFPAMIPMWMFGMAIAAGNAFILKPSERDPSVPVRLAELFLEAGAPEGLLQVVHGDKEMVDAILDHPDIAAVS 221
                                           ************************************************************************* PP

                             TIGR01722 220 fvGsvavgeyiyetgsahgkrvqalaGaknhmvvlpdadkeaaldalvgaavGaaGqrcmaisaavlvGaa.. 290
                                           fvGs+++++yiy++g+a++krvqa++Gaknh+vv+pdad+++++++l gaa+G+aG+rcma+ ++v vG+   
  NCBI__GCF_001013305.1:WP_047092853.1 222 FVGSSDIAQYIYSRGTANAKRVQAFGGAKNHGVVMPDADLDQVVNDLAGAAFGSAGERCMALPVVVPVGEDta 294
                                           *********************************************************************8544 PP

                             TIGR01722 291 kelveeireraekvrvgagddpgaelGplitkqakervasliasgakeGaevlldGrgykveGyeeGnfvGit 363
                                           ++l e++  +++ +r+g  +dp a++Gp++t+++k+r++++i+++ +eG e+++dGrgy ++G+e+G fvG+t
  NCBI__GCF_001013305.1:WP_047092853.1 295 NRLREKLIPAINALRIGVSNDPDAHYGPVVTPEHKARIEEWITTAENEGGEIVVDGRGYSLQGHEKGFFVGPT 367
                                           8999********************************************************************* PP

                             TIGR01722 364 llervkpdmkiykeeifGpvlvvleadtleeaiklinespyGnGtaiftsdGaaarkfqheievGqvGvnvpi 436
                                           l+++v p+m+ ykeeifGpvl +++a  +e+a++l  e  yGnG aift++G aar+f  +++vG+vG+nvpi
  NCBI__GCF_001013305.1:WP_047092853.1 368 LIDHVTPEMESYKEEIFGPVLQIVRATDFEHALRLPSEHQYGNGVAIFTRNGHAAREFASRVNVGMVGINVPI 440
                                           ************************************************************************* PP

                             TIGR01722 437 pvplpffsftGwkdslfGdlhiyGkqGvrfytrlktvtarw 477
                                           pvp++++sf+Gwk s fGd + yG +G++f+t++k vt rw
  NCBI__GCF_001013305.1:WP_047092853.1 441 PVPVAYHSFGGWKRSGFGDIDQYGTEGLKFWTKTKKVTQRW 481
                                           ***************************************** PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (477 nodes)
Target sequences:                          1  (498 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.00u 0.01s 00:00:00.01 Elapsed: 00:00:00.01
# Mc/sec: 21.95
//
[ok]

This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory