GapMind for catabolism of small carbon sources

 

Alignments for a candidate for glcB in Marinobacter adhaerens HP15

Align Malate synthase G (EC 2.3.3.9) (characterized)
to candidate GFF2238 HP15_2188 malate synthase G

Query= reanno::psRCH2:GFF353
         (726 letters)



>FitnessBrowser__Marino:GFF2238
          Length = 726

 Score = 1020 bits (2638), Expect = 0.0
 Identities = 501/726 (69%), Positives = 593/726 (81%), Gaps = 2/726 (0%)

Query: 1   MTERVQVGGLQVAKVLYDFVNNEAIPGTGVDAAAFWAGADSVIHDLAPKNRALLAKRDDL 60
           MTERVQVGG+QVAK LYDFVNNEAIPGTG+DA  FWA  D ++++LAP+NR LLAKRD +
Sbjct: 1   MTERVQVGGIQVAKNLYDFVNNEAIPGTGIDADKFWAEFDKIVNELAPRNRELLAKRDAI 60

Query: 61  QAQIDAWHQARAGQAHDAVAYKSFLQEIGYLLPEAEDFQATTENVDEEIARMAGPQLVVP 120
           Q ++D+W++   GQ  D   YKSFL++IGYL+ E  +F+ +T NVD E+A MAGPQLVVP
Sbjct: 61  QEKMDSWNRDHKGQKLDMGEYKSFLKDIGYLVDEPSEFKISTSNVDPEVATMAGPQLVVP 120

Query: 121 IMNARFALNAANARWGSLYDALYGTDAISEADGASKGPGYNEIRGNKVIAYARNFLNEAA 180
           IMNARFALNAANARWGSLYDALYGTDAISE DGA KG GYN +RG KVI +ARN L+ +A
Sbjct: 121 IMNARFALNAANARWGSLYDALYGTDAISEEDGAEKGRGYNPVRGAKVIEWARNLLDSSA 180

Query: 181 PLETGSHVDSTGYRIEGGKLVVSLKDGSTTGLKNPAQLQGFQGEASAPIAVLLKNNGIHF 240
           PL +GSH D+  Y + GGKLVV L++G +TGLK+ A   G+ G A AP  VLL  NG+HF
Sbjct: 181 PLASGSHKDAAKYVVVGGKLVVKLQNGESTGLKDEAGFVGYTGAADAPTGVLLVKNGMHF 240

Query: 241 EIQIDPASPIGQTDAAGVKDILMESALTTIMDCEDSIAAVDADDKTVVYRNWLGLMKGDL 300
           EIQID   PIG+ D A VKD+LMESALTTIMDCEDS+AAVDADDK + YRNWLGLMKGDL
Sbjct: 241 EIQIDATHPIGKDDGAHVKDVLMESALTTIMDCEDSVAAVDADDKALAYRNWLGLMKGDL 300

Query: 301 VEELEKGGKRITRAMNPDRVYTKADGNGELTLHGRSLLFIRNVGHLMTNDAILDKEGNEV 360
            E  EKGG+++TR MN DR YT ADG+ EL+L GRSL+FIRNVGHLMTN AIL K+GNEV
Sbjct: 301 QETFEKGGEQLTRKMNADRTYTAADGS-ELSLKGRSLMFIRNVGHLMTNPAILLKDGNEV 359

Query: 361 PEGIMDGLFTSLIAVHNLNGNTSRKNTRTGSMYIVKPKMHGPEEVAFATELFGRVEDVLG 420
           PEG+MDGL TSLIA+H++ GN   +N+  GSMYIVKPKMHGPEEVAF  E FGRVED L 
Sbjct: 360 PEGLMDGLITSLIAIHDMKGNGKFQNSTKGSMYIVKPKMHGPEEVAFTNEFFGRVEDALS 419

Query: 421 LPRNTLKVGIMDEERRTTINLKACIKEARERVVFINTGFLDRTGDEIHTSMEAGPMVRKA 480
           LPR +LKVGIMDEERRTT+NLKACI  A+ER VFINTGFLDRTGDEIHTSME GP +RK 
Sbjct: 420 LPRFSLKVGIMDEERRTTVNLKACIHAAKERAVFINTGFLDRTGDEIHTSMELGPFIRKG 479

Query: 481 AMKAEKWISAYENNNVDVGLACGLQGKAQIGKGMWAMPDLMAAMLEQKVGHPMAGANTAW 540
           AMK   WI+AYE  NVD+GL  G +G AQIGKGMWAMPDLMA MLE K+GHP AGANTAW
Sbjct: 480 AMKQATWINAYEQWNVDIGLETGFRGVAQIGKGMWAMPDLMAGMLEAKIGHPKAGANTAW 539

Query: 541 VPSPTAATLHAMHYHKIDVQARQVELAKREKASIDDILTIPLAQD-TNWSEEEKRNELDN 599
           VPSPTAATLHA HYH++ V   Q +L  R +A++DDILT+P+  D  + S E+ + ELDN
Sbjct: 540 VPSPTAATLHATHYHQVSVADVQKQLESRTRAALDDILTVPVMDDPASLSAEDIQQELDN 599

Query: 600 NSQGILGYMVRWVEQGVGCSKVPDINDIALMEDRATLRISSQHVANWMRHGVVTKDQVVE 659
           N+QGILGY+VRW++QGVGCSKVPDIND+ LMEDRATLRI+SQ +ANW+ HG+ ++DQ++E
Sbjct: 600 NAQGILGYVVRWIDQGVGCSKVPDINDVGLMEDRATLRIASQLLANWLHHGICSEDQIME 659

Query: 660 SLKRMAPVVDRQNQGDPLYRPMAPDFDNSVAFQAALELVLEGTKQPNGYTEPVLHRRRRE 719
           ++KRMA VVD+QN GD  YRPMA +FD+SVAFQAAL+LVL+G +QPNGYTEP+LH  R +
Sbjct: 660 TMKRMAAVVDKQNAGDSAYRPMAGNFDDSVAFQAALDLVLKGREQPNGYTEPLLHAYRLK 719

Query: 720 FKAKNG 725
            KAK G
Sbjct: 720 AKAKYG 725


Lambda     K      H
   0.316    0.133    0.386 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 1471
Number of extensions: 46
Number of successful extensions: 3
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 726
Length of database: 726
Length adjustment: 40
Effective length of query: 686
Effective length of database: 686
Effective search space:   470596
Effective search space used:   470596
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.6 bits)
S2: 55 (25.8 bits)

Align candidate GFF2238 HP15_2188 (malate synthase G)
to HMM TIGR01345 (glcB: malate synthase G (EC 2.3.3.9))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR01345.hmm
# target sequence database:        /tmp/gapView.12341.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR01345  [M=721]
Accession:   TIGR01345
Description: malate_syn_G: malate synthase G
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                           Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                           -----------
          0 1210.5   0.7          0 1210.3   0.7    1.0  1  lcl|FitnessBrowser__Marino:GFF2238  HP15_2188 malate synthase G


Domain annotation for each sequence (and alignments):
>> lcl|FitnessBrowser__Marino:GFF2238  HP15_2188 malate synthase G
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 ! 1210.3   0.7         0         0       2     720 ..       4     723 ..       3     724 .. 0.99

  Alignments for each domain:
  == domain 1  score: 1210.3 bits;  conditional E-value: 0
                           TIGR01345   2 rvdagrlqvakklkdfveeevlpgtgvdaekfwsgfdeivrdlapenrellakrdeiqaaideyhrknk.gvidk 75 
                                         rv++g++qvak+l+dfv++e++pgtg+da+kfw++fd+iv++lap+nrellakrd iq  +d + r+ k    d 
  lcl|FitnessBrowser__Marino:GFF2238   4 RVQVGGIQVAKNLYDFVNNEAIPGTGIDADKFWAEFDKIVNELAPRNRELLAKRDAIQEKMDSWNRDHKgQKLDM 78 
                                         7899*****************************************************************4457** PP

                           TIGR01345  76 eayksflkeigylveepervtietenvdseiasqagpqlvvpvlnaryalnaanarwgslydalygsnvipeedg 150
                                           yksflk+igylv+ep + +i t nvd e+a+ agpqlvvp++nar+alnaanarwgslydalyg+++i+eedg
  lcl|FitnessBrowser__Marino:GFF2238  79 GEYKSFLKDIGYLVDEPSEFKISTSNVDPEVATMAGPQLVVPIMNARFALNAANARWGSLYDALYGTDAISEEDG 153
                                         *************************************************************************** PP

                           TIGR01345 151 aekgkeynpkrgekviefarefldeslplesgsyadvvkykivdkklavqlesgkvtrlkdeeqfvgyrgdaadp 225
                                         aekg+ ynp+rg kvie+ar++ld+s pl sgs++d+ ky +v +kl+v+l++g+ t lkde+ fvgy+g a +p
  lcl|FitnessBrowser__Marino:GFF2238 154 AEKGRGYNPVRGAKVIEWARNLLDSSAPLASGSHKDAAKYVVVGGKLVVKLQNGESTGLKDEAGFVGYTGAADAP 228
                                         *************************************************************************** PP

                           TIGR01345 226 evillktnglhielqidarhpigkadkakvkdivlesaittildcedsvaavdaedkvlvyrnllglmkgtlkek 300
                                         + +ll +ng+h e+qida+hpigk+d a+vkd+++esa+tti+dcedsvaavda+dk l yrn+lglmkg+l+e+
  lcl|FitnessBrowser__Marino:GFF2238 229 TGVLLVKNGMHFEIQIDATHPIGKDDGAHVKDVLMESALTTIMDCEDSVAAVDADDKALAYRNWLGLMKGDLQET 303
                                         *************************************************************************** PP

                           TIGR01345 301 lekngriikrklnedrsytaangeelslhgrsllfvrnvghlmtipviltdegeeipegildgvltsvialydlk 375
                                         +ek g +++rk+n dr+ytaa+g+elsl+grsl+f+rnvghlmt+p+il ++g+e+peg++dg++ts+ia++d+k
  lcl|FitnessBrowser__Marino:GFF2238 304 FEKGGEQLTRKMNADRTYTAADGSELSLKGRSLMFIRNVGHLMTNPAILLKDGNEVPEGLMDGLITSLIAIHDMK 378
                                         *************************************************************************** PP

                           TIGR01345 376 vqnklrnsrkgsvyivkpkmhgpeevafanklftriedllglerhtlkvgvmdeerrtslnlkaciakvkervaf 450
                                          ++k++ns kgs+yivkpkmhgpeevaf+n++f+r+ed l l+r +lkvg+mdeerrt++nlkaci+ +ker +f
  lcl|FitnessBrowser__Marino:GFF2238 379 GNGKFQNSTKGSMYIVKPKMHGPEEVAFTNEFFGRVEDALSLPRFSLKVGIMDEERRTTVNLKACIHAAKERAVF 453
                                         *************************************************************************** PP

                           TIGR01345 451 intgfldrtgdeihtsmeagamvrkadmksapwlkayernnvaagltcglrgkaqigkgmwampdlmaemlekkg 525
                                         intgfldrtgdeihtsme g+++rk+ mk a+w++aye+ nv+ gl +g+rg aqigkgmwampdlma mle k+
  lcl|FitnessBrowser__Marino:GFF2238 454 INTGFLDRTGDEIHTSMELGPFIRKGAMKQATWINAYEQWNVDIGLETGFRGVAQIGKGMWAMPDLMAGMLEAKI 528
                                         *************************************************************************** PP

                           TIGR01345 526 dqlragantawvpsptaatlhalhyhrvdvqkvqkeladaerraelkeiltipvaen.tnwseeeikeeldnnvq 599
                                         + ++agantawvpsptaatlha+hyh+v v  vqk+l +  +ra l++ilt+pv ++ +  s+e+i++eldnn+q
  lcl|FitnessBrowser__Marino:GFF2238 529 GHPKAGANTAWVPSPTAATLHATHYHQVSVADVQKQLESR-TRAALDDILTVPVMDDpASLSAEDIQQELDNNAQ 602
                                         *************************************988.8999********98753899************** PP

                           TIGR01345 600 gilgyvvrwveqgigcskvpdihnvalmedratlrissqhlanwlrhgivskeqvleslermakvvdkqnagdea 674
                                         gilgyvvrw++qg+gcskvpdi +v lmedratlri+sq lanwl hgi s +q++e+++rma vvdkqnagd a
  lcl|FitnessBrowser__Marino:GFF2238 603 GILGYVVRWIDQGVGCSKVPDINDVGLMEDRATLRIASQLLANWLHHGICSEDQIMETMKRMAAVVDKQNAGDSA 677
                                         *************************************************************************** PP

                           TIGR01345 675 yrpmadnleasvafkaakdlilkgtkqpsgytepilharrlefkek 720
                                         yrpma+n+++svaf+aa dl+lkg +qp+gytep+lha+rl+ k+k
  lcl|FitnessBrowser__Marino:GFF2238 678 YRPMAGNFDDSVAFQAALDLVLKGREQPNGYTEPLLHAYRLKAKAK 723
                                         *******************************************987 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (721 nodes)
Target sequences:                          1  (726 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.04u 0.02s 00:00:00.06 Elapsed: 00:00:00.05
# Mc/sec: 9.14
//
[ok]

This GapMind analysis is from Sep 17 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory