GapMind for catabolism of small carbon sources

 

Alignments for a candidate for glcB in Phaeobacter inhibens BS107

Align Malate synthase G (EC 2.3.3.9) (characterized)
to candidate GFF974 PGA1_c09900 malate synthase GlcB

Query= reanno::psRCH2:GFF353
         (726 letters)



>FitnessBrowser__Phaeo:GFF974
          Length = 710

 Score =  862 bits (2228), Expect = 0.0
 Identities = 427/715 (59%), Positives = 536/715 (74%), Gaps = 16/715 (2%)

Query: 10  LQVAKVLYDFVNNEAIPGTGVDAAAFWAGADSVIHDLAPKNRALLAKRDDLQAQIDAWHQ 69
           +QVA  L  F+ ++A+PGTGV A AFWAG   +++ +  +NRALLAKR DLQ QIDAWH 
Sbjct: 10  MQVADTLVSFIEDKALPGTGVTADAFWAGLAGLVNGMGDENRALLAKRADLQGQIDAWHI 69

Query: 70  ARAGQAHDAVAYKSFLQEIGYLLPEAEDFQATTENVDEEIARMAGPQLVVPIMNARFALN 129
            R GQAHDA AY++FL++IGYLLPE +DF+  T+NVD+EIA + GPQLVVPI NARFALN
Sbjct: 70  ERKGQAHDATAYEAFLRDIGYLLPEGDDFEIETQNVDDEIANVPGPQLVVPITNARFALN 129

Query: 130 AANARWGSLYDALYGTDAISEADGASKGPGYNEIRGNKVIAYARNFLNEAAPLETGSHVD 189
           AANARWGSLYDALYGTDA+ +     +G GY+  RG +VIA+ R FL++  PL  GS  D
Sbjct: 130 AANARWGSLYDALYGTDAMGDLP---EGKGYDAGRGARVIAWGRGFLDQTFPLAEGSWND 186

Query: 190 STGYRIEGGKLVVSLKDGSTTGLKNPAQLQGFQGEASAPIAVLLKNNGIHFEIQIDPASP 249
             G  +EGG LV +LKD         AQ  G++G+A+ P  +LLKNNG+H  I +D    
Sbjct: 187 CVGLSVEGGALVPALKDA--------AQFAGYEGDAATPGKILLKNNGLHAVIVVDANGN 238

Query: 250 IGQTDAAGVKDILMESALTTIMDCEDSIAAVDADDKTVVYRNWLGLMKGDLVEELEKGGK 309
           IG+ D AG+ DI++ESAL+TIMDCEDS+A VD +DK   Y NWLGLMK DL EE+ KGG+
Sbjct: 239 IGKGDQAGINDIVLESALSTIMDCEDSVACVDGEDKVTAYANWLGLMKRDLAEEVTKGGE 298

Query: 310 RITRAMNPDRVYTKADGNGELTLHGRSLLFIRNVGHLMTNDAILDKEGNEVPEGIMDGLF 369
             TR +N D+ +T  DG+ +L L GRSLL +RNVGHLMTN A+ D  G E  EG +D + 
Sbjct: 299 TFTRVLNDDQTFTAPDGS-DLVLKGRSLLLVRNVGHLMTNPAVRDSAGREAGEGFIDAMV 357

Query: 370 TSLIAVHNLNGNTSRKNTRTGSMYIVKPKMHGPEEVAFATELFGRVEDVLGLPRNTLKVG 429
           T L A+H+L       N+  GS+Y+VKPKMHGPEEVAF   +F  VED LGLPR+T+K+G
Sbjct: 358 TVLCAMHDLQAEGG--NSLHGSVYVVKPKMHGPEEVAFTDRIFTHVEDALGLPRHTVKIG 415

Query: 430 IMDEERRTTINLKACIKEARERVVFINTGFLDRTGDEIHTSMEAGPMVRKAAMKAEKWIS 489
           IMDEERRT++NLK+CI+ A+ RV FINTGFLDRTGDEIHTSMEAG M+RK  MK+  WI+
Sbjct: 416 IMDEERRTSVNLKSCIRAAKHRVAFINTGFLDRTGDEIHTSMEAGAMMRKGEMKSTPWIA 475

Query: 490 AYENNNVDVGLACGLQGKAQIGKGMWAMPDLMAAMLEQKVGHPMAGANTAWVPSPTAATL 549
           +YE+ NVD+GLACGL+G+AQIGKGMWAMPD M  MLE K+GHP +GA  AWVPSPTAATL
Sbjct: 476 SYEDRNVDIGLACGLKGRAQIGKGMWAMPDRMGEMLEAKIGHPKSGATCAWVPSPTAATL 535

Query: 550 HAMHYHKIDVQARQVEL-AKREKASIDDILTIPLAQDTNWSEEEKRNELDNNSQGILGYM 608
           HA HYH+++V   Q +L A   + ++DD+LTIP+ Q  N S+ E   E++NN+QGILGY+
Sbjct: 536 HATHYHRVNVHEVQDQLKAGGARGTLDDLLTIPVMQGENLSDAEIAQEIENNAQGILGYV 595

Query: 609 VRWVEQGVGCSKVPDINDIALMEDRATLRISSQHVANWMRHGVVTKDQVVESLKRMAPVV 668
           VRWV+QGVGCSKVPDI+++ LMEDRAT RISSQ +ANW+ HG+V + QV+ ++++MA VV
Sbjct: 596 VRWVDQGVGCSKVPDIHNVGLMEDRATCRISSQALANWLHHGIVDETQVMAAMQKMAAVV 655

Query: 669 DRQNQGDPLYRPMAPDFDNSVAFQAALELVLEGTKQPNGYTEPVLHRRRREFKAK 723
           D QN  D  Y PMAP FD  +AFQAA +LV  G  QP+GYTEPVLH RR E KA+
Sbjct: 656 DGQNASDASYTPMAPGFD-GIAFQAACDLVFRGRIQPSGYTEPVLHARRLELKAQ 709


Lambda     K      H
   0.316    0.133    0.386 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 1475
Number of extensions: 65
Number of successful extensions: 7
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 726
Length of database: 710
Length adjustment: 40
Effective length of query: 686
Effective length of database: 670
Effective search space:   459620
Effective search space used:   459620
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.6 bits)
S2: 55 (25.8 bits)

Align candidate GFF974 PGA1_c09900 (malate synthase GlcB)
to HMM TIGR01345 (glcB: malate synthase G (EC 2.3.3.9))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR01345.hmm
# target sequence database:        /tmp/gapView.9587.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR01345  [M=721]
Accession:   TIGR01345
Description: malate_syn_G: malate synthase G
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                         Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                         -----------
          0 1066.3   0.0          0 1066.0   0.0    1.0  1  lcl|FitnessBrowser__Phaeo:GFF974  PGA1_c09900 malate synthase GlcB


Domain annotation for each sequence (and alignments):
>> lcl|FitnessBrowser__Phaeo:GFF974  PGA1_c09900 malate synthase GlcB
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 ! 1066.0   0.0         0         0       5     719 ..       7     708 ..       4     710 .] 0.98

  Alignments for each domain:
  == domain 1  score: 1066.0 bits;  conditional E-value: 0
                         TIGR01345   5 agrlqvakklkdfveeevlpgtgvdaekfwsgfdeivrdlapenrellakrdeiqaaideyhrknk.gvidkeayks 80 
                                         ++qva +l++f+e+++lpgtgv a++fw+g++ +v+ +  enr llakr ++q  id++h + k  + d +ay+ 
  lcl|FitnessBrowser__Phaeo:GFF974   7 RHDMQVADTLVSFIEDKALPGTGVTADAFWAGLAGLVNGMGDENRALLAKRADLQGQIDAWHIERKgQAHDATAYEA 83 
                                       568************************************************************99955789****** PP

                         TIGR01345  81 flkeigylveepervtietenvdseiasqagpqlvvpvlnaryalnaanarwgslydalygsnvipeedgaekgkey 157
                                       fl++igyl +e +  +iet+nvd+eia   gpqlvvp++nar+alnaanarwgslydalyg++++ +   + +gk y
  lcl|FitnessBrowser__Phaeo:GFF974  84 FLRDIGYLLPEGDDFEIETQNVDDEIANVPGPQLVVPITNARFALNAANARWGSLYDALYGTDAMGD---LPEGKGY 157
                                       ***************************************************************9865...678889* PP

                         TIGR01345 158 npkrgekviefarefldeslplesgsyadvvkykivdkklavqlesgkvtrlkdeeqfvgyrgdaadpevillktng 234
                                       +  rg +vi+++r fld+++pl +gs++d v  ++  + l+          lkd +qf gy+gdaa p  illk+ng
  lcl|FitnessBrowser__Phaeo:GFF974 158 DAGRGARVIAWGRGFLDQTFPLAEGSWNDCVGLSVEGGALVPA--------LKDAAQFAGYEGDAATPGKILLKNNG 226
                                       ********************************99999988755........6699********************** PP

                         TIGR01345 235 lhielqidarhpigkadkakvkdivlesaittildcedsvaavdaedkvlvyrnllglmkgtlkeklekngriikrk 311
                                       lh  + +da+++igk d+a+++divlesa++ti+dcedsva vd edkv  y n+lglmk +l e+++k g +++r 
  lcl|FitnessBrowser__Phaeo:GFF974 227 LHAVIVVDANGNIGKGDQAGINDIVLESALSTIMDCEDSVACVDGEDKVTAYANWLGLMKRDLAEEVTKGGETFTRV 303
                                       ***************************************************************************** PP

                         TIGR01345 312 lnedrsytaangeelslhgrsllfvrnvghlmtipviltdegeeipegildgvltsvialydlkvqnklrnsrkgsv 388
                                       ln+d+++ta++g++l+l+grsll+vrnvghlmt+p+++++ g e  eg +d+++t ++a++dl+ ++   ns +gsv
  lcl|FitnessBrowser__Phaeo:GFF974 304 LNDDQTFTAPDGSDLVLKGRSLLLVRNVGHLMTNPAVRDSAGREAGEGFIDAMVTVLCAMHDLQAEG--GNSLHGSV 378
                                       *****************************************************************88..9******* PP

                         TIGR01345 389 yivkpkmhgpeevafanklftriedllglerhtlkvgvmdeerrtslnlkaciakvkervafintgfldrtgdeiht 465
                                       y+vkpkmhgpeevaf++++ft +ed lgl+rht+k+g+mdeerrts+nlk+ci  +k+rvafintgfldrtgdeiht
  lcl|FitnessBrowser__Phaeo:GFF974 379 YVVKPKMHGPEEVAFTDRIFTHVEDALGLPRHTVKIGIMDEERRTSVNLKSCIRAAKHRVAFINTGFLDRTGDEIHT 455
                                       ***************************************************************************** PP

                         TIGR01345 466 smeagamvrkadmksapwlkayernnvaagltcglrgkaqigkgmwampdlmaemlekkgdqlragantawvpspta 542
                                       smeagam+rk++mks+pw+ +ye  nv+ gl cgl+g+aqigkgmwampd m emle k++ +++ga  awvpspta
  lcl|FitnessBrowser__Phaeo:GFF974 456 SMEAGAMMRKGEMKSTPWIASYEDRNVDIGLACGLKGRAQIGKGMWAMPDRMGEMLEAKIGHPKSGATCAWVPSPTA 532
                                       ***************************************************************************** PP

                         TIGR01345 543 atlhalhyhrvdvqkvqkeladaerraelkeiltipvaentnwseeeikeeldnnvqgilgyvvrwveqgigcskvp 619
                                       atlha+hyhrv+v++vq++l++ + r +l+++ltipv +  n s+ ei +e++nn+qgilgyvvrwv+qg+gcskvp
  lcl|FitnessBrowser__Phaeo:GFF974 533 ATLHATHYHRVNVHEVQDQLKAGGARGTLDDLLTIPVMQGENLSDAEIAQEIENNAQGILGYVVRWVDQGVGCSKVP 609
                                       ***************************************************************************** PP

                         TIGR01345 620 dihnvalmedratlrissqhlanwlrhgivskeqvleslermakvvdkqnagdeayrpmadnleasvafkaakdlil 696
                                       dihnv lmedrat rissq lanwl hgiv   qv++++++ma vvd qna d +y pma+ ++  +af+aa dl++
  lcl|FitnessBrowser__Phaeo:GFF974 610 DIHNVGLMEDRATCRISSQALANWLHHGIVDETQVMAAMQKMAAVVDGQNASDASYTPMAPGFD-GIAFQAACDLVF 685
                                       ***************************************************************8.69********** PP

                         TIGR01345 697 kgtkqpsgytepilharrlefke 719
                                       +g  qpsgytep+lharrle k+
  lcl|FitnessBrowser__Phaeo:GFF974 686 RGRIQPSGYTEPVLHARRLELKA 708
                                       ********************997 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (721 nodes)
Target sequences:                          1  (710 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.02u 0.02s 00:00:00.04 Elapsed: 00:00:00.04
# Mc/sec: 12.21
//
[ok]

This GapMind analysis is from Sep 17 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory