GapMind for catabolism of small carbon sources

 

Alignments for a candidate for glcB in Polaromonas naphthalenivorans CJ2

Align Malate synthase G (EC 2.3.3.9) (characterized)
to candidate WP_011803045.1 PNAP_RS18410 malate synthase G

Query= reanno::psRCH2:GFF353
         (726 letters)



>NCBI__GCF_000015505.1:WP_011803045.1
          Length = 730

 Score =  965 bits (2494), Expect = 0.0
 Identities = 473/729 (64%), Positives = 585/729 (80%), Gaps = 9/729 (1%)

Query: 1   MTERVQVGGLQVAKVLYDFVNNEAIPGTGVDAAAFWAGADSVIHDLAPKNRALLAKRDDL 60
           MTER     LQVA  LY F+ ++ +PGTGVD+  FW+G D+++ DLAPKN ALLA+RD L
Sbjct: 1   MTERTTRHSLQVATELYRFIEDKVLPGTGVDSDKFWSGFDAIVADLAPKNIALLAERDRL 60

Query: 61  QAQIDAWHQARAGQAHDAVAYKSFLQEIGYLLPEAEDFQATTENVDEEIARMAGPQLVVP 120
           Q ++DAWH+A  G   D  AY++FL+ IGYL+P+ E    TT NVD E+A  AGPQLVVP
Sbjct: 61  QLEMDAWHKANPGPIADMPAYRAFLESIGYLVPQPETVAVTTANVDAELAVQAGPQLVVP 120

Query: 121 IMNARFALNAANARWGSLYDALYGTDAISEADGASKGPGYNEIRGNKVIAYARNFLNEAA 180
           ++NAR+ALNAANARWGSLYDALYGTDAI E DGA KG GYN +RG KVIA+ARN L++AA
Sbjct: 121 VLNARYALNAANARWGSLYDALYGTDAIPETDGAEKGKGYNPVRGAKVIAFARNLLDQAA 180

Query: 181 PLETGSHVDSTGYRIEGGKLVVSLKDGSTTGLKNPAQLQGFQGEASAPIAVLLKNNGIHF 240
           PL TGSH D+TGY IEGG+LVV+   G + GL++PAQL G++G+A+AP +VLL +NG+H 
Sbjct: 181 PLSTGSHKDATGYSIEGGQLVVTQASGMS-GLQDPAQLVGYRGDAAAPSSVLLVHNGLHI 239

Query: 241 EIQIDPASPIGQTDAAGVKDILMESALTTIMDCEDSIAAVDADDKTVVYRNWLGLMKGDL 300
           +I ID A+ +G++DAAG+ D+++ESAL+TI+D EDS+A VDA+DK + Y NWLG+++G L
Sbjct: 240 DIIIDRATTLGKSDAAGISDMVIESALSTILDLEDSVAVVDAEDKVLAYGNWLGILQGTL 299

Query: 301 VEELEKGGKRITRAMNPDRVYTKADGNGELTLHGRSLLFIRNVGHLMTNDAILDKEGNEV 360
            EE+ KGG   TR +NPDRVYT ADG GE+TLHGRSL+F+RNVGHLMTN AIL   G E+
Sbjct: 300 TEEVSKGGTTFTRGLNPDRVYTAADG-GEVTLHGRSLMFVRNVGHLMTNPAILYAGGKEI 358

Query: 361 PEGIMDGLFTSLIAVHNLN--GNTSRKNTRTGSMYIVKPKMHGPEEVAFATELFGRVEDV 418
           PEGIMD + T+ IA+H+    G    KN+RTGS+YIVKPKMHGP EVAFA ELFGRVE +
Sbjct: 359 PEGIMDAVVTTTIAIHDFKRQGQPGIKNSRTGSVYIVKPKMHGPAEVAFAAELFGRVEAL 418

Query: 419 LGLPRNTLKVGIMDEERRTTINLKACIKEARERVVFINTGFLDRTGDEIHTSMEAGPMVR 478
           LGLP NT+K+GIMDEERRT++NLKACI EA  RV FINTGFLDRTGDE+HT+M+AGPM+R
Sbjct: 419 LGLPANTVKLGIMDEERRTSVNLKACIAEAEARVAFINTGFLDRTGDEMHTAMQAGPMIR 478

Query: 479 KAAMKAEKWISAYENNNVDVGLACGLQGKAQIGKGMWAMPDLMAAMLEQKVGHPMAGANT 538
           K  MK   WI+AYE NNV VGL+CGL+GKAQIGKGMWAMPDLMAAMLEQK+GHP AGANT
Sbjct: 479 KGDMKTSAWIAAYEKNNVLVGLSCGLRGKAQIGKGMWAMPDLMAAMLEQKIGHPKAGANT 538

Query: 539 AWVPSPTAATLHAMHYHKIDVQARQVELAKREKAS-----IDDILTIPLAQDTNWSEEEK 593
           AWVPSPTAATLHA+HYH++ V   Q EL K +  +     ++ +L IP+    NWS+ EK
Sbjct: 539 AWVPSPTAATLHALHYHQVLVSDVQKELEKIDANAERGNLLNGLLQIPVTATPNWSDAEK 598

Query: 594 RNELDNNSQGILGYMVRWVEQGVGCSKVPDINDIALMEDRATLRISSQHVANWMRHGVVT 653
           + ELDNN+QGILGY+VRW++QGVGCSKVPDI++IALMEDRATLRISSQH+ANW+ HGVVT
Sbjct: 599 QQELDNNAQGILGYVVRWIDQGVGCSKVPDIHNIALMEDRATLRISSQHMANWLHHGVVT 658

Query: 654 KDQVVESLKRMAPVVDRQNQGDPLYRPMAPDFDNSVAFQAALELVLEGTKQPNGYTEPVL 713
           + QV E+ +RMA VVD QN GDPLY+ MA  FD S A++AA +LV +G +QP+GYTEP+L
Sbjct: 659 EAQVKETFERMAAVVDGQNAGDPLYKNMAGHFDTSAAYKAACDLVFKGLEQPSGYTEPLL 718

Query: 714 HRRRREFKA 722
           H  R + KA
Sbjct: 719 HAWRLKVKA 727


Lambda     K      H
   0.316    0.133    0.386 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 1477
Number of extensions: 59
Number of successful extensions: 5
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 726
Length of database: 730
Length adjustment: 40
Effective length of query: 686
Effective length of database: 690
Effective search space:   473340
Effective search space used:   473340
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.6 bits)
S2: 55 (25.8 bits)

Align candidate WP_011803045.1 PNAP_RS18410 (malate synthase G)
to HMM TIGR01345 (glcB: malate synthase G (EC 2.3.3.9))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR01345.hmm
# target sequence database:        /tmp/gapView.3248991.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR01345  [M=721]
Accession:   TIGR01345
Description: malate_syn_G: malate synthase G
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                             Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                             -----------
          0 1204.8   0.0          0 1204.6   0.0    1.0  1  NCBI__GCF_000015505.1:WP_011803045.1  


Domain annotation for each sequence (and alignments):
>> NCBI__GCF_000015505.1:WP_011803045.1  
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 ! 1204.6   0.0         0         0       3     719 ..       5     727 ..       3     729 .. 0.98

  Alignments for each domain:
  == domain 1  score: 1204.6 bits;  conditional E-value: 0
                             TIGR01345   3 vdagrlqvakklkdfveeevlpgtgvdaekfwsgfdeivrdlapenrellakrdeiqaaideyhrknk.gvid 74 
                                            +  +lqva++l++f+e++vlpgtgvd++kfwsgfd+iv dlap+n  lla+rd++q  +d++h+ n+ +  d
  NCBI__GCF_000015505.1:WP_011803045.1   5 TTRHSLQVATELYRFIEDKVLPGTGVDSDKFWSGFDAIVADLAPKNIALLAERDRLQLEMDAWHKANPgPIAD 77 
                                           56789***************************************************************55569 PP

                             TIGR01345  75 keayksflkeigylveepervtietenvdseiasqagpqlvvpvlnaryalnaanarwgslydalygsnvipe 147
                                             ay+ fl+ igylv++pe+v ++t nvd+e+a qagpqlvvpvlnaryalnaanarwgslydalyg+++ipe
  NCBI__GCF_000015505.1:WP_011803045.1  78 MPAYRAFLESIGYLVPQPETVAVTTANVDAELAVQAGPQLVVPVLNARYALNAANARWGSLYDALYGTDAIPE 150
                                           9************************************************************************ PP

                             TIGR01345 148 edgaekgkeynpkrgekviefarefldeslplesgsyadvvkykivdkklavqlesgkvtrlkdeeqfvgyrg 220
                                           +dgaekgk ynp+rg kvi+far++ld++ pl++gs++d+  y+i  ++l+v+     +  l+d++q vgyrg
  NCBI__GCF_000015505.1:WP_011803045.1 151 TDGAEKGKGYNPVRGAKVIAFARNLLDQAAPLSTGSHKDATGYSIEGGQLVVTQA-SGMSGLQDPAQLVGYRG 222
                                           ****************************************************988.56899************ PP

                             TIGR01345 221 daadpevillktnglhielqidarhpigkadkakvkdivlesaittildcedsvaavdaedkvlvyrnllglm 293
                                           daa+p+ +ll +nglhi++ id    +gk+d a++ d+v+esa++tild+edsva vdaedkvl y n+lg+ 
  NCBI__GCF_000015505.1:WP_011803045.1 223 DAAAPSSVLLVHNGLHIDIIIDRATTLGKSDAAGISDMVIESALSTILDLEDSVAVVDAEDKVLAYGNWLGIL 295
                                           ************************************************************************* PP

                             TIGR01345 294 kgtlkeklekngriikrklnedrsytaangeelslhgrsllfvrnvghlmtipviltdegeeipegildgvlt 366
                                           +gtl e+++k g +++r ln dr+ytaa+g e++lhgrsl+fvrnvghlmt+p+il   g+eipegi+d+v+t
  NCBI__GCF_000015505.1:WP_011803045.1 296 QGTLTEEVSKGGTTFTRGLNPDRVYTAADGGEVTLHGRSLMFVRNVGHLMTNPAILYAGGKEIPEGIMDAVVT 368
                                           ************************************************************************* PP

                             TIGR01345 367 svialydlkvqnk..lrnsrkgsvyivkpkmhgpeevafanklftriedllglerhtlkvgvmdeerrtslnl 437
                                           + ia++d+k+q++  ++nsr+gsvyivkpkmhgp evafa +lf+r+e llgl+ +t+k+g+mdeerrts+nl
  NCBI__GCF_000015505.1:WP_011803045.1 369 TTIAIHDFKRQGQpgIKNSRTGSVYIVKPKMHGPAEVAFAAELFGRVEALLGLPANTVKLGIMDEERRTSVNL 441
                                           *********9875559********************************************************* PP

                             TIGR01345 438 kaciakvkervafintgfldrtgdeihtsmeagamvrkadmksapwlkayernnvaagltcglrgkaqigkgm 510
                                           kacia+++ rvafintgfldrtgde+ht+m+ag+m+rk+dmk+++w+ aye+nnv+ gl cglrgkaqigkgm
  NCBI__GCF_000015505.1:WP_011803045.1 442 KACIAEAEARVAFINTGFLDRTGDEMHTAMQAGPMIRKGDMKTSAWIAAYEKNNVLVGLSCGLRGKAQIGKGM 514
                                           ************************************************************************* PP

                             TIGR01345 511 wampdlmaemlekkgdqlragantawvpsptaatlhalhyhrvdvqkvqkeladaerrae....lkeiltipv 579
                                           wampdlma mle+k++ ++agantawvpsptaatlhalhyh+v v  vqkel + + +ae    l+ +l+ipv
  NCBI__GCF_000015505.1:WP_011803045.1 515 WAMPDLMAAMLEQKIGHPKAGANTAWVPSPTAATLHALHYHQVLVSDVQKELEKIDANAErgnlLNGLLQIPV 587
                                           ****************************************************999999885666777899*** PP

                             TIGR01345 580 aentnwseeeikeeldnnvqgilgyvvrwveqgigcskvpdihnvalmedratlrissqhlanwlrhgivske 652
                                           + + nws+ e+++eldnn+qgilgyvvrw++qg+gcskvpdihn+almedratlrissqh+anwl hg+v+  
  NCBI__GCF_000015505.1:WP_011803045.1 588 TATPNWSDAEKQQELDNNAQGILGYVVRWIDQGVGCSKVPDIHNIALMEDRATLRISSQHMANWLHHGVVTEA 660
                                           ************************************************************************* PP

                             TIGR01345 653 qvleslermakvvdkqnagdeayrpmadnleasvafkaakdlilkgtkqpsgytepilharrlefke 719
                                           qv e++erma vvd qnagd+ y++ma+ +++s a+kaa dl++kg +qpsgytep+lha+rl+ k+
  NCBI__GCF_000015505.1:WP_011803045.1 661 QVKETFERMAAVVDGQNAGDPLYKNMAGHFDTSAAYKAACDLVFKGLEQPSGYTEPLLHAWRLKVKA 727
                                           ****************************************************************996 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (721 nodes)
Target sequences:                          1  (730 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.00u 0.01s 00:00:00.01 Elapsed: 00:00:00.01
# Mc/sec: 39.29
//
[ok]

This GapMind analysis is from Apr 09 2024. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory