Align Malate synthase G (EC 2.3.3.9) (characterized)
to candidate WP_011803045.1 PNAP_RS18410 malate synthase G
Query= reanno::psRCH2:GFF353 (726 letters) >NCBI__GCF_000015505.1:WP_011803045.1 Length = 730 Score = 965 bits (2494), Expect = 0.0 Identities = 473/729 (64%), Positives = 585/729 (80%), Gaps = 9/729 (1%) Query: 1 MTERVQVGGLQVAKVLYDFVNNEAIPGTGVDAAAFWAGADSVIHDLAPKNRALLAKRDDL 60 MTER LQVA LY F+ ++ +PGTGVD+ FW+G D+++ DLAPKN ALLA+RD L Sbjct: 1 MTERTTRHSLQVATELYRFIEDKVLPGTGVDSDKFWSGFDAIVADLAPKNIALLAERDRL 60 Query: 61 QAQIDAWHQARAGQAHDAVAYKSFLQEIGYLLPEAEDFQATTENVDEEIARMAGPQLVVP 120 Q ++DAWH+A G D AY++FL+ IGYL+P+ E TT NVD E+A AGPQLVVP Sbjct: 61 QLEMDAWHKANPGPIADMPAYRAFLESIGYLVPQPETVAVTTANVDAELAVQAGPQLVVP 120 Query: 121 IMNARFALNAANARWGSLYDALYGTDAISEADGASKGPGYNEIRGNKVIAYARNFLNEAA 180 ++NAR+ALNAANARWGSLYDALYGTDAI E DGA KG GYN +RG KVIA+ARN L++AA Sbjct: 121 VLNARYALNAANARWGSLYDALYGTDAIPETDGAEKGKGYNPVRGAKVIAFARNLLDQAA 180 Query: 181 PLETGSHVDSTGYRIEGGKLVVSLKDGSTTGLKNPAQLQGFQGEASAPIAVLLKNNGIHF 240 PL TGSH D+TGY IEGG+LVV+ G + GL++PAQL G++G+A+AP +VLL +NG+H Sbjct: 181 PLSTGSHKDATGYSIEGGQLVVTQASGMS-GLQDPAQLVGYRGDAAAPSSVLLVHNGLHI 239 Query: 241 EIQIDPASPIGQTDAAGVKDILMESALTTIMDCEDSIAAVDADDKTVVYRNWLGLMKGDL 300 +I ID A+ +G++DAAG+ D+++ESAL+TI+D EDS+A VDA+DK + Y NWLG+++G L Sbjct: 240 DIIIDRATTLGKSDAAGISDMVIESALSTILDLEDSVAVVDAEDKVLAYGNWLGILQGTL 299 Query: 301 VEELEKGGKRITRAMNPDRVYTKADGNGELTLHGRSLLFIRNVGHLMTNDAILDKEGNEV 360 EE+ KGG TR +NPDRVYT ADG GE+TLHGRSL+F+RNVGHLMTN AIL G E+ Sbjct: 300 TEEVSKGGTTFTRGLNPDRVYTAADG-GEVTLHGRSLMFVRNVGHLMTNPAILYAGGKEI 358 Query: 361 PEGIMDGLFTSLIAVHNLN--GNTSRKNTRTGSMYIVKPKMHGPEEVAFATELFGRVEDV 418 PEGIMD + T+ IA+H+ G KN+RTGS+YIVKPKMHGP EVAFA ELFGRVE + Sbjct: 359 PEGIMDAVVTTTIAIHDFKRQGQPGIKNSRTGSVYIVKPKMHGPAEVAFAAELFGRVEAL 418 Query: 419 LGLPRNTLKVGIMDEERRTTINLKACIKEARERVVFINTGFLDRTGDEIHTSMEAGPMVR 478 LGLP NT+K+GIMDEERRT++NLKACI EA RV FINTGFLDRTGDE+HT+M+AGPM+R Sbjct: 419 LGLPANTVKLGIMDEERRTSVNLKACIAEAEARVAFINTGFLDRTGDEMHTAMQAGPMIR 478 Query: 479 KAAMKAEKWISAYENNNVDVGLACGLQGKAQIGKGMWAMPDLMAAMLEQKVGHPMAGANT 538 K MK WI+AYE NNV VGL+CGL+GKAQIGKGMWAMPDLMAAMLEQK+GHP AGANT Sbjct: 479 KGDMKTSAWIAAYEKNNVLVGLSCGLRGKAQIGKGMWAMPDLMAAMLEQKIGHPKAGANT 538 Query: 539 AWVPSPTAATLHAMHYHKIDVQARQVELAKREKAS-----IDDILTIPLAQDTNWSEEEK 593 AWVPSPTAATLHA+HYH++ V Q EL K + + ++ +L IP+ NWS+ EK Sbjct: 539 AWVPSPTAATLHALHYHQVLVSDVQKELEKIDANAERGNLLNGLLQIPVTATPNWSDAEK 598 Query: 594 RNELDNNSQGILGYMVRWVEQGVGCSKVPDINDIALMEDRATLRISSQHVANWMRHGVVT 653 + ELDNN+QGILGY+VRW++QGVGCSKVPDI++IALMEDRATLRISSQH+ANW+ HGVVT Sbjct: 599 QQELDNNAQGILGYVVRWIDQGVGCSKVPDIHNIALMEDRATLRISSQHMANWLHHGVVT 658 Query: 654 KDQVVESLKRMAPVVDRQNQGDPLYRPMAPDFDNSVAFQAALELVLEGTKQPNGYTEPVL 713 + QV E+ +RMA VVD QN GDPLY+ MA FD S A++AA +LV +G +QP+GYTEP+L Sbjct: 659 EAQVKETFERMAAVVDGQNAGDPLYKNMAGHFDTSAAYKAACDLVFKGLEQPSGYTEPLL 718 Query: 714 HRRRREFKA 722 H R + KA Sbjct: 719 HAWRLKVKA 727 Lambda K H 0.316 0.133 0.386 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 1477 Number of extensions: 59 Number of successful extensions: 5 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 726 Length of database: 730 Length adjustment: 40 Effective length of query: 686 Effective length of database: 690 Effective search space: 473340 Effective search space used: 473340 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.3 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.6 bits) S2: 55 (25.8 bits)
Align candidate WP_011803045.1 PNAP_RS18410 (malate synthase G)
to HMM TIGR01345 (glcB: malate synthase G (EC 2.3.3.9))
# hmmsearch :: search profile(s) against a sequence database # HMMER 3.3.1 (Jul 2020); http://hmmer.org/ # Copyright (C) 2020 Howard Hughes Medical Institute. # Freely distributed under the BSD open source license. # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # query HMM file: ../tmp/path.carbon/TIGR01345.hmm # target sequence database: /tmp/gapView.3248991.genome.faa # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: TIGR01345 [M=721] Accession: TIGR01345 Description: malate_syn_G: malate synthase G Scores for complete sequences (score includes all domains): --- full sequence --- --- best 1 domain --- -#dom- E-value score bias E-value score bias exp N Sequence Description ------- ------ ----- ------- ------ ----- ---- -- -------- ----------- 0 1204.8 0.0 0 1204.6 0.0 1.0 1 NCBI__GCF_000015505.1:WP_011803045.1 Domain annotation for each sequence (and alignments): >> NCBI__GCF_000015505.1:WP_011803045.1 # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc --- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ---- 1 ! 1204.6 0.0 0 0 3 719 .. 5 727 .. 3 729 .. 0.98 Alignments for each domain: == domain 1 score: 1204.6 bits; conditional E-value: 0 TIGR01345 3 vdagrlqvakklkdfveeevlpgtgvdaekfwsgfdeivrdlapenrellakrdeiqaaideyhrknk.gvid 74 + +lqva++l++f+e++vlpgtgvd++kfwsgfd+iv dlap+n lla+rd++q +d++h+ n+ + d NCBI__GCF_000015505.1:WP_011803045.1 5 TTRHSLQVATELYRFIEDKVLPGTGVDSDKFWSGFDAIVADLAPKNIALLAERDRLQLEMDAWHKANPgPIAD 77 56789***************************************************************55569 PP TIGR01345 75 keayksflkeigylveepervtietenvdseiasqagpqlvvpvlnaryalnaanarwgslydalygsnvipe 147 ay+ fl+ igylv++pe+v ++t nvd+e+a qagpqlvvpvlnaryalnaanarwgslydalyg+++ipe NCBI__GCF_000015505.1:WP_011803045.1 78 MPAYRAFLESIGYLVPQPETVAVTTANVDAELAVQAGPQLVVPVLNARYALNAANARWGSLYDALYGTDAIPE 150 9************************************************************************ PP TIGR01345 148 edgaekgkeynpkrgekviefarefldeslplesgsyadvvkykivdkklavqlesgkvtrlkdeeqfvgyrg 220 +dgaekgk ynp+rg kvi+far++ld++ pl++gs++d+ y+i ++l+v+ + l+d++q vgyrg NCBI__GCF_000015505.1:WP_011803045.1 151 TDGAEKGKGYNPVRGAKVIAFARNLLDQAAPLSTGSHKDATGYSIEGGQLVVTQA-SGMSGLQDPAQLVGYRG 222 ****************************************************988.56899************ PP TIGR01345 221 daadpevillktnglhielqidarhpigkadkakvkdivlesaittildcedsvaavdaedkvlvyrnllglm 293 daa+p+ +ll +nglhi++ id +gk+d a++ d+v+esa++tild+edsva vdaedkvl y n+lg+ NCBI__GCF_000015505.1:WP_011803045.1 223 DAAAPSSVLLVHNGLHIDIIIDRATTLGKSDAAGISDMVIESALSTILDLEDSVAVVDAEDKVLAYGNWLGIL 295 ************************************************************************* PP TIGR01345 294 kgtlkeklekngriikrklnedrsytaangeelslhgrsllfvrnvghlmtipviltdegeeipegildgvlt 366 +gtl e+++k g +++r ln dr+ytaa+g e++lhgrsl+fvrnvghlmt+p+il g+eipegi+d+v+t NCBI__GCF_000015505.1:WP_011803045.1 296 QGTLTEEVSKGGTTFTRGLNPDRVYTAADGGEVTLHGRSLMFVRNVGHLMTNPAILYAGGKEIPEGIMDAVVT 368 ************************************************************************* PP TIGR01345 367 svialydlkvqnk..lrnsrkgsvyivkpkmhgpeevafanklftriedllglerhtlkvgvmdeerrtslnl 437 + ia++d+k+q++ ++nsr+gsvyivkpkmhgp evafa +lf+r+e llgl+ +t+k+g+mdeerrts+nl NCBI__GCF_000015505.1:WP_011803045.1 369 TTIAIHDFKRQGQpgIKNSRTGSVYIVKPKMHGPAEVAFAAELFGRVEALLGLPANTVKLGIMDEERRTSVNL 441 *********9875559********************************************************* PP TIGR01345 438 kaciakvkervafintgfldrtgdeihtsmeagamvrkadmksapwlkayernnvaagltcglrgkaqigkgm 510 kacia+++ rvafintgfldrtgde+ht+m+ag+m+rk+dmk+++w+ aye+nnv+ gl cglrgkaqigkgm NCBI__GCF_000015505.1:WP_011803045.1 442 KACIAEAEARVAFINTGFLDRTGDEMHTAMQAGPMIRKGDMKTSAWIAAYEKNNVLVGLSCGLRGKAQIGKGM 514 ************************************************************************* PP TIGR01345 511 wampdlmaemlekkgdqlragantawvpsptaatlhalhyhrvdvqkvqkeladaerrae....lkeiltipv 579 wampdlma mle+k++ ++agantawvpsptaatlhalhyh+v v vqkel + + +ae l+ +l+ipv NCBI__GCF_000015505.1:WP_011803045.1 515 WAMPDLMAAMLEQKIGHPKAGANTAWVPSPTAATLHALHYHQVLVSDVQKELEKIDANAErgnlLNGLLQIPV 587 ****************************************************999999885666777899*** PP TIGR01345 580 aentnwseeeikeeldnnvqgilgyvvrwveqgigcskvpdihnvalmedratlrissqhlanwlrhgivske 652 + + nws+ e+++eldnn+qgilgyvvrw++qg+gcskvpdihn+almedratlrissqh+anwl hg+v+ NCBI__GCF_000015505.1:WP_011803045.1 588 TATPNWSDAEKQQELDNNAQGILGYVVRWIDQGVGCSKVPDIHNIALMEDRATLRISSQHMANWLHHGVVTEA 660 ************************************************************************* PP TIGR01345 653 qvleslermakvvdkqnagdeayrpmadnleasvafkaakdlilkgtkqpsgytepilharrlefke 719 qv e++erma vvd qnagd+ y++ma+ +++s a+kaa dl++kg +qpsgytep+lha+rl+ k+ NCBI__GCF_000015505.1:WP_011803045.1 661 QVKETFERMAAVVDGQNAGDPLYKNMAGHFDTSAAYKAACDLVFKGLEQPSGYTEPLLHAWRLKVKA 727 ****************************************************************996 PP Internal pipeline statistics summary: ------------------------------------- Query model(s): 1 (721 nodes) Target sequences: 1 (730 residues searched) Passed MSV filter: 1 (1); expected 0.0 (0.02) Passed bias filter: 1 (1); expected 0.0 (0.02) Passed Vit filter: 1 (1); expected 0.0 (0.001) Passed Fwd filter: 1 (1); expected 0.0 (1e-05) Initial search space (Z): 1 [actual number of targets] Domain search space (domZ): 1 [number of targets reported over threshold] # CPU time: 0.00u 0.01s 00:00:00.01 Elapsed: 00:00:00.01 # Mc/sec: 39.29 // [ok]
This GapMind analysis is from Apr 09 2024. The underlying query database was built on Sep 17 2021.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory