Align Malate synthase G (EC 2.3.3.9) (characterized)
to candidate 8501453 DvMF_2183 malate synthase G (RefSeq)
Query= reanno::psRCH2:GFF353 (726 letters) >FitnessBrowser__Miya:8501453 Length = 731 Score = 930 bits (2404), Expect = 0.0 Identities = 460/731 (62%), Positives = 558/731 (76%), Gaps = 6/731 (0%) Query: 1 MTERVQVGGLQVAKVLYDFVNNEAIPGTGVDAAAFWAGADSVIHDLAPKNRALLAKRDDL 60 MT+RVQVGGLQ+A LYD + + PGTGVD FW ++++ +A +NRALLAKR +L Sbjct: 1 MTQRVQVGGLQIAAPLYDVIVRDIAPGTGVDPDRFWTALENMVETMADRNRALLAKRAEL 60 Query: 61 QAQIDAWHQARAGQAHDAVAYKSFLQEIGYLLPEAEDFQATTENVDEEIARMAGPQLVVP 120 Q IDAWH+ R G HD AY++FL+ IGYL+PE DF TTE VD EIA +AGPQLVVP Sbjct: 61 QDAIDAWHRERRGTPHDGAAYEAFLRSIGYLVPEGPDFAVTTEGVDPEIALVAGPQLVVP 120 Query: 121 IMNARFALNAANARWGSLYDALYGTDAISE---ADGASKGPGYNEIRGNKVIAYARNFLN 177 I NAR+ALNAANARWGSLYDALYGTD I E GA +G YN RG V+A A FL+ Sbjct: 121 ITNARYALNAANARWGSLYDALYGTDVIPEDPARGGAPRGGAYNPARGALVVARAAAFLD 180 Query: 178 EAAPLETGSHVDSTGYRIEGGKLVVSLKDGSTTGLKNPAQLQGFQGE-ASAPIAVLLKNN 236 EA PL TGSH D+ Y + GGKL V LKDG+ TGL +PA+ G G+ A AVLL+N+ Sbjct: 181 EAFPLATGSHADAARYDVRGGKLAVILKDGAETGLADPARFVGHAGDPAGGNGAVLLRNH 240 Query: 237 GIHFEIQIDPASPIGQTDAAGVKDILMESALTTIMDCEDSIAAVDADDKTVVYRNWLGLM 296 G+H EI+ID IG+ AAGV+D++ME+A+TTI+DCEDS+A VD DK + YRN LGL Sbjct: 241 GLHAEIRIDREHAIGRGHAAGVRDVVMEAAMTTILDCEDSVAVVDGADKALAYRNMLGLF 300 Query: 297 KGDLVEELEKGGKRITRAMNPDRVYTKADGNGELTLHGRSLLFIRNVGHLMTNDAILDKE 356 +GDL E KGG+ + R +NPDR YT DG +L GRSLL +R VGHLMT DA+L + Sbjct: 301 RGDLSAEFPKGGRSVLRTLNPDREYTGPDG-AAFSLPGRSLLLVRTVGHLMTTDAVLARS 359 Query: 357 GNEVPEGIMDGLFTSLIAVHNLNGNTSRKNTRTGSMYIVKPKMHGPEEVAFATELFGRVE 416 G E+PEG++D + T+ IA+H+L G +S +N+RTG +YIVKPK HGPEEVAF ELF E Sbjct: 360 GEEIPEGMLDTMATAYIALHDLRGTSSVRNSRTGGVYIVKPKQHGPEEVAFTVELFRMAE 419 Query: 417 DVLGLPRNTLKVGIMDEERRTTINLKACIKEARERVVFINTGFLDRTGDEIHTSMEAGPM 476 D LG+PRNTLK+GIMDEERRTT+NLK CI+ A ERV+FINTGFLDRTGDEIHT MEAGP+ Sbjct: 420 DALGMPRNTLKIGIMDEERRTTVNLKECIRAAAERVIFINTGFLDRTGDEIHTCMEAGPV 479 Query: 477 VRKAAMKAEKWISAYENNNVDVGLACGLQGKAQIGKGMWAMPDLMAAMLEQKVGHPMAGA 536 VRK AM+ E+WI AYE+ NVD GLACGL G+AQ+GKGMWA PD+M M+E K+GHP AGA Sbjct: 480 VRKNAMRGERWIIAYEDWNVDTGLACGLAGRAQVGKGMWAKPDMMREMVETKIGHPRAGA 539 Query: 537 NTAWVPSPTAATLHAMHYHKIDVQARQVELAKREKASIDDILTIPL-AQDTNWSEEEKRN 595 N AWVPSPTAATLHAMHYH +DV A Q LA + +A++ D+LT+PL + + +E Sbjct: 540 NCAWVPSPTAATLHAMHYHAVDVAAVQKTLAGQRRATLADLLTLPLMGPASRPTPQEVEE 599 Query: 596 ELDNNSQGILGYMVRWVEQGVGCSKVPDINDIALMEDRATLRISSQHVANWMRHGVVTKD 655 EL NN+Q ILGY+VRWVEQG+GCSKVPDI D+ LMEDRATLRISSQH+ANW+ HG+ T+D Sbjct: 600 ELANNAQSILGYVVRWVEQGIGCSKVPDITDVGLMEDRATLRISSQHIANWLHHGICTRD 659 Query: 656 QVVESLKRMAPVVDRQNQGDPLYRPMAPDFDNSVAFQAALELVLEGTKQPNGYTEPVLHR 715 QVV LKRMA VVDRQN GDP YRPM+ DFD SVAFQAA +LVL G +QP+GYTEP+LH Sbjct: 660 QVVAVLKRMAAVVDRQNAGDPAYRPMSADFDASVAFQAACDLVLLGREQPSGYTEPILHA 719 Query: 716 RRREFKAKNGL 726 RR+E KAK G+ Sbjct: 720 RRQEAKAKFGI 730 Lambda K H 0.316 0.133 0.386 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 1424 Number of extensions: 47 Number of successful extensions: 5 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 726 Length of database: 731 Length adjustment: 40 Effective length of query: 686 Effective length of database: 691 Effective search space: 474026 Effective search space used: 474026 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.3 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.6 bits) S2: 55 (25.8 bits)
Align candidate 8501453 DvMF_2183 (malate synthase G (RefSeq))
to HMM TIGR01345 (glcB: malate synthase G (EC 2.3.3.9))
# hmmsearch :: search profile(s) against a sequence database # HMMER 3.3.1 (Jul 2020); http://hmmer.org/ # Copyright (C) 2020 Howard Hughes Medical Institute. # Freely distributed under the BSD open source license. # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # query HMM file: ../tmp/path.carbon/TIGR01345.hmm # target sequence database: /tmp/gapView.16873.genome.faa # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: TIGR01345 [M=721] Accession: TIGR01345 Description: malate_syn_G: malate synthase G Scores for complete sequences (score includes all domains): --- full sequence --- --- best 1 domain --- -#dom- E-value score bias E-value score bias exp N Sequence Description ------- ------ ----- ------- ------ ----- ---- -- -------- ----------- 0 1070.9 0.0 0 1070.7 0.0 1.0 1 lcl|FitnessBrowser__Miya:8501453 DvMF_2183 malate synthase G (Ref Domain annotation for each sequence (and alignments): >> lcl|FitnessBrowser__Miya:8501453 DvMF_2183 malate synthase G (RefSeq) # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc --- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ---- 1 ! 1070.7 0.0 0 0 1 720 [. 3 727 .. 3 728 .. 0.98 Alignments for each domain: == domain 1 score: 1070.7 bits; conditional E-value: 0 TIGR01345 1 ervdagrlqvakklkdfveeevlpgtgvdaekfwsgfdeivrdlapenrellakrdeiqaaideyhrknk.gvidke 76 +rv++g+lq+a+ l+d + +++ pgtgvd + fw++++++v+ +a +nr llakr e+q aid++hr+ + ++ d + lcl|FitnessBrowser__Miya:8501453 3 QRVQVGGLQIAAPLYDVIVRDIAPGTGVDPDRFWTALENMVETMADRNRALLAKRAELQDAIDAWHRERRgTPHDGA 79 5799****************************************************************995679*** PP TIGR01345 77 ayksflkeigylveepervtietenvdseiasqagpqlvvpvlnaryalnaanarwgslydalygsnvipee...dg 150 ay+ fl+ igylv+e ++te+vd eia agpqlvvp++naryalnaanarwgslydalyg++vipe+ +g lcl|FitnessBrowser__Miya:8501453 80 AYEAFLRSIGYLVPEGPDFAVTTEGVDPEIALVAGPQLVVPITNARYALNAANARWGSLYDALYGTDVIPEDparGG 156 *********************************************************************99644478 PP TIGR01345 151 aekgkeynpkrgekviefarefldeslplesgsyadvvkykivdkklavqlesgkvtrlkdeeqfvgyrgdaa.dpe 226 a +g+ ynp rg v++ a flde++pl +gs+ad+ +y++ +klav l++g +t l d+++fvg+ gd a lcl|FitnessBrowser__Miya:8501453 157 APRGGAYNPARGALVVARAAAFLDEAFPLATGSHADAARYDVRGGKLAVILKDGAETGLADPARFVGHAGDPAgGNG 233 99********************************************************************9761567 PP TIGR01345 227 villktnglhielqidarhpigkadkakvkdivlesaittildcedsvaavdaedkvlvyrnllglmkgtlkeklek 303 ++ll+++glh e++id +h ig+ a+v+d+v+e+a+ttildcedsva vd dk l yrn+lgl +g+l +++ k lcl|FitnessBrowser__Miya:8501453 234 AVLLRNHGLHAEIRIDREHAIGRGHAAGVRDVVMEAAMTTILDCEDSVAVVDGADKALAYRNMLGLFRGDLSAEFPK 310 9**************************************************************************** PP TIGR01345 304 ngriikrklnedrsytaangeelslhgrsllfvrnvghlmtipviltdegeeipegildgvltsvialydlkvqnkl 380 gr + r ln dr+yt+++g +sl+grsll+vr vghlmt+ ++l +geeipeg+ld++ t+ ial+dl+ +++ lcl|FitnessBrowser__Miya:8501453 311 GGRSVLRTLNPDREYTGPDGAAFSLPGRSLLLVRTVGHLMTTDAVLARSGEEIPEGMLDTMATAYIALHDLRGTSSV 387 ***************************************************************************** PP TIGR01345 381 rnsrkgsvyivkpkmhgpeevafanklftriedllglerhtlkvgvmdeerrtslnlkaciakvkervafintgfld 457 rnsr+g vyivkpk+hgpeevaf+ +lf ed lg++r+tlk+g+mdeerrt++nlk ci ++erv+fintgfld lcl|FitnessBrowser__Miya:8501453 388 RNSRTGGVYIVKPKQHGPEEVAFTVELFRMAEDALGMPRNTLKIGIMDEERRTTVNLKECIRAAAERVIFINTGFLD 464 ***************************************************************************** PP TIGR01345 458 rtgdeihtsmeagamvrkadmksapwlkayernnvaagltcglrgkaqigkgmwampdlmaemlekkgdqlragant 534 rtgdeiht+meag++vrk+ m+ w+ aye nv++gl cgl g+aq+gkgmwa pd+m em+e k++ +ragan lcl|FitnessBrowser__Miya:8501453 465 RTGDEIHTCMEAGPVVRKNAMRGERWIIAYEDWNVDTGLACGLAGRAQVGKGMWAKPDMMREMVETKIGHPRAGANC 541 ***************************************************************************** PP TIGR01345 535 awvpsptaatlhalhyhrvdvqkvqkeladaerraelkeiltipvaent.nwseeeikeeldnnvqgilgyvvrwve 610 awvpsptaatlha+hyh vdv +vqk+la + rra+l ++lt+p+ + + + +e++eel nn+q+ilgyvvrwve lcl|FitnessBrowser__Miya:8501453 542 AWVPSPTAATLHAMHYHAVDVAAVQKTLAGQ-RRATLADLLTLPLMGPAsRPTPQEVEEELANNAQSILGYVVRWVE 617 *****************************98.99**********9865516799*********************** PP TIGR01345 611 qgigcskvpdihnvalmedratlrissqhlanwlrhgivskeqvleslermakvvdkqnagdeayrpmadnleasva 687 qgigcskvpdi +v lmedratlrissqh+anwl hgi +++qv++ l+rma vvd+qnagd+ayrpm+ +++asva lcl|FitnessBrowser__Miya:8501453 618 QGIGCSKVPDITDVGLMEDRATLRISSQHIANWLHHGICTRDQVVAVLKRMAAVVDRQNAGDPAYRPMSADFDASVA 694 ***************************************************************************** PP TIGR01345 688 fkaakdlilkgtkqpsgytepilharrlefkek 720 f+aa dl+l g +qpsgytepilharr+e k+k lcl|FitnessBrowser__Miya:8501453 695 FQAACDLVLLGREQPSGYTEPILHARRQEAKAK 727 ******************************987 PP Internal pipeline statistics summary: ------------------------------------- Query model(s): 1 (721 nodes) Target sequences: 1 (731 residues searched) Passed MSV filter: 1 (1); expected 0.0 (0.02) Passed bias filter: 1 (1); expected 0.0 (0.02) Passed Vit filter: 1 (1); expected 0.0 (0.001) Passed Fwd filter: 1 (1); expected 0.0 (1e-05) Initial search space (Z): 1 [actual number of targets] Domain search space (domZ): 1 [number of targets reported over threshold] # CPU time: 0.03u 0.01s 00:00:00.04 Elapsed: 00:00:00.04 # Mc/sec: 12.45 // [ok]
This GapMind analysis is from Sep 17 2021. The underlying query database was built on Sep 17 2021.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory