Align malate synthase (EC 2.3.3.9) (characterized)
to candidate WP_078210938.1 BXU11_RS02770 malate synthase A
Query= BRENDA::Q5YLB8 (539 letters) >NCBI__GCF_002017945.1:WP_078210938.1 Length = 533 Score = 548 bits (1411), Expect = e-160 Identities = 274/504 (54%), Positives = 353/504 (70%), Gaps = 6/504 (1%) Query: 24 ILTKEATAFLALLHRTFNPTRKALLQRRSDRQAELDRGSLLDFLPETKHIRDNDAWKGAP 83 ILT+EA FL LH FN R LL+ R +Q + D G L F ETK IR+++ W+ AP Sbjct: 25 ILTEEAIDFLTALHENFNTKRLELLEARKAQQVQFDNGVLPTFPLETKSIRESN-WQAAP 83 Query: 84 PAPGLVDRRVEITGPTDRKMVVNALNSDVYTYMADFEDSSAPTWDNMVNGQVNLYDAIRR 143 L+DRRVEITGP DRKMV+NALNS T+MADFEDS++PTWDN++ GQ NL DA+ + Sbjct: 84 VPKDLIDRRVEITGPVDRKMVINALNSGAKTFMADFEDSTSPTWDNIMEGQQNLKDAVNK 143 Query: 144 QVDFKQGGKDYKLRTDRKLPTLIARARGWHLDEKHLTVDGEPMSGSLFDFGLYFFHNAKE 203 + + K+ K K LI R RG HL+EKH+ + E SGSL DFGLY FHN + Sbjct: 144 TITLEDPIKNKKYALKDKTAVLIVRPRGLHLNEKHILIANEEASGSLIDFGLYAFHNHDQ 203 Query: 204 LVKRGAGPYFYLPKMESHLEARMWNDVYNLAQDYIGMPRGTIRATVLIETISAAFEMDEI 263 L + G+ PYFYLPK+E +LEAR WN+V+ AQ+Y+G GT +ATVLIETI+A+F++DEI Sbjct: 204 LARNGSAPYFYLPKLEHYLEARWWNEVFEFAQEYLGEQHGTFKATVLIETITASFQLDEI 263 Query: 264 IYELRDHSSGLNCGRWDYIFSFIKKFRQNPSFVLPDRSDVTMTVPFMDAYVKLLIKTCHR 323 IYELRDH GLNCGRWDYIFS+IKKFR NP+F++P+R VTMT PFMDAY KL+I+ CH+ Sbjct: 264 IYELRDHIVGLNCGRWDYIFSYIKKFRNNPAFIVPNRDQVTMTSPFMDAYSKLVIQRCHK 323 Query: 324 RGVHAMGGMAAQIPIKDDPKANEAAMASVRADKLREVRAGHDGTWVGHPALAKIATDVFD 383 R +HAMGGMAAQIPIK+DP+ANE A V ADK RE R GHDGTWV HP L IA VF+ Sbjct: 324 RNIHAMGGMAAQIPIKNDPEANEIAFKKVIADKEREARNGHDGTWVAHPDLVPIAMKVFN 383 Query: 384 QYMPTPNQLFVRREDVHITANDLLNTNVPGRITEDGIRKNLNIGLSYMEGWLRGVGCIPI 443 + M T N + ++R D+HIT DLL V G ITE+GIRKN+N+ + Y+ WL G G I Sbjct: 384 ENMMTKNHIHIKRADLHITEADLLQIPV-GTITEEGIRKNVNVAVLYITSWLNGQGAAAI 442 Query: 444 NYLMEDAATAEVSRSQLWQWVKHNVTTAEGKRV-DKAY---ALKLLQEQTDELASKAPKG 499 ++LMEDAATAE+SRSQLWQW+++ VT G+++ K Y AL+ ++ ++ +A + Sbjct: 443 HHLMEDAATAEISRSQLWQWLQNEVTLDSGEKLTTKLYHRIALEEYEKIRKQVGDRAHEE 502 Query: 500 NRYQLAARYFAGQVAGEDYADFLT 523 Y LA + V + + +FLT Sbjct: 503 ENYILAEKLLDELVVNKKFVEFLT 526 Lambda K H 0.320 0.136 0.410 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 741 Number of extensions: 31 Number of successful extensions: 3 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 539 Length of database: 533 Length adjustment: 35 Effective length of query: 504 Effective length of database: 498 Effective search space: 250992 Effective search space used: 250992 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.4 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.8 bits) S2: 52 (24.6 bits)
Align candidate WP_078210938.1 BXU11_RS02770 (malate synthase A)
to HMM TIGR01344 (aceB: malate synthase A (EC 2.3.3.9))
# hmmsearch :: search profile(s) against a sequence database # HMMER 3.3.1 (Jul 2020); http://hmmer.org/ # Copyright (C) 2020 Howard Hughes Medical Institute. # Freely distributed under the BSD open source license. # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # query HMM file: ../tmp/path.carbon/TIGR01344.hmm # target sequence database: /tmp/gapView.3081382.genome.faa # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: TIGR01344 [M=511] Accession: TIGR01344 Description: malate_syn_A: malate synthase A Scores for complete sequences (score includes all domains): --- full sequence --- --- best 1 domain --- -#dom- E-value score bias E-value score bias exp N Sequence Description ------- ------ ----- ------- ------ ----- ---- -- -------- ----------- 6e-236 769.5 2.7 6.8e-236 769.3 2.7 1.0 1 NCBI__GCF_002017945.1:WP_078210938.1 Domain annotation for each sequence (and alignments): >> NCBI__GCF_002017945.1:WP_078210938.1 # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc --- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ---- 1 ! 769.3 2.7 6.8e-236 6.8e-236 1 510 [. 25 533 .] 25 533 .] 1.00 Alignments for each domain: == domain 1 score: 769.3 bits; conditional E-value: 6.8e-236 TIGR01344 1 vltkealeflaelhrrfaerrkellarrekkqakldkgelldflpetkeireddwkvaaipadlldrrveitG 73 +lt+ea+ fl+ lh++f+++r ell++r+++q ++d+g l+ f etk+ire++w+ a++p+dl+drrveitG NCBI__GCF_002017945.1:WP_078210938.1 25 ILTEEAIDFLTALHENFNTKRLELLEARKAQQVQFDNGVLPTFPLETKSIRESNWQAAPVPKDLIDRRVEITG 97 89*********************************************************************** PP TIGR01344 74 PvdrkmvinalnaeakvfladfedsssPtwenvveGqinlkdairgeidftdeesgkeyalkaklavlivrpr 146 Pvdrkmvinaln++ak+f+adfeds+sPtw+n++eGq nlkda++++i+ d+ ++k+yalk k+avlivrpr NCBI__GCF_002017945.1:WP_078210938.1 98 PVDRKMVINALNSGAKTFMADFEDSTSPTWDNIMEGQQNLKDAVNKTITLEDPIKNKKYALKDKTAVLIVRPR 170 ************************************************************************* PP TIGR01344 147 GwhlkerhleidgkaisgslldfglyffhnarellkkGkGPyfylPkleshlearlwndvfllaqevlglprG 219 G+hl+e+h+ i ++ sgsl+dfgly+fhn+ +l ++G+ PyfylPkle++lear+wn+vf +aqe+lg ++G NCBI__GCF_002017945.1:WP_078210938.1 171 GLHLNEKHILIANEEASGSLIDFGLYAFHNHDQLARNGSAPYFYLPKLEHYLEARWWNEVFEFAQEYLGEQHG 243 ************************************************************************* PP TIGR01344 220 tikatvlietlpaafemdeilyelrehssGlncGrwdyifslikklkkaeevvlPdrdavtmdkaflnayskl 292 t katvliet++a+f++dei+yelr+h++GlncGrwdyifs+ikk+++++++++P+rd+vtm+++f++ayskl NCBI__GCF_002017945.1:WP_078210938.1 244 TFKATVLIETITASFQLDEIIYELRDHIVGLNCGRWDYIFSYIKKFRNNPAFIVPNRDQVTMTSPFMDAYSKL 316 ************************************************************************* PP TIGR01344 293 liqtchrrgafalGGmaafiPikddpaaneaalekvradkereaknGhdGtwvahPdlvevalevfdevlgep 365 +iq ch+r+++a+GGmaa+iPik+dp+ane a++kv adkerea+nGhdGtwvahPdlv++a++vf+e + ++ NCBI__GCF_002017945.1:WP_078210938.1 317 VIQRCHKRNIHAMGGMAAQIPIKNDPEANEIAFKKVIADKEREARNGHDGTWVAHPDLVPIAMKVFNENMMTK 389 ************************************************************************* PP TIGR01344 366 nqldrvrledvsitaaellevkdasrteeGlrenirvglryieawlrGsGavpiynlmedaataeisraqlwq 438 n ++ +r +d++it+a+ll+++ ++ teeG+r+n++v++ yi +wl+G+Ga +i++lmedaataeisr+qlwq NCBI__GCF_002017945.1:WP_078210938.1 390 NHIHIKR-ADLHITEADLLQIPVGTITEEGIRKNVNVAVLYITSWLNGQGAAAIHHLMEDAATAEISRSQLWQ 461 *****88.***************************************************************** PP TIGR01344 439 wikhGvvledGekvtselvrdllkeeleklkkesgkeeyakarleeaaellerlvlseeledfltlpaydel 510 w+++ v+l+ Gek+t++l ++ ee ek++k++g+ + ++++ a++ll +lv+++++ +flt+p+y++l NCBI__GCF_002017945.1:WP_078210938.1 462 WLQNEVTLDSGEKLTTKLYHRIALEEYEKIRKQVGDRAHEEENYILAEKLLDELVVNKKFVEFLTIPGYKYL 533 *********************************************************************986 PP Internal pipeline statistics summary: ------------------------------------- Query model(s): 1 (511 nodes) Target sequences: 1 (533 residues searched) Passed MSV filter: 1 (1); expected 0.0 (0.02) Passed bias filter: 1 (1); expected 0.0 (0.02) Passed Vit filter: 1 (1); expected 0.0 (0.001) Passed Fwd filter: 1 (1); expected 0.0 (1e-05) Initial search space (Z): 1 [actual number of targets] Domain search space (domZ): 1 [number of targets reported over threshold] # CPU time: 0.01u 0.01s 00:00:00.02 Elapsed: 00:00:00.01 # Mc/sec: 21.89 // [ok]
This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory