Align 3-dehydroquinate synthase (EC 4.2.3.4) (characterized)
to candidate WP_025274781.1 HALAL_RS0114995 3-dehydroquinate synthase
Query= BRENDA::P9WPX9 (362 letters) >NCBI__GCF_000527155.1:WP_025274781.1 Length = 519 Score = 345 bits (884), Expect = 2e-99 Identities = 186/343 (54%), Positives = 229/343 (66%), Gaps = 3/343 (0%) Query: 17 PYPVVIGTGLLDELEDLLADRHKVAVVHQPGLAETAEEIRKRLAGKGVDAHRIEIPDAEA 76 PY V IG G ++++L D K+A ++ P + +I L +G +EIPDAEA Sbjct: 178 PYSVTIGPGTATLVDEVLPDAEKIAFIYSPTVEPQVRDILAALP-QGRRIVTLEIPDAEA 236 Query: 77 GKDLPVVGFIWEVLGRIGIGRKDALVSLGGGAATDVAGFAAATWLRGVSIVHLPTTLLGM 136 GK+L VVG WE LG G R DA+VS+GGGA +DVAGF AATWLRGV +VHLPT+LLG Sbjct: 237 GKELDVVGQCWEALGEEGFTRTDAVVSVGGGAVSDVAGFVAATWLRGVDVVHLPTSLLGA 296 Query: 137 VDAAVGGKTGINTDAGKNLVGAFHQPLAVLVDLATLQTLPRDEMICGMAEVVKAGFIADP 196 VDAAVGGKTGINT AGKNLVGAFH P AVL D L+TLPR+EM G+AE++KAGFI D Sbjct: 297 VDAAVGGKTGINTAAGKNLVGAFHPPKAVLCDTRMLRTLPREEMSNGLAEIIKAGFICDT 356 Query: 197 VILDLIEADPQAALDPAGDVLPELIRRAITVKAEVVAADEKESELREILNYGHTLGHAIE 256 ILDLIE DP ALDP G ++PEL+RRAI VKA++V D E R LN+GHTL HAIE Sbjct: 357 TILDLIEHDPAGALDPNGSLVPELMRRAIAVKADIVGEDLTEQGKRVWLNFGHTLAHAIE 416 Query: 257 RRERYRWRHGAAVSVGLVFAAELARLAGRLDDATAQRHRTILSSLGLPVSYDPDALPQLL 316 ++ERYR RHG AV++G+V+AAEL L R++ +R R IL S+GLP SY L Sbjct: 417 KQERYRLRHGFAVAIGMVYAAELGALTERVN--VTRRLRRILESVGLPTSYSGSDWDSLR 474 Query: 317 EIMAGDKKTRAGVLRFVVLDGLAKPGRMVGPDPGLLVTAYAGV 359 M DKK R RFV+LD +A+P + L A+ V Sbjct: 475 HAMVVDKKNRGSTQRFVLLDDIAQPAAVSDVPASLQAEAFGKV 517 Lambda K H 0.320 0.139 0.406 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 557 Number of extensions: 28 Number of successful extensions: 3 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 362 Length of database: 519 Length adjustment: 32 Effective length of query: 330 Effective length of database: 487 Effective search space: 160710 Effective search space used: 160710 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.4 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.8 bits) S2: 51 (24.3 bits)
Align candidate WP_025274781.1 HALAL_RS0114995 (3-dehydroquinate synthase)
to HMM TIGR01357 (aroB: 3-dehydroquinate synthase (EC 4.2.3.4))
# hmmsearch :: search profile(s) against a sequence database # HMMER 3.3.1 (Jul 2020); http://hmmer.org/ # Copyright (C) 2020 Howard Hughes Medical Institute. # Freely distributed under the BSD open source license. # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # query HMM file: ../tmp/path.aa/TIGR01357.hmm # target sequence database: /tmp/gapView.3719312.genome.faa # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: TIGR01357 [M=344] Accession: TIGR01357 Description: aroB: 3-dehydroquinate synthase Scores for complete sequences (score includes all domains): --- full sequence --- --- best 1 domain --- -#dom- E-value score bias E-value score bias exp N Sequence Description ------- ------ ----- ------- ------ ----- ---- -- -------- ----------- 3.6e-107 344.5 0.0 4.2e-105 337.7 0.0 2.0 2 NCBI__GCF_000527155.1:WP_025274781.1 Domain annotation for each sequence (and alignments): >> NCBI__GCF_000527155.1:WP_025274781.1 # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc --- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ---- 1 ! 4.9 0.0 0.00068 0.00068 47 95 .. 41 84 .. 30 85 .. 0.81 2 ! 337.7 0.0 4.2e-105 4.2e-105 1 340 [. 179 511 .. 179 515 .. 0.94 Alignments for each domain: == domain 1 score: 4.9 bits; conditional E-value: 0.00068 TIGR01357 47 lgvevlvlvvpdgeesKsletvaklldqlleeklerksvlvaiGGGvvg 95 +g++v + v++ge +++ +l+++ +++ + ++ ++++GGG+v+ NCBI__GCF_000527155.1:WP_025274781.1 41 AGTSVSDIFVNHGE-----DHFRELERKAVAAAIAEHDGVISLGGGAVS 84 56666666666666.....57999***********************95 PP == domain 2 score: 337.7 bits; conditional E-value: 4.2e-105 TIGR01357 1 ykvkvgegllkklveelaekasklvvitdeeveklvaekleealkslgvevlvlvvpdgeesKsletvaklld 73 y+v++g g+ + + e l a+k+ +i + +ve +v + l+++ + g ++ +l +pd+e K+l++v ++++ NCBI__GCF_000527155.1:WP_025274781.1 179 YSVTIGPGTATLVDEVLP-DAEKIAFIYSPTVEPQVRDILAALPQ--GRRIVTLEIPDAEAGKELDVVGQCWE 248 678899888875555555.689**************877777764..8************************* PP TIGR01357 74 qlleeklerksvlvaiGGGvvgDlaGFvAatylRGirlvqvPTtllamvDssvGGKtginlplgkNliGafyq 146 +l ee+++r +++v++GGG+v+D+aGFvAat+lRG+++v++PT+ll++vD++vGGKtgin++ gkNl+Gaf+ NCBI__GCF_000527155.1:WP_025274781.1 249 ALGEEGFTRTDAVVSVGGGAVSDVAGFVAATWLRGVDVVHLPTSLLGAVDAAVGGKTGINTAAGKNLVGAFHP 321 ************************************************************************* PP TIGR01357 147 PkaVlidlkvletlperelreGmaEviKhgliadaelfeelekneklllklaelealeelikrsievKaevVe 219 PkaVl+d+++l+tlp++e+++G+aE+iK g+i d +++ +e+ + l+ + + + el++r+i vKa++V NCBI__GCF_000527155.1:WP_025274781.1 322 PKAVLCDTRMLRTLPREEMSNGLAEIIKAGFICDTTILDLIEHDPAGALD-PNGSLVPELMRRAIAVKADIVG 393 *******************************************9987776.578******************* PP TIGR01357 220 eDekesglRalLNfGHtlgHaiEallkyklsHGeaVaiGmvveaklseklgllkaellerlvallkklglptk 292 eD +e+g R LNfGHtl+HaiE+ y+l+HG aVaiGmv++a+l + ++ +++rl+++l+++glpt+ NCBI__GCF_000527155.1:WP_025274781.1 394 EDLTEQGKRVWLNFGHTLAHAIEKQERYRLRHGFAVAIGMVYAAELGALTERVN--VTRRLRRILESVGLPTS 464 ***********************************************9999988..9**************** PP TIGR01357 293 lkkklsveellkallkDKKnegskiklvlleeiGkaalasevteeell 340 ++ + ++l +a+ DKKn+gs+ ++vll++i+++a s+v+++ + NCBI__GCF_000527155.1:WP_025274781.1 465 YSG-SDWDSLRHAMVVDKKNRGSTQRFVLLDDIAQPAAVSDVPASLQA 511 **7.********************************999998876655 PP Internal pipeline statistics summary: ------------------------------------- Query model(s): 1 (344 nodes) Target sequences: 1 (519 residues searched) Passed MSV filter: 1 (1); expected 0.0 (0.02) Passed bias filter: 1 (1); expected 0.0 (0.02) Passed Vit filter: 1 (1); expected 0.0 (0.001) Passed Fwd filter: 1 (1); expected 0.0 (1e-05) Initial search space (Z): 1 [actual number of targets] Domain search space (domZ): 1 [number of targets reported over threshold] # CPU time: 0.00u 0.00s 00:00:00.00 Elapsed: 00:00:00.00 # Mc/sec: 18.06 // [ok]
This GapMind analysis is from Jul 25 2024. The underlying query database was built on Jul 25 2024.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory