Align 3-deoxy-7-phosphoheptulonate synthase (EC 2.5.1.54) (characterized)
to candidate GFF3314 HP15_3256 phospho-2-dehydro-3-heoxyheptonate aldolase
Query= BRENDA::P00888 (356 letters) >FitnessBrowser__Marino:GFF3314 Length = 350 Score = 339 bits (870), Expect = 6e-98 Identities = 170/332 (51%), Positives = 223/332 (67%), Gaps = 4/332 (1%) Query: 13 DEQVLMTPEQLKAAFPLSLQQEAQIADSRKSISDIIAGRDPRLLVVCGPCSIHDPETALE 72 +E VL TP +LK P S Q+ + R++I I+ G D R L+V GPCSIHD ALE Sbjct: 22 EEVVLPTPAELKLQMPASDDIVRQVDEHRQAIRRILQGSDTRTLIVMGPCSIHDEVAALE 81 Query: 73 YARRFKALAAEVSDSLYLVMRVYFEKPRTTVGWKGLINDPHMDGSFDVEAGLQIARKLLL 132 Y + KALA EVSD +VMR Y EKPRTTVGWKGL+ DP G+ D+ GL+ +R+LLL Sbjct: 82 YGEKLKALADEVSDRFLIVMRAYLEKPRTTVGWKGLLYDPERTGAGDLHEGLRRSRRLLL 141 Query: 133 ELVNMGLPLATEALDPNSPQYLGDLFSWSAIGARTTESQTHREMASGLSMPVGFKNGTDG 192 L MGLPLATEAL P + YLGDL SW+AIGARTTESQ HRE+ SGL MP GFKNGTDG Sbjct: 142 NLAAMGLPLATEALSPFAMDYLGDLVSWTAIGARTTESQVHREIVSGLPMPTGFKNGTDG 201 Query: 193 SLATAINAMRAAAQPHRFVGINQAGQVALLQTQGNPDGHVILRGGKA-PNYSPADVAQCE 251 +A A NAM++A+ PH +G++ G ++ T+GNPD H++LRGG+ NY A + Q Sbjct: 202 GIAVATNAMKSASHPHHHLGVSATGAPVMITTRGNPDTHLVLRGGRGITNYDAASIEQAV 261 Query: 252 KEMEQAGLRPSLMVDCSHGNSNKDYRRQPAVAESVVAQIKDGNRSIIGLMIESNIHEGNQ 311 + +AGL ++MVDCSH N+ K RQ +A+ V+AQ + GN I GLM+ES + G Q Sbjct: 262 GALAEAGLSTAVMVDCSHDNACKQSERQLDIAQDVMAQRRAGNHHIRGLMLESFLEPGRQ 321 Query: 312 SSEQPRSEMKYGVSVTDACISWEMTDALLREI 343 + +++YG S+TD C+ W T+AL+R + Sbjct: 322 DDGE---DLRYGCSITDPCLGWAQTEALIRSL 350 Lambda K H 0.316 0.131 0.378 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 381 Number of extensions: 13 Number of successful extensions: 3 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 356 Length of database: 350 Length adjustment: 29 Effective length of query: 327 Effective length of database: 321 Effective search space: 104967 Effective search space used: 104967 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.3 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.6 bits) S2: 49 (23.5 bits)
Align candidate GFF3314 HP15_3256 (phospho-2-dehydro-3-heoxyheptonate aldolase)
to HMM TIGR00034 (3-deoxy-7-phosphoheptulonate synthase (EC 2.5.1.54))
# hmmsearch :: search profile(s) against a sequence database # HMMER 3.3.1 (Jul 2020); http://hmmer.org/ # Copyright (C) 2020 Howard Hughes Medical Institute. # Freely distributed under the BSD open source license. # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # query HMM file: ../tmp/path.aa/TIGR00034.hmm # target sequence database: /tmp/gapView.7679.genome.faa # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: TIGR00034 [M=342] Accession: TIGR00034 Description: aroFGH: 3-deoxy-7-phosphoheptulonate synthase Scores for complete sequences (score includes all domains): --- full sequence --- --- best 1 domain --- -#dom- E-value score bias E-value score bias exp N Sequence Description ------- ------ ----- ------- ------ ----- ---- -- -------- ----------- 9.1e-132 424.9 0.0 1.1e-131 424.6 0.0 1.0 1 lcl|FitnessBrowser__Marino:GFF3314 HP15_3256 phospho-2-dehydro-3-he Domain annotation for each sequence (and alignments): >> lcl|FitnessBrowser__Marino:GFF3314 HP15_3256 phospho-2-dehydro-3-heoxyheptonate aldolase # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc --- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ---- 1 ! 424.6 0.0 1.1e-131 1.1e-131 11 334 .. 26 350 .] 18 350 .] 0.98 Alignments for each domain: == domain 1 score: 424.6 bits; conditional E-value: 1.1e-131 TIGR00034 11 lltPeelkakfpltekaaekvaksrkeiadilaGkddrllvviGPcsihdpeaaleyakrlkklaeklkddleiv 85 l tP+elk ++p++ + +v + r++i +il+G+d+r l+v+GPcsihd aaley ++lk+la++++d++ iv lcl|FitnessBrowser__Marino:GFF3314 26 LPTPAELKLQMPASDDIVRQVDEHRQAIRRILQGSDTRTLIVMGPCSIHDEVAALEYGEKLKALADEVSDRFLIV 100 459************************************************************************ PP TIGR00034 86 mrvyfekPrttvGWkGlindPdlnesfdvnkGlriarkllldlvelglplatelldtispqyladllswgaiGar 160 mr+y+ekPrttvGWkGl+ dP+ ++ d+++Glr +r+lll+l+ +glplate+l + + yl+dl+sw+aiGar lcl|FitnessBrowser__Marino:GFF3314 101 MRAYLEKPRTTVGWKGLLYDPERTGAGDLHEGLRRSRRLLLNLAAMGLPLATEALSPFAMDYLGDLVSWTAIGAR 175 *************************************************************************** PP TIGR00034 161 ttesqvhrelasglslpvgfkngtdGslkvaidairaaaaehlflsvtkaGqvaivetkGnedghiilrGGkk.p 234 ttesqvhre+ sgl +p gfkngtdG+++va +a+++a+++h+ l+v+ +G+ +++t+Gn+d+h++lrGG+ + lcl|FitnessBrowser__Marino:GFF3314 176 TTESQVHREIVSGLPMPTGFKNGTDGGIAVATNAMKSASHPHHHLGVSATGAPVMITTRGNPDTHLVLRGGRGiT 250 ************************************************************************99* PP TIGR00034 235 nydaedvaevkeelekaglkeelmidfshgnsnkdykrqlevaesvveqiaeGekaiiGvmiesnleeGnqslke 309 nyda++++++ l +agl+ +m+d+sh n+ k+ +rql++a++v++q + G++ i G+m+es+le G+q+ +e lcl|FitnessBrowser__Marino:GFF3314 251 NYDAASIEQAVGALAEAGLSTAVMVDCSHDNACKQSERQLDIAQDVMAQRRAGNHHIRGLMLESFLEPGRQDDGE 325 *************************************************************************** PP TIGR00034 310 elkyGksvtdacigwedteallrkl 334 +l yG+s+td c+gw +teal+r l lcl|FitnessBrowser__Marino:GFF3314 326 DLRYGCSITDPCLGWAQTEALIRSL 350 **********************976 PP Internal pipeline statistics summary: ------------------------------------- Query model(s): 1 (342 nodes) Target sequences: 1 (350 residues searched) Passed MSV filter: 1 (1); expected 0.0 (0.02) Passed bias filter: 1 (1); expected 0.0 (0.02) Passed Vit filter: 1 (1); expected 0.0 (0.001) Passed Fwd filter: 1 (1); expected 0.0 (1e-05) Initial search space (Z): 1 [actual number of targets] Domain search space (domZ): 1 [number of targets reported over threshold] # CPU time: 0.02u 0.01s 00:00:00.03 Elapsed: 00:00:00.01 # Mc/sec: 7.72 // [ok]
This GapMind analysis is from Apr 09 2024. The underlying query database was built on Apr 09 2024.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory