Align Dihydroxy-acid dehydratase; DAD; EC 4.2.1.9 (characterized)
to candidate WP_004042767.1 C498_RS08250 dihydroxy-acid dehydratase
Query= SwissProt::P9WKJ5 (575 letters) >NCBI__GCF_000337315.1:WP_004042767.1 Length = 584 Score = 561 bits (1446), Expect = e-164 Identities = 295/575 (51%), Positives = 391/575 (68%), Gaps = 8/575 (1%) Query: 2 PQTTDEAASVSTVADIKPRSRDVTDGLEKAAARGMLRAVGMDDEDFAKPQIGVASSWNEI 61 P+ D S+ D RS +VT+G +KA R M RA+G DDEDF+ P +GV + +I Sbjct: 6 PREEDPDDVFSSGKDPNLRSTEVTEGPDKAPHRAMFRAMGFDDEDFSSPMVGVPNPAADI 65 Query: 62 TPCNLSLDRLANAVKEGVFSAGGYPLEFGTISVSDGISMGHEGMHFSLVSREVIADSVEV 121 TPCN+ LD +A+A EG+ +AGG P+EFGT+++SD ISMG EGM SL+SREVIADSVE+ Sbjct: 66 TPCNVHLDDVADAAIEGIDAAGGMPIEFGTVTISDAISMGTEGMKASLISREVIADSVEL 125 Query: 122 VMQAERLDGSVLLAGCDKSLPGMLMAAARLDLAAVFLYAGSILPGRAKLSDGSERDVTII 181 V ER+D V +AGCDK+LPGM+MAA R DL +VFLY GSI+PG+ + R+VT+ Sbjct: 126 VSFGERMDALVTVAGCDKNLPGMMMAAIRTDLPSVFLYGGSIMPGQHE-----GREVTVQ 180 Query: 182 DAFEAVGACSRGLMSRADVDAIERAICPGEGACGGMYTANTMASAAEALGMSLPGSAAPP 241 + FE VG + G MS ++D +ER CPG G+CGGM+TANTMAS +EALGM+ GSA+ P Sbjct: 181 NVFEGVGTYAEGDMSADELDDLERHACPGAGSCGGMFTANTMASISEALGMAPLGSASAP 240 Query: 242 ATDRRRDGFARRSGQAVVELLRRGITARDILTKEAFENAIAVVMAFGGSTNAVLHLLAIA 301 A R ARR+G+ V++ + DILTK++FENAI + +A GGSTNAVLHLLA+A Sbjct: 241 AESDERYENARRAGEVVLDCVENDRRPSDILTKKSFENAITLQVATGGSTNAVLHLLALA 300 Query: 302 HEANVALSLQDFSRIGSGVPHLADVKPFGRHVMSDVDHIGGVPVVMKALLDAGLLHGDCL 361 EA V L +++F+ I P +A+++P G VM+D+ IGG+PVV++ L++AGL HGD + Sbjct: 301 AEAGVDLDIEEFNEISRRTPKIANLQPGGTRVMNDLHEIGGIPVVIRRLVEAGLFHGDAM 360 Query: 362 TVTGHTMAENLAAITPPDPDG---KVLRALANPIHPSGGITILHGSLAPEGAVVKTAGFD 418 TVTG T+AE L + PD DG L + P G I IL G+LAP+GAV+K G D Sbjct: 361 TVTGRTIAEELDHLDLPDDDGLEADFLYTVDEPYQDEGAIKILTGNLAPDGAVLKVTGDD 420 Query: 419 SDVFEGTARVFDGERAALDALEDGTITVGDAVVIRYEGPKGGPGMREMLAITGAIKGAGL 478 + G ARVF+ E A+ +++G I GD + IR EGP+GGPGMREML +T A+ G G Sbjct: 421 AFHHTGPARVFENEEDAMRYVQEGHIEEGDVIAIRNEGPRGGPGMREMLGVTAAVVGQGH 480 Query: 479 GKDVLLLTDGRFSGGTTGLCVGHIAPEAVDGGPIALLRNGDRIRLDVAGRVLDVLADPAE 538 DV LLTDGRFSG T G VGH+APEA +GGPI LL +GD + +D+ R L V E Sbjct: 481 EDDVALLTDGRFSGATRGPMVGHVAPEAAEGGPIGLLEDGDEVTVDIPNRELSVDLSDEE 540 Query: 539 FASRQQDFSPPPPRYTTGVLSKYVKLVSSAAVGAV 573 +R++D+ P PP YT+GVL+KY + SAA GAV Sbjct: 541 LEARKEDWEPKPPAYTSGVLAKYARDFGSAANGAV 575 Lambda K H 0.318 0.136 0.393 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 1066 Number of extensions: 54 Number of successful extensions: 5 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 575 Length of database: 584 Length adjustment: 36 Effective length of query: 539 Effective length of database: 548 Effective search space: 295372 Effective search space used: 295372 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.3 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.7 bits) S2: 53 (25.0 bits)
Align candidate WP_004042767.1 C498_RS08250 (dihydroxy-acid dehydratase)
to HMM TIGR00110 (ilvD: dihydroxy-acid dehydratase (EC 4.2.1.9))
# hmmsearch :: search profile(s) against a sequence database # HMMER 3.3.1 (Jul 2020); http://hmmer.org/ # Copyright (C) 2020 Howard Hughes Medical Institute. # Freely distributed under the BSD open source license. # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # query HMM file: ../tmp/path.aa/TIGR00110.hmm # target sequence database: /tmp/gapView.367653.genome.faa # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: TIGR00110 [M=543] Accession: TIGR00110 Description: ilvD: dihydroxy-acid dehydratase Scores for complete sequences (score includes all domains): --- full sequence --- --- best 1 domain --- -#dom- E-value score bias E-value score bias exp N Sequence Description ------- ------ ----- ------- ------ ----- ---- -- -------- ----------- 4.2e-217 707.9 2.4 4.9e-217 707.7 2.4 1.0 1 NCBI__GCF_000337315.1:WP_004042767.1 Domain annotation for each sequence (and alignments): >> NCBI__GCF_000337315.1:WP_004042767.1 # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc --- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ---- 1 ! 707.7 2.4 4.9e-217 4.9e-217 1 541 [. 36 576 .. 36 578 .. 0.99 Alignments for each domain: == domain 1 score: 707.7 bits; conditional E-value: 4.9e-217 TIGR00110 1 aarallkatGlkdedlekPiiavvnsyteivPghvhlkdlaklvkeeieaaGgvakefntiavsDGiamgheG 73 ++ra+++a+G+ ded++ P+++v n +i+P++vhl+d+a+++ e+i+aaGg++ ef+t+++sD i+mg+eG NCBI__GCF_000337315.1:WP_004042767.1 36 PHRAMFRAMGFDDEDFSSPMVGVPNPAADITPCNVHLDDVADAAIEGIDAAGGMPIEFGTVTISDAISMGTEG 108 69*********************************************************************** PP TIGR00110 74 mkysLpsreiiaDsvetvvkahalDalvvissCDkivPGmlmaalrlniPaivvsGGpmeagktklsekidlv 146 mk sL sre+iaDsve v ++++Dalv+++ CDk++PGm+maa+r+++P+++ +GG++++g+ + ++++++ NCBI__GCF_000337315.1:WP_004042767.1 109 MKASLISREVIADSVELVSFGERMDALVTVAGCDKNLPGMMMAAIRTDLPSVFLYGGSIMPGQHE-GREVTVQ 180 *****************************************************************.9****** PP TIGR00110 147 dvfeavgeyaagklseeeleeiersacPtagsCsGlftansmacltealGlslPgsstllatsaekkelakks 219 +vfe+vg+ya+g++s +el+++er+acP+agsC+G+ftan+ma+++ealG++ gs++++a s e+ e a+++ NCBI__GCF_000337315.1:WP_004042767.1 181 NVFEGVGTYAEGDMSADELDDLERHACPGAGSCGGMFTANTMASISEALGMAPLGSASAPAESDERYENARRA 253 ************************************************************************* PP TIGR00110 220 gkrivelvkknikPrdiltkeafenaitldlalGGstntvLhllaiakeagvklslddfdrlsrkvPllaklk 292 g+ + + v+++ +P+diltk++fenaitl++a+GGstn+vLhlla+a+eagv+l++++f+++sr++P++a+l+ NCBI__GCF_000337315.1:WP_004042767.1 254 GEVVLDCVENDRRPSDILTKKSFENAITLQVATGGSTNAVLHLLALAAEAGVDLDIEEFNEISRRTPKIANLQ 326 ************************************************************************* PP TIGR00110 293 PsgkkviedlhraGGvsavlkeldkegllhkdaltvtGktlaetlekvkvlr...vdqdvirsldnpvkkegg 362 P+g +v++dlh+ GG++ v++ l ++gl+h da+tvtG+t+ae+l++ ++ ++ d + ++d+p+++eg NCBI__GCF_000337315.1:WP_004042767.1 327 PGGTRVMNDLHEIGGIPVVIRRLVEAGLFHGDAMTVTGRTIAEELDHLDLPDddgLEADFLYTVDEPYQDEGA 399 ***********************************************998754446678************** PP TIGR00110 363 lavLkGnlaeeGavvkiagveedilkfeGpakvfeseeealeailggkvkeGdvvviryeGPkGgPGmremLa 435 +++L+Gnla++Gav k++g + +++Gpa+vfe+ee+a+ + +g+++eGdv+ ir eGP+GgPGmremL NCBI__GCF_000337315.1:WP_004042767.1 400 IKILTGNLAPDGAVLKVTGDDA--FHHTGPARVFENEEDAMRYVQEGHIEEGDVIAIRNEGPRGGPGMREMLG 470 *******************988..************************************************* PP TIGR00110 436 PtsalvglGLgkkvaLitDGrfsGgtrGlsiGhvsPeaaegGaialvedGDkikiDienrkldlevseeelae 508 t+a+vg G +++vaL+tDGrfsG+trG+++Ghv+PeaaegG+i+l+edGD++++Di+nr+l +++s+eel++ NCBI__GCF_000337315.1:WP_004042767.1 471 VTAAVVGQGHEDDVALLTDGRFSGATRGPMVGHVAPEAAEGGPIGLLEDGDEVTVDIPNRELSVDLSDEELEA 543 ************************************************************************* PP TIGR00110 509 rrakakkkearevkgaLakyaklvssadkGavl 541 r++++++k + +++g+Lakya+ sa +Gav+ NCBI__GCF_000337315.1:WP_004042767.1 544 RKEDWEPKPPAYTSGVLAKYARDFGSAANGAVT 576 *******************************97 PP Internal pipeline statistics summary: ------------------------------------- Query model(s): 1 (543 nodes) Target sequences: 1 (584 residues searched) Passed MSV filter: 1 (1); expected 0.0 (0.02) Passed bias filter: 1 (1); expected 0.0 (0.02) Passed Vit filter: 1 (1); expected 0.0 (0.001) Passed Fwd filter: 1 (1); expected 0.0 (1e-05) Initial search space (Z): 1 [actual number of targets] Domain search space (domZ): 1 [number of targets reported over threshold] # CPU time: 0.01u 0.00s 00:00:00.01 Elapsed: 00:00:00.01 # Mc/sec: 24.22 // [ok]
This GapMind analysis is from Jul 25 2024. The underlying query database was built on Jul 25 2024.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory