Align Dihydroxy-acid dehydratase, mitochondrial; DAD; 2,3-dihydroxy acid hydrolyase; EC 4.2.1.9 (characterized)
to candidate CA265_RS15795 CA265_RS15795 dihydroxy-acid dehydratase
Query= SwissProt::P39522 (585 letters) >FitnessBrowser__Pedo557:CA265_RS15795 Length = 560 Score = 612 bits (1577), Expect = e-179 Identities = 310/564 (54%), Positives = 406/564 (71%), Gaps = 9/564 (1%) Query: 22 KLNKYSYIITEPKGQGASQAMLYATGFKKEDFKKPQVGVGSCWWSGNPCNMHLLDLNNRC 81 +LNKYS T+ Q A+QAMLY G D K QVG+ S + GN CNMHL DL Sbjct: 3 ELNKYSKTFTQDPTQPAAQAMLYGIGLTDADMAKAQVGIASMGYDGNTCNMHLNDLAKDV 62 Query: 82 SQSIEKAGLKAMQFNTIGVSDGISMGTKGMRYSLQSREIIADSFETIMMAQHYDANIAIP 141 + + K L + FNTIGVSDG+S GT GMRYSL SR++IADS ETI Q+YD I+IP Sbjct: 63 KKGVWKNDLVGLVFNTIGVSDGMSNGTDGMRYSLVSRDVIADSIETICGGQYYDGIISIP 122 Query: 142 SCDKNMPGVMMAMGRHNRPSIMVYGGTILPGHPTCGSSKISKNIDIVSAFQSYGEYISKQ 201 CDKNMPG +MAM R +RPSIMVYGGTI PGH + ++IVSAF++ G+ I Sbjct: 123 GCDKNMPGAIMAMARLDRPSIMVYGGTIAPGHYK------GEELNIVSAFEALGQKICGN 176 Query: 202 FTEEEREDVVEHACPGPGSCGGMYTANTMASAAEVLGLTIPNSSSFPAVSKEKLAECDNI 261 +EE+ + +++H CPG G+CGGMYTANTMASA E LG+++P SSS PA+S+EK EC + Sbjct: 177 LSEEDYQGIIKHTCPGAGACGGMYTANTMASAIEALGMSLPYSSSNPAISEEKKQECLDA 236 Query: 262 GEYIKKTMELGILPRDILTKEAFENAITYVVATGGSTNAVLHLVAVAHSAGVKLSPDDFQ 321 G+YIK +E I P DI+T++AFENAI ++ GGSTNAVLH +A+ + G++++ DDFQ Sbjct: 237 GKYIKILLEKDIKPSDIMTRKAFENAIRSIIILGGSTNAVLHFIAMGKAIGIEITQDDFQ 296 Query: 322 RISDTTPLIGDFKPSGKYVMADLINVGGTQSVIKYLYENNMLHGNTMTVTGDTLAERAKK 381 R+SD TP++ DFKPSGKY+M DL GG +V+KYL +LHG+ +TVTG T+AE Sbjct: 297 RMSDVTPVLADFKPSGKYLMQDLHQYGGIPAVLKYLLNEGLLHGDCLTVTGKTVAENLAD 356 Query: 382 APSLPE-GQEIIKPLSHPIKANGHLQILYGSLAPGGAVGKITGKEGTYFKGRARVFEEEG 440 S+ + Q+II+ LS PIKA GHLQILYG+LA G+V KI+GKEG F+G ARVF+ E Sbjct: 357 VKSIMDYDQKIIQKLSEPIKATGHLQILYGNLAEKGSVAKISGKEGEKFEGPARVFDGEH 416 Query: 441 AFIEALERGEIKKGEKTVVVIRYEGPRGAPGMPEMLKPSSALMGYGLGKDVALLTDGRFS 500 I + G ++ G+ V+VI+ GP GAPGMPEMLKP+SA++G GLGK VAL+TDGRFS Sbjct: 417 DLIAGISSGRVQPGD--VIVIKNSGPVGAPGMPEMLKPTSAIIGAGLGKSVALITDGRFS 474 Query: 501 GGSHGFLIGHIVPEAAEGGPIGLVRDGDEIIIDADNNKIDLLVSDKEMAQRKQSWVAPPP 560 GG+HGF++GHI PE+ +GG IGLV D D I+IDA NN I+L VSD+ +A+R++++V P Sbjct: 475 GGTHGFVVGHITPESYKGGLIGLVEDEDRILIDAVNNIINLQVSDEVIAERRKNYVQPAL 534 Query: 561 RYTRGTLSKYAKLVSNASNGCVLD 584 + T+G L KYAK VS+A++GCV D Sbjct: 535 KVTKGVLYKYAKTVSDAASGCVTD 558 Lambda K H 0.315 0.134 0.390 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 1050 Number of extensions: 49 Number of successful extensions: 4 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 585 Length of database: 560 Length adjustment: 36 Effective length of query: 549 Effective length of database: 524 Effective search space: 287676 Effective search space used: 287676 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.3 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.6 bits) S2: 53 (25.0 bits)
Align candidate CA265_RS15795 CA265_RS15795 (dihydroxy-acid dehydratase)
to HMM TIGR00110 (ilvD: dihydroxy-acid dehydratase (EC 4.2.1.9))
# hmmsearch :: search profile(s) against a sequence database # HMMER 3.3.1 (Jul 2020); http://hmmer.org/ # Copyright (C) 2020 Howard Hughes Medical Institute. # Freely distributed under the BSD open source license. # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # query HMM file: ../tmp/path.aa/TIGR00110.hmm # target sequence database: /tmp/gapView.6156.genome.faa # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: TIGR00110 [M=543] Accession: TIGR00110 Description: ilvD: dihydroxy-acid dehydratase Scores for complete sequences (score includes all domains): --- full sequence --- --- best 1 domain --- -#dom- E-value score bias E-value score bias exp N Sequence Description ------- ------ ----- ------- ------ ----- ---- -- -------- ----------- 1.1e-217 709.8 9.5 1.3e-217 709.6 9.5 1.0 1 lcl|FitnessBrowser__Pedo557:CA265_RS15795 CA265_RS15795 dihydroxy-acid deh Domain annotation for each sequence (and alignments): >> lcl|FitnessBrowser__Pedo557:CA265_RS15795 CA265_RS15795 dihydroxy-acid dehydratase # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc --- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ---- 1 ! 709.6 9.5 1.3e-217 1.3e-217 2 542 .. 20 558 .. 19 559 .. 0.99 Alignments for each domain: == domain 1 score: 709.6 bits; conditional E-value: 1.3e-217 TIGR00110 2 arallkatGlkdedlekPiiavvnsyteivPghvhlkdlaklvkeeieaaGgvakefntiavsDGiam 69 a+a+l+ +Gl+d+d+ k ++++++ + + +++hl+dlak vk+++ ++ v fnti+vsDG++ lcl|FitnessBrowser__Pedo557:CA265_RS15795 20 AQAMLYGIGLTDADMAKAQVGIASMGYDGNTCNMHLNDLAKDVKKGVWKNDLVGLVFNTIGVSDGMSN 87 79****************************************************************** PP TIGR00110 70 gheGmkysLpsreiiaDsvetvvkahalDalvvissCDkivPGmlmaalrlniPaivvsGGpmeagkt 137 g++Gm+ysL+sr++iaDs+et++ ++ +D+++ i+ CDk++PG++ma++rl++P+i+v+GG++++g++ lcl|FitnessBrowser__Pedo557:CA265_RS15795 88 GTDGMRYSLVSRDVIADSIETICGGQYYDGIISIPGCDKNMPGAIMAMARLDRPSIMVYGGTIAPGHY 155 ******************************************************************** PP TIGR00110 138 klsekidlvdvfeavgeyaagklseeeleeiersacPtagsCsGlftansmacltealGlslPgsstl 205 k +e++++v++fea+g+ g+lsee+ + i +++cP+ag+C+G++tan+ma++ ealG+slP+ss+ lcl|FitnessBrowser__Pedo557:CA265_RS15795 156 K-GEELNIVSAFEALGQKICGNLSEEDYQGIIKHTCPGAGACGGMYTANTMASAIEALGMSLPYSSSN 222 *.9***************************************************************** PP TIGR00110 206 latsaekkelakksgkrivelvkknikPrdiltkeafenaitldlalGGstntvLhllaiakeagvkl 273 +a+s+ekk+ + +gk+i+ l++k+ikP+di+t++afenai +++lGGstn+vLh +a+ k +g+++ lcl|FitnessBrowser__Pedo557:CA265_RS15795 223 PAISEEKKQECLDAGKYIKILLEKDIKPSDIMTRKAFENAIRSIIILGGSTNAVLHFIAMGKAIGIEI 290 ******************************************************************** PP TIGR00110 274 slddfdrlsrkvPllaklkPsgkkviedlhraGGvsavlkeldkegllhkdaltvtGktlaetlekvk 341 + ddf+r+s+ +P+la++kPsgk++++dlh+ GG++avlk+l +egllh d+ltvtGkt+ae+l++vk lcl|FitnessBrowser__Pedo557:CA265_RS15795 291 TQDDFQRMSDVTPVLADFKPSGKYLMQDLHQYGGIPAVLKYLLNEGLLHGDCLTVTGKTVAENLADVK 358 ******************************************************************99 PP TIGR00110 342 vlr.vdqdvirsldnpvkkegglavLkGnlaeeGavvkiagveedilkfeGpakvfeseeealeailg 408 +++ dq++i++l++p+k++g+l++L+Gnlae+G+v+ki+g+e kfeGpa+vf+ e++ +++i + lcl|FitnessBrowser__Pedo557:CA265_RS15795 359 SIMdYDQKIIQKLSEPIKATGHLQILYGNLAEKGSVAKISGKEG--EKFEGPARVFDGEHDLIAGISS 424 87325666***********************************9..9********************* PP TIGR00110 409 gkvkeGdvvviryeGPkGgPGmremLaPtsalvglGLgkkvaLitDGrfsGgtrGlsiGhvsPeaaeg 476 g+v+ Gdv+vi+ GP G+PGm+emL+Ptsa++g+GLgk+vaLitDGrfsGgt+G+++Gh++Pe+++g lcl|FitnessBrowser__Pedo557:CA265_RS15795 425 GRVQPGDVIVIKNSGPVGAPGMPEMLKPTSAIIGAGLGKSVALITDGRFSGGTHGFVVGHITPESYKG 492 ******************************************************************** PP TIGR00110 477 GaialvedGDkikiDienrkldlevseeelaerrakakkkearevkgaLakyaklvssadkGavld 542 G i+lved D+i iD+ n+ ++l+vs+e +aerr++ +++ + +kg+L kyak vs a +G+v+d lcl|FitnessBrowser__Pedo557:CA265_RS15795 493 GLIGLVEDEDRILIDAVNNIINLQVSDEVIAERRKNYVQPALKVTKGVLYKYAKTVSDAASGCVTD 558 ****************************************************************97 PP Internal pipeline statistics summary: ------------------------------------- Query model(s): 1 (543 nodes) Target sequences: 1 (560 residues searched) Passed MSV filter: 1 (1); expected 0.0 (0.02) Passed bias filter: 1 (1); expected 0.0 (0.02) Passed Vit filter: 1 (1); expected 0.0 (0.001) Passed Fwd filter: 1 (1); expected 0.0 (1e-05) Initial search space (Z): 1 [actual number of targets] Domain search space (domZ): 1 [number of targets reported over threshold] # CPU time: 0.03u 0.01s 00:00:00.04 Elapsed: 00:00:00.04 # Mc/sec: 7.47 // [ok]
This GapMind analysis is from Aug 03 2021. The underlying query database was built on Aug 03 2021.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see the paper from 2019 on GapMind for amino acid biosynthesis, the paper from 2022 on GapMind for carbon sources, or view the source code, or see changes to Amino acid biosynthesis since the publication.
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory