Align dihydroxy-acid dehydratase subunit (EC 4.2.1.9) (characterized)
to candidate NP_349767.1 CA_C3170 dihydroxy-acid dehydratase
Query= metacyc::MONOMER-11919 (549 letters) >NCBI__GCF_000008765.1:NP_349767.1 Length = 552 Score = 606 bits (1562), Expect = e-178 Identities = 301/550 (54%), Positives = 416/550 (75%), Gaps = 7/550 (1%) Query: 1 MKSDTIKRGIQRAPHRSLLARCGLTDDDFEKPFIGIANSYTDIVPGHIHLRELAEAVKEG 60 M SD +G++RAPHRSL G D++ ++P IGIANSY++++PGH++L ++ +AVK+G Sbjct: 1 MNSDKAMKGVERAPHRSLFKALGFIDEEMDRPLIGIANSYSELIPGHMNLDKIVKAVKDG 60 Query: 61 VNAAGGVAFEFNTMAICDGIAMNHDGMKYSLASREIVADTVESMAMAHALDGLVLLPTCD 120 + AGGV EF T+ +CDGI+MNH GM YSL SR+I+AD+VE +A AHALDGLVL+P CD Sbjct: 61 IRMAGGVPVEFGTIGVCDGISMNHKGMSYSLPSRQIIADSVEIVAKAHALDGLVLVPNCD 120 Query: 121 KIVPGMLMAAARLDIPAIVVTGGPMLPGEFKGRKVDLINVYEGVGTVSAGEMSEDELEEL 180 K+VPGMLMAA R++IP+IV++GGPML G GR DL +V+E VG VSAG+MS +EL EL Sbjct: 121 KVVPGMLMAAGRVNIPSIVISGGPMLSGRTNGRVTDLNSVFEAVGAVSAGKMSLEELAEL 180 Query: 181 ERCACPGPRSCAGLFTANTMACLTEALGMSLPGCATAHAVSSRKRQIARLSGKRIVEMVQ 240 E ACP SC+G+FTAN+M CL+E LG++LP T AV S + ++A+ +G +IVE+VQ Sbjct: 181 ENTACPTCGSCSGMFTANSMNCLSEVLGLALPYNGTIPAVFSERLRLAKKAGMKIVELVQ 240 Query: 241 ENLKPTMIMSQEAFENAVMVDLALGGSTNTTLHIPAIAAEIDGLNINLDLFDELSRVIPH 300 ++++P+ I+++ AF NAV +D+ALGGSTN+ LH+PAIA E D ++IN D +E+S IPH Sbjct: 241 KDIRPSDILTEAAFMNAVAMDMALGGSTNSLLHLPAIAYECD-VDINFDKINEISEKIPH 299 Query: 301 IASISPAGEHMMLDLDRAGGIPAVLKTL--EDHINRECVTCTGRTVQENIENVKVGHRDV 358 + +SPAG H + DL AGGIPAV+ + + +N EC+T TG+T+ EN+++ K+ + DV Sbjct: 300 VCKLSPAGFHHIEDLHMAGGIPAVVNGIIKKGLLNGECMTVTGKTLYENVKDAKIKNIDV 359 Query: 359 IRPLDSPVHSEGGLAILRGNLAPRGSVVKQGAVAEDMMVHEGPAKVFNSEDECMEAIFGG 418 IR +D+P GGL++LRGNLAP G++VK+ AVA +MM H GPA+VFNSE+E +AI GG Sbjct: 360 IR-IDNPYSETGGLSVLRGNLAPDGAIVKKAAVAPEMMQHTGPARVFNSEEEVSKAILGG 418 Query: 419 RIDEGDVIVIRYEGPKGGPGMREMLNPTSAIAGMGLER-VALITDGRFSGGTRGPCVGHV 477 +I+ GDV+VIRYEGPKGGPGM+EML+PT+++AGMGL++ VALITDGRFSG TRG +GHV Sbjct: 419 KINPGDVVVIRYEGPKGGPGMKEMLSPTASLAGMGLDKSVALITDGRFSGATRGASIGHV 478 Query: 478 SPEAMEDGPLAAVNDGDIIRIDIPSRKLEVDLSPREIEERLQSAVKPRRSVKGWLARYRK 537 SPEA E GP+ V +GD I IDI + + + + +++ R V+ + VKG+L YR+ Sbjct: 479 SPEAAEGGPIGLVEEGDTIEIDIEKKTINLLVPEEKLKNRKPEKVE--KPVKGYLNTYRQ 536 Query: 538 LAGSADTGAV 547 SA TGAV Sbjct: 537 GVSSACTGAV 546 Lambda K H 0.319 0.136 0.397 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 1014 Number of extensions: 42 Number of successful extensions: 6 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 549 Length of database: 552 Length adjustment: 36 Effective length of query: 513 Effective length of database: 516 Effective search space: 264708 Effective search space used: 264708 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.4 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.7 bits) S2: 53 (25.0 bits)
Align candidate NP_349767.1 CA_C3170 (dihydroxy-acid dehydratase)
to HMM TIGR00110 (ilvD: dihydroxy-acid dehydratase (EC 4.2.1.9))
# hmmsearch :: search profile(s) against a sequence database # HMMER 3.3.1 (Jul 2020); http://hmmer.org/ # Copyright (C) 2020 Howard Hughes Medical Institute. # Freely distributed under the BSD open source license. # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # query HMM file: ../tmp/path.aa/TIGR00110.hmm # target sequence database: /tmp/gapView.13893.genome.faa # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: TIGR00110 [M=543] Accession: TIGR00110 Description: ilvD: dihydroxy-acid dehydratase Scores for complete sequences (score includes all domains): --- full sequence --- --- best 1 domain --- -#dom- E-value score bias E-value score bias exp N Sequence Description ------- ------ ----- ------- ------ ----- ---- -- -------- ----------- 8.6e-243 792.7 9.0 1e-242 792.5 9.0 1.0 1 lcl|NCBI__GCF_000008765.1:NP_349767.1 CA_C3170 dihydroxy-acid dehydrat Domain annotation for each sequence (and alignments): >> lcl|NCBI__GCF_000008765.1:NP_349767.1 CA_C3170 dihydroxy-acid dehydratase # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc --- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ---- 1 ! 792.5 9.0 1e-242 1e-242 1 541 [. 14 547 .. 14 549 .. 0.99 Alignments for each domain: == domain 1 score: 792.5 bits; conditional E-value: 1e-242 TIGR00110 1 aarallkatGlkdedlekPiiavvnsyteivPghvhlkdlaklvkeeieaaGgvakefntiavsDGiamghe 72 ++r+l+ka+G+ de++++P+i+++nsy+e++Pgh++l+++ k+vk++i+ aGgv++ef+ti+v+DGi+m+h+ lcl|NCBI__GCF_000008765.1:NP_349767.1 14 PHRSLFKALGFIDEEMDRPLIGIANSYSELIPGHMNLDKIVKAVKDGIRMAGGVPVEFGTIGVCDGISMNHK 85 69********************************************************************** PP TIGR00110 73 GmkysLpsreiiaDsvetvvkahalDalvvissCDkivPGmlmaalrlniPaivvsGGpmeagktklsekid 144 Gm ysLpsr+iiaDsve v+kahalD+lv++++CDk+vPGmlmaa r+niP+iv+sGGpm +g+t+ ++ d lcl|NCBI__GCF_000008765.1:NP_349767.1 86 GMSYSLPSRQIIADSVEIVAKAHALDGLVLVPNCDKVVPGMLMAAGRVNIPSIVISGGPMLSGRTN-GRVTD 156 ******************************************************************.9**** PP TIGR00110 145 lvdvfeavgeyaagklseeeleeiersacPtagsCsGlftansmacltealGlslPgsstllatsaekkela 216 l +vfeavg+++agk+s eel e+e++acPt+gsCsG+ftansm+cl+e+lGl+lP+++t++a+ +e+++la lcl|NCBI__GCF_000008765.1:NP_349767.1 157 LNSVFEAVGAVSAGKMSLEELAELENTACPTCGSCSGMFTANSMNCLSEVLGLALPYNGTIPAVFSERLRLA 228 ************************************************************************ PP TIGR00110 217 kksgkrivelvkknikPrdiltkeafenaitldlalGGstntvLhllaiakeagvklslddfdrlsrkvPll 288 kk+g++ivelv+k+i+P+dilt++af na+++d+alGGstn+ Lhl+aia e +v++++d+++++s+k+P++ lcl|NCBI__GCF_000008765.1:NP_349767.1 229 KKAGMKIVELVQKDIRPSDILTEAAFMNAVAMDMALGGSTNSLLHLPAIAYECDVDINFDKINEISEKIPHV 300 ************************************************************************ PP TIGR00110 289 aklkPsgkkviedlhraGGvsavlkeldkegllhkdaltvtGktlaetlekvkvlrvdqdvirsldnpvkke 360 +kl+P+g + iedlh aGG++av++ + k+gll+ +++tvtGktl+e+++++k++ + dvir +dnp++++ lcl|NCBI__GCF_000008765.1:NP_349767.1 301 CKLSPAGFHHIEDLHMAGGIPAVVNGIIKKGLLNGECMTVTGKTLYENVKDAKIK--NIDVIR-IDNPYSET 369 ******************************************************9..99***8.8******* PP TIGR00110 361 gglavLkGnlaeeGavvkiagveedilkfeGpakvfeseeealeailggkvkeGdvvviryeGPkGgPGmre 432 ggl vL+Gnla++Ga+vk a+v+ ++++++Gpa+vf+seee+ +ailggk++ GdvvviryeGPkGgPGm+e lcl|NCBI__GCF_000008765.1:NP_349767.1 370 GGLSVLRGNLAPDGAIVKKAAVAPEMMQHTGPARVFNSEEEVSKAILGGKINPGDVVVIRYEGPKGGPGMKE 441 ************************************************************************ PP TIGR00110 433 mLaPtsalvglGLgkkvaLitDGrfsGgtrGlsiGhvsPeaaegGaialvedGDkikiDienrkldlevsee 504 mL Pt+ l+g+GL+k+vaLitDGrfsG+trG siGhvsPeaaegG+i+lve+GD+i+iDie+++++l v ee lcl|NCBI__GCF_000008765.1:NP_349767.1 442 MLSPTASLAGMGLDKSVALITDGRFSGATRGASIGHVSPEAAEGGPIGLVEEGDTIEIDIEKKTINLLVPEE 513 ************************************************************************ PP TIGR00110 505 elaerrakakkkearevkgaLakyaklvssadkGavl 541 l++r+ ++++k vkg+L+ y++ vssa +Gav lcl|NCBI__GCF_000008765.1:NP_349767.1 514 KLKNRKPEKVEKP---VKGYLNTYRQGVSSACTGAVF 547 *****99888877...89*****************97 PP Internal pipeline statistics summary: ------------------------------------- Query model(s): 1 (543 nodes) Target sequences: 1 (552 residues searched) Passed MSV filter: 1 (1); expected 0.0 (0.02) Passed bias filter: 1 (1); expected 0.0 (0.02) Passed Vit filter: 1 (1); expected 0.0 (0.001) Passed Fwd filter: 1 (1); expected 0.0 (1e-05) Initial search space (Z): 1 [actual number of targets] Domain search space (domZ): 1 [number of targets reported over threshold] # CPU time: 0.02u 0.01s 00:00:00.03 Elapsed: 00:00:00.02 # Mc/sec: 12.62 // [ok]
This GapMind analysis is from Apr 10 2024. The underlying query database was built on Apr 09 2024.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory