Align Galactarate dehydratase (EC 4.2.1.42) (characterized)
to candidate Ac3H11_3953 D-galactarate dehydratase (EC 4.2.1.42)
Query= reanno::HerbieS:HSERO_RS15800 (522 letters) >FitnessBrowser__acidovorax_3H11:Ac3H11_3953 Length = 525 Score = 753 bits (1943), Expect = 0.0 Identities = 372/518 (71%), Positives = 431/518 (83%), Gaps = 11/518 (2%) Query: 16 IMMNDTDNVAIVVNDGGLPAGTVFPDG-----LTLVDRVPQGHKIALRDLKQGEAIVRYD 70 I M+ DNVAIV NDGGLPAGTV P G +TL D+VPQGHK+AL D+ +G+ + RY+ Sbjct: 8 IRMHPDDNVAIVANDGGLPAGTVLPPGVPGAGITLRDKVPQGHKVALSDMAEGDVVRRYN 67 Query: 71 VAIGYAVRDIPKGGWIEESLVQMPPARELDNLPIATKKPAPQPPLEGYTFEGYRNADGSV 130 V IGYA++ IP G W+ E L+QMP AR L+ LP+AT KP PL GYTFEGYRNADGSV Sbjct: 68 VPIGYALKAIPAGSWVHERLLQMPSARALEGLPMATVKPPVLEPLTGYTFEGYRNADGSV 127 Query: 131 GTRNLLAITTTVQCVAGVVEHAVKRIRAELLPKYPNVEDVVALEHTYGCGVAIDAPNAGI 190 GTRN+LAITTTVQCVAGVV +AV+RI+ ELLP YP V+DV+ LEHTYGCGVAIDAP+A I Sbjct: 128 GTRNILAITTTVQCVAGVVANAVRRIKDELLPLYPQVDDVIGLEHTYGCGVAIDAPDAII 187 Query: 191 PIRTLRNISLNPNFGGQAMVVSLGCEKLQPNRLLPENMIPIHKQGE----PYVVCLQDAE 246 PIRTLRNISLNPNFGG+ MVVSLGCEKLQP+RLLP PI + + P VCLQ E Sbjct: 188 PIRTLRNISLNPNFGGEVMVVSLGCEKLQPDRLLPAGSFPIADERDAALGPETVCLQADE 247 Query: 247 HVGFNSMIDSIMNMAEARLTELNKRRRETCPASDLVVGVQCGGSDAFSGVTANPAVGFAT 306 HVGF SM+D I+ A L LN RRRET AS+LV+GVQCGGSDAFSGVTANPAVGF Sbjct: 248 HVGFMSMLDHIVQSARPHLERLNARRRETIRASELVLGVQCGGSDAFSGVTANPAVGFCA 307 Query: 307 DLLVRAGASVMFSEVTEVRDGIDQLTSRAVNEEVAQAMIREMDWYDNYLKQGGVDRSANT 366 DLLVRAGA+VMFSE TEVRD ++QLTSRA EVA++++RE+ WYD YL +G VDR+ANT Sbjct: 308 DLLVRAGATVMFSENTEVRDAVEQLTSRAATPEVAESIVRELGWYDRYLDRGRVDRAANT 367 Query: 367 TPGNKKGGLANIVEKAMGSIVKSGSSPISGVLSPGDKLQ--QKGLIYAATPASDFICGTL 424 TPGNK GGL+NI EKAMGSI+KSG++PIS VL+PG+KL+ Q+GL+YAATPASDFICGTL Sbjct: 368 TPGNKAGGLSNIAEKAMGSIIKSGTAPISHVLAPGEKLRRDQRGLVYAATPASDFICGTL 427 Query: 425 QLAAGMNLHIFTTGRGTPYGLAAVPVIKVATRNDLARRWHDLMDVNAGRIASGEASIEDV 484 QLAAGMNLH+FTTGRGTPYGLA PV+KVATR DLARRWHDLMD+NAG+IA GEASIE++ Sbjct: 428 QLAAGMNLHVFTTGRGTPYGLAECPVVKVATRTDLARRWHDLMDINAGKIADGEASIEEL 487 Query: 485 GWELFQLMLDVASGKKRTWAEQWKLHNALTLFNPAPVT 522 GWE+F L+LDVASG+K+TWAEQWKLHNAL LFNPAPVT Sbjct: 488 GWEMFHLLLDVASGRKKTWAEQWKLHNALVLFNPAPVT 525 Lambda K H 0.317 0.134 0.397 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 908 Number of extensions: 40 Number of successful extensions: 4 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 522 Length of database: 525 Length adjustment: 35 Effective length of query: 487 Effective length of database: 490 Effective search space: 238630 Effective search space used: 238630 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.3 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.6 bits) S2: 52 (24.6 bits)
Align candidate Ac3H11_3953 (D-galactarate dehydratase (EC 4.2.1.42))
to HMM TIGR03248 (garD: galactarate dehydratase (EC 4.2.1.42))
# hmmsearch :: search profile(s) against a sequence database # HMMER 3.3.1 (Jul 2020); http://hmmer.org/ # Copyright (C) 2020 Howard Hughes Medical Institute. # Freely distributed under the BSD open source license. # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # query HMM file: ../tmp/path.carbon/TIGR03248.hmm # target sequence database: /tmp/gapView.31626.genome.faa # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: TIGR03248 [M=507] Accession: TIGR03248 Description: galactar-dH20: galactarate dehydratase Scores for complete sequences (score includes all domains): --- full sequence --- --- best 1 domain --- -#dom- E-value score bias E-value score bias exp N Sequence Description ------- ------ ----- ------- ------ ----- ---- -- -------- ----------- 1.9e-265 867.0 0.0 2.1e-265 866.8 0.0 1.0 1 lcl|FitnessBrowser__acidovorax_3H11:Ac3H11_3953 D-galactarate dehydratase (EC 4. Domain annotation for each sequence (and alignments): >> lcl|FitnessBrowser__acidovorax_3H11:Ac3H11_3953 D-galactarate dehydratase (EC 4.2.1.42) # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc --- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ---- 1 ! 866.8 0.0 2.1e-265 2.1e-265 1 507 [] 6 525 .] 6 525 .] 0.97 Alignments for each domain: == domain 1 score: 866.8 bits; conditional E-value: 2.1e-265 TIGR03248 1 lyirvneqdnvaivvndkGlpagtefe.....dgltlvekipqghkvalvdlekgdaiiryg 57 l+ir++++dnvaiv+nd Glpagt+++ +g+tl++k+pqghkval d+++gd ++ry+ lcl|FitnessBrowser__acidovorax_3H11:Ac3H11_3953 6 LTIRMHPDDNVAIVANDGGLPAGTVLPpgvpgAGITLRDKVPQGHKVALSDMAEGDVVRRYN 67 689**********************98333335689************************** PP TIGR03248 58 eviGyavkdiarGswvkeellelpsapaleelplatkvpeklapleGytfeGyrnadGsvGt 119 iGya+k+i+ Gswv+e+ll++psa ale lp+at +p+ l+pl GytfeGyrnadGsvGt lcl|FitnessBrowser__acidovorax_3H11:Ac3H11_3953 68 VPIGYALKAIPAGSWVHERLLQMPSARALEGLPMATVKPPVLEPLTGYTFEGYRNADGSVGT 129 ************************************************************** PP TIGR03248 120 knilgittsvqcvagvvdyavkrikkellpkypnvddvvalnhsyGcGvaidapdaivpirt 181 +nil+itt+vqcvagvv av+rik+ellp yp+vddv++l+h+yGcGvaidapdai+pirt lcl|FitnessBrowser__acidovorax_3H11:Ac3H11_3953 130 RNILAITTTVQCVAGVVANAVRRIKDELLPLYPQVDDVIGLEHTYGCGVAIDAPDAIIPIRT 191 ************************************************************** PP TIGR03248 182 lrnlalnpnlGGealvvglGceklqperllpeelsa.velkdaav....lrlqdekl.Gfae 237 lrn++lnpn+GGe++vv+lGceklqp+rllp + + +++ daa+ ++lq +++ Gf + lcl|FitnessBrowser__acidovorax_3H11:Ac3H11_3953 192 LRNISLNPNFGGEVMVVSLGCEKLQPDRLLPAGSFPiADERDAALgpetVCLQADEHvGFMS 253 *******************************998875566666433333899988888**** PP TIGR03248 238 mveailelaeerlkklnarkretvpaselvvGlqcGGsdafsGvtanpavGfaadllvraGa 299 m+++i++ a+ +l++lnar+ret+ aselv+G+qcGGsdafsGvtanpavGf adllvraGa lcl|FitnessBrowser__acidovorax_3H11:Ac3H11_3953 254 MLDHIVQSARPHLERLNARRRETIRASELVLGVQCGGSDAFSGVTANPAVGFCADLLVRAGA 315 ************************************************************** PP TIGR03248 300 tvlfsevtevrdaihlltpraedaevakaliremkwydeylarGeadrsanttpGnkkGGls 361 tv+fse+tevrda+++lt+ra++ eva++++re+ wyd+yl+rG++dr+anttpGnk GGls lcl|FitnessBrowser__acidovorax_3H11:Ac3H11_3953 316 TVMFSENTEVRDAVEQLTSRAATPEVAESIVRELGWYDRYLDRGRVDRAANTTPGNKAGGLS 377 ************************************************************** PP TIGR03248 362 nivekalGsivksGssaivevlspGekvk..kkGliyaatpasdfvcGtlqlasglnlhvft 421 ni eka+Gsi+ksG+++i++vl+pGek + ++Gl+yaatpasdf+cGtlqla+g+nlhvft lcl|FitnessBrowser__acidovorax_3H11:Ac3H11_3953 378 NIAEKAMGSIIKSGTAPISHVLAPGEKLRrdQRGLVYAATPASDFICGTLQLAAGMNLHVFT 439 **************************9863379***************************** PP TIGR03248 422 tGrGtpyGlalvpvikvstrtelaerwadlidldaGriatGeatiedvGwelfrlildvasG 483 tGrGtpyGla+ pv+kv+trt+la+rw+dl+d++aG+ia Gea+ie++Gwe+f+l+ldvasG lcl|FitnessBrowser__acidovorax_3H11:Ac3H11_3953 440 TGRGTPYGLAECPVVKVATRTDLARRWHDLMDINAGKIADGEASIEELGWEMFHLLLDVASG 501 ************************************************************** PP TIGR03248 484 rkktwaekyklhndlalfnpapvt 507 rkktwae++klhn+l lfnpapvt lcl|FitnessBrowser__acidovorax_3H11:Ac3H11_3953 502 RKKTWAEQWKLHNALVLFNPAPVT 525 ***********************8 PP Internal pipeline statistics summary: ------------------------------------- Query model(s): 1 (507 nodes) Target sequences: 1 (525 residues searched) Passed MSV filter: 1 (1); expected 0.0 (0.02) Passed bias filter: 1 (1); expected 0.0 (0.02) Passed Vit filter: 1 (1); expected 0.0 (0.001) Passed Fwd filter: 1 (1); expected 0.0 (1e-05) Initial search space (Z): 1 [actual number of targets] Domain search space (domZ): 1 [number of targets reported over threshold] # CPU time: 0.03u 0.01s 00:00:00.04 Elapsed: 00:00:00.03 # Mc/sec: 7.28 // [ok]
This GapMind analysis is from Sep 17 2021. The underlying query database was built on Sep 17 2021.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory