Align 4-hydroxy-tetrahydrodipicolinate reductase (EC 1.17.1.8) (characterized)
to candidate WP_035236887.1 Q366_RS04940 dihydrodipicolinate reductase
Query= BRENDA::D8R6G2 (288 letters) >NCBI__GCF_000745975.1:WP_035236887.1 Length = 278 Score = 250 bits (639), Expect = 2e-71 Identities = 127/273 (46%), Positives = 168/273 (61%), Gaps = 2/273 (0%) Query: 15 PVMVNDCTGKVGQAVAEAAVA-AGLRLVPLSLTGPGRGGKRVVIGNVEVDVREVSEREDV 73 P+M+N G V + + EAA LVP SLTG +V + +V + + SERED Sbjct: 5 PIMINGLPGNVARIMTEAAFQDERFTLVPFSLTGEEISMTQVSVDQTKVTLLKPSEREDR 64 Query: 74 VKEVITEYPNVIVVDYTLPAAVNDNAEFYCKQGLPFVMGTTGGDREKLLDVARKSGTYSI 133 + ++ YP I VDYT P AVN+NA+FY +PFVMGTTGGDR+ L + ++ Sbjct: 65 INNILESYPGCICVDYTHPTAVNNNAKFYVAHKIPFVMGTTGGDRQDLENTVNNGSVPAV 124 Query: 134 IAPQMGKQVVAFVAAMEIMAKQFPGAFSGYTLQVTESHQSTKADVSGTALAVISSLRKLG 193 IAP M KQ+V A +E A FPG F G+TLQV ESHQ KAD SGTA A+++ +LG Sbjct: 125 IAPNMAKQIVGLQAMLEYGASTFPGLFKGFTLQVKESHQQAKADTSGTAKALVACFNQLG 184 Query: 194 LDFKDEQVELVRDPKEQMTRMGVPEQHLNGHAFHTYKIISPDGTVFFEFKHNVCGRSIYA 253 DF +E +RDPK Q +GVPEQ++ GH +HTY + +PDG+ FE HN+ GR IY Sbjct: 185 TDFNISDIEKIRDPKIQKADLGVPEQYIEGHGWHTYTLKAPDGSALFELTHNINGRQIYV 244 Query: 254 QGTVDAVLFLSKKIQEKS-EKRLYNMIDVLEGG 285 GT DAV+FL KI + ++L+ MIDVL G Sbjct: 245 SGTFDAVVFLKNKIDANAFGRKLFTMIDVLTAG 277 Lambda K H 0.318 0.134 0.379 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 215 Number of extensions: 10 Number of successful extensions: 1 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 288 Length of database: 278 Length adjustment: 26 Effective length of query: 262 Effective length of database: 252 Effective search space: 66024 Effective search space used: 66024 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.3 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.7 bits) S2: 47 (22.7 bits)
Align candidate WP_035236887.1 Q366_RS04940 (dihydrodipicolinate reductase)
to HMM TIGR02130 (dapB: dihydrodipicolinate reductase (EC 1.17.1.8))
# hmmsearch :: search profile(s) against a sequence database # HMMER 3.3.1 (Jul 2020); http://hmmer.org/ # Copyright (C) 2020 Howard Hughes Medical Institute. # Freely distributed under the BSD open source license. # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # query HMM file: ../tmp/path.aa/TIGR02130.hmm # target sequence database: /tmp/gapView.10599.genome.faa # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: TIGR02130 [M=275] Accession: TIGR02130 Description: dapB_plant: dihydrodipicolinate reductase Scores for complete sequences (score includes all domains): --- full sequence --- --- best 1 domain --- -#dom- E-value score bias E-value score bias exp N Sequence Description ------- ------ ----- ------- ------ ----- ---- -- -------- ----------- 2.2e-107 344.2 0.0 2.4e-107 344.0 0.0 1.0 1 lcl|NCBI__GCF_000745975.1:WP_035236887.1 Q366_RS04940 dihydrodipicolinate Domain annotation for each sequence (and alignments): >> lcl|NCBI__GCF_000745975.1:WP_035236887.1 Q366_RS04940 dihydrodipicolinate reductase # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc --- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ---- 1 ! 344.0 0.0 2.4e-107 2.4e-107 1 272 [. 4 277 .. 4 278 .] 0.97 Alignments for each domain: == domain 1 score: 344.0 bits; conditional E-value: 2.4e-107 TIGR02130 1 vsvmvngisgkmgkivikaav.aaglelvptsltsviiveqevevagkeilllkpserekvlsevleky 68 + +m+ng++g++ +i+ +aa lvp+slt+++i +v v ++ llkpsere +++++le y lcl|NCBI__GCF_000745975.1:WP_035236887.1 4 IPIMINGLPGNVARIMTEAAFqDERFTLVPFSLTGEEISMTQVSVDQTKVTLLKPSEREDRINNILESY 72 569***************99615689******************************************* PP TIGR02130 69 pelivvdytipsavndnaelyvkvkvpfvmgttggdrealaklveeakiyaviapqmakqvvaflaaie 137 p i+vdyt+p+avn+na++yv +k+pfvmgttggdr+ l +v++ ++aviap+makq+v ++a++e lcl|NCBI__GCF_000745975.1:WP_035236887.1 73 PGCICVDYTHPTAVNNNAKFYVAHKIPFVMGTTGGDRQDLENTVNNGSVPAVIAPNMAKQIVGLQAMLE 141 ********************************************************************* PP TIGR02130 138 ilakefptafegyklevveshqkskldasgtakavvstfqklgvdydmddiekvrdekeqievvgvpee 206 + a fp+ f+g+ l+v eshq++k+d+sgtaka+v++f++lg d+++ diek+rd+k q +gvpe+ lcl|NCBI__GCF_000745975.1:WP_035236887.1 142 YGASTFPGLFKGFTLQVKESHQQAKADTSGTAKALVACFNQLGTDFNISDIEKIRDPKIQKADLGVPEQ 210 ********************************************************************* PP TIGR02130 207 ylsghafhlysldsadktvsfefqhnvcgrkiyaegtvdavlfladkiiaka.ekkiynmidvlreg 272 y+ gh +h+y+l+++d++ fe+ hn+ gr+iy gt dav fl +ki a+a +k + midvl g lcl|NCBI__GCF_000745975.1:WP_035236887.1 211 YIEGHGWHTYTLKAPDGSALFELTHNINGRQIYVSGTFDAVVFLKNKIDANAfGRKLFTMIDVLTAG 277 ************************************************998835789*******988 PP Internal pipeline statistics summary: ------------------------------------- Query model(s): 1 (275 nodes) Target sequences: 1 (278 residues searched) Passed MSV filter: 1 (1); expected 0.0 (0.02) Passed bias filter: 1 (1); expected 0.0 (0.02) Passed Vit filter: 1 (1); expected 0.0 (0.001) Passed Fwd filter: 1 (1); expected 0.0 (1e-05) Initial search space (Z): 1 [actual number of targets] Domain search space (domZ): 1 [number of targets reported over threshold] # CPU time: 0.01u 0.01s 00:00:00.02 Elapsed: 00:00:00.00 # Mc/sec: 8.14 // [ok]
This GapMind analysis is from Apr 10 2024. The underlying query database was built on Apr 09 2024.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory