GapMind for Amino acid biosynthesis

 

Alignments for a candidate for ilvD in Bacillus alkalinitrilicus DSM 22532

Align dihydroxy-acid dehydratase (EC 4.2.1.9) (characterized)
to candidate WP_078427497.1 BK574_RS03330 dihydroxy-acid dehydratase

Query= BRENDA::Q9LIR4
         (608 letters)



>NCBI__GCF_002019605.1:WP_078427497.1
          Length = 556

 Score =  592 bits (1525), Expect = e-173
 Identities = 298/557 (53%), Positives = 403/557 (72%), Gaps = 3/557 (0%)

Query: 51  NKLNKYSSRITEPKSQGGSQAILHGVGLSDDDLLKPQIGISSVWYEGNTCNMHLLKLSEA 110
           N  +K+ S +    ++  ++A++  +G+ D+D  KP +GI+S W E   CNMH+ +L+  
Sbjct: 3   NNTSKFRSSVFNDINRAPNRAMIRAMGIKDEDFNKPFVGIASTWSEVTPCNMHIDELARK 62

Query: 111 VKEGVENAGMVGFRFNTIGVSDAISMGTRGMCFSLQSRDLIADSIETVMSAQWYDGNISI 170
            K+G  +AG   F FNTI VSD ISMGT GM FSL SR++IADSIETV+ AQ YDG ++I
Sbjct: 63  AKKGALDAGGTPFIFNTITVSDGISMGTEGMRFSLPSREVIADSIETVVGAQNYDGVVAI 122

Query: 171 PGCDKNMPGTIMAMGRLNRPGIMVYGGTIKPGHFQDKTYDIVSAFQSYGEFVSGSISDEQ 230
            GCDKNMPG ++A+GRLN P + VYGGTI+PG+   K  DIVSAF++ G++ +G I  ++
Sbjct: 123 GGCDKNMPGCMIAIGRLNLPAVFVYGGTIRPGNVDGKDIDIVSAFEAVGKYNNGDIDRDE 182

Query: 231 RKTVLHHSCPGAGACGGMYTANTMASAIEAMGMSLPYSSSIPAEDPLKLDECRLAGKYLL 290
              +  H+CPGAG+CGGMYTANTMASAIEAMGMSLP SSS PAE   KL++C  AGK ++
Sbjct: 183 LHKIECHACPGAGSCGGMYTANTMASAIEAMGMSLPGSSSNPAETEEKLEDCIKAGKAVM 242

Query: 291 ELLKMDLKPRDIITPKSLRNAMVSVMALGGSTNAVLHLIAIARSVGLELTLDDFQKVSDA 350
            LL   + P+DI+T K+  NA+  VMALGGSTNAVLHL+A+A +V ++L LDDF+++   
Sbjct: 243 NLLNKGITPKDIMTKKAFENAITVVMALGGSTNAVLHLLALAHTVDVDLNLDDFERIRKK 302

Query: 351 VPFLADLKPSGKYVMEDIHKIGGTPAVLRYLLELGLMDGDCMTVTGQTLAQNLENVPSLT 410
           VP +ADLKPSG+YVME++ +IGG PAV++ LL+ GL+ GDC+TVT  T+ QNL  +  L 
Sbjct: 303 VPHIADLKPSGRYVMENLSEIGGVPAVMKLLLDKGLLHGDCLTVTSNTIEQNLSEIQPLK 362

Query: 411 EGQEIIRPLSNPIKETGHIQILRGDLAPDGSVAKITGKEGLYFSGPALVFEGEESMLAAI 470
           EGQEII    NP +ETG + IL+G+LAPDG++AK++G +    +GPA VF+ E     A+
Sbjct: 363 EGQEII-SFENPKRETGPLVILKGNLAPDGALAKMSGLKIKKITGPARVFDSETDATNAV 421

Query: 471 SADPMSFKGTVVVIRGEGPKGGPGMPEMLTPTSAIMGAGLGKECALLTDGRFSGGSHGFV 530
             + ++  G V+VIR  GPKGGPGM EML+ T+ ++G G G++  L+TDGRFSGG+HG V
Sbjct: 422 LNNEVN-PGDVIVIRYVGPKGGPGMAEMLSITAIVVGKGFGEKVGLITDGRFSGGTHGLV 480

Query: 531 VGHICPEAQEGGPIGLIKNGDIITIDIGKKRIDTQVSPEEMNDRRKKWTAPAYKVNRGVL 590
           VGHI PEAQ GGPI LIK GD+ITID   + +   VSPE++N+R K W+ PA  + RG+L
Sbjct: 481 VGHISPEAQVGGPIALIKEGDMITIDSELQELAVDVSPEDLNERLKDWSPPAQNL-RGIL 539

Query: 591 YKYIKNVQSASDGCVTD 607
            KY ++V SAS G +TD
Sbjct: 540 AKYARSVSSASKGAITD 556


Lambda     K      H
   0.316    0.135    0.397 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 1003
Number of extensions: 42
Number of successful extensions: 4
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 608
Length of database: 556
Length adjustment: 36
Effective length of query: 572
Effective length of database: 520
Effective search space:   297440
Effective search space used:   297440
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.6 bits)
S2: 53 (25.0 bits)

Align candidate WP_078427497.1 BK574_RS03330 (dihydroxy-acid dehydratase)
to HMM TIGR00110 (ilvD: dihydroxy-acid dehydratase (EC 4.2.1.9))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.aa/TIGR00110.hmm
# target sequence database:        /tmp/gapView.22246.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR00110  [M=543]
Accession:   TIGR00110
Description: ilvD: dihydroxy-acid dehydratase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                                 Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                                 -----------
   1.6e-227  742.3   7.4   1.8e-227  742.1   7.4    1.0  1  lcl|NCBI__GCF_002019605.1:WP_078427497.1  BK574_RS03330 dihydroxy-acid deh


Domain annotation for each sequence (and alignments):
>> lcl|NCBI__GCF_002019605.1:WP_078427497.1  BK574_RS03330 dihydroxy-acid dehydratase
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  742.1   7.4  1.8e-227  1.8e-227       1     542 [.      20     556 .]      20     556 .] 0.99

  Alignments for each domain:
  == domain 1  score: 742.1 bits;  conditional E-value: 1.8e-227
                                 TIGR00110   1 aarallkatGlkdedlekPiiavvnsyteivPghvhlkdlaklvkeeieaaGgvakefntiavsDGiam 69 
                                               ++ra+++a+G+kded++kP+++++++++e++P+++h+++la+++k++  +aGg+++ fnti+vsDGi+m
  lcl|NCBI__GCF_002019605.1:WP_078427497.1  20 PNRAMIRAMGIKDEDFNKPFVGIASTWSEVTPCNMHIDELARKAKKGALDAGGTPFIFNTITVSDGISM 88 
                                               68******************************************************************* PP

                                 TIGR00110  70 gheGmkysLpsreiiaDsvetvvkahalDalvvissCDkivPGmlmaalrlniPaivvsGGpmeagktk 138
                                               g+eGm++sLpsre+iaDs+etvv a+ +D++v+i+ CDk++PG ++a  rln+Pa++v+GG++ +g++ 
  lcl|NCBI__GCF_002019605.1:WP_078427497.1  89 GTEGMRFSLPSREVIADSIETVVGAQNYDGVVAIGGCDKNMPGCMIAIGRLNLPAVFVYGGTIRPGNVD 157
                                               ********************************************************************* PP

                                 TIGR00110 139 lsekidlvdvfeavgeyaagklseeeleeiersacPtagsCsGlftansmacltealGlslPgsstlla 207
                                                +++id+v++feavg+y++g+++ +el++ie +acP+agsC+G++tan+ma++ ea+G+slPgss+ +a
  lcl|NCBI__GCF_002019605.1:WP_078427497.1 158 -GKDIDIVSAFEAVGKYNNGDIDRDELHKIECHACPGAGSCGGMYTANTMASAIEAMGMSLPGSSSNPA 225
                                               .9******************************************************************* PP

                                 TIGR00110 208 tsaekkelakksgkrivelvkknikPrdiltkeafenaitldlalGGstntvLhllaiakeagvklsld 276
                                                ++ek+e + k+gk +++l++k i+P+di+tk+afenait+++alGGstn+vLhlla+a++++v+l+ld
  lcl|NCBI__GCF_002019605.1:WP_078427497.1 226 ETEEKLEDCIKAGKAVMNLLNKGITPKDIMTKKAFENAITVVMALGGSTNAVLHLLALAHTVDVDLNLD 294
                                               ********************************************************************* PP

                                 TIGR00110 277 dfdrlsrkvPllaklkPsgkkviedlhraGGvsavlkeldkegllhkdaltvtGktlaetlekvkvlrv 345
                                               df+r+++kvP++a+lkPsg++v+e+l + GGv+av+k l  +gllh d+ltvt +t++++l++++ l++
  lcl|NCBI__GCF_002019605.1:WP_078427497.1 295 DFERIRKKVPHIADLKPSGRYVMENLSEIGGVPAVMKLLLDKGLLHGDCLTVTSNTIEQNLSEIQPLKE 363
                                               *******************************************************************99 PP

                                 TIGR00110 346 dqdvirsldnpvkkegglavLkGnlaeeGavvkiagveedilkfeGpakvfeseeealeailggkvkeG 414
                                               +q++i s +np +++g l +LkGnla++Ga++k++g +  i k++Gpa+vf+se +a +a+l+ +v+ G
  lcl|NCBI__GCF_002019605.1:WP_078427497.1 364 GQEII-SFENPKRETGPLVILKGNLAPDGALAKMSGLK--IKKITGPARVFDSETDATNAVLNNEVNPG 429
                                               *9998.79****************************96..59*************************** PP

                                 TIGR00110 415 dvvviryeGPkGgPGmremLaPtsalvglGLgkkvaLitDGrfsGgtrGlsiGhvsPeaaegGaialve 483
                                               dv+viry GPkGgPGm emL  t+ +vg G+g+kv+LitDGrfsGgt+Gl++Gh+sPea +gG+ial++
  lcl|NCBI__GCF_002019605.1:WP_078427497.1 430 DVIVIRYVGPKGGPGMAEMLSITAIVVGKGFGEKVGLITDGRFSGGTHGLVVGHISPEAQVGGPIALIK 498
                                               ********************************************************************* PP

                                 TIGR00110 484 dGDkikiDienrkldlevseeelaerrakakkkearevkgaLakyaklvssadkGavld 542
                                               +GD+i+iD e ++l ++vs e+l+er ++++++  ++ +g+Lakya+ vssa+kGa++d
  lcl|NCBI__GCF_002019605.1:WP_078427497.1 499 EGDMITIDSELQELAVDVSPEDLNERLKDWSPPA-QNLRGILAKYARSVSSASKGAITD 556
                                               *******************************999.88*******************986 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (543 nodes)
Target sequences:                          1  (556 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.01u 0.01s 00:00:00.02 Elapsed: 00:00:00.02
# Mc/sec: 10.77
//
[ok]

This GapMind analysis is from Apr 10 2024. The underlying query database was built on Apr 09 2024.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory