GapMind for Amino acid biosynthesis

 

Alignments for a candidate for ilvD in Echinicola vietnamensis KMM 6221, DSM 17526

Align dihydroxy-acid dehydratase (EC 4.2.1.9) (characterized)
to candidate Echvi_2055 Echvi_2055 dihydroxy-acid dehydratase

Query= BRENDA::A0A481UJA7
         (614 letters)



>FitnessBrowser__Cola:Echvi_2055
          Length = 561

 Score =  644 bits (1660), Expect = 0.0
 Identities = 311/557 (55%), Positives = 414/557 (74%), Gaps = 2/557 (0%)

Query: 59  LNKYSSRITEPKSQGGSQAILHGVGLSDEDLNKPQIGISSVWYEGNTCNMHLLRLSEAVK 118
           L KYS  I++ +    + A+L+  G++D+ + +P +G++S  YE N CNMHL   +E +K
Sbjct: 5   LKKYSWEISDNEENPAAMAMLYATGITDKKMKQPFVGVASCGYESNPCNMHLNSFAEDIK 64

Query: 119 EGVKEAGMVGFRFNTIGVSDAISMGTRGMCYSLQSRDLIADSIETVMSAQWYDGNISIPG 178
               +A + GF FNTIG+SD  SMGT GM YSL SR++IADSIE+ +    +DG ++IPG
Sbjct: 65  ASTNQADLSGFIFNTIGISDGQSMGTSGMRYSLPSREVIADSIESFILGHSFDGVVTIPG 124

Query: 179 CDKNMPGTIMAMGRLNRPSIMVYGGTIKPGHYNGHSYDIISAFQAYGEYVNGSISDEDRK 238
           CDKNMPG +M M R+NRP IMV+GGTI+ G+Y G   +I+SAF+AYG+ +NG ISDED  
Sbjct: 125 CDKNMPGVVMGMLRVNRPGIMVFGGTIRSGNYKGEKLNIVSAFEAYGKKINGQISDEDYM 184

Query: 239 NVVHNSCPGAGACGGMYTANTMASAIEAMGMCLPYSSSIPAENPLKLDECRLAGKYLLEL 298
            V+ N+CPGAGACGGMYTANTM+SAIEAMG+ LP+SSS PA +  K +EC+  GKY+ +L
Sbjct: 185 GVIKNACPGAGACGGMYTANTMSSAIEAMGLSLPFSSSYPATSKEKREECKNIGKYIKQL 244

Query: 299 LKMDLKPQNIITPQSLRNAMVVVMALGGSTNAVLHLIPIARSVGLELTLEDFQKVSDEVP 358
           L +D+KP++IIT +SL NA+ V +ALGGSTNA LH++ IAR+ G++ TLEDF++++ E P
Sbjct: 245 LALDIKPKDIITKKSLENAVRVTVALGGSTNAALHILAIARTAGIDFTLEDFKRINAETP 304

Query: 359 FLADLKPSGKYVMEDVHKIGGTPAVLRYLLEHGFLDGDCLTVTGKTLAENVQNCPPLSEG 418
            L D KPSGK++MED++++GG PA L+Y L  G L GDCLTVTGKT+AEN+++  P+   
Sbjct: 305 VLGDFKPSGKFMMEDLYEMGGLPAFLKYFLNEGLLHGDCLTVTGKTMAENLEDIDPVKPS 364

Query: 419 QD-IIRPLENPIKKTGHIQILQGNLAPEGSVAKITGKEGLYFSGPALVFEGEEAMLAAIS 477
           ++ +I PL+NPIK +GH+ +L GNLAPEG+VAKI+GKEG  F+G A VF+ E +  AA+ 
Sbjct: 365 KESVIHPLDNPIKPSGHLCVLHGNLAPEGAVAKISGKEGKSFTGTAKVFDDEPSANAAMK 424

Query: 478 ENPMNFKGKVVVIRGEGPKGGPGMPEMLTPTSAIMGAGLGKDCALLTDGRFSGGSHGFVV 537
              +  KG VVVIR  GPKGGPGMPEML PTS I+GAGLG D AL+TDGRFSGG+HGFVV
Sbjct: 425 NKEIQ-KGDVVVIRYVGPKGGPGMPEMLKPTSIIIGAGLGSDVALITDGRFSGGTHGFVV 483

Query: 538 GHICPEAQEGGPIGLVRNGDIIRIDVRERRIDVDVTDQEMEERRKNWTPPPYKATCGVLY 597
           GH+ PEA  GGPIGL+++GD+I ID     I VDV++ E  ER+KNW         G L 
Sbjct: 484 GHVTPEAYLGGPIGLLKDGDVITIDAESLEIRVDVSEAEFAERKKNWKNKDLSHLQGTLK 543

Query: 598 KYIKNVQSASRGCVTDE 614
           KY++ V +AS GCVTD+
Sbjct: 544 KYVQLVSTASEGCVTDK 560


Lambda     K      H
   0.316    0.135    0.401 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 1060
Number of extensions: 42
Number of successful extensions: 3
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 614
Length of database: 561
Length adjustment: 37
Effective length of query: 577
Effective length of database: 524
Effective search space:   302348
Effective search space used:   302348
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.6 bits)
S2: 53 (25.0 bits)

Align candidate Echvi_2055 Echvi_2055 (dihydroxy-acid dehydratase)
to HMM TIGR00110 (ilvD: dihydroxy-acid dehydratase (EC 4.2.1.9))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.aa/TIGR00110.hmm
# target sequence database:        /tmp/gapView.29097.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR00110  [M=543]
Accession:   TIGR00110
Description: ilvD: dihydroxy-acid dehydratase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                            Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                            -----------
     9e-224  729.9   3.6     1e-223  729.7   3.6    1.0  1  lcl|FitnessBrowser__Cola:Echvi_2055  Echvi_2055 dihydroxy-acid dehydr


Domain annotation for each sequence (and alignments):
>> lcl|FitnessBrowser__Cola:Echvi_2055  Echvi_2055 dihydroxy-acid dehydratase
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  729.7   3.6    1e-223    1e-223       2     542 ..      21     559 ..      20     560 .. 0.99

  Alignments for each domain:
  == domain 1  score: 729.7 bits;  conditional E-value: 1e-223
                            TIGR00110   2 arallkatGlkdedlekPiiavvnsyteivPghvhlkdlaklvkeeieaaGgvakefntiavsDGiamgheGmk 75 
                                          a a+l+atG++d+ +++P+++v+++  e +P+++hl+ +a+ +k++ ++a    + fnti++sDG +mg++Gm+
  lcl|FitnessBrowser__Cola:Echvi_2055  21 AMAMLYATGITDKKMKQPFVGVASCGYESNPCNMHLNSFAEDIKASTNQADLSGFIFNTIGISDGQSMGTSGMR 94 
                                          579*********************************************************************** PP

                            TIGR00110  76 ysLpsreiiaDsvetvvkahalDalvvissCDkivPGmlmaalrlniPaivvsGGpmeagktklsekidlvdvf 149
                                          ysLpsre+iaDs+e+ + +h +D++v+i+ CDk++PG++m++lr+n+P i+v GG++ +g++k +ek+++v++f
  lcl|FitnessBrowser__Cola:Echvi_2055  95 YSLPSREVIADSIESFILGHSFDGVVTIPGCDKNMPGVVMGMLRVNRPGIMVFGGTIRSGNYK-GEKLNIVSAF 167
                                          ***************************************************************.9********* PP

                            TIGR00110 150 eavgeyaagklseeeleeiersacPtagsCsGlftansmacltealGlslPgsstllatsaekkelakksgkri 223
                                          ea+g+  +g++s+e+   + ++acP+ag+C+G++tan+m+++ ea+GlslP ss+ +ats+ek+e +k++gk+i
  lcl|FitnessBrowser__Cola:Echvi_2055 168 EAYGKKINGQISDEDYMGVIKNACPGAGACGGMYTANTMSSAIEAMGLSLPFSSSYPATSKEKREECKNIGKYI 241
                                          ************************************************************************** PP

                            TIGR00110 224 velvkknikPrdiltkeafenaitldlalGGstntvLhllaiakeagvklslddfdrlsrkvPllaklkPsgkk 297
                                          ++l+  +ikP+di+tk+++ena+ + +alGGstn+ Lh+laia++ag++++l+df+r++ ++P+l+++kPsgk 
  lcl|FitnessBrowser__Cola:Echvi_2055 242 KQLLALDIKPKDIITKKSLENAVRVTVALGGSTNAALHILAIARTAGIDFTLEDFKRINAETPVLGDFKPSGKF 315
                                          ************************************************************************** PP

                            TIGR00110 298 viedlhraGGvsavlkeldkegllhkdaltvtGktlaetlekvk.vlrvdqdvirsldnpvkkegglavLkGnl 370
                                          ++edl + GG++a lk+  +egllh d+ltvtGkt+ae+le+++ v++ +++vi++ldnp+k  g+l vL+Gnl
  lcl|FitnessBrowser__Cola:Echvi_2055 316 MMEDLYEMGGLPAFLKYFLNEGLLHGDCLTVTGKTMAENLEDIDpVKPSKESVIHPLDNPIKPSGHLCVLHGNL 389
                                          ******************************************9735667888********************** PP

                            TIGR00110 371 aeeGavvkiagveedilkfeGpakvfeseeealeailggkvkeGdvvviryeGPkGgPGmremLaPtsalvglG 444
                                          a+eGav+ki+g+e     f+G+akvf++e  a +a+ ++++++Gdvvviry GPkGgPGm+emL+Pts ++g+G
  lcl|FitnessBrowser__Cola:Echvi_2055 390 APEGAVAKISGKEG--KSFTGTAKVFDDEPSANAAMKNKEIQKGDVVVIRYVGPKGGPGMPEMLKPTSIIIGAG 461
                                          **************..99******************************************************** PP

                            TIGR00110 445 LgkkvaLitDGrfsGgtrGlsiGhvsPeaaegGaialvedGDkikiDienrkldlevseeelaerrakakkkea 518
                                          Lg++vaLitDGrfsGgt+G+++Ghv+Pea+ gG+i+l++dGD i+iD+e+ ++ ++vse+e+aer++++k+k+ 
  lcl|FitnessBrowser__Cola:Echvi_2055 462 LGSDVALITDGRFSGGTHGFVVGHVTPEAYLGGPIGLLKDGDVITIDAESLEIRVDVSEAEFAERKKNWKNKDL 535
                                          ************************************************************************** PP

                            TIGR00110 519 revkgaLakyaklvssadkGavld 542
                                           + +g+L+ky +lvs+a++G+v+d
  lcl|FitnessBrowser__Cola:Echvi_2055 536 SHLQGTLKKYVQLVSTASEGCVTD 559
                                          **********************98 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (543 nodes)
Target sequences:                          1  (561 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.02u 0.01s 00:00:00.03 Elapsed: 00:00:00.03
# Mc/sec: 9.49
//
[ok]

This GapMind analysis is from Apr 09 2024. The underlying query database was built on Apr 09 2024.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory