GapMind for Amino acid biosynthesis

 

Alignments for a candidate for ilvD in Pseudomonas fluorescens GW456-L13

Align dihydroxy-acid dehydratase (EC 4.2.1.9) (characterized)
to candidate PfGW456L13_3725 Dihydroxy-acid dehydratase (EC 4.2.1.9)

Query= BRENDA::A0A481UJA7
         (614 letters)



>FitnessBrowser__pseudo13_GW456_L13:PfGW456L13_3725
          Length = 560

 Score =  588 bits (1516), Expect = e-172
 Identities = 308/556 (55%), Positives = 391/556 (70%), Gaps = 4/556 (0%)

Query: 59  LNKYSSRITEPKSQGGSQAILHGVGLSDEDLNKPQIGISSVWYEGNTCNMHLLRLSEAVK 118
           L KYSS++ +      ++A+L  VG +D D  KPQIGI+S W     CNMH+ +L+   +
Sbjct: 7   LRKYSSQVIDGVEAAPARAMLRAVGFTDADFTKPQIGIASTWAMVTPCNMHIDKLALEAE 66

Query: 119 EGVKEAGMVGFRFNTIGVSDAISMGTRGMCYSLQSRDLIADSIETVMSAQWYDGNISIPG 178
           +G   AG  G  FNTI +SD I+ GT GM YSL SR++IADSIE V   + +DG ++I G
Sbjct: 67  KGANAAGAKGVIFNTITISDGIANGTEGMKYSLVSREVIADSIEVVTGCEGFDGLVTIGG 126

Query: 179 CDKNMPGTIMAMGRLNRPSIMVYGGTIKPGHYNGHSYDIISAFQAYGEYVNGSISDEDRK 238
           CDKNMPG ++ M RLNRPSI VYGGTI+PG   GH+ DIIS F+A G++  G IS+   K
Sbjct: 127 CDKNMPGCLIGMARLNRPSIFVYGGTIQPGA--GHT-DIISVFEAVGQHARGDISEIQVK 183

Query: 239 NVVHNSCPGAGACGGMYTANTMASAIEAMGMCLPYSSSIPAENPLKLDECRLAGKYLLEL 298
            +   + PG G+CGGMYTANTMASAIEA+GM LP SSS  A    K  +   AG+ ++EL
Sbjct: 184 QIEEVAIPGPGSCGGMYTANTMASAIEALGMSLPGSSSQDAIGADKASDSFRAGQQVMEL 243

Query: 299 LKMDLKPQNIITPQSLRNAMVVVMALGGSTNAVLHLIPIARSVGLELTLEDFQKVSDEVP 358
           LK+DLKP++I+T ++  NA+ VV+AL GSTNAVLHL+ +A +V +ELTL+DF ++    P
Sbjct: 244 LKLDLKPRDIMTRKAFENAIRVVIALAGSTNAVLHLLAMANAVDVELTLDDFVELGKVSP 303

Query: 359 FLADLKPSGKYVMEDVHKIGGTPAVLRYLLEHGFLDGDCLTVTGKTLAENVQNCPPLSEG 418
            +ADL+PSGKY+M ++  IGG   +++ +L+ G L GD LTVTG+TLAEN+ + P    G
Sbjct: 304 VVADLRPSGKYMMSELVAIGGIQPLMKRMLDAGMLHGDVLTVTGQTLAENLASVPDYPAG 363

Query: 419 QDIIRPLENPIKKTGHIQILQGNLAPEGSVAKITGKEGLYFSGPALVFEGEEAMLAAISE 478
           QD+IRP + PIKK  H+ IL+GNL+P G+VAKITGKEGL F G A V+ GEE  LA I  
Sbjct: 364 QDVIRPFDQPIKKDSHLVILRGNLSPTGAVAKITGKEGLRFEGTARVYHGEEGALAGILN 423

Query: 479 NPMNFKGKVVVIRGEGPKGGPGMPEMLTPTSAIMGAGLGKDCALLTDGRFSGGSHGFVVG 538
             +   G+V+VIR EGPKGGPGM EML+PTSA+MG GLGK+ AL+TDGRFSGGSHGFVVG
Sbjct: 424 GEVQ-PGEVIVIRYEGPKGGPGMREMLSPTSAVMGKGLGKEVALITDGRFSGGSHGFVVG 482

Query: 539 HICPEAQEGGPIGLVRNGDIIRIDVRERRIDVDVTDQEMEERRKNWTPPPYKATCGVLYK 598
           HI PEA EGGPI LV NGD I ID   R I VDV+D  + ER+  W  P  K   GVL K
Sbjct: 483 HITPEAFEGGPIALVENGDRIIIDAETRLITVDVSDAVLAERKSRWVRPESKYKRGVLAK 542

Query: 599 YIKNVQSASRGCVTDE 614
           Y K V SAS G VTD+
Sbjct: 543 YAKTVSSASEGAVTDK 558


Lambda     K      H
   0.316    0.135    0.401 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 1003
Number of extensions: 43
Number of successful extensions: 3
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 614
Length of database: 560
Length adjustment: 37
Effective length of query: 577
Effective length of database: 523
Effective search space:   301771
Effective search space used:   301771
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.6 bits)
S2: 53 (25.0 bits)

Align candidate PfGW456L13_3725 (Dihydroxy-acid dehydratase (EC 4.2.1.9))
to HMM TIGR00110 (ilvD: dihydroxy-acid dehydratase (EC 4.2.1.9))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.aa/TIGR00110.hmm
# target sequence database:        /tmp/gapView.15442.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR00110  [M=543]
Accession:   TIGR00110
Description: ilvD: dihydroxy-acid dehydratase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                                               Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                                               -----------
   7.8e-226  736.7   8.1     9e-226  736.5   8.1    1.0  1  lcl|FitnessBrowser__pseudo13_GW456_L13:PfGW456L13_3725  Dihydroxy-acid dehydratase (EC 4


Domain annotation for each sequence (and alignments):
>> lcl|FitnessBrowser__pseudo13_GW456_L13:PfGW456L13_3725  Dihydroxy-acid dehydratase (EC 4.2.1.9)
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  736.5   8.1    9e-226    9e-226       1     542 [.      22     557 ..      22     558 .. 0.99

  Alignments for each domain:
  == domain 1  score: 736.5 bits;  conditional E-value: 9e-226
                                               TIGR00110   1 aarallkatGlkdedlekPiiavvnsyteivPghvhlkdlaklvkeeieaaGgva 55 
                                                             +ara+l+a+G++d+d++kP+i++++++  ++P+++h+++la  ++++ +aaG++ 
  lcl|FitnessBrowser__pseudo13_GW456_L13:PfGW456L13_3725  22 PARAMLRAVGFTDADFTKPQIGIASTWAMVTPCNMHIDKLALEAEKGANAAGAKG 76 
                                                             579**************************************************** PP

                                               TIGR00110  56 kefntiavsDGiamgheGmkysLpsreiiaDsvetvvkahalDalvvissCDkiv 110
                                                             + fnti++sDGia g+eGmkysL+sre+iaDs+e v   + +D+lv+i+ CDk++
  lcl|FitnessBrowser__pseudo13_GW456_L13:PfGW456L13_3725  77 VIFNTITISDGIANGTEGMKYSLVSREVIADSIEVVTGCEGFDGLVTIGGCDKNM 131
                                                             ******************************************************* PP

                                               TIGR00110 111 PGmlmaalrlniPaivvsGGpmeagktklsekidlvdvfeavgeyaagklseeel 165
                                                             PG l++++rln+P+i+v+GG++++g  +     d+++vfeavg+ a g++se ++
  lcl|FitnessBrowser__pseudo13_GW456_L13:PfGW456L13_3725 132 PGCLIGMARLNRPSIFVYGGTIQPGAGH----TDIISVFEAVGQHARGDISEIQV 182
                                                             *************************887....699******************** PP

                                               TIGR00110 166 eeiersacPtagsCsGlftansmacltealGlslPgsstllatsaekkelakksg 220
                                                             ++ie++a P++gsC+G++tan+ma++ ealG+slPgss+  a+ a+k++   ++g
  lcl|FitnessBrowser__pseudo13_GW456_L13:PfGW456L13_3725 183 KQIEEVAIPGPGSCGGMYTANTMASAIEALGMSLPGSSSQDAIGADKASDSFRAG 237
                                                             ******************************************************* PP

                                               TIGR00110 221 krivelvkknikPrdiltkeafenaitldlalGGstntvLhllaiakeagvklsl 275
                                                             ++++el+k ++kPrdi+t++afenai +++al Gstn+vLhlla+a+ ++v+l+l
  lcl|FitnessBrowser__pseudo13_GW456_L13:PfGW456L13_3725 238 QQVMELLKLDLKPRDIMTRKAFENAIRVVIALAGSTNAVLHLLAMANAVDVELTL 292
                                                             ******************************************************* PP

                                               TIGR00110 276 ddfdrlsrkvPllaklkPsgkkviedlhraGGvsavlkeldkegllhkdaltvtG 330
                                                             ddf +l +  P++a+l+Psgk+++ +l + GG++ ++k +  +g+lh d+ltvtG
  lcl|FitnessBrowser__pseudo13_GW456_L13:PfGW456L13_3725 293 DDFVELGKVSPVVADLRPSGKYMMSELVAIGGIQPLMKRMLDAGMLHGDVLTVTG 347
                                                             ******************************************************* PP

                                               TIGR00110 331 ktlaetlekvkvlrvdqdvirsldnpvkkegglavLkGnlaeeGavvkiagveed 385
                                                             +tlae+l++v+  +++qdvir+ d+p+kk+++l +L+Gnl++ Gav+ki+g+e  
  lcl|FitnessBrowser__pseudo13_GW456_L13:PfGW456L13_3725 348 QTLAENLASVPDYPAGQDVIRPFDQPIKKDSHLVILRGNLSPTGAVAKITGKEG- 401
                                                             ******************************************************. PP

                                               TIGR00110 386 ilkfeGpakvfeseeealeailggkvkeGdvvviryeGPkGgPGmremLaPtsal 440
                                                              l+feG+a+v++ ee al++il+g+v+ G+v+viryeGPkGgPGmremL Ptsa+
  lcl|FitnessBrowser__pseudo13_GW456_L13:PfGW456L13_3725 402 -LRFEGTARVYHGEEGALAGILNGEVQPGEVIVIRYEGPKGGPGMREMLSPTSAV 455
                                                             .****************************************************** PP

                                               TIGR00110 441 vglGLgkkvaLitDGrfsGgtrGlsiGhvsPeaaegGaialvedGDkikiDienr 495
                                                             +g GLgk+vaLitDGrfsGg++G+++Gh++Pea egG+ialve+GD+i+iD+e r
  lcl|FitnessBrowser__pseudo13_GW456_L13:PfGW456L13_3725 456 MGKGLGKEVALITDGRFSGGSHGFVVGHITPEAFEGGPIALVENGDRIIIDAETR 510
                                                             ******************************************************* PP

                                               TIGR00110 496 kldlevseeelaerrakakkkearevkgaLakyaklvssadkGavld 542
                                                              + ++vs++ laer+ +++++e ++++g+Lakyak vssa++Gav+d
  lcl|FitnessBrowser__pseudo13_GW456_L13:PfGW456L13_3725 511 LITVDVSDAVLAERKSRWVRPESKYKRGVLAKYAKTVSSASEGAVTD 557
                                                             *********************************************98 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (543 nodes)
Target sequences:                          1  (560 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.01u 0.01s 00:00:00.02 Elapsed: 00:00:00.02
# Mc/sec: 12.74
//
[ok]

This GapMind analysis is from Apr 09 2024. The underlying query database was built on Apr 09 2024.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory