GapMind for Amino acid biosynthesis

 

Alignments for a candidate for ilvD in Desulfovibrio vulgaris Hildenborough

Align dihydroxy-acid dehydratase subunit (EC 4.2.1.9) (characterized)
to candidate 208900 DVU3373 dihydroxy-acid dehydratase

Query= metacyc::MONOMER-11919
         (549 letters)



>MicrobesOnline__882:208900
          Length = 554

 Score =  594 bits (1531), Expect = e-174
 Identities = 301/552 (54%), Positives = 402/552 (72%), Gaps = 7/552 (1%)

Query: 1   MKSDTIKRGIQRAPHRSLLARCGLTDDDFEKPFIGIANSYTDIVPGHIHLRELAEAVKEG 60
           M+S  +  G+++APHRSLL   GLT ++  +P +G+ N+  ++VPGHIHL ++AEAVK G
Sbjct: 1   MRSKKMTHGLEKAPHRSLLHALGLTREELARPLVGVVNAANEVVPGHIHLDDIAEAVKAG 60

Query: 61  VNAAGGVAFEFNTMAICDGIAMNHDGMKYSLASREIVADTVESMAMAHALDGLVLLPTCD 120
           V AAGG   EF  +A+CDG+AMNH+GM++SL SRE++AD++E MA AH  D LV +P CD
Sbjct: 61  VRAAGGTPLEFPAIAVCDGLAMNHEGMRFSLPSRELIADSIEIMATAHPFDALVFIPNCD 120

Query: 121 KIVPGMLMAAARLDIPAIVVTGGPMLPGEFKGRKVDLINVYEGVGTVSAGEMSEDELEEL 180
           K VPGMLMA  RLD+P+++V+GGPML G     + DLI V+EGVG V  G+M+E EL+EL
Sbjct: 121 KSVPGMLMAMLRLDVPSVMVSGGPMLAGATLAGRADLITVFEGVGRVQRGDMTEAELDEL 180

Query: 181 ERCACPGPRSCAGLFTANTMACLTEALGMSLPGCATAHAVSSRKRQIARLSGKRIVEMVQ 240
              ACPG  SCAG+FTAN+M CL E +G++LPG  T  AV++ + ++A+ +G +++EM++
Sbjct: 181 VEGACPGCGSCAGMFTANSMNCLAETIGLALPGNGTTPAVTAARIRLAKHAGMKVMEMLE 240

Query: 241 ENLKPTMIMSQEAFENAVMVDLALGGSTNTTLHIPAIAAEIDGLNINLDLFDELSRVIPH 300
            N++P  I++++A  NAV VD+ALG STNT LH+PA+ AE  GL++ LD+FD++SR  P+
Sbjct: 241 RNIRPRDIVTEKAVANAVAVDMALGCSTNTVLHLPAVFAEA-GLDLTLDIFDKVSRKTPN 299

Query: 301 IASISPAGEHMMLDLDRAGGIPAVLKTLE--DHINRECVTCTGRTVQENIE--NVKVGHR 356
           +  +SPAG H + DL  AGGIPAV+  L+    I+R  +T TGRTV EN++    KV   
Sbjct: 300 LCKLSPAGHHHIQDLHAAGGIPAVMAELDRIGLIDRSAMTVTGRTVGENLDALGAKVRDA 359

Query: 357 DVIRPLDSPVHSEGGLAILRGNLAPRGSVVKQGAVAEDMMVHEGPAKVFNSEDECMEAIF 416
           DVIRP+D+P   +GG+AIL+G+LAP G+VVKQ AVA +MMV E  A+VF+SE+   EAI 
Sbjct: 360 DVIRPVDAPYSPQGGIAILKGSLAPGGAVVKQSAVAPEMMVREAVARVFDSEEAACEAIM 419

Query: 417 GGRIDEGDVIVIRYEGPKGGPGMREMLNPTSAIAGMGL-ERVALITDGRFSGGTRGPCVG 475
           GGRI  GD IVIRYEGPKGGPGMREML PTSAIAGMGL   VALITDGRFSGGTRG  +G
Sbjct: 420 GGRIKAGDAIVIRYEGPKGGPGMREMLTPTSAIAGMGLGADVALITDGRFSGGTRGAAIG 479

Query: 476 HVSPEAMEDGPLAAVNDGDIIRIDIPSRKLEVDLSPREIEERLQSAVKPRRSVKG-WLAR 534
           HVSPEA E GP+  V +GD IRIDIP+R L++ +   E+  R  + V   + +    L R
Sbjct: 480 HVSPEAAEGGPIGLVQEGDRIRIDIPARALDLLVDEDELARRRAAFVPVEKEITSPLLRR 539

Query: 535 YRKLAGSADTGA 546
           Y ++  SA TGA
Sbjct: 540 YARMVSSAATGA 551


Lambda     K      H
   0.319    0.136    0.397 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 1008
Number of extensions: 53
Number of successful extensions: 5
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 549
Length of database: 554
Length adjustment: 36
Effective length of query: 513
Effective length of database: 518
Effective search space:   265734
Effective search space used:   265734
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 53 (25.0 bits)

Align candidate 208900 DVU3373 (dihydroxy-acid dehydratase)
to HMM TIGR00110 (ilvD: dihydroxy-acid dehydratase (EC 4.2.1.9))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.aa/TIGR00110.hmm
# target sequence database:        /tmp/gapView.14587.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR00110  [M=543]
Accession:   TIGR00110
Description: ilvD: dihydroxy-acid dehydratase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                       Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                       -----------
   7.5e-238  776.4   7.5   8.5e-238  776.2   7.5    1.0  1  lcl|MicrobesOnline__882:208900  DVU3373 dihydroxy-acid dehydrata


Domain annotation for each sequence (and alignments):
>> lcl|MicrobesOnline__882:208900  DVU3373 dihydroxy-acid dehydratase
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  776.2   7.5  8.5e-238  8.5e-238       1     539 [.      14     551 ..      14     554 .] 0.99

  Alignments for each domain:
  == domain 1  score: 776.2 bits;  conditional E-value: 8.5e-238
                       TIGR00110   1 aarallkatGlkdedlekPiiavvnsyteivPghvhlkdlaklvkeeieaaGgvakefntiavsDGiamgheGmkysLp 79 
                                     ++r+ll+a+Gl+ e+l +P+++vvn+ +e+vPgh+hl+d+a++vk++++aaGg++ ef  iav+DG+am+heGm++sLp
  lcl|MicrobesOnline__882:208900  14 PHRSLLHALGLTREELARPLVGVVNAANEVVPGHIHLDDIAEAVKAGVRAAGGTPLEFPAIAVCDGLAMNHEGMRFSLP 92 
                                     69***************************************************************************** PP

                       TIGR00110  80 sreiiaDsvetvvkahalDalvvissCDkivPGmlmaalrlniPaivvsGGpmeagktklsekidlvdvfeavgeyaag 158
                                     sre+iaDs+e +++ah +Dalv i++CDk vPGmlma+lrl++P+++vsGGpm ag t  + + dl+ vfe+vg+++ g
  lcl|MicrobesOnline__882:208900  93 SRELIADSIEIMATAHPFDALVFIPNCDKSVPGMLMAMLRLDVPSVMVSGGPMLAGATL-AGRADLITVFEGVGRVQRG 170
                                     ********************************************************999.6789*************** PP

                       TIGR00110 159 klseeeleeiersacPtagsCsGlftansmacltealGlslPgsstllatsaekkelakksgkrivelvkknikPrdil 237
                                     +++e+el+e+++ acP++gsC+G+ftansm+cl+e++Gl+lPg++t++a++a + +lak++g++++e++++ni+Prdi+
  lcl|MicrobesOnline__882:208900 171 DMTEAELDELVEGACPGCGSCAGMFTANSMNCLAETIGLALPGNGTTPAVTAARIRLAKHAGMKVMEMLERNIRPRDIV 249
                                     ******************************************************************************* PP

                       TIGR00110 238 tkeafenaitldlalGGstntvLhllaiakeagvklslddfdrlsrkvPllaklkPsgkkviedlhraGGvsavlkeld 316
                                     t++a+ na+++d+alG stntvLhl+a+ +eag++l+ld fd++srk+P l+kl+P+g++ i+dlh+aGG++av+ eld
  lcl|MicrobesOnline__882:208900 250 TEKAVANAVAVDMALGCSTNTVLHLPAVFAEAGLDLTLDIFDKVSRKTPNLCKLSPAGHHHIQDLHAAGGIPAVMAELD 328
                                     ******************************************************************************* PP

                       TIGR00110 317 kegllhkdaltvtGktlaetlekvkvlrvdqdvirsldnpvkkegglavLkGnlaeeGavvkiagveedilkfeGpakv 395
                                     + gl+++ a+tvtG+t++e+l+    +  d dvir++d p++ +gg+a+LkG+la+ Gavvk+++v+ ++++ e  a+v
  lcl|MicrobesOnline__882:208900 329 RIGLIDRSAMTVTGRTVGENLDALGAKVRDADVIRPVDAPYSPQGGIAILKGSLAPGGAVVKQSAVAPEMMVREAVARV 407
                                     ******************************************************************************* PP

                       TIGR00110 396 feseeealeailggkvkeGdvvviryeGPkGgPGmremLaPtsalvglGLgkkvaLitDGrfsGgtrGlsiGhvsPeaa 474
                                     f+see+a eai+gg++k+Gd +viryeGPkGgPGmremL+Ptsa++g+GLg +vaLitDGrfsGgtrG +iGhvsPeaa
  lcl|MicrobesOnline__882:208900 408 FDSEEAACEAIMGGRIKAGDAIVIRYEGPKGGPGMREMLTPTSAIAGMGLGADVALITDGRFSGGTRGAAIGHVSPEAA 486
                                     ******************************************************************************* PP

                       TIGR00110 475 egGaialvedGDkikiDienrkldlevseeelaerrakakkkearevkgaLakyaklvssadkGa 539
                                     egG+i+lv++GD+i+iDi++r+ldl v+e+ela+rra+ ++ e++ ++ +L++ya++vssa +Ga
  lcl|MicrobesOnline__882:208900 487 EGGPIGLVQEGDRIRIDIPARALDLLVDEDELARRRAAFVPVEKEITSPLLRRYARMVSSAATGA 551
                                     *******************************************99*******************9 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (543 nodes)
Target sequences:                          1  (554 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.03u 0.01s 00:00:00.04 Elapsed: 00:00:00.03
# Mc/sec: 8.47
//
[ok]

This GapMind analysis is from Apr 09 2024. The underlying query database was built on Apr 09 2024.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory