GapMind for Amino acid biosynthesis

 

Alignments for a candidate for ilvD in Bradyrhizobium sp. BTAi1

Align Dihydroxy-acid dehydratase; DAD; EC 4.2.1.9 (characterized)
to candidate WP_012044488.1 BBTA_RS20815 dihydroxy-acid dehydratase

Query= SwissProt::P9WKJ5
         (575 letters)



>NCBI__GCF_000015165.1:WP_012044488.1
          Length = 574

 Score =  546 bits (1407), Expect = e-160
 Identities = 291/560 (51%), Positives = 372/560 (66%), Gaps = 8/560 (1%)

Query: 16  DIKPR--SRDVTDGLEKAAARGMLRAVGMDDEDFAKPQIGVASSWNEITPCNLSLDRLAN 73
           DIK R  SR VT+G E+A  R  L A+G+      +P +GVAS WNE  PCN++L R A 
Sbjct: 6   DIKRRLPSRHVTEGPERAPHRSYLYAMGLTTAQIHQPFVGVASCWNEAAPCNIALMRQAQ 65

Query: 74  AVKEGVFSAGGYPLEFGTISVSDGISMGHEGMHFSLVSREVIADSVEVVMQAERLDGSVL 133
           AVK+GV  AGG P EF TI+V+DGI+MGH+GM  SL SRE IADSVE+ ++    D  V 
Sbjct: 66  AVKKGVAHAGGTPREFCTITVTDGIAMGHDGMRSSLPSRETIADSVELTIRGHAYDALVG 125

Query: 134 LAGCDKSLPGMLMAAARLDLAAVFLYAGSILPGRAKLSDGSERDVTIIDAFEAVGACSRG 193
           LAGCDKSLPGM+MA  RL++ ++F+Y GSILPG  +      + VT+ D FEAVG  S G
Sbjct: 126 LAGCDKSLPGMMMAMVRLNVPSIFIYGGSILPGNFR-----GQQVTVQDMFEAVGKHSVG 180

Query: 194 LMSRADVDAIERAICPGEGACGGMYTANTMASAAEALGMSLPGSAAPPATDRRRDGFARR 253
            MS  D+D +ER  CP  GACG  +TANTMA+ +EA+G++LP SA  PA    RD F   
Sbjct: 181 QMSDDDLDELERVACPSAGACGAQFTANTMATVSEAIGLALPYSAGAPAPYEIRDSFCMT 240

Query: 254 SGQAVVELLRRGITARDILTKEAFENAIAVVMAFGGSTNAVLHLLAIAHEANVALSLQDF 313
           +G+ V+EL+   I  RDI+T+ A ENA AVV A GGSTNA LHL AIAHEA +   L D 
Sbjct: 241 AGEKVMELIAANIRPRDIVTRAALENAAAVVAASGGSTNAALHLPAIAHEAGIKFDLFDV 300

Query: 314 SRIGSGVPHLADVKPFGRHVMSDVDHIGGVPVVMKALLDAGLLHGDCLTVTGHTMAENLA 373
           + I    P++AD+KP GR+V  D+   GG+P++MK LLD G L+GDCLTVTG T+AENL 
Sbjct: 301 AEIFKKTPYIADLKPGGRYVAKDMFEAGGIPLLMKTLLDHGYLNGDCLTVTGRTIAENLK 360

Query: 374 AITPPDPDGKVLRALANPIHPSGGITILHGSLAPEGAVVKTAGFDSDVFEGTARVFDGER 433
            +   +P   V+R+   PI  +GG+  L G+LAPEGA+VK AG  +  F G AR FD E 
Sbjct: 361 GV-KWNPHQDVVRSADQPITATGGVVGLKGNLAPEGAIVKVAGMSNLKFSGPARCFDREE 419

Query: 434 AALDALEDGTITVGDAVVIRYEGPKGGPGMREMLAITGAIKGAGLGKDVLLLTDGRFSGG 493
            A +A++  T   GD +VIRYEGP+GGPGMREML  T A+ G G+G  + L+TDGRFSG 
Sbjct: 420 DAFEAVQKRTYREGDVIVIRYEGPRGGPGMREMLQTTAALTGQGMGGKIALITDGRFSGA 479

Query: 494 TTGLCVGHIAPEAVDGGPIALLRNGDRIRLDVAGRVLDVLADPAEFASRQQDFSPPPPRY 553
           T G C+GHI PEA  GGPIAL+ +GD I +D     LDV    AE A+R+  ++P   ++
Sbjct: 480 TRGFCIGHIGPEAAIGGPIALVEDGDIIEIDAVAGRLDVKLSDAELAARKTKWTPRETQH 539

Query: 554 TTGVLSKYVKLVSSAAVGAV 573
           T+G L KY + V  A  GAV
Sbjct: 540 TSGALWKYAQQVGPAVAGAV 559


Lambda     K      H
   0.318    0.136    0.393 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 968
Number of extensions: 51
Number of successful extensions: 3
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 575
Length of database: 574
Length adjustment: 36
Effective length of query: 539
Effective length of database: 538
Effective search space:   289982
Effective search space used:   289982
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 53 (25.0 bits)

Align candidate WP_012044488.1 BBTA_RS20815 (dihydroxy-acid dehydratase)
to HMM TIGR00110 (ilvD: dihydroxy-acid dehydratase (EC 4.2.1.9))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.aa/TIGR00110.hmm
# target sequence database:        /tmp/gapView.2095.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR00110  [M=543]
Accession:   TIGR00110
Description: ilvD: dihydroxy-acid dehydratase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                                 Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                                 -----------
   1.6e-212  692.8   2.0   1.8e-212  692.6   2.0    1.0  1  lcl|NCBI__GCF_000015165.1:WP_012044488.1  BBTA_RS20815 dihydroxy-acid dehy


Domain annotation for each sequence (and alignments):
>> lcl|NCBI__GCF_000015165.1:WP_012044488.1  BBTA_RS20815 dihydroxy-acid dehydratase
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  692.6   2.0  1.8e-212  1.8e-212       1     541 [.      24     560 ..      24     562 .. 0.99

  Alignments for each domain:
  == domain 1  score: 692.6 bits;  conditional E-value: 1.8e-212
                                 TIGR00110   1 aarallkatGlkdedlekPiiavvnsyteivPghvhlkdlaklvkeeieaaGgvakefntiavsDGiam 69 
                                               ++r+ l+a+Gl+ +++++P+++v+++++e +P+++ l   a++vk+++ +aGg++ ef ti+v+DGiam
  lcl|NCBI__GCF_000015165.1:WP_012044488.1  24 PHRSYLYAMGLTTAQIHQPFVGVASCWNEAAPCNIALMRQAQAVKKGVAHAGGTPREFCTITVTDGIAM 92 
                                               699****************************************************************** PP

                                 TIGR00110  70 gheGmkysLpsreiiaDsvetvvkahalDalvvissCDkivPGmlmaalrlniPaivvsGGpmeagktk 138
                                               gh+Gm+ sLpsre iaDsve  +++ha+Dalv ++ CDk +PGm+ma++rln+P+i+++GG++ +g+++
  lcl|NCBI__GCF_000015165.1:WP_012044488.1  93 GHDGMRSSLPSRETIADSVELTIRGHAYDALVGLAGCDKSLPGMMMAMVRLNVPSIFIYGGSILPGNFR 161
                                               ********************************************************************* PP

                                 TIGR00110 139 lsekidlvdvfeavgeyaagklseeeleeiersacPtagsCsGlftansmacltealGlslPgsstlla 207
                                                ++++++ d+feavg+ + g++s+++l+e+er+acP+ag+C+  ftan+ma+++ea+Gl+lP+s+ ++a
  lcl|NCBI__GCF_000015165.1:WP_012044488.1 162 -GQQVTVQDMFEAVGKHSVGQMSDDDLDELERVACPSAGACGAQFTANTMATVSEAIGLALPYSAGAPA 229
                                               .9******************************************************************* PP

                                 TIGR00110 208 tsaekkelakksgkrivelvkknikPrdiltkeafenaitldlalGGstntvLhllaiakeagvklsld 276
                                                 + + +++ ++g++++el+  ni+Prdi+t++a+ena +++ a GGstn+ Lhl+aia+eag+k++l 
  lcl|NCBI__GCF_000015165.1:WP_012044488.1 230 PYEIRDSFCMTAGEKVMELIAANIRPRDIVTRAALENAAAVVAASGGSTNAALHLPAIAHEAGIKFDLF 298
                                               ********************************************************************* PP

                                 TIGR00110 277 dfdrlsrkvPllaklkPsgkkviedlhraGGvsavlkeldkegllhkdaltvtGktlaetlekvkvlrv 345
                                               d+ ++ +k+P +a+lkP+g++v +d+ +aGG++ ++k+l  +g+l+ d+ltvtG+t+ae+l+ vk +  
  lcl|NCBI__GCF_000015165.1:WP_012044488.1 299 DVAEIFKKTPYIADLKPGGRYVAKDMFEAGGIPLLMKTLLDHGYLNGDCLTVTGRTIAENLKGVKWN-P 366
                                               ******************************************************************9.8 PP

                                 TIGR00110 346 dqdvirsldnpvkkegglavLkGnlaeeGavvkiagveedilkfeGpakvfeseeealeailggkvkeG 414
                                               +qdv+rs d+p++++gg+  LkGnla+eGa+vk+ag+++  lkf Gpa+ f+ ee+a ea+ ++  +eG
  lcl|NCBI__GCF_000015165.1:WP_012044488.1 367 HQDVVRSADQPITATGGVVGLKGNLAPEGAIVKVAGMSN--LKFSGPARCFDREEDAFEAVQKRTYREG 433
                                               9**************************************..**************************** PP

                                 TIGR00110 415 dvvviryeGPkGgPGmremLaPtsalvglGLgkkvaLitDGrfsGgtrGlsiGhvsPeaaegGaialve 483
                                               dv+viryeGP+GgPGmremL+ t+al+g G+g k+aLitDGrfsG+trG++iGh+ Peaa gG+ialve
  lcl|NCBI__GCF_000015165.1:WP_012044488.1 434 DVIVIRYEGPRGGPGMREMLQTTAALTGQGMGGKIALITDGRFSGATRGFCIGHIGPEAAIGGPIALVE 502
                                               ********************************************************************* PP

                                 TIGR00110 484 dGDkikiDienrkldlevseeelaerrakakkkearevkgaLakyaklvssadkGavl 541
                                               dGD+i+iD+ + +ld+++s++ela+r+ k++++e+++++gaL kya+ v  a  Gav+
  lcl|NCBI__GCF_000015165.1:WP_012044488.1 503 DGDIIEIDAVAGRLDVKLSDAELAARKTKWTPRETQHTSGALWKYAQQVGPAVAGAVT 560
                                               ********************************************************97 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (543 nodes)
Target sequences:                          1  (574 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.02u 0.01s 00:00:00.03 Elapsed: 00:00:00.03
# Mc/sec: 8.81
//
[ok]

This GapMind analysis is from Apr 10 2024. The underlying query database was built on Apr 09 2024.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory