GapMind for Amino acid biosynthesis

 

Alignments for a candidate for leuC in Dinoroseobacter shibae DFL-12

Align 3-isopropylmalate dehydratase large subunit; EC 4.2.1.33 (characterized)
to candidate 3606654 Dshi_0085 3-isopropylmalate dehydratase, large subunit (RefSeq)

Query= CharProtDB::CH_024771
         (466 letters)



>FitnessBrowser__Dino:3606654
          Length = 467

 Score =  610 bits (1574), Expect = e-179
 Identities = 305/466 (65%), Positives = 374/466 (80%), Gaps = 4/466 (0%)

Query: 2   AKTLYEKLFDAHVVYEAENETPLLYIDRHLVHEVTSPQAFDGLRAHGRPVRQPGKTFATM 61
           A+TLY+K++DAH+ +EAE+ T LLYIDRHLVHEVTSPQAF+GLR  GR VR P KT A  
Sbjct: 3   ARTLYDKIWDAHLAHEAEDGTCLLYIDRHLVHEVTSPQAFEGLRMAGRTVRAPDKTIAVP 62

Query: 62  DHNVSTQT--KDINACGEMARIQMQELIKNCKEFGVELYDLNHPYQGIVHVMGPEQGVTL 119
           DHNV T    ++ +   E +RIQ++ L KN K+FG+  Y ++   QGIVH++GPEQG TL
Sbjct: 63  DHNVPTTLGRENPDQMTEDSRIQVEALDKNAKDFGIHYYPVSDIRQGIVHIVGPEQGWTL 122

Query: 120 PGMTIVCGDSHTATHGAFGALAFGIGTSEVEHVLATQTLKQGRAKTMKIEVQGKAAPGIT 179
           PGMT+VCGDSHTATHGAFGALA GIGTSEVEHVLATQTL Q ++K MK+E+ GK  PG+T
Sbjct: 123 PGMTVVCGDSHTATHGAFGALAHGIGTSEVEHVLATQTLIQKKSKNMKVEITGKLRPGVT 182

Query: 180 AKDIVLAIIGKTGSAGGTGHVVEFCGEAIRDLSMEGRMTLCNMAIEMGAKAGLVAPDETT 239
           AKDI L++IG TG+AGGTG+V+E+CGEAIRDLSMEGRMT+CNMAIE GA+AGL+APDE T
Sbjct: 183 AKDITLSVIGATGTAGGTGYVIEYCGEAIRDLSMEGRMTVCNMAIEGGARAGLIAPDEKT 242

Query: 240 FNYVKGRLHAPKGKDFDDAVAYWKTLQTDEGATFDTVVTLQAEEISPQVTWGTNPGQVIS 299
           + YV+GR HAPKG  ++ A+A+WKTL +D+ A +D VVT++ E+I+P VTWGT+P  V+ 
Sbjct: 243 YEYVQGRPHAPKGAQWEAALAWWKTLYSDDDAHWDKVVTIKGEDIAPVVTWGTSPEDVLP 302

Query: 300 VNDNIPDPASFADPVERASAEKALAYMGLKPGIPLTEVAIDKVFIGSCTNSRIEDLRAAA 359
           ++  +P P  F      A A ++L YMGL PG PL+E+ ID VFIGSCTN RIEDLRAAA
Sbjct: 303 ISAMVPAPEDFTGGKVDA-ARRSLDYMGLTPGTPLSEIEIDTVFIGSCTNGRIEDLRAAA 361

Query: 360 EIAKGRKVAPGVQALVVPGSGPVKAQAEAEGLDKIFIEAGFEWRLPGCSMCLAMNNDRLN 419
           EI KG+K+A   +A+VVPGSG V+AQAE EGL  IF +AGFEWR+ GCSMCLAMN D+L+
Sbjct: 362 EILKGKKIAV-KRAMVVPGSGLVRAQAEEEGLADIFKQAGFEWRMAGCSMCLAMNPDQLS 420

Query: 420 PGERCASTSNRNFEGRQGRGGRTHLVSPAMAAAAAVTGHFADIRNI 465
            GERCASTSNRNFEGRQG  GRTHLVSPAMAAAAAVTG   D+R++
Sbjct: 421 EGERCASTSNRNFEGRQGYKGRTHLVSPAMAAAAAVTGKLTDVRDL 466


Lambda     K      H
   0.317    0.134    0.396 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 767
Number of extensions: 25
Number of successful extensions: 4
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 466
Length of database: 467
Length adjustment: 33
Effective length of query: 433
Effective length of database: 434
Effective search space:   187922
Effective search space used:   187922
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 51 (24.3 bits)

Align candidate 3606654 Dshi_0085 (3-isopropylmalate dehydratase, large subunit (RefSeq))
to HMM TIGR00170 (leuC: 3-isopropylmalate dehydratase, large subunit (EC 4.2.1.33))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.aa/TIGR00170.hmm
# target sequence database:        /tmp/gapView.3359.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR00170  [M=466]
Accession:   TIGR00170
Description: leuC: 3-isopropylmalate dehydratase, large subunit
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                         Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                         -----------
   2.3e-241  787.2   0.2   2.7e-241  786.9   0.2    1.0  1  lcl|FitnessBrowser__Dino:3606654  Dshi_0085 3-isopropylmalate dehy


Domain annotation for each sequence (and alignments):
>> lcl|FitnessBrowser__Dino:3606654  Dshi_0085 3-isopropylmalate dehydratase, large subunit (RefSeq)
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  786.9   0.2  2.7e-241  2.7e-241       2     466 .]       3     466 ..       2     466 .. 0.98

  Alignments for each domain:
  == domain 1  score: 786.9 bits;  conditional E-value: 2.7e-241
                         TIGR00170   2 aktlyeklfdahvvkeaenetdllyidrhlvhevtspqafeglraagrkvrrvdktlatldhnistesr..dveike 76 
                                       a+tly+k++dah+ +eae++t llyidrhlvhevtspqafeglr agr vr +dkt+a  dhn++t+    + +++ 
  lcl|FitnessBrowser__Dino:3606654   3 ARTLYDKIWDAHLAHEAEDGTCLLYIDRHLVHEVTSPQAFEGLRMAGRTVRAPDKTIAVPDHNVPTTLGreNPDQMT 79 
                                       89***************************************************************985423667899 PP

                         TIGR00170  77 ekaklqvkeleknvkefgvklfdlssaeqgivhvvgpeegltlpgktivcgdshtathgafgalafgigtsevehvl 153
                                       e++++qv++l+kn+k+fg++++ +s+ +qgivh+vgpe+g tlpg+t+vcgdshtathgafgala gigtsevehvl
  lcl|FitnessBrowser__Dino:3606654  80 EDSRIQVEALDKNAKDFGIHYYPVSDIRQGIVHIVGPEQGWTLPGMTVVCGDSHTATHGAFGALAHGIGTSEVEHVL 156
                                       9**************************************************************************** PP

                         TIGR00170 154 atqtlkqaraktlkievegklakgitakdiilaiigkigvaggtgyvvefageairdlsmeermtvcnmaieagaka 230
                                       atqtl+q+++k++k+e+ gkl +g+takdi l +ig +g+aggtgyv+e++geairdlsme+rmtvcnmaie ga+a
  lcl|FitnessBrowser__Dino:3606654 157 ATQTLIQKKSKNMKVEITGKLRPGVTAKDITLSVIGATGTAGGTGYVIEYCGEAIRDLSMEGRMTVCNMAIEGGARA 233
                                       ***************************************************************************** PP

                         TIGR00170 231 gliapdettfeyvkdrkyapkgkefekavaywktlktdegakfdkvvtleakdispqvtwgtnpgqvlsvneevpdp 307
                                       gliapde t+eyv++r++apkg+++e a+a wktl +d++a++dkvvt++++di+p vtwgt+p++vl+++  vp+p
  lcl|FitnessBrowser__Dino:3606654 234 GLIAPDEKTYEYVQGRPHAPKGAQWEAALAWWKTLYSDDDAHWDKVVTIKGEDIAPVVTWGTSPEDVLPISAMVPAP 310
                                       ***************************************************************************** PP

                         TIGR00170 308 ksladpvekasaekalaylglepgtklkdikvdkvfigsctnsriedlraaaevvkgkkvadnvklalvvpgsglvk 384
                                       ++++   +   a ++l+y+gl+pgt+l +i++d vfigsctn+riedlraaae++kgkk+a  vk+a+vvpgsglv+
  lcl|FitnessBrowser__Dino:3606654 311 EDFTGG-KVDAARRSLDYMGLTPGTPLSEIEIDTVFIGSCTNGRIEDLRAAAEILKGKKIA--VKRAMVVPGSGLVR 384
                                       ***986.456789**********************************************98..99************ PP

                         TIGR00170 385 kqaekegldkifleagfewreagcslclgmnndvldeyercastsnrnfegrqgkgarthlvspamaaaaavagkfv 461
                                       +qae+egl  if +agfewr agcs+cl+mn+d+l+e+ercastsnrnfegrqg ++rthlvspamaaaaav+gk+ 
  lcl|FitnessBrowser__Dino:3606654 385 AQAEEEGLADIFKQAGFEWRMAGCSMCLAMNPDQLSEGERCASTSNRNFEGRQGYKGRTHLVSPAMAAAAAVTGKLT 461
                                       ***************************************************************************** PP

                         TIGR00170 462 direl 466
                                       d+r+l
  lcl|FitnessBrowser__Dino:3606654 462 DVRDL 466
                                       ***85 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (466 nodes)
Target sequences:                          1  (467 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.01u 0.01s 00:00:00.02 Elapsed: 00:00:00.02
# Mc/sec: 8.49
//
[ok]

This GapMind analysis is from Apr 09 2024. The underlying query database was built on Apr 09 2024.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory