GapMind for Amino acid biosynthesis

 

Alignments for a candidate for leuC in Desulfovibrio vulgaris Miyazaki F

Align isopropylmalate/citramalate isomerase large subunit LeuC (EC 4.2.1.33; EC 4.2.1.35) (characterized)
to candidate 8501056 DvMF_1792 3-isopropylmalate dehydratase large subunit (RefSeq)

Query= reanno::DvH:208495
         (419 letters)



>FitnessBrowser__Miya:8501056
          Length = 418

 Score =  724 bits (1870), Expect = 0.0
 Identities = 359/419 (85%), Positives = 385/419 (91%), Gaps = 1/419 (0%)

Query: 1   MAHTLAQKILQRHTDEAITDAGQIVRCRVSMVLANDITAPLAIKSFRAMGAKRVFDKDRV 60
           MAHTLAQKILQ HTDEAIT AGQIVRCRVS+ LANDITAPLAIKSFRAMGAK+VFD+D+V
Sbjct: 1   MAHTLAQKILQAHTDEAITAAGQIVRCRVSLALANDITAPLAIKSFRAMGAKKVFDRDKV 60

Query: 61  ALVMDHFTPQKDIEAAQQVKLTREFAREMGVTHYYEGGDCGVEHALLPELGLVGPGDVVV 120
           ALVMDHFTPQKDI +AQQVKLTREFAREMGVTHYYEGGDCGVEHALLPELGLVGPGDVVV
Sbjct: 61  ALVMDHFTPQKDIASAQQVKLTREFAREMGVTHYYEGGDCGVEHALLPELGLVGPGDVVV 120

Query: 121 GADSHTCTYGGLGAFATGLGSTDVAGAMALGETWFKVPPTIRATFTGTLPAYVGAKDLIL 180
           GADSHTCTYGGLGAFATG GSTDVAGAMALGETWFKVPPTIRATFTGTLP +VGAKDLIL
Sbjct: 121 GADSHTCTYGGLGAFATGFGSTDVAGAMALGETWFKVPPTIRATFTGTLPKWVGAKDLIL 180

Query: 181 TLIGAIGVDGALYRALEFDGAAIEALDVEGRMTMANMAIEAGGKAGLFAADAKTLTYCTT 240
            LIG IGVDGALYRALEFDGAAIEAL VEGRMT+ANMAIEAGGKAGLFAADAKTL Y   
Sbjct: 181 RLIGEIGVDGALYRALEFDGAAIEALSVEGRMTIANMAIEAGGKAGLFAADAKTLAYTAA 240

Query: 241 AGRTGDTAFSADAGAVYERELSFDVTGMTPVVACPHLPDNVKPVSEVKDVTVQQVVIGSC 300
            GR  D   SAD GA YEREL+FDV+GM P+VACPHLP+NVKPV EV+ VT+ QVVIGSC
Sbjct: 241 RGRK-DAPLSADPGATYERELTFDVSGMEPLVACPHLPENVKPVGEVRGVTLDQVVIGSC 299

Query: 301 TNGRIGDLREAAAVLRGRKVSRDVRCIVLPATPGIWRQALREGLIETFMEAGCIVGPATC 360
           TNGRI D+REAA VL+GRKV++ VRCIVLPATPG+W++AL+EGLIETFME+GCIVGPATC
Sbjct: 300 TNGRISDMREAAEVLKGRKVAKGVRCIVLPATPGVWKEALKEGLIETFMESGCIVGPATC 359

Query: 361 GPCLGGHMGILADGERAIATTNRNFKGRMGSLESEVYLSGPATAAASAVTGVITDPSTL 419
           GPCLGGHMGILADGERAIATTNRNF+GRMGSLESEVYL+ PA AAASAV G+I  P +L
Sbjct: 360 GPCLGGHMGILADGERAIATTNRNFRGRMGSLESEVYLASPAVAAASAVAGIIAHPGSL 418


Lambda     K      H
   0.320    0.136    0.401 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 718
Number of extensions: 23
Number of successful extensions: 2
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 419
Length of database: 418
Length adjustment: 32
Effective length of query: 387
Effective length of database: 386
Effective search space:   149382
Effective search space used:   149382
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.8 bits)
S2: 50 (23.9 bits)

Align candidate 8501056 DvMF_1792 (3-isopropylmalate dehydratase large subunit (RefSeq))
to HMM TIGR02083 (leuC: 3-isopropylmalate dehydratase, large subunit (EC 4.2.1.33))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.aa/TIGR02083.hmm
# target sequence database:        /tmp/gapView.25013.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR02083  [M=419]
Accession:   TIGR02083
Description: LEU2: 3-isopropylmalate dehydratase, large subunit
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                         Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                         -----------
   9.8e-193  626.7   0.1   1.1e-192  626.4   0.1    1.0  1  lcl|FitnessBrowser__Miya:8501056  DvMF_1792 3-isopropylmalate dehy


Domain annotation for each sequence (and alignments):
>> lcl|FitnessBrowser__Miya:8501056  DvMF_1792 3-isopropylmalate dehydratase large subunit (RefSeq)
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  626.4   0.1  1.1e-192  1.1e-192       2     417 ..       4     416 ..       3     418 .] 0.99

  Alignments for each domain:
  == domain 1  score: 626.4 bits;  conditional E-value: 1.1e-192
                         TIGR02083   2 tlaekiladkagkeevkpgelilakldlvlgndvttplaikafkelgvkkvfdkdkvalvldhftpnkdikaaeqvk 78 
                                       tla+kil  ++      +g+++  ++ l+l+nd+t+plaik+f+ +g+kkvfd+dkvalv+dhftp+kdi +a+qvk
  lcl|FitnessBrowser__Miya:8501056   4 TLAQKILQAHTDEAITAAGQIVRCRVSLALANDITAPLAIKSFRAMGAKKVFDRDKVALVMDHFTPQKDIASAQQVK 80 
                                       89*************************************************************************** PP

                         TIGR02083  79 lirefakekeiekyfeigelgvehallpekglvvsgdliigadshtctygalgafatgvgstdlavamatgkawfkv 155
                                       l+refa+e ++ +y+e g+ gvehallpe glv +gd+++gadshtctyg lgafatg gstd+a+ama g++wfkv
  lcl|FitnessBrowser__Miya:8501056  81 LTREFAREMGVTHYYEGGDCGVEHALLPELGLVGPGDVVVGADSHTCTYGGLGAFATGFGSTDVAGAMALGETWFKV 157
                                       ***************************************************************************** PP

                         TIGR02083 156 peaikfvlkgklkdyvsakdlilkiigkigvdgalykslefsgeglkelsvddrltianmaieagaktgifevdekt 232
                                       p +i+ ++ g l ++v akdlil++ig+igvdgaly++lef g +++ lsv++r+tianmaieag+k+g+f +d kt
  lcl|FitnessBrowser__Miya:8501056 158 PPTIRATFTGTLPKWVGAKDLILRLIGEIGVDGALYRALEFDGAAIEALSVEGRMTIANMAIEAGGKAGLFAADAKT 234
                                       ***************************************************************************** PP

                         TIGR02083 233 ieyvkgrakrelkiykadedakyervieidlselepqvafphlpentkeideaekeeikidqvvigsctngrledlr 309
                                       ++y+  r++++  + +ad  a yer++ +d+s +ep va phlpen k++ e+    + +dqvvigsctngr++d+r
  lcl|FitnessBrowser__Miya:8501056 235 LAYTAARGRKDAPL-SADPGATYERELTFDVSGMEPLVACPHLPENVKPVGEVR--GVTLDQVVIGSCTNGRISDMR 308
                                       ********999875.67************************************9..9******************** PP

                         TIGR02083 310 laaeilkgkkvakevrliilpasqkvylealkeglleifieagavvstptcgpclgghmgilaegeravsttnrnfv 386
                                        aae+lkg+kvak vr+i+lpa++ v++ealkegl+e+f+e+g++v++ tcgpclgghmgila+gera++ttnrnf 
  lcl|FitnessBrowser__Miya:8501056 309 EAAEVLKGRKVAKGVRCIVLPATPGVWKEALKEGLIETFMESGCIVGPATCGPCLGGHMGILADGERAIATTNRNFR 385
                                       ***************************************************************************** PP

                         TIGR02083 387 grmghpksevylaspavaaasaikgkiaspe 417
                                       grmg  +sevylaspavaaasa++g+ia+p 
  lcl|FitnessBrowser__Miya:8501056 386 GRMGSLESEVYLASPAVAAASAVAGIIAHPG 416
                                       ****************************996 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (419 nodes)
Target sequences:                          1  (418 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.02u 0.01s 00:00:00.03 Elapsed: 00:00:00.02
# Mc/sec: 8.15
//
[ok]

This GapMind analysis is from Apr 09 2024. The underlying query database was built on Apr 09 2024.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory