GapMind for Amino acid biosynthesis

 

Alignments for a candidate for ilvI in Desulfobulbus mediterraneus DSM 13871

Align Acetolactate synthase isozyme 2 large subunit; AHAS-II; ALS-II; Acetohydroxy-acid synthase II large subunit; EC 2.2.1.6 (characterized)
to candidate WP_028583815.1 G494_RS0106080 acetolactate synthase, large subunit, biosynthetic type

Query= SwissProt::P0DP90
         (548 letters)



>NCBI__GCF_000429965.1:WP_028583815.1
          Length = 560

 Score =  484 bits (1246), Expect = e-141
 Identities = 257/554 (46%), Positives = 351/554 (63%), Gaps = 14/554 (2%)

Query: 1   MNGAQWVVHALRAQGVNTVFGYPGGAIMPVYDALYDGGV-EHLLCRHEQGAAMAAIGYAR 59
           + GA   +  L   G+ T+ G PGGA +P+Y+AL    V  H+L RHEQGA   A G AR
Sbjct: 4   LTGATLTIRLLEHYGITTIAGIPGGANLPLYEALSQSSVIRHVLARHEQGAGFMAQGMAR 63

Query: 60  ATGKTGVCIATSGPGATNLITGLADALLDSIPVVAITGQVSAPFIGTDAFQEVDVLGLSL 119
           ++G+  VC ATSGPGATN++T +ADA LDSIPV+ ITGQV    IGTDAFQEVD+ GL++
Sbjct: 64  SSGRPAVCFATSGPGATNILTAIADAYLDSIPVICITGQVPQDLIGTDAFQEVDIYGLTI 123

Query: 120 ACTKHSFLVQSLEELPRIMAEAFDVACSGRPGPVLVDIPKDIQLASGDLE--PWFTTVEN 177
             TKH++LV+S EEL +++ EAF +A SGRPGPV++DIPKD+QL +  +   P    +  
Sbjct: 124 PITKHNYLVRSGEELLKVIPEAFAIASSGRPGPVVIDIPKDVQLETVQVAGLPEIPALAA 183

Query: 178 EVTFPHAEVEQARQMLAKAQKPMLYVGGGVGMAQAVPALREFLAATKMPATCTLKGLGAV 237
            V  P   + QA  ++ +A+ P+LY+GGGV  + A    R     +++P T TL GLG +
Sbjct: 184 PVQVPAESLHQAASLINQAKAPVLYLGGGVVHSGAGLLARRLAERSQLPTTMTLMGLGII 243

Query: 238 EADYPYYLGMLGMHGTKAANFAVQECDLLIAVGARFDDRVTGKLNTFAPHASVIHMDIDP 297
             ++  Y+GMLGMH  ++ N  + ECDLLIA G RFDDR TGK+  F P A +IH+DIDP
Sbjct: 244 PPEHELYMGMLGMHAARSTNIMLDECDLLIAAGVRFDDRATGKIAEFCPDARIIHLDIDP 303

Query: 298 AEMNKLRQAHVALQGD----LNALLPAL---QQPLNQYDWQQHCAQLRDEHSWRYDHPGD 350
           +E++KL+QAH+ L GD    L  LLP +   ++PL    WQQ   QLR  H +      D
Sbjct: 304 SEISKLKQAHLGLVGDIAQTLERLLPLVSRRERPL----WQQRVRQLRAAHPFFCAEELD 359

Query: 351 AIYAPLLLKQLSDRKPADCVVTTDVGQHQMWAAQHIAHTRPENFITSSGLGTMGFGLPAA 410
                 ++ ++++      +V TDVG+HQMW AQ     RP  F+TS GLGTMGFGLP A
Sbjct: 360 FSRPYGIILRIAEIIGDQALVATDVGKHQMWTAQIYPLQRPRQFLTSGGLGTMGFGLPTA 419

Query: 411 VGAQVARPNDTVVCISGDGSFMMNVQELGTVKRKQLPLKIVLLDNQRLGMVRQWQQLFFQ 470
           +GA +  P+  VVCISGDGS MMN+QEL T   + L LK+V+L+NQ LG+VRQ Q+LF+ 
Sbjct: 420 IGAALQHPDKPVVCISGDGSIMMNIQELATAVEQDLNLKVVVLNNQSLGLVRQQQKLFYG 479

Query: 471 ERYSETTLTDNPDFLMLASAFGIHGQHITRKDQVEAALDTMLNSDGPYLLHVSIDELENV 530
            R   +    NPD   +A  FG+      +    +  L   L + GP L+++ ID  E V
Sbjct: 480 GRIFASEYRHNPDLAAIARGFGMKAFDCGQTLLFDEVLAVALQTPGPCLINIPIDIDEEV 539

Query: 531 WPLVPPGASNSEML 544
           +P+VPPGA+NS M+
Sbjct: 540 YPMVPPGAANSVMI 553


Lambda     K      H
   0.320    0.135    0.410 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 822
Number of extensions: 33
Number of successful extensions: 5
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 548
Length of database: 560
Length adjustment: 36
Effective length of query: 512
Effective length of database: 524
Effective search space:   268288
Effective search space used:   268288
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.8 bits)
S2: 53 (25.0 bits)

Align candidate WP_028583815.1 G494_RS0106080 (acetolactate synthase, large subunit, biosynthetic type)
to HMM TIGR00118 (ilvB: acetolactate synthase, large subunit, biosynthetic type (EC 2.2.1.6))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.aa/TIGR00118.hmm
# target sequence database:        /tmp/gapView.12994.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR00118  [M=557]
Accession:   TIGR00118
Description: acolac_lg: acetolactate synthase, large subunit, biosynthetic type
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                                 Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                                 -----------
   3.2e-204  665.4   0.0   3.6e-204  665.2   0.0    1.0  1  lcl|NCBI__GCF_000429965.1:WP_028583815.1  G494_RS0106080 acetolactate synt


Domain annotation for each sequence (and alignments):
>> lcl|NCBI__GCF_000429965.1:WP_028583815.1  G494_RS0106080 acetolactate synthase, large subunit, biosynthetic type
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  665.2   0.0  3.6e-204  3.6e-204       1     556 [.       4     555 ..       4     556 .. 0.98

  Alignments for each domain:
  == domain 1  score: 665.2 bits;  conditional E-value: 3.6e-204
                                 TIGR00118   1 lkgaeilveslkkegvetvfGyPGGavlpiydaly.dselehilvrheqaaahaadGyarasGkvGvvl 68 
                                               l+ga++ ++ l++ g+ t+ G+PGGa lp+y+al  +s ++h+l+rheq+a  +a+G+ar+sG++ v++
  lcl|NCBI__GCF_000429965.1:WP_028583815.1   4 LTGATLTIRLLEHYGITTIAGIPGGANLPLYEALSqSSVIRHVLARHEQGAGFMAQGMARSSGRPAVCF 72 
                                               689*******************************978889***************************** PP

                                 TIGR00118  69 atsGPGatnlvtgiatayldsvPlvvltGqvatsliGsdafqeidilGitlpvtkhsflvkkaedlpei 137
                                               atsGPGatn++t+ia+aylds+P++ +tGqv+++liG+dafqe+di G+t+p+tkh++lv++ e+l ++
  lcl|NCBI__GCF_000429965.1:WP_028583815.1  73 ATSGPGATNILTAIADAYLDSIPVICITGQVPQDLIGTDAFQEVDIYGLTIPITKHNYLVRSGEELLKV 141
                                               ********************************************************************* PP

                                 TIGR00118 138 lkeafeiastGrPGPvlvdlPkdvteaeieleveekvelpgykptvkghklqikkaleliekakkPvll 206
                                               + eaf ias+GrPGPv++d+Pkdv+ +++++      e+p+ +  v+     +++a+ li++ak Pvl+
  lcl|NCBI__GCF_000429965.1:WP_028583815.1 142 IPEAFAIASSGRPGPVVIDIPKDVQLETVQVAGL--PEIPALAAPVQVPAESLHQAASLINQAKAPVLY 208
                                               **************************99988766..689999999999999****************** PP

                                 TIGR00118 207 vGgGviiaeaseelkelaerlkipvtttllGlGafpedhplalgmlGmhGtkeanlavseadlliavGa 275
                                                GgGv++++a    ++laer ++p t+tl+GlG +p +h+l +gmlGmh ++ +n+ ++e+dllia G+
  lcl|NCBI__GCF_000429965.1:WP_028583815.1 209 LGGGVVHSGAGLLARRLAERSQLPTTMTLMGLGIIPPEHELYMGMLGMHAARSTNIMLDECDLLIAAGV 277
                                               ********************************************************************* PP

                                 TIGR00118 276 rfddrvtgnlakfapeakiihididPaeigknvkvdipivGdakkvleellkklkeeekkekeWlekie 344
                                               rfddr tg++a+f+p+a+iih+didP+ei+k  ++++ +vGd  + le+ll  ++++e+    W ++++
  lcl|NCBI__GCF_000429965.1:WP_028583815.1 278 RFDDRATGKIAEFCPDARIIHLDIDPSEISKLKQAHLGLVGDIAQTLERLLPLVSRRERPL--WQQRVR 344
                                               *****************************************************99988777..****** PP

                                 TIGR00118 345 ewkkeyilkldeeeesikPqkvikelskllkdeaivttdvGqhqmwaaqfyktkkprkfitsgGlGtmG 413
                                               ++++ +++   ee +  +P  +i ++ +++ d+a+v+tdvG+hqmw+aq y+ ++pr+f+tsgGlGtmG
  lcl|NCBI__GCF_000429965.1:WP_028583815.1 345 QLRAAHPFFCAEELDFSRPYGIILRIAEIIGDQALVATDVGKHQMWTAQIYPLQRPRQFLTSGGLGTMG 413
                                               ********************************************************************* PP

                                 TIGR00118 414 fGlPaalGakvakpeetvvavtGdgsfqmnlqelstiveydipvkivilnnellGmvkqWqelfyeery 482
                                               fGlP+a+Ga + +p++ vv+++Gdgs++mn+qel+t+ve d+++k+v+lnn+ lG+v+q q+lfy +r 
  lcl|NCBI__GCF_000429965.1:WP_028583815.1 414 FGLPTAIGAALQHPDKPVVCISGDGSIMMNIQELATAVEQDLNLKVVVLNNQSLGLVRQQQKLFYGGRI 482
                                               ********************************************************************* PP

                                 TIGR00118 483 setklaselpdfvklaeayGvkgiriekpeeleeklkealeskepvlldvevdkeeevlPmvapGagld 551
                                                +++   ++pd++++a+++G+k+    +    +e l+ al++ +p l+++ +d +eev+Pmv+pGa+++
  lcl|NCBI__GCF_000429965.1:WP_028583815.1 483 FASEYR-HNPDLAAIARGFGMKAFDCGQTLLFDEVLAVALQTPGPCLINIPIDIDEEVYPMVPPGAANS 550
                                               *****9.59***********************************************************9 PP

                                 TIGR00118 552 elvee 556
                                                +++e
  lcl|NCBI__GCF_000429965.1:WP_028583815.1 551 VMIGE 555
                                               99976 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (557 nodes)
Target sequences:                          1  (560 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.02u 0.01s 00:00:00.03 Elapsed: 00:00:00.02
# Mc/sec: 10.81
//
[ok]

This GapMind analysis is from Apr 10 2024. The underlying query database was built on Apr 09 2024.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory