GapMind for Amino acid biosynthesis

 

Alignments for a candidate for ilvI in Clostridium tyrobutyricum FAM22553

Align acetohydroxy-acid synthase large subunit (EC 2.2.1.6) (characterized)
to candidate WP_039651790.1 PN53_RS01910 biosynthetic-type acetolactate synthase large subunit

Query= metacyc::MONOMER-11900
         (599 letters)



>NCBI__GCF_000816635.1:WP_039651790.1
          Length = 536

 Score =  491 bits (1263), Expect = e-143
 Identities = 263/557 (47%), Positives = 362/557 (64%), Gaps = 30/557 (5%)

Query: 1   MNGAEAMIKALEAEKVEILFGYPGGALLPFYDALHHSDLIHLLTRHEQAAAHAADGYARA 60
           M  AEA+I+ L+ E+V ++FGYPG A++P Y+AL  SD+ H+L R EQAA H+A GYAR+
Sbjct: 1   MKAAEAIIQYLKKEEVNMVFGYPGAAVVPIYEALRKSDIKHVLVRQEQAAGHSASGYARS 60

Query: 61  SGKVGVCIGTSGPGATNLVTGVATAHSDSSPMVALTGQVPTKLIGNDAFQEIDALGLFMP 120
           +GKVGVCI TSGPGATNL+TG+A+A+ DS PMV +TGQV + LIG D FQE+D  G    
Sbjct: 61  TGKVGVCIVTSGPGATNLITGIASAYMDSIPMVIITGQVKSTLIGRDVFQELDITGATES 120

Query: 121 IVKHNFQIQKTCQIPEIFRSAFEIAQTGRPGPVHIDLPKDVQELELDIDKHPIPSKVKLI 180
             K++F ++++  IP+  + AF IA TGR GPV +D+P D+ E ++D      P  V + 
Sbjct: 121 FTKYSFLVRESKFIPKTIKEAFYIANTGRKGPVLVDIPVDIMEEDIDF---KYPEVVNIR 177

Query: 181 GYNPTTIGHPRQIKKAIKLIASAKRPIILAGGGVLLSGANEELLKLVELLNIPVCTTLMG 240
           GY PTT GH  QI+K I+ I ++KRPII AGGGV+L+ A  +L + VE  NIPV  TLMG
Sbjct: 178 GYKPTTKGHIGQIRKIIERIKTSKRPIICAGGGVILAKAENKLREFVEKSNIPVVHTLMG 237

Query: 241 KGCISENHPLALGMVGMHGTKPANYCLSESDVLISIGCRFSDRITGDIKSFATNAKIIHI 300
           KG I+E+    +G++G HG   AN  +  +DVLI IG R +DR T  IK FA NA +IHI
Sbjct: 238 KGSINEDSNYYVGLIGTHGFDYANKAVDNADVLILIGARATDRTTRGIKDFAKNADVIHI 297

Query: 301 DIDPAEIGKNVNVDVPIVGDAKLILKEVIKQLDYIINKDSKENNDKENISQWIENVNSLK 360
           DIDPAEIGKN+   +P+VGD + +L ++IK++  I         D EN   WI  +  + 
Sbjct: 298 DIDPAEIGKNLETFIPVVGDIENVLSKLIKEITPI---------DTEN---WINQIR-IW 344

Query: 361 KSSIPVMDYDDIPIKPQKIVKELMAVIDDLNINKNTIITTDVGQNQMWMAHYFKTQTPRS 420
           K+SI +       I P+  +  +   +D+      +I+ TDVGQNQ+W A  FK    R 
Sbjct: 345 KNSIEINKNPTDKINPRYALNNVSKKLDE-----ESILITDVGQNQIWCARNFKIMGNRK 399

Query: 421 FLSSGGLGTMGFGFPSAIGAKVAKPDSKVICITGDGGFMMNCQELGTIAEYNIPVVICIF 480
           FL+SGGLGTMG+  P+AIGAK+  PD  VI + GD GF M+  ELGT+ EY++ +V+ IF
Sbjct: 400 FLTSGGLGTMGYSIPAAIGAKMGCPDKNVIAVAGDAGFQMSLFELGTVKEYDVNIVMIIF 459

Query: 481 DNRTLGMVYQWQNLFYGKRQC----SVNFGGAPDFIKLAESYGIKARRIESPNEINEALK 536
           +N  LGMV + Q+     ++C     VNF   PDF+KLAESYGI A R+E  +E     +
Sbjct: 460 NNLGLGMVREIQD-----KKCEGEFGVNFKTNPDFVKLAESYGICAERVEKDDEFETVFE 514

Query: 537 EAINCDEPYLLDFAIDP 553
           +A+N D  +L++  +DP
Sbjct: 515 KALNSDRAFLIECVVDP 531


Lambda     K      H
   0.319    0.137    0.405 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 817
Number of extensions: 45
Number of successful extensions: 4
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 599
Length of database: 536
Length adjustment: 36
Effective length of query: 563
Effective length of database: 500
Effective search space:   281500
Effective search space used:   281500
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 53 (25.0 bits)

Align candidate WP_039651790.1 PN53_RS01910 (biosynthetic-type acetolactate synthase large subunit)
to HMM TIGR00118 (ilvB: acetolactate synthase, large subunit, biosynthetic type (EC 2.2.1.6))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.aa/TIGR00118.hmm
# target sequence database:        /tmp/gapView.2802739.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR00118  [M=557]
Accession:   TIGR00118
Description: acolac_lg: acetolactate synthase, large subunit, biosynthetic type
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                             Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                             -----------
   2.4e-210  685.6   2.2   2.7e-210  685.5   2.2    1.0  1  NCBI__GCF_000816635.1:WP_039651790.1  


Domain annotation for each sequence (and alignments):
>> NCBI__GCF_000816635.1:WP_039651790.1  
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  685.5   2.2  2.7e-210  2.7e-210       1     539 [.       1     534 [.       1     536 [] 0.98

  Alignments for each domain:
  == domain 1  score: 685.5 bits;  conditional E-value: 2.7e-210
                             TIGR00118   1 lkgaeilveslkkegvetvfGyPGGavlpiydalydselehilvrheqaaahaadGyarasGkvGvvlatsGP 73 
                                           +k+ae++++ lkke+v++vfGyPG av+piy+al +s+++h+lvr eqaa h a Gyar++GkvGv+++tsGP
  NCBI__GCF_000816635.1:WP_039651790.1   1 MKAAEAIIQYLKKEEVNMVFGYPGAAVVPIYEALRKSDIKHVLVRQEQAAGHSASGYARSTGKVGVCIVTSGP 73 
                                           799********************************************************************** PP

                             TIGR00118  74 GatnlvtgiatayldsvPlvvltGqvatsliGsdafqeidilGitlpvtkhsflvkkaedlpeilkeafeias 146
                                           Gatnl+tgia+ay+ds+P+v++tGqv+++liG d fqe+di+G t + tk+sflv++++ +p+++keaf+ia+
  NCBI__GCF_000816635.1:WP_039651790.1  74 GATNLITGIASAYMDSIPMVIITGQVKSTLIGRDVFQELDITGATESFTKYSFLVRESKFIPKTIKEAFYIAN 146
                                           ************************************************************************* PP

                             TIGR00118 147 tGrPGPvlvdlPkdvteaeieleveekvelpgykptvkghklqikkaleliekakkPvllvGgGviiaeasee 219
                                           tGr GPvlvd+P d++e++i+++++e v+++gykpt+kgh  qi+k++e i+++k+P++ +GgGvi a+a+++
  NCBI__GCF_000816635.1:WP_039651790.1 147 TGRKGPVLVDIPVDIMEEDIDFKYPEVVNIRGYKPTTKGHIGQIRKIIERIKTSKRPIICAGGGVILAKAENK 219
                                           ************************************************************************* PP

                             TIGR00118 220 lkelaerlkipvtttllGlGafpedhplalgmlGmhGtkeanlavseadlliavGarfddrvtgnlakfapea 292
                                           l+e++e+ +ipv++tl+G+G+++ed +  +g++G hG  +an av++ad+li +Gar  dr+t  ++ fa++a
  NCBI__GCF_000816635.1:WP_039651790.1 220 LREFVEKSNIPVVHTLMGKGSINEDSNYYVGLIGTHGFDYANKAVDNADVLILIGARATDRTTRGIKDFAKNA 292
                                           ************************************************************************* PP

                             TIGR00118 293 kiihididPaeigknvkvdipivGdakkvleellkklkeeekkekeWlekieewkkeyilkldeeeesikPqk 365
                                            +ihididPaeigkn+++ ip+vGd ++vl++l+k+++  ++++  W+++i+ wk++  +  ++  ++i+P++
  NCBI__GCF_000816635.1:WP_039651790.1 293 DVIHIDIDPAEIGKNLETFIPVVGDIENVLSKLIKEITPIDTEN--WINQIRIWKNSIEI-NKNPTDKINPRY 362
                                           *************************************9998777..*********98764.5667789***** PP

                             TIGR00118 366 vikelskllkdeaivttdvGqhqmwaaqfyktkkprkfitsgGlGtmGfGlPaalGakvakpeetvvavtGdg 438
                                            ++++sk l++e+i+ tdvGq+q+w a+ +k+   rkf+tsgGlGtmG+ +Paa+Gak++ p+++v+av+Gd+
  NCBI__GCF_000816635.1:WP_039651790.1 363 ALNNVSKKLDEESILITDVGQNQIWCARNFKIMGNRKFLTSGGLGTMGYSIPAAIGAKMGCPDKNVIAVAGDA 435
                                           ************************************************************************* PP

                             TIGR00118 439 sfqmnlqelstiveydipvkivilnnellGmvkqWqelfyeerysetklaselpdfvklaeayGvkgiriekp 511
                                           +fqm+l el t++eyd++++++i+nn  lGmv+  q+   e+ +     +  +pdfvklae+yG+ ++r+ek+
  NCBI__GCF_000816635.1:WP_039651790.1 436 GFQMSLFELGTVKEYDVNIVMIIFNNLGLGMVREIQDKKCEGEFGVNFKT--NPDFVKLAESYGICAERVEKD 506
                                           **************************************999988765555..6******************** PP

                             TIGR00118 512 eeleeklkealeskepvlldvevdkeee 539
                                           +e e  +++al+s++  l++ +vd +e+
  NCBI__GCF_000816635.1:WP_039651790.1 507 DEFETVFEKALNSDRAFLIECVVDPHES 534
                                           ************************8776 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (557 nodes)
Target sequences:                          1  (536 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.00u 0.01s 00:00:00.01 Elapsed: 00:00:00.01
# Mc/sec: 15.53
//
[ok]

This GapMind analysis is from Jul 25 2024. The underlying query database was built on Jul 25 2024.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory