GapMind for Amino acid biosynthesis

 

Alignments for a candidate for ilvI in Methylocapsa acidiphila B2

Align acetohydroxyacid synthase subunit B (EC 2.2.1.6) (characterized)
to candidate WP_026608213.1 METAC_RS0118905 acetolactate synthase 3 large subunit

Query= metacyc::MONOMER-18810
         (585 letters)



>NCBI__GCF_000427445.1:WP_026608213.1
          Length = 586

 Score =  623 bits (1607), Expect = 0.0
 Identities = 311/569 (54%), Positives = 407/569 (71%), Gaps = 5/569 (0%)

Query: 19  MIGAEILVHALAEEGVEYVWGYPGGAVLYIYDELHKQTKFEHILVRHEQAAVHAADGYAR 78
           + GAE++V AL ++GVE ++GYPGGAVL IYD L  Q   +H+LVRHEQ A HAA+GYAR
Sbjct: 5   LTGAEMVVRALQDQGVETIFGYPGGAVLPIYDALFHQKHIKHVLVRHEQGAAHAAEGYAR 64

Query: 79  ATGKVGVALVTSGPGVTNAVTGIATAYLDSIPMVVITGNVPTHAIGQDAFQECDTVGITR 138
           ++GK GV LVTSGPG TNA+TG+  A +DSIP+V ITG VPTH IG DAFQECDTVGITR
Sbjct: 65  SSGKAGVLLVTSGPGATNAITGLTDALMDSIPLVCITGQVPTHLIGSDAFQECDTVGITR 124

Query: 139 PIVKHNFLVKDVRDLAATIKKAFFIAATGRPGPVVVDIPKDVSRNACKYEYPKSIDMRSY 198
              KHN+LV+ V DLA  + +AF++A  GRPGPVV+DIPKDV      Y  P  ++ ++Y
Sbjct: 125 HCTKHNYLVRHVDDLARVLHEAFYVAQNGRPGPVVIDIPKDVQFAVGAYFGPHKVEHKTY 184

Query: 199 NPVNKGHSGQIRKAVALLQGAERPYIYTGGGVVLA--NASDELRQLAALTGHPVTNTLMG 256
            P  +G   +I +AV ++  A+RP  YTGGGV+ +   AS  LR+L  LTG P+T+TLMG
Sbjct: 185 KPRLEGDPDKIAQAVEMMAAAKRPIFYTGGGVINSGVEASHLLRELVGLTGFPITSTLMG 244

Query: 257 LGAFPGTSKQFVGMLGMHGTYEANMAMQNCDVLIAIGARFDDRVIGNPAHFTSQARKIIH 316
           LG++P + ++++GMLGMHGT+EAN AM +CD+++A+GARFDDR+ G    F+  ++K IH
Sbjct: 245 LGSYPASGEKWLGMLGMHGTFEANNAMHDCDLMVAVGARFDDRITGRLDAFSPGSKK-IH 303

Query: 317 IDIDPSSISKRVKVDIPIVGNVKDVLQELIAQIKASDIKPKREALAKWWEQIEQWRSVDC 376
           +DIDPSSI+K VK+D+ I+G+   VL++++   +A       E LAKWW QI+QWR+   
Sbjct: 304 VDIDPSSINKNVKIDLGIIGDCAHVLRKMVELWRARRHVASAEPLAKWWAQIDQWRARKS 363

Query: 377 LKYDRSSEIIKPQYVVEKIWELTKG-DAFICSDVGQHQMWAAQFYKFDEPRRWINSGGLG 435
           L Y  S E+IKPQ+ V++++ELTK  D +I ++VGQHQMWAAQ Y F+EP RW+ SGGLG
Sbjct: 364 LSYKASKEVIKPQFAVQRLYELTKARDVYITTEVGQHQMWAAQHYHFEEPNRWMTSGGLG 423

Query: 436 TMGVGLPYAMGIKKAFPEKEVVTITGEGSIQMCIQELSTCLQYDTPVKICSLNNGYLGMV 495
           TMG GLP A+G + A P+  VV I GE SI M IQELST  QY  PVK+  LNN Y+GMV
Sbjct: 424 TMGYGLPAAIGAQLAHPDALVVDIAGEASILMNIQELSTAAQYRLPVKVFILNNEYMGMV 483

Query: 496 RQWQEIEYDNRYSHSYMDALPDFVKLAEAYGHVGMRVEKTSDVEPALREAFRLKDRTVFL 555
           RQWQE+ +D RYS SY +ALPDFVKLAEAYG  G+R    + ++ A+ E      RTV  
Sbjct: 484 RQWQELLHDGRYSESYSEALPDFVKLAEAYGAHGIRCSDPAQLDAAILEMID-TPRTVIF 542

Query: 556 DFQTDPTENVWPMVQAGKGISEMLLGAED 584
           D   D TEN  PM+ +GK  +EM+L   D
Sbjct: 543 DCIVDKTENCLPMIPSGKAHNEMILPDHD 571


Lambda     K      H
   0.319    0.135    0.407 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 975
Number of extensions: 30
Number of successful extensions: 5
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 585
Length of database: 586
Length adjustment: 37
Effective length of query: 548
Effective length of database: 549
Effective search space:   300852
Effective search space used:   300852
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.8 bits)
S2: 53 (25.0 bits)

Align candidate WP_026608213.1 METAC_RS0118905 (acetolactate synthase 3 large subunit)
to HMM TIGR00118 (ilvB: acetolactate synthase, large subunit, biosynthetic type (EC 2.2.1.6))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.aa/TIGR00118.hmm
# target sequence database:        /tmp/gapView.28904.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR00118  [M=557]
Accession:   TIGR00118
Description: acolac_lg: acetolactate synthase, large subunit, biosynthetic type
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                                 Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                                 -----------
   2.4e-243  794.6   0.0   2.8e-243  794.3   0.0    1.0  1  lcl|NCBI__GCF_000427445.1:WP_026608213.1  METAC_RS0118905 acetolactate syn


Domain annotation for each sequence (and alignments):
>> lcl|NCBI__GCF_000427445.1:WP_026608213.1  METAC_RS0118905 acetolactate synthase 3 large subunit
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  794.3   0.0  2.8e-243  2.8e-243       1     555 [.       5     567 ..       5     569 .. 0.97

  Alignments for each domain:
  == domain 1  score: 794.3 bits;  conditional E-value: 2.8e-243
                                 TIGR00118   1 lkgaeilveslkkegvetvfGyPGGavlpiydaly.dselehilvrheqaaahaadGyarasGkvGvvl 68 
                                               l+gae++v++l+++gvet+fGyPGGavlpiydal+ +++++h+lvrheq+aahaa+Gyar+sGk+Gv l
  lcl|NCBI__GCF_000427445.1:WP_026608213.1   5 LTGAEMVVRALQDQGVETIFGYPGGAVLPIYDALFhQKHIKHVLVRHEQGAAHAAEGYARSSGKAGVLL 73 
                                               68********************************98999****************************** PP

                                 TIGR00118  69 atsGPGatnlvtgiatayldsvPlvvltGqvatsliGsdafqeidilGitlpvtkhsflvkkaedlpei 137
                                               +tsGPGatn++tg+++a +ds+Plv +tGqv+t+liGsdafqe+d +Git+ +tkh++lv++++dl+++
  lcl|NCBI__GCF_000427445.1:WP_026608213.1  74 VTSGPGATNAITGLTDALMDSIPLVCITGQVPTHLIGSDAFQECDTVGITRHCTKHNYLVRHVDDLARV 142
                                               ********************************************************************* PP

                                 TIGR00118 138 lkeafeiastGrPGPvlvdlPkdvteaeieleveekvelpgykptvkghklqikkaleliekakkPvll 206
                                               l+eaf++a+ GrPGPv++d+Pkdv+ a   +  ++kve ++ykp+++g++ +i++a+e++++ak+P+ +
  lcl|NCBI__GCF_000427445.1:WP_026608213.1 143 LHEAFYVAQNGRPGPVVIDIPKDVQFAVGAYFGPHKVEHKTYKPRLEGDPDKIAQAVEMMAAAKRPIFY 211
                                               ********************************************************************* PP

                                 TIGR00118 207 vGgGviia..easeelkelaerlkipvtttllGlGafpedhplalgmlGmhGtkeanlavseadlliav 273
                                                GgGvi +  eas+ l+el+  +  p+t+tl+GlG++p+  ++ lgmlGmhGt ean a++++dl++av
  lcl|NCBI__GCF_000427445.1:WP_026608213.1 212 TGGGVINSgvEASHLLRELVGLTGFPITSTLMGLGSYPASGEKWLGMLGMHGTFEANNAMHDCDLMVAV 280
                                               ******873368999****************************************************** PP

                                 TIGR00118 274 GarfddrvtgnlakfapeakiihididPaeigknvkvdipivGdakkvleellkklkee....ekke.k 337
                                               Garfddr+tg l+ f+p +k ih+didP++i+knvk d+ i+Gd+ +vl+++++  +++    + +   
  lcl|NCBI__GCF_000427445.1:WP_026608213.1 281 GARFDDRITGRLDAFSPGSKKIHVDIDPSSINKNVKIDLGIIGDCAHVLRKMVELWRARrhvaSAEPlA 349
                                               *****************************************************9888776665333314 PP

                                 TIGR00118 338 eWlekieewkkeyilkldeeeesikPqkvikelskllkd.eaivttdvGqhqmwaaqfyktkkprkfit 405
                                               +W+++i++w++++ l+++ ++e ikPq  +++l++l+k  ++++tt+vGqhqmwaaq+y++++p++++t
  lcl|NCBI__GCF_000427445.1:WP_026608213.1 350 KWWAQIDQWRARKSLSYKASKEVIKPQFAVQRLYELTKArDVYITTEVGQHQMWAAQHYHFEEPNRWMT 418
                                               5************************************9879**************************** PP

                                 TIGR00118 406 sgGlGtmGfGlPaalGakvakpeetvvavtGdgsfqmnlqelstiveydipvkivilnnellGmvkqWq 474
                                               sgGlGtmG+GlPaa+Ga++a+p++ vv+++G++s+ mn+qelst+++y +pvk+ ilnne++Gmv+qWq
  lcl|NCBI__GCF_000427445.1:WP_026608213.1 419 SGGLGTMGYGLPAAIGAQLAHPDALVVDIAGEASILMNIQELSTAAQYRLPVKVFILNNEYMGMVRQWQ 487
                                               ********************************************************************* PP

                                 TIGR00118 475 elfyeerysetklaselpdfvklaeayGvkgiriekpeeleeklkealeskepvlldvevdkeeevlPm 543
                                               el++++ryse++ ++ lpdfvklaeayG++gir ++p++l++++ e++ + + v++d  vdk+e++lPm
  lcl|NCBI__GCF_000427445.1:WP_026608213.1 488 ELLHDGRYSESYSEA-LPDFVKLAEAYGAHGIRCSDPAQLDAAILEMIDTPRTVIFDCIVDKTENCLPM 555
                                               *************95.***************************************************** PP

                                 TIGR00118 544 vapGagldelve 555
                                               +++G++ +e++ 
  lcl|NCBI__GCF_000427445.1:WP_026608213.1 556 IPSGKAHNEMIL 567
                                               **********96 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (557 nodes)
Target sequences:                          1  (586 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.02u 0.01s 00:00:00.03 Elapsed: 00:00:00.02
# Mc/sec: 11.37
//
[ok]

This GapMind analysis is from Apr 10 2024. The underlying query database was built on Apr 09 2024.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory