GapMind for Amino acid biosynthesis

 

Alignments for a candidate for ilvI in Streptacidiphilus oryzae TH49

Align acetolactate synthase (subunit 2/2) (EC 2.2.1.6) (characterized)
to candidate WP_037572586.1 BS73_RS14735 acetolactate synthase large subunit

Query= BRENDA::P9WG41
         (618 letters)



>NCBI__GCF_000744815.1:WP_037572586.1
          Length = 618

 Score =  835 bits (2156), Expect = 0.0
 Identities = 406/602 (67%), Positives = 480/602 (79%), Gaps = 15/602 (2%)

Query: 27  AARPKHVALQQLTGAQAVIRSLEELGVDVIFGIPGGAVLPVYDPLFDSKKLRHVLVRHEQ 86
           AA+P     + ++GAQ++IRSLE +GVD +FG+PGGA+LP YDPL DS+KLRHVLVRHEQ
Sbjct: 19  AAQP--TVAETMSGAQSLIRSLEAVGVDTVFGLPGGAILPAYDPLMDSEKLRHVLVRHEQ 76

Query: 87  GAGHAASGYAHVTGRVGVCMATSGPGATNLVTPLADAQMDSIPVVAITGQVGRGLIGTDA 146
           GAGHAA+GYA  TG+VGVCMATSGPGATNLVTP+ADA MDS+P+VAITGQV    IGTDA
Sbjct: 77  GAGHAATGYAQATGKVGVCMATSGPGATNLVTPIADAYMDSVPMVAITGQVASTSIGTDA 136

Query: 147 FQEADISGITMPITKHNFLVRSGDDIPRVLAEAFHIAASGRPGAVLVDIPKDVLQGQCTF 206
           FQEADI GITMPITKHNFLV    DIPRV++EAFHIAA+GRPG VLVD+ KD +Q Q  F
Sbjct: 137 FQEADICGITMPITKHNFLVTDAADIPRVISEAFHIAATGRPGPVLVDVAKDAMQKQTVF 196

Query: 207 SWPPRMELPGYKPNTKPHSRQVREAAKLIAAARKPVLYVGGGVIRGEATEQLRELAELTG 266
            WP   +LPGY+P T+PH +Q+REAA+L+A AR+PVLYVGGGV++  AT +L+ LAELT 
Sbjct: 197 RWPVEHQLPGYRPVTRPHGKQIREAARLLAQARRPVLYVGGGVLKARATAELKILAELTK 256

Query: 267 IPVVTTLMARGAFPDSHRQNLGMPGMHGTVAAVAALQRSDLLIALGTRFDDRVTGKLDSF 326
            PV TTLM  GAFPDSH Q+LGMPGMHGTVAAV +LQ++DL+IALG RFDDRVTGKLD F
Sbjct: 257 APVTTTLMGIGAFPDSHPQHLGMPGMHGTVAAVTSLQKADLIIALGARFDDRVTGKLDGF 316

Query: 327 APEAKVIHADIDPAEIGKNRHADVPIVGDVKAVITELIAMLRHH--HIPGTIEMADWWAY 384
           AP A ++HADIDPAEIGKNR ADVPIVGD + V+ +LI  +++   H PG  +   WW  
Sbjct: 317 APYATIVHADIDPAEIGKNRAADVPIVGDAREVLADLIVAVQNELDHTPGACDYTAWWEQ 376

Query: 385 LNGVRKTYPLSYGPQSDGSLSPEYVIEKLGEIAGPDAVFVAGVGQHQMWAAQFIRYEKPR 444
           L G RKTYPL + P  DG L+P+ VIE++G++ GPDA++ AGVGQHQMWA+QFI++E P 
Sbjct: 377 LGGWRKTYPLGWDPAPDGLLTPQQVIERIGQLVGPDAIYAAGVGQHQMWASQFIQFEHPA 436

Query: 445 SWLNSGGLGTMGFAIPAAMGAKIALPGTEVWAIDGDGCFQMTNQELATCAVEGIPVKVAL 504
           +WLNSGG GTMG+A+PAAMGAK   PGTEVWAIDGDGCFQMTNQEL TCA+  IP+KVA+
Sbjct: 437 TWLNSGGAGTMGYAVPAAMGAKAGQPGTEVWAIDGDGCFQMTNQELVTCALNNIPIKVAV 496

Query: 505 INNGNLGMVRQWQSLFYAERYSQTDL-----------ATHSHRIPDFVKLAEALGCVGLR 553
           INNG+LGMVRQWQ+LFY ERYS T L                RIPDFVKLAEA+GC GLR
Sbjct: 497 INNGSLGMVRQWQTLFYNERYSNTVLHSGPGHDGKEQPAQGTRIPDFVKLAEAMGCHGLR 556

Query: 554 CEREEDVVDVINQARAINDCPVVIDFIVGADAQVWPMVAAGTSNDEIQAARGIRPLFDDI 613
           CE  + +  VI QAR++ND PVVIDFIV  DA VWPMVAAGT+NDEI AAR +RP F D 
Sbjct: 557 CESPDQLDAVIEQARSLNDAPVVIDFIVHQDAMVWPMVAAGTNNDEIMAARDVRPDFGDN 616

Query: 614 TE 615
            E
Sbjct: 617 EE 618


Lambda     K      H
   0.319    0.136    0.414 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 1203
Number of extensions: 45
Number of successful extensions: 3
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 618
Length of database: 618
Length adjustment: 37
Effective length of query: 581
Effective length of database: 581
Effective search space:   337561
Effective search space used:   337561
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.8 bits)
S2: 53 (25.0 bits)

Align candidate WP_037572586.1 BS73_RS14735 (acetolactate synthase large subunit)
to HMM TIGR00118 (ilvB: acetolactate synthase, large subunit, biosynthetic type (EC 2.2.1.6))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.aa/TIGR00118.hmm
# target sequence database:        /tmp/gapView.380344.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR00118  [M=557]
Accession:   TIGR00118
Description: acolac_lg: acetolactate synthase, large subunit, biosynthetic type
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                             Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                             -----------
     2e-232  758.5   0.5   3.2e-232  757.8   0.5    1.3  1  NCBI__GCF_000744815.1:WP_037572586.1  


Domain annotation for each sequence (and alignments):
>> NCBI__GCF_000744815.1:WP_037572586.1  
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  757.8   0.5  3.2e-232  3.2e-232       1     555 [.      28     605 ..      28     607 .. 0.95

  Alignments for each domain:
  == domain 1  score: 757.8 bits;  conditional E-value: 3.2e-232
                             TIGR00118   1 lkgaeilveslkkegvetvfGyPGGavlpiydaly.dselehilvrheqaaahaadGyarasGkvGvvlatsG 72 
                                           ++ga+ l++sl++ gv+tvfG PGGa+lp yd l  +++l+h+lvrheq+a haa Gya+a+GkvGv++atsG
  NCBI__GCF_000744815.1:WP_037572586.1  28 MSGAQSLIRSLEAVGVDTVFGLPGGAILPAYDPLMdSEKLRHVLVRHEQGAGHAATGYAQATGKVGVCMATSG 100
                                           79*********************************7789********************************** PP

                             TIGR00118  73 PGatnlvtgiatayldsvPlvvltGqvatsliGsdafqeidilGitlpvtkhsflvkkaedlpeilkeafeia 145
                                           PGatnlvt+ia+ay+dsvP+v++tGqva++ iG+dafqe+di Git+p+tkh+flv++a+d+p+++ eaf+ia
  NCBI__GCF_000744815.1:WP_037572586.1 101 PGATNLVTPIADAYMDSVPMVAITGQVASTSIGTDAFQEADICGITMPITKHNFLVTDAADIPRVISEAFHIA 173
                                           ************************************************************************* PP

                             TIGR00118 146 stGrPGPvlvdlPkdvteaeieleveekvelpgykptvkghklqikkaleliekakkPvllvGgGviiaease 218
                                           +tGrPGPvlvd+ kd +++++ ++ + + +lpgy+p +++h +qi++a+ l+++a++Pvl+vGgGv++a a++
  NCBI__GCF_000744815.1:WP_037572586.1 174 ATGRPGPVLVDVAKDAMQKQTVFRWPVEHQLPGYRPVTRPHGKQIREAARLLAQARRPVLYVGGGVLKARATA 246
                                           ************************************************************************* PP

                             TIGR00118 219 elkelaerlkipvtttllGlGafpedhplalgmlGmhGtkeanlavseadlliavGarfddrvtgnlakfape 291
                                           elk lae +k+pvtttl+G+Gafp+ hp+ lgm GmhGt +a +++++adl+ia+Garfddrvtg+l+ fap 
  NCBI__GCF_000744815.1:WP_037572586.1 247 ELKILAELTKAPVTTTLMGIGAFPDSHPQHLGMPGMHGTVAAVTSLQKADLIIALGARFDDRVTGKLDGFAPY 319
                                           ************************************************************************* PP

                             TIGR00118 292 akiihididPaeigknvkvdipivGdakkvleellkklkee......ekkekeWlekieewkkeyilkldeee 358
                                           a+i+h didPaeigkn+++d+pivGda++vl++l+ ++++e        + + W+e++  w+k+y+l  d   
  NCBI__GCF_000744815.1:WP_037572586.1 320 ATIVHADIDPAEIGKNRAADVPIVGDAREVLADLIVAVQNEldhtpgACDYTAWWEQLGGWRKTYPLGWDPAP 392
                                           *************************************9999877664334445***************99888 PP

                             TIGR00118 359 es.ikPqkvikelskllkdeaivttdvGqhqmwaaqfyktkkprkfitsgGlGtmGfGlPaalGakvakpeet 430
                                           +  + Pq+vi+++ +l+  +ai++++vGqhqmwa+qf ++++p ++++sgG+GtmG+ +Paa+Gak ++p ++
  NCBI__GCF_000744815.1:WP_037572586.1 393 DGlLTPQQVIERIGQLVGPDAIYAAGVGQHQMWASQFIQFEHPATWLNSGGAGTMGYAVPAAMGAKAGQPGTE 465
                                           7769********************************************************************* PP

                             TIGR00118 431 vvavtGdgsfqmnlqelstiveydipvkivilnnellGmvkqWqelfyeerysetklas.............. 489
                                           v a++Gdg fqm+ qel t++  +ip+k+ ++nn  lGmv+qWq lfy+erys+t l+s              
  NCBI__GCF_000744815.1:WP_037572586.1 466 VWAIDGDGCFQMTNQELVTCALNNIPIKVAVINNGSLGMVRQWQTLFYNERYSNTVLHSgpghdgkeqpaqgt 538
                                           ******************************************************9886422222222222222 PP

                             TIGR00118 490 elpdfvklaeayGvkgiriekpeeleeklkealesk.epvlldvevdkeeevlPmvapGagldelve 555
                                             pdfvklaea+G +g+r e+p++l++ +++a + + +pv++d+ v++++ v+Pmva G+++de++ 
  NCBI__GCF_000744815.1:WP_037572586.1 539 RIPDFVKLAEAMGCHGLRCESPDQLDAVIEQARSLNdAPVVIDFIVHQDAMVWPMVAAGTNNDEIMA 605
                                           479*****************************998769**************************986 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (557 nodes)
Target sequences:                          1  (618 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.02u 0.01s 00:00:00.03 Elapsed: 00:00:00.03
# Mc/sec: 9.08
//
[ok]

This GapMind analysis is from Jul 26 2024. The underlying query database was built on Jul 25 2024.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory