GapMind for Amino acid biosynthesis

 

Alignments for a candidate for leuA in Haloglycomyces albus DSM 45210

Align 2-isopropylmalate synthase (EC 2.3.3.13) (characterized)
to candidate WP_025273525.1 HALAL_RS19070 2-isopropylmalate synthase

Query= BRENDA::P9WQB3
         (644 letters)



>NCBI__GCF_000527155.1:WP_025273525.1
          Length = 956

 Score =  697 bits (1798), Expect = 0.0
 Identities = 358/598 (59%), Positives = 433/598 (72%), Gaps = 43/598 (7%)

Query: 45  RYRPFAEEVEPIRLRNRTWPDRVIDRAPLWCAVDLRDGNQALIDPMSPARKRRMFDLLVR 104
           RYRPF EE+  + L +RTWPDR+ID AP WCAVDLRDGNQALIDPM   RK+ MF  LV 
Sbjct: 10  RYRPFHEEIR-VHLPDRTWPDRIIDHAPTWCAVDLRDGNQALIDPMDLERKKTMFQTLVD 68

Query: 105 MGYKEIEVGFPSASQTDFDFVREIIEQGAIPDDVTIQVLTQCRPELIERTFQACSGAPRA 164
           MGYKEIEVGFP+ASQTDFDFVR +IE+  IP DV+IQVLTQCR  LI+RTF++  GA +A
Sbjct: 69  MGYKEIEVGFPAASQTDFDFVRALIEEERIPADVSIQVLTQCRDHLIKRTFESLRGARQA 128

Query: 165 IVHFYNSTSILQRRVVFRANRAEVQAIATDGARKCVEQAAKY-PGTQWRFEYSPESYTGT 223
           IVHFYNSTSILQR+VVF+A++  +  IATDGA  C++ AA   P T  R+EYSPES+TGT
Sbjct: 129 IVHFYNSTSILQRKVVFQADKPSITKIATDGAELCLKYAADITPDTDIRYEYSPESFTGT 188

Query: 224 ELEYAKQVCDAVGEVIAPTPERPIIFNLPATVEMTTPNVYADSIEWMSRNLANRESVILS 283
           ELEYA  +C+AV EVI P+ +RP+I NLPATVEM TPN+YADSIEW  R+ A  E++ILS
Sbjct: 189 ELEYAVDICNAVAEVIDPSIDRPLILNLPATVEMATPNIYADSIEWFKRHFARPEAMILS 248

Query: 284 LHPHNDRGTAVAAAELGFAAGADRIEGCLFGNGERTGNVCLVTLGLNLFSRGVDPQIDFS 343
           +HPHNDRGT VAAAEL   AGADR+EGCLFGNGERTGNVCLVTLG+NLF++GVDP+IDF 
Sbjct: 249 VHPHNDRGTGVAAAELAVQAGADRVEGCLFGNGERTGNVCLVTLGMNLFTQGVDPEIDFG 308

Query: 344 NIDEIRRTVEYCNQLPVHERHPYGGDLVYTAFSGSHQDAINKGLDAMKLDADAADCDVDD 403
           +ID IRR VEYCN+LPV ERHPY GDLV+TAFSGSHQDAI KGL A++ +AD A   V+ 
Sbjct: 309 DIDSIRRKVEYCNRLPVPERHPYAGDLVFTAFSGSHQDAIKKGLQALESEADQAGVAVEQ 368

Query: 404 MLWQVPYLPIDPRDVGRTYEAVIRVNSQSGKGGVAYIMKTDHGLSLPRRLQIEFSQVIQK 463
             W VPYLP+DP+D+GR YEAVIRVNSQSGKGGV+YIM  D+G  LPRR+QIEFSQ IQK
Sbjct: 369 YPWAVPYLPVDPKDIGRNYEAVIRVNSQSGKGGVSYIMHHDYGFDLPRRMQIEFSQAIQK 428

Query: 464 IAEGTAGEGGEVSPKEMWDAFAEEYLAPVRPLERIRQHVDAADDDGGTTSITATVKINGV 523
               T   G EV+P+E+ D F  EY    +P  R+          G   ++ A V ++G 
Sbjct: 429 ---HTDHSGREVTPEEIHDVFRREYHPEYQPNARLAVGSCNTSTHGDEVTVEAVVTLDGQ 485

Query: 524 ETEISGSGNGPLAAFVHALADVGFDVAVLDYYEHAMSAGDDAQAAAYVEASVTIASPAQP 583
           E +I G GNGPL+AFV ALA+V   V V +Y +H++S G DA++AAYVE ++        
Sbjct: 486 EHKIIGDGNGPLSAFVSALAEVDVAVEVQEYVQHSLSTGSDAKSAAYVECAI-------- 537

Query: 584 GEAGRHASDPVTIASPAQPGEAGRHASDPVTSKTVWGVGIAPSITTASLRAVVSAVNR 641
           GE                              +TVWG G+  +IT A+LRA+ SAVNR
Sbjct: 538 GE------------------------------RTVWGAGVHSNITLATLRALASAVNR 565


Lambda     K      H
   0.317    0.133    0.399 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 1653
Number of extensions: 76
Number of successful extensions: 5
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 2
Number of HSP's successfully gapped: 2
Length of query: 644
Length of database: 956
Length adjustment: 41
Effective length of query: 603
Effective length of database: 915
Effective search space:   551745
Effective search space used:   551745
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 55 (25.8 bits)

Align candidate WP_025273525.1 HALAL_RS19070 (2-isopropylmalate synthase)
to HMM TIGR00970 (leuA: 2-isopropylmalate synthase (EC 2.3.3.13))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.aa/TIGR00970.hmm
# target sequence database:        /tmp/gapView.1812010.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR00970  [M=564]
Accession:   TIGR00970
Description: leuA_yeast: 2-isopropylmalate synthase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                             Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                             -----------
   2.8e-255  834.1   0.0   3.4e-255  833.8   0.0    1.1  1  NCBI__GCF_000527155.1:WP_025273525.1  


Domain annotation for each sequence (and alignments):
>> NCBI__GCF_000527155.1:WP_025273525.1  
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  833.8   0.0  3.4e-255  3.4e-255       3     562 ..       9     566 ..       7     568 .. 0.97

  Alignments for each domain:
  == domain 1  score: 833.8 bits;  conditional E-value: 3.4e-255
                             TIGR00970   3 kkykpfk...aiklsnrkwpdkvitraprwlsvdlrdGnqalidpmsverkkryfkllvriGfkeievgfpsa 72 
                                            +y+pf+    ++l++r+wpd++i++ap w++vdlrdGnqalidpm+ erkk +f+ lv++G+keievgfp+a
  NCBI__GCF_000527155.1:WP_025273525.1   9 TRYRPFHeeiRVHLPDRTWPDRIIDHAPTWCAVDLRDGNQALIDPMDLERKKTMFQTLVDMGYKEIEVGFPAA 81 
                                           79****9877789************************************************************ PP

                             TIGR00970  73 sqtdfdfvreiieqglipddvtiqvltqsreelikrtvealsGakkaivhlynatsdlfrevvfrasreevla 145
                                           sqtdfdfvr +ie+  ip dv+iqvltq+r++likrt+e+l+Ga++aivh+yn+ts+l+r+vvf+a++  + +
  NCBI__GCF_000527155.1:WP_025273525.1  82 SQTDFDFVRALIEEERIPADVSIQVLTQCRDHLIKRTFESLRGARQAIVHFYNSTSILQRKVVFQADKPSITK 154
                                           ************************************************************************* PP

                             TIGR00970 146 lavegsklvrklvkdaaksketrwsfeyspesfsdtelefavevceavkeviepteerpiifnlpatvevatp 218
                                           +a++g++l  k + d   +++t+ ++eyspesf++tele+av++c+av evi+p+ +rp+i+nlpatve+atp
  NCBI__GCF_000527155.1:WP_025273525.1 155 IATDGAELCLKYAAD--ITPDTDIRYEYSPESFTGTELEYAVDICNAVAEVIDPSIDRPLILNLPATVEMATP 225
                                           *****7776666544..4689**************************************************** PP

                             TIGR00970 219 nvyadsieylstniaerekvilslhphndrGtavaaaelGllaGadrieGclfGnGertGnvdlvtlalnlyt 291
                                           n+yadsie++ +++a  e +ils+hphndrGt+vaaael + aGadr+eGclfGnGertGnv+lvtl++nl+t
  NCBI__GCF_000527155.1:WP_025273525.1 226 NIYADSIEWFKRHFARPEAMILSVHPHNDRGTGVAAAELAVQAGADRVEGCLFGNGERTGNVCLVTLGMNLFT 298
                                           ************************************************************************* PP

                             TIGR00970 292 qGvspnldfsdldeilrvvercnkipvherhpygGdlvvtafsGshqdaikkGldaldkkkaaa.....dtlw 359
                                           qGv+p++df d+d+i+r ve+cn++pv erhpy+Gdlv+tafsGshqdaikkGl+al+ ++++a     +  w
  NCBI__GCF_000527155.1:WP_025273525.1 299 QGVDPEIDFGDIDSIRRKVEYCNRLPVPERHPYAGDLVFTAFSGSHQDAIKKGLQALESEADQAgvaveQYPW 371
                                           **************************************************************9989997888* PP

                             TIGR00970 360 kvpylpldpkdvgreyeavirvnsqsGkGGvayvlktdlGldlprrlqiefssvvkdiadskGkelsskeisd 432
                                            vpylp+dpkd+gr+yeavirvnsqsGkGGv+y+++ d+G+dlprr+qiefs+++++++d  G+e++++ei d
  NCBI__GCF_000527155.1:WP_025273525.1 372 AVPYLPVDPKDIGRNYEAVIRVNSQSGKGGVSYIMHHDYGFDLPRRMQIEFSQAIQKHTDHSGREVTPEEIHD 444
                                           ************************************************************************* PP

                             TIGR00970 433 lfkeeyllnveqlerislvdyaveddGteskvitavvkikgekkdieGsGnGplsalvdaladllnvdvavad 505
                                           +f+ ey+ + ++  r+ + + + + +G ++ +++avv ++g++ +i G GnGplsa+v ala++  v v+v +
  NCBI__GCF_000527155.1:WP_025273525.1 445 VFRREYHPEYQPNARLAVGSCNTSTHG-DEVTVEAVVTLDGQEHKIIGDGNGPLSAFVSALAEVD-VAVEVQE 515
                                           ***************************.77889******************************98.******* PP

                             TIGR00970 506 ysehalgsGddakaasyvelsvrrasdaekatvwGvGiaedvtsaslravlsavnra 562
                                           y +h+l++G+dak+a+yve ++ + +      vwG G+++++t a+lra+ savnr+
  NCBI__GCF_000527155.1:WP_025273525.1 516 YVQHSLSTGSDAKSAAYVECAIGERT------VWGAGVHSNITLATLRALASAVNRV 566
                                           *******************9988776......9**********************97 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (564 nodes)
Target sequences:                          1  (956 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.00u 0.01s 00:00:00.01 Elapsed: 00:00:00.01
# Mc/sec: 47.07
//
[ok]

This GapMind analysis is from Jul 25 2024. The underlying query database was built on Jul 25 2024.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory