GapMind for Amino acid biosynthesis

 

Alignments for a candidate for asp-kinase in Chlorobaculum parvum NCIB 8327

Align aspartate kinase (EC 2.7.2.4) (characterized)
to candidate WP_041466196.1 CPAR_RS10150 lysine-sensitive aspartokinase 3

Query= BRENDA::Q57991
         (473 letters)



>NCBI__GCF_000020505.1:WP_041466196.1
          Length = 470

 Score =  308 bits (788), Expect = 3e-88
 Identities = 191/474 (40%), Positives = 281/474 (59%), Gaps = 32/474 (6%)

Query: 4   VMKFGGTSVGSGERIRHVAKIVTKRKKEDDDVVVVVSAMSEVTNALVEISQQALDVRDIA 63
           VMKFGGTSVG+   +R V   + ++KK    +VV+ SA S +TN L++I+ +A   R + 
Sbjct: 3   VMKFGGTSVGTAAAMRQVIANIAEKKKTSAPLVVL-SACSGITNKLIQIADEAGSGR-LK 60

Query: 64  KVGDFIKFIREKHYKAIEEAIKSEEIKEEVKKIIDSRIEELEKVLIGVAYLGELTPKSRD 123
           +    +  +R+ H   I E I +EE++  V + I   +  LE++  G+  +GELT +SRD
Sbjct: 61  EALKLVGEVRQFHLDLIGELIGNEELRAAVIEKIGVYLTRLERLTEGIEIVGELTERSRD 120

Query: 124 YILSFGERLSSPILSGAIRDLGEKSIALEGGEAGIITDNNFGSAR----VKRLEVKERLL 179
              SFGE LS+ + + A+ + G     L+     +ITD+ +G AR      R    E + 
Sbjct: 121 RFCSFGELLSTSVFAAALNEAGVPCEWLDVRTV-MITDDRYGFARPLAETCRKNTTEIIK 179

Query: 180 PLLKEGIIPVVTGFIGTTEEGYITTLGRGGSDYSAALIGYGLDADIIEIWTDVSGVYTTD 239
           PLL  G + V  G+IG+TE G  TTLGRGGSD SAAL G  L ++ IEIWTDV GV TTD
Sbjct: 180 PLLDAGTVVVTQGYIGSTESGRTTTLGRGGSDLSAALFGAWLHSESIEIWTDVDGVMTTD 239

Query: 240 PRLVPTARRIPKLSYIEAMELAYFGAKVLHPRTIEPAMEKGIPILVKNTFEPESEGTLIT 299
           PR+VP AR I  +++ EA ELAY GAKVLHP TI PA+EK IP+ V NT+ P+S+GTLIT
Sbjct: 240 PRMVPEARSIRVMTFSEAAELAYLGAKVLHPDTIAPAVEKNIPVFVLNTWHPDSKGTLIT 299

Query: 300 NDMEM-----SDSIVKAISTIKNVALINIFGAGMVGVSGTAARIFKALGEEEVNVILISQ 354
           ND E+        +VK+I+  K  A++NI    M G  G  + +F       ++V +IS 
Sbjct: 300 NDPELLAGKSHGGLVKSIAVKKGQAILNIRSNRMFGRHGFMSELFDVFERFAISVEMIS- 358

Query: 355 GSSETNISLVVSEEDVDKA-LKALKREFGDFGKKSFLNNNLIRDVSVDKDVCVISVVGAG 413
            +SE ++SL V +  V +  +KAL               + + +V ++  V  +SVVG  
Sbjct: 359 -TSEVSVSLTVDDGSVGETFIKAL---------------SSLGEVEIEHKVATVSVVGDN 402

Query: 414 MRGAKGIAGKIFTAVSESGANIKMIAQGSSEVNISFVIDEKDLLNCVRKLHEKF 467
           +R ++G+AG+IF ++     N++MI+QG+SE+N+  V+DE D+   V  LH +F
Sbjct: 403 LRMSRGVAGRIFNSL--RNVNLRMISQGASEINVGVVVDESDVAPAVAALHCEF 454


Lambda     K      H
   0.316    0.135    0.364 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 523
Number of extensions: 23
Number of successful extensions: 8
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 473
Length of database: 470
Length adjustment: 33
Effective length of query: 440
Effective length of database: 437
Effective search space:   192280
Effective search space used:   192280
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.6 bits)
S2: 51 (24.3 bits)

Align candidate WP_041466196.1 CPAR_RS10150 (lysine-sensitive aspartokinase 3)
to HMM TIGR00657 (aspartate kinase (EC 2.7.2.4))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.aa/TIGR00657.hmm
# target sequence database:        /tmp/gapView.6914.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR00657  [M=442]
Accession:   TIGR00657
Description: asp_kinases: aspartate kinase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                                 Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                                 -----------
   3.2e-122  394.7   1.4   3.6e-122  394.5   1.4    1.0  1  lcl|NCBI__GCF_000020505.1:WP_041466196.1  CPAR_RS10150 lysine-sensitive as


Domain annotation for each sequence (and alignments):
>> lcl|NCBI__GCF_000020505.1:WP_041466196.1  CPAR_RS10150 lysine-sensitive aspartokinase 3
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  394.5   1.4  3.6e-122  3.6e-122       4     441 ..       2     455 ..       1     456 [. 0.95

  Alignments for each domain:
  == domain 1  score: 394.5 bits;  conditional E-value: 3.6e-122
                                 TIGR00657   4 iVqKFGGtSvgnverikkvakivkkekekgnqvvVVvSAmagvTdaLvelaekvsseee...kelieki 69 
                                               +V+KFGGtSvg++  +++v   + ++k k   ++VV+SA +g+T++L+++a+++ s++     +l+ ++
  lcl|NCBI__GCF_000020505.1:WP_041466196.1   2 VVMKFGGTSVGTAAAMRQVIANIAEKK-KTSAPLVVLSACSGITNKLIQIADEAGSGRLkeaLKLVGEV 69 
                                               9******************99998888.5559*************************996666677899 PP

                                 TIGR00657  70 rekhlealeela.sqalkeklkallekeleevkk............ereldlilsvGEklSaallaaal 125
                                               r+ hl+ + el+ +++l++ + + +   l+ +++            er++d+  s+GE lS++++aaal
  lcl|NCBI__GCF_000020505.1:WP_041466196.1  70 RQFHLDLIGELIgNEELRAAVIEKIGVYLTRLERltegieivgeltERSRDRFCSFGELLSTSVFAAAL 138
                                               **************999999999998888888888999******************************* PP

                                 TIGR00657 126 eelgvkavsllgaeagiltdsefgrAk....vleeikterleklleegiivvvaGFiGatekgeittLG 190
                                               +e g    ++l+ + +++td+++g A+    + ++ +te +++ll++g++vv++G+iG+te+g++ttLG
  lcl|NCBI__GCF_000020505.1:WP_041466196.1 139 NEAG-VPCEWLDVRTVMITDDRYGFARplaeTCRKNTTEIIKPLLDAGTVVVTQGYIGSTESGRTTTLG 206
                                               ****.99********************99988899999******************************* PP

                                 TIGR00657 191 RGGSDltAallAaalkAdeveiytDVdGiytaDPrivpeArrldeisyeEalELaslGakvLhprtlep 259
                                               RGGSDl+Aal +a l+ + +ei+tDVdG++t+DPr+vpeAr +  +++ Ea+ELa+lGakvLhp+t+ p
  lcl|NCBI__GCF_000020505.1:WP_041466196.1 207 RGGSDLSAALFGAWLHSESIEIWTDVDGVMTTDPRMVPEARSIRVMTFSEAAELAYLGAKVLHPDTIAP 275
                                               ********************************************************************* PP

                                 TIGR00657 260 amrakipivvkstfnpeaeGTlivaksk....seeepavkalsldknqalvsvsgttmk..pgilaevf 322
                                               a++++ip++v +t+ p+++GTli+++ +    +++   vk+++++k qa+++++++ m    g+++e+f
  lcl|NCBI__GCF_000020505.1:WP_041466196.1 276 AVEKNIPVFVLNTWHPDSKGTLITNDPEllagKSHGGLVKSIAVKKGQAILNIRSNRMFgrHGFMSELF 344
                                               ********************************9999**********************999******** PP

                                 TIGR00657 323 galaeakvnvdlilqsssetsisfvvdkedadkakellkkkvkeekaleevevekklalvslvGagmks 391
                                                  ++  ++v++i+  +se s+s++vd+ ++ ++       +k+++ l+eve+e+k+a+vs+vG++++ 
  lcl|NCBI__GCF_000020505.1:WP_041466196.1 345 DVFERFAISVEMIS--TSEVSVSLTVDDGSVGETF------IKALSSLGEVEIEHKVATVSVVGDNLRM 405
                                               **************..88889*****988765443......457899********************** PP

                                 TIGR00657 392 apgvaakifeaLaeeniniemis..sseikisvvvdekdaekavealheklv 441
                                                 gva++if+ L++  +n +mis  +sei++ vvvde+d+  av alh +++
  lcl|NCBI__GCF_000020505.1:WP_041466196.1 406 SRGVAGRIFNSLRN--VNLRMISqgASEINVGVVVDESDVAPAVAALHCEFF 455
                                               ************98..9******99***********************9997 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (442 nodes)
Target sequences:                          1  (470 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.02u 0.01s 00:00:00.03 Elapsed: 00:00:00.02
# Mc/sec: 8.89
//
[ok]

This GapMind analysis is from Apr 10 2024. The underlying query database was built on Apr 09 2024.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory