GapMind for Amino acid biosynthesis

 

Alignments for a candidate for asp-kinase in Thermophagus xiamenensis HS1

Align aspartate kinase (EC 2.7.2.4) (characterized)
to candidate WP_010528812.1 GQW_RS0115695 aspartate kinase

Query= BRENDA::O23653
         (544 letters)



>NCBI__GCF_000220155.1:WP_010528812.1
          Length = 436

 Score =  234 bits (596), Expect = 7e-66
 Identities = 161/464 (34%), Positives = 263/464 (56%), Gaps = 38/464 (8%)

Query: 85  VMKFGGSSVESAERMKEVANLILSFPDERPVIVLSAMGKTTNKLLKAGE---KAVTCGVT 141
           V+KFGG+SV SAERM+ VA L+ S   ER ++VLSAM  TTN L++      K    G  
Sbjct: 3   VLKFGGTSVGSAERMRTVAGLVTS--PERKIVVLSAMAGTTNSLVEITNYLYKKNYDGAN 60

Query: 142 NVESIEELSFIKELH-LRTAHEL---GVETTVIEKHLEGLHQLLKGISMMKELTLRTRDY 197
            V +  E  +I  +H L T+ E    G+E  V++ H + +    K +      T+     
Sbjct: 61  EVINRLEKGYIDTVHELFTSREYQSKGLE--VVKSHFDYIRSFTKDV-----FTVFEEKS 113

Query: 198 LVSFGECMSTRLFSAYLNKIGHKARQYDAFEIGFITTDDFTNADILEATYPAVSKTLVGD 257
           +++ GE ++T LF+ +L + G ++    A +  F+ T+     D +      +++ L G 
Sbjct: 114 ILAQGELITTALFNFFLQENGTESVLLPALD--FMRTNKSNEPDTVYIR-ENLNRILKGY 170

Query: 258 WSKENAVPVVTGYLGKGWRSCAITTLGRGGSDLTATTIGKALGLREIQVWKDVDGVLTCD 317
             K+  + +  GY+ +      +  L RGGSD +A+ IG A+   EIQ+W D+DG+   D
Sbjct: 171 SDKQ--LFITQGYICRNAFG-EVDNLQRGGSDYSASLIGAAVNAEEIQIWTDIDGMHNND 227

Query: 318 PNIYPGAQSVPYLTFDEAAELAYFGAQVLHPLSMRPARDGDIPVRVKNSYNPTAPGTVIT 377
           P      +S+  L+FDEAAELAYFGA++LHP S+ PA+  +IPVR+ N+ +P APGT+I+
Sbjct: 228 PRYVENTKSIAELSFDEAAELAYFGAKILHPTSVLPAKLANIPVRLLNTMDPQAPGTIIS 287

Query: 378 RSRDMSKAVLTSIVLKRNVTMLDIASTRMLGQYGFLAKVFTTFEDLGISVDVVATSEVSI 437
            S+   K  LT++  K N+T + I S RML  YGFL KVF  FE     +D++ TSEV +
Sbjct: 288 SSQ--HKGRLTAVAAKDNITAIKIKSGRMLLAYGFLRKVFEIFESYKTPIDMITTSEVGV 345

Query: 438 SLTLDPAKLWGRELIQRVNELDNLVEELEKIAVVKLLQRRSIISLIGN-VQKSSLILEKV 496
           S+T+D  +            L+ +V++L K + V++ + + II ++G+ V ++     ++
Sbjct: 346 SVTIDNDR-----------HLEEIVDDLRKFSTVEVDRDQVIICIVGDLVAENKGYANRI 394

Query: 497 FQVFRSNGVNVQMISQGASKVNISLIVNDEEAEQCVRALHSAFF 540
           F+  +   + ++MIS G S+ NISL+V+ ++  + +RAL +  F
Sbjct: 395 FEALKD--IPIRMISYGGSEHNISLLVSSKDKVRALRALSAKLF 436


Lambda     K      H
   0.318    0.133    0.377 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 441
Number of extensions: 20
Number of successful extensions: 7
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 544
Length of database: 436
Length adjustment: 34
Effective length of query: 510
Effective length of database: 402
Effective search space:   205020
Effective search space used:   205020
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 52 (24.6 bits)

Align candidate WP_010528812.1 GQW_RS0115695 (aspartate kinase)
to HMM TIGR00657 (aspartate kinase (EC 2.7.2.4))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.aa/TIGR00657.hmm
# target sequence database:        /tmp/gapView.2012228.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR00657  [M=442]
Accession:   TIGR00657
Description: asp_kinases: aspartate kinase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                             Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                             -----------
    1.5e-97  313.4   0.1    1.6e-97  313.3   0.1    1.0  1  NCBI__GCF_000220155.1:WP_010528812.1  


Domain annotation for each sequence (and alignments):
>> NCBI__GCF_000220155.1:WP_010528812.1  
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  313.3   0.1   1.6e-97   1.6e-97       5     441 ..       3     436 .]       1     436 [] 0.94

  Alignments for each domain:
  == domain 1  score: 313.3 bits;  conditional E-value: 1.6e-97
                             TIGR00657   5 VqKFGGtSvgnverikkvakivkkekekgnqvvVVvSAmagvTdaLvelaekvsseee...keliekirekhl 74 
                                           V+KFGGtSvg++er++ va +v++ +      +VV+SAmag+T++Lve+ + + +++    +e+i+++++  +
  NCBI__GCF_000220155.1:WP_010528812.1   3 VLKFGGTSVGSAERMRTVAGLVTSPE----RKIVVLSAMAGTTNSLVEITNYLYKKNYdgaNEVINRLEKGYI 71 
                                           89**********************99....569******************9999998899999********* PP

                             TIGR00657  75 ealeela.sqalkeklkallekeleevkk.......ereldlilsvGEklSaallaaaleelgvkavsllgae 139
                                           ++++el  s + + k  ++++++++ +++         e+  il+ GE +++al+   l+e g  +  ll a 
  NCBI__GCF_000220155.1:WP_010528812.1  72 DTVHELFtSREYQSKGLEVVKSHFDYIRSftkdvftVFEEKSILAQGELITTALFNFFLQENG-TESVLLPAL 143
                                           *****9999999*99999999999999999999998778889*********************.555577778 PP

                             TIGR00657 140 agiltdsefgrAkvleeikterleklleegiivvvaGFiGatekgeittLGRGGSDltAallAaalkAdevei 212
                                             ++t+++ +  +v + ++ +r++k  + +++ +++G+i  +  ge+  L RGGSD++A+l++aa++A+e++i
  NCBI__GCF_000220155.1:WP_010528812.1 144 DFMRTNKSNEPDTVYIRENLNRILKGYSDKQLFITQGYICRNAFGEVDNLQRGGSDYSASLIGAAVNAEEIQI 216
                                           88999999999898999999***************************************************** PP

                             TIGR00657 213 ytDVdGiytaDPrivpeArrldeisyeEalELaslGakvLhprtlepamrakipivvkstfnpeaeGTlivak 285
                                           +tD+dG +  DPr+v++++ ++e+s++Ea+ELa++Gak+Lhp ++ pa+ a+ip++  +t++p+a+GT+i ++
  NCBI__GCF_000220155.1:WP_010528812.1 217 WTDIDGMHNNDPRYVENTKSIAELSFDEAAELAYFGAKILHPTSVLPAKLANIPVRLLNTMDPQAPGTIISSS 289
                                           ***********************************************************************99 PP

                             TIGR00657 286 skseeepavkalsldknqalvsvsgttmk..pgilaevfgalaeakvnvdlilqsssetsisfvvdkedadka 356
                                              +++ +++a++ ++n + +++++  m   +g+l +vf++ ++ k  +d+i+  +se ++s ++d++   + 
  NCBI__GCF_000220155.1:WP_010528812.1 290 ---QHKGRLTAVAAKDNITAIKIKSGRMLlaYGFLRKVFEIFESYKTPIDMIT--TSEVGVSVTIDNDR--HL 355
                                           ...45689*****************999999**********************..88888999998775..23 PP

                             TIGR00657 357 kellkkkvkeekaleevevekklalvslvGagmksapgvaakifeaLaeeniniemis..sseikisvvvdek 427
                                           +e+    v++++++++vev+++  ++ +vG+ +++++g a +ifeaL++  i+i+mis   se++is++v++k
  NCBI__GCF_000220155.1:WP_010528812.1 356 EEI----VDDLRKFSTVEVDRDQVIICIVGDLVAENKGYANRIFEALKD--IPIRMISygGSEHNISLLVSSK 422
                                           333....567899***********************************9..********9************* PP

                             TIGR00657 428 daekavealheklv 441
                                           d ++a++al +kl+
  NCBI__GCF_000220155.1:WP_010528812.1 423 DKVRALRALSAKLF 436
                                           *********99986 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (442 nodes)
Target sequences:                          1  (436 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.00u 0.01s 00:00:00.01 Elapsed: 00:00:00.01
# Mc/sec: 17.15
//
[ok]

This GapMind analysis is from Jul 26 2024. The underlying query database was built on Jul 25 2024.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory