GapMind for Amino acid biosynthesis

 

Alignments for a candidate for lysA in Enterococcus termitis LMG 8895

Align diaminopimelate decarboxylase subunit (EC 4.1.1.20) (characterized)
to candidate WP_069662873.1 BCR25_RS06840 diaminopimelate decarboxylase

Query= metacyc::MONOMER-6601
         (439 letters)



>NCBI__GCF_001730305.1:WP_069662873.1
          Length = 433

 Score =  504 bits (1298), Expect = e-147
 Identities = 244/428 (57%), Positives = 310/428 (72%)

Query: 3   LHGTSRQNQHGHLEIGGVDALYLAEKYGTPLYVYDVALIRERAKSFKQAFISAGLKAQVA 62
           L GT+  N+  HL IGG D + LAEKYGTPL++YDVA IRERA+ FKQ   S G+K +V 
Sbjct: 4   LFGTATFNESEHLTIGGCDTVTLAEKYGTPLFIYDVAHIRERARGFKQTLNSLGVKNKVI 63

Query: 63  YASKAFSSVAMIQLAEEEGLSLDVVSGGELYTAVAAGFPAERIHFHGNNKSREELRMALE 122
           YASKAF  +AM +L EEE L  DVVS GE+YTA+ AG   E I FHGNNK++EEL  A+E
Sbjct: 64  YASKAFCCLAMYKLLEEEELGCDVVSAGEIYTAIKAGMSPENIEFHGNNKTKEELLYAVE 123

Query: 123 HRIGCIVVDNFYEIALLEDLCKETGHSIDVLLRITPGVEAHTHDYITTGQEDSKFGFDLH 182
             +G I++DNFYEI LL  + KE      VL RITPG+ A THDYI TGQ DSKFGFD++
Sbjct: 124 QGVGTIIIDNFYEIELLSAILKEKNQKQHVLFRITPGINAETHDYILTGQVDSKFGFDVN 183

Query: 183 NGQTERAIEQVLQSEHIQLLGVHCHIGSQIFDTAGFVLAAEKIFKKLDEWRDSYSFVSKV 242
           +GQ  +A+E++L  +H+ L GVHCHIGSQIF   GF+ A EK+   L+EW+ ++ +   V
Sbjct: 184 SGQATQALERILADDHLVLKGVHCHIGSQIFSAEGFLAAVEKMLTILNEWKQAFGYSVDV 243

Query: 243 LNLGGGFGIRYTEDDEPLHATEYVEKIIEAVKENASRYGFDIPEIWIEPGRSLVGDAGTT 302
           LN+GGGFG++YTE D+PL    +V+ I+ +VK   +   +  PEIW+EPGRS++ +AGTT
Sbjct: 244 LNMGGGFGVQYTEADDPLEPEAFVKAIVNSVKGQCALLDYAFPEIWLEPGRSIIAEAGTT 303

Query: 303 LYTVGSQKEVPGVRQYVAVDGGMNDNIRPALYQAKYEAAAANRIGEAHDKTVSIAGKCCE 362
           +YTVGS+K +P VR YV+VDGGM DNIRPALY AKY+   ANRI        +I GK CE
Sbjct: 304 IYTVGSEKVIPEVRHYVSVDGGMGDNIRPALYGAKYDGFLANRISNEQQSPKTIVGKYCE 363

Query: 363 SGDMLIWDIDLPEVKEGDLLAVFCTGAYGYSMANNYNRIPRPAVVFVENGEAHLVVKRET 422
           SGD+LI DI+LP ++  DL AV  TGAYGYSMANNYNR  +PAVVFVE+G   L ++RET
Sbjct: 364 SGDVLIKDIELPTLRPEDLFAVTSTGAYGYSMANNYNRNLKPAVVFVEDGHEKLAIRRET 423

Query: 423 YEDIVKLD 430
           YED++ LD
Sbjct: 424 YEDLISLD 431


Lambda     K      H
   0.319    0.137    0.401 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 575
Number of extensions: 23
Number of successful extensions: 1
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 439
Length of database: 433
Length adjustment: 32
Effective length of query: 407
Effective length of database: 401
Effective search space:   163207
Effective search space used:   163207
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 51 (24.3 bits)

Align candidate WP_069662873.1 BCR25_RS06840 (diaminopimelate decarboxylase)
to HMM TIGR01048 (lysA: diaminopimelate decarboxylase (EC 4.1.1.20))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.aa/TIGR01048.hmm
# target sequence database:        /tmp/gapView.3740274.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR01048  [M=417]
Accession:   TIGR01048
Description: lysA: diaminopimelate decarboxylase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                             Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                             -----------
   8.6e-149  481.5   0.0   9.8e-149  481.3   0.0    1.0  1  NCBI__GCF_001730305.1:WP_069662873.1  


Domain annotation for each sequence (and alignments):
>> NCBI__GCF_001730305.1:WP_069662873.1  
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  481.3   0.0  9.8e-149  9.8e-149       4     416 ..      11     431 ..       8     432 .. 0.98

  Alignments for each domain:
  == domain 1  score: 481.3 bits;  conditional E-value: 9.8e-149
                             TIGR01048   4 kkdgeleiegvdlkelaeefgtPlYvydeetlrerlealkeafka..eeslvlYAvKAnsnlavlrllaeeGl 74 
                                           +++ +l+i+g+d  +lae++gtPl++yd +++rer++ +k+  ++   +++v+YA+KA+ +la+ +ll+ee l
  NCBI__GCF_001730305.1:WP_069662873.1  11 NESEHLTIGGCDTVTLAEKYGTPLFIYDVAHIRERARGFKQTLNSlgVKNKVIYASKAFCCLAMYKLLEEEEL 83 
                                           67889**************************************9986667*********************** PP

                             TIGR01048  75 gldvvsgGEleralaAgvkaekivfsgngkseeeleaaleleiklinvdsveelelleeiakelgkkarvllR 147
                                           g dvvs GE+++a +Ag+++e+i f+gn+k++eel  a+e ++ +i++d++ e+ell++i ke ++k++vl+R
  NCBI__GCF_001730305.1:WP_069662873.1  84 GCDVVSAGEIYTAIKAGMSPENIEFHGNNKTKEELLYAVEQGVGTIIIDNFYEIELLSAILKEKNQKQHVLFR 156
                                           ************************************************************************* PP

                             TIGR01048 148 vnpdvdaktheyisTGlkesKFGieve..eaeeayelalkleslelvGihvHIGSqildlepfveaaekvvkl 218
                                           ++p+++a+th+yi TG+ +sKFG++v+  +a++a e++l++++l l G+h+HIGSqi+ +e+f +a+ek++++
  NCBI__GCF_001730305.1:WP_069662873.1 157 ITPGINAETHDYILTGQVDSKFGFDVNsgQATQALERILADDHLVLKGVHCHIGSQIFSAEGFLAAVEKMLTI 229
                                           ***************************999******************************************* PP

                             TIGR01048 219 leelkee.gieleeldlGGGlgisyeeeeeapdleeyaeklleklekea.elgl.klklilEpGRslvanagv 288
                                           l+e+k++ g+++ +l++GGG+g++y+e ++++++e +++++++++++++  l    ++++lEpGRs++a+ag+
  NCBI__GCF_001730305.1:WP_069662873.1 230 LNEWKQAfGYSVDVLNMGGGFGVQYTEADDPLEPEAFVKAIVNSVKGQCaLLDYaFPEIWLEPGRSIIAEAGT 302
                                           *****999****************************************96555579***************** PP

                             TIGR01048 289 lltrVesvKeves.rkfvlvDagmndliRpalYeayheiaalkrleeeetetvdvvGplCEsgDvlakdrelp 360
                                           ++++V+s+K  ++ r++v+vD+gm d+iRpalY+a+++ ++++r+++e+++  ++vG+ CEsgDvl+kd+elp
  NCBI__GCF_001730305.1:WP_069662873.1 303 TIYTVGSEKVIPEvRHYVSVDGGMGDNIRPALYGAKYDGFLANRISNEQQSPKTIVGKYCESGDVLIKDIELP 375
                                           **********9888*********************************************************** PP

                             TIGR01048 361 eveeGdllavasaGAYgasmssnYnsrprpaevlveegkarlirrretledllale 416
                                           + ++ dl av+s+GAYg+sm++nYn+  +pa+v+ve+g+ +l  rret+edl++l+
  NCBI__GCF_001730305.1:WP_069662873.1 376 TLRPEDLFAVTSTGAYGYSMANNYNRNLKPAVVFVEDGHEKLAIRRETYEDLISLD 431
                                           *****************************************************987 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (417 nodes)
Target sequences:                          1  (433 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.00u 0.01s 00:00:00.01 Elapsed: 00:00:00.00
# Mc/sec: 26.81
//
[ok]

This GapMind analysis is from Jul 26 2024. The underlying query database was built on Jul 25 2024.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory