GapMind for Amino acid biosynthesis

 

Alignments for a candidate for asd in Halococcus hamelinensis 100A6

Align Aspartate-semialdehyde dehydrogenase; ASA dehydrogenase; ASADH; Aspartate-beta-semialdehyde dehydrogenase; EC 1.2.1.11 (characterized)
to candidate WP_007692528.1 C447_RS07535 aspartate-semialdehyde dehydrogenase

Query= SwissProt::Q57658
         (354 letters)



>NCBI__GCF_000336675.1:WP_007692528.1
          Length = 344

 Score =  312 bits (800), Expect = 7e-90
 Identities = 168/346 (48%), Positives = 230/346 (66%), Gaps = 5/346 (1%)

Query: 7   MKIKVGVLGATGSVGQRFVQLLADHPMFELTALAASERSAGKKYKDACYWFQDRDIPENI 66
           M  +VG+LGATG+VGQR VQLL  HP FE+  + AS+ SAG+ Y +A  W  D  +P+ +
Sbjct: 1   MTTRVGILGATGAVGQRLVQLLDPHPEFEIACVTASDDSAGRPYGEAANWRIDVPMPDAL 60

Query: 67  KDMVVIPTDPKHEEFEDVDIVFSALPSDLAKKFEPEFAKEGKLIFSNASAYRMEEDVPLV 126
            D+ V  T+P     +DV ++FS+LPS + ++ EP   + G ++ SN+S  RM +DVPL 
Sbjct: 61  ADLDVARTEPDAIP-DDVPLLFSSLPSAVGERVEPPLCEAGYVVSSNSSNDRMADDVPLT 119

Query: 127 IPEVNADHLELIEIQREKRGWDGAIITNPNCSTICAVITLKPIMDKFGLEAVFIATMQAV 186
           IPEVN DHL+LIE+QR+ R WDGA++ NPNCSTI AV  L  + + FGLE   +ATMQAV
Sbjct: 120 IPEVNGDHLDLIEVQRDSRDWDGALVKNPNCSTITAVPPLAALAE-FGLETAHVATMQAV 178

Query: 187 SGAGYNGVPSMAILDNLIPFIKNEEEKMQTESLKLLGTLKDGKVELANFKISASCNRVAV 246
           SG GY+GV SM I+DN++P I  EEEK++TE  KLLG     +V+     I+ASCNRVA 
Sbjct: 179 SGGGYSGVTSMEIIDNVLPHIGGEEEKVETEPTKLLGEFDGAEVKRHEVDIAASCNRVAT 238

Query: 247 IDGHTESIFVKTKEGAEPEEIKEVMDKFDPLKDLNLPTYAKPIVIREEIDRPQPRLDRNE 306
           +DGH ES++  T+E    ++    M +  P  DL+  +  + I + E+ DRPQPRLDR  
Sbjct: 239 VDGHLESVWADTREDITADDAARAMREL-PALDLH-SSPDQFIEVFEDPDRPQPRLDRMV 296

Query: 307 GNGMSIVVGRIRKDPIFDVKYTALEHNTIRGAAGASVLNAEYFVKK 352
           G GMS+  G +R +    V++  L HNT+RGAAGASVLN E  V++
Sbjct: 297 GGGMSVAAGGLR-ETTRGVQFNCLAHNTLRGAAGASVLNGELLVER 341


Lambda     K      H
   0.317    0.136    0.388 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 330
Number of extensions: 13
Number of successful extensions: 5
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 354
Length of database: 344
Length adjustment: 29
Effective length of query: 325
Effective length of database: 315
Effective search space:   102375
Effective search space used:   102375
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.6 bits)
S2: 49 (23.5 bits)

Align candidate WP_007692528.1 C447_RS07535 (aspartate-semialdehyde dehydrogenase)
to HMM TIGR00978 (asd: aspartate-semialdehyde dehydrogenase (EC 1.2.1.11))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.aa/TIGR00978.hmm
# target sequence database:        /tmp/gapView.642361.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR00978  [M=342]
Accession:   TIGR00978
Description: asd_EA: aspartate-semialdehyde dehydrogenase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                             Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                             -----------
   1.7e-119  384.9   0.0   1.9e-119  384.7   0.0    1.0  1  NCBI__GCF_000336675.1:WP_007692528.1  


Domain annotation for each sequence (and alignments):
>> NCBI__GCF_000336675.1:WP_007692528.1  
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  384.7   0.0  1.9e-119  1.9e-119       2     341 ..       4     339 ..       3     340 .. 0.98

  Alignments for each domain:
  == domain 1  score: 384.7 bits;  conditional E-value: 1.9e-119
                             TIGR00978   2 kvavLGatGlvGqklvkllekhpyfelakvvaserkaGkkygevvkwilsgdipeevrdleiketepaaeekd 74 
                                           +v++LGatG vGq+lv+ll+ hp+fe+a v+as+ +aG+ yge+++w ++ ++p+ + dl + +tep+a  +d
  NCBI__GCF_000336675.1:WP_007692528.1   4 RVGILGATGAVGQRLVQLLDPHPEFEIACVTASDDSAGRPYGEAANWRIDVPMPDALADLDVARTEPDAIPDD 76 
                                           8*******************************************************************9999* PP

                             TIGR00978  75 vdlvfsalpsevaeevEkklaeeGlevfsnasalRldpdvplivpEvnsdhlellkvqker.gwkGvivtnpn 146
                                           v l+fs+lps v e+vE+ l e+G++v sn+s+ R+ +dvpl +pEvn dhl+l++vq++  +w+G +v+npn
  NCBI__GCF_000336675.1:WP_007692528.1  77 VPLLFSSLPSAVGERVEPPLCEAGYVVSSNSSNDRMADDVPLTIPEVNGDHLDLIEVQRDSrDWDGALVKNPN 149
                                           **********************************************************9876*********** PP

                             TIGR00978 147 CstailtlalkPlidaasikkvivatlqavsGAGypGvssldildnviPyikgEEekiekEtkkilGkleegk 219
                                           Cst++   +l+ l  +++++  +vat+qavsG Gy+Gv+s++i+dnv+P+i+gEEek+e+E++k+lG++++ +
  NCBI__GCF_000336675.1:WP_007692528.1 150 CSTITAVPPLAALA-EFGLETAHVATMQAVSGGGYSGVTSMEIIDNVLPHIGGEEEKVETEPTKLLGEFDGAE 221
                                           *****999****99.9********************************************************* PP

                             TIGR00978 220 vepaelevsatttRvPvleGHtesvfveldkkldieeirealkefkklpqklglpsaPekpivlldeedrPqp 292
                                           v+  e++++a+++Rv +++GH+esv  ++ +++  +++ +a++e   l    +l+s P++ i + +++drPqp
  NCBI__GCF_000336675.1:WP_007692528.1 222 VKRHEVDIAASCNRVATVDGHLESVWADTREDITADDAARAMRELPAL----DLHSSPDQFIEVFEDPDRPQP 290
                                           ******************************************998776....9******************** PP

                             TIGR00978 293 rldldaekgmavtvGrlreeseslklvvlghnlvRGAAGaallnaElly 341
                                           rld+ +++gm+v  G lre+++ ++++ l+hn++RGAAGa++ln Ell+
  NCBI__GCF_000336675.1:WP_007692528.1 291 RLDRMVGGGMSVAAGGLRETTRGVQFNCLAHNTLRGAAGASVLNGELLV 339
                                           **********************************************975 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (342 nodes)
Target sequences:                          1  (344 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.00u 0.00s 00:00:00.00 Elapsed: 00:00:00.00
# Mc/sec: 21.93
//
[ok]

This GapMind analysis is from Jul 25 2024. The underlying query database was built on Jul 25 2024.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory