GapMind for Amino acid biosynthesis

 

Alignments for a candidate for hisD in Thermithiobacillus tepidarius DSM 3134

Align histidinol dehydrogenase (EC 1.1.1.23) (characterized)
to candidate WP_028988607.1 G579_RS0100400 histidinol dehydrogenase

Query= BRENDA::Q8G2R2
         (430 letters)



>NCBI__GCF_000423825.1:WP_028988607.1
          Length = 432

 Score =  416 bits (1070), Expect = e-121
 Identities = 219/420 (52%), Positives = 285/420 (67%), Gaps = 1/420 (0%)

Query: 10  PDFEQKFAAFLSGKREVSEDVDRAVREIVDRVRREGDSALLDYSRRFDRIDLEKTG-IAV 68
           PDF+Q           +   V+  VR+++  VR +GD A+LDY++RFDR+       + V
Sbjct: 9   PDFDQALRRLTDWDAGLDPAVEATVRDVLHAVREQGDRAVLDYTQRFDRLRANSLAELEV 68

Query: 69  TEAEIDAAFDAAPASTVEALKLARDRIEKHHARQLPKDDRYTDALGVELGSRWTAIEAVG 128
             A++  A      +  EAL LA +RI  +H RQ+ +   Y++A G  LG R T +  VG
Sbjct: 69  PRAKLREALAGLAPADREALALAAERIRSYHERQVAESWEYSEADGTRLGQRVTPLARVG 128

Query: 129 LYVPGGTASYPSSVLMNAMPAKVAGVDRIVMVVPAPDGNLNPLVLVAARLAGVSEIYRVG 188
           +YVPGG A+YPSSVLMNA+PAKVAGV  I+MVVPAP G LNPLVL AA LAGV  ++ +G
Sbjct: 129 IYVPGGKAAYPSSVLMNAIPAKVAGVAEIIMVVPAPGGELNPLVLAAAELAGVDRVFTIG 188

Query: 189 GAQAIAALAYGTETIRPVAKIVGPGNAYVAAAKRIVFGTVGIDMIAGPSEVLIVADKDNN 248
           GAQA+AALAYGTET+  V KIVGPGN YVA AKR+VFG VGIDMIAGPSE+L+++D   +
Sbjct: 189 GAQAVAALAYGTETVPAVDKIVGPGNIYVATAKRMVFGRVGIDMIAGPSEILVISDGQTD 248

Query: 249 PDWIAADLLAQAEHDTAAQSILMTNDEAFAHAVEEAVERQLHTLARTETASASWRDFGAV 308
           PDW+A DLL+QAEHD  AQSIL++ D      V  +V++ L TL R   A  +W    A+
Sbjct: 249 PDWLAMDLLSQAEHDEQAQSILISWDADCLTRVAASVDKLLPTLEREAIARQAWDSRAAL 308

Query: 309 ILVKDFEDAIPLANRIAAEHLEIAVADAEAFVPRIRNAGSIFIGGYTPEVIGDYVGGCNH 368
           ILV+D  +A  +A+RIA EHLE++VAD +A +P + NAG++F+G YT E +GDYV G NH
Sbjct: 309 ILVRDAAEACAVADRIAPEHLELSVADPDALLPALHNAGAVFMGRYTAEALGDYVAGPNH 368

Query: 369 VLPTARSARFSSGLSVLDYMKRTSLLKLGSEQLRALGPAAIEIARAEGLDAHAQSVAIRL 428
           VLPTA SARFSS L V D+ KRTS++   +     LG AA  +A  EGL AHA+S A R+
Sbjct: 369 VLPTAGSARFSSPLGVYDFQKRTSIIYASAVGADQLGRAAKRLAEGEGLTAHARSAAYRI 428


Lambda     K      H
   0.318    0.134    0.375 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 556
Number of extensions: 24
Number of successful extensions: 2
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 430
Length of database: 432
Length adjustment: 32
Effective length of query: 398
Effective length of database: 400
Effective search space:   159200
Effective search space used:   159200
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 51 (24.3 bits)

Align candidate WP_028988607.1 G579_RS0100400 (histidinol dehydrogenase)
to HMM TIGR00069 (hisD: histidinol dehydrogenase (EC 1.1.1.23))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.aa/TIGR00069.hmm
# target sequence database:        /tmp/gapView.1435.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR00069  [M=393]
Accession:   TIGR00069
Description: hisD: histidinol dehydrogenase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                                 Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                                 -----------
   3.1e-165  536.1   0.3   3.7e-165  535.9   0.3    1.0  1  lcl|NCBI__GCF_000423825.1:WP_028988607.1  G579_RS0100400 histidinol dehydr


Domain annotation for each sequence (and alignments):
>> lcl|NCBI__GCF_000423825.1:WP_028988607.1  G579_RS0100400 histidinol dehydrogenase
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  535.9   0.3  3.7e-165  3.7e-165       1     393 []      33     427 ..      33     427 .. 0.99

  Alignments for each domain:
  == domain 1  score: 535.9 bits;  conditional E-value: 3.7e-165
                                 TIGR00069   1 vkeiiedvrkeGdeAlleytekfdkv...kleslrvseeeleealeavdeelkealelaaeniekfhek 66 
                                               v++++++vr++Gd+A+l+yt++fd++   +l++l+v++++l+eal+ + ++ +eal+laae+i+++he+
  lcl|NCBI__GCF_000423825.1:WP_028988607.1  33 VRDVLHAVREQGDRAVLDYTQRFDRLranSLAELEVPRAKLREALAGLAPADREALALAAERIRSYHER 101
                                               799**********************966677899*********************************** PP

                                 TIGR00069  67 qlpesveveteegvllgqkvrplervglYvPgGkaaypStvlmtavpAkvAgvkeivvvtPpkkdgkvn 135
                                               q++es+e+++++g+ lgq+v+pl rvg+YvPgGkaaypS+vlm+a+pAkvAgv ei++v P    g++n
  lcl|NCBI__GCF_000423825.1:WP_028988607.1 102 QVAESWEYSEADGTRLGQRVTPLARVGIYVPGGKAAYPSSVLMNAIPAKVAGVAEIIMVVPAP-GGELN 169
                                               **************************************************************6.9**** PP

                                 TIGR00069 136 pavlaaakllgvdevykvGGaqaiaalayGtetvpkvdkivGPGniyVtaAKklvfgevgidmiaGPsE 204
                                               p vlaaa+l+gvd+v+++GGaqa+aalayGtetvp+vdkivGPGniyV++AK++vfg+vgidmiaGPsE
  lcl|NCBI__GCF_000423825.1:WP_028988607.1 170 PLVLAAAELAGVDRVFTIGGAQAVAALAYGTETVPAVDKIVGPGNIYVATAKRMVFGRVGIDMIAGPSE 238
                                               ********************************************************************* PP

                                 TIGR00069 205 vlviadesanpelvaaDllsqaEHdedaqailvttseelaekveeeveeqleelerkeiaeksleknga 273
                                               +lvi+d +++p+++a+DllsqaEHde+aq+il++ +++ +++v ++v++ l +ler+ ia++++++++a
  lcl|NCBI__GCF_000423825.1:WP_028988607.1 239 ILVISDGQTDPDWLAMDLLSQAEHDEQAQSILISWDADCLTRVAASVDKLLPTLEREAIARQAWDSRAA 307
                                               ********************************************************************* PP

                                 TIGR00069 274 iilvddleealelsneyApEHLelqtkdpeellkkiknaGsvflGeytpealgdyvaGpnhvLPTsgtA 342
                                               +ilv+d++ea+++++++ApEHLel ++dp +ll+ ++naG+vf+G+yt+ealgdyvaGpnhvLPT+g+A
  lcl|NCBI__GCF_000423825.1:WP_028988607.1 308 LILVRDAAEACAVADRIAPEHLELSVADPDALLPALHNAGAVFMGRYTAEALGDYVAGPNHVLPTAGSA 376
                                               ********************************************************************* PP

                                 TIGR00069 343 rfasglsvedFlkrisvqelskealeelaeaveklaeaEgLeaHaeavevR 393
                                               rf+s+l+v+dF+kr+s++++s  ++++l++a+++lae EgL+aHa++++ R
  lcl|NCBI__GCF_000423825.1:WP_028988607.1 377 RFSSPLGVYDFQKRTSIIYASAVGADQLGRAAKRLAEGEGLTAHARSAAYR 427
                                               ***********************************************9887 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (393 nodes)
Target sequences:                          1  (432 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.02u 0.01s 00:00:00.03 Elapsed: 00:00:00.02
# Mc/sec: 7.66
//
[ok]

This GapMind analysis is from Apr 10 2024. The underlying query database was built on Apr 09 2024.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory