Align L-glutamate gamma-semialdehyde dehydrogenase (EC 1.2.1.88) (characterized)
to candidate WP_011319687.1 AVA_RS14900 L-glutamate gamma-semialdehyde dehydrogenase
Query= BRENDA::Q9K9B2 (515 letters) >NCBI__GCF_000204075.1:WP_011319687.1 Length = 993 Score = 487 bits (1253), Expect = e-142 Identities = 249/503 (49%), Positives = 327/503 (65%), Gaps = 2/503 (0%) Query: 11 TDFTVEANRKAFEEALGLVEKELGKEYPLIINGERVTTEDKIQSWNPARKDQLVGSVSKA 70 TDF VE RK A V LG+ Y +INGE V T + I S NP+ +++G V Sbjct: 479 TDFAVEEERKEAARAFAEVRGALGRSYLPLINGEYVQTAEVIDSVNPSNFGEVIGKVGLI 538 Query: 71 NQDLAEKAIQSADEAFQTWRNVNPEERANILVKAAAIIRRRKHEFSAWLVHEAGKPWKEA 130 + + AE+A+Q+A AF WR +ERA IL +A ++ R+ E SAW+V E GKP KEA Sbjct: 539 SVEQAEQAMQAAKAAFPGWRRTPVKERAAILRRAGDLLEERRAELSAWIVLEVGKPVKEA 598 Query: 131 DADTAEAIDFLEYYARQMIELNRGKEILSRPGEQNRYFYTPMGVTVTISPWNFALAIMVG 190 DA+ +EAIDF YYA +M L +G GE NRY Y P G+ V ISPWNF LAI G Sbjct: 599 DAEVSEAIDFCRYYADEMERLYQGINY-DVAGETNRYIYQPRGIVVVISPWNFPLAIACG 657 Query: 191 TAVAPIVTGNTVVLKPASTTPVVAAKFVEVLEDAGLPKGVINYVPGSGAEVGDYLVDHPK 250 VA +VTGN +LKPA T+ V+ AK E+L +AG+PKGV YVPG G++VG YLV HP Sbjct: 658 MTVAALVTGNCTLLKPAETSSVITAKLTEILLEAGIPKGVFQYVPGKGSQVGAYLVSHPD 717 Query: 251 TSLITFTGSKDVGVRLYERAAVVRPGQNHLKRVIVEMGGKDTVVVDRDADLDLAAESILV 310 T LI FTGS++VG R+Y AA ++P Q H+KRVI EMGGK+ ++VD ADLD A ++ Sbjct: 718 THLIAFTGSQEVGCRIYAEAATLKPQQRHMKRVIAEMGGKNAIIVDESADLDQAVIGVVQ 777 Query: 311 SAFGFSGQKCSAGSRAVIHKDVYDEVLEKTVALAKNLTVGDPTNRDNYMGPVIDEKAFEK 370 SAFG+SGQKCSA SR ++ + +YD + + V K+L +G+ +GPVID A ++ Sbjct: 778 SAFGYSGQKCSACSRVIVVEAIYDAFIHRLVEATKSLNIGEAELPSTQVGPVIDANARDR 837 Query: 371 IMSYIEIGKKEGRLMTGGEGDSSTGFFIQPTIIADLDPEAVIMQEEIFGPVVAFSKANDF 430 I YIE GK E ++ + G+F+ P I ++ P I Q+EIFGPV+A KA DF Sbjct: 838 IREYIEKGKAESQVALELSAPNH-GYFVGPVIFGEVPPHGTIAQQEIFGPVLAVIKAKDF 896 Query: 431 DHALEIANNTEYGLTGAVITRNRAHIEQAKREFHVGNLYFNRNCTGAIVGYHPFGGFKMS 490 AL IAN+T+Y LTG + +R +HI+QA+ EF VGNLY NRN TGAIV PFGGFK+S Sbjct: 897 AQALAIANDTDYALTGGLYSRTPSHIQQAQEEFEVGNLYINRNITGAIVARQPFGGFKLS 956 Query: 491 GTDSKAGGPDYLALHMQAKTVSE 513 G SKAGGPDYL ++ +T++E Sbjct: 957 GVGSKAGGPDYLLQFLEPRTITE 979 Lambda K H 0.316 0.134 0.388 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 1089 Number of extensions: 47 Number of successful extensions: 4 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 515 Length of database: 993 Length adjustment: 39 Effective length of query: 476 Effective length of database: 954 Effective search space: 454104 Effective search space used: 454104 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.3 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.6 bits) S2: 55 (25.8 bits)
Align candidate WP_011319687.1 AVA_RS14900 (L-glutamate gamma-semialdehyde dehydrogenase)
to HMM TIGR01237 (pruA: putative delta-1-pyrroline-5-carboxylate dehydrogenase (EC 1.2.1.88))
# hmmsearch :: search profile(s) against a sequence database # HMMER 3.3.1 (Jul 2020); http://hmmer.org/ # Copyright (C) 2020 Howard Hughes Medical Institute. # Freely distributed under the BSD open source license. # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # query HMM file: ../tmp/path.carbon/TIGR01237.hmm # target sequence database: /tmp/gapView.9187.genome.faa # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: TIGR01237 [M=511] Accession: TIGR01237 Description: D1pyr5carbox2: putative delta-1-pyrroline-5-carboxylate dehydrogenase Scores for complete sequences (score includes all domains): --- full sequence --- --- best 1 domain --- -#dom- E-value score bias E-value score bias exp N Sequence Description ------- ------ ----- ------- ------ ----- ---- -- -------- ----------- 2.5e-247 807.6 2.0 4.3e-247 806.8 2.0 1.3 1 lcl|NCBI__GCF_000204075.1:WP_011319687.1 AVA_RS14900 L-glutamate gamma-se Domain annotation for each sequence (and alignments): >> lcl|NCBI__GCF_000204075.1:WP_011319687.1 AVA_RS14900 L-glutamate gamma-semialdehyde dehydrogenase # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc --- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ---- 1 ! 806.8 2.0 4.3e-247 4.3e-247 5 511 .] 477 981 .. 475 981 .. 0.99 Alignments for each domain: == domain 1 score: 806.8 bits; conditional E-value: 4.3e-247 TIGR01237 5 pftdfadeelvqafkkalakvkellGkdyplvinGeeveteakidsinpadksevvGkvakasvedaeq 73 +tdfa ee ++++ +a+a+v+ lG++y+++inGe+v+t++ ids+np++ ev+Gkv+++sve+aeq lcl|NCBI__GCF_000204075.1:WP_011319687.1 477 TDTDFAVEEERKEAARAFAEVRGALGRSYLPLINGEYVQTAEVIDSVNPSNFGEVIGKVGLISVEQAEQ 545 68******************************************************************* PP TIGR01237 74 alqaakkafeewkktdveeraaillkaaailkrrrhelsallvlevGkiyaeadaevaeaidfleyyar 142 a+qaak+af+ w++t+v+eraail++a + l++rr elsa++vlevGk+++eadaev+eaidf++yya+ lcl|NCBI__GCF_000204075.1:WP_011319687.1 546 AMQAAKAAFPGWRRTPVKERAAILRRAGDLLEERRAELSAWIVLEVGKPVKEADAEVSEAIDFCRYYAD 614 ********************************************************************* PP TIGR01237 143 emiklakskevlsieGeknrylyiplGvavvispwnfplailvGmtvapivtGncvvlkpaeaatviaa 211 em++l ++ + ++ Ge+nry+y+p+G+ vvispwnfplai+ Gmtva++vtGnc++lkpae+++vi+a lcl|NCBI__GCF_000204075.1:WP_011319687.1 615 EMERLY-QGINYDVAGETNRYIYQPRGIVVVISPWNFPLAIACGMTVAALVTGNCTLLKPAETSSVITA 682 *****9.99************************************************************ PP TIGR01237 212 klveileeaGlpkGvlqfvpGkGsevGeylvdhpktrlitftGsrevGlriyedaakvqpGqkhlkrvi 280 kl eil eaG+pkGv+q+vpGkGs+vG ylv+hp+t+li+ftGs+evG+riy++aa ++p q+h+krvi lcl|NCBI__GCF_000204075.1:WP_011319687.1 683 KLTEILLEAGIPKGVFQYVPGKGSQVGAYLVSHPDTHLIAFTGSQEVGCRIYAEAATLKPQQRHMKRVI 751 ********************************************************************* PP TIGR01237 281 aelGGkdavivdesadieqavaaavtsafGfaGqkcsaasrvvvlekvydevverfveatkslkvgktd 349 ae+GGk+a+ivdesad++qav ++v+safG++Gqkcsa+srv+v+e +yd++++r+veatksl++g+++ lcl|NCBI__GCF_000204075.1:WP_011319687.1 752 AEMGGKNAIIVDESADLDQAVIGVVQSAFGYSGQKCSACSRVIVVEAIYDAFIHRLVEATKSLNIGEAE 820 ********************************************************************* PP TIGR01237 350 eadvqvgpvidqksfdkikeyielgkaegklvlggedddskGyfikptifkdvdrkarlaqeeifGpvv 418 +++qvgpvid+++ d+i+eyie gkae++++l +++++Gyf++p if++v ++ ++aq+eifGpv+ lcl|NCBI__GCF_000204075.1:WP_011319687.1 821 LPSTQVGPVIDANARDRIREYIEKGKAESQVALE-LSAPNHGYFVGPVIFGEVPPHGTIAQQEIFGPVL 888 *********************************9.99******************************** PP TIGR01237 419 avlrakdfdealeiansteygltGgvisnsrerierakaefevGnlyfnrkitGaivgvqpfGGfkmsG 487 av++akdf +al ian+t+y+ltGg++s+++ +i++a++efevGnly+nr+itGaiv++qpfGGfk+sG lcl|NCBI__GCF_000204075.1:WP_011319687.1 889 AVIKAKDFAQALAIANDTDYALTGGLYSRTPSHIQQAQEEFEVGNLYINRNITGAIVARQPFGGFKLSG 957 ********************************************************************* PP TIGR01237 488 tdskaGGpdylaqflqaktvteri 511 ++skaGGpdyl+qfl+++t+te+i lcl|NCBI__GCF_000204075.1:WP_011319687.1 958 VGSKAGGPDYLLQFLEPRTITENI 981 **********************86 PP Internal pipeline statistics summary: ------------------------------------- Query model(s): 1 (511 nodes) Target sequences: 1 (993 residues searched) Passed MSV filter: 1 (1); expected 0.0 (0.02) Passed bias filter: 1 (1); expected 0.0 (0.02) Passed Vit filter: 1 (1); expected 0.0 (0.001) Passed Fwd filter: 1 (1); expected 0.0 (1e-05) Initial search space (Z): 1 [actual number of targets] Domain search space (domZ): 1 [number of targets reported over threshold] # CPU time: 0.02u 0.01s 00:00:00.03 Elapsed: 00:00:00.02 # Mc/sec: 20.97 // [ok]
This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory