Align aldehyde dehydrogenase (NAD+) (EC 1.2.1.3) (characterized)
to candidate CCNA_00618 CCNA_00618 succinylglutamic semialdehyde dehydrogenase
Query= BRENDA::P76217 (492 letters) >FitnessBrowser__Caulo:CCNA_00618 Length = 472 Score = 490 bits (1261), Expect = e-143 Identities = 254/471 (53%), Positives = 321/471 (68%), Gaps = 4/471 (0%) Query: 15 ASR-VKRNPVSGEVLWQGNDADAAQVEQACRAARAAFPRWARLSFAERHAVVERFAALLE 73 ASR + R+P +GE + DA ++ AC +ARAAF WA AER A+ RFA + Sbjct: 3 ASRLISRDPYTGEAIADFAVNDARSIDAACHSARAAFAEWAMTPLAERRAIALRFAETVR 62 Query: 74 SNKAELTAIIARETGKPRWEAATEVTAMINKIAISIKAYHVRTGEQRSEMPDGAASLRHR 133 + + E+ +IARETGKP WEA TE ++ K+AISI+A R GE+ M D A L HR Sbjct: 63 ARREEIATLIARETGKPMWEALTEADSVAAKVAISIRAQDERAGERSEPMADATARLAHR 122 Query: 134 PHGVLAVFGPYNFPGHLPNGHIVPALLAGNTIIFKPSELTPWSGEAVMRLWQQAGLPPGV 193 PHGVLAV GP+NFP HL NGHIVPALLAGN ++FKPSE TP G+ + LW+ AGLP V Sbjct: 123 PHGVLAVIGPFNFPMHLANGHIVPALLAGNAVVFKPSEKTPACGQLMGELWRAAGLPDHV 182 Query: 194 LNLVQGGRETGQALSALEDLDGLLFTGSANTGYQLHRQLSGQPEKILALEMGGNNPLIID 253 L +V GG E G+AL E LDG+LFTG G +HR L+ P KILALE+GGN PL++ Sbjct: 183 LTIVIGGGEAGEALVRHEALDGVLFTGGVQAGRAIHRALADAPHKILALELGGNAPLVVW 242 Query: 254 EVADIDAAVHLTIQSAFVTAGQRCTCARRLLLKSGAQGDAFLARLVAVSQRLTPGNWDDE 313 +VADI+AA HL +QSA+VTAGQRCTCARRL+L GA+GDA L L + RL G Sbjct: 243 DVADIEAAAHLIVQSAYVTAGQRCTCARRLILPEGARGDALLEALTMLMDRLVIGGPFQS 302 Query: 314 PQPFIGGLISEQAAQQVVTAWQQLEAMGGRPLLAPRLLQAGTSLLTPGIIEMTGVAGVPD 373 P PF+G +I AA QV+ A ++ A GGRPL + +A ++LL+PG+IE+T A + D Sbjct: 303 PAPFMGPVIDAHAAAQVLAAQDRMTADGGRPLRLAAVREARSALLSPGLIELTD-APLRD 361 Query: 374 EEVFGPLLRVWRYDTFDEAIRMANNTRFGLSCGLVSPEREKFDQLLLEARAGIVNWNKPL 433 EE+FGPLL+V R FD A+ +AN TRFGL+ GL+S + + + RAGIVNWN+P Sbjct: 362 EEIFGPLLQVRRAADFDAALALANATRFGLAAGLISDDEALYRRFWTSVRAGIVNWNRPT 421 Query: 434 TGAASTAPFGGIGASGNHRPSAWYAADYCAWPMASLESDS--LTLPATLNP 482 TGA+S APFGG+G SGNHRPSA+YAADY A+P+A LES S LP LNP Sbjct: 422 TGASSAAPFGGVGGSGNHRPSAYYAADYSAYPVAGLESPSPVYRLPIGLNP 472 Lambda K H 0.318 0.134 0.412 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 718 Number of extensions: 27 Number of successful extensions: 2 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 492 Length of database: 472 Length adjustment: 34 Effective length of query: 458 Effective length of database: 438 Effective search space: 200604 Effective search space used: 200604 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.4 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.7 bits) S2: 52 (24.6 bits)
Align candidate CCNA_00618 CCNA_00618 (succinylglutamic semialdehyde dehydrogenase)
to HMM TIGR03240 (astD: succinylglutamate-semialdehyde dehydrogenase (EC 1.2.1.71))
# hmmsearch :: search profile(s) against a sequence database # HMMER 3.3.1 (Jul 2020); http://hmmer.org/ # Copyright (C) 2020 Howard Hughes Medical Institute. # Freely distributed under the BSD open source license. # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # query HMM file: ../tmp/path.carbon/TIGR03240.hmm # target sequence database: /tmp/gapView.6693.genome.faa # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: TIGR03240 [M=484] Accession: TIGR03240 Description: arg_catab_astD: succinylglutamate-semialdehyde dehydrogenase Scores for complete sequences (score includes all domains): --- full sequence --- --- best 1 domain --- -#dom- E-value score bias E-value score bias exp N Sequence Description ------- ------ ----- ------- ------ ----- ---- -- -------- ----------- 8.3e-195 634.0 1.3 1e-194 633.7 1.3 1.0 1 lcl|FitnessBrowser__Caulo:CCNA_00618 CCNA_00618 succinylglutamic semi Domain annotation for each sequence (and alignments): >> lcl|FitnessBrowser__Caulo:CCNA_00618 CCNA_00618 succinylglutamic semialdehyde dehydrogenase # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc --- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ---- 1 ! 633.7 1.3 1e-194 1e-194 14 470 .. 5 460 .. 2 471 .. 0.99 Alignments for each domain: == domain 1 score: 633.7 bits; conditional E-value: 1e-194 TIGR03240 14 slesldpvtqevlwqgkaasaaqvekavkaarkafpawarlsleeriavvkrfaelleeekeelaeviaketg 86 l s+dp t+e++ + + +a ++++a+++ar+af++wa ++l+er a+ rfae ++ ++ee+a++ia+etg lcl|FitnessBrowser__Caulo:CCNA_00618 5 RLISRDPYTGEAIADFAVNDARSIDAACHSARAAFAEWAMTPLAERRAIALRFAETVRARREEIATLIARETG 77 56799******************************************************************** PP TIGR03240 87 kplweartevasmvakvaisikayeertGekeseladakavlrhrphGvlavfGpynfpGhlpnGhivpalla 159 kp+wea te s++akvaisi+a +er+Ge+++++ada+a l hrphGvlav+Gp+nfp hl nGhivpalla lcl|FitnessBrowser__Caulo:CCNA_00618 78 KPMWEALTEADSVAAKVAISIRAQDERAGERSEPMADATARLAHRPHGVLAVIGPFNFPMHLANGHIVPALLA 150 ************************************************************************* PP TIGR03240 160 GntvvfkpseltplvaeetvklwekaGlpaGvlnlvqGaretGkalaaeedidGllftGssntGallhrqlag 232 Gn+vvfkpse tp +++ + +lw++aGlp+ vl +v G+ e+G+al+ +e +dG+lftG ++G +hr la+ lcl|FitnessBrowser__Caulo:CCNA_00618 151 GNAVVFKPSEKTPACGQLMGELWRAAGLPDHVLTIVIGGGEAGEALVRHEALDGVLFTGGVQAGRAIHRALAD 223 ************************************************************************* PP TIGR03240 233 rpekilalelGGnnplvveevkdidaavhlivqsafisaGqrctcarrllvkdgaeGdallerlvevaerltv 305 p+kilalelGGn plvv++v+di+aa+hlivqsa+++aGqrctcarrl++++ga+Gdalle+l+ + +rl++ lcl|FitnessBrowser__Caulo:CCNA_00618 224 APHKILALELGGNAPLVVWDVADIEAAAHLIVQSAYVTAGQRCTCARRLILPEGARGDALLEALTMLMDRLVI 296 ************************************************************************* PP TIGR03240 306 gkydaepqpflGavisekaakellaaqekllalggksllelkqleeeaalltpgiidvtevaevpdeeyfgpl 378 g + p+pf+G+vi ++aa ++laaq++++a gg+ l e+ +all+pg+i++t+ a + dee fgpl lcl|FitnessBrowser__Caulo:CCNA_00618 297 GGPFQSPAPFMGPVIDAHAAAQVLAAQDRMTADGGRPLRLAAVREARSALLSPGLIELTD-APLRDEEIFGPL 368 *9999*********************************9999999***************.699********* PP TIGR03240 379 lkvlrykdfdealaeanntrfGlaaGllsddrelydkflleiraGivnwnkpltGassaapfGGiGasGnhrp 451 l+v r +dfd+ala an+trfGlaaGl+sdd++ly++f++++raGivnwn+p+tGassaapfGG+G sGnhrp lcl|FitnessBrowser__Caulo:CCNA_00618 369 LQVRRAADFDAALALANATRFGLAAGLISDDEALYRRFWTSVRAGIVNWNRPTTGASSAAPFGGVGGSGNHRP 441 ************************************************************************* PP TIGR03240 452 sayyaadycaypvaslead 470 sayyaady aypva le+ lcl|FitnessBrowser__Caulo:CCNA_00618 442 SAYYAADYSAYPVAGLESP 460 ****************985 PP Internal pipeline statistics summary: ------------------------------------- Query model(s): 1 (484 nodes) Target sequences: 1 (472 residues searched) Passed MSV filter: 1 (1); expected 0.0 (0.02) Passed bias filter: 1 (1); expected 0.0 (0.02) Passed Vit filter: 1 (1); expected 0.0 (0.001) Passed Fwd filter: 1 (1); expected 0.0 (1e-05) Initial search space (Z): 1 [actual number of targets] Domain search space (domZ): 1 [number of targets reported over threshold] # CPU time: 0.01u 0.01s 00:00:00.02 Elapsed: 00:00:00.01 # Mc/sec: 11.84 // [ok]
This GapMind analysis is from Sep 17 2021. The underlying query database was built on Sep 17 2021.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory