Align aminomuconate-semialdehyde dehydrogenase (EC 1.2.1.32) (characterized)
to candidate WP_012061562.1 SMED_RS23160 aldehyde dehydrogenase family protein
Query= BRENDA::Q83XU8 (485 letters) >NCBI__GCF_000017145.1:WP_012061562.1 Length = 794 Score = 308 bits (790), Expect = 3e-88 Identities = 176/444 (39%), Positives = 249/444 (56%), Gaps = 22/444 (4%) Query: 25 PLNNAVIAKVHEAGRAEVDAAVAAAQAALKGAWGRMSLAQRVEVLYAVADGINRRFDDFL 84 P A++A + GR +VDAAVAAA+ A +G W ++S R LYA+A I R Sbjct: 55 PATGALLAGIARGGREDVDAAVAAARKA-QGPWAKLSGHARARHLYALARLIQRHARLIA 113 Query: 85 AAEVEDTGKPMSLARHVDIPRGAANFKIFADVVKNVPTEFFEMPTPDGVGAINYAVRRPV 144 E D GKP+ R +D+P A +F A + TEF A + PV Sbjct: 114 VVEALDNGKPIRETRDIDVPLAARHFYHHAGWAQLQETEF--------------ADQVPV 159 Query: 145 GVVGVICPWNLPLLLMTWKVGPALACGNTVVVKPSEETPQTAALLGEVMNTAGVPPGVYN 204 GVVG + PWN P L++ WKV PALA GNTV++KP+E TP TA L E+ +G+PPGV N Sbjct: 160 GVVGQVIPWNFPFLMLAWKVAPALALGNTVILKPAEYTPLTALLFAEMAAASGLPPGVLN 219 Query: 205 VVHGFGPNSTGEFLTSHPDVNAITFTGETGTGEAIMKAAADGARPVSLELGGKNAAIVFA 264 VV G G TG + H D++ I FTG T G I + A + ++LELGGK+ IVF Sbjct: 220 VVTGEG--ETGALIVEHEDIDKIAFTGSTEIGRLIREKTAGSGKSLTLELGGKSPFIVFD 277 Query: 265 DCDLDKAIEGTLRSCFANCGQVCLGTERVYVERPIFDRFVSRLKKGAEGMQLGRPEDLAT 324 D D+D A+EG + + + N GQVC R+ V+ + F RLK+ + +++G P D + Sbjct: 278 DADIDAAVEGVVDAIWFNQGQVCCAGSRLLVQEGVAPVFHERLKRRMDTLRVGHPLDKSI 337 Query: 325 GMGPLISQEHREKVLSYYKKAVEAGATVVTGGGVPEMPEALKGGAWVQPTIWTGLGDDSV 384 M +++ +++ K V GA++ P + E KGG++ +PT+ TG+ SV Sbjct: 338 DMAAIVAPVQLQRIAELVAKGVAEGASMHQ----PRI-ELPKGGSFYRPTLLTGVQPTSV 392 Query: 385 VAREEIFGPCALVMPFDSEEEVIRRANDNDYGLARRIWTTNLSRAHRVAGAIEVGIAWVN 444 VA EEIFGP A+ M F + +E I+ AN + YGLA +W+ + A VA + G+ WVN Sbjct: 393 VATEEIFGPVAVSMTFRTPDEAIQLANHSRYGLAASVWSETIGLALHVAAKLAAGVVWVN 452 Query: 445 SWFLRDLRTAFGGSKQSGIGREGG 468 + L D FGG ++SG GREGG Sbjct: 453 ATNLFDAAAGFGGKRESGFGREGG 476 Score = 90.1 bits (222), Expect = 3e-22 Identities = 66/201 (32%), Positives = 97/201 (48%), Gaps = 14/201 (6%) Query: 30 VIAKVHEAGRAEVDAAVAAAQAALKGAWGRMSLAQRVEVLYAVADGINRRFDDFLAAEVE 89 V+ +V E R ++ AV AA+ A W + R ++LY +A+ ++ R +F Sbjct: 545 VVGEVGEGNRKDIRNAVVAARGA--SGWSSATAHNRAQILYYIAENLSSRGAEFADRIAA 602 Query: 90 DTGKPMSLARHVDIPRGAA---NFKIFADVVKNVPTEFFEMPTPDGVGAINYAVRRPVGV 146 TG + AR ++ A ++ +AD + V P GV A+ P GV Sbjct: 603 MTGASAASAR-TEVEASIARLFSYGAWADKYEGV----VHQPPLRGVAL---AMPEPQGV 654 Query: 147 VGVICPWNLPLLLMTWKVGPALACGNTVVVKPSEETPQTAALLGEVMNTAGVPPGVYNVV 206 VGVICP PLL + V P +A GN V+ PSE P A V+ T+ VPPGV N+V Sbjct: 655 VGVICPPEAPLLGLVSLVAPLIAVGNRVIAVPSETHPLAATDFYSVLETSDVPPGVINIV 714 Query: 207 HGFGPNSTGEFLTSHPDVNAI 227 G + L +H DV+A+ Sbjct: 715 TGLA-TELAKALAAHNDVDAL 734 Lambda K H 0.318 0.135 0.406 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 924 Number of extensions: 48 Number of successful extensions: 5 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 2 Number of HSP's successfully gapped: 2 Length of query: 485 Length of database: 794 Length adjustment: 37 Effective length of query: 448 Effective length of database: 757 Effective search space: 339136 Effective search space used: 339136 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.3 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.7 bits) S2: 53 (25.0 bits)
This GapMind analysis is from Apr 09 2024. The underlying query database was built on Sep 17 2021.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory