Align methylmalonate-semialdehyde dehydrogenase (CoA-acylating) (EC 1.2.1.27) (characterized)
to candidate WP_099017976.1 CCS90_RS02790 aldehyde dehydrogenase family protein
Query= BRENDA::P42412 (487 letters) >NCBI__GCF_002591915.1:WP_099017976.1 Length = 787 Score = 223 bits (568), Expect = 2e-62 Identities = 148/474 (31%), Positives = 245/474 (51%), Gaps = 18/474 (3%) Query: 6 KLKNYINGEWVESKTDQYEDVVNPATKEVLCQVPISTKEDIDYAAQTAAEAFKTWSKVAV 65 K+ ++ING+W K + NPA+ E L QV + K ++ A + A +A W ++ Sbjct: 32 KMSHFINGDWHSGKNHFASN--NPASGERLAQVSQADKNTVNQAVKAAEKALPGWQALSG 89 Query: 66 PRRARILFNFQQLLSQHKEELAHLITIENGKNTKEALG-EVGRGIENVEFAAGAPSLMMG 124 R+ L+ +L+ ++ A L T++NGK +E+ +V I + AG L+ Sbjct: 90 HARSEYLYAIARLIQKNSRLFAVLETLDNGKPIRESRDIDVPLAIRHFYHHAGWAQLL-- 147 Query: 125 DSLASIATDVEAANYRYPIGVVGGIAPFNFPMMVPCWMFPMAIALGNTFILKPSERTPLL 184 D E ++ IGVVG I P+NFP+++ W A+A GNT ++KP+E T L Sbjct: 148 --------DTEFPDHE-AIGVVGQIIPWNFPLLMLAWKIAPALATGNTIVIKPAEFTSLT 198 Query: 185 TEKLVELFEKAGLPKGVFNVVYGAHDVVNGILEHPEIKAISFVGSKPVGEYVYKKGSENL 244 E+ ++AGLPKGV NVV G I+EHP+IK I+F GS VG ++ + + + Sbjct: 199 ALLFAEICQQAGLPKGVLNVVTGDGSTGQHIVEHPDIKKIAFTGSTAVGRWIRQATAGSG 258 Query: 245 KRVQSLTGAKNHTIVLNDANLEDTVTNIVGAAFGSAGERCMACAVVTVEEGIADEFMAKL 304 K++ G K+ IV DA+L+ V +V A + + G+ C A + + V+E +A+ F KL Sbjct: 259 KKLSLELGGKSAFIVCADADLDAAVEGVVDAIWFNQGQVCCAGSRLLVQEAVAERFHKKL 318 Query: 305 QEKVADIKIGNGLDDGVFLGPVIREDNKKRTLSYIEKGLEEGARLVCDGRENVSDDGYFV 364 ++ ++IG+ LD + +G +I +++ S ++ G++ GA L E + G F Sbjct: 319 ISRMHKLRIGDPLDKSIDMGAIIDPRQQQKIASLVQAGVDAGACLNQSTAE-LPSSGCFY 377 Query: 365 GPTIFDNVTTEMTIWKDEIFAPVLSVIRVKNLKEAIEIANKSEFANGACLFTSNSNAIRY 424 PT+ +V+ + +DEIF PVL + + EA+ +AN S + A +++ N N + Sbjct: 378 PPTLLTDVSPSDQVVQDEIFGPVLVSMTFRTPSEAVALANNSRYGLAASIWSENINMAMH 437 Query: 425 FRENIDAGMLGINLGVPAPMAFFPFSGWKSSFFGTLHANGKDSVDFYTRKKVVT 478 + AG++ IN A F G K S FG GK+ + Y K+ T Sbjct: 438 LAPLVQAGIVWINC-TNMMDAAAGFGGVKESGFG--REGGKEGLYEYLTLKINT 488 Score = 64.7 bits (156), Expect = 1e-14 Identities = 57/240 (23%), Positives = 107/240 (44%), Gaps = 7/240 (2%) Query: 8 KNYINGEWVESKTDQYEDVVNPATKEVLCQVPISTKEDIDYAAQTAAEAFKTWSKVAVPR 67 K YI G+ V Y ++ + +++ + ++D+ A + AA + +WS + + Sbjct: 517 KMYIGGKQVRPDGG-YSYAIHNSKGQIITHAGLGNRKDVRNAVEAAATSH-SWSHMNAYQ 574 Query: 68 RARILFNFQQLLSQHKEELAHLITIENGKNTKEALGEVGRGIENVEFAAGAPSLMMGDSL 127 R ++++ + LS K E A + G + K+A EV I+ + + A G Sbjct: 575 RQQVMYFMAENLSYRKAEFATRLE-SFGHSKKQAETEVESSIQRLSYYAAWCDKFDGQIH 633 Query: 128 ASIATDVEAANYRYPIGVVGGIAPFNFPMMVPCWMFPMAIALGNTFILKPSERTPLLTEK 187 + V A + PIGV+ P P++ + AIA+GN ++ SE PL Sbjct: 634 SVPMRGVVMAMHE-PIGVIAIQCPDEAPLLSFISLLAPAIAMGNRVVMTASEAFPLSALD 692 Query: 188 LVELFEKAGLPKGVFNVVYGAH-DVVNGILEHPEIKAISFVGSKPVGEYVYKKGSENLKR 246 ++ + +P G N++ G ++ + H + AI GS ++V + + NLKR Sbjct: 693 FYQVLNTSDVPAGTVNIISGNRLELTKPLAGHMNVDAIWAFGSD--AQWVEHESATNLKR 750 Lambda K H 0.318 0.136 0.396 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 815 Number of extensions: 39 Number of successful extensions: 4 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 2 Number of HSP's successfully gapped: 2 Length of query: 487 Length of database: 787 Length adjustment: 37 Effective length of query: 450 Effective length of database: 750 Effective search space: 337500 Effective search space used: 337500 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.3 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.7 bits) S2: 53 (25.0 bits)
This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory