Align Homocysteine formation from aspartate semialdehyde (NIL/ferredoxin component) (characterized)
to candidate WP_010876559.1 MTH_RS04380 4Fe-4S dicluster domain-containing protein
Query= reanno::Miya:8499492 (147 letters) >NCBI__GCF_000008645.1:WP_010876559.1 Length = 332 Score = 57.8 bits (138), Expect = 2e-13 Identities = 27/83 (32%), Positives = 41/83 (49%), Gaps = 14/83 (16%) Query: 72 REQDVTVTDVSQRISRDEDSCMHCGMCTAICPTSALAMD-------------IEARVVVF 118 R+ D + ++ I DED C++CG+C ICP A+ M +E V+ Sbjct: 164 RDPDSSNMAIADGIRVDEDKCLYCGICKRICPVGAIRMSCLTCMYNEELKATVEGAVITI 223 Query: 119 DKDRCTACGLCTRVCPVGAMNVE 141 D +RC CG C +CP A+ V+ Sbjct: 224 D-ERCAHCGWCMEICPANAITVK 245 Score = 55.1 bits (131), Expect = 1e-12 Identities = 26/81 (32%), Positives = 44/81 (54%), Gaps = 16/81 (19%) Query: 76 VTVTDVSQR-------ISRDEDSCMHCGMCTAICPTSALAMDIE---------ARVVVFD 119 +T+ D+ +R I+ + C++CG C A+CP SA+ + A + D Sbjct: 121 LTIRDLPERKSLVKGEINVSMEKCIYCGECAAMCPASAIEISWRDPDSSNMAIADGIRVD 180 Query: 120 KDRCTACGLCTRVCPVGAMNV 140 +D+C CG+C R+CPVGA+ + Sbjct: 181 EDKCLYCGICKRICPVGAIRM 201 Score = 47.8 bits (112), Expect = 2e-10 Identities = 21/66 (31%), Positives = 36/66 (54%), Gaps = 9/66 (13%) Query: 85 ISRDEDSCMHCGMCTAICPTSALAMD---------IEARVVVFDKDRCTACGLCTRVCPV 135 ++ + D C CG+C+ CP +A+ I+ V F+K++C CGLC VC Sbjct: 14 LNYNPDLCTGCGLCSETCPVNAIDRAPLLPIARGLIKMNRVSFNKEKCVLCGLCASVCIF 73 Query: 136 GAMNVE 141 GA++++ Sbjct: 74 GAIDLQ 79 Score = 43.1 bits (100), Expect = 4e-09 Identities = 19/55 (34%), Positives = 29/55 (52%), Gaps = 3/55 (5%) Query: 89 EDSCMHCGMCTAICPTSALAMDIEARVVVFDKD-RC--TACGLCTRVCPVGAMNV 140 ++ C HCG C ICP +A+ + R + D RC +C C VCP A+++ Sbjct: 224 DERCAHCGWCMEICPANAITVKKPIRGTISQADERCRGESCHACVDVCPCNAISI 278 Score = 40.0 bits (92), Expect = 3e-08 Identities = 20/60 (33%), Positives = 35/60 (58%), Gaps = 6/60 (10%) Query: 85 ISRDEDSCM--HCGMCTAICPTSALAM-DIEARVVVFDKDRCTACGLCTRVCPVGAMNVE 141 IS+ ++ C C C +CP +A+++ + AR+ D+ C CG C+ VCP G +++E Sbjct: 252 ISQADERCRGESCHACVDVCPCNAISIINGTARI---DEKFCVFCGACSSVCPDGLLSIE 308 Score = 38.9 bits (89), Expect = 8e-08 Identities = 16/68 (23%), Positives = 32/68 (47%), Gaps = 13/68 (19%) Query: 84 RISRDEDSCMHCGMCTAICPTSALAMDIEARVV-------------VFDKDRCTACGLCT 130 R+S +++ C+ CG+C ++C A+ + + + + D ++C CG C Sbjct: 53 RVSFNKEKCVLCGLCASVCIFGAIDLQKDGKSIRGADEYPFWDFKLEIDDEKCFLCGNCA 112 Query: 131 RVCPVGAM 138 CP A+ Sbjct: 113 DACPRNAL 120 Score = 37.4 bits (85), Expect = 2e-07 Identities = 14/31 (45%), Positives = 20/31 (64%) Query: 109 MDIEARVVVFDKDRCTACGLCTRVCPVGAMN 139 M E R + ++ D CT CGLC+ CPV A++ Sbjct: 7 MGSERRTLNYNPDLCTGCGLCSETCPVNAID 37 Lambda K H 0.321 0.137 0.414 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 239 Number of extensions: 27 Number of successful extensions: 14 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 8 Number of HSP's successfully gapped: 8 Length of query: 147 Length of database: 332 Length adjustment: 22 Effective length of query: 125 Effective length of database: 310 Effective search space: 38750 Effective search space used: 38750 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.4 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.9 bits) S2: 45 (21.9 bits)
This GapMind analysis is from Apr 10 2024. The underlying query database was built on Apr 09 2024.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory