Align glycolaldehyde oxidoreductase large subunit (characterized)
to candidate GFF1151 PGA1_c11660 molybdenum-containing hydroxylase
Query= metacyc::MONOMER-18071 (749 letters) >FitnessBrowser__Phaeo:GFF1151 Length = 766 Score = 343 bits (880), Expect = 2e-98 Identities = 240/763 (31%), Positives = 378/763 (49%), Gaps = 36/763 (4%) Query: 6 KPVKRIYDDKFVTGRSTYVDDIR-IPALYAGFVRSTYPHAIIKRIDVSDALKVNGIVAVF 64 +PV+R+ D +F+TG+ Y++D + AL A +RS H I +D+ A + G+ A+ Sbjct: 9 QPVRRVEDIRFLTGQGRYLEDAAPVGALRAWVLRSPVAHGTITHLDLDMAREAEGVQAIL 68 Query: 65 T-----AKEINPLLKGGIRPWPTYIDIRSFRYS---ERKAFPENKVKYVGEPVAIVLGQD 116 T A IN + G + +D R + ER ++V++VGEP+A+V+ + Sbjct: 69 TLADLEAAGINVGMDGAV------VDNRDGSKAAAPERPMLVRDRVRFVGEPIAVVIAET 122 Query: 117 KYSVRDAIDKVVVEYEPLKPVIRMEEAEKDQVIIHEELKTNISYKIPF-KAGEVDKAFSE 175 RDA + + ++ + L + + + +H E N ++ + AF+ Sbjct: 123 LEQARDAGEMIELDIDDLPAKVDLTAGGEQ---LHAEAPDNRAFDWSMGDEAATEAAFAS 179 Query: 176 SDKVVRVEAINERLIPNPMEPRGIVSRFEAGTLSIWYSTQVPHYMRSEFARILGIPESKI 235 + V ++ + R+I N MEPRG + + G L Y Q M+ + + LG+ + Sbjct: 180 AAHRVALQVEDNRIIVNTMEPRGCFATWAEGRLQFTYGGQGVWAMKKQLSDKLGLASEAV 239 Query: 236 KVSMPDVGGAFGAKVHLMPEELAVVASSIILGRPVRWTATRSEEMLASEARHNVFT-GEV 294 V+ PDVGG FG K PE AV ++ +LG PV W + RSE ML+ A ++ + E+ Sbjct: 240 HVTTPDVGGGFGMKAMPYPEYFAVAHAARLLGGPVFWMSDRSEAMLSDNAGRDLTSLAEL 299 Query: 295 AVKRDGTILGIKGKLLLDLGAYIT-VTAGIQPLIIPMMIPGPYKIRNLDIESVAVYTNTP 353 A D I G + +LGAY + +Q + ++ G Y I + +YTNT Sbjct: 300 AFDADLRITGYRVHSHCNLGAYNSHFGQPVQTQLFSRVLAGVYDIDTTWLRVEGIYTNTA 359 Query: 354 PITMYRGASRPEATYIIERIMSTVADELGLDDVSIREKNLVT--ELPYTNPFGLRYDSGD 411 + YRGA RPEA Y++ER+M A ELG+D +R +N + + PY YD GD Sbjct: 360 QVDAYRGAGRPEAIYVLERVMDHAARELGVDPWELRRRNFIRADQFPYKTATDETYDVGD 419 Query: 412 YVGLLR--EGVKRLGYYELKKWAEEERKKGHRVGVGLAYYLEICSFGPWEYAEVRVDERG 469 + LL E L + +K A+ R G+ GVGL YY+E P E AEV +E G Sbjct: 420 FHQLLTLTEEAADLAGFAQRKAADALR--GNLRGVGLCYYIESILGDPSESAEVAFEEEG 477 Query: 470 DVLVVTGTTPHGQGTETAIAQIVADALQIDISRVRVIWGDTDTVAASMGTYGSRSVTIGG 529 V + GT +GQG ET AQ +AD I + + V+ GD+D +A GT GSRSVT Sbjct: 478 RVSIYVGTQSNGQGHETVYAQFLADQTGIPAADITVVQGDSDRIAQGGGTGGSRSVTTQA 537 Query: 530 SAAIKVAEKILDKMKRIAASTWNVDVQEVQYEKGEFKLKNDPSKKMSWDDVASIAYRSHD 589 +A + + +++ A V V ++ +F +++ + + A +A Sbjct: 538 NATLAAVDVMIEAFTPFLAEQLGVAPASVVFDGEQFSAPGS-NQRPTLVEAAEMARAQGR 596 Query: 590 PGLVEKIIYEN--DVTFPYGVHIATVEVD-DTGVARVLEYRAYDDIGKVVNPALAEAQIH 646 L+ +FP G H+A + +D +TGV+ V Y DD G ++NP LAE Q+H Sbjct: 597 HELLRHTARAKLPGRSFPNGAHVAEIVIDPETGVSHVDRYTVVDDFGNLINPLLAEGQVH 656 Query: 647 GGGVQAVGQALYEQALLNENGQLI-VTYADYYVPTAVEAPKFTSVFADQYHPSNYPTGSK 705 GG Q +GQAL E+ + + +GQL+ ++ DY +P A + P FA N P G K Sbjct: 657 GGVAQGLGQALLERVVYDADGQLLTASFMDYALPRAADVPMIDVGFAPVPSTQN-PMGMK 715 Query: 706 GVGEAALIVGPAVIIRALEDAI---GTRFTKTPTTPEEILRAI 745 G GEA + A + A++DA+ G R + P TP + A+ Sbjct: 716 GCGEAGTVGALAAVANAVQDAVWDRGVRQVEMPYTPLRLWEAL 758 Lambda K H 0.317 0.136 0.394 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 1233 Number of extensions: 65 Number of successful extensions: 8 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 749 Length of database: 766 Length adjustment: 40 Effective length of query: 709 Effective length of database: 726 Effective search space: 514734 Effective search space used: 514734 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.3 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.7 bits) S2: 55 (25.8 bits)
This GapMind analysis is from Sep 17 2021. The underlying query database was built on Sep 17 2021.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory