Align glycolaldehyde oxidoreductase large subunit (characterized)
to candidate AZOBR_RS08560 AZOBR_RS08560 carbon-monoxide dehydrogenase
Query= metacyc::MONOMER-18071 (749 letters) >FitnessBrowser__azobra:AZOBR_RS08560 Length = 796 Score = 470 bits (1209), Expect = e-136 Identities = 284/784 (36%), Positives = 433/784 (55%), Gaps = 46/784 (5%) Query: 4 VGKPVKRIYDDKFVTGRSTYVDDIRIPA-LYAGFVRSTYPHAIIKRIDVSDALKVNGIVA 62 +G V+R D +F+TGR TY DDI P +A FVRS Y HA I ID ++A++ G++A Sbjct: 7 IGASVRRREDARFLTGRGTYTDDINRPGQTHAVFVRSPYAHARITGIDAAEAMRAPGVIA 66 Query: 63 VFTAKEINPLLKGGIRPWPTYIDIRSFRYSERKAFPE-----NKVKYVGEPVAIVLGQDK 117 V T ++ G + P I S S K P ++ +YVG+ VA+V+ + + Sbjct: 67 VLTGADMEADKVGSL---PCGWQIHSKDGSPMKEPPHFPIARDRARYVGDAVAVVIAETR 123 Query: 118 YSVRDAIDKVVVEYEPLKPVIRMEEA-EKDQVIIHEELKTNISYKIPF-KAGEVDKAFSE 175 +DA + V+V+YE L + +A E ++H+++ N+ + A VD AFS+ Sbjct: 124 EQAKDAAELVMVDYEELPAAVTSLKALEGGAPLVHDDVGGNLCFDWHLGDAAAVDAAFSQ 183 Query: 176 SDKVVRVEAINERLIPNPMEPRGIVSRFE--AGTLSIWYSTQVPHYMRSEF-ARILGIPE 232 + V +++ +N+RL+PN MEPR + ++ G ++ ++Q PH +R A +LGIPE Sbjct: 184 AAHVAKLDLVNQRLVPNAMEPRAALGEYDRATGEHTLTTTSQNPHVIRLLMGAFVLGIPE 243 Query: 233 SKIKVSMPDVGGAFGAKVHLMPEELAVVASSIILGRPVRWTATRSEEMLA-SEARHNVFT 291 K++V PDVGG FG+K+ EE V ++ +GRPV+WTA RSE L + R +V Sbjct: 244 HKLRVVAPDVGGGFGSKIFHYGEEAVVTWAAKKVGRPVKWTAERSESFLTDAHGRDHVSH 303 Query: 292 GEVAVKRDGTILGIKGKLLLDLGAYITVTA-GIQPLIIPMMIPGPYKIRNLDIESVAVYT 350 E+A+ +DG L ++ + ++GAY++ A I + ++ G YK + E AV+T Sbjct: 304 AELAMDKDGNFLALRVATIANMGAYLSTFAPSIPTYLYATLLAGQYKTPAIYAEVKAVFT 363 Query: 351 NTPPITMYRGASRPEATYIIERIMSTVADELGLDDVSIREKNLV--TELPYTNPFGLRYD 408 NT P+ YRGA RPEA Y+IER++ A E G+D +R +N V + +PY P L+YD Sbjct: 364 NTVPVDAYRGAGRPEACYLIERLVEVAAAETGIDKAELRRRNFVPASAMPYQTPVALQYD 423 Query: 409 SGDYVGLLREGVKRLGYYELKKWAEEERKKGHRVGVGLAYYLEICSFGP----------- 457 +GD+ L + + Y E ++G G+G A Y+E C P Sbjct: 424 TGDFAKNLDIALPLVDYDGFAARKAESARRGKLRGIGFATYIEACGIAPSNVAGALGARA 483 Query: 458 --WEYAEVRVDERGDVLVVTGTTPHGQGTETAIAQIVADALQIDISRVRVIWGDTDTVAA 515 +E AEVR G V V TG+ HGQG ET AQ+V++ + I V ++ GDT + Sbjct: 484 GLYESAEVRFHPTGSVTVFTGSHSHGQGHETTFAQLVSERFGVPIENVEIVHGDTSKIPF 543 Query: 516 SMGTYGSRSVTIGGSAAIKVAEKILDKMKRIAASTWNVDVQEVQYEKGEFKLKNDPSKKM 575 MGTYGSRS+ +GGSA +K +K+ K K+IAA +++ + G F + K + Sbjct: 544 GMGTYGSRSLAVGGSAIVKAMDKVERKAKKIAAHMLEAAEADIEVKDGRFVVAG-TDKAL 602 Query: 576 SWDDVASIAYRSH-------DPGLVEKIIYE-NDVTFPYGVHIATVEVD-DTGVARVLEY 626 + D+A AY H +PGL E+ Y+ + T+P G H+ VE+D DTGV +V+ + Sbjct: 603 TIGDIALQAYVPHNFPLDELEPGLDEQAFYDPKNFTYPNGCHVCEVEIDPDTGVVQVVNF 662 Query: 627 RAYDDIGKVVNPALAEAQIHGGGVQAVGQALYEQALLNE-NGQLIV-TYADYYVPTAVEA 684 A DD G+V+NP + E Q+HGG VQ +GQALYE + +E +GQLI +Y DY +P A + Sbjct: 663 AAVDDFGRVINPLIVEGQVHGGLVQGIGQALYENCVYDEDSGQLITGSYMDYCMPRADDV 722 Query: 685 PKFTSVFADQYHPSNYPTGSKGVGEAALIVGPAVIIRALEDAI---GTRFTKTPTTPEEI 741 P FT + + ++ P G KG GEA I A ++ A+ A+ G P TPE + Sbjct: 723 PSFTVRYHEDQPCTHNPLGVKGCGEAGTIGASAAVMNAVVHALSEYGVTHLDMPATPERV 782 Query: 742 LRAI 745 +AI Sbjct: 783 WQAI 786 Lambda K H 0.317 0.136 0.394 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 1472 Number of extensions: 68 Number of successful extensions: 11 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 749 Length of database: 796 Length adjustment: 41 Effective length of query: 708 Effective length of database: 755 Effective search space: 534540 Effective search space used: 534540 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.3 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.7 bits) S2: 55 (25.8 bits)
This GapMind analysis is from Apr 09 2024. The underlying query database was built on Sep 17 2021.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory