Align Glyceraldehyde dehydrogenase large chain; Glyceraldehyde dehydrogenase subunit A; Glyceraldehyde dehydrogenase subunit alpha; EC 1.2.99.8 (characterized)
to candidate 3607804 Dshi_1212 carbon-monoxide dehydrogenase, large subunit (RefSeq)
Query= SwissProt::Q4J6M3 (748 letters) >FitnessBrowser__Dino:3607804 Length = 806 Score = 357 bits (915), Expect = e-102 Identities = 244/776 (31%), Positives = 391/776 (50%), Gaps = 67/776 (8%) Query: 5 GKSIKRLNDDKFITGRSNYIDDIKIPSLYAG-FVRSPYPHAIIKRIDATDALKVNGIVAV 63 G S KR+ D +F G+ NY+DDIK+P + G FVRSPY HA + I+A AL + G+VAV Sbjct: 20 GSSRKRVEDARFTQGKGNYVDDIKLPGMLHGDFVRSPYAHARVVSINAEAALALPGVVAV 79 Query: 64 FSGKDINPMLKGGVGVLSAYVNPSLFRFKERKAFPEDNKVKYVGEPVAIVIGQDKYAVRD 123 + KD+ P LS + P+L K+ D KV + G+ VA V+ +D+Y D Sbjct: 80 LTAKDLEP--------LSLHWMPTLAGDKQMVL--ADGKVLFQGQEVAFVVAEDRYIAAD 129 Query: 124 AIDRVNVEYEQLKPVIKMEDAEKDEVIVHDELKTNV----------SYKIPFKAGD---I 170 A++ V VEYE+L ++ +A + +V++ ++L+ ++ ++ GD Sbjct: 130 AVELVEVEYEELPVLVDPFEALQSDVVLREDLEPGADGAHGPRRHHNHIFLWEEGDKAAT 189 Query: 171 EKAFSQADKVVKVEAINERLIPNPMEPRGILSVYD--GNSLSVWYSTQVPHFARSEFARI 228 E+ A+ VV+ R P P+E G ++ D L++W + Q PH R+ + + Sbjct: 190 EQVIENAEVVVEEMVYYHRTHPCPLETCGSVASMDKVNGKLTLWGTFQAPHVVRTVASLL 249 Query: 229 FGIPETKIRVAMPDVGGAFGSKVHIMAEELAVIASSILLRRPVRWTATRSEEMLASE-AR 287 GI E IRV PD+GG FG+KV + + I +SI+ +PV++ R + ++A+ AR Sbjct: 250 SGIEEHNIRVISPDIGGGFGNKVGVYPGYVCSIVASIVTGKPVKFIEDRMDNLMATAFAR 309 Query: 288 SNVFTGEVAVKKDGTVLGIKGKLLLDLGAYLTLTAGIQPTIIPV----MIPGPYKVRDLE 343 G ++ K+G + G+ + D GA+ A PT P + G Y + Sbjct: 310 DYWMKGRISATKEGKITGLHCHVTADHGAF---DACADPTKFPAGFFHICTGSYDIPVAY 366 Query: 344 IESTAVYTTTPPITM-YRGASR-PEATYIIERIMSTVADELGLDDVTIRERNLI--DQLP 399 + VYT P + YR + R EA Y IER++ +A EL +D +R N I +Q P Sbjct: 367 VGVDGVYTNKAPGGVAYRCSFRVTEAAYFIERMIEVLAMELNMDAAELRRINFIRKEQFP 426 Query: 400 YTNPFGLRYDTGDYIRVFKDGVAKLEYNELRKWAQQERSKGHR-------VGVGLAFYLE 452 YT+ G YD+GDY + + ++Y LR Q ER + + +G+GL + E Sbjct: 427 YTSALGWEYDSGDYHTAWDKALEAVDYKGLRA-EQAERVEAFKRGETRKLLGIGLTHFTE 485 Query: 453 ICSFGP-----------WEYGEIKVDNKGNVLVITGTTPHGQGTETAIAQIVADALQIPI 501 I GP ++ EI++ G+ + GT GQG T AQI+A IP Sbjct: 486 IVGAGPVKNCDILGMGMFDSCEIRIHPTGSAVARLGTISQGQGHATTFAQILASETGIPA 545 Query: 502 EKIRVVWGDTDIVEGSFGTYGSRSLTIGGSAALKVAERVLDKMKRAAASYFNADVQEIRY 561 + I + GDTD GTYGSRS + G+A ++ K + AA ++ + Sbjct: 546 DSITIEEGDTDTAPYGLGTYGSRSTPVAGAATALAGRKIRAKAQMIAAYLLEVHDNDVEW 605 Query: 562 ENEEFSVKNDPSKKASWDEIASLATTK-----EPIVEKIYYEN--DVTFPYGVHVAVVEV 614 + + F VK P + + EIA A + EP +E + Y + ++T+P+G ++ V+E+ Sbjct: 606 DVDRFVVKGAPERFKTIQEIAYAAYNQAIPGVEPGLEAVSYYDPPNMTYPFGAYICVMEI 665 Query: 615 D-DLGMARVVEYRAYDDIGKVINPALAEAQIHGGGVQGVGQALYEKAIINENGQLSV-TY 672 D D G + ++ A DD G INP + E Q+HGG + + A+ ++ ++ G T Sbjct: 666 DVDTGEHEIRQFYALDDCGTRINPMVIEGQVHGGLTEALAIAMGQEIAYDDIGNCKTGTL 725 Query: 673 ADYYVPTAVEAPRFISYFADKSHPSNYPTGTKGVGEAALIVGPAAIIRAIEDAVGA 728 D+++PTA E P + + F + P ++P G KGVGE+ + G A A+ DA A Sbjct: 726 MDFFIPTAWETPNYTTDFTETPSP-HHPIGAKGVGESPNVGGVPAFSNAVHDAFRA 780 Lambda K H 0.317 0.136 0.390 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 1458 Number of extensions: 89 Number of successful extensions: 11 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 748 Length of database: 806 Length adjustment: 41 Effective length of query: 707 Effective length of database: 765 Effective search space: 540855 Effective search space used: 540855 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.3 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.6 bits) S2: 55 (25.8 bits)
This GapMind analysis is from Sep 17 2021. The underlying query database was built on Sep 17 2021.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory