Align Isocitrate dehydrogenase [NADP]; IDH; Oxalosuccinate decarboxylase; EC 1.1.1.42 (characterized)
to candidate GFF2791 PGA1_c28340 isocitrate dehydrogenase [NADP]
Query= SwissProt::P16100 (741 letters) >FitnessBrowser__Phaeo:GFF2791 Length = 790 Score = 809 bits (2089), Expect = 0.0 Identities = 422/738 (57%), Positives = 528/738 (71%), Gaps = 7/738 (0%) Query: 2 STPKIIYTLTDEAPALATYSLLPIIKAFTGSSGIAVETRDISLAGRLIATFPEYLTDTQK 61 STP I+YT+ DEAP LA+ SLLPII++F ++G++V T+DISLAGR++ATFPE L+D Q+ Sbjct: 57 STPDILYTIVDEAPELASASLLPIIRSFAAAAGVSVGTKDISLAGRILATFPEALSDAQR 116 Query: 62 ISDDLAELGKLATTPDANIIKLPNISASVPQLKAAIKELQQQGYKLPDYPEEPKTDTEKD 121 S+DLAELG+L TP+AN+IKLPNISASVPQL AAI ELQ QG+ +PDYP +P TD EK+ Sbjct: 117 QSNDLAELGQLVKTPEANVIKLPNISASVPQLTAAIAELQAQGFGIPDYPADPATDAEKE 176 Query: 122 VKARYDKIKGSAVNPVLREGNSDRRAPLSVKNYARKHPHKMGAWSADSKSHVAHMDNGDF 181 ++ARYD IKGSAVNPVLREGNSDRRA +VKN+A+ +PH MG WSADSK+ V+ M DF Sbjct: 177 IRARYDTIKGSAVNPVLREGNSDRRAAKAVKNFAQNNPHSMGKWSADSKTKVSSMLGNDF 236 Query: 182 YGSEKAALIGAP--GSVKIELIAKDGSSTVLKAKTSVQAGEIIDSSVMSKNALRNFIAAE 239 Y +E +A I A G+ KIE + KDG+ TVLK ++ G + D++ MS AL +F+ Sbjct: 237 YANEVSATISAAQAGTAKIEFVGKDGAVTVLKDSWPLEEGTVADATFMSAKALSSFLKDA 296 Query: 240 IEDAKKQGVLLSVHLKATMMKVSDPIMFGQIVSEFYKDALTKHAEVLKQIGFDVNNGIGD 299 IED K G + S+H+KATMMKVSDPI+FG V + K ++ G N+G+G Sbjct: 297 IEDTKADGTMFSLHMKATMMKVSDPIIFGHAVKAWLGPVWDKFGAEIEAAGGSANSGLGA 356 Query: 300 LYARIKTLPEAKQKEIEADIQAVYAQRPQLAMVNSDKGITNLHVPSDVIVDASMPAMIRD 359 + A I LP I+A+I A+ RP + MV+SDKGITNLHVPSDVI+DASMPA+IR Sbjct: 357 VLATIDGLPNGDA--IKAEIAAL--DRPSMYMVDSDKGITNLHVPSDVIIDASMPAVIRA 412 Query: 360 SGKMWGPDGKLHDTKAVIPDRCYAGVYQVVIEDCKQHGAFDPTTMGSVPNVGLMAQKAEE 419 GK W G DT VIPDRCY+ VY I K +GA D TT GSV NVGLMAQKAEE Sbjct: 413 GGKGWDEAGNKGDTNCVIPDRCYSTVYDESINFFKANGALDVTTAGSVANVGLMAQKAEE 472 Query: 420 YGSHDKTFQIPADGVVRVTDESGKLLLEQSVEAGDIWRMCQAKDAPIQDWVKLAVNRARA 479 YGSH TF+ PADG +R+ +G+ L VEAGDIWR C K API++W++LA++R R Sbjct: 473 YGSHPTTFEAPADGTIRIVLANGETLHAHEVEAGDIWRSCTVKKAPIENWIELAMDRQRL 532 Query: 480 TNTPAVFWLDPARAHDAQVIAKVERYLKDYDTSGLDIRILSPVEATRFSLARIREGKDTI 539 T + A+FWLD RAHDA++I V L+ S L +I++P EAT SL I GKD+I Sbjct: 533 TGSEAIFWLDENRAHDAELIKYVTPALEAAGKSDL-FQIMAPREATAQSLKTITAGKDSI 591 Query: 540 SVTGNVLRDYLTDLFPIMELGTSAKMLSIVPLMSGGGLFETGAGGSAPKHVQQFLEEGYL 599 ++TGNVLRDYLTDLFPI+ELGTSAKMLSIV LM+GGGLFETGAGGSAPKHVQQ +EE +L Sbjct: 592 AITGNVLRDYLTDLFPILELGTSAKMLSIVKLMNGGGLFETGAGGSAPKHVQQLVEENHL 651 Query: 600 RWDSLGEFLALAASLEHLGNAYKNPKALVLASTLDQATGKILDNNKSPARKVGEIDNRGS 659 RWDS+GEF AL SL L ++ N KA VL + + AT ILD+N+SP+RKVGE DNR S Sbjct: 652 RWDSMGEFCALGESLNFLADSKGNAKAGVLGAAAEAATQGILDHNRSPSRKVGEPDNRAS 711 Query: 660 HFYLALYWAQALAAQTEDKELQAQFTGIAKALTDNETKIVGELAAAQGKPVDIAGYYHPN 719 H++ A YWA+ALAAQ +D EL A F IA L E I+ ELA QGK VD+ GY+H + Sbjct: 712 HYWFARYWAEALAAQGDDAELAAHFAPIAAELAAKEEAILSELAEVQGKAVDLGGYFHAD 771 Query: 720 TDLTSKAIRPSATFNAAL 737 T+ +RPSAT N + Sbjct: 772 PVKTAAVMRPSATLNGII 789 Lambda K H 0.315 0.131 0.374 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 1396 Number of extensions: 46 Number of successful extensions: 4 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 741 Length of database: 790 Length adjustment: 40 Effective length of query: 701 Effective length of database: 750 Effective search space: 525750 Effective search space used: 525750 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.3 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.5 bits) S2: 55 (25.8 bits)
Align candidate GFF2791 PGA1_c28340 (isocitrate dehydrogenase [NADP])
to HMM TIGR00178 (isocitrate dehydrogenase, NADP-dependent (EC 1.1.1.42))
# hmmsearch :: search profile(s) against a sequence database # HMMER 3.3.1 (Jul 2020); http://hmmer.org/ # Copyright (C) 2020 Howard Hughes Medical Institute. # Freely distributed under the BSD open source license. # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # query HMM file: ../tmp/path.carbon/TIGR00178.hmm # target sequence database: /tmp/gapView.17643.genome.faa # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: TIGR00178 [M=744] Accession: TIGR00178 Description: monomer_idh: isocitrate dehydrogenase, NADP-dependent Scores for complete sequences (score includes all domains): --- full sequence --- --- best 1 domain --- -#dom- E-value score bias E-value score bias exp N Sequence Description ------- ------ ----- ------- ------ ----- ---- -- -------- ----------- 0 1071.8 1.5 0 1071.6 1.5 1.0 1 lcl|FitnessBrowser__Phaeo:GFF2791 PGA1_c28340 isocitrate dehydroge Domain annotation for each sequence (and alignments): >> lcl|FitnessBrowser__Phaeo:GFF2791 PGA1_c28340 isocitrate dehydrogenase [NADP] # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc --- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ---- 1 ! 1071.6 1.5 0 0 5 739 .. 58 789 .. 53 790 .] 0.99 Alignments for each domain: == domain 1 score: 1071.6 bits; conditional E-value: 0 TIGR00178 5 kakiiytltdeapllatysllpivkafaasaGievetrdislagrilaefpeylteeqkvddalaelGelaktpea 80 ++ i+yt+ deap la+ sllpi+++faa+aG++v t+dislagrila+fpe l++ q+ +++laelG+l ktpea lcl|FitnessBrowser__Phaeo:GFF2791 58 TPDILYTIVDEAPELASASLLPIIRSFAAAAGVSVGTKDISLAGRILATFPEALSDAQRQSNDLAELGQLVKTPEA 133 578************************************************************************* PP TIGR00178 81 niiklpnisasvpqlkaaikelqdkGydlpdypeepktdeekdikaryakikGsavnpvlreGnsdrraplavkey 156 n+iklpnisasvpql+aai elq++G+ +pdyp++p td+ek+i+ary+ ikGsavnpvlreGnsdrra +avk++ lcl|FitnessBrowser__Phaeo:GFF2791 134 NVIKLPNISASVPQLTAAIAELQAQGFGIPDYPADPATDAEKEIRARYDTIKGSAVNPVLREGNSDRRAAKAVKNF 209 **************************************************************************** PP TIGR00178 157 arkhphkmGewsadskshvahmdagdfyaseksvlldaae..evkieliakdGketvlkaklklldgevidssvls 230 a+++ph+mG+wsadsk++v+ m +dfya+e s+++ aa+ kie++ kdG +tvlk++ +l++g v d++++s lcl|FitnessBrowser__Phaeo:GFF2791 210 AQNNPHSMGKWSADSKTKVSSMLGNDFYANEVSATISAAQagTAKIEFVGKDGAVTVLKDSWPLEEGTVADATFMS 285 ************************************98762269******************************** PP TIGR00178 231 kkalvefleeeiedakeegvllslhlkatmmkvsdpivfGhvvrvfykdvfakhaelleqlGldvenGladlyaki 306 +kal+ fl++ ied+k++g ++slh+katmmkvsdpi+fGh+v+++ v+ k+++ +e++G ++ Gl++++a i lcl|FitnessBrowser__Phaeo:GFF2791 286 AKALSSFLKDAIEDTKADGTMFSLHMKATMMKVSDPIIFGHAVKAWLGPVWDKFGAEIEAAGGSANSGLGAVLATI 361 **************************************************************************** PP TIGR00178 307 eslpaakkeeieadlekvyeerpelamvdsdkGitnlhvpsdvivdasmpamirasGkmygkdgklkdtkavipds 382 + lp++ + i+a++ + +rp + mvdsdkGitnlhvpsdvi+dasmpa+ira+Gk +++ g++ dt+ vipd+ lcl|FitnessBrowser__Phaeo:GFF2791 362 DGLPNG--DAIKAEIAAL--DRPSMYMVDSDKGITNLHVPSDVIIDASMPAVIRAGGKGWDEAGNKGDTNCVIPDR 433 ******..99***99875..8******************************************************* PP TIGR00178 383 syagvyqaviedckknGafdpttmGtvpnvGlmaqkaeeyGshdktfeieadGvvrvvdssGevlleeeveagdiw 458 +y+ vy++ i++ k nGa+d tt G+v nvGlmaqkaeeyGsh tfe +adG++r+v ++Ge l +eveagdiw lcl|FitnessBrowser__Phaeo:GFF2791 434 CYSTVYDESINFFKANGALDVTTAGSVANVGLMAQKAEEYGSHPTTFEAPADGTIRIVLANGETLHAHEVEAGDIW 509 **************************************************************************** PP TIGR00178 459 rmcqvkdapiqdwvklavtrarlsgtpavfwldperahdeelikkvekylkdhdteGldiqilspvkatrfsleri 534 r c vk api++w+ la+ r rl+g a+fwld++rahd+elik v l+ + l qi+ p +at sl+ i lcl|FitnessBrowser__Phaeo:GFF2791 510 RSCTVKKAPIENWIELAMDRQRLTGSEAIFWLDENRAHDAELIKYVTPALEAAGKSDL-FQIMAPREATAQSLKTI 584 **************************************************99998888.7**************** PP TIGR00178 535 rrGedtisvtGnvlrdyltdlfpilelGtsakmlsvvplmaGGGlfetGaGGsapkhvqqleeenhlrwdslGefl 610 + G+d+i++tGnvlrdyltdlfpilelGtsakmls+v lm+GGGlfetGaGGsapkhvqql+eenhlrwds+Gef lcl|FitnessBrowser__Phaeo:GFF2791 585 TAGKDSIAITGNVLRDYLTDLFPILELGTSAKMLSIVKLMNGGGLFETGAGGSAPKHVQQLVEENHLRWDSMGEFC 660 **************************************************************************** PP TIGR00178 611 alaaslehvavktgnekakvladtldaatgklldeekspsrkvGeldnrgskfylakywaqelaaqtedkelaasf 686 al++sl+ +a ++gn+ka vl+ + +aat+ +ld+++spsrkvGe dnr s++++a+ywa++laaq +d+elaa+f lcl|FitnessBrowser__Phaeo:GFF2791 661 ALGESLNFLADSKGNAKAGVLGAAAEAATQGILDHNRSPSRKVGEPDNRASHYWFARYWAEALAAQGDDAELAAHF 736 **************************************************************************** PP TIGR00178 687 asvaealtkneekivaelaavqGeavdlgGyyapdtdlttkvlrpsatfnail 739 a++a l+ +ee i +ela+vqG+avdlgGy+++d +t +v+rpsat+n i+ lcl|FitnessBrowser__Phaeo:GFF2791 737 APIAAELAAKEEAILSELAEVQGKAVDLGGYFHADPVKTAAVMRPSATLNGII 789 **************************************************986 PP Internal pipeline statistics summary: ------------------------------------- Query model(s): 1 (744 nodes) Target sequences: 1 (790 residues searched) Passed MSV filter: 1 (1); expected 0.0 (0.02) Passed bias filter: 1 (1); expected 0.0 (0.02) Passed Vit filter: 1 (1); expected 0.0 (0.001) Passed Fwd filter: 1 (1); expected 0.0 (1e-05) Initial search space (Z): 1 [actual number of targets] Domain search space (domZ): 1 [number of targets reported over threshold] # CPU time: 0.03u 0.01s 00:00:00.04 Elapsed: 00:00:00.04 # Mc/sec: 13.17 // [ok]
This GapMind analysis is from Sep 17 2021. The underlying query database was built on Sep 17 2021.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory