Align Isocitrate dehydrogenase [NADP]; IDH; Oxalosuccinate decarboxylase; EC 1.1.1.42 (characterized)
to candidate 201768 SO2629 isocitrate dehydrogenase, NADP-dependent (NCBI ptt file)
Query= SwissProt::P16100 (741 letters) >FitnessBrowser__MR1:201768 Length = 741 Score = 1069 bits (2764), Expect = 0.0 Identities = 534/736 (72%), Positives = 609/736 (82%) Query: 2 STPKIIYTLTDEAPALATYSLLPIIKAFTGSSGIAVETRDISLAGRLIATFPEYLTDTQK 61 ++P IIYT TDEAPALAT SLLPIIK FT ++ +AVETRDISL+GR+IA FPE LTD QK Sbjct: 4 NSPTIIYTETDEAPALATLSLLPIIKTFTNAADVAVETRDISLSGRVIANFPEKLTDAQK 63 Query: 62 ISDDLAELGKLATTPDANIIKLPNISASVPQLKAAIKELQQQGYKLPDYPEEPKTDTEKD 121 I D LAELG LA P+ANIIKLPNISAS+PQLKA I ELQQ+GY +P+YP+EPKTD EK Sbjct: 64 IGDHLAELGDLANQPEANIIKLPNISASIPQLKACILELQQKGYDIPNYPDEPKTDEEKS 123 Query: 122 VKARYDKIKGSAVNPVLREGNSDRRAPLSVKNYARKHPHKMGAWSADSKSHVAHMDNGDF 181 +KARYDKIKGSAVNPVLREGNSDRRAPLSVKN+A+K+PH MG W DSKSHVAHM GDF Sbjct: 124 IKARYDKIKGSAVNPVLREGNSDRRAPLSVKNFAKKNPHSMGKWVKDSKSHVAHMSEGDF 183 Query: 182 YGSEKAALIGAPGSVKIELIAKDGSSTVLKAKTSVQAGEIIDSSVMSKNALRNFIAAEIE 241 YGSE + + +V I L KDG VLK+ + AGEIID+SVMSK AL +F EI Sbjct: 184 YGSELSVTLSNADTVNIVLAQKDGQEVVLKSGLKLLAGEIIDASVMSKKALVSFFEREIA 243 Query: 242 DAKKQGVLLSVHLKATMMKVSDPIMFGQIVSEFYKDALTKHAEVLKQIGFDVNNGIGDLY 301 +AK + VLLS+HLKATMMKVSDPIMFG V F+K KHA + ++G DVNNG GD+Y Sbjct: 244 NAKAENVLLSLHLKATMMKVSDPIMFGHAVKVFFKPVFDKHAALFAELGVDVNNGFGDVY 303 Query: 302 ARIKTLPEAKQKEIEADIQAVYAQRPQLAMVNSDKGITNLHVPSDVIVDASMPAMIRDSG 361 A+I +LP + +IEADI AVYA+ P LAMV+SDKGITNLHVPSD+I+DASMPA IR SG Sbjct: 304 AKIASLPTDVRSQIEADIAAVYAEGPALAMVDSDKGITNLHVPSDIIIDASMPAAIRSSG 363 Query: 362 KMWGPDGKLHDTKAVIPDRCYAGVYQVVIEDCKQHGAFDPTTMGSVPNVGLMAQKAEEYG 421 +MWGPDGKLHDTKA+IPDRCYAGVYQ I CK+HGAFDP+TMGSVPNVGLMAQKAEEYG Sbjct: 364 QMWGPDGKLHDTKALIPDRCYAGVYQETIAFCKEHGAFDPSTMGSVPNVGLMAQKAEEYG 423 Query: 422 SHDKTFQIPADGVVRVTDESGKLLLEQSVEAGDIWRMCQAKDAPIQDWVKLAVNRARATN 481 SHDKTF+IPADGVV V D SGK+L+ +VEAGDIWRMCQ KDAPI+DWVKLAV RAR +N Sbjct: 424 SHDKTFEIPADGVVNVIDASGKVLMSHNVEAGDIWRMCQVKDAPIRDWVKLAVRRARLSN 483 Query: 482 TPAVFWLDPARAHDAQVIAKVERYLKDYDTSGLDIRILSPVEATRFSLARIREGKDTISV 541 TPAVFWLD RAHDAQ+I KV++YL ++DTSGLDI I+SP EATRFSLARI+EGKDTISV Sbjct: 484 TPAVFWLDANRAHDAQLIVKVKQYLPEHDTSGLDISIMSPEEATRFSLARIKEGKDTISV 543 Query: 542 TGNVLRDYLTDLFPIMELGTSAKMLSIVPLMSGGGLFETGAGGSAPKHVQQFLEEGYLRW 601 TGNVLRDYLTDLFPI+ELGTSAKMLSIVPLM+GGGLFETGAGGSAPKHVQQ +EG+LRW Sbjct: 544 TGNVLRDYLTDLFPILELGTSAKMLSIVPLMNGGGLFETGAGGSAPKHVQQVEKEGHLRW 603 Query: 602 DSLGEFLALAASLEHLGNAYKNPKALVLASTLDQATGKILDNNKSPARKVGEIDNRGSHF 661 DSLGEFLALAASLEHL NPKA VLA TLDQA G+ LD+NKSP+R+VGE+DNRGSHF Sbjct: 604 DSLGEFLALAASLEHLSQTVGNPKAQVLADTLDQAIGQFLDSNKSPSRRVGELDNRGSHF 663 Query: 662 YLALYWAQALAAQTEDKELQAQFTGIAKALTDNETKIVGELAAAQGKPVDIAGYYHPNTD 721 YLA+YWAQALA QT+D++LQA F +A AL+ NE IV EL AQG PVD+ GYY + Sbjct: 664 YLAMYWAQALAVQTKDEQLQAHFIPLAHALSANEQVIVAELNNAQGAPVDLGGYYRLDAV 723 Query: 722 LTSKAIRPSATFNAAL 737 KA+RPS T N L Sbjct: 724 KAEKAMRPSETLNKLL 739 Lambda K H 0.315 0.131 0.374 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 1512 Number of extensions: 60 Number of successful extensions: 1 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 741 Length of database: 741 Length adjustment: 40 Effective length of query: 701 Effective length of database: 701 Effective search space: 491401 Effective search space used: 491401 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.3 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.5 bits) S2: 55 (25.8 bits)
Align candidate 201768 SO2629 (isocitrate dehydrogenase, NADP-dependent (NCBI ptt file))
to HMM TIGR00178 (isocitrate dehydrogenase, NADP-dependent (EC 1.1.1.42))
# hmmsearch :: search profile(s) against a sequence database # HMMER 3.3.1 (Jul 2020); http://hmmer.org/ # Copyright (C) 2020 Howard Hughes Medical Institute. # Freely distributed under the BSD open source license. # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # query HMM file: ../tmp/path.carbon/TIGR00178.hmm # target sequence database: /tmp/gapView.20959.genome.faa # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: TIGR00178 [M=744] Accession: TIGR00178 Description: monomer_idh: isocitrate dehydrogenase, NADP-dependent Scores for complete sequences (score includes all domains): --- full sequence --- --- best 1 domain --- -#dom- E-value score bias E-value score bias exp N Sequence Description ------- ------ ----- ------- ------ ----- ---- -- -------- ----------- 0 1340.8 1.1 0 1340.6 1.1 1.0 1 lcl|FitnessBrowser__MR1:201768 SO2629 isocitrate dehydrogenase, Domain annotation for each sequence (and alignments): >> lcl|FitnessBrowser__MR1:201768 SO2629 isocitrate dehydrogenase, NADP-dependent (NCBI ptt file) # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc --- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ---- 1 ! 1340.6 1.1 0 0 2 739 .. 2 739 .. 1 741 [] 1.00 Alignments for each domain: == domain 1 score: 1340.6 bits; conditional E-value: 0 TIGR00178 2 stekakiiytltdeapllatysllpivkafaasaGievetrdislagrilaefpeylteeqkvddalaelGelaktpea 80 ++++iiyt tdeap+lat sllpi+k+f+++a ++vetrdisl+gr++a+fpe+lt+ qk++d+laelG+la+ pea lcl|FitnessBrowser__MR1:201768 2 KDNSPTIIYTETDEAPALATLSLLPIIKTFTNAADVAVETRDISLSGRVIANFPEKLTDAQKIGDHLAELGDLANQPEA 80 56789************************************************************************** PP TIGR00178 81 niiklpnisasvpqlkaaikelqdkGydlpdypeepktdeekdikaryakikGsavnpvlreGnsdrraplavkeyark 159 niiklpnisas+pqlka+i elq+kGyd+p+yp+epktdeek+ikary+kikGsavnpvlreGnsdrrapl+vk++a+k lcl|FitnessBrowser__MR1:201768 81 NIIKLPNISASIPQLKACILELQQKGYDIPNYPDEPKTDEEKSIKARYDKIKGSAVNPVLREGNSDRRAPLSVKNFAKK 159 ******************************************************************************* PP TIGR00178 160 hphkmGewsadskshvahmdagdfyaseksvlldaaeevkieliakdGketvlkaklklldgevidssvlskkalvefl 238 +ph+mG+w +dskshvahm++gdfy+se sv+l +a+ v+i l kdG+e+vlk+ lkll+ge+id+sv+skkalv f+ lcl|FitnessBrowser__MR1:201768 160 NPHSMGKWVKDSKSHVAHMSEGDFYGSELSVTLSNADTVNIVLAQKDGQEVVLKSGLKLLAGEIIDASVMSKKALVSFF 238 ******************************************************************************* PP TIGR00178 239 eeeiedakeegvllslhlkatmmkvsdpivfGhvvrvfykdvfakhaelleqlGldvenGladlyakieslpaakkeei 317 e ei++ak+e+vllslhlkatmmkvsdpi+fGh+v+vf+k vf kha+l+ +lG+dv+nG++d+yaki+slp + +i lcl|FitnessBrowser__MR1:201768 239 EREIANAKAENVLLSLHLKATMMKVSDPIMFGHAVKVFFKPVFDKHAALFAELGVDVNNGFGDVYAKIASLPTDVRSQI 317 ******************************************************************************* PP TIGR00178 318 eadlekvyeerpelamvdsdkGitnlhvpsdvivdasmpamirasGkmygkdgklkdtkavipdssyagvyqaviedck 396 ead+ +vy+e+p lamvdsdkGitnlhvpsd+i+dasmpa+ir+sG+m+g+dgkl+dtka+ipd++yagvyq+ i +ck lcl|FitnessBrowser__MR1:201768 318 EADIAAVYAEGPALAMVDSDKGITNLHVPSDIIIDASMPAAIRSSGQMWGPDGKLHDTKALIPDRCYAGVYQETIAFCK 396 ******************************************************************************* PP TIGR00178 397 knGafdpttmGtvpnvGlmaqkaeeyGshdktfeieadGvvrvvdssGevlleeeveagdiwrmcqvkdapiqdwvkla 475 ++Gafdp+tmG+vpnvGlmaqkaeeyGshdktfei+adGvv+v+d+sG+vl+ ++veagdiwrmcqvkdapi+dwvkla lcl|FitnessBrowser__MR1:201768 397 EHGAFDPSTMGSVPNVGLMAQKAEEYGSHDKTFEIPADGVVNVIDASGKVLMSHNVEAGDIWRMCQVKDAPIRDWVKLA 475 ******************************************************************************* PP TIGR00178 476 vtrarlsgtpavfwldperahdeelikkvekylkdhdteGldiqilspvkatrfslerirrGedtisvtGnvlrdyltd 554 v rarls+tpavfwld +rahd++li kv++yl +hdt+Gldi i+sp +atrfsl+ri++G+dtisvtGnvlrdyltd lcl|FitnessBrowser__MR1:201768 476 VRRARLSNTPAVFWLDANRAHDAQLIVKVKQYLPEHDTSGLDISIMSPEEATRFSLARIKEGKDTISVTGNVLRDYLTD 554 ******************************************************************************* PP TIGR00178 555 lfpilelGtsakmlsvvplmaGGGlfetGaGGsapkhvqqleeenhlrwdslGeflalaaslehvavktgnekakvlad 633 lfpilelGtsakmls+vplm+GGGlfetGaGGsapkhvqq+e+e+hlrwdslGeflalaasleh++++ gn+ka+vlad lcl|FitnessBrowser__MR1:201768 555 LFPILELGTSAKMLSIVPLMNGGGLFETGAGGSAPKHVQQVEKEGHLRWDSLGEFLALAASLEHLSQTVGNPKAQVLAD 633 ******************************************************************************* PP TIGR00178 634 tldaatgklldeekspsrkvGeldnrgskfylakywaqelaaqtedkelaasfasvaealtkneekivaelaavqGeav 712 tld+a+g++ld++kspsr+vGeldnrgs+fyla+ywaq+la qt+d++l+a+f ++a+al+ ne++ivael+++qG +v lcl|FitnessBrowser__MR1:201768 634 TLDQAIGQFLDSNKSPSRRVGELDNRGSHFYLAMYWAQALAVQTKDEQLQAHFIPLAHALSANEQVIVAELNNAQGAPV 712 ******************************************************************************* PP TIGR00178 713 dlgGyyapdtdlttkvlrpsatfnail 739 dlgGyy d +++k++rps+t+n++l lcl|FitnessBrowser__MR1:201768 713 DLGGYYRLDAVKAEKAMRPSETLNKLL 739 ************************987 PP Internal pipeline statistics summary: ------------------------------------- Query model(s): 1 (744 nodes) Target sequences: 1 (741 residues searched) Passed MSV filter: 1 (1); expected 0.0 (0.02) Passed bias filter: 1 (1); expected 0.0 (0.02) Passed Vit filter: 1 (1); expected 0.0 (0.001) Passed Fwd filter: 1 (1); expected 0.0 (1e-05) Initial search space (Z): 1 [actual number of targets] Domain search space (domZ): 1 [number of targets reported over threshold] # CPU time: 0.04u 0.02s 00:00:00.06 Elapsed: 00:00:00.05 # Mc/sec: 9.21 // [ok]
This GapMind analysis is from Sep 17 2021. The underlying query database was built on Sep 17 2021.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see the paper from 2019 on GapMind for amino acid biosynthesis, the paper from 2022 on GapMind for carbon sources, or view the source code.
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory