Align Isocitrate dehydrogenase [NADP]; IDH; Oxalosuccinate decarboxylase; EC 1.1.1.42 (characterized)
to candidate WP_066324211.1 BLR17_RS08165 NADP-dependent isocitrate dehydrogenase
Query= SwissProt::P16100 (741 letters) >NCBI__GCF_900100165.1:WP_066324211.1 Length = 744 Score = 1028 bits (2659), Expect = 0.0 Identities = 510/733 (69%), Positives = 601/733 (81%), Gaps = 2/733 (0%) Query: 5 KIIYTLTDEAPALATYSLLPIIKAFTGSSGIAVETRDISLAGRLIATFPEYLTDTQKISD 64 KI+YTLTDEAP LATYS LPI++AFT ++GI +ET DIS+A R++A FPE+LT+ Q++ D Sbjct: 6 KIVYTLTDEAPLLATYSFLPIVEAFTATAGIEIETEDISVAARILANFPEFLTEEQRVKD 65 Query: 65 DLAELGKLATTPDANIIKLPNISASVPQLKAAIKELQQQGYKLPDYPEEPKTDTEKDVKA 124 LAELGKLAT P+ANIIKLPN+SASVPQLK AI ELQ GY +P+YPE P+ D E +KA Sbjct: 66 SLAELGKLATAPEANIIKLPNVSASVPQLKGAIAELQAHGYAVPNYPEAPQNDAETAIKA 125 Query: 125 RYDKIKGSAVNPVLREGNSDRRAPLSVKNYARKHPHKMGAWSADSKSHVAHMDNGDFYGS 184 +Y KI GSAVNPVLREGNSDRRAP +VKNYA+ +PH MGAWSADSK+ VA M++GDFYGS Sbjct: 126 KYAKILGSAVNPVLREGNSDRRAPKAVKNYAKANPHSMGAWSADSKTAVASMESGDFYGS 185 Query: 185 EKAALIGAPGSVKIELIAKDGSSTVLKAKTSVQAGEIIDSSVMSKNALRNFIAAEIEDAK 244 E++ + VKIE + +DG++TVLKA T ++AGEIIDSSVM NAL++F+A I +AK Sbjct: 186 EQSVTVAEATDVKIEFVGQDGTTTVLKASTPLKAGEIIDSSVMHLNALKSFVAKTIAEAK 245 Query: 245 KQGVLLSVHLKATMMKVSDPIMFGQIVSEFYKDALTKHAEVLKQIGFDVNNGIGDLYARI 304 QGVLLSVHLKATMMKVSDPI+FG IV ++ D K+A + ++G + NG+GD+YA+I Sbjct: 246 AQGVLLSVHLKATMMKVSDPIIFGAIVEVYFADVFAKYAALFAELGVNTKNGLGDVYAKI 305 Query: 305 KTLPEAKQKEIEADIQAVYAQRPQLAMVNSDKGITNLHVPSDVIVDASMPAMIRDSGKMW 364 A++ E++A I P LAMVNSDKGITNLHVPSDVIVDASMPAMIR SG+MW Sbjct: 306 AG--NAQEAEVKAAIDQAIENGPALAMVNSDKGITNLHVPSDVIVDASMPAMIRTSGQMW 363 Query: 365 GPDGKLHDTKAVIPDRCYAGVYQVVIEDCKQHGAFDPTTMGSVPNVGLMAQKAEEYGSHD 424 +GK DT A+IPDR YAGVY I+ CK+HGAFDPTTMGSVPNVGLMAQKAEEYGSHD Sbjct: 364 NKEGKQQDTIAIIPDRSYAGVYTATIDFCKKHGAFDPTTMGSVPNVGLMAQKAEEYGSHD 423 Query: 425 KTFQIPADGVVRVTDESGKLLLEQSVEAGDIWRMCQAKDAPIQDWVKLAVNRARATNTPA 484 KTFQ+ +GVVRV D +G +L+EQ+VEA DI+RMCQAKDAPIQDWVKLAVNRAR +NTPA Sbjct: 424 KTFQMSTNGVVRVVDVNGNVLMEQTVEANDIFRMCQAKDAPIQDWVKLAVNRARLSNTPA 483 Query: 485 VFWLDPARAHDAQVIAKVERYLKDYDTSGLDIRILSPVEATRFSLARIREGKDTISVTGN 544 VFWLD RAHD ++I KV++YLKDYDT+ LDIRIL+P+ AT F+L RI +G+DTISVTGN Sbjct: 484 VFWLDENRAHDRELIVKVQKYLKDYDTTALDIRILNPIAATEFTLDRIIKGQDTISVTGN 543 Query: 545 VLRDYLTDLFPIMELGTSAKMLSIVPLMSGGGLFETGAGGSAPKHVQQFLEEGYLRWDSL 604 VLRDYLTDLFPI+E+GTSAKMLSIVPLM+GGGLFETGAGGSAPKHV+QF+ EGYLRWDSL Sbjct: 544 VLRDYLTDLFPILEVGTSAKMLSIVPLMNGGGLFETGAGGSAPKHVEQFVTEGYLRWDSL 603 Query: 605 GEFLALAASLEHLGNAYKNPKALVLASTLDQATGKILDNNKSPARKVGEIDNRGSHFYLA 664 GEFLAL ASLEHLG N KA+VL+ TLDQA L N+KSPARKVG+IDNRGSHFYLA Sbjct: 604 GEFLALGASLEHLGQTLNNEKAIVLSETLDQANDAFLKNDKSPARKVGQIDNRGSHFYLA 663 Query: 665 LYWAQALAAQTEDKELQAQFTGIAKALTDNETKIVGELAAAQGKPVDIAGYYHPNTDLTS 724 LYWAQALAAQT+D +LQA F IAK LT+NE KI EL AQGKP +I GYY PN L S Sbjct: 664 LYWAQALAAQTKDADLQAIFAPIAKELTENEAKIDAELIGAQGKPQEIGGYYQPNPALVS 723 Query: 725 KAIRPSATFNAAL 737 KA+RPS TFN L Sbjct: 724 KAMRPSTTFNTIL 736 Lambda K H 0.315 0.131 0.374 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 1446 Number of extensions: 52 Number of successful extensions: 3 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 741 Length of database: 744 Length adjustment: 40 Effective length of query: 701 Effective length of database: 704 Effective search space: 493504 Effective search space used: 493504 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.3 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.5 bits) S2: 55 (25.8 bits)
Align candidate WP_066324211.1 BLR17_RS08165 (NADP-dependent isocitrate dehydrogenase)
to HMM TIGR00178 (isocitrate dehydrogenase, NADP-dependent (EC 1.1.1.42))
# hmmsearch :: search profile(s) against a sequence database # HMMER 3.3.1 (Jul 2020); http://hmmer.org/ # Copyright (C) 2020 Howard Hughes Medical Institute. # Freely distributed under the BSD open source license. # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # query HMM file: ../tmp/path.carbon/TIGR00178.hmm # target sequence database: /tmp/gapView.2109254.genome.faa # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: TIGR00178 [M=744] Accession: TIGR00178 Description: monomer_idh: isocitrate dehydrogenase, NADP-dependent Scores for complete sequences (score includes all domains): --- full sequence --- --- best 1 domain --- -#dom- E-value score bias E-value score bias exp N Sequence Description ------- ------ ----- ------- ------ ----- ---- -- -------- ----------- 0 1316.8 7.7 0 1316.6 7.7 1.0 1 NCBI__GCF_900100165.1:WP_066324211.1 Domain annotation for each sequence (and alignments): >> NCBI__GCF_900100165.1:WP_066324211.1 # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc --- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ---- 1 ! 1316.6 7.7 0 0 3 742 .. 2 739 .. 1 741 [. 0.99 Alignments for each domain: == domain 1 score: 1316.6 bits; conditional E-value: 0 TIGR00178 3 tekakiiytltdeapllatysllpivkafaasaGievetrdislagrilaefpeylteeqkvddalaelGela 75 t+k+ki+ytltdeapllatys+lpiv+af+a+aGie+et dis+a+rila+fpe+lteeq+v+d+laelG+la NCBI__GCF_900100165.1:WP_066324211.1 2 TQKSKIVYTLTDEAPLLATYSFLPIVEAFTATAGIEIETEDISVAARILANFPEFLTEEQRVKDSLAELGKLA 74 6789********************************************************************* PP TIGR00178 76 ktpeaniiklpnisasvpqlkaaikelqdkGydlpdypeepktdeekdikaryakikGsavnpvlreGnsdrr 148 + peaniiklpn+sasvpqlk ai elq++Gy++p+ype p++d+e +ika+yaki+GsavnpvlreGnsdrr NCBI__GCF_900100165.1:WP_066324211.1 75 TAPEANIIKLPNVSASVPQLKGAIAELQAHGYAVPNYPEAPQNDAETAIKAKYAKILGSAVNPVLREGNSDRR 147 ************************************************************************* PP TIGR00178 149 aplavkeyarkhphkmGewsadskshvahmdagdfyaseksvlldaaeevkieliakdGketvlkaklklldg 221 ap+avk+ya+ +ph+mG+wsadsk+ va m++gdfy+se+sv++ +a++vkie++ +dG++tvlka+++l++g NCBI__GCF_900100165.1:WP_066324211.1 148 APKAVKNYAKANPHSMGAWSADSKTAVASMESGDFYGSEQSVTVAEATDVKIEFVGQDGTTTVLKASTPLKAG 220 ************************************************************************* PP TIGR00178 222 evidssvlskkalvefleeeiedakeegvllslhlkatmmkvsdpivfGhvvrvfykdvfakhaelleqlGld 294 e+idssv+ +al+ f+++ i++ak++gvlls+hlkatmmkvsdpi+fG +v v+++dvfak+a+l+ +lG++ NCBI__GCF_900100165.1:WP_066324211.1 221 EIIDSSVMHLNALKSFVAKTIAEAKAQGVLLSVHLKATMMKVSDPIIFGAIVEVYFADVFAKYAALFAELGVN 293 ************************************************************************* PP TIGR00178 295 venGladlyakieslpaakkeeieadlekvyeerpelamvdsdkGitnlhvpsdvivdasmpamirasGkmyg 367 ++nGl+d+yaki+ +a++ e++a+++++ e++p lamv+sdkGitnlhvpsdvivdasmpamir+sG+m++ NCBI__GCF_900100165.1:WP_066324211.1 294 TKNGLGDVYAKIA--GNAQEAEVKAAIDQAIENGPALAMVNSDKGITNLHVPSDVIVDASMPAMIRTSGQMWN 364 ************8..58999***************************************************** PP TIGR00178 368 kdgklkdtkavipdssyagvyqaviedckknGafdpttmGtvpnvGlmaqkaeeyGshdktfeieadGvvrvv 440 k+gk++dt a+ipd+syagvy a i++ckk+GafdpttmG+vpnvGlmaqkaeeyGshdktf++ +Gvvrvv NCBI__GCF_900100165.1:WP_066324211.1 365 KEGKQQDTIAIIPDRSYAGVYTATIDFCKKHGAFDPTTMGSVPNVGLMAQKAEEYGSHDKTFQMSTNGVVRVV 437 ************************************************************************* PP TIGR00178 441 dssGevlleeeveagdiwrmcqvkdapiqdwvklavtrarlsgtpavfwldperahdeelikkvekylkdhdt 513 d +G+vl+e+ vea+di+rmcq+kdapiqdwvklav+rarls+tpavfwld++rahd+eli kv+kylkd+dt NCBI__GCF_900100165.1:WP_066324211.1 438 DVNGNVLMEQTVEANDIFRMCQAKDAPIQDWVKLAVNRARLSNTPAVFWLDENRAHDRELIVKVQKYLKDYDT 510 ************************************************************************* PP TIGR00178 514 eGldiqilspvkatrfslerirrGedtisvtGnvlrdyltdlfpilelGtsakmlsvvplmaGGGlfetGaGG 586 + ldi+il+p+ at+f+l+ri +G+dtisvtGnvlrdyltdlfpile+Gtsakmls+vplm+GGGlfetGaGG NCBI__GCF_900100165.1:WP_066324211.1 511 TALDIRILNPIAATEFTLDRIIKGQDTISVTGNVLRDYLTDLFPILEVGTSAKMLSIVPLMNGGGLFETGAGG 583 ************************************************************************* PP TIGR00178 587 sapkhvqqleeenhlrwdslGeflalaaslehvavktgnekakvladtldaatgklldeekspsrkvGeldnr 659 sapkhv+q++ e++lrwdslGeflal+asleh++++ +neka vl++tld+a + +l+++ksp+rkvG++dnr NCBI__GCF_900100165.1:WP_066324211.1 584 SAPKHVEQFVTEGYLRWDSLGEFLALGASLEHLGQTLNNEKAIVLSETLDQANDAFLKNDKSPARKVGQIDNR 656 ************************************************************************* PP TIGR00178 660 gskfylakywaqelaaqtedkelaasfasvaealtkneekivaelaavqGeavdlgGyyapdtdlttkvlrps 732 gs+fyla ywaq+laaqt+d++l+a fa++a+ lt+ne+ki+ael +qG++ ++gGyy+p+ l++k++rps NCBI__GCF_900100165.1:WP_066324211.1 657 GSHFYLALYWAQALAAQTKDADLQAIFAPIAKELTENEAKIDAELIGAQGKPQEIGGYYQPNPALVSKAMRPS 729 ************************************************************************* PP TIGR00178 733 atfnaileal 742 tfn+il+ + NCBI__GCF_900100165.1:WP_066324211.1 730 TTFNTILDKI 739 *******976 PP Internal pipeline statistics summary: ------------------------------------- Query model(s): 1 (744 nodes) Target sequences: 1 (744 residues searched) Passed MSV filter: 1 (1); expected 0.0 (0.02) Passed bias filter: 1 (1); expected 0.0 (0.02) Passed Vit filter: 1 (1); expected 0.0 (0.001) Passed Fwd filter: 1 (1); expected 0.0 (1e-05) Initial search space (Z): 1 [actual number of targets] Domain search space (domZ): 1 [number of targets reported over threshold] # CPU time: 0.01u 0.00s 00:00:00.01 Elapsed: 00:00:00.01 # Mc/sec: 39.79 // [ok]
This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory