Align Isocitrate dehydrogenase [NADP]; IDH; Oxalosuccinate decarboxylase; EC 1.1.1.42 (characterized)
to candidate Echvi_1839 Echvi_1839 isocitrate dehydrogenase, NADP-dependent, monomeric type
Query= SwissProt::P16100 (741 letters) >FitnessBrowser__Cola:Echvi_1839 Length = 762 Score = 1053 bits (2724), Expect = 0.0 Identities = 520/735 (70%), Positives = 606/735 (82%) Query: 3 TPKIIYTLTDEAPALATYSLLPIIKAFTGSSGIAVETRDISLAGRLIATFPEYLTDTQKI 62 TPKI+YTLTDEAPALATYSLLPIIK+FT S+G+ VETRDISL+GR+IA FPEYL + Q+I Sbjct: 21 TPKILYTLTDEAPALATYSLLPIIKSFTDSAGVVVETRDISLSGRIIANFPEYLKEDQRI 80 Query: 63 SDDLAELGKLATTPDANIIKLPNISASVPQLKAAIKELQQQGYKLPDYPEEPKTDTEKDV 122 D LAELG++A TP+ANI+KLPNISAS+PQLKAAIKELQ++GY LPDYP+EPK EK V Sbjct: 81 GDALAELGEIAKTPEANIVKLPNISASIPQLKAAIKELQEKGYALPDYPDEPKDQEEKGV 140 Query: 123 KARYDKIKGSAVNPVLREGNSDRRAPLSVKNYARKHPHKMGAWSADSKSHVAHMDNGDFY 182 KA+YDKIKGSAVNPVLREGNSDRRAP +VK +AR +PH MG WSADSKSHVA M GDFY Sbjct: 141 KAKYDKIKGSAVNPVLREGNSDRRAPQAVKQFARNNPHSMGEWSADSKSHVASMSEGDFY 200 Query: 183 GSEKAALIGAPGSVKIELIAKDGSSTVLKAKTSVQAGEIIDSSVMSKNALRNFIAAEIED 242 GSE++ + G+VKI+L A DG+ TVLK +Q GE+IDSSVMS L+ F+A + D Sbjct: 201 GSEQSLTMSEAGTVKIQLEAGDGTVTVLKEGLELQEGEVIDSSVMSVKKLQAFLAEQKAD 260 Query: 243 AKKQGVLLSVHLKATMMKVSDPIMFGQIVSEFYKDALTKHAEVLKQIGFDVNNGIGDLYA 302 AK +G+L S+H+KATMMKVSDPI+FG V F+ KHA +K++G DVNNG GDL + Sbjct: 261 AKAKGILFSLHMKATMMKVSDPIIFGHAVKVFFAPVFEKHAATIKKLGVDVNNGFGDLVS 320 Query: 303 RIKTLPEAKQKEIEADIQAVYAQRPQLAMVNSDKGITNLHVPSDVIVDASMPAMIRDSGK 362 ++ LP K+KEIEADI+A A P LAMVNS KGITNLHVPSDVI+DASMPAMIR SG+ Sbjct: 321 ALEKLPADKRKEIEADIEACLADSPDLAMVNSHKGITNLHVPSDVIIDASMPAMIRSSGQ 380 Query: 363 MWGPDGKLHDTKAVIPDRCYAGVYQVVIEDCKQHGAFDPTTMGSVPNVGLMAQKAEEYGS 422 MW + L DTKA+IPDR YAGVYQ I+ CKQHGAFDPTTMGSVPNVGLMAQKAEEYGS Sbjct: 381 MWNKNDALQDTKAIIPDRSYAGVYQETIDFCKQHGAFDPTTMGSVPNVGLMAQKAEEYGS 440 Query: 423 HDKTFQIPADGVVRVTDESGKLLLEQSVEAGDIWRMCQAKDAPIQDWVKLAVNRARATNT 482 HDKTF+ ADG ++V + +G+ L+E VE GDI+RMCQ KDAPIQDWVKLAVNRAR++NT Sbjct: 441 HDKTFEAAADGAIKVLNAAGETLMEHKVEKGDIFRMCQTKDAPIQDWVKLAVNRARSSNT 500 Query: 483 PAVFWLDPARAHDAQVIAKVERYLKDYDTSGLDIRILSPVEATRFSLARIREGKDTISVT 542 PAVFWLD RAHDAQ+I KV +YL +DT GLDIRILSPVEATRFSL RI++GKDTISVT Sbjct: 501 PAVFWLDEHRAHDAQLIQKVNQYLPQHDTEGLDIRILSPVEATRFSLERIKDGKDTISVT 560 Query: 543 GNVLRDYLTDLFPIMELGTSAKMLSIVPLMSGGGLFETGAGGSAPKHVQQFLEEGYLRWD 602 GNVLRDYLTDLFPI+ELGTSAKMLSIVPLM+GGGLFETGAGGSAPKHVQQF+EEG+LRWD Sbjct: 561 GNVLRDYLTDLFPILELGTSAKMLSIVPLMNGGGLFETGAGGSAPKHVQQFVEEGHLRWD 620 Query: 603 SLGEFLALAASLEHLGNAYKNPKALVLASTLDQATGKILDNNKSPARKVGEIDNRGSHFY 662 SLGEFLALA SLEHLG + N +A+VL TLD ATGK L+N KSP+RKV E+DNRGSHFY Sbjct: 621 SLGEFLALAVSLEHLGETFDNNRAIVLGKTLDTATGKFLENGKSPSRKVNELDNRGSHFY 680 Query: 663 LALYWAQALAAQTEDKELQAQFTGIAKALTDNETKIVGELAAAQGKPVDIAGYYHPNTDL 722 LA+YWA+ALA Q ED L+ FT +AKA+ + E +++ EL AQG PVDI GY+ P D Sbjct: 681 LAMYWAEALANQDEDAALKEIFTKVAKAMIEKEEQVIAELNGAQGSPVDIGGYFKPAEDK 740 Query: 723 TSKAIRPSATFNAAL 737 TSKA+RPS T N L Sbjct: 741 TSKAMRPSQTLNGIL 755 Lambda K H 0.315 0.131 0.374 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 1506 Number of extensions: 50 Number of successful extensions: 1 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 741 Length of database: 762 Length adjustment: 40 Effective length of query: 701 Effective length of database: 722 Effective search space: 506122 Effective search space used: 506122 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.3 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.5 bits) S2: 55 (25.8 bits)
Align candidate Echvi_1839 Echvi_1839 (isocitrate dehydrogenase, NADP-dependent, monomeric type)
to HMM TIGR00178 (isocitrate dehydrogenase, NADP-dependent (EC 1.1.1.42))
# hmmsearch :: search profile(s) against a sequence database # HMMER 3.3.1 (Jul 2020); http://hmmer.org/ # Copyright (C) 2020 Howard Hughes Medical Institute. # Freely distributed under the BSD open source license. # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # query HMM file: ../tmp/path.carbon/TIGR00178.hmm # target sequence database: /tmp/gapView.19190.genome.faa # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: TIGR00178 [M=744] Accession: TIGR00178 Description: monomer_idh: isocitrate dehydrogenase, NADP-dependent Scores for complete sequences (score includes all domains): --- full sequence --- --- best 1 domain --- -#dom- E-value score bias E-value score bias exp N Sequence Description ------- ------ ----- ------- ------ ----- ---- -- -------- ----------- 0 1325.0 2.3 0 1324.8 2.3 1.0 1 lcl|FitnessBrowser__Cola:Echvi_1839 Echvi_1839 isocitrate dehydrogen Domain annotation for each sequence (and alignments): >> lcl|FitnessBrowser__Cola:Echvi_1839 Echvi_1839 isocitrate dehydrogenase, NADP-dependent, monomeric type # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc --- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ---- 1 ! 1324.8 2.3 0 0 3 742 .. 19 758 .. 16 760 .. 1.00 Alignments for each domain: == domain 1 score: 1324.8 bits; conditional E-value: 0 TIGR00178 3 tekakiiytltdeapllatysllpivkafaasaGievetrdislagrilaefpeylteeqkvddalaelGelak 76 t+++ki+ytltdeap+latysllpi+k+f++saG+ vetrdisl+gri+a+fpeyl e+q+++dalaelGe+ak lcl|FitnessBrowser__Cola:Echvi_1839 19 TKTPKILYTLTDEAPALATYSLLPIIKSFTDSAGVVVETRDISLSGRIIANFPEYLKEDQRIGDALAELGEIAK 92 6789********************************************************************** PP TIGR00178 77 tpeaniiklpnisasvpqlkaaikelqdkGydlpdypeepktdeekdikaryakikGsavnpvlreGnsdrrap 150 tpeani+klpnisas+pqlkaaikelq+kGy+lpdyp+epk +eek +ka+y+kikGsavnpvlreGnsdrrap lcl|FitnessBrowser__Cola:Echvi_1839 93 TPEANIVKLPNISASIPQLKAAIKELQEKGYALPDYPDEPKDQEEKGVKAKYDKIKGSAVNPVLREGNSDRRAP 166 ************************************************************************** PP TIGR00178 151 lavkeyarkhphkmGewsadskshvahmdagdfyaseksvlldaaeevkieliakdGketvlkaklklldgevi 224 +avk++ar++ph+mGewsadskshva m++gdfy+se+s+++ +a vki+l a dG++tvlk+ l+l++gevi lcl|FitnessBrowser__Cola:Echvi_1839 167 QAVKQFARNNPHSMGEWSADSKSHVASMSEGDFYGSEQSLTMSEAGTVKIQLEAGDGTVTVLKEGLELQEGEVI 240 ************************************************************************** PP TIGR00178 225 dssvlskkalvefleeeiedakeegvllslhlkatmmkvsdpivfGhvvrvfykdvfakhaelleqlGldvenG 298 dssv+s k l++fl+e+ +dak++g+l+slh+katmmkvsdpi+fGh+v+vf++ vf kha+++++lG+dv+nG lcl|FitnessBrowser__Cola:Echvi_1839 241 DSSVMSVKKLQAFLAEQKADAKAKGILFSLHMKATMMKVSDPIIFGHAVKVFFAPVFEKHAATIKKLGVDVNNG 314 ************************************************************************** PP TIGR00178 299 ladlyakieslpaakkeeieadlekvyeerpelamvdsdkGitnlhvpsdvivdasmpamirasGkmygkdgkl 372 ++dl + +e+lpa k++eiead+e+++++ p+lamv+s kGitnlhvpsdvi+dasmpamir+sG+m++k++ l lcl|FitnessBrowser__Cola:Echvi_1839 315 FGDLVSALEKLPADKRKEIEADIEACLADSPDLAMVNSHKGITNLHVPSDVIIDASMPAMIRSSGQMWNKNDAL 388 ************************************************************************** PP TIGR00178 373 kdtkavipdssyagvyqaviedckknGafdpttmGtvpnvGlmaqkaeeyGshdktfeieadGvvrvvdssGev 446 +dtka+ipd+syagvyq+ i++ck++GafdpttmG+vpnvGlmaqkaeeyGshdktfe adG ++v ++ Ge lcl|FitnessBrowser__Cola:Echvi_1839 389 QDTKAIIPDRSYAGVYQETIDFCKQHGAFDPTTMGSVPNVGLMAQKAEEYGSHDKTFEAAADGAIKVLNAAGET 462 ************************************************************************** PP TIGR00178 447 lleeeveagdiwrmcqvkdapiqdwvklavtrarlsgtpavfwldperahdeelikkvekylkdhdteGldiqi 520 l+e++ve+gdi+rmcq kdapiqdwvklav+rar s+tpavfwld++rahd++li+kv++yl +hdteGldi+i lcl|FitnessBrowser__Cola:Echvi_1839 463 LMEHKVEKGDIFRMCQTKDAPIQDWVKLAVNRARSSNTPAVFWLDEHRAHDAQLIQKVNQYLPQHDTEGLDIRI 536 ************************************************************************** PP TIGR00178 521 lspvkatrfslerirrGedtisvtGnvlrdyltdlfpilelGtsakmlsvvplmaGGGlfetGaGGsapkhvqq 594 lspv+atrfsleri+ G+dtisvtGnvlrdyltdlfpilelGtsakmls+vplm+GGGlfetGaGGsapkhvqq lcl|FitnessBrowser__Cola:Echvi_1839 537 LSPVEATRFSLERIKDGKDTISVTGNVLRDYLTDLFPILELGTSAKMLSIVPLMNGGGLFETGAGGSAPKHVQQ 610 ************************************************************************** PP TIGR00178 595 leeenhlrwdslGeflalaaslehvavktgnekakvladtldaatgklldeekspsrkvGeldnrgskfylaky 668 ++ee+hlrwdslGeflala sleh++ + +n++a vl++tld atgk+l++ kspsrkv eldnrgs+fyla+y lcl|FitnessBrowser__Cola:Echvi_1839 611 FVEEGHLRWDSLGEFLALAVSLEHLGETFDNNRAIVLGKTLDTATGKFLENGKSPSRKVNELDNRGSHFYLAMY 684 ************************************************************************** PP TIGR00178 669 waqelaaqtedkelaasfasvaealtkneekivaelaavqGeavdlgGyyapdtdlttkvlrpsatfnaileal 742 wa++la q ed+ l++ f+ va+a+ ++ee+++ael+ +qG++vd+gGy++p +d+t+k++rps+t+n ile + lcl|FitnessBrowser__Cola:Echvi_1839 685 WAEALANQDEDAALKEIFTKVAKAMIEKEEQVIAELNGAQGSPVDIGGYFKPAEDKTSKAMRPSQTLNGILEMV 758 ***********************************************************************976 PP Internal pipeline statistics summary: ------------------------------------- Query model(s): 1 (744 nodes) Target sequences: 1 (762 residues searched) Passed MSV filter: 1 (1); expected 0.0 (0.02) Passed bias filter: 1 (1); expected 0.0 (0.02) Passed Vit filter: 1 (1); expected 0.0 (0.001) Passed Fwd filter: 1 (1); expected 0.0 (1e-05) Initial search space (Z): 1 [actual number of targets] Domain search space (domZ): 1 [number of targets reported over threshold] # CPU time: 0.04u 0.03s 00:00:00.07 Elapsed: 00:00:00.06 # Mc/sec: 9.17 // [ok]
This GapMind analysis is from Sep 17 2021. The underlying query database was built on Sep 17 2021.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory