Align isocitrate dehydrogenase (NADP+) (EC 1.1.1.42) (characterized)
to candidate GFF3859 HP15_3800 isocitrate dehydrogenase, NADP-dependent
Query= BRENDA::O53611 (745 letters) >FitnessBrowser__Marino:GFF3859 Length = 747 Score = 1024 bits (2647), Expect = 0.0 Identities = 501/741 (67%), Positives = 599/741 (80%), Gaps = 1/741 (0%) Query: 1 MSAEQPTIIYTLTDEAPLLATYAFLPIVRAFAEPAGIKIEASDISVAARILAEFPDYLTE 60 M++ + I+YTLTDEAP LAT + LPI+ +A+PAGI+ E SDIS+AARILA FPDYL E Sbjct: 1 MTSSKAKIVYTLTDEAPALATRSLLPILETYAKPAGIEFETSDISLAARILANFPDYLEE 60 Query: 61 EQRVPDNLAELGRLTQLPDTNIIKLPNISASVPQLVAAIKELQDKGYAVPDYPADPKTDQ 120 +QRVPD LAELG T+ PD NIIKLPNISAS+PQL AAIKEL ++GY VP+Y +P+ D+ Sbjct: 61 DQRVPDALAELGEYTKDPDANIIKLPNISASIPQLRAAIKELNEQGYNVPEYKENPENDE 120 Query: 121 EKAIKERYARCLGSAVNPVLRQGNSDRRAPKAVKEYARKHPHSMGEWSMASRTHVAHMRH 180 EK I+ RYA+ LGSAVNPVLR+GNSDRRAP AVK +ARK+PHSMGEWS ASRTHVAHMR Sbjct: 121 EKEIQSRYAKVLGSAVNPVLREGNSDRRAPTAVKAFARKYPHSMGEWSPASRTHVAHMRG 180 Query: 181 GDFYAGEKSMTLDRARNVRMELLAKSGKTIVLKPEVPLDDGDVIDSMFMSKKALCDFYEE 240 GDFY+ E+S+TLD+A + K GK VLK ++PL +G+V+D MFMSKKAL F+E+ Sbjct: 181 GDFYSSEQSVTLDKATKANIVFENKQGKQTVLKSDLPLQEGEVLDGMFMSKKALVKFFED 240 Query: 241 QMQDAFETGVMFSLHVKATMMKVSHPIVFGHAVRIFYKDAFAKHQELFDDLGVNVNNGLS 300 + D TGVMFSLHVKATMMK+SHPIVFGHAV++FYKD F K+ ELFD++GVN NNGLS Sbjct: 241 AIADCENTGVMFSLHVKATMMKISHPIVFGHAVKVFYKDLFDKYGELFDEIGVNPNNGLS 300 Query: 301 DLYSKIESLPASQRDEIIEDLHRCHEHRPELAMVDSARGISNFHSPSDVIVDASMPAMIR 360 + KI+ LP S++++I EDLH C+EHRPE+AMVDS +GI+N H PSDVIVDASMPAMIR Sbjct: 301 SVVEKIKQLPESKQEQIQEDLHACYEHRPEIAMVDSVKGITNLHVPSDVIVDASMPAMIR 360 Query: 361 AGGKMYGADGKLKDTKAVNPESTFSRIYQEIINFCKTNGQFDPTTMGTVPNVGLMAQQAE 420 GKM+ D KLKDTKAV PEST++ IYQE+INFCKT+G FDPTTMGTVPNVGLMAQ+AE Sbjct: 361 NSGKMWARDNKLKDTKAVMPESTYATIYQEVINFCKTHGAFDPTTMGTVPNVGLMAQKAE 420 Query: 421 EYGSHDKTFEIPEDGVANIVDVATGEVLLTENVEAGDIWRMCIVKDAPIRDWVKLAVTRA 480 EYGSHDKTFEI EDGV +V G VL NVE GDIWR C KD PIRDWVKLAV RA Sbjct: 421 EYGSHDKTFEIKEDGVVRVV-AEDGTVLTEHNVEKGDIWRACQTKDLPIRDWVKLAVNRA 479 Query: 481 RISGMPVLFWLDPYRPHENELIKKVKTYLKDHDTEGLDIQIMSQVRSMRYTCERLVRGLD 540 R +GMP +FWLD R H+ +LI+KV TYLKDHDTEGLDI+IMS VR++R+T ERL+RGLD Sbjct: 480 RATGMPAVFWLDDERAHDAQLIQKVNTYLKDHDTEGLDIRIMSPVRAIRWTMERLIRGLD 539 Query: 541 TIAATGNILRDYLTDLFPILELGTSAKMLSVVPLMAGGGMYETGAGGSAPKHVKQLVEEN 600 TI+ TGN+LRDYLTDLFPILELGTSAKMLS+VPL+ GGG+YETGAGGSAPKHV+QL++EN Sbjct: 540 TISVTGNVLRDYLTDLFPILELGTSAKMLSIVPLLNGGGLYETGAGGSAPKHVQQLIQEN 599 Query: 601 HLRWDSLGEFLALGAGFEDIGIKTGNERAKLLGKTLDAAIGKLLDNDKSPSRKTGELDNR 660 HLRWDSLGEFLA +++G K NERA+LLG+TLD A +LL+N++SPSR TGELDNR Sbjct: 600 HLRWDSLGEFLATAVSLDELGEKQNNERARLLGQTLDKATERLLENNQSPSRVTGELDNR 659 Query: 661 GSQFYLAMYWAQELAAQTDDQQLAEHFASLADVLTKNEDVIVRELTEVQGEPVDIGGYYA 720 GS F+LA YWA+ELA Q D++L E F L+ L +N+D I+ E+T VQG P DIGGYY Sbjct: 660 GSHFHLARYWAEELANQDSDKELKEFFTKLSAQLEENKDKILEEMTVVQGNPADIGGYYH 719 Query: 721 PDSDMTTAVMRPSKTFNAALE 741 P + VM+PS T N LE Sbjct: 720 PPMEKVCEVMQPSATLNRILE 740 Lambda K H 0.317 0.134 0.388 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 1478 Number of extensions: 52 Number of successful extensions: 2 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 745 Length of database: 747 Length adjustment: 40 Effective length of query: 705 Effective length of database: 707 Effective search space: 498435 Effective search space used: 498435 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.3 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.7 bits) S2: 55 (25.8 bits)
Align candidate GFF3859 HP15_3800 (isocitrate dehydrogenase, NADP-dependent)
to HMM TIGR00178 (isocitrate dehydrogenase, NADP-dependent (EC 1.1.1.42))
# hmmsearch :: search profile(s) against a sequence database # HMMER 3.3.1 (Jul 2020); http://hmmer.org/ # Copyright (C) 2020 Howard Hughes Medical Institute. # Freely distributed under the BSD open source license. # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # query HMM file: ../tmp/path.carbon/TIGR00178.hmm # target sequence database: /tmp/gapView.14741.genome.faa # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: TIGR00178 [M=744] Accession: TIGR00178 Description: monomer_idh: isocitrate dehydrogenase, NADP-dependent Scores for complete sequences (score includes all domains): --- full sequence --- --- best 1 domain --- -#dom- E-value score bias E-value score bias exp N Sequence Description ------- ------ ----- ------- ------ ----- ---- -- -------- ----------- 0 1306.8 0.1 0 1306.6 0.1 1.0 1 lcl|FitnessBrowser__Marino:GFF3859 HP15_3800 isocitrate dehydrogena Domain annotation for each sequence (and alignments): >> lcl|FitnessBrowser__Marino:GFF3859 HP15_3800 isocitrate dehydrogenase, NADP-dependent # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc --- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ---- 1 ! 1306.6 0.1 0 0 1 741 [. 1 741 [. 1 744 [. 1.00 Alignments for each domain: == domain 1 score: 1306.6 bits; conditional E-value: 0 TIGR00178 1 mstekakiiytltdeapllatysllpivkafaasaGievetrdislagrilaefpeylteeqkvddalaelGela 75 m+++kaki+ytltdeap+lat sllpi++++a++aGie et+disla+rila+fp+yl e+q+v+dalaelGe + lcl|FitnessBrowser__Marino:GFF3859 1 MTSSKAKIVYTLTDEAPALATRSLLPILETYAKPAGIEFETSDISLAARILANFPDYLEEDQRVPDALAELGEYT 75 67889********************************************************************** PP TIGR00178 76 ktpeaniiklpnisasvpqlkaaikelqdkGydlpdypeepktdeekdikaryakikGsavnpvlreGnsdrrap 150 k p+aniiklpnisas+pql+aaikel+++Gy++p+y e+p++deek+i+ ryak++GsavnpvlreGnsdrrap lcl|FitnessBrowser__Marino:GFF3859 76 KDPDANIIKLPNISASIPQLRAAIKELNEQGYNVPEYKENPENDEEKEIQSRYAKVLGSAVNPVLREGNSDRRAP 150 *************************************************************************** PP TIGR00178 151 lavkeyarkhphkmGewsadskshvahmdagdfyaseksvlldaaeevkieliakdGketvlkaklklldgevid 225 +avk +ark+ph+mGews +s++hvahm+ gdfy+se+sv+ld+a++ +i + k+Gk+tvlk++l+l++gev+d lcl|FitnessBrowser__Marino:GFF3859 151 TAVKAFARKYPHSMGEWSPASRTHVAHMRGGDFYSSEQSVTLDKATKANIVFENKQGKQTVLKSDLPLQEGEVLD 225 *************************************************************************** PP TIGR00178 226 ssvlskkalvefleeeiedakeegvllslhlkatmmkvsdpivfGhvvrvfykdvfakhaelleqlGldvenGla 300 ++++skkalv+f+e+ i+d ++gv++slh+katmmk+s+pivfGh+v+vfykd f k++el++++G++ +nGl+ lcl|FitnessBrowser__Marino:GFF3859 226 GMFMSKKALVKFFEDAIADCENTGVMFSLHVKATMMKISHPIVFGHAVKVFYKDLFDKYGELFDEIGVNPNNGLS 300 *************************************************************************** PP TIGR00178 301 dlyakieslpaakkeeieadlekvyeerpelamvdsdkGitnlhvpsdvivdasmpamirasGkmygkdgklkdt 375 + ki++lp++k+e+i++dl+++ye+rpe+amvds kGitnlhvpsdvivdasmpamir+sGkm+++d+klkdt lcl|FitnessBrowser__Marino:GFF3859 301 SVVEKIKQLPESKQEQIQEDLHACYEHRPEIAMVDSVKGITNLHVPSDVIVDASMPAMIRNSGKMWARDNKLKDT 375 *************************************************************************** PP TIGR00178 376 kavipdssyagvyqaviedckknGafdpttmGtvpnvGlmaqkaeeyGshdktfeieadGvvrvvdssGevllee 450 kav+p+s+ya +yq+vi++ck++GafdpttmGtvpnvGlmaqkaeeyGshdktfei++dGvvrvv ++G vl e+ lcl|FitnessBrowser__Marino:GFF3859 376 KAVMPESTYATIYQEVINFCKTHGAFDPTTMGTVPNVGLMAQKAEEYGSHDKTFEIKEDGVVRVVAEDGTVLTEH 450 *************************************************************************** PP TIGR00178 451 eveagdiwrmcqvkdapiqdwvklavtrarlsgtpavfwldperahdeelikkvekylkdhdteGldiqilspvk 525 +ve+gdiwr+cq kd pi+dwvklav+rar++g+pavfwld+erahd++li+kv++ylkdhdteGldi+i+spv+ lcl|FitnessBrowser__Marino:GFF3859 451 NVEKGDIWRACQTKDLPIRDWVKLAVNRARATGMPAVFWLDDERAHDAQLIQKVNTYLKDHDTEGLDIRIMSPVR 525 *************************************************************************** PP TIGR00178 526 atrfslerirrGedtisvtGnvlrdyltdlfpilelGtsakmlsvvplmaGGGlfetGaGGsapkhvqqleeenh 600 a+r+++er+ rG dtisvtGnvlrdyltdlfpilelGtsakmls+vpl++GGGl+etGaGGsapkhvqql +enh lcl|FitnessBrowser__Marino:GFF3859 526 AIRWTMERLIRGLDTISVTGNVLRDYLTDLFPILELGTSAKMLSIVPLLNGGGLYETGAGGSAPKHVQQLIQENH 600 *************************************************************************** PP TIGR00178 601 lrwdslGeflalaaslehvavktgnekakvladtldaatgklldeekspsrkvGeldnrgskfylakywaqelaa 675 lrwdslGefla a sl++++ k++ne+a++l++tld+at++ll++++spsr +Geldnrgs+f+la+ywa+ela lcl|FitnessBrowser__Marino:GFF3859 601 LRWDSLGEFLATAVSLDELGEKQNNERARLLGQTLDKATERLLENNQSPSRVTGELDNRGSHFHLARYWAEELAN 675 *************************************************************************** PP TIGR00178 676 qtedkelaasfasvaealtkneekivaelaavqGeavdlgGyyapdtdlttkvlrpsatfnailea 741 q dkel++ f+ ++ l++n++ki +e++ vqG++ d+gGyy+p +++ +v++psat+n ile lcl|FitnessBrowser__Marino:GFF3859 676 QDSDKELKEFFTKLSAQLEENKDKILEEMTVVQGNPADIGGYYHPPMEKVCEVMQPSATLNRILEE 741 ***************************************************************985 PP Internal pipeline statistics summary: ------------------------------------- Query model(s): 1 (744 nodes) Target sequences: 1 (747 residues searched) Passed MSV filter: 1 (1); expected 0.0 (0.02) Passed bias filter: 1 (1); expected 0.0 (0.02) Passed Vit filter: 1 (1); expected 0.0 (0.001) Passed Fwd filter: 1 (1); expected 0.0 (1e-05) Initial search space (Z): 1 [actual number of targets] Domain search space (domZ): 1 [number of targets reported over threshold] # CPU time: 0.05u 0.01s 00:00:00.06 Elapsed: 00:00:00.06 # Mc/sec: 8.06 // [ok]
This GapMind analysis is from Sep 17 2021. The underlying query database was built on Sep 17 2021.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory