Align isocitrate dehydrogenase (EC 1.1.1.42) (characterized)
to candidate WP_099019356.1 CCS90_RS09825 NADP-dependent isocitrate dehydrogenase
Query= metacyc::MONOMER-11847 (741 letters) >NCBI__GCF_002591915.1:WP_099019356.1 Length = 740 Score = 952 bits (2462), Expect = 0.0 Identities = 486/739 (65%), Positives = 573/739 (77%) Query: 3 SKSTIIYTKIDEAPALATYSLLPIIQAFTRGTGVDVETRDISLAGRIIANFPENLTEEQR 62 +K TIIYT DEAPALATYS LPI+Q F V+VET+DISLA RI+A FPE L Q+ Sbjct: 2 TKQTIIYTITDEAPALATYSFLPIVQKFAGQADVEVETKDISLASRILAQFPERLKAGQK 61 Query: 63 IPDYLAQLGELALTPEANIIKLPNISASIPQLKAAIKELQEHGYNVPNYPEAPSNDEEKA 122 D L QLG LA TPEANIIKLPNISAS+PQL AAIKELQ GY++P+ PE P +EEKA Sbjct: 62 CEDALGQLGVLAKTPEANIIKLPNISASVPQLNAAIKELQAAGYDLPDCPEDPETEEEKA 121 Query: 123 IQARYAKVLGSAVNPVLREGNSDRRAPLSVKAYAQKHPHRMAAWSKDSKAHVSHMNEGDF 182 I A+YA VLGSAVNPVLREGNSDRR +VKAYA+K+PHR+ A +K+SK HV+HM+ DF Sbjct: 122 IMAKYANVLGSAVNPVLREGNSDRRVAAAVKAYARKNPHRLGAITKNSKTHVAHMDANDF 181 Query: 183 YGSEQSVTVPAATTVRIEYVNGANEVTVLKEKTALLAGEVIDTSVMNVRKLRDFYAEQIE 242 YGSEQS +PAA VRIE+V +VTV+K AL AGEVID+S M+VR LR F+A++I+ Sbjct: 182 YGSEQSYVMPAADEVRIEHVGTDGQVTVMKSAVALQAGEVIDSSCMSVRALRQFFAKEIK 241 Query: 243 DAKSQGVLLSLHLKATMMKISDPIMFGHAVSVFYKDVFDKHGALLAELGVNVNNGLGDLY 302 DAKS GVLLSLHLKATMMK+SDPIMFGH+V V+Y +VF KH A ++GV VNNG+G +Y Sbjct: 242 DAKSSGVLLSLHLKATMMKVSDPIMFGHSVEVYYAEVFKKHAATFEKIGVEVNNGIGSVY 301 Query: 303 AKIQTLPEDKRAEIEADIMAVYKTRPELAMVDSDKGITNLHVPNDIIIDASMPVVVRDGG 362 KI LP K+AEI+ADI A Y +P LAMVDSDKGITNLHVPND+IIDASMP ++ Sbjct: 302 NKITELPAAKQAEIKADIAACYANQPALAMVDSDKGITNLHVPNDVIIDASMPAAIKSSA 361 Query: 363 KMWGPDGQLHDCKAVIPDRCYATMYGEIVDDCRKNGAFDPSTIGSVPNVGLMAQKAEEYG 422 KMW +G+L D KA+IPDR YA +Y EI+ C+++GAFD T+G+V NVGLMAQKAEEYG Sbjct: 362 KMWNAEGKLQDTKAMIPDRSYAGIYQEIIAYCKEHGAFDVVTMGNVCNVGLMAQKAEEYG 421 Query: 423 SHDKTFTAAGDGVIRVVDADGTVLMSQKVETGDIFRMCQAKDAPIRDWVGLAVRRAKATG 482 SHDKTF A DG++RVV++ G +LM +VE GDI+R CQ KD PIRDWV LAV RAK TG Sbjct: 422 SHDKTFEMASDGMMRVVNSSGEILMQHEVEQGDIWRACQTKDLPIRDWVKLAVTRAKNTG 481 Query: 483 APAVFWLDSNRAHDAQIIAKVNEYLKDLDTDGVEIKIMPPVEAMRFTLGRFRAGQDTISV 542 + AVFWLD NRAHD +I KV YL+D DT G+EIKIM P AMR+T R AG DTISV Sbjct: 482 SAAVFWLDENRAHDRNLIEKVELYLQDHDTTGLEIKIMNPKAAMRYTCERVTAGLDTISV 541 Query: 543 TGNVLRDYLTDLFPIIELGTSAKMLSIVPLLNGGGLFETGAGGSAPKHVQQFQKEGYLRW 602 TGNVLRDYLTDLFPI+ELGTSAKMLSIVPLL GGGLFETGAGGSAPKHVQQF EG+LRW Sbjct: 542 TGNVLRDYLTDLFPILELGTSAKMLSIVPLLAGGGLFETGAGGSAPKHVQQFVTEGHLRW 601 Query: 603 DSLGEFSALAASLEHLAQTFGNPKAQVLADTLDQAIGKFLDNQKSPARKVGQIDNRGSHF 662 DSLGEF ALA SLE +A+ N +A+VLA LD A ++L KSP+RKV ++DNRGS + Sbjct: 602 DSLGEFLALAVSLEDIAEKTNNQQAKVLAAALDDANERYLSFNKSPSRKVNELDNRGSQY 661 Query: 663 YLALYWAEALAAQDSDAEMKARFAGVASSLAAKEELINAELIAAQGSPVDMGGYYQPDDE 722 YL YWA+AL+AQD DA +KA F V +L E I ELI AQGSPVD+GGY+QP+D Sbjct: 662 YLTKYWAKALSAQDEDAALKATFTQVFEALKNNEAQILQELIDAQGSPVDIGGYFQPNDA 721 Query: 723 KTAAAMRPSGTLNAIIDAM 741 AAMRPS T N IID++ Sbjct: 722 LAKAAMRPSSTFNQIIDSI 740 Lambda K H 0.316 0.133 0.380 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 1396 Number of extensions: 53 Number of successful extensions: 1 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 741 Length of database: 740 Length adjustment: 40 Effective length of query: 701 Effective length of database: 700 Effective search space: 490700 Effective search space used: 490700 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.3 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.6 bits) S2: 55 (25.8 bits)
Align candidate WP_099019356.1 CCS90_RS09825 (NADP-dependent isocitrate dehydrogenase)
to HMM TIGR00178 (isocitrate dehydrogenase, NADP-dependent (EC 1.1.1.42))
# hmmsearch :: search profile(s) against a sequence database # HMMER 3.3.1 (Jul 2020); http://hmmer.org/ # Copyright (C) 2020 Howard Hughes Medical Institute. # Freely distributed under the BSD open source license. # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # query HMM file: ../tmp/path.carbon/TIGR00178.hmm # target sequence database: /tmp/gapView.3347674.genome.faa # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: TIGR00178 [M=744] Accession: TIGR00178 Description: monomer_idh: isocitrate dehydrogenase, NADP-dependent Scores for complete sequences (score includes all domains): --- full sequence --- --- best 1 domain --- -#dom- E-value score bias E-value score bias exp N Sequence Description ------- ------ ----- ------- ------ ----- ---- -- -------- ----------- 0 1248.2 8.2 0 1248.1 8.2 1.0 1 NCBI__GCF_002591915.1:WP_099019356.1 Domain annotation for each sequence (and alignments): >> NCBI__GCF_002591915.1:WP_099019356.1 # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc --- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ---- 1 ! 1248.1 8.2 0 0 5 741 .. 3 739 .. 1 740 [] 1.00 Alignments for each domain: == domain 1 score: 1248.1 bits; conditional E-value: 0 TIGR00178 5 kakiiytltdeapllatysllpivkafaasaGievetrdislagrilaefpeylteeqkvddalaelGelakt 77 k++iiyt+tdeap+latys+lpiv+ fa++a +evet+disla+rila+fpe+l qk +dal +lG lakt NCBI__GCF_002591915.1:WP_099019356.1 3 KQTIIYTITDEAPALATYSFLPIVQKFAGQADVEVETKDISLASRILAQFPERLKAGQKCEDALGQLGVLAKT 75 579********************************************************************** PP TIGR00178 78 peaniiklpnisasvpqlkaaikelqdkGydlpdypeepktdeekdikaryakikGsavnpvlreGnsdrrap 150 peaniiklpnisasvpql+aaikelq+ Gydlpd pe+p+t+eek+i a+ya+++GsavnpvlreGnsdrr NCBI__GCF_002591915.1:WP_099019356.1 76 PEANIIKLPNISASVPQLNAAIKELQAAGYDLPDCPEDPETEEEKAIMAKYANVLGSAVNPVLREGNSDRRVA 148 ************************************************************************* PP TIGR00178 151 lavkeyarkhphkmGewsadskshvahmdagdfyaseksvlldaaeevkieliakdGketvlkaklklldgev 223 +avk yark+ph++G+ +++sk+hvahmda+dfy+se+s ++ aa+ev+ie++ dG++tv+k+ + l++gev NCBI__GCF_002591915.1:WP_099019356.1 149 AAVKAYARKNPHRLGAITKNSKTHVAHMDANDFYGSEQSYVMPAADEVRIEHVGTDGQVTVMKSAVALQAGEV 221 ************************************************************************* PP TIGR00178 224 idssvlskkalvefleeeiedakeegvllslhlkatmmkvsdpivfGhvvrvfykdvfakhaelleqlGldve 296 idss +s +al++f+++ei+dak gvllslhlkatmmkvsdpi+fGh v v+y++vf kha+++e++G++v+ NCBI__GCF_002591915.1:WP_099019356.1 222 IDSSCMSVRALRQFFAKEIKDAKSSGVLLSLHLKATMMKVSDPIMFGHSVEVYYAEVFKKHAATFEKIGVEVN 294 ************************************************************************* PP TIGR00178 297 nGladlyakieslpaakkeeieadlekvyeerpelamvdsdkGitnlhvpsdvivdasmpamirasGkmygkd 369 nG++ +y ki +lpaak+ ei+ad+ ++y+++p lamvdsdkGitnlhvp dvi+dasmpa+i++s km++++ NCBI__GCF_002591915.1:WP_099019356.1 295 NGIGSVYNKITELPAAKQAEIKADIAACYANQPALAMVDSDKGITNLHVPNDVIIDASMPAAIKSSAKMWNAE 367 ************************************************************************* PP TIGR00178 370 gklkdtkavipdssyagvyqaviedckknGafdpttmGtvpnvGlmaqkaeeyGshdktfeieadGvvrvvds 442 gkl+dtka+ipd+syag+yq++i +ck++Gafd tmG v nvGlmaqkaeeyGshdktfe+ +dG++rvv+s NCBI__GCF_002591915.1:WP_099019356.1 368 GKLQDTKAMIPDRSYAGIYQEIIAYCKEHGAFDVVTMGNVCNVGLMAQKAEEYGSHDKTFEMASDGMMRVVNS 440 ************************************************************************* PP TIGR00178 443 sGevlleeeveagdiwrmcqvkdapiqdwvklavtrarlsgtpavfwldperahdeelikkvekylkdhdteG 515 sGe+l+++eve+gdiwr+cq kd pi+dwvklavtra+ +g avfwld++rahd++li+kve yl+dhdt+G NCBI__GCF_002591915.1:WP_099019356.1 441 SGEILMQHEVEQGDIWRACQTKDLPIRDWVKLAVTRAKNTGSAAVFWLDENRAHDRNLIEKVELYLQDHDTTG 513 ************************************************************************* PP TIGR00178 516 ldiqilspvkatrfslerirrGedtisvtGnvlrdyltdlfpilelGtsakmlsvvplmaGGGlfetGaGGsa 588 l+i+i++p a+r+++er++ G dtisvtGnvlrdyltdlfpilelGtsakmls+vpl+aGGGlfetGaGGsa NCBI__GCF_002591915.1:WP_099019356.1 514 LEIKIMNPKAAMRYTCERVTAGLDTISVTGNVLRDYLTDLFPILELGTSAKMLSIVPLLAGGGLFETGAGGSA 586 ************************************************************************* PP TIGR00178 589 pkhvqqleeenhlrwdslGeflalaaslehvavktgnekakvladtldaatgklldeekspsrkvGeldnrgs 661 pkhvqq++ e+hlrwdslGeflala sle++a kt+n++akvla +ld+a ++ l +kspsrkv eldnrgs NCBI__GCF_002591915.1:WP_099019356.1 587 PKHVQQFVTEGHLRWDSLGEFLALAVSLEDIAEKTNNQQAKVLAAALDDANERYLSFNKSPSRKVNELDNRGS 659 ************************************************************************* PP TIGR00178 662 kfylakywaqelaaqtedkelaasfasvaealtkneekivaelaavqGeavdlgGyyapdtdlttkvlrpsat 734 ++yl+kywa++l+aq ed+ l+a+f++v eal++ne++i +el ++qG++vd+gGy++p++ l+++++rps t NCBI__GCF_002591915.1:WP_099019356.1 660 QYYLTKYWAKALSAQDEDAALKATFTQVFEALKNNEAQILQELIDAQGSPVDIGGYFQPNDALAKAAMRPSST 732 ************************************************************************* PP TIGR00178 735 fnailea 741 fn+i+++ NCBI__GCF_002591915.1:WP_099019356.1 733 FNQIIDS 739 ****997 PP Internal pipeline statistics summary: ------------------------------------- Query model(s): 1 (744 nodes) Target sequences: 1 (740 residues searched) Passed MSV filter: 1 (1); expected 0.0 (0.02) Passed bias filter: 1 (1); expected 0.0 (0.02) Passed Vit filter: 1 (1); expected 0.0 (0.001) Passed Fwd filter: 1 (1); expected 0.0 (1e-05) Initial search space (Z): 1 [actual number of targets] Domain search space (domZ): 1 [number of targets reported over threshold] # CPU time: 0.01u 0.01s 00:00:00.02 Elapsed: 00:00:00.01 # Mc/sec: 39.41 // [ok]
This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory