Align isocitrate dehydrogenase (EC 1.1.1.42) (characterized)
to candidate WP_013460323.1 SULKU_RS07370 NADP-dependent isocitrate dehydrogenase
Query= metacyc::MONOMER-11847 (741 letters) >NCBI__GCF_000183725.1:WP_013460323.1 Length = 731 Score = 889 bits (2296), Expect = 0.0 Identities = 458/735 (62%), Positives = 550/735 (74%), Gaps = 7/735 (0%) Query: 7 IIYTKIDEAPALATYSLLPIIQAFTRGTGVDVETRDISLAGRIIANFPENLTEEQRIPDY 66 II++ IDEAPALATYSLLPI+ AFT+ GV+V T DISLAGR+IA FPE +T+ QRIPD Sbjct: 4 IIWSVIDEAPALATYSLLPIVNAFTKAAGVEVVTSDISLAGRVIATFPERMTDAQRIPDN 63 Query: 67 LAQLGELALTPEANIIKLPNISASIPQLKAAIKELQEHGYNVPNYPEAPSNDEEKAIQAR 126 LAQLG +A P+ IIKLPNISASIPQLKA I+ELQ GY++PNYPE P NDEEK IQAR Sbjct: 64 LAQLGVVATQPDGVIIKLPNISASIPQLKACIEELQSQGYDLPNYPEEPKNDEEKEIQAR 123 Query: 127 YAKVLGSAVNPVLREGNSDRRAPLSVKAYAQKHPHRMAAWSKDSKAHVSHMNEGDFYGSE 186 Y+ LGSAVNPVLREGNSDRRA ++VK +A+K+PH++ W + SK V+HMN GDFY +E Sbjct: 124 YSACLGSAVNPVLREGNSDRRAAVAVKNFAKKNPHKLRGWDEGSKTRVAHMNSGDFYANE 183 Query: 187 QSVTVPAATTVRIEYVNGANEVTVLKEKTALLAGEVIDTSVMNVRKLRDFYAEQIEDAKS 246 QS+ V I +NG LKE A EV+D + M+ + LR F AE +++AK Sbjct: 184 QSIIKNGDGKVTIA-LNGKT----LKEIDAK-DKEVLDGTFMSAKALRSFLAESMDEAKE 237 Query: 247 QGVLLSLHLKATMMKISDPIMFGHAVSVFYKDVFDKHGALLAELGVNVNNGLGDLYAKIQ 306 +G+L S+HLKATMMK+SDPIMFGHAVSVF+KDV++K +GVN N GLGDL K+Q Sbjct: 238 KGILWSIHLKATMMKVSDPIMFGHAVSVFFKDVYEKFADDFKSVGVNPNLGLGDLEKKMQ 297 Query: 307 TLPEDKRAEIEADIMAVYKTRPELAMVDSDKGITNLHVPNDIIIDASMPVVVRDGGKMWG 366 L +K+AEI A + AVY +R +AMVDSDKGITNLH NDIIIDAS+PVVVRDGGKMWG Sbjct: 298 KLSAEKQAEINAAMQAVYASRAPMAMVDSDKGITNLHFSNDIIIDASLPVVVRDGGKMWG 357 Query: 367 PDGQLHDCKAVIPDRCYATMYGEIVDDCRKNGAFDPSTIGSVPNVGLMAQKAEEYGSHDK 426 PDG+++ C IPDRCYATMY EI++DC+ NG FD +T+GSV NVGLMAQKAEEYGSH Sbjct: 358 PDGKVNQCIVTIPDRCYATMYSEIIEDCKINGQFDVTTMGSVSNVGLMAQKAEEYGSHPT 417 Query: 427 TFTAAGDGVIRVVDADGTVLMSQKVETGDIFRMCQAKDAPIRDWVGLAVRRAKATGAPAV 486 TF DGV+ V DA G LMS VE GDI+RM + KD P++DWV LAV R++ P V Sbjct: 418 TFEMNEDGVVTVSDASGA-LMSFNVEKGDIWRMSRTKDIPVKDWVRLAVERSEIENVPVV 476 Query: 487 FWLDSNRAHDAQIIAKVNEYLKDLDTDGVEIKIMPPVEAMRFTLGRFRAGQDTISVTGNV 546 FWLD NRAHDA II KVN YL + + VE IM P +AM F+L R RAGQ+TIS TGNV Sbjct: 477 FWLDENRAHDANIIKKVNAYLPEFNKSNVEYHIMAPEKAMAFSLKRVRAGQNTISATGNV 536 Query: 547 LRDYLTDLFPIIELGTSAKMLSIVPLLNGGGLFETGAGGSAPKHVQQFQKEGYLRWDSLG 606 LRDYLTDLFPI+ELGTSAKMLSIVPLL GGGLFETGAGGSAPKHV QF EG+LRWDSLG Sbjct: 537 LRDYLTDLFPILELGTSAKMLSIVPLLAGGGLFETGAGGSAPKHVDQFVNEGHLRWDSLG 596 Query: 607 EFSALAASLEHLAQTFGNPKAQVLADTLDQAIGKFLDNQKSPARKVGQIDNRGSHFYLAL 666 EF ALA ++ H+ +T + K L LD+A +LDN K P+RK G+ DN+ SHF+LA Sbjct: 597 EFLALAEAMRHINKTEKSNKLSALTAALDEANAAYLDNNKEPSRKAGEPDNKASHFFLAQ 656 Query: 667 YWAEALAAQDSDAEMKARFAGVASSLAAKEELINAELIAAQGSPVDMGGYYQPDDEKTAA 726 +WA+ALA Q DAE+ ARFA VA +L+ E I EL++ +G D+GGY+ PD KT Sbjct: 657 FWAKALANQTVDAELAARFAPVAKALSENEAKIMEELLSVEGKAQDIGGYFHPDFAKTEK 716 Query: 727 AMRPSGTLNAIIDAM 741 AMRPS TLNAII A+ Sbjct: 717 AMRPSATLNAIIAAI 731 Lambda K H 0.316 0.133 0.380 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 1374 Number of extensions: 59 Number of successful extensions: 3 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 741 Length of database: 731 Length adjustment: 40 Effective length of query: 701 Effective length of database: 691 Effective search space: 484391 Effective search space used: 484391 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.3 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.6 bits) S2: 55 (25.8 bits)
Align candidate WP_013460323.1 SULKU_RS07370 (NADP-dependent isocitrate dehydrogenase)
to HMM TIGR00178 (isocitrate dehydrogenase, NADP-dependent (EC 1.1.1.42))
# hmmsearch :: search profile(s) against a sequence database # HMMER 3.3.1 (Jul 2020); http://hmmer.org/ # Copyright (C) 2020 Howard Hughes Medical Institute. # Freely distributed under the BSD open source license. # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # query HMM file: ../tmp/path.carbon/TIGR00178.hmm # target sequence database: /tmp/gapView.9645.genome.faa # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: TIGR00178 [M=744] Accession: TIGR00178 Description: monomer_idh: isocitrate dehydrogenase, NADP-dependent Scores for complete sequences (score includes all domains): --- full sequence --- --- best 1 domain --- -#dom- E-value score bias E-value score bias exp N Sequence Description ------- ------ ----- ------- ------ ----- ---- -- -------- ----------- 0 1113.2 4.3 0 1113.0 4.3 1.0 1 lcl|NCBI__GCF_000183725.1:WP_013460323.1 SULKU_RS07370 NADP-dependent iso Domain annotation for each sequence (and alignments): >> lcl|NCBI__GCF_000183725.1:WP_013460323.1 SULKU_RS07370 NADP-dependent isocitrate dehydrogenase # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc --- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ---- 1 ! 1113.0 4.3 0 0 5 742 .. 1 731 [] 1 731 [] 0.99 Alignments for each domain: == domain 1 score: 1113.0 bits; conditional E-value: 0 TIGR00178 5 kakiiytltdeapllatysllpivkafaasaGievetrdislagrilaefpeylteeqkvddalaelGe 73 +akii+++ deap+latysllpiv+af+++aG+ev t+dislagr++a+fpe++t+ q+++d+la+lG lcl|NCBI__GCF_000183725.1:WP_013460323.1 1 MAKIIWSVIDEAPALATYSLLPIVNAFTKAAGVEVVTSDISLAGRVIATFPERMTDAQRIPDNLAQLGV 69 58******************************************************************* PP TIGR00178 74 laktpeaniiklpnisasvpqlkaaikelqdkGydlpdypeepktdeekdikaryakikGsavnpvlre 142 +a+ p+ iiklpnisas+pqlka+i elq++Gydlp+ypeepk+deek+i+ary++++Gsavnpvlre lcl|NCBI__GCF_000183725.1:WP_013460323.1 70 VATQPDGVIIKLPNISASIPQLKACIEELQSQGYDLPNYPEEPKNDEEKEIQARYSACLGSAVNPVLRE 138 ********************************************************************* PP TIGR00178 143 GnsdrraplavkeyarkhphkmGewsadskshvahmdagdfyaseksvlldaaeevkieliakdGketv 211 Gnsdrra +avk++a+k+phk+ w sk++vahm++gdfya+e+s++ ++ +v i a +Gk+ lcl|NCBI__GCF_000183725.1:WP_013460323.1 139 GNSDRRAAVAVKNFAKKNPHKLRGWDEGSKTRVAHMNSGDFYANEQSIIKNGDGKVTI---ALNGKT-- 202 ******************************************************9986...678886.. PP TIGR00178 212 lkaklklldgevidssvlskkalvefleeeiedakeegvllslhlkatmmkvsdpivfGhvvrvfykdv 280 l ++++ +d+ev+d++++s+kal+ fl+e +++ake+g+l s+hlkatmmkvsdpi+fGh+v+vf+kdv lcl|NCBI__GCF_000183725.1:WP_013460323.1 203 L-KEIDAKDKEVLDGTFMSAKALRSFLAESMDEAKEKGILWSIHLKATMMKVSDPIMFGHAVSVFFKDV 270 4.47999************************************************************** PP TIGR00178 281 fakhaelleqlGldvenGladlyakieslpaakkeeieadlekvyeerpelamvdsdkGitnlhvpsdv 349 + k+a+ ++++G++ + Gl+dl k+++l a k+ ei+a++++vy++r +amvdsdkGitnlh d+ lcl|NCBI__GCF_000183725.1:WP_013460323.1 271 YEKFADDFKSVGVNPNLGLGDLEKKMQKLSAEKQAEINAAMQAVYASRAPMAMVDSDKGITNLHFSNDI 339 ********************************************************************* PP TIGR00178 350 ivdasmpamirasGkmygkdgklkdtkavipdssyagvyqaviedckknGafdpttmGtvpnvGlmaqk 418 i+das+p ++r++Gkm+g+dgk ++ + ipd++ya +y+++iedck nG+fd ttmG+v nvGlmaqk lcl|NCBI__GCF_000183725.1:WP_013460323.1 340 IIDASLPVVVRDGGKMWGPDGKVNQCIVTIPDRCYATMYSEIIEDCKINGQFDVTTMGSVSNVGLMAQK 408 ********************************************************************* PP TIGR00178 419 aeeyGshdktfeieadGvvrvvdssGevlleeeveagdiwrmcqvkdapiqdwvklavtrarlsgtpav 487 aeeyGsh tfe+++dGvv v d+sG l+ +ve+gdiwrm + kd p++dwv+lav r+ + ++p+v lcl|NCBI__GCF_000183725.1:WP_013460323.1 409 AEEYGSHPTTFEMNEDGVVTVSDASGA-LMSFNVEKGDIWRMSRTKDIPVKDWVRLAVERSEIENVPVV 476 ************************996.7899************************************* PP TIGR00178 488 fwldperahdeelikkvekylkdhdteGldiqilspvkatrfslerirrGedtisvtGnvlrdyltdlf 556 fwld++rahd+++ikkv+ yl +++ + ++ +i+ p ka+ fsl+r+r G++tis+tGnvlrdyltdlf lcl|NCBI__GCF_000183725.1:WP_013460323.1 477 FWLDENRAHDANIIKKVNAYLPEFNKSNVEYHIMAPEKAMAFSLKRVRAGQNTISATGNVLRDYLTDLF 545 ********************************************************************* PP TIGR00178 557 pilelGtsakmlsvvplmaGGGlfetGaGGsapkhvqqleeenhlrwdslGeflalaaslehvavktgn 625 pilelGtsakmls+vpl+aGGGlfetGaGGsapkhv+q+++e+hlrwdslGeflala++++h+ +++ lcl|NCBI__GCF_000183725.1:WP_013460323.1 546 PILELGTSAKMLSIVPLLAGGGLFETGAGGSAPKHVDQFVNEGHLRWDSLGEFLALAEAMRHINKTEKS 614 ********************************************************************* PP TIGR00178 626 ekakvladtldaatgklldeekspsrkvGeldnrgskfylakywaqelaaqtedkelaasfasvaealt 694 +k l+ +ld+a ld++k psrk Ge dn+ s+f+la++wa++la qt d+elaa+fa+va+al+ lcl|NCBI__GCF_000183725.1:WP_013460323.1 615 NKLSALTAALDEANAAYLDNNKEPSRKAGEPDNKASHFFLAQFWAKALANQTVDAELAARFAPVAKALS 683 ********************************************************************* PP TIGR00178 695 kneekivaelaavqGeavdlgGyyapdtdlttkvlrpsatfnaileal 742 +ne+ki++el +v+G+a d+gGy++pd +t+k++rpsat+nai++a+ lcl|NCBI__GCF_000183725.1:WP_013460323.1 684 ENEAKIMEELLSVEGKAQDIGGYFHPDFAKTEKAMRPSATLNAIIAAI 731 *********************************************985 PP Internal pipeline statistics summary: ------------------------------------- Query model(s): 1 (744 nodes) Target sequences: 1 (731 residues searched) Passed MSV filter: 1 (1); expected 0.0 (0.02) Passed bias filter: 1 (1); expected 0.0 (0.02) Passed Vit filter: 1 (1); expected 0.0 (0.001) Passed Fwd filter: 1 (1); expected 0.0 (1e-05) Initial search space (Z): 1 [actual number of targets] Domain search space (domZ): 1 [number of targets reported over threshold] # CPU time: 0.03u 0.02s 00:00:00.05 Elapsed: 00:00:00.04 # Mc/sec: 12.92 // [ok]
This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory