Align isocitrate dehydrogenase (NADP+) (EC 1.1.1.42) (characterized)
to candidate WP_004323643.1 C665_RS16920 NADP-dependent isocitrate dehydrogenase
Query= BRENDA::O53611 (745 letters) >NCBI__GCF_000310185.1:WP_004323643.1 Length = 746 Score = 1166 bits (3016), Expect = 0.0 Identities = 565/744 (75%), Positives = 648/744 (87%) Query: 1 MSAEQPTIIYTLTDEAPLLATYAFLPIVRAFAEPAGIKIEASDISVAARILAEFPDYLTE 60 MSA + IIYTLTDEAPLLAT AFLPI+R F PAG+++E +DISV+AR+LAEFP+YL+E Sbjct: 1 MSAGKSKIIYTLTDEAPLLATCAFLPIIRTFTGPAGVEVEKADISVSARVLAEFPEYLSE 60 Query: 61 EQRVPDNLAELGRLTQLPDTNIIKLPNISASVPQLVAAIKELQDKGYAVPDYPADPKTDQ 120 +QRVPD LAELGRLT PDTNIIKLPNISASV QL A +KELQ+KGY +PDYP DPKTD+ Sbjct: 61 DQRVPDTLAELGRLTLEPDTNIIKLPNISASVAQLKACVKELQEKGYKIPDYPEDPKTDE 120 Query: 121 EKAIKERYARCLGSAVNPVLRQGNSDRRAPKAVKEYARKHPHSMGEWSMASRTHVAHMRH 180 EKA++ R+ +CLGSAVNPVLR+GNSDRRAP AVK YA+KHPHSMGEW S+THV+HM H Sbjct: 121 EKALRVRFGKCLGSAVNPVLREGNSDRRAPAAVKNYAKKHPHSMGEWKQWSQTHVSHMHH 180 Query: 181 GDFYAGEKSMTLDRARNVRMELLAKSGKTIVLKPEVPLDDGDVIDSMFMSKKALCDFYEE 240 GDFY GEKSMTLDRAR+V+MEL+ KSGKTIVLKP+V L +G++IDSMFMSKKALC+FYE+ Sbjct: 181 GDFYHGEKSMTLDRARDVKMELITKSGKTIVLKPKVSLLEGEIIDSMFMSKKALCEFYEK 240 Query: 241 QMQDAFETGVMFSLHVKATMMKVSHPIVFGHAVRIFYKDAFAKHQELFDDLGVNVNNGLS 300 +++D E G++FSLHVKATMMKVSHPIVFGH V+I+YKDAF KH +LF++LG+NVNNG+ Sbjct: 241 ELEDCREAGILFSLHVKATMMKVSHPIVFGHCVKIYYKDAFEKHGKLFEELGINVNNGMV 300 Query: 301 DLYSKIESLPASQRDEIIEDLHRCHEHRPELAMVDSARGISNFHSPSDVIVDASMPAMIR 360 DLY KI++LP S+RDEII DLH C EHRP LAMVDSA+GI+NFHSP+D+IVDASMPAMIR Sbjct: 301 DLYEKIKTLPESKRDEIIRDLHACQEHRPALAMVDSAKGITNFHSPNDIIVDASMPAMIR 360 Query: 361 AGGKMYGADGKLKDTKAVNPESTFSRIYQEIINFCKTNGQFDPTTMGTVPNVGLMAQQAE 420 GGKM+GADGK D+K V PESTF+RIYQE+INFCK +G FDP TMGTVPNVGLMAQ+AE Sbjct: 361 QGGKMWGADGKQYDSKCVMPESTFARIYQEMINFCKWHGNFDPRTMGTVPNVGLMAQKAE 420 Query: 421 EYGSHDKTFEIPEDGVANIVDVATGEVLLTENVEAGDIWRMCIVKDAPIRDWVKLAVTRA 480 EYGSHDKTFEI EDGVANI D+ATGEVLL++NVEAGDIWRMC VKDAPIRDWVKLAVTRA Sbjct: 421 EYGSHDKTFEIAEDGVANITDLATGEVLLSQNVEAGDIWRMCQVKDAPIRDWVKLAVTRA 480 Query: 481 RISGMPVLFWLDPYRPHENELIKKVKTYLKDHDTEGLDIQIMSQVRSMRYTCERLVRGLD 540 R SGMP +FWLDPYRPHENELIKKVKTYLKDHDT GLDIQIMSQVR+MR T ER+ RGLD Sbjct: 481 RNSGMPAIFWLDPYRPHENELIKKVKTYLKDHDTSGLDIQIMSQVRAMRVTLERVARGLD 540 Query: 541 TIAATGNILRDYLTDLFPILELGTSAKMLSVVPLMAGGGMYETGAGGSAPKHVKQLVEEN 600 TI+ TGNILRDYLTDLFPI+ELGTSAKMLS+VPLM GGGMYETGAGGSAPKHV+QLV+EN Sbjct: 541 TISVTGNILRDYLTDLFPIMELGTSAKMLSIVPLMNGGGMYETGAGGSAPKHVQQLVQEN 600 Query: 601 HLRWDSLGEFLALGAGFEDIGIKTGNERAKLLGKTLDAAIGKLLDNDKSPSRKTGELDNR 660 HLRWDSLGEFLAL ED+GIKTGN RAKLL KTLD A G+LLD +KSPS KTG+LDNR Sbjct: 601 HLRWDSLGEFLALAVSLEDLGIKTGNARAKLLAKTLDEATGRLLDENKSPSPKTGQLDNR 660 Query: 661 GSQFYLAMYWAQELAAQTDDQQLAEHFASLADVLTKNEDVIVRELTEVQGEPVDIGGYYA 720 GSQ+YLA +WA+ LA Q +D +LA FA LA L +NE IV EL EVQG+ VDIGGYY Sbjct: 661 GSQYYLARFWAERLAGQAEDAELAGKFAPLAKALAENEQKIVAELAEVQGKAVDIGGYYK 720 Query: 721 PDSDMTTAVMRPSKTFNAALEAVQ 744 D++ AVMRPS T NA L+ + Sbjct: 721 ADAEKCKAVMRPSATLNAILKGAR 744 Lambda K H 0.317 0.134 0.388 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 1512 Number of extensions: 35 Number of successful extensions: 1 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 745 Length of database: 746 Length adjustment: 40 Effective length of query: 705 Effective length of database: 706 Effective search space: 497730 Effective search space used: 497730 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.3 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.7 bits) S2: 55 (25.8 bits)
Align candidate WP_004323643.1 C665_RS16920 (NADP-dependent isocitrate dehydrogenase)
to HMM TIGR00178 (isocitrate dehydrogenase, NADP-dependent (EC 1.1.1.42))
# hmmsearch :: search profile(s) against a sequence database # HMMER 3.3.1 (Jul 2020); http://hmmer.org/ # Copyright (C) 2020 Howard Hughes Medical Institute. # Freely distributed under the BSD open source license. # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # query HMM file: ../tmp/path.carbon/TIGR00178.hmm # target sequence database: /tmp/gapView.451925.genome.faa # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: TIGR00178 [M=744] Accession: TIGR00178 Description: monomer_idh: isocitrate dehydrogenase, NADP-dependent Scores for complete sequences (score includes all domains): --- full sequence --- --- best 1 domain --- -#dom- E-value score bias E-value score bias exp N Sequence Description ------- ------ ----- ------- ------ ----- ---- -- -------- ----------- 0 1351.1 0.0 0 1350.9 0.0 1.0 1 NCBI__GCF_000310185.1:WP_004323643.1 Domain annotation for each sequence (and alignments): >> NCBI__GCF_000310185.1:WP_004323643.1 # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc --- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ---- 1 ! 1350.9 0.0 0 0 1 740 [. 1 741 [. 1 745 [. 1.00 Alignments for each domain: == domain 1 score: 1350.9 bits; conditional E-value: 0 TIGR00178 1 mstekakiiytltdeapllatysllpivkafaasaGievetrdislagrilaefpeylteeqkvddalaelGe 73 ms+ k+kiiytltdeapllat ++lpi+++f+++aG+eve dis+++r+laefpeyl+e+q+v+d laelG+ NCBI__GCF_000310185.1:WP_004323643.1 1 MSAGKSKIIYTLTDEAPLLATCAFLPIIRTFTGPAGVEVEKADISVSARVLAEFPEYLSEDQRVPDTLAELGR 73 8999********************************************************************* PP TIGR00178 74 laktpeaniiklpnisasvpqlkaaikelqdkGydlpdypeepktdeekdikaryakikGsavnpvlreGnsd 146 l+ p++niiklpnisasv qlka++kelq+kGy++pdype+pktdeek+++ r+ k++GsavnpvlreGnsd NCBI__GCF_000310185.1:WP_004323643.1 74 LTLEPDTNIIKLPNISASVAQLKACVKELQEKGYKIPDYPEDPKTDEEKALRVRFGKCLGSAVNPVLREGNSD 146 ************************************************************************* PP TIGR00178 147 rraplavkeyarkhphkmGewsadskshvahmdagdfyaseksvlldaaeevkieliakdGketvlkaklkll 219 rrap+avk+ya+khph+mGew + s++hv+hm++gdfy++eks++ld+a++vk+eli+k+Gk++vlk+k++ll NCBI__GCF_000310185.1:WP_004323643.1 147 RRAPAAVKNYAKKHPHSMGEWKQWSQTHVSHMHHGDFYHGEKSMTLDRARDVKMELITKSGKTIVLKPKVSLL 219 ************************************************************************* PP TIGR00178 220 dgevidssvlskkalvefleeeiedakeegvllslhlkatmmkvsdpivfGhvvrvfykdvfakhaelleqlG 292 +ge+ids+++skkal+ef+e+e+ed +e g+l+slh+katmmkvs+pivfGh v+++ykd+f kh++l+e+lG NCBI__GCF_000310185.1:WP_004323643.1 220 EGEIIDSMFMSKKALCEFYEKELEDCREAGILFSLHVKATMMKVSHPIVFGHCVKIYYKDAFEKHGKLFEELG 292 ************************************************************************* PP TIGR00178 293 ldvenGladlyakieslpaakkeeieadlekvyeerpelamvdsdkGitnlhvpsdvivdasmpamirasGkm 365 ++v+nG+ dly ki++lp++k++ei+ dl+++ e+rp lamvds+kGitn+h+p d+ivdasmpamir++Gkm NCBI__GCF_000310185.1:WP_004323643.1 293 INVNNGMVDLYEKIKTLPESKRDEIIRDLHACQEHRPALAMVDSAKGITNFHSPNDIIVDASMPAMIRQGGKM 365 ************************************************************************* PP TIGR00178 366 ygkdgklkdtkavipdssyagvyqaviedckknGafdpttmGtvpnvGlmaqkaeeyGshdktfeieadGvvr 438 +g+dgk+ d+k v+p+s++a++yq++i++ck +G+fdp tmGtvpnvGlmaqkaeeyGshdktfei +dGv++ NCBI__GCF_000310185.1:WP_004323643.1 366 WGADGKQYDSKCVMPESTFARIYQEMINFCKWHGNFDPRTMGTVPNVGLMAQKAEEYGSHDKTFEIAEDGVAN 438 ************************************************************************* PP TIGR00178 439 vvd.ssGevlleeeveagdiwrmcqvkdapiqdwvklavtrarlsgtpavfwldperahdeelikkvekylkd 510 ++d ++Gevll ++veagdiwrmcqvkdapi+dwvklavtrar sg+pa+fwldp+r+h++elikkv++ylkd NCBI__GCF_000310185.1:WP_004323643.1 439 ITDlATGEVLLSQNVEAGDIWRMCQVKDAPIRDWVKLAVTRARNSGMPAIFWLDPYRPHENELIKKVKTYLKD 511 ***99******************************************************************** PP TIGR00178 511 hdteGldiqilspvkatrfslerirrGedtisvtGnvlrdyltdlfpilelGtsakmlsvvplmaGGGlfetG 583 hdt+Gldiqi+s+v+a+r +ler++rG dtisvtGn+lrdyltdlfpi+elGtsakmls+vplm+GGG++etG NCBI__GCF_000310185.1:WP_004323643.1 512 HDTSGLDIQIMSQVRAMRVTLERVARGLDTISVTGNILRDYLTDLFPIMELGTSAKMLSIVPLMNGGGMYETG 584 ************************************************************************* PP TIGR00178 584 aGGsapkhvqqleeenhlrwdslGeflalaaslehvavktgnekakvladtldaatgklldeekspsrkvGel 656 aGGsapkhvqql++enhlrwdslGeflala sle++++ktgn++ak+la+tld+atg+llde+ksps k+G+l NCBI__GCF_000310185.1:WP_004323643.1 585 AGGSAPKHVQQLVQENHLRWDSLGEFLALAVSLEDLGIKTGNARAKLLAKTLDEATGRLLDENKSPSPKTGQL 657 ************************************************************************* PP TIGR00178 657 dnrgskfylakywaqelaaqtedkelaasfasvaealtkneekivaelaavqGeavdlgGyyapdtdlttkvl 729 dnrgs++yla++wa+ la q+ed+ela +fa++a+al++ne+kivaela+vqG+avd+gGyy++d ++ ++v+ NCBI__GCF_000310185.1:WP_004323643.1 658 DNRGSQYYLARFWAERLAGQAEDAELAGKFAPLAKALAENEQKIVAELAEVQGKAVDIGGYYKADAEKCKAVM 730 ************************************************************************* PP TIGR00178 730 rpsatfnaile 740 rpsat+nail+ NCBI__GCF_000310185.1:WP_004323643.1 731 RPSATLNAILK 741 *********97 PP Internal pipeline statistics summary: ------------------------------------- Query model(s): 1 (744 nodes) Target sequences: 1 (746 residues searched) Passed MSV filter: 1 (1); expected 0.0 (0.02) Passed bias filter: 1 (1); expected 0.0 (0.02) Passed Vit filter: 1 (1); expected 0.0 (0.001) Passed Fwd filter: 1 (1); expected 0.0 (1e-05) Initial search space (Z): 1 [actual number of targets] Domain search space (domZ): 1 [number of targets reported over threshold] # CPU time: 0.01u 0.00s 00:00:00.01 Elapsed: 00:00:00.01 # Mc/sec: 41.66 // [ok]
This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory