GapMind for catabolism of small carbon sources

 

Alignments for a candidate for icd in Sulfurimonas denitrificans DSM 1251

Align isocitrate dehydrogenase (EC 1.1.1.42) (characterized)
to candidate WP_011372677.1 SUDEN_RS05495 NADP-dependent isocitrate dehydrogenase

Query= metacyc::MONOMER-11847
         (741 letters)



>NCBI__GCF_000012965.1:WP_011372677.1
          Length = 729

 Score =  850 bits (2197), Expect = 0.0
 Identities = 440/737 (59%), Positives = 545/737 (73%), Gaps = 9/737 (1%)

Query: 5   STIIYTKIDEAPALATYSLLPIIQAFTRGTGVDVETRDISLAGRIIANFPENLTEEQRIP 64
           S II++KIDEAPALATYSLLPI+ AFT+  GV+V T DISLAGR+I+ F + L  EQRI 
Sbjct: 2   SKIIWSKIDEAPALATYSLLPIVNAFTKAAGVEVVTSDISLAGRVISKFSDRLKPEQRIN 61

Query: 65  DYLAQLGELALTPEANIIKLPNISASIPQLKAAIKELQEHGYNVPNYPEAPSNDEEKAIQ 124
           D LA+LG + L P+ NIIKLPNISAS+ QL   I ELQ  GY+VPNYPE    DEEKA++
Sbjct: 62  DELAELGNVVLQPDGNIIKLPNISASVGQLIECITELQAQGYDVPNYPEDAKTDEEKALK 121

Query: 125 ARYAKVLGSAVNPVLREGNSDRRAPLSVKAYAQKHPHRMAAWSKDSKAHVSHMNEGDFYG 184
             Y+  LGSAVNPVLREGNSDRRA  +VK +AQK+PH++ A+   SKA+V+HMN GDF+ 
Sbjct: 122 EIYSTCLGSAVNPVLREGNSDRRAAAAVKRFAQKNPHKLRAFESPSKAYVAHMNGGDFFS 181

Query: 185 SEQSVTVPAATTVRIEYVNGANEVTVLKEKTALLAGEVIDTSVMNVRKLRDFYAEQIEDA 244
           +E+SV V     V IE +NG     VLK    ++  E++D + M+ + L+ FY + ++DA
Sbjct: 182 NEKSVIVEGNAPVTIE-LNGK----VLKTLNDVVDKEIMDATFMSAKALQAFYQKTLDDA 236

Query: 245 KSQGVLLSLHLKATMMKISDPIMFGHAVSVFYKDVFDKHGALLAELGVNVNNGLGDLYAK 304
           K  GVL SLHLKATMMK+SDPIMFGHAV VF+KDVF K+G  L  +G N N GLGDLY K
Sbjct: 237 KKNGVLWSLHLKATMMKVSDPIMFGHAVKVFFKDVFAKYGDELTAIGYNANMGLGDLYKK 296

Query: 305 IQTLPEDKRAEIEADIMAVYKTRPELAMVDSDKGITNLHVPNDIIIDASMPVVVRDGGKM 364
           ++     K+AEI A I A Y  +P +AMVDSDKGITNLH  NDIIIDASMPVVVRDGGKM
Sbjct: 297 LEK--SSKKAEIIAAIEATYDIQPPMAMVDSDKGITNLHASNDIIIDASMPVVVRDGGKM 354

Query: 365 WGPDGQLHDCKAVIPDRCYATMYGEIVDDCRKNGAFDPSTIGSVPNVGLMAQKAEEYGSH 424
           W  DG++ +C +VIPD  YA  +  +VDDC  NG +D +T+G+V NVGLMAQKAEEYGSH
Sbjct: 355 WNRDGKVQECVSVIPDASYAFFHKAMVDDCVANGQYDVTTMGNVANVGLMAQKAEEYGSH 414

Query: 425 DKTFTAAGDGVIRVVDADGTVLMSQKVETGDIFRMCQAKDAPIRDWVGLAVRRAKATGAP 484
             TF  A +G + V   +G VLMS  VE GDI+RM + KD PI+DWV LAV RA+ TG+P
Sbjct: 415 PTTFEIAENGSVEVKQ-NGKVLMSHNVEAGDIWRMSRVKDIPIQDWVRLAVERARLTGSP 473

Query: 485 AVFWLDSNRAHDAQIIAKVNEYLKDLDTDGVEIKIMPPVEAMRFTLGRFRAGQDTISVTG 544
           AVFWLD NRAHDA +I KVNEYLK+ DT G+E+ IM    A ++T  R R G+DTISVTG
Sbjct: 474 AVFWLDKNRAHDANMIKKVNEYLKNHDTTGLELPIMDVASAAKYTNARVRKGEDTISVTG 533

Query: 545 NVLRDYLTDLFPIIELGTSAKMLSIVPLLNGGGLFETGAGGSAPKHVQQFQKEGYLRWDS 604
           NVLRD+LTD++PI+ELGTSAKMLSIVPLL GGG+FETGAGGSAPKHV QF +EG+LRWDS
Sbjct: 534 NVLRDHLTDMYPILELGTSAKMLSIVPLLAGGGVFETGAGGSAPKHVDQFLEEGHLRWDS 593

Query: 605 LGEFSALAASLEHLAQTFGNPKAQVLADTLDQAIGKFLDNQKSPARKVGQIDNRGSHFYL 664
           LGEF ALA SL  +AQ   + K   +   LD A   +LDN K+P+RK G+ DN+ SHF+L
Sbjct: 594 LGEFLALAESLRFIAQKNSDDKLAAVTTALDIANAAYLDNNKAPSRKAGEPDNKASHFFL 653

Query: 665 ALYWAEALAAQDSDAEMKARFAGVASSLAAKEELINAELIAAQGSPVDMGGYYQPDDEKT 724
           A YWA+AL ++  +AE+ A+FA +A +L   E  I  EL++ +G   D+GGYY PDD+K 
Sbjct: 654 AQYWAKAL-SEGKNAELAAKFAPIAKALTENEAKIIEELLSIEGKAQDIGGYYHPDDKKA 712

Query: 725 AAAMRPSGTLNAIIDAM 741
            AAMRPS TLN I+D++
Sbjct: 713 EAAMRPSATLNKIVDSI 729


Lambda     K      H
   0.316    0.133    0.380 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 1388
Number of extensions: 64
Number of successful extensions: 5
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 741
Length of database: 729
Length adjustment: 40
Effective length of query: 701
Effective length of database: 689
Effective search space:   482989
Effective search space used:   482989
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.6 bits)
S2: 55 (25.8 bits)

Align candidate WP_011372677.1 SUDEN_RS05495 (NADP-dependent isocitrate dehydrogenase)
to HMM TIGR00178 (isocitrate dehydrogenase, NADP-dependent (EC 1.1.1.42))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR00178.hmm
# target sequence database:        /tmp/gapView.1226782.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR00178  [M=744]
Accession:   TIGR00178
Description: monomer_idh: isocitrate dehydrogenase, NADP-dependent
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                             Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                             -----------
          0 1089.9   7.7          0 1089.7   7.7    1.0  1  NCBI__GCF_000012965.1:WP_011372677.1  


Domain annotation for each sequence (and alignments):
>> NCBI__GCF_000012965.1:WP_011372677.1  
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 ! 1089.7   7.7         0         0       5     741 ..       1     728 [.       1     729 [] 0.99

  Alignments for each domain:
  == domain 1  score: 1089.7 bits;  conditional E-value: 0
                             TIGR00178   5 kakiiytltdeapllatysllpivkafaasaGievetrdislagrilaefpeylteeqkvddalaelGelakt 77 
                                           ++kii++  deap+latysllpiv+af+++aG+ev t+dislagr++++f ++l  eq+++d+laelG++   
  NCBI__GCF_000012965.1:WP_011372677.1   1 MSKIIWSKIDEAPALATYSLLPIVNAFTKAAGVEVVTSDISLAGRVISKFSDRLKPEQRINDELAELGNVVLQ 73 
                                           58*********************************************************************** PP

                             TIGR00178  78 peaniiklpnisasvpqlkaaikelqdkGydlpdypeepktdeekdikaryakikGsavnpvlreGnsdrrap 150
                                           p+ niiklpnisasv ql  +i elq++Gyd+p+ype++ktdeek++k+ y+ ++GsavnpvlreGnsdrra 
  NCBI__GCF_000012965.1:WP_011372677.1  74 PDGNIIKLPNISASVGQLIECITELQAQGYDVPNYPEDAKTDEEKALKEIYSTCLGSAVNPVLREGNSDRRAA 146
                                           ************************************************************************* PP

                             TIGR00178 151 lavkeyarkhphkmGewsadskshvahmdagdfyaseksvlldaaeevkieliakdGketvlkaklklldgev 223
                                           +avk++a+k+phk+ ++   sk+ vahm+ gdf+++eksv++++   v iel   +G   vlk+  ++ d+e+
  NCBI__GCF_000012965.1:WP_011372677.1 147 AAVKRFAQKNPHKLRAFESPSKAYVAHMNGGDFFSNEKSVIVEGNAPVTIEL---NG--KVLKTLNDVVDKEI 214
                                           **************************************************98...44..48************ PP

                             TIGR00178 224 idssvlskkalvefleeeiedakeegvllslhlkatmmkvsdpivfGhvvrvfykdvfakhaelleqlGldve 296
                                           +d++++s+kal++f+++ ++dak++gvl slhlkatmmkvsdpi+fGh+v+vf+kdvfak+++ l+++G +++
  NCBI__GCF_000012965.1:WP_011372677.1 215 MDATFMSAKALQAFYQKTLDDAKKNGVLWSLHLKATMMKVSDPIMFGHAVKVFFKDVFAKYGDELTAIGYNAN 287
                                           ************************************************************************* PP

                             TIGR00178 297 nGladlyakieslpaakkeeieadlekvyeerpelamvdsdkGitnlhvpsdvivdasmpamirasGkmygkd 369
                                            Gl+dly k+e+  ++kk ei+a++e++y+ +p +amvdsdkGitnlh+  d+i+dasmp ++r++Gkm+++d
  NCBI__GCF_000012965.1:WP_011372677.1 288 MGLGDLYKKLEK--SSKKAEIIAAIEATYDIQPPMAMVDSDKGITNLHASNDIIIDASMPVVVRDGGKMWNRD 358
                                           **********97..6789******************************************************* PP

                             TIGR00178 370 gklkdtkavipdssyagvyqaviedckknGafdpttmGtvpnvGlmaqkaeeyGshdktfeieadGvvrvvds 442
                                           gk ++   vipd sya  ++a+++dc  nG++d ttmG v nvGlmaqkaeeyGsh  tfei ++G v+v ++
  NCBI__GCF_000012965.1:WP_011372677.1 359 GKVQECVSVIPDASYAFFHKAMVDDCVANGQYDVTTMGNVANVGLMAQKAEEYGSHPTTFEIAENGSVEV-KQ 430
                                           ********************************************************************97.67 PP

                             TIGR00178 443 sGevlleeeveagdiwrmcqvkdapiqdwvklavtrarlsgtpavfwldperahdeelikkvekylkdhdteG 515
                                           +G+vl+ ++veagdiwrm +vkd piqdwv+lav rarl+g pavfwld++rahd+++ikkv++ylk+hdt+G
  NCBI__GCF_000012965.1:WP_011372677.1 431 NGKVLMSHNVEAGDIWRMSRVKDIPIQDWVRLAVERARLTGSPAVFWLDKNRAHDANMIKKVNEYLKNHDTTG 503
                                           9************************************************************************ PP

                             TIGR00178 516 ldiqilspvkatrfslerirrGedtisvtGnvlrdyltdlfpilelGtsakmlsvvplmaGGGlfetGaGGsa 588
                                           l++ i++   a++++ +r+r+GedtisvtGnvlrd+ltd++pilelGtsakmls+vpl+aGGG+fetGaGGsa
  NCBI__GCF_000012965.1:WP_011372677.1 504 LELPIMDVASAAKYTNARVRKGEDTISVTGNVLRDHLTDMYPILELGTSAKMLSIVPLLAGGGVFETGAGGSA 576
                                           ************************************************************************* PP

                             TIGR00178 589 pkhvqqleeenhlrwdslGeflalaaslehvavktgnekakvladtldaatgklldeekspsrkvGeldnrgs 661
                                           pkhv+q+ ee+hlrwdslGeflala+sl+ +a+k+ ++k   ++ +ld a    ld++k+psrk Ge dn+ s
  NCBI__GCF_000012965.1:WP_011372677.1 577 PKHVDQFLEEGHLRWDSLGEFLALAESLRFIAQKNSDDKLAAVTTALDIANAAYLDNNKAPSRKAGEPDNKAS 649
                                           ************************************************************************* PP

                             TIGR00178 662 kfylakywaqelaaqtedkelaasfasvaealtkneekivaelaavqGeavdlgGyyapdtdlttkvlrpsat 734
                                           +f+la+ywa++l+ + +++elaa+fa++a+alt+ne+ki++el +++G+a d+gGyy+pd++++++++rpsat
  NCBI__GCF_000012965.1:WP_011372677.1 650 HFFLAQYWAKALS-EGKNAELAAKFAPIAKALTENEAKIIEELLSIEGKAQDIGGYYHPDDKKAEAAMRPSAT 721
                                           ***********97.67899****************************************************** PP

                             TIGR00178 735 fnailea 741
                                           +n+i+++
  NCBI__GCF_000012965.1:WP_011372677.1 722 LNKIVDS 728
                                           ****997 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (744 nodes)
Target sequences:                          1  (729 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.01u 0.00s 00:00:00.01 Elapsed: 00:00:00.02
# Mc/sec: 24.46
//
[ok]

This GapMind analysis is from Apr 09 2024. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory