GapMind for catabolism of small carbon sources

 

Alignments for a candidate for icd in Echinicola vietnamensis KMM 6221, DSM 17526

Align Isocitrate dehydrogenase [NADP]; IDH; Oxalosuccinate decarboxylase; EC 1.1.1.42 (characterized)
to candidate Echvi_1839 Echvi_1839 isocitrate dehydrogenase, NADP-dependent, monomeric type

Query= SwissProt::P16100
         (741 letters)



>FitnessBrowser__Cola:Echvi_1839
          Length = 762

 Score = 1053 bits (2724), Expect = 0.0
 Identities = 520/735 (70%), Positives = 606/735 (82%)

Query: 3   TPKIIYTLTDEAPALATYSLLPIIKAFTGSSGIAVETRDISLAGRLIATFPEYLTDTQKI 62
           TPKI+YTLTDEAPALATYSLLPIIK+FT S+G+ VETRDISL+GR+IA FPEYL + Q+I
Sbjct: 21  TPKILYTLTDEAPALATYSLLPIIKSFTDSAGVVVETRDISLSGRIIANFPEYLKEDQRI 80

Query: 63  SDDLAELGKLATTPDANIIKLPNISASVPQLKAAIKELQQQGYKLPDYPEEPKTDTEKDV 122
            D LAELG++A TP+ANI+KLPNISAS+PQLKAAIKELQ++GY LPDYP+EPK   EK V
Sbjct: 81  GDALAELGEIAKTPEANIVKLPNISASIPQLKAAIKELQEKGYALPDYPDEPKDQEEKGV 140

Query: 123 KARYDKIKGSAVNPVLREGNSDRRAPLSVKNYARKHPHKMGAWSADSKSHVAHMDNGDFY 182
           KA+YDKIKGSAVNPVLREGNSDRRAP +VK +AR +PH MG WSADSKSHVA M  GDFY
Sbjct: 141 KAKYDKIKGSAVNPVLREGNSDRRAPQAVKQFARNNPHSMGEWSADSKSHVASMSEGDFY 200

Query: 183 GSEKAALIGAPGSVKIELIAKDGSSTVLKAKTSVQAGEIIDSSVMSKNALRNFIAAEIED 242
           GSE++  +   G+VKI+L A DG+ TVLK    +Q GE+IDSSVMS   L+ F+A +  D
Sbjct: 201 GSEQSLTMSEAGTVKIQLEAGDGTVTVLKEGLELQEGEVIDSSVMSVKKLQAFLAEQKAD 260

Query: 243 AKKQGVLLSVHLKATMMKVSDPIMFGQIVSEFYKDALTKHAEVLKQIGFDVNNGIGDLYA 302
           AK +G+L S+H+KATMMKVSDPI+FG  V  F+     KHA  +K++G DVNNG GDL +
Sbjct: 261 AKAKGILFSLHMKATMMKVSDPIIFGHAVKVFFAPVFEKHAATIKKLGVDVNNGFGDLVS 320

Query: 303 RIKTLPEAKQKEIEADIQAVYAQRPQLAMVNSDKGITNLHVPSDVIVDASMPAMIRDSGK 362
            ++ LP  K+KEIEADI+A  A  P LAMVNS KGITNLHVPSDVI+DASMPAMIR SG+
Sbjct: 321 ALEKLPADKRKEIEADIEACLADSPDLAMVNSHKGITNLHVPSDVIIDASMPAMIRSSGQ 380

Query: 363 MWGPDGKLHDTKAVIPDRCYAGVYQVVIEDCKQHGAFDPTTMGSVPNVGLMAQKAEEYGS 422
           MW  +  L DTKA+IPDR YAGVYQ  I+ CKQHGAFDPTTMGSVPNVGLMAQKAEEYGS
Sbjct: 381 MWNKNDALQDTKAIIPDRSYAGVYQETIDFCKQHGAFDPTTMGSVPNVGLMAQKAEEYGS 440

Query: 423 HDKTFQIPADGVVRVTDESGKLLLEQSVEAGDIWRMCQAKDAPIQDWVKLAVNRARATNT 482
           HDKTF+  ADG ++V + +G+ L+E  VE GDI+RMCQ KDAPIQDWVKLAVNRAR++NT
Sbjct: 441 HDKTFEAAADGAIKVLNAAGETLMEHKVEKGDIFRMCQTKDAPIQDWVKLAVNRARSSNT 500

Query: 483 PAVFWLDPARAHDAQVIAKVERYLKDYDTSGLDIRILSPVEATRFSLARIREGKDTISVT 542
           PAVFWLD  RAHDAQ+I KV +YL  +DT GLDIRILSPVEATRFSL RI++GKDTISVT
Sbjct: 501 PAVFWLDEHRAHDAQLIQKVNQYLPQHDTEGLDIRILSPVEATRFSLERIKDGKDTISVT 560

Query: 543 GNVLRDYLTDLFPIMELGTSAKMLSIVPLMSGGGLFETGAGGSAPKHVQQFLEEGYLRWD 602
           GNVLRDYLTDLFPI+ELGTSAKMLSIVPLM+GGGLFETGAGGSAPKHVQQF+EEG+LRWD
Sbjct: 561 GNVLRDYLTDLFPILELGTSAKMLSIVPLMNGGGLFETGAGGSAPKHVQQFVEEGHLRWD 620

Query: 603 SLGEFLALAASLEHLGNAYKNPKALVLASTLDQATGKILDNNKSPARKVGEIDNRGSHFY 662
           SLGEFLALA SLEHLG  + N +A+VL  TLD ATGK L+N KSP+RKV E+DNRGSHFY
Sbjct: 621 SLGEFLALAVSLEHLGETFDNNRAIVLGKTLDTATGKFLENGKSPSRKVNELDNRGSHFY 680

Query: 663 LALYWAQALAAQTEDKELQAQFTGIAKALTDNETKIVGELAAAQGKPVDIAGYYHPNTDL 722
           LA+YWA+ALA Q ED  L+  FT +AKA+ + E +++ EL  AQG PVDI GY+ P  D 
Sbjct: 681 LAMYWAEALANQDEDAALKEIFTKVAKAMIEKEEQVIAELNGAQGSPVDIGGYFKPAEDK 740

Query: 723 TSKAIRPSATFNAAL 737
           TSKA+RPS T N  L
Sbjct: 741 TSKAMRPSQTLNGIL 755


Lambda     K      H
   0.315    0.131    0.374 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 1506
Number of extensions: 50
Number of successful extensions: 1
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 741
Length of database: 762
Length adjustment: 40
Effective length of query: 701
Effective length of database: 722
Effective search space:   506122
Effective search space used:   506122
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.5 bits)
S2: 55 (25.8 bits)

Align candidate Echvi_1839 Echvi_1839 (isocitrate dehydrogenase, NADP-dependent, monomeric type)
to HMM TIGR00178 (isocitrate dehydrogenase, NADP-dependent (EC 1.1.1.42))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR00178.hmm
# target sequence database:        /tmp/gapView.19190.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR00178  [M=744]
Accession:   TIGR00178
Description: monomer_idh: isocitrate dehydrogenase, NADP-dependent
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                            Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                            -----------
          0 1325.0   2.3          0 1324.8   2.3    1.0  1  lcl|FitnessBrowser__Cola:Echvi_1839  Echvi_1839 isocitrate dehydrogen


Domain annotation for each sequence (and alignments):
>> lcl|FitnessBrowser__Cola:Echvi_1839  Echvi_1839 isocitrate dehydrogenase, NADP-dependent, monomeric type
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 ! 1324.8   2.3         0         0       3     742 ..      19     758 ..      16     760 .. 1.00

  Alignments for each domain:
  == domain 1  score: 1324.8 bits;  conditional E-value: 0
                            TIGR00178   3 tekakiiytltdeapllatysllpivkafaasaGievetrdislagrilaefpeylteeqkvddalaelGelak 76 
                                          t+++ki+ytltdeap+latysllpi+k+f++saG+ vetrdisl+gri+a+fpeyl e+q+++dalaelGe+ak
  lcl|FitnessBrowser__Cola:Echvi_1839  19 TKTPKILYTLTDEAPALATYSLLPIIKSFTDSAGVVVETRDISLSGRIIANFPEYLKEDQRIGDALAELGEIAK 92 
                                          6789********************************************************************** PP

                            TIGR00178  77 tpeaniiklpnisasvpqlkaaikelqdkGydlpdypeepktdeekdikaryakikGsavnpvlreGnsdrrap 150
                                          tpeani+klpnisas+pqlkaaikelq+kGy+lpdyp+epk +eek +ka+y+kikGsavnpvlreGnsdrrap
  lcl|FitnessBrowser__Cola:Echvi_1839  93 TPEANIVKLPNISASIPQLKAAIKELQEKGYALPDYPDEPKDQEEKGVKAKYDKIKGSAVNPVLREGNSDRRAP 166
                                          ************************************************************************** PP

                            TIGR00178 151 lavkeyarkhphkmGewsadskshvahmdagdfyaseksvlldaaeevkieliakdGketvlkaklklldgevi 224
                                          +avk++ar++ph+mGewsadskshva m++gdfy+se+s+++ +a  vki+l a dG++tvlk+ l+l++gevi
  lcl|FitnessBrowser__Cola:Echvi_1839 167 QAVKQFARNNPHSMGEWSADSKSHVASMSEGDFYGSEQSLTMSEAGTVKIQLEAGDGTVTVLKEGLELQEGEVI 240
                                          ************************************************************************** PP

                            TIGR00178 225 dssvlskkalvefleeeiedakeegvllslhlkatmmkvsdpivfGhvvrvfykdvfakhaelleqlGldvenG 298
                                          dssv+s k l++fl+e+ +dak++g+l+slh+katmmkvsdpi+fGh+v+vf++ vf kha+++++lG+dv+nG
  lcl|FitnessBrowser__Cola:Echvi_1839 241 DSSVMSVKKLQAFLAEQKADAKAKGILFSLHMKATMMKVSDPIIFGHAVKVFFAPVFEKHAATIKKLGVDVNNG 314
                                          ************************************************************************** PP

                            TIGR00178 299 ladlyakieslpaakkeeieadlekvyeerpelamvdsdkGitnlhvpsdvivdasmpamirasGkmygkdgkl 372
                                          ++dl + +e+lpa k++eiead+e+++++ p+lamv+s kGitnlhvpsdvi+dasmpamir+sG+m++k++ l
  lcl|FitnessBrowser__Cola:Echvi_1839 315 FGDLVSALEKLPADKRKEIEADIEACLADSPDLAMVNSHKGITNLHVPSDVIIDASMPAMIRSSGQMWNKNDAL 388
                                          ************************************************************************** PP

                            TIGR00178 373 kdtkavipdssyagvyqaviedckknGafdpttmGtvpnvGlmaqkaeeyGshdktfeieadGvvrvvdssGev 446
                                          +dtka+ipd+syagvyq+ i++ck++GafdpttmG+vpnvGlmaqkaeeyGshdktfe  adG ++v ++ Ge 
  lcl|FitnessBrowser__Cola:Echvi_1839 389 QDTKAIIPDRSYAGVYQETIDFCKQHGAFDPTTMGSVPNVGLMAQKAEEYGSHDKTFEAAADGAIKVLNAAGET 462
                                          ************************************************************************** PP

                            TIGR00178 447 lleeeveagdiwrmcqvkdapiqdwvklavtrarlsgtpavfwldperahdeelikkvekylkdhdteGldiqi 520
                                          l+e++ve+gdi+rmcq kdapiqdwvklav+rar s+tpavfwld++rahd++li+kv++yl +hdteGldi+i
  lcl|FitnessBrowser__Cola:Echvi_1839 463 LMEHKVEKGDIFRMCQTKDAPIQDWVKLAVNRARSSNTPAVFWLDEHRAHDAQLIQKVNQYLPQHDTEGLDIRI 536
                                          ************************************************************************** PP

                            TIGR00178 521 lspvkatrfslerirrGedtisvtGnvlrdyltdlfpilelGtsakmlsvvplmaGGGlfetGaGGsapkhvqq 594
                                          lspv+atrfsleri+ G+dtisvtGnvlrdyltdlfpilelGtsakmls+vplm+GGGlfetGaGGsapkhvqq
  lcl|FitnessBrowser__Cola:Echvi_1839 537 LSPVEATRFSLERIKDGKDTISVTGNVLRDYLTDLFPILELGTSAKMLSIVPLMNGGGLFETGAGGSAPKHVQQ 610
                                          ************************************************************************** PP

                            TIGR00178 595 leeenhlrwdslGeflalaaslehvavktgnekakvladtldaatgklldeekspsrkvGeldnrgskfylaky 668
                                          ++ee+hlrwdslGeflala sleh++ + +n++a vl++tld atgk+l++ kspsrkv eldnrgs+fyla+y
  lcl|FitnessBrowser__Cola:Echvi_1839 611 FVEEGHLRWDSLGEFLALAVSLEHLGETFDNNRAIVLGKTLDTATGKFLENGKSPSRKVNELDNRGSHFYLAMY 684
                                          ************************************************************************** PP

                            TIGR00178 669 waqelaaqtedkelaasfasvaealtkneekivaelaavqGeavdlgGyyapdtdlttkvlrpsatfnaileal 742
                                          wa++la q ed+ l++ f+ va+a+ ++ee+++ael+ +qG++vd+gGy++p +d+t+k++rps+t+n ile +
  lcl|FitnessBrowser__Cola:Echvi_1839 685 WAEALANQDEDAALKEIFTKVAKAMIEKEEQVIAELNGAQGSPVDIGGYFKPAEDKTSKAMRPSQTLNGILEMV 758
                                          ***********************************************************************976 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (744 nodes)
Target sequences:                          1  (762 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.04u 0.03s 00:00:00.07 Elapsed: 00:00:00.06
# Mc/sec: 9.17
//
[ok]

This GapMind analysis is from Sep 17 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory