GapMind for catabolism of small carbon sources

 

Alignments for a candidate for icd in Arenimonas metalli CF5-1

Align Isocitrate dehydrogenase [NADP]; IDH; Oxalosuccinate decarboxylase; EC 1.1.1.42 (characterized)
to candidate WP_034212296.1 N787_RS07575 NADP-dependent isocitrate dehydrogenase

Query= SwissProt::P16100
         (741 letters)



>NCBI__GCF_000747155.1:WP_034212296.1
          Length = 741

 Score = 1090 bits (2819), Expect = 0.0
 Identities = 542/738 (73%), Positives = 620/738 (84%)

Query: 3   TPKIIYTLTDEAPALATYSLLPIIKAFTGSSGIAVETRDISLAGRLIATFPEYLTDTQKI 62
           TPKI+YTLTDEAP LAT SLLPI++AF G++GIAVETRDISL+GR++A FPE L + Q+I
Sbjct: 4   TPKILYTLTDEAPFLATQSLLPIVQAFAGAAGIAVETRDISLSGRILAQFPERLREDQRI 63

Query: 63  SDDLAELGKLATTPDANIIKLPNISASVPQLKAAIKELQQQGYKLPDYPEEPKTDTEKDV 122
            D LAELG+LATTP+ANIIKLPNISAS+PQLKAAIKELQ QGY LPDYPEEP+ D EKDV
Sbjct: 64  GDHLAELGQLATTPEANIIKLPNISASLPQLKAAIKELQAQGYDLPDYPEEPRDDAEKDV 123

Query: 123 KARYDKIKGSAVNPVLREGNSDRRAPLSVKNYARKHPHKMGAWSADSKSHVAHMDNGDFY 182
           KARY K  GSAVNPVLREGNSDRRAP +VK   RKHPH+MG W++DSKSHVAHM   DF+
Sbjct: 124 KARYGKAMGSAVNPVLREGNSDRRAPKAVKENVRKHPHRMGKWASDSKSHVAHMAGDDFF 183

Query: 183 GSEKAALIGAPGSVKIELIAKDGSSTVLKAKTSVQAGEIIDSSVMSKNALRNFIAAEIED 242
           GSEK+A + A GS+KI  +  DG+  VLK    V+ GEI+D+SVM + +L  F A +I D
Sbjct: 184 GSEKSATMAAAGSLKIAFVGADGAERVLKESVKVKEGEIVDASVMRRASLARFAAEQIAD 243

Query: 243 AKKQGVLLSVHLKATMMKVSDPIMFGQIVSEFYKDALTKHAEVLKQIGFDVNNGIGDLYA 302
           AK  GVL S+HLKATMMKVSDPIMFG  V+EFYKDAL +HA+VL ++GFD NNGIGDLYA
Sbjct: 244 AKATGVLFSLHLKATMMKVSDPIMFGVFVTEFYKDALARHAKVLDEVGFDPNNGIGDLYA 303

Query: 303 RIKTLPEAKQKEIEADIQAVYAQRPQLAMVNSDKGITNLHVPSDVIVDASMPAMIRDSGK 362
           R+  LPEA Q  I+ADI AVYA  P LAMVNSDKGITNLHVPSDVIVDASMPAMIRDSG+
Sbjct: 304 RLGQLPEATQAAIKADIDAVYASAPGLAMVNSDKGITNLHVPSDVIVDASMPAMIRDSGR 363

Query: 363 MWGPDGKLHDTKAVIPDRCYAGVYQVVIEDCKQHGAFDPTTMGSVPNVGLMAQKAEEYGS 422
           MW   G+L DTKAVIPDRCYAG+YQ VI+DCK +GAFDP TMGSVPNVGLMAQKAEEYGS
Sbjct: 364 MWNAKGELQDTKAVIPDRCYAGIYQAVIDDCKANGAFDPATMGSVPNVGLMAQKAEEYGS 423

Query: 423 HDKTFQIPADGVVRVTDESGKLLLEQSVEAGDIWRMCQAKDAPIQDWVKLAVNRARATNT 482
           HDKTFQ+ A G VRVTD SG ++ E +V+AGDIWRMCQ KDAPIQDWVKLAV+R+R ++T
Sbjct: 424 HDKTFQLDAAGTVRVTDASGAVVFEHAVQAGDIWRMCQTKDAPIQDWVKLAVDRSRLSST 483

Query: 483 PAVFWLDPARAHDAQVIAKVERYLKDYDTSGLDIRILSPVEATRFSLARIREGKDTISVT 542
           PA+FWLD ARAHDAQ+IAKVERYLKD+DT GLDIRIL PVEA + SL RIR G+DTISVT
Sbjct: 484 PAIFWLDAARAHDAQLIAKVERYLKDFDTQGLDIRILPPVEAMKVSLQRIRAGQDTISVT 543

Query: 543 GNVLRDYLTDLFPIMELGTSAKMLSIVPLMSGGGLFETGAGGSAPKHVQQFLEEGYLRWD 602
           GNVLRDYLTDLFPIMELGTSAKMLSIVPLM+GGGLFETGAGGSAPKHVQQF++E YLRWD
Sbjct: 544 GNVLRDYLTDLFPIMELGTSAKMLSIVPLMAGGGLFETGAGGSAPKHVQQFVQENYLRWD 603

Query: 603 SLGEFLALAASLEHLGNAYKNPKALVLASTLDQATGKILDNNKSPARKVGEIDNRGSHFY 662
           SLGEFLALAASLEH+   + N  A VLA TLD A G+ILD ++ P+RKVG +DNRGSHFY
Sbjct: 604 SLGEFLALAASLEHISERWLNSSAPVLAKTLDLANGRILDEDRGPSRKVGGLDNRGSHFY 663

Query: 663 LALYWAQALAAQTEDKELQAQFTGIAKALTDNETKIVGELAAAQGKPVDIAGYYHPNTDL 722
           LALYWAQALAAQ EDK L+A+F  +AK L++ E+ IV EL A QG+PVDI GYY P+   
Sbjct: 664 LALYWAQALAAQDEDKALKAKFAPLAKLLSEQESTIVDELLAVQGQPVDIGGYYRPDMAK 723

Query: 723 TSKAIRPSATFNAALAPL 740
           T+ A+RPS TFN AL+ L
Sbjct: 724 TTAAMRPSNTFNIALSTL 741


Lambda     K      H
   0.315    0.131    0.374 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 1514
Number of extensions: 63
Number of successful extensions: 1
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 741
Length of database: 741
Length adjustment: 40
Effective length of query: 701
Effective length of database: 701
Effective search space:   491401
Effective search space used:   491401
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.5 bits)
S2: 55 (25.8 bits)

Align candidate WP_034212296.1 N787_RS07575 (NADP-dependent isocitrate dehydrogenase)
to HMM TIGR00178 (isocitrate dehydrogenase, NADP-dependent (EC 1.1.1.42))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR00178.hmm
# target sequence database:        /tmp/gapView.2983301.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR00178  [M=744]
Accession:   TIGR00178
Description: monomer_idh: isocitrate dehydrogenase, NADP-dependent
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                             Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                             -----------
          0 1313.5   0.5          0 1313.3   0.5    1.0  1  NCBI__GCF_000747155.1:WP_034212296.1  


Domain annotation for each sequence (and alignments):
>> NCBI__GCF_000747155.1:WP_034212296.1  
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 ! 1313.3   0.5         0         0       3     742 ..       2     741 .]       1     741 [] 1.00

  Alignments for each domain:
  == domain 1  score: 1313.3 bits;  conditional E-value: 0
                             TIGR00178   3 tekakiiytltdeapllatysllpivkafaasaGievetrdislagrilaefpeylteeqkvddalaelGela 75 
                                           ++++ki+ytltdeap lat sllpiv+afa++aGi+vetrdisl+grila+fpe+l e+q+++d+laelG+la
  NCBI__GCF_000747155.1:WP_034212296.1   2 ADTPKILYTLTDEAPFLATQSLLPIVQAFAGAAGIAVETRDISLSGRILAQFPERLREDQRIGDHLAELGQLA 74 
                                           6789********************************************************************* PP

                             TIGR00178  76 ktpeaniiklpnisasvpqlkaaikelqdkGydlpdypeepktdeekdikaryakikGsavnpvlreGnsdrr 148
                                           +tpeaniiklpnisas+pqlkaaikelq++Gydlpdypeep+ d+ekd+kary k +GsavnpvlreGnsdrr
  NCBI__GCF_000747155.1:WP_034212296.1  75 TTPEANIIKLPNISASLPQLKAAIKELQAQGYDLPDYPEEPRDDAEKDVKARYGKAMGSAVNPVLREGNSDRR 147
                                           ************************************************************************* PP

                             TIGR00178 149 aplavkeyarkhphkmGewsadskshvahmdagdfyaseksvlldaaeevkieliakdGketvlkaklklldg 221
                                           ap+avke +rkhph+mG+w+ dskshvahm+ +df++seks+++ aa ++ki ++ +dG e vlk+++k+++g
  NCBI__GCF_000747155.1:WP_034212296.1 148 APKAVKENVRKHPHRMGKWASDSKSHVAHMAGDDFFGSEKSATMAAAGSLKIAFVGADGAERVLKESVKVKEG 220
                                           ************************************************************************* PP

                             TIGR00178 222 evidssvlskkalvefleeeiedakeegvllslhlkatmmkvsdpivfGhvvrvfykdvfakhaelleqlGld 294
                                           e++d+sv+ + +l+ f +e+i+dak++gvl+slhlkatmmkvsdpi+fG  v +fykd++a+ha++l+++G+d
  NCBI__GCF_000747155.1:WP_034212296.1 221 EIVDASVMRRASLARFAAEQIADAKATGVLFSLHLKATMMKVSDPIMFGVFVTEFYKDALARHAKVLDEVGFD 293
                                           ************************************************************************* PP

                             TIGR00178 295 venGladlyakieslpaakkeeieadlekvyeerpelamvdsdkGitnlhvpsdvivdasmpamirasGkmyg 367
                                            +nG++dlya++ +lp+a +  i+ad+++vy++ p lamv+sdkGitnlhvpsdvivdasmpamir+sG+m++
  NCBI__GCF_000747155.1:WP_034212296.1 294 PNNGIGDLYARLGQLPEATQAAIKADIDAVYASAPGLAMVNSDKGITNLHVPSDVIVDASMPAMIRDSGRMWN 366
                                           ************************************************************************* PP

                             TIGR00178 368 kdgklkdtkavipdssyagvyqaviedckknGafdpttmGtvpnvGlmaqkaeeyGshdktfeieadGvvrvv 440
                                           + g+l+dtkavipd++yag+yqavi+dck nGafdp+tmG+vpnvGlmaqkaeeyGshdktf+++a+G+vrv+
  NCBI__GCF_000747155.1:WP_034212296.1 367 AKGELQDTKAVIPDRCYAGIYQAVIDDCKANGAFDPATMGSVPNVGLMAQKAEEYGSHDKTFQLDAAGTVRVT 439
                                           ************************************************************************* PP

                             TIGR00178 441 dssGevlleeeveagdiwrmcqvkdapiqdwvklavtrarlsgtpavfwldperahdeelikkvekylkdhdt 513
                                           d+sG v++e+ v+agdiwrmcq kdapiqdwvklav r+rls+tpa+fwld +rahd++li+kve+ylkd+dt
  NCBI__GCF_000747155.1:WP_034212296.1 440 DASGAVVFEHAVQAGDIWRMCQTKDAPIQDWVKLAVDRSRLSSTPAIFWLDAARAHDAQLIAKVERYLKDFDT 512
                                           ************************************************************************* PP

                             TIGR00178 514 eGldiqilspvkatrfslerirrGedtisvtGnvlrdyltdlfpilelGtsakmlsvvplmaGGGlfetGaGG 586
                                           +Gldi+il pv+a++ sl+rir G+dtisvtGnvlrdyltdlfpi+elGtsakmls+vplmaGGGlfetGaGG
  NCBI__GCF_000747155.1:WP_034212296.1 513 QGLDIRILPPVEAMKVSLQRIRAGQDTISVTGNVLRDYLTDLFPIMELGTSAKMLSIVPLMAGGGLFETGAGG 585
                                           ************************************************************************* PP

                             TIGR00178 587 sapkhvqqleeenhlrwdslGeflalaaslehvavktgnekakvladtldaatgklldeekspsrkvGeldnr 659
                                           sapkhvqq+++en+lrwdslGeflalaasleh++ +  n+ a vla+tld a g++lde++ psrkvG ldnr
  NCBI__GCF_000747155.1:WP_034212296.1 586 SAPKHVQQFVQENYLRWDSLGEFLALAASLEHISERWLNSSAPVLAKTLDLANGRILDEDRGPSRKVGGLDNR 658
                                           ************************************************************************* PP

                             TIGR00178 660 gskfylakywaqelaaqtedkelaasfasvaealtkneekivaelaavqGeavdlgGyyapdtdlttkvlrps 732
                                           gs+fyla ywaq+laaq edk l+a+fa++a+ l+++e++iv el avqG++vd+gGyy pd  +tt+++rps
  NCBI__GCF_000747155.1:WP_034212296.1 659 GSHFYLALYWAQALAAQDEDKALKAKFAPLAKLLSEQESTIVDELLAVQGQPVDIGGYYRPDMAKTTAAMRPS 731
                                           ************************************************************************* PP

                             TIGR00178 733 atfnaileal 742
                                           +tfn +l++l
  NCBI__GCF_000747155.1:WP_034212296.1 732 NTFNIALSTL 741
                                           *****99875 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (744 nodes)
Target sequences:                          1  (741 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.01u 0.01s 00:00:00.02 Elapsed: 00:00:00.01
# Mc/sec: 38.02
//
[ok]

This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory