GapMind for catabolism of small carbon sources

 

Alignments for a candidate for icd in Mariniradius saccharolyticus AK6

Align Isocitrate dehydrogenase [NADP]; IDH; Oxalosuccinate decarboxylase; EC 1.1.1.42 (characterized)
to candidate WP_008629425.1 C943_RS15090 NADP-dependent isocitrate dehydrogenase

Query= SwissProt::P16100
         (741 letters)



>NCBI__GCF_000330725.2:WP_008629425.1
          Length = 742

 Score = 1056 bits (2731), Expect = 0.0
 Identities = 522/735 (71%), Positives = 607/735 (82%)

Query: 3   TPKIIYTLTDEAPALATYSLLPIIKAFTGSSGIAVETRDISLAGRLIATFPEYLTDTQKI 62
           T KI+YTLTDEAPALATYSLLPI+KAFT S+GIAVETRDISL+GR+IA FPE+L   Q+I
Sbjct: 5   TAKILYTLTDEAPALATYSLLPIVKAFTHSAGIAVETRDISLSGRIIALFPEFLKPQQRI 64

Query: 63  SDDLAELGKLATTPDANIIKLPNISASVPQLKAAIKELQQQGYKLPDYPEEPKTDTEKDV 122
           SDDLAELG++A  PDANI+KLPNISAS+PQLKAAIKELQ QGY LPDYP++PK++ EKD+
Sbjct: 65  SDDLAELGEIAKQPDANIVKLPNISASIPQLKAAIKELQAQGYALPDYPDDPKSEQEKDI 124

Query: 123 KARYDKIKGSAVNPVLREGNSDRRAPLSVKNYARKHPHKMGAWSADSKSHVAHMDNGDFY 182
           K RYDK+KGSAVNPVLREGNSDRRAPL+VK YA+ +PH MG WS+DSK+HVA M  GDFY
Sbjct: 125 KGRYDKVKGSAVNPVLREGNSDRRAPLAVKQYAKNNPHSMGKWSSDSKTHVASMSRGDFY 184

Query: 183 GSEKAALIGAPGSVKIELIAKDGSSTVLKAKTSVQAGEIIDSSVMSKNALRNFIAAEIED 242
           GSEK+  I +   V I+L    G+   LK   ++QA E+ID++V+S   L  F+  +  D
Sbjct: 185 GSEKSVTIASATKVSIDLHTSAGTKISLKENLALQAEEVIDAAVLSVADLIEFLQQQKAD 244

Query: 243 AKKQGVLLSVHLKATMMKVSDPIMFGQIVSEFYKDALTKHAEVLKQIGFDVNNGIGDLYA 302
           AKKQGVL S+H+KATMMKVSDPI+FG  V  F+     KH  V ++IG DVNNG GDL A
Sbjct: 245 AKKQGVLFSLHMKATMMKVSDPIIFGHAVKVFFAQLFEKHQAVFEKIGVDVNNGFGDLIA 304

Query: 303 RIKTLPEAKQKEIEADIQAVYAQRPQLAMVNSDKGITNLHVPSDVIVDASMPAMIRDSGK 362
           +++TLPE+++K IE+DI+A  A+ P LAMVNSDKGITNLHVPSDVI+DASMPAMIR SG+
Sbjct: 305 KMQTLPESERKAIESDIEATLAEGPDLAMVNSDKGITNLHVPSDVIIDASMPAMIRSSGQ 364

Query: 363 MWGPDGKLHDTKAVIPDRCYAGVYQVVIEDCKQHGAFDPTTMGSVPNVGLMAQKAEEYGS 422
           MW   GKL DTKAVIPDRCYAGVY+  I  CK+HGAFDP TMGSVPNVGLMAQKAEEYGS
Sbjct: 365 MWNKAGKLQDTKAVIPDRCYAGVYEETINFCKKHGAFDPATMGSVPNVGLMAQKAEEYGS 424

Query: 423 HDKTFQIPADGVVRVTDESGKLLLEQSVEAGDIWRMCQAKDAPIQDWVKLAVNRARATNT 482
           HDKTF+IPADG V VTD SG++L+   V+ GDIWRMCQ KDAPIQDWVKLAVNRARAT  
Sbjct: 425 HDKTFEIPADGTVTVTDSSGEVLISHDVKKGDIWRMCQTKDAPIQDWVKLAVNRARATGV 484

Query: 483 PAVFWLDPARAHDAQVIAKVERYLKDYDTSGLDIRILSPVEATRFSLARIREGKDTISVT 542
           PAVFWLD  RAHDA++I KV++YL D+DT GLDIRILSPV+ATR SL RI+EGKDTISVT
Sbjct: 485 PAVFWLDEKRAHDAELIKKVKKYLGDHDTKGLDIRILSPVDATRLSLERIKEGKDTISVT 544

Query: 543 GNVLRDYLTDLFPIMELGTSAKMLSIVPLMSGGGLFETGAGGSAPKHVQQFLEEGYLRWD 602
           GNVLRDYLTDLFPI+ELGTSAKMLSIVPLM+GGGLFETGAGGSAPKHV+QFLEE +LRWD
Sbjct: 545 GNVLRDYLTDLFPILELGTSAKMLSIVPLMNGGGLFETGAGGSAPKHVEQFLEENHLRWD 604

Query: 603 SLGEFLALAASLEHLGNAYKNPKALVLASTLDQATGKILDNNKSPARKVGEIDNRGSHFY 662
           SLGEFLALA SLEHL + + N +A VLA TLDQ TGK L+ +KSP+R   E+DNRGSHFY
Sbjct: 605 SLGEFLALAVSLEHLADTFHNDRAKVLAETLDQGTGKFLEQDKSPSRNAKELDNRGSHFY 664

Query: 663 LALYWAQALAAQTEDKELQAQFTGIAKALTDNETKIVGELAAAQGKPVDIAGYYHPNTDL 722
           LA+YWA+ALA Q +D EL+ +FT IAKAL +NE KIV EL  AQG  +DI GYY P+   
Sbjct: 665 LAMYWAEALAGQDKDLELKTKFTPIAKALQENEQKIVSELNEAQGVKMDIGGYYKPDESK 724

Query: 723 TSKAIRPSATFNAAL 737
           TS A+RPS TFNA L
Sbjct: 725 TSAAMRPSPTFNAIL 739


Lambda     K      H
   0.315    0.131    0.374 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 1488
Number of extensions: 55
Number of successful extensions: 1
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 741
Length of database: 742
Length adjustment: 40
Effective length of query: 701
Effective length of database: 702
Effective search space:   492102
Effective search space used:   492102
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.5 bits)
S2: 55 (25.8 bits)

Align candidate WP_008629425.1 C943_RS15090 (NADP-dependent isocitrate dehydrogenase)
to HMM TIGR00178 (isocitrate dehydrogenase, NADP-dependent (EC 1.1.1.42))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR00178.hmm
# target sequence database:        /tmp/gapView.1606341.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR00178  [M=744]
Accession:   TIGR00178
Description: monomer_idh: isocitrate dehydrogenase, NADP-dependent
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                             Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                             -----------
          0 1329.3   2.3          0 1329.1   2.3    1.0  1  NCBI__GCF_000330725.2:WP_008629425.1  


Domain annotation for each sequence (and alignments):
>> NCBI__GCF_000330725.2:WP_008629425.1  
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 ! 1329.1   2.3         0         0       1     741 [.       1     741 [.       1     742 [] 1.00

  Alignments for each domain:
  == domain 1  score: 1329.1 bits;  conditional E-value: 0
                             TIGR00178   1 mstekakiiytltdeapllatysllpivkafaasaGievetrdislagrilaefpeylteeqkvddalaelGe 73 
                                           m+ ++aki+ytltdeap+latysllpivkaf++saGi+vetrdisl+gri+a fpe+l  +q+++d+laelGe
  NCBI__GCF_000330725.2:WP_008629425.1   1 MTDKTAKILYTLTDEAPALATYSLLPIVKAFTHSAGIAVETRDISLSGRIIALFPEFLKPQQRISDDLAELGE 73 
                                           566789******************************************************************* PP

                             TIGR00178  74 laktpeaniiklpnisasvpqlkaaikelqdkGydlpdypeepktdeekdikaryakikGsavnpvlreGnsd 146
                                           +ak p+ani+klpnisas+pqlkaaikelq++Gy+lpdyp++pk+++ekdik ry+k+kGsavnpvlreGnsd
  NCBI__GCF_000330725.2:WP_008629425.1  74 IAKQPDANIVKLPNISASIPQLKAAIKELQAQGYALPDYPDDPKSEQEKDIKGRYDKVKGSAVNPVLREGNSD 146
                                           ************************************************************************* PP

                             TIGR00178 147 rraplavkeyarkhphkmGewsadskshvahmdagdfyaseksvlldaaeevkieliakdGketvlkaklkll 219
                                           rraplavk+ya+++ph+mG+ws dsk+hva m+ gdfy+seksv++ +a++v i+l +  G+++ lk++l l+
  NCBI__GCF_000330725.2:WP_008629425.1 147 RRAPLAVKQYAKNNPHSMGKWSSDSKTHVASMSRGDFYGSEKSVTIASATKVSIDLHTSAGTKISLKENLALQ 219
                                           ************************************************************************* PP

                             TIGR00178 220 dgevidssvlskkalvefleeeiedakeegvllslhlkatmmkvsdpivfGhvvrvfykdvfakhaelleqlG 292
                                           ++evid++vls   l efl+++ +dak++gvl+slh+katmmkvsdpi+fGh+v+vf+++ f kh++++e++G
  NCBI__GCF_000330725.2:WP_008629425.1 220 AEEVIDAAVLSVADLIEFLQQQKADAKKQGVLFSLHMKATMMKVSDPIIFGHAVKVFFAQLFEKHQAVFEKIG 292
                                           ************************************************************************* PP

                             TIGR00178 293 ldvenGladlyakieslpaakkeeieadlekvyeerpelamvdsdkGitnlhvpsdvivdasmpamirasGkm 365
                                           +dv+nG++dl+ak+++lp+++++ ie+d+e++++e+p+lamv+sdkGitnlhvpsdvi+dasmpamir+sG+m
  NCBI__GCF_000330725.2:WP_008629425.1 293 VDVNNGFGDLIAKMQTLPESERKAIESDIEATLAEGPDLAMVNSDKGITNLHVPSDVIIDASMPAMIRSSGQM 365
                                           ************************************************************************* PP

                             TIGR00178 366 ygkdgklkdtkavipdssyagvyqaviedckknGafdpttmGtvpnvGlmaqkaeeyGshdktfeieadGvvr 438
                                           ++k gkl+dtkavipd++yagvy++ i++ckk+Gafdp+tmG+vpnvGlmaqkaeeyGshdktfei+adG+v 
  NCBI__GCF_000330725.2:WP_008629425.1 366 WNKAGKLQDTKAVIPDRCYAGVYEETINFCKKHGAFDPATMGSVPNVGLMAQKAEEYGSHDKTFEIPADGTVT 438
                                           ************************************************************************* PP

                             TIGR00178 439 vvdssGevlleeeveagdiwrmcqvkdapiqdwvklavtrarlsgtpavfwldperahdeelikkvekylkdh 511
                                           v+dssGevl+ ++v++gdiwrmcq kdapiqdwvklav+rar++g+pavfwld++rahd+elikkv+kyl dh
  NCBI__GCF_000330725.2:WP_008629425.1 439 VTDSSGEVLISHDVKKGDIWRMCQTKDAPIQDWVKLAVNRARATGVPAVFWLDEKRAHDAELIKKVKKYLGDH 511
                                           ************************************************************************* PP

                             TIGR00178 512 dteGldiqilspvkatrfslerirrGedtisvtGnvlrdyltdlfpilelGtsakmlsvvplmaGGGlfetGa 584
                                           dt+Gldi+ilspv+atr+sleri++G+dtisvtGnvlrdyltdlfpilelGtsakmls+vplm+GGGlfetGa
  NCBI__GCF_000330725.2:WP_008629425.1 512 DTKGLDIRILSPVDATRLSLERIKEGKDTISVTGNVLRDYLTDLFPILELGTSAKMLSIVPLMNGGGLFETGA 584
                                           ************************************************************************* PP

                             TIGR00178 585 GGsapkhvqqleeenhlrwdslGeflalaaslehvavktgnekakvladtldaatgklldeekspsrkvGeld 657
                                           GGsapkhv+q+ eenhlrwdslGeflala sleh+a +  n++akvla+tld+ tgk+l+++kspsr   eld
  NCBI__GCF_000330725.2:WP_008629425.1 585 GGSAPKHVEQFLEENHLRWDSLGEFLALAVSLEHLADTFHNDRAKVLAETLDQGTGKFLEQDKSPSRNAKELD 657
                                           ************************************************************************* PP

                             TIGR00178 658 nrgskfylakywaqelaaqtedkelaasfasvaealtkneekivaelaavqGeavdlgGyyapdtdlttkvlr 730
                                           nrgs+fyla+ywa++la q +d el+ +f+++a+al++ne+kiv+el+++qG  +d+gGyy+pd+ +t++++r
  NCBI__GCF_000330725.2:WP_008629425.1 658 NRGSHFYLAMYWAEALAGQDKDLELKTKFTPIAKALQENEQKIVSELNEAQGVKMDIGGYYKPDESKTSAAMR 730
                                           ************************************************************************* PP

                             TIGR00178 731 psatfnailea 741
                                           ps tfnaile 
  NCBI__GCF_000330725.2:WP_008629425.1 731 PSPTFNAILER 741
                                           ********985 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (744 nodes)
Target sequences:                          1  (742 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.01u 0.00s 00:00:00.01 Elapsed: 00:00:00.01
# Mc/sec: 42.51
//
[ok]

This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory