GapMind for catabolism of small carbon sources

 

Alignments for a candidate for icd in Herbaspirillum seropedicae SmR1

Align isocitrate dehydrogenase (NADP+) (EC 1.1.1.42) (characterized)
to candidate HSERO_RS12570 HSERO_RS12570 isocitrate dehydrogenase

Query= BRENDA::O53611
         (745 letters)



>FitnessBrowser__HerbieS:HSERO_RS12570
          Length = 742

 Score = 1111 bits (2873), Expect = 0.0
 Identities = 546/738 (73%), Positives = 630/738 (85%), Gaps = 1/738 (0%)

Query: 6   PTIIYTLTDEAPLLATYAFLPIVRAFAEPAGIKIEASDISVAARILAEFPDYLTEEQRVP 65
           PTIIYTLTDEAP LAT++ LPIV+ F   AGI +  SDISVAARILAEFP+YLT+EQ+VP
Sbjct: 6   PTIIYTLTDEAPYLATHSLLPIVKKFTAAAGIDVVESDISVAARILAEFPEYLTDEQKVP 65

Query: 66  DNLAELGRLTQLPDTNIIKLPNISASVPQLVAAIKELQDKGYAVPDYPADPKTDQEKAIK 125
           DNLA LG LTQ PD NIIKLPNISAS+ QL AAI+ELQ +GY +PDYP +P T++EKA++
Sbjct: 66  DNLAALGALTQSPDANIIKLPNISASILQLQAAIRELQSRGYKLPDYPENPTTEEEKALQ 125

Query: 126 ERYARCLGSAVNPVLRQGNSDRRAPKAVKEYARKHPHSMGEWSMASRTHVAHMRHGDFYA 185
           +RYA+  GSAVNPVLR+GNSDRRAP AVK YARKHPHSM +WSMASRTHVAHM  GDFYA
Sbjct: 126 KRYAKVTGSAVNPVLREGNSDRRAPNAVKNYARKHPHSMAKWSMASRTHVAHMHGGDFYA 185

Query: 186 GEKSMTLDRARNVRMELLAKSGKTIVLKPEVPLDDGDVIDSMFMSKKALCDFYEEQMQDA 245
           GEKS+TLD+A +V+M+L+ KSG+TIVLKP++ L  G++IDSMFMSKKALC +YEEQMQDA
Sbjct: 186 GEKSLTLDKAVDVKMDLVTKSGETIVLKPKISLQAGEIIDSMFMSKKALCAYYEEQMQDA 245

Query: 246 FETGVMFSLHVKATMMKVSHPIVFGHAVRIFYKDAFAKHQELFDDLGVNVNNGLSDLYSK 305
           FET ++ S+HVKATMMKVSHPIVFGHAVR +YKD F KH +LF ++GVN NNG+S +Y K
Sbjct: 246 FETDLLLSVHVKATMMKVSHPIVFGHAVRTYYKDTFTKHAKLFAEIGVNPNNGMSGVYEK 305

Query: 306 IESLPASQRDEIIEDLHRCHEHRPELAMVDSARGISNFHSPSDVIVDASMPAMIRAGGKM 365
           I +LP +++ E++ DL      RP LAMVDSA+GI+N H+P+DVIVDASMPAMIRAGGKM
Sbjct: 306 INTLPDAKKAEVLADLKADEAKRPRLAMVDSAKGITNLHAPNDVIVDASMPAMIRAGGKM 365

Query: 366 YGADGKLKDTKAVNPESTFSRIYQEIINFCKTNGQFDPTTMGTVPNVGLMAQQAEEYGSH 425
           + A GK +DTKA+ PESTF+RIYQE+INFCKTNG FDPTTMGTVPNVGLMAQ+AEEYGSH
Sbjct: 366 WNAAGKTEDTKALLPESTFARIYQEMINFCKTNGAFDPTTMGTVPNVGLMAQKAEEYGSH 425

Query: 426 DKTFEIPEDGVANIVDVATGEVLLTENVEAGDIWRMCIVKDAPIRDWVKLAVTRARISGM 485
           DKTFEI + GVA IV +  G+VLL +NVE GDIWRMC VKDA +RDWVKLAVTRAR+SGM
Sbjct: 426 DKTFEIAQAGVARIVTL-DGQVLLEQNVEEGDIWRMCQVKDAAVRDWVKLAVTRARLSGM 484

Query: 486 PVLFWLDPYRPHENELIKKVKTYLKDHDTEGLDIQIMSQVRSMRYTCERLVRGLDTIAAT 545
           P +FWLDPYRPHE ELIKKV TYLKDHD  G DIQIMSQVR+MRYT ER++RGLDTI+ T
Sbjct: 485 PAVFWLDPYRPHEAELIKKVNTYLKDHDLTGADIQIMSQVRAMRYTLERVIRGLDTISVT 544

Query: 546 GNILRDYLTDLFPILELGTSAKMLSVVPLMAGGGMYETGAGGSAPKHVKQLVEENHLRWD 605
           GNILRDYLTDLFPI+ELGTSAKMLS+VPLMAGGGM+ETGAGGSAPKHV+QLV ENHLRWD
Sbjct: 545 GNILRDYLTDLFPIMELGTSAKMLSIVPLMAGGGMFETGAGGSAPKHVQQLVGENHLRWD 604

Query: 606 SLGEFLALGAGFEDIGIKTGNERAKLLGKTLDAAIGKLLDNDKSPSRKTGELDNRGSQFY 665
           SLGEFLAL    E++GIKTGN +AK+L KTLDAA GKLLDN+KSPS KTGELDNRGS FY
Sbjct: 605 SLGEFLALAVSIEEVGIKTGNSKAKILAKTLDAATGKLLDNNKSPSPKTGELDNRGSHFY 664

Query: 666 LAMYWAQELAAQTDDQQLAEHFASLADVLTKNEDVIVRELTEVQGEPVDIGGYYAPDSDM 725
           LA+YWAQELAAQ DD +LA+ FA LA  L +NE  IV EL  VQG+ VDIGGYY PD + 
Sbjct: 665 LALYWAQELAAQKDDAELAKQFAPLAKALAENEQKIVEELNSVQGKQVDIGGYYLPDREK 724

Query: 726 TTAVMRPSKTFNAALEAV 743
           T AVMRPS T N ALE +
Sbjct: 725 TFAVMRPSATLNQALETL 742


Lambda     K      H
   0.317    0.134    0.388 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 1513
Number of extensions: 37
Number of successful extensions: 2
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 745
Length of database: 742
Length adjustment: 40
Effective length of query: 705
Effective length of database: 702
Effective search space:   494910
Effective search space used:   494910
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 55 (25.8 bits)

Align candidate HSERO_RS12570 HSERO_RS12570 (isocitrate dehydrogenase)
to HMM TIGR00178 (isocitrate dehydrogenase, NADP-dependent (EC 1.1.1.42))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR00178.hmm
# target sequence database:        /tmp/gapView.24210.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR00178  [M=744]
Accession:   TIGR00178
Description: monomer_idh: isocitrate dehydrogenase, NADP-dependent
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                                  Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                                  -----------
          0 1341.9   2.8          0 1341.7   2.8    1.0  1  lcl|FitnessBrowser__HerbieS:HSERO_RS12570  HSERO_RS12570 isocitrate dehydro


Domain annotation for each sequence (and alignments):
>> lcl|FitnessBrowser__HerbieS:HSERO_RS12570  HSERO_RS12570 isocitrate dehydrogenase
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 ! 1341.7   2.8         0         0       3     742 ..       3     742 .]       1     742 [] 1.00

  Alignments for each domain:
  == domain 1  score: 1341.7 bits;  conditional E-value: 0
                                  TIGR00178   3 tekakiiytltdeapllatysllpivkafaasaGievetrdislagrilaefpeylteeqkvddalae 70 
                                                t +++iiytltdeap lat+sllpivk f+a+aGi+v  +dis+a+rilaefpeylt+eqkv+d+la 
  lcl|FitnessBrowser__HerbieS:HSERO_RS12570   3 TGNPTIIYTLTDEAPYLATHSLLPIVKKFTAAAGIDVVESDISVAARILAEFPEYLTDEQKVPDNLAA 70 
                                                7789**************************************************************** PP

                                  TIGR00178  71 lGelaktpeaniiklpnisasvpqlkaaikelqdkGydlpdypeepktdeekdikaryakikGsavnp 138
                                                lG l++ p+aniiklpnisas+ ql+aai+elq++Gy+lpdype+p+t+eek++++ryak+ Gsavnp
  lcl|FitnessBrowser__HerbieS:HSERO_RS12570  71 LGALTQSPDANIIKLPNISASILQLQAAIRELQSRGYKLPDYPENPTTEEEKALQKRYAKVTGSAVNP 138
                                                ******************************************************************** PP

                                  TIGR00178 139 vlreGnsdrraplavkeyarkhphkmGewsadskshvahmdagdfyaseksvlldaaeevkieliakd 206
                                                vlreGnsdrrap avk+yarkhph+m +ws++s++hvahm+ gdfya+eks++ld+a +vk++l++k+
  lcl|FitnessBrowser__HerbieS:HSERO_RS12570 139 VLREGNSDRRAPNAVKNYARKHPHSMAKWSMASRTHVAHMHGGDFYAGEKSLTLDKAVDVKMDLVTKS 206
                                                ******************************************************************** PP

                                  TIGR00178 207 GketvlkaklklldgevidssvlskkalvefleeeiedakeegvllslhlkatmmkvsdpivfGhvvr 274
                                                G+++vlk+k++l++ge+ids+++skkal++++ee+++da+e+++lls+h+katmmkvs+pivfGh+vr
  lcl|FitnessBrowser__HerbieS:HSERO_RS12570 207 GETIVLKPKISLQAGEIIDSMFMSKKALCAYYEEQMQDAFETDLLLSVHVKATMMKVSHPIVFGHAVR 274
                                                ******************************************************************** PP

                                  TIGR00178 275 vfykdvfakhaelleqlGldvenGladlyakieslpaakkeeieadlekvyeerpelamvdsdkGitn 342
                                                ++ykd+f+kha+l+ ++G++ +nG++ +y ki++lp+akk e+ adl++  ++rp lamvds+kGitn
  lcl|FitnessBrowser__HerbieS:HSERO_RS12570 275 TYYKDTFTKHAKLFAEIGVNPNNGMSGVYEKINTLPDAKKAEVLADLKADEAKRPRLAMVDSAKGITN 342
                                                ******************************************************************** PP

                                  TIGR00178 343 lhvpsdvivdasmpamirasGkmygkdgklkdtkavipdssyagvyqaviedckknGafdpttmGtvp 410
                                                lh+p dvivdasmpamira+Gkm+++ gk +dtka+ p+s++a++yq++i++ck+nGafdpttmGtvp
  lcl|FitnessBrowser__HerbieS:HSERO_RS12570 343 LHAPNDVIVDASMPAMIRAGGKMWNAAGKTEDTKALLPESTFARIYQEMINFCKTNGAFDPTTMGTVP 410
                                                ******************************************************************** PP

                                  TIGR00178 411 nvGlmaqkaeeyGshdktfeieadGvvrvvdssGevlleeeveagdiwrmcqvkdapiqdwvklavtr 478
                                                nvGlmaqkaeeyGshdktfei ++Gv+r+v  +G+vlle++ve+gdiwrmcqvkda ++dwvklavtr
  lcl|FitnessBrowser__HerbieS:HSERO_RS12570 411 NVGLMAQKAEEYGSHDKTFEIAQAGVARIVTLDGQVLLEQNVEEGDIWRMCQVKDAAVRDWVKLAVTR 478
                                                ******************************************************************** PP

                                  TIGR00178 479 arlsgtpavfwldperahdeelikkvekylkdhdteGldiqilspvkatrfslerirrGedtisvtGn 546
                                                arlsg+pavfwldp+r+h++elikkv++ylkdhd +G diqi+s+v+a+r++ler+ rG dtisvtGn
  lcl|FitnessBrowser__HerbieS:HSERO_RS12570 479 ARLSGMPAVFWLDPYRPHEAELIKKVNTYLKDHDLTGADIQIMSQVRAMRYTLERVIRGLDTISVTGN 546
                                                ******************************************************************** PP

                                  TIGR00178 547 vlrdyltdlfpilelGtsakmlsvvplmaGGGlfetGaGGsapkhvqqleeenhlrwdslGeflalaa 614
                                                +lrdyltdlfpi+elGtsakmls+vplmaGGG+fetGaGGsapkhvqql+ enhlrwdslGeflala 
  lcl|FitnessBrowser__HerbieS:HSERO_RS12570 547 ILRDYLTDLFPIMELGTSAKMLSIVPLMAGGGMFETGAGGSAPKHVQQLVGENHLRWDSLGEFLALAV 614
                                                ******************************************************************** PP

                                  TIGR00178 615 slehvavktgnekakvladtldaatgklldeekspsrkvGeldnrgskfylakywaqelaaqtedkel 682
                                                s+e+v++ktgn+kak+la+tldaatgklld++ksps k+Geldnrgs+fyla ywaqelaaq +d+el
  lcl|FitnessBrowser__HerbieS:HSERO_RS12570 615 SIEEVGIKTGNSKAKILAKTLDAATGKLLDNNKSPSPKTGELDNRGSHFYLALYWAQELAAQKDDAEL 682
                                                ******************************************************************** PP

                                  TIGR00178 683 aasfasvaealtkneekivaelaavqGeavdlgGyyapdtdlttkvlrpsatfnaileal 742
                                                a++fa++a+al++ne+kiv+el++vqG+ vd+gGyy pd ++t +v+rpsat+n++le+l
  lcl|FitnessBrowser__HerbieS:HSERO_RS12570 683 AKQFAPLAKALAENEQKIVEELNSVQGKQVDIGGYYLPDREKTFAVMRPSATLNQALETL 742
                                                *********************************************************976 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (744 nodes)
Target sequences:                          1  (742 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.04u 0.02s 00:00:00.06 Elapsed: 00:00:00.04
# Mc/sec: 11.69
//
[ok]

This GapMind analysis is from Sep 17 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory