GapMind for catabolism of small carbon sources

 

Alignments for a candidate for icd in Marinobacter adhaerens HP15

Align isocitrate dehydrogenase (NADP+) (EC 1.1.1.42) (characterized)
to candidate GFF3859 HP15_3800 isocitrate dehydrogenase, NADP-dependent

Query= BRENDA::O53611
         (745 letters)



>FitnessBrowser__Marino:GFF3859
          Length = 747

 Score = 1024 bits (2647), Expect = 0.0
 Identities = 501/741 (67%), Positives = 599/741 (80%), Gaps = 1/741 (0%)

Query: 1   MSAEQPTIIYTLTDEAPLLATYAFLPIVRAFAEPAGIKIEASDISVAARILAEFPDYLTE 60
           M++ +  I+YTLTDEAP LAT + LPI+  +A+PAGI+ E SDIS+AARILA FPDYL E
Sbjct: 1   MTSSKAKIVYTLTDEAPALATRSLLPILETYAKPAGIEFETSDISLAARILANFPDYLEE 60

Query: 61  EQRVPDNLAELGRLTQLPDTNIIKLPNISASVPQLVAAIKELQDKGYAVPDYPADPKTDQ 120
           +QRVPD LAELG  T+ PD NIIKLPNISAS+PQL AAIKEL ++GY VP+Y  +P+ D+
Sbjct: 61  DQRVPDALAELGEYTKDPDANIIKLPNISASIPQLRAAIKELNEQGYNVPEYKENPENDE 120

Query: 121 EKAIKERYARCLGSAVNPVLRQGNSDRRAPKAVKEYARKHPHSMGEWSMASRTHVAHMRH 180
           EK I+ RYA+ LGSAVNPVLR+GNSDRRAP AVK +ARK+PHSMGEWS ASRTHVAHMR 
Sbjct: 121 EKEIQSRYAKVLGSAVNPVLREGNSDRRAPTAVKAFARKYPHSMGEWSPASRTHVAHMRG 180

Query: 181 GDFYAGEKSMTLDRARNVRMELLAKSGKTIVLKPEVPLDDGDVIDSMFMSKKALCDFYEE 240
           GDFY+ E+S+TLD+A    +    K GK  VLK ++PL +G+V+D MFMSKKAL  F+E+
Sbjct: 181 GDFYSSEQSVTLDKATKANIVFENKQGKQTVLKSDLPLQEGEVLDGMFMSKKALVKFFED 240

Query: 241 QMQDAFETGVMFSLHVKATMMKVSHPIVFGHAVRIFYKDAFAKHQELFDDLGVNVNNGLS 300
            + D   TGVMFSLHVKATMMK+SHPIVFGHAV++FYKD F K+ ELFD++GVN NNGLS
Sbjct: 241 AIADCENTGVMFSLHVKATMMKISHPIVFGHAVKVFYKDLFDKYGELFDEIGVNPNNGLS 300

Query: 301 DLYSKIESLPASQRDEIIEDLHRCHEHRPELAMVDSARGISNFHSPSDVIVDASMPAMIR 360
            +  KI+ LP S++++I EDLH C+EHRPE+AMVDS +GI+N H PSDVIVDASMPAMIR
Sbjct: 301 SVVEKIKQLPESKQEQIQEDLHACYEHRPEIAMVDSVKGITNLHVPSDVIVDASMPAMIR 360

Query: 361 AGGKMYGADGKLKDTKAVNPESTFSRIYQEIINFCKTNGQFDPTTMGTVPNVGLMAQQAE 420
             GKM+  D KLKDTKAV PEST++ IYQE+INFCKT+G FDPTTMGTVPNVGLMAQ+AE
Sbjct: 361 NSGKMWARDNKLKDTKAVMPESTYATIYQEVINFCKTHGAFDPTTMGTVPNVGLMAQKAE 420

Query: 421 EYGSHDKTFEIPEDGVANIVDVATGEVLLTENVEAGDIWRMCIVKDAPIRDWVKLAVTRA 480
           EYGSHDKTFEI EDGV  +V    G VL   NVE GDIWR C  KD PIRDWVKLAV RA
Sbjct: 421 EYGSHDKTFEIKEDGVVRVV-AEDGTVLTEHNVEKGDIWRACQTKDLPIRDWVKLAVNRA 479

Query: 481 RISGMPVLFWLDPYRPHENELIKKVKTYLKDHDTEGLDIQIMSQVRSMRYTCERLVRGLD 540
           R +GMP +FWLD  R H+ +LI+KV TYLKDHDTEGLDI+IMS VR++R+T ERL+RGLD
Sbjct: 480 RATGMPAVFWLDDERAHDAQLIQKVNTYLKDHDTEGLDIRIMSPVRAIRWTMERLIRGLD 539

Query: 541 TIAATGNILRDYLTDLFPILELGTSAKMLSVVPLMAGGGMYETGAGGSAPKHVKQLVEEN 600
           TI+ TGN+LRDYLTDLFPILELGTSAKMLS+VPL+ GGG+YETGAGGSAPKHV+QL++EN
Sbjct: 540 TISVTGNVLRDYLTDLFPILELGTSAKMLSIVPLLNGGGLYETGAGGSAPKHVQQLIQEN 599

Query: 601 HLRWDSLGEFLALGAGFEDIGIKTGNERAKLLGKTLDAAIGKLLDNDKSPSRKTGELDNR 660
           HLRWDSLGEFLA     +++G K  NERA+LLG+TLD A  +LL+N++SPSR TGELDNR
Sbjct: 600 HLRWDSLGEFLATAVSLDELGEKQNNERARLLGQTLDKATERLLENNQSPSRVTGELDNR 659

Query: 661 GSQFYLAMYWAQELAAQTDDQQLAEHFASLADVLTKNEDVIVRELTEVQGEPVDIGGYYA 720
           GS F+LA YWA+ELA Q  D++L E F  L+  L +N+D I+ E+T VQG P DIGGYY 
Sbjct: 660 GSHFHLARYWAEELANQDSDKELKEFFTKLSAQLEENKDKILEEMTVVQGNPADIGGYYH 719

Query: 721 PDSDMTTAVMRPSKTFNAALE 741
           P  +    VM+PS T N  LE
Sbjct: 720 PPMEKVCEVMQPSATLNRILE 740


Lambda     K      H
   0.317    0.134    0.388 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 1478
Number of extensions: 52
Number of successful extensions: 2
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 745
Length of database: 747
Length adjustment: 40
Effective length of query: 705
Effective length of database: 707
Effective search space:   498435
Effective search space used:   498435
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 55 (25.8 bits)

Align candidate GFF3859 HP15_3800 (isocitrate dehydrogenase, NADP-dependent)
to HMM TIGR00178 (isocitrate dehydrogenase, NADP-dependent (EC 1.1.1.42))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR00178.hmm
# target sequence database:        /tmp/gapView.14741.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR00178  [M=744]
Accession:   TIGR00178
Description: monomer_idh: isocitrate dehydrogenase, NADP-dependent
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                           Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                           -----------
          0 1306.8   0.1          0 1306.6   0.1    1.0  1  lcl|FitnessBrowser__Marino:GFF3859  HP15_3800 isocitrate dehydrogena


Domain annotation for each sequence (and alignments):
>> lcl|FitnessBrowser__Marino:GFF3859  HP15_3800 isocitrate dehydrogenase, NADP-dependent
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 ! 1306.6   0.1         0         0       1     741 [.       1     741 [.       1     744 [. 1.00

  Alignments for each domain:
  == domain 1  score: 1306.6 bits;  conditional E-value: 0
                           TIGR00178   1 mstekakiiytltdeapllatysllpivkafaasaGievetrdislagrilaefpeylteeqkvddalaelGela 75 
                                         m+++kaki+ytltdeap+lat sllpi++++a++aGie et+disla+rila+fp+yl e+q+v+dalaelGe +
  lcl|FitnessBrowser__Marino:GFF3859   1 MTSSKAKIVYTLTDEAPALATRSLLPILETYAKPAGIEFETSDISLAARILANFPDYLEEDQRVPDALAELGEYT 75 
                                         67889********************************************************************** PP

                           TIGR00178  76 ktpeaniiklpnisasvpqlkaaikelqdkGydlpdypeepktdeekdikaryakikGsavnpvlreGnsdrrap 150
                                         k p+aniiklpnisas+pql+aaikel+++Gy++p+y e+p++deek+i+ ryak++GsavnpvlreGnsdrrap
  lcl|FitnessBrowser__Marino:GFF3859  76 KDPDANIIKLPNISASIPQLRAAIKELNEQGYNVPEYKENPENDEEKEIQSRYAKVLGSAVNPVLREGNSDRRAP 150
                                         *************************************************************************** PP

                           TIGR00178 151 lavkeyarkhphkmGewsadskshvahmdagdfyaseksvlldaaeevkieliakdGketvlkaklklldgevid 225
                                         +avk +ark+ph+mGews +s++hvahm+ gdfy+se+sv+ld+a++ +i +  k+Gk+tvlk++l+l++gev+d
  lcl|FitnessBrowser__Marino:GFF3859 151 TAVKAFARKYPHSMGEWSPASRTHVAHMRGGDFYSSEQSVTLDKATKANIVFENKQGKQTVLKSDLPLQEGEVLD 225
                                         *************************************************************************** PP

                           TIGR00178 226 ssvlskkalvefleeeiedakeegvllslhlkatmmkvsdpivfGhvvrvfykdvfakhaelleqlGldvenGla 300
                                         ++++skkalv+f+e+ i+d  ++gv++slh+katmmk+s+pivfGh+v+vfykd f k++el++++G++ +nGl+
  lcl|FitnessBrowser__Marino:GFF3859 226 GMFMSKKALVKFFEDAIADCENTGVMFSLHVKATMMKISHPIVFGHAVKVFYKDLFDKYGELFDEIGVNPNNGLS 300
                                         *************************************************************************** PP

                           TIGR00178 301 dlyakieslpaakkeeieadlekvyeerpelamvdsdkGitnlhvpsdvivdasmpamirasGkmygkdgklkdt 375
                                          +  ki++lp++k+e+i++dl+++ye+rpe+amvds kGitnlhvpsdvivdasmpamir+sGkm+++d+klkdt
  lcl|FitnessBrowser__Marino:GFF3859 301 SVVEKIKQLPESKQEQIQEDLHACYEHRPEIAMVDSVKGITNLHVPSDVIVDASMPAMIRNSGKMWARDNKLKDT 375
                                         *************************************************************************** PP

                           TIGR00178 376 kavipdssyagvyqaviedckknGafdpttmGtvpnvGlmaqkaeeyGshdktfeieadGvvrvvdssGevllee 450
                                         kav+p+s+ya +yq+vi++ck++GafdpttmGtvpnvGlmaqkaeeyGshdktfei++dGvvrvv ++G vl e+
  lcl|FitnessBrowser__Marino:GFF3859 376 KAVMPESTYATIYQEVINFCKTHGAFDPTTMGTVPNVGLMAQKAEEYGSHDKTFEIKEDGVVRVVAEDGTVLTEH 450
                                         *************************************************************************** PP

                           TIGR00178 451 eveagdiwrmcqvkdapiqdwvklavtrarlsgtpavfwldperahdeelikkvekylkdhdteGldiqilspvk 525
                                         +ve+gdiwr+cq kd pi+dwvklav+rar++g+pavfwld+erahd++li+kv++ylkdhdteGldi+i+spv+
  lcl|FitnessBrowser__Marino:GFF3859 451 NVEKGDIWRACQTKDLPIRDWVKLAVNRARATGMPAVFWLDDERAHDAQLIQKVNTYLKDHDTEGLDIRIMSPVR 525
                                         *************************************************************************** PP

                           TIGR00178 526 atrfslerirrGedtisvtGnvlrdyltdlfpilelGtsakmlsvvplmaGGGlfetGaGGsapkhvqqleeenh 600
                                         a+r+++er+ rG dtisvtGnvlrdyltdlfpilelGtsakmls+vpl++GGGl+etGaGGsapkhvqql +enh
  lcl|FitnessBrowser__Marino:GFF3859 526 AIRWTMERLIRGLDTISVTGNVLRDYLTDLFPILELGTSAKMLSIVPLLNGGGLYETGAGGSAPKHVQQLIQENH 600
                                         *************************************************************************** PP

                           TIGR00178 601 lrwdslGeflalaaslehvavktgnekakvladtldaatgklldeekspsrkvGeldnrgskfylakywaqelaa 675
                                         lrwdslGefla a sl++++ k++ne+a++l++tld+at++ll++++spsr +Geldnrgs+f+la+ywa+ela 
  lcl|FitnessBrowser__Marino:GFF3859 601 LRWDSLGEFLATAVSLDELGEKQNNERARLLGQTLDKATERLLENNQSPSRVTGELDNRGSHFHLARYWAEELAN 675
                                         *************************************************************************** PP

                           TIGR00178 676 qtedkelaasfasvaealtkneekivaelaavqGeavdlgGyyapdtdlttkvlrpsatfnailea 741
                                         q  dkel++ f+ ++  l++n++ki +e++ vqG++ d+gGyy+p  +++ +v++psat+n ile 
  lcl|FitnessBrowser__Marino:GFF3859 676 QDSDKELKEFFTKLSAQLEENKDKILEEMTVVQGNPADIGGYYHPPMEKVCEVMQPSATLNRILEE 741
                                         ***************************************************************985 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (744 nodes)
Target sequences:                          1  (747 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.05u 0.01s 00:00:00.06 Elapsed: 00:00:00.06
# Mc/sec: 8.06
//
[ok]

This GapMind analysis is from Sep 17 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory