GapMind for catabolism of small carbon sources

 

Alignments for a candidate for icd in Rhodanobacter denitrificans 2APBS1

Align Isocitrate dehydrogenase [NADP]; IDH; Oxalosuccinate decarboxylase; EC 1.1.1.42 (characterized)
to candidate WP_015449052.1 R2APBS1_RS18160 NADP-dependent isocitrate dehydrogenase

Query= SwissProt::P16100
         (741 letters)



>NCBI__GCF_000230695.2:WP_015449052.1
          Length = 741

 Score = 1122 bits (2901), Expect = 0.0
 Identities = 564/739 (76%), Positives = 630/739 (85%), Gaps = 2/739 (0%)

Query: 3   TPKIIYTLTDEAPALATYSLLPIIKAFTGSSGIAVETRDISLAGRLIATFPEYLTDTQKI 62
           TP+IIYTLTDEAP LAT SLLPI+ AF G++GI VETRDISLA R+IA FPE L + Q++
Sbjct: 4   TPRIIYTLTDEAPFLATQSLLPIVAAFAGTAGIKVETRDISLAARIIALFPEALKEGQRL 63

Query: 63  SDDLAELGKLATTPDANIIKLPNISASVPQLKAAIKELQQQGYKLPDYPEEPKTDTEKDV 122
            DDLAELGKLATTP+ANIIKLPNISAS+PQLKAAIKELQ QGY LPDYP+ PK D+EKD+
Sbjct: 64  PDDLAELGKLATTPEANIIKLPNISASMPQLKAAIKELQGQGYALPDYPDAPKDDSEKDI 123

Query: 123 KARYDKIKGSAVNPVLREGNSDRRAPLSVKNYARKHPHKMGAWSADSKSHVAHMDNGDFY 182
           +ARYD++KGSAVNPVLREGNSDRRAP SVKNYARKHPHKMGAWS DSK+HVAHM  GDFY
Sbjct: 124 RARYDRVKGSAVNPVLREGNSDRRAPASVKNYARKHPHKMGAWSRDSKTHVAHMGGGDFY 183

Query: 183 GSEKAALIGAPGSVKIELIAKDGSSTVLKAKTSVQAGEIIDSSVMSKNALRNFIAAEIED 242
           G+EK+ALIG  G+V IE   KDGS  VLK KT++ AGEIID +VMS+ AL  FI A+I D
Sbjct: 184 GTEKSALIGQAGNVSIEWFGKDGSHAVLKPKTALLAGEIIDGAVMSRRALAGFIDAQIAD 243

Query: 243 AKKQGVLLSVHLKATMMKVSDPIMFGQIVSEFYKDALTKHAEVLKQIGFDVNNGIGDLYA 302
           AK+QGVL S+HLKATMMKVSDPIMFG  V EFY+D L KHA  LKQ+GF+ NNGIGDLYA
Sbjct: 244 AKQQGVLFSLHLKATMMKVSDPIMFGVAVGEFYRDVLAKHAAALKQVGFNPNNGIGDLYA 303

Query: 303 RIKTLPEAKQKEIEADIQAVYAQRPQLAMVNSDKGITNLHVPSDVIVDASMPAMIRDSGK 362
           R+ +LPEA+Q  I+AD+ A YAQRP +AMVNSDKGITNLHVPSDVIVDASMPAMIRDSGK
Sbjct: 304 RLGSLPEAQQAAIKADLDAEYAQRPGVAMVNSDKGITNLHVPSDVIVDASMPAMIRDSGK 363

Query: 363 MWGPDGKLHDTKAVIPDRCYAGVYQVVIEDCKQHGAFDPTTMGSVPNVGLMAQKAEEYGS 422
           MW   G+L D KAVIPDR YAGVYQ  I+DCK HGAFDP TMGSVPNVGLMAQ AEEYGS
Sbjct: 364 MWNAQGQLQDVKAVIPDRSYAGVYQATIDDCKAHGAFDPATMGSVPNVGLMAQAAEEYGS 423

Query: 423 HDKTFQIPADGVVRVTDESGKLLLEQSVEAGDIWRMCQAKDAPIQDWVKLAVNRARATNT 482
           HDKTFQI  DGVV+V DE+G +LL+  VEAGDIWRMCQ KDAPI+DWVKLAV+RAR  +T
Sbjct: 424 HDKTFQIAGDGVVKVLDEAGTVLLQHDVEAGDIWRMCQTKDAPIRDWVKLAVSRAR--HT 481

Query: 483 PAVFWLDPARAHDAQVIAKVERYLKDYDTSGLDIRILSPVEATRFSLARIREGKDTISVT 542
           PAVFWLDPARAHDAQVIAKVERYLKD+DTSGLDIRI+ PV AT+FSL RIR+G DTISVT
Sbjct: 482 PAVFWLDPARAHDAQVIAKVERYLKDHDTSGLDIRIMDPVAATKFSLERIRQGLDTISVT 541

Query: 543 GNVLRDYLTDLFPIMELGTSAKMLSIVPLMSGGGLFETGAGGSAPKHVQQFLEEGYLRWD 602
           GNVLRDYLTDLFPIMELGTSAKMLSIVPLM+GGGLFETGAGGSAPKHVQQFL+E YLRWD
Sbjct: 542 GNVLRDYLTDLFPIMELGTSAKMLSIVPLMAGGGLFETGAGGSAPKHVQQFLQENYLRWD 601

Query: 603 SLGEFLALAASLEHLGNAYKNPKALVLASTLDQATGKILDNNKSPARKVGEIDNRGSHFY 662
           SLGEFLALAASLE +     + +  VLA TLDQA GK LD +KSP+RK+G IDNRGSHFY
Sbjct: 602 SLGEFLALAASLEFVAGRQGSAEVDVLAKTLDQANGKFLDTDKSPSRKLGGIDNRGSHFY 661

Query: 663 LALYWAQALAAQTEDKELQAQFTGIAKALTDNETKIVGELAAAQGKPVDIAGYYHPNTDL 722
           LALYWAQALAAQ  +  L+A+F  +AKALT++E  IV EL   QGKP DI GYYHP+   
Sbjct: 662 LALYWAQALAAQDVNAALKAKFAPLAKALTEHEATIVDELVRVQGKPADIGGYYHPDLAR 721

Query: 723 TSKAIRPSATFNAALAPLA 741
            S A+RPSATFNAALA L+
Sbjct: 722 VSAAMRPSATFNAALALLS 740


Lambda     K      H
   0.315    0.131    0.374 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 1610
Number of extensions: 58
Number of successful extensions: 3
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 741
Length of database: 741
Length adjustment: 40
Effective length of query: 701
Effective length of database: 701
Effective search space:   491401
Effective search space used:   491401
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.5 bits)
S2: 55 (25.8 bits)

Align candidate WP_015449052.1 R2APBS1_RS18160 (NADP-dependent isocitrate dehydrogenase)
to HMM TIGR00178 (isocitrate dehydrogenase, NADP-dependent (EC 1.1.1.42))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR00178.hmm
# target sequence database:        /tmp/gapView.2995467.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR00178  [M=744]
Accession:   TIGR00178
Description: monomer_idh: isocitrate dehydrogenase, NADP-dependent
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                             Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                             -----------
          0 1322.4   0.3          0 1322.2   0.3    1.0  1  NCBI__GCF_000230695.2:WP_015449052.1  


Domain annotation for each sequence (and alignments):
>> NCBI__GCF_000230695.2:WP_015449052.1  
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 ! 1322.2   0.3         0         0       4     742 ..       3     739 ..       1     741 [] 0.99

  Alignments for each domain:
  == domain 1  score: 1322.2 bits;  conditional E-value: 0
                             TIGR00178   4 ekakiiytltdeapllatysllpivkafaasaGievetrdislagrilaefpeylteeqkvddalaelGelak 76 
                                           ++++iiytltdeap lat sllpiv afa++aGi+vetrdisla+ri+a fpe l e q+++d+laelG+la+
  NCBI__GCF_000230695.2:WP_015449052.1   3 QTPRIIYTLTDEAPFLATQSLLPIVAAFAGTAGIKVETRDISLAARIIALFPEALKEGQRLPDDLAELGKLAT 75 
                                           4589********************************************************************* PP

                             TIGR00178  77 tpeaniiklpnisasvpqlkaaikelqdkGydlpdypeepktdeekdikaryakikGsavnpvlreGnsdrra 149
                                           tpeaniiklpnisas+pqlkaaikelq +Gy+lpdyp+ pk d+ekdi+ary+++kGsavnpvlreGnsdrra
  NCBI__GCF_000230695.2:WP_015449052.1  76 TPEANIIKLPNISASMPQLKAAIKELQGQGYALPDYPDAPKDDSEKDIRARYDRVKGSAVNPVLREGNSDRRA 148
                                           ************************************************************************* PP

                             TIGR00178 150 plavkeyarkhphkmGewsadskshvahmdagdfyaseksvlldaaeevkieliakdGketvlkaklklldge 222
                                           p++vk+yarkhphkmG+ws+dsk+hvahm  gdfy++eks+l+++a +v ie+  kdG++ vlk+k+ ll+ge
  NCBI__GCF_000230695.2:WP_015449052.1 149 PASVKNYARKHPHKMGAWSRDSKTHVAHMGGGDFYGTEKSALIGQAGNVSIEWFGKDGSHAVLKPKTALLAGE 221
                                           ************************************************************************* PP

                             TIGR00178 223 vidssvlskkalvefleeeiedakeegvllslhlkatmmkvsdpivfGhvvrvfykdvfakhaelleqlGldv 295
                                           +id++v+s++al+ f++++i+dak++gvl+slhlkatmmkvsdpi+fG +v +fy+dv+akha+ l+q+G++ 
  NCBI__GCF_000230695.2:WP_015449052.1 222 IIDGAVMSRRALAGFIDAQIADAKQQGVLFSLHLKATMMKVSDPIMFGVAVGEFYRDVLAKHAAALKQVGFNP 294
                                           ************************************************************************* PP

                             TIGR00178 296 enGladlyakieslpaakkeeieadlekvyeerpelamvdsdkGitnlhvpsdvivdasmpamirasGkmygk 368
                                           +nG++dlya++ slp+a++  i+adl++ y++rp +amv+sdkGitnlhvpsdvivdasmpamir+sGkm+++
  NCBI__GCF_000230695.2:WP_015449052.1 295 NNGIGDLYARLGSLPEAQQAAIKADLDAEYAQRPGVAMVNSDKGITNLHVPSDVIVDASMPAMIRDSGKMWNA 367
                                           ************************************************************************* PP

                             TIGR00178 369 dgklkdtkavipdssyagvyqaviedckknGafdpttmGtvpnvGlmaqkaeeyGshdktfeieadGvvrvvd 441
                                           +g+l+d kavipd+syagvyqa i+dck +Gafdp+tmG+vpnvGlmaq aeeyGshdktf+i  dGvv+v d
  NCBI__GCF_000230695.2:WP_015449052.1 368 QGQLQDVKAVIPDRSYAGVYQATIDDCKAHGAFDPATMGSVPNVGLMAQAAEEYGSHDKTFQIAGDGVVKVLD 440
                                           ************************************************************************* PP

                             TIGR00178 442 ssGevlleeeveagdiwrmcqvkdapiqdwvklavtrarlsgtpavfwldperahdeelikkvekylkdhdte 514
                                           + G vll+++veagdiwrmcq kdapi+dwvklav+rar   tpavfwldp+rahd+++i+kve+ylkdhdt+
  NCBI__GCF_000230695.2:WP_015449052.1 441 EAGTVLLQHDVEAGDIWRMCQTKDAPIRDWVKLAVSRARH--TPAVFWLDPARAHDAQVIAKVERYLKDHDTS 511
                                           *************************************996..7****************************** PP

                             TIGR00178 515 GldiqilspvkatrfslerirrGedtisvtGnvlrdyltdlfpilelGtsakmlsvvplmaGGGlfetGaGGs 587
                                           Gldi+i++pv at+fslerir+G dtisvtGnvlrdyltdlfpi+elGtsakmls+vplmaGGGlfetGaGGs
  NCBI__GCF_000230695.2:WP_015449052.1 512 GLDIRIMDPVAATKFSLERIRQGLDTISVTGNVLRDYLTDLFPIMELGTSAKMLSIVPLMAGGGLFETGAGGS 584
                                           ************************************************************************* PP

                             TIGR00178 588 apkhvqqleeenhlrwdslGeflalaaslehvavktgnekakvladtldaatgklldeekspsrkvGeldnrg 660
                                           apkhvqq+ +en+lrwdslGeflalaasle va ++g +   vla+tld+a gk+ld++kspsrk G +dnrg
  NCBI__GCF_000230695.2:WP_015449052.1 585 APKHVQQFLQENYLRWDSLGEFLALAASLEFVAGRQGSAEVDVLAKTLDQANGKFLDTDKSPSRKLGGIDNRG 657
                                           ************************************************************************* PP

                             TIGR00178 661 skfylakywaqelaaqtedkelaasfasvaealtkneekivaelaavqGeavdlgGyyapdtdlttkvlrpsa 733
                                           s+fyla ywaq+laaq  ++ l+a+fa++a+alt++e++iv el  vqG++ d+gGyy+pd   +++++rpsa
  NCBI__GCF_000230695.2:WP_015449052.1 658 SHFYLALYWAQALAAQDVNAALKAKFAPLAKALTEHEATIVDELVRVQGKPADIGGYYHPDLARVSAAMRPSA 730
                                           ************************************************************************* PP

                             TIGR00178 734 tfnaileal 742
                                           tfna+l+ l
  NCBI__GCF_000230695.2:WP_015449052.1 731 TFNAALALL 739
                                           *****9866 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (744 nodes)
Target sequences:                          1  (741 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.01u 0.01s 00:00:00.02 Elapsed: 00:00:00.01
# Mc/sec: 37.59
//
[ok]

This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory