GapMind for catabolism of small carbon sources

 

Alignments for a candidate for iolC in Rhizobium freirei PRF 81

Align 5-dehydro-2-deoxygluconokinase (EC 2.7.1.92); possible 5-dehydro-2-deoxyphosphogluconate aldolase DUF2090 (EC 4.1.2.29) (characterized)
to candidate WP_004121794.1 RHSP_RS20075 5-dehydro-2-deoxygluconokinase

Query= reanno::Smeli:SMc01165
         (650 letters)



>NCBI__GCF_000359745.1:WP_004121794.1
          Length = 644

 Score =  893 bits (2308), Expect = 0.0
 Identities = 442/635 (69%), Positives = 517/635 (81%), Gaps = 4/635 (0%)

Query: 13  AGAKP---LDLITIGRASVDLYGQQIGTRLEDVASFAKSVGGCPCNISVGTARLGLKSAL 69
           +GA+P   LD+ITIGR+SVDLYGQQIG++LED+ASFAKSVGGCP NI++GTARLGLKS L
Sbjct: 6   SGAEPEARLDVITIGRSSVDLYGQQIGSKLEDIASFAKSVGGCPANIAIGTARLGLKSGL 65

Query: 70  LTRVGDEQMGRFIREQLQREGVETRGIVTDPERLTALAILSVENDKSFPLLFYRDNCADN 129
           +TRVGDEQMGRFIREQ  REGV   GI TD ERLTAL +L+VE +   P++FYR +CAD 
Sbjct: 66  ITRVGDEQMGRFIREQAAREGVAVDGIATDKERLTALVLLAVEAEGVSPMIFYRSDCADM 125

Query: 130 ALCEDDISEDFIRSARAVLVTGTHFAKPNADAAQRKAIRIAKESGARIVFDIDYRPNLWG 189
           AL EDDI E FI+S+ AVLV+GTHF+KPN +AAQRKAIRIAK +G +++FDIDYRPNLWG
Sbjct: 126 ALNEDDIDESFIKSSNAVLVSGTHFSKPNTEAAQRKAIRIAKANGRKVIFDIDYRPNLWG 185

Query: 190 LAGHDAGESRYIASDRVSAHLKTVLGDCDLIVGTEEEVLIASGESDLLAALKTIRSLSKA 249
           LAGH  G  RY+ SDRVS+ +K  L DCDLIVGTEEE++IASG  D+L ALK IR LS A
Sbjct: 186 LAGHAEGFERYVKSDRVSSKMKETLPDCDLIVGTEEEIMIASGADDVLGALKEIRRLSSA 245

Query: 250 TIVLKRGPMGCIVYDGPISDDLEDGIVGKGFPIEVYNVLGAGDAFMSGFLRGWLTGEPHA 309
            IVLKRG MGCIVY+GPISDDLE GIVG+GFPIEV+NVLGAGDAFMSGFLRG+L GEP  
Sbjct: 246 VIVLKRGAMGCIVYEGPISDDLEAGIVGEGFPIEVFNVLGAGDAFMSGFLRGYLRGEPLK 305

Query: 310 TSATWANACGAFAVSRLLCAPEIPTWTELQYFLEHGSKEKALRKDEAINHVHWATTRR-R 368
           TSATWANACGAFAVSRLLC+PE PTW EL +FL+ GSK +ALRKDEAINH+HWATTRR  
Sbjct: 306 TSATWANACGAFAVSRLLCSPEYPTWAELDFFLKTGSKHRALRKDEAINHIHWATTRRSE 365

Query: 369 DIPLLMALAVDHRSQLEDIAEGNPELLSRIPAFKVLAVKAAAEVAQGRSGFGMLIDDKYG 428
           +IPLLMALA+DHRSQL  + +       +I AFK LAV+AAA VA GRSG+GMLID+++G
Sbjct: 366 EIPLLMALAIDHRSQLVSVCDELGIGHDKIVAFKQLAVEAAARVADGRSGYGMLIDERFG 425

Query: 429 RDALYAAGAHRDFWIGKPIELPGSRPLTFEFSQDLGSRLVDWPVDHCIKVLSFYHPDDPA 488
           RDA + A      WIG+P+ELPGS+PL FEFSQD+GS+LV+WP+ HCIK L FYHPDDP 
Sbjct: 426 RDAFFDAATKNFSWIGRPVELPGSKPLRFEFSQDIGSQLVEWPLSHCIKCLCFYHPDDPK 485

Query: 489 ELKAAQVAKLRSAFEAARKVGREILIEIIAGKHGKLDDRTIPRALEELYDAGLKPDWWKL 548
           ELK  Q  KLR+ FEAARKVGRE+L+EIIA K+G L D TIP ALEELY  G+KPDWWKL
Sbjct: 486 ELKEEQQEKLRTLFEAARKVGRELLVEIIASKNGPLTDDTIPTALEELYALGIKPDWWKL 545

Query: 549 EPQASRAAWAAIDAVIETRDPLCRGVVLLGLEAPYEVLKDGFAAARTSKTVKGFAVGRTI 608
           EPQ S  AW  IDAVI   DPLCRG+VLLGLEAP E L   F A   + +VKGFAVGRTI
Sbjct: 546 EPQESTTAWKKIDAVIAKNDPLCRGIVLLGLEAPAEELIRSFEATLAAPSVKGFAVGRTI 605

Query: 609 FADAAKAWLAGRMTDEQAVSDMAAKFKALVDLWLQ 643
           F+DAA+AWL+G M DE+A++DMA +F+ L   WL+
Sbjct: 606 FSDAARAWLSGGMNDEEAIADMAGRFRQLTAAWLK 640


Lambda     K      H
   0.320    0.136    0.407 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 1219
Number of extensions: 52
Number of successful extensions: 3
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 650
Length of database: 644
Length adjustment: 38
Effective length of query: 612
Effective length of database: 606
Effective search space:   370872
Effective search space used:   370872
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.8 bits)
S2: 54 (25.4 bits)

Align candidate WP_004121794.1 RHSP_RS20075 (5-dehydro-2-deoxygluconokinase)
to HMM TIGR04382 (iolC: 5-dehydro-2-deoxygluconokinase (EC 2.7.1.92))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR04382.hmm
# target sequence database:        /tmp/gapView.2734548.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR04382  [M=309]
Accession:   TIGR04382
Description: myo_inos_iolC_N: 5-dehydro-2-deoxygluconokinase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                             Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                             -----------
   3.8e-120  386.8   0.0   5.2e-120  386.3   0.0    1.2  1  NCBI__GCF_000359745.1:WP_004121794.1  


Domain annotation for each sequence (and alignments):
>> NCBI__GCF_000359745.1:WP_004121794.1  
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  386.3   0.0  5.2e-120  5.2e-120       2     309 .]      14     338 ..      13     338 .. 0.99

  Alignments for each domain:
  == domain 1  score: 386.3 bits;  conditional E-value: 5.2e-120
                             TIGR04382   2 ldlitiGRvgvDlyaqqigasledvksfakylGGspaNiavgaarlGlktalitkvgddqlGrfvreelereg 74 
                                           ld+itiGR++vDly+qqig++led+ sfak++GG+paNia+g+arlGlk++lit+vgd+q+Grf+re+ +reg
  NCBI__GCF_000359745.1:WP_004121794.1  14 LDVITIGRSSVDLYGQQIGSKLEDIASFAKSVGGCPANIAIGTARLGLKSGLITRVGDEQMGRFIREQAAREG 86 
                                           8************************************************************************ PP

                             TIGR04382  75 vdtshvvtdkeartslvlleikdpdefpllfYRenaaDlaltvddvdeeliaeakallvsgtalskepsreav 147
                                           v ++++ tdke++t+lvll ++ +  +p++fYR+++aD+al+ dd+de +i++++a+lvsgt++sk+++++a+
  NCBI__GCF_000359745.1:WP_004121794.1  87 VAVDGIATDKERLTALVLLAVEAEGVSPMIFYRSDCADMALNEDDIDESFIKSSNAVLVSGTHFSKPNTEAAQ 159
                                           ************************************************************************* PP

                             TIGR04382 148 lkalelakkagvkvvlDiDYRpvlWk............skeeasaalqlvlkkvdviiGteeEfeiaagekdd 208
                                            ka+++ak++g kv++DiDYRp+lW+            +++++s ++++ l+++d+i+GteeE+ ia+g++d 
  NCBI__GCF_000359745.1:WP_004121794.1 160 RKAIRIAKANGRKVIFDIDYRPNLWGlaghaegferyvKSDRVSSKMKETLPDCDLIVGTEEEIMIASGADDV 232
                                           ************************************************************************* PP

                             TIGR04382 209 eaaakallelgaelvvvKrGeeGslvytkd.....eeevevkgfkvevlkvlGaGDaFasgllygllegedle 276
                                             a+k++++l+ +++v+KrG+ G++vy++      e  +  +gf++ev++vlGaGDaF+sg+l+g+l+ge+l+
  NCBI__GCF_000359745.1:WP_004121794.1 233 LGALKEIRRLSSAVIVLKRGAMGCIVYEGPisddlEAGIVGEGFPIEVFNVLGAGDAFMSGFLRGYLRGEPLK 305
                                           ****************************9999999999*********************************** PP

                             TIGR04382 277 kalelanAagaivvsrlscaeamptleeleefl 309
                                           +++++anA+ga++vsrl c++++pt++el+ fl
  NCBI__GCF_000359745.1:WP_004121794.1 306 TSATWANACGAFAVSRLLCSPEYPTWAELDFFL 338
                                           *****************************9886 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (309 nodes)
Target sequences:                          1  (644 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.00u 0.00s 00:00:00.00 Elapsed: 00:00:00.00
# Mc/sec: 44.70
//
[ok]

This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory