GapMind for catabolism of small carbon sources

 

Alignments for a candidate for xylB in Caldicellulosiruptor hydrothermalis 108

Align Xylulose kinase; Short=Xylulokinase; EC 2.7.1.17 (characterized, see rationale)
to candidate WP_013402101.1 CALHY_RS00630 xylulokinase

Query= uniprot:Q97FW4
         (500 letters)



>NCBI__GCF_000166355.1:WP_013402101.1
          Length = 497

 Score =  302 bits (773), Expect = 2e-86
 Identities = 164/500 (32%), Positives = 271/500 (54%), Gaps = 10/500 (2%)

Query: 1   MRYLLGIDVGTSGTKTALFDECGNTIKTSTHEYELFQPQVGWAEQNPENWWTACVKGIRE 60
           M  +L ID+GT+  K  +FD  GN +  +  EY  + PQ+ WAEQ+P +WW   V+GI+E
Sbjct: 1   MEKILTIDIGTTACKVIVFDLQGNILAKANREYPTYTPQIEWAEQDPLDWWNEVVEGIKE 60

Query: 61  VIEKSKIDPLDIKGIGISGQMHGLVLIDKEYKVIRNSIIWCDQRTEKECTQITDTIGKEK 120
           V + +  D   I  IG+S Q   +V IDKE  V+  +I W D+R+  E  +I+   GK+ 
Sbjct: 61  VAQAAGAD--GIVAIGLSSQRETVVPIDKEGNVLSRAISWMDRRSRLEAEEISHQFGKDT 118

Query: 121 LIRITGNPALTGFTLSKLLWVRNNEPDNYKRIYKVLLPKDYIRFKLTGVFAAEVSDASGT 180
           + +ITG    + FT +KLLW++ ++P+  ++ Y  L PK++I + LTG  A + S AS T
Sbjct: 119 IHKITGLIPDSTFTATKLLWLKKHQPEILQKAYIFLQPKEFIGYMLTGEAATDHSLASRT 178

Query: 181 QMLDINTRNWSEELLDDLRIDKNILPDVYESVVVSGCVIEKASKETKLAVNTPVVGGAGD 240
            M D+N R W E++ + + +  +  P +  +  V G + E  +K   L    PVV G GD
Sbjct: 179 MMFDVNKRQWWEDIFEFVGVKTSQFPRLCYADEVIGYLKEDVAKILGLKSGIPVVSGGGD 238

Query: 241 QAAGAIGNGIVREGLISTVIGTSGVVFAATDTPRFDSKGRVHTLCHAVPNKWHIMGVTQG 300
           +   A+G GIV   ++ +    + V  ++   P      RV   CH + + + I      
Sbjct: 239 RPLEALGAGIVGSRVMESTGTATNVSMSSNKVPE-SLDPRVVCSCHVIRDHYLIEQGINT 297

Query: 301 AGLSLNWFKRTFCAKEILESKEAGINIYDLLTEKASQSKPGSNGIIYLPYLMGERTPHID 360
           +G  L W +  F   E    KE G N+Y+L+  +A  S PG+NG++ LP+ MG R    +
Sbjct: 298 SGTILRWIRDNFYRGE----KEKGENVYELIDSEAESSSPGANGVVLLPFFMGSRATRWN 353

Query: 361 PNVKGAFLGISLINNHNDFVRSILEGVGFSLKNCLDIIENMKVNIEEIRVSGGGAESSIW 420
           P+ KG   G++L ++  D  R++LEG+ + ++ C++I+E+M +  E I   GGGA+S +W
Sbjct: 354 PDAKGVLFGLTLTHSRADIARAVLEGISYEIRACIEILESMGLKAESIVSMGGGAKSRVW 413

Query: 421 RQILSDIFNYELTTVKASEGPALGVAILAGVGAGIYNSVEEACDKIVKGNEKVMPNANLI 480
            +I +DI    +   K  E  + G  +LA    G   S+ E   +++    +  P++   
Sbjct: 414 SKIKADILGKNVVVEKVQEAASKGAMLLASYAIGARESLIEEKREVL---FEYQPDSKNH 470

Query: 481 EVYSKVYEVYNSAYPKIKDI 500
           E+Y++VYE+YN  Y  +  +
Sbjct: 471 EIYNRVYEIYNQLYNSVSPL 490


Lambda     K      H
   0.317    0.136    0.404 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 639
Number of extensions: 26
Number of successful extensions: 5
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 2
Number of HSP's successfully gapped: 1
Length of query: 500
Length of database: 497
Length adjustment: 34
Effective length of query: 466
Effective length of database: 463
Effective search space:   215758
Effective search space used:   215758
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.6 bits)
S2: 52 (24.6 bits)

Align candidate WP_013402101.1 CALHY_RS00630 (xylulokinase)
to HMM TIGR01312 (xylB: xylulokinase (EC 2.7.1.17))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR01312.hmm
# target sequence database:        /tmp/gapView.4175829.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR01312  [M=481]
Accession:   TIGR01312
Description: XylB: xylulokinase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                             Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                             -----------
   1.4e-124  402.2   0.0   1.6e-124  402.0   0.0    1.0  1  NCBI__GCF_000166355.1:WP_013402101.1  


Domain annotation for each sequence (and alignments):
>> NCBI__GCF_000166355.1:WP_013402101.1  
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  402.0   0.0  1.6e-124  1.6e-124       2     481 .]       6     485 ..       5     485 .. 0.94

  Alignments for each domain:
  == domain 1  score: 402.0 bits;  conditional E-value: 1.6e-124
                             TIGR01312   2 GiDlgTssvKallvdekgeviasgsasltvispkpgwsEqdpeewlealeealkellekakeekkeikaisis 74 
                                            iD+gT+++K+++ d +g+++a+++ ++++++p+  w+Eqdp +w++ + e +ke+ ++a    + i ai++s
  NCBI__GCF_000166355.1:WP_013402101.1   6 TIDIGTTACKVIVFDLQGNILAKANREYPTYTPQIEWAEQDPLDWWNEVVEGIKEVAQAAG--ADGIVAIGLS 76 
                                           69*********************************************************99..599******* PP

                             TIGR01312  75 GQmHglvlLDeegkvlrpaiLWnDtrtaeeceeleeelgeeelleltgnlalegfTapKllWvrkhepevfar 147
                                            Q   +v +D+eg+vl  ai W D r+  e+ee++++ g+++++++tg ++   fTa+KllW++kh+pe+ ++
  NCBI__GCF_000166355.1:WP_013402101.1  77 SQRETVVPIDKEGNVLSRAISWMDRRSRLEAEEISHQFGKDTIHKITGLIPDSTFTATKLLWLKKHQPEILQK 149
                                           ************************************************************************* PP

                             TIGR01312 148 iakvlLPkDylrykLtgevvteysDAsGTllfdvkkrewskellkaldleesllPklvessekaGkvreevak 220
                                              +l Pk ++ y+Ltge++t++s As T++fdv+kr+w +++++ + +++s++P+l+ ++e++G ++e+vak
  NCBI__GCF_000166355.1:WP_013402101.1 150 AYIFLQPKEFIGYMLTGEAATDHSLASRTMMFDVNKRQWWEDIFEFVGVKTSQFPRLCYADEVIGYLKEDVAK 222
                                           ************************************************************************* PP

                             TIGR01312 221 klGleegvkvaaGggdnaagAiGlgivkegkvlvslGtSGvvlavedkaesdpegavhsFchalpgkwyplgv 293
                                            lGl++g++v++Gggd    A+G+giv +  v+ s Gt   v + ++k  ++ + +v   ch+  ++++    
  NCBI__GCF_000166355.1:WP_013402101.1 223 ILGLKSGIPVVSGGGDRPLEALGAGIVGS-RVMESTGTATNVSMSSNKVPESLDPRVVCSCHVIRDHYLIEQG 294
                                           *************************8875.6899******************************777777777 PP

                             TIGR01312 294 tlsatsalewlkellg.......eldveelneeaekvevgaegvlllPylsGERtPhldpqargsliGltant 359
                                           +   ++ l w+++++        e  +e ++ eae +++ga+gv+llP+++G R    +p+a+g+l+Glt ++
  NCBI__GCF_000166355.1:WP_013402101.1 295 INTSGTILRWIRDNFYrgekekgENVYELIDSEAESSSPGANGVVLLPFFMGSRATRWNPDAKGVLFGLTLTH 367
                                           7778999********87775554445677889999************************************** PP

                             TIGR01312 360 tradlarAvlegvafalrdsldilkelkglkikeirliGGGaksevwrqiladilglevvvpeeeegaalGaA 432
                                           +rad+arAvleg+++ +r+ ++il++  glk ++i+++GGGaks+vw +i adilg++vvv++ +e+a+ Ga 
  NCBI__GCF_000166355.1:WP_013402101.1 368 SRADIARAVLEGISYEIRACIEILES-MGLKAESIVSMGGGAKSRVWSKIKADILGKNVVVEKVQEAASKGAM 439
                                           **************************.66******************************************** PP

                             TIGR01312 433 ilAaialgekdlveecseavvkqkesvepiaenveayeelyerykklye 481
                                            lA +a g     e++ e+  +   +++p+ +n+e y+++ye y++ly+
  NCBI__GCF_000166355.1:WP_013402101.1 440 LLASYAIGAR---ESLIEEKREVLFEYQPDSKNHEIYNRVYEIYNQLYN 485
                                           *******954...34444444444557899****************995 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (481 nodes)
Target sequences:                          1  (497 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.01u 0.00s 00:00:00.01 Elapsed: 00:00:00.00
# Mc/sec: 32.40
//
[ok]

This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory