Align Xylulose kinase; Short=Xylulokinase; EC 2.7.1.17 (characterized, see rationale)
to candidate WP_013402101.1 CALHY_RS00630 xylulokinase
Query= uniprot:Q97FW4 (500 letters) >NCBI__GCF_000166355.1:WP_013402101.1 Length = 497 Score = 302 bits (773), Expect = 2e-86 Identities = 164/500 (32%), Positives = 271/500 (54%), Gaps = 10/500 (2%) Query: 1 MRYLLGIDVGTSGTKTALFDECGNTIKTSTHEYELFQPQVGWAEQNPENWWTACVKGIRE 60 M +L ID+GT+ K +FD GN + + EY + PQ+ WAEQ+P +WW V+GI+E Sbjct: 1 MEKILTIDIGTTACKVIVFDLQGNILAKANREYPTYTPQIEWAEQDPLDWWNEVVEGIKE 60 Query: 61 VIEKSKIDPLDIKGIGISGQMHGLVLIDKEYKVIRNSIIWCDQRTEKECTQITDTIGKEK 120 V + + D I IG+S Q +V IDKE V+ +I W D+R+ E +I+ GK+ Sbjct: 61 VAQAAGAD--GIVAIGLSSQRETVVPIDKEGNVLSRAISWMDRRSRLEAEEISHQFGKDT 118 Query: 121 LIRITGNPALTGFTLSKLLWVRNNEPDNYKRIYKVLLPKDYIRFKLTGVFAAEVSDASGT 180 + +ITG + FT +KLLW++ ++P+ ++ Y L PK++I + LTG A + S AS T Sbjct: 119 IHKITGLIPDSTFTATKLLWLKKHQPEILQKAYIFLQPKEFIGYMLTGEAATDHSLASRT 178 Query: 181 QMLDINTRNWSEELLDDLRIDKNILPDVYESVVVSGCVIEKASKETKLAVNTPVVGGAGD 240 M D+N R W E++ + + + + P + + V G + E +K L PVV G GD Sbjct: 179 MMFDVNKRQWWEDIFEFVGVKTSQFPRLCYADEVIGYLKEDVAKILGLKSGIPVVSGGGD 238 Query: 241 QAAGAIGNGIVREGLISTVIGTSGVVFAATDTPRFDSKGRVHTLCHAVPNKWHIMGVTQG 300 + A+G GIV ++ + + V ++ P RV CH + + + I Sbjct: 239 RPLEALGAGIVGSRVMESTGTATNVSMSSNKVPE-SLDPRVVCSCHVIRDHYLIEQGINT 297 Query: 301 AGLSLNWFKRTFCAKEILESKEAGINIYDLLTEKASQSKPGSNGIIYLPYLMGERTPHID 360 +G L W + F E KE G N+Y+L+ +A S PG+NG++ LP+ MG R + Sbjct: 298 SGTILRWIRDNFYRGE----KEKGENVYELIDSEAESSSPGANGVVLLPFFMGSRATRWN 353 Query: 361 PNVKGAFLGISLINNHNDFVRSILEGVGFSLKNCLDIIENMKVNIEEIRVSGGGAESSIW 420 P+ KG G++L ++ D R++LEG+ + ++ C++I+E+M + E I GGGA+S +W Sbjct: 354 PDAKGVLFGLTLTHSRADIARAVLEGISYEIRACIEILESMGLKAESIVSMGGGAKSRVW 413 Query: 421 RQILSDIFNYELTTVKASEGPALGVAILAGVGAGIYNSVEEACDKIVKGNEKVMPNANLI 480 +I +DI + K E + G +LA G S+ E +++ + P++ Sbjct: 414 SKIKADILGKNVVVEKVQEAASKGAMLLASYAIGARESLIEEKREVL---FEYQPDSKNH 470 Query: 481 EVYSKVYEVYNSAYPKIKDI 500 E+Y++VYE+YN Y + + Sbjct: 471 EIYNRVYEIYNQLYNSVSPL 490 Lambda K H 0.317 0.136 0.404 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 639 Number of extensions: 26 Number of successful extensions: 5 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 2 Number of HSP's successfully gapped: 1 Length of query: 500 Length of database: 497 Length adjustment: 34 Effective length of query: 466 Effective length of database: 463 Effective search space: 215758 Effective search space used: 215758 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.3 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.6 bits) S2: 52 (24.6 bits)
Align candidate WP_013402101.1 CALHY_RS00630 (xylulokinase)
to HMM TIGR01312 (xylB: xylulokinase (EC 2.7.1.17))
# hmmsearch :: search profile(s) against a sequence database # HMMER 3.3.1 (Jul 2020); http://hmmer.org/ # Copyright (C) 2020 Howard Hughes Medical Institute. # Freely distributed under the BSD open source license. # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # query HMM file: ../tmp/path.carbon/TIGR01312.hmm # target sequence database: /tmp/gapView.4175829.genome.faa # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: TIGR01312 [M=481] Accession: TIGR01312 Description: XylB: xylulokinase Scores for complete sequences (score includes all domains): --- full sequence --- --- best 1 domain --- -#dom- E-value score bias E-value score bias exp N Sequence Description ------- ------ ----- ------- ------ ----- ---- -- -------- ----------- 1.4e-124 402.2 0.0 1.6e-124 402.0 0.0 1.0 1 NCBI__GCF_000166355.1:WP_013402101.1 Domain annotation for each sequence (and alignments): >> NCBI__GCF_000166355.1:WP_013402101.1 # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc --- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ---- 1 ! 402.0 0.0 1.6e-124 1.6e-124 2 481 .] 6 485 .. 5 485 .. 0.94 Alignments for each domain: == domain 1 score: 402.0 bits; conditional E-value: 1.6e-124 TIGR01312 2 GiDlgTssvKallvdekgeviasgsasltvispkpgwsEqdpeewlealeealkellekakeekkeikaisis 74 iD+gT+++K+++ d +g+++a+++ ++++++p+ w+Eqdp +w++ + e +ke+ ++a + i ai++s NCBI__GCF_000166355.1:WP_013402101.1 6 TIDIGTTACKVIVFDLQGNILAKANREYPTYTPQIEWAEQDPLDWWNEVVEGIKEVAQAAG--ADGIVAIGLS 76 69*********************************************************99..599******* PP TIGR01312 75 GQmHglvlLDeegkvlrpaiLWnDtrtaeeceeleeelgeeelleltgnlalegfTapKllWvrkhepevfar 147 Q +v +D+eg+vl ai W D r+ e+ee++++ g+++++++tg ++ fTa+KllW++kh+pe+ ++ NCBI__GCF_000166355.1:WP_013402101.1 77 SQRETVVPIDKEGNVLSRAISWMDRRSRLEAEEISHQFGKDTIHKITGLIPDSTFTATKLLWLKKHQPEILQK 149 ************************************************************************* PP TIGR01312 148 iakvlLPkDylrykLtgevvteysDAsGTllfdvkkrewskellkaldleesllPklvessekaGkvreevak 220 +l Pk ++ y+Ltge++t++s As T++fdv+kr+w +++++ + +++s++P+l+ ++e++G ++e+vak NCBI__GCF_000166355.1:WP_013402101.1 150 AYIFLQPKEFIGYMLTGEAATDHSLASRTMMFDVNKRQWWEDIFEFVGVKTSQFPRLCYADEVIGYLKEDVAK 222 ************************************************************************* PP TIGR01312 221 klGleegvkvaaGggdnaagAiGlgivkegkvlvslGtSGvvlavedkaesdpegavhsFchalpgkwyplgv 293 lGl++g++v++Gggd A+G+giv + v+ s Gt v + ++k ++ + +v ch+ ++++ NCBI__GCF_000166355.1:WP_013402101.1 223 ILGLKSGIPVVSGGGDRPLEALGAGIVGS-RVMESTGTATNVSMSSNKVPESLDPRVVCSCHVIRDHYLIEQG 294 *************************8875.6899******************************777777777 PP TIGR01312 294 tlsatsalewlkellg.......eldveelneeaekvevgaegvlllPylsGERtPhldpqargsliGltant 359 + ++ l w+++++ e +e ++ eae +++ga+gv+llP+++G R +p+a+g+l+Glt ++ NCBI__GCF_000166355.1:WP_013402101.1 295 INTSGTILRWIRDNFYrgekekgENVYELIDSEAESSSPGANGVVLLPFFMGSRATRWNPDAKGVLFGLTLTH 367 7778999********87775554445677889999************************************** PP TIGR01312 360 tradlarAvlegvafalrdsldilkelkglkikeirliGGGaksevwrqiladilglevvvpeeeegaalGaA 432 +rad+arAvleg+++ +r+ ++il++ glk ++i+++GGGaks+vw +i adilg++vvv++ +e+a+ Ga NCBI__GCF_000166355.1:WP_013402101.1 368 SRADIARAVLEGISYEIRACIEILES-MGLKAESIVSMGGGAKSRVWSKIKADILGKNVVVEKVQEAASKGAM 439 **************************.66******************************************** PP TIGR01312 433 ilAaialgekdlveecseavvkqkesvepiaenveayeelyerykklye 481 lA +a g e++ e+ + +++p+ +n+e y+++ye y++ly+ NCBI__GCF_000166355.1:WP_013402101.1 440 LLASYAIGAR---ESLIEEKREVLFEYQPDSKNHEIYNRVYEIYNQLYN 485 *******954...34444444444557899****************995 PP Internal pipeline statistics summary: ------------------------------------- Query model(s): 1 (481 nodes) Target sequences: 1 (497 residues searched) Passed MSV filter: 1 (1); expected 0.0 (0.02) Passed bias filter: 1 (1); expected 0.0 (0.02) Passed Vit filter: 1 (1); expected 0.0 (0.001) Passed Fwd filter: 1 (1); expected 0.0 (1e-05) Initial search space (Z): 1 [actual number of targets] Domain search space (domZ): 1 [number of targets reported over threshold] # CPU time: 0.01u 0.00s 00:00:00.01 Elapsed: 00:00:00.00 # Mc/sec: 32.40 // [ok]
This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory