GapMind for catabolism of small carbon sources

 

Alignments for a candidate for xylB in Acidovorax sp. GW101-3H11

Align Xylulose kinase (EC 2.7.1.17) (characterized)
to candidate Ac3H11_2937 Xylulose kinase (EC 2.7.1.17)

Query= reanno::Smeli:SMc03164
         (484 letters)



>FitnessBrowser__acidovorax_3H11:Ac3H11_2937
          Length = 515

 Score =  448 bits (1152), Expect = e-130
 Identities = 247/513 (48%), Positives = 309/513 (60%), Gaps = 34/513 (6%)

Query: 1   MYLGLDLGTSGVKAMLMDGEQRIIGSASGALDVDRPHPGWSEQDPADWIRAAEEAIARLR 60
           MYLG+DLGTSGVK +L++G Q ++ +A  A+   RP P WSEQ+PADW+ A E A+A+LR
Sbjct: 1   MYLGIDLGTSGVKLLLLNGAQTLVATADAAVPQHRPQPTWSEQNPADWMAAVEAAVAQLR 60

Query: 61  ETHAQALAAVRGIGLSGQMHGATLLDEGDAVLRPCILWNDTRSFREAAAL-DGDPQFRAL 119
                A A VRGIGLSG MHGA +L     VLRP ILWND R+  E A L D  P  R +
Sbjct: 61  AQAPAAWAQVRGIGLSGHMHGAVVLGAQGHVLRPAILWNDGRASAECAQLEDAVPTSRQI 120

Query: 120 TGNIVFPGFTAPKLAWVRENEPEIFARVRWVLLPKDYLRLWLTGEHMSEMSDSAGTSWLD 179
           TGN+  PGFTAPKL W+R +EP +FA+VR VLLPKD+LRL LTG+ +S+MSD++GT WLD
Sbjct: 121 TGNLAMPGFTAPKLLWLRTHEPAVFAQVRTVLLPKDWLRLQLTGDAVSDMSDASGTLWLD 180

Query: 180 TGKRKWSASLLAATHLEERQMPDLVEGTDAAGTLRPELAARWGMGPGVVVAGGAGDNAAS 239
              R WS ++L A  L+   MP L EG+   GTLR ++A RWG+G GVVVA GAGDNAAS
Sbjct: 181 VQARTWSPAMLQACGLDVSHMPKLAEGSAPTGTLRSDVARRWGLGEGVVVAAGAGDNAAS 240

Query: 240 ACGMGTVGEGQAFVSLGTSGVLFAANASYLPNPESAVHAFCHALPNTWHQMGVILSATDA 299
           A G+G    GQ FVSLGTSGV+F    ++ P  E AVHAF HALP  WH M V+LSA  A
Sbjct: 241 AVGVGARTAGQGFVSLGTSGVVFRVTDAFAPATERAVHAFAHALPQRWHHMSVMLSAASA 300

Query: 300 LNWHSGVTGRS-AAELTSELG--ESLKAPGSVTFLPYLSGERTPHNDATIRGVFAGLGHE 356
             W + +TGRS  A+L+  +G   + +   +  FLPYLSGERTPHN+A   GVF GL  E
Sbjct: 301 FGWVTRLTGRSDEAQLSDAVGALSTSRQAQAPLFLPYLSGERTPHNNAAATGVFMGLRAE 360

Query: 357 SSRAVLTQAVLEGVSFAIRDSLEALRAAGTKLKRVTA----------------------- 393
              A L  AV+EGV F + D L A+RAAG    R                          
Sbjct: 361 HEAADLAYAVMEGVGFGLLDGLNAMRAAGAGQGRAAGEAVGSTTPVQAELVEARTAPGAT 420

Query: 394 ------IGGGSRSRYWLSSIATALNLPVDLPADGDFGAAFGAARLGLIAATGADPAAVCT 447
                 +GGG+RS  W   +A+AL  P+  P      AA GAARL  +A  G D A  C 
Sbjct: 421 DSALALVGGGARSNPWAQLLASALGTPLQRPQGAHAAAALGAARLAAMAC-GGDEAHWCQ 479

Query: 448 APETAETIAPEASLVPAYEDAYQRYRRLYPAIK 480
                 T  P+ +      + Y R+  LYPA++
Sbjct: 480 PLPADATFMPQPAQQALLAERYARFVALYPALQ 512


Lambda     K      H
   0.317    0.132    0.401 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 715
Number of extensions: 35
Number of successful extensions: 5
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 2
Number of HSP's successfully gapped: 1
Length of query: 484
Length of database: 515
Length adjustment: 34
Effective length of query: 450
Effective length of database: 481
Effective search space:   216450
Effective search space used:   216450
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 52 (24.6 bits)

Align candidate Ac3H11_2937 (Xylulose kinase (EC 2.7.1.17))
to HMM TIGR01312 (xylB: xylulokinase (EC 2.7.1.17))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR01312.hmm
# target sequence database:        /tmp/gapView.8476.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR01312  [M=481]
Accession:   TIGR01312
Description: XylB: xylulokinase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                                        Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                                        -----------
   1.2e-162  527.8   2.7     5e-149  482.8   0.9    2.0  2  lcl|FitnessBrowser__acidovorax_3H11:Ac3H11_2937  Xylulose kinase (EC 2.7.1.17)


Domain annotation for each sequence (and alignments):
>> lcl|FitnessBrowser__acidovorax_3H11:Ac3H11_2937  Xylulose kinase (EC 2.7.1.17)
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  482.8   0.9    5e-149    5e-149       1     387 [.       3     389 ..       3     397 .. 0.98
   2 !   45.0   0.0   2.9e-16   2.9e-16     387     480 ..     417     508 ..     393     509 .. 0.84

  Alignments for each domain:
  == domain 1  score: 482.8 bits;  conditional E-value: 5e-149
                                        TIGR01312   1 lGiDlgTssvKallvdekgeviasgsasltvispkpgwsEqdpeewlealeealkellekak 62 
                                                      lGiDlgTs+vK ll++  ++++a++ a+++  +p+p+wsEq+p +w++a+e a+++l+++a 
  lcl|FitnessBrowser__acidovorax_3H11:Ac3H11_2937   3 LGIDLGTSGVKLLLLNGAQTLVATADAAVPQHRPQPTWSEQNPADWMAAVEAAVAQLRAQAP 64 
                                                      7************************************************************* PP

                                        TIGR01312  63 eekkeikaisisGQmHglvlLDeegkvlrpaiLWnDtrtaeeceeleeelgeeelleltgnl 124
                                                        ++++++i++sG mHg+v+L ++g+vlrpaiLWnD r+++ec++le++++  +++++tgnl
  lcl|FitnessBrowser__acidovorax_3H11:Ac3H11_2937  65 AAWAQVRGIGLSGHMHGAVVLGAQGHVLRPAILWNDGRASAECAQLEDAVP--TSRQITGNL 124
                                                      **************************************************9..99******* PP

                                        TIGR01312 125 alegfTapKllWvrkhepevfariakvlLPkDylrykLtgevvteysDAsGTllfdvkkrew 186
                                                      a++gfTapKllW+r hep vfa++++vlLPkD+lr++Ltg++v+++sDAsGTl++dv+ r+w
  lcl|FitnessBrowser__acidovorax_3H11:Ac3H11_2937 125 AMPGFTAPKLLWLRTHEPAVFAQVRTVLLPKDWLRLQLTGDAVSDMSDASGTLWLDVQARTW 186
                                                      ************************************************************** PP

                                        TIGR01312 187 skellkaldleesllPklvessekaGkvreevakklGleegvkvaaGggdnaagAiGlgivk 248
                                                      s ++l+a+ l+ s +Pkl e+s+ +G++r++va+++Gl egv vaaG+gdnaa+A+G+g+ +
  lcl|FitnessBrowser__acidovorax_3H11:Ac3H11_2937 187 SPAMLQACGLDVSHMPKLAEGSAPTGTLRSDVARRWGLGEGVVVAAGAGDNAASAVGVGART 248
                                                      ************************************************************** PP

                                        TIGR01312 249 egkvlvslGtSGvvlavedkaesdpegavhsFchalpgkwyplgvtlsatsalewlkellge 310
                                                      +g+ +vslGtSGvv+ v+d+  + +e avh+F+halp++w++++v+lsa+sa  w+++l+g+
  lcl|FitnessBrowser__acidovorax_3H11:Ac3H11_2937 249 AGQGFVSLGTSGVVFRVTDAFAPATERAVHAFAHALPQRWHHMSVMLSAASAFGWVTRLTGR 310
                                                      *************************************************************9 PP

                                        TIGR01312 311 ldveelneeaekvevg..aegvlllPylsGERtPhldpqargsliGltanttradlarAvle 370
                                                      +d ++l  ++ + +++  a++ l+lPylsGERtPh++++a+g+++Gl+a+++ adla+Av+e
  lcl|FitnessBrowser__acidovorax_3H11:Ac3H11_2937 311 SDEAQLSDAVGALSTSrqAQAPLFLPYLSGERTPHNNAAATGVFMGLRAEHEAADLAYAVME 372
                                                      9999**9999887665559******************************************* PP

                                        TIGR01312 371 gvafalrdsldilkelk 387
                                                      gv f+l d+l+++++++
  lcl|FitnessBrowser__acidovorax_3H11:Ac3H11_2937 373 GVGFGLLDGLNAMRAAG 389
                                                      **************966 PP

  == domain 2  score: 45.0 bits;  conditional E-value: 2.9e-16
                                        TIGR01312 387 kglkikeirliGGGaksevwrqiladilglevvvpe.eeegaalGaAilAaialgekdlvee 447
                                                       g + + + l+GGGa+s+ w q+la  lg++++ p+ ++ +aalGaA+lAa+a+g  d+++ 
  lcl|FitnessBrowser__acidovorax_3H11:Ac3H11_2937 417 PGATDSALALVGGGARSNPWAQLLASALGTPLQRPQgAHAAAALGAARLAAMACGG-DEAHW 477
                                                      556678899***************************7788899***********96.68999 PP

                                        TIGR01312 448 cseavvkqkesvepiaenveayeelyerykkly 480
                                                      c+   ++++    p+ ++++  +e+y+r+ +ly
  lcl|FitnessBrowser__acidovorax_3H11:Ac3H11_2937 478 CQPLPADATFM--PQPAQQALLAERYARFVALY 508
                                                      98887777766..77777778999999998887 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (481 nodes)
Target sequences:                          1  (515 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.08u 0.01s 00:00:00.09 Elapsed: 00:00:00.07
# Mc/sec: 3.12
//
[ok]

This GapMind analysis is from Sep 17 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory