GapMind for catabolism of small carbon sources

 

Alignments for a candidate for glk in Leptospirillum ferrooxidans C2-3

Align Glucokinase; EC 2.7.1.2; Glucose kinase (uncharacterized)
to candidate WP_081495339.1 LFE_RS04280 glucokinase

Query= curated2:B2J224
         (341 letters)



>NCBI__GCF_000284315.1:WP_081495339.1
          Length = 400

 Score =  197 bits (501), Expect = 4e-55
 Identities = 120/348 (34%), Positives = 185/348 (53%), Gaps = 22/348 (6%)

Query: 3   LLLAGDIGGTKTIL---QLVETSDSQGLHTIYQESYHSADFPDLVPIVQQFLI-----KA 54
           L+L GDIGGTKT L    L E    +    +    Y S  +  L P++Q++L        
Sbjct: 55  LVLVGDIGGTKTSLALFSLAERGSREHFVPLCPAVYPSKRYDSLEPMIQEYLSWVRERLV 114

Query: 55  NTPIPEKACFAIAGPIVKNTAKLTNLAWFLDTERLQQELGIP--HIYLINDFAAVGYGIS 112
             P    A F +AGP+     + TNL W +D   L++ + +   ++ L+ND  A+ + + 
Sbjct: 115 GDPNIVAATFGVAGPVTGRVCRTTNLPWIVDAGSLEKLISVQEGNVNLLNDLEALAWSVG 174

Query: 113 GLQKQDLHPLQVGKPQPETPIGIIGAGTGLGQGFLIKQGNNYQVFPSEGGHADFAPRNEI 172
           G     L  +Q G+P   T + ++  GTGLG+  L  +  N+   PSEGGHAD+AP N+ 
Sbjct: 175 GDPLIPLATIQEGEPGTSTTMVLVAPGTGLGEAVLCGRDENFFALPSEGGHADWAPVNKE 234

Query: 173 EFQLLKYLLDKHDIQRISVERVVSGMGIVAIYQFLRDRKFAAESPDIAQIVRTWEQEAGQ 232
           + +LL+YL +  +   +SVERV+SGMGIV +Y FL     A   P              +
Sbjct: 235 QARLLEYLWE--EFSHVSVERVLSGMGIVRLYTFLAKDIPAPNRP----------LPIPK 282

Query: 233 EEKSVDPGAAIGTAALEKRDRLSEQTLQLFIEAYGAEAGNLALKLLPYGGLYIAGGIAPK 292
            E   D  + I   AL+++D +  + L++F E    EA N+ LK+L  GG +I GGI  +
Sbjct: 283 AEPPADFPSQISQKALDEKDPVCMKALEMFAEILAQEASNMVLKVLARGGCFIGGGIPIR 342

Query: 293 ILPLIQNSGFLLNFTQKGRMRPLLEEIPVYIILNPQVGLIGAALCAAR 340
           I P +++  F   F +KGR   LL ++PV+IIL+P+  L+G+A  A+R
Sbjct: 343 IRPFLESVAFRRRFVEKGRYSELLAKVPVWIILDPEAPLVGSAFYASR 390


Lambda     K      H
   0.320    0.140    0.408 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 320
Number of extensions: 18
Number of successful extensions: 5
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 341
Length of database: 400
Length adjustment: 30
Effective length of query: 311
Effective length of database: 370
Effective search space:   115070
Effective search space used:   115070
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.8 bits)
S2: 49 (23.5 bits)

Align candidate WP_081495339.1 LFE_RS04280 (glucokinase)
to HMM TIGR00749 (glk: glucokinase (EC 2.7.1.2))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR00749.hmm
# target sequence database:        /tmp/gapView.3696649.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR00749  [M=315]
Accession:   TIGR00749
Description: glk: glucokinase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                             Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                             -----------
    1.7e-77  246.7   0.0    2.1e-77  246.5   0.0    1.1  1  NCBI__GCF_000284315.1:WP_081495339.1  


Domain annotation for each sequence (and alignments):
>> NCBI__GCF_000284315.1:WP_081495339.1  
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  246.5   0.0   2.1e-77   2.1e-77       1     315 []      57     385 ..      57     385 .. 0.94

  Alignments for each domain:
  == domain 1  score: 246.5 bits;  conditional E-value: 2.1e-77
                             TIGR00749   1 lvgdiGGtnarlalvevapgeieqv......ktyssedfpsleavvrvyleeakvel.kdpi..kgcfaiatP 64 
                                           lvgdiGGt++ lal   a +  +         +y s+ ++sle++++ yl+  ++ l +dp    ++f +a+P
  NCBI__GCF_000284315.1:WP_081495339.1  57 LVGDIGGTKTSLALFSLAERGSREHfvplcpAVYPSKRYDSLEPMIQEYLSWVRERLvGDPNivAATFGVAGP 129
                                           89*************8888777665567889********************999875267644699******* PP

                             TIGR00749  65 iigdfvrltnldWalsieelkqelala..klelindfaavayailalkeedliqlggakveesaaiailGaGt 135
                                           ++g ++r tnl W +    l++ ++++  +++l+nd+ a a+++ +     l+ ++  ++ +s+++ ++ +Gt
  NCBI__GCF_000284315.1:WP_081495339.1 130 VTGRVCRTTNLPWIVDAGSLEKLISVQegNVNLLNDLEALAWSVGGDPLIPLATIQEGEPGTSTTMVLVAPGT 202
                                           ************************99834699**************9999*********************** PP

                             TIGR00749 136 GlGvatliqqsdgrykvlageGghvdfaPrseleillleylrkkygrvsaervlsGsGlvliyealskrkger 208
                                           GlG a l    d+++ +l++eGgh+d+aP +++++ lleyl +++++vs+ervlsG+G+v +y +l k  +  
  NCBI__GCF_000284315.1:WP_081495339.1 203 GLGEAVLCG-RDENFFALPSEGGHADWAPVNKEQARLLEYLWEEFSHVSVERVLSGMGIVRLYTFLAKDIPAP 274
                                           *****9998.899********************************************************9887 PP

                             TIGR00749 209 e....vsklskeelkekdiseaalegsdvlarralelflsilGalagnlalklgarGGvyvaGGivPrfiell 277
                                           +    + k       + +is++al+++d+++ +ale+f  il ++a+n+ lk++arGG ++ GGi  r+ ++l
  NCBI__GCF_000284315.1:WP_081495339.1 275 NrplpIPKAEPPADFPSQISQKALDEKDPVCMKALEMFAEILAQEASNMVLKVLARGGCFIGGGIPIRIRPFL 347
                                           766777777777888999******************************************************* PP

                             TIGR00749 278 kkssfraafedkGrlkellasiPvqvvlkkkvGllGag 315
                                           ++ +fr +f +kGr+ ella++Pv ++l+ ++ l+G++
  NCBI__GCF_000284315.1:WP_081495339.1 348 ESVAFRRRFVEKGRYSELLAKVPVWIILDPEAPLVGSA 385
                                           ***********************************986 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (315 nodes)
Target sequences:                          1  (400 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.00u 0.00s 00:00:00.00 Elapsed: 00:00:00.00
# Mc/sec: 19.86
//
[ok]

This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory