GapMind for catabolism of small carbon sources

 

Alignments for a candidate for sdaB in Bacteroides thetaiotaomicron VPI-5482

Align L-serine ammonia-lyase (EC 4.3.1.17) (characterized)
to candidate 354204 BT4678 L-serine dehydratase (NCBI ptt file)

Query= reanno::acidovorax_3H11:Ac3H11_929
         (466 letters)



>FitnessBrowser__Btheta:354204
          Length = 411

 Score =  298 bits (762), Expect = 3e-85
 Identities = 190/459 (41%), Positives = 252/459 (54%), Gaps = 67/459 (14%)

Query: 4   SVFDLFKIGIGPSSSHTVGPMIAARQFACHVQGTLGLDAVHHVTVELFGSLSATGVGHGT 63
           S+ +L++IG GPSSSHT+GP  AA  F                 V L+GSL+ATG GH T
Sbjct: 12  SIKELYRIGTGPSSSHTMGPRKAAEMFVERHPDAASFK------VTLYGSLAATGKGHMT 65

Query: 64  DKAVLLGLAGHEPDHIDPDQILPAIADIRTRQTLALLGEHPVPFVEKEHLLFRRKSLPLH 123
           D A++             D + PA                PV  V +  +      LP H
Sbjct: 66  DVAII-------------DTLQPAA---------------PVEIVWQPKVF-----LPFH 92

Query: 124 PNGMAFHAFDAQGNEIATREYYSVGGGFVIDAAGERVLNSAATAGPDAVGHAQGLPHPFR 183
           PNGM F A DA    +     YS+GGG + +      + S    G + +           
Sbjct: 93  PNGMTFAALDADNKILENWTVYSIGGGALAENNDNPTIESPEVYGMNNM----------- 141

Query: 184 TGAELLAQCQATGLTPAQLM--AANEQHWRSASEVRRQLMAIWKTMAGAVQRGCASTGTL 241
              E+L  C+ TG +  + +    NE  W   +EV       W TM  A+ RG  + G L
Sbjct: 142 --TEILQWCERTGKSYWEYVKECENEDIWDYLAEV-------WDTMKDAIHRGLEAEGVL 192

Query: 242 PGPMHVRRRAAELHHKLSSAPEAALRDPLSMLDWVNLYAMAVNEENAAGGRVVTAPTNGA 301
           PGP+++RR+A+  + + +   ++     L     V  YA+AV+EENA+GG++VTAPT G+
Sbjct: 193 PGPLNLRRKASTYYIRATGYKQS-----LQSRGLVFSYALAVSEENASGGKIVTAPTCGS 247

Query: 302 AGVIPAVLHYYVNFLPGANEDGIATFLLTAGAIGIIYKENASLSGAEVGCQGEVGVACSM 361
            GV+PAVL Y++      ++  I   L TAG IG I K NAS+SGAEVGCQGEVGVAC+M
Sbjct: 248 CGVMPAVL-YHLQKSRDFSDMRILRALATAGLIGNIVKFNASISGAEVGCQGEVGVACAM 306

Query: 362 AAGALAAVMGGSPEQIENAAEIGMEHNLGMTCDPVGGLVQIPCIERNAMGAIKAINAARM 421
           A+ A   + GGSP QIE AAE+G+EH+LGMTCDPV GLVQIPCIERNA  A +A++A   
Sbjct: 307 ASAAANQLFGGSPAQIEYAAEMGLEHHLGMTCDPVCGLVQIPCIERNAYAAARALDANLY 366

Query: 422 ALRGDGQHVVSLDKVIKTMMQTGADMKVKYKETSRGGLA 460
           +   DG H VS DKV++ M QTG D+   YKETS GGLA
Sbjct: 367 SAFTDGMHRVSFDKVVQVMKQTGHDLPSLYKETSEGGLA 405


Lambda     K      H
   0.319    0.134    0.398 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 521
Number of extensions: 32
Number of successful extensions: 6
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 466
Length of database: 411
Length adjustment: 32
Effective length of query: 434
Effective length of database: 379
Effective search space:   164486
Effective search space used:   164486
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.8 bits)
S2: 51 (24.3 bits)

Align candidate 354204 BT4678 (L-serine dehydratase (NCBI ptt file))
to HMM TIGR00720 (L-serine ammonia-lyase (EC 4.3.1.17))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR00720.hmm
# target sequence database:        /tmp/gapView.4609.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR00720  [M=450]
Accession:   TIGR00720
Description: sda_mono: L-serine ammonia-lyase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                          Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                          -----------
   1.1e-131  425.8   0.7   7.3e-112  360.4   0.1    2.0  2  lcl|FitnessBrowser__Btheta:354204  BT4678 L-serine dehydratase (NCB


Domain annotation for each sequence (and alignments):
>> lcl|FitnessBrowser__Btheta:354204  BT4678 L-serine dehydratase (NCBI ptt file)
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !   64.9   0.0   3.5e-22   3.5e-22       2      77 ..      12      81 ..      11      86 .. 0.91
   2 !  360.4   0.1  7.3e-112  7.3e-112     114     448 ..      84     406 ..      76     408 .. 0.93

  Alignments for each domain:
  == domain 1  score: 64.9 bits;  conditional E-value: 3.5e-22
                          TIGR00720  2 svfdlfkiGiGPssshtvGPmkaakefveelkkkgkleqvkrvkvdlyGslaltGkGhktdkavllGleGelpeev 77
                                       s+ +l++iG GPsssht+GP kaa++fve+  ++      ++ kv+lyGsla+tGkGh td a++  l+   p e+
  lcl|FitnessBrowser__Btheta:354204 12 SIKELYRIGTGPSSSHTMGPRKAAEMFVERHPDA------ASFKVTLYGSLAATGKGHMTDVAIIDTLQPAAPVEI 81
                                       7889************************998655......899**********************99988877665 PP

  == domain 2  score: 360.4 bits;  conditional E-value: 7.3e-112
                          TIGR00720 114 k.devlplhenglrlkaydeegevlkektyysvGGGfivdeeelkkeeeeeeevpypfksaaellelCkeeglsis 188
                                        + + +lp+h+ng+++ a+d ++++l + t ys+GGG + ++++     + e+   y  ++++e+l+ C+++g+s  
  lcl|FitnessBrowser__Btheta:354204  84 QpKVFLPFHPNGMTFAALDADNKILENWTVYSIGGGALAENNDN---PTIESPEVYGMNNMTEILQWCERTGKSYW 156
                                        546789******************************99776544...44555667********************9 PP

                          TIGR00720 189 evvlenekalrseeevraklleiwkvmeecierglkaegvlpGglkvkrraaslkrklkakeetskdplavldwvn 264
                                        e v e e     +e++ ++l+e+w++m+++i+rgl+aegvlpG+l+++r+a++ + + +  +++    l+    v 
  lcl|FitnessBrowser__Btheta:354204 157 EYVKECE-----NEDIWDYLAEVWDTMKDAIHRGLEAEGVLPGPLNLRRKASTYYIRATGYKQS----LQSRGLVF 223
                                        9998766.....47899************************************99887665555....67778999 PP

                          TIGR00720 265 lyalavneenaaGgrvvtaPtnGaagiiPavlayykkfveeaseekvvrflltagaiGilykenasisgaevGCqg 340
                                         yalav+eena+Gg++vtaPt G++g++Pavl++  +  ++ s+ ++ r l tag iG + k nasisgaevGCqg
  lcl|FitnessBrowser__Btheta:354204 224 SYALAVSEENASGGKIVTAPTCGSCGVMPAVLYH-LQKSRDFSDMRILRALATAGLIGNIVKFNASISGAEVGCQG 298
                                        9******************************955.5568899********************************** PP

                          TIGR00720 341 evGvacsmaaaglaellggtpeqvenaaeiamehnlGltCdPvgGlvqiPCiernaiaavkainaarlalkedgkk 416
                                        evGvac+ma+a+  +l+gg+p q+e aae+++eh+lG+tCdPv GlvqiPCierna aa +a++a   +   dg +
  lcl|FitnessBrowser__Btheta:354204 299 EVGVACAMASAAANQLFGGSPAQIEYAAEMGLEHHLGMTCDPVCGLVQIPCIERNAYAAARALDANLYSAFTDGMH 374
                                        **************************************************************************** PP

                          TIGR00720 417 kvsldkvietmretGkdmkakyketskgGlav 448
                                        +vs+dkv+++m++tG+d+ + ykets+gGla+
  lcl|FitnessBrowser__Btheta:354204 375 RVSFDKVVQVMKQTGHDLPSLYKETSEGGLAK 406
                                        ******************************97 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (450 nodes)
Target sequences:                          1  (411 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.07u 0.01s 00:00:00.08 Elapsed: 00:00:00.07
# Mc/sec: 2.48
//
[ok]

This GapMind analysis is from Sep 17 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory