GapMind for catabolism of small carbon sources

 

Alignments for a candidate for pobA in Caulobacter crescentus NA1000

Align p-hydroxybenzoate hydroxylase; PHBH; 4-hydroxybenzoate 3-monooxygenase; EC 1.14.13.2 (characterized)
to candidate CCNA_02487 CCNA_02487 4-hydroxybenzoate 3-monooxygenase

Query= SwissProt::P20586
         (394 letters)



>FitnessBrowser__Caulo:CCNA_02487
          Length = 406

 Score =  491 bits (1263), Expect = e-143
 Identities = 238/389 (61%), Positives = 290/389 (74%)

Query: 1   MKTQVAIIGAGPSGLLLGQLLHKAGIDNVILERQTPDYVLGRIRAGVLEQGMVDLLREAG 60
           ++TQVAI+GAGP+GL LG LL +AG+D VILER+   YV GR+RAGVLE+  V+L+   G
Sbjct: 16  VRTQVAIVGAGPAGLFLGHLLRQAGVDVVILERKDRAYVEGRVRAGVLERITVELMERLG 75

Query: 61  VDRRMARDGLVHEGVEIAFAGQRRRIDLKRLSGGKTVTVYGQTEVTRDLMEAREACGATT 120
           VD RM R+GLVH G  +A  G+  RID+  L+GG TV VYGQ EV +DL +A E      
Sbjct: 76  VDERMRREGLVHAGANLASDGEMFRIDMAELTGGSTVMVYGQQEVMKDLFDAAEQRDLRI 135

Query: 121 VYQAAEVRLHDLQGERPYVTFERDGERLRLDCDYIAGCDGFHGISRQSIPAERLKVFERV 180
           V+ A  VRLHD++GERP++T+ +DG   RLDCD+IAGCDG+HG+SR +IP + LK FERV
Sbjct: 136 VFDADAVRLHDVEGERPHITWRKDGAEHRLDCDFIAGCDGYHGVSRATIPDKVLKTFERV 195

Query: 181 YPFGWLGLLADTPPVSHELIYANHPRGFALCSQRSATRSRYYVQVPLSEKVEDWSDERFW 240
           YPFGWLG+LA+ PP  HELIY+NH RGFAL S RS TRSRYYVQ  L +++EDWSDERFW
Sbjct: 196 YPFGWLGILAEAPPCDHELIYSNHDRGFALASMRSPTRSRYYVQCSLDDRLEDWSDERFW 255

Query: 241 TELKARLPSEVAEKLVTGPSLEKSIAPLRSFVVEPMQHGRLFLAGDAAHIVPPTGAKGLN 300
            E+  RL  E A ++V  PS EKSIAPLRSFV EPM++GRLFLAGDAAHIVPPTGAKG+N
Sbjct: 256 DEVSVRLGPEAAARIVRAPSFEKSIAPLRSFVSEPMRYGRLFLAGDAAHIVPPTGAKGMN 315

Query: 301 LAASDVSTLYRLLLKAYREGRGELLERYSAICLRRIWKAERFSWWMTSVLHRFPDTDAFS 360
           LA SDV  L   L++ Y E     ++ YSA  L R+WKAERFSWW TS+ HRFPD D F 
Sbjct: 316 LAVSDVIMLSEALVEHYHERSSAGIDGYSARALARVWKAERFSWWFTSLTHRFPDQDGFD 375

Query: 361 QRIQQTELEYYLGSEAGLATIAENYVGLP 389
           +++Q  EL Y  GS A   T+AENYVGLP
Sbjct: 376 RKMQVAELAYIKGSRAAQVTLAENYVGLP 404


Lambda     K      H
   0.321    0.138    0.413 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 502
Number of extensions: 23
Number of successful extensions: 1
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 394
Length of database: 406
Length adjustment: 31
Effective length of query: 363
Effective length of database: 375
Effective search space:   136125
Effective search space used:   136125
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.8 bits)
S2: 50 (23.9 bits)

Align candidate CCNA_02487 CCNA_02487 (4-hydroxybenzoate 3-monooxygenase)
to HMM TIGR02360 (pobA: 4-hydroxybenzoate 3-monooxygenase (EC 1.14.13.2))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR02360.hmm
# target sequence database:        /tmp/gapView.5156.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR02360  [M=390]
Accession:   TIGR02360
Description: pbenz_hydroxyl: 4-hydroxybenzoate 3-monooxygenase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                             Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                             -----------
   1.2e-219  714.9   0.0   1.3e-219  714.7   0.0    1.0  1  lcl|FitnessBrowser__Caulo:CCNA_02487  CCNA_02487 4-hydroxybenzoate 3-m


Domain annotation for each sequence (and alignments):
>> lcl|FitnessBrowser__Caulo:CCNA_02487  CCNA_02487 4-hydroxybenzoate 3-monooxygenase
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  714.7   0.0  1.3e-219  1.3e-219       2     390 .]      17     405 ..      16     405 .. 1.00

  Alignments for each domain:
  == domain 1  score: 714.7 bits;  conditional E-value: 1.3e-219
                             TIGR02360   2 ktqvaiigaGpsGlllgqllhkaGidavilerksrdyvlgriraGvleqgtvdlleeagvderldreglvheG 74 
                                           +tqvai+gaGp+Gl+lg+ll++aG+d+vilerk+r+yv+gr+raGvle++tv+l+e++gvder++reglvh+G
  lcl|FitnessBrowser__Caulo:CCNA_02487  17 RTQVAIVGAGPAGLFLGHLLRQAGVDVVILERKDRAYVEGRVRAGVLERITVELMERLGVDERMRREGLVHAG 89 
                                           9************************************************************************ PP

                             TIGR02360  75 veiafegekvrvdlkkltggksvlvyGqtevtrdlyeareaaglktvyeadevrlhdlesdrpkvtfekdgee 147
                                           +++a++ge++r+d+++ltgg++v+vyGq+ev++dl++a+e+++l++v++ad+vrlhd+e++rp++t++kdg+e
  lcl|FitnessBrowser__Caulo:CCNA_02487  90 ANLASDGEMFRIDMAELTGGSTVMVYGQQEVMKDLFDAAEQRDLRIVFDADAVRLHDVEGERPHITWRKDGAE 162
                                           ************************************************************************* PP

                             TIGR02360 148 krldcdfiaGcdGfhGvsrksipaeklkefekvypfGwlGilsetppvsdeliysnserGfalcslrsetrsr 220
                                           +rldcdfiaGcdG+hGvsr++ip+++lk+fe+vypfGwlGil+e+pp+++eliysn++rGfal+s+rs trsr
  lcl|FitnessBrowser__Caulo:CCNA_02487 163 HRLDCDFIAGCDGYHGVSRATIPDKVLKTFERVYPFGWLGILAEAPPCDHELIYSNHDRGFALASMRSPTRSR 235
                                           ************************************************************************* PP

                             TIGR02360 221 yyvqvsltdkvedwsddrfweelkrrldeeaaeklvtgpsieksiaplrsfvaepmryGrlflaGdaahivpp 293
                                           yyvq+sl+d++edwsd+rfw+e++ rl +eaa+++v++ps+eksiaplrsfv+epmryGrlflaGdaahivpp
  lcl|FitnessBrowser__Caulo:CCNA_02487 236 YYVQCSLDDRLEDWSDERFWDEVSVRLGPEAAARIVRAPSFEKSIAPLRSFVSEPMRYGRLFLAGDAAHIVPP 308
                                           ************************************************************************* PP

                             TIGR02360 294 tGakGlnlaasdvaylyealleaykekdsaglerysakalarvwkaerfswwltsllhrfpdedefdkkiqqa 366
                                           tGakG+nla+sdv++l+eal+e+y+e++sag+++ysa+alarvwkaerfsww+tsl+hrfpd+d fd+k+q+a
  lcl|FitnessBrowser__Caulo:CCNA_02487 309 TGAKGMNLAVSDVIMLSEALVEHYHERSSAGIDGYSARALARVWKAERFSWWFTSLTHRFPDQDGFDRKMQVA 381
                                           ************************************************************************* PP

                             TIGR02360 367 eleylleseaaqktlaenyvGlpy 390
                                           el+y+++s+aaq tlaenyvGlp+
  lcl|FitnessBrowser__Caulo:CCNA_02487 382 ELAYIKGSRAAQVTLAENYVGLPL 405
                                           **********************95 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (390 nodes)
Target sequences:                          1  (406 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.01u 0.01s 00:00:00.02 Elapsed: 00:00:00.01
# Mc/sec: 11.38
//
[ok]

This GapMind analysis is from Sep 17 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory