GapMind for catabolism of small carbon sources

 

Alignments for a candidate for fahA in Dinoroseobacter shibae DFL-12

Align Fumarylacetoacetase; FAA; Beta-diketonase; Fumarylacetoacetate hydrolase; EC 3.7.1.2 (characterized)
to candidate 3610430 Dshi_3811 fumarylacetoacetase (RefSeq)

Query= SwissProt::P16930
         (419 letters)



>FitnessBrowser__Dino:3610430
          Length = 413

 Score =  404 bits (1037), Expect = e-117
 Identities = 210/416 (50%), Positives = 272/416 (65%), Gaps = 15/416 (3%)

Query: 2   SFIPVAED--SDFPIHNLPYGVFSTRGDPRPRIGVAIGDQILDLSIIKHLFTGPVLSKHQ 59
           S++P A D    FP++NL +GVFST GD  PR  VAIGD++LDL+ ++      +L  H 
Sbjct: 7   SWVPGANDPAGAFPLNNLAFGVFST-GDG-PRCAVAIGDKVLDLAALQ---AAGLLPDHG 61

Query: 60  DVFNQPTLNSFMGLGQAAWKEARVFLQNLLSVSQARLRDDTELRKCAFISQASATMHLPA 119
             F+ P L++FMG GQ AW+  R  L  LL     R   +T   + A   +A   +HLP 
Sbjct: 62  --FDAPALDTFMGRGQPAWQAVREALTELL-----RAGAETAPVRAALHDRAGVRLHLPF 114

Query: 120 TIGDYTDFYSSRQHATNVGIMFRDKENALMPNWLHLPVGYHGRASSVVVSGTPIRRPMGQ 179
           T+ ++TDFY+ RQHA NVG +FRD  NAL PNWLH+P+GY+GRAS+VVVSGTPI RP GQ
Sbjct: 115 TLAEFTDFYAGRQHAFNVGSLFRDPANALPPNWLHMPIGYNGRASTVVVSGTPIHRPAGQ 174

Query: 180 MKPDDSKPPVYGACKLLDMELEMAFFVGPGNRLGEPIPISKAHEHIFGMVLMNDWSARDI 239
           +K      P +G C+ LD ELE+   VG  +++G P+ + +A E IFG VL+NDWSARDI
Sbjct: 175 IKDPSDPMPRFGPCERLDFELELGAVVGTPSQMGVPVTVDEADEMIFGYVLLNDWSARDI 234

Query: 240 QKWEYVPLGPFLGKSFGTTVSPWVVPMDALMPFAVPNPKQDPRPLPYLCHDEPYTFDINL 299
           Q WEYVPLGPF GK+F TT+SPWVVP  AL PF    P ++   LP+L    P   DI+L
Sbjct: 235 QAWEYVPLGPFQGKAFATTISPWVVPRAALAPFRCGPPVREVPLLPHLRDTGPMFHDIDL 294

Query: 300 SVNLKGEGMSQAATICKSNFKYMYWTMLQQLTHHSVNGCNLRPGDLLASGTISGPEPENF 359
           +V L   G      +C++N   +Y++  Q L HHS +GC +R GDLL SGTISGPE   F
Sbjct: 295 AVTLAPPG-GAPTEVCRTNSNALYYSAAQLLAHHSTSGCAMRTGDLLGSGTISGPEKGMF 353

Query: 360 GSMLELSWKGTKPIDLGNGQTRKFLLDGDEVIITGYCQGDGYRIGFGQCAGKVLPA 415
           GS+LE++W G  P+ L  G TR+FL DGD V + G  +GDGYRIGFG C G +LPA
Sbjct: 354 GSLLEITWGGRDPVALAGGATRRFLADGDTVTLKGEARGDGYRIGFGTCTGTILPA 409


Lambda     K      H
   0.321    0.139    0.440 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 612
Number of extensions: 20
Number of successful extensions: 5
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 419
Length of database: 413
Length adjustment: 31
Effective length of query: 388
Effective length of database: 382
Effective search space:   148216
Effective search space used:   148216
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.8 bits)
S2: 50 (23.9 bits)

Align candidate 3610430 Dshi_3811 (fumarylacetoacetase (RefSeq))
to HMM TIGR01266 (fahA: fumarylacetoacetase (EC 3.7.1.2))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR01266.hmm
# target sequence database:        /tmp/gapView.25010.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR01266  [M=420]
Accession:   TIGR01266
Description: fum_ac_acetase: fumarylacetoacetase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                         Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                         -----------
   5.6e-157  508.8   0.0   6.3e-157  508.6   0.0    1.0  1  lcl|FitnessBrowser__Dino:3610430  Dshi_3811 fumarylacetoacetase (R


Domain annotation for each sequence (and alignments):
>> lcl|FitnessBrowser__Dino:3610430  Dshi_3811 fumarylacetoacetase (RefSeq)
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  508.6   0.0  6.3e-157  6.3e-157       1     419 [.       7     409 ..       7     410 .. 0.97

  Alignments for each domain:
  == domain 1  score: 508.6 bits;  conditional E-value: 6.3e-157
                         TIGR01266   1 sfvavak..nsdfplqnlPyGvfstkadssrrigvaiGdqildlskiaaaglfeglalkehqevfkestlnaflalg 75 
                                       s+v+ a+     fpl+nl +Gvfs  ++  +r++vaiGd++ldl++++aagl+ +       + f+  +l++f++ g
  lcl|FitnessBrowser__Dino:3610430   7 SWVPGANdpAGAFPLNNLAFGVFS--TGDGPRCAVAIGDKVLDLAALQAAGLLPD-------HGFDAPALDTFMGRG 74 
                                       789888844468************..56789********************9988.......89************* PP

                         TIGR01266  76 rparkevrerlqkllsesaevlrdnaalrkeallaqaeatmhlPaqiGdytdfyssirhatnvGilfrgkdnallPn 152
                                       +pa+++vre l +ll + ae+     a  + al+ +a ++ hlP ++ ++tdfy++++ha nvG lfr + nal Pn
  lcl|FitnessBrowser__Dino:3610430  75 QPAWQAVREALTELLRAGAET-----APVRAALHDRAGVRLHLPFTLAEFTDFYAGRQHAFNVGSLFRDPANALPPN 146
                                       ***************966655.....5566799******************************************** PP

                         TIGR01266 153 ykhlPvgyhGrassvvvsGtelrrPvGqikadnakePvfgpckkldlelelaffvgtenelGeavpiekaeehifGv 229
                                       + h+P+gy Gras+vvvsGt+++rP Gqik +   +P fgpc++ld+elel+  vgt+ ++G +v ++ a+e ifG 
  lcl|FitnessBrowser__Dino:3610430 147 WLHMPIGYNGRASTVVVSGTPIHRPAGQIKDPSDPMPRFGPCERLDFELELGAVVGTPSQMGVPVTVDEADEMIFGY 223
                                       ***************************************************************************** PP

                         TIGR01266 230 vllndwsardiqaweyvPlGPflaksfattvsPwvvsiealePfrvaqlePeqdpkplpylredradtafdielevs 306
                                       vllndwsardiqaweyvPlGPf +k+fatt+sPwvv+  al Pfr     P+++  +lp+lr+   ++ +di+l+v+
  lcl|FitnessBrowser__Dino:3610430 224 VLLNDWSARDIQAWEYVPLGPFQGKAFATTISPWVVPRAALAPFRCG--PPVREVPLLPHLRDT-GPMFHDIDLAVT 297
                                       *********************************************99..888888899****99.9*********** PP

                         TIGR01266 307 lkteGlaeaavisrsnakslywtlkqqlahhsvnGcnlraGdllgsGtisGkeeeafGsllelsakGkkevkladge 383
                                       l + G + ++ ++r+n+  ly++ +q lahhs +Gc +r+GdllgsGtisG+e++ fGslle++++G+++v la g 
  lcl|FitnessBrowser__Dino:3610430 298 LAPPG-GAPTEVCRTNSNALYYSAAQLLAHHSTSGCAMRTGDLLGSGTISGPEKGMFGSLLEITWGGRDPVALAGGA 373
                                       *****.99********************************************************************* PP

                         TIGR01266 384 trkfledGdevilrgvckkeGvrvGfGecaGkvlpa 419
                                       tr+fl dGd+v l+g ++ +G+r+GfG c+G++lpa
  lcl|FitnessBrowser__Dino:3610430 374 TRRFLADGDTVTLKGEARGDGYRIGFGTCTGTILPA 409
                                       **********************************98 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (420 nodes)
Target sequences:                          1  (413 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.01u 0.01s 00:00:00.02 Elapsed: 00:00:00.01
# Mc/sec: 9.99
//
[ok]

This GapMind analysis is from Sep 17 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory