GapMind for catabolism of small carbon sources

 

Alignments for a candidate for hmgA in Paraburkholderia sp. CCGE1002

Align Homogentisate 1,2-dioxygenase (EC 1.13.11.5) (characterized)
to candidate WP_013091977.1 BC1002_RS20860 homogentisate 1,2-dioxygenase

Query= reanno::pseudo13_GW456_L13:PfGW456L13_4962
         (434 letters)



>NCBI__GCF_000092885.1:WP_013091977.1
          Length = 434

 Score =  548 bits (1413), Expect = e-161
 Identities = 271/415 (65%), Positives = 321/415 (77%), Gaps = 4/415 (0%)

Query: 12  YQSGFGNEFSSEALPGALPVGQNSPQKAPYGLYTELFSGTAFTMARSEARRTWMYRIQPS 71
           Y SGFGN+FS+EA+PGALP G+NSPQ+AP+GL+ EL SG AFT  R+E RRTWMYRI PS
Sbjct: 12  YMSGFGNQFSTEAIPGALPEGRNSPQRAPHGLFAELLSGAAFTAPRAENRRTWMYRILPS 71

Query: 72  ANHPAFFKLDRQL-TGGPLGEV-TP-NRLRWNPLDIPAEPTDFIDGLVSMAANSGAEKPA 128
           A H A+ K+++     GP GEV TP NR RW+P   P  PTDF+DG V++A N   E  +
Sbjct: 72  AMHGAYRKIEQPAWESGPFGEVDTPANRFRWDPWPAPTLPTDFVDGTVTIAGNGSPEAQS 131

Query: 129 GISIYNYRANRSM-ERAFFNADGEMLLVPELGRLRIATELGVLELEPLEIAVLPRGLKFR 187
           G++ + YRAN SM  R   NADGEML+VP+LGRL I TELGVL+L P E+AVLPRGL F 
Sbjct: 132 GMTAHMYRANASMANRYLLNADGEMLIVPQLGRLTIRTELGVLDLAPGEVAVLPRGLHFA 191

Query: 188 IELLDPQARGYVAENHGAPLRLPDLGPIGSNGLANPRDFLTPVAAYENLQQPTTLVQKFL 247
           ++L D +A GY+AEN+GAP RLP+LGPIGSNGLAN RDFLTPVAAYE+ Q+   +V+KFL
Sbjct: 192 VDLDDGEASGYIAENYGAPFRLPELGPIGSNGLANHRDFLTPVAAYEDGQRNVKIVRKFL 251

Query: 248 GQLWACELNHSPLNVVAWHGNNVPYKYDLRRFNTIGTVSFDHPDPSIFTVLTSPTSVHGL 307
           G+ W    NHSPLNVVAWHGN VPYKYDL RF  IGTVSFDHPDPSI+TVLTSP++V G 
Sbjct: 252 GKFWEGTQNHSPLNVVAWHGNLVPYKYDLARFMAIGTVSFDHPDPSIYTVLTSPSTVPGT 311

Query: 308 ANLDFVIFPPRWMVAEKTFRPPWFHRNLMNEFMGLIQGEYDAKAEGFVPGGASLHSCMSA 367
           AN DFVIFPPRW+VAE TFRPPWFHRNLM+E MGL+ G YDAKAEGFVPGG SLH+CM  
Sbjct: 312 ANCDFVIFPPRWLVAEDTFRPPWFHRNLMSELMGLVYGVYDAKAEGFVPGGVSLHNCMMP 371

Query: 368 HGPDGETCTKAINAELKPAKIDNTMAFMFETSQVLRPSRFALDCPQLQNTYDACW 422
           HGPD  T  KA + EL P KI NT+AFMFE+S+V + +R +L+    Q  YDA W
Sbjct: 372 HGPDAATFDKASSTELIPHKIANTLAFMFESSRVFKLTRSSLEAVNRQTGYDAVW 426


Lambda     K      H
   0.320    0.137    0.426 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 741
Number of extensions: 27
Number of successful extensions: 3
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 434
Length of database: 434
Length adjustment: 32
Effective length of query: 402
Effective length of database: 402
Effective search space:   161604
Effective search space used:   161604
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.8 bits)
S2: 51 (24.3 bits)

Align candidate WP_013091977.1 BC1002_RS20860 (homogentisate 1,2-dioxygenase)
to HMM TIGR01015 (hmgA: homogentisate 1,2-dioxygenase (EC 1.13.11.5))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR01015.hmm
# target sequence database:        /tmp/gapView.14406.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR01015  [M=429]
Accession:   TIGR01015
Description: hmgA: homogentisate 1,2-dioxygenase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                                 Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                                 -----------
     3e-199  648.1   0.0   3.4e-199  647.9   0.0    1.0  1  lcl|NCBI__GCF_000092885.1:WP_013091977.1  BC1002_RS20860 homogentisate 1,2


Domain annotation for each sequence (and alignments):
>> lcl|NCBI__GCF_000092885.1:WP_013091977.1  BC1002_RS20860 homogentisate 1,2-dioxygenase
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  647.9   0.0  3.4e-199  3.4e-199       3     426 ..      11     430 ..       9     433 .. 0.99

  Alignments for each domain:
  == domain 1  score: 647.9 bits;  conditional E-value: 3.4e-199
                                 TIGR01015   3 kylsGfgnefeseavpgalPvGqnsPqkapyglyaeqlsGsaftaPraenkrswlyrirPsaaheafee 71 
                                               +y+sGfgn+f++ea+pgalP+G+nsPq+ap+gl+ae lsG+aftaPraen+r+w+yri+Psa+h a+ +
  lcl|NCBI__GCF_000092885.1:WP_013091977.1  11 TYMSGFGNQFSTEAIPGALPEGRNSPQRAPHGLFAELLSGAAFTAPRAENRRTWMYRILPSAMHGAYRK 79 
                                               7****************************************************************9999 PP

                                 TIGR01015  72 lkeseekltanfkeeasdpnqlrwspleipseeavdfveglvtlagagdaksraGlavhlyavnasmed 140
                                               ++ +  + + +f e+++  n++rw+p ++p+   +dfv+g vt+ag+g +++++G+++h+y +nasm +
  lcl|NCBI__GCF_000092885.1:WP_013091977.1  80 IE-QPAWESGPFGEVDTPANRFRWDPWPAPT-LPTDFVDGTVTIAGNGSPEAQSGMTAHMYRANASMAN 146
                                               98.699************************7.99*********************************** PP

                                 TIGR01015 141 evfynadGdllivpqkGaleittelGrlkvePneiaviprGvrfrveve.eearGyilevygakfqlPd 208
                                               +++ nadG++livpq G l+i+telG+l+++P+e+av+prG++f+v+++ +ea Gyi e+yga f+lP+
  lcl|NCBI__GCF_000092885.1:WP_013091977.1 147 RYLLNADGEMLIVPQLGRLTIRTELGVLDLAPGEVAVLPRGLHFAVDLDdGEASGYIAENYGAPFRLPE 215
                                               ************************************************9899***************** PP

                                 TIGR01015 209 lGPiGanglanprdfeaPvaafedkevkdevrviskfqgklfaakqdhspldvvawhGnyvPykydlkk 277
                                               lGPiG+nglan rdf +Pvaa+ed +   +v++++kf gk++  +q+hspl+vvawhGn+vPykydl +
  lcl|NCBI__GCF_000092885.1:WP_013091977.1 216 LGPIGSNGLANHRDFLTPVAAYEDGQR--NVKIVRKFLGKFWEGTQNHSPLNVVAWHGNLVPYKYDLAR 282
                                               ************************998..89************************************** PP

                                 TIGR01015 278 fnvinsvsfdhpdPsiftvltapsdkeGtaiadfvifpPrwlvaektfrPPyyhrnvmsefmGlikGky 346
                                               f++i++vsfdhpdPsi+tvlt+ps+ +Gta++dfvifpPrwlvae+tfrPP++hrn+mse mGl++G y
  lcl|NCBI__GCF_000092885.1:WP_013091977.1 283 FMAIGTVSFDHPDPSIYTVLTSPSTVPGTANCDFVIFPPRWLVAEDTFRPPWFHRNLMSELMGLVYGVY 351
                                               ********************************************************************* PP

                                 TIGR01015 347 dakeeGfvpgGaslhnimsahGPdveafekasnaelkPekiddgtlafmfesslslavtklakelekld 415
                                               dak+eGfvpgG slhn+m +hGPd+++f+kas +el P+ki++ tlafmfess ++++t+ + e+ + +
  lcl|NCBI__GCF_000092885.1:WP_013091977.1 352 DAKAEGFVPGGVSLHNCMMPHGPDAATFDKASSTELIPHKIAN-TLAFMFESSRVFKLTRSSLEAVNRQ 419
                                               *******************************************.************************* PP

                                 TIGR01015 416 edyeevwqglk 426
                                               + y++vw+g+ 
  lcl|NCBI__GCF_000092885.1:WP_013091977.1 420 TGYDAVWDGFT 430
                                               ********986 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (429 nodes)
Target sequences:                          1  (434 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.01u 0.00s 00:00:00.01 Elapsed: 00:00:00.01
# Mc/sec: 10.87
//
[ok]

This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory