GapMind for catabolism of small carbon sources

 

Alignments for a candidate for hutU in Shewanella loihica PV-4

Align Urocanate hydratase (EC 4.2.1.49) (characterized)
to candidate 5211345 Shew_3757 urocanate hydratase (RefSeq)

Query= reanno::pseudo6_N2E2:Pf6N2E2_3805
         (562 letters)



>FitnessBrowser__PV4:5211345
          Length = 556

 Score =  877 bits (2267), Expect = 0.0
 Identities = 423/544 (77%), Positives = 468/544 (86%)

Query: 16  IRAPRGNTLTAKSWLTEAPLRMLMNNLDPEVAENPKELVVYGGIGRAARNWECYDKIVES 75
           I AP G TL+ KSWLTEAP+RMLMNNL P+VAE P++LVVYGGIGRAAR+W+CYDKIVE 
Sbjct: 11  IIAPHGTTLSCKSWLTEAPMRMLMNNLHPDVAERPEDLVVYGGIGRAARDWQCYDKIVEV 70

Query: 76  LTNLNDDETLLVQSGKPVGVFKTHSNAPRVLIANSNLVPHWASWEHFNELDAKGLAMYGQ 135
           L  L +DETLLVQSGKPVGVFKTHSNAPRV+IANSNLVPHWA+WEHFNELD KGLAMYGQ
Sbjct: 71  LQRLEEDETLLVQSGKPVGVFKTHSNAPRVIIANSNLVPHWANWEHFNELDKKGLAMYGQ 130

Query: 136 MTAGSWIYIGSQGIVQGTYETFVEAGRQHYDSNLKGRWVLTAGLGGMGGAQPLAATLAGA 195
           MTAGSWIYIGSQGIVQGTYETFV   +QH+  +  G+W+LT GLGGMGGAQPLA T+AG 
Sbjct: 131 MTAGSWIYIGSQGIVQGTYETFVAMAKQHFGGDASGKWILTGGLGGMGGAQPLAGTMAGY 190

Query: 196 CSLNIECQQVSIDFRLKTRYVDEQATDLDDALARIEKYTAEGKAISIALCGNAAEILPEM 255
             L  E  +  IDFRL+TRYVD++AT LD+ALA I++    GK +S+ L  NAA+I  E+
Sbjct: 191 SVLACEVDETRIDFRLRTRYVDKKATSLDEALAMIDEANKSGKPVSVGLLANAADIFAEL 250

Query: 256 VRRGVRPDMVTDQTSAHDPLNGYLPAGWTWDEYRARAKTEPAAVVKAAKQSMAIHVKAML 315
           V RG+ PD+VTDQTSAHDPLNGYLP GWT +      K + AAVVKAAKQSMA+ VKAML
Sbjct: 251 VERGITPDVVTDQTSAHDPLNGYLPQGWTLEYAAEMRKQDEAAVVKAAKQSMAVQVKAML 310

Query: 316 AFQKMGVPTFDYGNNIRQMAQEEGVENAFDFPGFVPAYIRPLFCRGIGPFRWAALSGDPQ 375
           A Q  G  T DYGNNIRQMA EEGVENAFDFPGFVPAY+RPLFC GIGPFRWAALSGDP+
Sbjct: 311 ALQAAGAATTDYGNNIRQMAFEEGVENAFDFPGFVPAYVRPLFCEGIGPFRWAALSGDPE 370

Query: 376 DIYKTDAKVKELIPDDAHLHNWLDMARERISFQGLPARICWVGLGQRAKLGLAFNEMVRS 435
           DIYKTDAKVKELIPD+ HLHNWLDMARERI+FQGLPARICWVGL  RA+L  AFNEMV++
Sbjct: 371 DIYKTDAKVKELIPDNPHLHNWLDMARERIAFQGLPARICWVGLKDRARLAKAFNEMVKN 430

Query: 436 GELSAPIVIGRDHLDSGSVASPNRETESMQDGSDAVSDWPLLNALLNTASGATWVSLHHG 495
           GELSAPIVIGRDHLDSGSVASPNRETESM DGSDAVSDWPL+NALLNTASGATWVSLHHG
Sbjct: 431 GELSAPIVIGRDHLDSGSVASPNRETESMLDGSDAVSDWPLMNALLNTASGATWVSLHHG 490

Query: 496 GGVGMGFSQHSGMVIVCDGTDEAAERIARVLHNDPGTGVMRHADAGYQIAIDCAKEQGLN 555
           GGVGMGFSQHSG+VIV DGTDEA  R+ RVL NDP TGVMRHADAGY+IA  CAKEQGL+
Sbjct: 491 GGVGMGFSQHSGVVIVADGTDEAEARLGRVLWNDPATGVMRHADAGYEIAKQCAKEQGLD 550

Query: 556 LPMI 559
           LPM+
Sbjct: 551 LPML 554


Lambda     K      H
   0.318    0.134    0.412 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 1107
Number of extensions: 51
Number of successful extensions: 1
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 562
Length of database: 556
Length adjustment: 36
Effective length of query: 526
Effective length of database: 520
Effective search space:   273520
Effective search space used:   273520
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 53 (25.0 bits)

Align candidate 5211345 Shew_3757 (urocanate hydratase (RefSeq))
to HMM TIGR01228 (hutU: urocanate hydratase (EC 4.2.1.49))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR01228.hmm
# target sequence database:        /tmp/gapView.27002.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR01228  [M=545]
Accession:   TIGR01228
Description: hutU: urocanate hydratase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                        Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                        -----------
   5.2e-297  971.7   1.2     6e-297  971.5   1.2    1.0  1  lcl|FitnessBrowser__PV4:5211345  Shew_3757 urocanate hydratase (R


Domain annotation for each sequence (and alignments):
>> lcl|FitnessBrowser__PV4:5211345  Shew_3757 urocanate hydratase (RefSeq)
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  971.5   1.2    6e-297    6e-297       2     545 .]      10     553 ..       9     553 .. 1.00

  Alignments for each domain:
  == domain 1  score: 971.5 bits;  conditional E-value: 6e-297
                        TIGR01228   2 eiraprGkeleakgweqeaalrllmnnldpevaedpeelvvyGGkGkaarnweafdkiveelkrleddetllvqsGkp 79 
                                       i ap+G++l++k+w +ea++r+lmnnl p+vae pe+lvvyGG+G+aar+w+++dkive l+rle+detllvqsGkp
  lcl|FitnessBrowser__PV4:5211345  10 RIIAPHGTTLSCKSWLTEAPMRMLMNNLHPDVAERPEDLVVYGGIGRAARDWQCYDKIVEVLQRLEEDETLLVQSGKP 87 
                                      578*************************************************************************** PP

                        TIGR01228  80 vgvfkthekaprvliansnlvpkwadwekfeeleakGlimyGqmtaGswiyiGtqGilqGtyetlaelarkhfggslk 157
                                      vgvfkth++aprv+iansnlvp+wa+we+f+el++kGl+myGqmtaGswiyiG+qGi+qGtyet++++a++hfgg+ +
  lcl|FitnessBrowser__PV4:5211345  88 VGVFKTHSNAPRVIIANSNLVPHWANWEHFNELDKKGLAMYGQMTAGSWIYIGSQGIVQGTYETFVAMAKQHFGGDAS 165
                                      ****************************************************************************** PP

                        TIGR01228 158 gklvltaGlGgmGGaqplavtlneavsiavevdeeridkrletkyldektddldealaraeeakaeGkalsigllGna 235
                                      gk++lt GlGgmGGaqpla t+++ +++a evde+rid+rl+t+y+d+k+++ldeala+++ea++ Gk++s+gll na
  lcl|FitnessBrowser__PV4:5211345 166 GKWILTGGLGGMGGAQPLAGTMAGYSVLACEVDETRIDFRLRTRYVDKKATSLDEALAMIDEANKSGKPVSVGLLANA 243
                                      ****************************************************************************** PP

                        TIGR01228 236 aevleellergvvpdvvtdqtsahdellGyipegytvedadklrdeepeeyvkaakaslakhvrallalqkkGavtfd 313
                                      a+++ el+erg++pdvvtdqtsahd+l+Gy+p+g+t+e a+++r+++++++vkaak+s+a++v+a+lalq +Ga t d
  lcl|FitnessBrowser__PV4:5211345 244 ADIFAELVERGITPDVVTDQTSAHDPLNGYLPQGWTLEYAAEMRKQDEAAVVKAAKQSMAVQVKAMLALQAAGAATTD 321
                                      ****************************************************************************** PP

                        TIGR01228 314 yGnnirqvakeeGvedafdfpGfvpayirdlfceGkGpfrwvalsGdpadiyrtdkavkelfpedeelhrwidlakek 391
                                      yGnnirq+a+eeGve+afdfpGfvpay+r+lfceG Gpfrw+alsGdp+diy+td++vkel+p++ +lh+w+d+a+e+
  lcl|FitnessBrowser__PV4:5211345 322 YGNNIRQMAFEEGVENAFDFPGFVPAYVRPLFCEGIGPFRWAALSGDPEDIYKTDAKVKELIPDNPHLHNWLDMARER 399
                                      ****************************************************************************** PP

                        TIGR01228 392 vafqGlparicwlgygereklalainelvrsGelkapvvigrdhldaGsvaspnreteamkdGsdavadwpllnalln 469
                                      +afqGlparicw+g+++r++la+a+ne+v++Gel+ap+vigrdhld+Gsvaspnrete+m dGsdav+dwpl+nalln
  lcl|FitnessBrowser__PV4:5211345 400 IAFQGLPARICWVGLKDRARLAKAFNEMVKNGELSAPIVIGRDHLDSGSVASPNRETESMLDGSDAVSDWPLMNALLN 477
                                      ****************************************************************************** PP

                        TIGR01228 470 taaGaswvslhhGGGvglGfslhaglvivadGtdeaaerlkrvltadpGlGvirhadaGyesaldvakeqgldlpm 545
                                      ta+Ga+wvslhhGGGvg+Gfs+h+g+vivadGtdea+ rl rvl +dp +Gv+rhadaGye a ++akeqgldlpm
  lcl|FitnessBrowser__PV4:5211345 478 TASGATWVSLHHGGGVGMGFSQHSGVVIVADGTDEAEARLGRVLWNDPATGVMRHADAGYEIAKQCAKEQGLDLPM 553
                                      ***************************************************************************8 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (545 nodes)
Target sequences:                          1  (556 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.02u 0.01s 00:00:00.03 Elapsed: 00:00:00.03
# Mc/sec: 8.74
//
[ok]

This GapMind analysis is from Sep 17 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory