GapMind for catabolism of small carbon sources

 

Alignments for a candidate for hutU in Rhizobium johnstonii 3841

Align Urocanate hydratase; Urocanase; Imidazolonepropionate hydrolase; EC 4.2.1.49 (characterized)
to candidate WP_011650166.1 RL_RS01900 urocanate hydratase

Query= SwissProt::Q5L084
         (551 letters)



>NCBI__GCF_000009265.1:WP_011650166.1
          Length = 553

 Score =  574 bits (1479), Expect = e-168
 Identities = 283/542 (52%), Positives = 371/542 (68%), Gaps = 4/542 (0%)

Query: 10  PAGTERRAKGWIQEAALRMLNNNLHPDVAERPDELIVYGGIGKAARNWECYEAIVDTLLR 69
           P G E RAKGW QEA LR+L N L   V E PD LIVY  +GKAARNW  +  IV  L  
Sbjct: 14  PGGPELRAKGWRQEALLRLLENVL--SVGEDPDNLIVYAALGKAARNWAAHRGIVKALTE 71

Query: 70  LENDETLLIQSGKPVAVFRTHPDAPRVLIANSNLVPAWATWDHFHELDKKGLIMYGQMTA 129
           +E D+TLLIQSGKP+ + RTH  AP V++AN N+V  WA  + F+EL +KGLI +G +TA
Sbjct: 72  MEEDQTLLIQSGKPIGLVRTHAKAPLVIMANCNIVGQWAKAEVFYELQRKGLICWGGLTA 131

Query: 130 GSWIYIGSQGIVQGTYETFAEVARQHFGGTLAGTITLTAGLGGMGGAQPLAVTMNGGVCL 189
           G+W YIGSQG++QGTYE F  +A + FGG L G   LTAGLGGMGGAQPLA  M G   L
Sbjct: 132 GAWQYIGSQGVIQGTYEIFMRIAERRFGGDLLGRFVLTAGLGGMGGAQPLAGRMAGAAIL 191

Query: 190 AIEVDPARIQRRIDTNYLDTMTDSLDAALEMAKQAKEEKKALSIGLVGNAAEVLPRLVEM 249
            +++DP R ++R    YL  +   LD+AL+M   A ++K+ALS+GLVGNAAEV P +   
Sbjct: 192 CVDIDPERARKRQQIGYLQEIAPDLDSALQMIDAAVKDKRALSVGLVGNAAEVYPEIARR 251

Query: 250 GFVPDVLTDQTSAHDPLNGYIPAGLTLDEAAELRARDPKQYIARAKQSIAAHVRAMLAMQ 309
           G VPD++TDQTSAHD + GY+P G+ LD+   LR     Q +A ++ SI  HV AML  Q
Sbjct: 252 GIVPDIVTDQTSAHDLVYGYVPKGMNLDQVKGLRDDGQGQLMAASRASIVEHVTAMLEFQ 311

Query: 310 KQGAVTFDYGNNIRQVAKDEGVDDAFSFPGFVPAYIRPLFCEGKGPFRWVALSGDPEDIY 369
           K+G+  FD GN IR  AK+ GV +AF  P F  AY+RPLF    GPFRW+ALSG+  DI 
Sbjct: 312 KKGSEVFDNGNLIRTQAKEGGVANAFDIPIFTEAYLRPLFARAIGPFRWMALSGEESDIA 371

Query: 370 KTDEVILREFSDNERLCHWIRMAQKRIKFQGLPARICWLGYGERAKFGKIINDMVAKGEL 429
           + D+++L  F DN+ + +WIR+A++ + F+GLPARI WLG+GER    + +N +VA GEL
Sbjct: 372 RIDDLLLEMFPDNKIITNWIRLAREHVPFEGLPARIAWLGHGERTALARRVNALVASGEL 431

Query: 430 KAPIVIGRDHLDSGSVASPNRETEGMKDGSDAIADWPILNALLNAVGGASWVSVHHGGGV 489
           K P+   RDHLD+G++A PN  TEGMKDGSDAIADWP+++A++     A  V +H GGG 
Sbjct: 432 KGPVAFSRDHLDAGAMAHPNIMTEGMKDGSDAIADWPLIDAMMLCSSMADLVVIHSGGGG 491

Query: 490 GMGYSIHAGMVIVADGTKEAEKRLERVLTTDPGLGVVRHADAGYELAIRTAKEKGIDMPM 549
             GY    G+ +VADGT +A++RL+  LT D  LGV+R+ADAGYE A+    +K  D+P 
Sbjct: 492 YAGYMTSCGVTVVADGTDDADERLDHALTNDTALGVMRYADAGYEEALDEVAKK--DVPY 549

Query: 550 LK 551
           ++
Sbjct: 550 IR 551


Lambda     K      H
   0.319    0.136    0.411 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 776
Number of extensions: 27
Number of successful extensions: 2
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 551
Length of database: 553
Length adjustment: 36
Effective length of query: 515
Effective length of database: 517
Effective search space:   266255
Effective search space used:   266255
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 53 (25.0 bits)

Align candidate WP_011650166.1 RL_RS01900 (urocanate hydratase)
to HMM TIGR01228 (hutU: urocanate hydratase (EC 4.2.1.49))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR01228.hmm
# target sequence database:        /tmp/gapView.1213420.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR01228  [M=545]
Accession:   TIGR01228
Description: hutU: urocanate hydratase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                             Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                             -----------
   2.7e-213  695.4   0.9   3.5e-213  695.0   0.9    1.0  1  NCBI__GCF_000009265.1:WP_011650166.1  


Domain annotation for each sequence (and alignments):
>> NCBI__GCF_000009265.1:WP_011650166.1  
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  695.0   0.9  3.5e-213  3.5e-213       6     541 ..      14     547 ..      10     550 .. 0.99

  Alignments for each domain:
  == domain 1  score: 695.0 bits;  conditional E-value: 3.5e-213
                             TIGR01228   6 prGkeleakgweqeaalrllmnnldpevaedpeelvvyGGkGkaarnweafdkiveelkrleddetllvqsGk 78 
                                           p G el akgw+qea lrll+n l   v edp++l+vy   Gkaarnw a+  iv+ l+ +e+d+tll+qsGk
  NCBI__GCF_000009265.1:WP_011650166.1  14 PGGPELRAKGWRQEALLRLLENVL--SVGEDPDNLIVYAALGKAARNWAAHRGIVKALTEMEEDQTLLIQSGK 84 
                                           7899*******************9..69********************************************* PP

                             TIGR01228  79 pvgvfkthekaprvliansnlvpkwadwekfeeleakGlimyGqmtaGswiyiGtqGilqGtyetlaelarkh 151
                                           p+g+++th kap v++an+n+v++wa+ e+f el++kGli +G +taG+w yiG+qG++qGtye +  +a ++
  NCBI__GCF_000009265.1:WP_011650166.1  85 PIGLVRTHAKAPLVIMANCNIVGQWAKAEVFYELQRKGLICWGGLTAGAWQYIGSQGVIQGTYEIFMRIAERR 157
                                           ************************************************************************* PP

                             TIGR01228 152 fggslkgklvltaGlGgmGGaqplavtlneavsiavevdeeridkrletkyldektddldealaraeeakaeG 224
                                           fgg+l g++vltaGlGgmGGaqpla ++++a+ + v++d+er +kr +  yl+e + dld al+++++a ++ 
  NCBI__GCF_000009265.1:WP_011650166.1 158 FGGDLLGRFVLTAGLGGMGGAQPLAGRMAGAAILCVDIDPERARKRQQIGYLQEIAPDLDSALQMIDAAVKDK 230
                                           ************************************************************************* PP

                             TIGR01228 225 kalsigllGnaaevleellergvvpdvvtdqtsahdellGyipegytvedadklrdeepeeyvkaakaslakh 297
                                           +als+gl Gnaaev++e+ +rg+vpd+vtdqtsahd + Gy+p+g+ +++ + lrd+ + + + a++as+++h
  NCBI__GCF_000009265.1:WP_011650166.1 231 RALSVGLVGNAAEVYPEIARRGIVPDIVTDQTSAHDLVYGYVPKGMNLDQVKGLRDDGQGQLMAASRASIVEH 303
                                           ************************************************************************* PP

                             TIGR01228 298 vrallalqkkGavtfdyGnnirqvakeeGvedafdfpGfvpayirdlfceGkGpfrwvalsGdpadiyrtdka 370
                                           v a+l++qkkG+ +fd Gn ir++ake Gv++afd+p f  ay+r+lf++  Gpfrw+alsG+  di r+d+ 
  NCBI__GCF_000009265.1:WP_011650166.1 304 VTAMLEFQKKGSEVFDNGNLIRTQAKEGGVANAFDIPIFTEAYLRPLFARAIGPFRWMALSGEESDIARIDDL 376
                                           ************************************************************************* PP

                             TIGR01228 371 vkelfpedeelhrwidlakekvafqGlparicwlgygereklalainelvrsGelkapvvigrdhldaGsvas 443
                                           ++e+fp+++ +++wi la+e+v f+Glpari wlg+ger  la ++n lv sGelk pv+ +rdhldaG++a 
  NCBI__GCF_000009265.1:WP_011650166.1 377 LLEMFPDNKIITNWIRLAREHVPFEGLPARIAWLGHGERTALARRVNALVASGELKGPVAFSRDHLDAGAMAH 449
                                           ************************************************************************* PP

                             TIGR01228 444 pnreteamkdGsdavadwpllnallntaaGaswvslhhGGGvglGfslhaglvivadGtdeaaerlkrvltad 516
                                           pn  te mkdGsda+adwpl++a++   + a++v +h GGG   G+   +g+ +vadGtd+a+erl  +lt+d
  NCBI__GCF_000009265.1:WP_011650166.1 450 PNIMTEGMKDGSDAIADWPLIDAMMLCSSMADLVVIHSGGGGYAGYMTSCGVTVVADGTDDADERLDHALTND 522
                                           ************************************************************************* PP

                             TIGR01228 517 pGlGvirhadaGyesaldvakeqgl 541
                                             lGv+r+adaGye+ald++ ++++
  NCBI__GCF_000009265.1:WP_011650166.1 523 TALGVMRYADAGYEEALDEVAKKDV 547
                                           ******************9999887 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (545 nodes)
Target sequences:                          1  (553 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.00u 0.00s 00:00:00.00 Elapsed: 00:00:00.00
# Mc/sec: 38.43
//
[ok]

This GapMind analysis is from Apr 09 2024. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory