Align Urocanate hydratase (EC 4.2.1.49) (characterized)
to candidate WP_036139592.1 N800_RS13305 urocanate hydratase
Query= reanno::pseudo6_N2E2:Pf6N2E2_3805 (562 letters) >NCBI__GCF_000768355.1:WP_036139592.1 Length = 554 Score = 914 bits (2361), Expect = 0.0 Identities = 438/544 (80%), Positives = 485/544 (89%) Query: 16 IRAPRGNTLTAKSWLTEAPLRMLMNNLDPEVAENPKELVVYGGIGRAARNWECYDKIVES 75 IRAPRG + KSWL+EA RM+ NNLDP+VAENP ELVVYGGIGRAAR+WE YD I++S Sbjct: 11 IRAPRGAEKSCKSWLSEAAYRMIQNNLDPDVAENPTELVVYGGIGRAARDWESYDAILKS 70 Query: 76 LTNLNDDETLLVQSGKPVGVFKTHSNAPRVLIANSNLVPHWASWEHFNELDAKGLAMYGQ 135 L L D+ETLLVQSGKPVGVF TH +APRVLIANSNLVPHWA+WEHFNELD KGL MYGQ Sbjct: 71 LRELEDNETLLVQSGKPVGVFPTHPDAPRVLIANSNLVPHWATWEHFNELDKKGLMMYGQ 130 Query: 136 MTAGSWIYIGSQGIVQGTYETFVEAGRQHYDSNLKGRWVLTAGLGGMGGAQPLAATLAGA 195 MTAGSWIYIGSQGIVQGTYETFVE GRQHY +L G+W+LTAGLGGMGGAQPLAAT+AGA Sbjct: 131 MTAGSWIYIGSQGIVQGTYETFVEMGRQHYGGDLTGKWILTAGLGGMGGAQPLAATMAGA 190 Query: 196 CSLNIECQQVSIDFRLKTRYVDEQATDLDDALARIEKYTAEGKAISIALCGNAAEILPEM 255 C L IEC+Q SID RL+TRYVDEQATDLDDALARIEKYTA G+A SIAL GNAAEILPE+ Sbjct: 191 CMLAIECRQSSIDMRLRTRYVDEQATDLDDALARIEKYTAAGEAKSIALLGNAAEILPEL 250 Query: 256 VRRGVRPDMVTDQTSAHDPLNGYLPAGWTWDEYRARAKTEPAAVVKAAKQSMAIHVKAML 315 VRRGVRPD VTDQTSAHDPL+GYLPAGWT +++ + KTEP VV+AAK SM +HV+AML Sbjct: 251 VRRGVRPDAVTDQTSAHDPLHGYLPAGWTLEQWLDKQKTEPQEVVEAAKHSMRVHVEAML 310 Query: 316 AFQKMGVPTFDYGNNIRQMAQEEGVENAFDFPGFVPAYIRPLFCRGIGPFRWAALSGDPQ 375 F+ MGVPTFDYGNNIRQMA + G ++AFDFPGFVPAY+RPLFCRG+GPFRW ALSGDP+ Sbjct: 311 HFEDMGVPTFDYGNNIRQMAFDMGCKDAFDFPGFVPAYVRPLFCRGVGPFRWVALSGDPE 370 Query: 376 DIYKTDAKVKELIPDDAHLHNWLDMARERISFQGLPARICWVGLGQRAKLGLAFNEMVRS 435 DI KTDAKVKELIP+D HLH WLDMA ERISFQGLPARICWVGLG R +LGLAFNEMVR+ Sbjct: 371 DIAKTDAKVKELIPNDPHLHKWLDMAHERISFQGLPARICWVGLGDRHRLGLAFNEMVRN 430 Query: 436 GELSAPIVIGRDHLDSGSVASPNRETESMQDGSDAVSDWPLLNALLNTASGATWVSLHHG 495 GEL AP+VIGRDHLDSGSV+SPNRETESM DGSDAVSDWPLLNA+LN A GATWVSLHHG Sbjct: 431 GELKAPVVIGRDHLDSGSVSSPNRETESMMDGSDAVSDWPLLNAMLNVAGGATWVSLHHG 490 Query: 496 GGVGMGFSQHSGMVIVCDGTDEAAERIARVLHNDPGTGVMRHADAGYQIAIDCAKEQGLN 555 GGVGMG+SQHSG+VIVCDG++ A +R+ARVL NDPGTGVMRHADAGY+IA CAKEQGL Sbjct: 491 GGVGMGYSQHSGVVIVCDGSEAADQRLARVLWNDPGTGVMRHADAGYEIAKQCAKEQGLK 550 Query: 556 LPMI 559 LPM+ Sbjct: 551 LPML 554 Lambda K H 0.318 0.134 0.412 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 1166 Number of extensions: 44 Number of successful extensions: 1 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 562 Length of database: 554 Length adjustment: 36 Effective length of query: 526 Effective length of database: 518 Effective search space: 272468 Effective search space used: 272468 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.3 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.7 bits) S2: 53 (25.0 bits)
Align candidate WP_036139592.1 N800_RS13305 (urocanate hydratase)
to HMM TIGR01228 (hutU: urocanate hydratase (EC 4.2.1.49))
# hmmsearch :: search profile(s) against a sequence database # HMMER 3.3.1 (Jul 2020); http://hmmer.org/ # Copyright (C) 2020 Howard Hughes Medical Institute. # Freely distributed under the BSD open source license. # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # query HMM file: ../tmp/path.carbon/TIGR01228.hmm # target sequence database: /tmp/gapView.4089871.genome.faa # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: TIGR01228 [M=545] Accession: TIGR01228 Description: hutU: urocanate hydratase Scores for complete sequences (score includes all domains): --- full sequence --- --- best 1 domain --- -#dom- E-value score bias E-value score bias exp N Sequence Description ------- ------ ----- ------- ------ ----- ---- -- -------- ----------- 1.7e-298 976.6 0.1 1.9e-298 976.4 0.1 1.0 1 NCBI__GCF_000768355.1:WP_036139592.1 Domain annotation for each sequence (and alignments): >> NCBI__GCF_000768355.1:WP_036139592.1 # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc --- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ---- 1 ! 976.4 0.1 1.9e-298 1.9e-298 1 545 [] 9 553 .. 9 553 .. 1.00 Alignments for each domain: == domain 1 score: 976.4 bits; conditional E-value: 1.9e-298 TIGR01228 1 keiraprGkeleakgweqeaalrllmnnldpevaedpeelvvyGGkGkaarnweafdkiveelkrleddetll 73 ++iraprG e+++k+w eaa r+++nnldp+vae+p elvvyGG+G+aar+we++d+i+++l+ led+etll NCBI__GCF_000768355.1:WP_036139592.1 9 RTIRAPRGAEKSCKSWLSEAAYRMIQNNLDPDVAENPTELVVYGGIGRAARDWESYDAILKSLRELEDNETLL 81 689********************************************************************** PP TIGR01228 74 vqsGkpvgvfkthekaprvliansnlvpkwadwekfeeleakGlimyGqmtaGswiyiGtqGilqGtyetlae 146 vqsGkpvgvf th +aprvliansnlvp+wa+we+f+el++kGl+myGqmtaGswiyiG+qGi+qGtyet++e NCBI__GCF_000768355.1:WP_036139592.1 82 VQSGKPVGVFPTHPDAPRVLIANSNLVPHWATWEHFNELDKKGLMMYGQMTAGSWIYIGSQGIVQGTYETFVE 154 ************************************************************************* PP TIGR01228 147 larkhfggslkgklvltaGlGgmGGaqplavtlneavsiavevdeeridkrletkyldektddldealaraee 219 ++r+h+gg+l gk++ltaGlGgmGGaqpla+t+++a+++a+e+ ++ id rl+t+y+de+++dld+alar+e+ NCBI__GCF_000768355.1:WP_036139592.1 155 MGRQHYGGDLTGKWILTAGLGGMGGAQPLAATMAGACMLAIECRQSSIDMRLRTRYVDEQATDLDDALARIEK 227 ************************************************************************* PP TIGR01228 220 akaeGkalsigllGnaaevleellergvvpdvvtdqtsahdellGyipegytvedadklrdeepeeyvkaaka 292 ++a+G+a+si+llGnaae+l+el++rgv+pd vtdqtsahd+l Gy+p+g+t+e+ + +++ep+e+v+aak+ NCBI__GCF_000768355.1:WP_036139592.1 228 YTAAGEAKSIALLGNAAEILPELVRRGVRPDAVTDQTSAHDPLHGYLPAGWTLEQWLDKQKTEPQEVVEAAKH 300 ************************************************************************* PP TIGR01228 293 slakhvrallalqkkGavtfdyGnnirqvakeeGvedafdfpGfvpayirdlfceGkGpfrwvalsGdpadiy 365 s+ +hv+a+l++++ G+ tfdyGnnirq+a++ G +dafdfpGfvpay+r+lfc+G GpfrwvalsGdp+di NCBI__GCF_000768355.1:WP_036139592.1 301 SMRVHVEAMLHFEDMGVPTFDYGNNIRQMAFDMGCKDAFDFPGFVPAYVRPLFCRGVGPFRWVALSGDPEDIA 373 ************************************************************************* PP TIGR01228 366 rtdkavkelfpedeelhrwidlakekvafqGlparicwlgygereklalainelvrsGelkapvvigrdhlda 438 +td++vkel+p+d +lh+w+d+a+e+++fqGlparicw+g+g+r++l+la+ne+vr+Gelkapvvigrdhld+ NCBI__GCF_000768355.1:WP_036139592.1 374 KTDAKVKELIPNDPHLHKWLDMAHERISFQGLPARICWVGLGDRHRLGLAFNEMVRNGELKAPVVIGRDHLDS 446 ************************************************************************* PP TIGR01228 439 GsvaspnreteamkdGsdavadwpllnallntaaGaswvslhhGGGvglGfslhaglvivadGtdeaaerlkr 511 Gsv+spnrete+m dGsdav+dwpllna+ln+a+Ga+wvslhhGGGvg+G+s+h+g+viv+dG+++a++rl+r NCBI__GCF_000768355.1:WP_036139592.1 447 GSVSSPNRETESMMDGSDAVSDWPLLNAMLNVAGGATWVSLHHGGGVGMGYSQHSGVVIVCDGSEAADQRLAR 519 ************************************************************************* PP TIGR01228 512 vltadpGlGvirhadaGyesaldvakeqgldlpm 545 vl +dpG+Gv+rhadaGye a ++akeqgl+lpm NCBI__GCF_000768355.1:WP_036139592.1 520 VLWNDPGTGVMRHADAGYEIAKQCAKEQGLKLPM 553 *********************************8 PP Internal pipeline statistics summary: ------------------------------------- Query model(s): 1 (545 nodes) Target sequences: 1 (554 residues searched) Passed MSV filter: 1 (1); expected 0.0 (0.02) Passed bias filter: 1 (1); expected 0.0 (0.02) Passed Vit filter: 1 (1); expected 0.0 (0.001) Passed Fwd filter: 1 (1); expected 0.0 (1e-05) Initial search space (Z): 1 [actual number of targets] Domain search space (domZ): 1 [number of targets reported over threshold] # CPU time: 0.01u 0.01s 00:00:00.02 Elapsed: 00:00:00.01 # Mc/sec: 27.71 // [ok]
This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory