Align fumarylacetoacetase (EC 3.7.1.2) (characterized)
to candidate WP_038200417.1 Q392_RS01235 fumarylacetoacetase
Query= BRENDA::Q94272 (418 letters) >NCBI__GCF_000745855.1:WP_038200417.1 Length = 422 Score = 363 bits (933), Expect = e-105 Identities = 213/417 (51%), Positives = 264/417 (63%), Gaps = 32/417 (7%) Query: 11 SDFPIQNLPYGVFSTKADSSR-HIGVAIGDQILNLAEIANLFDGPQLKAHQDVFKQSTLN 69 SDFPIQNLP+G F K IGVAIGDQ+L+L A L D +N Sbjct: 24 SDFPIQNLPFGRFRRKGSGEAFRIGVAIGDQVLDL-RAAGLVD------------TDDMN 70 Query: 70 AFMALPRPAWLEARARIQQLLSEDCAVLRDNAHLRSRALVAQSDATMHLPAQIGDYTDFY 129 A MA PA R ++ LS+ A ++ALV Q+D +P ++GDYTDFY Sbjct: 71 ALMAAA-PA---GRRALRAKLSDGLAAGSTQQAAWAQALVPQADCEYTVPCRVGDYTDFY 126 Query: 130 SSIHHATNVGIMFRGKENALMPNWKWLPVGYHGRASSIVVSGTDLKRPVGQTKAPDAEVP 189 + IHHAT +G +FR + LMPN+KW+P+GYHGRASSIVVSGT KRP GQTKAPDA P Sbjct: 127 TGIHHATTIGKLFR-PDQPLMPNYKWVPIGYHGRASSIVVSGTPFKRPQGQTKAPDAAEP 185 Query: 190 SFGPSKLMDFELEMAFFVGGPENELGTRVPIEKAEDRIFGVVLMNDWSARDIQAWEYVPL 249 SFGP K +D+ELE+ F+V G N LG V I+ AE +FGV L NDWSARD+QAWEY PL Sbjct: 186 SFGPCKRLDYELELGFYV-GQGNALGQPVGIDDAEAHLFGVGLFNDWSARDLQAWEYQPL 244 Query: 250 GPFLAKSFATTVSPWVVSIEALRPYFV--ENPVQDPVPPAYLH---HDDPFTLDINLAV- 303 GPFLAK+FA+TVSPW+V++EAL P+ P DP P YL + + LDI L V Sbjct: 245 GPFLAKNFASTVSPWIVTMEALAPFRAPFTRPAGDPQPLPYLDGAANREGGQLDITLEVL 304 Query: 304 ----SIRPEGDAVDHIVCKTNFKHLYWTLKQQLAHHTVNGCNLRAGDLLGSGTVSGPEEG 359 +R +G+A + + YWT Q LAHHTVNGCNL+ GDLLGSGT+SGP Sbjct: 305 VQTAKMREQGEAPARLTHGLVKEAAYWTAAQLLAHHTVNGCNLQPGDLLGSGTLSGPTPD 364 Query: 360 AYGSMLELSWRGAKEVPV-GSEIRKFLKDGDEVNLSGVCEKNG-VRIGFGECRGKVL 414 A GS++EL+ G + + + G E R FL+DGD + L G CE+ G VRIGFGE G VL Sbjct: 365 AAGSLMELTLGGKQPITLPGGEQRTFLQDGDTLVLRGWCERAGAVRIGFGEASGTVL 421 Lambda K H 0.319 0.137 0.421 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 551 Number of extensions: 28 Number of successful extensions: 8 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 418 Length of database: 422 Length adjustment: 32 Effective length of query: 386 Effective length of database: 390 Effective search space: 150540 Effective search space used: 150540 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.4 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.8 bits) S2: 50 (23.9 bits)
Align candidate WP_038200417.1 Q392_RS01235 (fumarylacetoacetase)
to HMM TIGR01266 (fahA: fumarylacetoacetase (EC 3.7.1.2))
# hmmsearch :: search profile(s) against a sequence database # HMMER 3.3.1 (Jul 2020); http://hmmer.org/ # Copyright (C) 2020 Howard Hughes Medical Institute. # Freely distributed under the BSD open source license. # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # query HMM file: ../tmp/path.carbon/TIGR01266.hmm # target sequence database: /tmp/gapView.24270.genome.faa # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: TIGR01266 [M=420] Accession: TIGR01266 Description: fum_ac_acetase: fumarylacetoacetase Scores for complete sequences (score includes all domains): --- full sequence --- --- best 1 domain --- -#dom- E-value score bias E-value score bias exp N Sequence Description ------- ------ ----- ------- ------ ----- ---- -- -------- ----------- 3.2e-158 512.9 0.0 1.3e-157 511.0 0.0 1.7 1 lcl|NCBI__GCF_000745855.1:WP_038200417.1 Q392_RS01235 fumarylacetoacetase Domain annotation for each sequence (and alignments): >> lcl|NCBI__GCF_000745855.1:WP_038200417.1 Q392_RS01235 fumarylacetoacetase # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc --- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ---- 1 ! 511.0 0.0 1.3e-157 1.3e-157 1 418 [. 14 422 .] 14 422 .] 0.94 Alignments for each domain: == domain 1 score: 511.0 bits; conditional E-value: 1.3e-157 TIGR01266 1 sfvavak..nsdfplqnlPyGvfstk.adssrrigvaiGdqildlskiaaaglfeglalkehqevfkes 66 s+v++a+ sdfp+qnlP+G f k ++ rigvaiGdq+ldl + agl + lcl|NCBI__GCF_000745855.1:WP_038200417.1 14 SWVESANaaGSDFPIQNLPFGRFRRKgSGEAFRIGVAIGDQVLDLRA---AGLVDT------------D 67 89999986679************9762456779***********975...466655............5 PP TIGR01266 67 tlnaflalgrparkevrerlqkllsesaevlrdnaalrkeallaqaeatmhlPaqiGdytdfyssirha 135 +na++a + + r+++r++l + l++ + + +a +al++qa+++ +P ++Gdytdfy++i+ha lcl|NCBI__GCF_000745855.1:WP_038200417.1 68 DMNALMAAAPAGRRALRAKLSDGLAAGSTQ----QAAWAQALVPQADCEYTVPCRVGDYTDFYTGIHHA 132 68********************99955544....677889***************************** PP TIGR01266 136 tnvGilfrgkdnallPnykhlPvgyhGrassvvvsGtelrrPvGqikadnakePvfgpckkldlelela 204 t +G+lfr +d +l+Pnyk++P+gyhGrass+vvsGt+ +rP+Gq+ka++a eP+fgpck+ld+elel+ lcl|NCBI__GCF_000745855.1:WP_038200417.1 133 TTIGKLFR-PDQPLMPNYKWVPIGYHGRASSIVVSGTPFKRPQGQTKAPDAAEPSFGPCKRLDYELELG 200 ********.************************************************************ PP TIGR01266 205 ffvgtenelGeavpiekaeehifGvvllndwsardiqaweyvPlGPflaksfattvsPwvvsiealePf 273 f+vg++n+lG++v i+ ae h+fGv l+ndwsard+qawey+PlGPflak+fa+tvsPw+v++eal Pf lcl|NCBI__GCF_000745855.1:WP_038200417.1 201 FYVGQGNALGQPVGIDDAEAHLFGVGLFNDWSARDLQAWEYQPLGPFLAKNFASTVSPWIVTMEALAPF 269 ********************************************************************* PP TIGR01266 274 rvaqlePeqdpkplpylredr..adtafdielevslkteGlae....aavisrsnaks.lywtlkqqla 335 r + ++P +dp+plpyl + +di lev ++t+ ++e +a++++ +k+ ywt +q la lcl|NCBI__GCF_000745855.1:WP_038200417.1 270 RAPFTRPAGDPQPLPYLDGAAnrEGGQLDITLEVLVQTAKMREqgeaPARLTHGLVKEaAYWTAAQLLA 338 *****************7643228899************999988889999999988637********* PP TIGR01266 336 hhsvnGcnlraGdllgsGtisGkeeeafGsllelsakGkkevkladgetrkfledGdevilrgvckkeG 404 hh+vnGcnl++GdllgsGt+sG+ ++a Gsl+el+ +Gk+++ l ge+r+fl+dGd+++lrg c++ G lcl|NCBI__GCF_000745855.1:WP_038200417.1 339 HHTVNGCNLQPGDLLGSGTLSGPTPDAAGSLMELTLGGKQPITLPGGEQRTFLQDGDTLVLRGWCERAG 407 ********************************************************************* PP TIGR01266 405 vr.vGfGecaGkvlp 418 + +GfGe G+vl+ lcl|NCBI__GCF_000745855.1:WP_038200417.1 408 AVrIGFGEASGTVLA 422 877**********96 PP Internal pipeline statistics summary: ------------------------------------- Query model(s): 1 (420 nodes) Target sequences: 1 (422 residues searched) Passed MSV filter: 1 (1); expected 0.0 (0.02) Passed bias filter: 1 (1); expected 0.0 (0.02) Passed Vit filter: 1 (1); expected 0.0 (0.001) Passed Fwd filter: 1 (1); expected 0.0 (1e-05) Initial search space (Z): 1 [actual number of targets] Domain search space (domZ): 1 [number of targets reported over threshold] # CPU time: 0.05u 0.01s 00:00:00.06 Elapsed: 00:00:00.04 # Mc/sec: 3.59 // [ok]
This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory