Align subunit of catechol 1,2-dioxygenase (EC 1.13.11.1) (characterized)
to candidate GFF2640 Psest_2692 catechol 1,2-dioxygenase, proteobacterial
Query= metacyc::MONOMER-3422 (311 letters) >FitnessBrowser__psRCH2:GFF2640 Length = 313 Score = 422 bits (1086), Expect = e-123 Identities = 216/330 (65%), Positives = 244/330 (73%), Gaps = 40/330 (12%) Query: 1 MTVKISHTADVQAFFNKVAGLDHAEGNPRFKQIILRVLQDTARLVEDLEITEDEFWHAID 60 MTVKIS T+DVQ FF + +G + G+ R K +I R+L DTA++VEDLEIT+DEFW A+D Sbjct: 1 MTVKISQTSDVQNFFKEASGFGNDAGSSRMKTVINRILVDTAKIVEDLEITQDEFWKAVD 60 Query: 61 YLNRLGGRNEAGLLAAGLGIEHFLDLLQDAKDAEAGLGGGTPRTIEGPLYVAGAPLAQGE 120 YLNRLGGR+EAGLL AGLG+EHFLDLL+DAKD + GL GGTPRTIEGPLYVAGAP++Q E Sbjct: 61 YLNRLGGRHEAGLLVAGLGLEHFLDLLEDAKDEQQGLTGGTPRTIEGPLYVAGAPISQAE 120 Query: 121 ARMDDGT--DPGVVMFLQGQVFDADGKPLAGATVDLWHANTQGTYSYFDSTQSEYNLRRR 178 RMDDG+ D VMFLQGQV DGKP+A A VDLWHANT+G YSYFD +QSEYNLRRR Sbjct: 121 TRMDDGSELDVATVMFLQGQVTGPDGKPVANAVVDLWHANTKGNYSYFDKSQSEYNLRRR 180 Query: 179 IITDAVGRYRARSIVPSGYGCDPQGTTQECLDLLGRHGQRPAHVHFFISAPGFRHLTTQI 238 I+TD G YRARSIVPSGYGC G TQE LD LGRHG+RPAH+HFFISAPG RHLTTQI Sbjct: 181 IVTDENGNYRARSIVPSGYGCSLDGPTQEVLDHLGRHGRRPAHIHFFISAPGHRHLTTQI 240 Query: 239 NLKMPLPRVIAVFRASALPNCEGDKYLWDDFAYATRDGLIGELRFV-------------- 284 NL GD+YLWDDFAYATRDGL+G++RFV Sbjct: 241 NL-------------------AGDEYLWDDFAYATRDGLVGDIRFVEDAEAARARGIEGS 281 Query: 285 -----AFDFHLQAAAAPEAEARSHRPRALQ 309 FDF LQAA APEAE RS RPRALQ Sbjct: 282 RFAELTFDFQLQAAPAPEAEQRSARPRALQ 311 Lambda K H 0.321 0.139 0.424 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 372 Number of extensions: 11 Number of successful extensions: 3 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 311 Length of database: 313 Length adjustment: 27 Effective length of query: 284 Effective length of database: 286 Effective search space: 81224 Effective search space used: 81224 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.4 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.9 bits) S2: 48 (23.1 bits)
Align candidate GFF2640 Psest_2692 (catechol 1,2-dioxygenase, proteobacterial)
to HMM TIGR02439 (catA: catechol 1,2-dioxygenase (EC 1.13.11.1))
# hmmsearch :: search profile(s) against a sequence database # HMMER 3.3.1 (Jul 2020); http://hmmer.org/ # Copyright (C) 2020 Howard Hughes Medical Institute. # Freely distributed under the BSD open source license. # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # query HMM file: ../tmp/path.carbon/TIGR02439.hmm # target sequence database: /tmp/gapView.10386.genome.faa # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: TIGR02439 [M=285] Accession: TIGR02439 Description: catechol_proteo: catechol 1,2-dioxygenase Scores for complete sequences (score includes all domains): --- full sequence --- --- best 1 domain --- -#dom- E-value score bias E-value score bias exp N Sequence Description ------- ------ ----- ------- ------ ----- ---- -- -------- ----------- 6e-146 470.8 0.2 6.9e-146 470.6 0.2 1.0 1 lcl|FitnessBrowser__psRCH2:GFF2640 Psest_2692 catechol 1,2-dioxygen Domain annotation for each sequence (and alignments): >> lcl|FitnessBrowser__psRCH2:GFF2640 Psest_2692 catechol 1,2-dioxygenase, proteobacterial # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc --- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ---- 1 ! 470.6 0.2 6.9e-146 6.9e-146 2 285 .] 8 292 .. 7 292 .. 0.99 Alignments for each domain: == domain 1 score: 470.6 bits; conditional E-value: 6.9e-146 TIGR02439 2 tkevqallkkvagleqeggnarikqivlrvlsdlfkaiedlditedefwaaveylnklGqanelgllaaGlGleh 76 t++vq+++k+++g+ ++ g +r+k +++r+l d++k +edl+it+defw+av+yln+lG ++e+gll+aGlGleh lcl|FitnessBrowser__psRCH2:GFF2640 8 TSDVQNFFKEASGFGNDAGSSRMKTVINRILVDTAKIVEDLEITQDEFWKAVDYLNRLGGRHEAGLLVAGLGLEH 82 789************************************************************************ PP TIGR02439 77 fldlrldaadakagleggtPrtieGPlyvaGapvseGfarlddgseddkaetlvlkGqvldaeGkpiagakvevw 151 fldl+ da+d+++gl+ggtPrtieGPlyvaGap+s+ ++r+ddgse d a++++l+Gqv+ +Gkp+a+a+v++w lcl|FitnessBrowser__psRCH2:GFF2640 83 FLDLLEDAKDEQQGLTGGTPRTIEGPLYVAGAPISQAETRMDDGSELDVATVMFLQGQVTGPDGKPVANAVVDLW 157 *************************************************************************** PP TIGR02439 152 hanskGnysffdksqsefnlrrtiitdaeGkyrarsvvPvGygvppqgptqqllnllGrhGerPahvhffvsapg 226 han+kGnys+fdksqse+nlrr+i+td++G+yrars+vP+Gyg+ +gptq++l++lGrhG+rPah+hff+sapg lcl|FitnessBrowser__psRCH2:GFF2640 158 HANTKGNYSYFDKSQSEYNLRRRIVTDENGNYRARSIVPSGYGCSLDGPTQEVLDHLGRHGRRPAHIHFFISAPG 232 *************************************************************************** PP TIGR02439 227 yrklttqinlegdkylyddfafatreglvaevkevedaaaakrrgveg.rfaeiefdlel 285 +r+lttqinl+gd+yl+ddfa+atr+glv+++++veda+aa++rg+eg rfae++fd++l lcl|FitnessBrowser__psRCH2:GFF2640 233 HRHLTTQINLAGDEYLWDDFAYATRDGLVGDIRFVEDAEAARARGIEGsRFAELTFDFQL 292 ***********************************************989********97 PP Internal pipeline statistics summary: ------------------------------------- Query model(s): 1 (285 nodes) Target sequences: 1 (313 residues searched) Passed MSV filter: 1 (1); expected 0.0 (0.02) Passed bias filter: 1 (1); expected 0.0 (0.02) Passed Vit filter: 1 (1); expected 0.0 (0.001) Passed Fwd filter: 1 (1); expected 0.0 (1e-05) Initial search space (Z): 1 [actual number of targets] Domain search space (domZ): 1 [number of targets reported over threshold] # CPU time: 0.01u 0.00s 00:00:00.01 Elapsed: 00:00:00.00 # Mc/sec: 10.30 // [ok]
This GapMind analysis is from Sep 17 2021. The underlying query database was built on Sep 17 2021.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory