Align acetate kinase (EC 2.7.2.1) (characterized)
to candidate WP_028313010.1 H566_RS0121200 acetate/propionate family kinase
Query= BRENDA::P38502 (408 letters) >NCBI__GCF_000482785.1:WP_028313010.1 Length = 408 Score = 254 bits (648), Expect = 4e-72 Identities = 159/400 (39%), Positives = 231/400 (57%), Gaps = 25/400 (6%) Query: 3 VLVINAGSSSLKYQLIDMTN----ESALAVGLCERIGIDNSIITQKKFDGKKL------- 51 +LVINAGSSS+K+ + E A G E IG ++ K G+KL Sbjct: 4 ILVINAGSSSVKFAVFQTDGHDLREPAALHGQLEGIGGAPHLVA-KDAAGRKLADDALPH 62 Query: 52 EKLTDLPTHKD-ALEEVVKALTDDEFGVIKDMGEINAVGHRVVHGGEKFTTSALYDEGVE 110 + L T D AL ++ + + E G I A+GHRVVHGG + D+ V Sbjct: 63 DPAASLDTAYDTALAALLAWIGEHEVGFA-----IEAIGHRVVHGGPTHAAPVVVDDAVL 117 Query: 111 KAIKDCFELAPLHNPPNMMGISACAEIMPGTPMVIVFDTAFHQTMPPYAYMYALPYDLYE 170 + LAPLH P N+ I A I P + FDTAFH++ P A +ALP +E Sbjct: 118 AELDRYVPLAPLHQPHNLDAIRALGRIEPDAIQIACFDTAFHRSQPAVAQRFALPRR-FE 176 Query: 171 KHGVRKYGFHGTSHKYVAERAALMLGKPAEETKIITCHLGNGSSITAVEGGKSVETSMGF 230 GVR+YGFHG S++Y+A R +LG+ A+ ++I HLGNG+S+ A+ G +SV ++MGF Sbjct: 177 AAGVRRYGFHGLSYEYIAGRLPELLGERADG-RVIVAHLGNGASLCALHGRRSVASTMGF 235 Query: 231 TPLEGLAMGTRCGSIDPAIVPFLMEKEGLTTREIDTLMNKKSGVLGVSGLSNDFRDLDEA 290 + L+GL MGTR G+IDP ++ +L++ E + + I L+ K+SG+LGVSG+S D R L Sbjct: 236 SALDGLMMGTRTGAIDPGVLLYLLDTEKMDSARISRLLYKESGLLGVSGISADMRTL--L 293 Query: 291 ASKGNRKAELALEIFAYKVKKFIGEYSAVLNGADAVVFTAGIGENSASIRKRILTGLDGI 350 AS AE A+E+F Y++ + IG +A L G DA+VFTAGIGE++A +R+R+ + Sbjct: 294 ASDAPEAAE-AVELFCYRIAREIGSLAAALGGLDALVFTAGIGEHAAPVRERVSAACGWL 352 Query: 351 GIKIDDEKNKIRGQEIDISTPDAKVRVFVIPTNEELAIAR 390 G+K+D E N ++ + + V V IPT+EE+ IAR Sbjct: 353 GVKLDAEAN--AAGQLRLDAAGSAVAVACIPTDEEVMIAR 390 Lambda K H 0.316 0.135 0.380 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 415 Number of extensions: 24 Number of successful extensions: 6 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 408 Length of database: 408 Length adjustment: 31 Effective length of query: 377 Effective length of database: 377 Effective search space: 142129 Effective search space used: 142129 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.3 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.6 bits) S2: 50 (23.9 bits)
Align candidate WP_028313010.1 H566_RS0121200 (acetate/propionate family kinase)
to HMM TIGR00016 (ackA: acetate kinase (EC 2.7.2.1))
# hmmsearch :: search profile(s) against a sequence database # HMMER 3.3.1 (Jul 2020); http://hmmer.org/ # Copyright (C) 2020 Howard Hughes Medical Institute. # Freely distributed under the BSD open source license. # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # query HMM file: ../tmp/path.carbon/TIGR00016.hmm # target sequence database: /tmp/gapView.1884983.genome.faa # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: TIGR00016 [M=405] Accession: TIGR00016 Description: ackA: acetate kinase Scores for complete sequences (score includes all domains): --- full sequence --- --- best 1 domain --- -#dom- E-value score bias E-value score bias exp N Sequence Description ------- ------ ----- ------- ------ ----- ---- -- -------- ----------- 4.8e-110 354.0 0.0 5.8e-110 353.7 0.0 1.0 1 NCBI__GCF_000482785.1:WP_028313010.1 Domain annotation for each sequence (and alignments): >> NCBI__GCF_000482785.1:WP_028313010.1 # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc --- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ---- 1 ! 353.7 0.0 5.8e-110 5.8e-110 5 400 .. 3 392 .. 1 397 [. 0.87 Alignments for each domain: == domain 1 score: 353.7 bits; conditional E-value: 5.8e-110 TIGR00016 5 kilvlnaGssslkfalldaen...sekvllsglverikleeariktvedgekkeeeklai.......edheea 67 ilv+naGsss+kfa++++ + e l+g e i + + + g k + l + +++++a NCBI__GCF_000482785.1:WP_028313010.1 3 AILVINAGSSSVKFAVFQTDGhdlREPAALHGQLEGIGGAPHLVAKDAAGRKLADDALPHdpaasldTAYDTA 75 69****************9874444455788888988887776665555544433333221110001344455 PP TIGR00016 68 vkkllntlkkdkkilkelseialiGHRvvhGgekftesvivtdevlkkikdiselAPlHnpaelegieavlkl 140 +++ll + + ++ i++iGHRvvhGg +++ v+v+d vl+++ + ++lAPlH p +l++i+a+ NCBI__GCF_000482785.1:WP_028313010.1 76 LAALLAWIGE----HEVGFAIEAIGHRVVHGGPTHAAPVVVDDAVLAELDRYVPLAPLHQPHNLDAIRALG-- 142 5555555553....566779**************************************************9.. PP TIGR00016 141 kvllkaknvavFDtafHqtipeeaylYalPyslykelgvRrYGfHGtshkyvtqraakllnkplddlnlivcH 213 + ++a ++a+FDtafH+ p a +alP + ++ gvRrYGfHG+s++y++ r+ +ll+ +d ++iv+H NCBI__GCF_000482785.1:WP_028313010.1 143 RIEPDAIQIACFDTAFHRSQPAVAQRFALP-RRFEAAGVRRYGFHGLSYEYIAGRLPELLGE-RADGRVIVAH 213 8889999***********************.567899************************9.7789****** PP TIGR00016 214 lGnGasvsavknGksidtsmGltPLeGlvmGtRsGdiDpaiisylaetlglsldeieetlnkksGllgisgls 286 lGnGas++a++ +s+ +mG+ L+Gl+mGtR+G iDp+++ yl +t+++ + i ++l k+sGllg+sg+s NCBI__GCF_000482785.1:WP_028313010.1 214 LGNGASLCALHGRRSVASTMGFSALDGLMMGTRTGAIDPGVLLYLLDTEKMDSARISRLLYKESGLLGVSGIS 286 ************************************************************************* PP TIGR00016 287 sDlRdildkkeegneeaklAlkvyvhRiakyigkyiaslegelDaivFtgGiGenaaevrelvleklevlGlk 359 +D+R++l+ ea+ A++++++Ria+ ig+ +a+l g lDa+vFt+GiGe aa vre+v + lG+k NCBI__GCF_000482785.1:WP_028313010.1 287 ADMRTLLASD---APEAAEAVELFCYRIAREIGSLAAALGG-LDALVFTAGIGEHAAPVRERVSAACGWLGVK 355 ******9887...567999********************76.******************************* PP TIGR00016 360 ldlelnnaarsgkesvisteeskvkvlviptneelviaeDa 400 ld e n a + + s+v v+ ipt+ee++ia+ a NCBI__GCF_000482785.1:WP_028313010.1 356 LDAEANAAGQL----RLDAAGSAVAVACIPTDEEVMIARHA 392 ******95444....4556788****************976 PP Internal pipeline statistics summary: ------------------------------------- Query model(s): 1 (405 nodes) Target sequences: 1 (408 residues searched) Passed MSV filter: 1 (1); expected 0.0 (0.02) Passed bias filter: 1 (1); expected 0.0 (0.02) Passed Vit filter: 1 (1); expected 0.0 (0.001) Passed Fwd filter: 1 (1); expected 0.0 (1e-05) Initial search space (Z): 1 [actual number of targets] Domain search space (domZ): 1 [number of targets reported over threshold] # CPU time: 0.00u 0.00s 00:00:00.00 Elapsed: 00:00:00.00 # Mc/sec: 20.30 // [ok]
This GapMind analysis is from Apr 09 2024. The underlying query database was built on Sep 17 2021.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory