Align Acetate kinase; EC 2.7.2.1; Acetokinase (uncharacterized)
to candidate WP_068006986.1 PsAD2_RS14055 acetate/propionate family kinase
Query= curated2:B8EIS2 (404 letters) >NCBI__GCF_001623255.1:WP_068006986.1 Length = 388 Score = 342 bits (876), Expect = 1e-98 Identities = 178/386 (46%), Positives = 254/386 (65%), Gaps = 15/386 (3%) Query: 7 TLNAGSSSIKFALFREGLTQEGSAKELTPMAIGLAEMVGEERRITVHDGAGAKIYEVKRT 66 TLN GSSS+KF+L++ E G E +G ++ + I +V+R Sbjct: 6 TLNTGSSSVKFSLYKAHDKPEFYIS-------GSIECLGPSAKLKMQ----TPIEQVRR- 53 Query: 67 EHVDAPFHAEALRRILAWRQSAFPDAEVVAAGHRVVHGGVHYSAPVIVTDEVLKYLHTLI 126 + HA A+ I + + + VV GHR+VHGG + P +T +V+ L LI Sbjct: 54 -ELGPIDHASAVPAIFSALEPFLKGSSVVGIGHRIVHGGRSFFEPTELTPKVMDELEQLI 112 Query: 127 PLAPLHEPYNIAGILGAREAWPHVEQVACFDTAFHRTHPFVNDVFALPRRFYDEGVRRYG 186 PLAPLH+PY++A I A++ +P Q+ CFDTAFH H F ND FA+PR FY+EGVRRYG Sbjct: 113 PLAPLHQPYSLATIQAAKQVFPDALQIGCFDTAFHAGHSFPNDAFAIPRHFYEEGVRRYG 172 Query: 187 FHGLSYEYIVRRLREIAPLHAAGRVVVAHLGNGASMCAIRDGLSVASSMGFTALDGLPMG 246 FHGLS++ I +R+ A R+V+AHLGNGASMCA+ +G S+++S GF+A+DGLPMG Sbjct: 173 FHGLSFDAICNEMRDRYSQVANERLVIAHLGNGASMCAVNNGHSISASTGFSAVDGLPMG 232 Query: 247 TRCGQLDPGVVLYLMQEKKMSAAEITDLLYRESGLKGLSGLSHDMRELEAADTLEAQQAI 306 TRCG+LDPGV+LYLMQEKK+S EI ++YR SGL GLSG+S DMR+LE ++ A +AI Sbjct: 233 TRCGRLDPGVMLYLMQEKKLSPEEIETIIYRRSGLLGLSGISSDMRDLEESNDPFAIEAI 292 Query: 307 EYFVFRIRRELGGLAAVLKGIDAIVFCGGIGENSRHVRERVLEGMEWIGVELDRSANSAN 366 +Y+ ++ RRE+ +AA L+GIDA++F G+GENS VR ++ + + ++G+ +D+ N+ N Sbjct: 293 DYYCYQARREVATMAASLEGIDALIFSAGVGENSALVRSKICQPLAFLGITIDQKKNTVN 352 Query: 367 AEVISSERSRTRVFVIPTDEEGMIAR 392 A I + RVF++ TDEE +I R Sbjct: 353 APEIGT--GPVRVFIVATDEEQVIVR 376 Lambda K H 0.322 0.137 0.404 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 428 Number of extensions: 22 Number of successful extensions: 3 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 404 Length of database: 388 Length adjustment: 31 Effective length of query: 373 Effective length of database: 357 Effective search space: 133161 Effective search space used: 133161 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.4 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.9 bits) S2: 50 (23.9 bits)
Align candidate WP_068006986.1 PsAD2_RS14055 (acetate/propionate family kinase)
to HMM TIGR00016 (ackA: acetate kinase (EC 2.7.2.1))
# hmmsearch :: search profile(s) against a sequence database # HMMER 3.3.1 (Jul 2020); http://hmmer.org/ # Copyright (C) 2020 Howard Hughes Medical Institute. # Freely distributed under the BSD open source license. # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # query HMM file: ../tmp/path.carbon/TIGR00016.hmm # target sequence database: /tmp/gapView.3302522.genome.faa # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: TIGR00016 [M=405] Accession: TIGR00016 Description: ackA: acetate kinase Scores for complete sequences (score includes all domains): --- full sequence --- --- best 1 domain --- -#dom- E-value score bias E-value score bias exp N Sequence Description ------- ------ ----- ------- ------ ----- ---- -- -------- ----------- 6.6e-105 337.0 0.0 7.5e-105 336.9 0.0 1.0 1 NCBI__GCF_001623255.1:WP_068006986.1 Domain annotation for each sequence (and alignments): >> NCBI__GCF_001623255.1:WP_068006986.1 # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc --- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ---- 1 ! 336.9 0.0 7.5e-105 7.5e-105 8 401 .. 6 379 .. 1 383 [. 0.94 Alignments for each domain: == domain 1 score: 336.9 bits; conditional E-value: 7.5e-105 TIGR00016 8 vlnaGssslkfalldaensekvllsglverikleeariktvedgekkeeeklaiedheeavkkllntlkkdkk 80 ln+Gsss+kf+l++a++ + +sg +e++ + + ++++ ++ + +l dh++av +++++l+ NCBI__GCF_001623255.1:WP_068006986.1 6 TLNTGSSSVKFSLYKAHDKPEFYISGSIECLGPSAK--LKMQTPIEQVRRELGPIDHASAVPAIFSALEP--- 73 69****************98999*******997776..45667778888889999**************6... PP TIGR00016 81 ilkelseialiGHRvvhGgekftesvivtdevlkkikdiselAPlHnpaelegieavlklkvllkaknvavFD 153 + s++ iGHR+vhGg +f e + +t +v+++++++++lAPlH p l +i+a++ +v ++a ++ +FD NCBI__GCF_001623255.1:WP_068006986.1 74 -FLKGSSVVGIGHRIVHGGRSFFEPTELTPKVMDELEQLIPLAPLHQPYSLATIQAAK--QVFPDALQIGCFD 143 .55678999*************************************************..7788899****** PP TIGR00016 154 tafHqtipeeaylYalPyslykelgvRrYGfHGtshkyvtqraakllnkplddlnlivcHlGnGasvsavknG 226 tafH ++ +a+P+++y+e gvRrYGfHG+s+ + +++ + + +++ +l+++HlGnGas++av+nG NCBI__GCF_001623255.1:WP_068006986.1 144 TAFHAGHSFPNDAFAIPRHFYEE-GVRRYGFHGLSFDAICNEMRDRYSQ-VANERLVIAHLGNGASMCAVNNG 214 ******999999*****998865.9*********************999.9999******************* PP TIGR00016 227 ksidtsmGltPLeGlvmGtRsGdiDpaiisylaetlglsldeieetlnkksGllgisglssDlRdildkkeeg 299 +si s G+ ++Gl mGtR+G +Dp+++ yl+++++ls +eie+++ ++sGllg+sg+ssD+Rd+++ NCBI__GCF_001623255.1:WP_068006986.1 215 HSISASTGFSAVDGLPMGTRCGRLDPGVMLYLMQEKKLSPEEIETIIYRRSGLLGLSGISSDMRDLEESN--- 284 ******************************************************************9988... PP TIGR00016 300 neeaklAlkvyvhRiakyigkyiaslegelDaivFtgGiGenaaevrelvleklevlGlkldlelnnaarsgk 372 + a A++ y++ ++ +++++asleg +Da++F +G+Gen+a vr ++++ l++lG+ +d+++n+ + NCBI__GCF_001623255.1:WP_068006986.1 285 DPFAIEAIDYYCYQARREVATMAASLEG-IDALIFSAGVGENSALVRSKICQPLAFLGITIDQKKNT----VN 352 566899********************99.*************************************9....45 PP TIGR00016 373 esvisteeskvkvlviptneelviaeDal 401 i + v+v+++ t+ee vi++ ++ NCBI__GCF_001623255.1:WP_068006986.1 353 APEIG--TGPVRVFIVATDEEQVIVRAVA 379 55666..689***************9776 PP Internal pipeline statistics summary: ------------------------------------- Query model(s): 1 (405 nodes) Target sequences: 1 (388 residues searched) Passed MSV filter: 1 (1); expected 0.0 (0.02) Passed bias filter: 1 (1); expected 0.0 (0.02) Passed Vit filter: 1 (1); expected 0.0 (0.001) Passed Fwd filter: 1 (1); expected 0.0 (1e-05) Initial search space (Z): 1 [actual number of targets] Domain search space (domZ): 1 [number of targets reported over threshold] # CPU time: 0.01u 0.00s 00:00:00.01 Elapsed: 00:00:00.00 # Mc/sec: 19.70 // [ok]
This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory