Align trehalose-specific PTS system, I, HPr, and IIA components (characterized)
to candidate WP_068174654.1 HTA01S_RS20310 phosphoenolpyruvate--protein phosphotransferase
Query= reanno::pseudo3_N2E3:AO353_15995 (844 letters) >NCBI__GCF_001592305.1:WP_068174654.1 Length = 588 Score = 262 bits (669), Expect = 5e-74 Identities = 194/581 (33%), Positives = 282/581 (48%), Gaps = 43/581 (7%) Query: 277 LRGVCASAGSAFGYVVQVAERTLEMPEF--AADQQL-ERESLERA-------LMHATQAL 326 + G S G A G V VA +++ + ADQ E E L +A ++ Q L Sbjct: 5 VHGQAVSRGIAIGRAVIVASSRVDVAHYFVTADQTTAEIERLRQARNAVMEEIVRVQQGL 64 Query: 327 QRLRDNAAGEAQADIFKAHQELLEDPSLLEQAQALIAEGK-SAAFAWNSATEATATLFKS 385 L N A + + H LL+D L+ + I + +A +A + E A F Sbjct: 65 GELGSNDAHPELSALLDVHLMLLQDEQLISGVKHWIVDRHYNAEWALTTQLEVIARQFDE 124 Query: 386 LGSTLLAERALDLMDVGQRVLKLILGVP-----------------DGVWELPDQAILIAE 428 + L ER DL V +R+L+ + GV D V ++P +LIA Sbjct: 125 MEDPYLRERKADLEQVVERMLRFMRGVASPIMAPVRTEARRDAAHDSVVDVP--LVLIAH 182 Query: 429 QLTPSQTAALDTGKVLGFATVGGGATSHVAILARALGLPAVCGLPLQVLSLASGTRVLLD 488 L+P+ GF T GG TSH AI+AR++ +PAV G + V++D Sbjct: 183 DLSPADMLQFKQSVFAGFVTDVGGKTSHTAIVARSMDIPAVVGARSASHLVEQDDWVIID 242 Query: 489 ADKGELHLDPAVSVIEQLHAKRQQQRQRHQHELENAARAAVTRDGHHFEVTANVASLAET 548 D G + +DP+ ++ + K++Q + AVT DG E+ AN+ + Sbjct: 243 GDAGVVLVDPSPVLLAEYGFKQRQGEVERERLSRLKNTPAVTLDGQRIELQANIEQPDDA 302 Query: 549 EQAMSLGAEGIGLLRSEFLYQQRS-VAPSHDEQAGTYSAIARALGPQRNL--VVRTLDVG 605 A+ GA G+GL R+EFL+ R+ P +EQ Y A RA+ + L +RT+DVG Sbjct: 303 VAALKAGAVGVGLFRTEFLFMGRNGKLPDEEEQ---YQAYRRAVEGMQGLPVTIRTVDVG 359 Query: 606 GDKPLAYVPM---DSEANPFLGMRGIRLCLERPQLLREQFRAILSSAGLARLHIMLPMVS 662 DKPL P+ + NP LG+R IR L P + Q RAIL +A +++++PM++ Sbjct: 360 ADKPLDRTPVRAGEDHLNPALGLRAIRWSLADPAMFLAQLRAILRAAAHGSVNLLIPMLA 419 Query: 663 QLSELRLARLMLEEEALALGLRELP----KLGIMIEVPAAALMADLFAPEVDFFSIGTND 718 SE+R ++++ L R +LG MIE+PAAAL LF DF SIGTND Sbjct: 420 HASEIRQTLVLVDRAREQLNTRGQVYGPVRLGAMIEIPAAALTIPLFLRHFDFLSIGTND 479 Query: 719 LTQYTLAMDRDHPRLASQADSFHPSVLRLIASTVKAAHAHGKWVGVCGALASETLAVPLL 778 L QYTLA+DR +A D HP+VL+L+AST+ A GK V VCG +A + LL Sbjct: 480 LIQYTLAIDRADEAVAHLYDPVHPAVLQLLASTIAQCRAQGKGVSVCGEMAGDVTMTRLL 539 Query: 779 LGLGVDELSVSVPLIPAIKAAIREVELSDCQAIAHQVLGLE 819 LGLG+ S+ I A+K I + + Q A VL E Sbjct: 540 LGLGLRTFSMHPSQILAVKQQILRSDTARLQVWAQSVLVAE 580 Lambda K H 0.318 0.132 0.370 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 1029 Number of extensions: 44 Number of successful extensions: 6 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 844 Length of database: 588 Length adjustment: 39 Effective length of query: 805 Effective length of database: 549 Effective search space: 441945 Effective search space used: 441945 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.3 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.7 bits) S2: 54 (25.4 bits)
This GapMind analysis is from Apr 09 2024. The underlying query database was built on Sep 17 2021.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory