Align ABC-type sugar transport system, ATPase component protein (characterized, see rationale)
to candidate 3607097 Dshi_0519 ABC transporter related (RefSeq)
Query= uniprot:D8IUD1 (522 letters) >FitnessBrowser__Dino:3607097 Length = 498 Score = 285 bits (730), Expect = 2e-81 Identities = 187/473 (39%), Positives = 260/473 (54%), Gaps = 10/473 (2%) Query: 16 LSGIGKRYAAP-VLDGIDLDLRPGQVLALTGENGAGKSTLSKIICGLVDASAGGMMLDGQ 74 ++G+ K + VL GIDL L PG V L G NGAGKSTL K+ICG A G M L Sbjct: 1 MNGLQKSFGKNNVLRGIDLTLDPGSVTVLMGANGAGKSTLVKVICGQHRADGGTMRLATN 60 Query: 75 PYAPASRTQAEGLGIRMVMQELN--LIPTLSIAENLFLEKLPRR-FGWIDRKKLAEAARA 131 + P A G+ V Q ++ +IP L +A NL L++L G R++ A Sbjct: 61 AFDPEDAADAIRQGVVTVHQSIDDGVIPDLDVANNLMLDRLAEHSHGLFVRERHLRTEAA 120 Query: 132 QMEVVGLGELDPWTPVGDLGLGHQQMVEIARNLIGSCRCLILDEPTAMLTNREVELLFSR 191 ++ E++ V DL + +QM+ IAR + + + LILDEPT+ L+ E + LF Sbjct: 121 KVAAAMGIEVNLRARVSDLSVADRQMIAIARAMARAPKVLILDEPTSSLSATEADRLFEL 180 Query: 192 IERLRAEGVAIIYISHRLEELKRIADRIVVLRDGKLVCNDDIGRYSTEQLVQLMAGELTK 251 I+RLRA+GVAI+YISHR+ +++RIADRIVV+RDG + + E V M G Sbjct: 181 IDRLRAQGVAILYISHRMSDIRRIADRIVVMRDGMISGVFETEPLDLEAAVTAMLGH-RM 239 Query: 252 VDLDAEHRRIGAPVLRIRGLGRAPVVHPASLALHAGEVLGIAGLIGSGRTELLRLIFGAD 311 ++DAE R+ PVL I+ L P +L H GEV+ + GL+GSG++ L ++FG Sbjct: 240 TEVDAEVRQGTHPVLEIKNLQLFEGASPITLTAHDGEVVALVGLLGSGKSRLAEILFGIA 299 Query: 312 RAEQGEIFI-GDSQEPARIRSPKDAVKAGIAMVTEDRKGQGLLLPQAISVNTSLANLGSV 370 R +G I I G P RS KDA+ G+ M +DR ++ I+ N +L L + Sbjct: 300 RPIRGSIRIKGKDYSP---RSVKDAIAQGVFMSPKDRGTNAVIPAFDIADNMTLPFLQGM 356 Query: 371 SRGGMLDHAAESSVAQDYVKKLRIRSGSVAQAAGELSGGNQQKVVIARWLYRDCPIMLFD 430 S G L + AQ V +L I SV G LSGGNQQKV+IARWL ++L D Sbjct: 357 SVGSFLKSRQQRGTAQGMVDRLGIVCQSVRDGIGTLSGGNQQKVMIARWLLEPAQVLLLD 416 Query: 431 EPTRGIDIGAKSDIYRLFAELAAQGKGLLVVSSDLRELMQICDRIAVMSAGRI 483 EP +G+DIGA+ DI QG+ LV +++ E ++I DRI VMS G I Sbjct: 417 EPFQGVDIGARRDIGH-HIRATTQGRATLVFLAEIDEALEIADRIVVMSEGAI 468 Score = 77.0 bits (188), Expect = 1e-18 Identities = 66/260 (25%), Positives = 114/260 (43%), Gaps = 22/260 (8%) Query: 269 RGLGRAPVVHPASLALHAGEVLGIAGLIGSGRTELLRLIFGADRAEQGEIFIGDSQ-EPA 327 + G+ V+ L L G V + G G+G++ L+++I G RA+ G + + + +P Sbjct: 6 KSFGKNNVLRGIDLTLDPGSVTVLMGANGAGKSTLVKVICGQHRADGGTMRLATNAFDP- 64 Query: 328 RIRSPKDAVKAGIAMVTEDRKGQGLLLPQAISVNTSLANLGSVSRG------GMLDHAAE 381 DA++ G+ V + G++ ++ N L L S G + AA+ Sbjct: 65 --EDAADAIRQGVVTVHQSID-DGVIPDLDVANNLMLDRLAEHSHGLFVRERHLRTEAAK 121 Query: 382 SSVAQDYVKKLRIRSGSVAQAAGELSGGNQQKVVIARWLYRDCPIMLFDEPTRGIDIGAK 441 + A LR R +LS ++Q + IAR + R +++ DEPT + Sbjct: 122 VAAAMGIEVNLRAR-------VSDLSVADRQMIAIARAMARAPKVLILDEPTSSLSATEA 174 Query: 442 SDIYRLFAELAAQGKGLLVVSSDLRELMQICDRIAVMSAGRIADTFSRDDWSQERILAAA 501 ++ L L AQG +L +S + ++ +I DRI VM G I+ F + L AA Sbjct: 175 DRLFELIDRLRAQGVAILYISHRMSDIRRIADRIVVMRDGMISGVFETEPLD----LEAA 230 Query: 502 FSGYVGRQEAAAAAHVAGNT 521 + +G + A V T Sbjct: 231 VTAMLGHRMTEVDAEVRQGT 250 Score = 75.1 bits (183), Expect = 6e-18 Identities = 60/203 (29%), Positives = 100/203 (49%), Gaps = 6/203 (2%) Query: 31 IDLDLRPGQVLALTGENGAGKSTLSKIICGLVDASAGGMMLDGQPYAPASRTQAEGLGIR 90 I L G+V+AL G G+GKS L++I+ G+ G + + G+ Y+P S A G+ Sbjct: 268 ITLTAHDGEVVALVGLLGSGKSRLAEILFGIARPIRGSIRIKGKDYSPRSVKDAIAQGVF 327 Query: 91 MVMQELN---LIPTLSIAENLFLEKLPRRF--GWIDRKKLAEAARAQMEVVGLGELDPWT 145 M ++ +IP IA+N+ L L ++ ++ A+ ++ +G+ Sbjct: 328 MSPKDRGTNAVIPAFDIADNMTLPFLQGMSVGSFLKSRQQRGTAQGMVDRLGIVCQSVRD 387 Query: 146 PVGDLGLGHQQMVEIARNLIGSCRCLILDEPTAMLTNREVELLFSRIERLRAEGVAIIYI 205 +G L G+QQ V IAR L+ + L+LDEP + + I R +G A + Sbjct: 388 GIGTLSGGNQQKVMIARWLLEPAQVLLLDEPFQGVDIGARRDIGHHI-RATTQGRATLVF 446 Query: 206 SHRLEELKRIADRIVVLRDGKLV 228 ++E IADRIVV+ +G +V Sbjct: 447 LAEIDEALEIADRIVVMSEGAIV 469 Lambda K H 0.320 0.137 0.390 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 643 Number of extensions: 28 Number of successful extensions: 10 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 3 Number of HSP's successfully gapped: 3 Length of query: 522 Length of database: 498 Length adjustment: 34 Effective length of query: 488 Effective length of database: 464 Effective search space: 226432 Effective search space used: 226432 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.4 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.8 bits) S2: 52 (24.6 bits)
This GapMind analysis is from Sep 17 2021. The underlying query database was built on Sep 17 2021.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory