Align phenylpyruvate ferredoxin oxidoreductase (EC 1.2.7.8) (characterized)
to candidate PfGW456L13_3456 Indolepyruvate ferredoxin oxidoreductase, alpha and beta subunits
Query= reanno::Marino:GFF880 (1172 letters) >FitnessBrowser__pseudo13_GW456_L13:PfGW456L13_3456 Length = 1187 Score = 1087 bits (2811), Expect = 0.0 Identities = 563/1166 (48%), Positives = 765/1166 (65%), Gaps = 28/1166 (2%) Query: 13 LEDRYLRESGRVFLTGTQALVRIPLMQAALDRKQGLNTAGLVSGYRGSPLGAVDQALWQA 72 L+D+Y++ SG+V LTG QALVR+PLMQ D GLNTAG +SGYRGSPLG DQALW+A Sbjct: 7 LDDKYIQHSGKVLLTGIQALVRLPLMQRQRDLANGLNTAGFISGYRGSPLGGFDQALWKA 66 Query: 73 KDLLDENRIDFVPAINEDLAATILLGTQQVETDEDRQVEGVFGLWYGKGPGVDRAGDALK 132 +D L E+ F P +NEDLAAT + GTQQV E +GVFG+WYGKGPGVDR GD + Sbjct: 67 RDYLKEHHTVFHPGMNEDLAATSIWGTQQVNIFEGATYDGVFGMWYGKGPGVDRCGDVFR 126 Query: 133 HGTTYGSSPHGGVLVVAGDDHGCVSSSMPHQSDVAFMSFFMPTINPANIAEYLEFGLWGY 192 H G+S GGVL +AGDDHG SSS+PHQ++ F + MP + P+ + EYL++G+ G+ Sbjct: 127 HANAAGTSQFGGVLAIAGDDHGARSSSLPHQTEHIFKAVMMPVLAPSGVQEYLDYGMHGW 186 Query: 193 ALSRYSGCWVGFKAISETVESAASVEIPPAPDFVTPDDFTAPESGLHYRWPDLPGPQLET 252 A+SRYSGCWV KA+++TVESAA V+I D PE GL+ RWPD P Q + Sbjct: 187 AMSRYSGCWVALKAVADTVESAAVVDIDIHRVQPIIPDIPLPEGGLNIRWPDPPLAQEQR 246 Query: 253 RIEHKLAAVQAFARANRIDRCLFDNKEARFGIVTTGKGHLDLLEALDLLGIDEDKARDMG 312 +EHKL A A+AR NR+DR + D+ +AR GI+T+GK +LD+ +AL +LGIDE A+ +G Sbjct: 247 LLEHKLYAALAYARVNRLDRIVMDSPKARIGIITSGKSYLDVCQALKILGIDETLAQQIG 306 Query: 313 LDIYKVGMVWPLERRGILDFVHGKEEVLVIEEKRGIIESQIKEYMSEPDRPGEVLITGKQ 372 L +YKVGMVWPLE G+ F G EE++V+EEKR +IE Q+KE + I GK Sbjct: 307 LRVYKVGMVWPLEAEGVRQFAEGLEEIVVVEEKRHMIEYQLKEELYNWREDVRPRIVGKF 366 Query: 373 DELGRPLIPYVG-------ELSPKLVAGFLAARLGRFFE---VDFSERMAEISAMTTAQD 422 D+ G +P+ G +L+P ++A LA R+ + + S + + + Q Sbjct: 367 DDKGEWSLPHTGWLLPATNDLTPAMIARALAKRILHLHQNGPLQVSLAVLDAQLASKGQF 426 Query: 423 PGGVKRMPYFCSGCPHNTSTKVPEGSKALAGIGCHFMASWMGRNTESLIQMGGEGVNWIG 482 ++R+P++CSGCPHNTSTKVP+GS+ALAGIGCH+MA+W+ T++ QMGGEGV WIG Sbjct: 427 SNLMERVPHYCSGCPHNTSTKVPQGSRALAGIGCHYMAAWIYPQTQTFSQMGGEGVAWIG 486 Query: 483 KSRYTGNPHVFQNLGEGTYFHSGSMAIRQAVAAGINITYKILFNDAVAMTGGQPVDGQIT 542 ++ +T HVF NLG+GTYFHSG +AIR A+AA + ITYKIL+NDAVAMTGGQPVDG ++ Sbjct: 487 QAPFTKTRHVFANLGDGTYFHSGILAIRAAIAAKVQITYKILYNDAVAMTGGQPVDGSLS 546 Query: 543 VDRIAQQMAAEGVNRVVVLSDEPEKYDGHHDLFPKDVTFHDRSELDQVQRELRDIPGCTV 602 V +I++Q+AAEGV R+VV+SD+ EKY DL V R ++D+VQ +LR G + Sbjct: 547 VAQISRQLAAEGVQRIVVVSDDVEKYQHIRDL-ADGVPVLRRDKMDEVQEQLRQFQGVSA 605 Query: 603 LIYDQTCAAEKRRRRKRKQFPDPAKRAFINHHVCEGCGDCSVQSNCLSVVPRKTELGRKR 662 +IYDQTCAAEKRRRRKR +FPDPA+R IN VCEGCGDCS +SNC+SVV +TE GRKR Sbjct: 606 IIYDQTCAAEKRRRRKRGKFPDPARRVVINEAVCEGCGDCSSKSNCMSVVAVETEYGRKR 665 Query: 663 KIDQSSCNKDFSCVNGFCPSFVTIEGGQLRKSRGVDTGSVLTRKLADIPAPKLPEMTGSY 722 +IDQSSCNKDF+C+NGFCPSFVT+EGG LRK + + + + D+P P + Y Sbjct: 666 EIDQSSCNKDFTCLNGFCPSFVTVEGGTLRKPKALASAK---NDVWDLPTPATVALEEPY 722 Query: 723 DLLVGGVGGTGVVTVGQLITMAAHLESRGASVLDFMGFAQKGGTVLSYVRMAPSPDKLHQ 782 +LV GVGGTGVVT+G L+ MAA +E +G LD G AQKGG V S++R+A ++L Sbjct: 723 SILVTGVGGTGVVTIGALLGMAAFIEGKGTLNLDMAGMAQKGGAVWSHIRIAAHQEQLFA 782 Query: 783 VRISNGQADAVIACDLVVASSQKALSVLRPNHTRIVANEAE-------------LPTADY 829 RI+ G+ ++ CDLVV+++ + LS LR T + N E T D Sbjct: 783 PRIAEGETALLLGCDLVVSANTETLSKLRHGVTHALINSEETITSAFVRTFAQQAETGDL 842 Query: 830 VLFRDADMKADKRLGLLKNAVGEDHFDQLDANGIAEKLMGDTVFSNVMMLGFAWQKGLLP 889 + D + + AVG +H D +DA+ IA LMGD++ +N MLG+A+QKG LP Sbjct: 843 LKHPDPTFQTGNMSEQIAEAVGTEHADFIDASKIATALMGDSIATNTFMLGYAYQKGWLP 902 Query: 890 LSEAALMKAIELNGVAIDRNKEAFGWGRLSAVDPSAVTDLLDDSNAQVVEVKPEPTLDEL 949 + +AAL++AIELNG A+ N AF WGR SA D V L+ Q + +L+E Sbjct: 903 VGKAALLQAIELNGTAVPFNLSAFDWGRRSAHDLPRVLRKLEAGKVQSPDRLLSQSLEET 962 Query: 950 INTRHKHLVNYQNQRWADQYRDAVAGVRKAEESLGETNLLLTRAVAQQLYRFMAYKDEYE 1009 + R + L YQN +A +YR V KAE +L L+ +VA+ ++ +A KDEYE Sbjct: 963 LARRMEFLTAYQNYAYAQRYRYRVEQFIKAETALLGQPGKLSASVARYYFKVLAIKDEYE 1022 Query: 1010 VARLFAETDFMKEVNETFEGDFKVHFHLAPPLLSGETDAQGRPKKRRFGPWMFRAFRLLA 1069 VARLF + F++++ FEGD+++ FHLAPPLL+ + + PKKR FGPWM + F+LLA Sbjct: 1023 VARLFTDGQFLEKIQAGFEGDYRLRFHLAPPLLN-DNGSGREPKKRSFGPWMLQGFKLLA 1081 Query: 1070 KLRGLRGTAIDPFRYSADRKLDRAMLKDYQSLVDRIGRELNASNYETFLQLAELPADVRG 1129 +L+ LR T +DPF + +RK++R L +Y+ ++D + L L ELP VRG Sbjct: 1082 RLKFLRNTWLDPFGRTHERKVERNWLANYEQILDEVLAGLTTQKLGLAQDLVELPESVRG 1141 Query: 1130 YGPVREQAAESIREKQTQLIKALDTG 1155 YGPV+E+ +++Q QL++ G Sbjct: 1142 YGPVKERFLGHAQQRQAQLLEQWRNG 1167 Lambda K H 0.319 0.136 0.405 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 3030 Number of extensions: 152 Number of successful extensions: 9 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 1172 Length of database: 1187 Length adjustment: 47 Effective length of query: 1125 Effective length of database: 1140 Effective search space: 1282500 Effective search space used: 1282500 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.4 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.7 bits) S2: 58 (26.9 bits)
This GapMind analysis is from Sep 17 2021. The underlying query database was built on Sep 17 2021.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see the paper from 2019 on GapMind for amino acid biosynthesis, the paper from 2022 on GapMind for carbon sources, or view the source code.
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory