Align phenylpyruvate ferredoxin oxidoreductase (EC 1.2.7.8) (characterized)
to candidate RR42_RS10350 RR42_RS10350 indolepyruvate ferredoxin oxidoreductase
Query= reanno::Marino:GFF880 (1172 letters) >FitnessBrowser__Cup4G11:RR42_RS10350 Length = 1162 Score = 993 bits (2566), Expect = 0.0 Identities = 545/1168 (46%), Positives = 713/1168 (61%), Gaps = 36/1168 (3%) Query: 9 DDYKLEDRYLRESGRVFLTGTQALVRIPLMQAALDRKQGLNTAGLVSGYRGSPLGAVDQA 68 D Y+LEDRY RE+G VFLTGTQALVRI + QA DR GL T GLVSGYRGSPLG DQ Sbjct: 3 DTYRLEDRYRREAGSVFLTGTQALVRILVEQARADRIAGLKTGGLVSGYRGSPLGGFDQE 62 Query: 69 LWQAKDLLDENRIDFVPAINEDLAATILLGTQQVETDEDRQVEGVFGLWYGKGPGVDRAG 128 LW+ + LL E I F P +NEDL AT+L G QQ++ ++VEGVF +WYGKGPGVDR G Sbjct: 63 LWRQRSLLAEYEIRFEPGLNEDLGATMLWGAQQIDAFPGKRVEGVFSMWYGKGPGVDRTG 122 Query: 129 DALKHGTTYGSSPHGGVLVVAGDDHGCVSSSMPHQSDVAFMSFFMPTINPANIAEYLEFG 188 D ++ G+S HGGVL +AGDDH SS PHQ+D F MP + PA++ EY+EFG Sbjct: 123 DVFRNANVLGTSRHGGVLAIAGDDHAAQSSMFPHQTDHVFEGAMMPVLFPASVEEYVEFG 182 Query: 189 LWGYALSRYSGCWVGFKAISETVESAASVEIPPAPD-----FVTPDDFTAPESGLHY--- 240 L+GYALSR+SG WV FKAI+ETVES S+ I A F P D PE G Y Sbjct: 183 LFGYALSRFSGLWVAFKAITETVESGRSMLIGGAGSARGARFSLPGDIDIPERGFGYDTG 242 Query: 241 -RWPDLPGPQLETRIEHKLAAVQAFARANRIDRCLFDNKEARFGIVTTGKGHLDLLEALD 299 +WP +E +L A QAFARAN IDR + ++AR GIVT GK H DLL AL Sbjct: 243 VKWPGQRAELERRLLEERLPAAQAFARANPIDRTIVRPRDARIGIVTVGKAHGDLLAALA 302 Query: 300 LLGIDEDKARDMGLDIYKVGMVWPLERRGILDFVHGKEEVLVIEEKRGIIESQIKEYMSE 359 LG+DE + ++G+ +YK+GM WP+E G+ F G +LV+EEKR +E QI+E + Sbjct: 303 RLGLDEPRLAELGIGLYKIGMTWPIEGEGVRRFASGMRALLVVEEKRSFVERQIQETLFN 362 Query: 360 PDRPGEVLITGKQDELGRPLIPYVGELSP----KLVAGFL------AARLGRFFEVDFSE 409 P + GK+ PL+P E +P K + FL A G Sbjct: 363 VAAPQRPEVFGKRGPNDAPLLPATLEFAPDQLQKALRQFLAYTGVHALPPGNTGSAAALP 422 Query: 410 RMAEISAMTTAQDPGGVKRMPYFCSGCPHNTSTKVPEGSKALAGIGCHFMASWMGRNTES 469 R + ++ P + R P+FC+GCPHN+STK+P+GS A AGIGCH MA G NT + Sbjct: 423 RQPRVIPLSAQLQPDVLTRKPFFCAGCPHNSSTKLPDGSYAAAGIGCHIMALGQGDNTAT 482 Query: 470 LIQMGGEGVNWIGKSRYTGNPHVFQNLGEGTYFHSGSMAIRQAVAAGINITYKILFNDAV 529 QMGGEGV W+G S ++ PH+F NLG+GTY HSGS+AIRQAVAAG +TYKILFNDAV Sbjct: 483 FCQMGGEGVQWVGLSSFSDLPHLFVNLGDGTYQHSGSLAIRQAVAAGTAVTYKILFNDAV 542 Query: 530 AMTGGQPVDGQITVDRIAQQMAAEGVNRVVVLSDEPEKYDGHHDLFPKDVTFHDRSELDQ 589 AMTGGQP +G +TV R+ Q+ AEGV +VV++SD P++Y G + P V R LD Sbjct: 543 AMTGGQPTEGGLTVPRMVAQLIAEGVGKVVLVSDHPQRYWGAKSI-PASVEIAHRDALDD 601 Query: 590 VQRELRDIPGCTVLIYDQTCAAEKRRRRKRKQFPDPAKRAFINHHVCEGCGDCSVQSNCL 649 VQR LR G + ++YDQTCAAEKRRRRKR DP +R IN VCEGCGDCSVQSNC+ Sbjct: 602 VQRRLRAYRGVSAIVYDQTCAAEKRRRRKRGTLADPVRRVVINPSVCEGCGDCSVQSNCI 661 Query: 650 SVVPRKTELGRKRKIDQSSCNKDFSCVNGFCPSFVTIEGGQLRKSRGVDTGSVLTRKLAD 709 ++ P +T LGRKR ++QSSCNKD SC+ GFCPSFVTIEG Q +++ + + A Sbjct: 662 AIEPLETPLGRKRAVNQSSCNKDMSCLKGFCPSFVTIEGLQPKRANQRRIQQMEAQWRAS 721 Query: 710 IPAPKLPEMTGSY----DLLVGGVGGTGVVTVGQLITMAAHLESRGASVLDFMGFAQKGG 765 +P P P G+ +LV GVGGTGVVTVG ++ MAAHLE +GA+ LDF G AQK G Sbjct: 722 LPPPAGPAALGALLEHARILVTGVGGTGVVTVGAILAMAAHLEGKGAATLDFTGLAQKNG 781 Query: 766 TVLSYVRMAPSPDKLHQVRISNGQADAVIACDLVVASSQKALSVLRPNHTRIVANEAELP 825 V+S+V++A +++ RI AD +I CD VVA+S L+ LR TR V N A P Sbjct: 782 AVVSHVQLADRRERIVTARIEACSADVMIGCDAVVAASPDVLARLRKCGTRAVVNSAVAP 841 Query: 826 TADYVLFRDADMKADKRLGLLKNAVGEDHFDQLDANGIAEKLMGDTVFSNVMMLGFAWQK 885 TAD+V D + + +++ VG + D A L GD + +N+M++G A+Q+ Sbjct: 842 TADFVANGDLPISREIHQAAIESVVGAGQAEFFDCTAAAMTLFGDAIATNMMLVGHAYQR 901 Query: 886 GLLPLSEAALMKAIELNGVAIDRNKEAFGWGRLSAVDPSAV--TDLLDDSNAQVVEVKPE 943 G +PLSE A+ +AIELNG A++ N+ AF WGR+ A +P+A+ T + AQ E Sbjct: 902 GWIPLSEMAIARAIELNGAAVELNRRAFLWGRILACEPNALRRTSAAEAQAAQPFE---- 957 Query: 944 PTLDELINTRHKHLVNYQNQRWADQYRDAVAGVRKAEESLGETNLLLTRAVAQQLYRFMA 1003 L + R + L YQN +A++Y VA V AE+ + LL AVA+ YR +A Sbjct: 958 --LARFVAERKRDLSAYQNAAYAERYGRMVAAVAGAEQRIAGEAGLLAEAVARSYYRVLA 1015 Query: 1004 YKDEYEVARLFAETDFMKEVNETFEGDFKVHFHLAPPLLSGETDAQGRPKKRRF-GPWMF 1062 YKDEYEVARL ++ F + + TF+G K FH+APP L+ GR K G M Sbjct: 1016 YKDEYEVARLHSDPAFARSLESTFDGHGKKTFHMAPPWLTRVDRNTGRRNKIVLSGTVMS 1075 Query: 1063 RAFRLLAKLRGLRGTAIDPFRYSADRKLDRAMLKDYQSLVDRIGRELNASNYETFLQLAE 1122 RLL + LRGT DPF +DR+++R M+ + + V + R L+ + L Sbjct: 1076 PLLRLLRHGKILRGTPFDPFGRQSDRRIERRMIVECEDDVQLVLRTLSERTLASAAALVG 1135 Query: 1123 LPADVRGYGPVREQAAESIREKQTQLIK 1150 A +RG+G ++E+ + R Q L K Sbjct: 1136 AYAQIRGFGVIKER---NYRSAQADLQK 1160 Lambda K H 0.319 0.136 0.405 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 3065 Number of extensions: 120 Number of successful extensions: 7 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 1172 Length of database: 1162 Length adjustment: 47 Effective length of query: 1125 Effective length of database: 1115 Effective search space: 1254375 Effective search space used: 1254375 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.4 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.7 bits) S2: 58 (26.9 bits)
This GapMind analysis is from Sep 17 2021. The underlying query database was built on Sep 17 2021.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory