Align phenylpyruvate ferredoxin oxidoreductase (EC 1.2.7.8) (characterized)
to candidate WP_041097026.1 SUTH_RS03175 indolepyruvate ferredoxin oxidoreductase family protein
Query= reanno::Marino:GFF880 (1172 letters) >NCBI__GCF_000828635.1:WP_041097026.1 Length = 1155 Score = 1124 bits (2908), Expect = 0.0 Identities = 582/1146 (50%), Positives = 770/1146 (67%), Gaps = 17/1146 (1%) Query: 13 LEDRYLRESGRVFLTGTQALVRIPLMQAALDRKQGLNTAGLVSGYRGSPLGAVDQALWQA 72 LED+Y SGRVFLTGTQALVR+ L+Q D GLNTAG VSGYRGSPLG +DQALW+A Sbjct: 10 LEDKYALASGRVFLTGTQALVRLLLLQRQRDALAGLNTAGYVSGYRGSPLGGLDQALWKA 69 Query: 73 KDLLDENRIDFVPAINEDLAATILLGTQQVETDEDRQVEGVFGLWYGKGPGVDRAGDALK 132 + L ++ + F P +NEDLAAT L GTQQV + +GVFG+WYGKGPGVDR GD K Sbjct: 70 RPHLAQSHVVFQPGVNEDLAATALWGTQQVNLSPGAKHDGVFGMWYGKGPGVDRCGDVFK 129 Query: 133 HGTTYGSSPHGGVLVVAGDDHGCVSSSMPHQSDVAFMSFFMPTINPANIAEYLEFGLWGY 192 H + G+ HGG+L VAGDDH SS++ HQS+ AF + MP + PA + +YL+ GL G+ Sbjct: 130 HANSAGTWKHGGILAVAGDDHAARSSTVAHQSEHAFKAAMMPVLVPAGVQDYLDLGLHGW 189 Query: 193 ALSRYSGCWVGFKAISETVESAASVEI-PPAPDFVTPDDFTAPESGLHYRWPDLPGPQLE 251 A+SRYSGCWVGFKA+++TVES+ASV+I P V PDD+ P GL+ RWPD Q Sbjct: 190 AMSRYSGCWVGFKAVADTVESSASVDISPDRVRIVLPDDYALPAGGLNIRWPDDRLLQEA 249 Query: 252 TRIEHKLAAVQAFARANRIDRCLFDNKEARFGIVTTGKGHLDLLEALDLLGIDEDKARDM 311 ++HKL A A+ RAN++++ + D R GI+TTGK +LD+ +A D LGID+ A ++ Sbjct: 250 RLLDHKLYAALAYCRANKLNQVVIDAPNPRLGIITTGKSYLDVRQAFDDLGIDDALAAEI 309 Query: 312 GLDIYKVGMVWPLERRGILDFVHGKEEVLVIEEKRGIIESQIKEYMSEPDRPGEVLITGK 371 G+ +YKVGMVWPLE G+ F G EE+LV+EEKR +IE Q+KE + + GK Sbjct: 310 GIRLYKVGMVWPLESDGVRRFAEGLEEILVVEEKRQLIEYQLKEELYNWREDVRPRVIGK 369 Query: 372 QDEL-------GRPLIPYVGELSPKLVAGFLAARLGRFF-EVDFSERMAEISAMTTAQDP 423 DE GR L+P GELSP +A +A R+GR F ER+A I A + P Sbjct: 370 FDEKGEWALPNGRWLLPASGELSPAQIARVIADRIGRHFTSPRIRERLAIIEAKERSAAP 429 Query: 424 G-GVKRMPYFCSGCPHNTSTKVPEGSKALAGIGCHFMASWMGRNTESLIQMGGEGVNWIG 482 V R PYFC GCPHNTST VPEGS+ALAGIGCHFM WM R+T + MGGEG W+G Sbjct: 430 AIQVARTPYFCPGCPHNTSTCVPEGSRALAGIGCHFMVLWMNRSTATYSHMGGEGAAWMG 489 Query: 483 KSRYTGNPHVFQNLGEGTYFHSGSMAIRQAVAAGINITYKILFNDAVAMTGGQPVDGQIT 542 ++ +T HVF NLG+GTYFHSGS+AIR AVA+G+N TYKIL+NDAVAMTGGQPVDG +T Sbjct: 490 QAPFTEQRHVFVNLGDGTYFHSGSLAIRAAVASGVNATYKILYNDAVAMTGGQPVDGNLT 549 Query: 543 VDRIAQQMAAEGVNRVVVLSDEPEKYDGHHDLFPKDVTFHDRSELDQVQRELRDIPGCTV 602 V +IA Q+ AEGV+ VVV++D + GH DL P V R ELD +QRE+R+ PG + Sbjct: 550 VPQIAHQLHAEGVHHVVVVTDGTARAYGHPDL-PHGVPIRHRDELDAIQREMRECPGVSA 608 Query: 603 LIYDQTCAAEKRRRRKRKQFPDPAKRAFINHHVCEGCGDCSVQSNCLSVVPRKTELGRKR 662 +IYDQTCAAEKRRRRKR + DP +R FIN VCEGCGDC VQSNCL+VVP +TE GRKR Sbjct: 609 IIYDQTCAAEKRRRRKRGKMIDPPRRLFINEAVCEGCGDCGVQSNCLAVVPVETEFGRKR 668 Query: 663 KIDQSSCNKDFSCVNGFCPSFVTIEGGQLRKSRGVDTGSVLTRKLADIPAPKLPEMTGSY 722 IDQS+CNKD+SC GFCPSFV++ GG ++K RG+ + +A PAP L Y Sbjct: 669 AIDQSACNKDYSCEKGFCPSFVSVLGGGVKKGRGLAGSTNGGDFIAVPPAPTLASTADPY 728 Query: 723 DLLVGGVGGTGVVTVGQLITMAAHLESRGASVLDFMGFAQKGGTVLSYVRMAPSPDKLHQ 782 +L+ GVGGTGVVT+G LI MAAH++ +G +VLD G AQKGG V S+VR+ P+ +H Sbjct: 729 GILITGVGGTGVVTIGALIGMAAHIDGKGVTVLDMTGLAQKGGAVFSHVRICDDPEAIHA 788 Query: 783 VRISNGQADAVIACDLVVASSQKALSVLRPNHTRIVANEAELPTADYVLFRDADMKADKR 842 VR++ G+ADAVI D++V +S AL+ ++ TR+V N AE PTAD+ D + Sbjct: 789 VRVATGEADAVIGGDVIVTASPDALTRMQSGRTRVVVNCAETPTADFTRNPDWQFPLARM 848 Query: 843 LGLLKNAVGEDHFDQLDANGIAEKLMGDTVFSNVMMLGFAWQKGLLPLSEAALMKAIELN 902 ++ VG +DA+ +A +L+GD++ SN+ +LG+AWQ+GL+P+S A+ +AIELN Sbjct: 849 QAVVGETVGAGAAHFVDASDLAVRLLGDSIASNLFLLGYAWQQGLVPVSWDAIDRAIELN 908 Query: 903 GVAIDRNKEAFGWGRLSAVDPSAVTDLLDDSNAQVVEVKPEPTLDELINTRHKHLVNYQN 962 G A+ ++ AF WGR +A DP+ V + V P PTLDELI R + L YQ+ Sbjct: 909 GTAVPLSRAAFLWGRRAAHDPAGVAAYARPK----IAVPPAPTLDELIAKRVRFLTEYQD 964 Query: 963 QRWADQYRDAVAGVRKAEESLGETNLLLTRAVAQQLYRFMAYKDEYEVARLFAETDFMKE 1022 +A++YR V +R AE + + LT VA L++ MA KDEYEVARL+AETDF+++ Sbjct: 965 AAYAERYRTQVEKIRTAEAFIDSSQ--LTETVAHNLFKLMAIKDEYEVARLYAETDFLQK 1022 Query: 1023 VNETFEGDFKVHFHLAPPLLSGETDAQGRPKKRRFGPWMFRAFRLLAKLRGLRGTAIDPF 1082 + E FEGD+ + FHLAPPLL+ G+ KK FGPWM F+ LAK R RG+ D F Sbjct: 1023 IGERFEGDYTLQFHLAPPLLARPDPKTGKVKKLAFGPWMLTGFKWLAKARRYRGSRWDVF 1082 Query: 1083 RYSADRKLDRAMLKDYQSLVDRIGRELNASNYETFLQLAELPADVRGYGPVREQAAESIR 1142 SA+R+L+R++L DY++ + R+ +L+ + + LA LP +RG+G V+ + ++ Sbjct: 1083 GRSAERQLERSLLADYEADLARMAGKLDRTTLGDAIALANLPEKIRGFGHVKRRNIDAAM 1142 Query: 1143 EKQTQL 1148 ++ L Sbjct: 1143 PERDAL 1148 Lambda K H 0.319 0.136 0.405 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 3215 Number of extensions: 141 Number of successful extensions: 8 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 1172 Length of database: 1155 Length adjustment: 47 Effective length of query: 1125 Effective length of database: 1108 Effective search space: 1246500 Effective search space used: 1246500 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.4 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.7 bits) S2: 58 (26.9 bits)
This GapMind analysis is from Apr 09 2024. The underlying query database was built on Sep 17 2021.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory