GapMind for catabolism of small carbon sources

 

Alignments for a candidate for iorAB in Sulfuritalea hydrogenivorans DSM 22779

Align phenylpyruvate ferredoxin oxidoreductase (EC 1.2.7.8) (characterized)
to candidate WP_041097026.1 SUTH_RS03175 indolepyruvate ferredoxin oxidoreductase family protein

Query= reanno::Marino:GFF880
         (1172 letters)



>NCBI__GCF_000828635.1:WP_041097026.1
          Length = 1155

 Score = 1124 bits (2908), Expect = 0.0
 Identities = 582/1146 (50%), Positives = 770/1146 (67%), Gaps = 17/1146 (1%)

Query: 13   LEDRYLRESGRVFLTGTQALVRIPLMQAALDRKQGLNTAGLVSGYRGSPLGAVDQALWQA 72
            LED+Y   SGRVFLTGTQALVR+ L+Q   D   GLNTAG VSGYRGSPLG +DQALW+A
Sbjct: 10   LEDKYALASGRVFLTGTQALVRLLLLQRQRDALAGLNTAGYVSGYRGSPLGGLDQALWKA 69

Query: 73   KDLLDENRIDFVPAINEDLAATILLGTQQVETDEDRQVEGVFGLWYGKGPGVDRAGDALK 132
            +  L ++ + F P +NEDLAAT L GTQQV      + +GVFG+WYGKGPGVDR GD  K
Sbjct: 70   RPHLAQSHVVFQPGVNEDLAATALWGTQQVNLSPGAKHDGVFGMWYGKGPGVDRCGDVFK 129

Query: 133  HGTTYGSSPHGGVLVVAGDDHGCVSSSMPHQSDVAFMSFFMPTINPANIAEYLEFGLWGY 192
            H  + G+  HGG+L VAGDDH   SS++ HQS+ AF +  MP + PA + +YL+ GL G+
Sbjct: 130  HANSAGTWKHGGILAVAGDDHAARSSTVAHQSEHAFKAAMMPVLVPAGVQDYLDLGLHGW 189

Query: 193  ALSRYSGCWVGFKAISETVESAASVEI-PPAPDFVTPDDFTAPESGLHYRWPDLPGPQLE 251
            A+SRYSGCWVGFKA+++TVES+ASV+I P     V PDD+  P  GL+ RWPD    Q  
Sbjct: 190  AMSRYSGCWVGFKAVADTVESSASVDISPDRVRIVLPDDYALPAGGLNIRWPDDRLLQEA 249

Query: 252  TRIEHKLAAVQAFARANRIDRCLFDNKEARFGIVTTGKGHLDLLEALDLLGIDEDKARDM 311
              ++HKL A  A+ RAN++++ + D    R GI+TTGK +LD+ +A D LGID+  A ++
Sbjct: 250  RLLDHKLYAALAYCRANKLNQVVIDAPNPRLGIITTGKSYLDVRQAFDDLGIDDALAAEI 309

Query: 312  GLDIYKVGMVWPLERRGILDFVHGKEEVLVIEEKRGIIESQIKEYMSEPDRPGEVLITGK 371
            G+ +YKVGMVWPLE  G+  F  G EE+LV+EEKR +IE Q+KE +          + GK
Sbjct: 310  GIRLYKVGMVWPLESDGVRRFAEGLEEILVVEEKRQLIEYQLKEELYNWREDVRPRVIGK 369

Query: 372  QDEL-------GRPLIPYVGELSPKLVAGFLAARLGRFF-EVDFSERMAEISAMTTAQDP 423
             DE        GR L+P  GELSP  +A  +A R+GR F      ER+A I A   +  P
Sbjct: 370  FDEKGEWALPNGRWLLPASGELSPAQIARVIADRIGRHFTSPRIRERLAIIEAKERSAAP 429

Query: 424  G-GVKRMPYFCSGCPHNTSTKVPEGSKALAGIGCHFMASWMGRNTESLIQMGGEGVNWIG 482
               V R PYFC GCPHNTST VPEGS+ALAGIGCHFM  WM R+T +   MGGEG  W+G
Sbjct: 430  AIQVARTPYFCPGCPHNTSTCVPEGSRALAGIGCHFMVLWMNRSTATYSHMGGEGAAWMG 489

Query: 483  KSRYTGNPHVFQNLGEGTYFHSGSMAIRQAVAAGINITYKILFNDAVAMTGGQPVDGQIT 542
            ++ +T   HVF NLG+GTYFHSGS+AIR AVA+G+N TYKIL+NDAVAMTGGQPVDG +T
Sbjct: 490  QAPFTEQRHVFVNLGDGTYFHSGSLAIRAAVASGVNATYKILYNDAVAMTGGQPVDGNLT 549

Query: 543  VDRIAQQMAAEGVNRVVVLSDEPEKYDGHHDLFPKDVTFHDRSELDQVQRELRDIPGCTV 602
            V +IA Q+ AEGV+ VVV++D   +  GH DL P  V    R ELD +QRE+R+ PG + 
Sbjct: 550  VPQIAHQLHAEGVHHVVVVTDGTARAYGHPDL-PHGVPIRHRDELDAIQREMRECPGVSA 608

Query: 603  LIYDQTCAAEKRRRRKRKQFPDPAKRAFINHHVCEGCGDCSVQSNCLSVVPRKTELGRKR 662
            +IYDQTCAAEKRRRRKR +  DP +R FIN  VCEGCGDC VQSNCL+VVP +TE GRKR
Sbjct: 609  IIYDQTCAAEKRRRRKRGKMIDPPRRLFINEAVCEGCGDCGVQSNCLAVVPVETEFGRKR 668

Query: 663  KIDQSSCNKDFSCVNGFCPSFVTIEGGQLRKSRGVDTGSVLTRKLADIPAPKLPEMTGSY 722
             IDQS+CNKD+SC  GFCPSFV++ GG ++K RG+   +     +A  PAP L      Y
Sbjct: 669  AIDQSACNKDYSCEKGFCPSFVSVLGGGVKKGRGLAGSTNGGDFIAVPPAPTLASTADPY 728

Query: 723  DLLVGGVGGTGVVTVGQLITMAAHLESRGASVLDFMGFAQKGGTVLSYVRMAPSPDKLHQ 782
             +L+ GVGGTGVVT+G LI MAAH++ +G +VLD  G AQKGG V S+VR+   P+ +H 
Sbjct: 729  GILITGVGGTGVVTIGALIGMAAHIDGKGVTVLDMTGLAQKGGAVFSHVRICDDPEAIHA 788

Query: 783  VRISNGQADAVIACDLVVASSQKALSVLRPNHTRIVANEAELPTADYVLFRDADMKADKR 842
            VR++ G+ADAVI  D++V +S  AL+ ++   TR+V N AE PTAD+    D      + 
Sbjct: 789  VRVATGEADAVIGGDVIVTASPDALTRMQSGRTRVVVNCAETPTADFTRNPDWQFPLARM 848

Query: 843  LGLLKNAVGEDHFDQLDANGIAEKLMGDTVFSNVMMLGFAWQKGLLPLSEAALMKAIELN 902
              ++   VG      +DA+ +A +L+GD++ SN+ +LG+AWQ+GL+P+S  A+ +AIELN
Sbjct: 849  QAVVGETVGAGAAHFVDASDLAVRLLGDSIASNLFLLGYAWQQGLVPVSWDAIDRAIELN 908

Query: 903  GVAIDRNKEAFGWGRLSAVDPSAVTDLLDDSNAQVVEVKPEPTLDELINTRHKHLVNYQN 962
            G A+  ++ AF WGR +A DP+ V           + V P PTLDELI  R + L  YQ+
Sbjct: 909  GTAVPLSRAAFLWGRRAAHDPAGVAAYARPK----IAVPPAPTLDELIAKRVRFLTEYQD 964

Query: 963  QRWADQYRDAVAGVRKAEESLGETNLLLTRAVAQQLYRFMAYKDEYEVARLFAETDFMKE 1022
              +A++YR  V  +R AE  +  +   LT  VA  L++ MA KDEYEVARL+AETDF+++
Sbjct: 965  AAYAERYRTQVEKIRTAEAFIDSSQ--LTETVAHNLFKLMAIKDEYEVARLYAETDFLQK 1022

Query: 1023 VNETFEGDFKVHFHLAPPLLSGETDAQGRPKKRRFGPWMFRAFRLLAKLRGLRGTAIDPF 1082
            + E FEGD+ + FHLAPPLL+      G+ KK  FGPWM   F+ LAK R  RG+  D F
Sbjct: 1023 IGERFEGDYTLQFHLAPPLLARPDPKTGKVKKLAFGPWMLTGFKWLAKARRYRGSRWDVF 1082

Query: 1083 RYSADRKLDRAMLKDYQSLVDRIGRELNASNYETFLQLAELPADVRGYGPVREQAAESIR 1142
              SA+R+L+R++L DY++ + R+  +L+ +     + LA LP  +RG+G V+ +  ++  
Sbjct: 1083 GRSAERQLERSLLADYEADLARMAGKLDRTTLGDAIALANLPEKIRGFGHVKRRNIDAAM 1142

Query: 1143 EKQTQL 1148
             ++  L
Sbjct: 1143 PERDAL 1148


Lambda     K      H
   0.319    0.136    0.405 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 3215
Number of extensions: 141
Number of successful extensions: 8
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 1172
Length of database: 1155
Length adjustment: 47
Effective length of query: 1125
Effective length of database: 1108
Effective search space:  1246500
Effective search space used:  1246500
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 58 (26.9 bits)

This GapMind analysis is from Apr 09 2024. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory