GapMind for catabolism of small carbon sources

 

Alignments for a candidate for iorAB in Rhodococcus qingshengii djl-6-2

Align phenylpyruvate ferredoxin oxidoreductase (EC 1.2.7.8) (characterized)
to candidate WP_076948421.1 C1M55_RS23815 indolepyruvate ferredoxin oxidoreductase family protein

Query= reanno::Marino:GFF880
         (1172 letters)



>NCBI__GCF_002893965.1:WP_076948421.1
          Length = 1151

 Score =  708 bits (1827), Expect = 0.0
 Identities = 430/1158 (37%), Positives = 634/1158 (54%), Gaps = 64/1158 (5%)

Query: 7    QLDDYKLEDRYLRESGRVFLTGTQALVRIPLMQAALDRKQGLNTAGLVSGYRGSPLGAVD 66
            Q   Y L+DR    SG V LTG QA+ R  + Q   D + G   A  VSGY+GSPLG VD
Sbjct: 10   QAKPYDLDDRCRSGSGPVLLTGVQAIARGFVEQHVRDIRAGKRIATFVSGYQGSPLGGVD 69

Query: 67   QALWQAKDLLDENRIDFVPAINEDLAATILLGTQQVETDEDRQVEGVFGLWYGKGPGVDR 126
            + L     +L E+ I F+P  NE+LAAT + G+Q          +GV G+WYGKGPGVDR
Sbjct: 70   KMLHGMPKVLAEHDITFIPGFNEELAATAVWGSQTDLPAGTATHDGVTGVWYGKGPGVDR 129

Query: 127  AGDALKHGTTYGSSPHGGVLVVAGDDHGCVSSSMPHQSDVAFMSFFMPTINPANIAEYLE 186
            A D ++H   YG +P GGV+++ GDD    SS++P  S+ +  +  +P + P N  E + 
Sbjct: 130  ATDTIRHANMYGVNPKGGVVLLVGDDPASKSSTVPAVSERSLAAMGVPVLFPRNAEEIIT 189

Query: 187  FGLWGYALSRYSGCWVGFKAISETVESAASVEIPPAP-DFVTPD-------------DFT 232
             G+   ALSR SGC V  K +++  + A +++   A  D V P+                
Sbjct: 190  MGMHAVALSRVSGCVVALKIVADVADGAWTIDGSIADFDIVVPEVQFEGKPFVYKQRQMA 249

Query: 233  APESGLHYRWPDLPGPQLETRIEHKLAAVQAFARANRIDRCLFDNKEARFGIVTTGKGHL 292
            AP   +     DL GP+ +         V A+  AN +D    +   A+ GI  TG    
Sbjct: 250  APPDSVIAE-ADLYGPRWDL--------VHAYGTANNLDVIEVNPSGAKIGIAATGTTFD 300

Query: 293  DLLEALDLLGIDEDKARDMGLDIYKVGMVWPLERRGILDFVHGKEEVLVIEEKRGIIESQ 352
             + +AL  LG+D+      G+ + ++GM +P+    I +F  G E+++V+E+K   IE+Q
Sbjct: 301  SVRQALADLGVDDAALHRAGIRLLRIGMCYPVVGEKIKEFAEGLEQIVVVEDKTAFIEAQ 360

Query: 353  IKEYMSEPDRPGEVLITGKQDELGRPLIPYVGELSPKLVAGFLAARLGRFFE--VDFSER 410
            I+E +   D    ++  GK+D  GR L+P  GEL+    AG L A L R  +  V+    
Sbjct: 361  IREVLYGTDDAPRII--GKRDGQGRLLMPASGELT----AGRLLAPLRRVLKPHVELKRT 414

Query: 411  MAEISAMTTAQDPGGVKRMPYFCSGCPHNTSTKVPEGSKALAGIGCHFMASWMGRNTES- 469
            +    ++         KR  YFCSGCPHN ST +PEGS    GIGCH + +  GR   + 
Sbjct: 415  LPAPLSLNVLS----AKRTAYFCSGCPHNRSTAIPEGSIGGGGIGCHTLVTMSGREDSAV 470

Query: 470  --LIQMGGEGVNWIGKSRYTGNPHVFQNLGEGTYFHSGSMAIRQAVAAGINITYKILFND 527
              L QMGGEG  WIG++ +T  PH+FQN+G+GT+FHSG +A++  +AAG+NITYK+L+N+
Sbjct: 471  TGLTQMGGEGSQWIGQAPFTDVPHLFQNIGDGTFFHSGQLAVQACIAAGVNITYKLLYNE 530

Query: 528  AVAMTGGQPVDGQITVDRIAQQMAAEGVNRVVVLSDEPEKYDGHHDLFPKDVTFHDRSEL 587
             VAMTG Q  +G ++V +++ ++  EGV ++++ +DEP +++       K      R  L
Sbjct: 531  VVAMTGAQDAEGALSVAQLSHKLTTEGVKQIIICADEPSRHNKR--ALAKGTKLWHRDRL 588

Query: 588  DQVQRELRDIPGCTVLIYDQTCAAEKRRRRKRKQFPDPAKRAFINHHVCEGCGDCSVQSN 647
            D+ Q+ELR+I G TVLIYDQ CAA+ RR+RKR   P    R  IN  VCEGCGDC V+SN
Sbjct: 589  DEAQKELREIKGVTVLIYDQHCAADARRQRKRGTLPTRNTRVVINEAVCEGCGDCGVKSN 648

Query: 648  CLSVVPRKTELGRKRKIDQSSCNKDFSCVNGFCPSFVTIEGGQLRKSRGVDTGSVLTRKL 707
            CLSV P  TE GRK KIDQ+SCN D++C++G CPSFVT+E   +   +       +   +
Sbjct: 649  CLSVQPVDTEYGRKTKIDQTSCNTDYTCMDGDCPSFVTVE--LVPGKKAPKAARPVPPVV 706

Query: 708  ADIPAPKLPEMTGSYDLLVGGVGGTGVVTVGQLITMAAHLESRGASVLDFMGFAQKGGTV 767
            AD   P    +T + ++ + G+GGTG+VTV Q++  AA         LD +G +QK G V
Sbjct: 707  AD---PDFGPLTSTQNVFLAGIGGTGIVTVNQVLATAALRAGLEVESLDQIGLSQKAGPV 763

Query: 768  LSYVRMAPSPDKLHQVRISNGQADAVIACDLVVASSQKALSVLRPNHTRIVANEAELPTA 827
            +S++R + S  +    R++ G AD ++A DL+ A+  K L       T  +A+ ++ PT 
Sbjct: 764  VSHLRFSSSALEPSN-RLTPGSADCIVAFDLLTATDNKNLDYGNAEKTVSIASTSQTPTG 822

Query: 828  DYVLFRDADMKADKRLGLLKNAVGEDHFDQLDANGIAEKLMGDTVFSNVMMLGFAWQKGL 887
            D V  +      +  L    + V        DA   A++L G+T  +N +++G A+Q G 
Sbjct: 823  DMVYDKAVRYPEETSLLARLDKVSRT-LRSFDALAAAQELFGNTAAANFLLIGAAYQSGG 881

Query: 888  LPLSEAALMKAIELNGVAIDRNKEAFGWGRLSAVDP---SAVTDLLDDSNAQVV------ 938
            L L   A+ +AI +NGVA+D N  AF WGR +  D    SAVT         VV      
Sbjct: 882  LRLPAEAIEEAIGINGVAVDANIAAFRWGRAAIADTAAFSAVTTPAKSVREPVVAPAHMF 941

Query: 939  --EVKPEPTLDELINTRHKHLVNYQNQRWADQYRDAVAGVRKAEESLGETNLLLTRAVAQ 996
                    TL  L+  R   L+ +Q+ + A +Y D V  V  AE +  E     + AVA+
Sbjct: 942  AGTTFTGETL-RLVELRAAQLIEFQSVKIAQRYIDQVQAVWTAERAATE-RTDFSEAVAR 999

Query: 997  QLYRFMAYKDEYEVARLFAETDFMKEVNETFEGDFKVHFHLAPPLLSGETDAQGRPKKRR 1056
             L++F AYKDEYEVAR+  +  F+++V         + + L PP+L       GR KK  
Sbjct: 1000 GLFKFTAYKDEYEVARMLVDPAFIEDVKSQVPAGENLTYKLHPPILR----VMGRKKKIG 1055

Query: 1057 FGPWMFRAFRLLAKLRGLRGTAIDPFRYSADRKLDRAMLKDYQSLVDRIGRELNASNYET 1116
             GP    A ++LAK + LRGT +DPF Y   RK++R +L +Y++++  + R L    Y+ 
Sbjct: 1056 LGPKSHVALKVLAKGKKLRGTKLDPFGYMHVRKVERELLAEYEAMIADLSRTLATDGYDR 1115

Query: 1117 FLQLAELPADVRGYGPVR 1134
             +++A LP  VRGY  ++
Sbjct: 1116 AVEIAALPDVVRGYEDIK 1133


Lambda     K      H
   0.319    0.136    0.405 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 2832
Number of extensions: 129
Number of successful extensions: 12
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 1172
Length of database: 1151
Length adjustment: 47
Effective length of query: 1125
Effective length of database: 1104
Effective search space:  1242000
Effective search space used:  1242000
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 58 (26.9 bits)

This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory