GapMind for catabolism of small carbon sources

 

Aligments for a candidate for ofo in Caulobacter crescentus NA1000

Align 3-methyl-2-oxobutanoate:ferredoxin oxidoreductase (EC 1.2.7.7) (characterized)
to candidate CCNA_03280 CCNA_03280 pyruvate ferredoxin/flavodoxin oxidoreductase family protein

Query= reanno::Cup4G11:RR42_RS19540
         (1197 letters)



>lcl|FitnessBrowser__Caulo:CCNA_03280 CCNA_03280 pyruvate
            ferredoxin/flavodoxin oxidoreductase family protein
          Length = 1146

 Score = 1074 bits (2778), Expect = 0.0
 Identities = 585/1176 (49%), Positives = 756/1176 (64%), Gaps = 48/1176 (4%)

Query: 18   ANVSLEDKYTLERGRVYISGTQALVRLPMLQRERDRAAGLNTAGFISGYRGSPLGALDQS 77
            + V+L+DKY LE GR +I+G QAL+R+ + ++  DR AGLNT G++SGYRGSPLG LDQ 
Sbjct: 4    SEVTLDDKYVLEDGRAFITGVQALLRVLLDRKRLDRKAGLNTGGYLSGYRGSPLGGLDQQ 63

Query: 78   LWKAKQHLAAHDIVFQAGLNEDLAATSVWGSQQVNMYPDARFEGVFGMWYGKGPGVDRTS 137
              + K+ L AHD+VFQ GLNEDLAAT+VWGSQQ N++P A ++GVFGMWYGK PGVDRT 
Sbjct: 64   AARIKKLLTAHDVVFQEGLNEDLAATAVWGSQQANLFPGALYDGVFGMWYGKAPGVDRTG 123

Query: 138  DVFKHANSAGSSRHGGVLVLAGDDHAAKSSTLAHQSEHIFKACGLPVLYPSNVQEYLDYG 197
            DVFKHAN AG+   GGVL +AGDDH  KSSTL  QSE  F+   +PVL P++VQE LDYG
Sbjct: 124  DVFKHANFAGTFPTGGVLAVAGDDHGCKSSTLPSQSEFAFQDFEMPVLSPADVQEVLDYG 183

Query: 198  LHAWAMSRYSGLWVSMKCVTDVVESSASVELDPHRVEIVLPQDFILPPGGLNIRWPDPPL 257
            L   +MSR+SGLW  M  + D ++S  ++++   R +IV+P+ F  PPGGL IR  D P+
Sbjct: 184  LLGISMSRFSGLWTGMIALADTMDSGVTIDVSLDRHQIVVPE-FAFPPGGLGIRQKDQPM 242

Query: 258  EQEARLLDYKWYAGLAYVRANKIDRIEIDSPH-----ARFGIMTGGKAYLDTRQALANLG 312
            E+E R+  +K  A LA+ RAN IDR+ + + H     AR GI+  G+AY D  +A   +G
Sbjct: 243  EKERRMRLHKIPAALAFARANNIDRVVLGASHVKVGKARLGIVCQGQAYKDVLEAFTAMG 302

Query: 313  LDDETCARIGIRLYKVGCVWPLEAHGARAFAEGLQEILVVEEKRQIMEYALKEELYNWRD 372
            +  +  A +G+ +YKVG  WPLE  G RAFA GL+ ++V+E KR ++E   +  LY+   
Sbjct: 303  MTLQEAADLGVSIYKVGMPWPLEPLGLRAFAAGLETLMVIEHKRALIEPQARAALYDLPA 362

Query: 373  DVRPKVYGKFDEKDNAGGEWSIPQSNWLLPAHYELSPAIIARAIATRLDKFELPADVRAR 432
              RP+V GK DEK   GG         LL     LS A IA AI  RL +         R
Sbjct: 363  QARPRVIGKTDEK---GGP--------LLSELGSLSVAEIALAIYDRLPQ----GPHMER 407

Query: 433  IAARIAVIEAKEKAMAVPRVAAERKPWFCSGCPHNTSTNVPEGSRALAGIGCHYMTVWMD 492
              A +  + A   A         RKP+FCSGCPHNTST +PEGSRALAGIGCHYM  + D
Sbjct: 408  AQAYLNRVSAAGVAAVSLAADQARKPFFCSGCPHNTSTKLPEGSRALAGIGCHYMAGFND 467

Query: 493  RSTSTFSQMGGEGVAWIGQAPFAGDKHVFANLGDGTYFHSGLLAIRASIAAGVNITYKIL 552
              T   + MGGEG+ W+G APF  +KHVF NLGDGTY HSG LAIR ++AAG NITYK+L
Sbjct: 468  PMTDLNTHMGGEGLTWVGAAPFTSEKHVFQNLGDGTYNHSGSLAIRGAVAAGTNITYKLL 527

Query: 553  YNDAVAMTGGQPIDGKLSVQDVANQVAAEGARKIVVVTDEPEKYSAAIKLPQGVEVHHRD 612
            YNDAVAMTGGQ  +   +   +  Q+AAEG +K V+V DE E+Y     L  GVE+  R 
Sbjct: 528  YNDAVAMTGGQRAESGFTPAQITRQLAAEGVKKTVIVVDELERYQGVNDLAPGVEIFPRS 587

Query: 613  ELDRIQRELREVPGATILIYDQTCATEKRRRRKRGTYPDPAKRAFINDAVCEGCGDCSVK 672
            +L R+Q  LRE PG T+L+YDQTCATEKRRRRKRG+ P   +R FIN  VCEGCGDCSVK
Sbjct: 588  DLMRVQEMLRETPGTTVLLYDQTCATEKRRRRKRGSMPKATQRVFINPLVCEGCGDCSVK 647

Query: 673  SNCLSVEPLETELGTKRQINQSSCNKDFSCVNGFCPSFVTAEGAQVKKPERHGVSMDNLP 732
            SNC+SVEPL TE G KR+INQSSCN+D+SCV GFCPSF+T EGA+  + ++   ++    
Sbjct: 648  SNCVSVEPLATEFGRKRKINQSSCNQDYSCVEGFCPSFITLEGAESAQSKKTPAAL-TAE 706

Query: 733  ALPQPALPGLEHPYGVLVTGVGGTGVVTIGGLLGMAAHLENKGVTVLDMAGLAQKGGAVL 792
            + P P    L     +L TGVGGTGV T+  ++ MAAH++ +  +V+DM GLAQKGG+V 
Sbjct: 707  STPLPEFEPLTGVRKILFTGVGGTGVTTVASIMAMAAHIDGRAGSVVDMTGLAQKGGSVF 766

Query: 793  SHVQIAAHPDQLHATRIAMGEADLVIGCDAIVSAIDDVISKTQVGRTRAIVNTAQTPTAE 852
            SHV+I    + +   R+    AD++I CD +V+A  + +S     RTRA  N+   PTA+
Sbjct: 767  SHVKIGKTEETIVGGRVPAASADVLIACDLLVAASPEGLSLYAKDRTRAFGNSDFAPTAD 826

Query: 853  FIKNPKWQFPGLSAEQDVRNAVGEACDFINASGLAVALIGDAIFTNPLVLGYAWQKGWLP 912
            F+ +   +F   +  + V+ A  +  D   A  LA    GDAI+ N +++G+AWQ+G +P
Sbjct: 827  FVTSRDVRFDSGAMARRVKGAT-KTFDACPAQRLAETEFGDAIYANMIMVGFAWQRGVIP 885

Query: 913  LSLDALVRAIELNGTAVEKNKAAFDWGRHMAHDPEHVLSLTGKLRNTAEGAEVVKLPTSS 972
            LS  A+ RAI+LNG   E N  AF+ GR +AHDP    +LT K   T         PT  
Sbjct: 886  LSSRAVYRAIKLNGVDAEANLQAFELGRRVAHDPS---TLTVKEDTT---------PTPE 933

Query: 973  GALLEKLIAHRAEHLTAYQDAAYAQTFRDTVSRVRAAESALVGNGKPLPLTEAAARNLSK 1032
               L+ LIAHR   LTAYQ+AAYAQ + D V++VRAAE+A+ G    LPLT AAA NL K
Sbjct: 934  TMPLDALIAHRIAQLTAYQNAAYAQRYADKVAKVRAAETAVSGEDGALPLTRAAAVNLYK 993

Query: 1033 LMAYKDEYEVARLYTDPIFLDKLRNQFEGEPGRDYQLNFWLAPPLMAKRDEKGHLVKRRF 1092
            LMAYKDEYEVARLYTD  F  +L   F+G   +      WLAPPL+A +   G   K  F
Sbjct: 994  LMAYKDEYEVARLYTDGRFAAELAGTFKGGKAK-----VWLAPPLLAPKGPDGKPKKIAF 1048

Query: 1093 GPSTMKL-FGVLAKLKGLRGGVFDVFGKTAERRTERALIGEYRALLEELTRGLSAANHAT 1151
            G   + L F ++AK+KGLRG   D+FGKT ERR ER LI  Y   L+ L  GL A +   
Sbjct: 1049 GGWMLDLAFPMMAKMKGLRGTALDIFGKTEERRMERGLIASYETGLDRLAAGLKAESLPL 1108

Query: 1152 AITLASLPDDIRGFGHVKDDNL-------AKVRTRW 1180
            A+ +A +P  IRGFGHVK+ ++       AK+ T+W
Sbjct: 1109 AVKIAEVPQAIRGFGHVKEASVVTAKAAEAKLWTQW 1144


Lambda     K      H
   0.319    0.135    0.407 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 3176
Number of extensions: 131
Number of successful extensions: 9
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 2
Number of HSP's successfully gapped: 1
Length of query: 1197
Length of database: 1146
Length adjustment: 47
Effective length of query: 1150
Effective length of database: 1099
Effective search space:  1263850
Effective search space used:  1263850
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 58 (26.9 bits)

This GapMind analysis is from Sep 17 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see the paper from 2019 on GapMind for amino acid biosynthesis, the paper from 2022 on GapMind for carbon sources, or view the source code.

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory