GapMind for catabolism of small carbon sources

 

Alignments for a candidate for ydiJ in Dinoroseobacter shibae DFL-12

Align 2-hydroxyglutarate oxidase (EC 1.1.3.15) (characterized)
to candidate 3609942 Dshi_3324 FAD linked oxidase domain protein (RefSeq)

Query= reanno::Putida:PP_4493
         (1006 letters)



>FitnessBrowser__Dino:3609942
          Length = 973

 Score =  338 bits (866), Expect = 1e-96
 Identities = 295/1010 (29%), Positives = 458/1010 (45%), Gaps = 103/1010 (10%)

Query: 28   GQISADYATRTVLATDNSIYQRLPQAAVFPLDADDVARVATLMGEPRFQQVKLTPRGGGT 87
            G++  D  +R   ATD SIYQ  P  AV P   DD+    T +G  R     + PRGGGT
Sbjct: 19   GELLFDAFSRGRYATDASIYQMTPLGAVVPKSLDDIE---TALGAARELGASVLPRGGGT 75

Query: 88   GTNGQSLTDGIVVDLSRHMNNILEINVEERWVRVQAGTVKDQLNAALKPHGLFFAPELST 147
               GQ++   +V+D S++ N IL+++V  R   VQ G V D+LN ALKPHGL+F  ++ST
Sbjct: 76   SQCGQTVNHSLVLDNSKYFNRILDLDVANRRCVVQPGIVLDELNRALKPHGLWFPVDVST 135

Query: 148  SNRATVGGMINTDASGQGSCTYGKTRDHVLELHSVLLGGERLHSLPIDDAALEQACAAPG 207
            ++RAT+GGM   ++ G  S  YG  RD+VL + ++L  G R    P+D   +        
Sbjct: 136  ASRATIGGMAANNSCGGRSLRYGTMRDNVLSIEAILADGTRTTFGPVDRNGVLGNDPTSA 195

Query: 208  RVGEVYRM-AREIQETQAELIETTFPKLNRCLTGYDLAHLRDEQGRFNLNSVLCGAEGSL 266
             + E+  + ARE    QAE I   FP + R + GY++  L  +  R N+  +L G+EG+L
Sbjct: 196  LISEMLALGARE----QAE-IAARFPNIIRRVGGYNIDALTPDHARNNMAHLLVGSEGTL 250

Query: 267  GYVVEAKLNVLPIPKYAVLVNVRYTSFMDALRDANALMAHKPLSIETVDSKVLMLAMKDI 326
            GY    +L + P+    V     + +F  A+  A  L+  KP ++E VD  ++ LA +DI
Sbjct: 251  GYFTAIELKLSPLQSTKVTGACHFPTFHAAMDAAQHLVPLKPAAVELVDDTMIALA-RDI 309

Query: 327  VW--HSVAEYFPADPERPTLGINLVEFCGDEPAEVNAKVQAFIQHLQS-----DTSVERL 379
                 +++++   DP      + LVEF     A   AK++     + +     D S +  
Sbjct: 310  PMFRQTISQFVQGDP----AALLLVEFAEGSDAANAAKLEELSDKMAALGFGFDRSGKHW 365

Query: 380  GHTLA-EGAEAVTRVYTMRKRSVGLLGNVEGEVRPQPFVEDTAVPPEQLADYIADFRALL 438
            G ++A    +    +  +RK  + ++ +++ E +P  FVED AV    LA Y A    + 
Sbjct: 366  GGSVAVTDLKLQNDIGEVRKSGLNIMMSMKSEGKPVSFVEDCAVELPDLAAYTAGLTEIF 425

Query: 439  DGYGLAYGMFGHVDAGVLHVRPALDMKDPVQAALVKPISDAVAALTKRYGGLLWGEHGKG 498
            + +G     + H   G LHVRP L+MK       ++ I++    L   Y G   GEHG G
Sbjct: 426  EKHGTKGTWYAHASVGCLHVRPVLNMKLDKDVKTMRAIAEECFDLVAHYKGSHSGEHGDG 485

Query: 499  L-RSEYVPEYFG-ELYPALQRLKGAFDPHNQLNPGKICTP--LGSAEGLTPVDGVTLRGD 554
            + RSE+  + FG ++  +   +K  FDP+   NPGKI  P  +            T++  
Sbjct: 486  IVRSEFHEKMFGPQMVASFNEVKDRFDPNRLFNPGKIVQPPKMDDRRLFRYGPDYTVK-P 544

Query: 555  LDRTIDERVWQ----DFPSAVH-CNGNGACYNYDPNDAMCPSWKATRERQHSPKGRASLM 609
            +   +D   W      F  AV  CN NGAC        MCPS++ TR  +   +GRA+ +
Sbjct: 545  MKTALDWSAWPGAGGGFQGAVEMCNNNGACRKL-KGGVMCPSYRVTRNERDVVRGRANSL 603

Query: 610  REWLRLQGEANIDVLAAARNKVSWLKGLPARLRNNRARNQGQEDFSHEVYDAMAGCLACK 669
            R  L + G+   D  A                             S ++  AM  C++CK
Sbjct: 604  R--LAISGQLGPDAFA-----------------------------SDDMVSAMKLCVSCK 632

Query: 670  SCAGQCPIKVNVPDFRSRFLELYHGRYQRPLRDYLIGSLEFTIPYLAHAPGLYNAVMGSK 729
             C  +CP  V++   +   L     ++     D L+G +    P+ A  P L NA     
Sbjct: 633  GCKRECPTGVDMAKMKIEVLAARAEKHGLSFHDKLVGYMPAYAPFAAKLPWLANARNWLP 692

Query: 730  WVSQLLADKVGMVDSPLISRFNFQATLTRCRVGMATVPALRELTPAQRERSIVLVQDAFT 789
              ++L     G   S  +  ++    L              +      +  +VL  D F 
Sbjct: 693  GAAKLTERLTGFAASRKLPAWSNDPFL--------------DTEFQDADPDVVLWADIFN 738

Query: 790  RYFETPLLSAFIDLAHRLGHRVFLAPYSANGKPL-----HVQGFLGAFAKAAIRNATQLK 844
            RYFE   L A   +    G RV +   +    PL     ++   L   A+A+ R A  LK
Sbjct: 739  RYFEPENLRAAGRVLRAAGLRVAVVRPADGKDPLDCGRTYLAVGLVEEARASARRA--LK 796

Query: 845  AL---ADCGVPLVGLDPAMTLVYRQEYQ-KVPGLEG---CPKVLLPQEWLMD---VLPEQ 894
            AL   A+ GVP+VGL+P+  L  R E+   VPG E        L+ +E+L+     LP +
Sbjct: 797  ALVPHAEKGVPIVGLEPSSLLTLRDEWPVLVPGPEADLVAQHALMLEEYLVRNEIALPLK 856

Query: 895  APAAPGSFRLMAHCTEKT-NVPASTRQWEQVFARLGLKLVTEATGCCGMSGTYGHEARNQ 953
                P    L  HC +K  NV    ++   +  +  + +V   T CCGM+G +G+     
Sbjct: 857  PLGRP--ILLHGHCHQKAHNVMPDVQKTLAMIPQAEVSVV--ETSCCGMAGAFGYGKDTV 912

Query: 954  ETSRTIFEQSWATKL---DKDGEPLATGYSCRSQVKRMTERKMRHPLEVV 1000
            ETS  + E      +    +D   +A G SCR Q+K    R+  H   ++
Sbjct: 913  ETSLKMAELDLLPAVRAAAQDALVVADGTSCRHQIKDGAAREAIHVARIL 962


Lambda     K      H
   0.320    0.135    0.409 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 2214
Number of extensions: 121
Number of successful extensions: 11
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 3
Number of HSP's successfully gapped: 3
Length of query: 1006
Length of database: 973
Length adjustment: 44
Effective length of query: 962
Effective length of database: 929
Effective search space:   893698
Effective search space used:   893698
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.8 bits)
S2: 57 (26.6 bits)

This GapMind analysis is from Sep 17 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory