GapMind for catabolism of small carbon sources

 

Aligments for a candidate for ydiJ in Pseudomonas fluorescens FW300-N2C3

Align 2-hydroxyglutarate oxidase (EC 1.1.3.15) (characterized)
to candidate AO356_29050 AO356_29050 hypothetical protein

Query= reanno::Putida:PP_4493
         (1006 letters)



>lcl|FitnessBrowser__pseudo5_N2C3_1:AO356_29050 AO356_29050
            hypothetical protein
          Length = 1015

 Score = 1378 bits (3566), Expect = 0.0
 Identities = 699/1010 (69%), Positives = 796/1010 (78%), Gaps = 10/1010 (0%)

Query: 1    MIAQLSTVAPS-ANYPEFLEALRNSGFRGQISADYATRTVLATDNSIYQRLPQAAVFPLD 59
            MIA+LS   P  A Y  FL AL+ +GF+G+I+ D+  RTVLATDNSIYQRLPQAAVFPLD
Sbjct: 1    MIARLSPPDPQKATYDAFLNALQVAGFQGEIARDHGDRTVLATDNSIYQRLPQAAVFPLD 60

Query: 60   ADDVARVATLMGEPRFQQVKLTPRGGGTGTNGQSLTDGIVVDLSRHMNNILEINVEERWV 119
              DV  ++ L  EP  + V LTPRGGGTGTNGQSLTDG+VVDLSRH+NNILEINVEERW 
Sbjct: 61   EQDVMILSRLAAEPAHRAVVLTPRGGGTGTNGQSLTDGVVVDLSRHLNNILEINVEERWA 120

Query: 120  RVQAGTVKDQLNAALKPHGLFFAPELSTSNRATVGGMINTDASGQGSCTYGKTRDHVLEL 179
            RVQ G VKDQLNAALKP+GLFFAPELSTSNRAT+GGMINTDASGQGSCTYGKTRDHVLEL
Sbjct: 121  RVQNGVVKDQLNAALKPYGLFFAPELSTSNRATIGGMINTDASGQGSCTYGKTRDHVLEL 180

Query: 180  HSVLLGGERLHSLPIDDAALEQACAAPGRVGEVYRMAREIQETQAELIETTFPKLNRCLT 239
             +VLLGG RL S  +D A L        RVGEVY+ A +I +T AELI  TFPKLNRCLT
Sbjct: 181  STVLLGGARLTSSAVDAALLSALVERQDRVGEVYQCALDIADTHAELIRDTFPKLNRCLT 240

Query: 240  GYDLAHLRDEQGRFNLNSVLCGAEGSLGYVVEAKLNVLPIPKYAVLVNVRYTSFMDALRD 299
            GYDLAHLR+  GRFNLNSVLCG+EGSLG++VEAKLNVLPIP Y++LVNVRY  FM+ALRD
Sbjct: 241  GYDLAHLREPDGRFNLNSVLCGSEGSLGFIVEAKLNVLPIPTYSILVNVRYAGFMEALRD 300

Query: 300  ANALMAHKPLSIETVDSKVLMLAMKDIVWHSVAEYFPADPERPTLGINLVEFCGDEPAEV 359
            A ALMA KPLSIETVDSKVLMLAMKDIVWH VAEYFP DP+RPTLGINL+EF GD+ A V
Sbjct: 301  AKALMAMKPLSIETVDSKVLMLAMKDIVWHGVAEYFPEDPDRPTLGINLIEFSGDDEAAV 360

Query: 360  NAKVQAFIQHLQSDTSVERLGHTLAEGAEAVTRVYTMRKRSVGLLGNVEGEVRPQPFVED 419
              +V  F+ HLQ DTSV RLGHTLA GA+A+ RVY MRKR+VGLLGNV+GE RPQPFVED
Sbjct: 361  QQQVHEFVLHLQRDTSVVRLGHTLAIGADALKRVYAMRKRAVGLLGNVKGEARPQPFVED 420

Query: 420  TAVPPEQLADYIADFRALLDGYGLAYGMFGHVDAGVLHVRPALDMKDPVQAALVKPISDA 479
            TAVPPE LAD+I  FRALLD + L YGMFGHVDAGVLHVRP LDMKDP QAAL++PISDA
Sbjct: 421  TAVPPEHLADFIQAFRALLDRHDLQYGMFGHVDAGVLHVRPILDMKDPTQAALIRPISDA 480

Query: 480  VAALTKRYGGLLWGEHGKGLRSEYVPEYFGELYPALQRLKGAFDPHNQLNPGKICTP--L 537
            VAALT++YGGLLWGEHGKGLRS+YVP+YFG+LYPALQ LK AFDP NQLNPGKI TP  L
Sbjct: 481  VAALTQQYGGLLWGEHGKGLRSQYVPDYFGDLYPALQALKAAFDPFNQLNPGKIATPNTL 540

Query: 538  GSAEGLTPVDGVTLRGDLDRTIDERVWQDFPSAVHCNGNGACYNYDPNDAMCPSWKATRE 597
              A  LT VD V +RG LDRTIDERVWQ + +AVHCNGNGACYN+DP+DAMCPSWKATR 
Sbjct: 541  PGAR-LTRVDEVQMRGALDRTIDERVWQSYDTAVHCNGNGACYNFDPDDAMCPSWKATRN 599

Query: 598  RQHSPKGRASLMREWLRLQGEANIDVLAAAR--NKVSWLKGLPARLRNNRARNQGQEDFS 655
            R HSPKGRASL+REWLRLQGE  +DVLAA+          G+  R  N+ AR  GQ DFS
Sbjct: 600  RIHSPKGRASLIREWLRLQGEQGVDVLAASERVRGAGGAPGVARRAVNSLARKMGQPDFS 659

Query: 656  HEVYDAMAGCLACKSCAGQCPIKVNVPDFRSRFLELYHGRYQRPLRDYLIGSLEFTIPYL 715
            HEVY+AM+GCLACKSCAGQCP+KVNVP+FRSRFLELYH RY RPL+DYLIGSLE+T+PYL
Sbjct: 660  HEVYEAMSGCLACKSCAGQCPVKVNVPEFRSRFLELYHSRYLRPLKDYLIGSLEYTVPYL 719

Query: 716  AHAPGLYNAVMGSKWVSQLLADKVGMVDSPLISRFNFQATLTRCRVGMATVPALRELTPA 775
            A  P LYN VM ++ +   L    GMVDSPL+S  +F A   R  V +AT   L  L+  
Sbjct: 720  AKMPRLYNFVMSARPIRWGLERVAGMVDSPLLSLVDFAAVCRRWNVEVATPERLEALSAE 779

Query: 776  QRERSIVLVQDAFTRYFETPLLSAFIDLAHRLGHRVFLAPYSANGKPLHVQGFLGAFAKA 835
            QRE+S++LVQDAFTR+FETPLL  +++L  RLG+RVFLAPY+ NGKPL VQGFL AF KA
Sbjct: 780  QREKSVILVQDAFTRFFETPLLVDWVELISRLGYRVFLAPYAPNGKPLQVQGFLKAFEKA 839

Query: 836  AIRNATQLKALADCGVPLVGLDPAMTLVYRQEYQKVPGLEGCPKVLLPQEWLMDVLPEQA 895
            A  N   L +L    VPLVGLDPAMTLVYRQEY K+      P+V+L QEWLM+ LPE A
Sbjct: 840  ASFNGQTLLSLQQYRVPLVGLDPAMTLVYRQEYAKILDAGNAPQVMLAQEWLMNALPEDA 899

Query: 896  PAAPGS-FRLMAHCTEKTNVPASTRQWEQVFARLGLKLVTEATGCCGMSGTYGHEARNQE 954
                   +  + HCTEKTN P S  QW+++F R+GL L  +A+GCCGMSGTYGHE RN  
Sbjct: 900  AVEEQEPYHFLPHCTEKTNEPGSIGQWQKIFQRMGLTLQVQASGCCGMSGTYGHETRNAG 959

Query: 955  TSRTIFEQSW---ATKLDKDGEPLATGYSCRSQVKRMTERKMRHPLEVVL 1001
            TS TI+ QSW     KL++ G  +A GYSCRSQVKR     + HPL+V+L
Sbjct: 960  TSDTIYGQSWKPLLQKLNRSGRVVADGYSCRSQVKRQEGVAVLHPLQVLL 1009


Lambda     K      H
   0.320    0.135    0.409 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 2756
Number of extensions: 97
Number of successful extensions: 5
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 1006
Length of database: 1015
Length adjustment: 45
Effective length of query: 961
Effective length of database: 970
Effective search space:   932170
Effective search space used:   932170
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.8 bits)
S2: 57 (26.6 bits)

This GapMind analysis is from Sep 17 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see the paper from 2019 on GapMind for amino acid biosynthesis, the preprint on GapMind for carbon sources, or view the source code.

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory