GapMind for catabolism of small carbon sources

 

Alignments for a candidate for rocA in Pseudarthrobacter sulfonivorans Ar51

Align L-glutamate gamma-semialdehyde dehydrogenase (EC 1.2.1.88); Proline dehydrogenase (EC 1.5.5.2) (characterized)
to candidate WP_058930844.1 AU252_RS11615 bifunctional proline dehydrogenase/L-glutamate gamma-semialdehyde dehydrogenase

Query= reanno::Marino:GFF2744
         (1209 letters)



>NCBI__GCF_001484605.1:WP_058930844.1
          Length = 1165

 Score =  247 bits (630), Expect = 5e-69
 Identities = 252/903 (27%), Positives = 393/903 (43%), Gaps = 104/903 (11%)

Query: 147  VGETLRKLLKRFGEPVIRTVAGQAMKEMGRQFVLGRDIDEAQDEAKEYMAKGYTYSYDML 206
            +G T+  +L +   P+ R V    ++EM    ++     +      +    G   + ++L
Sbjct: 101  LGGTMAPVLPQVVIPIARRV----LREMVGHLIVDATDAKLGPAIAKIRKDGIKLNVNLL 156

Query: 207  GEAARTDDDAKRYYDSYSNAIDSIAKASKGDVRKNPGISVKLSALLARYEYGNKERVMNE 266
            GEA   + +A R        +      ++ DV     +S+K+S+ +A +     +  +  
Sbjct: 157  GEAVLGEHEASRRLAGTHTLL------ARPDVDY---VSIKVSSTVAPHSAWAFDEAVEH 207

Query: 267  LLPRARELVKKAA-----------AANMGF-NIDAEEQDRLDLSLDVIEELVADPELAGW 314
            ++ +   L  KAA           A N  F N+D EE   LD+++ V   ++  PE    
Sbjct: 208  VVEKLTPLFTKAASFAGAASGAGAATNAKFINLDMEEYKDLDMTIAVFTRILDKPEFKNL 267

Query: 315  DGFGVVVQAYGKRSSFVL----DWLYGLAEKYDRKFMVRLVKGAYWDAEIKRAQVMGLNG 370
            +  G+V+QAY   +   +    DW             VR+VKGA    E   +    L+ 
Sbjct: 268  EA-GIVLQAYLPDALSAMIRLQDWAAERRANGGAAIKVRVVKGANLPMEQVESS---LHD 323

Query: 371  FPVFT--RKACSDVSFLSCATKLLN--MTNRIYPQFATHNAHSVSAILEMAKTKGVDN-Y 425
            +P+ T   K  SD ++       L+    N +    A HN   ++    +AK +GV++  
Sbjct: 324  WPLATWHTKQDSDTNYKRVINYSLHPDRINNVRIGVAGHNLFDIAFAWLLAKQRGVESGI 383

Query: 426  EFQRLHGMGESLHNEVLKVSGVPCRIYAPVGPHKDL---LAYLVRRLLENGANSSFV--- 479
            EF+ L GM +     V K  G    +Y PV    +    +AYL+RRL E  +  +F+   
Sbjct: 384  EFEMLLGMAQGQAEAVKKDVG-SLLLYTPVVHPAEFDVAIAYLIRRLEEGASQDNFMSAV 442

Query: 480  -----NQIVDKRITPEEIAKDPIVSVEEMGNNISSKAIVHPFKLFGDQRRNSKGWDITDP 534
                 NQ++ +R     +A    +  E    N      + P  L  D   N+   D + P
Sbjct: 443  FELSENQVLFEREKQRFLASLDTLDDEVPPANRQQNRSLPPQPLPRDTFANTPDTDPSLP 502

Query: 535  VTVNEIEKGRGAYKDYRWKGGPLIAGEVAGTEIQVVRNPADPDDLVGH--VTQASDADVD 592
                          +  W  G  I   V             P   +G+  V  A+ +D D
Sbjct: 503  A-------------NRTW--GRAILDRV-------------PTSTLGNAAVDAATISDAD 534

Query: 593  TAITSAAAAFE---SWSAKSAEERAACVRKVGDLYEENYAELFALTTREAGKSLLDAVAE 649
            T  T  A A E   +W A + ++RA  + + GD+ E   A+L  +   E GK++     E
Sbjct: 535  TLNTVIATAVEKGKAWGALTGDQRAEILHRAGDVLEARRADLLEVMASETGKTIDQGDPE 594

Query: 650  IREAVDFSQYYANEAIRYKDSGDARGVMCCIS----PWNFPLAIFTGQILANLAAGNTVV 705
            + EAVDF+ YYA  A + +    A  V   ++    PWNFP+AI  G  LA LAAG+ VV
Sbjct: 595  VSEAVDFAHYYAESARKLEKVDGATFVPAKLTVVTPPWNFPVAIPAGSTLAALAAGSAVV 654

Query: 706  AKPAEQTSLLAIRAVELMHQAGIPKDAIQLVPGTGATVGAALTSDSRVSGVCFTGSTATA 765
             KPA+Q        +E + +AG+PKD + +V      +G  L S   V  V  TG   TA
Sbjct: 655  IKPAKQARRSGAVMIEALWEAGVPKDVLTMVQLGERELGQQLISHPSVDRVILTGGYETA 714

Query: 766  QRINKVMTENMAPDAPLVAETGGLNAMIVDSTALPEQVVRDVLASSFQSAGQRCSALRML 825
            +     +  +   D PL+AET G NA+IV  +A  +   +DV  S+F  AGQ+CSA  ++
Sbjct: 715  E-----LFRSFRKDLPLLAETSGKNAIIVTPSADLDLAAKDVAYSAFGHAGQKCSAASLV 769

Query: 826  YVQRDIADG--LLEMLYGAMEELGIGDPWLLSTDVGPVIDENARKKIVDHCEKFERNGKL 883
             +   +A        L  A+  L +G P   ++ +GP+I+    K +       E     
Sbjct: 770  ILVGSVAKSKRFHNQLVDAVTSLKVGYPQDPTSQMGPIIEPADGKLLNALTTLGEGETWA 829

Query: 884  LKKMKVPEKGLFVSPAVLSVSGIEELE----EEIFGPVLHVATFEAKNIDKVVDDINAKG 939
            ++  K+ E G   SP V    G+         E FGPVL V T  A  +++ +   N   
Sbjct: 830  VEPKKLDETGRLWSPGVR--YGVRRGSYFHLTEFFGPVLGVMT--ADTLEEAIAIQNQIE 885

Query: 940  YGLTFGIHSRVDRRVERITSRIKVGNTYVNRNQIGAIVGSQPFGG--EGLSGTGPKAGGP 997
            YGLT G+HS     +      ++ GN YVNR   GAIV  QPFGG  +   G G KAGGP
Sbjct: 886  YGLTAGLHSLNPDELGTWLDSVQAGNLYVNRGITGAIVQRQPFGGWKKSAVGAGTKAGGP 945

Query: 998  QYV 1000
             Y+
Sbjct: 946  NYL 948


Lambda     K      H
   0.316    0.133    0.378 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 2897
Number of extensions: 133
Number of successful extensions: 7
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 2
Number of HSP's successfully gapped: 2
Length of query: 1209
Length of database: 1165
Length adjustment: 47
Effective length of query: 1162
Effective length of database: 1118
Effective search space:  1299116
Effective search space used:  1299116
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.6 bits)
S2: 59 (27.3 bits)

This GapMind analysis is from Apr 09 2024. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory