GapMind for catabolism of small carbon sources

 

Alignments for a candidate for astD in Sphingomonas koreensis DSMZ 15582

Align N-succinylglutamate 5-semialdehyde dehydrogenase; Succinylglutamic semialdehyde dehydrogenase; SGSD; EC 1.2.1.71 (characterized)
to candidate Ga0059261_4132 Ga0059261_4132 succinylglutamic semialdehyde dehydrogenase (EC 1.2.1.71)

Query= SwissProt::Q8ZPV0
         (492 letters)



>FitnessBrowser__Korea:Ga0059261_4132
          Length = 471

 Score =  472 bits (1214), Expect = e-137
 Identities = 247/457 (54%), Positives = 314/457 (68%), Gaps = 6/457 (1%)

Query: 20  TNPVSAEILWQGNDANAAQVAEACQAARAAFPRWARQPFAARQAIVEKFAALLEAHKAEL 79
           TNP + E LW+G  A+A   A A + AR AFP WA Q   AR AI+ ++AA+L   K  L
Sbjct: 8   TNPATGETLWEGPVASAEDCARAVERARTAFPAWAAQSPDARAAILTRYAAVLGERKDAL 67

Query: 80  TEVIARETGKPRWEAATEVTAMINKIAISIKAYHARTGAQKSELVDGAATLRHRPHGVLA 139
            E IARETGKP WE ATEV +MI K+AISI+A  AR G ++S +  G A L HRPHGV+A
Sbjct: 68  AEAIARETGKPLWETATEVASMIGKVAISIEAMAARAGTRESAMPFGRAVLAHRPHGVMA 127

Query: 140 VFGPYNFPGHLPNGHIVPALLAGNTLIFKPSELTPWTGETVIKLWERAGLPAGVLNLVQG 199
           V GPYNFPGHLPNGHIVPALLAGNTL+FKPSE TP  G+ +++    AG+P  V  L+QG
Sbjct: 128 VLGPYNFPGHLPNGHIVPALLAGNTLVFKPSEETPLVGQLMVEALHAAGIPEDVAILLQG 187

Query: 200 GRETGQALSSLDDLDGLLFTGSASTGYQLHRQLSGQPEKILALEMGGNNPLIIEDAANID 259
           GRETG AL S  D+DGLLFTGSA  G    R  + +P  ILALE+GGNNPL+  D    +
Sbjct: 188 GRETGAALVS-QDIDGLLFTGSAGAGMHFRRSFAERPAVILALELGGNNPLVAWD-GEPE 245

Query: 260 AAVHLTLQSAFITAGQRCTCARRLLVKQGAQGDAFLARLVDVAGRLQPGRWDDDPQPFIG 319
           A   + + S FIT GQRC+CARRL+V +GA GDA +  +  ++ RL+ GRWD+ P+PF+G
Sbjct: 246 AVASIVVASTFITTGQRCSCARRLIVPEGAAGDAIVDAVAALSDRLRIGRWDETPEPFMG 305

Query: 320 GLISAQAAQHVMEAWRQREALGGRTLLAPRKVKEGTS--LLTPGIIELTGVADVPDEEVF 377
            L+S  AA+           LG R  + P    EG S   + P I+++TG+ +VPDEE+F
Sbjct: 306 PLVSTGAAERAAAQVAALVGLGARE-IRPFGGVEGRSGAFVRPAILDVTGL-EVPDEEIF 363

Query: 378 GPLLNVWRYAHFDEAIRLANNTRFGLSCGLVSTDRAQFEQLLLEARAGIVNWNKPLTGAA 437
            P+L V R A FD A+  AN TRFGL+ GL+S D A + +   +ARAG+VN N+P TGAA
Sbjct: 364 APVLQVRRVADFDAALAAANQTRFGLAAGLISDDDALWARFQAQARAGVVNRNRPTTGAA 423

Query: 438 STAPFGGVGASGNHRPSAWYAADYCAWPMASLESPEL 474
           S+ PFGG+G SGNHRPSA+YAADYCA+P+ASLE+  +
Sbjct: 424 SSMPFGGLGDSGNHRPSAYYAADYCAYPVASLEAERI 460


Lambda     K      H
   0.319    0.134    0.411 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 711
Number of extensions: 32
Number of successful extensions: 4
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 492
Length of database: 471
Length adjustment: 34
Effective length of query: 458
Effective length of database: 437
Effective search space:   200146
Effective search space used:   200146
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 52 (24.6 bits)

Align candidate Ga0059261_4132 Ga0059261_4132 (succinylglutamic semialdehyde dehydrogenase (EC 1.2.1.71))
to HMM TIGR03240 (astD: succinylglutamate-semialdehyde dehydrogenase (EC 1.2.1.71))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR03240.hmm
# target sequence database:        /tmp/gapView.15515.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR03240  [M=484]
Accession:   TIGR03240
Description: arg_catab_astD: succinylglutamate-semialdehyde dehydrogenase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                                 Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                                 -----------
   1.1e-203  663.2   3.3   1.5e-203  662.8   3.3    1.0  1  lcl|FitnessBrowser__Korea:Ga0059261_4132  Ga0059261_4132 succinylglutamic 


Domain annotation for each sequence (and alignments):
>> lcl|FitnessBrowser__Korea:Ga0059261_4132  Ga0059261_4132 succinylglutamic semialdehyde dehydrogenase (EC 1.2.1.71)
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  662.8   3.3  1.5e-203  1.5e-203      15     475 ..       5     463 ..       2     470 .. 0.98

  Alignments for each domain:
  == domain 1  score: 662.8 bits;  conditional E-value: 1.5e-203
                                 TIGR03240  15 lesldpvtqevlwqgkaasaaqvekavkaarkafpawarlsleeriavvkrfaelleeekeelaeviak 83 
                                               ++s++p+t+e+lw+g  asa++ ++av+ ar+afpawa+ s ++r+a+++r+a++l e+k++lae+ia+
  lcl|FitnessBrowser__Korea:Ga0059261_4132   5 FASTNPATGETLWEGPVASAEDCARAVERARTAFPAWAAQSPDARAAILTRYAAVLGERKDALAEAIAR 73 
                                               789****************************************************************** PP

                                 TIGR03240  84 etgkplweartevasmvakvaisikayeertGekeseladakavlrhrphGvlavfGpynfpGhlpnGh 152
                                               etgkplwe++tevasm++kvaisi+a + r+G++es+++ ++avl hrphGv+av+GpynfpGhlpnGh
  lcl|FitnessBrowser__Korea:Ga0059261_4132  74 ETGKPLWETATEVASMIGKVAISIEAMAARAGTRESAMPFGRAVLAHRPHGVMAVLGPYNFPGHLPNGH 142
                                               ********************************************************************* PP

                                 TIGR03240 153 ivpallaGntvvfkpseltplvaeetvklwekaGlpaGvlnlvqGaretGkalaaeedidGllftGssn 221
                                               ivpallaGnt+vfkpse+tplv++ +v+++++aG+p+ v  l+qG+retG al+++ didGllftGs+ 
  lcl|FitnessBrowser__Korea:Ga0059261_4132 143 IVPALLAGNTLVFKPSEETPLVGQLMVEALHAAGIPEDVAILLQGGRETGAALVSQ-DIDGLLFTGSAG 210
                                               ******************************************************98.9*********** PP

                                 TIGR03240 222 tGallhrqlagrpekilalelGGnnplvveevkdidaavhlivqsafisaGqrctcarrllvkdgaeGd 290
                                               +G++++r +a+rp +ilalelGGnnplv ++  + +a++  +v s+fi++Gqrc+carrl+v++ga Gd
  lcl|FitnessBrowser__Korea:Ga0059261_4132 211 AGMHFRRSFAERPAVILALELGGNNPLVAWDG-EPEAVASIVVASTFITTGQRCSCARRLIVPEGAAGD 278
                                               ****************************9996.589********************************* PP

                                 TIGR03240 291 allerlvevaerltvgkydaepqpflGavisekaakellaaqekllalggksllelkqlee.eaalltp 358
                                               a+++++ ++++rl++g++d+ p+pf+G+++s+ aa++  a  + l+ lg++++  +  +e  + a+++p
  lcl|FitnessBrowser__Korea:Ga0059261_4132 279 AIVDAVAALSDRLRIGRWDETPEPFMGPLVSTGAAERAAAQVAALVGLGAREIRPFGGVEGrSGAFVRP 347
                                               *************************************999999***************9984568**** PP

                                 TIGR03240 359 giidvtevaevpdeeyfgpllkvlrykdfdealaeanntrfGlaaGllsddrelydkflleiraGivnw 427
                                               +i+dvt++ evpdee f+p+l+v r++dfd+ala+an+trfGlaaGl+sdd++l+ +f  ++raG+vn 
  lcl|FitnessBrowser__Korea:Ga0059261_4132 348 AILDVTGL-EVPDEEIFAPVLQVRRVADFDAALAAANQTRFGLAAGLISDDDALWARFQAQARAGVVNR 415
                                               *******8.9*********************************************************** PP

                                 TIGR03240 428 nkpltGassaapfGGiGasGnhrpsayyaadycaypvasleadslalp 475
                                               n+p+tGa+s++pfGG+G sGnhrpsayyaadycaypvaslea+++a  
  lcl|FitnessBrowser__Korea:Ga0059261_4132 416 NRPTTGAASSMPFGGLGDSGNHRPSAYYAADYCAYPVASLEAERIADQ 463
                                               *******************************************99865 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (484 nodes)
Target sequences:                          1  (471 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.02u 0.01s 00:00:00.03 Elapsed: 00:00:00.03
# Mc/sec: 7.26
//
[ok]

This GapMind analysis is from Sep 17 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory