GapMind for catabolism of small carbon sources

 

Alignments for a candidate for astD in Amantichitinum ursilacus IGB-41

Align Succinylglutamic semialdehyde dehydrogenase (EC 1.2.1.71) (characterized)
to candidate WP_053935844.1 WG78_RS00590 succinylglutamate-semialdehyde dehydrogenase

Query= reanno::pseudo13_GW456_L13:PfGW456L13_1974
         (488 letters)



>NCBI__GCF_001294205.1:WP_053935844.1
          Length = 485

 Score =  644 bits (1660), Expect = 0.0
 Identities = 318/484 (65%), Positives = 385/484 (79%)

Query: 4   LYIAGEWLAGGGEAFESLNPVTQQVLWSGVGATAGQVESAVQAARQAFPDWARRTLEERI 63
           L I GEW AG GE ++S NPVTQ ++W G  A A +V++AV  AR AF  WAR+  E R+
Sbjct: 2   LMINGEWRAGSGERWQSRNPVTQHIVWDGQAANAAEVDAAVANARAAFKSWARQEPEARL 61

Query: 64  SVLEAFAAALKNHADELAHTIGEETGKPLWEAATEVTSMVNKIAISVQSYRERTGEKSGP 123
           +V   FA  LK +   LA TIG ETGKP WEA TEVTSM+NK+ IS+++  ERTG K   
Sbjct: 62  AVARNFAELLKTNQAMLADTIGLETGKPRWEALTEVTSMINKVDISIRALDERTGSKEAT 121

Query: 124 LGDATAVLRHKPHGVVAVFGPYNFPGHLPNGHIVPALLAGNSVLFKPSELTPKVAELTVK 183
            GDA AVLRH+PHGV+AVFGPYNFPGHLPNGHIVPAL+AGN V+FKPSEL P VA+ T +
Sbjct: 122 QGDALAVLRHRPHGVLAVFGPYNFPGHLPNGHIVPALIAGNCVVFKPSELAPLVAQKTAE 181

Query: 184 CWIEAGLPAGVLNLLQGARETGIALAANPGIDGLFFTGSSRTGNHLHQQFAGRPDKILAL 243
            W+ AGLPAGVLNLLQG R+TGIAL+ + GIDGL FTGS+ TG HLH+QFAG+PDK+LAL
Sbjct: 182 LWLAAGLPAGVLNLLQGGRDTGIALSKHAGIDGLLFTGSATTGYHLHRQFAGQPDKMLAL 241

Query: 244 EMGGNNPLVVDQVADLDAAVYTIIQSAFISAGQRCTCARRLLVPQGAWGDSLLARLVAVS 303
           EMGGNNPL+V+QVAD++AA++ ++QSAF+SAGQRCTCARRLLVPQG+WGD+ + RL  V+
Sbjct: 242 EMGGNNPLIVEQVADVNAALHHVVQSAFVSAGQRCTCARRLLVPQGSWGDAFIDRLSEVT 301

Query: 304 STLSVGAFDQQPAPFMGSVVSLGAAKALMDAQEHLLANGAVALLEMTQPQAQSALLTPGI 363
             L+VGA+D +P PFMG+V+SL AA+ L+ AQ  L A GA ++L M +    +ALL+PGI
Sbjct: 302 RNLTVGAWDAEPQPFMGAVISLHAAEQLLKAQTELHAMGAASILTMRRLVEGTALLSPGI 361

Query: 364 LDVSAVADRPDEELFGPLLQVIRYADFEAAIAEANDTAYGLAAGLLSDSEARYQQFWLES 423
           LDV+ VA  PD+E FGPLLQV RYADF  AI  AN T YGLAAGLLSD EA+Y+ FWLES
Sbjct: 362 LDVTHVAHLPDDEYFGPLLQVQRYADFNEAITLANRTRYGLAAGLLSDDEAQYRTFWLES 421

Query: 424 RAGIVNWNKQLTGAASSAPFGGVGASGNHRASAYYAADYCAYPVASLETPSLVLPSALTP 483
           RAGIVNWNK LTGA+S+APFGGVGASGNHR SA+YAADYCA+PVASLE+ +L LP  L P
Sbjct: 422 RAGIVNWNKPLTGASSAAPFGGVGASGNHRPSAWYAADYCAWPVASLESNALTLPGQLPP 481

Query: 484 GVKM 487
           G+ +
Sbjct: 482 GLTL 485


Lambda     K      H
   0.316    0.132    0.388 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 750
Number of extensions: 28
Number of successful extensions: 1
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 488
Length of database: 485
Length adjustment: 34
Effective length of query: 454
Effective length of database: 451
Effective search space:   204754
Effective search space used:   204754
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.6 bits)
S2: 52 (24.6 bits)

Align candidate WP_053935844.1 WG78_RS00590 (succinylglutamate-semialdehyde dehydrogenase)
to HMM TIGR03240 (astD: succinylglutamate-semialdehyde dehydrogenase (EC 1.2.1.71))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR03240.hmm
# target sequence database:        /tmp/gapView.2952462.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR03240  [M=484]
Accession:   TIGR03240
Description: arg_catab_astD: succinylglutamate-semialdehyde dehydrogenase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                             Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                             -----------
   6.3e-252  822.4   1.0   7.1e-252  822.2   1.0    1.0  1  NCBI__GCF_001294205.1:WP_053935844.1  


Domain annotation for each sequence (and alignments):
>> NCBI__GCF_001294205.1:WP_053935844.1  
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  822.2   1.0  7.1e-252  7.1e-252       2     483 ..       3     484 ..       2     485 .] 1.00

  Alignments for each domain:
  == domain 1  score: 822.2 bits;  conditional E-value: 7.1e-252
                             TIGR03240   2 fidGkwraGqGeslesldpvtqevlwqgkaasaaqvekavkaarkafpawarlsleeriavvkrfaelleeek 74 
                                            i+G+wraG+Ge  +s++pvtq+ +w+g+aa+aa+v++av+ ar+af++war + e+r+av ++faell+ ++
  NCBI__GCF_001294205.1:WP_053935844.1   3 MINGEWRAGSGERWQSRNPVTQHIVWDGQAANAAEVDAAVANARAAFKSWARQEPEARLAVARNFAELLKTNQ 75 
                                           69*********************************************************************** PP

                             TIGR03240  75 eelaeviaketgkplweartevasmvakvaisikayeertGekeseladakavlrhrphGvlavfGpynfpGh 147
                                           + la++i+ etgkp+wea tev+sm++kv+isi+a++ertG+ke++ +da avlrhrphGvlavfGpynfpGh
  NCBI__GCF_001294205.1:WP_053935844.1  76 AMLADTIGLETGKPRWEALTEVTSMINKVDISIRALDERTGSKEATQGDALAVLRHRPHGVLAVFGPYNFPGH 148
                                           ************************************************************************* PP

                             TIGR03240 148 lpnGhivpallaGntvvfkpseltplvaeetvklwekaGlpaGvlnlvqGaretGkalaaeedidGllftGss 220
                                           lpnGhivpal+aGn+vvfkpsel+plva++t +lw +aGlpaGvlnl+qG+r+tG al+++ +idGllftGs+
  NCBI__GCF_001294205.1:WP_053935844.1 149 LPNGHIVPALIAGNCVVFKPSELAPLVAQKTAELWLAAGLPAGVLNLLQGGRDTGIALSKHAGIDGLLFTGSA 221
                                           ************************************************************************* PP

                             TIGR03240 221 ntGallhrqlagrpekilalelGGnnplvveevkdidaavhlivqsafisaGqrctcarrllvkdgaeGdall 293
                                           +tG++lhrq+ag+p+k+lale+GGnnpl+ve+v+d++aa h++vqsaf+saGqrctcarrllv++g++Gda++
  NCBI__GCF_001294205.1:WP_053935844.1 222 TTGYHLHRQFAGQPDKMLALEMGGNNPLIVEQVADVNAALHHVVQSAFVSAGQRCTCARRLLVPQGSWGDAFI 294
                                           ************************************************************************* PP

                             TIGR03240 294 erlvevaerltvgkydaepqpflGavisekaakellaaqekllalggksllelkqleeeaalltpgiidvtev 366
                                           +rl ev+++ltvg++daepqpf+Gavis +aa++ll+aq++l+a+g++s+l++++l e++all+pgi+dvt+v
  NCBI__GCF_001294205.1:WP_053935844.1 295 DRLSEVTRNLTVGAWDAEPQPFMGAVISLHAAEQLLKAQTELHAMGAASILTMRRLVEGTALLSPGILDVTHV 367
                                           ************************************************************************* PP

                             TIGR03240 367 aevpdeeyfgpllkvlrykdfdealaeanntrfGlaaGllsddrelydkflleiraGivnwnkpltGassaap 439
                                           a++pd+eyfgpll+v ry+df+ea++ an tr+GlaaGllsdd+++y++f+le+raGivnwnkpltGassaap
  NCBI__GCF_001294205.1:WP_053935844.1 368 AHLPDDEYFGPLLQVQRYADFNEAITLANRTRYGLAAGLLSDDEAQYRTFWLESRAGIVNWNKPLTGASSAAP 440
                                           ************************************************************************* PP

                             TIGR03240 440 fGGiGasGnhrpsayyaadycaypvasleadslalpatlspGlk 483
                                           fGG+GasGnhrpsa+yaadyca+pvasle++ l+lp +l pGl+
  NCBI__GCF_001294205.1:WP_053935844.1 441 FGGVGASGNHRPSAWYAADYCAWPVASLESNALTLPGQLPPGLT 484
                                           ******************************************97 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (484 nodes)
Target sequences:                          1  (485 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.00u 0.00s 00:00:00.00 Elapsed: 00:00:00.00
# Mc/sec: 28.75
//
[ok]

This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory