GapMind for catabolism of small carbon sources

 

Alignments for a candidate for putA in Pontibacillus litoralis JSM 072002

Align 1-pyrroline-5-carboxylate dehydrogenase 2; P5C dehydrogenase 2; L-glutamate gamma-semialdehyde dehydrogenase; EC 1.2.1.88 (characterized)
to candidate WP_036835911.1 N784_RS14095 L-glutamate gamma-semialdehyde dehydrogenase

Query= SwissProt::P94391
         (515 letters)



>NCBI__GCF_000775615.1:WP_036835911.1
          Length = 515

 Score =  780 bits (2013), Expect = 0.0
 Identities = 381/514 (74%), Positives = 439/514 (85%)

Query: 1   MTTPYKHEPFTNFQDQNNVEAFKKALATVSEYLGKDYPLVINGERVETEAKIVSINPADK 60
           M TPYKHEPFT+F  + N +A++  L  V  YL ++Y L++NGER++T+ KIVS NPA  
Sbjct: 1   MYTPYKHEPFTDFSIEENKKAYEAGLEKVKSYLNQEYDLIVNGERIKTDDKIVSTNPAKT 60

Query: 61  EEVVGRVSKASQEHAEQAIQAAAKAFEEWRYTSPEERAAVLFRAAAKVRRRKHEFSALLV 120
           +EVVG VSKA+QE AEQA+QAA+ AFE+WR  S E RA +LFRAAAK+RRRKHEFSALL 
Sbjct: 61  DEVVGTVSKATQEIAEQAMQAASDAFEDWRKWSAEARAGILFRAAAKIRRRKHEFSALLS 120

Query: 121 KEAGKPWNEADADTAEAIDFMEYYARQMIELAKGKPVNSREGEKNQYVYTPTGVTVVIPP 180
            E GKPW EADADTAEAIDF+EYYARQ IE+ KGK V SREGE N YVYTP GV VVIPP
Sbjct: 121 YEVGKPWKEADADTAEAIDFLEYYARQAIEVDKGKHVESREGEMNCYVYTPCGVAVVIPP 180

Query: 181 WNFLFAIMAGTTVAPIVTGNTVVLKPASATPVIAAKFVEVLEESGLPKGVVNFVPGSGAE 240
           WN   AIMAGTTVAP+VTGNTVV+KPAS +PV AAKFVEVLEESGLPKGV+NFVPGSG E
Sbjct: 181 WNLALAIMAGTTVAPLVTGNTVVMKPASNSPVTAAKFVEVLEESGLPKGVLNFVPGSGKE 240

Query: 241 VGDYLVDHPKTSLITFTGSREVGTRIFERAAKVQPGQQHLKRVIAEMGGKDTVVVDEDAD 300
           VGDYLVDHPKT+LI+FTGSR+VG RI ERAAK+QPGQ HLKRVIAEMGGKDTVVVD+ AD
Sbjct: 241 VGDYLVDHPKTALISFTGSRDVGVRIMERAAKLQPGQNHLKRVIAEMGGKDTVVVDKSAD 300

Query: 301 IELAAQSIFTSAFGFAGQKCSAGSRAVVHEKVYDQVLERVIEITESKVTAKPDSADVYMG 360
           IE A  +I  SAFGF+GQKCS+GSRAVVHE +YD+VL+RV ++T+         +++YMG
Sbjct: 301 IETAVNAIVVSAFGFSGQKCSSGSRAVVHEDIYDEVLDRVAKLTKELTVGNATESNIYMG 360

Query: 361 PVIDQGSYDKIMSYIEIGKQEGRLVSGGTGDDSKGYFIKPTIFADLDPKARLMQEEIFGP 420
           PV+DQ ++DKIMSY+EIGK+EGRLV GGTGDDS GYFI+PTIFADL P +R+ QEEIFGP
Sbjct: 361 PVVDQAAFDKIMSYMEIGKEEGRLVVGGTGDDSTGYFIQPTIFADLAPTSRMQQEEIFGP 420

Query: 421 VVAFCKVSDFDEALEVANNTEYGLTGAVITNNRKHIERAKQEFHVGNLYFNRNCTGAIVG 480
           VV F KV DFDEA+EVANNTEYGLTGAVI+ +R++IE+A ++FHVGNLYFNRNCTGAIVG
Sbjct: 421 VVCFTKVKDFDEAIEVANNTEYGLTGAVISEDRQNIEKAARDFHVGNLYFNRNCTGAIVG 480

Query: 481 YHPFGGFKMSGTDSKAGGPDYLALHMQAKTISEM 514
           Y PFGGFKMSGTDSKAGGPDYLALHMQAKTISEM
Sbjct: 481 YQPFGGFKMSGTDSKAGGPDYLALHMQAKTISEM 514


Lambda     K      H
   0.316    0.133    0.379 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 905
Number of extensions: 26
Number of successful extensions: 1
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 515
Length of database: 515
Length adjustment: 35
Effective length of query: 480
Effective length of database: 480
Effective search space:   230400
Effective search space used:   230400
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.6 bits)
S2: 52 (24.6 bits)

Align candidate WP_036835911.1 N784_RS14095 (L-glutamate gamma-semialdehyde dehydrogenase)
to HMM TIGR01237 (pruA: putative delta-1-pyrroline-5-carboxylate dehydrogenase (EC 1.2.1.88))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR01237.hmm
# target sequence database:        /tmp/gapView.2446601.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR01237  [M=511]
Accession:   TIGR01237
Description: D1pyr5carbox2: putative delta-1-pyrroline-5-carboxylate dehydrogenase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                             Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                             -----------
   2.9e-260  850.2   5.9   3.3e-260  850.1   5.9    1.0  1  NCBI__GCF_000775615.1:WP_036835911.1  


Domain annotation for each sequence (and alignments):
>> NCBI__GCF_000775615.1:WP_036835911.1  
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  850.1   5.9  3.3e-260  3.3e-260       1     510 [.       5     514 ..       5     515 .] 1.00

  Alignments for each domain:
  == domain 1  score: 850.1 bits;  conditional E-value: 3.3e-260
                             TIGR01237   1 yknepftdfadeelvqafkkalakvkellGkdyplvinGeeveteakidsinpadksevvGkvakasvedaeq 73 
                                           yk+epftdf+ ee+++a+++ l kvk +l ++y l++nGe+++t++ki+s npa+++evvG+v+ka++e aeq
  NCBI__GCF_000775615.1:WP_036835911.1   5 YKHEPFTDFSIEENKKAYEAGLEKVKSYLNQEYDLIVNGERIKTDDKIVSTNPAKTDEVVGTVSKATQEIAEQ 77 
                                           9************************************************************************ PP

                             TIGR01237  74 alqaakkafeewkktdveeraaillkaaailkrrrhelsallvlevGkiyaeadaevaeaidfleyyaremik 146
                                           a+qaa  afe+w+k++ e+ra+il++aaa+++rr+he+sall+ evGk+++eada++aeaidfleyyar++i+
  NCBI__GCF_000775615.1:WP_036835911.1  78 AMQAASDAFEDWRKWSAEARAGILFRAAAKIRRRKHEFSALLSYEVGKPWKEADADTAEAIDFLEYYARQAIE 150
                                           ************************************************************************* PP

                             TIGR01237 147 lakskevlsieGeknrylyiplGvavvispwnfplailvGmtvapivtGncvvlkpaeaatviaaklveilee 219
                                           ++k+k+v s+eGe n y+y+p Gvavvi+pwn+ lai++G+tvap+vtGn+vv+kpa++++v aak+ve+lee
  NCBI__GCF_000775615.1:WP_036835911.1 151 VDKGKHVESREGEMNCYVYTPCGVAVVIPPWNLALAIMAGTTVAPLVTGNTVVMKPASNSPVTAAKFVEVLEE 223
                                           ************************************************************************* PP

                             TIGR01237 220 aGlpkGvlqfvpGkGsevGeylvdhpktrlitftGsrevGlriyedaakvqpGqkhlkrviaelGGkdavivd 292
                                           +GlpkGvl+fvpG+G+evG+ylvdhpkt li+ftGsr+vG+ri+e+aak+qpGq+hlkrviae+GGkd+v+vd
  NCBI__GCF_000775615.1:WP_036835911.1 224 SGLPKGVLNFVPGSGKEVGDYLVDHPKTALISFTGSRDVGVRIMERAAKLQPGQNHLKRVIAEMGGKDTVVVD 296
                                           ************************************************************************* PP

                             TIGR01237 293 esadieqavaaavtsafGfaGqkcsaasrvvvlekvydevverfveatkslkvgktdeadvqvgpvidqksfd 365
                                           +sadie+av+a+v safGf+Gqkcs++sr+vv+e++ydev++r+ + tk l+vg++ e+++++gpv+dq +fd
  NCBI__GCF_000775615.1:WP_036835911.1 297 KSADIETAVNAIVVSAFGFSGQKCSSGSRAVVHEDIYDEVLDRVAKLTKELTVGNATESNIYMGPVVDQAAFD 369
                                           ************************************************************************* PP

                             TIGR01237 366 kikeyielgkaegklvlggedddskGyfikptifkdvdrkarlaqeeifGpvvavlrakdfdealeianstey 438
                                           ki++y+e+gk+eg+lv+gg++dds Gyfi+ptif+d+ + +r+ qeeifGpvv++ ++kdfdea+e+an+tey
  NCBI__GCF_000775615.1:WP_036835911.1 370 KIMSYMEIGKEEGRLVVGGTGDDSTGYFIQPTIFADLAPTSRMQQEEIFGPVVCFTKVKDFDEAIEVANNTEY 442
                                           ************************************************************************* PP

                             TIGR01237 439 gltGgvisnsrerierakaefevGnlyfnrkitGaivgvqpfGGfkmsGtdskaGGpdylaqflqaktvter 510
                                           gltG+vis+ r++ie+a+ +f+vGnlyfnr++tGaivg+qpfGGfkmsGtdskaGGpdyla ++qakt++e+
  NCBI__GCF_000775615.1:WP_036835911.1 443 GLTGAVISEDRQNIEKAARDFHVGNLYFNRNCTGAIVGYQPFGGFKMSGTDSKAGGPDYLALHMQAKTISEM 514
                                           ***********************************************************************8 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (511 nodes)
Target sequences:                          1  (515 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.01u 0.01s 00:00:00.02 Elapsed: 00:00:00.01
# Mc/sec: 15.16
//
[ok]

This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory