GapMind for Amino acid biosynthesis

 

Alignments for a candidate for cysK in Desulfobacter vibrioformis DSM 8776

Align cysteine synthase (EC 2.5.1.47); L-3-cyanoalanine synthase (EC 4.4.1.9) (characterized)
to candidate WP_035238797.1 Q366_RS10715 cysteine synthase A

Query= BRENDA::Q84IF9
         (308 letters)



>NCBI__GCF_000745975.1:WP_035238797.1
          Length = 313

 Score =  269 bits (687), Expect = 7e-77
 Identities = 142/300 (47%), Positives = 191/300 (63%), Gaps = 2/300 (0%)

Query: 6   NSITELIGDTPAVKLNRIVDEDSADVYLKLEFMNPGSSVKDRIALAMIEAAEKAGKLKPG 65
           N I   IG TP  ++  +   D  ++Y K E++NPG S+KDR+AL +IE AEK G+LK G
Sbjct: 4   NEILNQIGATPMFRIG-LGGNDDMNIYAKAEYLNPGGSIKDRVALFIIEQAEKKGRLKKG 62

Query: 66  DTIVEPTSGNTGIGLAMVAAAKGYKAVLVMPDTMSLERRNLLRAYGAELVLTPGAQGMRG 125
            +IVE TSGNTGI +AMV   KGY   ++MP+ MS ER+ ++RA  AEL+LTP  + + G
Sbjct: 63  MSIVEATSGNTGIAVAMVGLVKGYDVRIIMPENMSDERKKMIRALNAELILTPPEKNVAG 122

Query: 126 PIAKAEELVREH-GYFMPQQFKNEANPEIHRLTTGKEIVEQMGDQLDAFVAGVGTGGTTT 184
            + K +E++ E    F+P QF+N  N   H L+TG EI + M   +D FV+G+G+GGT  
Sbjct: 123 AVEKLKEIMAEDDNIFVPDQFENHDNSMSHYLSTGPEIWKNMNGHVDIFVSGLGSGGTLM 182

Query: 185 GAGKVLREAYPNIKIYAVEPADSPVLSGGKPGPHKIQGIGAGFVPDILDTSIYDGVITVT 244
           G GK L+E  P I I AVEP ++  L G +PG HKI+GIG GFVP I+DTS+ D VI V 
Sbjct: 183 GTGKYLKEKNPEIMIVAVEPKNASALLGHEPGLHKIEGIGDGFVPSIVDTSLIDNVIEVD 242

Query: 245 TEEAFAAARRAAREEGILGGISSGAAIHAALKVAKELGKGKKVLAIIPSNGERYLSTPLY 304
            + A    R  A ++G L GISSGA + AAL++    GK K +  I P   ERY ST L+
Sbjct: 243 DDSAVEMTRWLASKQGFLVGISSGANVCAALEMRHLFGKDKNIATIFPDGAERYFSTALF 302


Lambda     K      H
   0.314    0.134    0.374 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 257
Number of extensions: 6
Number of successful extensions: 2
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 308
Length of database: 313
Length adjustment: 27
Effective length of query: 281
Effective length of database: 286
Effective search space:    80366
Effective search space used:    80366
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 42 (21.9 bits)
S2: 48 (23.1 bits)

Align candidate WP_035238797.1 Q366_RS10715 (cysteine synthase A)
to HMM TIGR01136 (cysteine synthase (EC 2.5.1.47))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.aa/TIGR01136.hmm
# target sequence database:        /tmp/gapView.958537.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR01136  [M=299]
Accession:   TIGR01136
Description: cysKM: cysteine synthase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                             Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                             -----------
     3e-117  377.2   0.1   3.5e-117  377.0   0.1    1.0  1  NCBI__GCF_000745975.1:WP_035238797.1  


Domain annotation for each sequence (and alignments):
>> NCBI__GCF_000745975.1:WP_035238797.1  
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  377.0   0.1  3.5e-117  3.5e-117       2     299 .]       7     302 ..       6     302 .. 0.98

  Alignments for each domain:
  == domain 1  score: 377.0 bits;  conditional E-value: 3.5e-117
                             TIGR01136   2 eeliGntPlvrlnlseelkaevlvKlEsrnPsgSvKdRialsmildAekrgllkkgktiieatSGNtGiaLAm 74 
                                            + iG tP+ r+ l  + ++++++K E+ nP+gS+KdR+al +i++Aek+g lkkg  i+eatSGNtGia+Am
  NCBI__GCF_000745975.1:WP_035238797.1   7 LNQIGATPMFRIGLGGNDDMNIYAKAEYLNPGGSIKDRVALFIIEQAEKKGRLKKGMSIVEATSGNTGIAVAM 79 
                                           578********************************************************************** PP

                             TIGR01136  75 vaaakgyklilvmpetmslERrkllkayGaelvlteaeegmkgaiekakelaeeepekyvllkqfeNpaNpea 147
                                           v+  kgy + ++mpe+ms ER+k+++a+ ael+lt++e+   ga+ek ke+++e+ +++++++qfeN +N  +
  NCBI__GCF_000745975.1:WP_035238797.1  80 VGLVKGYDVRIIMPENMSDERKKMIRALNAELILTPPEKNVAGAVEKLKEIMAED-DNIFVPDQFENHDNSMS 151
                                           ****************************************************997.5788************* PP

                             TIGR01136 148 HrkttgpEilkdtdgkidafvagvGtgGtitGvgrvlkekkpnvkivavePaespvlsegkpgphkiqgigag 220
                                           H+ +tgpEi+k+++g++d+fv+g+G gGt++G+g++lkek+p++ ivaveP+++++l + +pg hki+gig+g
  NCBI__GCF_000745975.1:WP_035238797.1 152 HYLSTGPEIWKNMNGHVDIFVSGLGSGGTLMGTGKYLKEKNPEIMIVAVEPKNASALLGHEPGLHKIEGIGDG 224
                                           ************************************************************************* PP

                             TIGR01136 221 fiPkildeelldevikvededaietarrlakeegilvGiSsGaavaaalkvakklekedkkivvilpdagerY 293
                                           f+P+i d++l+d+vi+v+d+ a+e++r la+++g lvGiSsGa+v aal++ +    +dk+i +i+pd +erY
  NCBI__GCF_000745975.1:WP_035238797.1 225 FVPSIVDTSLIDNVIEVDDDSAVEMTRWLASKQGFLVGISSGANVCAALEMRHLFG-KDKNIATIFPDGAERY 296
                                           ***************************************************99998.7*************** PP

                             TIGR01136 294 Lstelf 299
                                           +st+lf
  NCBI__GCF_000745975.1:WP_035238797.1 297 FSTALF 302
                                           *****9 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (299 nodes)
Target sequences:                          1  (313 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.00u 0.01s 00:00:00.01 Elapsed: 00:00:00.00
# Mc/sec: 15.78
//
[ok]

This GapMind analysis is from Jul 25 2024. The underlying query database was built on Jul 25 2024.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory