GapMind for Amino acid biosynthesis

 

Alignments for a candidate for aroA in Desulfobacca acetoxidans DSM 11109

Align 3-phosphoshikimate 1-carboxyvinyltransferase (EC 2.5.1.19) (characterized)
to candidate WP_013705642.1 DESAC_RS03210 3-phosphoshikimate 1-carboxyvinyltransferase

Query= BRENDA::Q8YMB5
         (425 letters)



>NCBI__GCF_000195295.1:WP_013705642.1
          Length = 432

 Score =  350 bits (899), Expect = e-101
 Identities = 186/413 (45%), Positives = 268/413 (64%), Gaps = 3/413 (0%)

Query: 11  RPVDATVEIPGSKSITNRALLVAALAQGDSTLENALFSEDSEYFAKCVEQLGIPITLHPH 70
           + ++A + +PGSKS ++RAL+ A LA+G S+L N L ++D+   A+ +EQLG+ IT    
Sbjct: 10  KAMEAVITLPGSKSFSHRALIAAGLARGSSSLRNLLRADDTLMTARALEQLGVRITWQEK 69

Query: 71  LAQIQVSGKGGDIPAKQADLFVGLAGTAARFITALVALGNGEYRLDGVPRMRERPMGDLV 130
             +  + G GG +      +++G +GT+ RF+TA+ ALGNG Y L G PR+ +RP+ DL+
Sbjct: 70  --ECLLEGAGGRLKVPTEPIYLGDSGTSMRFLTAVAALGNGRYVLTGSPRLCQRPIQDLL 127

Query: 131 TVLQNSGITINFEGNSGFMPYTIYGQQFAGGHFRLKANQTSQQLSALLMIAPYAQQDTTI 190
             L   G+  + E ++G  P  I  +  AGG  R+    +SQ LSALL+I+P+A +D  I
Sbjct: 128 DALTLLGVVAHCENHNGCPPVIIQARGLAGGESRVSGGISSQFLSALLLISPFAARDVEI 187

Query: 191 EVEGTLVSQSYVKMTCRLMADFGVDVTQTDDNQFHIKAGQRYQARHYTIEPDASNASYFF 250
           EV G LVS+ YV +T  +M  FG+   +     F + AGQRYQAR Y +E DAS+ASYF 
Sbjct: 188 EVVGELVSRPYVDITLSVMEAFGIAYYRRGYQNFCVPAGQRYQARDYEVEGDASSASYFL 247

Query: 251 AAAAVTGGRVRVNHLTKQSCQGDILWLNVLEQMGCQVIEGADYTEVIGPEQLQGIDIDMN 310
            AAA+TGGR+ + +L  QSCQGDI +L VL+QMGCQV E      V+   QLQ I I+M 
Sbjct: 248 GAAALTGGRITLTNLNPQSCQGDIGFLEVLQQMGCQV-EPTGSGVVLRGRQLQAIRINMA 306

Query: 311 DMSDLVQTLGAIAPYASSPVIIRNVEHIRYKETERIRAVVTELRRLGVKVEEFADGMKIE 370
            M DLV TL  +A YA    +I  V H+R+KE++R++AV TEL ++G+ V +  DG+ I+
Sbjct: 307 HMPDLVPTLAVLAAYAQGETVITGVPHLRHKESDRLQAVATELAKMGITVNQTKDGLIIQ 366

Query: 371 PTPITPAAIETYHDHRMAMAFAVTGLKTPGIVIQDPGCTAKTFPDYFTRFFKM 423
                   IETY+DHR+AM+FA+ GLKTPG++I +P C AK+FPD++  F K+
Sbjct: 367 GGKPRGVVIETYNDHRIAMSFALAGLKTPGVMIANPDCVAKSFPDFWDYFAKL 419



 Score = 32.7 bits (73), Expect = 2e-05
 Identities = 38/142 (26%), Positives = 56/142 (39%), Gaps = 13/142 (9%)

Query: 6   IPALNRPVDATVEIPGSKSITNRALLVAALAQGDSTLENALFSEDSEYFAKCVEQLGIPI 65
           +PA  R      E+ G  S  +  L  AAL  G  TL N   +  S     C   +G   
Sbjct: 223 VPAGQRYQARDYEVEGDASSASYFLGAAALTGGRITLTN--LNPQS-----CQGDIGFLE 275

Query: 66  TLHPHLAQIQVSGKGGDIPAKQAD-LFVGLA--GTAARFITALVALGNGEYRLDGVPRMR 122
            L     Q++ +G G  +  +Q   + + +A        +  L A   GE  + GVP +R
Sbjct: 276 VLQQMGCQVEPTGSGVVLRGRQLQAIRINMAHMPDLVPTLAVLAAYAQGETVITGVPHLR 335

Query: 123 ERPMGDL---VTVLQNSGITIN 141
            +    L    T L   GIT+N
Sbjct: 336 HKESDRLQAVATELAKMGITVN 357


Lambda     K      H
   0.320    0.136    0.393 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 445
Number of extensions: 16
Number of successful extensions: 5
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 2
Number of HSP's successfully gapped: 2
Length of query: 425
Length of database: 432
Length adjustment: 32
Effective length of query: 393
Effective length of database: 400
Effective search space:   157200
Effective search space used:   157200
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.8 bits)
S2: 51 (24.3 bits)

Align candidate WP_013705642.1 DESAC_RS03210 (3-phosphoshikimate 1-carboxyvinyltransferase)
to HMM TIGR01356 (aroA: 3-phosphoshikimate 1-carboxyvinyltransferase (EC 2.5.1.19))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.aa/TIGR01356.hmm
# target sequence database:        /tmp/gapView.11572.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR01356  [M=415]
Accession:   TIGR01356
Description: aroA: 3-phosphoshikimate 1-carboxyvinyltransferase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                                 Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                                 -----------
     5e-142  459.5   0.0   5.7e-142  459.3   0.0    1.0  1  lcl|NCBI__GCF_000195295.1:WP_013705642.1  DESAC_RS03210 3-phosphoshikimate


Domain annotation for each sequence (and alignments):
>> lcl|NCBI__GCF_000195295.1:WP_013705642.1  DESAC_RS03210 3-phosphoshikimate 1-carboxyvinyltransferase
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  459.3   0.0  5.7e-142  5.7e-142       3     414 ..      16     421 ..      14     422 .. 0.98

  Alignments for each domain:
  == domain 1  score: 459.3 bits;  conditional E-value: 5.7e-142
                                 TIGR01356   3 ikipgsKSishRalllaaLaegetvvtnlLkseDtlatlealrklGakveeekeelviegvgg.lkepe 70 
                                               i++pgsKS+shRal++a+La+g++ ++nlL+++Dtl+t +al++lG++++ +++e+  eg gg lk p+
  lcl|NCBI__GCF_000195295.1:WP_013705642.1  16 ITLPGSKSFSHRALIAAGLARGSSSLRNLLRADDTLMTARALEQLGVRITWQEKECLLEGAGGrLKVPT 84 
                                               89************************************************888899****9999***** PP

                                 TIGR01356  71 aeldlgnsGttaRlltgvlalasgevvltgdeslkkRPierlveaLrelgaeieskeeegslPlaisgp 139
                                               + ++lg+sGt++R+lt+v+al +g +vltg+++l +RPi++l++aL  lg+  ++++++g++P+ i++ 
  lcl|NCBI__GCF_000195295.1:WP_013705642.1  85 EPIYLGDSGTSMRFLTAVAALGNGRYVLTGSPRLCQRPIQDLLDALTLLGVVAHCENHNGCPPVIIQAR 153
                                               ********************************************************************9 PP

                                 TIGR01356 140 .lkggivelsgsaSsQyksalllaaplalqavtleivgeklisrpyieitLkllksfgveveeederki 207
                                                l gg +++sg +SsQ++salll  p a ++v++e+vg +l+srpy++itL ++++fg+   +   +++
  lcl|NCBI__GCF_000195295.1:WP_013705642.1 154 gLAGGESRVSGGISSQFLSALLLISPFAARDVEIEVVG-ELVSRPYVDITLSVMEAFGIAYYRRGYQNF 221
                                               8999**********************************.*************************99*** PP

                                 TIGR01356 208 vvkggqkykqkevevegDaSsAafflaaaaitgeevtvenlgenstqgdkaiiivLeemGadveveeqr 276
                                                v+ gq y+ +++evegDaSsA++fl aaa+tg+++t++nl  +s qgd  +++vL++mG++ve + + 
  lcl|NCBI__GCF_000195295.1:WP_013705642.1 222 CVPAGQRYQARDYEVEGDASSASYFLGAAALTGGRITLTNLNPQSCQGDIGFLEVLQQMGCQVEPTGS- 289
                                               ********************************************************************. PP

                                 TIGR01356 277 dvevegasklkgvkvdidvdsliDelptlavlaafAegetriknieelRvkEsdRiaaiaeeLeklGve 345
                                                v+++g ++l++++  i++++++D++ptlavlaa+A+get+i+++ +lR kEsdR++a+a+eL+k+G++
  lcl|NCBI__GCF_000195295.1:WP_013705642.1 290 GVVLRG-RQLQAIR--INMAHMPDLVPTLAVLAAYAQGETVITGVPHLRHKESDRLQAVATELAKMGIT 355
                                               799996.68*9***..***************************************************** PP

                                 TIGR01356 346 veeledgllieGkkkelkgavvdtydDHRiamalavlglaaegeveiedaecvaksfPeFfevleqlga 414
                                               v++++dgl+i+G+  + +g v++ty+DHRiam++a++gl+   +v i ++ cvaksfP+F++  ++lg 
  lcl|NCBI__GCF_000195295.1:WP_013705642.1 356 VNQTKDGLIIQGG--KPRGVVIETYNDHRIAMSFALAGLKTP-GVMIANPDCVAKSFPDFWDYFAKLGT 421
                                               *************..6*************************9.********************999986 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (415 nodes)
Target sequences:                          1  (432 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.02u 0.01s 00:00:00.03 Elapsed: 00:00:00.02
# Mc/sec: 7.29
//
[ok]

This GapMind analysis is from Apr 10 2024. The underlying query database was built on Apr 09 2024.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory