GapMind for Amino acid biosynthesis

 

Alignments for a candidate for trpE in Teredinibacter turnerae T7901

Align anthranilate synthase (subunit 1/2) (EC 4.1.3.27) (characterized)
to candidate WP_015816820.1 TERTU_RS13865 anthranilate synthase component I

Query= BRENDA::P20580
         (492 letters)



>NCBI__GCF_000023025.1:WP_015816820.1
          Length = 492

 Score =  596 bits (1537), Expect = e-175
 Identities = 309/489 (63%), Positives = 367/489 (75%), Gaps = 6/489 (1%)

Query: 1   MNREEFLRLAADGYNRIPLSFETLADFDTPLSIYLKLADAPNSYLLESVQGGEKWGRYSI 60
           M  E F +LAA G+NRIP+  E LAD +TPLS Y KLA+ P SYL ESVQGGEKWGRYSI
Sbjct: 1   MTPELFSQLAAAGHNRIPVRREVLADTETPLSSYFKLANGPYSYLFESVQGGEKWGRYSI 60

Query: 61  IGLPCRTVLRVYDHQVRISIDGVETERFDCADPLAFVEEFKARYQVPTVPGLPRFDGGLV 120
           IGLP R  L V ++ V+   +    E  DC DPL FV+ ++  + V  V GLP F+GGLV
Sbjct: 61  IGLPARERLEVRENTVQFFNESGLVESSDCDDPLEFVKNWQKSFNVAEVDGLPSFNGGLV 120

Query: 121 GYFGYDCVRYVEKRLATCPNPDPLGNPDILLMVSDAVVVFDNLAGKIHAIVLADPSEENA 180
           GYF YDCVRYVE RL      D +G P+ILLMVSD ++VFDNL GKIH IVLADP+  NA
Sbjct: 121 GYFAYDCVRYVEPRLQGNAPDDEIGTPEILLMVSDEILVFDNLKGKIHLIVLADPARANA 180

Query: 181 YERGQARLEELLERLRQ-----PITPRRGLDLEAAQGREPAFRASFTREDYENAVGRIKD 235
            +    RL+ L  +L       P  P      +     E  F +S+  E ++  VGR+KD
Sbjct: 181 LQLANDRLDALEAKLASGPGNIPAMPAMNTS-KGITACEGDFESSYGCEKFQADVGRLKD 239

Query: 236 YILAGDCMQVVPSQRMSIEFKAAPIDLYRALRCFNPTPYMYFFNFGDFHVVGSSPEVLVR 295
           YILAGD MQ+V SQRMS  F A P++LYRALRC NP+PYMYF N GD HVVGSSPE+L R
Sbjct: 240 YILAGDTMQIVLSQRMSYPFTAPPVNLYRALRCLNPSPYMYFMNLGDHHVVGSSPEILAR 299

Query: 296 VEDGLVTVRPIAGTRPRGINEEADLALEQDLLSDAKEIAEHLMLIDLGRNDVGRVSDIGA 355
           +E+G +TVRPIAGTR RG +E  D ALE +L++D KEIAEHLMLIDLGRNDVGRV++IG+
Sbjct: 300 LENGEMTVRPIAGTRRRGYSEAEDKALEAELVADPKEIAEHLMLIDLGRNDVGRVAEIGS 359

Query: 356 VKVTEKMVIERYSNVMHIVSNVTGQLREGLSAMDALRAILPAGTLSGAPKIRAMEIIDEL 415
           VK+TEKMV+ER+S+VMHI SNVTG+L+    AMD LRA LPAGTLSGAPKIRAMEIIDEL
Sbjct: 360 VKLTEKMVVERFSHVMHITSNVTGRLKADKDAMDVLRAALPAGTLSGAPKIRAMEIIDEL 419

Query: 416 EPVKRGVYGGAVGYLAWNGNMDTAIAIRTAVIKNGELHVQAGGGIVADSVPALEWEETIN 475
           EPVKRG+YGGA+GYLAWNGNMDTAIAIRTAVIK+G++ VQAG G+VADS P LEW+ET+N
Sbjct: 420 EPVKRGIYGGAIGYLAWNGNMDTAIAIRTAVIKDGKIFVQAGAGVVADSQPELEWKETMN 479

Query: 476 KRRAMFRAV 484
           K RA+F AV
Sbjct: 480 KARALFSAV 488


Lambda     K      H
   0.321    0.139    0.408 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 723
Number of extensions: 28
Number of successful extensions: 2
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 492
Length of database: 492
Length adjustment: 34
Effective length of query: 458
Effective length of database: 458
Effective search space:   209764
Effective search space used:   209764
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.8 bits)
S2: 52 (24.6 bits)

Align candidate WP_015816820.1 TERTU_RS13865 (anthranilate synthase component I)
to HMM TIGR00564 (trpE: anthranilate synthase component I (EC 4.1.3.27))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.aa/TIGR00564.hmm
# target sequence database:        /tmp/gapView.11741.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR00564  [M=455]
Accession:   TIGR00564
Description: trpE_most: anthranilate synthase component I
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                                 Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                                 -----------
   1.4e-172  560.7   0.0   1.6e-172  560.5   0.0    1.0  1  lcl|NCBI__GCF_000023025.1:WP_015816820.1  TERTU_RS13865 anthranilate synth


Domain annotation for each sequence (and alignments):
>> lcl|NCBI__GCF_000023025.1:WP_015816820.1  TERTU_RS13865 anthranilate synthase component I
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  560.5   0.0  1.6e-172  1.6e-172       1     454 [.      25     488 ..      25     489 .. 0.92

  Alignments for each domain:
  == domain 1  score: 560.5 bits;  conditional E-value: 1.6e-172
                                 TIGR00564   1 adtltpisvylklakrkesfllEsvekeeelgRySliglnpvleikakdgkavlleaddeeak.ieede 68 
                                               adt+tp+s y kla+ ++s+l+Esv+ +e++gRyS+igl  + ++++++++++ +++++  ++   +d+
  lcl|NCBI__GCF_000023025.1:WP_015816820.1  25 ADTETPLSSYFKLANGPYSYLFESVQGGEKWGRYSIIGLPARERLEVRENTVQFFNESGLVESsDCDDP 93 
                                               699***********99**********************************9999998876665478999 PP

                                 TIGR00564  69 lkelrklleka.eesedeldeplsggavGylgydtvrlveklke.ea.edelelpdlllllvetvivfD 134
                                               l+ ++++ +++   + d+l+  + gg+vGy++yd+vr+ve+  + +a  de+ +p++ll++ ++++vfD
  lcl|NCBI__GCF_000023025.1:WP_015816820.1  94 LEFVKNWQKSFnVAEVDGLPS-FNGGLVGYFAYDCVRYVEPRLQgNApDDEIGTPEILLMVSDEILVFD 161
                                               ******999995556666766.******************98775445999****************** PP

                                 TIGR00564 135 hvekkvilienarteaersaeeeaaarleellaelqkeleka..vkaleekkes.........ftsnve 192
                                               + + k++li  a  +  + a + a++rl++l a+l +   +   ++a+ +   s         f+s++ 
  lcl|NCBI__GCF_000023025.1:WP_015816820.1 162 NLKGKIHLIVLADPARAN-ALQLANDRLDALEAKLASGPGNIpaMPAMNT---SkgitacegdFESSYG 226
                                               ***********9777666.8888899999988888775432111333333...2233566667888888 PP

                                 TIGR00564 193 keeyeekvakakeyikaGdifqvvlSqrleakveakpfelYrkLRtvNPSpylyyldledfelvgsSPE 261
                                                e+++++v ++k+yi aGd +q+vlSqr++ +++a+p++lYr+LR  NPSpy+y+++l d ++vgsSPE
  lcl|NCBI__GCF_000023025.1:WP_015816820.1 227 CEKFQADVGRLKDYILAGDTMQIVLSQRMSYPFTAPPVNLYRALRCLNPSPYMYFMNLGDHHVVGSSPE 295
                                               8******************************************************************** PP

                                 TIGR00564 262 llvkvkgkrvetrPiAGtrkRGatkeeDealeeeLladeKerAEHlmLvDLaRNDigkvaklgsvevke 330
                                               +l ++++ ++++rPiAGtr+RG +++eD+ale+eL+ad+Ke AEHlmL+DL+RND+g+va++gsv+ +e
  lcl|NCBI__GCF_000023025.1:WP_015816820.1 296 ILARLENGEMTVRPIAGTRRRGYSEAEDKALEAELVADPKEIAEHLMLIDLGRNDVGRVAEIGSVKLTE 364
                                               ******999************************************************************ PP

                                 TIGR00564 331 llkiekyshvmHivSeVeGelkdeltavDalraalPaGTlsGAPKvrAmelidelEkekRgiYgGavgy 399
                                                + +e++shvmHi+S+V+G+lk +++a+D+lraalPaGTlsGAPK+rAme+idelE++kRgiYgGa+gy
  lcl|NCBI__GCF_000023025.1:WP_015816820.1 365 KMVVERFSHVMHITSNVTGRLKADKDAMDVLRAALPAGTLSGAPKIRAMEIIDELEPVKRGIYGGAIGY 433
                                               ********************************************************************* PP

                                 TIGR00564 400 lsfdgdvdtaiaiRtmvlkdgvayvqAgaGiVaDSdpeaEyeEtlnKakallrai 454
                                               l  +g++dtaiaiRt+v+kdg+++vqAgaG+VaDS+pe E++Et+nKa+al +a+
  lcl|NCBI__GCF_000023025.1:WP_015816820.1 434 LAWNGNMDTAIAIRTAVIKDGKIFVQAGAGVVADSQPELEWKETMNKARALFSAV 488
                                               **************************************************99886 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (455 nodes)
Target sequences:                          1  (492 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.01u 0.01s 00:00:00.02 Elapsed: 00:00:00.02
# Mc/sec: 10.45
//
[ok]

This GapMind analysis is from Apr 10 2024. The underlying query database was built on Apr 09 2024.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory