GapMind for Amino acid biosynthesis

 

Alignments for a candidate for trpE in Methanosarcina barkeri Fusaro

Align Anthranilate synthase component 1 2; EC 4.1.3.27; Anthranilate synthase component I 2 (uncharacterized)
to candidate WP_011308530.1 MBAR_RS19240 anthranilate synthase component I

Query= curated2:Q5V213
         (536 letters)



>NCBI__GCF_000195895.1:WP_011308530.1
          Length = 547

 Score =  397 bits (1020), Expect = e-115
 Identities = 245/570 (42%), Positives = 323/570 (56%), Gaps = 80/570 (14%)

Query: 1   MTLDISREEFVEHAKA-DRPVVVRTAAELD-VDVEPLTAYAALTGRTSDVAANDYTFLLE 58
           ++ D+ +EEF + A    +P  ++  A +D +   PL  Y AL         + Y++LLE
Sbjct: 2   LSFDLGKEEFKKLASGVSQPGFIQLLARIDNLTCSPLELYHALRAS----GTSGYSYLLE 57

Query: 59  SAEKVASSDPDGAFAPETDDRHARFSFVGYDPRAVVTV---------------------- 96
           S EK               +  AR+SFVG DP AVV +                      
Sbjct: 58  SVEK--------------QETKARYSFVGNDPDAVVKIGDRKISLELLNPNASPFFEEIQ 103

Query: 97  ----------TGDESEVEAFDDRYADLVTT----DGGDVVDDLRAAMPD---VALRNFPA 139
                     T +E   E  +    +L  T     G D  D LR   P    + L N   
Sbjct: 104 SKTKDACGCETIEEENPEKENSELGNLKFTAPIPKGKDGFDALRLVFPSANGMGLLNTKR 163

Query: 140 MDRQHLEGGLVGFLSYDAVYDLWLDEVGLDRP-DSRFPDAQFVLTTSTVRFDHVEDTVSL 198
            DRQ   GG +G+ +YDA+YD WL   G+ +  +S  P+ Q++L + T  FDH+ + + +
Sbjct: 164 FDRQTFLGGAIGYTAYDAIYDSWL---GVKKGFESEIPELQYLLVSKTFVFDHMTEEIYI 220

Query: 199 VFTPVVRQGEDAGERYGELVAEAERVEAVLSDLSPLSTG------------GFRREDEVA 246
           V TP +  G + GE Y + + EAE++   L + +                 G    D  A
Sbjct: 221 VVTPFIIPGANVGEIYEKALLEAEKLYTTLKEAALSGDSVEIAIPGGSIFPGLPVSDCNA 280

Query: 247 GPRDEYEDAVERAKEYVLSGDIYQGVISRTRELYGDVDPLGFYEALRAVNPSPYMYLLGY 306
           G + ++ED+V +AKE++ +GDI+Q V+SR  E   +  P   Y  LRA+NPSPYMY+  +
Sbjct: 281 GKK-KFEDSVVQAKEHIFAGDIFQVVLSRKCEFTLEQSPFELYMQLRAINPSPYMYIFEF 339

Query: 307 DDLTIVGASPETLVSVAGDHVVSNPIAGTCPRGNSPVEDRRLAGEMLADGKERAEHTMLV 366
            DL IVGASPETL++V    +++NPIAGTCPRG +  ED   A  M+ D KERAEH MLV
Sbjct: 340 GDLAIVGASPETLLTVHERTLITNPIAGTCPRGKTEAEDEAFAAHMMHDEKERAEHVMLV 399

Query: 367 DLARNDVRRVAEAGSVRVPEFMNVLKYSHVQHIESTVTGRLAEDKDAFDAARATFPAGTL 426
           DL RNDVR V E+GSV+V EFM VLKYSHVQHIES V G L  + D FDA RA FPAGTL
Sbjct: 400 DLGRNDVRMVTESGSVKVSEFMKVLKYSHVQHIESKVIGTLRPECDQFDAFRAIFPAGTL 459

Query: 427 SGAPKIRAMEIIDELERSPRGPYGGGVGYFDWDGDTDFAIVIRSATVEDEGDRDRITVQA 486
           SGAPKIRAMEII ELE SPRG YGGGVGY+ W+GD DFAIVIR+  V+ +    + +VQA
Sbjct: 460 SGAPKIRAMEIISELETSPRGIYGGGVGYYSWNGDADFAIVIRTIIVQGK----KASVQA 515

Query: 487 GAGIVADSDPESEYVETEQKMDGVLTALEE 516
           GAGIVADSDP  E+ ETE+KM  +L A+ E
Sbjct: 516 GAGIVADSDPGYEFRETERKMGAMLAAIGE 545


Lambda     K      H
   0.315    0.135    0.383 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 786
Number of extensions: 38
Number of successful extensions: 7
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 536
Length of database: 547
Length adjustment: 35
Effective length of query: 501
Effective length of database: 512
Effective search space:   256512
Effective search space used:   256512
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 42 (22.0 bits)
S2: 52 (24.6 bits)

Align candidate WP_011308530.1 MBAR_RS19240 (anthranilate synthase component I)
to HMM TIGR01820 (trpE: anthranilate synthase component I (EC 4.1.3.27))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.aa/TIGR01820.hmm
# target sequence database:        /tmp/gapView.649.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR01820  [M=449]
Accession:   TIGR01820
Description: TrpE-arch: anthranilate synthase component I
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                                 Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                                 -----------
   4.7e-204  664.7   0.0   5.6e-204  664.4   0.0    1.0  1  lcl|NCBI__GCF_000195895.1:WP_011308530.1  MBAR_RS19240 anthranilate syntha


Domain annotation for each sequence (and alignments):
>> lcl|NCBI__GCF_000195895.1:WP_011308530.1  MBAR_RS19240 anthranilate synthase component I
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  664.4   0.0  5.6e-204  5.6e-204       1     449 []      37     544 ..      37     544 .. 0.94

  Alignments for each domain:
  == domain 1  score: 664.4 bits;  conditional E-value: 5.6e-204
                                 TIGR01820   1 Plelykalrk..eseysflLesvekqskkaryslvgaspeavvkiner.........kavelfeeivsk 58 
                                               Plely+alr+  +s+ys+lLesvekq++karys+vg++p+avvki +r         +a+ +feei+sk
  lcl|NCBI__GCF_000195895.1:WP_011308530.1  37 PLELYHALRAsgTSGYSYLLESVEKQETKARYSFVGNDPDAVVKIGDRkislellnpNASPFFEEIQSK 105
                                               9********988888****************************************************** PP

                                 TIGR01820  59 vkkleg......kkkae...................gkdvldalrkalkklkeiellee...erqtflG 99 
                                               +k+ +g      +   +                   gkd +dalr ++++++++ ll++   +rqtflG
  lcl|NCBI__GCF_000195895.1:WP_011308530.1 106 TKDACGcetieeE---NpekenselgnlkftapipkGKDGFDALRLVFPSANGMGLLNTkrfDRQTFLG 171
                                               ******7655441...145556789999999**********************99988877899***** PP

                                 TIGR01820 100 glvGyvaYdavrdywedaekekeseipeaefllvtkvlvfdhleeevslvvteevsad........... 157
                                               g++Gy+aYda++d+w++++k +eseipe+++llv+k++vfdh++ee+++vvt+++ +            
  lcl|NCBI__GCF_000195895.1:WP_011308530.1 172 GAIGYTAYDAIYDSWLGVKKGFESEIPELQYLLVSKTFVFDHMTEEIYIVVTPFIIPGanvgeiyekal 240
                                               ******************************************************999899999999999 PP

                                 TIGR01820 158 .eaekiveklkeaekeeeekkeaeleslae............keefeeavekakekifeGdifqvvlSr 213
                                                eaek++++lkea    ++   a+    +             k++fe++v +ake+if+GdifqvvlSr
  lcl|NCBI__GCF_000195895.1:WP_011308530.1 241 lEAEKLYTTLKEAALSGDSVEIAIP-GGSIfpglpvsdcnagKKKFEDSVVQAKEHIFAGDIFQVVLSR 308
                                               9999999999999887665433322.2222245666667777999************************ PP

                                 TIGR01820 214 klelrldldplelYaklreiNPSPYmyllefgdraivGaSPEtlvrvekrtveinPiAGtapRgkseee 282
                                               k+e++l+++p+elY +lr+iNPSPYmy++efgd+aivGaSPEtl++v++rt+++nPiAGt+pRgk+e+e
  lcl|NCBI__GCF_000195895.1:WP_011308530.1 309 KCEFTLEQSPFELYMQLRAINPSPYMYIFEFGDLAIVGASPETLLTVHERTLITNPIAGTCPRGKTEAE 377
                                               ********************************************************************* PP

                                 TIGR01820 283 DeelakelLsdeKerAEHvmLvDLaRNDvrkvsesgsvkvsefmkvlkyshvqHieSevvgtLkkeada 351
                                               De++a+++++deKerAEHvmLvDL+RNDvr+v+esgsvkvsefmkvlkyshvqHieS+v+gtL++e+d+
  lcl|NCBI__GCF_000195895.1:WP_011308530.1 378 DEAFAAHMMHDEKERAEHVMLVDLGRNDVRMVTESGSVKVSEFMKVLKYSHVQHIESKVIGTLRPECDQ 446
                                               ********************************************************************* PP

                                 TIGR01820 352 fdalkAvfPAGtlsGaPKirAmeiieelEkepRgvYgGgvGyfslngdadlAiviRtaliekkklriqa 420
                                               fda++A+fPAGtlsGaPKirAmeii+elE++pRg+YgGgvGy+s+ngdad+AiviRt+++++kk+++qa
  lcl|NCBI__GCF_000195895.1:WP_011308530.1 447 FDAFRAIFPAGTLSGAPKIRAMEIISELETSPRGIYGGGVGYYSWNGDADFAIVIRTIIVQGKKASVQA 515
                                               ********************************************************************* PP

                                 TIGR01820 421 GAGivaDSdPekEfeEterKmkavlkaig 449
                                               GAGivaDSdP +Ef+EterKm a+l+aig
  lcl|NCBI__GCF_000195895.1:WP_011308530.1 516 GAGIVADSDPGYEFRETERKMGAMLAAIG 544
                                               ***************************96 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (449 nodes)
Target sequences:                          1  (547 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.02u 0.01s 00:00:00.03 Elapsed: 00:00:00.02
# Mc/sec: 8.25
//
[ok]

This GapMind analysis is from Apr 10 2024. The underlying query database was built on Apr 09 2024.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory