GapMind for Amino acid biosynthesis

 

Alignments for a candidate for asnB in Thioalkalivibrio halophilus HL17

Align Putative asparagine synthetase [glutamine-hydrolyzing] 2; EC 6.3.5.4 (uncharacterized)
to candidate WP_077244558.1 B1A74_RS10125 asparagine synthase (glutamine-hydrolyzing)

Query= curated2:Q58456
         (515 letters)



>NCBI__GCF_001995255.1:WP_077244558.1
          Length = 648

 Score =  228 bits (580), Expect = 7e-64
 Identities = 154/426 (36%), Positives = 226/426 (53%), Gaps = 39/426 (9%)

Query: 1   MCGINGIIRFGKEVIKEEINKMNKAIKHRGPDDEGIFIYNFKNYSIGLGHVRLAILDLSE 60
           MCGI G        + +  ++M   I  RGPDD G++          L H RL+I+DLS 
Sbjct: 1   MCGIVGYWSRNSRSV-DVASRMAHRIATRGPDDAGVWAEG--EGEPVLAHRRLSIVDLSP 57

Query: 61  KGHQPMGYNVDEDKIIYRDDELDRADIIIVYNGEIYNYLELKEKFNLETETG-------T 113
            GHQPM                     ++ YNGEIYN+ EL+++  LE E G       +
Sbjct: 58  AGHQPMVSPCGR--------------YVLSYNGEIYNHTELRQE--LEREGGGFDWRGHS 101

Query: 114 DTEVILKLYNKLGFD-CVKEFNGMWAFCIFDKKKGLIFCSRDRLGVKPFYYYWDGNEFIF 172
           DTE +L      G    ++  NGM+AF ++D+ +  +F +RDR+G KP YY   G+ F+F
Sbjct: 102 DTETLLAALRYWGVHGALERLNGMFAFALWDRTERTLFLARDRMGEKPLYYGRSGDTFLF 161

Query: 173 SSELKGILAVKEINKKENINKDAVELYFALGFIPSPYSIYKNTFKLEARQNLIFDLDKRE 232
            SELK + A  +   +  +++DA+ L      +P+P+SIY+  +KL     L+      +
Sbjct: 162 GSELKALAAHPDW--RGEVDRDALALMLRYNNVPAPWSIYRGIYKLPPAHYLVVRGQGHQ 219

Query: 233 IRK-YYYWELPDY------KPIYDKKKLIEEGKKLLYDAVKIRMRSDVPVGAFLSGGLDS 285
           + +   YW+LP+       +   +   L +E  +LL D+V  RM +DVP+GAFLSGG DS
Sbjct: 220 VGEPECYWDLPEVASEGAREAKGEPDALADELDELLRDSVGRRMMADVPLGAFLSGGFDS 279

Query: 286 STVVGVMREFTDLSKLHTFSIG-FEGKYDETPYIKIVVDYFKTQHHHYYFKERDFEELID 344
           + VV  M+       + TFSIG  + +YDE  +   V  +  T H   Y    D + +I 
Sbjct: 280 TMVVAQMQA-QSARPVKTFSIGNADAEYDEAHHAAAVARHLGTDHRELYVTPEDAQAVIP 338

Query: 345 KYSWIYDEPFGDYSGFPTYKVSEMARKFVTVVLSGDGGDEVFGGYMTHLNGYRM-DFIRK 403
           +   I+DEPF D S  PTY VSE+AR+ VTV LSGDGGDE+ GGY  H+ G  +   + +
Sbjct: 339 RLPEIFDEPFADSSQIPTYLVSELARRDVTVTLSGDGGDELLGGYNRHVVGPGVWRRVNR 398

Query: 404 LPKFLR 409
           LP +LR
Sbjct: 399 LPGWLR 404


Lambda     K      H
   0.322    0.144    0.434 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 800
Number of extensions: 37
Number of successful extensions: 8
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 2
Number of HSP's successfully gapped: 1
Length of query: 515
Length of database: 648
Length adjustment: 36
Effective length of query: 479
Effective length of database: 612
Effective search space:   293148
Effective search space used:   293148
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.9 bits)
S2: 53 (25.0 bits)

Align candidate WP_077244558.1 B1A74_RS10125 (asparagine synthase (glutamine-hydrolyzing))
to HMM TIGR01536 (asnB: asparagine synthase (glutamine-hydrolyzing) (EC 6.3.5.4))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.aa/TIGR01536.hmm
# target sequence database:        /tmp/gapView.3293292.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR01536  [M=517]
Accession:   TIGR01536
Description: asn_synth_AEB: asparagine synthase (glutamine-hydrolyzing)
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                             Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                             -----------
   7.7e-153  496.1   0.0     1e-152  495.7   0.0    1.2  1  NCBI__GCF_001995255.1:WP_077244558.1  


Domain annotation for each sequence (and alignments):
>> NCBI__GCF_001995255.1:WP_077244558.1  
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  495.7   0.0    1e-152    1e-152       1     517 []       2     574 ..       2     574 .. 0.89

  Alignments for each domain:
  == domain 1  score: 495.7 bits;  conditional E-value: 1e-152
                             TIGR01536   1 CgiagivdlkakakeeeeaikemletlahRGPDaegvwkdekeenailghrRLaiidlseg.aQPlsnek.ev 71 
                                           Cgi+g   ++     + + +++m++++a RGPD+ gvw +  e + +l+hrRL+i+dls + +QP+ +   ++
  NCBI__GCF_001995255.1:WP_077244558.1   2 CGIVGYWSRNSR---SVDVASRMAHRIATRGPDDAGVWAE-GEGEPVLAHRRLSIVDLSPAgHQPMVSPCgRY 70 
                                           *****9999665...55789********************.899***************998******999** PP

                             TIGR01536  72 vivfnGEIYNheeLreeleekG..yeFetksDtEViLaayeewg.eelverLeGmFAfalwdekkgelflaRD 141
                                           v+ +nGEIYNh eLr+ele++G  ++++++sDtE +Laa+++wg + ++erL+GmFAfalwd+ +++lflaRD
  NCBI__GCF_001995255.1:WP_077244558.1  71 VLSYNGEIYNHTELRQELEREGggFDWRGHSDTETLLAALRYWGvHGALERLNGMFAFALWDRTERTLFLARD 143
                                           ******************9985338889****************999************************** PP

                             TIGR01536 142 rlGikPLYyaseqgkllfaSEiKallalkeikaeldkealaelltlqlvptektlfkevkelepakal..... 209
                                           r+G kPLYy++ ++++lf+SE+Kal a+++ + e+d++ala +l +++vp + ++++++++l+pa++l     
  NCBI__GCF_001995255.1:WP_077244558.1 144 RMGEKPLYYGRSGDTFLFGSELKALAAHPDWRGEVDRDALALMLRYNNVPAPWSIYRGIYKLPPAHYLvvrgq 216
                                           ********************************************************************99763 PP

                             TIGR01536 210 .dgeeklee..ywevekee......vkeseeelveelrelledavkkrlvadvpvgvllSGGlDSslvaaiak 273
                                            ++  +     yw++ + +       k + ++l +el+ell+d+v +r++advp+g++lSGG DS++v+a ++
  NCBI__GCF_001995255.1:WP_077244558.1 217 gHQVGE--PecYWDLPEVAsegareAKGEPDALADELDELLRDSVGRRMMADVPLGAFLSGGFDSTMVVAQMQ 287
                                           333333..266***99888888988889999****************************************** PP

                             TIGR01536 274 keaksevktFsigfedskdldeskaarkvadelgtehkevliseeevlkeleevilaleeptairasiplyll 346
                                           ++++++vktFsig + ++++de+++a++va++lgt+h+e+++++e++ + ++++  +++ep+a++++ip+yl+
  NCBI__GCF_001995255.1:WP_077244558.1 288 AQSARPVKTFSIGNA-DAEYDEAHHAAAVARHLGTDHRELYVTPEDAQAVIPRLPEIFDEPFADSSQIPTYLV 359
                                           ***************.9******************************************************** PP

                             TIGR01536 347 sklarekgvkVvLsGeGaDElfgGYeyfreakaeeale...............................lpea 388
                                           s+lar++ v+V LsG+G+DEl+gGY+++   +    ++                               lp  
  NCBI__GCF_001995255.1:WP_077244558.1 360 SELARRD-VTVTLSGDGGDELLGGYNRHVVGPG--VWRrvnrlpgwlrgflggavgnlsqrdlrqwrrrLPAR 429
                                           *******.*******************987543..33333489999999999999988888876666663333 PP

                             TIGR01536 389 selaekkl....................llqaklakeselkellkakleeelkekeelkkelkeeseleellr 441
                                           ++     +                      ++ l+++++  e +    +e++++   ++  +      e+++ 
  NCBI__GCF_001995255.1:WP_077244558.1 430 MQ-----VpnlelkleklakalgasdgpGFYRALRSRWKAPEGMVLGASEQQEQEGPVDWLSDLPGLREQMML 497
                                           33.....033333444555666777777555667777776666666666666666666666666699****** PP

                             TIGR01536 442 ldlelllsdllrak.DrvsmahslEvRvPflDkelvelalsippelklrdgkeKvlLreaaeellPeeileRk 513
                                           ld+ ++l+d+++ k Dr+sma slE+RvP+lD++lve+a+++p+e+k+rdg+ K+lLr+++++++P++++eR+
  NCBI__GCF_001995255.1:WP_077244558.1 498 LDMLTYLPDDILTKvDRASMAVSLEARVPLLDHRLVEFAWQVPTEYKVRDGQGKWLLRKVLDRYVPSSLMERP 570
                                           ************************************************************************* PP

                             TIGR01536 514 Keaf 517
                                           K++f
  NCBI__GCF_001995255.1:WP_077244558.1 571 KQGF 574
                                           **99 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (517 nodes)
Target sequences:                          1  (648 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.01u 0.00s 00:00:00.01 Elapsed: 00:00:00.00
# Mc/sec: 38.46
//
[ok]

This GapMind analysis is from Jul 25 2024. The underlying query database was built on Jul 25 2024.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory