GapMind for Amino acid biosynthesis

 

Alignments for a candidate for asnB in Hydrogenovibrio halophilus DSM 15072

Align asparagine synthase (glutamine-hydrolysing) 1; EC 6.3.5.4 (characterized)
to candidate WP_019894569.1 A377_RS0102545 N-acetylglutaminylglutamine amidotransferase

Query= CharProtDB::CH_005185
         (632 letters)



>NCBI__GCF_000384235.1:WP_019894569.1
          Length = 618

 Score =  275 bits (703), Expect = 4e-78
 Identities = 194/626 (30%), Positives = 311/626 (49%), Gaps = 66/626 (10%)

Query: 20  EELIKQMNQMIVHRGPDSDGYFHDEHVGFGFRRLSIIDVENGG-QPLSYEDETYWIIFNG 78
           E  +  M + +  RGPD  G +    VG G RRLSIID+ + G QP+   DE   ++FNG
Sbjct: 17  EATLAPMLEKLAKRGPDDGGIWLQNQVGLGHRRLSIIDLSDAGHQPMV--DEELTLVFNG 74

Query: 79  EIYNYIELREELEAKGYTFNTDSDTEVLLATYRHYKEEAASKLRGMFAFLIWNKNDHVLY 138
            IYNY+ LRE+L   G+ F + SDTEV+L  YR +  E  ++  GMFAF +W+ + H L 
Sbjct: 75  CIYNYVALREQLIELGHAFRSHSDTEVILKAYRQWGMECVTRFEGMFAFALWDDHQHQLM 134

Query: 139 GARDPFGIKPLYYTTINDQVYFASERKSLMVAQN-DIEIDKEAL-QQYMSFQFVPEPSTL 196
            ARD FGIKPLYY  +   V FAS  ++L+ A   +I++D   L  Q+     VP P T+
Sbjct: 135 LARDRFGIKPLYYAPVEGGVRFASNTQALLAAGGVNIDLDPVGLHHQFTLHGVVPAPHTV 194

Query: 197 DAHVKKVEPGSQFTIRPDGDITFKTYF--------------KANFKPVQTEEDKLVKEVR 242
              V+K+ PG   T+ PDG +  ++++              +       T+    V+ V 
Sbjct: 195 LKGVRKLAPGQWMTVNPDGQMYQRSWWHLKAERPGPDHPSSRLQALSGVTDHQAWVEAVH 254

Query: 243 DAIYDSVNVHM-RSDVPVGSFLSGGIDSSFIVSVAKEFH-PSLKTFSVGFE---QQGFSE 297
           D + ++V+  +  SDVPVG  LSGG+DSS IV++  E     ++TFS+GFE   ++  SE
Sbjct: 255 DTLKEAVHKRLTASDVPVGVLLSGGLDSSLIVALLDEAGVEDIRTFSIGFEDVPEEKGSE 314

Query: 298 VDVAKETAAALGIENISKVISPEEYMNELPKIVWHFDDPLADPAAIPLYFVAKEAKKHVT 357
            D + +        +   +I  E  +  L + V    +P+    A+  Y ++++  K V 
Sbjct: 315 FDYSDQVVERFQTRHQKFLIPNEAVLPRLQEAVDAMSEPMFAQDAVAFYLLSEQVSKEVK 374

Query: 358 VALSGEGADELFGGYNIYREPLSLKPFERIPSGLKKMLLHVAAVMPEGMRGKSLLERGCT 417
           V +SG+GADE+FGGY  Y  P   +  E++               PE     +       
Sbjct: 375 VVMSGQGADEVFGGYFWY--PQMAQAAEKLG--------------PEAQAVDAF------ 412

Query: 418 PLQDRYIGNAKIFEESVKKQLLKHYNPNLSYRDVTKTYFTESSSYSD----INKMQYVDI 473
                    A  + +    +  +  +P+   RDVT  +  +  S  D    I+++  +D 
Sbjct: 413 ---------APFYFDRDHAEWSEMIHPDYHIRDVTTEWVQDRLSEEDADTFIDQVLRLDA 463

Query: 474 HTWMRGDILLKADKMTMANSLELRVPFLDKVVFDVASKIPDELKTKNGTTKYLLRKAAEG 533
              +  D + + D MTMA  LE RVPFLD  + +VA   P E K ++   KYLL+  A G
Sbjct: 464 SHLIVDDPVKRVDNMTMAWGLEARVPFLDHSLVEVAMTAPPETKLQD--FKYLLKAVARG 521

Query: 534 IVPEHVLNRKKLGFPVPIRHWLKNEMNEWVRNII--QESQTDAYIHKDYVLQLLEDHCAD 591
            VP+ V++R K  FP+P   +++    + +R+++  + +Q       DY+ +LL +  AD
Sbjct: 522 RVPDSVIDRPKGYFPMPALKYVRGPFYDMMRSVLTSEVAQKRGLFQSDYIERLLAEPEAD 581

Query: 592 KA---DNSRKIWTVLIFMIWHSINIE 614
            +       K+W   +  +W   +++
Sbjct: 582 ASFTRIQGSKLWHAALLELWLQTHVD 607


Lambda     K      H
   0.319    0.136    0.403 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 968
Number of extensions: 53
Number of successful extensions: 7
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 2
Number of HSP's successfully gapped: 2
Length of query: 632
Length of database: 618
Length adjustment: 37
Effective length of query: 595
Effective length of database: 581
Effective search space:   345695
Effective search space used:   345695
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 54 (25.4 bits)

Align candidate WP_019894569.1 A377_RS0102545 (N-acetylglutaminylglutamine amidotransferase)
to HMM TIGR01536 (asnB: asparagine synthase (glutamine-hydrolyzing) (EC 6.3.5.4))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.aa/TIGR01536.hmm
# target sequence database:        /tmp/gapView.3274554.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR01536  [M=517]
Accession:   TIGR01536
Description: asn_synth_AEB: asparagine synthase (glutamine-hydrolyzing)
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                             Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                             -----------
     2e-140  455.1   0.0   2.3e-140  454.9   0.0    1.0  1  NCBI__GCF_000384235.1:WP_019894569.1  


Domain annotation for each sequence (and alignments):
>> NCBI__GCF_000384235.1:WP_019894569.1  
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  454.9   0.0  2.3e-140  2.3e-140       1     516 [.       2     534 ..       2     535 .. 0.87

  Alignments for each domain:
  == domain 1  score: 454.9 bits;  conditional E-value: 2.3e-140
                             TIGR01536   1 CgiagivdlkakakeeeeaikemletlahRGPDaegvwkdekeenailghrRLaiidlseg.aQPlsnekevv 72 
                                           Cgi g +  + + ++ e++++ mle+la+RGPD+ g+w +   ++++lghrRL+iidls++ +QP+ +e+  +
  NCBI__GCF_000384235.1:WP_019894569.1   2 CGICGEIYWDGQKAS-EATLAPMLEKLAKRGPDDGGIWLQ---NQVGLGHRRLSIIDLSDAgHQPMVDEE-LT 69 
                                           9***98877666444.58**********************...79**************999*******9.79 PP

                             TIGR01536  73 ivfnGEIYNheeLreeleekGyeFetksDtEViLaayeewgeelverLeGmFAfalwdekkgelflaRDrlGi 145
                                           +vfnG IYN+ +Lre+l+e G+ F+++sDtEViL+ay++wg e+v r+eGmFAfalwd+++++l+laRDr+Gi
  NCBI__GCF_000384235.1:WP_019894569.1  70 LVFNGCIYNYVALREQLIELGHAFRSHSDTEVILKAYRQWGMECVTRFEGMFAFALWDDHQHQLMLARDRFGI 142
                                           ************************************************************************* PP

                             TIGR01536 146 kPLYyaseqgkllfaSEiKallalkeikaeldkealaelltlq.lvptektlfkevkelepakal....dgee 213
                                           kPLYya  +g + faS  +alla+  ++  ld  +l++++tl+ +vp ++t+ k+v++l p++ +    dg+ 
  NCBI__GCF_000384235.1:WP_019894569.1 143 KPLYYAPVEGGVRFASNTQALLAAGGVNIDLDPVGLHHQFTLHgVVPAPHTVLKGVRKLAPGQWMtvnpDGQM 215
                                           ******************************************99*******************9999666666 PP

                             TIGR01536 214 kleeywevekee..............vkeseeelveelrelledavkkrlv.advpvgvllSGGlDSslvaai 271
                                           + +++w+++ e+                ++ ++ ve + ++l++av+krl  +dvpvgvllSGGlDSsl++a+
  NCBI__GCF_000384235.1:WP_019894569.1 216 YQRSWWHLKAERpgpdhpssrlqalsGVTDHQAWVEAVHDTLKEAVHKRLTaSDVPVGVLLSGGLDSSLIVAL 288
                                           6666******999*********9998778899******************989******************** PP

                             TIGR01536 272 akkeaksevktFsigfe..dskdldeskaarkvadelgtehkevliseeevlkeleevilaleeptairasip 342
                                           +++   ++++tFsigfe   +++ +e ++  +v++ + t+h+++li +e+vl  l+e + a+ ep+  ++++ 
  NCBI__GCF_000384235.1:WP_019894569.1 289 LDEAGVEDIRTFSIGFEdvPEEKGSEFDYSDQVVERFQTRHQKFLIPNEAVLPRLQEAVDAMSEPMFAQDAVA 361
                                           ***9999**********33344555666********************************************* PP

                             TIGR01536 343 lyllsklarekgvkVvLsGeGaDElfgGYeyfrea.kaeealelpeaselaekklllqaklakeselkellka 414
                                            ylls++++++ vkVv+sG+GaDE+fgGY ++ ++ +a+e+l  pea+ + + + ++ ++ +   e  e+++ 
  NCBI__GCF_000384235.1:WP_019894569.1 362 FYLLSEQVSKE-VKVVMSGQGADEVFGGYFWYPQMaQAAEKLG-PEAQAVDAFAPFYFDRDH--AEWSEMIHP 430
                                           **********9.********************98614444554.554443332222222222..222333333 PP

                             TIGR01536 415 kleeelkekeelkkelkee...seleellrldlelll.sdllrakDrvsmahslEvRvPflDkelvelalsip 483
                                           +++    ++e ++++l+ee   + ++++lrld+  l+  d +++ D ++ma++lE+RvPflD+ lve+a++ p
  NCBI__GCF_000384235.1:WP_019894569.1 431 DYHIRDVTTEWVQDRLSEEdadTFIDQVLRLDASHLIvDDPVKRVDNMTMAWGLEARVPFLDHSLVEVAMTAP 503
                                           33333333333333333333448999***9998766505567777**************************** PP

                             TIGR01536 484 pelklrdgkeKvlLreaaeellPeeileRkKea 516
                                           pe kl+d   K+lL+ +a++ +P+++ +R+K  
  NCBI__GCF_000384235.1:WP_019894569.1 504 PETKLQD--FKYLLKAVARGRVPDSVIDRPKGY 534
                                           *****86..79*******************965 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (517 nodes)
Target sequences:                          1  (618 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.01u 0.00s 00:00:00.01 Elapsed: 00:00:00.01
# Mc/sec: 26.21
//
[ok]

This GapMind analysis is from Jul 25 2024. The underlying query database was built on Jul 25 2024.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory