Align anthranilate synthase (subunit 1/2) (EC 4.1.3.27) (characterized)
to candidate WP_029132343.1 A3GO_RS0104225 anthranilate synthase component I
Query= BRENDA::P20580 (492 letters) >NCBI__GCF_000428045.1:WP_029132343.1 Length = 497 Score = 669 bits (1726), Expect = 0.0 Identities = 331/497 (66%), Positives = 398/497 (80%), Gaps = 12/497 (2%) Query: 1 MNREEFLRLAADGYNRIPLSFETLADFDTPLSIYLKLADAPNSYLLESVQGGEKWGRYSI 60 M E+F +LAA G+NRIPL E LAD DTPLS+YLKLA+APNSYLLESVQGGEKWGRYSI Sbjct: 1 MTPEQFEQLAAQGHNRIPLMCEVLADLDTPLSVYLKLANAPNSYLLESVQGGEKWGRYSI 60 Query: 61 IGLPCRTVLRVYDHQVRISIDGVETERFDCADPLAFVEEFKARYQVPTVPGLPRFDGGLV 120 IGLPCRT+LRV H + I DG E + DPLAF+E F+ RY+V + P LPRF GGLV Sbjct: 61 IGLPCRTLLRVTGHGITIETDGTIAESHEVEDPLAFIESFQQRYRVASHPDLPRFTGGLV 120 Query: 121 GYFGYDCVRYVEKRLATCPNPDPLGNPDILLMVSDAVVVFDNLAGKIHAIVLADPSEENA 180 GYFGYD +RY+E +LA CPNPD +G PDILL+VSD VVVFDNL G+++ IV DP+ Sbjct: 121 GYFGYDTIRYIEPKLAHCPNPDGIGTPDILLLVSDEVVVFDNLRGRLYVIVHVDPTSGGT 180 Query: 181 YERGQARLEELLERLR-----QPITPRRGLDLEAAQGREPAFRASFTREDYENAVGRIKD 235 +E G+ RL ++ER+R QP +PRR ++ E F + FT + ++ AV RIK Sbjct: 181 FESGRKRLRTVVERMRECLPDQPASPRRRIN-------EADFISGFTEQGFKQAVERIKG 233 Query: 236 YILAGDCMQVVPSQRMSIEFKAAPIDLYRALRCFNPTPYMYFFNFGDFHVVGSSPEVLVR 295 YIL GDCMQ V SQR+SI + A P+DLYRALR NP+PYMYF+N +FH+VGSSPE+L R Sbjct: 234 YILDGDCMQTVISQRLSIPYHAQPLDLYRALRGLNPSPYMYFYNLDEFHIVGSSPEILTR 293 Query: 296 VEDGLVTVRPIAGTRPRGINEEADLALEQDLLSDAKEIAEHLMLIDLGRNDVGRVSDIGA 355 +EDG VTVRPIAGTRPRG E DLALEQ+LL+D KE+AEHLMLIDLGRND GRV+ G+ Sbjct: 294 LEDGEVTVRPIAGTRPRGKTEAEDLALEQELLADPKELAEHLMLIDLGRNDAGRVAKTGS 353 Query: 356 VKVTEKMVIERYSNVMHIVSNVTGQLREGLSAMDALRAILPAGTLSGAPKIRAMEIIDEL 415 VK+T+KM++ERYS+VMHIVSNVTGQL+EG++A+D LRA PAGT+SGAPKIRAMEIIDEL Sbjct: 354 VKLTDKMIVERYSHVMHIVSNVTGQLKEGMNAIDVLRATFPAGTVSGAPKIRAMEIIDEL 413 Query: 416 EPVKRGVYGGAVGYLAWNGNMDTAIAIRTAVIKNGELHVQAGGGIVADSVPALEWEETIN 475 EPVKRG+Y GAVGYL+WNGNMDTAIAIRTAVIK+ +LH+QAG G+VADS P LEW+ET+N Sbjct: 414 EPVKRGIYSGAVGYLSWNGNMDTAIAIRTAVIKDKQLHIQAGAGVVADSQPQLEWDETMN 473 Query: 476 KRRAMFRAVALAEQSVE 492 K RA+FRAVALAE +E Sbjct: 474 KGRAVFRAVALAEAGLE 490 Lambda K H 0.321 0.139 0.408 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 742 Number of extensions: 27 Number of successful extensions: 2 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 492 Length of database: 497 Length adjustment: 34 Effective length of query: 458 Effective length of database: 463 Effective search space: 212054 Effective search space used: 212054 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.4 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.8 bits) S2: 52 (24.6 bits)
Align candidate WP_029132343.1 A3GO_RS0104225 (anthranilate synthase component I)
to HMM TIGR00564 (trpE: anthranilate synthase component I (EC 4.1.3.27))
# hmmsearch :: search profile(s) against a sequence database # HMMER 3.3.1 (Jul 2020); http://hmmer.org/ # Copyright (C) 2020 Howard Hughes Medical Institute. # Freely distributed under the BSD open source license. # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # query HMM file: ../tmp/path.aa/TIGR00564.hmm # target sequence database: /tmp/gapView.9244.genome.faa # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: TIGR00564 [M=455] Accession: TIGR00564 Description: trpE_most: anthranilate synthase component I Scores for complete sequences (score includes all domains): --- full sequence --- --- best 1 domain --- -#dom- E-value score bias E-value score bias exp N Sequence Description ------- ------ ----- ------- ------ ----- ---- -- -------- ----------- 1.7e-174 567.0 0.0 1.9e-174 566.8 0.0 1.0 1 lcl|NCBI__GCF_000428045.1:WP_029132343.1 A3GO_RS0104225 anthranilate synt Domain annotation for each sequence (and alignments): >> lcl|NCBI__GCF_000428045.1:WP_029132343.1 A3GO_RS0104225 anthranilate synthase component I # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc --- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ---- 1 ! 566.8 0.0 1.9e-174 1.9e-174 2 454 .. 26 482 .. 25 483 .. 0.95 Alignments for each domain: == domain 1 score: 566.8 bits; conditional E-value: 1.9e-174 TIGR00564 2 dtltpisvylklakrkesfllEsvekeeelgRySliglnpvleikakdgkavlleaddeeak..ieede 68 d tp+svylkla+ ++s+llEsv+ +e++gRyS+igl ++ ++++++ + ++e+d++ a+ ++ed+ lcl|NCBI__GCF_000428045.1:WP_029132343.1 26 DLDTPLSVYLKLANAPNSYLLESVQGGEKWGRYSIIGLPCRTLLRVTGHGI-TIETDGTIAEshEVEDP 93 778************************************999999999844.4444554444458999* PP TIGR00564 69 lkelrklleka.eesedeldeplsggavGylgydtvrlveklke..eaedelelpdlllllvetvivfD 134 l ++++ +++ s+++l+ ++gg+vGy+gydt+r++e++ + +++d + +pd+lll+ ++v+vfD lcl|NCBI__GCF_000428045.1:WP_029132343.1 94 LAFIESFQQRYrVASHPDLPR-FTGGLVGYFGYDTIRYIEPKLAhcPNPDGIGTPDILLLVSDEVVVFD 161 *****9999997678888887.******************98875566********************* PP TIGR00564 135 hvekkvilienarteaersaeeeaaarleellaelqkeleka..vkaleekkesftsnvekeeyeekva 201 + + ++++i ++ +++ ++e ++rl+++++++++ l ++ ++ ++ +f s ++++ ++++v+ lcl|NCBI__GCF_000428045.1:WP_029132343.1 162 NLRGRLYVIVHVDPTSGG-TFESGRKRLRTVVERMRECLPDQpaSPRRRINEADFISGFTEQGFKQAVE 229 ************999888.999************99977655447888889999*************** PP TIGR00564 202 kakeyikaGdifqvvlSqrleakveakpfelYrkLRtvNPSpylyyldledfelvgsSPEllvkvkgkr 270 ++k yi +Gd +q v+Sqrl+ + +a+p++lYr+LR NPSpy+y+ +l++f++vgsSPE+l ++++ + lcl|NCBI__GCF_000428045.1:WP_029132343.1 230 RIKGYILDGDCMQTVISQRLSIPYHAQPLDLYRALRGLNPSPYMYFYNLDEFHIVGSSPEILTRLEDGE 298 ********************************************************************* PP TIGR00564 271 vetrPiAGtrkRGatkeeDealeeeLladeKerAEHlmLvDLaRNDigkvaklgsvevkellkiekysh 339 v++rPiAGtr+RG+t++eD ale+eLlad+Ke AEHlmL+DL+RND g+vak+gsv+ ++ + +e+ysh lcl|NCBI__GCF_000428045.1:WP_029132343.1 299 VTVRPIAGTRPRGKTEAEDLALEQELLADPKELAEHLMLIDLGRNDAGRVAKTGSVKLTDKMIVERYSH 367 ********************************************************************* PP TIGR00564 340 vmHivSeVeGelkdeltavDalraalPaGTlsGAPKvrAmelidelEkekRgiYgGavgylsfdgdvdt 408 vmHivS+V+G+lk++++a+D+lra++PaGT+sGAPK+rAme+idelE++kRgiY+Gavgyls +g++dt lcl|NCBI__GCF_000428045.1:WP_029132343.1 368 VMHIVSNVTGQLKEGMNAIDVLRATFPAGTVSGAPKIRAMEIIDELEPVKRGIYSGAVGYLSWNGNMDT 436 ********************************************************************* PP TIGR00564 409 aiaiRtmvlkdgvayvqAgaGiVaDSdpeaEyeEtlnKakallrai 454 aiaiRt+v+kd+++++qAgaG+VaDS+p+ E++Et+nK +a+ ra+ lcl|NCBI__GCF_000428045.1:WP_029132343.1 437 AIAIRTAVIKDKQLHIQAGAGVVADSQPQLEWDETMNKGRAVFRAV 482 *****************************************99986 PP Internal pipeline statistics summary: ------------------------------------- Query model(s): 1 (455 nodes) Target sequences: 1 (497 residues searched) Passed MSV filter: 1 (1); expected 0.0 (0.02) Passed bias filter: 1 (1); expected 0.0 (0.02) Passed Vit filter: 1 (1); expected 0.0 (0.001) Passed Fwd filter: 1 (1); expected 0.0 (1e-05) Initial search space (Z): 1 [actual number of targets] Domain search space (domZ): 1 [number of targets reported over threshold] # CPU time: 0.02u 0.01s 00:00:00.03 Elapsed: 00:00:00.02 # Mc/sec: 8.14 // [ok]
This GapMind analysis is from Apr 10 2024. The underlying query database was built on Apr 09 2024.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory