Align Homocitrate synthase; EC 2.3.3.14 (characterized)
to candidate WP_011764286.1 AZO_RS02775 homocitrate synthase
Query= SwissProt::P05342 (385 letters) >NCBI__GCF_000061505.1:WP_011764286.1 Length = 396 Score = 393 bits (1010), Expect = e-114 Identities = 214/374 (57%), Positives = 258/374 (68%), Gaps = 5/374 (1%) Query: 2 ASVIIDDTTLRDGEQSAGVAFNADEKIAIARALAELGVPELEIGIPSMGEEE----REVM 57 A V IDDTTLRDGEQSAGVAF EK IAR LAE+GVPELEIGIP+MG EE R + Sbjct: 3 APVTIDDTTLRDGEQSAGVAFTRAEKCGIARILAEIGVPELEIGIPAMGSEECGDIRAIA 62 Query: 58 HAIAGLGLSSRLLAWCRLCDVDLAAARSTGVTMVDLSLPVSDLMLHHKLNRDRDWALREV 117 +A G +RL+ W RL D+ A RS V M++LS+PVSD + HKL RD L + Sbjct: 63 DTLADGGHDTRLIVWGRLTSADIDACRSLPVQMLELSVPVSDQQIRHKLGATRDEVLARI 122 Query: 118 ARLVGEARMAGLEVCLGCEDASRADLEFVVQVGEVAQAAGARRLRFADTVGVMEPFGMLD 177 AR V AR AG +V +G EDASRAD F+ +V + A+AAGARR RFADT+G+++PF + Sbjct: 123 ARWVPVAREAGFDVGVGGEDASRADPAFLAEVIQAAEAAGARRFRFADTLGILDPFATFE 182 Query: 178 RFRFLSRRLDMELEVHAHDDFGLATANTLAAVMGGATHINTTVNGLGERAGNAALEECVL 237 R L +E+E+HAHDD GLATANTLAAV GA+H+NTTVNGLGERAGNAALEE L Sbjct: 183 AIRALRAASKLEIEMHAHDDLGLATANTLAAVRAGASHVNTTVNGLGERAGNAALEEVAL 242 Query: 238 ALKNLHGIDTGIDTRGIPAISALVERASGRQVAWQKSVVGAGVFTHEAGIHVDGLLKHRR 297 L+ HG+ ID + S V RASGR V W KSVVG GVFTHEAGIHVDGLLK Sbjct: 243 GLRQFHGMGEIIDFTRLLDTSEAVARASGRPVGWHKSVVGEGVFTHEAGIHVDGLLKDPA 302 Query: 298 NYEGLNPDELGRSHSLVLGKHSGAHMVRNTYRDLGIELADWQSQALLGRIRAFSTRTKRR 357 NY+G++P LGRSH ++LGKHSG V Y LGIEL ++ LL RIR F++RTK R Sbjct: 303 NYQGIDPALLGRSHRMLLGKHSGGRGVAAAYAGLGIELDAARTACLLARIREFTSRTK-R 361 Query: 358 SPQPAELQDFYRQL 371 +PQ +L F+ +L Sbjct: 362 TPQREDLLAFWEEL 375 Lambda K H 0.320 0.135 0.396 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 511 Number of extensions: 22 Number of successful extensions: 2 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 385 Length of database: 396 Length adjustment: 31 Effective length of query: 354 Effective length of database: 365 Effective search space: 129210 Effective search space used: 129210 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.4 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.8 bits) S2: 50 (23.9 bits)
Align candidate WP_011764286.1 AZO_RS02775 (homocitrate synthase)
to HMM TIGR02660 (nifV: homocitrate synthase (EC 2.3.3.14))
# hmmsearch :: search profile(s) against a sequence database # HMMER 3.3.1 (Jul 2020); http://hmmer.org/ # Copyright (C) 2020 Howard Hughes Medical Institute. # Freely distributed under the BSD open source license. # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # query HMM file: ../tmp/path.aa/TIGR02660.hmm # target sequence database: /tmp/gapView.3960842.genome.faa # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: TIGR02660 [M=365] Accession: TIGR02660 Description: nifV_homocitr: homocitrate synthase Scores for complete sequences (score includes all domains): --- full sequence --- --- best 1 domain --- -#dom- E-value score bias E-value score bias exp N Sequence Description ------- ------ ----- ------- ------ ----- ---- -- -------- ----------- 2e-156 506.8 1.5 2.2e-156 506.6 1.5 1.0 1 NCBI__GCF_000061505.1:WP_011764286.1 Domain annotation for each sequence (and alignments): >> NCBI__GCF_000061505.1:WP_011764286.1 # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc --- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ---- 1 ! 506.6 1.5 2.2e-156 2.2e-156 2 364 .. 5 371 .. 4 372 .. 0.98 Alignments for each domain: == domain 1 score: 506.6 bits; conditional E-value: 2.2e-156 TIGR02660 2 vlinDttLRDGEqaagvaFsaeEKlaiAkaLdeaGvdelEvGipamgeeEraairaiaal....glkarllaW 70 v+i+DttLRDGEq+agvaF+++EK+ iA++L+e+Gv+elE+Gipamg+eE+ +iraia++ g+++rl++W NCBI__GCF_000061505.1:WP_011764286.1 5 VTIDDTTLRDGEQSAGVAFTRAEKCGIARILAEIGVPELEIGIPAMGSEECGDIRAIADTladgGHDTRLIVW 77 89*******************************************************98722225689***** PP TIGR02660 71 cRlraedieaaaevGvkavdlsvpvsdlqleaklkkdrawvleelkelvslakeeglkvsvgaeDasRadeef 143 Rl+++di+a +++ v++++lsvpvsd+q+++kl +r+ vl+++++ v +a+e+g +v vg+eDasRad++f NCBI__GCF_000061505.1:WP_011764286.1 78 GRLTSADIDACRSLPVQMLELSVPVSDQQIRHKLGATRDEVLARIARWVPVAREAGFDVGVGGEDASRADPAF 150 ************************************************************************* PP TIGR02660 144 lvelaevakeagakRlRfaDtvgvldPfstyelvkalraalelelElHaHnDlGlAtAntlaavkaGassvsv 216 l+e+++ a++aga+R+RfaDt+g+ldPf+t+e+++alraa++le+E+HaH+DlGlAtAntlaav+aGas+v++ NCBI__GCF_000061505.1:WP_011764286.1 151 LAEVIQAAEAAGARRFRFADTLGILDPFATFEAIRALRAASKLEIEMHAHDDLGLATANTLAAVRAGASHVNT 223 ************************************************************************* PP TIGR02660 217 tvlGlGERAGnAaleevalalkellgldtgidlselkelsqlvakasgraleaqkavvGesvFaHEsGiHvdg 289 tv+GlGERAGnAaleeval+l++ +g+ id+++l s++va+asgr++ ++k+vvGe vF+HE+GiHvdg NCBI__GCF_000061505.1:WP_011764286.1 224 TVNGLGERAGNAALEEVALGLRQFHGMGEIIDFTRLLDTSEAVARASGRPVGWHKSVVGEGVFTHEAGIHVDG 296 ************************************************************************* PP TIGR02660 290 llkdeatYesldPeevGrerelviGKHsgraaviealkelgleleeeeaeelleavravaerlKrsleeeela 362 llkd+a+Y+++dP+++Gr++++ +GKHsg ++v++a++ lg+el++++++ ll+++r++++r+Kr++++e+l NCBI__GCF_000061505.1:WP_011764286.1 297 LLKDPANYQGIDPALLGRSHRMLLGKHSGGRGVAAAYAGLGIELDAARTACLLARIREFTSRTKRTPQREDLL 369 ***********************************************************************98 PP TIGR02660 363 al 364 a+ NCBI__GCF_000061505.1:WP_011764286.1 370 AF 371 76 PP Internal pipeline statistics summary: ------------------------------------- Query model(s): 1 (365 nodes) Target sequences: 1 (396 residues searched) Passed MSV filter: 1 (1); expected 0.0 (0.02) Passed bias filter: 1 (1); expected 0.0 (0.02) Passed Vit filter: 1 (1); expected 0.0 (0.001) Passed Fwd filter: 1 (1); expected 0.0 (1e-05) Initial search space (Z): 1 [actual number of targets] Domain search space (domZ): 1 [number of targets reported over threshold] # CPU time: 0.01u 0.01s 00:00:00.02 Elapsed: 00:00:00.00 # Mc/sec: 20.93 // [ok]
This GapMind analysis is from Jul 25 2024. The underlying query database was built on Jul 25 2024.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory