GapMind for Amino acid biosynthesis

 

Alignments for a candidate for cimA in Phaeobacter inhibens BS107

Align (R)-citramalate synthase (EC 2.3.3.21) (characterized)
to candidate GFF1624 PGA1_c16460 2-isopropylmalate synthase/homocitrate synthase-like protein

Query= BRENDA::Q74C76
         (528 letters)



>FitnessBrowser__Phaeo:GFF1624
          Length = 543

 Score =  424 bits (1090), Expect = e-123
 Identities = 243/542 (44%), Positives = 336/542 (61%), Gaps = 31/542 (5%)

Query: 6   LYDTTLRDGTQAEDISFLVEDKIRIAHKLDEIGIHYIEGGWPGSNPKDVAFFKDIKKEKL 65
           LYDTTLRDG Q + + F   +K++IA  LD +G+ YIEGGWPG+NP D  FF    +   
Sbjct: 8   LYDTTLRDGQQTQGVQFSTTEKVQIATALDGLGVDYIEGGWPGANPTDSGFFDAAPR--- 64

Query: 66  SQAKIAAFGSTRRAKVTPDKDHNLKTLIQAEPDVCTIFGKTWDFHVHEALRISLEENLEL 125
           ++A + AFG T+RA  + + D  L  ++ A      + GK+ D+HV  AL I+LEENL+ 
Sbjct: 65  TRATMTAFGMTKRAGRSAENDDVLAAVLNAGTAAVCLVGKSHDYHVTHALGITLEENLDN 124

Query: 126 IFDSLEYLKANVPEVFYDAEHFFDGYKANPDYAIKTLKAAQDAKADCIVLCDTNGGTMPF 185
           I  S+ +L A   E  +DAEHFFDGYK NPDYA+   +AA +A A  +VLCDTNGGT+P 
Sbjct: 125 IRASIAHLVAQGREALFDAEHFFDGYKDNPDYALAACRAALEAGARWVVLCDTNGGTLPG 184

Query: 186 ELVEIIREVRKHITAPL-----GIHTHNDSECAVANSLHAVSEGIVQVQGTINGFGERCG 240
           ++  I+ EV   I A L     GIHTHND+E AVA SL AV  G  Q+QGT+NG GERCG
Sbjct: 185 DVGRIVAEV---IAAGLPGDHLGIHTHNDTENAVACSLAAVDAGARQIQGTLNGLGERCG 241

Query: 241 NANLCSIIPALKLKMKREC-----IGDDQLRKLRDLSRFVYELANLSPNKHQAYVGNSAF 295
           NANL ++IP L LK          +  + L  L  LSR + E+ N  P K  AYVG SAF
Sbjct: 242 NANLTTLIPTLLLKPPYADQFDIGVSHEGLSTLTALSRMLDEILNRVPTKQAAYVGASAF 301

Query: 296 AHKGGVHVSAIQRHPETYEHLRPELVGNMTRVLVSDLSGRSNILAKAEEFNIKMDSKDPV 355
           AHK G+H SAI + P TYEH+ P LVGN   + +S+ +G+SN+  +  E  +++++ DP 
Sbjct: 302 AHKAGLHASAILKDPSTYEHIDPALVGNARIIPMSNQAGQSNLRRRLSEAGLRVENGDPA 361

Query: 356 TLEILENIKEMENRGYQFEGAEASFELLMKRALGTHRKFFSVIGFRVIDEKR---HEDQK 412
              ILE IK  E  GY ++ A+ASFE+L +  LG    FF V  ++V  E+R   ++   
Sbjct: 362 LARILERIKTREAEGYSYDTAQASFEILAREELGQLPSFFEVKRYKVTVERRKNKYDRMV 421

Query: 413 PLSEATIMVKVGGK--------IEHTAAEGNGPVNALDNALRKALEKFYPRLKEVKLLDY 464
            LSEA ++VKV G+        ++ T ++  GPVNAL  AL K L ++   L +++L+D+
Sbjct: 422 SLSEAVVVVKVDGQKLLSVSESLDETGSD-RGPVNALAKALTKDLGQYSKALDDMRLVDF 480

Query: 465 KVRVLPAGQGTASSIRVLIESGDKES-RWGTVGVSENIVDASYQALLDSVEYKLHKSEEI 523
           KVR+     GT +  RV+I+S D    RW TVGVS NI+DAS++ALLD++ +KL +  + 
Sbjct: 481 KVRITQG--GTEAVTRVIIDSEDGAGRRWSTVGVSANIIDASFEALLDAIRWKLLRDTDA 538

Query: 524 EG 525
            G
Sbjct: 539 GG 540


Lambda     K      H
   0.317    0.135    0.387 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 692
Number of extensions: 30
Number of successful extensions: 7
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 528
Length of database: 543
Length adjustment: 35
Effective length of query: 493
Effective length of database: 508
Effective search space:   250444
Effective search space used:   250444
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.6 bits)
S2: 52 (24.6 bits)

Align candidate GFF1624 PGA1_c16460 (2-isopropylmalate synthase/homocitrate synthase-like protein)
to HMM TIGR00977 (cimA: citramalate synthase (EC 2.3.1.182))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.aa/TIGR00977.hmm
# target sequence database:        /tmp/gapView.27202.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR00977  [M=526]
Accession:   TIGR00977
Description: citramal_synth: citramalate synthase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                          Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                          -----------
   2.9e-184  599.1   0.0   3.6e-184  598.8   0.0    1.0  1  lcl|FitnessBrowser__Phaeo:GFF1624  PGA1_c16460 2-isopropylmalate sy


Domain annotation for each sequence (and alignments):
>> lcl|FitnessBrowser__Phaeo:GFF1624  PGA1_c16460 2-isopropylmalate synthase/homocitrate synthase-like protein
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  598.8   0.0  3.6e-184  3.6e-184       2     522 ..       6     537 ..       5     540 .. 0.94

  Alignments for each domain:
  == domain 1  score: 598.8 bits;  conditional E-value: 3.6e-184
                          TIGR00977   2 lklydttlrdGaqaeGvslsledkiriaeklddlGihyieGGwpganpkdvaffekvkeenlknakvvafsstrrp 77 
                                        l+lydttlrdG q++Gv +s ++k++ia +ld lG++yieGGwpganp d  ff ++ +    +a ++af+ t+r 
  lcl|FitnessBrowser__Phaeo:GFF1624   6 LYLYDTTLRDGQQTQGVQFSTTEKVQIATALDGLGVDYIEGGWPGANPTDSGFFDAAPR---TRATMTAFGMTKRA 78 
                                        89****************************************************98755...6899********** PP

                          TIGR00977  78 dkkveedkqlqalikaetpvvtifGkswdlhveealkttleenlkmiydtveylkrfadeviydaehffdGykanp 153
                                         +  e+d  l a+++a+t  v + Gks d hv++al  tleenl+ i  ++++l  +++e ++daehffdGyk np
  lcl|FitnessBrowser__Phaeo:GFF1624  79 GRSAENDDVLAAVLNAGTAAVCLVGKSHDYHVTHALGITLEENLDNIRASIAHLVAQGREALFDAEHFFDGYKDNP 154
                                        **************************************************************************** PP

                          TIGR00977 154 eyalktlkvaekaGadwlvladtnGGtlpheieeitkkvk.krlkdpqlGihahndsetavansllaveaGavqvq 228
                                        +yal++ ++a +aGa w+vl+dtnGGtlp+++  i+ +v    l   +lGih+hnd+e ava sl+av aGa+q+q
  lcl|FitnessBrowser__Phaeo:GFF1624 155 DYALAACRAALEAGARWVVLCDTNGGTLPGDVGRIVAEVIaAGLPGDHLGIHTHNDTENAVACSLAAVDAGARQIQ 230
                                        ***********************************9998615689999**************************** PP

                          TIGR00977 229 GtinGlGercGnanlcslipnlqlkl....gldv.iekenlkkltevarlvaeivnlaldenmpyvGesafahkGG 299
                                        Gt+nGlGercGnanl +lip l lk     ++d+ + +e l  lt ++r++ ei n+ +++++ yvG safahk G
  lcl|FitnessBrowser__Phaeo:GFF1624 231 GTLNGLGERCGNANLTTLIPTLLLKPpyadQFDIgVSHEGLSTLTALSRMLDEILNRVPTKQAAYVGASAFAHKAG 306
                                        *************************7222245666899************************************** PP

                          TIGR00977 300 vhvsavkrnpktyehidpelvGnkrkivvselaGksnvleklkelGieidekspkvrkilkkikelekqGyhfeaa 375
                                        +h+sa+ ++p tyehidp lvGn r+i +s++aG+sn+ ++l e G+ +++ +p++ +il++ik  e++Gy +++a
  lcl|FitnessBrowser__Phaeo:GFF1624 307 LHASAILKDPSTYEHIDPALVGNARIIPMSNQAGQSNLRRRLSEAGLRVENGDPALARILERIKTREAEGYSYDTA 382
                                        **************************************************************************** PP

                          TIGR00977 376 easlellvrdalGkrkkyfevdgfrvliakrrdee..slseaeatvrvsvegae.......eltaaeGnGpvsald 442
                                        +as+e+l r+ lG+  ++fev+ ++v++++r+++    +s +ea v v v+g++         ++    Gpv+al 
  lcl|FitnessBrowser__Phaeo:GFF1624 383 QASFEILAREELGQLPSFFEVKRYKVTVERRKNKYdrMVSLSEAVVVVKVDGQKllsvsesLDETGSDRGPVNALA 458
                                        ***************************998877544467777788888888998455554444455568******* PP

                          TIGR00977 443 ralrkalekfypslkdlkltdykvrilnesaGtsaktrvliessdGk.rrwgtvGvseniieasytallesieykl 517
                                        +al k l ++   l d++l+d+kvri   + Gt+a trv+i+s dG  rrw+tvGvs nii+as+ all++i +kl
  lcl|FitnessBrowser__Phaeo:GFF1624 459 KALTKDLGQYSKALDDMRLVDFKVRIT--QGGTEAVTRVIIDSEDGAgRRWSTVGVSANIIDASFEALLDAIRWKL 532
                                        **************************7..579*************9659*************************** PP

                          TIGR00977 518 rkdee 522
                                         +d+ 
  lcl|FitnessBrowser__Phaeo:GFF1624 533 LRDTD 537
                                        99975 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (526 nodes)
Target sequences:                          1  (543 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.03u 0.02s 00:00:00.05 Elapsed: 00:00:00.03
# Mc/sec: 7.18
//
[ok]

This GapMind analysis is from Apr 09 2024. The underlying query database was built on Apr 09 2024.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory