GapMind for catabolism of small carbon sources

 

Aligments for a candidate for acn in Dyella japonica UNC79MFTsu3.2

Align Aconitate hydratase (EC 4.2.1.3) (characterized)
to candidate N515DRAFT_1419 N515DRAFT_1419 aconitate hydratase

Query= reanno::Marino:GFF3491
         (919 letters)



>lcl|FitnessBrowser__Dyella79:N515DRAFT_1419 N515DRAFT_1419
           aconitate hydratase
          Length = 916

 Score = 1130 bits (2923), Expect = 0.0
 Identities = 568/913 (62%), Positives = 694/913 (76%), Gaps = 11/913 (1%)

Query: 9   DSLNTLSSLDAGGKTFHYYSLPKAADTLGDLNRLPFSLKVLMENLLRNEDGTTVDRSHID 68
           DS  T  +L   G ++   SL K      D+  LP+S+K+L+ENLLR+EDG  V    I+
Sbjct: 3   DSFATRDTLKVNGSSYQIASLAKLGQRF-DIKHLPYSMKILLENLLRHEDGVNVTAKEIE 61

Query: 69  AMVQWMKDRHSDTEIQFRPARVLMQDFTGVPGVVDLAAMREAVQAAGKDPAMINPLSPVD 128
           A+ +W      DTEI F PARV++QDFTGVP VVDLAAMR+AV   G D   INPL+P +
Sbjct: 62  AVARWNPKAEPDTEIAFMPARVVLQDFTGVPCVVDLAAMRDAVVKLGGDAKQINPLAPAE 121

Query: 129 LVIDHSVMVDKFGDASSFKDNVAIEMERNQERYEFLRWGQQAFDNFRVVPPGTGICHQVN 188
           LVIDHSV VD +G  S+ + NVAIE +RNQERY FLRWGQ+AFDNF+VVPP TGI HQVN
Sbjct: 122 LVIDHSVQVDVYGSESALEQNVAIEFQRNQERYAFLRWGQKAFDNFKVVPPRTGIVHQVN 181

Query: 189 LEYLGKTVWQKDQDGKTIAYPDTLVGTDSHTTMINGLGILGWGVGGIEAEAAMLGQPVSM 248
           LEYLG+ V+  ++DG++ AYPDT+ GTDSHTTMING+G+LGWGVGGIEAEAAMLGQP SM
Sbjct: 182 LEYLGRVVFTGEKDGQSWAYPDTVFGTDSHTTMINGVGVLGWGVGGIEAEAAMLGQPSSM 241

Query: 249 LIPEVVGFKITGKLREGITATDLVLTVTEMLRKKGVVGKFVEFYGDGLKDMPVADRATIA 308
           LIP+VVGFK+TGKL EG+TATDLVLTVT+MLRK GVVGKFVEF+G GLKD+ +ADRATI 
Sbjct: 242 LIPQVVGFKLTGKLAEGVTATDLVLTVTQMLRKLGVVGKFVEFFGPGLKDLALADRATIG 301

Query: 309 NMAPEYGATCGFFPVDEQTIKYMRLTGREEEQLELVEAYAKAQGLWR-EPGHEPVYTDNL 367
           NMAPEYGATCG FPVD++ + Y+RL+GR EE +ELV+AYA+AQGLW  E      +T  L
Sbjct: 302 NMAPEYGATCGIFPVDQEALNYLRLSGRSEEHIELVKAYAQAQGLWHDENTPHAQFTTTL 361

Query: 368 ELDMGEVEASLAGPKRPQDRVALKNMKSSFELL---METAEGPAENREANLESEGGQTAV 424
           ELD+G+V  SLAGPKRPQDRV L++++ SF      +     P     +N  +EGG  A+
Sbjct: 362 ELDLGDVRPSLAGPKRPQDRVLLQDVEKSFRDALGPLTANRRPRNGDTSNFINEGGSAAI 421

Query: 425 GVDDSYKHHASQPLEMNGEKSRLDPGAVVIAAITSCTNTSNPSVMMAAGLIAQKAVQKGL 484
           G   +    +   +E NGE  RL  GAVVIAAITSCTNTSNP+VM+ AGL+A+KA  KGL
Sbjct: 422 GNPANAVSESGVLVEKNGESFRLGDGAVVIAAITSCTNTSNPAVMLGAGLLAKKAAAKGL 481

Query: 485 STKPWVKTSLAPGSKVVTDYLKVGGFQDDLDKLGFNLVGYGCTTCIGNSGPLPDAVEKAI 544
             +PWVKTSL PGSKVVTDYL+  G   +L+K+GF +VGYGCTTCIGNSGPLP  + K I
Sbjct: 482 KAQPWVKTSLGPGSKVVTDYLEKTGLLQELEKVGFYVVGYGCTTCIGNSGPLPAEISKGI 541

Query: 545 SDGDLTVASVLSGNRNFEGRVHPLVKTNWLASPPLVVAYALAGNVRLDLSQDPLGNDKDG 604
           ++GDL VASVLSGNRNFEGRVHP VK N+LASPPLVVAYALAG++ +DLS+DPLG   DG
Sbjct: 542 AEGDLAVASVLSGNRNFEGRVHPEVKMNYLASPPLVVAYALAGSLDVDLSKDPLGTGSDG 601

Query: 605 NPVYLKDLWPSQQEIAEAVE-KVKTDMFRKEYAEVFDGDATWKSIKVPESKVYEWSDKST 663
            PVYL+D+WPS QEI++ +   +   MF K YA+VF GD  W  I  P+  VY+W D ST
Sbjct: 602 QPVYLRDIWPSNQEISDTIAGAINPAMFAKNYADVFQGDDRWNHIASPDGSVYQWGD-ST 660

Query: 664 YIQHPPFFEGLKEEPDAIDDIKDANILALLGDSVTTDHISPAGSFKPDTPAGKYLQEHGV 723
           YI++PP+F+G+  E   ++DI  A +L L GDS+TTDHISPAGS K D+PAG++L   GV
Sbjct: 661 YIKNPPYFDGMTREVGKVEDIHGARVLGLFGDSITTDHISPAGSIKKDSPAGRFLIGKGV 720

Query: 724 EPKDFNSYGSRRGNHEVMMRGTFANVRIRNEMLDGVEGGYTKFVPTGEQMAIYDAAMKYQ 783
           EPKDFNSYGSRRGN +VM+RGTFAN+RIRN MLDGVEGGYT  VP+GEQ+AIYDAAMKY+
Sbjct: 721 EPKDFNSYGSRRGNDDVMVRGTFANIRIRNLMLDGVEGGYTLHVPSGEQLAIYDAAMKYK 780

Query: 784 EKGTPLVVIAGKEYGTGSSRDWAAKGTRLLGVKAVVAESYERIHRSNLIGMGVMPLQFPE 843
            + TPLVV+AGKEYGTGSSRDWAAKGT LLGVKAV+AES+ERIHRSNL+GMGV+P QF +
Sbjct: 781 AEHTPLVVLAGKEYGTGSSRDWAAKGTLLLGVKAVIAESFERIHRSNLVGMGVLPCQFED 840

Query: 844 GTDRKSLKLTGEETISIEGLS-GEIKPGQTLKMTVKYKDGSTETCELKSRIDTANEAVYF 902
           G   ++L LTG+E   I GL+ GE K     K+T    DGS +   +K  + T  E  +F
Sbjct: 841 GQSAQTLGLTGKEVFDITGLNDGESK---VAKVTATAPDGSRKEFIVKVLLLTPKEREFF 897

Query: 903 KHGGILHYVVREM 915
           +HGGIL YV+R++
Sbjct: 898 RHGGILQYVLRQL 910


Lambda     K      H
   0.315    0.134    0.390 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 2290
Number of extensions: 103
Number of successful extensions: 6
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 919
Length of database: 916
Length adjustment: 43
Effective length of query: 876
Effective length of database: 873
Effective search space:   764748
Effective search space used:   764748
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 42 (22.0 bits)
S2: 57 (26.6 bits)

Align candidate N515DRAFT_1419 N515DRAFT_1419 (aconitate hydratase)
to HMM TIGR01341 (acnA: aconitate hydratase 1 (EC 4.2.1.3))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR01341.hmm
# target sequence database:        /tmp/gapView.18288.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR01341  [M=876]
Accession:   TIGR01341
Description: aconitase_1: aconitate hydratase 1
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                                    Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                                    -----------
          0 1375.9   0.0          0 1375.3   0.0    1.2  1  lcl|FitnessBrowser__Dyella79:N515DRAFT_1419  N515DRAFT_1419 aconitate hydrata


Domain annotation for each sequence (and alignments):
>> lcl|FitnessBrowser__Dyella79:N515DRAFT_1419  N515DRAFT_1419 aconitate hydratase
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 ! 1375.3   0.0         0         0       7     875 ..      22     910 ..      16     911 .. 0.96

  Alignments for each domain:
  == domain 1  score: 1375.3 bits;  conditional E-value: 0
                                    TIGR01341   7 slkaleeslekisklpkslrillesvlrnldgskikeedveallkwkkeelkdeeiafkparvvlq 72 
                                                  sl +l  +  +i++lp+s++ille++lr+ dg +++ +++ea++ w+ ++  d+eiaf+parvvlq
  lcl|FitnessBrowser__Dyella79:N515DRAFT_1419  22 SLAKLG-QRFDIKHLPYSMKILLENLLRHEDGVNVTAKEIEAVARWNPKAEPDTEIAFMPARVVLQ 86 
                                                  555555.5678******************************************************* PP

                                    TIGR01341  73 dftGvpavvdlaalreavknlgkdpekinplvpvdlvidhsvqvdkageeealeanvelefernke 138
                                                  dftGvp vvdlaa+r+av +lg+d+++inpl p++lvidhsvqvd++g+e+ale+nv +ef+rn+e
  lcl|FitnessBrowser__Dyella79:N515DRAFT_1419  87 DFTGVPCVVDLAAMRDAVVKLGGDAKQINPLAPAELVIDHSVQVDVYGSESALEQNVAIEFQRNQE 152
                                                  ****************************************************************** PP

                                    TIGR01341 139 rykflkwakkafknlkvvppgtGivhqvnleylakvvfeaekdgellaypdslvGtdshttminGl 204
                                                  ry+fl+w++kaf n+kvvpp tGivhqvnleyl++vvf+ ekdg+  aypd++ GtdshttminG+
  lcl|FitnessBrowser__Dyella79:N515DRAFT_1419 153 RYAFLRWGQKAFDNFKVVPPRTGIVHQVNLEYLGRVVFTGEKDGQSWAYPDTVFGTDSHTTMINGV 218
                                                  ****************************************************************** PP

                                    TIGR01341 205 GvlGwGvGGieaeaallGqpvslsvpeviGvkltGklreGvtatdlvltvtellrkkgvvgkfvef 270
                                                  GvlGwGvGGieaeaa+lGqp+s+ +p+v+G+kltGkl eGvtatdlvltvt++lrk gvvgkfvef
  lcl|FitnessBrowser__Dyella79:N515DRAFT_1419 219 GVLGWGVGGIEAEAAMLGQPSSMLIPQVVGFKLTGKLAEGVTATDLVLTVTQMLRKLGVVGKFVEF 284
                                                  ****************************************************************** PP

                                    TIGR01341 271 fGeglkelsladratianmapeyGataaffpiddvtlqylrltgrdedkvelvekylkaqelfvd. 335
                                                  fG+glk l+ladrati nmapeyGat+++fp+d+++l+ylrl+gr+e+++elv++y++aq+l++d 
  lcl|FitnessBrowser__Dyella79:N515DRAFT_1419 285 FGPGLKDLALADRATIGNMAPEYGATCGIFPVDQEALNYLRLSGRSEEHIELVKAYAQAQGLWHDe 350
                                                  *****************************************************************5 PP

                                    TIGR01341 336 dseepkytdvveldlsdveasvaGpkrpqdrvalkevkaafksslesnagekglalr......... 392
                                                  ++ ++++t+++eldl dv++s+aGpkrpqdrv l++v+++f+ +l   ++++              
  lcl|FitnessBrowser__Dyella79:N515DRAFT_1419 351 NTPHAQFTTTLELDLGDVRPSLAGPKRPQDRVLLQDVEKSFRDALGPLTANRRPRNGdtsnfineg 416
                                                  55669**************************************97654444432111223455678 PP

                                    TIGR01341 393 .............keakekklegkeaelkdgavviaaitsctntsnpsvllgagllakkavelGlk 445
                                                                  +    +g++ +l dgavviaaitsctntsnp+v+lgagllakka   Glk
  lcl|FitnessBrowser__Dyella79:N515DRAFT_1419 417 gsaaignpanavsESGVLVEKNGESFRLGDGAVVIAAITSCTNTSNPAVMLGAGLLAKKAAAKGLK 482
                                                  88888877764332222233359999**************************************** PP

                                    TIGR01341 446 vkpyvktslapGskvvtdylaesgllpyleelGfnlvGyGcttciGnsGpleeeveeaikendlev 511
                                                   +p+vktsl pGskvvtdyl ++gll+ le++Gf++vGyGcttciGnsGpl+ e+++ i+e+dl v
  lcl|FitnessBrowser__Dyella79:N515DRAFT_1419 483 AQPWVKTSLGPGSKVVTDYLEKTGLLQELEKVGFYVVGYGCTTCIGNSGPLPAEISKGIAEGDLAV 548
                                                  ****************************************************************** PP

                                    TIGR01341 512 savlsGnrnfegrihplvkanylaspplvvayalaGtvdidlekepigtdkdGkkvylkdiwpsak 577
                                                  ++vlsGnrnfegr+hp vk nylaspplvvayalaG++d+dl+k+p+gt+ dG++vyl+diwps++
  lcl|FitnessBrowser__Dyella79:N515DRAFT_1419 549 ASVLSGNRNFEGRVHPEVKMNYLASPPLVVAYALAGSLDVDLSKDPLGTGSDGQPVYLRDIWPSNQ 614
                                                  ****************************************************************** PP

                                    TIGR01341 578 eiaelvkkavkkelfkkeyeevtegnerwnelevtssdlyewdekstyireppffeelklepeeve 643
                                                  ei++++  a+++ +f k+y+ v++g++rwn++  +++++y+w  +styi++pp+f++++ e  +ve
  lcl|FitnessBrowser__Dyella79:N515DRAFT_1419 615 EISDTIAGAINPAMFAKNYADVFQGDDRWNHIASPDGSVYQWG-DSTYIKNPPYFDGMTREVGKVE 679
                                                  ******************************************6.7********************* PP

                                    TIGR01341 644 dikgarillllGdsittdhispaGsikkdspaakylkekGverrdfnsyGsrrGnhevmlrGtfan 709
                                                  di+gar+l l+GdsittdhispaGsikkdspa+++l+ kGve++dfnsyGsrrGn++vm+rGtfan
  lcl|FitnessBrowser__Dyella79:N515DRAFT_1419 680 DIHGARVLGLFGDSITTDHISPAGSIKKDSPAGRFLIGKGVEPKDFNSYGSRRGNDDVMVRGTFAN 745
                                                  ****************************************************************** PP

                                    TIGR01341 710 iriknklvkgkeGgltvylpdsevvsvydaamkykkegvplvvlaGkeyGsGssrdwaakgtkllG 775
                                                  iri+n +++g eGg+t+++p +e++++ydaamkyk e++plvvlaGkeyG+Gssrdwaakgt llG
  lcl|FitnessBrowser__Dyella79:N515DRAFT_1419 746 IRIRNLMLDGVEGGYTLHVPSGEQLAIYDAAMKYKAEHTPLVVLAGKEYGTGSSRDWAAKGTLLLG 811
                                                  ****************************************************************** PP

                                    TIGR01341 776 vkaviaesferihrsnlvgmGvlplefkqgedaetlgltgeetidvddieelkpkkevtvelvked 841
                                                  vkaviaesferihrsnlvgmGvlp +f++g++a+tlgltg+e  d+ ++++ +  k ++v+++  d
  lcl|FitnessBrowser__Dyella79:N515DRAFT_1419 812 VKAVIAESFERIHRSNLVGMGVLPCQFEDGQSAQTLGLTGKEVFDITGLNDGES-KVAKVTATAPD 876
                                                  ***********************************************9999765.5689******* PP

                                    TIGR01341 842 geketveavlridtevelayvkkgGilqyvlrkl 875
                                                  g+++ + +++ + t+ e +++++gGilqyvlr+l
  lcl|FitnessBrowser__Dyella79:N515DRAFT_1419 877 GSRKEFIVKVLLLTPKEREFFRHGGILQYVLRQL 910
                                                  *******************************985 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (876 nodes)
Target sequences:                          1  (916 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.06u 0.02s 00:00:00.08 Elapsed: 00:00:00.08
# Mc/sec: 9.51
//
[ok]

This GapMind analysis is from Sep 17 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see the paper from 2019 on GapMind for amino acid biosynthesis, the preprint on GapMind for carbon sources, or view the source code.

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory