GapMind for catabolism of small carbon sources

 

Alignments for a candidate for acn in Mariniradius saccharolyticus AK6

Align aconitate hydratase (EC 4.2.1.3) (characterized)
to candidate WP_008623754.1 C943_RS02195 aconitate hydratase

Query= BRENDA::Q8RP87
         (747 letters)



>NCBI__GCF_000330725.2:WP_008623754.1
          Length = 753

 Score = 1041 bits (2692), Expect = 0.0
 Identities = 507/747 (67%), Positives = 595/747 (79%), Gaps = 2/747 (0%)

Query: 1   MVYDLNMLKNFYASYKGKMEHVRAALKRPLTLAEKILYTHLYNVADLKNYERGEDYVNFR 60
           M +D+ M+K  Y +Y  ++   R A+ RPLTL EKILY HL   A  + Y+RG  YV+F 
Sbjct: 1   MAFDIEMIKAVYGNYPTRIAAARKAVGRPLTLTEKILYAHLTQGAATEAYQRGGSYVDFN 60

Query: 61  PDRVAMQDATAQMALLQFMNAGKEAVAVPSTVHCDHLIQAYRGAERDIETATQTNREVYD 120
           PDRVAMQDATAQMALLQFM AG++ VAVPSTVHCDHLIQA  GA+ D++ A   NREVYD
Sbjct: 61  PDRVAMQDATAQMALLQFMQAGRDKVAVPSTVHCDHLIQAEIGADADLQKAKDKNREVYD 120

Query: 121 FLRDVSSRYGIGFWKPGAGIIHQVVLENYAFPGGMMVGTDSHTPNAGGLGMVAIGVGGAD 180
           FL  VS++YGIGFWKPGAGIIHQVVLENYAFPGGMM+GTDSHTPNAGGLGM+AIGVGGAD
Sbjct: 121 FLASVSNKYGIGFWKPGAGIIHQVVLENYAFPGGMMIGTDSHTPNAGGLGMIAIGVGGAD 180

Query: 181 AVDVMTGMEWELKMPKLIGVRLTGELNGWTAPKDVILKLAGILTVKGGTNAIIEYFGPGT 240
           A DVM G+ WELK PKLIGVRLTG L+GWT+ KDVILK+AGILTVKGGT A++EYFG G 
Sbjct: 181 ACDVMAGLPWELKFPKLIGVRLTGRLSGWTSAKDVILKVAGILTVKGGTGAVVEYFGEGA 240

Query: 241 ASLSATGKATICNMGAEVGATTSLFPYDERMAVYLKATGREEVAAMADSVAADLRADDEV 300
            SLSATGK TICNMGAE+GATTS+F YDE+   YL+ T R E+A MA+S+   L  D EV
Sbjct: 241 RSLSATGKGTICNMGAEIGATTSIFGYDEKSEAYLRGTDRAEIAEMANSIKEHLTGDAEV 300

Query: 301 MARPDDFYDRVIEINLSELEPYINGPFTPDAATPISEFAEKVVTNGYPRKMEVGLIGSCT 360
            A P+ ++D+VI+INLSELEP+INGPFTPD A P+S+FA+ V  NG+P K+EVGLIGSCT
Sbjct: 301 YANPEKYFDQVIDINLSELEPHINGPFTPDLAWPLSKFAQAVKENGWPAKLEVGLIGSCT 360

Query: 361 NSSYQDISRAVSVARQVNEKNLGVAAPLIVNPGSEQIRATAERDGMMDVFEKMGATIMAN 420
           NSSY+DISRA S+A+Q  +K L   +   + PGSEQ+R T ERDG + VFEKMG  ++AN
Sbjct: 361 NSSYEDISRAASLAQQAVDKKLIAKSEYTITPGSEQVRFTVERDGFLGVFEKMGGVVLAN 420

Query: 421 ACGPCIGQWKRHTDDPTRKNSIVTSFNRNFAKRADGNPNTFAFVASPEIVLALTIAGDLC 480
           ACGPCIGQW RH  +   KNSI+TSFNRNFAKRADGNPNT AFVASPEIV A+ I+GDL 
Sbjct: 421 ACGPCIGQWARHGAEKQEKNSIITSFNRNFAKRADGNPNTHAFVASPEIVTAMAISGDLT 480

Query: 481 FNPLKDRLVNHDGEKVKLSEPQGDELPSAGFVAGNQGYQAPG--GEKNEIRVAPDSQRLQ 538
           FNP+ D L+N +G+ VKL EP+G ELP+ GF   + GYQAP   G K ++ V P S RLQ
Sbjct: 481 FNPVTDTLINEEGKAVKLDEPKGLELPTKGFAVEDAGYQAPASDGSKVQVIVDPKSDRLQ 540

Query: 539 LLTPFPAWDGNDFLNMPLLIKAQGKCTTDHISMAGPWLRFRGHLENISDNMLMGAVNAFN 598
           LL PFPAW+G D   + LLIKA+GKCTTDHISMAGPWLRFRGHL+NIS+NML+GAVNA+N
Sbjct: 541 LLAPFPAWEGTDLKGLKLLIKAKGKCTTDHISMAGPWLRFRGHLDNISNNMLIGAVNAYN 600

Query: 599 GETNKVWNRLTNTYETVSGTAKQYKADGISSIVVAEENYGEGSSREHAAMEPRFLHVKVI 658
           GETNKV N+L   Y  V    +QYKA GI SIVV +ENYGEGSSREHAAMEPRFL V+ I
Sbjct: 601 GETNKVKNQLGGGYGEVPAVQRQYKAHGIGSIVVGDENYGEGSSREHAAMEPRFLGVRAI 660

Query: 659 LAKSFARIHETNLKKQGMLAVTFADKADYDRIREHDLISVVGLKEFSPGRNLEVILHHED 718
           L KSFARIHETNLKKQGMLA+TFA+ ADYD I+E D I ++GL +F+PG+ L V+LHH D
Sbjct: 661 LVKSFARIHETNLKKQGMLALTFANPADYDLIQEDDSIDILGLTDFAPGKPLTVVLHHAD 720

Query: 719 GTEERFAVQHTYNEQQIGWFRAGSALN 745
           G++    V HTYNE QIGWFRAGSALN
Sbjct: 721 GSKNEIKVNHTYNEGQIGWFRAGSALN 747


Lambda     K      H
   0.317    0.134    0.395 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 1539
Number of extensions: 54
Number of successful extensions: 2
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 747
Length of database: 753
Length adjustment: 40
Effective length of query: 707
Effective length of database: 713
Effective search space:   504091
Effective search space used:   504091
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 55 (25.8 bits)

Align candidate WP_008623754.1 C943_RS02195 (aconitate hydratase)
to HMM TIGR01340 (aconitate hydratase, mitochondrial (EC 4.2.1.3))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR01340.hmm
# target sequence database:        /tmp/gapView.1437006.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR01340  [M=745]
Accession:   TIGR01340
Description: aconitase_mito: aconitate hydratase, mitochondrial
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                             Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                             -----------
          0 1168.6   0.1          0 1168.4   0.1    1.0  1  NCBI__GCF_000330725.2:WP_008623754.1  


Domain annotation for each sequence (and alignments):
>> NCBI__GCF_000330725.2:WP_008623754.1  
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 ! 1168.4   0.1         0         0       6     745 .]      17     749 ..      12     749 .. 0.97

  Alignments for each domain:
  == domain 1  score: 1168.4 bits;  conditional E-value: 0
                             TIGR01340   6 ekldkvrrvlnsrpltlaekvlyshlddpeesllsqdiedvrGksylklkpdrvamqdasaqmallqflsagl 78 
                                           +++   r+ ++ rpltl+ek+ly+hl++ +    ++  +  rG sy+ ++pdrvamqda+aqmallqf+ ag 
  NCBI__GCF_000330725.2:WP_008623754.1  17 TRIAAARKAVG-RPLTLTEKILYAHLTQGAA---TEAYQ--RGGSYVDFNPDRVAMQDATAQMALLQFMQAGR 83 
                                           56667788888.**************99888...55555..******************************** PP

                             TIGR01340  79 kkvavpasvhcdhlivakkGeekdlarakelnkevfdflesaakkygidfwkpGsGiihqivlenyavpGllm 151
                                           +kvavp++vhcdhli+a++G++ dl++ak+ n+ev+dfl+s+++kygi+fwkpG+Giihq+vlenya+pG++m
  NCBI__GCF_000330725.2:WP_008623754.1  84 DKVAVPSTVHCDHLIQAEIGADADLQKAKDKNREVYDFLASVSNKYGIGFWKPGAGIIHQVVLENYAFPGGMM 156
                                           ************************************************************************* PP

                             TIGR01340 152 lGtdshtpnaGGlaaiaiGvGGadavdvlagipwelkapkvlGvkltGklsgwtspkdvilklaglltvkGGt 224
                                           +GtdshtpnaGGl++iaiGvGGada dv+ag+pwelk pk++Gv+ltG+lsgwts+kdvilk+ag+ltvkGGt
  NCBI__GCF_000330725.2:WP_008623754.1 157 IGTDSHTPNAGGLGMIAIGVGGADACDVMAGLPWELKFPKLIGVRLTGRLSGWTSAKDVILKVAGILTVKGGT 229
                                           ************************************************************************* PP

                             TIGR01340 225 GaiveyfGeGveslsctGmaticnmGaeiGattslfpfneaskdylkatnraeiaeeakvakdka...ellka 294
                                           Ga+veyfGeG  sls tG +ticnmGaeiGatts+f ++e+s+ yl+ t+raeiae a+  k++      + a
  NCBI__GCF_000330725.2:WP_008623754.1 230 GAVVEYFGEGARSLSATGKGTICNMGAEIGATTSIFGYDEKSEAYLRGTDRAEIAEMANSIKEHLtgdAEVYA 302
                                           **********************************************************966655411134457 PP

                             TIGR01340 295 dkdaeydelieidlsklephvnGpftpdlstpiskfkekvkkekwpeklkvGliGsctnssyedmsrvasivk 367
                                           + +  +d++i+i+ls+leph+nGpftpdl+ p+skf+++vk+++wp kl+vGliGsctnssyed+sr+as+++
  NCBI__GCF_000330725.2:WP_008623754.1 303 NPEKYFDQVIDINLSELEPHINGPFTPDLAWPLSKFAQAVKENGWPAKLEVGLIGSCTNSSYEDISRAASLAQ 375
                                           788899******************************************************************* PP

                             TIGR01340 368 daekaGlkskidftvtpGseqiratlerdgilevfekaGgvvlanacGpciGqwdrkdvvkkgekntiltsyn 440
                                           +a ++ l +k+++t+tpGseq+r t+erdg l vfek+GgvvlanacGpciGqw r+   +k+ekn+i+ts+n
  NCBI__GCF_000330725.2:WP_008623754.1 376 QAVDKKLIAKSEYTITPGSEQVRFTVERDGFLGVFEKMGGVVLANACGPCIGQWARHG-AEKQEKNSIITSFN 447
                                           *********************************************************9.99************ PP

                             TIGR01340 441 rnfrgrndanratmaflaspelvtalsvaGslkfnpltdslktkdGkefklkapkGdelpekgfeaGrdtfqa 513
                                           rnf  r d+n++t+af+aspe+vta++++G+l+fnp+td+l +++Gk  kl+ pkG elp+kgf      +qa
  NCBI__GCF_000330725.2:WP_008623754.1 448 RNFAKRADGNPNTHAFVASPEIVTAMAISGDLTFNPVTDTLINEEGKAVKLDEPKGLELPTKGFAVEDAGYQA 520
                                           ************************************************************************* PP

                             TIGR01340 514 esdspdenvevavdpksdrlqllepfekwngkdlkglrvlikvkGkcttdhisaaGpwlkykGhldnisentl 586
                                           ++++  ++v+v vdpksdrlqll+pf +w+g dlkgl++lik kGkcttdhis+aGpwl+++Ghldnis+n+l
  NCBI__GCF_000330725.2:WP_008623754.1 521 PASD-GSKVQVIVDPKSDRLQLLAPFPAWEGTDLKGLKLLIKAKGKCTTDHISMAGPWLRFRGHLDNISNNML 592
                                           *999.9******************************************************************* PP

                             TIGR01340 587 igavnaetgevnkvkdk.dGskgavpelakdykargvkwvvvaeenyGeGsarehaaleprylGgriiivksf 658
                                           igavna +ge nkvk++  G +g+vp++ ++yka+g+  +vv++enyGeGs+rehaa+epr+lG r+i+vksf
  NCBI__GCF_000330725.2:WP_008623754.1 593 IGAVNAYNGETNKVKNQlGGGYGEVPAVQRQYKAHGIGSIVVGDENYGEGSSREHAAMEPRFLGVRAILVKSF 665
                                           *****************8999**************************************************** PP

                             TIGR01340 659 arihetnlkkqGvlpltfaneadydkiqaedkvellnlkellknnngkevdlrvkkkngkvveiklkhtlskd 731
                                           arihetnlkkqG+l ltfan+adyd+iq +d++++l+l+++++   gk++++ +++ +g+  eik++ht ++ 
  NCBI__GCF_000330725.2:WP_008623754.1 666 ARIHETNLKKQGMLALTFANPADYDLIQEDDSIDILGLTDFAP---GKPLTVVLHHADGSKNEIKVNHTYNEG 735
                                           ******************************************9...*************************** PP

                             TIGR01340 732 qieffkaGsalnll 745
                                           qi++f+aGsalnl+
  NCBI__GCF_000330725.2:WP_008623754.1 736 QIGWFRAGSALNLI 749
                                           ************86 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (745 nodes)
Target sequences:                          1  (753 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.02u 0.01s 00:00:00.03 Elapsed: 00:00:00.02
# Mc/sec: 27.28
//
[ok]

This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory